All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Cc: Lokesh Vutla <lokeshvutla@ti.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH 11/11] drm/tidss: Annotate dma-fence critical section in commit path
Date: Tue, 22 Jun 2021 18:31:34 +0200	[thread overview]
Message-ID: <YNIQZswOVhXdhZoi@phenom.ffwll.local> (raw)
In-Reply-To: <7c07a183-cd1b-ab75-527c-cdf37cd4273c@ideasonboard.com>

On Thu, May 27, 2021 at 03:15:34PM +0300, Tomi Valkeinen wrote:
> Hi Daniel,
> 
> On 21/01/2021 17:29, Daniel Vetter wrote:
> > Ends right after hw_done(), totally standard case.
> > 
> > Acked-by: Jyri Sarha <jsarha@ti.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Jyri Sarha <jsarha@ti.com>
> > Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
> > ---
> >   drivers/gpu/drm/tidss/tidss_kms.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/tidss/tidss_kms.c b/drivers/gpu/drm/tidss/tidss_kms.c
> > index 09485c7f0d6f..95f8e0f78e32 100644
> > --- a/drivers/gpu/drm/tidss/tidss_kms.c
> > +++ b/drivers/gpu/drm/tidss/tidss_kms.c
> > @@ -4,6 +4,8 @@
> >    * Author: Tomi Valkeinen <tomi.valkeinen@ti.com>
> >    */
> > +#include <linux/dma-fence.h>
> > +
> >   #include <drm/drm_atomic.h>
> >   #include <drm/drm_atomic_helper.h>
> >   #include <drm/drm_bridge.h>
> > @@ -26,6 +28,7 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
> >   {
> >   	struct drm_device *ddev = old_state->dev;
> >   	struct tidss_device *tidss = to_tidss(ddev);
> > +	bool fence_cookie = dma_fence_begin_signalling();
> >   	dev_dbg(ddev->dev, "%s\n", __func__);
> > @@ -36,6 +39,7 @@ static void tidss_atomic_commit_tail(struct drm_atomic_state *old_state)
> >   	drm_atomic_helper_commit_modeset_enables(ddev, old_state);
> >   	drm_atomic_helper_commit_hw_done(old_state);
> > +	dma_fence_end_signalling(fence_cookie);
> >   	drm_atomic_helper_wait_for_flip_done(ddev, old_state);
> >   	drm_atomic_helper_cleanup_planes(ddev, old_state);
> > 
> 
> I bisected v5.13 rc lockdep warnings to this patch. I see this with tidss (lockdep
> report below) and omapdrm (probably caused by "drm/omapdrm: Annotate dma-fence critical
> section in commit path"). The report on omapdrm is very similar, with fs_reclaim and
> dma_fence_map, but hdmi bridge instead of mhdp bridge.
> 
> I'm unfamiliar with this piece of drm code, do you have any hints on what the problem
> might be and where I should be looking at?

This entirely fell through cracks.

tldr; Strictly speaking, anything which can hold up a dma_fence from
signalling is not allowed to allocate memory. There's some limitations to
this (GFP_NORECLAIM is ok), but that's the gist. And atomic commits can in
theory return a dma_fence that will signal when it's all done, in practice
it's for Android only.

This is somewhat the reason why I've not pushed this further, it's a very
drastic limitation. Also I've been snowed under with tons of issues around
dma-fence from the i915 side.

Now the issue still is that there's some real deadlocks here, and it would
be nice to annotate them all. Otoh not being able to allocate any memory
(or take locks that in other paths are held while allocating memory, which
is equivalent) is really tough, and definitely not feasible for full
modesets. So the pragmatic fix I think would be to limit the dma_fence
OUT-FENCE for Android to only the page-flip case, similarly limit our
annotations. And if you do an OUT-FENCE which includes a modeset, we just
fall back to a synchronous modeset. The problem can only happen if
userspace (and hence some other kernel thread) can get at the out
dma_fence before we've finished all the work that might need allocations
and what not else.

If you're bored maybe hack this up?

Also I'm kinda disappointed since in the cover letter I asked explicitly
for testing with lockdep, since I somewhat expected issues like this. But
people just blindly throw acks around without looking at the bigger
picture.

Cheers, Daniel

> 
>  Tomi
> 
> [   20.769286] ======================================================
> [   20.776918] WARNING: possible circular locking dependency detected
> [   20.783082] 5.11.0-rc2-00688-g4d56a4f08391-dirty #18 Not tainted
> [   20.789072] ------------------------------------------------------
> [   20.795232] kmstest/397 is trying to acquire lock:
> [   20.800008] ffff800011637878 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire+0x40/0xa4
> [   20.808014]
> [   20.808014] but task is already holding lock:
> [   20.813828] ffff000804fb41a8 (&mhdp->link_mutex){+.+.}-{3:3}, at: cdns_mhdp_atomic_enable+0x54/0x2c0 [cdns_mhdp8546]
> [   20.824343]
> [   20.824343] which lock already depends on the new lock.
> [   20.824343]
> [   20.832497]
> [   20.832497] the existing dependency chain (in reverse order) is:
> [   20.839958]
> [   20.839958] -> #2 (&mhdp->link_mutex){+.+.}-{3:3}:
> [   20.846214]        lock_acquire.part.0+0x178/0x380
> [   20.850999]        lock_acquire+0x6c/0x90
> [   20.854996]        __mutex_lock+0x9c/0x540
> [   20.859083]        mutex_lock_nested+0x44/0x70
> [   20.863513]        cdns_mhdp_atomic_enable+0x54/0x2c0 [cdns_mhdp8546]
> [   20.869938]        drm_atomic_bridge_chain_enable+0x60/0xd4 [drm]
> [   20.876093]        drm_atomic_helper_commit_modeset_enables+0x148/0x260 [drm_kms_helper]
> [   20.884207]        tidss_atomic_commit_tail+0x64/0xd0 [tidss]
> [   20.889947]        commit_tail+0xac/0x190 [drm_kms_helper]
> [   20.895445]        drm_atomic_helper_commit+0x194/0x3a0 [drm_kms_helper]
> [   20.902155]        drm_atomic_commit+0x58/0x70 [drm]
> [   20.907165]        drm_mode_atomic_ioctl+0x9f0/0xbbc [drm]
> [   20.912693]        drm_ioctl_kernel+0xd0/0x124 [drm]
> [   20.917701]        drm_ioctl+0x238/0x460 [drm]
> [   20.922189]        __arm64_sys_ioctl+0xb4/0x100
> [   20.926707]        el0_svc_common.constprop.0+0x80/0x1e0
> [   20.932007]        do_el0_svc+0x30/0xa0
> [   20.935831]        el0_svc+0x20/0x30
> [   20.939394]        el0_sync_handler+0x1a8/0x1b0
> [   20.943910]        el0_sync+0x174/0x180
> [   20.947734]
> [   20.947734] -> #1 (dma_fence_map){++++}-{0:0}:
> [   20.953644]        lock_acquire.part.0+0x178/0x380
> [   20.958422]        lock_acquire+0x6c/0x90
> [   20.962418]        __dma_fence_might_wait+0x60/0xdc
> [   20.967285]        dma_resv_lockdep+0x1dc/0x2b4
> [   20.971803]        do_one_initcall+0x90/0x460
> [   20.976145]        kernel_init_freeable+0x2c0/0x32c
> [   20.981012]        kernel_init+0x20/0x128
> [   20.985007]        ret_from_fork+0x10/0x3c
> [   20.989091]
> [   20.989091] -> #0 (fs_reclaim){+.+.}-{0:0}:
> [   20.994740]        check_noncircular+0x164/0x180
> [   20.999344]        __lock_acquire+0x13dc/0x1be4
> [   21.003861]        lock_acquire.part.0+0x178/0x380
> [   21.008639]        lock_acquire+0x6c/0x90
> [   21.012635]        fs_reclaim_acquire+0x7c/0xa4
> [   21.017152]        kmem_cache_alloc_trace+0x7c/0x3c0
> [   21.022103]        drm_mode_duplicate+0x34/0x70 [drm]
> [   21.027199]        cdns_mhdp_atomic_enable+0x1c4/0x2c0 [cdns_mhdp8546]
> [   21.033711]        drm_atomic_bridge_chain_enable+0x60/0xd4 [drm]
> [   21.039845]        drm_atomic_helper_commit_modeset_enables+0x148/0x260 [drm_kms_helper]
> [   21.047946]        tidss_atomic_commit_tail+0x64/0xd0 [tidss]
> [   21.053681]        commit_tail+0xac/0x190 [drm_kms_helper]
> [   21.059179]        drm_atomic_helper_commit+0x194/0x3a0 [drm_kms_helper]
> [   21.065889]        drm_atomic_commit+0x58/0x70 [drm]
> [   21.070897]        drm_mode_atomic_ioctl+0x9f0/0xbbc [drm]
> [   21.076423]        drm_ioctl_kernel+0xd0/0x124 [drm]
> [   21.081430]        drm_ioctl+0x238/0x460 [drm]
> [   21.085917]        __arm64_sys_ioctl+0xb4/0x100
> [   21.090434]        el0_svc_common.constprop.0+0x80/0x1e0
> [   21.095730]        do_el0_svc+0x30/0xa0
> [   21.099554]        el0_svc+0x20/0x30
> [   21.103117]        el0_sync_handler+0x1a8/0x1b0
> [   21.107632]        el0_sync+0x174/0x180
> [   21.111454]
> [   21.111454] other info that might help us debug this:
> [   21.111454]
> [   21.119435] Chain exists of:
> [   21.119435]   fs_reclaim --> dma_fence_map --> &mhdp->link_mutex
> [   21.119435]
> [   21.129768]  Possible unsafe locking scenario:
> [   21.129768]
> [   21.135670]        CPU0                    CPU1
> [   21.140184]        ----                    ----
> [   21.144698]   lock(&mhdp->link_mutex);
> [   21.148435]                                lock(dma_fence_map);
> [   21.154340]                                lock(&mhdp->link_mutex);
> [   21.160591]   lock(fs_reclaim);
> [   21.163721]
> [   21.163721]  *** DEADLOCK ***
> [   21.163721]
> [   21.169623] 4 locks held by kmstest/397:
> [   21.173532]  #0: ffff8000145efc28 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_modeset_acquire_init+0x48/0x70 [drm]
> [   21.184011]  #1: ffff000805dc7888 (crtc_ww_class_mutex){+.+.}-{3:3}, at: drm_modeset_lock+0x138/0x170 [drm]
> [   21.193795]  #2: ffff8000116d53f8 (dma_fence_map){++++}-{0:0}, at: tidss_atomic_commit_tail+0x2c/0xd0 [tidss]
> [   21.203701]  #3: ffff000804fb41a8 (&mhdp->link_mutex){+.+.}-{3:3}, at: cdns_mhdp_atomic_enable+0x54/0x2c0 [cdns_mhdp8546]
> [   21.214642]
> [   21.214642] stack backtrace:
> [   21.218984] CPU: 1 PID: 397 Comm: kmstest Not tainted 5.11.0-rc2-00688-g4d56a4f08391-dirty #18
> [   21.227575] Hardware name: Texas Instruments K3 J721E SoC (DT)
> [   21.233390] Call trace:
> [   21.235825]  dump_backtrace+0x0/0x1f0
> [   21.239476]  show_stack+0x24/0x80
> [   21.242781]  dump_stack+0xec/0x154
> [   21.246170]  print_circular_bug+0x1f8/0x200
> [   21.250341]  check_noncircular+0x164/0x180
> [   21.254426]  __lock_acquire+0x13dc/0x1be4
> [   21.258422]  lock_acquire.part.0+0x178/0x380
> [   21.262679]  lock_acquire+0x6c/0x90
> [   21.266155]  fs_reclaim_acquire+0x7c/0xa4
> [   21.270152]  kmem_cache_alloc_trace+0x7c/0x3c0
> [   21.274582]  drm_mode_duplicate+0x34/0x70 [drm]
> [   21.279159]  cdns_mhdp_atomic_enable+0x1c4/0x2c0 [cdns_mhdp8546]
> [   21.285151]  drm_atomic_bridge_chain_enable+0x60/0xd4 [drm]
> [   21.290764]  drm_atomic_helper_commit_modeset_enables+0x148/0x260 [drm_kms_helper]
> [   21.298343]  tidss_atomic_commit_tail+0x64/0xd0 [tidss]
> [   21.303559]  commit_tail+0xac/0x190 [drm_kms_helper]
> [   21.308535]  drm_atomic_helper_commit+0x194/0x3a0 [drm_kms_helper]
> [   21.314725]  drm_atomic_commit+0x58/0x70 [drm]
> [   21.319214]  drm_mode_atomic_ioctl+0x9f0/0xbbc [drm]
> [   21.324220]  drm_ioctl_kernel+0xd0/0x124 [drm]
> [   21.328708]  drm_ioctl+0x238/0x460 [drm]
> [   21.332676]  __arm64_sys_ioctl+0xb4/0x100
> [   21.336672]  el0_svc_common.constprop.0+0x80/0x1e0
> [   21.341449]  do_el0_svc+0x30/0xa0
> [   21.344753]  el0_svc+0x20/0x30
> [   21.347795]  el0_sync_handler+0x1a8/0x1b0
> [   21.351790]  el0_sync+0x174/0x180

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2021-06-22 16:31 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 15:29 [PATCH 00/11] dma_fence critical sections annotations for atomic Daniel Vetter
2021-01-21 15:29 ` [PATCH 01/11] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29   ` [Intel-gfx] " Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 02/11] drm/vkms: Annotate vblank timer Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29   ` [Intel-gfx] " Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 03/11] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29   ` [Intel-gfx] " Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 04/11] drm/komeda: Annotate dma-fence critical section in commit path Daniel Vetter
2021-01-21 15:29 ` [PATCH 05/11] drm/malidp: " Daniel Vetter
2021-01-21 15:29 ` [PATCH 06/11] drm/atmel: Use drm_atomic_helper_commit Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 07/11] drm/imx: Annotate dma-fence critical section in commit path Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 08/11] drm/omapdrm: " Daniel Vetter
2021-01-21 15:29 ` [PATCH 09/11] drm/rcar-du: " Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-01-21 15:29 ` [PATCH 10/11] drm/tegra: " Daniel Vetter
2021-01-21 15:29   ` Daniel Vetter
2021-02-05 18:09   ` Thierry Reding
2021-02-05 18:09     ` Thierry Reding
2021-01-21 15:29 ` [PATCH 11/11] drm/tidss: " Daniel Vetter
2021-05-27 12:15   ` Tomi Valkeinen
2021-06-22 16:31     ` Daniel Vetter [this message]
2021-02-23 11:54 ` [PATCH 00/11] dma_fence critical sections annotations for atomic Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNIQZswOVhXdhZoi@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lokeshvutla@ti.com \
    --cc=tomi.valkeinen@ideasonboard.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.