AMD-GFX Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: DRI Development <dri-devel@lists.freedesktop.org>
Cc: linux-rdma@vger.kernel.org,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	intel-gfx@lists.freedesktop.org,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	amd-gfx@lists.freedesktop.org,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	linaro-mm-sig@lists.linaro.org,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org
Subject: [RFC 00/17] dma-fence lockdep annotations
Date: Tue, 12 May 2020 10:59:27 +0200
Message-ID: <20200512085944.222637-1-daniel.vetter@ffwll.ch> (raw)

Hi all,

I've dragged my feet for years on this, hoping that cross-release lockdep
would do this for us, but well that never really happened unfortunately.
So here we are.

Cc'ed quite a pile of people since this is about the cross-driver contract
around dma_fences. Which is heavily used for dma_buf, and I'm hearing more
noises that rdma folks are looking into this, hence also on cc.

There's a bunch of different parts to this RFC:

- The annotations itself, in the 2nd patch after the prep patch to add
  might_sleep annotations. Commit message has all the motivation for what
  kind of deadlocks I want to catch, best you just read it.

  Since lockdep doesn't understand cross-release natively we need to
  cobble something together using rwlocks and a few more tricks, but from
  the test rollout in a few places in drm/vkms, amdgpu & i915 I think what
  I have now seems to actually work. Downside is that we have to
  explicitly annotate all code involved in eventual dma_fence signalling.

- Second important part is locking down the current dma-fence cross-driver
  contract, using lockdep priming like we already do for dma_resv_lock.
  I've just started with my own take on what we probably need to make the
  current code work (-ish), but both amdgpu and i915 have issues with
  that. So this needs some careful discussions, and also some thought on
  how we land it all eventually to not break lockdep completely for
  everyone.

  The important patch for that is "dma-fence: prime lockdep annotations"
  plus of course the various annotations patches and driver hacks to
  highlight some of the issues caught.

  Note that depending upon what exactly we end up deciding we might need
  to improve the annotations for fs_reclaim_acquire/release - for
  dma_fence_wait in mmu notifiers we can only allow GFP_NOWAIT (afaiui),
  and currently fs_reclaim_acquire/release only has a lockdep class for
  __GFP_FS only, we'd need to add another one for __GFP_DIRECT_RECLAIM in
  general maybe.

- Finally there's clearly some gaps in the current dma_fence driver
  interfaces: Amdgpu's hang recovery is essentially impossible to fix
  as-is - it needs to reset the display state and you can't get at modeset
  locks from tdr without deadlock potential. i915 has an internal trick
  (but it stops working once we involve real cross-driver fences) for this
  issues, but then for i915 modeset reset is only needed on very ancient
  gen2/3. Modern hw is a lot more reasonable.

  I'm kinda hoping that the annotations and priming for basic command
  submission and atomic modeset paths could be merged soonish, while we
  the tdr side clearly needs a pile more work to get going. But since we
  have to explicitly annotate all code paths anyway we can hide bugs in
  e.g. tdr code by simply not yet annotating those functions.

  I'm trying to lay out at least one idea for solving the tdr issue in the
  patch titled "drm/scheduler: use dma-fence annotations in tdr work".

Finally, once we have some agreement on where we're going with all this,
we also need some documentation. Currently that's missing because I don't
want to re-edit the text all the time while we still figure out the
details of the exact cross-driver semantics.

My goal here is that with this we can lock down the cross-driver contract
for the last bit of the dma_buf/resv/fence story and make sure this stops
being such a wobbly thing where everyone just does whatever they feel
like.

Ideas, thoughts, reviews, testing (with specific annotations for that
driver) on other drivers very much welcome.

Cheers, Daniel

Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>

Daniel Vetter (17):
  dma-fence: add might_sleep annotation to _wait()
  dma-fence: basic lockdep annotations
  dma-fence: prime lockdep annotations
  drm/vkms: Annotate vblank timer
  drm/vblank: Annotate with dma-fence signalling section
  drm/atomic-helper: Add dma-fence annotations
  drm/amdgpu: add dma-fence annotations to atomic commit path
  drm/scheduler: use dma-fence annotations in main thread
  drm/amdgpu: use dma-fence annotations in cs_submit()
  drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code
  drm/amdgpu: DC also loves to allocate stuff where it shouldn't
  drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail
  drm/scheduler: use dma-fence annotations in tdr work
  drm/amdgpu: use dma-fence annotations for gpu reset code
  Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset"
  drm/amdgpu: gpu recovery does full modesets
  drm/i915: Annotate dma_fence_work

 drivers/dma-buf/dma-fence.c                   | 56 +++++++++++++++++++
 drivers/dma-buf/dma-resv.c                    |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  5 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 22 ++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/atom.c             |  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 +++++-
 drivers/gpu/drm/amd/display/dc/core/dc.c      |  4 +-
 drivers/gpu/drm/drm_atomic_helper.c           | 16 ++++++
 drivers/gpu/drm/drm_vblank.c                  |  8 ++-
 drivers/gpu/drm/i915/i915_sw_fence_work.c     |  3 +
 drivers/gpu/drm/scheduler/sched_main.c        | 11 ++++
 drivers/gpu/drm/vkms/vkms_crtc.c              |  8 ++-
 include/linux/dma-fence.h                     | 13 +++++
 16 files changed, 160 insertions(+), 13 deletions(-)

-- 
2.26.2

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

             reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-12  8:59 Daniel Vetter [this message]
2020-05-12  8:59 ` [RFC 01/17] dma-fence: add might_sleep annotation to _wait() Daniel Vetter
2020-05-12  9:03   ` Chris Wilson
2020-05-12  9:08   ` Christian König
2020-06-02  9:45     ` Maarten Lankhorst
2020-05-12  8:59 ` [RFC 02/17] dma-fence: basic lockdep annotations Daniel Vetter
2020-05-12  9:04   ` Chris Wilson
2020-05-12  9:08     ` Daniel Vetter
2020-05-12  9:19       ` Chris Wilson
2020-05-13  8:30         ` Daniel Vetter
2020-05-25 15:41     ` Daniel Vetter
2020-05-12 12:09   ` Jason Gunthorpe
2020-05-12 12:57     ` Daniel Vetter
2020-05-26 10:00   ` Maarten Lankhorst
2020-05-28 13:36   ` Thomas Hellström (Intel)
2020-05-28 14:22     ` Daniel Vetter
2020-05-28 21:54   ` Luben Tuikov
2020-05-29  5:49     ` Daniel Vetter
2020-05-12  8:59 ` [RFC 03/17] dma-fence: prime " Daniel Vetter
2020-05-12  8:59 ` [RFC 04/17] drm/vkms: Annotate vblank timer Daniel Vetter
2020-05-12  8:59 ` [RFC 05/17] drm/vblank: Annotate with dma-fence signalling section Daniel Vetter
2020-05-12  8:59 ` [RFC 06/17] drm/atomic-helper: Add dma-fence annotations Daniel Vetter
2020-05-12  8:59 ` [RFC 07/17] drm/amdgpu: add dma-fence annotations to atomic commit path Daniel Vetter
2020-05-12  8:59 ` [RFC 08/17] drm/scheduler: use dma-fence annotations in main thread Daniel Vetter
2020-05-25 15:30   ` Daniel Vetter
2020-05-12  8:59 ` [RFC 09/17] drm/amdgpu: use dma-fence annotations in cs_submit() Daniel Vetter
2020-05-13  7:02   ` Christian König
2020-05-13  7:07     ` Daniel Vetter
2020-05-12  8:59 ` [RFC 10/17] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code Daniel Vetter
2020-05-12 15:56   ` Christian König
2020-05-12 16:20     ` Daniel Vetter
2020-05-12 16:27       ` Daniel Vetter
2020-05-12 17:31         ` Christian König
2020-05-12 18:34           ` Daniel Vetter
2020-05-12  8:59 ` [RFC 11/17] drm/amdgpu: DC also loves to allocate stuff where it shouldn't Daniel Vetter
2020-05-12  8:59 ` [RFC 12/17] drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail Daniel Vetter
2020-05-12  8:59 ` [RFC 13/17] drm/scheduler: use dma-fence annotations in tdr work Daniel Vetter
2020-05-12  8:59 ` [RFC 14/17] drm/amdgpu: use dma-fence annotations for gpu reset code Daniel Vetter
2020-05-12  8:59 ` [RFC 15/17] Revert "drm/amdgpu: add fbdev suspend/resume on gpu reset" Daniel Vetter
2020-05-12  8:59 ` [RFC 16/17] drm/amdgpu: gpu recovery does full modesets Daniel Vetter
2020-05-12 12:54   ` Alex Deucher
2020-05-12 12:58     ` Daniel Vetter
2020-05-12 13:12       ` Alex Deucher
2020-05-12 13:17         ` Daniel Vetter
2020-05-12 13:29           ` Alex Deucher
2020-05-12 13:45             ` Daniel Vetter
2020-05-12 14:24               ` Alex Deucher
2020-05-12 16:12                 ` Daniel Vetter
2020-05-12 20:10                   ` Kazlauskas, Nicholas
2020-05-13  6:02                     ` Daniel Vetter
2020-05-12  8:59 ` [RFC 17/17] drm/i915: Annotate dma_fence_work Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200512085944.222637-1-daniel.vetter@ffwll.ch \
    --to=daniel.vetter@ffwll.ch \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

AMD-GFX Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/amd-gfx/0 amd-gfx/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 amd-gfx amd-gfx/ https://lore.kernel.org/amd-gfx \
		amd-gfx@lists.freedesktop.org
	public-inbox-index amd-gfx

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.freedesktop.lists.amd-gfx


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git