[PATCH 00/11] drm/scheduler dependency tracking

* [PATCH 00/11] drm/scheduler dependency tracking
@ 2021-06-24 14:00 Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                   ` (10 more replies)
  0 siblings, 11 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter

Hi all,

While trying to carefully auditing how all the various drivers handler the
implicit dependencies in the dma-resv object I got a bit too annoyed about
all the hand-rolling. Here's some patches to unify this at least for
drivers using the drm/scheduler.

4 out of 5 are converted over (but only compile-tested), I think amdgpu
would also work:

- handle the job->sync dependencies using drm_sched_job_await*

- build up the job->sched_sync fences needed for deciding whether we need
  a full flush or not before we push the job into the scheduler, instead
  of in the ->dependency callback. This also has the benefit of removing
  a bunch of allocations from scheduler callbacks, where they're not ok
  (due to recursuion into mmu notifier/shrinker on direct reclaim)

- keep the vmid_grab stuff in the ->dependency callback, for special
  things like that I've kept that as a fallback.

There's a few complications though:

- amdgpu_sync is both used for amdgpu_job and for other things, mostly
  amdkfd, but also some bo wait functions

- amdgpu_job is both used for pushing jobs into the scheduler, but also
  for directly pushing a job into the hw through an ib

All not insurmountable, but a bit too much when the main goal here was
just to establish the drm_sched_job_await api.

Wrt the datastructure I picked: Since 3 out of 5 drivers used the xarray,
and that should at least be fairly storage efficient and easy to grow, I
went with that. We can bikeshed/tune the backing implementation later on.

Similarly the await_implicit implementation is as inefficient as the one
the drivers currently use, relying on dma_resv_get_fences(). This means we
copy all the fences to some temporary array first, which is entirely
unecessary because we're holding the dma_resv lock.

All that can be tuned later on easily.

Review, comments and especially testing very much welcome.

Cheers, Daniel

Daniel Vetter (11):
  drm/sched: Split drm_sched_job_init
  drm/sched: Add dependency tracking
  drm/sched: drop entity parameter from drm_sched_push_job
  drm/panfrost: use scheduler dependency tracking
  drm/lima: use scheduler dependency tracking
  drm/v3d: Move drm_sched_job_init to v3d_job_init
  drm/v3d: Use scheduler dependency handling
  drm/etnaviv: Use scheduler dependency handling
  drm/gem: Delete gem array fencing helpers
  drm/scheduler: Don't store self-dependencies
  drm/sched: Check locking in drm_sched_job_await_implicit

 .gitignore                                   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c       |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c      |   4 +-
 drivers/gpu/drm/drm_gem.c                    |  96 -------------
 drivers/gpu/drm/etnaviv/etnaviv_gem.h        |   5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  32 ++---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c      |  63 +--------
 drivers/gpu/drm/etnaviv/etnaviv_sched.h      |   3 +-
 drivers/gpu/drm/lima/lima_gem.c              |   7 +-
 drivers/gpu/drm/lima/lima_sched.c            |  28 +---
 drivers/gpu/drm/lima/lima_sched.h            |   6 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c      |  14 +-
 drivers/gpu/drm/panfrost/panfrost_job.c      |  39 +-----
 drivers/gpu/drm/panfrost/panfrost_job.h      |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c     |  30 +++--
 drivers/gpu/drm/scheduler/sched_fence.c      |  15 ++-
 drivers/gpu/drm/scheduler/sched_main.c       | 135 ++++++++++++++++++-
 drivers/gpu/drm/v3d/v3d_drv.h                |   5 -
 drivers/gpu/drm/v3d/v3d_gem.c                |  91 ++++---------
 drivers/gpu/drm/v3d/v3d_sched.c              |  29 +---
 include/drm/drm_gem.h                        |   5 -
 include/drm/gpu_scheduler.h                  |  40 +++++-
 22 files changed, 282 insertions(+), 375 deletions(-)

-- 
2.32.0.rc2

^ permalink raw reply	[flat|nested] 57+ messages in thread