All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/25] Parallel submission aka multi-bb execbuf
@ 2021-10-14 17:19 ` Matthew Brost
  0 siblings, 0 replies; 69+ messages in thread
From: Matthew Brost @ 2021-10-14 17:19 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison

As discussed in [1] we are introducing a new parallel submission uAPI
for the i915 which allows more than 1 BB to be submitted in an execbuf
IOCTL. This is the implemenation for both GuC and execlists.

In addition to selftests in the series, an IGT is available implemented
in the first 4 patches [2].

The execbuf IOCTL changes have been done in a single large patch (#21)
as all the changes flow together and I believe a single patch will be
better if some one has to lookup this change in the future. Can split in
a series of smaller patches if desired.

This code is available in a public [3] repo for UMD teams to test there
code on.

v2: Drop complicated state machine to block in kernel if no guc_ids
available, perma-pin parallel contexts, reworker execbuf IOCTL to be a
series of loops inside the IOCTL rather than 1 large one on the outside,
address Daniel Vetter's comments
v3: Address John Harrison's comments, add a couple of patches which fix
bugs found internally
v4: Address John Harrison's latest round of comments
v5: Address John Harrison's latest round of comments, resend for CI

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/92028/
[2] https://patchwork.freedesktop.org/series/93071/
[3] https://gitlab.freedesktop.org/mbrost/mbrost-drm-intel/-/tree/drm-intel-parallel

Matthew Brost (25):
  drm/i915/guc: Move GuC guc_id allocation under submission state
    sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC
    submission
  drm/i915/guc: Don't call switch_to_kernel_context with GuC submission
  drm/i915: Add logical engine mapping
  drm/i915: Expose logical engine instance to user
  drm/i915/guc: Introduce context parent-child relationship
  drm/i915/guc: Add multi-lrc context registration
  drm/i915/guc: Ensure GuC schedule operations do not operate on child
    contexts
  drm/i915/guc: Assign contexts in parent-child relationship consecutive
    guc_ids
  drm/i915/guc: Implement parallel context pin / unpin functions
  drm/i915/guc: Implement multi-lrc submission
  drm/i915/guc: Insert submit fences between requests in parent-child
    relationship
  drm/i915/guc: Implement multi-lrc reset
  drm/i915/guc: Update debugfs for GuC multi-lrc
  drm/i915/guc: Connect UAPI to GuC multi-lrc interface
  drm/i915/doc: Update parallel submit doc to point to i915_drm.h
  drm/i915/guc: Add basic GuC multi-lrc selftest
  drm/i915/guc: Implement no mid batch preemption for multi-lrc
  drm/i915: Multi-BB execbuf
  drm/i915/guc: Handle errors in multi-lrc requests
  drm/i915: Make request conflict tracking understand parallel submits
  drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
  drm/i915: Enable multi-bb execbuf
  drm/i915/execlists: Weak parallel submission support for execlists

 Documentation/gpu/rfc/i915_parallel_execbuf.h |  122 --
 Documentation/gpu/rfc/i915_scheduler.rst      |    4 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |   57 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  229 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   16 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  786 ++++++---
 drivers/gpu/drm/i915/gt/intel_context.c       |   50 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   56 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   73 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   66 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   13 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |   37 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   63 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         |   14 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |    7 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   12 +-
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |    1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   29 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   54 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |    2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |   24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   34 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1444 ++++++++++++++---
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  179 ++
 drivers/gpu/drm/i915/i915_query.c             |    2 +
 drivers/gpu/drm/i915/i915_request.c           |  143 +-
 drivers/gpu/drm/i915/i915_request.h           |   23 +
 drivers/gpu/drm/i915/i915_vma.c               |   21 +-
 drivers/gpu/drm/i915/i915_vma.h               |   13 +-
 drivers/gpu/drm/i915/intel_wakeref.h          |   12 +
 .../drm/i915/selftests/i915_live_selftests.h  |    1 +
 include/uapi/drm/i915_drm.h                   |  139 +-
 34 files changed, 3053 insertions(+), 692 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 69+ messages in thread
* [PATCH 00/25] Parallel submission aka multi-bb execbuf
@ 2021-10-13 20:42 Matthew Brost
  2021-10-13 20:42 ` [PATCH 15/25] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost
  0 siblings, 1 reply; 69+ messages in thread
From: Matthew Brost @ 2021-10-13 20:42 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison

As discussed in [1] we are introducing a new parallel submission uAPI
for the i915 which allows more than 1 BB to be submitted in an execbuf
IOCTL. This is the implemenation for both GuC and execlists.

In addition to selftests in the series, an IGT is available implemented
in the first 4 patches [2].

The execbuf IOCTL changes have been done in a single large patch (#21)
as all the changes flow together and I believe a single patch will be
better if some one has to lookup this change in the future. Can split in
a series of smaller patches if desired.

This code is available in a public [3] repo for UMD teams to test there
code on.

v2: Drop complicated state machine to block in kernel if no guc_ids
available, perma-pin parallel contexts, reworker execbuf IOCTL to be a
series of loops inside the IOCTL rather than 1 large one on the outside,
address Daniel Vetter's comments
v3: Address John Harrison's comments, add a couple of patches which fix
bugs found internally
v4: Address John Harrison's latest round of comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/92028/
[2] https://patchwork.freedesktop.org/series/93071/
[3] https://gitlab.freedesktop.org/mbrost/mbrost-drm-intel/-/tree/drm-intel-parallel

Matthew Brost (25):
  drm/i915/guc: Move GuC guc_id allocation under submission state
    sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC
    submission
  drm/i915/guc: Don't call switch_to_kernel_context with GuC submission
  drm/i915: Add logical engine mapping
  drm/i915: Expose logical engine instance to user
  drm/i915/guc: Introduce context parent-child relationship
  drm/i915/guc: Add multi-lrc context registration
  drm/i915/guc: Ensure GuC schedule operations do not operate on child
    contexts
  drm/i915/guc: Assign contexts in parent-child relationship consecutive
    guc_ids
  drm/i915/guc: Implement parallel context pin / unpin functions
  drm/i915/guc: Implement multi-lrc submission
  drm/i915/guc: Insert submit fences between requests in parent-child
    relationship
  drm/i915/guc: Implement multi-lrc reset
  drm/i915/guc: Update debugfs for GuC multi-lrc
  drm/i915/guc: Connect UAPI to GuC multi-lrc interface
  drm/i915/doc: Update parallel submit doc to point to i915_drm.h
  drm/i915/guc: Add basic GuC multi-lrc selftest
  drm/i915/guc: Implement no mid batch preemption for multi-lrc
  drm/i915: Multi-BB execbuf
  drm/i915/guc: Handle errors in multi-lrc requests
  drm/i915: Make request conflict tracking understand parallel submits
  drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
  drm/i915: Enable multi-bb execbuf
  drm/i915/execlists: Weak parallel submission support for execlists

 Documentation/gpu/rfc/i915_parallel_execbuf.h |  122 --
 Documentation/gpu/rfc/i915_scheduler.rst      |    4 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |   57 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  227 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   16 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  786 ++++++---
 drivers/gpu/drm/i915/gt/intel_context.c       |   50 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   54 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   73 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   12 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   66 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |   13 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |   37 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |    7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   63 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h         |   14 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |    7 +
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |   12 +-
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |    1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   29 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   54 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |    2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |   24 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   34 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1452 ++++++++++++++---
 .../drm/i915/gt/uc/selftest_guc_multi_lrc.c   |  179 ++
 drivers/gpu/drm/i915/i915_query.c             |    2 +
 drivers/gpu/drm/i915/i915_request.c           |  143 +-
 drivers/gpu/drm/i915/i915_request.h           |   23 +
 drivers/gpu/drm/i915/i915_vma.c               |   21 +-
 drivers/gpu/drm/i915/i915_vma.h               |   13 +-
 drivers/gpu/drm/i915/intel_wakeref.h          |   12 +
 .../drm/i915/selftests/i915_live_selftests.h  |    1 +
 include/uapi/drm/i915_drm.h                   |  139 +-
 34 files changed, 3056 insertions(+), 693 deletions(-)
 delete mode 100644 Documentation/gpu/rfc/i915_parallel_execbuf.h
 create mode 100644 drivers/gpu/drm/i915/gt/uc/selftest_guc_multi_lrc.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2021-10-15  6:12 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-14 17:19 [PATCH 00/25] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-14 17:19 ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 01/25] drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 02/25] drm/i915/guc: Take GT PM ref when deregistering context Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 03/25] drm/i915/guc: Take engine PM when a context is pinned with GuC submission Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 04/25] drm/i915/guc: Don't call switch_to_kernel_context " Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 05/25] drm/i915: Add logical engine mapping Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 06/25] drm/i915: Expose logical engine instance to user Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 07/25] drm/i915/guc: Introduce context parent-child relationship Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 08/25] drm/i915/guc: Add multi-lrc context registration Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 18:18   ` John Harrison
2021-10-14 18:18     ` [Intel-gfx] " John Harrison
2021-10-14 17:19 ` [PATCH 09/25] drm/i915/guc: Ensure GuC schedule operations do not operate on child contexts Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 10/25] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 11/25] drm/i915/guc: Implement parallel context pin / unpin functions Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 12/25] drm/i915/guc: Implement multi-lrc submission Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 13/25] drm/i915/guc: Insert submit fences between requests in parent-child relationship Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 14/25] drm/i915/guc: Implement multi-lrc reset Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 15/25] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 16/25] drm/i915/guc: Connect UAPI to GuC multi-lrc interface Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 18:24   ` John Harrison
2021-10-14 18:24     ` [Intel-gfx] " John Harrison
2021-10-14 17:19 ` [PATCH 17/25] drm/i915/doc: Update parallel submit doc to point to i915_drm.h Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 18/25] drm/i915/guc: Add basic GuC multi-lrc selftest Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:19 ` [PATCH 19/25] drm/i915/guc: Implement no mid batch preemption for multi-lrc Matthew Brost
2021-10-14 17:19   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:20 ` [PATCH 20/25] drm/i915: Multi-BB execbuf Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 18:27   ` John Harrison
2021-10-14 18:27     ` [Intel-gfx] " John Harrison
2021-10-14 17:20 ` [PATCH 21/25] drm/i915/guc: Handle errors in multi-lrc requests Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:20 ` [PATCH 22/25] drm/i915: Make request conflict tracking understand parallel submits Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:20 ` [PATCH 23/25] drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 17:20 ` [PATCH 24/25] drm/i915: Enable multi-bb execbuf Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 18:29   ` John Harrison
2021-10-14 18:29     ` [Intel-gfx] " John Harrison
2021-10-14 17:20 ` [PATCH 25/25] drm/i915/execlists: Weak parallel submission support for execlists Matthew Brost
2021-10-14 17:20   ` [Intel-gfx] " Matthew Brost
2021-10-14 18:42   ` John Harrison
2021-10-14 18:42     ` [Intel-gfx] " John Harrison
2021-10-14 18:55     ` Matthew Brost
2021-10-14 18:55       ` [Intel-gfx] " Matthew Brost
2021-10-14 23:50 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Parallel submission aka multi-bb execbuf (rev7) Patchwork
2021-10-14 23:51 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-10-15  0:25 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-10-15  6:12 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2021-10-13 20:42 [PATCH 00/25] Parallel submission aka multi-bb execbuf Matthew Brost
2021-10-13 20:42 ` [PATCH 15/25] drm/i915/guc: Update debugfs for GuC multi-lrc Matthew Brost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.