All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Intel GFX" <intel-gfx@lists.freedesktop.org>,
	"Thomas Hellström" <thomas.hellstrom@intel.com>
Subject: Re: [Intel-gfx] [PATCH v8 01/69] drm/i915: Do not share hwsp across contexts any more, v7.
Date: Mon, 15 Mar 2021 13:08:01 +0100	[thread overview]
Message-ID: <8a997e84-6677-038c-02f4-9bcac059fcc8@linux.intel.com> (raw)
In-Reply-To: <CAOFGe96XvGm8bi3Hse44zuva=Vqks9Ht3mV9MQ1PurY9RVRfvw@mail.gmail.com>

Op 2021-03-11 om 22:22 schreef Jason Ekstrand:
> First off, I'm just here asking questions right now trying to start
> getting my head around some of this stuff.  Feel free to ignore me or
> tell me to go away if I'm being annoying. :-)
>
> On Thu, Mar 11, 2021 at 7:49 AM Maarten Lankhorst
> <maarten.lankhorst@linux.intel.com> wrote:
>> Instead of sharing pages with breadcrumbs, give each timeline a
>> single page. This allows unrelated timelines not to share locks
>> any more during command submission.
>>
>> As an additional benefit, seqno wraparound no longer requires
>> i915_vma_pin, which means we no longer need to worry about a
>> potential -EDEADLK at a point where we are ready to submit.
>>
>> Changes since v1:
>> - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle).
>> - Extra check for completion in intel_read_hwsp().
>> Changes since v2:
>> - Fix inconsistent indent in hwsp_alloc() (kbuild)
>> - memset entire cacheline to 0.
>> Changes since v3:
>> - Do same in intel_timeline_reset_seqno(), and clflush for good measure.
>> Changes since v4:
>> - Use refcounting on timeline, instead of relying on i915_active.
>> - Fix waiting on kernel requests.
>> Changes since v5:
>> - Bump amount of slots to maximum (256), for best wraparounds.
>> - Add hwsp_offset to i915_request to fix potential wraparound hang.
>> - Ensure timeline wrap test works with the changes.
>> - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to
>>   fix a hang.
>> Changes since v6:
>> - Rename i915_request_active_offset to i915_request_active_seqno(),
>>   and elaborate the function. (tvrtko)
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1
>> Reported-by: kernel test robot <lkp@intel.com>
>> ---
>>  drivers/gpu/drm/i915/gt/gen2_engine_cs.c      |   2 +-
>>  drivers/gpu/drm/i915/gt/gen6_engine_cs.c      |   8 +-
>>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c      |  13 +-
>>  drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   1 +
>>  drivers/gpu/drm/i915/gt/intel_gt_types.h      |   4 -
>>  drivers/gpu/drm/i915/gt/intel_timeline.c      | 422 ++++--------------
>>  .../gpu/drm/i915/gt/intel_timeline_types.h    |  17 +-
>>  drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |   5 +-
>>  drivers/gpu/drm/i915/gt/selftest_timeline.c   |  83 ++--
>>  drivers/gpu/drm/i915/i915_request.c           |   4 -
>>  drivers/gpu/drm/i915/i915_request.h           |  31 +-
>>  11 files changed, 175 insertions(+), 415 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> index b491a64919c8..9646200d2792 100644
>> --- a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c
>> @@ -143,7 +143,7 @@ static u32 *__gen2_emit_breadcrumb(struct i915_request *rq, u32 *cs,
>>                                    int flush, int post)
>>  {
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH;
>>
>> diff --git a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> index ce38d1bcaba3..b388ceeeb1c9 100644
>> --- a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c
>> @@ -161,7 +161,7 @@ u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>                  PIPE_CONTROL_DC_FLUSH_ENABLE |
>>                  PIPE_CONTROL_QW_WRITE |
>>                  PIPE_CONTROL_CS_STALL);
>> -       *cs++ = i915_request_active_timeline(rq)->hwsp_offset |
>> +       *cs++ = i915_request_active_seqno(rq) |
>>                 PIPE_CONTROL_GLOBAL_GTT;
>>         *cs++ = rq->fence.seqno;
>>
>> @@ -359,7 +359,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>                  PIPE_CONTROL_QW_WRITE |
>>                  PIPE_CONTROL_GLOBAL_GTT_IVB |
>>                  PIPE_CONTROL_CS_STALL);
>> -       *cs++ = i915_request_active_timeline(rq)->hwsp_offset;
>> +       *cs++ = i915_request_active_seqno(rq);
>>         *cs++ = rq->fence.seqno;
>>
>>         *cs++ = MI_USER_INTERRUPT;
>> @@ -374,7 +374,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
>>  u32 *gen6_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
>>  {
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
>>         *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
>> @@ -394,7 +394,7 @@ u32 *gen7_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs)
>>         int i;
>>
>>         GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma);
>> -       GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
>> +       GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR);
>>
>>         *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB |
>>                 MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
>> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> index cac80af7ad1c..6b9c34d3ac8d 100644
>> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>> @@ -338,15 +338,14 @@ static u32 preempt_address(struct intel_engine_cs *engine)
>>
>>  static u32 hwsp_offset(const struct i915_request *rq)
>>  {
>> -       const struct intel_timeline_cacheline *cl;
>> +       const struct intel_timeline *tl;
>>
>> -       /* Before the request is executed, the timeline/cachline is fixed */
>> +       /* Before the request is executed, the timeline is fixed */
>> +       tl = rcu_dereference_protected(rq->timeline,
>> +                                      !i915_request_signaled(rq));
> Why is Gen8+ different from Gen2/6 here?  In particular, why not use
> i915_request_active_timeline(rq) or, better yet,
> i915_request_active_seqno()?  The primary difference I see is that the
> guard on the RCU is different but it's not immediately obvious to me
> why this should be different between hardware generations.  Also,
> i915_request_active_seqno() returns a u32, but that could be fixed.

The legacy rings different locking, and it was splatting on PROVE_RCU.

I didn't want to weaken it, so I just put in a different condition for gen8 that's slightly weaker, but still better than the '1'.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-03-15 12:08 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 13:41 [Intel-gfx] [PATCH v8 00/69] drm/i915: Remove obj->mm.lock! Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 01/69] drm/i915: Do not share hwsp across contexts any more, v7 Maarten Lankhorst
2021-03-11 21:22   ` Jason Ekstrand
2021-03-15 12:08     ` Maarten Lankhorst [this message]
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 02/69] drm/i915: Pin timeline map after first timeline pin, v3 Maarten Lankhorst
2021-03-11 21:44   ` Jason Ekstrand
2021-03-15 12:34     ` Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 03/69] drm/i915: Move cmd parser pinning to execbuffer Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 04/69] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 05/69] drm/i915: Ensure we hold the object mutex in pin correctly Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 06/69] drm/i915: Add gem object locking to madvise Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 07/69] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 08/69] drm/i915: Rework struct phys attachment handling Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 09/69] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 10/69] drm/i915: make lockdep slightly happier about execbuf Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 11/69] drm/i915: Disable userptr pread/pwrite support Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 12/69] drm/i915: No longer allow exporting userptr through dma-buf Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 13/69] drm/i915: Reject more ioctls for userptr, v2 Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 14/69] drm/i915: Reject UNSYNCHRONIZED " Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 15/69] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 16/69] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7 Maarten Lankhorst
2021-03-11 17:24   ` Thomas Hellström (Intel)
2021-03-15 12:36     ` Maarten Lankhorst
2021-03-16  8:47       ` Thomas Hellström (Intel)
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 17/69] drm/i915: Flatten obj->mm.lock Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 18/69] drm/i915: Populate logical context during first pin Maarten Lankhorst
2021-03-11 13:41 ` [Intel-gfx] [PATCH v8 19/69] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 20/69] drm/i915: Handle ww locking in init_status_page Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 21/69] drm/i915: Rework clflush to work correctly without obj->mm.lock Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 22/69] drm/i915: Pass ww ctx to intel_pin_to_display_plane Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 23/69] drm/i915: Add object locking to vm_fault_cpu Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 24/69] drm/i915: Move pinning to inside engine_wa_list_verify() Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 25/69] drm/i915: Take reservation lock around i915_vma_pin Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 26/69] drm/i915: Make lrc_init_wa_ctx compatible with ww locking, v3 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 27/69] drm/i915: Make __engine_unpark() compatible with ww locking Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 28/69] drm/i915: Take obj lock around set_domain ioctl Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 29/69] drm/i915: Defer pin calls in buffer pool until first use by caller Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 30/69] drm/i915: Fix pread/pwrite to work with new locking rules Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 31/69] drm/i915: Fix workarounds selftest, part 1 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 32/69] drm/i915: Prepare for obj->mm.lock removal, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 33/69] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 34/69] drm/i915: Add ww locking around vm_access() Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 35/69] drm/i915: Increase ww locking for perf Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 36/69] drm/i915: Lock ww in ucode objects correctly Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 37/69] drm/i915: Add ww locking to dma-buf ops Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 38/69] drm/i915: Add missing ww lock in intel_dsb_prepare Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 39/69] drm/i915: Fix ww locking in shmem_create_from_object Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 40/69] drm/i915: Use a single page table lock for each gtt Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 41/69] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 42/69] drm/i915/selftests: Prepare client blit " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 43/69] drm/i915/selftests: Prepare coherency tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 44/69] drm/i915/selftests: Prepare context " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 45/69] drm/i915/selftests: Prepare dma-buf " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 46/69] drm/i915/selftests: Prepare execbuf " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 47/69] drm/i915/selftests: Prepare mman testcases " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 48/69] drm/i915/selftests: Prepare object tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 49/69] drm/i915/selftests: Prepare object blit " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 50/69] drm/i915/selftests: Prepare igt_gem_utils " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 51/69] drm/i915/selftests: Prepare context selftest " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 52/69] drm/i915/selftests: Prepare hangcheck " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 53/69] drm/i915/selftests: Prepare execlists and lrc selftests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 54/69] drm/i915/selftests: Prepare mocs tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 55/69] drm/i915/selftests: Prepare ring submission " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 56/69] drm/i915/selftests: Prepare timeline tests " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 57/69] drm/i915/selftests: Prepare i915_request " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 58/69] drm/i915/selftests: Prepare memory region " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 59/69] drm/i915/selftests: Prepare cs engine " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 60/69] drm/i915/selftests: Prepare gtt " Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 61/69] drm/i915: Finally remove obj->mm.lock Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 62/69] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 63/69] drm/i915: Move gt_revoke() slightly Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 64/69] drm/i915: Add missing -EDEADLK path in execbuffer ggtt pinning Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 65/69] drm/i915: Fix pin_map in scheduler selftests Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 66/69] drm/i915: Add ww parameter to get_pages() callback Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 67/69] drm/i915: Add ww context to prepare_(read/write) Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 68/69] drm/i915: Pass ww ctx to pin_map Maarten Lankhorst
2021-03-11 13:42 ` [Intel-gfx] [PATCH v8 69/69] drm/i915: Pass ww ctx to i915_gem_object_pin_pages Maarten Lankhorst
2021-03-11 14:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Remove obj->mm.lock! (rev16) Patchwork
2021-03-11 14:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-03-11 14:32 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-03-11 14:59 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-03-16  9:10 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Remove obj->mm.lock! (rev17) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a997e84-6677-038c-02f4-9bcac059fcc8@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=thomas.hellstrom@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.