All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 02/20] drm/i915/gt: Couple up old virtual breadcrumb on new sibling
Date: Tue, 12 May 2020 11:12:23 +0100	[thread overview]
Message-ID: <eebc8a12-1204-e619-f7bd-df607e839ad7@linux.intel.com> (raw)
In-Reply-To: <158927336578.15653.17606758936318781729@build.alporthouse.com>


On 12/05/2020 09:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-05-12 09:41:01)
>> On 11/05/2020 08:57, Chris Wilson wrote:
>>> The second try at staging the transfer of the breadcrumb. In part one,
>>> we realised we could not simply move to the second engine as we were
>>> only holding the breadcrumb lock on the first. So in commit 6c81e21a4742
>>> ("drm/i915/gt: Stage the transfer of the virtual breadcrumb"), we
>>> removed it from the first engine and marked up this request to reattach
>>> the signaling on the new engine. However, this failed to take into
>>> account that we only attach the breadcrumb if the new request is added
>>> at the start of the queue, which if we are transferring, it is because
>>> we know there to be a request to be signaled (and hence we would not be
>>> attached). In this second try, we remove from the first list under its
>>> lock, take ownership of the link, and then take the second lock to
>>> complete the transfer.
>>
>> Overall just an optimisation not to call i915_request_enable_breadcrumb,
>> I mean not add to the list indirectly?
> 
> The request that we need to add already has its breadcrumb enabled. The
> request is on the veng->context.signals list, it's just that the veng is
> on siblings[0] signalers list and we are no longer guaranteed to
> generate an interrupt on engine.
> 
> There's an explosion in the current code due to the lists not moving
> as expected on enabling the breadcrumb on the next request (because of
>                  if (pos == &ce->signals) /* catch transitions from empty list */
>                          list_move_tail(&ce->signal_link, &b->signalers);
> 
> )
> 
> The explosion is on a dead list, but has on a couple of occasions looked
> like
> 
> <4> [373.551331] RIP: 0010:i915_request_enable_breadcrumb+0x144/0x380 [i915]
> <4> [373.551341] Code: c7 c2 20 f1 42 c0 48 c7 c7 77 85 28 c0 e8 44 bc f2 ec bf 01 00 00 00 e8 5a 8e f2 ec 31 f6 bf 09 00 00 00 e8 6e 09 e3 ec 0f 0b <3b> 45 80 0f 89 5d ff ff ff 48 8b 6d 08 4c 39 e5 75 ee 49 8b 4d 38
> <4> [373.551356] RSP: 0018:ffffb64d0114b9f8 EFLAGS: 00010083
> <4> [373.551363] RAX: 00000000000036b2 RBX: ffffa310385096c0 RCX: 0000000000000003
> <4> [373.551372] RDX: 00000000000036b2 RSI: 000000002ac5cf63 RDI: 00000000ffffffff
> <4> [373.551379] RBP: dead000000000122 R08: ffffa31047075a50 R09: 00000000fffffffe
> <4> [373.551385] R10: 0000000053a90a70 R11: 000000005e84b7e5 R12: ffffa3103fde38c0
> <4> [373.551392] R13: ffffa3103fde3888 R14: ffffa30ff0982328 R15: ffffa30ff0982000
> <4> [373.551401] FS:  00007f19f3359e40(0000) GS:ffffa3104ed00000(0000) knlGS:0000000000000000
> <4> [373.551410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [373.551414] CR2: 00007f19f2aac778 CR3: 0000000232b0c004 CR4: 00000000003606e0
> <4> [373.551421] Call Trace:
> <4> [373.551466]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
> <4> [373.551506]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
> <4> [373.551515]  __dma_fence_enable_signaling+0x60/0x160
> <4> [373.551558]  ? dma_i915_sw_fence_wake+0x40/0x40 [i915]
> <4> [373.551564]  dma_fence_add_callback+0x44/0xd0
> <4> [373.551605]  __i915_sw_fence_await_dma_fence+0x6f/0xc0 [i915]
> <4> [373.551665]  __i915_request_commit+0x442/0x5b0 [i915]
> <4> [373.551721]  i915_gem_do_execbuffer+0x17fb/0x2eb0 [i915]
> 
> kasan/kcsan do not complain; it's just a broken list.

Which list gets broken? But it does sound plausible that intel_lrc.c 
messing around the breadcrumb lists directly could cause a problem due 
special handling on empty <-> non-empty ce signalers transitions.

Maybe virtual_xfer_breadcrumbs should be moved to intel_breadcrumbs.c, 
well moved.. breadcrumb internal logic moved, veng left in intel_lrc.c:

static void
virtual_xfer_breadcrumbs(struct virtual_engine *ve,
			 struct intel_engine_cs *target)
{
	intel_breadcrumbs_transfer(ve->context,
				   ve->siblings[0],
				   target);
}

?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-05-12 10:12 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11  7:57 [Intel-gfx] [PATCH 01/20] drm/i915/gt: Mark up the racy read of execlists->context_tag Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 02/20] drm/i915/gt: Couple up old virtual breadcrumb on new sibling Chris Wilson
2020-05-12  8:41   ` Tvrtko Ursulin
2020-05-12  8:49     ` Chris Wilson
2020-05-12 10:12       ` Tvrtko Ursulin [this message]
2020-05-12 10:28         ` Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 03/20] dma-buf: Use atomic_fetch_add() for the context id Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 04/20] drm/i915: Mark the addition of the initial-breadcrumb in the request Chris Wilson
2020-05-11 11:21   ` Mika Kuoppala
2020-05-11 11:30     ` Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 05/20] drm/i915: Tidy awaiting on dma-fences Chris Wilson
2020-05-11 11:30   ` Mika Kuoppala
2020-05-11  7:57 ` [Intel-gfx] [PATCH 06/20] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 07/20] drm/syncobj: Allow use of dma-fence-proxy Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 08/20] drm/i915/gem: Teach execbuf how to wait on future syncobj Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 09/20] drm/i915/gem: Allow combining submit-fences with syncobj Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 10/20] drm/i915/gt: Declare when we enabled timeslicing Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 11/20] drm/i915/gem: Remove redundant exec_fence Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 12/20] drm/i915: Drop no-semaphore boosting Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 13/20] drm/i915: Move saturated workload detection back to the context Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 14/20] drm/i915: Remove the saturation backoff for HW semaphores Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 15/20] drm/i915/gt: Use built-in active intel_context reference Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 16/20] drm/i915: Drop I915_RESET_TIMEOUT and friends Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 17/20] drm/i915: Drop I915_IDLE_ENGINES_TIMEOUT Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 18/20] drm/i915/selftests: Always call the provided engine->emit_init_breadcrumb Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 19/20] drm/i915: Emit await(batch) before MI_BB_START Chris Wilson
2020-05-11  7:57 ` [Intel-gfx] [PATCH 20/20] drm/i915/selftests: Always flush before unpining after writing Chris Wilson
2020-05-11  8:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/20] drm/i915/gt: Mark up the racy read of execlists->context_tag Patchwork
2020-05-11  8:32 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-05-11  9:07 ` [Intel-gfx] [PATCH 01/20] " Mika Kuoppala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eebc8a12-1204-e619-f7bd-df607e839ad7@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.