All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v6 63/64] drm/i915: Move gt_revoke() slightly
Date: Mon, 18 Jan 2021 15:46:40 +0100	[thread overview]
Message-ID: <4836b692-d2f3-826d-cbc2-6c29c47df6f5@linux.intel.com> (raw)
In-Reply-To: <7b56a025-4852-a172-06df-7d64d1cf8e39@linux.intel.com>

Op 18-01-2021 om 14:28 schreef Thomas Hellström:
>
> On 1/18/21 2:22 PM, Thomas Hellström wrote:
>>
>> On 1/18/21 1:01 PM, Maarten Lankhorst wrote:
>>> Op 18-01-2021 om 12:11 schreef Thomas Hellström:
>>>> On 1/5/21 4:35 PM, Maarten Lankhorst wrote:
>>>>> We get a lockdep splat when the reset mutex is held, because it can be
>>>>> taken from fence_wait. This conflicts with the mmu notifier we have,
>>>>> because we recurse between reset mutex and mmap lock -> mmu notifier.
>>>>>
>>>>> Remove this recursion by calling revoke_mmaps before taking the lock.
>>>> Hmm. Is the mmap se taken from gt_revoke()?
>>>>
>>>> If so, isn't the real problem that the mmap_sem is taken in the dma_fence critical path (where the reset code sits)?
>>> Hey,
>>>
>>> The gpu reset code specifically needs to revoke all gtt mappings, and the fault handler uses intel_gt_reset_trylock(),
>>>
>>> so this change should be ok since all those mappings are invalidated correctly and completed before this point.
>>>
>>> The reset mutex isn't actually taken inside fence code, but used for lockdep validation, so this should be ok.
>>>
>>> ~Maarten
>>
>> Hmm, OK but then we still have the following established locking order.
>>
>> lock(fence_signaling)
>> lock(i_mmap_lock)
>>
>> But in the notifier
>>
>> lock(i_mmap_lock)
>> fence_signaling(within notifier)
>>
>> So gt_revoke() is violating dma-fence rules.
>>
>> BTW it looks to me like the reset mutex notation is actually doing much the same as the dma-fence annotations; While we can move gt_revoke() out of the reset mutex, that only gives us false hopes since it moves it out of the equivalent dma-fence annotation. I figure the reason this was not seen before the new code is that the reset mutex lockdep isn't taken when waiting for active. Only when waiting for dma-fence, but IMO the root problem is pre-existing.
>>
>> /Thomas
>>
>>
> The interesting scenario is
>
> thread 1:
> take i_mmap_lock()
> enter_mmu_notifier()
> wait_fence()
>
> thread 2:
> need_to_reset_gpu_for_the_above_fence();
> take i_mmap_lock()
>
> Deadlock.
>
> /Thomas
>
>
Yeah, I think gpu reset isn't completely following lockdep rules yet. Thread 1 isn't doing anything wrong, gpu reset probably should stop revoking gt bindings, and allow some garbage during reset. I don't see another way out. :-/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-01-18 14:46 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-05 15:34 [Intel-gfx] [PATCH v6 00/64] drm/i915: Remove obj->mm.lock! Maarten Lankhorst
2021-01-05 15:34 ` [Intel-gfx] [PATCH v6 01/64] drm/i915: Do not share hwsp across contexts any more, v6 Maarten Lankhorst
2021-01-14 12:06   ` Tvrtko Ursulin
2021-01-18 16:09     ` Maarten Lankhorst
2021-01-05 15:34 ` [Intel-gfx] [PATCH v6 02/64] drm/i915: Pin timeline map after first timeline pin, v3 Maarten Lankhorst
2021-01-18 10:28   ` Thomas Hellström (Intel)
2021-01-05 15:34 ` [Intel-gfx] [PATCH v6 03/64] drm/i915: Move cmd parser pinning to execbuffer Maarten Lankhorst
2021-01-05 15:34 ` [Intel-gfx] [PATCH v6 04/64] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Maarten Lankhorst
2021-01-05 15:34 ` [Intel-gfx] [PATCH v6 05/64] drm/i915: Ensure we hold the object mutex in pin correctly Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 06/64] drm/i915: Add gem object locking to madvise Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 07/64] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 08/64] drm/i915: Rework struct phys attachment handling Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 09/64] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 10/64] drm/i915: make lockdep slightly happier about execbuf Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 11/64] drm/i915: Disable userptr pread/pwrite support Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 12/64] drm/i915: No longer allow exporting userptr through dma-buf Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 13/64] drm/i915: Reject more ioctls for userptr Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 14/64] drm/i915: Reject UNSYNCHRONIZED for userptr, v2 Maarten Lankhorst
2021-01-11 20:50   ` Dave Airlie
2021-01-18 10:34   ` Thomas Hellström
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 15/64] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Maarten Lankhorst
2021-01-18 11:17   ` Thomas Hellström
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5 Maarten Lankhorst
2021-01-11 20:51   ` Dave Airlie
2021-01-18 11:30   ` Thomas Hellström (Intel)
2021-01-18 12:43     ` Maarten Lankhorst
2021-01-18 12:55       ` Thomas Hellström (Intel)
2021-01-18 14:43         ` Maarten Lankhorst
2021-01-20 13:32         ` Thomas Hellström (Intel)
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 17/64] drm/i915: Flatten obj->mm.lock Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 18/64] drm/i915: Populate logical context during first pin Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 19/64] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 20/64] drm/i915: Handle ww locking in init_status_page Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 21/64] drm/i915: Rework clflush to work correctly without obj->mm.lock Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 22/64] drm/i915: Pass ww ctx to intel_pin_to_display_plane Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 23/64] drm/i915: Add object locking to vm_fault_cpu Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 24/64] drm/i915: Move pinning to inside engine_wa_list_verify() Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 25/64] drm/i915: Take reservation lock around i915_vma_pin Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 26/64] drm/i915: Make lrc_init_wa_ctx compatible with ww locking Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 27/64] drm/i915: Make __engine_unpark() " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 28/64] drm/i915: Take obj lock around set_domain ioctl Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 29/64] drm/i915: Defer pin calls in buffer pool until first use by caller Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 30/64] drm/i915: Fix pread/pwrite to work with new locking rules Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 31/64] drm/i915: Fix workarounds selftest, part 1 Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 32/64] drm/i915: Prepare for obj->mm.lock removal Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 33/64] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 34/64] drm/i915: Add ww locking around vm_access() Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 35/64] drm/i915: Increase ww locking for perf Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 36/64] drm/i915: Lock ww in ucode objects correctly Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 37/64] drm/i915: Add ww locking to dma-buf ops Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 38/64] drm/i915: Add missing ww lock in intel_dsb_prepare Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 39/64] drm/i915: Fix ww locking in shmem_create_from_object Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 40/64] drm/i915: Use a single page table lock for each gtt Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 41/64] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 42/64] drm/i915/selftests: Prepare client blit " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 43/64] drm/i915/selftests: Prepare coherency tests " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 44/64] drm/i915/selftests: Prepare context " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 45/64] drm/i915/selftests: Prepare dma-buf " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 46/64] drm/i915/selftests: Prepare execbuf " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 47/64] drm/i915/selftests: Prepare mman testcases " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 48/64] drm/i915/selftests: Prepare object tests " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 49/64] drm/i915/selftests: Prepare object blit " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 50/64] drm/i915/selftests: Prepare igt_gem_utils " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 51/64] drm/i915/selftests: Prepare context selftest " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 52/64] drm/i915/selftests: Prepare hangcheck " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 53/64] drm/i915/selftests: Prepare execlists and lrc selftests " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 54/64] drm/i915/selftests: Prepare mocs tests " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 55/64] drm/i915/selftests: Prepare ring submission " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 56/64] drm/i915/selftests: Prepare timeline tests " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 57/64] drm/i915/selftests: Prepare i915_request " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 58/64] drm/i915/selftests: Prepare memory region " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 59/64] drm/i915/selftests: Prepare cs engine " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 60/64] drm/i915/selftests: Prepare gtt " Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 61/64] drm/i915: Finally remove obj->mm.lock Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 62/64] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Maarten Lankhorst
2021-01-18 10:58   ` Thomas Hellström
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 63/64] drm/i915: Move gt_revoke() slightly Maarten Lankhorst
2021-01-18 11:11   ` Thomas Hellström
2021-01-18 12:01     ` Maarten Lankhorst
2021-01-18 13:22       ` Thomas Hellström
2021-01-18 13:28         ` Thomas Hellström
2021-01-18 14:46           ` Maarten Lankhorst [this message]
2021-01-18 15:05             ` Thomas Hellström
2021-01-18 15:32               ` Maarten Lankhorst
2021-01-05 15:35 ` [Intel-gfx] [PATCH v6 64/64] drm/i915: Avoid some false positives in assert_object_held() Maarten Lankhorst
2021-01-05 17:24 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Remove obj->mm.lock! (rev12) Patchwork
2021-01-05 17:26 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-01-05 17:29 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-01-07 15:42 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Remove obj->mm.lock! (rev13) Patchwork
2021-01-07 15:44 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-01-07 15:47 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-01-07 16:11 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4836b692-d2f3-826d-cbc2-6c29c47df6f5@linux.intel.com \
    --to=maarten.lankhorst@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.