* Re: [PATCH v3] drm/i915: Synchronize active and retire callbacks [not found] ` <20200403223528.2570-1-sultan@kerneltoast.com> @ 2020-04-04 2:41 ` Sultan Alsawaf [not found] ` <20200407064007.7599-1-sultan@kerneltoast.com> 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-04 2:41 UTC (permalink / raw) To: stable, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie, Daniel Vetter, Chris Wilson, Ville Syrjälä, Matthew Auld, Tvrtko Ursulin, Mika Kuoppala, Lionel Landwerlin, intel-gfx, dri-devel, linux-kernel On Fri, Apr 03, 2020 at 03:35:15PM -0700, Sultan Alsawaf wrote: > + ref->retire(ref); > + mutex_unlock(&ref->callback_lock); Ugh, this patch is still wrong because the mutex unlock after ref->retire() is a use-after-free. Fun times... Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20200407064007.7599-1-sultan@kerneltoast.com>]
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks [not found] ` <20200407064007.7599-1-sultan@kerneltoast.com> @ 2020-04-14 6:13 ` Sultan Alsawaf 2020-04-14 8:23 ` Chris Wilson 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-14 6:13 UTC (permalink / raw) To: Chris Wilson Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, stable, Rodrigo Vivi, Matthew Auld Chris, Could you please take a look at this? This really is quite an important fix. Thanks, Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-14 6:13 ` [PATCH v4] " Sultan Alsawaf @ 2020-04-14 8:23 ` Chris Wilson 2020-04-14 14:43 ` Sultan Alsawaf 0 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2020-04-14 8:23 UTC (permalink / raw) To: Sultan Alsawaf Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, stable, Rodrigo Vivi, Matthew Auld Quoting Sultan Alsawaf (2020-04-14 07:13:12) > Chris, > > Could you please take a look at this? This really is quite an important fix. It's crazy. See a266bf420060 for a patch that should be applied to v5.4 -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-14 8:23 ` Chris Wilson @ 2020-04-14 14:43 ` Sultan Alsawaf 2020-04-20 5:24 ` Sultan Alsawaf 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-14 14:43 UTC (permalink / raw) To: Chris Wilson Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, stable, Rodrigo Vivi, Matthew Auld On Tue, Apr 14, 2020 at 09:23:56AM +0100, Chris Wilson wrote: > Quoting Sultan Alsawaf (2020-04-14 07:13:12) > > Chris, > > > > Could you please take a look at this? This really is quite an important fix. > > It's crazy. See a266bf420060 for a patch that should be applied to v5.4 > -Chris What? a266bf420060 was part of 5.4.0-rc7, so it's already in 5.4. And if you read the commit message, you would see that the problem in question affects Linus' tree. You can break i915 in 5.6 by just adding a small delay: diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c index 6ff803f397c4..3a7968effdfd 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring.c +++ b/drivers/gpu/drm/i915/gt/intel_ring.c @@ -10,6 +10,7 @@ #include "intel_engine.h" #include "intel_ring.h" #include "intel_timeline.h" +#include <linux/delay.h> unsigned int intel_ring_update_space(struct intel_ring *ring) { @@ -92,6 +93,9 @@ void intel_ring_unpin(struct intel_ring *ring) else i915_gem_object_unpin_map(vma->obj); + mdelay(1); + ring->vaddr = NULL; + i915_vma_make_purgeable(vma); i915_vma_unpin(vma); } This is how I reproduced the race in question. I can't even reach the greeter on my laptop with this, because i915 dies before that. Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-14 14:43 ` Sultan Alsawaf @ 2020-04-20 5:24 ` Sultan Alsawaf 2020-04-20 8:21 ` Joonas Lahtinen 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-20 5:24 UTC (permalink / raw) To: Chris Wilson Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, stable, Rodrigo Vivi, Matthew Auld Chris, Could you please look at this in earnest? This is a real bug that crashes my laptop without any kind of provocation. It is undeniably a bug in i915, and I've clearly described it in my patch. If you dont like the patch, I'm open to any suggestions you have for an alternative solution. My goal here is to make i915 better, but it's difficult when communication only goes one way. Thanks, Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-20 5:24 ` Sultan Alsawaf @ 2020-04-20 8:21 ` Joonas Lahtinen 2020-04-20 16:15 ` Sultan Alsawaf 0 siblings, 1 reply; 9+ messages in thread From: Joonas Lahtinen @ 2020-04-20 8:21 UTC (permalink / raw) To: Chris Wilson, Sultan Alsawaf Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Matthew Auld, Rodrigo Vivi, stable Quoting Sultan Alsawaf (2020-04-20 08:24:19) > Chris, > > Could you please look at this in earnest? This is a real bug that crashes my > laptop without any kind of provocation. It is undeniably a bug in i915, and I've > clearly described it in my patch. If you dont like the patch, I'm open to any > suggestions you have for an alternative solution. My goal here is to make i915 > better, but it's difficult when communication only goes one way. Hi Sultan, The patch Chris pointed out was not part of 5.4 release. The commit message describes that it fixes the functions to be tolerant to running simultaneously. In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1) and "ring->vaddr = NULL;" is not correct. I think you might have used the wrong git command for checking the patch history: $ git describe a266bf420060 v5.4-rc7-1996-ga266bf420060 # after -rc7 tag $ git describe --contains a266bf420060 v5.6-rc1~34^2~21^2~326 # included in v5.6-rc1 And git log to double check: $ git log --format=oneline kernel.org/stable/linux-5.4.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" $ git log --format=oneline kernel.org/stable/linux-5.5.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint $ git log --format=oneline kernel.org/stable/linux-5.6.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" a754012b9f2323a5d640da7eb7b095ac3b8cd012 drm/i915/execlists: Leave resetting ring to intel_ring 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint a266bf42006004306dd48a9082c35dfbff153307 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4. Could you try applying the patch to 5.4 and seeing if the problem persists? Regards, Joonas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-20 8:21 ` Joonas Lahtinen @ 2020-04-20 16:15 ` Sultan Alsawaf 2020-04-21 6:51 ` Joonas Lahtinen 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-20 16:15 UTC (permalink / raw) To: Joonas Lahtinen Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Chris Wilson, stable, Rodrigo Vivi, Matthew Auld On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote: > So it seems that the patch got pulled into v5.6 and has been backported > to v5.5 but not v5.4. You're right, that's my mistake. > In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1) > and "ring->vaddr = NULL;" is not correct. I'm not so sure about this. Look at where `ring->vaddr` is assigned: -------------------------------------8<----------------------------------------- ret = i915_vma_pin(vma, 0, 0, flags); if (unlikely(ret)) goto err_unpin; if (i915_vma_is_map_and_fenceable(vma)) addr = (void __force *)i915_vma_pin_iomap(vma); else addr = i915_gem_object_pin_map(vma->obj, i915_coherent_map_type(vma->vm->i915)); if (IS_ERR(addr)) { ret = PTR_ERR(addr); goto err_ring; } i915_vma_make_unshrinkable(vma); /* Discard any unused bytes beyond that submitted to hw. */ intel_ring_reset(ring, ring->emit); ring->vaddr = addr; ------------------------------------->8----------------------------------------- And then the converse of that is done *before* my reproducer patch does `ring->vaddr = NULL;`: -------------------------------------8<----------------------------------------- i915_vma_unset_ggtt_write(vma); if (i915_vma_is_map_and_fenceable(vma)) i915_vma_unpin_iomap(vma); else i915_gem_object_unpin_map(vma->obj); /* mdelay(1); ring->vaddr = NULL; */ i915_vma_make_purgeable(vma); i915_vma_unpin(vma); ------------------------------------->8----------------------------------------- Isn't the value assigned to `ring->vaddr` trashed by those function calls above where I've got the mdelay? If so, why would it be correct to let the stale value sit in `ring->vaddr`? My interpretation of the zeroing of ring->vaddr being removed by Chris was that it was an unnecessary step before the ring was getting discarded anyway. Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-20 16:15 ` Sultan Alsawaf @ 2020-04-21 6:51 ` Joonas Lahtinen 2020-04-21 15:54 ` Sultan Alsawaf 0 siblings, 1 reply; 9+ messages in thread From: Joonas Lahtinen @ 2020-04-21 6:51 UTC (permalink / raw) To: Sultan Alsawaf Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Chris Wilson, stable, Rodrigo Vivi, Matthew Auld Quoting Sultan Alsawaf (2020-04-20 19:15:14) > On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote: > > So it seems that the patch got pulled into v5.6 and has been backported > > to v5.5 but not v5.4. > > You're right, that's my mistake. Did applying the patch to v5.4 fix the issue at hand? Regards, Joonas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v4] drm/i915: Synchronize active and retire callbacks 2020-04-21 6:51 ` Joonas Lahtinen @ 2020-04-21 15:54 ` Sultan Alsawaf 0 siblings, 0 replies; 9+ messages in thread From: Sultan Alsawaf @ 2020-04-21 15:54 UTC (permalink / raw) To: Joonas Lahtinen Cc: dri-devel, David Airlie, intel-gfx, linux-kernel, Chris Wilson, stable, Rodrigo Vivi, Matthew Auld On Tue, Apr 21, 2020 at 09:51:37AM +0300, Joonas Lahtinen wrote: > Quoting Sultan Alsawaf (2020-04-20 19:15:14) > > On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote: > > > So it seems that the patch got pulled into v5.6 and has been backported > > > to v5.5 but not v5.4. > > > > You're right, that's my mistake. > > Did applying the patch to v5.4 fix the issue at hand? Of course the patch doesn't apply as-is because the code's been shuffled around, but yes. The crashes are gone with that patch, and I don't have the motivation to check if that patch is actually correct, so hurray, problem solved. I'm not going to send the backport myself because I'll probably be ignored, so you can go ahead and do that. Sultan _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-04-22 6:55 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20200403042948.2533-1-sultan@kerneltoast.com> [not found] ` <20200403223528.2570-1-sultan@kerneltoast.com> 2020-04-04 2:41 ` [PATCH v3] drm/i915: Synchronize active and retire callbacks Sultan Alsawaf [not found] ` <20200407064007.7599-1-sultan@kerneltoast.com> 2020-04-14 6:13 ` [PATCH v4] " Sultan Alsawaf 2020-04-14 8:23 ` Chris Wilson 2020-04-14 14:43 ` Sultan Alsawaf 2020-04-20 5:24 ` Sultan Alsawaf 2020-04-20 8:21 ` Joonas Lahtinen 2020-04-20 16:15 ` Sultan Alsawaf 2020-04-21 6:51 ` Joonas Lahtinen 2020-04-21 15:54 ` Sultan Alsawaf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).