* [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines @ 2020-02-05 12:11 Chris Wilson 2020-02-05 12:13 ` [Intel-gfx] [PATCH v2] " Chris Wilson ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Chris Wilson @ 2020-02-05 12:11 UTC (permalink / raw) To: intel-gfx If we have a set of active engines marked as being non-persistent, we lose track of those if the user replaces those engines with I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that non-persistent requests are terminated if they are no longer being tracked by the user's context (in order to prevent a lost request causing an untracked and so unstoppable GPU hang), we need to apply the same context cancellation upon changing engines. Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") Testcase: XXX Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 52a749691a8d..5d4093266103 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1623,6 +1623,11 @@ set_engines(struct i915_gem_context *ctx, } replace: + /* Flush stale requests off the old engines if required */ + if (!i915_gem_context_is_persistent(ctx) || + !i915_modparams.enable_hangcheck) + kill_context(ctx); + mutex_lock(&ctx->engines_mutex); if (args->size) i915_gem_context_set_user_engines(ctx); -- 2.25.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Intel-gfx] [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson @ 2020-02-05 12:13 ` Chris Wilson 2020-02-05 16:22 ` Tvrtko Ursulin 2020-02-05 17:17 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev2) Patchwork ` (3 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Chris Wilson @ 2020-02-05 12:13 UTC (permalink / raw) To: intel-gfx If we have a set of active engines marked as being non-persistent, we lose track of those if the user replaces those engines with I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that non-persistent requests are terminated if they are no longer being tracked by the user's context (in order to prevent a lost request causing an untracked and so unstoppable GPU hang), we need to apply the same context cancellation upon changing engines. Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") Testcase: XXX Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 52a749691a8d..20f1d3e0221f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, replace: mutex_lock(&ctx->engines_mutex); + + /* Flush stale requests off the old engines if required */ + if (!i915_gem_context_is_persistent(ctx) || + !i915_modparams.enable_hangcheck) + kill_context(ctx); + if (args->size) i915_gem_context_set_user_engines(ctx); else i915_gem_context_clear_user_engines(ctx); set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1); + mutex_unlock(&ctx->engines_mutex); call_rcu(&set.engines->rcu, free_engines_rcu); -- 2.25.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 12:13 ` [Intel-gfx] [PATCH v2] " Chris Wilson @ 2020-02-05 16:22 ` Tvrtko Ursulin 2020-02-05 16:44 ` Chris Wilson 0 siblings, 1 reply; 11+ messages in thread From: Tvrtko Ursulin @ 2020-02-05 16:22 UTC (permalink / raw) To: Chris Wilson, intel-gfx On 05/02/2020 12:13, Chris Wilson wrote: > If we have a set of active engines marked as being non-persistent, we > lose track of those if the user replaces those engines with > I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that > non-persistent requests are terminated if they are no longer being > tracked by the user's context (in order to prevent a lost request > causing an untracked and so unstoppable GPU hang), we need to apply the > same context cancellation upon changing engines. > > Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") > Testcase: XXX > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index 52a749691a8d..20f1d3e0221f 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, > > replace: > mutex_lock(&ctx->engines_mutex); > + > + /* Flush stale requests off the old engines if required */ > + if (!i915_gem_context_is_persistent(ctx) || > + !i915_modparams.enable_hangcheck) > + kill_context(ctx); Is the negative effect of this is legit contexts can't keep submitting and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but still. Might break legitimate userspace. Not that I offer solutions.. :( Banning changing engines once context went non-persistent? That too can break someone. Regards, Tvrtko > + > if (args->size) > i915_gem_context_set_user_engines(ctx); > else > i915_gem_context_clear_user_engines(ctx); > set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1); > + > mutex_unlock(&ctx->engines_mutex); > > call_rcu(&set.engines->rcu, free_engines_rcu); > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 16:22 ` Tvrtko Ursulin @ 2020-02-05 16:44 ` Chris Wilson 2020-02-05 18:33 ` Tvrtko Ursulin 0 siblings, 1 reply; 11+ messages in thread From: Chris Wilson @ 2020-02-05 16:44 UTC (permalink / raw) To: Tvrtko Ursulin, intel-gfx Quoting Tvrtko Ursulin (2020-02-05 16:22:34) > > > On 05/02/2020 12:13, Chris Wilson wrote: > > If we have a set of active engines marked as being non-persistent, we > > lose track of those if the user replaces those engines with > > I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that > > non-persistent requests are terminated if they are no longer being > > tracked by the user's context (in order to prevent a lost request > > causing an untracked and so unstoppable GPU hang), we need to apply the > > same context cancellation upon changing engines. > > > > Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") > > Testcase: XXX > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index 52a749691a8d..20f1d3e0221f 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, > > > > replace: > > mutex_lock(&ctx->engines_mutex); > > + > > + /* Flush stale requests off the old engines if required */ > > + if (!i915_gem_context_is_persistent(ctx) || > > + !i915_modparams.enable_hangcheck) > > + kill_context(ctx); > > Is the negative effect of this is legit contexts can't keep submitting > and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but > still. Might break legitimate userspace. Not that I offer solutions.. :( > Banning changing engines once context went non-persistent? That too can > break someone. It closes the hole we have. To do otherwise, we need to keep track of the old engines. Not an impossible task, certainly inconvenient. struct old_engines { struct i915_active active; struct list_head link; struct i915_gem_context *ctx; void *engines; int num_engines; }; With a list+spinlock in the ctx that we can work in kill_context. The biggest catch there is actually worrying about attaching the active to already executing request, and making sure the coupling doesn't bug on a concurrent completion. Hmm, it's just a completion callback, but more convenient to use a ready made one. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 16:44 ` Chris Wilson @ 2020-02-05 18:33 ` Tvrtko Ursulin 2020-02-05 18:39 ` Chris Wilson 0 siblings, 1 reply; 11+ messages in thread From: Tvrtko Ursulin @ 2020-02-05 18:33 UTC (permalink / raw) To: Chris Wilson, intel-gfx On 05/02/2020 16:44, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2020-02-05 16:22:34) >> On 05/02/2020 12:13, Chris Wilson wrote: >>> If we have a set of active engines marked as being non-persistent, we >>> lose track of those if the user replaces those engines with >>> I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that >>> non-persistent requests are terminated if they are no longer being >>> tracked by the user's context (in order to prevent a lost request >>> causing an untracked and so unstoppable GPU hang), we need to apply the >>> same context cancellation upon changing engines. >>> >>> Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") >>> Testcase: XXX >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> --- >>> drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c >>> index 52a749691a8d..20f1d3e0221f 100644 >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c >>> @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, >>> >>> replace: >>> mutex_lock(&ctx->engines_mutex); >>> + >>> + /* Flush stale requests off the old engines if required */ >>> + if (!i915_gem_context_is_persistent(ctx) || >>> + !i915_modparams.enable_hangcheck) >>> + kill_context(ctx); >> >> Is the negative effect of this is legit contexts can't keep submitting >> and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but >> still. Might break legitimate userspace. Not that I offer solutions.. :( >> Banning changing engines once context went non-persistent? That too can >> break someone. > > It closes the hole we have. To do otherwise, we need to keep track of > the old engines. Not an impossible task, certainly inconvenient. > > struct old_engines { > struct i915_active active; > struct list_head link; > struct i915_gem_context *ctx; > void *engines; > int num_engines; > }; > > With a list+spinlock in the ctx that we can work in kill_context. > > The biggest catch there is actually worrying about attaching the active > to already executing request, and making sure the coupling doesn't bug > on a concurrent completion. Hmm, it's just a completion callback, but > more convenient to use a ready made one. What would you do with old engines? We don't have a mechanism to mark intel_context closed. Hm, right, it would get unreachable by definition. But how to terminate it if it doesn't play nicely? Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Intel-gfx] [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 18:33 ` Tvrtko Ursulin @ 2020-02-05 18:39 ` Chris Wilson 0 siblings, 0 replies; 11+ messages in thread From: Chris Wilson @ 2020-02-05 18:39 UTC (permalink / raw) To: Tvrtko Ursulin, intel-gfx Quoting Tvrtko Ursulin (2020-02-05 18:33:57) > > On 05/02/2020 16:44, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2020-02-05 16:22:34) > >> On 05/02/2020 12:13, Chris Wilson wrote: > >>> If we have a set of active engines marked as being non-persistent, we > >>> lose track of those if the user replaces those engines with > >>> I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that > >>> non-persistent requests are terminated if they are no longer being > >>> tracked by the user's context (in order to prevent a lost request > >>> causing an untracked and so unstoppable GPU hang), we need to apply the > >>> same context cancellation upon changing engines. > >>> > >>> Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") > >>> Testcase: XXX > >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>> --- > >>> drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ > >>> 1 file changed, 7 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> index 52a749691a8d..20f1d3e0221f 100644 > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, > >>> > >>> replace: > >>> mutex_lock(&ctx->engines_mutex); > >>> + > >>> + /* Flush stale requests off the old engines if required */ > >>> + if (!i915_gem_context_is_persistent(ctx) || > >>> + !i915_modparams.enable_hangcheck) > >>> + kill_context(ctx); > >> > >> Is the negative effect of this is legit contexts can't keep submitting > >> and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but > >> still. Might break legitimate userspace. Not that I offer solutions.. :( > >> Banning changing engines once context went non-persistent? That too can > >> break someone. > > > > It closes the hole we have. To do otherwise, we need to keep track of > > the old engines. Not an impossible task, certainly inconvenient. > > > > struct old_engines { > > struct i915_active active; > > struct list_head link; > > struct i915_gem_context *ctx; > > void *engines; > > int num_engines; > > }; > > > > With a list+spinlock in the ctx that we can work in kill_context. > > > > The biggest catch there is actually worrying about attaching the active > > to already executing request, and making sure the coupling doesn't bug > > on a concurrent completion. Hmm, it's just a completion callback, but > > more convenient to use a ready made one. > > What would you do with old engines? We don't have a mechanism to mark > intel_context closed. Hm, right, it would get unreachable by definition. > But how to terminate it if it doesn't play nicely? Wait 30 minutes (if it passes the tests) and you'll find out :) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev2) 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson 2020-02-05 12:13 ` [Intel-gfx] [PATCH v2] " Chris Wilson @ 2020-02-05 17:17 ` Patchwork 2020-02-05 19:39 ` [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson ` (2 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Patchwork @ 2020-02-05 17:17 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/gem: Don't leak non-persistent requests on changing engines (rev2) URL : https://patchwork.freedesktop.org/series/73023/ State : success == Summary == CI Bug Log - changes from CI_DRM_7869 -> Patchwork_16434 ==================================================== Summary ------- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/index.html Known issues ------------ Here are the changes found in Patchwork_16434 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@gem_close_race@basic-threads: - fi-byt-j1900: [PASS][1] -> [INCOMPLETE][2] ([i915#45]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-byt-j1900/igt@gem_close_race@basic-threads.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-byt-j1900/igt@gem_close_race@basic-threads.html * igt@i915_selftest@live_execlists: - fi-icl-y: [PASS][3] -> [DMESG-FAIL][4] ([fdo#108569]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-icl-y/igt@i915_selftest@live_execlists.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-icl-y/igt@i915_selftest@live_execlists.html * igt@i915_selftest@live_gem_contexts: - fi-kbl-x1275: [PASS][5] -> [INCOMPLETE][6] ([i915#504]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html * igt@i915_selftest@live_gtt: - fi-bdw-5557u: [PASS][7] -> [TIMEOUT][8] ([fdo#112271]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-bdw-5557u/igt@i915_selftest@live_gtt.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-bdw-5557u/igt@i915_selftest@live_gtt.html * igt@prime_self_import@basic-llseek-bad: - fi-tgl-y: [PASS][9] -> [DMESG-WARN][10] ([CI#94] / [i915#402]) +1 similar issue [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-tgl-y/igt@prime_self_import@basic-llseek-bad.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-tgl-y/igt@prime_self_import@basic-llseek-bad.html #### Possible fixes #### * igt@gem_exec_parallel@fds: - fi-icl-u3: [INCOMPLETE][11] -> [PASS][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-icl-u3/igt@gem_exec_parallel@fds.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-icl-u3/igt@gem_exec_parallel@fds.html * igt@gem_exec_suspend@basic-s4-devices: - fi-tgl-y: [FAIL][13] ([CI#94]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-tgl-y/igt@gem_exec_suspend@basic-s4-devices.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-tgl-y/igt@gem_exec_suspend@basic-s4-devices.html * igt@i915_getparams_basic@basic-subslice-total: - fi-tgl-y: [DMESG-WARN][15] ([CI#94] / [i915#402]) -> [PASS][16] +1 similar issue [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-tgl-y/igt@i915_getparams_basic@basic-subslice-total.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-tgl-y/igt@i915_getparams_basic@basic-subslice-total.html * igt@i915_module_load@reload: - fi-skl-6770hq: [DMESG-WARN][17] ([i915#92]) -> [PASS][18] +1 similar issue [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-skl-6770hq/igt@i915_module_load@reload.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-skl-6770hq/igt@i915_module_load@reload.html * igt@i915_selftest@live_blt: - fi-ivb-3770: [DMESG-FAIL][19] ([i915#725]) -> [PASS][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-ivb-3770/igt@i915_selftest@live_blt.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-ivb-3770/igt@i915_selftest@live_blt.html * igt@kms_chamelium@hdmi-crc-fast: - fi-kbl-7500u: [FAIL][21] ([fdo#109635] / [i915#217]) -> [PASS][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-fast.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-fast.html * igt@kms_chamelium@hdmi-hpd-fast: - fi-kbl-7500u: [FAIL][23] ([fdo#111096] / [i915#323]) -> [PASS][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html * igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence: - fi-skl-6770hq: [SKIP][25] ([fdo#109271]) -> [PASS][26] +4 similar issues [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-skl-6770hq/igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence.html [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-skl-6770hq/igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence.html * igt@kms_pipe_crc_basic@read-crc-pipe-c: - fi-skl-6770hq: [DMESG-WARN][27] ([i915#106] / [i915#188]) -> [PASS][28] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-skl-6770hq/igt@kms_pipe_crc_basic@read-crc-pipe-c.html [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-skl-6770hq/igt@kms_pipe_crc_basic@read-crc-pipe-c.html #### Warnings #### * igt@i915_pm_rpm@module-reload: - fi-skl-6770hq: [DMESG-WARN][29] ([i915#92]) -> [FAIL][30] ([i915#178]) [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-skl-6770hq/igt@i915_pm_rpm@module-reload.html * igt@i915_selftest@live_blt: - fi-hsw-4770: [DMESG-FAIL][31] ([i915#553]) -> [DMESG-FAIL][32] ([i915#553] / [i915#725]) [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7869/fi-hsw-4770/igt@i915_selftest@live_blt.html [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/fi-hsw-4770/igt@i915_selftest@live_blt.html [CI#94]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/94 [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569 [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635 [fdo#111096]: https://bugs.freedesktop.org/show_bug.cgi?id=111096 [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271 [i915#106]: https://gitlab.freedesktop.org/drm/intel/issues/106 [i915#178]: https://gitlab.freedesktop.org/drm/intel/issues/178 [i915#188]: https://gitlab.freedesktop.org/drm/intel/issues/188 [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217 [i915#323]: https://gitlab.freedesktop.org/drm/intel/issues/323 [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402 [i915#45]: https://gitlab.freedesktop.org/drm/intel/issues/45 [i915#504]: https://gitlab.freedesktop.org/drm/intel/issues/504 [i915#553]: https://gitlab.freedesktop.org/drm/intel/issues/553 [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725 [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92 Participating hosts (49 -> 39) ------------------------------ Additional (2): fi-kbl-7560u fi-gdg-551 Missing (12): fi-kbl-soraka fi-hsw-4200u fi-bdw-gvtdvm fi-bsw-cyan fi-ctg-p8600 fi-elk-e7500 fi-bdw-samus fi-byt-n2820 fi-byt-clapper fi-bsw-nick fi-skl-6600u fi-snb-2600 Build changes ------------- * CI: CI-20190529 -> None * Linux: CI_DRM_7869 -> Patchwork_16434 CI-20190529: 20190529 CI_DRM_7869: db0579be255412f38a450c3c577f8d10f1195034 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5419: 44913a91e77434b03001bb9ea53216cd03c476e6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_16434: ff4baf44b0269a86adf6319dafac50877260ce2f @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == ff4baf44b026 drm/i915/gem: Don't leak non-persistent requests on changing engines == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16434/index.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson 2020-02-05 12:13 ` [Intel-gfx] [PATCH v2] " Chris Wilson 2020-02-05 17:17 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev2) Patchwork @ 2020-02-05 19:39 ` Chris Wilson 2020-02-05 20:32 ` Chris Wilson 2020-02-05 22:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) Patchwork 2020-02-05 22:37 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork 4 siblings, 1 reply; 11+ messages in thread From: Chris Wilson @ 2020-02-05 19:39 UTC (permalink / raw) To: intel-gfx If we have a set of active engines marked as being non-persistent, we lose track of those if the user replaces those engines with I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that non-persistent requests are terminated if they are no longer being tracked by the user's context (in order to prevent a lost request causing an untracked and so unstoppable GPU hang), we need to apply the same context cancellation upon changing engines. v2: Track stale engines[] so we only reap at context closure. Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") Testcase: igt/gem_ctx_peristence/replace Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 109 ++++++++++++++++-- .../gpu/drm/i915/gem/i915_gem_context_types.h | 11 +- drivers/gpu/drm/i915/i915_sw_fence.c | 15 ++- drivers/gpu/drm/i915/i915_sw_fence.h | 2 +- 4 files changed, 124 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 52a749691a8d..01ffb1d1c6a0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -270,7 +270,8 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) if (!e) return ERR_PTR(-ENOMEM); - init_rcu_head(&e->rcu); + e->ctx = ctx; + for_each_engine(engine, gt, id) { struct intel_context *ce; @@ -450,7 +451,7 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce) return engine; } -static void kill_context(struct i915_gem_context *ctx) +static void kill_engines(struct i915_gem_engines *engines) { struct i915_gem_engines_iter it; struct intel_context *ce; @@ -462,7 +463,7 @@ static void kill_context(struct i915_gem_context *ctx) * However, we only care about pending requests, so only include * engines on which there are incomplete requests. */ - for_each_gem_engine(ce, __context_engines_static(ctx), it) { + for_each_gem_engine(ce, engines, it) { struct intel_engine_cs *engine; if (intel_context_set_banned(ce)) @@ -484,8 +485,32 @@ static void kill_context(struct i915_gem_context *ctx) * the context from the GPU, we have to resort to a full * reset. We hope the collateral damage is worth it. */ - __reset_context(ctx, engine); + __reset_context(engines->ctx, engine); + } +} + +static void kill_context(struct i915_gem_context *ctx) +{ + struct i915_gem_engines *pos, *next; + + spin_lock(&ctx->stale_lock); + list_for_each_entry_safe(pos, next, &ctx->stale_list, link) { + if (!i915_sw_fence_await(&pos->fence)) + continue; + + spin_unlock(&ctx->stale_lock); + + kill_engines(pos); + + spin_lock(&ctx->stale_lock); + list_safe_reset_next(pos, next, link); + list_del_init(&pos->link); + + i915_sw_fence_complete(&pos->fence); } + spin_unlock(&ctx->stale_lock); + + kill_engines(__context_engines_static(ctx)); } static void set_closed_name(struct i915_gem_context *ctx) @@ -602,6 +627,9 @@ __create_context(struct drm_i915_private *i915) ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); mutex_init(&ctx->mutex); + INIT_LIST_HEAD(&ctx->stale_list); + spin_lock_init(&ctx->stale_lock); + mutex_init(&ctx->engines_mutex); e = default_engines(ctx); if (IS_ERR(e)) { @@ -1529,6 +1557,66 @@ static const i915_user_extension_fn set_engines__extensions[] = { [I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond, }; +static int engines_notify(struct i915_sw_fence *fence, + enum i915_sw_fence_notify state) +{ + struct i915_gem_engines *engines = + container_of(fence, typeof(*engines), fence); + + switch (state) { + case FENCE_COMPLETE: + if (!list_empty(&engines->link)) { + spin_lock(&engines->ctx->stale_lock); + list_del(&engines->link); + spin_unlock(&engines->ctx->stale_lock); + } + break; + + case FENCE_FREE: + init_rcu_head(&engines->rcu); + call_rcu(&engines->rcu, free_engines_rcu); + break; + } + + return NOTIFY_DONE; +} + +static int engines_listen(struct i915_gem_engines *engines) +{ + struct i915_gem_engines_iter it; + struct intel_context *ce; + + GEM_BUG_ON(!engines); + i915_sw_fence_init(&engines->fence, engines_notify); + + for_each_gem_engine(ce, engines, it) { + struct dma_fence *fence; + int err; + + if (!ce->timeline) + continue; + + fence = i915_active_fence_get(&ce->timeline->last_request); + if (!fence) + continue; + + err = i915_sw_fence_await_dma_fence(&engines->fence, + fence, 0, + GFP_KERNEL); + + dma_fence_put(fence); + if (err < 0) + return err; + } + + spin_lock(&engines->ctx->stale_lock); + list_add(&engines->link, &engines->ctx->stale_list); + spin_unlock(&engines->ctx->stale_lock); + + i915_sw_fence_commit(&engines->fence); + return 0; +} + static int set_engines(struct i915_gem_context *ctx, const struct drm_i915_gem_context_param *args) @@ -1571,7 +1659,8 @@ set_engines(struct i915_gem_context *ctx, if (!set.engines) return -ENOMEM; - init_rcu_head(&set.engines->rcu); + set.engines->ctx = ctx; + for (n = 0; n < num_engines; n++) { struct i915_engine_class_instance ci; struct intel_engine_cs *engine; @@ -1631,7 +1720,11 @@ set_engines(struct i915_gem_context *ctx, set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1); mutex_unlock(&ctx->engines_mutex); - call_rcu(&set.engines->rcu, free_engines_rcu); + /* Keep track of old engine sets for kill_context() */ + if (engines_listen(set.engines)) { + kill_engines(set.engines); /* on error, immediately cleanup */ + engines_notify(&set.engines->fence, FENCE_FREE); + } return 0; } @@ -1646,7 +1739,6 @@ __copy_engines(struct i915_gem_engines *e) if (!copy) return ERR_PTR(-ENOMEM); - init_rcu_head(©->rcu); for (n = 0; n < e->num_engines; n++) { if (e->engines[n]) copy->engines[n] = intel_context_get(e->engines[n]); @@ -1890,7 +1982,8 @@ static int clone_engines(struct i915_gem_context *dst, if (!clone) goto err_unlock; - init_rcu_head(&clone->rcu); + clone->ctx = dst; + for (n = 0; n < e->num_engines; n++) { struct intel_engine_cs *engine; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 017ca803ab47..39c2f6189684 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -20,6 +20,7 @@ #include "gt/intel_context_types.h" #include "i915_scheduler.h" +#include "i915_sw_fence.h" struct pid; @@ -30,7 +31,12 @@ struct intel_timeline; struct intel_ring; struct i915_gem_engines { - struct rcu_head rcu; + union { + struct rcu_head rcu; + struct list_head link; + }; + struct i915_sw_fence fence; + struct i915_gem_context *ctx; unsigned int num_engines; struct intel_context *engines[]; }; @@ -173,6 +179,9 @@ struct i915_gem_context { * context in messages. */ char name[TASK_COMM_LEN + 8]; + + struct spinlock stale_lock; + struct list_head stale_list; }; #endif /* __I915_GEM_CONTEXT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 51ba97daf2a0..e47b9cf952c4 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -211,10 +211,19 @@ void i915_sw_fence_complete(struct i915_sw_fence *fence) __i915_sw_fence_complete(fence, NULL); } -void i915_sw_fence_await(struct i915_sw_fence *fence) +bool i915_sw_fence_await(struct i915_sw_fence *fence) { - debug_fence_assert(fence); - WARN_ON(atomic_inc_return(&fence->pending) <= 1); + int old, new; + + new = atomic_read(&fence->pending); + do { + if (new < 1) + return false; + + old = new++; + } while((new = atomic_cmpxchg(&fence->pending, old, new)) != old); + + return true; } void __i915_sw_fence_init(struct i915_sw_fence *fence, diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index 19e806ce43bc..30a863353ee6 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -91,7 +91,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, unsigned long timeout, gfp_t gfp); -void i915_sw_fence_await(struct i915_sw_fence *fence); +bool i915_sw_fence_await(struct i915_sw_fence *fence); void i915_sw_fence_complete(struct i915_sw_fence *fence); static inline bool i915_sw_fence_signaled(const struct i915_sw_fence *fence) -- 2.25.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines 2020-02-05 19:39 ` [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson @ 2020-02-05 20:32 ` Chris Wilson 0 siblings, 0 replies; 11+ messages in thread From: Chris Wilson @ 2020-02-05 20:32 UTC (permalink / raw) To: intel-gfx If we have a set of active engines marked as being non-persistent, we lose track of those if the user replaces those engines with I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that non-persistent requests are terminated if they are no longer being tracked by the user's context (in order to prevent a lost request causing an untracked and so unstoppable GPU hang), we need to apply the same context cancellation upon changing engines. v2: Track stale engines[] so we only reap at context closure. Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") Testcase: igt/gem_ctx_peristence/replace Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 118 ++++++++++++++++-- .../gpu/drm/i915/gem/i915_gem_context_types.h | 11 +- drivers/gpu/drm/i915/i915_sw_fence.c | 15 ++- drivers/gpu/drm/i915/i915_sw_fence.h | 2 +- 4 files changed, 133 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 52a749691a8d..21ce84cc1f68 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -270,7 +270,8 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) if (!e) return ERR_PTR(-ENOMEM); - init_rcu_head(&e->rcu); + e->ctx = ctx; + for_each_engine(engine, gt, id) { struct intel_context *ce; @@ -450,7 +451,7 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce) return engine; } -static void kill_context(struct i915_gem_context *ctx) +static void kill_engines(struct i915_gem_engines *engines) { struct i915_gem_engines_iter it; struct intel_context *ce; @@ -462,7 +463,7 @@ static void kill_context(struct i915_gem_context *ctx) * However, we only care about pending requests, so only include * engines on which there are incomplete requests. */ - for_each_gem_engine(ce, __context_engines_static(ctx), it) { + for_each_gem_engine(ce, engines, it) { struct intel_engine_cs *engine; if (intel_context_set_banned(ce)) @@ -484,10 +485,41 @@ static void kill_context(struct i915_gem_context *ctx) * the context from the GPU, we have to resort to a full * reset. We hope the collateral damage is worth it. */ - __reset_context(ctx, engine); + __reset_context(engines->ctx, engine); + } +} + +static void kill_stale_engines(struct i915_gem_context *ctx) +{ + if (!list_empty(&ctx->stale_list)) { + struct i915_gem_engines *pos, *next; + unsigned long flags; + + spin_lock_irqsave(&ctx->stale_lock, flags); + list_for_each_entry_safe(pos, next, &ctx->stale_list, link) { + if (!i915_sw_fence_await(&pos->fence)) + continue; + + spin_unlock_irqrestore(&ctx->stale_lock, flags); + + kill_engines(pos); + + spin_lock_irqsave(&ctx->stale_lock, flags); + list_safe_reset_next(pos, next, link); + list_del_init(&pos->link); + + i915_sw_fence_complete(&pos->fence); + } + spin_unlock_irqrestore(&ctx->stale_lock, flags); } } +static void kill_context(struct i915_gem_context *ctx) +{ + kill_stale_engines(ctx); + kill_engines(__context_engines_static(ctx)); +} + static void set_closed_name(struct i915_gem_context *ctx) { char *s; @@ -602,6 +634,9 @@ __create_context(struct drm_i915_private *i915) ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); mutex_init(&ctx->mutex); + INIT_LIST_HEAD(&ctx->stale_list); + spin_lock_init(&ctx->stale_lock); + mutex_init(&ctx->engines_mutex); e = default_engines(ctx); if (IS_ERR(e)) { @@ -1529,6 +1564,71 @@ static const i915_user_extension_fn set_engines__extensions[] = { [I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond, }; +static int engines_notify(struct i915_sw_fence *fence, + enum i915_sw_fence_notify state) +{ + struct i915_gem_engines *engines = + container_of(fence, typeof(*engines), fence); + + switch (state) { + case FENCE_COMPLETE: + if (!list_empty(&engines->link)) { + struct i915_gem_context *ctx = engines->ctx; + unsigned long flags; + + spin_lock_irqsave(&ctx->stale_lock, flags); + list_del(&engines->link); + spin_unlock_irqrestore(&ctx->stale_lock, flags); + } + break; + + case FENCE_FREE: + init_rcu_head(&engines->rcu); + call_rcu(&engines->rcu, free_engines_rcu); + break; + } + + return NOTIFY_DONE; +} + +static void engines_idle_release(struct i915_gem_engines *engines) +{ + struct i915_gem_engines_iter it; + struct intel_context *ce; + unsigned long flags; + + GEM_BUG_ON(!engines); + i915_sw_fence_init(&engines->fence, engines_notify); + + spin_lock_irqsave(&engines->ctx->stale_lock, flags); + list_add(&engines->link, &engines->ctx->stale_list); + spin_unlock_irqrestore(&engines->ctx->stale_lock, flags); + + for_each_gem_engine(ce, engines, it) { + struct dma_fence *fence; + int err; + + if (!ce->timeline) + continue; + + fence = i915_active_fence_get(&ce->timeline->last_request); + if (!fence) + continue; + + err = i915_sw_fence_await_dma_fence(&engines->fence, + fence, 0, + GFP_KERNEL); + + dma_fence_put(fence); + if (err < 0) { + kill_engines(engines); + break; + } + } + + i915_sw_fence_commit(&engines->fence); +} + static int set_engines(struct i915_gem_context *ctx, const struct drm_i915_gem_context_param *args) @@ -1571,7 +1671,8 @@ set_engines(struct i915_gem_context *ctx, if (!set.engines) return -ENOMEM; - init_rcu_head(&set.engines->rcu); + set.engines->ctx = ctx; + for (n = 0; n < num_engines; n++) { struct i915_engine_class_instance ci; struct intel_engine_cs *engine; @@ -1631,7 +1732,8 @@ set_engines(struct i915_gem_context *ctx, set.engines = rcu_replace_pointer(ctx->engines, set.engines, 1); mutex_unlock(&ctx->engines_mutex); - call_rcu(&set.engines->rcu, free_engines_rcu); + /* Keep track of old engine sets for kill_context() */ + engines_idle_release(set.engines); return 0; } @@ -1646,7 +1748,6 @@ __copy_engines(struct i915_gem_engines *e) if (!copy) return ERR_PTR(-ENOMEM); - init_rcu_head(©->rcu); for (n = 0; n < e->num_engines; n++) { if (e->engines[n]) copy->engines[n] = intel_context_get(e->engines[n]); @@ -1890,7 +1991,8 @@ static int clone_engines(struct i915_gem_context *dst, if (!clone) goto err_unlock; - init_rcu_head(&clone->rcu); + clone->ctx = dst; + for (n = 0; n < e->num_engines; n++) { struct intel_engine_cs *engine; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 017ca803ab47..39c2f6189684 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -20,6 +20,7 @@ #include "gt/intel_context_types.h" #include "i915_scheduler.h" +#include "i915_sw_fence.h" struct pid; @@ -30,7 +31,12 @@ struct intel_timeline; struct intel_ring; struct i915_gem_engines { - struct rcu_head rcu; + union { + struct rcu_head rcu; + struct list_head link; + }; + struct i915_sw_fence fence; + struct i915_gem_context *ctx; unsigned int num_engines; struct intel_context *engines[]; }; @@ -173,6 +179,9 @@ struct i915_gem_context { * context in messages. */ char name[TASK_COMM_LEN + 8]; + + struct spinlock stale_lock; + struct list_head stale_list; }; #endif /* __I915_GEM_CONTEXT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 51ba97daf2a0..9a20b7246f91 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -211,10 +211,19 @@ void i915_sw_fence_complete(struct i915_sw_fence *fence) __i915_sw_fence_complete(fence, NULL); } -void i915_sw_fence_await(struct i915_sw_fence *fence) +bool i915_sw_fence_await(struct i915_sw_fence *fence) { - debug_fence_assert(fence); - WARN_ON(atomic_inc_return(&fence->pending) <= 1); + int old, new; + + new = atomic_read(&fence->pending); + do { + if (new < 1) + return false; + + old = new++; + } while ((new = atomic_cmpxchg(&fence->pending, old, new)) != old); + + return true; } void __i915_sw_fence_init(struct i915_sw_fence *fence, diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index 19e806ce43bc..30a863353ee6 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -91,7 +91,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, unsigned long timeout, gfp_t gfp); -void i915_sw_fence_await(struct i915_sw_fence *fence); +bool i915_sw_fence_await(struct i915_sw_fence *fence); void i915_sw_fence_complete(struct i915_sw_fence *fence); static inline bool i915_sw_fence_signaled(const struct i915_sw_fence *fence) -- 2.25.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson ` (2 preceding siblings ...) 2020-02-05 19:39 ` [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson @ 2020-02-05 22:05 ` Patchwork 2020-02-05 22:37 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork 4 siblings, 0 replies; 11+ messages in thread From: Patchwork @ 2020-02-05 22:05 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) URL : https://patchwork.freedesktop.org/series/73023/ State : warning == Summary == $ dim checkpatch origin/drm-tip 66727965053c drm/i915/gem: Don't leak non-persistent requests on changing engines -:248: WARNING:USE_SPINLOCK_T: struct spinlock should be spinlock_t #248: FILE: drivers/gpu/drm/i915/gem/i915_gem_context_types.h:183: + struct spinlock stale_lock; total: 0 errors, 1 warnings, 0 checks, 240 lines checked _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson ` (3 preceding siblings ...) 2020-02-05 22:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) Patchwork @ 2020-02-05 22:37 ` Patchwork 4 siblings, 0 replies; 11+ messages in thread From: Patchwork @ 2020-02-05 22:37 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) URL : https://patchwork.freedesktop.org/series/73023/ State : failure == Summary == CI Bug Log - changes from CI_DRM_7871 -> Patchwork_16442 ==================================================== Summary ------- **FAILURE** Serious unknown changes coming with Patchwork_16442 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_16442, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/index.html Possible new issues ------------------- Here are the unknown changes that may have been introduced in Patchwork_16442: ### IGT changes ### #### Possible regressions #### * igt@i915_selftest@live_gem_contexts: - fi-kbl-x1275: [PASS][1] -> [DMESG-FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-kbl-x1275/igt@i915_selftest@live_gem_contexts.html Known issues ------------ Here are the changes found in Patchwork_16442 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@i915_selftest@live_blt: - fi-hsw-4770r: [PASS][3] -> [DMESG-FAIL][4] ([i915#553] / [i915#725]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-hsw-4770r/igt@i915_selftest@live_blt.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-hsw-4770r/igt@i915_selftest@live_blt.html - fi-hsw-4770: [PASS][5] -> [DMESG-FAIL][6] ([i915#725]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-hsw-4770/igt@i915_selftest@live_blt.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-hsw-4770/igt@i915_selftest@live_blt.html * igt@i915_selftest@live_execlists: - fi-icl-y: [PASS][7] -> [DMESG-FAIL][8] ([fdo#108569]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-icl-y/igt@i915_selftest@live_execlists.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-icl-y/igt@i915_selftest@live_execlists.html * igt@kms_chamelium@hdmi-hpd-fast: - fi-icl-u2: [PASS][9] -> [FAIL][10] ([i915#217]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html * igt@prime_self_import@basic-llseek-size: - fi-tgl-y: [PASS][11] -> [DMESG-WARN][12] ([CI#94] / [i915#402]) +1 similar issue [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-tgl-y/igt@prime_self_import@basic-llseek-size.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-tgl-y/igt@prime_self_import@basic-llseek-size.html #### Possible fixes #### * igt@i915_selftest@live_gtt: - fi-icl-guc: [TIMEOUT][13] ([fdo#112271]) -> [PASS][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-icl-guc/igt@i915_selftest@live_gtt.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-icl-guc/igt@i915_selftest@live_gtt.html * igt@kms_addfb_basic@bad-pitch-128: - fi-tgl-y: [DMESG-WARN][15] ([CI#94] / [i915#402]) -> [PASS][16] +1 similar issue [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7871/fi-tgl-y/igt@kms_addfb_basic@bad-pitch-128.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/fi-tgl-y/igt@kms_addfb_basic@bad-pitch-128.html [CI#94]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/94 [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569 [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271 [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217 [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402 [i915#553]: https://gitlab.freedesktop.org/drm/intel/issues/553 [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725 Participating hosts (45 -> 42) ------------------------------ Additional (5): fi-bdw-5557u fi-hsw-peppy fi-bwr-2160 fi-ilk-650 fi-kbl-7500u Missing (8): fi-bsw-n3050 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-ivb-3770 fi-byt-clapper fi-bdw-samus fi-kbl-r Build changes ------------- * CI: CI-20190529 -> None * Linux: CI_DRM_7871 -> Patchwork_16442 CI-20190529: 20190529 CI_DRM_7871: c9b0237ee7ffb1bbb62f864f0b2d7b290ee1313d @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5419: 44913a91e77434b03001bb9ea53216cd03c476e6 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_16442: 66727965053c0e896d166743999590651bf4ca80 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 66727965053c drm/i915/gem: Don't leak non-persistent requests on changing engines == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16442/index.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-02-05 22:37 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-05 12:11 [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson 2020-02-05 12:13 ` [Intel-gfx] [PATCH v2] " Chris Wilson 2020-02-05 16:22 ` Tvrtko Ursulin 2020-02-05 16:44 ` Chris Wilson 2020-02-05 18:33 ` Tvrtko Ursulin 2020-02-05 18:39 ` Chris Wilson 2020-02-05 17:17 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev2) Patchwork 2020-02-05 19:39 ` [Intel-gfx] [PATCH] drm/i915/gem: Don't leak non-persistent requests on changing engines Chris Wilson 2020-02-05 20:32 ` Chris Wilson 2020-02-05 22:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/gem: Don't leak non-persistent requests on changing engines (rev4) Patchwork 2020-02-05 22:37 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.