From: Daniel Vetter <daniel.vetter@ffwll.ch> To: DRI Development <dri-devel@lists.freedesktop.org> Cc: "Intel Graphics Development" <intel-gfx@lists.freedesktop.org>, "Daniel Vetter" <daniel.vetter@ffwll.ch>, "Jason Ekstrand" <jason@jlekstrand.net>, "Chris Wilson" <chris@chris-wilson.co.uk>, "Tvrtko Ursulin" <tvrtko.ursulin@intel.com>, "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>, "Matthew Brost" <matthew.brost@intel.com>, "Matthew Auld" <matthew.auld@intel.com>, "Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>, "Thomas Hellström" <thomas.hellstrom@intel.com>, "Lionel Landwerlin" <lionel.g.landwerlin@intel.com>, "Dave Airlie" <airlied@redhat.com>, "Daniel Vetter" <daniel.vetter@intel.com> Subject: [PATCH 03/11] drm/i915: Keep gem ctx->vm alive until the final put Date: Fri, 13 Aug 2021 22:30:25 +0200 [thread overview] Message-ID: <20210813203033.3179400-3-daniel.vetter@ffwll.ch> (raw) In-Reply-To: <20210813203033.3179400-1-daniel.vetter@ffwll.ch> The comment added in commit b81dde719439c8f09bb61e742ed95bfc4b33946b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue May 21 22:11:29 2019 +0100 drm/i915: Allow userspace to clone contexts on creation and moved in commit 27dbae8f36c1c25008b7885fc07c57054b7dfba3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Nov 6 09:13:12 2019 +0000 drm/i915/gem: Safely acquire the ctx->vm when copying suggested that i915_address_space were at least intended to be managed through SLAB_TYPESAFE_BY_RCU: * This ppgtt may have be reallocated between * the read and the kref, and reassigned to a third * context. In order to avoid inadvertent sharing * of this ppgtt with that third context (and not * src), we have to confirm that we have the same * ppgtt after passing through the strong memory * barrier implied by a successful * kref_get_unless_zero(). But extensive git history search has not brough any such reuse to light. What has come to light though is that ever since commit 2850748ef8763ab46958e43a4d1c445f29eeb37d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Oct 4 14:39:58 2019 +0100 drm/i915: Pull i915_vma_pin under the vm->mutex (yes this commit is earlier) the final i915_vma_put call has been moved from i915_gem_context_free (now called _release) to context_close, which means it's not actually safe anymore to access the ctx->vm pointer without lock helds, because it might disappear at any moment. Note that superficially things all still work, because the i915_address_space is RCU protected since commit b32fa811156328aea5a3c2ff05cc096490382456 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 20 19:37:05 2019 +0100 drm/i915/gtt: Defer address space cleanup to an RCU worker except the very clever macro above (which is designed to protected against object reuse due to SLAB_TYPESAFE_BY_RCU or similar tricks) results in an endless loop if the refcount of the ctx->vm ever permanently drops to 0. Which it totally now can. Fix that by moving the final i915_vm_put to where it should be. Note that i915_gem_context is rcu protected, but _only_ the final kfree. This means anyone who chases a pointer to a gem ctx solely under the protection can pretty only call kref_get_unless_zero(). This seems to be pretty much the case, aside from a bunch of cases that consult the scheduling information without any further protection. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5a053cf14948..12e2de1db1a2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -990,6 +990,7 @@ static void i915_gem_context_release_work(struct work_struct *work) { struct i915_gem_context *ctx = container_of(work, typeof(*ctx), release_work); + struct i915_address_space *vm; trace_i915_context_free(ctx); GEM_BUG_ON(!i915_gem_context_is_closed(ctx)); @@ -997,6 +998,10 @@ static void i915_gem_context_release_work(struct work_struct *work) if (ctx->syncobj) drm_syncobj_put(ctx->syncobj); + vm = i915_gem_context_vm(ctx); + if (vm) + i915_vm_put(vm); + mutex_destroy(&ctx->engines_mutex); mutex_destroy(&ctx->lut_mutex); @@ -1220,8 +1225,15 @@ static void context_close(struct i915_gem_context *ctx) set_closed_name(ctx); vm = i915_gem_context_vm(ctx); - if (vm) + if (vm) { + /* i915_vm_close drops the final reference, which is a bit too + * early and could result in surprises with concurrent + * operations racing with thist ctx close. Keep a full reference + * until the end. + */ + i915_vm_get(vm); i915_vm_close(vm); + } ctx->file_priv = ERR_PTR(-EBADF); -- 2.32.0
WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel.vetter@ffwll.ch> To: DRI Development <dri-devel@lists.freedesktop.org> Cc: "Intel Graphics Development" <intel-gfx@lists.freedesktop.org>, "Daniel Vetter" <daniel.vetter@ffwll.ch>, "Jason Ekstrand" <jason@jlekstrand.net>, "Chris Wilson" <chris@chris-wilson.co.uk>, "Tvrtko Ursulin" <tvrtko.ursulin@intel.com>, "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>, "Matthew Brost" <matthew.brost@intel.com>, "Matthew Auld" <matthew.auld@intel.com>, "Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>, "Thomas Hellström" <thomas.hellstrom@intel.com>, "Lionel Landwerlin" <lionel.g.landwerlin@intel.com>, "Dave Airlie" <airlied@redhat.com>, "Daniel Vetter" <daniel.vetter@intel.com> Subject: [Intel-gfx] [PATCH 03/11] drm/i915: Keep gem ctx->vm alive until the final put Date: Fri, 13 Aug 2021 22:30:25 +0200 [thread overview] Message-ID: <20210813203033.3179400-3-daniel.vetter@ffwll.ch> (raw) In-Reply-To: <20210813203033.3179400-1-daniel.vetter@ffwll.ch> The comment added in commit b81dde719439c8f09bb61e742ed95bfc4b33946b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue May 21 22:11:29 2019 +0100 drm/i915: Allow userspace to clone contexts on creation and moved in commit 27dbae8f36c1c25008b7885fc07c57054b7dfba3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Nov 6 09:13:12 2019 +0000 drm/i915/gem: Safely acquire the ctx->vm when copying suggested that i915_address_space were at least intended to be managed through SLAB_TYPESAFE_BY_RCU: * This ppgtt may have be reallocated between * the read and the kref, and reassigned to a third * context. In order to avoid inadvertent sharing * of this ppgtt with that third context (and not * src), we have to confirm that we have the same * ppgtt after passing through the strong memory * barrier implied by a successful * kref_get_unless_zero(). But extensive git history search has not brough any such reuse to light. What has come to light though is that ever since commit 2850748ef8763ab46958e43a4d1c445f29eeb37d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Oct 4 14:39:58 2019 +0100 drm/i915: Pull i915_vma_pin under the vm->mutex (yes this commit is earlier) the final i915_vma_put call has been moved from i915_gem_context_free (now called _release) to context_close, which means it's not actually safe anymore to access the ctx->vm pointer without lock helds, because it might disappear at any moment. Note that superficially things all still work, because the i915_address_space is RCU protected since commit b32fa811156328aea5a3c2ff05cc096490382456 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 20 19:37:05 2019 +0100 drm/i915/gtt: Defer address space cleanup to an RCU worker except the very clever macro above (which is designed to protected against object reuse due to SLAB_TYPESAFE_BY_RCU or similar tricks) results in an endless loop if the refcount of the ctx->vm ever permanently drops to 0. Which it totally now can. Fix that by moving the final i915_vm_put to where it should be. Note that i915_gem_context is rcu protected, but _only_ the final kfree. This means anyone who chases a pointer to a gem ctx solely under the protection can pretty only call kref_get_unless_zero(). This seems to be pretty much the case, aside from a bunch of cases that consult the scheduling information without any further protection. Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5a053cf14948..12e2de1db1a2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -990,6 +990,7 @@ static void i915_gem_context_release_work(struct work_struct *work) { struct i915_gem_context *ctx = container_of(work, typeof(*ctx), release_work); + struct i915_address_space *vm; trace_i915_context_free(ctx); GEM_BUG_ON(!i915_gem_context_is_closed(ctx)); @@ -997,6 +998,10 @@ static void i915_gem_context_release_work(struct work_struct *work) if (ctx->syncobj) drm_syncobj_put(ctx->syncobj); + vm = i915_gem_context_vm(ctx); + if (vm) + i915_vm_put(vm); + mutex_destroy(&ctx->engines_mutex); mutex_destroy(&ctx->lut_mutex); @@ -1220,8 +1225,15 @@ static void context_close(struct i915_gem_context *ctx) set_closed_name(ctx); vm = i915_gem_context_vm(ctx); - if (vm) + if (vm) { + /* i915_vm_close drops the final reference, which is a bit too + * early and could result in surprises with concurrent + * operations racing with thist ctx close. Keep a full reference + * until the end. + */ + i915_vm_get(vm); i915_vm_close(vm); + } ctx->file_priv = ERR_PTR(-EBADF); -- 2.32.0
next prev parent reply other threads:[~2021-08-13 20:31 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-13 20:30 [PATCH 01/11] drm/i915: Release i915_gem_context from a worker Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 02/11] drm/i915: Release ctx->syncobj on final put, not on ctx close Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` Daniel Vetter [this message] 2021-08-13 20:30 ` [Intel-gfx] [PATCH 03/11] drm/i915: Keep gem ctx->vm alive until the final put Daniel Vetter 2021-08-13 20:30 ` [PATCH 04/11] drm/i915: Drop code to handle set-vm races from execbuf Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 05/11] drm/i915: Rename i915_gem_context_get_vm_rcu to i915_gem_context_get_eb_vm Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 06/11] drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 07/11] drm/i915: Add i915_gem_context_is_full_ppgtt Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 08/11] drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 09/11] drm/i915: Drop __rcu from gem_context->vm Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 10/11] drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-31 9:29 ` Maarten Lankhorst 2021-08-31 9:29 ` [Intel-gfx] " Maarten Lankhorst 2021-08-31 12:14 ` [PATCH] " Daniel Vetter 2021-08-31 12:14 ` [Intel-gfx] " Daniel Vetter 2021-08-13 20:30 ` [PATCH 11/11] drm/i915: Stop rcu support for i915_address_space Daniel Vetter 2021-08-13 20:30 ` [Intel-gfx] " Daniel Vetter 2021-08-13 21:48 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915: Release i915_gem_context from a worker Patchwork 2021-08-13 21:49 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork 2021-08-13 22:18 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2021-08-14 1:26 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork 2021-08-14 10:43 ` [PATCH] " Daniel Vetter 2021-08-14 10:43 ` [Intel-gfx] " Daniel Vetter 2021-08-31 9:38 ` Maarten Lankhorst 2021-08-31 9:38 ` [Intel-gfx] " Maarten Lankhorst 2021-08-31 12:16 ` Daniel Vetter 2021-08-31 12:16 ` [Intel-gfx] " Daniel Vetter 2021-08-31 15:14 ` Daniel Vetter 2021-08-31 15:14 ` [Intel-gfx] " Daniel Vetter 2021-09-02 10:04 ` Maarten Lankhorst 2021-09-02 10:04 ` Maarten Lankhorst 2021-08-14 10:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with drm/i915: Release i915_gem_context from a worker (rev2) Patchwork 2021-08-14 10:56 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork 2021-08-14 11:19 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2021-08-14 12:38 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork 2021-08-31 10:16 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with drm/i915: Release i915_gem_context from a worker (rev3) Patchwork 2021-08-31 12:18 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with drm/i915: Release i915_gem_context from a worker (rev4) Patchwork 2021-09-02 12:42 ` [Intel-gfx] [PATCH 01/11] drm/i915: Release i915_gem_context from a worker Tvrtko Ursulin 2021-09-02 15:05 ` Daniel Vetter 2021-09-02 16:20 ` Tvrtko Ursulin 2021-09-02 20:02 ` Daniel Vetter 2021-09-03 10:40 ` Tvrtko Ursulin 2021-09-02 14:20 Daniel Vetter 2021-09-02 14:20 ` [PATCH 03/11] drm/i915: Keep gem ctx->vm alive until the final put Daniel Vetter
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210813203033.3179400-3-daniel.vetter@ffwll.ch \ --to=daniel.vetter@ffwll.ch \ --cc=airlied@redhat.com \ --cc=chris@chris-wilson.co.uk \ --cc=daniel.vetter@intel.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=intel-gfx@lists.freedesktop.org \ --cc=jason@jlekstrand.net \ --cc=joonas.lahtinen@linux.intel.com \ --cc=lionel.g.landwerlin@intel.com \ --cc=maarten.lankhorst@linux.intel.com \ --cc=matthew.auld@intel.com \ --cc=matthew.brost@intel.com \ --cc=thomas.hellstrom@intel.com \ --cc=tvrtko.ursulin@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.