From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: "Lisovskiy, Stanislav" <stanislav.lisovskiy@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Fix global state use-after-frees with a refcount
Date: Mon, 1 Jun 2020 17:47:55 +0300 [thread overview]
Message-ID: <20200601144755.GL6112@intel.com> (raw)
In-Reply-To: <20200601075929.GA2431@intel.com>
On Mon, Jun 01, 2020 at 10:59:29AM +0300, Lisovskiy, Stanislav wrote:
> On Fri, May 29, 2020 at 08:11:43AM +0300, Ville Syrjälä wrote:
> > On Thu, May 28, 2020 at 10:58:52PM +0300, Lisovskiy, Stanislav wrote:
> > > On Thu, May 28, 2020 at 10:38:52PM +0300, Lisovskiy, Stanislav wrote:
> > > > On Wed, May 27, 2020 at 11:02:45PM +0300, Ville Syrjala wrote:
> > > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > >
> > > > > While the current locking/serialization of the global state
> > > > > suffices for protecting the obj->state access and the actual
> > > > > hardware reprogramming, we do have a problem with accessing
> > > > > the old/new states during nonblocking commits.
> > > > >
> > > > > The state computation and swap will be protected by the crtc
> > > > > locks, but the commit_tails can finish out of order, thus also
> > > > > causing the atomic states to be cleaned up out of order. This
> > > > > would mean the commit that started first but finished last has
> > > > > had its new state freed as the no-longer-needed old state by the
> > > > > other commit.
> > > > >
> > > > > To fix this let's just refcount the states. obj->state amounts
> > > > > to one reference, and the intel_atomic_state holds extra references
> > > > > to both its new and old global obj states.
> > > > >
> > > > > Fixes: 0ef1905ecf2e ("drm/i915: Introduce better global state handling")
> > > > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > > ---
> > > > > .../gpu/drm/i915/display/intel_global_state.c | 45 ++++++++++++++++---
> > > > > .../gpu/drm/i915/display/intel_global_state.h | 3 ++
> > > > > 2 files changed, 42 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_global_state.c b/drivers/gpu/drm/i915/display/intel_global_state.c
> > > > > index 212d4ee68205..7a19215ad844 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_global_state.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_global_state.c
> > > > > @@ -10,6 +10,28 @@
> > > > > #include "intel_display_types.h"
> > > > > #include "intel_global_state.h"
> > > > >
> > > > > +static void __intel_atomic_global_state_free(struct kref *kref)
> > > > > +{
> > > > > + struct intel_global_state *obj_state =
> > > > > + container_of(kref, struct intel_global_state, ref);
> > > > > + struct intel_global_obj *obj = obj_state->obj;
> > > > > +
> > > > > + obj->funcs->atomic_destroy_state(obj, obj_state);
> > > > > +}
> > > > > +
> > > > > +static void intel_atomic_global_state_put(struct intel_global_state *obj_state)
> > > > > +{
> > > > > + kref_put(&obj_state->ref, __intel_atomic_global_state_free);
> > > > > +}
> > > > > +
> > > > > +static struct intel_global_state *
> > > > > +intel_atomic_global_state_get(struct intel_global_state *obj_state)
> > > > > +{
> > > > > + kref_get(&obj_state->ref);
> > > > > +
> > > > > + return obj_state;
> > > > > +}
> > > > > +
> > > > > void intel_atomic_global_obj_init(struct drm_i915_private *dev_priv,
> > > > > struct intel_global_obj *obj,
> > > > > struct intel_global_state *state,
> > > > > @@ -17,6 +39,10 @@ void intel_atomic_global_obj_init(struct drm_i915_private *dev_priv,
> > > > > {
> > > > > memset(obj, 0, sizeof(*obj));
> > > > >
> > > > > + state->obj = obj;
> > > > > +
> > > > > + kref_init(&state->ref);
> > > > > +
> > > > > obj->state = state;
> > > > > obj->funcs = funcs;
> > > > > list_add_tail(&obj->head, &dev_priv->global_obj_list);
> > > > > @@ -28,7 +54,9 @@ void intel_atomic_global_obj_cleanup(struct drm_i915_private *dev_priv)
> > > > >
> > > > > list_for_each_entry_safe(obj, next, &dev_priv->global_obj_list, head) {
> > > > > list_del(&obj->head);
> > > > > - obj->funcs->atomic_destroy_state(obj, obj->state);
> > > > > +
> > > > > + drm_WARN_ON(&dev_priv->drm, kref_read(&obj->state->ref) != 1);
> > > > > + intel_atomic_global_state_put(obj->state);
> > > > > }
> > > > > }
> > > > >
> > > > > @@ -97,10 +125,14 @@ intel_atomic_get_global_obj_state(struct intel_atomic_state *state,
> > > > > if (!obj_state)
> > > > > return ERR_PTR(-ENOMEM);
> > > > >
> > > > > + obj_state->obj = obj;
> > > > > obj_state->changed = false;
> > > > >
> > > > > + kref_init(&obj_state->ref);
> > > > > +
> > > > > state->global_objs[index].state = obj_state;
> > > > > - state->global_objs[index].old_state = obj->state;
> > > > > + state->global_objs[index].old_state =
> > > > > + intel_atomic_global_state_get(obj->state);
> > > > > state->global_objs[index].new_state = obj_state;
> > > > > state->global_objs[index].ptr = obj;
> > > > > obj_state->state = state;
> > > > > @@ -163,7 +195,9 @@ void intel_atomic_swap_global_state(struct intel_atomic_state *state)
> > > > > new_obj_state->state = NULL;
> > > > >
> > > > > state->global_objs[i].state = old_obj_state;
> > > > > - obj->state = new_obj_state;
> > > > > +
> > > > > + intel_atomic_global_state_put(obj->state);
> > > > > + obj->state = intel_atomic_global_state_get(new_obj_state);
> > > > > }
> > > > > }
> > > > >
> > > > > @@ -172,10 +206,9 @@ void intel_atomic_clear_global_state(struct intel_atomic_state *state)
> > > > > int i;
> > > > >
> > > > > for (i = 0; i < state->num_global_objs; i++) {
> > > > > - struct intel_global_obj *obj = state->global_objs[i].ptr;
> > > > > + intel_atomic_global_state_put(state->global_objs[i].old_state);
> > > > > + intel_atomic_global_state_put(state->global_objs[i].new_state);
> > > >
> > > > Shouldn't we clean old_state only?
> > > >
> > > > As I understand in absence of any transaction you now have a pool of
> > > > global_obj each has a state with single kref taken.
> > > >
> > > > So when we are going to get a new state, we do +1 kref to old_state(which is current global obj->state)
> > > > in order to prevent it being cleared by competing commit.
> > > > However the new state doesn't have any kref taken by that moment.
> > > > Then you swap do -1 kref for the old state and do +1 kref for new state,
> > > > which means that when you -1 kref again for old state in atomic_clear also,
> > > > it will be destroyed, however regarding the new state, as I understand
> > > > it still has only single kref grabbed when it was swapped,
> > > > so isn't it going to be now removed? unless we are lucky and somebody
> > > > haven't grabbed it already as an old_state in the next commit?
> > > >
> > > > Stan
> > >
> > > Ah actually I got it - forgot that kref is init as 1.
> > > But then you probably don't even need to increment kref for new state
> > > when swapping.
> > > Before assigning new obj->state you release one kref in swap(which makes sense)
> > > Then you just do only intel_atomic_global_state_put(old_state) in atomic_clear
> > > and then no need in doing intel_atomic_global_state_get(new_state) during
> > > swap.
> > > I.e we always call intel_atomic_global_state_get/put only regarding "old"
> > > obj->state and each new_state will be disposed when it becomes old_state.
> >
> >
> > IMO the approach of handing off references is just hard to follow.
> > Better to just get/put explicitly whenever you assign a pointer.
> > I already dislike handing off the original kref_init() reference,
> > and almost added a get+put there too. Maybe I really should do that...
>
> Agree, tbh I don't like the idea that kref_init already implicitly holds
> a reference - it even confused me initially.
> Typical smartpointer usually increments the ref only when assignment
> is done.
>
>
> Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Ta. Pushed. Hopefully few rounds of ci will show whether this fixes
things. Though I've also seen some vma related use-after-frees in
the logs as well, so there may be further problems elsewhere...
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
prev parent reply other threads:[~2020-06-01 14:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-27 20:02 [Intel-gfx] [PATCH] drm/i915: Fix global state use-after-frees with a refcount Ville Syrjala
2020-05-27 20:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2020-05-27 23:03 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-05-28 19:38 ` [Intel-gfx] [PATCH] " Lisovskiy, Stanislav
2020-05-28 19:58 ` Lisovskiy, Stanislav
2020-05-29 5:11 ` Ville Syrjälä
2020-06-01 7:59 ` Lisovskiy, Stanislav
2020-06-01 14:47 ` Ville Syrjälä [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200601144755.GL6112@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=stanislav.lisovskiy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).