From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Miroslav Benes <mbenes@suse.cz>
Cc: Seth Jennings <sjenning@redhat.com>,
Jiri Kosina <jkosina@suse.cz>, Vojtech Pavlik <vojtech@suse.cz>,
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
live-patching@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 6/9] livepatch: create per-task consistency model
Date: Wed, 11 Feb 2015 14:23:18 -0600 [thread overview]
Message-ID: <20150211202318.GB4311@treble.redhat.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1502111703570.27943@pobox.suse.cz>
On Wed, Feb 11, 2015 at 05:28:13PM +0100, Miroslav Benes wrote:
> On Tue, 10 Feb 2015, Josh Poimboeuf wrote:
>
> > On Tue, Feb 10, 2015 at 04:59:17PM +0100, Miroslav Benes wrote:
> > >
> > > On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > >
> > > > Add a basic per-task consistency model. This is the foundation which
> > > > will eventually enable us to patch those ~10% of security patches which
> > > > change function prototypes and/or data semantics.
> > > >
> > > > When a patch is enabled, livepatch enters into a transition state where
> > > > tasks are converging from the old universe to the new universe. If a
> > > > given task isn't using any of the patched functions, it's switched to
> > > > the new universe. Once all the tasks have been converged to the new
> > > > universe, patching is complete.
> > > >
> > > > The same sequence occurs when a patch is disabled, except the tasks
> > > > converge from the new universe to the old universe.
> > > >
> > > > The /sys/kernel/livepatch/<patch>/transition file shows whether a patch
> > > > is in transition. Only a single patch (the topmost patch on the stack)
> > > > can be in transition at a given time. A patch can remain in the
> > > > transition state indefinitely, if any of the tasks are stuck in the
> > > > previous universe.
> > > >
> > > > A transition can be reversed and effectively canceled by writing the
> > > > opposite value to the /sys/kernel/livepatch/<patch>/enabled file while
> > > > the transition is in progress. Then all the tasks will attempt to
> > > > converge back to the original universe.
> > >
> > > Hi Josh,
> > >
> > > first, thanks a lot for great work. I'm starting to go through it and it's
> > > gonna take me some time to do and send a complete review.
> >
> > I know there are a lot of details to look at, please take your time. I
> > really appreciate your review. (And everybody else's, for that matter
> > :-)
> >
> > > > + /* success! unpatch obsolete functions and do some cleanup */
> > > > +
> > > > + if (klp_universe_goal == KLP_UNIVERSE_OLD) {
> > > > + klp_unpatch_objects(klp_transition_patch);
> > > > +
> > > > + /* prevent ftrace handler from reading old func->transition */
> > > > + synchronize_rcu();
> > > > + }
> > > > +
> > > > + pr_notice("'%s': %s complete\n", klp_transition_patch->mod->name,
> > > > + klp_universe_goal == KLP_UNIVERSE_NEW ? "patching" :
> > > > + "unpatching");
> > > > +
> > > > + klp_complete_transition();
> > > > +}
> > >
> > > ...synchronize_rcu() could be insufficient. There still can be some
> > > process in our ftrace handler after the call.
> > >
> > > Consider the following scenario:
> > >
> > > When synchronize_rcu is called some process could have been preempted on
> > > some other cpu somewhere at the start of the ftrace handler before
> > > rcu_read_lock. synchronize_rcu waits for the grace period to pass, but that
> > > does not mean anything for our process in the handler, because it is not
> > > in rcu critical section. There is no guarantee that after synchronize_rcu
> > > the process would be away from the handler.
> > >
> > > "Meanwhile" klp_try_complete_transition continues and calls
> > > klp_complete_transition. This clears func->transition flags. Now the
> > > process in the handler could be scheduled again. It reads the wrong value
> > > of func->transition and redirection to the wrong function is done.
> > >
> > > What do you think? I hope I made myself clear.
> >
> > You really made me think. But I don't think there's a race here.
> >
> > Consider the two separate cases, patching and unpatching:
> >
> > 1. patching has completed: klp_universe_goal and all tasks'
> > klp_universes are at KLP_UNIVERSE_NEW. In this case, the value of
> > func->transition doesn't matter, because we want to use the func at
> > the top of the stack, and if klp_universe is NEW, the ftrace handler
> > will do that, regardless of the value of func->transition. This is
> > why I didn't do the rcu_synchronize() in this case. But maybe you're
> > not worried about this case anyway, I just described it for the sake
> > of completeness :-)
>
> Yes, this case shouldn't be a problem :)
>
> > 2. unpatching has completed: klp_universe_goal and all tasks'
> > klp_universes are at KLP_UNIVERSE_OLD. In this case, the value of
> > func->transition _does_ matter. However, notice that
> > klp_unpatch_objects() is called before rcu_synchronize(). That
> > removes the "new" func from the klp_ops stack. Since the ftrace
> > handler accesses the list _after_ calling rcu_read_lock(), it will
> > never see the "new" func, and thus func->transition will never be
> > set.
>
> Hm, so indeed I messed it up. Let me rework the scenario a bit. We have a
> function foo(), which has been already patched with foo_1() from patch_1
> and foo_2() from patch_2. Now we would like to unpatch patch_2. It is
> successfully completed and klp_try_complete_transition calls
> klp_unpatch_objects and synchronize_rcu. Thus foo_2() is removed from the
> RCU list in ops.
>
> Now to the funny part. After synchronize_rcu() and before
> klp_complete_transition some process might get to the ftrace handler (it
> is still there because of the patch_1 still being present). It gets foo_1
> from the list_first_or_null_rcu, sees that func->transition is 1 (it
> hasn't been cleared yet)
Same answer as the other email, foo_1()'s func->transition will be 0 :-)
When patching, only the new klp_func gets transition set to 1.
When unpatching, only the klp_func being removed gets transition set to
1.
> , current->klp_universe is KLP_UNIVERSE_OLD... so
> it tries to get previous function. There is none and foo() is called. This
> is incorrect.
>
> It is very similar scenario to the one in my other email earlier this day.
> I think we need to clear func->transition before calling
> klp_unpatch_objects. More or less.
--
Josh
next prev parent reply other threads:[~2015-02-11 20:23 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-09 17:31 [RFC PATCH 0/9] livepatch: consistency model Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 1/9] livepatch: simplify disable error path Josh Poimboeuf
2015-02-13 12:25 ` Miroslav Benes
2015-02-18 17:03 ` Petr Mladek
2015-02-18 20:07 ` Jiri Kosina
2015-02-09 17:31 ` [RFC PATCH 2/9] livepatch: separate enabled and patched states Josh Poimboeuf
2015-02-10 16:44 ` Jiri Slaby
2015-02-10 17:21 ` Josh Poimboeuf
2015-02-13 12:57 ` Miroslav Benes
2015-02-13 14:39 ` Josh Poimboeuf
2015-02-13 14:46 ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 3/9] livepatch: move patching functions into patch.c Josh Poimboeuf
2015-02-10 18:27 ` Jiri Slaby
2015-02-10 18:50 ` Josh Poimboeuf
2015-02-13 14:28 ` Miroslav Benes
2015-02-13 15:09 ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 4/9] livepatch: get function sizes Josh Poimboeuf
2015-02-10 18:30 ` Jiri Slaby
2015-02-10 18:53 ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 5/9] sched: move task rq locking functions to sched.h Josh Poimboeuf
2015-02-10 10:48 ` Masami Hiramatsu
2015-02-10 14:54 ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 6/9] livepatch: create per-task consistency model Josh Poimboeuf
2015-02-10 10:58 ` Masami Hiramatsu
2015-02-10 14:59 ` Josh Poimboeuf
2015-02-10 15:59 ` Miroslav Benes
2015-02-10 16:56 ` Josh Poimboeuf
2015-02-11 16:28 ` Miroslav Benes
2015-02-11 20:23 ` Josh Poimboeuf [this message]
2015-02-10 19:27 ` Seth Jennings
2015-02-10 19:32 ` Josh Poimboeuf
2015-02-11 10:21 ` Miroslav Benes
2015-02-11 20:19 ` Josh Poimboeuf
2015-02-12 10:45 ` Miroslav Benes
2015-02-12 3:21 ` Josh Poimboeuf
2015-02-12 11:56 ` Peter Zijlstra
2015-02-12 12:25 ` Jiri Kosina
2015-02-12 12:36 ` Peter Zijlstra
2015-02-12 12:39 ` Jiri Kosina
2015-02-12 12:39 ` Peter Zijlstra
2015-02-12 12:42 ` Jiri Kosina
2015-02-12 13:01 ` Josh Poimboeuf
2015-02-12 12:51 ` Josh Poimboeuf
2015-02-12 13:08 ` Peter Zijlstra
2015-02-12 13:16 ` Jiri Kosina
2015-02-12 14:20 ` Josh Poimboeuf
2015-02-12 14:27 ` Jiri Kosina
2015-02-12 13:16 ` Jiri Slaby
2015-02-12 13:35 ` Peter Zijlstra
2015-02-12 14:08 ` Jiri Kosina
2015-02-12 15:24 ` Josh Poimboeuf
2015-02-12 14:20 ` Jiri Slaby
2015-02-12 14:32 ` Jiri Kosina
2015-02-18 20:17 ` Ingo Molnar
2015-02-18 20:44 ` Vojtech Pavlik
2015-02-19 9:52 ` Peter Zijlstra
2015-02-19 10:11 ` Vojtech Pavlik
2015-02-19 10:51 ` Peter Zijlstra
2015-02-12 13:26 ` Jiri Slaby
2015-02-12 15:48 ` Josh Poimboeuf
2015-02-14 11:40 ` Jiri Slaby
2015-02-17 14:59 ` Josh Poimboeuf
2015-02-16 14:19 ` Miroslav Benes
2015-02-17 15:10 ` Josh Poimboeuf
2015-02-17 15:48 ` Miroslav Benes
2015-02-17 16:01 ` Josh Poimboeuf
2015-02-18 12:42 ` Miroslav Benes
2015-02-18 13:15 ` Josh Poimboeuf
2015-02-18 13:42 ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 7/9] proc: add /proc/<pid>/universe to show livepatch status Josh Poimboeuf
2015-02-10 18:47 ` Jiri Slaby
2015-02-10 18:57 ` Josh Poimboeuf
2015-02-09 17:31 ` [RFC PATCH 8/9] livepatch: allow patch modules to be removed Josh Poimboeuf
2015-02-10 19:02 ` Jiri Slaby
2015-02-10 19:57 ` Josh Poimboeuf
2015-02-11 10:55 ` Jiri Slaby
2015-02-11 18:39 ` Josh Poimboeuf
2015-02-12 15:22 ` Miroslav Benes
2015-02-13 12:44 ` Josh Poimboeuf
2015-02-13 16:04 ` Josh Poimboeuf
2015-02-13 16:17 ` Miroslav Benes
2015-02-13 20:49 ` Josh Poimboeuf
2015-02-16 16:06 ` Miroslav Benes
2015-02-17 15:55 ` Josh Poimboeuf
2015-02-17 16:38 ` Miroslav Benes
2015-02-09 17:31 ` [RFC PATCH 9/9] livepatch: update task universe when exiting kernel Josh Poimboeuf
2015-02-16 10:16 ` Jiri Slaby
2015-02-17 14:58 ` Josh Poimboeuf
2015-02-09 23:15 ` [RFC PATCH 0/9] livepatch: consistency model Jiri Kosina
2015-02-10 3:05 ` Josh Poimboeuf
2015-02-10 7:21 ` Jiri Kosina
2015-02-10 8:57 ` Jiri Kosina
2015-02-10 14:43 ` Josh Poimboeuf
2015-02-10 11:16 ` Masami Hiramatsu
2015-02-10 15:59 ` Josh Poimboeuf
2015-02-10 17:29 ` Josh Poimboeuf
2015-02-13 10:14 ` Jiri Kosina
2015-02-13 14:19 ` Josh Poimboeuf
2015-02-13 14:22 ` Jiri Kosina
2015-02-13 14:40 ` Miroslav Benes
2015-02-13 14:55 ` Josh Poimboeuf
2015-02-13 14:41 ` Josh Poimboeuf
2015-02-24 11:27 ` Masami Hiramatsu
2015-03-10 16:23 ` Josh Poimboeuf
2015-03-10 21:02 ` Jiri Kosina
2015-03-10 21:30 ` Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150211202318.GB4311@treble.redhat.com \
--to=jpoimboe@redhat.com \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=mbenes@suse.cz \
--cc=sjenning@redhat.com \
--cc=vojtech@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).