Re: Re: Re: [PATCH 0/2] Kernel Live Patching

From: Vojtech Pavlik <vojtech@suse.cz>
To: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Seth Jennings <sjenning@redhat.com>,
	Jiri Kosina <jkosina@suse.cz>,
	Steven Rostedt <rostedt@goodmis.org>,
	live-patching@vger.kernel.org, kpatch@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: Re: Re: [PATCH 0/2] Kernel Live Patching
Date: Wed, 12 Nov 2014 22:47:19 +0100	[thread overview]
Message-ID: <20141112214719.GA26809@suse.cz> (raw)
In-Reply-To: <546399E4.6050605@hitachi.com>

On Thu, Nov 13, 2014 at 02:33:24AM +0900, Masami Hiramatsu wrote:
> Right. Consistency model is still same as kpatch. Btw, I think
> we can just use the difference of consistency for classifying
> the patches, since we have these classes, only limited combination
> is meaningful.
> 
> >> 	LEAVE_FUNCTION
> >> 	LEAVE_PATCHED_SET
> >> 	LEAVE_KERNEL
> >>
> >> 	SWITCH_FUNCTION
> >> 	SWITCH_THREAD
> >> 	SWITCH_KERNEL
> 
> How about the below combination of consistent flags?
> 
> <flags>
> CONSISTENT_IN_THREAD - patching is consistent in a thread.
> CONSISTENT_IN_TIME - patching is atomically done.
> 
> <combination>
> (none) - the 'null' mode? same as LEAVE_FUNCTION & SWITCH_FUNCTION
> 
> CONSISTENT_IN_THREAD - kGraft mode. same as LEAVE_KERNEL & SWITCH_THREAD
> 
> CONSISTENT_IN_TIME - kpatch mode. same as LEAVE_PATCHED_SET & SWITCH_KERNEL
> 
> CONSISTENT_IN_THREAD|CONSISTENT_IN_TIME - CRIU mode. same as LEAVE_KERNEL & SWITCH_KERNEL

The reason I tried to parametrize the consistency model in a more
flexible and fine-grained manner than just describing the existing
solutions was for the purpose of exploring whether any of the remaining
combinations make sense.

It allowed me to look at what value we're getting from the consistency
models: Most importantly the ability to change function prototypes and
still make calls work.

For this, the minimum requirements are LEAVE_PATCHED_SET (what
kpatch does) and SWITCH_THREAD (which is what kGraft does). 

Both kpatch and kGraft do more, but:

I was able to show that LEAVE_KERNEL is unnecessary and any cases where
it is beneficial can be augmented by just increasing the patched set.

I believe at this point that SWITCH_KERNEL is unnecessary and that data or
locking changes - the major benefit of switching at once can be done by
shadowing/versioning of data structures, which is what both kpatch and
kGraft had planned to do anyway.

I haven't shown yet whether the strongest consistency (LEAVE_KERNEL +
SWITCH_KERNEL) is possible at all. CRIU is close, but not necessarily
doing quite that. It might be possible to just force processes to sleep
at syscall entry one by one until all are asleep. Also the benefits of
doing that are still unclear.

The goal is to find a consistency model that is best suited for the
goals of both kpatch and kGraft: Reliably apply simple to
mid-complexity kernel patches.

> So, each patch requires consistency constrains flag and livepatch tool
> chooses the mode based on the flag.
> 
> >> So, I think the patch may be classified by following four types
> >>
> >> PATCH_FUNCTION - Patching per function. This ignores context, just
> >>                change the function.
> >>                User must ensure that the new function can co-exist
> >>                with old functions on the same context (e.g. recursive
> >>                call can cause inconsistency).
> >>
> >> PATCH_THREAD - Patching per thread. If a thread leave the kernel,
> >>                changes are applied for that thread.
> >>                User must ensure that the new functions can co-exist
> >>                with old functions per-thread. Inter-thread shared
> >>                data acquisition(locks) should not be involved.
> >>
> >> PATCH_KERNEL - Patching all threads. This wait for all threads leave the
> >>                all target functions.
> >>                User must ensure that the new functions can co-exist
> >>                with old functions on a thread (note that if there is a
> >>                loop, old one can be called first n times, and new one
> >>                can be called afterwords).(**)
> > 
> > Yes, but only when the function calling it is not included in the
> > patched set, which is only a problem for semantic changes accompanied by
> > no change in the function prototyppe. This can be avoided by changing
> > the prototype deliberately.
> 
> Hmm, but what would you think about following simple case?
> 
> ----
> int func(int a) {
>   return a + 1;
> }
> 
> ...
>   b = 0;
>   for (i = 0; i < 10; i++)
>     b = func(b);
> ...
> ----
> ----
> int func(int a) {
>   return a + 2; /* Changed */
> }
> 
> ...
>   b = 0;
>   for (i = 0; i < 10; i++)
>     b = func(b);
> ...
> ----
> 
> So, after the patch, "b" will be in a range of 10 to 20, not 10 or 20.
> Of course CONSISTENT_IN_THREAD can ensure it should be 10 or 20 :)

If you force a prototype change, eg by changing func() to an unsigned
int, or simply add a parameter, the place where it is called from will
also be changed and will be included in the patched set. (Or you can
just include it manually in the set.)

Then, you can be sure that the place which calls func() is not on the
stack when patching. This way, in your classification, PATCH_KERNEL can
be as good as PATCH_THREAD. In my classification, I'm saying that
LEAVE_PATCHED_SET is as good as LEAVE_KERNEL.

> >> (*) Instead of checking stacks, at first, wait for all threads leaving
> >> the kernel once, after that, wait for refcount becomes zero and switch
> >> all the patched functions.
> > 
> > This is a very beautiful idea.
> > 
> > It does away with both the stack parsing and the kernel stopping,
> > achieving kGraft's goals, while preserving kpatch's consistency model.
> > 
> > Sadly, it combines the disadvantages of both kpatch and kGraft: From
> > kpatch it takes the inability to patch functions where threads are
> > sleeping often and as such never leave them at once. From kGraft it
> > takes the need to annotate kernel threads and wake sleepers from
> > userspace.
> 
> But how frequently the former case happens? It seems very very rare.
> And if we aim to enable both kpatch mode and kGraft mode in the kernel,
> anyway we'll have something for the latter cases.

The kpatch problem case isn't that rare. It just happened with a CVE in
futexes recently. It will happen if you try to patch anything that is on
the stack when a TTY or TCP read is waiting for data as another example. 

The kGraft problem case will happen when you load a 3rd party module
with a non-annotated kernel thread. Or a different problem will happen
when you have an application sleeping that will exit when receiving any
signal.

Both the cases can be handled with tricks and workarounds. But it'd be
much nicer to have a patching engine that is reliable.

> > So while it is beautiful, it's less practical than either kpatch or
> > kGraft alone. 
> 
> Ah, sorry for confusing, I don't tend to integrate kpatch and kGraft.
> Actually, it is just about modifying kpatch, since it may shorten
> stack-checking time.
> This means that does not change the consistency model.
> We certainly need both of kGraft mode and kpatch mode.

What I'm proposing is a LEAVE_PATCHED_SET + SWITCH_THREAD mode. It's
less consistency, but it is enough. And it is more reliable (likely to
succeed in finite time) than either kpatch or kGraft.

It'd be mostly based on your refcounting code, including stack
checking (when a process sleeps, counter gets set based on number of
patched functions on the stack), possibly including setting the counter
to 0 on syscall entry/exit, but it'd make the switch per-thread like
kGraft does, not for the whole system, when the respective counters
reach zero.

This handles the frequent sleeper case, it doesn't need annotated kernel
thread main loops, it will not need the user to wake up every process in
the system unless it sleeps in a patched function.

And it can handle all the patches that kpatch and kGraft can (it needs
shadowing for some).

> > Yes, this is what I call 'extending the patched set'. You can do that
> > either by deliberately changing the prototype of the patched function
> > being called, which causes the calling function to be considered
> > different, or just add it to the set of functions considered manually.
> 
> I'd prefer latter one :) or just gives hints of watching targets.

Me too.

-- 
Vojtech Pavlik
Director SUSE Labs