Re: [RFC kgr on klp 0/9] kGraft on the top of KLP

From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Jiri Kosina <jkosina@suse.cz>
Cc: Jiri Slaby <jslaby@suse.cz>,
	live-patching@vger.kernel.org, sjenning@redhat.com,
	vojtech@suse.cz, mingo@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC kgr on klp 0/9] kGraft on the top of KLP
Date: Tue, 5 May 2015 11:24:44 -0500	[thread overview]
Message-ID: <20150505162444.GA11582@treble.redhat.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1505050756570.17961@pobox.suse.cz>

On Tue, May 05, 2015 at 08:14:50AM +0200, Jiri Kosina wrote:
> On Mon, 4 May 2015, Josh Poimboeuf wrote:
> > > - the "immediate" one, where the code redirection flip is switched 
> > >   unconditionally and immediately (i.e. exactly what we currently have in 
> > >   Linus' tree); semantically applicable to many patches, but not all of 
> > >   them
> > > 
> > > - something that fills the "but not all of them" gap above.
> > 
> > What's the benefit of having the "immediate" model in addition to
> > the more comprehensive model?
> 
> Fair enoungh, I agree that in case of the hybrid aproach you're proposing 
> the immediate model is not necessary.

Just to make sure I understand, would the immediate model be needed in
order to cover some of the gaps caused by not being able to patch
kthreads?

> > > - the kGraft method is not (yet) able to patch kernel threads, and allows 
> > >   for multiple instances of the patched functions to be running in 
> > >   parallel (i.e. patch author needs to be aware of this constaint, and 
> > >   write the code accordingly)
> > 
> > Not being able to patch kthreads sounds like a huge drawback, if not a
> > deal breaker.  
> 
> It depends on bringing some sanity to freezing / parking / signal handling 
> for kthreads, which is an independent work in progress in parallel.
> 
> > How does the patching state ever reach completion?
> 
> kthread context always calls the old code and it doesn't block the 
> finalization; that's basically a documented feature for now.
> 
> That surely is a limitation and something the patch author has to be aware 
> of, but I wouldn't really consider it a show stopper for now, for the 
> reason pointed out above; it'll eventually be made to work, it's not a 
> substantial issue.

Until the kthread issues are sorted out, I would call it a _very_
substantial issue.  Not being able to patch kthreads is a huge
limitation.  Also it would in many cases block the ability to properly
change data semantics, which is one of the big reasons for a consistency
model.

> > I would say it's orders of magnitude more disruptive and much riskier 
> > compared to walking the stacks (again, assuming we can make stack 
> > walking "safe").
> 
> Agreed ...  under the condition that it can be made really 100% reliable 
> *and* we'd be reasonably sure that we will be able to realistically 
> achieve the same goal on other architectures as well. Have you even 
> started exploring that space, please?

Yes.  As I postulated before [1], there are two obstacles to achieving
reliable frame pointer stack traces: 1) missing frame pointer logic and
2) exceptions.  If either 1 or 2 was involved in the creation of any of
the frames on the stack, some frame pointers might be missing, and one
or more frames could be skipped by the stack walker.

The first obstacle can be overcome and enforced at compile time using
stackvalidate [1].

The second obstacle can be overcome at run time with a future RFC:
something like a save_stack_trace_tsk_validate() function which does
some validations while it walks the stack.  It can return an error if it
detects an exception frame.

  (It can also do some sanity checks like ensuring that it walks all the
  way to the bottom of the stack and that each frame has a valid ktext
  address.  I also would propose a CONFIG_DEBUG_VALIDATE_STACK option
  which tries to validate the stack on every call to schedule.)

Then we can have the hybrid consistency model rely on
save_stack_trace_tsk_validate().  If the stack is deemed unsafe, we can
fall back to retrying later, or to the kGraft mode of user mode barrier
patching.

Eventually I want to try to make *all* stacks reliable, even those with
exception frames.  That would involve compile and run time validations
of DWARF data, and ensuring that DWARF and frame pointers are consistent
with each other.  But those are general improvements which aren't
prerequisites for the hybrid model.

[1] http://lkml.kernel.org/r/cover.1430770553.git.jpoimboe@redhat.com

-- 
Josh