Live-Patching Archive on lore.kernel.org
 help / color / Atom feed
From: Miroslav Benes <mbenes@suse.cz>
To: Vasily Gorbik <gor@linux.ibm.com>
Cc: heiko.carstens@de.ibm.com, borntraeger@de.ibm.com,
	jpoimboe@redhat.com, joe.lawrence@redhat.com,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	jikos@kernel.org, pmladek@suse.com, nstange@suse.de,
	live-patching@vger.kernel.org, lpechacek@suse.cz
Subject: Re: [PATCH v3 4/4] s390/livepatch: Implement reliable stack tracing for the consistency model
Date: Fri, 29 Nov 2019 19:16:38 +0100 (CET)
Message-ID: <alpine.LSU.2.21.1911291522140.23789@pobox.suse.cz> (raw)
In-Reply-To: <cover.thread-a0061f.your-ad-here.call-01575012971-ext-9115@work.hours>

On Fri, 29 Nov 2019, Vasily Gorbik wrote:

> On Wed, Nov 06, 2019 at 10:56:01AM +0100, Miroslav Benes wrote:
> > The livepatch consistency model requires reliable stack tracing
> > architecture support in order to work properly. In order to achieve
> > this, two main issues have to be solved. First, reliable and consistent
> > call chain backtracing has to be ensured. Second, the unwinder needs to
> > be able to detect stack corruptions and return errors.
> 
> I tried to address that in a patch series I pushed here:
> https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/log/?h=livepatch
> 
> It partially includes your changes from this patch which have been split
> in 2 patches:
> s390/unwind: add stack pointer alignment sanity checks
> s390/livepatch: Implement reliable stack tracing for the consistency model
> 
> The primary goal was to make our backchain unwinder reliable itself. And
> suitable for livepatching as is (we extra checks at the caller, see
> below). Besides unwinder changes few things have been improved to avoid
> special handling during unwinding.
> - all tasks now have pt_regs at the bottom of task stack.
> - final backchain always contains 0, no further references to no_dat stack.
> - successful unwinding means reaching pt_regs at the bottom of task stack,
> and unwinder guarantees that unwind_error() reflects that.
> - final pt_regs at the bottom of task stack is never included in unwinding
> results. It never was for user tasks. And kernel tasks cannot return to
> that state anyway (and in some cases pt_regs are empty).
> - unwinder starts unwinding from a reliable state (not "sp" passed as
> an argument).

Great. Thanks for doing that. It all looks good to me and the outcome is 
definitely better. I obviously still had some misconceptions about the 
code and it is much clearer now.

> There is also s390 specific unwinder testing module.
>
> > Idle tasks are a bit special. Their final back chains point to no_dat
> > stacks. See for reference CALL_ON_STACK() in smp_start_secondary()
> > callback used in __cpu_up(). The unwinding is stopped there and it is
> > not considered to be a stack corruption.
> I changed that with commit:
> s390: avoid misusing CALL_ON_STACK for task stack setup
> Now idle tasks are not special, final back chain contains 0.
> 
> > ---
> >  arch/s390/Kconfig             |  1 +
> >  arch/s390/kernel/dumpstack.c  | 11 +++++++
> >  arch/s390/kernel/stacktrace.c | 46 ++++++++++++++++++++++++++
> >  arch/s390/kernel/unwind_bc.c  | 61 +++++++++++++++++++++++++++--------
> >  4 files changed, 106 insertions(+), 13 deletions(-)
> > 
> > --- a/arch/s390/kernel/dumpstack.c
> > +++ b/arch/s390/kernel/dumpstack.c
> > @@ -94,12 +94,23 @@ int get_stack_info(unsigned long sp, struct task_struct *task,
> >  	if (!sp)
> >  		goto unknown;
> >  
> > +	/* Sanity check: ABI requires SP to be aligned 8 bytes. */
> > +	if (sp & 0x7)
> > +		goto unknown;
> > +
> This has been split in a separate commit:
> s390/unwind: add stack pointer alignment sanity checks
> 
> > +	/*
> > +	 * The reliable unwinding should not start on nodat_stack, async_stack
> > +	 * or restart_stack. The task is either current or must be inactive.
> > +	 */
> > +	if (unwind_reliable)
> > +		goto unknown;
> I moved this check in arch_stack_walk_reliable itself. See below.
> 
> >  static bool unwind_use_regs(struct unwind_state *state)
> > @@ -85,10 +94,13 @@ static bool unwind_use_frame(struct unwind_state *state, unsigned long sp,
> >  	struct stack_frame *sf;
> >  	unsigned long ip;
> >  
> > -	if (unlikely(outside_of_stack(state, sp))) {
> > -		if (!update_stack_info(state, sp))
> > -			goto out_err;
> > -	}
> > +	/*
> > +	 * No need to update stack info when unwind_reliable is true. We should
> > +	 * be on a task stack and everything else is an error.
> > +	 */
> > +	if (unlikely(outside_of_stack(state, sp)) &&
> > +	    (unwind_reliable || !update_stack_info(state, sp)))
> > +		goto out_err;
> I moved this check in arch_stack_walk_reliable itself. See below.
> 
> > +	/* Unwind reliable */
> > +	if ((unsigned long)regs != info->end - sizeof(struct pt_regs))
> > +		goto out_err;
> I moved this check in arch_stack_walk_reliable itself. See below.
> 
> 
> > @@ -136,8 +162,17 @@ bool unwind_next_frame(struct unwind_state *state, bool unwind_reliable)
> >  	sf = (struct stack_frame *) state->sp;
> >  	sp = READ_ONCE_NOCHECK(sf->back_chain);
> >  
> > -	/* Non-zero back-chain points to the previous frame */
> > -	if (likely(sp))
> > +	/*
> > +	 * Non-zero back-chain points to the previous frame
> > +	 *
> > +	 * unwind_reliable case: Idle tasks are special. The final
> > +	 * back-chain points to nodat_stack.  See CALL_ON_STACK() in
> > +	 * smp_start_secondary() callback used in __cpu_up(). We just
> > +	 * accept it and look for pt_regs.
> > +	 */
> > +	if (likely(sp) &&
> > +	    (!unwind_reliable || !(is_idle_task(state->task) &&
> > +				   outside_of_stack(state, sp))))
> >  		return unwind_use_frame(state, sp, unwind_reliable);
> This is not needed anymore.
> 
> In the end this all boils down to arch_stack_walk_reliable
> implementation. I made the following changes compaired to your version:
> ---
> - use plain unwind_for_each_frame
> - "state.stack_info.type != STACK_TYPE_TASK" guarantees that we are
>   not leaving task stack.
> - "if (state.regs)" guarantees that we have not met an program interrupt
>   pt_regs (page faults) or preempted. Corresponds to yours:
> > +	if ((unsigned long)regs != info->end - sizeof(struct pt_regs))
> > +		goto out_err;

Agreed. All this should be equivalent to the changes I made.

> - I don't see a point in storing "kernel_thread_starter", am I missing
>   something?

No, you don't. It was just for the sake of completeness and it is not 
worth it.

[...]
 
> I'd appreciate if you could give those changes a spin. And check if
> something is missing or broken. Or share your livepatching testing
> procedure. If everything goes well, I might include these changes in
> second pull request for this 5.5 merge window.
> 
> I did successfully run ./testing/selftests/livepatch/test-livepatch.sh

'make run_tests' in tools/testing/selftests/livepatch/ would call all the 
tests in the directory. Especially test-callbacks.sh which stresses the 
livepatching even more.
 
> https://github.com/lpechacek/qa_test_klp seems outdated. I was able to
> fix and run some tests of it but haven't had time to figure out all of
> them. Is there a version that would run on top of current Linus tree?

Ah, sorry. I should have mentioned that. The code became outdated with 
recent upstream changes. Libor is working on the updates (CCed).

Anyway, I ran it here and it all works fine.

Tested-by: Miroslav Benes <mbenes@suse.cz>

Thanks a lot
Miroslav

      parent reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-06  9:55 [PATCH v3 0/4] " Miroslav Benes
2019-11-06  9:55 ` [PATCH v3 1/4] s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr() Miroslav Benes
2019-11-28 16:51   ` Vasily Gorbik
2019-11-06  9:55 ` [PATCH v3 2/4] s390/unwind: split unwind_next_frame() to several functions Miroslav Benes
2019-11-06  9:56 ` [PATCH v3 3/4] s390/unwind: prepare the unwinding interface for reliable stack traces Miroslav Benes
2019-11-06  9:56 ` [PATCH v3 4/4] s390/livepatch: Implement reliable stack tracing for the consistency model Miroslav Benes
2019-11-29  7:41   ` Vasily Gorbik
2019-11-29  7:41     ` [PATCH v4 1/2] s390/unwind: add stack pointer alignment sanity checks Vasily Gorbik
2019-11-29 18:16       ` Miroslav Benes
2019-11-29  7:41     ` [PATCH v4 2/2] s390/livepatch: Implement reliable stack tracing for the consistency model Vasily Gorbik
2019-11-29 18:16       ` Miroslav Benes
2019-11-29 18:16     ` Miroslav Benes [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.21.1911291522140.23789@pobox.suse.cz \
    --to=mbenes@suse.cz \
    --cc=borntraeger@de.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=lpechacek@suse.cz \
    --cc=nstange@suse.de \
    --cc=pmladek@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Live-Patching Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/live-patching/0 live-patching/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 live-patching live-patching/ https://lore.kernel.org/live-patching \
		live-patching@vger.kernel.org
	public-inbox-index live-patching

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.live-patching


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git