live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question: livepatch failed for new fork() task stack unreliable
@ 2020-05-29 10:10 Wang ShaoBo
  2020-05-29 17:44 ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wang ShaoBo @ 2020-05-29 10:10 UTC (permalink / raw)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, bobo.shaobowang, mingo,
	x86, linux-kernel, live-patching, mbenes, jpoimboe, devel, viro,
	esyr

Stack unreliable error is reported by stack_trace_save_tsk_reliable() when trying
to insmod a hot patch for module modification, this results in frequent failures
sometimes. We found this 'unreliable' stack is from task just fork.

The task just fork need to go through these steps will the problem not appear:

_do_fork
    -=> copy_process
    ...
    -=> ret_from_fork
            -=> UNWIND_HINT_REGS

Call trace as follow when stack_trace_save_tsk_reliable() return failure:
    [ 896.214710] livepatch: klp_check_stack: monitor-process:41642 has an unreliable stack
    [ 896.214735] livepatch: Call Trace:    # print trace entries by myself
    [ 896.214760] Call Trace:               # call show_stack()
    [ 896.214763] ? __switch_to_asm+0x70/0x70

Only for user mode task, there are two cases related for one task just created:

1) The task was not actually scheduled to excute, at this time UNWIND_HINT_EMPTY in
ret_from_fork() has not reset unwind_hint, it's sp_reg and end field remain default value
and end up throwing an error in unwind_next_frame() when called by arch_stack_walk_reliable();

2) The task has been scheduled but UNWIND_HINT_REGS not finished, at this time
arch_stack_walk_reliable() terminates it's backtracing loop for pt_regs unknown
and return -EINVAL because it's a user task.

As shown below, for user task, There exists a gap where ORC unwinder cannot
capture the stack state of task immediately, at this time the task has already been
created but ret_from_fork() has not complete it's mission.

We attempt to append a bit field orc_info_prepared in task_struct to probe when
related actions finished in ret_from_fork, we found scenario 1) 2) can be capatured.
It's a informal solution, just for testing our conjecture.

I am eager to purse an effective answer, welcome any ideas.
Another similar question: https://lkml.org/lkml/2020/3/12/590

Following is the draft modification:

1. Add a bit field orc_info_prepared int task_struct.

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4418f5cb8324..3ff1368b8877 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -791,6 +791,9 @@ struct task_struct {
	/* Stalled due to lack of memory */
	unsigned			in_memstall:1;
 #endif
+#ifdef CONFIG_UNWINDER_ORC
+	unsigned			orc_info_prepared:1;
+#endif
 
	unsigned long			atomic_flags; /* Flags requiring atomic access. */


2. if UNWIND_HINT_REGS complete, pt_regs can be known by orc unwinder,
   set orc_info_prepared = 1 in orc_info_prepared_fini().

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3063aa9090f9..637bdb091090 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -339,6 +339,7 @@ SYM_CODE_START(ret_from_fork)
 
 2:
 	UNWIND_HINT_REGS
+	call	orc_info_prepared_fini
 	movq	%rsp, %rdi
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 
3. Simply judge orc_info_prepared if task is user mode process.

diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 6ad43fc44556..bf1d2887f00b 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -77,6 +77,10 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 			return -EINVAL;
 	}


+	if (!(task->flags & (PF_KTHREAD | PF_IDLE)) &&
+		!task_orc_info_prepared(task))
+		return 0;
+
 	/* Check for stack corruption */
 	if (unwind_error(&state))
 		return -EINVAL;


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-05-29 10:10 Question: livepatch failed for new fork() task stack unreliable Wang ShaoBo
@ 2020-05-29 17:44 ` Josh Poimboeuf
       [not found]   ` <a9ed9157-f3cf-7d2c-7a8e-56150a2a114e@huawei.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-05-29 17:44 UTC (permalink / raw)
  To: Wang ShaoBo
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Fri, May 29, 2020 at 06:10:59PM +0800, Wang ShaoBo wrote:
> Stack unreliable error is reported by stack_trace_save_tsk_reliable() when trying
> to insmod a hot patch for module modification, this results in frequent failures
> sometimes. We found this 'unreliable' stack is from task just fork.

For livepatch, this shouldn't actually be a failure.  The patch will
just stay in the transition state until after the fork has completed.
Which should happen in a reasonable amount of time, right?

> 1) The task was not actually scheduled to excute, at this time UNWIND_HINT_EMPTY in
> ret_from_fork() has not reset unwind_hint, it's sp_reg and end field remain default value
> and end up throwing an error in unwind_next_frame() when called by arch_stack_walk_reliable();

Yes, this seems to be true for forked-but-not-yet-scheduled tasks.

I can look at fixing that.  I have some ORC cleanups in progress which
are related to UNWIND_HINT_EMPTY and the end of the stack.  I can add
this issue to the list of improvements.

> 2) The task has been scheduled but UNWIND_HINT_REGS not finished, at this time
> arch_stack_walk_reliable() terminates it's backtracing loop for pt_regs unknown
> and return -EINVAL because it's a user task.

Hm, do you see this problem with upstream?  It seems like it should
work.  arch_stack_walk_reliable() has this:

			/* Success path for user tasks */
			if (user_mode(regs))
				return 0;

Where exactly is the error coming from?

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
       [not found]   ` <a9ed9157-f3cf-7d2c-7a8e-56150a2a114e@huawei.com>
@ 2020-06-01 18:05     ` Josh Poimboeuf
  2020-06-02  1:22       ` Wangshaobo (bobo)
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-01 18:05 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote:
> 1) when a user mode task just fork start excuting ret_from_fork() till
> schedule_tail, unwind_next_frame found
> 
> orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time
> arch_stack_walk_reliable()
> 
> terminates it's backtracing loop for unwind_done() return true. then 'if
> (!(task->flags & (PF_KTHREAD | PF_IDLE)))'
> 
> in arch_stack_walk_reliable() true and return -EINVAL after.
> 
> * The stack trace looks like that:
> 
> ret_from_fork
> 
>       -=> UNWIND_HINT_EMPTY
> 
>       -=> schedule_tail             /* schedule out */
> 
>       ...
> 
>       -=> UNWIND_HINT_REGS      /*  UNDO */

Yes, makes sense.

> 2) when using call_usermodehelper_exec_async() to create a user mode task,
> ret_from_fork() still not exec whereas
> 
> the task has been scheduled in __schedule(), at this time, orc->sp_reg is
> ORC_REG_UNDEFINED but orc->end equals zero,
> 
> unwind_error() return true and also terminates arch_stack_walk_reliable()'s
> backtracing loop, end up return from
> 
> 'if (unwind_error())' branch.
> 
> * The stack trace looks like that:
> 
> -=> call_usermodehelper_exec
> 
>                  -=> do_exec
> 
>                            -=> search_binary_handler
> 
>                                       -=> load_elf_binary
> 
>                                                 -=> elf_map
> 
>                                                          -=> vm_mmap_pgoff
> 
> -=> down_write_killable
> 
> -=> _cond_resched
> 
>              -=> __schedule           /* scheduled to work */
> 
> -=> ret_from_fork       /* UNDO */

I don't quite follow the stacktrace, but it sounds like the issue is the
same as the first one you originally reported:

> 1) The task was not actually scheduled to excute, at this time
> UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's
> sp_reg and end field remain default value and end up throwing an error
> in unwind_next_frame() when called by arch_stack_walk_reliable();

Or am I misunderstanding?

And to reiterate, these are not "livepatch failures", right?  Livepatch
doesn't fail when stack_trace_save_tsk_reliable() returns an error.  It
recovers gracefully and tries again later.

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-01 18:05     ` Josh Poimboeuf
@ 2020-06-02  1:22       ` Wangshaobo (bobo)
  2020-06-02 13:14         ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wangshaobo (bobo) @ 2020-06-02  1:22 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr


在 2020/6/2 2:05, Josh Poimboeuf 写道:
> On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote:
>> 1) when a user mode task just fork start excuting ret_from_fork() till
>> schedule_tail, unwind_next_frame found
>>
>> orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time
>> arch_stack_walk_reliable()
>>
>> terminates it's backtracing loop for unwind_done() return true. then 'if
>> (!(task->flags & (PF_KTHREAD | PF_IDLE)))'
>>
>> in arch_stack_walk_reliable() true and return -EINVAL after.
>>
>> * The stack trace looks like that:
>>
>> ret_from_fork
>>
>>        -=> UNWIND_HINT_EMPTY
>>
>>        -=> schedule_tail             /* schedule out */
>>
>>        ...
>>
>>        -=> UNWIND_HINT_REGS      /*  UNDO */
> Yes, makes sense.
>
>> 2) when using call_usermodehelper_exec_async() to create a user mode task,
>> ret_from_fork() still not exec whereas
>>
>> the task has been scheduled in __schedule(), at this time, orc->sp_reg is
>> ORC_REG_UNDEFINED but orc->end equals zero,
>>
>> unwind_error() return true and also terminates arch_stack_walk_reliable()'s
>> backtracing loop, end up return from
>>
>> 'if (unwind_error())' branch.
>>
>> * The stack trace looks like that:
>>
>> -=> call_usermodehelper_exec
>>
>>                   -=> do_exec
>>
>>                             -=> search_binary_handler
>>
>>                                        -=> load_elf_binary
>>
>>                                                  -=> elf_map
>>
>>                                                           -=> vm_mmap_pgoff
>>
>> -=> down_write_killable
>>
>> -=> _cond_resched
>>
>>               -=> __schedule           /* scheduled to work */
>>
>> -=> ret_from_fork       /* UNDO */
> I don't quite follow the stacktrace, but it sounds like the issue is the
> same as the first one you originally reported:

yes, true, same as the first one,  the only difference what i want to 
say is the task has been scheduled but the first one is not.

>> 1) The task was not actually scheduled to excute, at this time
>> UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's
>> sp_reg and end field remain default value and end up throwing an error
>> in unwind_next_frame() when called by arch_stack_walk_reliable();
> Or am I misunderstanding?
>
> And to reiterate, these are not "livepatch failures", right?  Livepatch
> doesn't fail when stack_trace_save_tsk_reliable() returns an error.  It
> recovers gracefully and tries again later.

yes, you are right,  "livepatch failures" only indicates serveral retry 
failures, we found if frequent fork() happend in current

system, it is easier to cause retry but still always end up success.

so i think this question is related to ORC unwinder, could i ask if you 
have strategy or plan to avoid this problem ?

thanks,

Wang ShaoBo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-02  1:22       ` Wangshaobo (bobo)
@ 2020-06-02 13:14         ` Josh Poimboeuf
  2020-06-03 14:06           ` Wangshaobo (bobo)
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-02 13:14 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Tue, Jun 02, 2020 at 09:22:30AM +0800, Wangshaobo (bobo) wrote:
> so i think this question is related to ORC unwinder, could i ask if you have
> strategy or plan to avoid this problem ?

I suspect something like this would fix it (untested):

diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 6ad43fc44556..8cf95ded1410 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -50,7 +50,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 		if (regs) {
 			/* Success path for user tasks */
 			if (user_mode(regs))
-				return 0;
+				break;
 
 			/*
 			 * Kernel mode registers on the stack indicate an
@@ -81,10 +81,6 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 	if (unwind_error(&state))
 		return -EINVAL;
 
-	/* Success path for non-user tasks, i.e. kthreads and idle tasks */
-	if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
-		return -EINVAL;
-
 	return 0;
 }
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 7f969b2d240f..d7396431261a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		state->sp = sp;
 		state->regs = NULL;
 		state->prev_regs = NULL;
-		state->signal = false;
+		state->signal = ((void *)state->ip == ret_from_fork);
 		break;
 
 	case ORC_TYPE_REGS:


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-02 13:14         ` Josh Poimboeuf
@ 2020-06-03 14:06           ` Wangshaobo (bobo)
  2020-06-03 15:33             ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wangshaobo (bobo) @ 2020-06-03 14:06 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr


在 2020/6/2 21:14, Josh Poimboeuf 写道:
> On Tue, Jun 02, 2020 at 09:22:30AM +0800, Wangshaobo (bobo) wrote:
>> so i think this question is related to ORC unwinder, could i ask if you have
>> strategy or plan to avoid this problem ?
> I suspect something like this would fix it (untested):
>
> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 6ad43fc44556..8cf95ded1410 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -50,7 +50,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
>   		if (regs) {
>   			/* Success path for user tasks */
>   			if (user_mode(regs))
> -				return 0;
> +				break;
>   
>   			/*
>   			 * Kernel mode registers on the stack indicate an
> @@ -81,10 +81,6 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
>   	if (unwind_error(&state))
>   		return -EINVAL;
>   
> -	/* Success path for non-user tasks, i.e. kthreads and idle tasks */
> -	if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
> -		return -EINVAL;
> -
>   	return 0;
>   }
>   
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index 7f969b2d240f..d7396431261a 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
>   		state->sp = sp;
>   		state->regs = NULL;
>   		state->prev_regs = NULL;
> -		state->signal = false;
> +		state->signal = ((void *)state->ip == ret_from_fork);
>   		break;
>   
>   	case ORC_TYPE_REGS:

what a awesome job, thanks a lot, Josh

Today I test your fix, but arch_stack_walk_reliable() still return 
failed sometimes, I

found one of three scenarios mentioned failed:


1. user task (just fork) but not been scheduled

     test FAILED

     it is because unwind_next_frame() get the first frame, this time 
state->signal is false, and then return

     failed in the same place for ret_from_fork has not executed at all.


2. user task (just fork) start excuting ret_from_fork() till 
schedule_tail but not UNWIND_HINT_REGS

     test condition :loop fork() in current  system

     result : SUCCESS,

     it looks like this modification works for my perspective :

	-	/* Success path for non-user tasks, i.e. kthreads and idle tasks */
	-	if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
	-		return -EINVAL;
   but is this possible to miss one invalid judgement condition ? (1)

3. call_usermodehelper_exec_async

     test condition :loop call call_usermodehelper() in a module selfmade.

     result : SUCCESS,

    it looks state->signal==true works when unwind_next_frame() gets the 
end ret_from_fork() frame,

    but i don't know how does it work, i am confused by this sentences, 
how does the comment means sibling calls and

     calls to noreturn functions? (2)

             /*
              * Find the orc_entry associated with the text address.
              *
              * Decrement call return addresses by one so they work for 
sibling
              * calls and calls to noreturn functions.
              */
             orc = orc_find(state->signal ? state->ip : state->ip - 1);
             if (!orc) {


So i slightly modify your code, i move  state->signal = ((void 
*)state->ip == ret_from_fork) to unwind_start()

and render unwind_next_frame() remain the same as before:

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index e9cc182aa97e..ecce5051e8fd 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, 
struct task_struct *task,
                 state->sp = task->thread.sp;
                 state->bp = READ_ONCE_NOCHECK(frame->bp);
                 state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
+              state->signal = ((void *)state->ip == ret_from_fork);
         }

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 7f969b2d240f..d7396431261a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
  		state->sp = sp;
  		state->regs = NULL;
  		state->prev_regs = NULL;
-		state->signal = ((void *)state->ip == ret_from_fork);
+		state->signal = false;
  		break;


After modification all the three scenarios are captured and no longer 
return failed,  but i don't know

how does it affect the scenarios 3, because current frame->ret_addr(the 
first frame) is not ret_from_fork,

it should return failed as scenarios1, but it didn't , i really want to 
know the reason. (3)


thanks again

Wang ShaoBo



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-03 14:06           ` Wangshaobo (bobo)
@ 2020-06-03 15:33             ` Josh Poimboeuf
  2020-06-04  1:24               ` Wangshaobo (bobo)
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-03 15:33 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Wed, Jun 03, 2020 at 10:06:07PM +0800, Wangshaobo (bobo) wrote:
> Today I test your fix, but arch_stack_walk_reliable() still return failed
> sometimes, I
> 
> found one of three scenarios mentioned failed:
> 
> 
> 1. user task (just fork) but not been scheduled
> 
>     test FAILED
> 
>     it is because unwind_next_frame() get the first frame, this time
> state->signal is false, and then return
> 
>     failed in the same place for ret_from_fork has not executed at all.

Oops - I meant to do it in __unwind_start (as you discovered).

> 2. user task (just fork) start excuting ret_from_fork() till schedule_tail
> but not UNWIND_HINT_REGS
> 
>     test condition :loop fork() in current  system
> 
>     result : SUCCESS,
> 
>     it looks like this modification works for my perspective :
> 
> 	-	/* Success path for non-user tasks, i.e. kthreads and idle tasks */
> 	-	if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
> 	-		return -EINVAL;
>   but is this possible to miss one invalid judgement condition ? (1)

I'm not sure I understand your question, but I think this change
shouldn't break anything else because the ORC unwinder is careful to
always set an error if it doesn't reach the end of the stack, especially
with my recent ORC fixes which went into 5.7.

> 3. call_usermodehelper_exec_async
> 
>     test condition :loop call call_usermodehelper() in a module selfmade.
> 
>     result : SUCCESS,
> 
>    it looks state->signal==true works when unwind_next_frame() gets the end
> ret_from_fork() frame,
> 
>    but i don't know how does it work, i am confused by this sentences, how
> does the comment means sibling calls and
> 
>     calls to noreturn functions? (2)
> 
>             /*
>              * Find the orc_entry associated with the text address.
>              *
>              * Decrement call return addresses by one so they work for sibling
>              * calls and calls to noreturn functions.
>              */
>             orc = orc_find(state->signal ? state->ip : state->ip - 1);
>             if (!orc) {

To be honest, I don't remember what I meant by sibling calls.  They
don't even leave anything on the stack.

For noreturns, the code might be laid out like this:

func1:
	...
	call noreturn_foo
func2:

func2 is immediately after the call to noreturn_foo.  So the return
address on the stack will actually be 'func2'.  We want to retrieve the
ORC data for the call instruction (inside func1), instead of the
instruction at the beginning of func2.

I should probably update that comment.

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-03 15:33             ` Josh Poimboeuf
@ 2020-06-04  1:24               ` Wangshaobo (bobo)
  2020-06-04  2:40                 ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wangshaobo (bobo) @ 2020-06-04  1:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr


在 2020/6/3 23:33, Josh Poimboeuf 写道:
> On Wed, Jun 03, 2020 at 10:06:07PM +0800, Wangshaobo (bobo) wrote:
> To be honest, I don't remember what I meant by sibling calls.  They
> don't even leave anything on the stack.
>
> For noreturns, the code might be laid out like this:
>
> func1:
> 	...
> 	call noreturn_foo
> func2:
>
> func2 is immediately after the call to noreturn_foo.  So the return
> address on the stack will actually be 'func2'.  We want to retrieve the
> ORC data for the call instruction (inside func1), instead of the
> instruction at the beginning of func2.
>
> I should probably update that comment.

So, I want to ask is there any side effects if i modify like this ? this 
modification is based on

your fix. It looks like ok with proper test.

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index e9cc182aa97e..ecce5051e8fd 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, 
struct task_struct *task,
                 state->sp = task->thread.sp;
                 state->bp = READ_ONCE_NOCHECK(frame->bp);
                 state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
+              state->signal = ((void *)state->ip == ret_from_fork);
         }

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 7f969b2d240f..d7396431261a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
          state->sp = sp;
          state->regs = NULL;
          state->prev_regs = NULL;
-        state->signal = ((void *)state->ip == ret_from_fork);
+        state->signal = false;
          break;

thanks,

Wang ShaoBo



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-04  1:24               ` Wangshaobo (bobo)
@ 2020-06-04  2:40                 ` Josh Poimboeuf
  2020-06-05  1:26                   ` Wangshaobo (bobo)
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-04  2:40 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Thu, Jun 04, 2020 at 09:24:55AM +0800, Wangshaobo (bobo) wrote:
> 
> 在 2020/6/3 23:33, Josh Poimboeuf 写道:
> > On Wed, Jun 03, 2020 at 10:06:07PM +0800, Wangshaobo (bobo) wrote:
> > To be honest, I don't remember what I meant by sibling calls.  They
> > don't even leave anything on the stack.
> > 
> > For noreturns, the code might be laid out like this:
> > 
> > func1:
> > 	...
> > 	call noreturn_foo
> > func2:
> > 
> > func2 is immediately after the call to noreturn_foo.  So the return
> > address on the stack will actually be 'func2'.  We want to retrieve the
> > ORC data for the call instruction (inside func1), instead of the
> > instruction at the beginning of func2.
> > 
> > I should probably update that comment.
> 
> So, I want to ask is there any side effects if i modify like this ? this
> modification is based on
> 
> your fix. It looks like ok with proper test.
> 
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index e9cc182aa97e..ecce5051e8fd 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, struct
> task_struct *task,
>                 state->sp = task->thread.sp;
>                 state->bp = READ_ONCE_NOCHECK(frame->bp);
>                 state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
> +              state->signal = ((void *)state->ip == ret_from_fork);
>         }
> 
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index 7f969b2d240f..d7396431261a 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
>          state->sp = sp;
>          state->regs = NULL;
>          state->prev_regs = NULL;
> -        state->signal = ((void *)state->ip == ret_from_fork);
> +        state->signal = false;
>          break;

Yes that's correct.

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-04  2:40                 ` Josh Poimboeuf
@ 2020-06-05  1:26                   ` Wangshaobo (bobo)
  2020-06-05  1:51                     ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wangshaobo (bobo) @ 2020-06-05  1:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr


在 2020/6/4 10:40, Josh Poimboeuf 写道:
> On Thu, Jun 04, 2020 at 09:24:55AM +0800, Wangshaobo (bobo) wrote:
>> 在 2020/6/3 23:33, Josh Poimboeuf 写道:
>>> On Wed, Jun 03, 2020 at 10:06:07PM +0800, Wangshaobo (bobo) wrote:
>>> To be honest, I don't remember what I meant by sibling calls.  They
>>> don't even leave anything on the stack.
>>>
>>> For noreturns, the code might be laid out like this:
>>>
>>> func1:
>>> 	...
>>> 	call noreturn_foo
>>> func2:
>>>
>>> func2 is immediately after the call to noreturn_foo.  So the return
>>> address on the stack will actually be 'func2'.  We want to retrieve the
>>> ORC data for the call instruction (inside func1), instead of the
>>> instruction at the beginning of func2.
>>>
>>> I should probably update that comment.
>> So, I want to ask is there any side effects if i modify like this ? this
>> modification is based on
>>
>> your fix. It looks like ok with proper test.
>>
>> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
>> index e9cc182aa97e..ecce5051e8fd 100644
>> --- a/arch/x86/kernel/unwind_orc.c
>> +++ b/arch/x86/kernel/unwind_orc.c
>> @@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, struct
>> task_struct *task,
>>                  state->sp = task->thread.sp;
>>                  state->bp = READ_ONCE_NOCHECK(frame->bp);
>>                  state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
>> +              state->signal = ((void *)state->ip == ret_from_fork);
>>          }
>>
>> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
>> index 7f969b2d240f..d7396431261a 100644
>> --- a/arch/x86/kernel/unwind_orc.c
>> +++ b/arch/x86/kernel/unwind_orc.c
>> @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
>>           state->sp = sp;
>>           state->regs = NULL;
>>           state->prev_regs = NULL;
>> -        state->signal = ((void *)state->ip == ret_from_fork);
>> +        state->signal = false;
>>           break;
> Yes that's correct.

Hi, josh

Could i ask when are you free to send the patch, all the tests are 
passed by.

thanks for your help.

Wang ShaoBo

>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-05  1:26                   ` Wangshaobo (bobo)
@ 2020-06-05  1:51                     ` Josh Poimboeuf
  2020-06-30 13:04                       ` Wangshaobo (bobo)
  0 siblings, 1 reply; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-05  1:51 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Fri, Jun 05, 2020 at 09:26:42AM +0800, Wangshaobo (bobo) wrote:
> > > So, I want to ask is there any side effects if i modify like this ? this
> > > modification is based on
> > > 
> > > your fix. It looks like ok with proper test.
> > > 
> > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> > > index e9cc182aa97e..ecce5051e8fd 100644
> > > --- a/arch/x86/kernel/unwind_orc.c
> > > +++ b/arch/x86/kernel/unwind_orc.c
> > > @@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, struct
> > > task_struct *task,
> > >                  state->sp = task->thread.sp;
> > >                  state->bp = READ_ONCE_NOCHECK(frame->bp);
> > >                  state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
> > > +              state->signal = ((void *)state->ip == ret_from_fork);
> > >          }
> > > 
> > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> > > index 7f969b2d240f..d7396431261a 100644
> > > --- a/arch/x86/kernel/unwind_orc.c
> > > +++ b/arch/x86/kernel/unwind_orc.c
> > > @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
> > >           state->sp = sp;
> > >           state->regs = NULL;
> > >           state->prev_regs = NULL;
> > > -        state->signal = ((void *)state->ip == ret_from_fork);
> > > +        state->signal = false;
> > >           break;
> > Yes that's correct.
> 
> Hi, josh
> 
> Could i ask when are you free to send the patch, all the tests are passed
> by.

I want to run some regression tests, so it will probably be next week.

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-05  1:51                     ` Josh Poimboeuf
@ 2020-06-30 13:04                       ` Wangshaobo (bobo)
  2020-06-30 21:36                         ` Josh Poimboeuf
  0 siblings, 1 reply; 13+ messages in thread
From: Wangshaobo (bobo) @ 2020-06-30 13:04 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr


在 2020/6/5 9:51, Josh Poimboeuf 写道:
> On Fri, Jun 05, 2020 at 09:26:42AM +0800, Wangshaobo (bobo) wrote:
>>>> So, I want to ask is there any side effects if i modify like this ? this
>>>> modification is based on
>>>>
>>>> your fix. It looks like ok with proper test.
>>>>
>>>> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
>>>> index e9cc182aa97e..ecce5051e8fd 100644
>>>> --- a/arch/x86/kernel/unwind_orc.c
>>>> +++ b/arch/x86/kernel/unwind_orc.c
>>>> @@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, struct
>>>> task_struct *task,
>>>>                   state->sp = task->thread.sp;
>>>>                   state->bp = READ_ONCE_NOCHECK(frame->bp);
>>>>                   state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
>>>> +              state->signal = ((void *)state->ip == ret_from_fork);
>>>>           }
>>>>
>>>> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
>>>> index 7f969b2d240f..d7396431261a 100644
>>>> --- a/arch/x86/kernel/unwind_orc.c
>>>> +++ b/arch/x86/kernel/unwind_orc.c
>>>> @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
>>>>            state->sp = sp;
>>>>            state->regs = NULL;
>>>>            state->prev_regs = NULL;
>>>> -        state->signal = ((void *)state->ip == ret_from_fork);
>>>> +        state->signal = false;
>>>>            break;
>>> Yes that's correct.
>> Hi, josh
>>
>> Could i ask when are you free to send the patch, all the tests are passed
>> by.
> I want to run some regression tests, so it will probably be next week.

Hi, josh

did you send this patch, I haven't received it up to now, so i ask u to 
confirm again.

thanks,

Wang ShaoBo



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Question: livepatch failed for new fork() task stack unreliable
  2020-06-30 13:04                       ` Wangshaobo (bobo)
@ 2020-06-30 21:36                         ` Josh Poimboeuf
  0 siblings, 0 replies; 13+ messages in thread
From: Josh Poimboeuf @ 2020-06-30 21:36 UTC (permalink / raw)
  To: Wangshaobo (bobo)
  Cc: huawei.libin, xiexiuqi, cj.chengjian, mingo, x86, linux-kernel,
	live-patching, mbenes, devel, viro, esyr

On Tue, Jun 30, 2020 at 09:04:12PM +0800, Wangshaobo (bobo) wrote:
> 
> 在 2020/6/5 9:51, Josh Poimboeuf 写道:
> > On Fri, Jun 05, 2020 at 09:26:42AM +0800, Wangshaobo (bobo) wrote:
> > > > > So, I want to ask is there any side effects if i modify like this ? this
> > > > > modification is based on
> > > > > 
> > > > > your fix. It looks like ok with proper test.
> > > > > 
> > > > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> > > > > index e9cc182aa97e..ecce5051e8fd 100644
> > > > > --- a/arch/x86/kernel/unwind_orc.c
> > > > > +++ b/arch/x86/kernel/unwind_orc.c
> > > > > @@ -620,6 +620,7 @@ void __unwind_start(struct unwind_state *state, struct
> > > > > task_struct *task,
> > > > >                   state->sp = task->thread.sp;
> > > > >                   state->bp = READ_ONCE_NOCHECK(frame->bp);
> > > > >                   state->ip = READ_ONCE_NOCHECK(frame->ret_addr);
> > > > > +              state->signal = ((void *)state->ip == ret_from_fork);
> > > > >           }
> > > > > 
> > > > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> > > > > index 7f969b2d240f..d7396431261a 100644
> > > > > --- a/arch/x86/kernel/unwind_orc.c
> > > > > +++ b/arch/x86/kernel/unwind_orc.c
> > > > > @@ -540,7 +540,7 @@ bool unwind_next_frame(struct unwind_state *state)
> > > > >            state->sp = sp;
> > > > >            state->regs = NULL;
> > > > >            state->prev_regs = NULL;
> > > > > -        state->signal = ((void *)state->ip == ret_from_fork);
> > > > > +        state->signal = false;
> > > > >            break;
> > > > Yes that's correct.
> > > Hi, josh
> > > 
> > > Could i ask when are you free to send the patch, all the tests are passed
> > > by.
> > I want to run some regression tests, so it will probably be next week.

Sorry, I was away for a bit and I didn't get a chance to send the patch.
I should hopefully have it ready soon.

-- 
Josh


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-06-30 21:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-29 10:10 Question: livepatch failed for new fork() task stack unreliable Wang ShaoBo
2020-05-29 17:44 ` Josh Poimboeuf
     [not found]   ` <a9ed9157-f3cf-7d2c-7a8e-56150a2a114e@huawei.com>
2020-06-01 18:05     ` Josh Poimboeuf
2020-06-02  1:22       ` Wangshaobo (bobo)
2020-06-02 13:14         ` Josh Poimboeuf
2020-06-03 14:06           ` Wangshaobo (bobo)
2020-06-03 15:33             ` Josh Poimboeuf
2020-06-04  1:24               ` Wangshaobo (bobo)
2020-06-04  2:40                 ` Josh Poimboeuf
2020-06-05  1:26                   ` Wangshaobo (bobo)
2020-06-05  1:51                     ` Josh Poimboeuf
2020-06-30 13:04                       ` Wangshaobo (bobo)
2020-06-30 21:36                         ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).