* [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
@ 2017-01-06 14:00 Miroslav Benes
2017-01-06 15:01 ` Petr Mladek
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Miroslav Benes @ 2017-01-06 14:00 UTC (permalink / raw)
To: jpoimboe, jeyu, jikos
Cc: pmladek, corbet, live-patching, linux-doc, linux-kernel, Miroslav Benes
The Limitations section of the documentation describes the impossibility
to livepatch anything that is inlined to __schedule() function. This had
been true till 4.9 kernel came. Thanks to commit 0100301bfdf5
("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
__switch_to_asm function now (implemented in assembly) called properly
from context_switch(). RIP is thus saved on the stack and a task would
return to proper version of __schedule() et al. functions.
Of course __switch_to_asm() is not patchable for the reason described in
the section. But there is no __fentry__ call and I cannot imagine a
reason to do it anyway.
Therefore, remove the paragraphs from the section.
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
FWIW, I also tested this to be sure on top of the consistency model
patch set. I patched schedule() function which calls __schedule() (it is
impossible to patch it directly due to notrace attribute). It works well
except...
1. the patching process does not finish, because many tasks sleep in
schedule. STOP/CONT signal does not help. I'll investigate.
2. reversion of the process does not work as expected. The kernel
crashes after the removal of the module. A task very likely slept in
schedule and was not migrated properly. It might be because of the races
in klp_reverse_transition() described by Petr, or might be somewhere
else. I'll look into it.
Documentation/livepatch/livepatch.txt | 19 -------------------
1 file changed, 19 deletions(-)
diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt
index f5967316deb9..7f04e13ec53d 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -329,25 +329,6 @@ See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
by "notrace".
- + Anything inlined into __schedule() can not be patched.
-
- The switch_to macro is inlined into __schedule(). It switches the
- context between two processes in the middle of the macro. It does
- not save RIP in x86_64 version (contrary to 32-bit version). Instead,
- the currently used __schedule()/switch_to() handles both processes.
-
- Now, let's have two different tasks. One calls the original
- __schedule(), its registers are stored in a defined order and it
- goes to sleep in the switch_to macro and some other task is restored
- using the original __schedule(). Then there is the second task which
- calls patched__schedule(), it goes to sleep there and the first task
- is picked by the patched__schedule(). Its RSP is restored and now
- the registers should be restored as well. But the order is different
- in the new patched__schedule(), so...
-
- There is work in progress to remove this limitation.
-
-
+ Livepatch modules can not be removed.
The current implementation just redirects the functions at the very
--
2.11.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-06 14:00 [PATCH] Documentation/livepatch: remove the limitation for schedule() patching Miroslav Benes
@ 2017-01-06 15:01 ` Petr Mladek
2017-01-06 15:10 ` Miroslav Benes
2017-01-06 19:13 ` Josh Poimboeuf
2017-01-11 1:33 ` Jiri Kosina
2 siblings, 1 reply; 8+ messages in thread
From: Petr Mladek @ 2017-01-06 15:01 UTC (permalink / raw)
To: Miroslav Benes
Cc: jpoimboe, jeyu, jikos, corbet, live-patching, linux-doc, linux-kernel
On Fri 2017-01-06 15:00:45, Miroslav Benes wrote:
> The Limitations section of the documentation describes the impossibility
> to livepatch anything that is inlined to __schedule() function. This had
> been true till 4.9 kernel came. Thanks to commit 0100301bfdf5
> ("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
> __switch_to_asm function now (implemented in assembly) called properly
> from context_switch(). RIP is thus saved on the stack and a task would
> return to proper version of __schedule() et al. functions.
>
> Of course __switch_to_asm() is not patchable for the reason described in
> the section. But there is no __fentry__ call and I cannot imagine a
> reason to do it anyway.
>
> Therefore, remove the paragraphs from the section.
>
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
It is great to get a feature for free ;-)
Reviewed-by: Petr Mladek <pmladek@suse.com>
Best Regards,
Petr
---
> FWIW, I also tested this to be sure on top of the consistency model
> patch set. I patched schedule() function which calls __schedule() (it is
> impossible to patch it directly due to notrace attribute). It works well
> except...
>
> 1. the patching process does not finish, because many tasks sleep in
> schedule. STOP/CONT signal does not help. I'll investigate.
Are these userspace processes or kthreads? Kthreads would cause
problems because they do not handle signals.
> 2. reversion of the process does not work as expected. The kernel
> crashes after the removal of the module. A task very likely slept in
> schedule and was not migrated properly. It might be because of the races
> in klp_reverse_transition() described by Petr, or might be somewhere
> else. I'll look into it.
I hope that I will be able to do another dive into the consistency
model patchset the following week.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-06 15:01 ` Petr Mladek
@ 2017-01-06 15:10 ` Miroslav Benes
0 siblings, 0 replies; 8+ messages in thread
From: Miroslav Benes @ 2017-01-06 15:10 UTC (permalink / raw)
To: Petr Mladek
Cc: jpoimboe, jeyu, jikos, corbet, live-patching, linux-doc, linux-kernel
On Fri, 6 Jan 2017, Petr Mladek wrote:
> On Fri 2017-01-06 15:00:45, Miroslav Benes wrote:
> > The Limitations section of the documentation describes the impossibility
> > to livepatch anything that is inlined to __schedule() function. This had
> > been true till 4.9 kernel came. Thanks to commit 0100301bfdf5
> > ("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
> > __switch_to_asm function now (implemented in assembly) called properly
> > from context_switch(). RIP is thus saved on the stack and a task would
> > return to proper version of __schedule() et al. functions.
> >
> > Of course __switch_to_asm() is not patchable for the reason described in
> > the section. But there is no __fentry__ call and I cannot imagine a
> > reason to do it anyway.
> >
> > Therefore, remove the paragraphs from the section.
> >
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
>
> It is great to get a feature for free ;-)
>
> Reviewed-by: Petr Mladek <pmladek@suse.com>
>
> Best Regards,
> Petr
>
> ---
> > FWIW, I also tested this to be sure on top of the consistency model
> > patch set. I patched schedule() function which calls __schedule() (it is
> > impossible to patch it directly due to notrace attribute). It works well
> > except...
> >
> > 1. the patching process does not finish, because many tasks sleep in
> > schedule. STOP/CONT signal does not help. I'll investigate.
>
> Are these userspace processes or kthreads? Kthreads would cause
> problems because they do not handle signals.
Userspace processes, but I take it back. Stupid typo in my script. It
works as expected. Kthreads sleeping in schedule() are of course there and
a signal does not help.
Miroslav
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-06 14:00 [PATCH] Documentation/livepatch: remove the limitation for schedule() patching Miroslav Benes
2017-01-06 15:01 ` Petr Mladek
@ 2017-01-06 19:13 ` Josh Poimboeuf
2017-01-09 12:50 ` Miroslav Benes
2017-01-11 1:33 ` Jiri Kosina
2 siblings, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2017-01-06 19:13 UTC (permalink / raw)
To: Miroslav Benes
Cc: jeyu, jikos, pmladek, corbet, live-patching, linux-doc, linux-kernel
On Fri, Jan 06, 2017 at 03:00:45PM +0100, Miroslav Benes wrote:
> The Limitations section of the documentation describes the impossibility
> to livepatch anything that is inlined to __schedule() function. This had
> been true till 4.9 kernel came. Thanks to commit 0100301bfdf5
> ("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
> __switch_to_asm function now (implemented in assembly) called properly
> from context_switch(). RIP is thus saved on the stack and a task would
> return to proper version of __schedule() et al. functions.
>
> Of course __switch_to_asm() is not patchable for the reason described in
> the section. But there is no __fentry__ call and I cannot imagine a
> reason to do it anyway.
>
> Therefore, remove the paragraphs from the section.
>
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
> FWIW, I also tested this to be sure on top of the consistency model
> patch set. I patched schedule() function which calls __schedule() (it is
> impossible to patch it directly due to notrace attribute). It works well
> except...
>
> 1. the patching process does not finish, because many tasks sleep in
> schedule. STOP/CONT signal does not help. I'll investigate.
>
> 2. reversion of the process does not work as expected. The kernel
> crashes after the removal of the module. A task very likely slept in
> schedule and was not migrated properly. It might be because of the races
> in klp_reverse_transition() described by Petr, or might be somewhere
> else. I'll look into it.
Hm, will be interesting to see the cause of this...
--
Josh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-06 19:13 ` Josh Poimboeuf
@ 2017-01-09 12:50 ` Miroslav Benes
2017-01-09 14:54 ` Josh Poimboeuf
0 siblings, 1 reply; 8+ messages in thread
From: Miroslav Benes @ 2017-01-09 12:50 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: jeyu, jikos, pmladek, corbet, live-patching, linux-doc, linux-kernel
On Fri, 6 Jan 2017, Josh Poimboeuf wrote:
> On Fri, Jan 06, 2017 at 03:00:45PM +0100, Miroslav Benes wrote:
> >
> > 2. reversion of the process does not work as expected. The kernel
> > crashes after the removal of the module. A task very likely slept in
> > schedule and was not migrated properly. It might be because of the races
> > in klp_reverse_transition() described by Petr, or might be somewhere
> > else. I'll look into it.
>
> Hm, will be interesting to see the cause of this...
The absence of the patched schedule() on the stack was the cause.
klp_try_switch_task() thus did not see it and happily migrated the task.
The reason is funny. One cannot patch __schedule() (which is of
interested) because of the notrace attribute. So all the callers need to
be processed. I tried to make my life easier and patched only schedule().
GCC then inlined new __schedule() to the new schedule(). When I added
noinline attribute to the new __schedule() everything was fine (because
suddenly new schedule() was on the stack as expected).
There is still one thing which I don't understand. Why __schedule()
(patched or the original) is not on the stack. The actual "sleep"
should happen in __switch_to_asm() which is C function now. And there is a
call to __switch_to_asm() in __schedule(). __schedule() thus should be on
the stack, shouldn't it? What am I missing? __switch_to_asm() pushes %rbp
on the stack...
Miroslav
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-09 12:50 ` Miroslav Benes
@ 2017-01-09 14:54 ` Josh Poimboeuf
2017-01-10 10:32 ` Miroslav Benes
0 siblings, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2017-01-09 14:54 UTC (permalink / raw)
To: Miroslav Benes
Cc: jeyu, jikos, pmladek, corbet, live-patching, linux-doc, linux-kernel
On Mon, Jan 09, 2017 at 01:50:19PM +0100, Miroslav Benes wrote:
> There is still one thing which I don't understand. Why __schedule()
> (patched or the original) is not on the stack. The actual "sleep"
> should happen in __switch_to_asm() which is C function now. And there is a
> call to __switch_to_asm() in __schedule(). __schedule() thus should be on
> the stack, shouldn't it? What am I missing? __switch_to_asm() pushes %rbp
> on the stack...
Ah, this is an unwinder bug. get_frame_pointer() needs to be fixed so
that for an inactive task it returns a pointer to inactive_task_frame.bp
rather than the value of inactive_task_frame.bp itself. Will fix it.
--
Josh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-09 14:54 ` Josh Poimboeuf
@ 2017-01-10 10:32 ` Miroslav Benes
0 siblings, 0 replies; 8+ messages in thread
From: Miroslav Benes @ 2017-01-10 10:32 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: jeyu, jikos, pmladek, corbet, live-patching, linux-doc, linux-kernel
On Mon, 9 Jan 2017, Josh Poimboeuf wrote:
> On Mon, Jan 09, 2017 at 01:50:19PM +0100, Miroslav Benes wrote:
> > There is still one thing which I don't understand. Why __schedule()
> > (patched or the original) is not on the stack. The actual "sleep"
> > should happen in __switch_to_asm() which is C function now. And there is a
> > call to __switch_to_asm() in __schedule(). __schedule() thus should be on
> > the stack, shouldn't it? What am I missing? __switch_to_asm() pushes %rbp
> > on the stack...
>
> Ah, this is an unwinder bug. get_frame_pointer() needs to be fixed so
> that for an inactive task it returns a pointer to inactive_task_frame.bp
> rather than the value of inactive_task_frame.bp itself. Will fix it.
And it works with the fix. Thanks.
Miroslav
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Documentation/livepatch: remove the limitation for schedule() patching
2017-01-06 14:00 [PATCH] Documentation/livepatch: remove the limitation for schedule() patching Miroslav Benes
2017-01-06 15:01 ` Petr Mladek
2017-01-06 19:13 ` Josh Poimboeuf
@ 2017-01-11 1:33 ` Jiri Kosina
2 siblings, 0 replies; 8+ messages in thread
From: Jiri Kosina @ 2017-01-11 1:33 UTC (permalink / raw)
To: Miroslav Benes
Cc: jpoimboe, jeyu, pmladek, corbet, live-patching, linux-doc, linux-kernel
On Fri, 6 Jan 2017, Miroslav Benes wrote:
> The Limitations section of the documentation describes the impossibility
> to livepatch anything that is inlined to __schedule() function. This had
> been true till 4.9 kernel came. Thanks to commit 0100301bfdf5
> ("sched/x86: Rewrite the switch_to() code") from Brian Gerst there is
> __switch_to_asm function now (implemented in assembly) called properly
> from context_switch(). RIP is thus saved on the stack and a task would
> return to proper version of __schedule() et al. functions.
>
> Of course __switch_to_asm() is not patchable for the reason described in
> the section. But there is no __fentry__ call and I cannot imagine a
> reason to do it anyway.
>
> Therefore, remove the paragraphs from the section.
>
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
> FWIW, I also tested this to be sure on top of the consistency model
> patch set. I patched schedule() function which calls __schedule() (it is
> impossible to patch it directly due to notrace attribute). It works well
> except...
>
> 1. the patching process does not finish, because many tasks sleep in
> schedule. STOP/CONT signal does not help. I'll investigate.
>
> 2. reversion of the process does not work as expected. The kernel
> crashes after the removal of the module. A task very likely slept in
> schedule and was not migrated properly. It might be because of the races
> in klp_reverse_transition() described by Petr, or might be somewhere
> else. I'll look into it.
Applied, thanks.
--
Jiri Kosina
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-01-11 1:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-06 14:00 [PATCH] Documentation/livepatch: remove the limitation for schedule() patching Miroslav Benes
2017-01-06 15:01 ` Petr Mladek
2017-01-06 15:10 ` Miroslav Benes
2017-01-06 19:13 ` Josh Poimboeuf
2017-01-09 12:50 ` Miroslav Benes
2017-01-09 14:54 ` Josh Poimboeuf
2017-01-10 10:32 ` Miroslav Benes
2017-01-11 1:33 ` Jiri Kosina
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).