linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
@ 2021-10-12  8:35 Woody Lin
  2021-10-12  9:59 ` Valentin Schneider
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Woody Lin @ 2021-10-12  8:35 UTC (permalink / raw)
  To: Ingo Molnar, Ben Segall
  Cc: Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Mel Gorman, Daniel Bristot de Oliveira,
	Valentin Schneider, linux-kernel, Woody Lin

There was a 'init_idle' that resets scs sp to base, but is removed by
f1a0a376ca0c. Without the resetting, the hot-plugging implemented by
cpu_psci_cpu_boot will use the previous scs sp as new base when starting
up a CPU core, so the usage on scs page is being stacked up until
overflow.

This only happens on idle task since __cpu_up is using idle task as the
main thread to start up a CPU core, so the overflow can be fixed by
resetting scs sp to base in 'idle_task_exit'.

Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1bba4128a3e6..f21714ea3db8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8795,6 +8795,7 @@ void idle_task_exit(void)
 		finish_arch_post_lock_switch();
 	}
 
+	scs_task_reset(current);
 	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 
-- 
2.33.0.882.g93a45727a2-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12  8:35 [PATCH] sched/scs: Reset the shadow stack when idle_task_exit Woody Lin
@ 2021-10-12  9:59 ` Valentin Schneider
  2021-10-12 10:35   ` Woody Lin
  2021-10-19 15:23 ` [tip: sched/urgent] " tip-bot2 for Woody Lin
  2021-10-19 15:55 ` tip-bot2 for Woody Lin
  2 siblings, 1 reply; 9+ messages in thread
From: Valentin Schneider @ 2021-10-12  9:59 UTC (permalink / raw)
  To: Woody Lin, Ingo Molnar, Ben Segall
  Cc: Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Mel Gorman, Daniel Bristot de Oliveira,
	linux-kernel, Woody Lin

On 12/10/21 16:35, Woody Lin wrote:
> There was a 'init_idle' that resets scs sp to base, but is removed by
> f1a0a376ca0c. Without the resetting, the hot-plugging implemented by
> cpu_psci_cpu_boot will use the previous scs sp as new base when starting
> up a CPU core, so the usage on scs page is being stacked up until
> overflow.
>
> This only happens on idle task since __cpu_up is using idle task as the
> main thread to start up a CPU core, so the overflow can be fixed by
> resetting scs sp to base in 'idle_task_exit'.
>

Looking at init_idle() for similar issues, it looks like we might also want
to re-issue kasan_unpoison_task_stack() on the idle task upon hotplug.

> Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
> Signed-off-by: Woody Lin <woodylin@google.com>
> ---
>  kernel/sched/core.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 1bba4128a3e6..f21714ea3db8 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8795,6 +8795,7 @@ void idle_task_exit(void)
>               finish_arch_post_lock_switch();
>       }
>
> +	scs_task_reset(current);
>       /* finish_cpu(), as ran on the BP, will clean up the active_mm state */

So AIUI for SCS that works just fine - one thing I'm unclear on is how the
following pops are going to work given the SP reset happens in the middle
of a call stack, but AFAICT that was already the case before I messed about
with init_idle(), so that must already be handled.

I'm not familiar enough with KASAN to say whether that
kasan_unpoison_task_stack() should rather happen upon hotplugging the CPU
back (rather than on hot-unplug). If that is the case, then maybe somewhere
around cpu_startup_entry() might work (and then you could bunch these two
"needs to be re-run at init for the idle task" functions into a common
helper).

>  }
>
> --
> 2.33.0.882.g93a45727a2-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12  9:59 ` Valentin Schneider
@ 2021-10-12 10:35   ` Woody Lin
  2021-10-12 10:57     ` Valentin Schneider
  0 siblings, 1 reply; 9+ messages in thread
From: Woody Lin @ 2021-10-12 10:35 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Ben Segall, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Tue, Oct 12, 2021 at 6:00 PM Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 12/10/21 16:35, Woody Lin wrote:
> > There was a 'init_idle' that resets scs sp to base, but is removed by
> > f1a0a376ca0c. Without the resetting, the hot-plugging implemented by
> > cpu_psci_cpu_boot will use the previous scs sp as new base when starting
> > up a CPU core, so the usage on scs page is being stacked up until
> > overflow.
> >
> > This only happens on idle task since __cpu_up is using idle task as the
> > main thread to start up a CPU core, so the overflow can be fixed by
> > resetting scs sp to base in 'idle_task_exit'.
> >
>
> Looking at init_idle() for similar issues, it looks like we might also want
> to re-issue kasan_unpoison_task_stack() on the idle task upon hotplug.
>
> > Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
> > Signed-off-by: Woody Lin <woodylin@google.com>
> > ---
> >  kernel/sched/core.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 1bba4128a3e6..f21714ea3db8 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -8795,6 +8795,7 @@ void idle_task_exit(void)
> >               finish_arch_post_lock_switch();
> >       }
> >
> > +     scs_task_reset(current);
> >       /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
>
> So AIUI for SCS that works just fine - one thing I'm unclear on is how the
> following pops are going to work given the SP reset happens in the middle
> of a call stack, but AFAICT that was already the case before I messed about
> with init_idle(), so that must already be handled.

Hi Valentin,

Thanks for the question. The 'scs_task_reset' here resets only the
'.thread_info.scs_sp' of the task, so the register (on arm64 it's x18)
is still pointing to the same location for popping and storing call
frames. The register will be updated to '.thread_info.scs_sp' in
'__secondary_switched', which starts a new core and there is no popping
after the updating, so it won't introduce an underflow.

>
> I'm not familiar enough with KASAN to say whether that
> kasan_unpoison_task_stack() should rather happen upon hotplugging the CPU
> back (rather than on hot-unplug). If that is the case, then maybe somewhere
> around cpu_startup_entry() might work (and then you could bunch these two
> "needs to be re-run at init for the idle task" functions into a common
> helper).

unpoison looks more like an one-time thing to me; the idle tasks will
reuse the same stack pages until system resets, so I think we don't need
to re-unpoison that during hotplugging as long as it's unpoisoned in
'init_idle'.

>
> >  }
> >
> > --
> > 2.33.0.882.g93a45727a2-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12 10:35   ` Woody Lin
@ 2021-10-12 10:57     ` Valentin Schneider
  2021-10-13  1:22       ` Woody Lin
  0 siblings, 1 reply; 9+ messages in thread
From: Valentin Schneider @ 2021-10-12 10:57 UTC (permalink / raw)
  To: Woody Lin
  Cc: Ingo Molnar, Ben Segall, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On 12/10/21 18:35, Woody Lin wrote:
> On Tue, Oct 12, 2021 at 6:00 PM Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>>
>> So AIUI for SCS that works just fine - one thing I'm unclear on is how the
>> following pops are going to work given the SP reset happens in the middle
>> of a call stack, but AFAICT that was already the case before I messed about
>> with init_idle(), so that must already be handled.
>
> Hi Valentin,
>
> Thanks for the question. The 'scs_task_reset' here resets only the
> '.thread_info.scs_sp' of the task, so the register (on arm64 it's x18)
> is still pointing to the same location for popping and storing call
> frames. The register will be updated to '.thread_info.scs_sp' in
> '__secondary_switched', which starts a new core and there is no popping
> after the updating, so it won't introduce an underflow.
>

I think I got it; __secondary_switched() -> init_cpu_task() -> scs_load()

Thanks!

>>
>> I'm not familiar enough with KASAN to say whether that
>> kasan_unpoison_task_stack() should rather happen upon hotplugging the CPU
>> back (rather than on hot-unplug). If that is the case, then maybe somewhere
>> around cpu_startup_entry() might work (and then you could bunch these two
>> "needs to be re-run at init for the idle task" functions into a common
>> helper).
>
> unpoison looks more like an one-time thing to me; the idle tasks will
> reuse the same stack pages until system resets, so I think we don't need
> to re-unpoison that during hotplugging as long as it's unpoisoned in
> 'init_idle'.
>

I would tend to agree, but was bitten by s390 freeing some memory on
hot-unplug and re-allocating it upon hotplug:

  6a942f578054 ("s390: preempt: Fix preempt_count initialization")

This makes me doubt whether we can assert the idle task stack pages are
perennial vs hotplug on all architectures.

>>
>> >  }
>> >
>> > --
>> > 2.33.0.882.g93a45727a2-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12 10:57     ` Valentin Schneider
@ 2021-10-13  1:22       ` Woody Lin
  2021-10-13 13:32         ` Valentin Schneider
  0 siblings, 1 reply; 9+ messages in thread
From: Woody Lin @ 2021-10-13  1:22 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Ben Segall, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Tue, Oct 12, 2021 at 6:57 PM Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 12/10/21 18:35, Woody Lin wrote:
> > On Tue, Oct 12, 2021 at 6:00 PM Valentin Schneider
> > <valentin.schneider@arm.com> wrote:
> >>
> >> So AIUI for SCS that works just fine - one thing I'm unclear on is how the
> >> following pops are going to work given the SP reset happens in the middle
> >> of a call stack, but AFAICT that was already the case before I messed about
> >> with init_idle(), so that must already be handled.
> >
> > Hi Valentin,
> >
> > Thanks for the question. The 'scs_task_reset' here resets only the
> > '.thread_info.scs_sp' of the task, so the register (on arm64 it's x18)
> > is still pointing to the same location for popping and storing call
> > frames. The register will be updated to '.thread_info.scs_sp' in
> > '__secondary_switched', which starts a new core and there is no popping
> > after the updating, so it won't introduce an underflow.
> >
>
> I think I got it; __secondary_switched() -> init_cpu_task() -> scs_load()
>
> Thanks!
>
> >>
> >> I'm not familiar enough with KASAN to say whether that
> >> kasan_unpoison_task_stack() should rather happen upon hotplugging the CPU
> >> back (rather than on hot-unplug). If that is the case, then maybe somewhere
> >> around cpu_startup_entry() might work (and then you could bunch these two
> >> "needs to be re-run at init for the idle task" functions into a common
> >> helper).
> >
> > unpoison looks more like an one-time thing to me; the idle tasks will
> > reuse the same stack pages until system resets, so I think we don't need
> > to re-unpoison that during hotplugging as long as it's unpoisoned in
> > 'init_idle'.
> >
>
> I would tend to agree, but was bitten by s390 freeing some memory on
> hot-unplug and re-allocating it upon hotplug:
>
>   6a942f578054 ("s390: preempt: Fix preempt_count initialization")
>
> This makes me doubt whether we can assert the idle task stack pages are
> perennial vs hotplug on all architectures.

I made a quick study on memory-hotplug and seems that only memory contains
nothing other than migratable pages can be unplugged. So process stack
pages should not be a concern for this, since which is an unmovable
memory.

However I don't have a chance to work on a system that enables
memory-hotplug so far, so couldn't verify this assumption further. Guess
we can create a separate thread to clarify this more.

Regards,
Woody

>
> >>
> >> >  }
> >> >
> >> > --
> >> > 2.33.0.882.g93a45727a2-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-13  1:22       ` Woody Lin
@ 2021-10-13 13:32         ` Valentin Schneider
  2021-10-15 13:02           ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Valentin Schneider @ 2021-10-13 13:32 UTC (permalink / raw)
  To: Woody Lin
  Cc: Ingo Molnar, Ben Segall, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On 13/10/21 09:22, Woody Lin wrote:
> On Tue, Oct 12, 2021 at 6:57 PM Valentin Schneider <valentin.schneider@arm.com> wrote:
>> On 12/10/21 18:35, Woody Lin wrote:
>> > unpoison looks more like an one-time thing to me; the idle tasks will
>> > reuse the same stack pages until system resets, so I think we don't need
>> > to re-unpoison that during hotplugging as long as it's unpoisoned in
>> > 'init_idle'.
>> >
>>
>> I would tend to agree, but was bitten by s390 freeing some memory on
>> hot-unplug and re-allocating it upon hotplug:
>>
>>   6a942f578054 ("s390: preempt: Fix preempt_count initialization")
>>
>> This makes me doubt whether we can assert the idle task stack pages are
>> perennial vs hotplug on all architectures.
>
> I made a quick study on memory-hotplug and seems that only memory contains
> nothing other than migratable pages can be unplugged. So process stack
> pages should not be a concern for this, since which is an unmovable
> memory.
>
> However I don't have a chance to work on a system that enables
> memory-hotplug so far, so couldn't verify this assumption further. Guess
> we can create a separate thread to clarify this more.
>

That sounds sensible; I'll try to dig some more into this.

As for the SCS change, someone might argue for placing this elsewhere in
the hotplug path, but that looks fine to me:

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

> Regards,
> Woody
>
>>
>> >>
>> >> >  }
>> >> >
>> >> > --
>> >> > 2.33.0.882.g93a45727a2-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-13 13:32         ` Valentin Schneider
@ 2021-10-15 13:02           ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2021-10-15 13:02 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Woody Lin, Ingo Molnar, Ben Segall, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Wed, Oct 13, 2021 at 02:32:51PM +0100, Valentin Schneider wrote:
> As for the SCS change, someone might argue for placing this elsewhere in
> the hotplug path, but that looks fine to me:
> 
> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: sched/urgent] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12  8:35 [PATCH] sched/scs: Reset the shadow stack when idle_task_exit Woody Lin
  2021-10-12  9:59 ` Valentin Schneider
@ 2021-10-19 15:23 ` tip-bot2 for Woody Lin
  2021-10-19 15:55 ` tip-bot2 for Woody Lin
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Woody Lin @ 2021-10-19 15:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Woody Lin, Peter Zijlstra (Intel), Valentin Schneider, x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     45dfb89b8f96643268449c25d7025b17de46717c
Gitweb:        https://git.kernel.org/tip/45dfb89b8f96643268449c25d7025b17de46717c
Author:        Woody Lin <woodylin@google.com>
AuthorDate:    Tue, 12 Oct 2021 16:35:21 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 18 Oct 2021 16:58:41 +02:00

sched/scs: Reset the shadow stack when idle_task_exit

There was a 'init_idle' that resets scs sp to base, but is removed by
f1a0a376ca0c. Without the resetting, the hot-plugging implemented by
cpu_psci_cpu_boot will use the previous scs sp as new base when starting
up a CPU core, so the usage on scs page is being stacked up until
overflow.

This only happens on idle task since __cpu_up is using idle task as the
main thread to start up a CPU core, so the overflow can be fixed by
resetting scs sp to base in 'idle_task_exit'.

Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20211012083521.973587-1-woodylin@google.com
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1bba412..f21714e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8795,6 +8795,7 @@ void idle_task_exit(void)
 		finish_arch_post_lock_switch();
 	}
 
+	scs_task_reset(current);
 	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip: sched/urgent] sched/scs: Reset the shadow stack when idle_task_exit
  2021-10-12  8:35 [PATCH] sched/scs: Reset the shadow stack when idle_task_exit Woody Lin
  2021-10-12  9:59 ` Valentin Schneider
  2021-10-19 15:23 ` [tip: sched/urgent] " tip-bot2 for Woody Lin
@ 2021-10-19 15:55 ` tip-bot2 for Woody Lin
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Woody Lin @ 2021-10-19 15:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Woody Lin, Peter Zijlstra (Intel), Valentin Schneider, x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     63acd42c0d4942f74710b11c38602fb14dea7320
Gitweb:        https://git.kernel.org/tip/63acd42c0d4942f74710b11c38602fb14dea7320
Author:        Woody Lin <woodylin@google.com>
AuthorDate:    Tue, 12 Oct 2021 16:35:21 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 19 Oct 2021 17:46:11 +02:00

sched/scs: Reset the shadow stack when idle_task_exit

Commit f1a0a376ca0c ("sched/core: Initialize the idle task with
preemption disabled") removed the init_idle() call from
idle_thread_get(). This was the sole call-path on hotplug that resets
the Shadow Call Stack (scs) Stack Pointer (sp).

Not resetting the scs-sp leads to scs overflow after enough hotplug
cycles. Therefore add an explicit scs_task_reset() to the hotplug code
to make sure the scs-sp does get reset on hotplug.

Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
[peterz: Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20211012083521.973587-1-woodylin@google.com
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1bba412..f21714e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8795,6 +8795,7 @@ void idle_task_exit(void)
 		finish_arch_post_lock_switch();
 	}
 
+	scs_task_reset(current);
 	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-19 15:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12  8:35 [PATCH] sched/scs: Reset the shadow stack when idle_task_exit Woody Lin
2021-10-12  9:59 ` Valentin Schneider
2021-10-12 10:35   ` Woody Lin
2021-10-12 10:57     ` Valentin Schneider
2021-10-13  1:22       ` Woody Lin
2021-10-13 13:32         ` Valentin Schneider
2021-10-15 13:02           ` Peter Zijlstra
2021-10-19 15:23 ` [tip: sched/urgent] " tip-bot2 for Woody Lin
2021-10-19 15:55 ` tip-bot2 for Woody Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).