From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C19ECC433F5 for ; Tue, 23 Nov 2021 11:06:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234303AbhKWLKF (ORCPT ); Tue, 23 Nov 2021 06:10:05 -0500 Received: from foss.arm.com ([217.140.110.172]:51058 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233462AbhKWLKE (ORCPT ); Tue, 23 Nov 2021 06:10:04 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2C2721063; Tue, 23 Nov 2021 03:06:56 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 08E943F5A1; Tue, 23 Nov 2021 03:06:54 -0800 (PST) Date: Tue, 23 Nov 2021 11:06:48 +0000 From: Mark Rutland To: linux-kernel@vger.kernel.org Cc: catalin.marinas@arm.com, dvyukov@google.com, peterz@infradead.org, quic_qiancai@quicinc.com, valentin.schneider@arm.com, will@kernel.org, woodylin@google.com Subject: Re: [PATCH v2] Reset task stack state in bringup_cpu() Message-ID: <20211123110648.GA37253@lakrids.cambridge.arm.com> References: <20211118102927.4854-1-mark.rutland@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211118102927.4854-1-mark.rutland@arm.com> User-Agent: Mutt/1.11.1+11 (2f07cb52) (2018-12-01) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 18, 2021 at 10:29:27AM +0000, Mark Rutland wrote: > To hot unplug a CPU, the idle task on that CPU calls a few layers of C > code before finally leaving the kernel. When KASAN is in use, poisoned > shadow is left around for each of the active stack frames, and when > shadow call stacks are in use. When shadow call stacks are in use the > task's saved SCS SP is left pointing at an arbitrary point within the > task's shadow call stack. > > When a CPU is offlined than onlined back into the kernel, this stale > state can adversely affect execution. Stale KASAN shadow can alias new > stackframes and result in bogus KASAN warnings. A stale SCS SP is > effectively a memory leak, and prevents a portion of the shadow call > stack being used. Across a number of hotplug cycles the idle task's > entire shadow call stack can become unusable. > > We previously fixed the KASAN issue in commit: > > e1b77c92981a5222 ("sched/kasan: remove stale KASAN poison after hotplug") > > ... by removing any stale KASAN stack poison immediately prior to > onlining a CPU. > > Subsequently in commit: > > f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled") > > ... the refactoring left the KASAN and SCS cleanup in one-time idle > thread initialization code rather than something invoked prior to each > CPU being onlined, breaking both as above. > > We fixed SCS (but not KASAN) in commit: > > 63acd42c0d4942f7 ("sched/scs: Reset the shadow stack when idle_task_exit") > > ... but as this runs in the context of the idle task being offlined it's > potentially fragile. > > To fix these consistently and more robustly, reset the SCS SP and KASAN > shadow of a CPU's idle task immediately before we online that CPU in > bringup_cpu(). This ensures the idle task always has a consistent state > when it is running, and removes the need to so so when exiting an idle > task. > > Whenever any thread is created, dup_task_struct() will give the task a > stack which is free of KASAN shadow, and initialize the task's SCS SP, > so there's no need to specially initialize either for idle thread within > init_idle(), as this was only necessary to handle hotplug cycles. > > I've tested this with both GCC and clang, with relevant options enabled, > offlining and onlining CPUs with: > > | while true; do > | for C in /sys/devices/system/cpu/cpu*/online; do > | echo 0 > $C; > | echo 1 > $C; > | done > | done > > Link: https://lore.kernel.org/lkml/20211012083521.973587-1-woodylin@google.com/ > Link: https://lore.kernel.org/linux-arm-kernel/YY9ECKyPtDbD9q8q@qian-HP-Z2-SFF-G5-Workstation/ > Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/ > Fixes: 1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled") Ugh, that should be f1a0a376ca0c4ef1 (with a leading 'f'), as in the body of the commit message. I can fix that up for v3, unless someone's happy to fix that up when picking this up. Thanks, Mark. > Reported-by: Qian Cai > Signed-off-by: Mark Rutland > Reviewed-by: Valentin Schneider > Tested-by: Qian Cai > Cc: Catalin Marinas > Cc: Dmitry Vyukov > Cc: Peter Zijlstra > Cc: Will Deacon > Cc: Woody Lin > --- > kernel/cpu.c | 7 +++++++ > kernel/sched/core.c | 4 ---- > 2 files changed, 7 insertions(+), 4 deletions(-) > > Since v1 [1]: > * Clarify commit message > * Fix typos > * Accumulate tags > > [1] https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/ > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 192e43a87407..407a2568f35e 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -588,6 +589,12 @@ static int bringup_cpu(unsigned int cpu) > int ret; > > /* > + * Reset stale stack state from the last time this CPU was online. > + */ > + scs_task_reset(idle); > + kasan_unpoison_task_stack(idle); > + > + /* > * Some architectures have to walk the irq descriptors to > * setup the vector space for the cpu which comes online. > * Prevent irq alloc/free across the bringup. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 3c9b0fda64ac..76f9deeaa942 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -8619,9 +8619,6 @@ void __init init_idle(struct task_struct *idle, int cpu) > idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY; > kthread_set_per_cpu(idle, cpu); > > - scs_task_reset(idle); > - kasan_unpoison_task_stack(idle); > - > #ifdef CONFIG_SMP > /* > * It's possible that init_idle() gets called multiple times on a task, > @@ -8777,7 +8774,6 @@ void idle_task_exit(void) > finish_arch_post_lock_switch(); > } > > - scs_task_reset(current); > /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ > } > > -- > 2.11.0 >