From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A85CC432C0 for ; Fri, 29 Nov 2019 14:59:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A75D72070B for ; Fri, 29 Nov 2019 14:59:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="n8TQqeRw" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A75D72070B Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0F6FD6B0597; Fri, 29 Nov 2019 09:59:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D0656B0598; Fri, 29 Nov 2019 09:59:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F276F6B0599; Fri, 29 Nov 2019 09:59:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id D04596B0597 for ; Fri, 29 Nov 2019 09:59:19 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 8F4D08248D7C for ; Fri, 29 Nov 2019 14:59:19 +0000 (UTC) X-FDA: 76209623238.28.son08_514f8812e710e X-HE-Tag: son08_514f8812e710e X-Filterd-Recvd-Size: 14747 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 Nov 2019 14:59:18 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id l22so2299661pff.9 for ; Fri, 29 Nov 2019 06:59:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=n5H1IIv1KYy7RcMVpCXKnlJrqOzetyIHY1t2GO7GqIw=; b=n8TQqeRwVmaLfleK0u3kAUCjwKh2ehEIk7mcDG8UkOhMb4ql4aO4hEkNHJUmY0fseO 7hGZGPlKmCBOYyueTxoXhjsXUY61qH29tfitC93HZsHXzCaTvRC02N7B3yFwkVwNDOG3 d4kv0EjcArE0H12HfNSR+ezsYCu/ZAJwWpXY19qe7/XpJnTPJACJCp74dzZNw3MilyYc hHKZ7RA3cGzPTDeJLBXvw+emdDWeW12DrXyXwx7rlDCTu8X5uwFi9OzfdAN4aUB/7v5n RrtX9icn63WR20w2w4EhrVQqqkopnyPCVoc+bNpIpM7CRZ39Z2EPRXq+s4rEJ60q8IBS X0XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=n5H1IIv1KYy7RcMVpCXKnlJrqOzetyIHY1t2GO7GqIw=; b=pI61/nL2DbfsVF2BucpF3hsknB81psb+CCIO+n7wOmPvD1Nu5cYCCECPYldDw9i6nk 1Re1q+4Uyeo9LHe2agZBvvEOT7HKK6BfEp48p2hl3OlHqDvZ16Q5yVUuRJqPljx4pkG1 0T3RQ7WTR3ScTM71OTO4A1rKorx7VJ0zIbdDqNtYa4Six2aaURKqzqYyS2HvZFs3UC6h mi+ubq2aqZLLTlcNvziX3aCRn50qRjEUVLBB7KcQCQGyJoKNVUFn0PfC7dXaCRVWPgDl hmqrJ/PgLbc2d6k4pPgfDpSFgmw1mXy6eU5Y+yBFnMWaaUn6jFEOuJZjRwFuVCZFmYNT ULPw== X-Gm-Message-State: APjAAAWtKheUXoSIhNOzXhlIq/Gu8ObOSEwF0zgrp5cpyUFjNOwOcCe3 AuWmIAYQ4ugZ2H+5D4NpfxSPPOeHLZARadRygmtBiQ== X-Google-Smtp-Source: APXvYqzF5+L1S8u9Hut9jvRFO0vNIggulo4kqQzQFnV3BfJQWoylDx4Bic01jzBeUUmyKt+yDGSlp+H6zrV7CXJQC4U= X-Received: by 2002:aa7:9629:: with SMTP id r9mr948517pfg.51.1575039557180; Fri, 29 Nov 2019 06:59:17 -0800 (PST) MIME-Version: 1.0 References: <20191122112621.204798-1-glider@google.com> <20191122112621.204798-25-glider@google.com> In-Reply-To: <20191122112621.204798-25-glider@google.com> From: Andrey Konovalov Date: Fri, 29 Nov 2019 15:59:05 +0100 Message-ID: Subject: Re: [PATCH RFC v3 24/36] kmsan: disable instrumentation of certain functions To: Alexander Potapenko Cc: Thomas Gleixner , Andrew Morton , Vegard Nossum , Dmitry Vyukov , Linux Memory Management List , Alexander Viro , Andreas Dilger , Andrey Ryabinin , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Christoph Hellwig , Christoph Hellwig , darrick.wong@oracle.com, "David S. Miller" , Dmitry Torokhov , Eric Biggers , Eric Dumazet , ericvh@gmail.com, Greg Kroah-Hartman , harry.wentland@amd.com, Herbert Xu , iii@linux.ibm.com, mingo@elte.hu, Jason Wang , Jens Axboe , Marek Szyprowski , Marco Elver , Mark Rutland , "Martin K. Petersen" , Martin Schwidefsky , Matthew Wilcox , "Michael S . Tsirkin" , Michal Simek , pmladek@suse.com, Qian Cai , Randy Dunlap , Robin Murphy , sergey.senozhatsky@gmail.com, Steven Rostedt , Takashi Iwai , "Theodore Ts'o" , gor@linux.ibm.com, wsa@the-dreams.de Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 22, 2019 at 12:27 PM wrote: > > Some functions are called from handwritten assembly, and therefore don't > have their arguments' metadata fully set up by the instrumentation code. > Mark them with __no_sanitize_memory to avoid false positives from > spreading further. > Certain functions perform task switching, so that the value of |current| > is different as they proceed. Because KMSAN state pointer is only read > once at the beginning of the function, touching it after |current| has > changed may be dangerous. > > Signed-off-by: Alexander Potapenko > To: Alexander Potapenko > Cc: Thomas Gleixner > Cc: Andrew Morton > Cc: Vegard Nossum > Cc: Dmitry Vyukov > Cc: linux-mm@kvack.org > --- > v3: > - removed TODOs from comments > > Change-Id: I684d23dac5a22eb0a4cea71993cb934302b17cea > --- > arch/x86/entry/common.c | 1 + > arch/x86/include/asm/irq_regs.h | 1 + > arch/x86/include/asm/syscall_wrapper.h | 1 + > arch/x86/kernel/apic/apic.c | 1 + > arch/x86/kernel/dumpstack_64.c | 1 + > arch/x86/kernel/process_64.c | 5 +++++ > arch/x86/kernel/traps.c | 12 ++++++++++-- > arch/x86/kernel/uprobes.c | 7 ++++++- > kernel/profile.c | 1 + > kernel/sched/core.c | 6 ++++++ > 10 files changed, 33 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c > index 3f8e22615812..0dd5b2acb355 100644 > --- a/arch/x86/entry/common.c > +++ b/arch/x86/entry/common.c > @@ -275,6 +275,7 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs) > } > > #ifdef CONFIG_X86_64 > +__no_sanitize_memory > __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs) > { > struct thread_info *ti; > diff --git a/arch/x86/include/asm/irq_regs.h b/arch/x86/include/asm/irq_regs.h > index 187ce59aea28..d65a00bd6f02 100644 > --- a/arch/x86/include/asm/irq_regs.h > +++ b/arch/x86/include/asm/irq_regs.h > @@ -14,6 +14,7 @@ > > DECLARE_PER_CPU(struct pt_regs *, irq_regs); > > +__no_sanitize_memory > static inline struct pt_regs *get_irq_regs(void) > { > return __this_cpu_read(irq_regs); > diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h > index e046a405743d..43910ce1b53b 100644 > --- a/arch/x86/include/asm/syscall_wrapper.h > +++ b/arch/x86/include/asm/syscall_wrapper.h > @@ -159,6 +159,7 @@ > ALLOW_ERROR_INJECTION(__x64_sys##name, ERRNO); \ > static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \ > static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\ > + __no_sanitize_memory \ > asmlinkage long __x64_sys##name(const struct pt_regs *regs) \ > { \ > return __se_sys##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\ > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index 9e2dd2b296cd..7b24bda22c38 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -1118,6 +1118,7 @@ static void local_apic_timer_interrupt(void) > * [ if a single-CPU system runs an SMP kernel then we call the local > * interrupt as well. Thus we cannot inline the local irq ... ] > */ > +__no_sanitize_memory /* |regs| may be uninitialized */ The comment style around __no_sanitize_memory looks very different for different call sites, perhaps it makes sense to unify it somehow. > __visible void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs) > { > struct pt_regs *old_regs = set_irq_regs(regs); > diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c > index 753b8cfe8b8a..ba883d282a43 100644 > --- a/arch/x86/kernel/dumpstack_64.c > +++ b/arch/x86/kernel/dumpstack_64.c > @@ -143,6 +143,7 @@ static bool in_irq_stack(unsigned long *stack, struct stack_info *info) > return true; > } > > +__no_sanitize_memory > int get_stack_info(unsigned long *stack, struct task_struct *task, > struct stack_info *info, unsigned long *visit_mask) > { > diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c > index af64519b2695..70e33150a83a 100644 > --- a/arch/x86/kernel/process_64.c > +++ b/arch/x86/kernel/process_64.c > @@ -500,6 +500,11 @@ void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp) > * Kprobes not supported here. Set the probe on schedule instead. > * Function graph tracer not supported too. > */ > +/* > + * Avoid touching KMSAN state or reporting anything here, as __switch_to() does > + * weird things with tasks. > + */ > +__no_sanitize_memory > __visible __notrace_funcgraph struct task_struct * > __switch_to(struct task_struct *prev_p, struct task_struct *next_p) > { > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 4bb0f8447112..a94282d1f60b 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -618,7 +618,10 @@ NOKPROBE_SYMBOL(do_int3); > * Help handler running on a per-cpu (IST or entry trampoline) stack > * to switch to the normal thread stack if the interrupted code was in > * user mode. The actual stack switch is done in entry_64.S > + * > + * This function switches the registers - don't instrument it with KMSAN! > */ > +__no_sanitize_memory > asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs) > { > struct pt_regs *regs = (struct pt_regs *)this_cpu_read(cpu_current_top_of_stack) - 1; > @@ -634,6 +637,11 @@ struct bad_iret_stack { > }; > > asmlinkage __visible notrace > +/* > + * Dark magic happening here, let's not instrument this function. > + * Also avoid copying any metadata by using raw __memmove(). > + */ > +__no_sanitize_memory > struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s) > { > /* > @@ -648,10 +656,10 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s) > (struct bad_iret_stack *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1; > > /* Copy the IRET target to the new stack. */ > - memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8); > + __memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8); > > /* Copy the remainder of the stack from the current stack. */ > - memmove(new_stack, s, offsetof(struct bad_iret_stack, regs.ip)); > + __memmove(new_stack, s, offsetof(struct bad_iret_stack, regs.ip)); Looks like this change should go into a separate patch. > > BUG_ON(!user_mode(&new_stack->regs)); > return new_stack; > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > index 8cd745ef8c7b..bcd4bf5a909f 100644 > --- a/arch/x86/kernel/uprobes.c > +++ b/arch/x86/kernel/uprobes.c > @@ -8,6 +8,7 @@ > * Jim Keniston > */ > #include > +#include > #include > #include > #include > @@ -997,9 +998,13 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs) > int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data) > { > struct die_args *args = data; > - struct pt_regs *regs = args->regs; > + struct pt_regs *regs; > int ret = NOTIFY_DONE; > > + kmsan_unpoison_shadow(args, sizeof(*args)); > + regs = args->regs; > + if (regs) > + kmsan_unpoison_shadow(regs, sizeof(*regs)); This one as well. > /* We are only interested in userspace traps */ > if (regs && !user_mode(regs)) > return NOTIFY_DONE; > diff --git a/kernel/profile.c b/kernel/profile.c > index af7c94bf5fa1..835a5b66d1a4 100644 > --- a/kernel/profile.c > +++ b/kernel/profile.c > @@ -399,6 +399,7 @@ void profile_hits(int type, void *__pc, unsigned int nr_hits) > } > EXPORT_SYMBOL_GPL(profile_hits); > > +__no_sanitize_memory > void profile_tick(int type) > { > struct pt_regs *regs = get_irq_regs(); > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index dd05a378631a..674d36fe9d44 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -475,6 +475,7 @@ void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task) > put_task_struct(task); > } > > +__no_sanitize_memory /* context switching here */ > void wake_up_q(struct wake_q_head *head) > { > struct wake_q_node *node = head->first; > @@ -3180,6 +3181,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev, > * past. prev == current is still correct but we need to recalculate this_rq > * because prev may have moved to another CPU. > */ > +__no_sanitize_memory /* |current| changes here */ > static struct rq *finish_task_switch(struct task_struct *prev) > __releases(rq->lock) > { > @@ -3986,6 +3988,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > * > * WARNING: must be called with preemption disabled! > */ > +__no_sanitize_memory /* |current| changes here */ > static void __sched notrace __schedule(bool preempt) > { > struct task_struct *prev, *next; > @@ -4605,6 +4608,7 @@ int task_prio(const struct task_struct *p) > * > * Return: 1 if the CPU is currently idle. 0 otherwise. > */ > +__no_sanitize_memory /* nothing to report here */ > int idle_cpu(int cpu) > { > struct rq *rq = cpu_rq(cpu); > @@ -6544,6 +6548,7 @@ static struct kmem_cache *task_group_cache __read_mostly; > DECLARE_PER_CPU(cpumask_var_t, load_balance_mask); > DECLARE_PER_CPU(cpumask_var_t, select_idle_mask); > > +__no_sanitize_memory > void __init sched_init(void) > { > unsigned long ptr = 0; > @@ -6716,6 +6721,7 @@ static inline int preempt_count_equals(int preempt_offset) > return (nested == preempt_offset); > } > > +__no_sanitize_memory /* expect the arguments to be initialized */ > void __might_sleep(const char *file, int line, int preempt_offset) > { > /* > -- > 2.24.0.432.g9d3f5f5b63-goog >