From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FCA5C43603 for ; Thu, 19 Dec 2019 14:16:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8FD8724650 for ; Thu, 19 Dec 2019 14:16:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="v9Ea6GsI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8FD8724650 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 02F3C8E0170; Thu, 19 Dec 2019 09:16:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F21278E00F5; Thu, 19 Dec 2019 09:16:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0ED18E0170; Thu, 19 Dec 2019 09:16:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0068.hostedemail.com [216.40.44.68]) by kanga.kvack.org (Postfix) with ESMTP id C50C28E00F5 for ; Thu, 19 Dec 2019 09:16:31 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 69A3B180AD815 for ; Thu, 19 Dec 2019 14:16:31 +0000 (UTC) X-FDA: 76282091382.09.range32_90287d24ca846 X-HE-Tag: range32_90287d24ca846 X-Filterd-Recvd-Size: 14820 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Thu, 19 Dec 2019 14:16:30 +0000 (UTC) Received: by mail-wr1-f68.google.com with SMTP id c9so6135550wrw.8 for ; Thu, 19 Dec 2019 06:16:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Ka/Pl+2j5ZMLZQoJxeMqIKZ9Qm32G60jA9+9u7dYjDQ=; b=v9Ea6GsILtLNef/XMkHSFso9EX4Xo4C/j8DuMeURjWsoru9u1ctIHOOHCLjYe5W0sR RQZL7/q0sG/z8p+1N7PrIPnUxg7lfDuF7uOjBiuVelKLQlfhsW8EAiNBD3JFjLDzUb47 AZXmjXRu4hbDp8HSNSJSCXQ0UVvfgGoPMKZogQGtOUPQRsKzMKOc2DfrUDg61WF5ddon 5v+DcHk5NxnfOlR8L2yoIT3cfr6Ks8EF3wBXNTzOZAE6DCGcywqQFBDGf753Qb8gtQNJ VrbUpGvH/OYW3zgcwFY5y9BU4eq/00PiP3Qq0blX8v9OLLkhji9CLVLlzzEhFLZSIHUm FMqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Ka/Pl+2j5ZMLZQoJxeMqIKZ9Qm32G60jA9+9u7dYjDQ=; b=oNdrotZqc9a1PhUtjLButn+m7b/gCwQI/v/J0SRh1auB3LwL96q0AF6o+Nm1GZJJ+m lKJ8njKlzK3xBk5jlMzsSzKUZhpJf13Ix13nF+8FvNJekB3aQbcnD1CXUA5/Gh1W6BWL SidUQiDv372koYhRcvZfsoJhDr2Ou5hDS8TrxWxxH1GBho1DsD4CCYjhxcmzVzMrj5f1 Zw4hWNi3Xcl2kG8TvHc7e7yLj7HqyykR2qBJDKCNgs43Ybj0wElmotQVuLzx9LjQRkOw 7WYw9KwQIHQEwCKxBE9Pdv6K9Cpc7QfG8UfnyPLTc9nXLBn11qeziYAqe/d5dHLibKWb 7ZrA== X-Gm-Message-State: APjAAAVN731rAD7+mtSECsKchsAsnm7qYUwqPvr2/tk/uq9h3zgDGORe d8K56f+jTESx7Nuvh8aOHqCpl22OtiO1RO4yecJTBA== X-Google-Smtp-Source: APXvYqz62s2UkzK5/ZhqAHQOlywbRMZ0RO9uX3tfzVHZNfza6EL8BTj+mY523UHVXBR+rPadKt+7lkctZ8WJqf7Jxvw= X-Received: by 2002:adf:9104:: with SMTP id j4mr9744975wrj.221.1576764988805; Thu, 19 Dec 2019 06:16:28 -0800 (PST) MIME-Version: 1.0 References: <20191122112621.204798-1-glider@google.com> <20191122112621.204798-11-glider@google.com> <20191129160725.GA236510@google.com> In-Reply-To: <20191129160725.GA236510@google.com> From: Alexander Potapenko Date: Thu, 19 Dec 2019 15:16:16 +0100 Message-ID: Subject: Re: [PATCH RFC v3 10/36] kmsan: add KMSAN runtime To: Marco Elver Cc: Wolfram Sang , Vegard Nossum , Dmitry Vyukov , Linux Memory Management List , Al Viro , Andreas Dilger , Andrew Morton , Andrey Konovalov , Andrey Ryabinin , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Christoph Hellwig , Christoph Hellwig , "Darrick J. Wong" , David Miller , Dmitry Torokhov , Eric Biggers , Eric Dumazet , Eric Van Hensbergen , Greg Kroah-Hartman , Harry Wentland , Herbert Xu , Ilya Leoshkevich , Ingo Molnar , Jason Wang , Jens Axboe , Marek Szyprowski , Mark Rutland , "Martin K . Petersen" , Martin Schwidefsky , Matthew Wilcox , "Michael S. Tsirkin" , Michal Simek , Petr Mladek , Qian Cai , Randy Dunlap , Robin Murphy , Sergey Senozhatsky , Steven Rostedt , Takashi Iwai , "Theodore Ts'o" , Thomas Gleixner , Vasily Gorbik Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > Could all these just be using '.macro .endm'? Done in v4 > > +__no_sanitize_memory > > +static inline unsigned long KMSAN_INIT_8(unsigned long value) > > +{ > > + return value; > > +} > > Should the above be __always_inline? No. __always_inline forces a non-instrumented function to be inlined into its instrumented caller, which results in the former being instrumented. I've updated the comment to reflect that. > Does it make sense to use u8, u16, u32, u64 here -- just in case it's > ported to other architectures in future? Done in v4. > > + default: \ > > + BUILD_BUG_ON(1); \ > > + } \ > > + __ret; \ > > + }) /**/ > > Is the /**/ needed? No, as long as we use .macro and .endm. > > It would be good to add doc comments to all API functions. Done in v4 > > +extern bool kmsan_ready; > > What does this variable mean. Would 'kmsan_enabled' be more accurate? I think kmsan_inited is a better name, if we want to change it at all. kmsan_enabled somewhat implies KMSAN can be disabled. > This is in include/linux -- do they need a KMSAN_ prefix to not clash > with other definitions? Done in v4. > > +#define KMSAN_PARAM_SIZE 800 > > + > > +#define PARAM_ARRAY_SIZE (KMSAN_PARAM_SIZE / sizeof(depot_stack_handle= _t)) > > Similar here -- does it need a KMSAN_ prefix? Done in v4. > > +void kmsan_clear_page(void *page_addr); > > It would be good to have doc comments for each of them. Done in v4. > > + > > +KMSAN_SANITIZE :=3D n > > +KCOV_INSTRUMENT :=3D n > > Does KMSAN work together with UBSAN? In that case may this needs a > UBSAN_SANITIZE :=3D n Done > > +#include > > +#include > > +#include > > +#include > > + > > +#include > > Why the space above the mmzone.h include? Removed it, also fixed the include order for this file. > > +/* > > + * Some kernel asm() calls mention the non-existing |__force_order| va= riable > > + * in the asm constraints to preserve the order of accesses to control > > + * registers. KMSAN turns those mentions into actual memory accesses, = therefore > > + * the variable is now required to link the kernel. > > + */ > > +unsigned long __force_order; > > Not sure if this is related, but when compiling with KMSAN I get > > ERROR: "__force_order" [drivers/misc/lkdtm/lkdtm.ko] undefined! > > with a default config with KMSAN selected. Added an EXPORT_SYMBOL to fix this. > > > +bool kmsan_ready; > > +#define KMSAN_STACK_DEPTH 64 > > +#define MAX_CHAIN_DEPTH 7 > > Should these defines be above the variable definitions? Done > > Why not just 'panic("%s: ...", __func__, ...)' ? > > If the BUG() should not be here, then maybe just WARN_ON? Replaced with panic(). > > + > > +/* > > + * TODO(glider): writing an initialized byte shouldn't zero out the or= igin, if > > + * the remaining three bytes are uninitialized. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Filed https://github.com/google/kmsan/issues/70 to track this. This isn't a showstopper. > > + if (checked && !metadata_is_contiguous(addr, size, META_ORIGIN)) = { > > + kmsan_pr_locked("WARNING: not setting origin for %d bytes= starting at %px, because the metadata is incontiguous\n", size, addr); > > + BUG(); > > Just panic? Done. > > +/* > > + * TODO(glider): this check shouldn't be performed for origin pages, b= ecause > > + * they're always accessed after the shadow pages. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Dropped the TODO. This is somewhat perfectionist. > > + if (origin_p) { > > + kmsan_pr_locked("Origin: %08x\n", *origin_p); > > + kmsan_print_origin(*origin_p); > > + } else { > > + kmsan_pr_locked("Origin: unavailable\n"); > > + } > > These repeated calls to kmsan_pr_locked seem unnecessary. There is > nothing ensuring atomicity of all these print calls w.r.t. reporting. Replaced them with pr_err(). > > +/* Stolen from kernel/printk/internal.h */ > > +#define PRINTK_SAFE_CONTEXT_MASK 0x3fffffff > > Is this used anywhere? No. Removed it. > > +/* Called by kmsan_report.c under a lock. */ > > +#define kmsan_pr_err(...) pr_err(__VA_ARGS__) > > Why is this macro needed? It's never redefined, so in the places it is > used, you can just use pr_err. For readability I would avoid unnecessary > aliases, but if there is a genuine reason this may be needed in future, > I would just add a comment. I've removed the macro. > > +/* Used in other places - doesn't require a lock. */ > > +#define kmsan_pr_locked(...) \ > > + do { \ > > + unsigned long flags; \ > > + spin_lock_irqsave(&report_lock, flags); \ > > + pr_err(__VA_ARGS__); \ > > + spin_unlock_irqrestore(&report_lock, flags); \ > > + } while (0) > > Is this macro needed? The only reason it sort of makes sense is to > serialize a report with other printing, but otherwise pr_err already > makes sure things are serialized properly. Yes, this was the intention. On the other hand, this lock doesn't prevent non-KMSAN code from messing up KMSAN reports, so it makes little sense. Maybe we can just keep the spinlock to separate the reports from each other. > > +enum KMSAN_BUG_REASON { > > + REASON_ANY =3D 0, > > + REASON_COPY_TO_USER =3D 1, > > + REASON_USE_AFTER_FREE =3D 2, > > + REASON_SUBMIT_URB =3D 3, > > +}; > > Is it required to explicitly assign constants to these? No. Removed the constants. > > +#define LEAVE_RUNTIME(irq_flags) \ > > + do { \ > > + this_cpu_dec(kmsan_in_runtime); \ > > + if (this_cpu_read(kmsan_in_runtime)) { \ > > + kmsan_pr_err("kmsan_in_runtime: %d\n", \ > > + this_cpu_read(kmsan_in_runtime)); \ > > + BUG(); \ > > + } \ > > + restart_nmi(); \ > > + local_irq_restore(irq_flags); \ > > + preempt_enable(); } while (0) > > Could these not be macros, and instead be static __always_inline > functions? Done > > +static void kmsan_context_exit(void) > > +{ > > + int level =3D this_cpu_read(kmsan_context_level) - 1; > > + > > + BUG_ON(level < 0); > > + this_cpu_write(kmsan_context_level, level); > > +} > > These are not preemption-safe. this_cpu_dec_return followed by the > BUG_ON should be sufficient. Similarly above and below (using > this_cpu_add_return) Good catch, thank you! > > +void kmsan_interrupt_exit(void) > > +{ > > + int in_interrupt =3D this_cpu_read(kmsan_in_interrupt); > > + > > + BUG_ON(!in_interrupt); > > + kmsan_context_exit(); > > + /* Can't check preempt_count() here, it may be zero. */ > > + this_cpu_write(kmsan_in_interrupt, in_interrupt - 1); > > +} > > +EXPORT_SYMBOL(kmsan_interrupt_exit); > > Why exactly does kmsan_in_interrupt need to be maintained here? I can't > see them being used anywhere else. Is it only for the BUG_ON? Yes, initially some consistency checks made sense. I think it's safe to delete them now. > > +void kmsan_softirq_exit(void) > > +{ > > + bool in_softirq =3D this_cpu_read(kmsan_in_softirq); > > + > > + BUG_ON(!in_softirq); > > + kmsan_context_exit(); > > + /* Can't check preempt_count() here, it may be zero. */ > > + this_cpu_write(kmsan_in_softirq, false); > > +} > > +EXPORT_SYMBOL(kmsan_softirq_exit); > > Same question here for kmsan_in_softirq. Ditto > > +void kmsan_nmi_exit(void) > > +{ > > + bool in_nmi =3D this_cpu_read(kmsan_in_nmi); > > + > > + BUG_ON(!in_nmi); > > + BUG_ON(preempt_count() & NMI_MASK); > > + kmsan_context_exit(); > > + this_cpu_write(kmsan_in_nmi, false); > > + > > +} > > +EXPORT_SYMBOL(kmsan_nmi_exit); > > And same question here for kmsan_in_nmi. Ditto. > > + > > +/* > > + * Record a range of memory for which the metadata pages will be creat= ed once > > + * the page allocator becomes available. > > + * TODO(glider): squash together ranges belonging to the same page. > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Removed the TODO. There's a problem with non-contiguous pages which is tracked at https://github.com/google/kmsan/issues/71 > > + /* > > + * TODO(glider): alloc_node_data() in arch/x86/mm/numa.c uses > > + * sizeof(pg_data_t). > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). Resolved this (the code is actually correct) > > + > > + if (IN_RUNTIME()) { > > + /* > > + * TODO(glider): looks legit. depot_save_stack() may call > > + * free_pages(). > > + */ > > What needs to be done to address the TODO? Just adding a comment is > fine (or if the TODO can be resolved that's also fine). I've just dropped this if-clause. > > + return; > > + } > > + > > + ENTER_RUNTIME(irq_flags); > > + shadow =3D shadow_page_for(&page[0]); > > + origin =3D origin_page_for(&page[0]); > > + > > + /* TODO(glider): this is racy. */ > > Can this be fixed or does the race not matter -- in the latter case, > just remove the TODO and turn it into a NOTE or similar. It doesn't matter. Removed the comment. --=20 Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Stra=C3=9Fe, 33 80636 M=C3=BCnchen Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg