From: Alexander Potapenko <glider@google.com> To: Andrey Konovalov <andreyknvl@google.com> Cc: Wolfram Sang <wsa@the-dreams.de>, Vegard Nossum <vegard.nossum@oracle.com>, Dmitry Vyukov <dvyukov@google.com>, Linux Memory Management List <linux-mm@kvack.org>, Alexander Viro <viro@zeniv.linux.org.uk>, Andreas Dilger <adilger.kernel@dilger.ca>, Andrew Morton <akpm@linux-foundation.org>, Andrey Ryabinin <aryabinin@virtuozzo.com>, Andy Lutomirski <luto@kernel.org>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Arnd Bergmann <arnd@arndb.de>, Christoph Hellwig <hch@infradead.org>, Christoph Hellwig <hch@lst.de>, "Darrick J. Wong" <darrick.wong@oracle.com>, "David S. Miller" <davem@davemloft.net>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, Eric Biggers <ebiggers@google.com>, Eric Dumazet <edumazet@google.com>, Eric Van Hensbergen <ericvh@gmail.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Harry Wentland <harry.wentland@amd.com>, Herbert Xu <herbert@gondor.apana.org.au>, Ilya Leoshkevich <iii@linux.ibm.com>, Ingo Molnar <mingo@elte.hu>, Jason Wang <jasowang@redhat.com>, Jens Axboe <axboe@kernel.dk>, Marek Szyprowski <m.szyprowski@samsung.com>, Marco Elver <elver@google.com>, Mark Rutland <mark.rutland@arm.com>, "Martin K. Petersen" <martin.petersen@oracle.com>, Martin Schwidefsky <schwidefsky@de.ibm.com>, Matthew Wilcox <willy@infradead.org>, "Michael S . Tsirkin" <mst@redhat.com>, Michal Simek <monstr@monstr.eu>, Petr Mladek <pmladek@suse.com>, Qian Cai <cai@lca.pw>, Randy Dunlap <rdunlap@infradead.org>, Robin Murphy <robin.murphy@arm.com>, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Steven Rostedt <rostedt@goodmis.org>, Takashi Iwai <tiwai@suse.com>, "Theodore Ts'o" <tytso@mit.edu>, Thomas Gleixner <tglx@linutronix.de>, Vasily Gorbik <gor@linux.ibm.com> Subject: Re: [PATCH RFC v3 10/36] kmsan: add KMSAN runtime Date: Fri, 20 Dec 2019 19:58:25 +0100 Message-ID: <CAG_fn=V8=UfMH+c+SLgGi9M+DVE53=UPS1=ZmGFDqmHTzLLEfw@mail.gmail.com> (raw) In-Reply-To: <CAAeHK+xoh5gjrsai5fe6_meWrskhXbiR4ubT5hy2yZFFuMr5aw@mail.gmail.com> > 1. There's a lot of TODOs in the code. They either need to be resolved > or removed. Done in v4 > 2. This patch is huge, would it be possible to split it? One way to do > this is to have two parts, one that adds the headers and empty hooks, > and the other one that adds hooks implementations. Or something like > that, if that's feasible at all. I've split away kmsan_hooks.c, kmsan_entry.c and kmsan_instr.c Let's see if that helps. > > + * Adopted from KTSAN assembly hooks implementation by Dmitry Vyukov: > > + * https://github.com/google/ktsan/blob/ktsan/arch/x86/include/asm/ktsan.h > > This link can get stale. Maybe just link the repo? Guess there's not much code left to credit, only a series of push instructions. Not sure it's worth it. I've removed this comment. > > + * KMSAN checks. > > This comment just repeats the file name. Maybe worth mentioning what > exactly we are checking here, and how this header is different from > kmsan.h. Perhaps some of the functions declared here should be moved > there as well. I've expanded the comment a bit, added doc comments and moved the functions not supposed to be widely used to kmsan.h > > +struct task_struct; > > +struct vm_struct; > > + > > + > > Remove unneeded whitespace. Done > > + if (checked) { > > + kmsan_pr_locked("WARNING: not memsetting %d bytes starting at %px, because the shadow is NULL\n", to_fill, address); > > Why not use WARN()? changed this to panic() > > + > > + if (!kmsan_ready) > > + return 0; > > Do we need this check here? > right, this is an internal function, callers must ensure we're not in the runtime. > > + > > + if (!id) > > + return id; > > And this one? > This one we do need, as there're cases in which the caller may pass a zero origin to us. > > + /* Lowest bit is the UAF flag, higher bits hold the depth. */ > > + extra_bits = (depth << 1) | (extra_bits & 1); > > Please add some helper functions/macros to work with extra_bits. Done > > + if (checked && !metadata_is_contiguous(addr, size, META_ORIGIN)) { > > + kmsan_pr_locked("WARNING: not setting origin for %d bytes starting at %px, because the metadata is incontiguous\n", size, addr); > > Why not use WARN()? changed this to panic() > > + > > +struct kmsan_context_state *task_kmsan_context_state(void); > > s/task_kmsan_context_state/kmsan_task_context_state/ or something like that. changed to kmsan_task_context_state > > +{ > > + int in_interrupt = this_cpu_read(kmsan_in_interrupt); > > + > > + /* Turns out it's possible for in_interrupt to be >0 here. */ > > Why/how? Expand the comment. Dropped this function > > [...] > > > +void kmsan_nmi_enter(void) > > +{ > > + bool in_nmi = this_cpu_read(kmsan_in_nmi); > > + > > + BUG_ON(in_nmi); > > + BUG_ON(preempt_count() & NMI_MASK); > > BUG_ON(in_nmi())? I've actually dropped context-specific functions, leaving only kmsan_context_{enter,exit} > > +/* > > + * The functions may call back to instrumented code, which, in turn, may call > > + * these hooks again. To avoid re-entrancy, we use __GFP_NO_KMSAN_SHADOW. > > + * Instrumented functions shouldn't be called under > > + * ENTER_RUNTIME()/LEAVE_RUNTIME(), because this will lead to skipping > > + * effects of functions like memset() inside instrumented code. > > + */ > > Add empty line. Done > > + LEAVE_RUNTIME(irq_flags); > > +} > > +EXPORT_SYMBOL(kmsan_task_create); > > + > > + > > Remove empty line. Done > > + return; > > + > > + ENTER_RUNTIME(irq_flags); > > + if (flags & __GFP_ZERO) { > > No {} needed here. Done > > + if (s->ctor) > > Why don't we poison if there's a ctor? Some comment is needed. Done > > + if (!kmsan_ready || IN_RUNTIME()) > > + return; > > + ENTER_RUNTIME(irq_flags); > > + if (flags & __GFP_ZERO) { > > No {} needed here. Done > > + u8 *vaddr; > > + > > + if (!skb || !skb->len) > > We either need to check !skb before skb_headlen() or drop the check. Done > > + kmsan_internal_check_memory(skb->data, skb_headlen(skb), 0, REASON_ANY); > > Use start instead of calling skb_headlen(skb) again. Or just remove > start and always call skb_headlen(skb). Done > > + skb_walk_frags(skb, frag_iter) > > + kmsan_check_skb(frag_iter); > > Hm, won't this recursively walk the same list multiple times? It should not. See the implementation of skb_dump() > > +} > > + > > +extern char _sdata[], _edata[]; > > #include <asm/sections.h>? Didn't know that, thanks! > > > + > > + > > + > > Remove excessive whitespace. Done > > + > > + for_each_reserved_mem_region(i, &p_start, &p_end) { > > No need for {} here. Done > > > + for_each_online_node(nid) > > + kmsan_record_future_shadow_range( > > + NODE_DATA(nid), (char *)NODE_DATA(nid) + nd_size); > > Remove tab before (char *)NODE_DATA(nid). Done > > + * It's unlikely that the assembly will touch more than 512 bytes. > > + */ > > + if (size > 512) > > Maybe do (WARN_ON(size > 512)) if this is something that we would want > to detect? added a WARN_ONCE to that branch > > + /* Ok to skip address check here, we'll do it later. */ > > + shadow_dst = kmsan_get_metadata(dst, n, META_SHADOW); > > kmsan_memmove_metadata() performs this check, do we need it here? Same > goes for other callers of kmsan_memmove/memcpy_metadata(). > You're right. I've removed the extra checks. > > + kmsan_internal_memset_shadow(dst, shadow, n, /*checked*/false); > > + new_origin = 0; > > + kmsan_internal_set_origin(dst, n, new_origin); > > Do we need variables for shadow and new_origin here? No. They were here in the hope to make __msan_memset() use shadow of |c| to initialize |dst|. See https://github.com/google/kmsan/issues/63 > > + if (!kmsan_ready || IN_RUNTIME()) > > + return; > > + > > + while (size_copy) { > > Why not call kmsan_internal_poison_shadow()/kmsan_internal_memset_shadow() > here instead of doing this manually? Done > > + if (!kmsan_ready || IN_RUNTIME()) > > + return; > > + > > + ENTER_RUNTIME(irq_flags); > > + /* Assuming the shadow exists. */ > > Why do we assume that shadow exists here, but not in > __msan_poison_alloca()? Please expand the comment. We can safely assume that in both cases, and it's a bug if the shadow doesn't exist. I've removed the misleading comment. > In some cases the caller of kmsan_print_origin() performs this check > and prints a differently formatted message (metadata_is_contiguous()) > or no message at all (kmsan_report()). Some unification would be food. > Let's just bail out from kmsan_print_origin if the origin is zero. Only metadata_is_contiguous() may actually pass a zero origin, and the message there is enough already. > > + kmsan_pr_err("Local variable description: %s\n", descr); > > + kmsan_pr_err("Variable was created at:\n"); > > A shorter way: "Local variable %s created at: ...". Done > > + kmsan_pr_err("Uninit was created at:\n"); > > + if (entries) > > Should this rather check nr_entries? SGTM > > + stack_trace_print(entries, nr_entries, 0); > > + else > > + kmsan_pr_err("No stack\n"); > > KASAN says "(stack is not available)" here. Makes sense to unify with this. Done > > > + break; > > + } > > +} > > + > > +void kmsan_report(depot_stack_handle_t origin, > > + void *address, int size, int off_first, int off_last, > > + const void *user_addr, int reason) > > It's not really clear what off_first and off_last arguments are, and > how the range that they describe is different from [address, address + > size). Some comment would be good. Added a doc comment to kmsan_report() > > + > > + nr_entries = stack_depot_fetch(origin, &entries); > > Do we need this here? No, nor do we need nr_entries and entries > > +#define has_origin_page(page) \ > > + (!!((page)->origin)) > > Something like this would take less space: > > #define shadow_page_for(page) ((page)->shadow) > #define origin_page_for(page) ((page)->origin) > ... Done > > + * Dummy load and store pages to be used when the real metadata is unavailable. > > + * There are separate pages for loads and stores, so that every load returns a > > + * zero, and every store doesn't affect other stores. > > every store doesn't affect other _reads_? Ack > > + BUG_ON(is_origin && !IS_ALIGNED(addr64, ORIGIN_SIZE)); > > + if (kmsan_internal_is_vmalloc_addr(addr)) { > > No need for {} here. Done > > + * none. The caller must check the return value for being non-NULL if needed. > > + * The return value of this function should not depend on whether we're in the > > + * runtime or not. > > + */ > > +void *kmsan_get_metadata(void *address, size_t size, bool is_origin) > > This looks very similar to kmsan_get_shadow_origin_ptr(), would it be > possible to unify them somehow or to split out common parts into a > helper function? I've rewritten kmsan_get_shadow_origin_ptr() to use kmsan_get_metadata(). It might have become a bit slower (still worth looking into), but less spaghetti code now. > > + if (kmsan_internal_is_vmalloc_addr(address) || > > + kmsan_internal_is_module_addr(address)) { > > No need for {} here. Done > > + for (i = 0; i < pages; i++) { > > + cp = &page[i]; > > + ignore_page(cp); > > ignore_page(&page[i]) Done -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
next prev parent reply index Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-22 11:25 [PATCH RFC v3 00/36] Add KernelMemorySanitizer infrastructure glider 2019-11-22 11:25 ` [PATCH RFC v3 01/36] stackdepot: check depot_index before accessing the stack slab glider 2019-11-27 14:22 ` Marco Elver 2019-11-22 11:25 ` [PATCH RFC v3 02/36] stackdepot: build with -fno-builtin glider 2019-11-27 14:22 ` Marco Elver 2019-11-22 11:25 ` [PATCH RFC v3 03/36] kasan: stackdepot: move filter_irq_stacks() to stackdepot.c glider 2019-11-27 14:22 ` Marco Elver 2019-11-27 14:56 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 04/36] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider 2019-11-27 14:23 ` Marco Elver 2019-11-22 11:25 ` [PATCH RFC v3 05/36] kmsan: add ReST documentation glider 2019-11-27 14:22 ` Marco Elver 2019-12-03 12:42 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 06/36] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW glider 2019-11-27 14:48 ` Marco Elver 2019-12-03 12:57 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 07/36] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__ glider 2019-11-28 13:13 ` Marco Elver 2019-11-29 16:09 ` Andrey Konovalov 2019-12-16 11:35 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 08/36] kmsan: reduce vmalloc space glider 2019-11-28 13:30 ` Marco Elver 2019-11-22 11:25 ` [PATCH RFC v3 09/36] kmsan: add KMSAN bits to struct page and struct task_struct glider 2019-11-28 13:44 ` Marco Elver 2019-11-28 14:05 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 10/36] kmsan: add KMSAN runtime glider 2019-11-24 19:44 ` Wolfram Sang 2019-11-25 9:14 ` Alexander Potapenko 2019-11-29 16:07 ` Marco Elver 2019-12-19 14:16 ` Alexander Potapenko 2019-12-02 15:39 ` Andrey Konovalov 2019-12-20 18:58 ` Alexander Potapenko [this message] 2019-12-03 14:34 ` Andrey Konovalov 2019-11-22 11:25 ` [PATCH RFC v3 11/36] kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot glider 2019-11-29 14:52 ` Andrey Konovalov 2019-12-03 14:27 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 12/36] kmsan: define READ_ONCE_NOCHECK() glider 2019-12-02 10:03 ` Marco Elver 2019-12-03 12:45 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 13/36] kmsan: make READ_ONCE_TASK_STACK() return initialized values glider 2019-12-02 10:07 ` Marco Elver 2019-12-05 15:52 ` Alexander Potapenko 2019-11-22 11:25 ` [PATCH RFC v3 14/36] kmsan: x86: sync metadata pages on page fault glider 2019-11-22 11:26 ` [PATCH RFC v3 15/36] kmsan: add tests for KMSAN glider 2019-11-29 14:14 ` Andrey Konovalov 2019-12-05 14:30 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 16/36] crypto: kmsan: disable accelerated configs under KMSAN glider 2019-12-02 13:25 ` Marco Elver 2019-12-05 14:51 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 17/36] kmsan: x86: disable UNWINDER_ORC " glider 2019-12-02 13:30 ` Marco Elver 2019-11-22 11:26 ` [PATCH RFC v3 18/36] kmsan: disable LOCK_DEBUGGING_SUPPORT glider 2019-12-02 13:33 ` Marco Elver 2019-12-03 14:34 ` Alexander Potapenko 2019-12-03 15:00 ` Qian Cai 2019-12-03 15:14 ` Alexander Potapenko 2019-12-03 18:02 ` Qian Cai 2019-12-03 18:38 ` Steven Rostedt 2019-12-04 8:41 ` Alexander Potapenko 2019-12-04 12:22 ` Petr Mladek 2019-12-04 13:12 ` Qian Cai 2019-12-04 16:24 ` Alexander Potapenko 2019-12-04 18:03 ` Qian Cai 2019-11-22 11:26 ` [PATCH RFC v3 20/36] kmsan: x86: increase stack sizes in KMSAN builds glider 2019-12-02 14:31 ` Marco Elver 2019-11-22 11:26 ` [PATCH RFC v3 21/36] kmsan: disable KMSAN instrumentation for certain kernel parts glider 2019-11-29 15:07 ` Andrey Konovalov 2019-12-10 10:35 ` Alexander Potapenko 2019-12-10 12:38 ` Alexander Potapenko 2019-12-10 12:43 ` Qian Cai 2019-11-22 11:26 ` [PATCH RFC v3 22/36] kmsan: mm: call KMSAN hooks from SLUB code glider 2019-12-02 15:36 ` Marco Elver 2019-12-10 12:07 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 23/36] kmsan: call KMSAN hooks where needed glider 2019-11-26 10:17 ` Petr Mladek 2019-11-26 10:52 ` Alexander Potapenko 2019-11-29 16:21 ` Andrey Konovalov 2019-12-16 11:30 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 24/36] kmsan: disable instrumentation of certain functions glider 2019-11-29 14:59 ` Andrey Konovalov 2019-12-18 10:02 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 25/36] kmsan: unpoison |tlb| in arch_tlb_gather_mmu() glider 2019-11-29 15:08 ` Andrey Konovalov 2019-12-03 14:19 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 26/36] kmsan: use __msan_memcpy() where possible glider 2019-11-29 15:13 ` Andrey Konovalov 2019-12-05 15:46 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 27/36] kmsan: hooks for copy_to_user() and friends glider 2019-11-29 15:34 ` Andrey Konovalov 2019-12-05 16:00 ` Alexander Potapenko 2019-12-05 16:44 ` Andrey Konovalov 2019-12-11 14:22 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 28/36] kmsan: enable KMSAN builds glider 2019-11-29 15:55 ` Andrey Konovalov 2019-12-11 12:51 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 29/36] kmsan: handle /dev/[u]random glider 2019-11-22 11:26 ` [PATCH RFC v3 30/36] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg() glider 2019-11-22 11:26 ` [PATCH RFC v3 31/36] kmsan: disable strscpy() optimization under KMSAN glider 2019-12-02 15:51 ` Marco Elver 2019-12-02 16:23 ` Alexander Potapenko 2019-12-03 11:19 ` Alexander Potapenko 2019-12-03 11:24 ` Marco Elver 2019-12-03 11:27 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 32/36] kmsan: add iomap support glider 2019-12-03 12:50 ` Marco Elver 2019-12-03 14:07 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 33/36] kmsan: dma: unpoison memory mapped by dma_direct_map_page() glider 2019-11-22 11:26 ` [PATCH RFC v3 34/36] kmsan: disable physical page merging in biovec glider 2019-12-03 12:54 ` Marco Elver 2019-12-03 13:38 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 35/36] kmsan: ext4: skip block merging logic in ext4_mpage_readpages for KMSAN glider 2019-11-25 16:05 ` Robin Murphy 2019-11-25 17:03 ` Alexander Potapenko 2019-12-03 14:22 ` Marco Elver 2019-12-05 14:31 ` Alexander Potapenko 2019-11-22 11:26 ` [PATCH RFC v3 36/36] net: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN glider 2019-12-03 14:17 ` Marco Elver 2019-12-05 14:37 ` Alexander Potapenko 2019-11-29 14:39 ` [PATCH RFC v3 00/36] Add KernelMemorySanitizer infrastructure Marco Elver 2019-12-02 16:02 ` Alexander Potapenko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAG_fn=V8=UfMH+c+SLgGi9M+DVE53=UPS1=ZmGFDqmHTzLLEfw@mail.gmail.com' \ --to=glider@google.com \ --cc=adilger.kernel@dilger.ca \ --cc=akpm@linux-foundation.org \ --cc=andreyknvl@google.com \ --cc=ard.biesheuvel@linaro.org \ --cc=arnd@arndb.de \ --cc=aryabinin@virtuozzo.com \ --cc=axboe@kernel.dk \ --cc=cai@lca.pw \ --cc=darrick.wong@oracle.com \ --cc=davem@davemloft.net \ --cc=dmitry.torokhov@gmail.com \ --cc=dvyukov@google.com \ --cc=ebiggers@google.com \ --cc=edumazet@google.com \ --cc=elver@google.com \ --cc=ericvh@gmail.com \ --cc=gor@linux.ibm.com \ --cc=gregkh@linuxfoundation.org \ --cc=harry.wentland@amd.com \ --cc=hch@infradead.org \ --cc=hch@lst.de \ --cc=herbert@gondor.apana.org.au \ --cc=iii@linux.ibm.com \ --cc=jasowang@redhat.com \ --cc=linux-mm@kvack.org \ --cc=luto@kernel.org \ --cc=m.szyprowski@samsung.com \ --cc=mark.rutland@arm.com \ --cc=martin.petersen@oracle.com \ --cc=mingo@elte.hu \ --cc=monstr@monstr.eu \ --cc=mst@redhat.com \ --cc=pmladek@suse.com \ --cc=rdunlap@infradead.org \ --cc=robin.murphy@arm.com \ --cc=rostedt@goodmis.org \ --cc=schwidefsky@de.ibm.com \ --cc=sergey.senozhatsky@gmail.com \ --cc=tglx@linutronix.de \ --cc=tiwai@suse.com \ --cc=tytso@mit.edu \ --cc=vegard.nossum@oracle.com \ --cc=viro@zeniv.linux.org.uk \ --cc=willy@infradead.org \ --cc=wsa@the-dreams.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-mm Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \ linux-mm@kvack.org public-inbox-index linux-mm Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kvack.linux-mm AGPL code for this site: git clone https://public-inbox.org/public-inbox.git