All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Potapenko <glider@google.com>
To: Marco Elver <elver@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrey Konovalov <andreyknvl@google.com>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Borislav Petkov <bp@alien8.de>, Christoph Hellwig <hch@lst.de>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Ilya Leoshkevich <iii@linux.ibm.com>,
	Ingo Molnar <mingo@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Kees Cook <keescook@chromium.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Pekka Enberg <penberg@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Vegard Nossum <vegard.nossum@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux-Arch <linux-arch@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 06/45] kmsan: add ReST documentation
Date: Fri, 15 Jul 2022 09:42:37 +0200	[thread overview]
Message-ID: <CAG_fn=X5w5F1rwHuQqQ9GRYT4MiNGQLh71FRN16Wy3rGJLX_AA@mail.gmail.com> (raw)
In-Reply-To: <CANpmjNN=XO=6rpV-KS2xq=3fiV1L3wCL1DFwLes-CJsi=6ZmcQ@mail.gmail.com>

> To be consistent with other tools, I think we have settled on "The
> Kernel <...> Sanitizer (K?SAN)", see
> Documentation/dev-tools/k[ac]san.rst. So this will be "The Kernel
> Memory Sanitizer (KMSAN)".

Done (will appear in v5).


> -> "The third stack trace ..."
> (Because it looks like there's also another stack trace in the middle
> and "lower" is ambiguous)

Done

>
> > +where this variable was created.
> > +
> > +The upper stack shows where the uninit value was used - in
>
> -> "The first stack trace shows where the uninit value was used (in
> ``test_uninit_kmsan_check_memory()``)."
Done

> > +KMSAN and Clang
> > +===============
>
> The KASAN documentation has a section on "Support" which lists
> architectures and compilers supported. I'd try to mirror (or improve
> on) that.

Renamed this section to "Support", added a line about supported
architectures (x86_64)

>
> > +In order for KMSAN to work the kernel must be built with Clang, which so far is
> > +the only compiler that has KMSAN support. The kernel instrumentation pass is
> > +based on the userspace `MemorySanitizer tool`_.
> > +
> > +How to build
> > +============
>
> I'd call it "Usage", like in the KASAN and KCSAN documentation.
Done

>
> > +In order to build a kernel with KMSAN you will need a fresh Clang (14.0.0+).
> > +Please refer to `LLVM documentation`_ for the instructions on how to build Clang.
> > +
> > +Now configure and build the kernel with CONFIG_KMSAN enabled.
>
> I would move build/usage instructions right after introduction as
> that's most likely what users of KMSAN will want to know about first.

Done

> > +How KMSAN works
> > +===============
> > +
> > +KMSAN shadow memory
> > +-------------------
> > +
> > +KMSAN associates a metadata byte (also called shadow byte) with every byte of
> > +kernel memory. A bit in the shadow byte is set iff the corresponding bit of the
> > +kernel memory byte is uninitialized. Marking the memory uninitialized (i.e.
> > +setting its shadow bytes to ``0xff``) is called poisoning, marking it
> > +initialized (setting the shadow bytes to ``0x00``) is called unpoisoning.
> > +
> > +When a new variable is allocated on the stack, it is poisoned by default by
> > +instrumentation code inserted by the compiler (unless it is a stack variable
> > +that is immediately initialized). Any new heap allocation done without
> > +``__GFP_ZERO`` is also poisoned.
> > +
> > +Compiler instrumentation also tracks the shadow values with the help from the
> > +runtime library in ``mm/kmsan/``.
>
> This sentence might still be confusing. I think it should highlight
> that runtime and compiler go together, but depending on the scope of
> the value, the compiler invokes the runtime to persist the shadow.

Changed to:
"""
Compiler instrumentation also tracks the shadow values as they are used along
the code. When needed, instrumentation code invokes the runtime library in
``mm/kmsan/`` to persist shadow values.
"""

> > +
> > +
>
> There are 2 blank lines here, which is inconsistent with the rest of
> the document.

Fixed

> > +Origin tracking
> > +---------------
> > +
> > +Every four bytes of kernel memory also have a so-called origin assigned to
>
> Is "assigned" or "mapped" more appropriate here?

I think initially this was more about origin values that exist in SSA
as well as memory, so not all of them were "mapped".
On the other hand, we're talking about bytes in the memory, so "mapped" is fine.

> > +them. This origin describes the point in program execution at which the
> > +uninitialized value was created. Every origin is associated with either the
> > +full allocation stack (for heap-allocated memory), or the function containing
> > +the uninitialized variable (for locals).
> > +
> > +When an uninitialized variable is allocated on stack or heap, a new origin
> > +value is created, and that variable's origin is filled with that value.
> > +When a value is read from memory, its origin is also read and kept together
> > +with the shadow. For every instruction that takes one or more values the origin
>
> s/values the origin/values, the origin/
Done, thanks!


> > +
> > +If ``a`` is initialized and ``b`` is not, the shadow of the result would be
> > +0xffff0000, and the origin of the result would be the origin of ``b``.
> > +``ret.s[0]`` would have the same origin, but it will be never used, because
>
> s/be never/never be/
Done

> > +Passing uninitialized values to functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +KMSAN instrumentation pass has an option, ``-fsanitize-memory-param-retval``,
>
> "KMSAN instrumentation pass" -> "Clang's instrumentation support" ?
> Because it seems wrong to say that KMSAN has the instrumentation pass.
How about "Clang's MSan instrumentation pass"?

> > +
> > +Sometimes the pointers passed into inline assembly do not point to valid memory.
> > +In such cases they are ignored at runtime.
> > +
> > +Disabling the instrumentation
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> It would be useful to move this section somewhere to the beginning,
> closer to usage and the example, as this is information that a user of
> KMSAN might want to know (but they might not want to know much about
> how KMSAN works).

I restructured the TOC as follows:

== The Kernel Memory Sanitizer (KMSAN)
== Usage
--- Building the kernel
--- Example report
--- Disabling the instrumentation
== Support
== How KMSAN works
--- KMSAN shadow memory
--- Origin tracking
~~~~ Origin chaining
--- Clang instrumentation API
~~~~ Shadow manipulation
~~~~ Handling locals
~~~~ Access to per-task data
~~~~ Passing uninitialized values to functions
~~~~ String functions
~~~~ Error reporting
~~~~ Inline assembly instrumentation
--- Runtime library
~~~~ Per-task KMSAN state
~~~~ KMSAN contexts
~~~~ Metadata allocation
== References


> > +Another function attribute supported by KMSAN is ``__no_sanitize_memory``.
> > +Applying this attribute to a function will result in KMSAN not instrumenting it,
> > +which can be helpful if we do not want the compiler to mess up some low-level
>
> s/mess up/interfere with/
Done

> > +code (e.g. that marked with ``noinstr``).
>
> maybe "... (e.g. that marked with ``noinstr``, which implicitly adds
> ``__no_sanitize_memory``)."

Done

> otherwise people might think that it's necessary to add
> __no_sanitize_memory explicitly to noinstr.

Good point!

> > +    ...
> > +    struct kmsan_context kmsan;
> > +    ...
> > +  }
> > +
> > +
>
> 1 blank line instead of 2?
Done

> > +This means that in general for two contiguous memory pages their shadow/origin
> > +pages may not be contiguous. So, if a memory access crosses the boundary
>
> s/So, /Consequently, /
Done


-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

  reply	other threads:[~2022-07-15  7:43 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-01 14:22 [PATCH v4 00/45] Add KernelMemorySanitizer infrastructure Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 01/45] x86: add missing include to sparsemem.h Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 02/45] stackdepot: reserve 5 extra bits in depot_stack_handle_t Alexander Potapenko
2022-07-12 14:17   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 03/45] instrumented.h: allow instrumenting both sides of copy_from_user() Alexander Potapenko
2022-07-12 14:17   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 04/45] x86: asm: instrument usercopy in get_user() and __put_user_size() Alexander Potapenko
2022-07-02  3:47   ` kernel test robot
2022-07-15 14:03     ` Alexander Potapenko
2022-07-15 14:03       ` Alexander Potapenko
2022-07-02 10:45   ` kernel test robot
2022-07-15 16:44     ` Alexander Potapenko
2022-07-15 16:44       ` Alexander Potapenko
2022-07-02 13:09   ` kernel test robot
2022-07-07 10:13   ` Marco Elver
2022-08-07 17:33     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 05/45] asm-generic: instrument usercopy in cacheflush.h Alexander Potapenko
2022-07-12 14:17   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 06/45] kmsan: add ReST documentation Alexander Potapenko
2022-07-07 12:34   ` Marco Elver
2022-07-15  7:42     ` Alexander Potapenko [this message]
2022-07-15  8:52       ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 07/45] kmsan: introduce __no_sanitize_memory and __no_kmsan_checks Alexander Potapenko
2022-07-12 14:17   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 08/45] kmsan: mark noinstr as __no_sanitize_memory Alexander Potapenko
2022-07-12 14:17   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 09/45] x86: kmsan: pgtable: reduce vmalloc space Alexander Potapenko
2022-07-11 16:12   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 10/45] libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE Alexander Potapenko
2022-07-11 16:26   ` Marco Elver
2022-08-03  9:41     ` Alexander Potapenko
2022-08-03  9:44     ` Alexander Potapenko
2023-01-05 22:08       ` Dan Williams
2023-01-09  9:51         ` Alexander Potapenko
2023-01-09 22:06           ` Dan Williams
2023-01-10  5:56             ` Greg Kroah-Hartman
2023-01-10  6:55               ` Dan Williams
2023-01-10  8:48                 ` Alexander Potapenko
2023-01-10  8:52                   ` Alexander Potapenko
2023-01-10  8:53                   ` Eric Dumazet
2023-01-10  8:55                     ` Christoph Hellwig
2023-01-10 15:35                       ` Steven Rostedt
2023-01-10  9:14                     ` Alexander Potapenko
2023-01-30  8:34         ` Alexander Potapenko
2023-01-30 18:57           ` Dan Williams
2022-07-01 14:22 ` [PATCH v4 11/45] kmsan: add KMSAN runtime core Alexander Potapenko
2022-07-02  0:18   ` Hillf Danton
2022-08-03 17:25     ` Alexander Potapenko
2022-07-11 16:49   ` Marco Elver
2022-08-03 18:14     ` Alexander Potapenko
2022-07-13 10:04   ` Marco Elver
2022-08-03 17:45     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 12/45] kmsan: disable instrumentation of unsupported common kernel code Alexander Potapenko
2022-07-12 11:54   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 13/45] MAINTAINERS: add entry for KMSAN Alexander Potapenko
2022-07-12 12:06   ` Marco Elver
2022-08-02 16:39     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 14/45] mm: kmsan: maintain KMSAN metadata for page operations Alexander Potapenko
2022-07-12 12:20   ` Marco Elver
2022-08-03 10:30     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 15/45] mm: kmsan: call KMSAN hooks from SLUB code Alexander Potapenko
2022-07-12 13:13   ` Marco Elver
2022-08-02 16:31     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 16/45] kmsan: handle task creation and exiting Alexander Potapenko
2022-07-12 13:17   ` Marco Elver
2022-08-02 15:47     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 17/45] init: kmsan: call KMSAN initialization routines Alexander Potapenko
2022-07-12 14:05   ` Marco Elver
2022-08-02 20:07     ` Alexander Potapenko
2022-08-03  9:08       ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 18/45] instrumented.h: add KMSAN support Alexander Potapenko
2022-07-12 13:51   ` Marco Elver
2022-08-03 11:17     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 19/45] kmsan: unpoison @tlb in arch_tlb_gather_mmu() Alexander Potapenko
2022-07-13  9:28   ` Marco Elver
2022-07-01 14:22 ` [PATCH v4 20/45] kmsan: add iomap support Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 21/45] Input: libps2: mark data received in __ps2_command() as initialized Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 22/45] dma: kmsan: unpoison DMA mappings Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 23/45] virtio: kmsan: check/unpoison scatterlist in vring_map_one_sg() Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 24/45] kmsan: handle memory sent to/from USB Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 25/45] kmsan: add tests for KMSAN Alexander Potapenko
2022-07-12 14:16   ` Marco Elver
2022-08-02 17:29     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 26/45] kmsan: disable strscpy() optimization under KMSAN Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 27/45] crypto: kmsan: disable accelerated configs " Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 28/45] kmsan: disable physical page merging in biovec Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 29/45] block: kmsan: skip bio block merging logic for KMSAN Alexander Potapenko
2022-07-13 10:22   ` Marco Elver
2022-08-02 17:47     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 30/45] kcov: kmsan: unpoison area->list in kcov_remote_area_put() Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 31/45] security: kmsan: fix interoperability with auto-initialization Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 32/45] objtool: kmsan: list KMSAN API functions as uaccess-safe Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 33/45] x86: kmsan: disable instrumentation of unsupported code Alexander Potapenko
2022-07-12 13:43   ` Marco Elver
2022-08-03 10:52     ` Alexander Potapenko
2022-07-01 14:22 ` [PATCH v4 34/45] x86: kmsan: skip shadow checks in __switch_to() Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 35/45] x86: kmsan: handle open-coded assembly in lib/iomem.c Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 36/45] x86: kmsan: use __msan_ string functions where possible Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 37/45] x86: kmsan: sync metadata pages on page fault Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 38/45] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 39/45] x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESS Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 40/45] x86: kmsan: don't instrument stack walking functions Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 41/45] entry: kmsan: introduce kmsan_unpoison_entry_regs() Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 42/45] bpf: kmsan: initialize BPF registers with zeroes Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 43/45] namei: initialize parameters passed to step_into() Alexander Potapenko
2022-07-02 17:23   ` Linus Torvalds
2022-07-03  3:59     ` Al Viro
2022-07-04  2:52     ` Al Viro
2022-07-04  8:20       ` Alexander Potapenko
2022-07-04 13:44         ` Al Viro
2022-07-04 13:55           ` Al Viro
2022-07-04 15:49           ` Alexander Potapenko
2022-07-04 16:03             ` Greg Kroah-Hartman
2022-07-04 16:33               ` Alexander Potapenko
2022-07-04 18:23             ` Segher Boessenkool
2022-07-04 16:00           ` Al Viro
2022-07-04 16:47             ` Alexander Potapenko
2022-07-04 17:36       ` Linus Torvalds
2022-07-04 19:02         ` Al Viro
2022-07-04 19:16           ` Linus Torvalds
2022-07-04 19:55             ` Al Viro
2022-07-04 20:24               ` Linus Torvalds
2022-07-04 20:46                 ` Al Viro
2022-07-04 20:51                   ` Linus Torvalds
2022-07-04 21:04                     ` Al Viro
2022-07-04 23:13                       ` [PATCH 1/7] __follow_mount_rcu(): verify that mount_lock remains unchanged Al Viro
2022-07-04 23:14                         ` [PATCH 2/7] follow_dotdot{,_rcu}(): change calling conventions Al Viro
2022-07-04 23:14                         ` [PATCH 3/7] namei: stash the sampled ->d_seq into nameidata Al Viro
2022-07-04 23:15                         ` [PATCH 4/7] step_into(): lose inode argument Al Viro
2022-07-04 23:15                         ` [PATCH 5/7] follow_dotdot{,_rcu}(): don't bother with inode Al Viro
2022-07-04 23:16                         ` [PATCH 6/7] lookup_fast(): " Al Viro
2022-07-04 23:17                         ` [PATCH 7/7] step_into(): move fetching ->d_inode past handle_mounts() Al Viro
2022-07-04 23:19                         ` [PATCH 1/7] __follow_mount_rcu(): verify that mount_lock remains unchanged Al Viro
2022-07-05  0:06                           ` Linus Torvalds
2022-07-05  3:48                             ` Al Viro
2022-07-04 20:47                 ` [PATCH v4 43/45] namei: initialize parameters passed to step_into() Linus Torvalds
2022-08-08 16:37   ` Alexander Potapenko
2022-07-01 14:23 ` [PATCH v4 44/45] mm: fs: initialize fsdata passed to write_begin/write_end interface Alexander Potapenko
2022-07-04 20:07   ` Matthew Wilcox
2022-07-04 20:30     ` Al Viro
2022-08-25 15:39     ` Alexander Potapenko
2022-08-25 16:33       ` Linus Torvalds
2022-08-25 21:57         ` Segher Boessenkool
2022-08-26 19:41           ` Linus Torvalds
2022-08-31 13:32             ` Alexander Potapenko
2022-08-25 22:13         ` Segher Boessenkool
2022-07-01 14:23 ` [PATCH v4 45/45] x86: kmsan: enable KMSAN builds for x86 Alexander Potapenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG_fn=X5w5F1rwHuQqQ9GRYT4MiNGQLh71FRN16Wy3rGJLX_AA@mail.gmail.com' \
    --to=glider@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=iii@linux.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mst@redhat.com \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=vegard.nossum@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.