Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure
@ 2020-03-25 16:12 glider
  2020-03-25 16:12 ` [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider
                   ` (37 more replies)
  0 siblings, 38 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Alexander Viro, Andreas Dilger, Andrew Morton, Andrey Konovalov,
	Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Christoph Hellwig, Christoph Hellwig, Darrick J. Wong,
	David S. Miller, Dmitry Torokhov, Dmitry Vyukov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Marco Elver,
	Mark Rutland, Martin K. Petersen, Martin Schwidefsky,
	Matthew Wilcox, Michael S. Tsirkin, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Robin Murphy, Sergey Senozhatsky,
	Steven Rostedt, Takashi Iwai, Theodore Ts'o, Thomas Gleixner,
	Vasily Gorbik, Vegard Nossum, Wolfram Sang, linux-mm
  Cc: glider, mhocko

KernelMemorySanitizer (KMSAN) is a detector of errors related to uses of
uninitialized memory. It relies on compile-time Clang instrumentation
(similar to MSan in the userspace:
https://clang.llvm.org/docs/MemorySanitizer.html)
and tracks the state of every bit of kernel memory, being able to report
an error if uninitialized value is used in a condition, dereferenced or
copied to userspace, USB or network.

KMSAN has reported more than 200 bugs in the past two years, most of
them with the help of syzkaller (http://syzkaller.appspot.com).

The proposed patchset contains KMSAN runtime implementation together
with small changes to other subsystems needed to make KMSAN work.
The latter changes fall into several categories:
 - nice-to-have features that are independent from KMSAN but simplify
   its implementation (stackdepot changes, CONFIG_GENERIC_CSUM etc.);
 - Kconfig changes that prohibit options incompatible with KMSAN;
 - calls to KMSAN runtime functions that help KMSAN do the bookkeeping
   (e.g. tell it to allocate, copy or delete the metadata);
 - calls to KMSAN runtime functions that tell KMSAN to check memory
   escaping the kernel for uninitialized values. These are required to
   increase the number of true positive error reports;
 - calls to runtime functions that tell KMSAN to ignore certain memory
   ranges to avoid false negative reports. Most certainly there can be
   better ways to deal with every such report.

This patchset allows one to boot and run a defconfig+KMSAN kernel on a QEMU
without known major false positives. It however doesn't guarantee there
are no false positives in drivers of certain devices or less tested
subsystems, although KMSAN is actively tested on syzbot with quite a
rich config.

One may find it handy to review these patches in Gerrit:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1081
I've ensured the Change-Id: tags stay away from commit descriptions.

The patchset was generated relative to mmotm
(v5.6-rc7-mmots-2020-03-23-22-35).

Several points worth a separate discussion:
1. Right now KMSAN assumes that contiguous physical pages cannot be
accessed as such, unless they were allocated together by a single
alloc_pages() call. Some kernel code however does so, which may break
under KMSAN. Two possible solutions to this problem are:
 A. Allocate shadow and origin pages at fixed offset from the kernel page.
    This is what we already do for vmalloc, but not for page_alloc(), as
    it turned out to be quite hard.
    Ideas on how to implement this approach are still welcome, because
    it'll simplify the rest of the KMSAN runtime a lot.
 B. Make all accesses touching non-contiguous pages access dummy shadow
    pages instead, so that such accesses don't produce any uninitialized
    values.
    This is quite controversial, as it may prevent true positives from
    being reported.

2. checkpatch.pl complains a lot about the use of BUG_ON in KMSAN
source. I don't have a strong opinion on this, but KMSAN is a debugging
tool, so any runtime invariant violation in it renders the tool useless.
Therefore it doesn't make much sense to not terminate after a bug in
KMSAN.

There has been a suggestion to disable KMSAN gracefully instead of
panicking. The downside of doing so is that users may gain a false sense
of memory safety if they don't notice that the tool has shut down.

3. objtool complains a lot about calls to KMSAN runtime with UACCESS
enabled.
None of these functions is expected to touch userspace memory, but
they can be called in the uaccess context, as the compiler adds them
to every memory access.
Turns out it's not enough to just whitelist KMSAN interface functions
in tools/objtool/check.c, as they are viral: after whitelisting them
I get warnings about their callees.
On the other hand, it's unacceptable to call
user_access_save()/user_access_restore() inside these functions, as
this slows down the whole runtime heavily.
Perhaps this problem can be solved on objtool side, as the mentioned
reports aren't errors per se.



Alexander Potapenko (38):
  stackdepot: reserve 5 extra bits in depot_stack_handle_t
  kmsan: add ReST documentation
  kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__
  kmsan: reduce vmalloc space
  kmsan: add KMSAN runtime core
  kmsan: KMSAN compiler API implementation
  kmsan: add KMSAN hooks for kernel subsystems
  kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot
  kmsan: define READ_ONCE_NOCHECK()
  kmsan: make READ_ONCE_TASK_STACK() return initialized values
  kmsan: x86: sync metadata pages on page fault
  kmsan: add tests for KMSAN
  crypto: kmsan: disable accelerated configs under KMSAN
  kmsan: x86: disable UNWINDER_ORC under KMSAN
  kmsan: x86/asm: softirq: add KMSAN IRQ entry hooks
  kmsan: disable KMSAN instrumentation for certain kernel parts
  kmsan: mm: call KMSAN hooks from SLUB code
  kmsan: mm: maintain KMSAN metadata for page operations
  kmsan: handle memory sent to/from USB
  kmsan: handle task creation and exiting
  kmsan: net: check the value of skb before sending it to the network
  kmsan: printk: treat the result of vscnprintf() as initialized
  kmsan: disable instrumentation of certain functions
  kmsan: unpoison |tlb| in arch_tlb_gather_mmu()
  kmsan: use __msan_ string functions where possible.
  kmsan: hooks for copy_to_user() and friends
  kmsan: init: call KMSAN initialization routines
  kmsan: enable KMSAN builds
  kmsan: handle /dev/[u]random
  kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg()
  kmsan: disable strscpy() optimization under KMSAN
  kmsan: add iomap support
  kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  kmsan: disable physical page merging in biovec
  x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for
    KASAN/KMSAN
  kmsan: x86/uprobes: unpoison regs in arch_uprobe_exception_notify()
  kmsan: block: skip bio block merging logic for KMSAN

To: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: linux-mm@kvack.org


 Documentation/dev-tools/index.rst             |   1 +
 Documentation/dev-tools/kmsan.rst             | 424 ++++++++++++++
 Makefile                                      |   3 +-
 arch/x86/Kconfig                              |   5 +
 arch/x86/Kconfig.debug                        |   3 +
 arch/x86/boot/Makefile                        |   1 +
 arch/x86/boot/compressed/Makefile             |   2 +
 arch/x86/boot/compressed/misc.h               |   1 +
 arch/x86/entry/common.c                       |   2 +
 arch/x86/entry/entry_64.S                     |  16 +
 arch/x86/entry/vdso/Makefile                  |   3 +
 arch/x86/include/asm/checksum.h               |  10 +-
 arch/x86/include/asm/irq_regs.h               |   2 +
 arch/x86/include/asm/kmsan.h                  |  93 +++
 arch/x86/include/asm/page_64.h                |  13 +
 arch/x86/include/asm/pgtable_64_types.h       |  15 +
 arch/x86/include/asm/string_64.h              |  23 +-
 arch/x86/include/asm/syscall_wrapper.h        |   2 +
 arch/x86/include/asm/uaccess.h                |  10 +
 arch/x86/include/asm/unwind.h                 |  10 +-
 arch/x86/kernel/Makefile                      |   4 +
 arch/x86/kernel/apic/apic.c                   |   3 +
 arch/x86/kernel/cpu/Makefile                  |   1 +
 arch/x86/kernel/dumpstack_64.c                |   5 +
 arch/x86/kernel/process_64.c                  |   5 +
 arch/x86/kernel/traps.c                       |  13 +-
 arch/x86/kernel/uprobes.c                     |   7 +-
 arch/x86/lib/Makefile                         |   2 +
 arch/x86/mm/Makefile                          |   3 +
 arch/x86/mm/fault.c                           |  20 +
 arch/x86/mm/ioremap.c                         |   3 +
 arch/x86/realmode/rm/Makefile                 |   1 +
 block/bio.c                                   |   2 +
 block/blk.h                                   |   7 +
 crypto/Kconfig                                |  30 +
 drivers/char/random.c                         |   6 +
 drivers/firmware/efi/libstub/Makefile         |   1 +
 .../firmware/efi/libstub/efi-stub-helper.c    |   5 +
 drivers/firmware/efi/libstub/tpm.c            |   5 +
 drivers/usb/core/urb.c                        |   2 +
 drivers/virtio/virtio_ring.c                  |  10 +-
 include/asm-generic/cacheflush.h              |   7 +-
 include/asm-generic/uaccess.h                 |  12 +-
 include/linux/compiler-clang.h                |   7 +
 include/linux/compiler-gcc.h                  |   5 +
 include/linux/compiler.h                      |  14 +-
 include/linux/gfp.h                           |   4 +-
 include/linux/highmem.h                       |   3 +
 include/linux/kmsan-checks.h                  | 127 ++++
 include/linux/kmsan.h                         | 335 +++++++++++
 include/linux/mm_types.h                      |   9 +
 include/linux/sched.h                         |   5 +
 include/linux/stackdepot.h                    |   8 +
 include/linux/string.h                        |   2 +
 include/linux/uaccess.h                       |  34 +-
 init/main.c                                   |   3 +
 kernel/Makefile                               |   1 +
 kernel/dma/direct.c                           |   1 +
 kernel/exit.c                                 |   2 +
 kernel/fork.c                                 |   2 +
 kernel/kthread.c                              |   2 +
 kernel/locking/Makefile                       |   4 +
 kernel/printk/printk.c                        |   6 +
 kernel/sched/core.c                           |  22 +
 kernel/softirq.c                              |   5 +
 lib/Kconfig.debug                             |   2 +
 lib/Kconfig.kmsan                             |  22 +
 lib/Makefile                                  |   3 +
 lib/iomap.c                                   |  40 ++
 lib/ioremap.c                                 |   5 +
 lib/iov_iter.c                                |  14 +-
 lib/stackdepot.c                              |  26 +-
 lib/string.c                                  |   8 +
 lib/test_kmsan.c                              | 229 ++++++++
 lib/usercopy.c                                |   8 +-
 mm/Makefile                                   |   1 +
 mm/gup.c                                      |   3 +
 mm/kmsan/Makefile                             |  11 +
 mm/kmsan/kmsan.c                              | 547 ++++++++++++++++++
 mm/kmsan/kmsan.h                              | 161 ++++++
 mm/kmsan/kmsan_entry.c                        |  38 ++
 mm/kmsan/kmsan_hooks.c                        | 416 +++++++++++++
 mm/kmsan/kmsan_init.c                         |  79 +++
 mm/kmsan/kmsan_instr.c                        | 229 ++++++++
 mm/kmsan/kmsan_report.c                       | 143 +++++
 mm/kmsan/kmsan_shadow.c                       | 456 +++++++++++++++
 mm/kmsan/kmsan_shadow.h                       |  30 +
 mm/memory.c                                   |   2 +
 mm/mmu_gather.c                               |  10 +
 mm/page_alloc.c                               |  17 +
 mm/slub.c                                     |  29 +-
 mm/vmalloc.c                                  |  24 +-
 net/sched/sch_generic.c                       |   2 +
 scripts/Makefile.kmsan                        |  12 +
 scripts/Makefile.lib                          |   6 +
 95 files changed, 3926 insertions(+), 41 deletions(-)
 create mode 100644 Documentation/dev-tools/kmsan.rst
 create mode 100644 arch/x86/include/asm/kmsan.h
 create mode 100644 include/linux/kmsan-checks.h
 create mode 100644 include/linux/kmsan.h
 create mode 100644 lib/Kconfig.kmsan
 create mode 100644 lib/test_kmsan.c
 create mode 100644 mm/kmsan/Makefile
 create mode 100644 mm/kmsan/kmsan.c
 create mode 100644 mm/kmsan/kmsan.h
 create mode 100644 mm/kmsan/kmsan_entry.c
 create mode 100644 mm/kmsan/kmsan_hooks.c
 create mode 100644 mm/kmsan/kmsan_init.c
 create mode 100644 mm/kmsan/kmsan_instr.c
 create mode 100644 mm/kmsan/kmsan_report.c
 create mode 100644 mm/kmsan/kmsan_shadow.c
 create mode 100644 mm/kmsan/kmsan_shadow.h
 create mode 100644 scripts/Makefile.kmsan

-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
@ 2020-03-25 16:12 ` glider
  2020-03-30 13:36   ` Andrey Konovalov
  2020-03-25 16:12 ` [PATCH v5 02/38] kmsan: add ReST documentation glider
                   ` (36 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

Some users (currently only KMSAN) may want to use spare bits in
depot_stack_handle_t. Let them do so and provide get_dsh_extra_bits()
and set_dsh_extra_bits() to access those bits.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---

Change-Id: I23580dbde85908eeda0bdd8f83a8c3882ab3e012
---
 include/linux/stackdepot.h |  8 ++++++++
 lib/stackdepot.c           | 24 +++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 24d49c732341a..ac1b5a78d7f65 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -12,6 +12,11 @@
 #define _LINUX_STACKDEPOT_H
 
 typedef u32 depot_stack_handle_t;
+/*
+ * Number of bits in the handle that stack depot doesn't use. Users may store
+ * information in them.
+ */
+#define STACK_DEPOT_EXTRA_BITS 5
 
 depot_stack_handle_t stack_depot_save(unsigned long *entries,
 				      unsigned int nr_entries, gfp_t gfp_flags);
@@ -20,5 +25,8 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 			       unsigned long **entries);
 
 unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries);
+depot_stack_handle_t set_dsh_extra_bits(depot_stack_handle_t handle,
+					unsigned int bits);
+unsigned int get_dsh_extra_bits(depot_stack_handle_t handle);
 
 #endif
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 2caffc64e4c82..195ce3dc7c37e 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -40,8 +40,10 @@
 #define STACK_ALLOC_ALIGN 4
 #define STACK_ALLOC_OFFSET_BITS (STACK_ALLOC_ORDER + PAGE_SHIFT - \
 					STACK_ALLOC_ALIGN)
+
 #define STACK_ALLOC_INDEX_BITS (DEPOT_STACK_BITS - \
-		STACK_ALLOC_NULL_PROTECTION_BITS - STACK_ALLOC_OFFSET_BITS)
+		STACK_ALLOC_NULL_PROTECTION_BITS - \
+		STACK_ALLOC_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
 #define STACK_ALLOC_SLABS_CAP 8192
 #define STACK_ALLOC_MAX_SLABS \
 	(((1LL << (STACK_ALLOC_INDEX_BITS)) < STACK_ALLOC_SLABS_CAP) ? \
@@ -54,6 +56,7 @@ union handle_parts {
 		u32 slabindex : STACK_ALLOC_INDEX_BITS;
 		u32 offset : STACK_ALLOC_OFFSET_BITS;
 		u32 valid : STACK_ALLOC_NULL_PROTECTION_BITS;
+		u32 extra : STACK_DEPOT_EXTRA_BITS;
 	};
 };
 
@@ -72,6 +75,24 @@ static int next_slab_inited;
 static size_t depot_offset;
 static DEFINE_SPINLOCK(depot_lock);
 
+depot_stack_handle_t set_dsh_extra_bits(depot_stack_handle_t handle,
+					u32 bits)
+{
+	union handle_parts parts = { .handle = handle };
+
+	parts.extra = bits & ((1U << STACK_DEPOT_EXTRA_BITS) - 1);
+	return parts.handle;
+}
+EXPORT_SYMBOL_GPL(set_dsh_extra_bits);
+
+u32 get_dsh_extra_bits(depot_stack_handle_t handle)
+{
+	union handle_parts parts = { .handle = handle };
+
+	return parts.extra;
+}
+EXPORT_SYMBOL_GPL(get_dsh_extra_bits);
+
 static bool init_stack_slab(void **prealloc)
 {
 	if (!*prealloc)
@@ -136,6 +157,7 @@ static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
 	stack->handle.slabindex = depot_index;
 	stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
 	stack->handle.valid = 1;
+	stack->handle.extra = 0;
 	memcpy(stack->entries, entries, size * sizeof(unsigned long));
 	depot_offset += required_size;
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 02/38] kmsan: add ReST documentation
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
  2020-03-25 16:12 ` [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider
@ 2020-03-25 16:12 ` glider
  2020-03-30 14:32   ` Andrey Konovalov
  2020-03-25 16:12 ` [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW glider
                   ` (35 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

Add Documentation/dev-tools/kmsan.rst and reference it in the dev-tools
index.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---
v4:
 - address comments by Marco Elver:
  - remove contractions
  - fix references
  - minor fixes

Change-Id: Iac6345065e6804ef811f1124fdf779c67ff1530e
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/kmsan.rst | 424 ++++++++++++++++++++++++++++++
 2 files changed, 425 insertions(+)
 create mode 100644 Documentation/dev-tools/kmsan.rst

diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..a3b9579fc810c 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -19,6 +19,7 @@ whole; patches welcome!
    kcov
    gcov
    kasan
+   kmsan
    ubsan
    kmemleak
    kcsan
diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst
new file mode 100644
index 0000000000000..591c4809d46f3
--- /dev/null
+++ b/Documentation/dev-tools/kmsan.rst
@@ -0,0 +1,424 @@
+=============================
+KernelMemorySanitizer (KMSAN)
+=============================
+
+KMSAN is a dynamic memory error detector aimed at finding uses of uninitialized
+memory.
+It is based on compiler instrumentation, and is quite similar to the userspace
+`MemorySanitizer tool`_.
+
+Example report
+==============
+Here is an example of a real KMSAN report in ``packet_bind_spkt()``::
+
+  ==================================================================
+  BUG: KMSAN: uninit-value in strlen
+  CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
+  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
+   0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
+   ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
+   0000000000000000 0000000000000092 00000000ec400911 0000000000000002
+  Call Trace:
+   [<     inline     >] __dump_stack lib/dump_stack.c:15
+   [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
+   [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
+   [<ffffffff818a783b>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
+   [<     inline     >] strlen lib/string.c:484
+   [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
+   [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230 net/packet/af_packet.c:3132
+   [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
+   [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
+   [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:?
+  chained origin:
+   [<ffffffff810bb787>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
+   [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
+   [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
+   [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:527
+   [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
+   [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
+   [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
+   [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:?
+  origin description: ----address@SYSC_bind (origin=00000000eb400911)
+  ==================================================================
+
+The report tells that the local variable ``address`` was created uninitialized
+in ``SYSC_bind()`` (the ``bind`` system call implementation). The lower stack
+trace corresponds to the place where this variable was created.
+
+The upper stack shows where the uninit value was used - in ``strlen()``.
+It turned out that the contents of ``address`` were partially copied from the
+userspace, but the buffer was not zero-terminated and contained some trailing
+uninitialized bytes.
+
+``packet_bind_spkt()`` did not check the length of the buffer, but called
+``strlcpy()`` on it, which called ``strlen()``, which started reading the
+buffer byte by byte till it hit the uninitialized memory.
+
+
+
+KMSAN and Clang
+===============
+
+In order for KMSAN to work the kernel must be
+built with Clang, which so far is the only compiler that has KMSAN support.
+The kernel instrumentation pass is based on the userspace
+`MemorySanitizer tool`_. Because of the instrumentation complexity it is
+unlikely that any other compiler will support KMSAN soon.
+
+Right now the instrumentation pass supports x86_64 only.
+
+How to build
+============
+
+In order to build a kernel with KMSAN you will need a fresh Clang (10.0.0+,
+trunk version r365008 or greater). Please refer to `LLVM documentation`_
+for the instructions on how to build Clang::
+
+  export KMSAN_CLANG_PATH=/path/to/clang
+  # Now configure and build the kernel with CONFIG_KMSAN enabled.
+  make CC=$KMSAN_CLANG_PATH
+
+How KMSAN works
+===============
+
+KMSAN shadow memory
+-------------------
+
+KMSAN associates a metadata byte (also called shadow byte) with every byte of
+kernel memory.
+A bit in the shadow byte is set iff the corresponding bit of the kernel memory
+byte is uninitialized.
+Marking the memory uninitialized (i.e. setting its shadow bytes to 0xff) is
+called poisoning, marking it initialized (setting the shadow bytes to 0x00) is
+called unpoisoning.
+
+When a new variable is allocated on the stack, it is poisoned by default by
+instrumentation code inserted by the compiler (unless it is a stack variable
+that is immediately initialized). Any new heap allocation done without
+``__GFP_ZERO`` is also poisoned.
+
+Compiler instrumentation also tracks the shadow values with the help from the
+runtime library in ``mm/kmsan/``.
+
+The shadow value of a basic or compound type is an array of bytes of the same
+length.
+When a constant value is written into memory, that memory is unpoisoned.
+When a value is read from memory, its shadow memory is also obtained and
+propagated into all the operations which use that value. For every instruction
+that takes one or more values the compiler generates code that calculates the
+shadow of the result depending on those values and their shadows.
+
+Example::
+
+  int a = 0xff;
+  int b;
+  int c = a | b;
+
+In this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``,
+shadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of
+``c`` are uninitialized, while the lower byte is initialized.
+
+
+Origin tracking
+---------------
+
+Every four bytes of kernel memory also have a so-called origin assigned to
+them.
+This origin describes the point in program execution at which the uninitialized
+value was created. Every origin is associated with a creation stack, which lets
+the user figure out what is going on.
+
+When an uninitialized variable is allocated on stack or heap, a new origin
+value is created, and that variable's origin is filled with that value.
+When a value is read from memory, its origin is also read and kept together
+with the shadow. For every instruction that takes one or more values the origin
+of the result is one of the origins corresponding to any of the uninitialized
+inputs.
+If a poisoned value is written into memory, its origin is written to the
+corresponding storage as well.
+
+Example 1::
+
+  int a = 0;
+  int b;
+  int c = a + b;
+
+In this case the origin of ``b`` is generated upon function entry, and is
+stored to the origin of ``c`` right before the addition result is written into
+memory.
+
+Several variables may share the same origin address, if they are stored in the
+same four-byte chunk.
+In this case every write to either variable updates the origin for all of them.
+
+Example 2::
+
+  int combine(short a, short b) {
+    union ret_t {
+      int i;
+      short s[2];
+    } ret;
+    ret.s[0] = a;
+    ret.s[1] = b;
+    return ret.i;
+  }
+
+If ``a`` is initialized and ``b`` is not, the shadow of the result would be
+0xffff0000, and the origin of the result would be the origin of ``b``.
+``ret.s[0]`` would have the same origin, but it will be never used, because
+that variable is initialized.
+
+If both function arguments are uninitialized, only the origin of the second
+argument is preserved.
+
+Origin chaining
+~~~~~~~~~~~~~~~
+To ease debugging, KMSAN creates a new origin for every memory store.
+The new origin references both its creation stack and the previous origin the
+memory location had.
+This may cause increased memory consumption, so we limit the length of origin
+chains in the runtime.
+
+Clang instrumentation API
+-------------------------
+
+Clang instrumentation pass inserts calls to functions defined in
+``mm/kmsan/kmsan_instr.c`` into the kernel code.
+
+Shadow manipulation
+~~~~~~~~~~~~~~~~~~~
+For every memory access the compiler emits a call to a function that returns a
+pair of pointers to the shadow and origin addresses of the given memory::
+
+  typedef struct {
+    void *s, *o;
+  } shadow_origin_ptr_t
+
+  shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr)
+  shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr)
+  shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, u64 size)
+  shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, u64 size)
+
+The function name depends on the memory access size.
+Each such function also checks if the shadow of the memory in the range
+[``addr``, ``addr + n``) is contiguous and reports an error otherwise.
+
+The compiler makes sure that for every loaded value its shadow and origin
+values are read from memory.
+When a value is stored to memory, its shadow and origin are also stored using
+the metadata pointers.
+
+Origin tracking
+~~~~~~~~~~~~~~~
+A special function is used to create a new origin value for a local variable
+and set the origin of that variable to that value::
+
+  void __msan_poison_alloca(u64 address, u64 size, char *descr)
+
+Access to per-task data
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+At the beginning of every instrumented function KMSAN inserts a call to
+``__msan_get_context_state()``::
+
+  kmsan_context_state *__msan_get_context_state(void)
+
+``kmsan_context_state`` is declared in ``include/linux/kmsan.h``::
+
+  struct kmsan_context_s {
+    char param_tls[KMSAN_PARAM_SIZE];
+    char retval_tls[RETVAL_SIZE];
+    char va_arg_tls[KMSAN_PARAM_SIZE];
+    char va_arg_origin_tls[KMSAN_PARAM_SIZE];
+    u64 va_arg_overflow_size_tls;
+    depot_stack_handle_t param_origin_tls[PARAM_ARRAY_SIZE];
+    depot_stack_handle_t retval_origin_tls;
+    depot_stack_handle_t origin_tls;
+  };
+
+This structure is used by KMSAN to pass parameter shadows and origins between
+instrumented functions.
+
+String functions
+~~~~~~~~~~~~~~~~
+
+The compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the
+following functions. These functions are also called when data structures are
+initialized or copied, making sure shadow and origin values are copied alongside
+with the data::
+
+  void *__msan_memcpy(void *dst, void *src, u64 n)
+  void *__msan_memmove(void *dst, void *src, u64 n)
+  void *__msan_memset(void *dst, int c, size_t n)
+
+Error reporting
+~~~~~~~~~~~~~~~
+
+For each pointer dereference and each condition the compiler emits a shadow
+check that calls ``__msan_warning()`` in the case a poisoned value is being
+used::
+
+  void __msan_warning(u32 origin)
+
+``__msan_warning()`` causes KMSAN runtime to print an error report.
+
+Inline assembly instrumentation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+KMSAN instruments every inline assembly output with a call to::
+
+  void __msan_instrument_asm_store(u64 addr, u64 size)
+
+, which unpoisons the memory region.
+
+This approach may mask certain errors, but it also helps to avoid a lot of
+false positives in bitwise operations, atomics etc.
+
+Sometimes the pointers passed into inline assembly do not point to valid memory.
+In such cases they are ignored at runtime.
+
+Disabling the instrumentation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+A function can be marked with ``__no_sanitize_memory``.
+Doing so does not remove KMSAN instrumentation from it, however it makes the
+compiler ignore the uninitialized values coming from the function's inputs,
+and initialize the function's outputs.
+The compiler will not inline functions marked with this attribute into functions
+not marked with it, and vice versa.
+
+It is also possible to disable KMSAN for a single file (e.g. main.o)::
+
+  KMSAN_SANITIZE_main.o := n
+
+or for the whole directory::
+
+  KMSAN_SANITIZE := n
+
+in the Makefile. This comes at a cost however: stack allocations from such files
+and parameters of instrumented functions called from them will have incorrect
+shadow/origin values. As a rule of thumb, avoid using KMSAN_SANITIZE.
+
+Runtime library
+---------------
+The code is located in ``mm/kmsan/``.
+
+Per-task KMSAN state
+~~~~~~~~~~~~~~~~~~~~
+
+Every task_struct has an associated KMSAN task state that holds the KMSAN
+context (see above) and a per-task flag disallowing KMSAN reports::
+
+  struct kmsan_task_state {
+    ...
+    bool allow_reporting;
+    struct kmsan_context_state cstate;
+    ...
+  }
+
+  struct task_struct {
+    ...
+    struct kmsan_task_state kmsan;
+    ...
+  }
+
+
+KMSAN contexts
+~~~~~~~~~~~~~~
+
+When running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to
+hold the metadata for function parameters and return values.
+
+But in the case the kernel is running in the interrupt, softirq or NMI context,
+where ``current`` is unavailable, KMSAN switches to per-cpu interrupt state::
+
+  DEFINE_PER_CPU(kmsan_context_state[KMSAN_NESTED_CONTEXT_MAX],
+                 kmsan_percpu_cstate);
+
+Metadata allocation
+~~~~~~~~~~~~~~~~~~~
+There are several places in the kernel for which the metadata is stored.
+
+1. Each ``struct page`` instance contains two pointers to its shadow and
+origin pages::
+
+  struct page {
+    ...
+    struct page *shadow, *origin;
+    ...
+  };
+
+Every time a ``struct page`` is allocated, the runtime library allocates two
+additional pages to hold its shadow and origins. This is done by adding hooks
+to ``alloc_pages()``/``free_pages()`` in ``mm/page_alloc.c``.
+To avoid allocating the metadata for non-interesting pages (right now only the
+shadow/origin page themselves and stackdepot storage) the
+``__GFP_NO_KMSAN_SHADOW`` flag is used.
+
+There is a problem related to this allocation algorithm: when two contiguous
+memory blocks are allocated with two different ``alloc_pages()`` calls, their
+shadow pages may not be contiguous. So, if a memory access crosses the boundary
+of a memory block, accesses to shadow/origin memory may potentially corrupt
+other pages or read incorrect values from them.
+
+As a workaround, we check the access size in
+``__msan_metadata_ptr_for_XXX_YYY()`` and return a pointer to a fake shadow
+region in the case of an error::
+
+  char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
+  char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
+
+``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes.
+All stores to ``dummy_store_page`` are ignored.
+
+Unfortunately at boot time we need to allocate shadow and origin pages for the
+kernel data (``.data``, ``.bss`` etc.) and percpu memory regions, the size of
+which is not a power of 2. As a result, we have to allocate the metadata page by
+page, so that it is also non-contiguous, although it may be perfectly valid to
+access the corresponding kernel memory across page boundaries.
+This can be probably fixed by allocating 1<<N pages at once, splitting them and
+deallocating the rest.
+
+LSB of the ``shadow`` pointer in a ``struct page`` may be set to 1. In this case
+shadow and origin pages are allocated, but KMSAN ignores accesses to them by
+falling back to dummy pages. Allocating the metadata pages is still needed to
+support ``vmap()/vunmap()`` operations on this struct page.
+
+2. For vmalloc memory and modules, there is a direct mapping between the memory
+range, its shadow and origin. KMSAN lessens the vmalloc area by 3/4, making only
+the first quarter available to ``vmalloc()``. The second quarter of the vmalloc
+area contains shadow memory for the first quarter, the third one holds the
+origins. A small part of the fourth quarter contains shadow and origins for the
+kernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for
+more details.
+
+When an array of pages is mapped into a contiguous virtual memory space, their
+shadow and origin pages are similarly mapped into contiguous regions.
+
+3. For CPU entry area there are separate per-CPU arrays that hold its
+metadata::
+
+  DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_shadow);
+  DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_origin);
+
+When calculating shadow and origin addresses for a given memory address, the
+runtime checks whether the address belongs to the physical page range, the
+virtual page range or CPU entry area.
+
+Handling ``pt_regs``
+~~~~~~~~~~~~~~~~~~~~
+
+Many functions receive a ``struct pt_regs`` holding the register state at a
+certain point. Registers do not have (easily calculatable) shadow or origin
+associated with them.
+We can assume that the registers are always initialized.
+
+References
+==========
+
+E. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized
+memory use in C++
+<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_.
+In Proceedings of CGO 2015.
+
+.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html
+.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
  2020-03-25 16:12 ` [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider
  2020-03-25 16:12 ` [PATCH v5 02/38] kmsan: add ReST documentation glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:19   ` Michal Hocko
  2020-03-25 16:12 ` [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__ glider
                   ` (34 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Andrew Morton, Michal Hocko, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

This flag is to be used by KMSAN runtime to mark that newly created
memory pages don't need KMSAN metadata backing them.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---
We can't decide what to do here:
 - do we need to conditionally define ___GFP_NO_KMSAN_SHADOW depending on
   CONFIG_KMSAN like LOCKDEP does?
 - if KMSAN is defined, and LOCKDEP is not, do we want to "compactify" the GFP
   bits?

Change-Id: If5d0352fd5711ad103328e2c185eb885e826423a
---
 include/linux/gfp.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index be2754841369e..e1ab42b5e9ce2 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -44,6 +44,7 @@ struct vm_area_struct;
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
+#define ___GFP_NO_KMSAN_SHADOW  0x1000000u
 /* If the above are modified, __GFP_BITS_SHIFT may need updating */
 
 /*
@@ -212,12 +213,13 @@ struct vm_area_struct;
 #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
+#define __GFP_NO_KMSAN_SHADOW  ((__force gfp_t)___GFP_NO_KMSAN_SHADOW)
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (25)
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (2 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW glider
@ 2020-03-25 16:12 ` glider
  2020-03-30 13:37   ` Andrey Konovalov
  2020-03-25 16:12 ` [PATCH v5 05/38] kmsan: reduce vmalloc space glider
                   ` (33 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

__no_sanitize_memory is a function attribute that makes KMSAN
ignore the uninitialized values coming from the function's
inputs, and initialize the function's outputs.

Functions marked with this attribute can't be inlined into functions
not marked with it, and vice versa.

__SANITIZE_MEMORY__ is a macro that's defined iff the file is
instrumented with KMSAN. This is not the same as CONFIG_KMSAN, which is
defined for every file.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
Acked-by: Marco Elver <elver@google.com>

---

v4:
 - dropped an unnecessary comment as requested by Marco Elver

Change-Id: I1f1672652c8392f15f7ca8ac26cd4e71f9cc1e4b
---
 include/linux/compiler-clang.h | 7 +++++++
 include/linux/compiler-gcc.h   | 5 +++++
 2 files changed, 12 insertions(+)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 2cb42d8bdedc6..d4f929b4a6705 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -33,6 +33,13 @@
 #define __no_sanitize_thread
 #endif
 
+#if __has_feature(memory_sanitizer)
+# define __SANITIZE_MEMORY__
+# define __no_sanitize_memory __attribute__((no_sanitize("kernel-memory")))
+#else
+# define __no_sanitize_memory
+#endif
+
 /*
  * Not all versions of clang implement the the type-generic versions
  * of the builtin overflow checkers. Fortunately, clang implements
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index cf294faec2f87..1121557252f88 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -151,6 +151,11 @@
 #define __no_sanitize_thread
 #endif
 
+/*
+ * GCC doesn't support KMSAN.
+ */
+#define __no_sanitize_memory
+
 #if GCC_VERSION >= 50100
 #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
 #endif
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 05/38] kmsan: reduce vmalloc space
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (3 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__ glider
@ 2020-03-25 16:12 ` glider
  2020-03-30 13:48   ` Andrey Konovalov
  2020-03-25 16:12 ` [PATCH v5 06/38] kmsan: add KMSAN runtime core glider
                   ` (32 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

KMSAN is going to use 3/4 of existing vmalloc space to hold the
metadata, therefore we lower VMALLOC_END to make sure vmalloc() doesn't
allocate past the first 1/4.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

Change-Id: Iaa5e8e0fc2aa66c956f937f5a1de6e5ef40d57cc
---
 arch/x86/include/asm/pgtable_64_types.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 52e5f5f2240d9..586629e204366 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -139,7 +139,22 @@ extern unsigned int ptrs_per_p4d;
 # define VMEMMAP_START		__VMEMMAP_BASE_L4
 #endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */
 
+#ifndef CONFIG_KMSAN
 #define VMALLOC_END		(VMALLOC_START + (VMALLOC_SIZE_TB << 40) - 1)
+#else
+/*
+ * In KMSAN builds vmalloc area is four times smaller, and the remaining 3/4
+ * are used to keep the metadata for virtual pages.
+ */
+#define VMALLOC_QUARTER_SIZE	((VMALLOC_SIZE_TB << 40) >> 2)
+#define VMALLOC_END		(VMALLOC_START + VMALLOC_QUARTER_SIZE - 1)
+#define VMALLOC_SHADOW_OFFSET	VMALLOC_QUARTER_SIZE
+#define VMALLOC_ORIGIN_OFFSET	(VMALLOC_QUARTER_SIZE * 2)
+#define VMALLOC_META_END	(VMALLOC_END + VMALLOC_ORIGIN_OFFSET)
+#define MODULES_SHADOW_START	(VMALLOC_META_END + 1)
+#define MODULES_ORIGIN_START	(MODULES_SHADOW_START + MODULES_LEN)
+#define MODULES_ORIGIN_END	(MODULES_ORIGIN_START + MODULES_LEN)
+#endif
 
 #define MODULES_VADDR		(__START_KERNEL_map + KERNEL_IMAGE_SIZE)
 /* The module sections ends with the start of the fixmap */
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 06/38] kmsan: add KMSAN runtime core
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (4 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 05/38] kmsan: reduce vmalloc space glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 07/38] kmsan: KMSAN compiler API implementation glider
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Wolfram Sang, Christoph Hellwig,
	Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor

This patch adds the core parts of KMSAN runtime and associated files:

  - include/linux/kmsan-checks.h: user API to poison/unpoison/check memory
  - include/linux/kmsan.h: declarations of KMSAN memory hooks to be
    referenced outside KMSAN runtime
  - lib/Kconfig.kmsan: declarations for CONFIG_KMSAN and CONFIG_TEST_KMSAN
  - mm/kmsan/Makefile: boilerplate Makefile
  - mm/kmsan/kmsan.h: internal KMSAN declarations
  - mm/kmsan/kmsan.c: core functions that operate with shadow and
    origin memory and perform checks, utility functions
  - mm/kmsan/kmsan_init.c: KMSAN initialization routines
  - scripts/Makefile.kmsan: CFLAGS_KMSAN

The patch also adds the necessary bookkeeping bits to struct page and
struct task_struct:
 - each struct page now contains pointers to two struct pages holding
   KMSAN metadata (shadow and origins) for the original struct page;
 - each task_struct contains a struct kmsan_task_state used to track
   the metadata of function parameters and return values for that task.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---
v2:
 - dropped kmsan_handle_vprintk()
 - use locking for single kmsan_pr_err() calls
 - don't try to understand we're inside printk()
v3:
 - fix an endless loop in __msan_poison_alloca()
 - implement kmsan_handle_dma()
 - dropped kmsan_handle_i2c_transfer()
 - fixed compilation with UNWINDER_ORC
 - dropped assembly hooks for system calls
v4:
 - splitted away some runtime parts to ease the review process
 - fix a lot of comments by Marco Elver and Andrey Konovalov:
 -- clean up headers and #defines, remove debugging code
 -- dropped kmsan_pr_* macros, fixed reporting code
 -- removed TODOs
 -- simplified kmsan_get_shadow_origin_ptr()
 - actually filter out IRQ frames using filter_irq_stacks()
 - simplify kmsan_get_metadata()
 - include build_bug.h into kmsan-checks.h
 - don't instrument KMSAN files with stackprotector
 - squashed "kmsan: add KMSAN bits to struct page and struct
   task_struct" into this patch as requested by Marco Elver
v5:
 - s/kmsan_softirq/kmsan_context everywhere (spotted by kbuild test
   robot <lkp@intel.com>)

Change-Id: I4b3a7aba6d5804afac4f5f7274cadf8675b6e119
---
 arch/x86/Kconfig             |   1 +
 include/linux/kmsan-checks.h | 127 ++++++++
 include/linux/kmsan.h        | 335 +++++++++++++++++++++
 include/linux/mm_types.h     |   9 +
 include/linux/sched.h        |   5 +
 lib/Kconfig.debug            |   2 +
 lib/Kconfig.kmsan            |  22 ++
 mm/kmsan/Makefile            |  11 +
 mm/kmsan/kmsan.c             | 547 +++++++++++++++++++++++++++++++++++
 mm/kmsan/kmsan.h             | 161 +++++++++++
 mm/kmsan/kmsan_init.c        |  79 +++++
 mm/kmsan/kmsan_report.c      | 143 +++++++++
 mm/kmsan/kmsan_shadow.c      | 456 +++++++++++++++++++++++++++++
 mm/kmsan/kmsan_shadow.h      |  30 ++
 scripts/Makefile.kmsan       |  12 +
 15 files changed, 1940 insertions(+)
 create mode 100644 include/linux/kmsan-checks.h
 create mode 100644 include/linux/kmsan.h
 create mode 100644 lib/Kconfig.kmsan
 create mode 100644 mm/kmsan/Makefile
 create mode 100644 mm/kmsan/kmsan.c
 create mode 100644 mm/kmsan/kmsan.h
 create mode 100644 mm/kmsan/kmsan_init.c
 create mode 100644 mm/kmsan/kmsan_report.c
 create mode 100644 mm/kmsan/kmsan_shadow.c
 create mode 100644 mm/kmsan/kmsan_shadow.h
 create mode 100644 scripts/Makefile.kmsan

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8d298164dda2a..376c13480def2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -140,6 +140,7 @@ config X86
 	select HAVE_ARCH_KASAN			if X86_64
 	select HAVE_ARCH_KASAN_VMALLOC		if X86_64
 	select HAVE_ARCH_KCSAN			if X86_64
+	select HAVE_ARCH_KMSAN			if X86_64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS		if MMU
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if MMU && COMPAT
diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h
new file mode 100644
index 0000000000000..2e4b8001e8d96
--- /dev/null
+++ b/include/linux/kmsan-checks.h
@@ -0,0 +1,127 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KMSAN checks to be used for one-off annotations in subsystems.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#ifndef _LINUX_KMSAN_CHECKS_H
+#define _LINUX_KMSAN_CHECKS_H
+
+#include <linux/build_bug.h>
+#include <linux/types.h>
+
+#ifdef CONFIG_KMSAN
+
+/*
+ * Helper functions that mark the return value initialized.
+ * Note that Clang ignores the inline attribute in the cases when a no_sanitize
+ * function is called from an instrumented one. For the same reason these
+ * functions may not be declared __always_inline - in that case they dissolve in
+ * the callers and KMSAN won't be able to notice they should not be
+ * instrumented.
+ */
+
+__no_sanitize_memory
+static inline u8 KMSAN_INIT_1(u8 value)
+{
+	return value;
+}
+
+__no_sanitize_memory
+static inline u16 KMSAN_INIT_2(u16 value)
+{
+	return value;
+}
+
+__no_sanitize_memory
+static inline u32 KMSAN_INIT_4(u32 value)
+{
+	return value;
+}
+
+__no_sanitize_memory
+static inline u64 KMSAN_INIT_8(u64 value)
+{
+	return value;
+}
+
+/**
+ * KMSAN_INIT_VALUE - Make the value initialized.
+ * @val: 1-, 2-, 4- or 8-byte integer that may be treated as uninitialized by
+ *       KMSAN's.
+ *
+ * Return: value of @val that KMSAN treats as initialized.
+ */
+#define KMSAN_INIT_VALUE(val)		\
+	({				\
+		typeof(val) __ret;	\
+		switch (sizeof(val)) {	\
+		case 1:						\
+			*(u8 *)&__ret = KMSAN_INIT_1((u8)val);	\
+			break;					\
+		case 2:						\
+			*(u16 *)&__ret = KMSAN_INIT_2((u16)val);\
+			break;					\
+		case 4:						\
+			*(u32 *)&__ret = KMSAN_INIT_4((u32)val);\
+			break;					\
+		case 8:						\
+			*(u64 *)&__ret = KMSAN_INIT_8((u64)val);\
+			break;					\
+		default:					\
+			BUILD_BUG_ON(1);			\
+		}						\
+		__ret;						\
+	}) /**/
+
+/**
+ * kmsan_poison_shadow() - Mark the memory range as uninitialized.
+ * @address: address to start with.
+ * @size:    size of buffer to poison.
+ * @flags:   GFP flags for allocations done by this function.
+ *
+ * Until other data is written to this range, KMSAN will treat it as
+ * uninitialized. Error reports for this memory will reference the call site of
+ * kmsan_poison_shadow() as origin.
+ */
+void kmsan_poison_shadow(const void *address, size_t size, gfp_t flags);
+
+/**
+ * kmsan_unpoison_shadow() -  Mark the memory range as initialized.
+ * @address: address to start with.
+ * @size:    size of buffer to unpoison.
+ *
+ * Until other data is written to this range, KMSAN will treat it as
+ * initialized.
+ */
+void kmsan_unpoison_shadow(const void *address, size_t size);
+
+/**
+ * kmsan_check_memory() - Check the memory range for being initialized.
+ * @address: address to start with.
+ * @size:    size of buffer to check.
+ *
+ * If any piece of the given range is marked as uninitialized, KMSAN will report
+ * an error.
+ */
+void kmsan_check_memory(const void *address, size_t size);
+
+#else
+
+#define KMSAN_INIT_VALUE(value) (value)
+
+static inline void kmsan_poison_shadow(const void *address, size_t size,
+				       gfp_t flags) {}
+static inline void kmsan_unpoison_shadow(const void *address, size_t size) {}
+static inline void kmsan_check_memory(const void *address, size_t size) {}
+
+#endif
+
+#endif /* _LINUX_KMSAN_CHECKS_H */
diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
new file mode 100644
index 0000000000000..071e75f426f7a
--- /dev/null
+++ b/include/linux/kmsan.h
@@ -0,0 +1,335 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KMSAN API for subsystems.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef LINUX_KMSAN_H
+#define LINUX_KMSAN_H
+
+#include <linux/dma-direction.h>
+#include <linux/gfp.h>
+#include <linux/stackdepot.h>
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+
+struct page;
+struct kmem_cache;
+struct task_struct;
+struct sk_buff;
+struct urb;
+
+#ifdef CONFIG_KMSAN
+
+/* These constants are defined in the MSan LLVM instrumentation pass. */
+#define KMSAN_RETVAL_SIZE 800
+#define KMSAN_PARAM_SIZE 800
+#define KMSAN_PARAM_ARRAY_SIZE (KMSAN_PARAM_SIZE / sizeof(depot_stack_handle_t))
+
+struct kmsan_context_state {
+	char param_tls[KMSAN_PARAM_SIZE];
+	char retval_tls[KMSAN_RETVAL_SIZE];
+	char va_arg_tls[KMSAN_PARAM_SIZE];
+	char va_arg_origin_tls[KMSAN_PARAM_SIZE];
+	u64 va_arg_overflow_size_tls;
+	depot_stack_handle_t param_origin_tls[KMSAN_PARAM_ARRAY_SIZE];
+	depot_stack_handle_t retval_origin_tls;
+	depot_stack_handle_t origin_tls;
+};
+
+#undef KMSAN_PARAM_ARRAY_SIZE
+#undef KMSAN_PARAM_SIZE
+#undef KMSAN_RETVAL_SIZE
+
+struct kmsan_task_state {
+	bool allow_reporting;
+	struct kmsan_context_state cstate;
+};
+
+/**
+ * kmsan_initialize_shadow() - Initialize KMSAN shadow at boot time.
+ *
+ * Allocate and initialize KMSAN metadata for early allocations.
+ */
+void __init kmsan_initialize_shadow(void);
+
+/**
+ * kmsan_initialize() - Initialize KMSAN state and enable KMSAN.
+ */
+void __init kmsan_initialize(void);
+
+/**
+ * kmsan_task_create() - Initialize KMSAN state for the task.
+ * @task: task to initialize.
+ */
+void kmsan_task_create(struct task_struct *task);
+
+/**
+ * kmsan_task_exit() - Notify KMSAN that a task has exited.
+ * @task: task about to finish.
+ */
+void kmsan_task_exit(struct task_struct *task);
+
+/**
+ * kmsan_alloc_page() - Notify KMSAN about an alloc_pages() call.
+ * @page:  struct page pointer returned by alloc_pages().
+ * @order: order of allocated struct page.
+ * @flags: GFP flags used by alloc_pages()
+ *
+ * Return:
+ * * 0       - Ok
+ * * -ENOMEM - allocation failure
+ *
+ * KMSAN allocates metadata (shadow and origin pages) for @page and marks
+ * 1<<@order pages starting at @page as uninitialized, unless @flags contain
+ * __GFP_ZERO.
+ */
+int kmsan_alloc_page(struct page *page, unsigned int order, gfp_t flags);
+
+/**
+ * kmsan_free_page() - Notify KMSAN about a free_pages() call.
+ * @page:  struct page pointer passed to free_pages().
+ * @order: order of deallocated struct page.
+ *
+ * KMSAN deallocates the metadata pages for the given struct page.
+ */
+void kmsan_free_page(struct page *page, unsigned int order);
+
+/**
+ * kmsan_split_page() - Notify KMSAN about a split_page() call.
+ * @page:  struct page pointer passed to split_page().
+ * @order: order of split struct page.
+ *
+ * KMSAN splits the metadata pages for the given struct page, so that they
+ * can be deallocated separately.
+ */
+void kmsan_split_page(struct page *page, unsigned int order);
+
+/**
+ * kmsan_copy_page_meta() - Copy KMSAN metadata between two pages.
+ * @dst: destination page.
+ * @src: source page.
+ *
+ * KMSAN copies the contents of metadata pages for @src into the metadata pages
+ * for @dst. If @dst has no associated metadata pages, nothing happens.
+ * If @src has no associated metadata pages, @dst metadata pages are unpoisoned.
+ */
+void kmsan_copy_page_meta(struct page *dst, struct page *src);
+
+/**
+ * kmsan_gup_pgd_range() - Notify KMSAN about a gup_pgd_range() call.
+ * @pages: array of struct page pointers.
+ * @nr:    array size.
+ *
+ * gup_pgd_range() creates new pages, some of which may belong to the userspace
+ * memory. In that case these pages should be initialized.
+ */
+void kmsan_gup_pgd_range(struct page **pages, int nr);
+
+/**
+ * kmsan_slab_alloc() - Notify KMSAN about a slab allocation.
+ * @s:      slab cache the object belongs to.
+ * @object: object pointer.
+ * @flags:  GFP flags passed to the allocator.
+ *
+ * Depending on cache flags and GFP flags, KMSAN sets up the metadata of the
+ * newly created object, marking it initialized or uninitialized.
+ */
+void kmsan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags);
+
+/**
+ * kmsan_slab_free() - Notify KMSAN about a slab deallocation.
+ * @s:      slab cache the object belongs to.
+ * @object: object pointer.
+ *
+ * KMSAN marks the freed object as uninitialized.
+ */
+void kmsan_slab_free(struct kmem_cache *s, void *object);
+
+/**
+ * kmsan_kmalloc_large() - Notify KMSAN about a large slab allocation.
+ * @ptr:   object pointer.
+ * @size:  object size.
+ * @flags: GFP flags passed to the allocator.
+ *
+ * Similar to kmsan_slab_alloc(), but for large allocations.
+ */
+void kmsan_kmalloc_large(const void *ptr, size_t size, gfp_t flags);
+
+/**
+ * kmsan_kfree_large() - Notify KMSAN about a large slab deallocation.
+ * @ptr: object pointer.
+ *
+ * Similar to kmsan_slab_free(), but for large allocations.
+ */
+void kmsan_kfree_large(const void *ptr);
+
+/**
+ * kmsan_vmap_page_range_noflush() - Notify KMSAN about a vmap.
+ * @start: start address of vmapped range.
+ * @end:   end address of vmapped range.
+ * @prot:  page protection flags used for vmap.
+ * @pages: array of pages.
+ *
+ * KMSAN maps shadow and origin pages of @pages into contiguous ranges in
+ * vmalloc metadata address range.
+ */
+void kmsan_vmap_page_range_noflush(unsigned long start, unsigned long end,
+				   pgprot_t prot, struct page **pages);
+
+/**
+ * kmsan_vunmap_page_range() - Notify KMSAN about a vunmap.
+ * @addr: start address of vunmapped range.
+ * @end:  end address of vunmapped range.
+ *
+ * KMSAN unmaps the contiguous metadata ranges created by
+ * kmsan_vmap_page_range_noflush().
+ */
+void kmsan_vunmap_page_range(unsigned long addr, unsigned long end);
+
+/**
+ * kmsan_ioremap_page_range() - Notify KMSAN about a ioremap_page_range() call.
+ * @addr:      range start.
+ * @end:       range end.
+ * @phys_addr: physical range start.
+ * @prot:      page protection flags used for ioremap_page_range().
+ *
+ * KMSAN creates new metadata pages for the physical pages mapped into the
+ * virtual memory.
+ */
+void kmsan_ioremap_page_range(unsigned long addr, unsigned long end,
+			      phys_addr_t phys_addr, pgprot_t prot);
+
+/**
+ * kmsan_iounmap_page_range() - Notify KMSAN about a iounmap_page_range() call.
+ * @start: range start.
+ * @end:   range end.
+ *
+ * KMSAN unmaps the metadata pages for the given range and, unlike for
+ * vunmap_page_range(), also deallocates them.
+ */
+void kmsan_iounmap_page_range(unsigned long start, unsigned long end);
+
+/**
+ * kmsan_context_enter() - Notify KMSAN about a context entry.
+ *
+ * This function should be called whenever the kernel leaves the current task
+ * and enters an IRQ, softirq or NMI context. KMSAN will switch the task state
+ * to a per-thread storage.
+ */
+void kmsan_context_enter(void);
+
+/**
+ * kmsan_context_exit() - Notify KMSAN about a context exit.
+ *
+ * This function should be called when the kernel leaves the previously entered
+ * context.
+ */
+void kmsan_context_exit(void);
+
+/**
+ * kmsan_copy_to_user() - Notify KMSAN about a data transfer to userspace.
+ * @to:      destination address in the userspace.
+ * @from:    source address in the kernel.
+ * @to_copy: number of bytes to copy.
+ * @left:    number of bytes not copied.
+ *
+ * If this is a real userspace data transfer, KMSAN checks the bytes that were
+ * actually copied to ensure there was no information leak. If @to belongs to
+ * the kernel space (which is possible for compat syscalls), KMSAN just copies
+ * the metadata.
+ */
+void kmsan_copy_to_user(const void *to, const void *from, size_t to_copy,
+			size_t left);
+
+/**
+ * kmsan_check_skb() - Check an sk_buff for being initialized.
+ *
+ * KMSAN checks the memory belonging to a socket buffer and reports an error if
+ * contains uninitialized values.
+ */
+void kmsan_check_skb(const struct sk_buff *skb);
+
+/**
+ * kmsan_handle_dma() - Handle a DMA data transfer.
+ * @address:   buffer address.
+ * @size:      buffer size.
+ * @direction: one of possible dma_data_direction values.
+ *
+ * Depending on @direction, KMSAN:
+ * * checks the buffer, if it is copied to device;
+ * * initializes the buffer, if it is copied from device;
+ * * does both, if this is a DMA_BIDIRECTIONAL transfer.
+ */
+void kmsan_handle_dma(const void *address, size_t size,
+		      enum dma_data_direction direction);
+
+/**
+ * kmsan_handle_urb() - Handle a USB data transfer.
+ * @urb:    struct urb pointer.
+ * @is_out: data transfer direction (true means output to hardware)
+ *
+ * If @is_out is true, KMSAN checks the transfer buffer of @urb. Otherwise,
+ * KMSAN initializes the transfer buffer.
+ */
+void kmsan_handle_urb(const struct urb *urb, bool is_out);
+
+#else
+
+static inline void __init kmsan_initialize_shadow(void) { }
+static inline void __init kmsan_initialize(void) { }
+
+static inline void kmsan_task_create(struct task_struct *task) {}
+static inline void kmsan_task_exit(struct task_struct *task) {}
+
+static inline int kmsan_alloc_page(struct page *page, unsigned int order,
+				   gfp_t flags)
+{
+	return 0;
+}
+static inline void kmsan_free_page(struct page *page, unsigned int order) {}
+static inline void kmsan_split_page(struct page *page, unsigned int order) {}
+static inline void kmsan_copy_page_meta(struct page *dst, struct page *src) {}
+static inline void kmsan_gup_pgd_range(struct page **pages, int nr) {}
+
+static inline void kmsan_slab_alloc(struct kmem_cache *s, void *object,
+				    gfp_t flags) {}
+static inline void kmsan_slab_free(struct kmem_cache *s, void *object) {}
+static inline void kmsan_kmalloc_large(const void *ptr, size_t size,
+				       gfp_t flags) {}
+static inline void kmsan_kfree_large(const void *ptr) {}
+
+static inline void kmsan_vmap_page_range_noflush(unsigned long start,
+						 unsigned long end,
+						 pgprot_t prot,
+						 struct page **pages) {}
+static inline void kmsan_vunmap_page_range(unsigned long start,
+					   unsigned long end) {}
+
+static inline void kmsan_ioremap_page_range(unsigned long start,
+					    unsigned long end,
+					    phys_addr_t phys_addr,
+					    pgprot_t prot) {}
+static inline void kmsan_iounmap_page_range(unsigned long start,
+					    unsigned long end) {}
+
+static inline void kmsan_context_enter(void) {}
+static inline void kmsan_context_exit(void) {}
+
+static inline void kmsan_copy_to_user(
+	const void *to, const void *from, size_t to_copy, size_t left) {}
+
+static inline void kmsan_check_skb(const struct sk_buff *skb) {}
+static inline void kmsan_handle_dma(const void *address, size_t size,
+				    enum dma_data_direction direction) {}
+static inline void kmsan_handle_urb(const struct urb *urb, bool is_out) {}
+
+#endif
+
+#endif /* LINUX_KMSAN_H */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4aba6c0c2ba80..ba8d5808259bc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -216,6 +216,15 @@ struct page {
 					   not kmapped, ie. highmem) */
 #endif /* WANT_PAGE_VIRTUAL */
 
+#ifdef CONFIG_KMSAN
+	/*
+	 * Bits in struct page are scarce, so the LSB in *shadow is used to
+	 * indicate whether the page should be ignored by KMSAN or not.
+	 */
+	struct page *shadow;
+	struct page *origin;
+#endif
+
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
 	int _last_cpupid;
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 983389c3c26d1..208bff758b9cd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -15,6 +15,7 @@
 #include <linux/sem.h>
 #include <linux/shm.h>
 #include <linux/kcov.h>
+#include <linux/kmsan.h>
 #include <linux/mutex.h>
 #include <linux/plist.h>
 #include <linux/hrtimer.h>
@@ -1199,6 +1200,10 @@ struct task_struct {
 	struct kcsan_ctx		kcsan_ctx;
 #endif
 
+#ifdef CONFIG_KMSAN
+	struct kmsan_task_state		kmsan;
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	/* Index of current stored address in ret_stack: */
 	int				curr_ret_stack;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9f6e6edbd9949..e6f251b83437e 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -823,6 +823,8 @@ config DEBUG_STACKOVERFLOW
 
 source "lib/Kconfig.kasan"
 
+source "lib/Kconfig.kmsan"
+
 endmenu # "Memory Debugging"
 
 config DEBUG_SHIRQ
diff --git a/lib/Kconfig.kmsan b/lib/Kconfig.kmsan
new file mode 100644
index 0000000000000..187dddfcf2201
--- /dev/null
+++ b/lib/Kconfig.kmsan
@@ -0,0 +1,22 @@
+config HAVE_ARCH_KMSAN
+	bool
+
+if HAVE_ARCH_KMSAN
+
+config KMSAN
+	bool "KMSAN: detector of uninitialized memory use"
+	depends on SLUB && !KASAN
+	select STACKDEPOT
+	help
+	  KMSAN is a dynamic detector of uses of uninitialized memory in the
+	  kernel. It is based on compiler instrumentation provided by Clang
+	  and thus requires Clang 10.0.0+ to build.
+
+config TEST_KMSAN
+	tristate "Module for testing KMSAN for bug detection"
+	depends on m && KMSAN
+	help
+	  Test module that can trigger various uses of uninitialized memory
+	  detectable by KMSAN.
+
+endif
diff --git a/mm/kmsan/Makefile b/mm/kmsan/Makefile
new file mode 100644
index 0000000000000..a9778eb8a46a1
--- /dev/null
+++ b/mm/kmsan/Makefile
@@ -0,0 +1,11 @@
+obj-y := kmsan.o kmsan_instr.o kmsan_init.o kmsan_entry.o kmsan_hooks.o kmsan_report.o kmsan_shadow.o
+
+# KMSAN runtime functions may enable UACCESS checks, so build them without
+# stackprotector to avoid objtool warnings.
+CFLAGS_kmsan_instr.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_kmsan_shadow.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_kmsan_hooks.o := $(call cc-option, -fno-stack-protector)
+
+KMSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+UBSAN_SANITIZE := n
diff --git a/mm/kmsan/kmsan.c b/mm/kmsan/kmsan.c
new file mode 100644
index 0000000000000..037f8b5f33a57
--- /dev/null
+++ b/mm/kmsan/kmsan.c
@@ -0,0 +1,547 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN runtime library.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/page.h>
+#include <linux/compiler.h>
+#include <linux/export.h>
+#include <linux/highmem.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/kmsan.h>
+#include <linux/memory.h>
+#include <linux/mm.h>
+#include <linux/mm_types.h>
+#include <linux/mmzone.h>
+#include <linux/percpu-defs.h>
+#include <linux/preempt.h>
+#include <linux/slab.h>
+#include <linux/stackdepot.h>
+#include <linux/stacktrace.h>
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+
+#include "../slab.h"
+#include "kmsan.h"
+
+#define KMSAN_STACK_DEPTH 64
+#define MAX_CHAIN_DEPTH 7
+
+/*
+ * Some kernel asm() calls mention the non-existing |__force_order| variable
+ * in the asm constraints to preserve the order of accesses to control
+ * registers. KMSAN turns those mentions into actual memory accesses, therefore
+ * the variable is now required to link the kernel.
+ */
+unsigned long __force_order;
+EXPORT_SYMBOL(__force_order);
+
+bool kmsan_ready;
+/*
+ * According to Documentation/x86/kernel-stacks, kernel code can run on the
+ * following stacks:
+ * - regular task stack - when executing the task code
+ *  - interrupt stack - when handling external hardware interrupts and softirqs
+ *  - NMI stack
+ * 0 is for regular interrupts, 1 for softirqs, 2 for NMI.
+ * Because interrupts may nest, trying to use a new context for every new
+ * interrupt.
+ */
+/* [0] for dummy per-CPU context. */
+DEFINE_PER_CPU(struct kmsan_context_state[KMSAN_NESTED_CONTEXT_MAX],
+	       kmsan_percpu_cstate);
+/* 0 for task context, |i>0| for kmsan_context_state[i]. */
+DEFINE_PER_CPU(int, kmsan_context_level);
+DEFINE_PER_CPU(int, kmsan_in_runtime_cnt);
+
+struct kmsan_context_state *kmsan_task_context_state(void)
+{
+	int cpu = smp_processor_id();
+	int level = this_cpu_read(kmsan_context_level);
+	struct kmsan_context_state *ret;
+
+	if (!kmsan_ready || kmsan_in_runtime()) {
+		ret = &per_cpu(kmsan_percpu_cstate[0], cpu);
+		__memset(ret, 0, sizeof(struct kmsan_context_state));
+		return ret;
+	}
+
+	if (!level)
+		ret = &current->kmsan.cstate;
+	else
+		ret = &per_cpu(kmsan_percpu_cstate[level], cpu);
+	return ret;
+}
+
+void kmsan_internal_task_create(struct task_struct *task)
+{
+	struct kmsan_task_state *state = &task->kmsan;
+
+	__memset(state, 0, sizeof(struct kmsan_task_state));
+	state->allow_reporting = true;
+}
+
+void kmsan_internal_memset_shadow(void *addr, int b, size_t size,
+				  bool checked)
+{
+	void *shadow_start;
+	u64 page_offset, address = (u64)addr;
+	size_t to_fill;
+
+	BUG_ON(!metadata_is_contiguous(addr, size, META_SHADOW));
+	while (size) {
+		page_offset = address % PAGE_SIZE;
+		to_fill = min(PAGE_SIZE - page_offset, (u64)size);
+		shadow_start = kmsan_get_metadata((void *)address, to_fill,
+						  META_SHADOW);
+		if (!shadow_start) {
+			if (checked)
+				panic("%s: not memsetting %d bytes starting at %px, because the shadow is NULL\n",
+				      __func__, to_fill, address);
+			/* Otherwise just move on. */
+		} else {
+			__memset(shadow_start, b, to_fill);
+		}
+		address += to_fill;
+		size -= to_fill;
+	}
+}
+
+void kmsan_internal_poison_shadow(void *address, size_t size,
+				gfp_t flags, unsigned int poison_flags)
+{
+	bool checked = poison_flags & KMSAN_POISON_CHECK;
+	depot_stack_handle_t handle;
+	u32 extra_bits = kmsan_extra_bits(/*depth*/0,
+					  poison_flags & KMSAN_POISON_FREE);
+
+	kmsan_internal_memset_shadow(address, -1, size, checked);
+	handle = kmsan_save_stack_with_flags(flags, extra_bits);
+	kmsan_set_origin_checked(address, size, handle, checked);
+}
+
+void kmsan_internal_unpoison_shadow(void *address, size_t size, bool checked)
+{
+	kmsan_internal_memset_shadow(address, 0, size, checked);
+	kmsan_set_origin_checked(address, size, 0, checked);
+}
+
+depot_stack_handle_t kmsan_save_stack_with_flags(gfp_t flags,
+						 unsigned int reserved)
+{
+	depot_stack_handle_t handle;
+	unsigned long entries[KMSAN_STACK_DEPTH];
+	unsigned int nr_entries;
+
+	nr_entries = stack_trace_save(entries, KMSAN_STACK_DEPTH, 0);
+	nr_entries = filter_irq_stacks(entries, nr_entries);
+
+	/* Don't sleep (see might_sleep_if() in __alloc_pages_nodemask()). */
+	flags &= ~__GFP_DIRECT_RECLAIM;
+
+	handle = stack_depot_save(entries, nr_entries, flags);
+	return set_dsh_extra_bits(handle, reserved);
+}
+
+/*
+ * Depending on the value of is_memmove, this serves as both a memcpy and a
+ * memmove implementation.
+ *
+ * As with the regular memmove, do the following:
+ * - if src and dst don't overlap, use memcpy();
+ * - if src and dst overlap:
+ *   - if src > dst, use memcpy();
+ *   - if src < dst, use reverse-memcpy.
+ * Why this is correct:
+ * - problems may arise if for some part of the overlapping region we
+ *   overwrite its shadow with a new value before copying it somewhere.
+ *   But there's a 1:1 mapping between the kernel memory and its shadow,
+ *   therefore if this doesn't happen with the kernel memory it can't happen
+ *   with the shadow.
+ */
+static void kmsan_memcpy_memmove_metadata(void *dst, void *src, size_t n,
+					  bool is_memmove)
+{
+	void *shadow_src, *shadow_dst;
+	depot_stack_handle_t *origin_src, *origin_dst;
+	int src_slots, dst_slots, i, iter, step, skip_bits;
+	depot_stack_handle_t old_origin = 0, chain_origin, new_origin = 0;
+	u32 *align_shadow_src, shadow;
+	bool backwards;
+
+	shadow_dst = kmsan_get_metadata(dst, n, META_SHADOW);
+	if (!shadow_dst)
+		return;
+	BUG_ON(!metadata_is_contiguous(dst, n, META_SHADOW));
+
+	shadow_src = kmsan_get_metadata(src, n, META_SHADOW);
+	if (!shadow_src) {
+		/*
+		 * |src| is untracked: zero out destination shadow, ignore the
+		 * origins, we're done.
+		 */
+		__memset(shadow_dst, 0, n);
+		return;
+	}
+	BUG_ON(!metadata_is_contiguous(src, n, META_SHADOW));
+
+	if (is_memmove)
+		__memmove(shadow_dst, shadow_src, n);
+	else
+		__memcpy(shadow_dst, shadow_src, n);
+
+	origin_dst = kmsan_get_metadata(dst, n, META_ORIGIN);
+	origin_src = kmsan_get_metadata(src, n, META_ORIGIN);
+	BUG_ON(!origin_dst || !origin_src);
+	BUG_ON(!metadata_is_contiguous(dst, n, META_ORIGIN));
+	BUG_ON(!metadata_is_contiguous(src, n, META_ORIGIN));
+	src_slots = (ALIGN((u64)src + n, ORIGIN_SIZE) -
+		     ALIGN_DOWN((u64)src, ORIGIN_SIZE)) / ORIGIN_SIZE;
+	dst_slots = (ALIGN((u64)dst + n, ORIGIN_SIZE) -
+		     ALIGN_DOWN((u64)dst, ORIGIN_SIZE)) / ORIGIN_SIZE;
+	BUG_ON(!src_slots || !dst_slots);
+	BUG_ON((src_slots < 1) || (dst_slots < 1));
+	BUG_ON((src_slots - dst_slots > 1) || (dst_slots - src_slots < -1));
+
+	backwards = is_memmove && (dst > src);
+	i = backwards ? min(src_slots, dst_slots) - 1 : 0;
+	iter = backwards ? -1 : 1;
+
+	align_shadow_src = (u32 *)ALIGN_DOWN((u64)shadow_src, ORIGIN_SIZE);
+	for (step = 0; step < min(src_slots, dst_slots); step++, i += iter) {
+		BUG_ON(i < 0);
+		shadow = align_shadow_src[i];
+		if (i == 0) {
+			/*
+			 * If |src| isn't aligned on ORIGIN_SIZE, don't
+			 * look at the first |src % ORIGIN_SIZE| bytes
+			 * of the first shadow slot.
+			 */
+			skip_bits = ((u64)src % ORIGIN_SIZE) * 8;
+			shadow = (shadow << skip_bits) >> skip_bits;
+		}
+		if (i == src_slots - 1) {
+			/*
+			 * If |src + n| isn't aligned on
+			 * ORIGIN_SIZE, don't look at the last
+			 * |(src + n) % ORIGIN_SIZE| bytes of the
+			 * last shadow slot.
+			 */
+			skip_bits = (((u64)src + n) % ORIGIN_SIZE) * 8;
+			shadow = (shadow >> skip_bits) << skip_bits;
+		}
+		/*
+		 * Overwrite the origin only if the corresponding
+		 * shadow is nonempty.
+		 */
+		if (origin_src[i] && (origin_src[i] != old_origin) && shadow) {
+			old_origin = origin_src[i];
+			chain_origin = kmsan_internal_chain_origin(old_origin);
+			/*
+			 * kmsan_internal_chain_origin() may return
+			 * NULL, but we don't want to lose the previous
+			 * origin value.
+			 */
+			if (chain_origin)
+				new_origin = chain_origin;
+			else
+				new_origin = old_origin;
+		}
+		if (shadow)
+			origin_dst[i] = new_origin;
+		else
+			origin_dst[i] = 0;
+	}
+}
+
+void kmsan_memcpy_metadata(void *dst, void *src, size_t n)
+{
+	kmsan_memcpy_memmove_metadata(dst, src, n, /*is_memmove*/false);
+}
+
+void kmsan_memmove_metadata(void *dst, void *src, size_t n)
+{
+	kmsan_memcpy_memmove_metadata(dst, src, n, /*is_memmove*/true);
+}
+
+depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id)
+{
+	depot_stack_handle_t handle;
+	unsigned long entries[3];
+	u64 magic = KMSAN_CHAIN_MAGIC_ORIGIN_FULL;
+	int depth = 0;
+	static int skipped;
+	u32 extra_bits;
+	bool uaf;
+
+	if (!id)
+		return id;
+	/*
+	 * Make sure we have enough spare bits in |id| to hold the UAF bit and
+	 * the chain depth.
+	 */
+	BUILD_BUG_ON((1 << STACK_DEPOT_EXTRA_BITS) <= (MAX_CHAIN_DEPTH << 1));
+
+	extra_bits = get_dsh_extra_bits(id);
+	depth = kmsan_depth_from_eb(extra_bits);
+	uaf = kmsan_uaf_from_eb(extra_bits);
+
+	if (depth >= MAX_CHAIN_DEPTH) {
+		skipped++;
+		if (skipped % 10000 == 0) {
+			pr_warn("not chained %d origins\n", skipped);
+			dump_stack();
+			kmsan_print_origin(id);
+		}
+		return id;
+	}
+	depth++;
+	extra_bits = kmsan_extra_bits(depth, uaf);
+
+	entries[0] = magic + depth;
+	entries[1] = kmsan_save_stack_with_flags(GFP_ATOMIC, extra_bits);
+	entries[2] = id;
+	handle = stack_depot_save(entries, ARRAY_SIZE(entries), GFP_ATOMIC);
+	return set_dsh_extra_bits(handle, extra_bits);
+}
+
+void kmsan_write_aligned_origin(void *var, size_t size, u32 origin)
+{
+	u32 *var_cast = (u32 *)var;
+	int i;
+
+	BUG_ON((u64)var_cast % ORIGIN_SIZE);
+	BUG_ON(size % ORIGIN_SIZE);
+	for (i = 0; i < size / ORIGIN_SIZE; i++)
+		var_cast[i] = origin;
+}
+
+void kmsan_internal_set_origin(void *addr, int size, u32 origin)
+{
+	void *origin_start;
+	u64 address = (u64)addr, page_offset;
+	size_t to_fill, pad = 0;
+
+	if (!IS_ALIGNED(address, ORIGIN_SIZE)) {
+		pad = address % ORIGIN_SIZE;
+		address -= pad;
+		size += pad;
+	}
+
+	while (size > 0) {
+		page_offset = address % PAGE_SIZE;
+		to_fill = min(PAGE_SIZE - page_offset, (u64)size);
+		/* write at least ORIGIN_SIZE bytes */
+		to_fill = ALIGN(to_fill, ORIGIN_SIZE);
+		BUG_ON(!to_fill);
+		origin_start = kmsan_get_metadata((void *)address, to_fill,
+						  META_ORIGIN);
+		address += to_fill;
+		size -= to_fill;
+		if (!origin_start)
+			/* Can happen e.g. if the memory is untracked. */
+			continue;
+		kmsan_write_aligned_origin(origin_start, to_fill, origin);
+	}
+}
+
+void kmsan_set_origin_checked(void *addr, int size, u32 origin, bool checked)
+{
+	if (checked && !metadata_is_contiguous(addr, size, META_ORIGIN))
+		panic("%s: WARNING: not setting origin for %d bytes starting at %px, because the metadata is incontiguous\n",
+		      __func__, size, addr);
+	kmsan_internal_set_origin(addr, size, origin);
+}
+
+struct page *vmalloc_to_page_or_null(void *vaddr)
+{
+	struct page *page;
+
+	if (!kmsan_internal_is_vmalloc_addr(vaddr) &&
+	    !kmsan_internal_is_module_addr(vaddr))
+		return NULL;
+	page = vmalloc_to_page(vaddr);
+	if (pfn_valid(page_to_pfn(page)))
+		return page;
+	else
+		return NULL;
+}
+
+void kmsan_internal_check_memory(void *addr, size_t size, const void *user_addr,
+				 int reason)
+{
+	unsigned long irq_flags;
+	unsigned long addr64 = (unsigned long)addr;
+	unsigned char *shadow = NULL;
+	depot_stack_handle_t *origin = NULL;
+	depot_stack_handle_t cur_origin = 0, new_origin = 0;
+	int cur_off_start = -1;
+	int i, chunk_size;
+	size_t pos = 0;
+
+	BUG_ON(!metadata_is_contiguous(addr, size, META_SHADOW));
+	if (size <= 0)
+		return;
+	while (pos < size) {
+		chunk_size = min(size - pos,
+				 PAGE_SIZE - ((addr64 + pos) % PAGE_SIZE));
+		shadow = kmsan_get_metadata((void *)(addr64 + pos), chunk_size,
+					    META_SHADOW);
+		if (!shadow) {
+			/*
+			 * This page is untracked. If there were uninitialized
+			 * bytes before, report them.
+			 */
+			if (cur_origin) {
+				irq_flags = kmsan_enter_runtime();
+				kmsan_report(cur_origin, addr, size,
+					     cur_off_start, pos - 1, user_addr,
+					     reason);
+				kmsan_leave_runtime(irq_flags);
+			}
+			cur_origin = 0;
+			cur_off_start = -1;
+			pos += chunk_size;
+			continue;
+		}
+		for (i = 0; i < chunk_size; i++) {
+			if (!shadow[i]) {
+				/*
+				 * This byte is unpoisoned. If there were
+				 * poisoned bytes before, report them.
+				 */
+				if (cur_origin) {
+					irq_flags = kmsan_enter_runtime();
+					kmsan_report(cur_origin, addr, size,
+						     cur_off_start, pos + i - 1,
+						     user_addr, reason);
+					kmsan_leave_runtime(irq_flags);
+				}
+				cur_origin = 0;
+				cur_off_start = -1;
+				continue;
+			}
+			origin = kmsan_get_metadata((void *)(addr64 + pos + i),
+						chunk_size - i, META_ORIGIN);
+			BUG_ON(!origin);
+			new_origin = *origin;
+			/*
+			 * Encountered new origin - report the previous
+			 * uninitialized range.
+			 */
+			if (cur_origin != new_origin) {
+				if (cur_origin) {
+					irq_flags = kmsan_enter_runtime();
+					kmsan_report(cur_origin, addr, size,
+						     cur_off_start, pos + i - 1,
+						     user_addr, reason);
+					kmsan_leave_runtime(irq_flags);
+				}
+				cur_origin = new_origin;
+				cur_off_start = pos + i;
+			}
+		}
+		pos += chunk_size;
+	}
+	BUG_ON(pos != size);
+	if (cur_origin) {
+		irq_flags = kmsan_enter_runtime();
+		kmsan_report(cur_origin, addr, size, cur_off_start, pos - 1,
+			     user_addr, reason);
+		kmsan_leave_runtime(irq_flags);
+	}
+}
+
+bool metadata_is_contiguous(void *addr, size_t size, bool is_origin)
+{
+	u64 cur_addr = (u64)addr, next_addr;
+	char *cur_meta = NULL, *next_meta = NULL;
+	depot_stack_handle_t *origin_p;
+	bool all_untracked = false;
+	const char *fname = is_origin ? "origin" : "shadow";
+
+	if (!size)
+		return true;
+
+	/* The whole range belongs to the same page. */
+	if (ALIGN_DOWN(cur_addr + size - 1, PAGE_SIZE) ==
+	    ALIGN_DOWN(cur_addr, PAGE_SIZE))
+		return true;
+	cur_meta = kmsan_get_metadata((void *)cur_addr, 1, is_origin);
+	if (!cur_meta)
+		all_untracked = true;
+	for (next_addr = cur_addr + PAGE_SIZE; next_addr < (u64)addr + size;
+		     cur_addr = next_addr,
+		     cur_meta = next_meta,
+		     next_addr += PAGE_SIZE) {
+		next_meta = kmsan_get_metadata((void *)next_addr, 1, is_origin);
+		if (!next_meta) {
+			if (!all_untracked)
+				goto report;
+			continue;
+		}
+		if ((u64)cur_meta == ((u64)next_meta - PAGE_SIZE))
+			continue;
+		goto report;
+	}
+	return true;
+
+report:
+	pr_err("BUG: attempting to access two shadow page ranges.\n");
+	dump_stack();
+	pr_err("\n");
+	pr_err("Access of size %d at %px.\n", size, addr);
+	pr_err("Addresses belonging to different ranges: %px and %px\n",
+	       cur_addr, next_addr);
+	pr_err("page[0].%s: %px, page[1].%s: %px\n",
+	       fname, cur_meta, fname, next_meta);
+	origin_p = kmsan_get_metadata(addr, 1, META_ORIGIN);
+	if (origin_p) {
+		pr_err("Origin: %08x\n", *origin_p);
+		kmsan_print_origin(*origin_p);
+	} else {
+		pr_err("Origin: unavailable\n");
+	}
+	return false;
+}
+
+/*
+ * Dummy replacement for __builtin_return_address() which may crash without
+ * frame pointers.
+ */
+void *kmsan_internal_return_address(int arg)
+{
+#ifdef CONFIG_UNWINDER_FRAME_POINTER
+	switch (arg) {
+	case 1:
+		return __builtin_return_address(1);
+	case 2:
+		return __builtin_return_address(2);
+	default:
+		BUG();
+	}
+#else
+	unsigned long entries[1];
+
+	stack_trace_save(entries, 1, arg);
+	return (void *)entries[0];
+#endif
+}
+
+bool kmsan_internal_is_module_addr(void *vaddr)
+{
+	return ((u64)vaddr >= MODULES_VADDR) && ((u64)vaddr < MODULES_END);
+}
+
+bool kmsan_internal_is_vmalloc_addr(void *addr)
+{
+	return ((u64)addr >= VMALLOC_START) && ((u64)addr < VMALLOC_END);
+}
diff --git a/mm/kmsan/kmsan.h b/mm/kmsan/kmsan.h
new file mode 100644
index 0000000000000..9568b0005b5e7
--- /dev/null
+++ b/mm/kmsan/kmsan.h
@@ -0,0 +1,161 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KMSAN internal declarations.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#ifndef __MM_KMSAN_KMSAN_H
+#define __MM_KMSAN_KMSAN_H
+
+#include <asm/pgtable_64_types.h>
+#include <linux/irqflags.h>
+#include <linux/sched.h>
+#include <linux/stackdepot.h>
+#include <linux/stacktrace.h>
+#include <linux/nmi.h>
+#include <linux/mm.h>
+#include <linux/printk.h>
+
+#include "kmsan_shadow.h"
+
+#define KMSAN_MAGIC_MASK 0xffffffffff00
+#define KMSAN_ALLOCA_MAGIC_ORIGIN 0x4110c4071900
+#define KMSAN_CHAIN_MAGIC_ORIGIN_FULL 0xd419170cba00
+
+#define KMSAN_POISON_NOCHECK	0x0
+#define KMSAN_POISON_CHECK	0x1
+#define KMSAN_POISON_FREE	0x2
+
+#define ORIGIN_SIZE 4
+
+#define META_SHADOW	(false)
+#define META_ORIGIN	(true)
+
+#define KMSAN_NESTED_CONTEXT_MAX (8)
+/* [0] for dummy per-CPU context */
+DECLARE_PER_CPU(struct kmsan_context_state[KMSAN_NESTED_CONTEXT_MAX],
+		kmsan_percpu_cstate);
+/* 0 for task context, |i>0| for kmsan_context_state[i]. */
+DECLARE_PER_CPU(int, kmsan_context_level);
+
+extern spinlock_t report_lock;
+extern bool kmsan_ready;
+
+void kmsan_print_origin(depot_stack_handle_t origin);
+void kmsan_report(depot_stack_handle_t origin,
+		  void *address, int size, int off_first, int off_last,
+		  const void *user_addr, int reason);
+
+enum KMSAN_BUG_REASON {
+	REASON_ANY,
+	REASON_COPY_TO_USER,
+	REASON_USE_AFTER_FREE,
+	REASON_SUBMIT_URB,
+};
+
+/*
+ * When a compiler hook is invoked, it may make a call to instrumented code
+ * and eventually call itself recursively. To avoid that, we protect the
+ * runtime entry points with kmsan_enter_runtime()/kmsan_leave_runtime() and
+ * exit the hook if kmsan_in_runtime() is true. But when an interrupt occurs
+ * inside the runtime, the hooks won’t run either, which may lead to errors.
+ * Therefore we have to disable interrupts inside the runtime.
+ */
+DECLARE_PER_CPU(int, kmsan_in_runtime_cnt);
+
+static __always_inline bool kmsan_in_runtime(void)
+{
+	return this_cpu_read(kmsan_in_runtime_cnt);
+}
+
+static __always_inline unsigned long kmsan_enter_runtime(void)
+{
+	int level;
+	unsigned long irq_flags;
+
+	preempt_disable();
+	local_irq_save(irq_flags);
+	stop_nmi();
+	level = this_cpu_inc_return(kmsan_in_runtime_cnt);
+	BUG_ON(level > 1);
+	return irq_flags;
+}
+
+static __always_inline void kmsan_leave_runtime(unsigned long irq_flags)
+{
+	int level = this_cpu_dec_return(kmsan_in_runtime_cnt);
+
+	if (level)
+		panic("kmsan_in_runtime: %d\n", level);
+	restart_nmi();
+	local_irq_restore(irq_flags);
+	preempt_enable();
+}
+
+void kmsan_memcpy_metadata(void *dst, void *src, size_t n);
+void kmsan_memmove_metadata(void *dst, void *src, size_t n);
+
+depot_stack_handle_t kmsan_save_stack(void);
+depot_stack_handle_t kmsan_save_stack_with_flags(gfp_t flags,
+						 unsigned int extra_bits);
+
+/*
+ * Pack and unpack the origin chain depth and UAF flag to/from the extra bits
+ * provided by the stack depot.
+ * The UAF flag is stored in the lowest bit, followed by the depth in the upper
+ * bits.
+ * set_dsh_extra_bits() is responsible for clamping the value.
+ */
+static __always_inline unsigned int kmsan_extra_bits(unsigned int depth,
+						     bool uaf)
+{
+	return (depth << 1) | uaf;
+}
+
+static __always_inline bool kmsan_uaf_from_eb(unsigned int extra_bits)
+{
+	return extra_bits & 1;
+}
+
+static __always_inline unsigned int kmsan_depth_from_eb(unsigned int extra_bits)
+{
+	return extra_bits >> 1;
+}
+
+void kmsan_internal_poison_shadow(void *address, size_t size, gfp_t flags,
+				  unsigned int poison_flags);
+void kmsan_internal_unpoison_shadow(void *address, size_t size, bool checked);
+void kmsan_internal_memset_shadow(void *address, int b, size_t size,
+				  bool checked);
+depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id);
+void kmsan_write_aligned_origin(void *var, size_t size, u32 origin);
+
+void kmsan_internal_task_create(struct task_struct *task);
+void kmsan_internal_set_origin(void *addr, int size, u32 origin);
+void kmsan_set_origin_checked(void *addr, int size, u32 origin, bool checked);
+
+struct kmsan_context_state *kmsan_task_context_state(void);
+
+bool metadata_is_contiguous(void *addr, size_t size, bool is_origin);
+void kmsan_internal_check_memory(void *addr, size_t size, const void *user_addr,
+				 int reason);
+
+struct page *vmalloc_to_page_or_null(void *vaddr);
+
+/* Declared in mm/vmalloc.c */
+void __vunmap_page_range(unsigned long addr, unsigned long end);
+int __vmap_page_range_noflush(unsigned long start, unsigned long end,
+				   pgprot_t prot, struct page **pages);
+
+void *kmsan_internal_return_address(int arg);
+bool kmsan_internal_is_module_addr(void *vaddr);
+bool kmsan_internal_is_vmalloc_addr(void *addr);
+
+#endif  /* __MM_KMSAN_KMSAN_H */
diff --git a/mm/kmsan/kmsan_init.c b/mm/kmsan/kmsan_init.c
new file mode 100644
index 0000000000000..12c84efa70ff9
--- /dev/null
+++ b/mm/kmsan/kmsan_init.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN initialization routines.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include "kmsan.h"
+
+#include <asm/cpu_entry_area.h>
+#include <asm/sections.h>
+#include <linux/mm.h>
+#include <linux/memblock.h>
+
+#define NUM_FUTURE_RANGES 128
+struct start_end_pair {
+	void *start, *end;
+};
+
+static struct start_end_pair start_end_pairs[NUM_FUTURE_RANGES] __initdata;
+static int future_index __initdata;
+
+/*
+ * Record a range of memory for which the metadata pages will be created once
+ * the page allocator becomes available.
+ */
+static void __init kmsan_record_future_shadow_range(void *start, void *end)
+{
+	BUG_ON(future_index == NUM_FUTURE_RANGES);
+	BUG_ON((start >= end) || !start || !end);
+	start_end_pairs[future_index].start = start;
+	start_end_pairs[future_index].end = end;
+	future_index++;
+}
+
+/*
+ * Initialize the shadow for existing mappings during kernel initialization.
+ * These include kernel text/data sections, NODE_DATA and future ranges
+ * registered while creating other data (e.g. percpu).
+ *
+ * Allocations via memblock can be only done before slab is initialized.
+ */
+void __init kmsan_initialize_shadow(void)
+{
+	int nid;
+	u64 i;
+	const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+	phys_addr_t p_start, p_end;
+
+	for_each_reserved_mem_region(i, &p_start, &p_end)
+		kmsan_record_future_shadow_range(phys_to_virt(p_start),
+						 phys_to_virt(p_end+1));
+	/* Allocate shadow for .data */
+	kmsan_record_future_shadow_range(_sdata, _edata);
+
+	for_each_online_node(nid)
+		kmsan_record_future_shadow_range(
+			NODE_DATA(nid), (char *)NODE_DATA(nid) + nd_size);
+
+	for (i = 0; i < future_index; i++)
+		kmsan_init_alloc_meta_for_range(start_end_pairs[i].start,
+						start_end_pairs[i].end);
+}
+EXPORT_SYMBOL(kmsan_initialize_shadow);
+
+void __init kmsan_initialize(void)
+{
+	/* Assuming current is init_task */
+	kmsan_internal_task_create(current);
+	pr_info("Starting KernelMemorySanitizer\n");
+	kmsan_ready = true;
+}
+EXPORT_SYMBOL(kmsan_initialize);
diff --git a/mm/kmsan/kmsan_report.c b/mm/kmsan/kmsan_report.c
new file mode 100644
index 0000000000000..7455fa7d10bb2
--- /dev/null
+++ b/mm/kmsan/kmsan_report.c
@@ -0,0 +1,143 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN error reporting routines.
+ *
+ * Copyright (C) 2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/console.h>
+#include <linux/stackdepot.h>
+#include <linux/stacktrace.h>
+
+#include "kmsan.h"
+
+DEFINE_SPINLOCK(report_lock);
+
+void kmsan_print_origin(depot_stack_handle_t origin)
+{
+	unsigned long *entries = NULL, *chained_entries = NULL;
+	unsigned long nr_entries, chained_nr_entries, magic;
+	char *descr = NULL;
+	void *pc1 = NULL, *pc2 = NULL;
+	depot_stack_handle_t head;
+
+	if (!origin)
+		return;
+
+	while (true) {
+		nr_entries = stack_depot_fetch(origin, &entries);
+		magic = nr_entries ? (entries[0] & KMSAN_MAGIC_MASK) : 0;
+		if ((nr_entries == 4) && (magic == KMSAN_ALLOCA_MAGIC_ORIGIN)) {
+			descr = (char *)entries[1];
+			pc1 = (void *)entries[2];
+			pc2 = (void *)entries[3];
+			pr_err("Local variable %s created at:\n", descr);
+			pr_err(" %pS\n", pc1);
+			pr_err(" %pS\n", pc2);
+			break;
+		}
+		if ((nr_entries == 3) &&
+		    (magic == KMSAN_CHAIN_MAGIC_ORIGIN_FULL)) {
+			head = entries[1];
+			origin = entries[2];
+			pr_err("Uninit was stored to memory at:\n");
+			chained_nr_entries =
+				stack_depot_fetch(head, &chained_entries);
+			stack_trace_print(chained_entries, chained_nr_entries,
+					  0);
+			pr_err("\n");
+			continue;
+		}
+		pr_err("Uninit was created at:\n");
+		if (nr_entries)
+			stack_trace_print(entries, nr_entries, 0);
+		else
+			pr_err("(stack is not available)\n");
+		break;
+	}
+}
+
+/**
+ * kmsan_report() - Report a use of uninitialized value.
+ * @origin:    Stack ID of the uninitialized value.
+ * @address:   Address at which the memory access happens.
+ * @size:      Memory access size.
+ * @off_first: Offset (from @address) of the first byte to be reported.
+ * @off_last:  Offset (from @address) of the last byte to be reported.
+ * @user_addr: When non-NULL, denotes the userspace address to which the kernel
+ *             is leaking data.
+ * @reason:    Error type from KMSAN_BUG_REASON enum.
+ *
+ * kmsan_report() prints an error message for a consequent group of bytes
+ * sharing the same origin. If an uninitialized value is used in a comparison,
+ * this function is called once without specifying the addresses. When checking
+ * a memory range, KMSAN may call kmsan_report() multiple times with the same
+ * @address, @size, @user_addr and @reason, but different @off_first and
+ * @off_last corresponding to different @origin values.
+ */
+void kmsan_report(depot_stack_handle_t origin,
+		  void *address, int size, int off_first, int off_last,
+		  const void *user_addr, int reason)
+{
+	unsigned long flags;
+	bool is_uaf;
+	char *bug_type = NULL;
+
+	if (!kmsan_ready)
+		return;
+	if (!current->kmsan.allow_reporting)
+		return;
+	if (!origin)
+		return;
+
+	current->kmsan.allow_reporting = false;
+	spin_lock_irqsave(&report_lock, flags);
+	pr_err("=====================================================\n");
+	is_uaf = kmsan_uaf_from_eb(get_dsh_extra_bits(origin));
+	switch (reason) {
+	case REASON_ANY:
+		bug_type = is_uaf ? "use-after-free" : "uninit-value";
+		break;
+	case REASON_COPY_TO_USER:
+		bug_type = is_uaf ? "kernel-infoleak-after-free" :
+				    "kernel-infoleak";
+		break;
+	case REASON_SUBMIT_URB:
+		bug_type = is_uaf ? "kernel-usb-infoleak-after-free" :
+				    "kernel-usb-infoleak";
+		break;
+	}
+	pr_err("BUG: KMSAN: %s in %pS\n",
+	       bug_type, kmsan_internal_return_address(2));
+	dump_stack();
+	pr_err("\n");
+
+	kmsan_print_origin(origin);
+
+	if (size) {
+		pr_err("\n");
+		if (off_first == off_last)
+			pr_err("Byte %d of %d is uninitialized\n",
+			       off_first, size);
+		else
+			pr_err("Bytes %d-%d of %d are uninitialized\n",
+			       off_first, off_last, size);
+	}
+	if (address)
+		pr_err("Memory access of size %d starts at %px\n",
+		       size, address);
+	if (user_addr && reason == REASON_COPY_TO_USER)
+		pr_err("Data copied to user address %px\n", user_addr);
+	pr_err("=====================================================\n");
+	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
+	spin_unlock_irqrestore(&report_lock, flags);
+	if (panic_on_warn)
+		panic("panic_on_warn set ...\n");
+	current->kmsan.allow_reporting = true;
+}
diff --git a/mm/kmsan/kmsan_shadow.c b/mm/kmsan/kmsan_shadow.c
new file mode 100644
index 0000000000000..bcd4f1faa7a67
--- /dev/null
+++ b/mm/kmsan/kmsan_shadow.c
@@ -0,0 +1,456 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN shadow implementation.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <asm/cpu_entry_area.h>
+#include <asm/page.h>
+#include <asm/pgtable_64_types.h>
+#include <asm/tlbflush.h>
+#include <linux/memblock.h>
+#include <linux/mm_types.h>
+#include <linux/percpu-defs.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/stddef.h>
+
+#include "kmsan.h"
+#include "kmsan_shadow.h"
+
+#define shadow_page_for(page) ((page)->shadow)
+
+#define origin_page_for(page) ((page)->origin)
+
+#define shadow_ptr_for(page) (page_address((page)->shadow))
+
+#define origin_ptr_for(page) (page_address((page)->origin))
+
+#define has_shadow_page(page) (!!((page)->shadow))
+
+#define has_origin_page(page) (!!((page)->origin))
+
+#define set_no_shadow_origin_page(page)	\
+	do {				\
+		(page)->shadow = NULL;	\
+		(page)->origin = NULL;	\
+	} while (0) /**/
+
+#define is_ignored_page(page) (!!(((u64)((page)->shadow)) % 2))
+
+#define ignore_page(pg)	\
+	((pg)->shadow = (struct page *)((u64)((pg)->shadow) | 1))
+
+DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_shadow);
+DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_origin);
+
+/*
+ * Dummy load and store pages to be used when the real metadata is unavailable.
+ * There are separate pages for loads and stores, so that every load returns a
+ * zero, and every store doesn't affect other loads.
+ */
+char dummy_load_page[PAGE_SIZE] __aligned(PAGE_SIZE);
+char dummy_store_page[PAGE_SIZE] __aligned(PAGE_SIZE);
+
+/*
+ * Taken from arch/x86/mm/physaddr.h to avoid using an instrumented version.
+ */
+static int kmsan_phys_addr_valid(unsigned long addr)
+{
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+	return !(addr >> boot_cpu_data.x86_phys_bits);
+#else
+	return 1;
+#endif
+}
+
+/*
+ * Taken from arch/x86/mm/physaddr.c to avoid using an instrumented version.
+ */
+static bool kmsan_virt_addr_valid(void *addr)
+{
+	unsigned long x = (unsigned long)addr;
+	unsigned long y = x - __START_KERNEL_map;
+
+	/* use the carry flag to determine if x was < __START_KERNEL_map */
+	if (unlikely(x > y)) {
+		x = y + phys_base;
+
+		if (y >= KERNEL_IMAGE_SIZE)
+			return false;
+	} else {
+		x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+		/* carry flag will be set if starting x was >= PAGE_OFFSET */
+		if ((x > y) || !kmsan_phys_addr_valid(x))
+			return false;
+	}
+
+	return pfn_valid(x >> PAGE_SHIFT);
+}
+
+static unsigned long vmalloc_meta(void *addr, bool is_origin)
+{
+	unsigned long addr64 = (unsigned long)addr, off;
+
+	BUG_ON(is_origin && !IS_ALIGNED(addr64, ORIGIN_SIZE));
+	if (kmsan_internal_is_vmalloc_addr(addr))
+		return addr64 + (is_origin ? VMALLOC_ORIGIN_OFFSET
+					   : VMALLOC_SHADOW_OFFSET);
+	if (kmsan_internal_is_module_addr(addr)) {
+		off = addr64 - MODULES_VADDR;
+		return off + (is_origin ? MODULES_ORIGIN_START
+					: MODULES_SHADOW_START);
+	}
+	return 0;
+}
+
+static void *get_cea_meta_or_null(void *addr, bool is_origin)
+{
+	int cpu = smp_processor_id();
+	int off;
+	char *metadata_array;
+
+	if (((u64)addr < CPU_ENTRY_AREA_BASE) ||
+	    ((u64)addr >= (CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE)))
+		return NULL;
+	off = (char *)addr - (char *)get_cpu_entry_area(cpu);
+	if ((off < 0) || (off >= CPU_ENTRY_AREA_SIZE))
+		return NULL;
+	metadata_array = is_origin ? cpu_entry_area_origin :
+				     cpu_entry_area_shadow;
+	return &per_cpu(metadata_array[off], cpu);
+}
+
+static struct page *virt_to_page_or_null(void *vaddr)
+{
+	if (kmsan_virt_addr_valid(vaddr))
+		return virt_to_page(vaddr);
+	else
+		return NULL;
+}
+
+struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *address, u64 size,
+						     bool store)
+{
+	struct shadow_origin_ptr ret;
+	void *shadow;
+
+	if (size > PAGE_SIZE)
+		panic("size too big in %s(%px, %d, %d)\n",
+		     __func__, address, size, store);
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		goto return_dummy;
+
+	BUG_ON(!metadata_is_contiguous(address, size, META_SHADOW));
+	shadow = kmsan_get_metadata(address, size, META_SHADOW);
+	if (!shadow)
+		goto return_dummy;
+
+	ret.s = shadow;
+	ret.o = kmsan_get_metadata(address, size, META_ORIGIN);
+	return ret;
+
+return_dummy:
+	if (store) {
+		ret.s = dummy_store_page;
+		ret.o = dummy_store_page;
+	} else {
+		ret.s = dummy_load_page;
+		ret.o = dummy_load_page;
+	}
+	return ret;
+}
+
+/*
+ * Obtain the shadow or origin pointer for the given address, or NULL if there's
+ * none. The caller must check the return value for being non-NULL if needed.
+ * The return value of this function should not depend on whether we're in the
+ * runtime or not.
+ */
+void *kmsan_get_metadata(void *address, size_t size, bool is_origin)
+{
+	struct page *page;
+	void *ret;
+	u64 addr = (u64)address, pad, off;
+
+	if (is_origin && !IS_ALIGNED(addr, ORIGIN_SIZE)) {
+		pad = addr % ORIGIN_SIZE;
+		addr -= pad;
+		size += pad;
+	}
+	address = (void *)addr;
+	if (kmsan_internal_is_vmalloc_addr(address) ||
+	    kmsan_internal_is_module_addr(address))
+		return (void *)vmalloc_meta(address, is_origin);
+
+	ret = get_cea_meta_or_null(address, is_origin);
+	if (ret)
+		return ret;
+
+	page = virt_to_page_or_null(address);
+	if (!page)
+		return NULL;
+	if (is_ignored_page(page))
+		return NULL;
+	if (!has_shadow_page(page) || !has_origin_page(page))
+		return NULL;
+	off = addr % PAGE_SIZE;
+
+	ret = (is_origin ? origin_ptr_for(page) : shadow_ptr_for(page)) + off;
+	return ret;
+}
+
+void __init kmsan_init_alloc_meta_for_range(void *start, void *end)
+{
+	u64 addr, size;
+	struct page *page;
+	void *shadow, *origin;
+	struct page *shadow_p, *origin_p;
+
+	start = (void *)ALIGN_DOWN((u64)start, PAGE_SIZE);
+	size = ALIGN((u64)end - (u64)start, PAGE_SIZE);
+	shadow = memblock_alloc(size, PAGE_SIZE);
+	origin = memblock_alloc(size, PAGE_SIZE);
+	for (addr = 0; addr < size; addr += PAGE_SIZE) {
+		page = virt_to_page_or_null((char *)start + addr);
+		shadow_p = virt_to_page_or_null((char *)shadow + addr);
+		set_no_shadow_origin_page(shadow_p);
+		shadow_page_for(page) = shadow_p;
+		origin_p = virt_to_page_or_null((char *)origin + addr);
+		set_no_shadow_origin_page(origin_p);
+		origin_page_for(page) = origin_p;
+	}
+}
+
+/* Called from mm/memory.c */
+void kmsan_copy_page_meta(struct page *dst, struct page *src)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	if (!has_shadow_page(src)) {
+		kmsan_internal_unpoison_shadow(page_address(dst), PAGE_SIZE,
+					       /*checked*/false);
+		return;
+	}
+	if (!has_shadow_page(dst))
+		return;
+	if (is_ignored_page(src)) {
+		ignore_page(dst);
+		return;
+	}
+
+	irq_flags = kmsan_enter_runtime();
+	__memcpy(shadow_ptr_for(dst), shadow_ptr_for(src),
+		PAGE_SIZE);
+	BUG_ON(!has_origin_page(src) || !has_origin_page(dst));
+	__memcpy(origin_ptr_for(dst), origin_ptr_for(src),
+		PAGE_SIZE);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_copy_page_meta);
+
+/* Helper function to allocate page metadata. */
+static int kmsan_internal_alloc_meta_for_pages(struct page *page,
+					       unsigned int order,
+					       gfp_t flags, int node)
+{
+	struct page *shadow, *origin;
+	int pages = 1 << order;
+	int i;
+	bool initialized = (flags & __GFP_ZERO) || !kmsan_ready;
+	depot_stack_handle_t handle;
+
+	if (flags & __GFP_NO_KMSAN_SHADOW) {
+		for (i = 0; i < pages; i++)
+			set_no_shadow_origin_page(&page[i]);
+		return 0;
+	}
+
+	/*
+	 * We always want metadata allocations to succeed and to finish fast.
+	 */
+	flags = GFP_ATOMIC;
+	if (initialized)
+		flags |= __GFP_ZERO;
+	shadow = alloc_pages_node(node, flags | __GFP_NO_KMSAN_SHADOW, order);
+	if (!shadow) {
+		for (i = 0; i < pages; i++) {
+			set_no_shadow_origin_page(&page[i]);
+			set_no_shadow_origin_page(&page[i]);
+		}
+		return -ENOMEM;
+	}
+	if (!initialized)
+		__memset(page_address(shadow), -1, PAGE_SIZE * pages);
+
+	origin = alloc_pages_node(node, flags | __GFP_NO_KMSAN_SHADOW, order);
+	/* Assume we've allocated the origin. */
+	if (!origin) {
+		__free_pages(shadow, order);
+		for (i = 0; i < pages; i++)
+			set_no_shadow_origin_page(&page[i]);
+		return -ENOMEM;
+	}
+
+	if (!initialized) {
+		handle = kmsan_save_stack_with_flags(flags, /*extra_bits*/0);
+		/*
+		 * Addresses are page-aligned, pages are contiguous, so it's ok
+		 * to just fill the origin pages with |handle|.
+		 */
+		for (i = 0; i < PAGE_SIZE * pages / sizeof(handle); i++) {
+			((depot_stack_handle_t *)page_address(origin))[i] =
+						handle;
+		}
+	}
+
+	for (i = 0; i < pages; i++) {
+		shadow_page_for(&page[i]) = &shadow[i];
+		set_no_shadow_origin_page(shadow_page_for(&page[i]));
+		origin_page_for(&page[i]) = &origin[i];
+		set_no_shadow_origin_page(origin_page_for(&page[i]));
+	}
+	return 0;
+}
+
+/* Called from mm/page_alloc.c */
+int kmsan_alloc_page(struct page *page, unsigned int order, gfp_t flags)
+{
+	unsigned long irq_flags;
+	int ret;
+
+	if (kmsan_in_runtime())
+		return 0;
+	irq_flags = kmsan_enter_runtime();
+	ret = kmsan_internal_alloc_meta_for_pages(page, order, flags, -1);
+	kmsan_leave_runtime(irq_flags);
+	return ret;
+}
+
+/* Called from mm/page_alloc.c */
+void kmsan_free_page(struct page *page, unsigned int order)
+{
+	struct page *shadow, *origin, *cur_page;
+	int pages = 1 << order;
+	int i;
+	unsigned long irq_flags;
+
+	if (!shadow_page_for(page)) {
+		for (i = 0; i < pages; i++) {
+			cur_page = &page[i];
+			BUG_ON(shadow_page_for(cur_page));
+		}
+		return;
+	}
+
+	if (!kmsan_ready) {
+		for (i = 0; i < pages; i++) {
+			cur_page = &page[i];
+			set_no_shadow_origin_page(cur_page);
+		}
+		return;
+	}
+
+	irq_flags = kmsan_enter_runtime();
+	shadow = shadow_page_for(&page[0]);
+	origin = origin_page_for(&page[0]);
+
+	for (i = 0; i < pages; i++) {
+		BUG_ON(has_shadow_page(shadow_page_for(&page[i])));
+		BUG_ON(has_shadow_page(origin_page_for(&page[i])));
+		set_no_shadow_origin_page(&page[i]);
+	}
+	BUG_ON(has_shadow_page(shadow));
+	__free_pages(shadow, order);
+
+	BUG_ON(has_shadow_page(origin));
+	__free_pages(origin, order);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_free_page);
+
+/* Called from mm/page_alloc.c */
+void kmsan_split_page(struct page *page, unsigned int order)
+{
+	struct page *shadow, *origin;
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	irq_flags = kmsan_enter_runtime();
+	if (!has_shadow_page(&page[0])) {
+		BUG_ON(has_origin_page(&page[0]));
+		kmsan_leave_runtime(irq_flags);
+		return;
+	}
+	shadow = shadow_page_for(&page[0]);
+	split_page(shadow, order);
+
+	origin = origin_page_for(&page[0]);
+	split_page(origin, order);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_split_page);
+
+/* Called from mm/vmalloc.c */
+void kmsan_vmap_page_range_noflush(unsigned long start, unsigned long end,
+				   pgprot_t prot, struct page **pages)
+{
+	int nr, i, mapped;
+	struct page **s_pages, **o_pages;
+	unsigned long shadow_start, shadow_end, origin_start, origin_end;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	shadow_start = vmalloc_meta((void *)start, META_SHADOW);
+	if (!shadow_start)
+		return;
+
+	BUG_ON(start >= end);
+	nr = (end - start) / PAGE_SIZE;
+	s_pages = kcalloc(nr, sizeof(struct page *), GFP_KERNEL);
+	o_pages = kcalloc(nr, sizeof(struct page *), GFP_KERNEL);
+	if (!s_pages || !o_pages)
+		goto ret;
+	for (i = 0; i < nr; i++) {
+		s_pages[i] = shadow_page_for(pages[i]);
+		o_pages[i] = origin_page_for(pages[i]);
+	}
+	prot = __pgprot(pgprot_val(prot) | _PAGE_NX);
+	prot = PAGE_KERNEL;
+
+	shadow_end = vmalloc_meta((void *)end, META_SHADOW);
+	origin_start = vmalloc_meta((void *)start, META_ORIGIN);
+	origin_end = vmalloc_meta((void *)end, META_ORIGIN);
+	mapped = __vmap_page_range_noflush(shadow_start, shadow_end,
+					   prot, s_pages);
+	BUG_ON(mapped != nr);
+	flush_tlb_kernel_range(shadow_start, shadow_end);
+	mapped = __vmap_page_range_noflush(origin_start, origin_end,
+					   prot, o_pages);
+	BUG_ON(mapped != nr);
+	flush_tlb_kernel_range(origin_start, origin_end);
+ret:
+	kfree(s_pages);
+	kfree(o_pages);
+}
+
+void kmsan_ignore_page(struct page *page, int order)
+{
+	int i;
+
+	for (i = 0; i < 1 << order; i++)
+		ignore_page(&page[i]);
+}
diff --git a/mm/kmsan/kmsan_shadow.h b/mm/kmsan/kmsan_shadow.h
new file mode 100644
index 0000000000000..eaa7f771b6a52
--- /dev/null
+++ b/mm/kmsan/kmsan_shadow.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KMSAN shadow API.
+ *
+ * This should be agnostic to shadow implementation details.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#ifndef __MM_KMSAN_KMSAN_SHADOW_H
+#define __MM_KMSAN_KMSAN_SHADOW_H
+
+#include <asm/cpu_entry_area.h>  /* for CPU_ENTRY_AREA_MAP_SIZE */
+
+struct shadow_origin_ptr {
+	void *s, *o;
+};
+
+struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *addr, u64 size,
+						     bool store);
+void *kmsan_get_metadata(void *addr, size_t size, bool is_origin);
+void __init kmsan_init_alloc_meta_for_range(void *start, void *end);
+
+#endif  /* __MM_KMSAN_KMSAN_SHADOW_H */
diff --git a/scripts/Makefile.kmsan b/scripts/Makefile.kmsan
new file mode 100644
index 0000000000000..8b3844b66b228
--- /dev/null
+++ b/scripts/Makefile.kmsan
@@ -0,0 +1,12 @@
+ifdef CONFIG_KMSAN
+
+CFLAGS_KMSAN := -fsanitize=kernel-memory
+
+ifeq ($(call cc-option, $(CFLAGS_KMSAN) -Werror),)
+   ifneq ($(CONFIG_COMPILE_TEST),y)
+        $(warning Cannot use CONFIG_KMSAN: \
+            -fsanitize=kernel-memory is not supported by compiler)
+   endif
+endif
+
+endif
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 07/38] kmsan: KMSAN compiler API implementation
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (5 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 06/38] kmsan: add KMSAN runtime core glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 08/38] kmsan: add KMSAN hooks for kernel subsystems glider
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Wolfram Sang, Christoph Hellwig,
	Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor

kmsan_instr.c contains the functions called by KMSAN instrumentation.
These include functions that:
 - return shadow/origin pointers for memory accesses;
 - poison and unpoison local variables;
 - provide KMSAN context state to pass metadata for function arguments;
 - perform string operations (mem*) on metadata;
 - tell KMSAN to report an error.

This patch has been split away from the rest of KMSAN runtime to
simplify the review process.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - split this patch away as requested by Andrey Konovalov
 - removed redundant address checks when copying shadow
 - fix __msan_memmove prototype

Change-Id: I826272ed2ebe8ab8ef61a9d4cccdcf07c7b6b499
---
 mm/kmsan/kmsan_instr.c | 229 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 229 insertions(+)
 create mode 100644 mm/kmsan/kmsan_instr.c

diff --git a/mm/kmsan/kmsan_instr.c b/mm/kmsan/kmsan_instr.c
new file mode 100644
index 0000000000000..0de8aafac5101
--- /dev/null
+++ b/mm/kmsan/kmsan_instr.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN compiler API.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include "kmsan.h"
+#include <linux/gfp.h>
+#include <linux/mm.h>
+
+static bool is_bad_asm_addr(void *addr, u64 size, bool is_store)
+{
+	if ((u64)addr < TASK_SIZE)
+		return true;
+	if (!kmsan_get_metadata(addr, size, META_SHADOW))
+		return true;
+	return false;
+}
+
+struct shadow_origin_ptr __msan_metadata_ptr_for_load_n(void *addr, u64 size)
+{
+	return kmsan_get_shadow_origin_ptr(addr, size, /*store*/false);
+}
+EXPORT_SYMBOL(__msan_metadata_ptr_for_load_n);
+
+struct shadow_origin_ptr __msan_metadata_ptr_for_store_n(void *addr, u64 size)
+{
+	return kmsan_get_shadow_origin_ptr(addr, size, /*store*/true);
+}
+EXPORT_SYMBOL(__msan_metadata_ptr_for_store_n);
+
+#define DECLARE_METADATA_PTR_GETTER(size)	\
+struct shadow_origin_ptr __msan_metadata_ptr_for_load_##size(void *addr) \
+{		\
+	return kmsan_get_shadow_origin_ptr(addr, size, /*store*/false);	\
+}		\
+EXPORT_SYMBOL(__msan_metadata_ptr_for_load_##size);			\
+		\
+struct shadow_origin_ptr __msan_metadata_ptr_for_store_##size(void *addr) \
+{									\
+	return kmsan_get_shadow_origin_ptr(addr, size, /*store*/true);	\
+}									\
+EXPORT_SYMBOL(__msan_metadata_ptr_for_store_##size)
+
+DECLARE_METADATA_PTR_GETTER(1);
+DECLARE_METADATA_PTR_GETTER(2);
+DECLARE_METADATA_PTR_GETTER(4);
+DECLARE_METADATA_PTR_GETTER(8);
+
+void __msan_instrument_asm_store(void *addr, u64 size)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	/*
+	 * Most of the accesses are below 32 bytes. The two exceptions so far
+	 * are clwb() (64 bytes) and FPU state (512 bytes).
+	 * It's unlikely that the assembly will touch more than 512 bytes.
+	 */
+	if (size > 512) {
+		WARN_ONCE(1, "assembly store size too big: %d\n", size);
+		size = 8;
+	}
+	if (is_bad_asm_addr(addr, size, /*is_store*/true))
+		return;
+	irq_flags = kmsan_enter_runtime();
+	/* Unpoisoning the memory on best effort. */
+	kmsan_internal_unpoison_shadow(addr, size, /*checked*/false);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(__msan_instrument_asm_store);
+
+void *__msan_memmove(void *dst, const void *src, size_t n)
+{
+	void *result;
+
+	result = __memmove(dst, src, n);
+	if (!n)
+		/* Some people call memmove() with zero length. */
+		return result;
+	if (!kmsan_ready || kmsan_in_runtime())
+		return result;
+
+	kmsan_memmove_metadata(dst, (void *)src, n);
+
+	return result;
+}
+EXPORT_SYMBOL(__msan_memmove);
+
+void *__msan_memmove_nosanitize(void *dst, void *src, u64 n)
+{
+	return __memmove(dst, src, n);
+}
+EXPORT_SYMBOL(__msan_memmove_nosanitize);
+
+void *__msan_memcpy(void *dst, const void *src, u64 n)
+{
+	void *result;
+
+	result = __memcpy(dst, src, n);
+	if (!n)
+		/* Some people call memcpy() with zero length. */
+		return result;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return result;
+
+	kmsan_memcpy_metadata(dst, (void *)src, n);
+
+	return result;
+}
+EXPORT_SYMBOL(__msan_memcpy);
+
+void *__msan_memcpy_nosanitize(void *dst, void *src, u64 n)
+{
+	return __memcpy(dst, src, n);
+}
+EXPORT_SYMBOL(__msan_memcpy_nosanitize);
+
+void *__msan_memset(void *dst, int c, size_t n)
+{
+	void *result;
+	unsigned long irq_flags;
+
+	result = __memset(dst, c, n);
+	if (!kmsan_ready || kmsan_in_runtime())
+		return result;
+
+	irq_flags = kmsan_enter_runtime();
+	/*
+	 * Clang doesn't pass parameter metadata here, so it is impossible to
+	 * use shadow of @c to set up the shadow for @dst.
+	 */
+	kmsan_internal_unpoison_shadow(dst, n, /*checked*/false);
+	kmsan_leave_runtime(irq_flags);
+
+	return result;
+}
+EXPORT_SYMBOL(__msan_memset);
+
+void *__msan_memset_nosanitize(void *dst, int c, size_t n)
+{
+	return __memset(dst, c, n);
+}
+EXPORT_SYMBOL(__msan_memset_nosanitize);
+
+depot_stack_handle_t __msan_chain_origin(depot_stack_handle_t origin)
+{
+	depot_stack_handle_t ret = 0;
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return ret;
+
+	/* Creating new origins may allocate memory. */
+	irq_flags = kmsan_enter_runtime();
+	ret = kmsan_internal_chain_origin(origin);
+	kmsan_leave_runtime(irq_flags);
+	return ret;
+}
+EXPORT_SYMBOL(__msan_chain_origin);
+
+void __msan_poison_alloca(void *address, u64 size, char *descr)
+{
+	depot_stack_handle_t handle;
+	unsigned long entries[4];
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	kmsan_internal_memset_shadow(address, -1, size, /*checked*/true);
+
+	entries[0] = KMSAN_ALLOCA_MAGIC_ORIGIN;
+	entries[1] = (u64)descr;
+	entries[2] = (u64)__builtin_return_address(0);
+	entries[3] = (u64)kmsan_internal_return_address(1);
+
+	/* stack_depot_save() may allocate memory. */
+	irq_flags = kmsan_enter_runtime();
+	handle = stack_depot_save(entries, ARRAY_SIZE(entries), GFP_ATOMIC);
+	kmsan_leave_runtime(irq_flags);
+	kmsan_internal_set_origin(address, size, handle);
+}
+EXPORT_SYMBOL(__msan_poison_alloca);
+
+void __msan_unpoison_alloca(void *address, u64 size)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	irq_flags = kmsan_enter_runtime();
+	kmsan_internal_unpoison_shadow(address, size, /*checked*/true);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(__msan_unpoison_alloca);
+
+void __msan_warning(u32 origin)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	irq_flags = kmsan_enter_runtime();
+	kmsan_report(origin, /*address*/0, /*size*/0,
+		/*off_first*/0, /*off_last*/0, /*user_addr*/0, REASON_ANY);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(__msan_warning);
+
+struct kmsan_context_state *__msan_get_context_state(void)
+{
+	struct kmsan_context_state *ret;
+
+	ret = kmsan_task_context_state();
+	BUG_ON(!ret);
+	return ret;
+}
+EXPORT_SYMBOL(__msan_get_context_state);
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 08/38] kmsan: add KMSAN hooks for kernel subsystems
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (6 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 07/38] kmsan: KMSAN compiler API implementation glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 09/38] kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot glider
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Wolfram Sang, Christoph Hellwig,
	Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor

This patch provides hooks that subsystems use to notify KMSAN about
changes in the kernel state. Such changes include:
 - page operations (allocation, deletion, splitting, mapping);
 - memory allocation and deallocation;
 - entering and leaving IRQ/NMI/softirq contexts;
 - copying data between kernel, userspace and hardware.

This patch has been split away from the rest of KMSAN runtime to
simplify the review process.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - fix a lot of comments by Marco Elver and Andrey Konovalov:
 - clean up headers and #defines, remove debugging code
 - simplified KMSAN entry hooks
 - fixed kmsan_check_skb()

Change-Id: I99d1f34f26bef122897cb840dac8d5b34d2b6a80
---
 arch/x86/include/asm/kmsan.h |  93 ++++++++
 mm/kmsan/kmsan_entry.c       |  38 ++++
 mm/kmsan/kmsan_hooks.c       | 416 +++++++++++++++++++++++++++++++++++
 3 files changed, 547 insertions(+)
 create mode 100644 arch/x86/include/asm/kmsan.h
 create mode 100644 mm/kmsan/kmsan_entry.c
 create mode 100644 mm/kmsan/kmsan_hooks.c

diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h
new file mode 100644
index 0000000000000..f924f29f90f97
--- /dev/null
+++ b/arch/x86/include/asm/kmsan.h
@@ -0,0 +1,93 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Assembly bits to safely invoke KMSAN hooks from .S files.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+#ifndef _ASM_X86_KMSAN_H
+#define _ASM_X86_KMSAN_H
+
+#ifdef CONFIG_KMSAN
+
+#ifdef __ASSEMBLY__
+.macro KMSAN_PUSH_REGS
+	pushq	%rax
+	pushq	%rcx
+	pushq	%rdx
+	pushq	%rdi
+	pushq	%rsi
+	pushq	%r8
+	pushq	%r9
+	pushq	%r10
+	pushq	%r11
+.endm
+
+.macro KMSAN_POP_REGS
+	popq	%r11
+	popq	%r10
+	popq	%r9
+	popq	%r8
+	popq	%rsi
+	popq	%rdi
+	popq	%rdx
+	popq	%rcx
+	popq	%rax
+
+.endm
+
+.macro KMSAN_CALL_HOOK fname
+	KMSAN_PUSH_REGS
+	call \fname
+	KMSAN_POP_REGS
+.endm
+
+.macro KMSAN_CONTEXT_ENTER
+	KMSAN_CALL_HOOK kmsan_context_enter
+.endm
+
+.macro KMSAN_CONTEXT_EXIT
+	KMSAN_CALL_HOOK kmsan_context_exit
+.endm
+
+#define KMSAN_INTERRUPT_ENTER KMSAN_CONTEXT_ENTER
+#define KMSAN_INTERRUPT_EXIT KMSAN_CONTEXT_EXIT
+
+#define KMSAN_SOFTIRQ_ENTER KMSAN_CONTEXT_ENTER
+#define KMSAN_SOFTIRQ_EXIT KMSAN_CONTEXT_EXIT
+
+#define KMSAN_NMI_ENTER KMSAN_CONTEXT_ENTER
+#define KMSAN_NMI_EXIT KMSAN_CONTEXT_EXIT
+
+#define KMSAN_IST_ENTER(shift_ist) KMSAN_CONTEXT_ENTER
+#define KMSAN_IST_EXIT(shift_ist) KMSAN_CONTEXT_EXIT
+
+.macro KMSAN_UNPOISON_PT_REGS
+	KMSAN_CALL_HOOK kmsan_unpoison_pt_regs
+.endm
+
+#else
+#error this header must be included into an assembly file
+#endif
+
+#else /* ifdef CONFIG_KMSAN */
+
+#define KMSAN_INTERRUPT_ENTER
+#define KMSAN_INTERRUPT_EXIT
+#define KMSAN_SOFTIRQ_ENTER
+#define KMSAN_SOFTIRQ_EXIT
+#define KMSAN_NMI_ENTER
+#define KMSAN_NMI_EXIT
+#define KMSAN_SYSCALL_ENTER
+#define KMSAN_SYSCALL_EXIT
+#define KMSAN_IST_ENTER(shift_ist)
+#define KMSAN_IST_EXIT(shift_ist)
+#define KMSAN_UNPOISON_PT_REGS
+
+#endif /* ifdef CONFIG_KMSAN */
+#endif /* ifndef _ASM_X86_KMSAN_H */
diff --git a/mm/kmsan/kmsan_entry.c b/mm/kmsan/kmsan_entry.c
new file mode 100644
index 0000000000000..7af31642cd451
--- /dev/null
+++ b/mm/kmsan/kmsan_entry.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN hooks for entry_64.S
+ *
+ * Copyright (C) 2018-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include "kmsan.h"
+
+void kmsan_context_enter(void)
+{
+	int level = this_cpu_inc_return(kmsan_context_level);
+
+	BUG_ON(level >= KMSAN_NESTED_CONTEXT_MAX);
+}
+EXPORT_SYMBOL(kmsan_context_enter);
+
+void kmsan_context_exit(void)
+{
+	int level = this_cpu_dec_return(kmsan_context_level);
+
+	BUG_ON(level < 0);
+}
+EXPORT_SYMBOL(kmsan_context_exit);
+
+void kmsan_unpoison_pt_regs(struct pt_regs *regs)
+{
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	kmsan_internal_unpoison_shadow(regs, sizeof(*regs), /*checked*/true);
+}
+EXPORT_SYMBOL(kmsan_unpoison_pt_regs);
diff --git a/mm/kmsan/kmsan_hooks.c b/mm/kmsan/kmsan_hooks.c
new file mode 100644
index 0000000000000..8ddfd91b1d115
--- /dev/null
+++ b/mm/kmsan/kmsan_hooks.c
@@ -0,0 +1,416 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KMSAN hooks for kernel subsystems.
+ *
+ * These functions handle creation of KMSAN metadata for memory allocations.
+ *
+ * Copyright (C) 2018-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <asm/cacheflush.h>
+#include <linux/dma-direction.h>
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <linux/mm_types.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+#include <linux/usb.h>
+
+#include "../slab.h"
+#include "kmsan.h"
+
+/*
+ * The functions may call back to instrumented code, which, in turn, may call
+ * these hooks again. To avoid re-entrancy, we use __GFP_NO_KMSAN_SHADOW.
+ * Instrumented functions shouldn't be called under
+ * kmsan_enter_runtime()/kmsan_leave_runtime(), because this will lead to
+ * skipping effects of functions like memset() inside instrumented code.
+ */
+
+/* Called from kernel/kthread.c, kernel/fork.c */
+void kmsan_task_create(struct task_struct *task)
+{
+	unsigned long irq_flags;
+
+	if (!task)
+		return;
+	irq_flags = kmsan_enter_runtime();
+	kmsan_internal_task_create(task);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_task_create);
+
+/* Called from kernel/exit.c */
+void kmsan_task_exit(struct task_struct *task)
+{
+	struct kmsan_task_state *state = &task->kmsan;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	state->allow_reporting = false;
+}
+EXPORT_SYMBOL(kmsan_task_exit);
+
+/* Called from mm/slub.c */
+void kmsan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags)
+{
+	unsigned long irq_flags;
+
+	if (unlikely(object == NULL))
+		return;
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	/*
+	 * There's a ctor or this is an RCU cache - do nothing. The memory
+	 * status hasn't changed since last use.
+	 */
+	if (s->ctor || (s->flags & SLAB_TYPESAFE_BY_RCU))
+		return;
+
+	irq_flags = kmsan_enter_runtime();
+	if (flags & __GFP_ZERO)
+		kmsan_internal_unpoison_shadow(object, s->object_size,
+					       KMSAN_POISON_CHECK);
+	else
+		kmsan_internal_poison_shadow(object, s->object_size, flags,
+					     KMSAN_POISON_CHECK);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_slab_alloc);
+
+/* Called from mm/slub.c */
+void kmsan_slab_free(struct kmem_cache *s, void *object)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	/* RCU slabs could be legally used after free within the RCU period */
+	if (unlikely(s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_POISON)))
+		return;
+	/*
+	 * If there's a constructor, freed memory must remain in the same state
+	 * till the next allocation. We cannot save its state to detect
+	 * use-after-free bugs, instead we just keep it unpoisoned.
+	 */
+	if (s->ctor)
+		return;
+	irq_flags = kmsan_enter_runtime();
+	kmsan_internal_poison_shadow(object, s->object_size,
+				     GFP_KERNEL,
+				     KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_slab_free);
+
+/* Called from mm/slub.c */
+void kmsan_kmalloc_large(const void *ptr, size_t size, gfp_t flags)
+{
+	unsigned long irq_flags;
+
+	if (unlikely(ptr == NULL))
+		return;
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	irq_flags = kmsan_enter_runtime();
+	if (flags & __GFP_ZERO)
+		kmsan_internal_unpoison_shadow((void *)ptr, size,
+					       /*checked*/true);
+	else
+		kmsan_internal_poison_shadow((void *)ptr, size, flags,
+					     KMSAN_POISON_CHECK);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_kmalloc_large);
+
+/* Called from mm/slub.c */
+void kmsan_kfree_large(const void *ptr)
+{
+	struct page *page;
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	irq_flags = kmsan_enter_runtime();
+	page = virt_to_head_page((void *)ptr);
+	BUG_ON(ptr != page_address(page));
+	kmsan_internal_poison_shadow(
+		(void *)ptr, PAGE_SIZE << compound_order(page), GFP_KERNEL,
+		KMSAN_POISON_CHECK | KMSAN_POISON_FREE);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_kfree_large);
+
+static unsigned long vmalloc_shadow(unsigned long addr)
+{
+	return (unsigned long)kmsan_get_metadata((void *)addr, 1, META_SHADOW);
+}
+
+static unsigned long vmalloc_origin(unsigned long addr)
+{
+	return (unsigned long)kmsan_get_metadata((void *)addr, 1, META_ORIGIN);
+}
+
+/* Called from mm/vmalloc.c */
+void kmsan_vunmap_page_range(unsigned long start, unsigned long end)
+{
+	__vunmap_page_range(vmalloc_shadow(start), vmalloc_shadow(end));
+	__vunmap_page_range(vmalloc_origin(start), vmalloc_origin(end));
+}
+EXPORT_SYMBOL(kmsan_vunmap_page_range);
+
+/* Called from lib/ioremap.c */
+/*
+ * This function creates new shadow/origin pages for the physical pages mapped
+ * into the virtual memory. If those physical pages already had shadow/origin,
+ * those are ignored.
+ */
+void kmsan_ioremap_page_range(unsigned long start, unsigned long end,
+	phys_addr_t phys_addr, pgprot_t prot)
+{
+	unsigned long irq_flags;
+	struct page *shadow, *origin;
+	int i, nr;
+	unsigned long off = 0;
+	gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO | __GFP_NO_KMSAN_SHADOW;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	nr = (end - start) / PAGE_SIZE;
+	irq_flags = kmsan_enter_runtime();
+	for (i = 0; i < nr; i++, off += PAGE_SIZE) {
+		shadow = alloc_pages(gfp_mask, 1);
+		origin = alloc_pages(gfp_mask, 1);
+		__vmap_page_range_noflush(vmalloc_shadow(start + off),
+				vmalloc_shadow(start + off + PAGE_SIZE),
+				prot, &shadow);
+		__vmap_page_range_noflush(vmalloc_origin(start + off),
+				vmalloc_origin(start + off + PAGE_SIZE),
+				prot, &origin);
+	}
+	flush_cache_vmap(vmalloc_shadow(start), vmalloc_shadow(end));
+	flush_cache_vmap(vmalloc_origin(start), vmalloc_origin(end));
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_ioremap_page_range);
+
+void kmsan_iounmap_page_range(unsigned long start, unsigned long end)
+{
+	int i, nr;
+	struct page *shadow, *origin;
+	unsigned long v_shadow, v_origin;
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	nr = (end - start) / PAGE_SIZE;
+	irq_flags = kmsan_enter_runtime();
+	v_shadow = (unsigned long)vmalloc_shadow(start);
+	v_origin = (unsigned long)vmalloc_origin(start);
+	for (i = 0; i < nr; i++, v_shadow += PAGE_SIZE, v_origin += PAGE_SIZE) {
+		shadow = vmalloc_to_page_or_null((void *)v_shadow);
+		origin = vmalloc_to_page_or_null((void *)v_origin);
+		__vunmap_page_range(v_shadow, v_shadow + PAGE_SIZE);
+		__vunmap_page_range(v_origin, v_origin + PAGE_SIZE);
+		if (shadow)
+			__free_pages(shadow, 1);
+		if (origin)
+			__free_pages(origin, 1);
+	}
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_iounmap_page_range);
+
+/* Called from include/linux/uaccess.h, include/linux/uaccess.h */
+void kmsan_copy_to_user(const void *to, const void *from,
+			size_t to_copy, size_t left)
+{
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	/*
+	 * At this point we've copied the memory already. It's hard to check it
+	 * before copying, as the size of actually copied buffer is unknown.
+	 */
+
+	/* copy_to_user() may copy zero bytes. No need to check. */
+	if (!to_copy)
+		return;
+	/* Or maybe copy_to_user() failed to copy anything. */
+	if (to_copy == left)
+		return;
+	if ((u64)to < TASK_SIZE) {
+		/* This is a user memory access, check it. */
+		kmsan_internal_check_memory((void *)from, to_copy - left, to,
+						REASON_COPY_TO_USER);
+		return;
+	}
+	/* Otherwise this is a kernel memory access. This happens when a compat
+	 * syscall passes an argument allocated on the kernel stack to a real
+	 * syscall.
+	 * Don't check anything, just copy the shadow of the copied bytes.
+	 */
+	kmsan_memcpy_metadata((void *)to, (void *)from, to_copy - left);
+}
+EXPORT_SYMBOL(kmsan_copy_to_user);
+
+void kmsan_gup_pgd_range(struct page **pages, int nr)
+{
+	int i;
+	void *page_addr;
+
+	/*
+	 * gup_pgd_range() has just created a number of new pages that KMSAN
+	 * treats as uninitialized. In the case they belong to the userspace
+	 * memory, unpoison the corresponding kernel pages.
+	 */
+	for (i = 0; i < nr; i++) {
+		page_addr = page_address(pages[i]);
+		if (((u64)page_addr < TASK_SIZE) &&
+		    ((u64)page_addr + PAGE_SIZE < TASK_SIZE))
+			kmsan_unpoison_shadow(page_addr, PAGE_SIZE);
+	}
+
+}
+EXPORT_SYMBOL(kmsan_gup_pgd_range);
+
+/* Helper function to check an SKB. */
+void kmsan_check_skb(const struct sk_buff *skb)
+{
+	struct sk_buff *frag_iter;
+	int i;
+	skb_frag_t *f;
+	u32 p_off, p_len, copied;
+	struct page *p;
+	u8 *vaddr;
+
+	if (!skb || !skb->len)
+		return;
+
+	kmsan_internal_check_memory(skb->data, skb_headlen(skb), 0, REASON_ANY);
+	if (skb_is_nonlinear(skb)) {
+		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+			f = &skb_shinfo(skb)->frags[i];
+
+			skb_frag_foreach_page(f, skb_frag_off(f),
+					      skb_frag_size(f),
+					      p, p_off, p_len, copied) {
+
+				vaddr = kmap_atomic(p);
+				kmsan_internal_check_memory(vaddr + p_off,
+						p_len, /*user_addr*/ 0,
+						REASON_ANY);
+				kunmap_atomic(vaddr);
+			}
+		}
+	}
+	skb_walk_frags(skb, frag_iter)
+		kmsan_check_skb(frag_iter);
+}
+EXPORT_SYMBOL(kmsan_check_skb);
+
+/* Helper function to check an URB. */
+void kmsan_handle_urb(const struct urb *urb, bool is_out)
+{
+	if (!urb)
+		return;
+	if (is_out)
+		kmsan_internal_check_memory(urb->transfer_buffer,
+					    urb->transfer_buffer_length,
+					    /*user_addr*/ 0, REASON_SUBMIT_URB);
+	else
+		kmsan_internal_unpoison_shadow(urb->transfer_buffer,
+					       urb->transfer_buffer_length,
+					       /*checked*/false);
+}
+EXPORT_SYMBOL(kmsan_handle_urb);
+
+static void kmsan_handle_dma_page(const void *addr, size_t size,
+				  enum dma_data_direction dir)
+{
+	switch (dir) {
+	case DMA_BIDIRECTIONAL:
+		kmsan_internal_check_memory((void *)addr, size, /*user_addr*/0,
+					    REASON_ANY);
+		kmsan_internal_unpoison_shadow((void *)addr, size,
+					       /*checked*/false);
+		break;
+	case DMA_TO_DEVICE:
+		kmsan_internal_check_memory((void *)addr, size, /*user_addr*/0,
+					    REASON_ANY);
+		break;
+	case DMA_FROM_DEVICE:
+		kmsan_internal_unpoison_shadow((void *)addr, size,
+					       /*checked*/false);
+		break;
+	case DMA_NONE:
+		break;
+	}
+}
+
+/* Helper function to handle DMA data transfers. */
+void kmsan_handle_dma(const void *addr, size_t size,
+		      enum dma_data_direction dir)
+{
+	u64 page_offset, to_go, uaddr = (u64)addr;
+
+	/*
+	 * The kernel may occasionally give us adjacent DMA pages not belonging
+	 * to the same allocation. Process them separately to avoid triggering
+	 * internal KMSAN checks.
+	 */
+	while (size > 0) {
+		page_offset = uaddr % PAGE_SIZE;
+		to_go = min(PAGE_SIZE - page_offset, (u64)size);
+		kmsan_handle_dma_page((void *)uaddr, to_go, dir);
+		uaddr += to_go;
+		size -= to_go;
+	}
+}
+EXPORT_SYMBOL(kmsan_handle_dma);
+
+/* Functions from kmsan-checks.h follow. */
+void kmsan_poison_shadow(const void *address, size_t size, gfp_t flags)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+	irq_flags = kmsan_enter_runtime();
+	/* The users may want to poison/unpoison random memory. */
+	kmsan_internal_poison_shadow((void *)address, size, flags,
+				     KMSAN_POISON_NOCHECK);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_poison_shadow);
+
+void kmsan_unpoison_shadow(const void *address, size_t size)
+{
+	unsigned long irq_flags;
+
+	if (!kmsan_ready || kmsan_in_runtime())
+		return;
+
+	irq_flags = kmsan_enter_runtime();
+	/* The users may want to poison/unpoison random memory. */
+	kmsan_internal_unpoison_shadow((void *)address, size,
+				       KMSAN_POISON_NOCHECK);
+	kmsan_leave_runtime(irq_flags);
+}
+EXPORT_SYMBOL(kmsan_unpoison_shadow);
+
+void kmsan_check_memory(const void *addr, size_t size)
+{
+	return kmsan_internal_check_memory((void *)addr, size, /*user_addr*/ 0,
+					   REASON_ANY);
+}
+EXPORT_SYMBOL(kmsan_check_memory);
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 09/38] kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (7 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 08/38] kmsan: add KMSAN hooks for kernel subsystems glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 10/38] kmsan: define READ_ONCE_NOCHECK() glider
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrey Ryabinin, Jens Axboe, Andy Lutomirski, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, Christoph Hellwig,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, ard.biesheuvel, arnd, hch,
	darrick.wong, davem, dmitry.torokhov, ebiggers, edumazet, ericvh,
	gregkh, harry.wentland, herbert, iii, mingo, jasowang,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

We assume an uninitialized value couldn't come from stackdepot, so
we don't track stackdepot allocations with KMSAN.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-mm@kvack.org
---

v4:
 - set __GFP_NO_KMSAN_SHADOW explicitly for allocations

Change-Id: Ic3ec9b3dff3fff2732d874508a3582fb26ff0b1f
---
 lib/stackdepot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 195ce3dc7c37e..ba584910ad66b 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -297,7 +297,7 @@ depot_stack_handle_t stack_depot_save(unsigned long *entries,
 		 */
 		alloc_flags &= ~GFP_ZONEMASK;
 		alloc_flags &= (GFP_ATOMIC | GFP_KERNEL);
-		alloc_flags |= __GFP_NOWARN;
+		alloc_flags |= (__GFP_NOWARN | __GFP_NO_KMSAN_SHADOW);
 		page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER);
 		if (page)
 			prealloc = page_address(page);
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 10/38] kmsan: define READ_ONCE_NOCHECK()
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (8 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 09/38] kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 11/38] kmsan: make READ_ONCE_TASK_STACK() return initialized values glider
                   ` (27 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Mark Rutland, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, martin.petersen, schwidefsky, willy, mst, mhocko,
	monstr, pmladek, cai, rdunlap, robin.murphy, sergey.senozhatsky,
	rostedt, tiwai, tytso, tglx, gor, wsa

READ_ONCE_NOCHECK() is already used by KASAN to ignore memory accesses
from e.g. stack unwinders.
Define READ_ONCE_NOCHECK() for KMSAN so that it returns initialized
values. This helps defeat false positives from leftover stack contents.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v3:
 - removed unnecessary #ifdef as requested by Mark Rutland
v4:
 - added an #include as requested by Marco Elver

Change-Id: Ib38369ba038ab3b581d8e45b81036c3304fb79cb
---
 include/linux/compiler.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f504edebd5d71..c6c67729729e3 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -279,6 +279,7 @@ void __write_once_size(volatile void *p, void *res, int size)
  */
 #include <asm/barrier.h>
 #include <linux/kasan-checks.h>
+#include <linux/kmsan-checks.h>
 
 #define __READ_ONCE(x, check)						\
 ({									\
@@ -294,9 +295,9 @@ void __write_once_size(volatile void *p, void *res, int size)
 
 /*
  * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
+ * to hide memory access from KASAN or KMSAN.
  */
-#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
+#define READ_ONCE_NOCHECK(x) KMSAN_INIT_VALUE(__READ_ONCE(x, 0))
 
 static __no_kasan_or_inline
 unsigned long read_word_at_a_time(const void *addr)
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 11/38] kmsan: make READ_ONCE_TASK_STACK() return initialized values
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (9 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 10/38] kmsan: define READ_ONCE_NOCHECK() glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 12/38] kmsan: x86: sync metadata pages on page fault glider
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

To avoid false positives, assume that reading from the task stack
always produces initialized values.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
Acked-by: Marco Elver <elver@google.com>

---
v4:
 - added an #include as requested by Marco Elver

Change-Id: Ie73e5a41fdc8195699928e65f5cbe0d3d3c9e2fa
---
 arch/x86/include/asm/unwind.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 499578f7e6d7b..82c3bceb9999c 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -4,6 +4,7 @@
 
 #include <linux/sched.h>
 #include <linux/ftrace.h>
+#include <linux/kmsan-checks.h>
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 
@@ -100,9 +101,10 @@ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
 #endif
 
 /*
- * This disables KASAN checking when reading a value from another task's stack,
- * since the other task could be running on another CPU and could have poisoned
- * the stack in the meantime.
+ * This disables KASAN/KMSAN checking when reading a value from another task's
+ * stack, since the other task could be running on another CPU and could have
+ * poisoned the stack in the meantime. Frame pointers are uninitialized by
+ * default, so for KMSAN we mark the return value initialized unconditionally.
  */
 #define READ_ONCE_TASK_STACK(task, x)			\
 ({							\
@@ -111,7 +113,7 @@ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
 		val = READ_ONCE(x);			\
 	else						\
 		val = READ_ONCE_NOCHECK(x);		\
-	val;						\
+	KMSAN_INIT_VALUE(val);				\
 })
 
 static inline bool task_on_another_cpu(struct task_struct *task)
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 12/38] kmsan: x86: sync metadata pages on page fault
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (10 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 11/38] kmsan: make READ_ONCE_TASK_STACK() return initialized values glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 13/38] kmsan: add tests for KMSAN glider
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Ingo Molnar, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, jasowang, axboe, m.szyprowski,
	mark.rutland, martin.petersen, schwidefsky, willy, mst, mhocko,
	monstr, pmladek, cai, rdunlap, robin.murphy, sergey.senozhatsky,
	rostedt, tiwai, tytso, tglx, gor, wsa

KMSAN assumes shadow and origin pages for every allocated page are
accessible. For pages in vmalloc region those metadata pages reside in
[VMALLOC_END, VMALLOC_META_END), therefore we must sync a bigger memory
region.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

Change-Id: I0d54855489870ef1180b37fe2120b601da464bf7
---
 arch/x86/mm/fault.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a51df516b87bf..d22e373fa2124 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -331,11 +331,21 @@ static void dump_pagetable(unsigned long address)
 
 void vmalloc_sync_mappings(void)
 {
+#ifndef CONFIG_KMSAN
 	/*
 	 * 64-bit mappings might allocate new p4d/pud pages
 	 * that need to be propagated to all tasks' PGDs.
 	 */
 	sync_global_pgds(VMALLOC_START & PGDIR_MASK, VMALLOC_END);
+#else
+	/*
+	 * For KMSAN, make sure metadata pages for vmalloc area and modules are
+	 * also synced.
+	 */
+	sync_global_pgds(VMALLOC_START & PGDIR_MASK, VMALLOC_META_END);
+	sync_global_pgds(MODULES_SHADOW_START & PGDIR_MASK,
+		MODULES_ORIGIN_END);
+#endif
 }
 
 void vmalloc_sync_unmappings(void)
@@ -360,7 +370,17 @@ static noinline int vmalloc_fault(unsigned long address)
 	pte_t *pte;
 
 	/* Make sure we are in vmalloc area: */
+#ifdef CONFIG_KMSAN
+	/*
+	 * For KMSAN, make sure metadata pages for vmalloc area and modules are
+	 * also synced.
+	 */
+	if (!(address >= VMALLOC_START && address < VMALLOC_META_END) &&
+		!(address >= MODULES_SHADOW_START &&
+		  address < MODULES_ORIGIN_END))
+#else
 	if (!(address >= VMALLOC_START && address < VMALLOC_END))
+#endif
 		return -1;
 
 	/*
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 13/38] kmsan: add tests for KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (11 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 12/38] kmsan: x86: sync metadata pages on page fault glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 14/38] crypto: kmsan: disable accelerated configs under KMSAN glider
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

The initial commit adds several tests that trigger KMSAN warnings in
simple cases.
To use, build the kernel with CONFIG_TEST_KMSAN and do
`insmod test_kmsan.ko`

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v2:
 - added printk_test()
v4:
 - test_kmsan: don't report -Wuninitialized warnings in the test
 - test_kmsan.c: addressed comments by Andrey Konovalov

Change-Id: I287e86ae83a82b770f2baa46e5bbdce1dfa65195
---
 lib/Makefile     |   2 +
 lib/test_kmsan.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 231 insertions(+)
 create mode 100644 lib/test_kmsan.c

diff --git a/lib/Makefile b/lib/Makefile
index ab68a86743607..d8058c5c05826 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -68,6 +68,8 @@ CFLAGS_test_kasan.o += $(call cc-disable-warning, vla)
 obj-$(CONFIG_TEST_UBSAN) += test_ubsan.o
 CFLAGS_test_ubsan.o += $(call cc-disable-warning, vla)
 UBSAN_SANITIZE_test_ubsan.o := y
+obj-$(CONFIG_TEST_KMSAN) += test_kmsan.o
+CFLAGS_test_kmsan.o += $(call cc-disable-warning, uninitialized)
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LIST_SORT) += test_list_sort.o
 obj-$(CONFIG_TEST_MIN_HEAP) += test_min_heap.o
diff --git a/lib/test_kmsan.c b/lib/test_kmsan.c
new file mode 100644
index 0000000000000..f1780ed0cd315
--- /dev/null
+++ b/lib/test_kmsan.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Module for testing KMSAN.
+ *
+ * Copyright (C) 2017-2019 Google LLC
+ * Author: Alexander Potapenko <glider@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+/*
+ * Tests below use noinline and volatile to work around compiler optimizations
+ * that may mask KMSAN bugs.
+ */
+#define pr_fmt(fmt) "kmsan test: %s : " fmt, __func__
+
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+#include <linux/slab.h>
+#include <linux/kmsan-checks.h>
+
+#define CHECK(x)					\
+	do {						\
+		if (x)					\
+			pr_info(#x " is true\n");	\
+		else					\
+			pr_info(#x " is false\n");	\
+	} while (0)
+
+int signed_sum3(int a, int b, int c)
+{
+	return a + b + c;
+}
+
+noinline void uninit_kmalloc_test(void)
+{
+	int *ptr;
+
+	pr_info("-----------------------------\n");
+	pr_info("uninitialized kmalloc test (UMR report)\n");
+	ptr = kmalloc(sizeof(int), GFP_KERNEL);
+	pr_info("kmalloc returned %p\n", ptr);
+	CHECK(*ptr);
+}
+noinline void init_kmalloc_test(void)
+{
+	int *ptr;
+
+	pr_info("-----------------------------\n");
+	pr_info("initialized kmalloc test (no reports)\n");
+	ptr = kmalloc(sizeof(int), GFP_KERNEL);
+	memset(ptr, 0, sizeof(int));
+	pr_info("kmalloc returned %p\n", ptr);
+	CHECK(*ptr);
+}
+
+noinline void init_kzalloc_test(void)
+{
+	int *ptr;
+
+	pr_info("-----------------------------\n");
+	pr_info("initialized kzalloc test (no reports)\n");
+	ptr = kzalloc(sizeof(int), GFP_KERNEL);
+	pr_info("kzalloc returned %p\n", ptr);
+	CHECK(*ptr);
+}
+
+noinline void uninit_multiple_args_test(void)
+{
+	volatile int a;
+	volatile char b = 3, c;
+
+	pr_info("-----------------------------\n");
+	pr_info("uninitialized local passed to fn (UMR report)\n");
+	CHECK(signed_sum3(a, b, c));
+}
+
+noinline void uninit_stack_var_test(void)
+{
+	int cond;
+
+	pr_info("-----------------------------\n");
+	pr_info("uninitialized stack variable (UMR report)\n");
+	CHECK(cond);
+}
+
+noinline void init_stack_var_test(void)
+{
+	volatile int cond = 1;
+
+	pr_info("-----------------------------\n");
+	pr_info("initialized stack variable (no reports)\n");
+	CHECK(cond);
+}
+
+noinline void two_param_fn_2(int arg1, int arg2)
+{
+	CHECK(arg1);
+	CHECK(arg2);
+}
+
+noinline void one_param_fn(int arg)
+{
+	two_param_fn_2(arg, arg);
+	CHECK(arg);
+}
+
+noinline void two_param_fn(int arg1, int arg2)
+{
+	int init = 0;
+
+	one_param_fn(init);
+	CHECK(arg1);
+	CHECK(arg2);
+}
+
+noinline void params_test(void)
+{
+	volatile int uninit, init = 1;
+
+	pr_info("-----------------------------\n");
+	pr_info("uninit passed through a function parameter (UMR report)\n");
+	two_param_fn(uninit, init);
+}
+
+noinline void do_uninit_local_array(char *array, int start, int stop)
+{
+	int i;
+	volatile char uninit;
+
+	for (i = start; i < stop; i++)
+		array[i] = uninit;
+}
+
+noinline void uninit_kmsan_check_memory_test(void)
+{
+	volatile char local_array[8];
+
+	pr_info("-----------------------------\n");
+	pr_info("kmsan_check_memory() called on uninit local (UMR report)\n");
+	do_uninit_local_array((char *)local_array, 5, 7);
+
+	kmsan_check_memory((char *)local_array, 8);
+}
+
+noinline void init_kmsan_vmap_vunmap_test(void)
+{
+	const int npages = 2;
+	struct page *pages[npages];
+	void *vbuf;
+	int i;
+
+	pr_info("-----------------------------\n");
+	pr_info("pages initialized via vmap (no reports)\n");
+
+	for (i = 0; i < npages; i++)
+		pages[i] = alloc_page(GFP_KERNEL);
+	vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
+	memset(vbuf, 0xfe, npages * PAGE_SIZE);
+	for (i = 0; i < npages; i++)
+		kmsan_check_memory(page_address(pages[i]), PAGE_SIZE);
+
+	if (vbuf)
+		vunmap(vbuf);
+	for (i = 0; i < npages; i++)
+		if (pages[i])
+			__free_page(pages[i]);
+}
+
+noinline void init_vmalloc_test(void)
+{
+	char *buf;
+	int npages = 8, i;
+
+	pr_info("-----------------------------\n");
+	pr_info("pages initialized via vmap (no reports)\n");
+	buf = vmalloc(PAGE_SIZE * npages);
+	buf[0] = 1;
+	memset(buf, 0xfe, PAGE_SIZE * npages);
+	CHECK(buf[0]);
+	for (i = 0; i < npages; i++)
+		kmsan_check_memory(&buf[PAGE_SIZE * i], PAGE_SIZE);
+	vfree(buf);
+}
+
+noinline void uaf_test(void)
+{
+	volatile int *var;
+
+	pr_info("-----------------------------\n");
+	pr_info("use-after-free in kmalloc-ed buffer (UMR report)\n");
+	var = kmalloc(80, GFP_KERNEL);
+	var[3] = 0xfeedface;
+	kfree((int *)var);
+	CHECK(var[3]);
+}
+
+noinline void printk_test(void)
+{
+	volatile int uninit;
+
+	pr_info("-----------------------------\n");
+	pr_info("uninit local passed to pr_info() (UMR report)\n");
+	pr_info("%px contains %d\n", &uninit, uninit);
+}
+
+static noinline int __init kmsan_tests_init(void)
+{
+	uninit_kmalloc_test();
+	init_kmalloc_test();
+	init_kzalloc_test();
+	uninit_multiple_args_test();
+	uninit_stack_var_test();
+	init_stack_var_test();
+	params_test();
+	uninit_kmsan_check_memory_test();
+	init_kmsan_vmap_vunmap_test();
+	init_vmalloc_test();
+	uaf_test();
+	printk_test();
+	return -EAGAIN;
+}
+
+module_init(kmsan_tests_init);
+MODULE_LICENSE("GPL");
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 14/38] crypto: kmsan: disable accelerated configs under KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (12 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 13/38] kmsan: add tests for KMSAN glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 15/38] kmsan: x86: disable UNWINDER_ORC " glider
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Eric Biggers, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, dmitry.torokhov,
	edumazet, ericvh, gregkh, harry.wentland, iii, mingo, jasowang,
	axboe, m.szyprowski, mark.rutland, martin.petersen, schwidefsky,
	willy, mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

KMSAN is unable to understand when initialized values come from assembly.
Disable accelerated configs in KMSAN builds to prevent false positive
reports.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - shorten comments as requested by Marco Elver

v5:
 - move the 'depends' directives together, added missing configs as
   requested by Eric Biggers

Change-Id: Iddc71a2a27360e036d719c0940ebf15553cf8de8
---
 crypto/Kconfig | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index c24a47406f8f5..5035e8b2b033f 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -268,6 +268,7 @@ config CRYPTO_CURVE25519
 config CRYPTO_CURVE25519_X86
 	tristate "x86_64 accelerated Curve25519 scalar multiplication library"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_LIB_CURVE25519_GENERIC
 	select CRYPTO_ARCH_HAVE_LIB_CURVE25519
 
@@ -317,11 +318,13 @@ config CRYPTO_AEGIS128_SIMD
 	bool "Support SIMD acceleration for AEGIS-128"
 	depends on CRYPTO_AEGIS128 && ((ARM || ARM64) && KERNEL_MODE_NEON)
 	depends on !ARM || CC_IS_CLANG || GCC_VERSION >= 40800
+	depends on !KMSAN # avoid false positives from assembly
 	default y
 
 config CRYPTO_AEGIS128_AESNI_SSE2
 	tristate "AEGIS-128 AEAD algorithm (x86_64 AESNI+SSE2 implementation)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_AEAD
 	select CRYPTO_SIMD
 	help
@@ -458,6 +461,7 @@ config CRYPTO_NHPOLY1305
 config CRYPTO_NHPOLY1305_SSE2
 	tristate "NHPoly1305 hash function (x86_64 SSE2 implementation)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_NHPOLY1305
 	help
 	  SSE2 optimized implementation of the hash function used by the
@@ -466,6 +470,7 @@ config CRYPTO_NHPOLY1305_SSE2
 config CRYPTO_NHPOLY1305_AVX2
 	tristate "NHPoly1305 hash function (x86_64 AVX2 implementation)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_NHPOLY1305
 	help
 	  AVX2 optimized implementation of the hash function used by the
@@ -579,6 +584,7 @@ config CRYPTO_CRC32C
 config CRYPTO_CRC32C_INTEL
 	tristate "CRC32c INTEL hardware acceleration"
 	depends on X86
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_HASH
 	help
 	  In Intel processor with SSE4.2 supported, the processor will
@@ -619,6 +625,7 @@ config CRYPTO_CRC32
 config CRYPTO_CRC32_PCLMUL
 	tristate "CRC32 PCLMULQDQ hardware acceleration"
 	depends on X86
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_HASH
 	select CRC32
 	help
@@ -684,6 +691,7 @@ config CRYPTO_BLAKE2S
 config CRYPTO_BLAKE2S_X86
 	tristate "BLAKE2s digest algorithm (x86 accelerated version)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_LIB_BLAKE2S_GENERIC
 	select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
 
@@ -698,6 +706,7 @@ config CRYPTO_CRCT10DIF
 config CRYPTO_CRCT10DIF_PCLMUL
 	tristate "CRCT10DIF PCLMULQDQ hardware acceleration"
 	depends on X86 && 64BIT && CRC_T10DIF
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_HASH
 	help
 	  For x86_64 processors with SSE4.2 and PCLMULQDQ supported,
@@ -745,6 +754,7 @@ config CRYPTO_POLY1305
 config CRYPTO_POLY1305_X86_64
 	tristate "Poly1305 authenticator algorithm (x86_64/SSE2/AVX2)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_LIB_POLY1305_GENERIC
 	select CRYPTO_ARCH_HAVE_LIB_POLY1305
 	help
@@ -870,6 +880,7 @@ config CRYPTO_SHA1
 config CRYPTO_SHA1_SSSE3
 	tristate "SHA1 digest algorithm (SSSE3/AVX/AVX2/SHA-NI)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SHA1
 	select CRYPTO_HASH
 	help
@@ -881,6 +892,7 @@ config CRYPTO_SHA1_SSSE3
 config CRYPTO_SHA256_SSSE3
 	tristate "SHA256 digest algorithm (SSSE3/AVX/AVX2/SHA-NI)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SHA256
 	select CRYPTO_HASH
 	help
@@ -893,6 +905,7 @@ config CRYPTO_SHA256_SSSE3
 config CRYPTO_SHA512_SSSE3
 	tristate "SHA512 digest algorithm (SSSE3/AVX/AVX2)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SHA512
 	select CRYPTO_HASH
 	help
@@ -1064,6 +1077,7 @@ config CRYPTO_WP512
 config CRYPTO_GHASH_CLMUL_NI_INTEL
 	tristate "GHASH hash function (CLMUL-NI accelerated)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_CRYPTD
 	help
 	  This is the x86_64 CLMUL-NI accelerated implementation of
@@ -1114,6 +1128,7 @@ config CRYPTO_AES_TI
 config CRYPTO_AES_NI_INTEL
 	tristate "AES cipher algorithms (AES-NI)"
 	depends on X86
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_AEAD
 	select CRYPTO_LIB_AES
 	select CRYPTO_ALGAPI
@@ -1237,6 +1252,7 @@ config CRYPTO_BLOWFISH_COMMON
 config CRYPTO_BLOWFISH_X86_64
 	tristate "Blowfish cipher algorithm (x86_64)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_BLOWFISH_COMMON
 	help
@@ -1268,6 +1284,7 @@ config CRYPTO_CAMELLIA_X86_64
 	tristate "Camellia cipher algorithm (x86_64)"
 	depends on X86 && 64BIT
 	depends on CRYPTO
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_GLUE_HELPER_X86
 	help
@@ -1285,6 +1302,7 @@ config CRYPTO_CAMELLIA_AESNI_AVX_X86_64
 	tristate "Camellia cipher algorithm (x86_64/AES-NI/AVX)"
 	depends on X86 && 64BIT
 	depends on CRYPTO
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_CAMELLIA_X86_64
 	select CRYPTO_GLUE_HELPER_X86
@@ -1305,6 +1323,7 @@ config CRYPTO_CAMELLIA_AESNI_AVX2_X86_64
 	tristate "Camellia cipher algorithm (x86_64/AES-NI/AVX2)"
 	depends on X86 && 64BIT
 	depends on CRYPTO
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_CAMELLIA_AESNI_AVX_X86_64
 	help
 	  Camellia cipher algorithm module (x86_64/AES-NI/AVX2).
@@ -1351,6 +1370,7 @@ config CRYPTO_CAST5
 config CRYPTO_CAST5_AVX_X86_64
 	tristate "CAST5 (CAST-128) cipher algorithm (x86_64/AVX)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_CAST5
 	select CRYPTO_CAST_COMMON
@@ -1373,6 +1393,7 @@ config CRYPTO_CAST6
 config CRYPTO_CAST6_AVX_X86_64
 	tristate "CAST6 (CAST-256) cipher algorithm (x86_64/AVX)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_CAST6
 	select CRYPTO_CAST_COMMON
@@ -1406,6 +1427,7 @@ config CRYPTO_DES_SPARC64
 config CRYPTO_DES3_EDE_X86_64
 	tristate "Triple DES EDE cipher algorithm (x86-64)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_DES
 	help
@@ -1473,6 +1495,7 @@ config CRYPTO_CHACHA20
 config CRYPTO_CHACHA20_X86_64
 	tristate "ChaCha stream cipher algorithms (x86_64/SSSE3/AVX2/AVX-512VL)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_CHACHA_GENERIC
 	select CRYPTO_ARCH_HAVE_LIB_CHACHA
@@ -1516,6 +1539,7 @@ config CRYPTO_SERPENT
 config CRYPTO_SERPENT_SSE2_X86_64
 	tristate "Serpent cipher algorithm (x86_64/SSE2)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_GLUE_HELPER_X86
 	select CRYPTO_SERPENT
@@ -1535,6 +1559,7 @@ config CRYPTO_SERPENT_SSE2_X86_64
 config CRYPTO_SERPENT_SSE2_586
 	tristate "Serpent cipher algorithm (i586/SSE2)"
 	depends on X86 && !64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_GLUE_HELPER_X86
 	select CRYPTO_SERPENT
@@ -1554,6 +1579,7 @@ config CRYPTO_SERPENT_SSE2_586
 config CRYPTO_SERPENT_AVX_X86_64
 	tristate "Serpent cipher algorithm (x86_64/AVX)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_GLUE_HELPER_X86
 	select CRYPTO_SERPENT
@@ -1574,6 +1600,7 @@ config CRYPTO_SERPENT_AVX_X86_64
 config CRYPTO_SERPENT_AVX2_X86_64
 	tristate "Serpent cipher algorithm (x86_64/AVX2)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SERPENT_AVX_X86_64
 	help
 	  Serpent cipher algorithm, by Anderson, Biham & Knudsen.
@@ -1669,6 +1696,7 @@ config CRYPTO_TWOFISH_586
 config CRYPTO_TWOFISH_X86_64
 	tristate "Twofish cipher algorithm (x86_64)"
 	depends on (X86 || UML_X86) && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_ALGAPI
 	select CRYPTO_TWOFISH_COMMON
 	help
@@ -1685,6 +1713,7 @@ config CRYPTO_TWOFISH_X86_64
 config CRYPTO_TWOFISH_X86_64_3WAY
 	tristate "Twofish cipher algorithm (x86_64, 3-way parallel)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_TWOFISH_COMMON
 	select CRYPTO_TWOFISH_X86_64
@@ -1706,6 +1735,7 @@ config CRYPTO_TWOFISH_X86_64_3WAY
 config CRYPTO_TWOFISH_AVX_X86_64
 	tristate "Twofish cipher algorithm (x86_64/AVX)"
 	depends on X86 && 64BIT
+	depends on !KMSAN # avoid false positives from assembly
 	select CRYPTO_SKCIPHER
 	select CRYPTO_GLUE_HELPER_X86
 	select CRYPTO_SIMD
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 15/38] kmsan: x86: disable UNWINDER_ORC under KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (13 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 14/38] crypto: kmsan: disable accelerated configs under KMSAN glider
@ 2020-03-25 16:12 ` " glider
  2020-03-25 16:12 ` [PATCH v5 16/38] kmsan: x86/asm: softirq: add KMSAN IRQ entry hooks glider
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Qian Cai, Christoph Hellwig, Herbert Xu, Harry Wentland,
	Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, darrick.wong, davem, dmitry.torokhov,
	ebiggers, edumazet, ericvh, gregkh, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

KMSAN doesn't currently support UNWINDER_ORC, causing the kernel to
freeze at boot time.
See http://github.com/google/kmsan/issues/48.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---
This patch is part of "kmsan: Kconfig changes to disable options
incompatible with KMSAN", which was split into smaller pieces.

Change-Id: I9cb6ebbaeb9a38e9e1d015c68ab77d40420a7ad0
---
 arch/x86/Kconfig.debug | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 2e74690b028a5..ad71eb2a416ec 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -276,6 +276,9 @@ choice
 config UNWINDER_ORC
 	bool "ORC unwinder"
 	depends on X86_64
+	# KMSAN doesn't support UNWINDER_ORC yet,
+	# see https://github.com/google/kmsan/issues/48.
+	depends on !KMSAN
 	select STACK_VALIDATION
 	---help---
 	  This option enables the ORC (Oops Rewind Capability) unwinder for
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 16/38] kmsan: x86/asm: softirq: add KMSAN IRQ entry hooks
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (14 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 15/38] kmsan: x86: disable UNWINDER_ORC " glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 17/38] kmsan: disable KMSAN instrumentation for certain kernel parts glider
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Christoph Hellwig, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

Add assembly helpers to entry_64.S that invoke hooks from kmsan_entry.c and
notify KMSAN about interrupts.
Also call these hooks from kernel/softirq.c
This is needed to switch between several KMSAN contexts holding function
parameter metadata.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-mm@kvack.org
---

v4:
 - moved softirq changes to this patch

Change-Id: I3037d51672fe69d09e588b27adb2d9fdc6ad3a7d
---
 arch/x86/entry/entry_64.S | 16 ++++++++++++++++
 kernel/softirq.c          |  5 +++++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0e9504fabe526..03f5a32b0af4d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -35,6 +35,7 @@
 #include <asm/asm.h>
 #include <asm/smap.h>
 #include <asm/pgtable_types.h>
+#include <asm/kmsan.h>
 #include <asm/export.h>
 #include <asm/frame.h>
 #include <asm/nospec-branch.h>
@@ -575,6 +576,7 @@ SYM_CODE_START(interrupt_entry)
 
 1:
 	ENTER_IRQ_STACK old_rsp=%rdi save_ret=1
+	KMSAN_INTERRUPT_ENTER
 	/* We entered an interrupt context - irqs are off: */
 	TRACE_IRQS_OFF
 
@@ -604,12 +606,14 @@ SYM_CODE_START_LOCAL(common_interrupt)
 	addq	$-0x80, (%rsp)			/* Adjust vector to [-256, -1] range */
 	call	interrupt_entry
 	UNWIND_HINT_REGS indirect=1
+	KMSAN_UNPOISON_PT_REGS
 	call	do_IRQ	/* rdi points to pt_regs */
 	/* 0(%rsp): old RSP */
 ret_from_intr:
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF
 
+	KMSAN_INTERRUPT_EXIT
 	LEAVE_IRQ_STACK
 
 	testb	$3, CS(%rsp)
@@ -801,6 +805,7 @@ SYM_CODE_START(\sym)
 .Lcommon_\sym:
 	call	interrupt_entry
 	UNWIND_HINT_REGS indirect=1
+	KMSAN_UNPOISON_PT_REGS
 	call	\do_sym	/* rdi points to pt_regs */
 	jmp	ret_from_intr
 SYM_CODE_END(\sym)
@@ -908,15 +913,18 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 
 	.if \shift_ist != -1
 	subq	$\ist_offset, CPU_TSS_IST(\shift_ist)
+	KMSAN_IST_ENTER(\shift_ist)
 	.endif
 
 	.if \read_cr2
 	movq	%r12, %rdx			/* Move CR2 into 3rd argument */
 	.endif
 
+	KMSAN_UNPOISON_PT_REGS
 	call	\do_sym
 
 	.if \shift_ist != -1
+	KMSAN_IST_EXIT(\shift_ist)
 	addq	$\ist_offset, CPU_TSS_IST(\shift_ist)
 	.endif
 
@@ -1079,7 +1087,9 @@ SYM_FUNC_START(do_softirq_own_stack)
 	pushq	%rbp
 	mov	%rsp, %rbp
 	ENTER_IRQ_STACK regs=0 old_rsp=%r11
+	KMSAN_SOFTIRQ_ENTER
 	call	__do_softirq
+	KMSAN_SOFTIRQ_EXIT
 	LEAVE_IRQ_STACK regs=0
 	leaveq
 	ret
@@ -1466,9 +1476,12 @@ SYM_CODE_START(nmi)
 	 * done with the NMI stack.
 	 */
 
+	KMSAN_NMI_ENTER
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
+	KMSAN_UNPOISON_PT_REGS
 	call	do_nmi
+	KMSAN_NMI_EXIT
 
 	/*
 	 * Return back to user mode.  We must *not* do the normal exit
@@ -1678,10 +1691,13 @@ end_repeat_nmi:
 	call	paranoid_entry
 	UNWIND_HINT_REGS
 
+	KMSAN_NMI_ENTER
 	/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
+	KMSAN_UNPOISON_PT_REGS
 	call	do_nmi
+	KMSAN_NMI_EXIT
 
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 0427a86743a46..98c5f4062cbfe 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -11,6 +11,7 @@
 
 #include <linux/export.h>
 #include <linux/kernel_stat.h>
+#include <linux/kmsan.h>
 #include <linux/interrupt.h>
 #include <linux/init.h>
 #include <linux/mm.h>
@@ -370,7 +371,9 @@ static inline void invoke_softirq(void)
 		 * it is the irq stack, because it should be near empty
 		 * at this stage.
 		 */
+		kmsan_context_enter();
 		__do_softirq();
+		kmsan_context_exit();
 #else
 		/*
 		 * Otherwise, irq_exit() is called on the task stack that can
@@ -600,7 +603,9 @@ static void run_ksoftirqd(unsigned int cpu)
 		 * We can safely run softirq on inline stack, as we are not deep
 		 * in the task stack here.
 		 */
+		kmsan_context_enter();
 		__do_softirq();
+		kmsan_context_exit();
 		local_irq_enable();
 		cond_resched();
 		return;
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 17/38] kmsan: disable KMSAN instrumentation for certain kernel parts
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (15 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 16/38] kmsan: x86/asm: softirq: add KMSAN IRQ entry hooks glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 18/38] kmsan: mm: call KMSAN hooks from SLUB code glider
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Ard Biesheuvel, Thomas Gleixner, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto, arnd, hch,
	hch, darrick.wong, davem, dmitry.torokhov, ebiggers, edumazet,
	ericvh, gregkh, harry.wentland, herbert, iii, mingo, jasowang,
	axboe, m.szyprowski, mark.rutland, martin.petersen, schwidefsky,
	willy, mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, gor, wsa

Instrumenting some files with KMSAN will result in kernel being unable
to link, boot or crashing at runtime for various reasons (e.g. infinite
recursion caused by instrumentation hooks calling instrumented code again).

Disable KMSAN in the following places:
 - arch/x86/boot and arch/x86/realmode/rm, as KMSAN doesn't work for i386;
 - arch/x86/entry/vdso, which isn't linked with KMSAN runtime;
 - three files in arch/x86/kernel - boot problems;
 - arch/x86/mm/cpu_entry_area.c - recursion;
 - EFI stub - build failures;
 - kcov, stackdepot, lockdep - recursion.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - fix lockdep support by not instrumenting lockdep.c
 - unified comments with KCSAN

Change-Id: I90961eabf2dcb9ae992aed259088953bad5e4d6d
---
 arch/x86/boot/Makefile                | 1 +
 arch/x86/boot/compressed/Makefile     | 2 ++
 arch/x86/entry/vdso/Makefile          | 3 +++
 arch/x86/kernel/Makefile              | 4 ++++
 arch/x86/kernel/cpu/Makefile          | 1 +
 arch/x86/mm/Makefile                  | 3 +++
 arch/x86/realmode/rm/Makefile         | 1 +
 drivers/firmware/efi/libstub/Makefile | 1 +
 kernel/Makefile                       | 1 +
 kernel/locking/Makefile               | 4 ++++
 lib/Makefile                          | 1 +
 11 files changed, 22 insertions(+)

diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index d7aa1c3a6b25a..2ca8b9b478f3a 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -12,6 +12,7 @@
 # Sanitizer runtimes are unavailable and cannot be linked for early boot code.
 KASAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
+KMSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
 
 # Kernel does not boot with kcov instrumentation here.
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7619742f91c9a..2af62067a90ec 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -20,6 +20,8 @@
 # Sanitizer runtimes are unavailable and cannot be linked for early boot code.
 KASAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
+# KMSAN doesn't work for i386
+KMSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
 
 # Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index ecf6128c95516..e2b1b9be89ab7 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -13,6 +13,9 @@ KBUILD_CFLAGS += $(DISABLE_LTO)
 
 # Sanitizer runtimes are unavailable and cannot be linked here.
 KASAN_SANITIZE			:= n
+KMSAN_SANITIZE_vclock_gettime.o := n
+KMSAN_SANITIZE_vgetcpu.o	:= n
+
 UBSAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 1ee83df407e3b..a3b7b0452c817 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -32,6 +32,10 @@ KASAN_SANITIZE_paravirt.o				:= n
 # by several compilation units. To be safe, disable all instrumentation.
 KCSAN_SANITIZE := n
 
+# Work around reboot loop.
+KMSAN_SANITIZE_head$(BITS).o				:= n
+KMSAN_SANITIZE_nmi.o					:= n
+
 OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
 OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
 OBJECT_FILES_NON_STANDARD_paravirt_patch.o		:= y
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index dba6a83bc3493..0e299ba013868 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -12,6 +12,7 @@ endif
 # If these files are instrumented, boot hangs during the first second.
 KCOV_INSTRUMENT_common.o := n
 KCOV_INSTRUMENT_perf_event.o := n
+KMSAN_SANITIZE_common.o := n
 
 # As above, instrumenting secondary CPU boot code causes boot hangs.
 KCSAN_SANITIZE_common.o := n
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index f7fd0e868c9c8..f11848633cf5b 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -11,6 +11,9 @@ KASAN_SANITIZE_mem_encrypt_identity.o	:= n
 # reference __initdata sections.
 KCSAN_SANITIZE := n
 
+# Avoid recursion by not calling KMSAN hooks for CEA code.
+KMSAN_SANITIZE_cpu_entry_area.o := n
+
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_mem_encrypt.o		= -pg
 CFLAGS_REMOVE_mem_encrypt_identity.o	= -pg
diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
index 83f1b6a56449f..f614009d3e4e2 100644
--- a/arch/x86/realmode/rm/Makefile
+++ b/arch/x86/realmode/rm/Makefile
@@ -10,6 +10,7 @@
 # Sanitizer runtimes are unavailable and cannot be linked here.
 KASAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
+KMSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
 
 # Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index dd31237fba2e9..2cf047a0d2e06 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -36,6 +36,7 @@ GCOV_PROFILE			:= n
 # Sanitizer runtimes are unavailable and cannot be linked here.
 KASAN_SANITIZE			:= n
 KCSAN_SANITIZE			:= n
+KMSAN_SANITIZE			:= n
 UBSAN_SANITIZE			:= n
 OBJECT_FILES_NON_STANDARD	:= y
 
diff --git a/kernel/Makefile b/kernel/Makefile
index 6ac453daf500e..e9093daf41056 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -35,6 +35,7 @@ KCOV_INSTRUMENT_stacktrace.o := n
 KCOV_INSTRUMENT_kcov.o := n
 KASAN_SANITIZE_kcov.o := n
 KCSAN_SANITIZE_kcov.o := n
+KMSAN_SANITIZE_kcov.o := n
 CFLAGS_kcov.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
 
 # cond_syscall is currently not LTO compatible
diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
index 6d11cfb9b41f2..1dd1f7d81e691 100644
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -3,6 +3,10 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT		:= n
 
+# Instrumenting lockdep.c with KMSAN may cause deadlocks because of
+# recursive KMSAN runtime calls.
+KMSAN_SANITIZE_lockdep.o := n
+
 obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
 
 # Avoid recursion lockdep -> KCSAN -> ... -> lockdep.
diff --git a/lib/Makefile b/lib/Makefile
index d8058c5c05826..6ec959b62a55f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -234,6 +234,7 @@ obj-$(CONFIG_IRQ_POLL) += irq_poll.o
 CFLAGS_stackdepot.o += -fno-builtin
 obj-$(CONFIG_STACKDEPOT) += stackdepot.o
 KASAN_SANITIZE_stackdepot.o := n
+KMSAN_SANITIZE_stackdepot.o := n
 KCOV_INSTRUMENT_stackdepot.o := n
 
 libfdt_files = fdt.o fdt_ro.o fdt_wip.o fdt_rw.o fdt_sw.o fdt_strerror.o \
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 18/38] kmsan: mm: call KMSAN hooks from SLUB code
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (16 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 17/38] kmsan: disable KMSAN instrumentation for certain kernel parts glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 19/38] kmsan: mm: maintain KMSAN metadata for page operations glider
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

In order to report uninitialized memory coming from heap allocations
KMSAN has to poison them unless they're created with __GFP_ZERO.

It's handy that we need KMSAN hooks in the places where
init_on_alloc/init_on_free initialization is performed.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v3:
 - reverted unrelated whitespace changes

Change-Id: I51103b7981d3aabed747d0c85cbdc85568665871
---
 mm/slub.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 332d4b459a907..67c7f76bee412 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -21,6 +21,8 @@
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/kasan.h>
+#include <linux/kmsan.h>
+#include <linux/kmsan-checks.h> /* KMSAN_INIT_VALUE */
 #include <linux/cpu.h>
 #include <linux/cpuset.h>
 #include <linux/mempolicy.h>
@@ -283,17 +285,27 @@ static void prefetch_freepointer(const struct kmem_cache *s, void *object)
 	prefetch(object + s->offset);
 }
 
+/*
+ * When running under KMSAN, get_freepointer_safe() may return an uninitialized
+ * pointer value in the case the current thread loses the race for the next
+ * memory chunk in the freelist. In that case this_cpu_cmpxchg_double() in
+ * slab_alloc_node() will fail, so the uninitialized value won't be used, but
+ * KMSAN will still check all arguments of cmpxchg because of imperfect
+ * handling of inline assembly.
+ * To work around this problem, use KMSAN_INIT_VALUE() to force initialize the
+ * return value of get_freepointer_safe().
+ */
 static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
 {
 	unsigned long freepointer_addr;
 	void *p;
 
 	if (!debug_pagealloc_enabled_static())
-		return get_freepointer(s, object);
+		return KMSAN_INIT_VALUE(get_freepointer(s, object));
 
 	freepointer_addr = (unsigned long)object + s->offset;
 	probe_kernel_read(&p, (void **)freepointer_addr, sizeof(p));
-	return freelist_ptr(s, p, freepointer_addr);
+	return KMSAN_INIT_VALUE(freelist_ptr(s, p, freepointer_addr));
 }
 
 static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp)
@@ -1411,6 +1423,7 @@ static inline void *kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags)
 	ptr = kasan_kmalloc_large(ptr, size, flags);
 	/* As ptr might get tagged, call kmemleak hook after KASAN. */
 	kmemleak_alloc(ptr, size, 1, flags);
+	kmsan_kmalloc_large(ptr, size, flags);
 	return ptr;
 }
 
@@ -1418,6 +1431,7 @@ static __always_inline void kfree_hook(void *x)
 {
 	kmemleak_free(x);
 	kasan_kfree_large(x, _RET_IP_);
+	kmsan_kfree_large(x);
 }
 
 static __always_inline bool slab_free_hook(struct kmem_cache *s, void *x)
@@ -1461,6 +1475,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
 	do {
 		object = next;
 		next = get_freepointer(s, object);
+		kmsan_slab_free(s, object);
 
 		if (slab_want_init_on_free(s)) {
 			/*
@@ -2784,6 +2799,7 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
 	if (unlikely(slab_want_init_on_alloc(gfpflags, s)) && object)
 		memset(object, 0, s->object_size);
 
+	kmsan_slab_alloc(s, object, gfpflags);
 	slab_post_alloc_hook(s, gfpflags, 1, &object);
 
 	return object;
@@ -3167,7 +3183,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			  void **p)
 {
 	struct kmem_cache_cpu *c;
-	int i;
+	int i, j;
 
 	/* memcg and kmem_cache debug support */
 	s = slab_pre_alloc_hook(s, flags);
@@ -3217,11 +3233,11 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 
 	/* Clear memory outside IRQ disabled fastpath loop */
 	if (unlikely(slab_want_init_on_alloc(flags, s))) {
-		int j;
-
 		for (j = 0; j < i; j++)
 			memset(p[j], 0, s->object_size);
 	}
+	for (j = 0; j < i; j++)
+		kmsan_slab_alloc(s, p[j], flags);
 
 	/* memcg and kmem_cache debug support */
 	slab_post_alloc_hook(s, flags, size, p);
@@ -3829,6 +3845,7 @@ static int __init setup_slub_min_objects(char *str)
 
 __setup("slub_min_objects=", setup_slub_min_objects);
 
+__no_sanitize_memory
 void *__kmalloc(size_t size, gfp_t flags)
 {
 	struct kmem_cache *s;
@@ -5725,6 +5742,7 @@ static char *create_unique_id(struct kmem_cache *s)
 	p += sprintf(p, "%07u", s->size);
 
 	BUG_ON(p > name + ID_STR_LENGTH - 1);
+	kmsan_unpoison_shadow(name, p - name);
 	return name;
 }
 
@@ -5874,6 +5892,7 @@ static int sysfs_slab_alias(struct kmem_cache *s, const char *name)
 	al->name = name;
 	al->next = alias_list;
 	alias_list = al;
+	kmsan_unpoison_shadow(al, sizeof(struct saved_alias));
 	return 0;
 }
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 19/38] kmsan: mm: maintain KMSAN metadata for page operations
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (17 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 18/38] kmsan: mm: call KMSAN hooks from SLUB code glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 20/38] kmsan: handle memory sent to/from USB glider
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman, Eric Dumazet, Wolfram Sang,
	Petr Mladek, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	ericvh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor

Insert KMSAN hooks that make the necessary bookkeeping changes:
 - allocate/split/deallocate metadata pages in
   alloc_pages()/split_page()/free_page();
 - clear page shadow and origins in clear_page(), copy_user_highpage();
 - copy page metadata in copy_highpage(), wp_page_copy();
 - handle vmap()/vunmap()/iounmap();

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

This patch was previously called "kmsan: call KMSAN hooks where needed"

v2:
 - dropped call to kmsan_handle_vprintk, updated comment in printk.c

v3:
 - put KMSAN_INIT_VALUE on a separate line in vprintk_store()
 - dropped call to kmsan_handle_i2c_transfer()
 - minor style fixes

v4:
 - split mm-unrelated bits to other patches as requested by Andrey
   Konovalov
 - dropped changes to mm/compaction.c
 - use kmsan_unpoison_shadow in page_64.h and highmem.h

Change-Id: I1250a928d9263bf71fdaa067a070bdee686ef47b
---
 arch/x86/include/asm/page_64.h | 13 +++++++++++++
 arch/x86/mm/ioremap.c          |  3 +++
 include/linux/highmem.h        |  3 +++
 lib/ioremap.c                  |  5 +++++
 mm/gup.c                       |  3 +++
 mm/memory.c                    |  2 ++
 mm/page_alloc.c                | 17 +++++++++++++++++
 mm/vmalloc.c                   | 24 ++++++++++++++++++++++--
 8 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 939b1cff4a7b7..045856c38f494 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -44,14 +44,27 @@ void clear_page_orig(void *page);
 void clear_page_rep(void *page);
 void clear_page_erms(void *page);
 
+/* This is an assembly header, avoid including too much of kmsan.h */
+#ifdef CONFIG_KMSAN
+void kmsan_unpoison_shadow(const void *addr, size_t size);
+#endif
+__no_sanitize_memory
 static inline void clear_page(void *page)
 {
+#ifdef CONFIG_KMSAN
+	/* alternative_call_2() changes |page|. */
+	void *page_copy = page;
+#endif
 	alternative_call_2(clear_page_orig,
 			   clear_page_rep, X86_FEATURE_REP_GOOD,
 			   clear_page_erms, X86_FEATURE_ERMS,
 			   "=D" (page),
 			   "0" (page)
 			   : "cc", "memory", "rax", "rcx");
+#ifdef CONFIG_KMSAN
+	/* Clear KMSAN shadow for the pages that have it. */
+	kmsan_unpoison_shadow(page_copy, PAGE_SIZE);
+#endif
 }
 
 void copy_page(void *to, void *from);
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 935a91e1fd774..80399defe90aa 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -7,6 +7,7 @@
  * (C) Copyright 1995 1996 Linus Torvalds
  */
 
+#include <linux/kmsan.h>
 #include <linux/memblock.h>
 #include <linux/init.h>
 #include <linux/io.h>
@@ -469,6 +470,8 @@ void iounmap(volatile void __iomem *addr)
 		return;
 	}
 
+	kmsan_iounmap_page_range((unsigned long)addr,
+		(unsigned long)addr + get_vm_area_size(p));
 	memtype_free(p->phys_addr, p->phys_addr + get_vm_area_size(p));
 
 	/* Finally remove it */
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index ea5cdbd8c2c32..9f6efa26e9b5c 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -5,6 +5,7 @@
 #include <linux/fs.h>
 #include <linux/kernel.h>
 #include <linux/bug.h>
+#include <linux/kmsan.h>
 #include <linux/mm.h>
 #include <linux/uaccess.h>
 #include <linux/hardirq.h>
@@ -255,6 +256,7 @@ static inline void copy_user_highpage(struct page *to, struct page *from,
 	vfrom = kmap_atomic(from);
 	vto = kmap_atomic(to);
 	copy_user_page(vto, vfrom, vaddr, to);
+	kmsan_unpoison_shadow(page_address(to), PAGE_SIZE);
 	kunmap_atomic(vto);
 	kunmap_atomic(vfrom);
 }
@@ -270,6 +272,7 @@ static inline void copy_highpage(struct page *to, struct page *from)
 	vfrom = kmap_atomic(from);
 	vto = kmap_atomic(to);
 	copy_page(vto, vfrom);
+	kmsan_copy_page_meta(to, from);
 	kunmap_atomic(vto);
 	kunmap_atomic(vfrom);
 }
diff --git a/lib/ioremap.c b/lib/ioremap.c
index 3f0e18543de84..14b0325b6fa9e 100644
--- a/lib/ioremap.c
+++ b/lib/ioremap.c
@@ -6,6 +6,7 @@
  *
  * (C) Copyright 1995 1996 Linus Torvalds
  */
+#include <linux/kmsan.h>
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
@@ -214,6 +215,8 @@ int ioremap_page_range(unsigned long addr,
 	unsigned long start;
 	unsigned long next;
 	int err;
+	unsigned long old_addr = addr;
+	phys_addr_t old_phys_addr = phys_addr;
 
 	might_sleep();
 	BUG_ON(addr >= end);
@@ -228,6 +231,8 @@ int ioremap_page_range(unsigned long addr,
 	} while (pgd++, phys_addr += (next - addr), addr = next, addr != end);
 
 	flush_cache_vmap(start, end);
+	if (!err)
+		kmsan_ioremap_page_range(old_addr, end, old_phys_addr, prot);
 
 	return err;
 }
diff --git a/mm/gup.c b/mm/gup.c
index a212305695209..a2546215f165f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -4,6 +4,7 @@
 #include <linux/err.h>
 #include <linux/spinlock.h>
 
+#include <linux/kmsan.h>
 #include <linux/mm.h>
 #include <linux/memremap.h>
 #include <linux/pagemap.h>
@@ -2710,6 +2711,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	    gup_fast_permitted(start, end)) {
 		local_irq_save(flags);
 		gup_pgd_range(start, end, gup_flags, pages, &nr_pinned);
+		kmsan_gup_pgd_range(pages, nr_pinned);
 		local_irq_restore(flags);
 	}
 
@@ -2765,6 +2767,7 @@ static int internal_get_user_pages_fast(unsigned long start, int nr_pages,
 	    gup_fast_permitted(start, end)) {
 		local_irq_disable();
 		gup_pgd_range(addr, end, gup_flags, pages, &nr_pinned);
+		kmsan_gup_pgd_range(pages, nr_pinned);
 		local_irq_enable();
 		ret = nr_pinned;
 	}
diff --git a/mm/memory.c b/mm/memory.c
index 8d7f387dd0c77..aa9e266449e26 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -51,6 +51,7 @@
 #include <linux/highmem.h>
 #include <linux/pagemap.h>
 #include <linux/memremap.h>
+#include <linux/kmsan.h>
 #include <linux/ksm.h>
 #include <linux/rmap.h>
 #include <linux/export.h>
@@ -2676,6 +2677,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 				put_page(old_page);
 			return 0;
 		}
+		kmsan_copy_page_meta(new_page, old_page);
 	}
 
 	if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, &memcg, false))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca1453204e667..869dc64226296 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -26,6 +26,8 @@
 #include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/kasan.h>
+#include <linux/kmsan.h>
+#include <linux/kmsan-checks.h>
 #include <linux/module.h>
 #include <linux/suspend.h>
 #include <linux/pagevec.h>
@@ -1178,6 +1180,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
 	trace_mm_page_free(page, order);
+	kmsan_free_page(page, order);
 
 	/*
 	 * Check tail pages before head page information is cleared to
@@ -3199,6 +3202,7 @@ void split_page(struct page *page, unsigned int order)
 	VM_BUG_ON_PAGE(PageCompound(page), page);
 	VM_BUG_ON_PAGE(!page_count(page), page);
 
+	kmsan_split_page(page, order);
 	for (i = 1; i < (1 << order); i++)
 		set_page_refcounted(page + i);
 	split_page_owner(page, order);
@@ -3349,6 +3353,14 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone,
 /*
  * Allocate a page from the given zone. Use pcplists for order-0 allocations.
  */
+
+/*
+ * Do not instrument rmqueue() with KMSAN. This function may call
+ * __msan_poison_alloca() through a call to set_pfnblock_flags_mask().
+ * If __msan_poison_alloca() attempts to allocate pages for the stack depot, it
+ * may call rmqueue() again, which will result in a deadlock.
+ */
+__no_sanitize_memory
 static inline
 struct page *rmqueue(struct zone *preferred_zone,
 			struct zone *zone, unsigned int order,
@@ -4862,6 +4874,11 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
 
 	trace_mm_page_alloc(page, order, alloc_mask, ac.migratetype);
 
+	if (page)
+		if (kmsan_alloc_page(page, order, gfp_mask)) {
+			__free_pages(page, order);
+			page = NULL;
+		}
 	return page;
 }
 EXPORT_SYMBOL(__alloc_pages_nodemask);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6b8eeb0ecee51..c5577e616c33b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -29,6 +29,7 @@
 #include <linux/rcupdate.h>
 #include <linux/pfn.h>
 #include <linux/kmemleak.h>
+#include <linux/kmsan.h>
 #include <linux/atomic.h>
 #include <linux/compiler.h>
 #include <linux/llist.h>
@@ -127,7 +128,8 @@ static void vunmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end)
 	} while (p4d++, addr = next, addr != end);
 }
 
-static void vunmap_page_range(unsigned long addr, unsigned long end)
+/* Exported for KMSAN, visible in mm/kmsan/kmsan.h only. */
+void __vunmap_page_range(unsigned long addr, unsigned long end)
 {
 	pgd_t *pgd;
 	unsigned long next;
@@ -141,6 +143,13 @@ static void vunmap_page_range(unsigned long addr, unsigned long end)
 		vunmap_p4d_range(pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
+EXPORT_SYMBOL(__vunmap_page_range);
+
+static void vunmap_page_range(unsigned long addr, unsigned long end)
+{
+	kmsan_vunmap_page_range(addr, end);
+	__vunmap_page_range(addr, end);
+}
 
 static int vmap_pte_range(pmd_t *pmd, unsigned long addr,
 		unsigned long end, pgprot_t prot, struct page **pages, int *nr)
@@ -224,8 +233,11 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr,
  * will have pfns corresponding to the "pages" array.
  *
  * Ie. pte at addr+N*PAGE_SIZE shall point to pfn corresponding to pages[N]
+ *
+ * This function is exported for use in KMSAN, but is only declared in KMSAN
+ * headers.
  */
-static int vmap_page_range_noflush(unsigned long start, unsigned long end,
+int __vmap_page_range_noflush(unsigned long start, unsigned long end,
 				   pgprot_t prot, struct page **pages)
 {
 	pgd_t *pgd;
@@ -245,6 +257,14 @@ static int vmap_page_range_noflush(unsigned long start, unsigned long end,
 
 	return nr;
 }
+EXPORT_SYMBOL(__vmap_page_range_noflush);
+
+static int vmap_page_range_noflush(unsigned long start, unsigned long end,
+				   pgprot_t prot, struct page **pages)
+{
+	kmsan_vmap_page_range_noflush(start, end, prot, pages);
+	return __vmap_page_range_noflush(start, end, prot, pages);
+}
 
 static int vmap_page_range(unsigned long start, unsigned long end,
 			   pgprot_t prot, struct page **pages)
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 20/38] kmsan: handle memory sent to/from USB
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (18 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 19/38] kmsan: mm: maintain KMSAN metadata for page operations glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 21/38] kmsan: handle task creation and exiting glider
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman, Eric Dumazet, Wolfram Sang,
	Petr Mladek, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	ericvh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor

Depending on the value of is_out kmsan_handle_urb() KMSAN either
marks the data copied to the kernel from a USB device as initialized,
or checks the data sent to the device for being initialized.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

This patch was previously called "kmsan: call KMSAN hooks where needed"

v4:
 - split this patch away

Change-Id: Idd0f8ce858975112285706ffb7286f570bd3007b
---
 drivers/usb/core/urb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c
index da923ec176122..4a0b0ac0f52f9 100644
--- a/drivers/usb/core/urb.c
+++ b/drivers/usb/core/urb.c
@@ -8,6 +8,7 @@
 #include <linux/bitops.h>
 #include <linux/slab.h>
 #include <linux/log2.h>
+#include <linux/kmsan-checks.h>
 #include <linux/usb.h>
 #include <linux/wait.h>
 #include <linux/usb/hcd.h>
@@ -402,6 +403,7 @@ int usb_submit_urb(struct urb *urb, gfp_t mem_flags)
 			URB_SETUP_MAP_SINGLE | URB_SETUP_MAP_LOCAL |
 			URB_DMA_SG_COMBINED);
 	urb->transfer_flags |= (is_out ? URB_DIR_OUT : URB_DIR_IN);
+	kmsan_handle_urb(urb, is_out);
 
 	if (xfertype != USB_ENDPOINT_XFER_CONTROL &&
 			dev->state < USB_STATE_CONFIGURED)
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 21/38] kmsan: handle task creation and exiting
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (19 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 20/38] kmsan: handle memory sent to/from USB glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 22/38] kmsan: net: check the value of skb before sending it to the network glider
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman, Eric Dumazet, Wolfram Sang,
	Petr Mladek, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	ericvh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor

Tell KMSAN that a new task is created, so the tool creates a backing
metadata structure for that task.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

This patch was previously called "kmsan: call KMSAN hooks where needed"

v4:
 - split this patch away

Change-Id: I7a6a83419b0e038f8993175461255f462a430205
---
 kernel/exit.c    | 2 ++
 kernel/fork.c    | 2 ++
 kernel/kthread.c | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/kernel/exit.c b/kernel/exit.c
index e93c6197a827c..377f9edbb28fa 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -60,6 +60,7 @@
 #include <linux/writeback.h>
 #include <linux/shm.h>
 #include <linux/kcov.h>
+#include <linux/kmsan.h>
 #include <linux/random.h>
 #include <linux/rcuwait.h>
 #include <linux/compat.h>
@@ -709,6 +710,7 @@ void __noreturn do_exit(long code)
 
 	profile_task_exit(tsk);
 	kcov_task_exit(tsk);
+	kmsan_task_exit(tsk);
 
 	WARN_ON(blk_needs_flush_plug(tsk));
 
diff --git a/kernel/fork.c b/kernel/fork.c
index d48e063a3abe7..21f7f411880d3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -37,6 +37,7 @@
 #include <linux/fdtable.h>
 #include <linux/iocontext.h>
 #include <linux/key.h>
+#include <linux/kmsan.h>
 #include <linux/binfmts.h>
 #include <linux/mman.h>
 #include <linux/mmu_notifier.h>
@@ -943,6 +944,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	account_kernel_stack(tsk, 1);
 
 	kcov_task_init(tsk);
+	kmsan_task_create(tsk);
 
 #ifdef CONFIG_FAULT_INJECTION
 	tsk->fail_nth = 0;
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b262f47046ca4..33ca743ca8b54 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -17,6 +17,7 @@
 #include <linux/unistd.h>
 #include <linux/file.h>
 #include <linux/export.h>
+#include <linux/kmsan.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <linux/freezer.h>
@@ -350,6 +351,7 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 		set_cpus_allowed_ptr(task, cpu_all_mask);
 	}
 	kfree(create);
+	kmsan_task_create(task);
 	return task;
 }
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 22/38] kmsan: net: check the value of skb before sending it to the network
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (20 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 21/38] kmsan: handle task creation and exiting glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 23/38] kmsan: printk: treat the result of vscnprintf() as initialized glider
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman, Eric Dumazet, Wolfram Sang,
	Petr Mladek, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	ericvh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor

Calling kmsan_check_skb() lets KMSAN check the bytes to be transferred
over the network for being initialized.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

This patch was previously called "kmsan: call KMSAN hooks where needed"

v4:
 - split this patch away

Change-Id: Iff48409dc50341d59e355ce3ec11d4722f0799e2
---
 net/sched/sch_generic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 2efd5b61acef1..4b2cc309bb1e3 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/types.h>
 #include <linux/kernel.h>
+#include <linux/kmsan-checks.h>
 #include <linux/sched.h>
 #include <linux/string.h>
 #include <linux/errno.h>
@@ -654,6 +655,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
 	} else {
 		WRITE_ONCE(qdisc->empty, true);
 	}
+	kmsan_check_skb(skb);
 
 	return skb;
 }
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 23/38] kmsan: printk: treat the result of vscnprintf() as initialized
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (21 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 22/38] kmsan: net: check the value of skb before sending it to the network glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 24/38] kmsan: disable instrumentation of certain functions glider
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman, Eric Dumazet, Wolfram Sang,
	Petr Mladek, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	ericvh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor

In vprintk_store(), vscnprintf() may return an uninitialized text_len
value if any of its arguments are uninitialized. In that case KMSAN will
report one or more errors in vscnprintf() itself, but it doesn't make
much sense to track that value further, as it may trigger more errors in
printk. Instead, we explicitly mark it as initialized.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
Acked-by: Petr Mladek <pmladek@suse.com>

---

This patch was split from "kmsan: call KMSAN hooks where needed", as
requested by Andrey Konovalov. Petr Mladek has previously acked the
printk part of that patch, hence the Acked-by above.

v4:
 - split this patch away

Change-Id: Ibed60b0bdd25f8ae91acee5800b5328e78e0735a
---
 kernel/printk/printk.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ad46062345452..4cadba3c1e68d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1913,6 +1913,12 @@ int vprintk_store(int facility, int level,
 	 * prefix which might be passed-in as a parameter.
 	 */
 	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	/*
+	 * If any of vscnprintf() arguments is uninitialized, KMSAN will report
+	 * one or more errors and also probably mark text_len as uninitialized.
+	 * Initialize |text_len| to prevent the errors from spreading further.
+	 */
+	text_len = KMSAN_INIT_VALUE(text_len);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 24/38] kmsan: disable instrumentation of certain functions
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (22 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 23/38] kmsan: printk: treat the result of vscnprintf() as initialized glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 25/38] kmsan: unpoison |tlb| in arch_tlb_gather_mmu() glider
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, gor,
	wsa

Some functions are called from handwritten assembly, and therefore don't
have their arguments' metadata fully set up by the instrumentation code.
Mark them with __no_sanitize_memory to avoid false positives from
spreading further.
Certain functions perform task switching, so that the value of |current|
is different as they proceed. Because KMSAN state pointer is only read
once at the beginning of the function, touching it after |current| has
changed may be dangerous.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v3:
 - removed TODOs from comments

v4:
 - updated the comments, dropped __no_sanitize_memory from idle_cpu(),
   sched_init(), profile_tick()
 - split away the uprobes part as requested by Andrey Konovalov

Change-Id: I684d23dac5a22eb0a4cea71993cb934302b17cea
---
 arch/x86/entry/common.c                |  2 ++
 arch/x86/include/asm/irq_regs.h        |  2 ++
 arch/x86/include/asm/syscall_wrapper.h |  2 ++
 arch/x86/kernel/apic/apic.c            |  3 +++
 arch/x86/kernel/dumpstack_64.c         |  5 +++++
 arch/x86/kernel/process_64.c           |  5 +++++
 arch/x86/kernel/traps.c                | 13 +++++++++++--
 kernel/sched/core.c                    | 22 ++++++++++++++++++++++
 8 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index ec167d8c41cbd..5c3d0f3a14c37 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -280,6 +280,8 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_X86_64
+/* Tell KMSAN to not instrument this function and to initialize |regs|. */
+__no_sanitize_memory
 __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
 {
 	struct thread_info *ti;
diff --git a/arch/x86/include/asm/irq_regs.h b/arch/x86/include/asm/irq_regs.h
index 187ce59aea28e..a6fc1641e2861 100644
--- a/arch/x86/include/asm/irq_regs.h
+++ b/arch/x86/include/asm/irq_regs.h
@@ -14,6 +14,8 @@
 
 DECLARE_PER_CPU(struct pt_regs *, irq_regs);
 
+/* Tell KMSAN to return an initialized struct pt_regs. */
+__no_sanitize_memory
 static inline struct pt_regs *get_irq_regs(void)
 {
 	return __this_cpu_read(irq_regs);
diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index e2389ce9bf58a..098b1a8d6bc41 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -196,6 +196,8 @@ struct pt_regs;
 	ALLOW_ERROR_INJECTION(__x64_sys##name, ERRNO);			\
 	static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__));	\
 	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+	/* Tell KMSAN to initialize |regs|. */				\
+	__no_sanitize_memory						\
 	asmlinkage long __x64_sys##name(const struct pt_regs *regs)	\
 	{								\
 		return __se_sys##name(SC_X86_64_REGS_TO_ARGS(x,__VA_ARGS__));\
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 5f973fed3c9ff..1f0250f14e462 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1127,6 +1127,9 @@ static void local_apic_timer_interrupt(void)
  * [ if a single-CPU system runs an SMP kernel then we call the local
  *   interrupt as well. Thus we cannot inline the local irq ... ]
  */
+
+/* Tell KMSAN to initialize |regs|. */
+__no_sanitize_memory
 __visible void __irq_entry smp_apic_timer_interrupt(struct pt_regs *regs)
 {
 	struct pt_regs *old_regs = set_irq_regs(regs);
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 87b97897a8810..3d1691f81cada 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -150,6 +150,11 @@ static bool in_irq_stack(unsigned long *stack, struct stack_info *info)
 	return true;
 }
 
+/*
+ * This function may touch stale uninitialized values on stack. Do not
+ * instrument it with KMSAN to avoid false positives.
+ */
+__no_sanitize_memory
 int get_stack_info(unsigned long *stack, struct task_struct *task,
 		   struct stack_info *info, unsigned long *visit_mask)
 {
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ffd497804dbc3..5e8c6767e9916 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -424,6 +424,11 @@ void compat_start_thread(struct pt_regs *regs, u32 new_ip, u32 new_sp)
  * Kprobes not supported here. Set the probe on schedule instead.
  * Function graph tracer not supported too.
  */
+/*
+ * Avoid touching KMSAN state or reporting anything here, as __switch_to() does
+ * weird things with tasks.
+ */
+__no_sanitize_memory
 __visible __notrace_funcgraph struct task_struct *
 __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d54cffdc7cac2..917268aee054e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -638,7 +638,11 @@ NOKPROBE_SYMBOL(do_int3);
  * Help handler running on a per-cpu (IST or entry trampoline) stack
  * to switch to the normal thread stack if the interrupted code was in
  * user mode. The actual stack switch is done in entry_64.S
+ *
  */
+
+/* This function switches the registers - don't instrument it with KMSAN. */
+__no_sanitize_memory
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs)
 {
 	struct pt_regs *regs = (struct pt_regs *)this_cpu_read(cpu_current_top_of_stack) - 1;
@@ -654,6 +658,11 @@ struct bad_iret_stack {
 };
 
 asmlinkage __visible notrace
+/*
+ * Dark magic happening here, let's not instrument this function.
+ * Also avoid copying any metadata by using raw __memmove().
+ */
+__no_sanitize_memory
 struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
 {
 	/*
@@ -668,10 +677,10 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
 		(struct bad_iret_stack *)this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
 
 	/* Copy the IRET target to the new stack. */
-	memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8);
+	__memmove(&new_stack->regs.ip, (void *)s->regs.sp, 5*8);
 
 	/* Copy the remainder of the stack from the current stack. */
-	memmove(new_stack, s, offsetof(struct bad_iret_stack, regs.ip));
+	__memmove(new_stack, s, offsetof(struct bad_iret_stack, regs.ip));
 
 	BUG_ON(!user_mode(&new_stack->regs));
 	return new_stack;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1a5937936ac75..bb1b659c12f6a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -471,6 +471,11 @@ void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
 		put_task_struct(task);
 }
 
+/*
+ * Context switch here may lead to KMSAN task state corruption. Disable KMSAN
+ * instrumentation.
+ */
+__no_sanitize_memory
 void wake_up_q(struct wake_q_head *head)
 {
 	struct wake_q_node *node = head->first;
@@ -3217,6 +3222,12 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev,
  * past. prev == current is still correct but we need to recalculate this_rq
  * because prev may have moved to another CPU.
  */
+
+/*
+ * Context switch here may lead to KMSAN task state corruption. Disable KMSAN
+ * instrumentation.
+ */
+__no_sanitize_memory
 static struct rq *finish_task_switch(struct task_struct *prev)
 	__releases(rq->lock)
 {
@@ -4052,6 +4063,12 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
  *
  * WARNING: must be called with preemption disabled!
  */
+
+/*
+ * Context switch here may lead to KMSAN task state corruption. Disable KMSAN
+ * instrumentation.
+ */
+__no_sanitize_memory
 static void __sched notrace __schedule(bool preempt)
 {
 	struct task_struct *prev, *next;
@@ -6789,6 +6806,11 @@ static inline int preempt_count_equals(int preempt_offset)
 	return (nested == preempt_offset);
 }
 
+/*
+ * This function might be called from code that is not instrumented with KMSAN.
+ * Nevertheless, treat its arguments as initialized.
+ */
+__no_sanitize_memory
 void __might_sleep(const char *file, int line, int preempt_offset)
 {
 	/*
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 25/38] kmsan: unpoison |tlb| in arch_tlb_gather_mmu()
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (23 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 24/38] kmsan: disable instrumentation of certain functions glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 26/38] kmsan: use __msan_ string functions where possible glider
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

This is a hack to reduce stackdepot pressure.

struct mmu_gather contains 7 1-bit fields packed into a 32-bit unsigned
int value. The remaining 25 bits remain uninitialized and are never used,
but KMSAN updates the origin for them in zap_pXX_range() in mm/memory.c,
thus creating very long origin chains. This is technically correct, but
consumes too much memory.

Unpoisoning the whole structure will prevent creating such chains.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - removed a TODO, updated patch description

Change-Id: I22a201e7e4f67ed74f8129072f12e5351b26103a
---
 mm/mmu_gather.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index a3538cb2bcbee..d3d57c276e301 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -1,6 +1,7 @@
 #include <linux/gfp.h>
 #include <linux/highmem.h>
 #include <linux/kernel.h>
+#include <linux/kmsan-checks.h>
 #include <linux/mmdebug.h>
 #include <linux/mm_types.h>
 #include <linux/pagemap.h>
@@ -264,6 +265,15 @@ void tlb_flush_mmu(struct mmu_gather *tlb)
 void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
 			unsigned long start, unsigned long end)
 {
+	/*
+	 * struct mmu_gather contains 7 1-bit fields packed into a 32-bit
+	 * unsigned int value. The remaining 25 bits remain uninitialized
+	 * and are never used, but KMSAN updates the origin for them in
+	 * zap_pXX_range() in mm/memory.c, thus creating very long origin
+	 * chains. This is technically correct, but consumes too much memory.
+	 * Unpoisoning the whole structure will prevent creating such chains.
+	 */
+	kmsan_unpoison_shadow(tlb, sizeof(*tlb));
 	tlb->mm = mm;
 
 	/* Is it from 0 to ~0? */
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 26/38] kmsan: use __msan_ string functions where possible.
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (24 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 25/38] kmsan: unpoison |tlb| in arch_tlb_gather_mmu() glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 27/38] kmsan: hooks for copy_to_user() and friends glider
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

Unless stated otherwise (by explicitly calling __memcpy(), __memset() or
__memmove()) we want all string functions to call their __msan_ versions
(e.g. __msan_memcpy() instead of memcpy()), so that shadow and origin
values are updated accordingly.

Bootloader must still use the default string functions to avoid crashes.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v3:
 - use default string functions in the bootloader
v4:
 - include kmsan-checks.h into compiler.h
 - also handle memset() and memmove()
 - fix https://github.com/google/kmsan/issues/64
v5:
 - don't compile memset() and memmove() under KMSAN

Change-Id: Ib2512ce5aa8d457453dd38caa12f58f002166813
---
 arch/x86/boot/compressed/misc.h               |  1 +
 arch/x86/include/asm/string_64.h              | 23 ++++++++++++++++++-
 .../firmware/efi/libstub/efi-stub-helper.c    |  5 ++++
 drivers/firmware/efi/libstub/tpm.c            |  5 ++++
 include/linux/compiler.h                      |  9 +++++++-
 include/linux/string.h                        |  2 ++
 6 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index c8181392f70d7..dd4bd8c5d97a1 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -12,6 +12,7 @@
 #undef CONFIG_PARAVIRT_XXL
 #undef CONFIG_PARAVIRT_SPINLOCKS
 #undef CONFIG_KASAN
+#undef CONFIG_KMSAN
 
 /* cpu_feature_enabled() cannot be used this early */
 #define USE_EARLY_PGTABLE_L5
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 75314c3dbe471..0442e679b3079 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -11,11 +11,23 @@
    function. */
 
 #define __HAVE_ARCH_MEMCPY 1
+#if defined(CONFIG_KMSAN)
+#undef memcpy
+/* __msan_memcpy() is defined in compiler.h */
+#define memcpy(dst, src, len) __msan_memcpy(dst, src, len)
+#else
 extern void *memcpy(void *to, const void *from, size_t len);
+#endif
 extern void *__memcpy(void *to, const void *from, size_t len);
 
 #define __HAVE_ARCH_MEMSET
+#if defined(CONFIG_KMSAN)
+extern void *__msan_memset(void *s, int c, size_t n);
+#undef memset
+#define memset(dst, c, len) __msan_memset(dst, c, len)
+#else
 void *memset(void *s, int c, size_t n);
+#endif
 void *__memset(void *s, int c, size_t n);
 
 #define __HAVE_ARCH_MEMSET16
@@ -55,7 +67,13 @@ static inline void *memset64(uint64_t *s, uint64_t v, size_t n)
 }
 
 #define __HAVE_ARCH_MEMMOVE
+#if defined(CONFIG_KMSAN)
+#undef memmove
+void *__msan_memmove(void *dest, const void *src, size_t len);
+#define memmove(dst, src, len) __msan_memmove(dst, src, len)
+#else
 void *memmove(void *dest, const void *src, size_t count);
+#endif
 void *__memmove(void *dest, const void *src, size_t count);
 
 int memcmp(const void *cs, const void *ct, size_t count);
@@ -64,7 +82,8 @@ char *strcpy(char *dest, const char *src);
 char *strcat(char *dest, const char *src);
 int strcmp(const char *cs, const char *ct);
 
-#if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
+#if (defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)) || \
+	(defined(CONFIG_KMSAN) && !defined(__SANITIZE_MEMORY__))
 
 /*
  * For files that not instrumented (e.g. mm/slub.c) we
@@ -73,7 +92,9 @@ int strcmp(const char *cs, const char *ct);
 
 #undef memcpy
 #define memcpy(dst, src, len) __memcpy(dst, src, len)
+#undef memmove
 #define memmove(dst, src, len) __memmove(dst, src, len)
+#undef memset
 #define memset(s, c, n) __memset(s, c, n)
 
 #ifndef __NO_FORTIFY
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index 9f34c72429397..610f791c2493e 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -5,7 +5,12 @@
  * implementation files.
  *
  * Copyright 2011 Intel Corporation; author Matt Fleming
+ *
+ *
+ * This file is not linked with KMSAN runtime.
+ * Do not replace memcpy with __memcpy.
  */
+#undef CONFIG_KMSAN
 
 #include <linux/efi.h>
 #include <asm/efi.h>
diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
index 1d59e103a2e3a..7e8906b1c1c98 100644
--- a/drivers/firmware/efi/libstub/tpm.c
+++ b/drivers/firmware/efi/libstub/tpm.c
@@ -6,7 +6,12 @@
  * Copyright (C) 2017 Google, Inc.
  *     Matthew Garrett <mjg59@google.com>
  *     Thiebaud Weksteen <tweek@google.com>
+ *
+ *
+ * This file is not linked with KMSAN runtime.
+ * Do not replace memcpy with __memcpy.
  */
+#undef CONFIG_KMSAN
 #include <linux/efi.h>
 #include <linux/tpm_eventlog.h>
 #include <asm/efi.h>
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index c6c67729729e3..f2b97241fe2d4 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -180,6 +180,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #include <uapi/linux/types.h>
 #include <linux/kcsan-checks.h>
 
+#ifdef CONFIG_KMSAN
+void *__msan_memcpy(void *dst, const void *src, u64 size);
+#define __DO_MEMCPY(res, p, size) __msan_memcpy(res, p, size)
+#else
+#define __DO_MEMCPY(res, p, size) __builtin_memcpy(res, p, size)
+#endif
+
 #define __READ_ONCE_SIZE						\
 ({									\
 	switch (size) {							\
@@ -189,7 +196,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;		\
 	default:							\
 		barrier();						\
-		__builtin_memcpy((void *)res, (const void *)p, size);	\
+		__DO_MEMCPY((void *)res, (const void *)p, size);	\
 		barrier();						\
 	}								\
 })
diff --git a/include/linux/string.h b/include/linux/string.h
index 6dfbb2efa8157..7ef92817c082f 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -356,6 +356,7 @@ __FORTIFY_INLINE char *strncat(char *p, const char *q, __kernel_size_t count)
 	return p;
 }
 
+#ifndef CONFIG_KMSAN
 __FORTIFY_INLINE void *memset(void *p, int c, __kernel_size_t size)
 {
 	size_t p_size = __builtin_object_size(p, 0);
@@ -395,6 +396,7 @@ __FORTIFY_INLINE void *memmove(void *p, const void *q, __kernel_size_t size)
 		fortify_panic(__func__);
 	return __builtin_memmove(p, q, size);
 }
+#endif
 
 extern void *__real_memscan(void *, int, __kernel_size_t) __RENAME(memscan);
 __FORTIFY_INLINE void *memscan(void *p, int c, __kernel_size_t size)
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 27/38] kmsan: hooks for copy_to_user() and friends
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (25 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 26/38] kmsan: use __msan_ string functions where possible glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 28/38] kmsan: init: call KMSAN initialization routines glider
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Alexander Viro, Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm
  Cc: glider, adilger.kernel, akpm, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

Memory that is copied from userspace must be unpoisoned.
Before copying memory to userspace, check it and report an error if it
contains uninitialized bits.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---

v3:
 - fixed compilation errors reported by kbuild test bot

v4:
 - minor variable fixes as requested by Andrey Konovalov
 - simplified code around copy_to_user() hooks

v5:
 - fixed use of uninitialized value spotted by kbuild test robot <lkp@intel.com>

Change-Id: I38428b9c7d1909b8441dcec1749b080494a7af99
---
 arch/x86/include/asm/uaccess.h   | 10 ++++++++++
 include/asm-generic/cacheflush.h |  7 ++++++-
 include/asm-generic/uaccess.h    | 12 +++++++++--
 include/linux/uaccess.h          | 34 ++++++++++++++++++++++++++------
 lib/iov_iter.c                   | 14 +++++++++----
 lib/usercopy.c                   |  8 ++++++--
 6 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 61d93f062a36e..bfb55fdba5df4 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -6,6 +6,7 @@
  */
 #include <linux/compiler.h>
 #include <linux/kasan-checks.h>
+#include <linux/kmsan-checks.h>
 #include <linux/string.h>
 #include <asm/asm.h>
 #include <asm/page.h>
@@ -174,6 +175,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 			ASM_CALL_CONSTRAINT				\
 		     : "0" (ptr), "i" (sizeof(*(ptr))));		\
 	(x) = (__force __typeof__(*(ptr))) __val_gu;			\
+	kmsan_unpoison_shadow(&(x), sizeof(*(ptr)));			\
 	__builtin_expect(__ret_gu, 0);					\
 })
 
@@ -248,6 +250,7 @@ extern void __put_user_8(void);
 	__chk_user_ptr(ptr);					\
 	might_fault();						\
 	__pu_val = x;						\
+	kmsan_check_memory(&(__pu_val), sizeof(*(ptr)));	\
 	switch (sizeof(*(ptr))) {				\
 	case 1:							\
 		__put_user_x(1, __pu_val, ptr, __ret_pu);	\
@@ -270,7 +273,9 @@ extern void __put_user_8(void);
 
 #define __put_user_size(x, ptr, size, label)				\
 do {									\
+	__typeof__(*(ptr)) __pus_val = x;				\
 	__chk_user_ptr(ptr);						\
+	kmsan_check_memory(&(__pus_val), size);				\
 	switch (size) {							\
 	case 1:								\
 		__put_user_goto(x, ptr, "b", "b", "iq", label);	\
@@ -295,7 +300,9 @@ do {									\
  */
 #define __put_user_size_ex(x, ptr, size)				\
 do {									\
+	__typeof__(*(ptr)) __puse_val = x;				\
 	__chk_user_ptr(ptr);						\
+	kmsan_check_memory(&(__puse_val), size);			\
 	switch (size) {							\
 	case 1:								\
 		__put_user_asm_ex(x, ptr, "b", "b", "iq");		\
@@ -363,6 +370,7 @@ do {									\
 	default:							\
 		(x) = __get_user_bad();					\
 	}								\
+	kmsan_unpoison_shadow(&(x), size);				\
 } while (0)
 
 #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
@@ -413,6 +421,7 @@ do {									\
 	default:							\
 		(x) = __get_user_bad();					\
 	}								\
+	kmsan_unpoison_shadow(&(x), size);				\
 } while (0)
 
 #define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
@@ -433,6 +442,7 @@ do {									\
 	__typeof__(ptr) __pu_ptr = (ptr);			\
 	__typeof__(size) __pu_size = (size);			\
 	__uaccess_begin();					\
+	kmsan_check_memory(&(__pu_val), size);			\
 	__put_user_size(__pu_val, __pu_ptr, __pu_size, __pu_label);	\
 	__pu_err = 0;						\
 __pu_label:							\
diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h
index cac7404b2bdd2..7023c44457ef9 100644
--- a/include/asm-generic/cacheflush.h
+++ b/include/asm-generic/cacheflush.h
@@ -4,6 +4,7 @@
 
 /* Keep includes the same across arches.  */
 #include <linux/mm.h>
+#include <linux/kmsan-checks.h>
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
 
@@ -99,6 +100,7 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
 #ifndef copy_to_user_page
 #define copy_to_user_page(vma, page, vaddr, dst, src, len)	\
 	do { \
+		kmsan_check_memory(src, len); \
 		memcpy(dst, src, len); \
 		flush_icache_user_range(vma, page, vaddr, len); \
 	} while (0)
@@ -106,7 +108,10 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
 
 #ifndef copy_from_user_page
 #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
-	memcpy(dst, src, len)
+	do { \
+		memcpy(dst, src, len); \
+		kmsan_unpoison_shadow(dst, len); \
+	} while (0)
 #endif
 
 #endif /* __ASM_CACHEFLUSH_H */
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index e935318804f8a..88b626c3ef2de 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -142,7 +142,11 @@ static inline int __access_ok(unsigned long addr, unsigned long size)
 
 static inline int __put_user_fn(size_t size, void __user *ptr, void *x)
 {
-	return unlikely(raw_copy_to_user(ptr, x, size)) ? -EFAULT : 0;
+	int n;
+
+	n = raw_copy_to_user(ptr, x, size);
+	kmsan_copy_to_user(ptr, x, size, n);
+	return unlikely(n) ? -EFAULT : 0;
 }
 
 #define __put_user_fn(sz, u, k)	__put_user_fn(sz, u, k)
@@ -203,7 +207,11 @@ extern int __put_user_bad(void) __attribute__((noreturn));
 #ifndef __get_user_fn
 static inline int __get_user_fn(size_t size, const void __user *ptr, void *x)
 {
-	return unlikely(raw_copy_from_user(x, ptr, size)) ? -EFAULT : 0;
+	int res;
+
+	res = raw_copy_from_user(x, ptr, size);
+	kmsan_unpoison_shadow(x, size - res);
+	return unlikely(res) ? -EFAULT : 0;
 }
 
 #define __get_user_fn(sz, u, k)	__get_user_fn(sz, u, k)
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 8a215c5c1aed8..b38bdeb135dde 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -5,6 +5,7 @@
 #include <linux/instrumented.h>
 #include <linux/sched.h>
 #include <linux/thread_info.h>
+#include <linux/kmsan-checks.h>
 
 #define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
 
@@ -58,18 +59,26 @@
 static __always_inline __must_check unsigned long
 __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
 {
+	unsigned long res;
+
 	instrument_copy_from_user(to, from, n);
 	check_object_size(to, n, false);
-	return raw_copy_from_user(to, from, n);
+	res = raw_copy_from_user(to, from, n);
+	kmsan_unpoison_shadow(to, n - res);
+	return res;
 }
 
 static __always_inline __must_check unsigned long
 __copy_from_user(void *to, const void __user *from, unsigned long n)
 {
+	unsigned long res;
+
 	might_fault();
 	instrument_copy_from_user(to, from, n);
 	check_object_size(to, n, false);
-	return raw_copy_from_user(to, from, n);
+	res = raw_copy_from_user(to, from, n);
+	kmsan_unpoison_shadow(to, n - res);
+	return res;
 }
 
 /**
@@ -88,18 +97,26 @@ __copy_from_user(void *to, const void __user *from, unsigned long n)
 static __always_inline __must_check unsigned long
 __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
 {
+	unsigned long res;
+
 	instrument_copy_to_user(to, from, n);
 	check_object_size(from, n, true);
-	return raw_copy_to_user(to, from, n);
+	res = raw_copy_to_user(to, from, n);
+	kmsan_copy_to_user((const void *)to, from, n, res);
+	return res;
 }
 
 static __always_inline __must_check unsigned long
 __copy_to_user(void __user *to, const void *from, unsigned long n)
 {
+	unsigned long res;
+
 	might_fault();
 	instrument_copy_to_user(to, from, n);
 	check_object_size(from, n, true);
-	return raw_copy_to_user(to, from, n);
+	res = raw_copy_to_user(to, from, n);
+	kmsan_copy_to_user((const void *)to, from, n, res);
+	return res;
 }
 
 #ifdef INLINE_COPY_FROM_USER
@@ -107,10 +124,12 @@ static inline __must_check unsigned long
 _copy_from_user(void *to, const void __user *from, unsigned long n)
 {
 	unsigned long res = n;
+
 	might_fault();
 	if (likely(access_ok(from, n))) {
 		instrument_copy_from_user(to, from, n);
 		res = raw_copy_from_user(to, from, n);
+		kmsan_unpoison_shadow(to, n - res);
 	}
 	if (unlikely(res))
 		memset(to + (n - res), 0, res);
@@ -125,12 +144,15 @@ _copy_from_user(void *, const void __user *, unsigned long);
 static inline __must_check unsigned long
 _copy_to_user(void __user *to, const void *from, unsigned long n)
 {
+	unsigned long res = n;
+
 	might_fault();
 	if (access_ok(to, n)) {
 		instrument_copy_to_user(to, from, n);
-		n = raw_copy_to_user(to, from, n);
+		res = raw_copy_to_user(to, from, n);
+		kmsan_copy_to_user(to, from, n, res);
 	}
-	return n;
+	return res;
 }
 #else
 extern __must_check unsigned long
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bf538c2bec777..179c28455693d 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -138,20 +138,26 @@
 
 static int copyout(void __user *to, const void *from, size_t n)
 {
+	int res;
+
 	if (access_ok(to, n)) {
 		instrument_copy_to_user(to, from, n);
-		n = raw_copy_to_user(to, from, n);
+		res = raw_copy_to_user(to, from, n);
+		kmsan_copy_to_user(to, from, n, res);
 	}
-	return n;
+	return res;
 }
 
 static int copyin(void *to, const void __user *from, size_t n)
 {
+	size_t res;
+
 	if (access_ok(from, n)) {
 		instrument_copy_from_user(to, from, n);
-		n = raw_copy_from_user(to, from, n);
+		res = raw_copy_from_user(to, from, n);
+		kmsan_unpoison_shadow(to, n - res);
 	}
-	return n;
+	return res;
 }
 
 static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t bytes,
diff --git a/lib/usercopy.c b/lib/usercopy.c
index 4bb1c5e7a3eb0..bf313548c4039 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/bitops.h>
 #include <linux/instrumented.h>
+#include <linux/kmsan-checks.h>
 #include <linux/uaccess.h>
 
 /* out-of-line parts */
@@ -13,6 +14,7 @@ unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n
 	if (likely(access_ok(from, n))) {
 		instrument_copy_from_user(to, from, n);
 		res = raw_copy_from_user(to, from, n);
+		kmsan_unpoison_shadow(to, n - res);
 	}
 	if (unlikely(res))
 		memset(to + (n - res), 0, res);
@@ -24,12 +26,14 @@ EXPORT_SYMBOL(_copy_from_user);
 #ifndef INLINE_COPY_TO_USER
 unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
 {
+	unsigned long res;
 	might_fault();
 	if (likely(access_ok(to, n))) {
 		instrument_copy_to_user(to, from, n);
-		n = raw_copy_to_user(to, from, n);
+		res = raw_copy_to_user(to, from, n);
+		kmsan_copy_to_user(to, from, n, res);
 	}
-	return n;
+	return res;
 }
 EXPORT_SYMBOL(_copy_to_user);
 #endif
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 28/38] kmsan: init: call KMSAN initialization routines
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (26 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 27/38] kmsan: hooks for copy_to_user() and friends glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 29/38] kmsan: enable KMSAN builds glider
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Vegard Nossum, Dmitry Vyukov,
	Andrey Konovalov, Marco Elver, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

kmsan_initialize_shadow() creates metadata pages for mappings created
at boot time.

kmsan_initialize() initializes the bookkeeping for init_task and enables
KMSAN.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Marco Elver <elver@google.com>
Cc: linux-mm@kvack.org

---

Change-Id: Ie3af251d629b911668f8651d868c544f3c11209f
---
 init/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/init/main.c b/init/main.c
index 345a9ab4450f1..4dd15063d32fe 100644
--- a/init/main.c
+++ b/init/main.c
@@ -33,6 +33,7 @@
 #include <linux/nmi.h>
 #include <linux/percpu.h>
 #include <linux/kmod.h>
+#include <linux/kmsan.h>
 #include <linux/vmalloc.h>
 #include <linux/kernel_stat.h>
 #include <linux/start_kernel.h>
@@ -772,6 +773,7 @@ static void __init mm_init(void)
 	page_ext_init_flatmem();
 	init_debug_pagealloc();
 	report_meminit();
+	kmsan_initialize_shadow();
 	mem_init();
 	kmem_cache_init();
 	kmemleak_init();
@@ -847,6 +849,7 @@ asmlinkage __visible void __init start_kernel(void)
 	sort_main_extable();
 	trap_init();
 	mm_init();
+	kmsan_initialize();
 
 	ftrace_init();
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 29/38] kmsan: enable KMSAN builds
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (27 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 28/38] kmsan: init: call KMSAN initialization routines glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 30/38] kmsan: handle /dev/[u]random glider
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

Make KMSAN usable by adding the necessary Makefile bits.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---
This patch was previously called "kmsan: Changing existing files to
enable KMSAN builds". Logically unrelated parts of it were split away.

v4:
 - split away changes to init/main.c as requested by Andrey Konovalov

Change-Id: I37e0b7f2d2f2b0aeac5753ff9d6b411485fc374e
---
 Makefile             | 3 ++-
 mm/Makefile          | 1 +
 scripts/Makefile.lib | 6 ++++++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index a532333b4cd02..da315f20618b3 100644
--- a/Makefile
+++ b/Makefile
@@ -482,7 +482,7 @@ export KBUILD_HOSTCXXFLAGS KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS LDFLAGS_MODULE
 
 export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS KBUILD_LDFLAGS
 export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
-export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN CFLAGS_KCSAN
+export CFLAGS_KASAN CFLAGS_KASAN_NOSANITIZE CFLAGS_UBSAN CFLAGS_KCSAN CFLAGS_KMSAN
 export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
 export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
 export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
@@ -901,6 +901,7 @@ KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
 endif
 
 include scripts/Makefile.kasan
+include scripts/Makefile.kmsan
 include scripts/Makefile.extrawarn
 include scripts/Makefile.ubsan
 include scripts/Makefile.kcsan
diff --git a/mm/Makefile b/mm/Makefile
index fa91e963c2f9e..7b9bce9cc0afb 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -81,6 +81,7 @@ obj-$(CONFIG_PAGE_POISONING) += page_poison.o
 obj-$(CONFIG_SLAB) += slab.o
 obj-$(CONFIG_SLUB) += slub.o
 obj-$(CONFIG_KASAN)	+= kasan/
+obj-$(CONFIG_KMSAN)	+= kmsan/
 obj-$(CONFIG_FAILSLAB) += failslab.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_MEMTEST)		+= memtest.o
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index b12dd5ba48960..e9a8c2671a4b3 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -137,6 +137,12 @@ _c_flags += $(if $(patsubst n%,, \
 		$(CFLAGS_KASAN), $(CFLAGS_KASAN_NOSANITIZE))
 endif
 
+ifeq ($(CONFIG_KMSAN),y)
+_c_flags += $(if $(patsubst n%,, \
+		$(KMSAN_SANITIZE_$(basetarget).o)$(KMSAN_SANITIZE)y), \
+		$(CFLAGS_KMSAN))
+endif
+
 ifeq ($(CONFIG_UBSAN),y)
 _c_flags += $(if $(patsubst n%,, \
 		$(UBSAN_SANITIZE_$(basetarget).o)$(UBSAN_SANITIZE)$(CONFIG_UBSAN_SANITIZE_ALL)), \
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 30/38] kmsan: handle /dev/[u]random
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (28 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 29/38] kmsan: enable KMSAN builds glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 31/38] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg() glider
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Andrew Morton, Jens Axboe, Theodore Ts'o, Dmitry Torokhov,
	Martin K. Petersen, Michael S. Tsirkin, Christoph Hellwig,
	Eric Dumazet, Eric Van Hensbergen, Takashi Iwai, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, Matthew Wilcox,
	linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, ebiggers, gregkh, harry.wentland,
	herbert, iii, mingo, jasowang, m.szyprowski, mark.rutland,
	schwidefsky, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tglx, gor, wsa

The random number generator may use uninitialized memory, but it may not
return uninitialized values. Unpoison the output buffer in
_extract_crng() to prevent false reports.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org

---
This patch was previously known as "kmsan: unpoisoning buffers from
devices etc.", but it turned out to be possible to drop most of the
annotations from that patch, so it only relates to /dev/random now.

Change-Id: Id460e7a86ce564f1357469f53d0c7410ca08f0e9
---
 drivers/char/random.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d10e31fd342f..7cd36c726b045 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -322,6 +322,7 @@
 #include <linux/fs.h>
 #include <linux/genhd.h>
 #include <linux/interrupt.h>
+#include <linux/kmsan-checks.h>
 #include <linux/mm.h>
 #include <linux/nodemask.h>
 #include <linux/spinlock.h>
@@ -1007,6 +1008,11 @@ static void _extract_crng(struct crng_state *crng,
 	spin_lock_irqsave(&crng->lock, flags);
 	if (arch_get_random_long(&v))
 		crng->state[14] ^= v;
+	/*
+	 * Regardless of where the random data comes from, KMSAN should treat
+	 * it as initialized.
+	 */
+	kmsan_unpoison_shadow(crng->state, sizeof(crng->state));
 	chacha20_block(&crng->state[0], out);
 	if (crng->state[12] == 0)
 		crng->state[13]++;
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 31/38] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg()
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (29 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 30/38] kmsan: handle /dev/[u]random glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 32/38] kmsan: disable strscpy() optimization under KMSAN glider
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	Michael S. Tsirkin, Jason Wang, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, axboe, m.szyprowski,
	mark.rutland, martin.petersen, schwidefsky, willy, mhocko,
	monstr, pmladek, cai, rdunlap, robin.murphy, sergey.senozhatsky,
	rostedt, tiwai, tytso, tglx, gor, wsa

If vring doesn't use the DMA API, KMSAN is unable to tell whether the
memory is initialized by hardware. Explicitly call kmsan_handle_dma()
from vring_map_one_sg() in this case to prevent false positives.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: linux-mm@kvack.org
---

Change-Id: Icc8678289b7084139320fc503898a67aa9803458
---
 drivers/virtio/virtio_ring.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 58b96baa8d488..8b6dee1dfde58 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/hrtimer.h>
 #include <linux/dma-mapping.h>
+#include <linux/kmsan-checks.h>
 #include <xen/xen.h>
 
 #ifdef DEBUG
@@ -326,8 +327,15 @@ static dma_addr_t vring_map_one_sg(const struct vring_virtqueue *vq,
 				   struct scatterlist *sg,
 				   enum dma_data_direction direction)
 {
-	if (!vq->use_dma_api)
+	if (!vq->use_dma_api) {
+		/*
+		 * If DMA is not used, KMSAN doesn't know that the scatterlist
+		 * is initialized by the hardware. Explicitly check/unpoison it
+		 * depending on the direction.
+		 */
+		kmsan_handle_dma(sg_virt(sg), sg->length, direction);
 		return (dma_addr_t)sg_phys(sg);
+	}
 
 	/*
 	 * We can't use dma_map_sg, because we don't use scatterlists in
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 32/38] kmsan: disable strscpy() optimization under KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (30 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 31/38] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg() glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 33/38] kmsan: add iomap support glider
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Vegard Nossum, Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, mhocko, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

Disable the efficient 8-byte reading under KMSAN to avoid false positives.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

v4:
 - actually disable the optimization under KMSAN via max=0
 - use IS_ENABLED as requested by Marco Elver

Change-Id: I25d1acf5c3df6eff85894cd94f5ddbe93308271c
---
 lib/string.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index 6012c385fb314..fec929e70f1a5 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -202,6 +202,14 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
 		max = 0;
 #endif
 
+	/*
+	 * read_word_at_a_time() below may read uninitialized bytes after the
+	 * trailing zero and use them in comparisons. Disable this optimization
+	 * under KMSAN to prevent false positive reports.
+	 */
+	if (IS_ENABLED(CONFIG_KMSAN))
+		max = 0;
+
 	while (max >= sizeof(unsigned long)) {
 		unsigned long c, data;
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 33/38] kmsan: add iomap support
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (31 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 32/38] kmsan: disable strscpy() optimization under KMSAN glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page() glider
                   ` (4 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Christoph Hellwig, Darrick J. Wong, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

Functions from lib/iomap.c interact with hardware, so KMSAN must ensure
that:
 - every read function returns an initialized value
 - every write function checks values before sending them to hardware.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---
v4:
 - adjust sizes of checked memory buffers as requested by Marco Elver

Change-Id: Iacd96265e56398d8c111637ddad3cad727e48c8d
---
 lib/iomap.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/lib/iomap.c b/lib/iomap.c
index e909ab71e995d..3582e8d1ca34e 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -6,6 +6,7 @@
  */
 #include <linux/pci.h>
 #include <linux/io.h>
+#include <linux/kmsan-checks.h>
 
 #include <linux/export.h>
 
@@ -70,26 +71,31 @@ static void bad_io_access(unsigned long port, const char *access)
 #define mmio_read64be(addr) swab64(readq(addr))
 #endif
 
+__no_sanitize_memory
 unsigned int ioread8(void __iomem *addr)
 {
 	IO_COND(addr, return inb(port), return readb(addr));
 	return 0xff;
 }
+__no_sanitize_memory
 unsigned int ioread16(void __iomem *addr)
 {
 	IO_COND(addr, return inw(port), return readw(addr));
 	return 0xffff;
 }
+__no_sanitize_memory
 unsigned int ioread16be(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read16be(port), return mmio_read16be(addr));
 	return 0xffff;
 }
+__no_sanitize_memory
 unsigned int ioread32(void __iomem *addr)
 {
 	IO_COND(addr, return inl(port), return readl(addr));
 	return 0xffffffff;
 }
+__no_sanitize_memory
 unsigned int ioread32be(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read32be(port), return mmio_read32be(addr));
@@ -142,18 +148,21 @@ static u64 pio_read64be_hi_lo(unsigned long port)
 	return lo | (hi << 32);
 }
 
+__no_sanitize_memory
 u64 ioread64_lo_hi(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
 	return 0xffffffffffffffffULL;
 }
 
+__no_sanitize_memory
 u64 ioread64_hi_lo(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64_hi_lo(port), return readq(addr));
 	return 0xffffffffffffffffULL;
 }
 
+__no_sanitize_memory
 u64 ioread64be_lo_hi(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64be_lo_hi(port),
@@ -161,6 +170,7 @@ u64 ioread64be_lo_hi(void __iomem *addr)
 	return 0xffffffffffffffffULL;
 }
 
+__no_sanitize_memory
 u64 ioread64be_hi_lo(void __iomem *addr)
 {
 	IO_COND(addr, return pio_read64be_hi_lo(port),
@@ -188,22 +198,32 @@ EXPORT_SYMBOL(ioread64be_hi_lo);
 
 void iowrite8(u8 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, outb(val,port), writeb(val, addr));
 }
 void iowrite16(u16 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, outw(val,port), writew(val, addr));
 }
 void iowrite16be(u16 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write16be(val,port), mmio_write16be(val, addr));
 }
 void iowrite32(u32 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, outl(val,port), writel(val, addr));
 }
 void iowrite32be(u32 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write32be(val,port), mmio_write32be(val, addr));
 }
 EXPORT_SYMBOL(iowrite8);
@@ -239,24 +259,32 @@ static void pio_write64be_hi_lo(u64 val, unsigned long port)
 
 void iowrite64_lo_hi(u64 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write64_lo_hi(val, port),
 		writeq(val, addr));
 }
 
 void iowrite64_hi_lo(u64 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write64_hi_lo(val, port),
 		writeq(val, addr));
 }
 
 void iowrite64be_lo_hi(u64 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write64be_lo_hi(val, port),
 		mmio_write64be(val, addr));
 }
 
 void iowrite64be_hi_lo(u64 val, void __iomem *addr)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(&val, sizeof(val));
 	IO_COND(addr, pio_write64be_hi_lo(val, port),
 		mmio_write64be(val, addr));
 }
@@ -328,14 +356,20 @@ static inline void mmio_outsl(void __iomem *addr, const u32 *src, int count)
 void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insb(port,dst,count), mmio_insb(addr, dst, count));
+	/* KMSAN must treat values read from devices as initialized. */
+	kmsan_unpoison_shadow(dst, count);
 }
 void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insw(port,dst,count), mmio_insw(addr, dst, count));
+	/* KMSAN must treat values read from devices as initialized. */
+	kmsan_unpoison_shadow(dst, count * 2);
 }
 void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
 {
 	IO_COND(addr, insl(port,dst,count), mmio_insl(addr, dst, count));
+	/* KMSAN must treat values read from devices as initialized. */
+	kmsan_unpoison_shadow(dst, count * 4);
 }
 EXPORT_SYMBOL(ioread8_rep);
 EXPORT_SYMBOL(ioread16_rep);
@@ -343,14 +377,20 @@ EXPORT_SYMBOL(ioread32_rep);
 
 void iowrite8_rep(void __iomem *addr, const void *src, unsigned long count)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(src, count);
 	IO_COND(addr, outsb(port, src, count), mmio_outsb(addr, src, count));
 }
 void iowrite16_rep(void __iomem *addr, const void *src, unsigned long count)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(src, count * 2);
 	IO_COND(addr, outsw(port, src, count), mmio_outsw(addr, src, count));
 }
 void iowrite32_rep(void __iomem *addr, const void *src, unsigned long count)
 {
+	/* Make sure uninitialized memory isn't copied to devices. */
+	kmsan_check_memory(src, count * 4);
 	IO_COND(addr, outsl(port, src,count), mmio_outsl(addr, src, count));
 }
 EXPORT_SYMBOL(iowrite8_rep);
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (32 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 33/38] kmsan: add iomap support glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:19   ` Christoph Hellwig
  2020-03-25 16:12 ` [PATCH v5 35/38] kmsan: disable physical page merging in biovec glider
                   ` (3 subsequent siblings)
  37 siblings, 1 reply; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, darrick.wong, davem, dmitry.torokhov,
	ebiggers, edumazet, ericvh, gregkh, harry.wentland, herbert, iii,
	mingo, jasowang, axboe, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

KMSAN doesn't know about DMA memory writes performed by devices.
We unpoison such memory when it's mapped to avoid false positive
reports.

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---

Change-Id: Ib1019ed531fea69f88b5cdec3d1e27403f2f3d64
---
 kernel/dma/direct.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index a8560052a915f..63dc1a594964a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -367,6 +367,7 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
 			     &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
 		return DMA_MAPPING_ERROR;
 	}
+	kmsan_handle_dma(page_address(page) + offset, size, dir);
 
 	if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		arch_sync_dma_for_device(phys, size, dir);
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 35/38] kmsan: disable physical page merging in biovec
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (33 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page() glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 36/38] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN glider
                   ` (2 subsequent siblings)
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Jens Axboe, Andy Lutomirski, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Christoph Hellwig, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, ard.biesheuvel,
	arnd, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, tglx,
	gor, wsa

KMSAN metadata for consequent physical pages may be inconsequent,
therefore accessing such pages together may lead to metadata
corruption.
We disable merging pages in biovec to prevent such corruptions.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-mm@kvack.org
---

v4:
 - use IS_ENABLED instead of #ifdef (as requested by Marco Elver)

Change-Id: Id2f2babaf662ac44675c4f2790f4a80ddc328fa7
---
 block/blk.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/blk.h b/block/blk.h
index 670337b7cfa0d..065dfee244118 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -79,6 +79,13 @@ static inline bool biovec_phys_mergeable(struct request_queue *q,
 	phys_addr_t addr1 = page_to_phys(vec1->bv_page) + vec1->bv_offset;
 	phys_addr_t addr2 = page_to_phys(vec2->bv_page) + vec2->bv_offset;
 
+	/*
+	 * Merging consequent physical pages may not work correctly under KMSAN
+	 * if their metadata pages aren't consequent. Just disable merging.
+	 */
+	if (IS_ENABLED(CONFIG_KMSAN))
+		return false;
+
 	if (addr1 + vec1->bv_len != addr2)
 		return false;
 	if (xen_domain() && !xen_biovec_phys_mergeable(vec1, vec2->bv_page))
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 36/38] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (34 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 35/38] kmsan: disable physical page merging in biovec glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 37/38] kmsan: x86/uprobes: unpoison regs in arch_uprobe_exception_notify() glider
  2020-03-25 16:12 ` [PATCH v5 38/38] kmsan: block: skip bio block merging logic for KMSAN glider
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Arnd Bergmann, Michal Simek, Andrey Ryabinin, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, Randy Dunlap,
	linux-mm
  Cc: glider, viro, adilger.kernel, akpm, luto, ard.biesheuvel, hch,
	hch, darrick.wong, davem, dmitry.torokhov, ebiggers, edumazet,
	ericvh, gregkh, harry.wentland, herbert, iii, mingo, jasowang,
	axboe, m.szyprowski, mark.rutland, martin.petersen, schwidefsky,
	willy, mst, mhocko, pmladek, cai, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

This is needed to allow memory tools like KASAN and KMSAN see the
memory accesses from the checksum code. Without CONFIG_GENERIC_CSUM the
tools can't see memory accesses originating from handwritten assembly
code.
For KASAN it's a question of detecting more bugs, for KMSAN using the C
implementation also helps avoid false positives originating from
seemingly uninitialized checksum values.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-mm@kvack.org
---

v2:
 - dropped the "default n" (as requested by Randy Dunlap)

v4:
 - changed "net:" to "x86:" in the patch name

Change-Id: I645e2c097253a8d5717ad87e2e2df6f6f67251f3
---
 arch/x86/Kconfig                |  4 ++++
 arch/x86/include/asm/checksum.h | 10 +++++++---
 arch/x86/lib/Makefile           |  2 ++
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 376c13480def2..c45c937682863 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -274,6 +274,10 @@ config GENERIC_ISA_DMA
 	def_bool y
 	depends on ISA_DMA_API
 
+config GENERIC_CSUM
+	bool
+	default y if KMSAN || KASAN
+
 config GENERIC_BUG
 	def_bool y
 	depends on BUG
diff --git a/arch/x86/include/asm/checksum.h b/arch/x86/include/asm/checksum.h
index d79d1e622dcf1..ab3464cbce26d 100644
--- a/arch/x86/include/asm/checksum.h
+++ b/arch/x86/include/asm/checksum.h
@@ -1,6 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifdef CONFIG_X86_32
-# include <asm/checksum_32.h>
+#ifdef CONFIG_GENERIC_CSUM
+# include <asm-generic/checksum.h>
 #else
-# include <asm/checksum_64.h>
+# ifdef CONFIG_X86_32
+#  include <asm/checksum_32.h>
+# else
+#  include <asm/checksum_64.h>
+# endif
 #endif
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 6110bce7237bd..40d6704c4767d 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -64,7 +64,9 @@ endif
         lib-$(CONFIG_X86_USE_3DNOW) += mmx_32.o
 else
         obj-y += iomap_copy_64.o
+ifneq ($(CONFIG_GENERIC_CSUM),y)
         lib-y += csum-partial_64.o csum-copy_64.o csum-wrappers_64.o
+endif
         lib-y += clear_page_64.o copy_page_64.o
         lib-y += memmove_64.o memset_64.o
         lib-y += copy_user_64.o
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 37/38] kmsan: x86/uprobes: unpoison regs in arch_uprobe_exception_notify()
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (35 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 36/38] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN glider
@ 2020-03-25 16:12 ` glider
  2020-03-25 16:12 ` [PATCH v5 38/38] kmsan: block: skip bio block merging logic for KMSAN glider
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, aryabinin, luto, ard.biesheuvel,
	arnd, hch, hch, darrick.wong, davem, dmitry.torokhov, ebiggers,
	edumazet, ericvh, gregkh, harry.wentland, herbert, iii, mingo,
	jasowang, axboe, m.szyprowski, mark.rutland, martin.petersen,
	schwidefsky, willy, mst, mhocko, monstr, pmladek, cai, rdunlap,
	robin.murphy, sergey.senozhatsky, rostedt, tiwai, tytso, gor,
	wsa

arch_uprobe_exception_notify() may receive register state without
valid KMSAN metadata, which will lead to false positives.
Explicitly unpoison args and args->regs to avoid this.

Signed-off-by: Alexander Potapenko <glider@google.com>
To: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org

---

This patch was split from "kmsan: disable instrumentation of certain
functions"

v4:
 - split this patch away

Change-Id: I466ef628b00362ab5eb1852c76baa8cdb06736d9
---
 arch/x86/kernel/uprobes.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 15e5aad8ac2c1..bc156b016dc57 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -8,6 +8,7 @@
  *	Jim Keniston
  */
 #include <linux/kernel.h>
+#include <linux/kmsan-checks.h>
 #include <linux/sched.h>
 #include <linux/ptrace.h>
 #include <linux/uprobes.h>
@@ -997,9 +998,13 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
 int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data)
 {
 	struct die_args *args = data;
-	struct pt_regs *regs = args->regs;
+	struct pt_regs *regs;
 	int ret = NOTIFY_DONE;
 
+	kmsan_unpoison_shadow(args, sizeof(*args));
+	regs = args->regs;
+	if (regs)
+		kmsan_unpoison_shadow(regs, sizeof(*regs));
 	/* We are only interested in userspace traps */
 	if (regs && !user_mode(regs))
 		return NOTIFY_DONE;
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v5 38/38] kmsan: block: skip bio block merging logic for KMSAN
  2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
                   ` (36 preceding siblings ...)
  2020-03-25 16:12 ` [PATCH v5 37/38] kmsan: x86/uprobes: unpoison regs in arch_uprobe_exception_notify() glider
@ 2020-03-25 16:12 ` glider
  37 siblings, 0 replies; 60+ messages in thread
From: glider @ 2020-03-25 16:12 UTC (permalink / raw)
  To: Eric Biggers, Jens Axboe, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, linux-mm
  Cc: glider, viro, adilger.kernel, akpm, aryabinin, luto,
	ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, edumazet, ericvh, gregkh, harry.wentland,
	herbert, iii, mingo, jasowang, m.szyprowski, mark.rutland,
	martin.petersen, schwidefsky, willy, mst, mhocko, monstr,
	pmladek, cai, rdunlap, robin.murphy, sergey.senozhatsky, rostedt,
	tiwai, tytso, tglx, gor, wsa

KMSAN doesn't allow treating adjacent memory pages as such, if they were
allocated by different alloc_pages() calls.
The block layer however does so: adjacent pages end up being used
together. To prevent this, make page_is_mergeable() return false under
KMSAN.

Suggested-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-mm@kvack.org
---

Change-Id: Iff367f421d51fac549e31ed122365b7539642cff
---
 block/bio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 0985f34225561..09503ef00bc20 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -696,6 +696,8 @@ static inline bool page_is_mergeable(const struct bio_vec *bv,
 	*same_page = ((vec_end_addr & PAGE_MASK) == page_addr);
 	if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page)
 		return false;
+	if (!*same_page && IS_ENABLED(CONFIG_KMSAN))
+		return false;
 	return true;
 }
 
-- 
2.25.1.696.g5e7596f4ac-goog



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-25 16:12 ` [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page() glider
@ 2020-03-25 16:19   ` Christoph Hellwig
  2020-03-27 17:03     ` Alexander Potapenko
  0 siblings, 1 reply; 60+ messages in thread
From: Christoph Hellwig @ 2020-03-25 16:19 UTC (permalink / raw)
  To: glider
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov, linux-mm, viro,
	adilger.kernel, akpm, aryabinin, luto, ard.biesheuvel, arnd, hch,
	darrick.wong, davem, dmitry.torokhov, ebiggers, edumazet, ericvh,
	gregkh, harry.wentland, herbert, iii, mingo, jasowang, axboe,
	mark.rutland, martin.petersen, schwidefsky, willy, mst, mhocko,
	monstr, pmladek, cai, rdunlap, sergey.senozhatsky, rostedt,
	tiwai, tytso, tglx, gor, wsa

On Wed, Mar 25, 2020 at 05:12:45PM +0100, glider@google.com wrote:
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index a8560052a915f..63dc1a594964a 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -367,6 +367,7 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
>  			     &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
>  		return DMA_MAPPING_ERROR;
>  	}
> +	kmsan_handle_dma(page_address(page) + offset, size, dir);

This needs to go into dma_map_page so that it also covers IOMMUs.
dma_map_sg_atttrs will also need similar treatment.  Also the page
doesn't have to be mapped into kernel address space, you probably
want to pass the page to kmsan_handle_dma and throw in a highmem
check there.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 16:12 ` [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW glider
@ 2020-03-25 16:19   ` Michal Hocko
  2020-03-25 17:26     ` Alexander Potapenko
  0 siblings, 1 reply; 60+ messages in thread
From: Michal Hocko @ 2020-03-25 16:19 UTC (permalink / raw)
  To: glider
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, linux-mm, viro, adilger.kernel, aryabinin,
	luto, ard.biesheuvel, arnd, hch, hch, darrick.wong, davem,
	dmitry.torokhov, ebiggers, edumazet, ericvh, gregkh,
	harry.wentland, herbert, iii, mingo, jasowang, axboe,
	m.szyprowski, mark.rutland, martin.petersen, schwidefsky, willy,
	mst, monstr, pmladek, cai, rdunlap, robin.murphy,
	sergey.senozhatsky, rostedt, tiwai, tytso, tglx, gor, wsa

On Wed 25-03-20 17:12:14, glider@google.com wrote:
> This flag is to be used by KMSAN runtime to mark that newly created
> memory pages don't need KMSAN metadata backing them.

I really dislike an idea of the gfp flag. If you need some form of
exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
History tells us that single usecase gfp flags are too tempting to abuse
and using incorrectly.

> Signed-off-by: Alexander Potapenko <glider@google.com>
> To: Alexander Potapenko <glider@google.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: linux-mm@kvack.org
> 
> ---
> We can't decide what to do here:
>  - do we need to conditionally define ___GFP_NO_KMSAN_SHADOW depending on
>    CONFIG_KMSAN like LOCKDEP does?
>  - if KMSAN is defined, and LOCKDEP is not, do we want to "compactify" the GFP
>    bits?
> 
> Change-Id: If5d0352fd5711ad103328e2c185eb885e826423a
> ---
>  include/linux/gfp.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index be2754841369e..e1ab42b5e9ce2 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -44,6 +44,7 @@ struct vm_area_struct;
>  #else
>  #define ___GFP_NOLOCKDEP	0
>  #endif
> +#define ___GFP_NO_KMSAN_SHADOW  0x1000000u
>  /* If the above are modified, __GFP_BITS_SHIFT may need updating */
>  
>  /*
> @@ -212,12 +213,13 @@ struct vm_area_struct;
>  #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
> +#define __GFP_NO_KMSAN_SHADOW  ((__force gfp_t)___GFP_NO_KMSAN_SHADOW)
>  
>  /* Disable lockdep for GFP context tracking */
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>  
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (25)
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
>  /**
> -- 
> 2.25.1.696.g5e7596f4ac-goog
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 16:19   ` Michal Hocko
@ 2020-03-25 17:26     ` Alexander Potapenko
  2020-03-25 17:40       ` Alexander Potapenko
  2020-03-25 17:43       ` Michal Hocko
  0 siblings, 2 replies; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-25 17:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, Linux Memory Management List, Al Viro,
	Andreas Dilger, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > This flag is to be used by KMSAN runtime to mark that newly created
> > memory pages don't need KMSAN metadata backing them.
>
> I really dislike an idea of the gfp flag. If you need some form of
> exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> History tells us that single usecase gfp flags are too tempting to abuse
> and using incorrectly.

Great idea, will do!
Guess PF_ flags isn't a scarce resource?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:26     ` Alexander Potapenko
@ 2020-03-25 17:40       ` Alexander Potapenko
  2020-03-25 17:49         ` Matthew Wilcox
  2020-03-25 18:38         ` Michal Hocko
  2020-03-25 17:43       ` Michal Hocko
  1 sibling, 2 replies; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-25 17:40 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, Linux Memory Management List, Al Viro,
	Andreas Dilger, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 6:26 PM Alexander Potapenko <glider@google.com> wrote:
>
> On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > This flag is to be used by KMSAN runtime to mark that newly created
> > > memory pages don't need KMSAN metadata backing them.
> >
> > I really dislike an idea of the gfp flag. If you need some form of
> > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > History tells us that single usecase gfp flags are too tempting to abuse
> > and using incorrectly.
>
> Great idea, will do!
> Guess PF_ flags isn't a scarce resource?

Actually, no, we are out of bits in current->flags already.
I could introduce a separate flag into struct task, but that won't
work in interrupt contexts - how do you solve that problem for FS/IO
allocations?

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:26     ` Alexander Potapenko
  2020-03-25 17:40       ` Alexander Potapenko
@ 2020-03-25 17:43       ` Michal Hocko
  1 sibling, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2020-03-25 17:43 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, Linux Memory Management List, Al Viro,
	Andreas Dilger, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed 25-03-20 18:26:34, Alexander Potapenko wrote:
> On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > This flag is to be used by KMSAN runtime to mark that newly created
> > > memory pages don't need KMSAN metadata backing them.
> >
> > I really dislike an idea of the gfp flag. If you need some form of
> > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > History tells us that single usecase gfp flags are too tempting to abuse
> > and using incorrectly.
> 
> Great idea, will do!
> Guess PF_ flags isn't a scarce resource?

task_struct is a monster data structure and there are surely holes you
can fit your flag in. All you need is to just store it somewhere and
then check for it wherever you hook your infrastructure into the page
or slab allocator. The primary thing is to avoid a gfp flag.

Thanks!

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:40       ` Alexander Potapenko
@ 2020-03-25 17:49         ` Matthew Wilcox
  2020-03-25 18:03           ` Alexander Potapenko
  2020-03-25 18:40           ` Michal Hocko
  2020-03-25 18:38         ` Michal Hocko
  1 sibling, 2 replies; 60+ messages in thread
From: Matthew Wilcox @ 2020-03-25 17:49 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Michal Hocko, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 06:40:29PM +0100, Alexander Potapenko wrote:
> On Wed, Mar 25, 2020 at 6:26 PM Alexander Potapenko <glider@google.com> wrote:
> >
> > On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > > This flag is to be used by KMSAN runtime to mark that newly created
> > > > memory pages don't need KMSAN metadata backing them.
> > >
> > > I really dislike an idea of the gfp flag. If you need some form of
> > > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > > History tells us that single usecase gfp flags are too tempting to abuse
> > > and using incorrectly.
> >
> > Great idea, will do!
> > Guess PF_ flags isn't a scarce resource?
> 
> Actually, no, we are out of bits in current->flags already.
> I could introduce a separate flag into struct task, but that won't
> work in interrupt contexts - how do you solve that problem for FS/IO
> allocations?

I would suggest using bits in the section labelled:

        /* Unserialized, strictly 'current' */

since this doesn't need to be accessed from any other task.  Michal,
can we move PF_MEMALLOC_NOIO and NOFS to that area too?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:49         ` Matthew Wilcox
@ 2020-03-25 18:03           ` Alexander Potapenko
  2020-03-25 18:09             ` Matthew Wilcox
  2020-03-25 18:40           ` Michal Hocko
  1 sibling, 1 reply; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-25 18:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Michal Hocko, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

> I would suggest using bits in the section labelled:
>
>         /* Unserialized, strictly 'current' */

The main problem is that |current| is unavailable in the interrupt
context, so we'll also need to:
 - disable interrupts when preparing for a KMSAN internal memory
allocation - sounds costly, huh?
 - store the context flag in a per-cpu variable in the case |current|
is unavailable.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 18:03           ` Alexander Potapenko
@ 2020-03-25 18:09             ` Matthew Wilcox
  2020-03-25 18:30               ` Alexander Potapenko
  0 siblings, 1 reply; 60+ messages in thread
From: Matthew Wilcox @ 2020-03-25 18:09 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Michal Hocko, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 07:03:32PM +0100, Alexander Potapenko wrote:
> > I would suggest using bits in the section labelled:
> >
> >         /* Unserialized, strictly 'current' */
> 
> The main problem is that |current| is unavailable in the interrupt
> context, so we'll also need to:
>  - disable interrupts when preparing for a KMSAN internal memory
> allocation - sounds costly, huh?
>  - store the context flag in a per-cpu variable in the case |current|
> is unavailable.

It's not /unavailable/ ... it's whatever task happens to be running at
the time the interrupt is triggered.  You can borrow its task_struct.
You'll have to save off the current value of the flag before setting it,
just like memalloc_nofs_save() does.

But this does rather call into question whether Michal's advice to use
task_struct is good advice to begin with.  For memalloc_nofs/noio,
it works well this way because allocations in interrupt context are
inherently at a more restrictive context than task level.  It's not clear
to me what this kmsan GFP flag is being used for, and whether allocations
that happen in interrupt context should inherit the kmsan setting.
I will have to read these patches more carefully to determine that;
I was really just responding to the "where can I find some free bits"
part of the question.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 18:09             ` Matthew Wilcox
@ 2020-03-25 18:30               ` Alexander Potapenko
  2020-03-25 18:43                 ` Michal Hocko
  0 siblings, 1 reply; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-25 18:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Michal Hocko, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 7:09 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Mar 25, 2020 at 07:03:32PM +0100, Alexander Potapenko wrote:
> > > I would suggest using bits in the section labelled:
> > >
> > >         /* Unserialized, strictly 'current' */
> >
> > The main problem is that |current| is unavailable in the interrupt
> > context, so we'll also need to:
> >  - disable interrupts when preparing for a KMSAN internal memory
> > allocation - sounds costly, huh?
> >  - store the context flag in a per-cpu variable in the case |current|
> > is unavailable.
>
> It's not /unavailable/ ... it's whatever task happens to be running at
> the time the interrupt is triggered.  You can borrow its task_struct.

That didn't come to my mind, interesting!

> You'll have to save off the current value of the flag before setting it,
> just like memalloc_nofs_save() does.
> But this does rather call into question whether Michal's advice to use
> task_struct is good advice to begin with.  For memalloc_nofs/noio,
> it works well this way because allocations in interrupt context are
> inherently at a more restrictive context than task level.  It's not clear
> to me what this kmsan GFP flag is being used for, and whether allocations
> that happen in interrupt context should inherit the kmsan setting.

The idea behind this flag is as follows.
KMSAN must allocate metadata pages for every allocation performed by
alloc_pages().
Metadata allocations are also done via alloc_pages(), and for those no
further metadata needs to be allocated.
Thus the GFP flag is used to prevent the recursion in alloc_pages().

It is theoretically possible that a less restrictive allocation
(without __GFP_NO_KMSAN_SHADOW) happens in an interrupt on top of a
task performing a more restrictive allocation (with
__GFP_NO_KMSAN_SHADOW).

> I will have to read these patches more carefully to determine that;
> I was really just responding to the "where can I find some free bits"
> part of the question.

Thanks for clarification.

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:40       ` Alexander Potapenko
  2020-03-25 17:49         ` Matthew Wilcox
@ 2020-03-25 18:38         ` Michal Hocko
  2020-03-27 12:20           ` Alexander Potapenko
  1 sibling, 1 reply; 60+ messages in thread
From: Michal Hocko @ 2020-03-25 18:38 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, Linux Memory Management List, Al Viro,
	Andreas Dilger, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed 25-03-20 18:40:29, Alexander Potapenko wrote:
> On Wed, Mar 25, 2020 at 6:26 PM Alexander Potapenko <glider@google.com> wrote:
> >
> > On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > > This flag is to be used by KMSAN runtime to mark that newly created
> > > > memory pages don't need KMSAN metadata backing them.
> > >
> > > I really dislike an idea of the gfp flag. If you need some form of
> > > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > > History tells us that single usecase gfp flags are too tempting to abuse
> > > and using incorrectly.
> >
> > Great idea, will do!
> > Guess PF_ flags isn't a scarce resource?
> 
> Actually, no, we are out of bits in current->flags already.
> I could introduce a separate flag into struct task, but that won't
> work in interrupt contexts - how do you solve that problem for FS/IO
> allocations?

NOFS/NOIO is not a problem for IRQ context because we never do reclaim
from that context.

I was also not aware that there are users from the IRQ context. I
thought this would be an internal KMSAN stuff. What would be the IRQ
context you call this from?

Anyway, if you cannot go with the task_struct then a per-cpu value
should work, right?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 17:49         ` Matthew Wilcox
  2020-03-25 18:03           ` Alexander Potapenko
@ 2020-03-25 18:40           ` Michal Hocko
  1 sibling, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2020-03-25 18:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alexander Potapenko, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed 25-03-20 10:49:16, Matthew Wilcox wrote:
> On Wed, Mar 25, 2020 at 06:40:29PM +0100, Alexander Potapenko wrote:
> > On Wed, Mar 25, 2020 at 6:26 PM Alexander Potapenko <glider@google.com> wrote:
> > >
> > > On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > > > This flag is to be used by KMSAN runtime to mark that newly created
> > > > > memory pages don't need KMSAN metadata backing them.
> > > >
> > > > I really dislike an idea of the gfp flag. If you need some form of
> > > > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > > > History tells us that single usecase gfp flags are too tempting to abuse
> > > > and using incorrectly.
> > >
> > > Great idea, will do!
> > > Guess PF_ flags isn't a scarce resource?
> > 
> > Actually, no, we are out of bits in current->flags already.
> > I could introduce a separate flag into struct task, but that won't
> > work in interrupt contexts - how do you solve that problem for FS/IO
> > allocations?
> 
> I would suggest using bits in the section labelled:
> 
>         /* Unserialized, strictly 'current' */
> 
> since this doesn't need to be accessed from any other task.  Michal,
> can we move PF_MEMALLOC_NOIO and NOFS to that area too?

I wouldn't object. The only reason for using PF_$FOO is historical. It
was XFS which started to use pf flag for NOSF AFAIR. I haven't checked
recently but xfs still used to use its own api in the past.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 18:30               ` Alexander Potapenko
@ 2020-03-25 18:43                 ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2020-03-25 18:43 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Matthew Wilcox, Vegard Nossum, Andrew Morton, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrey Ryabinin, Andy Lutomirski,
	Ard Biesheuvel, Arnd Bergmann, Christoph Hellwig,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Marek Szyprowski,
	Mark Rutland, Martin K . Petersen, Martin Schwidefsky,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed 25-03-20 19:30:09, Alexander Potapenko wrote:
> On Wed, Mar 25, 2020 at 7:09 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Wed, Mar 25, 2020 at 07:03:32PM +0100, Alexander Potapenko wrote:
> > > > I would suggest using bits in the section labelled:
> > > >
> > > >         /* Unserialized, strictly 'current' */
> > >
> > > The main problem is that |current| is unavailable in the interrupt
> > > context, so we'll also need to:
> > >  - disable interrupts when preparing for a KMSAN internal memory
> > > allocation - sounds costly, huh?
> > >  - store the context flag in a per-cpu variable in the case |current|
> > > is unavailable.
> >
> > It's not /unavailable/ ... it's whatever task happens to be running at
> > the time the interrupt is triggered.  You can borrow its task_struct.
> 
> That didn't come to my mind, interesting!
> 
> > You'll have to save off the current value of the flag before setting it,
> > just like memalloc_nofs_save() does.
> > But this does rather call into question whether Michal's advice to use
> > task_struct is good advice to begin with.  For memalloc_nofs/noio,
> > it works well this way because allocations in interrupt context are
> > inherently at a more restrictive context than task level.  It's not clear
> > to me what this kmsan GFP flag is being used for, and whether allocations
> > that happen in interrupt context should inherit the kmsan setting.
> 
> The idea behind this flag is as follows.
> KMSAN must allocate metadata pages for every allocation performed by
> alloc_pages().
> Metadata allocations are also done via alloc_pages(), and for those no
> further metadata needs to be allocated.
> Thus the GFP flag is used to prevent the recursion in alloc_pages().

Kmemleak used the same approach IIRC. It turned out to be unusable in a
presence of any higher GFP_NOWAIT memory pressure. Anyway talk to
Catalin Marinas about problems he had to go through to address them.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW
  2020-03-25 18:38         ` Michal Hocko
@ 2020-03-27 12:20           ` Alexander Potapenko
  0 siblings, 0 replies; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-27 12:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Andrey Konovalov, Linux Memory Management List, Al Viro,
	Andreas Dilger, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Simek, Petr Mladek, Qian Cai,
	Randy Dunlap, Robin Murphy, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 7:38 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Wed 25-03-20 18:40:29, Alexander Potapenko wrote:
> > On Wed, Mar 25, 2020 at 6:26 PM Alexander Potapenko <glider@google.com> wrote:
> > >
> > > On Wed, Mar 25, 2020 at 5:19 PM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > On Wed 25-03-20 17:12:14, glider@google.com wrote:
> > > > > This flag is to be used by KMSAN runtime to mark that newly created
> > > > > memory pages don't need KMSAN metadata backing them.
> > > >
> > > > I really dislike an idea of the gfp flag. If you need some form of
> > > > exclusion for kmsan allocations then follow the pattern of memalloc_no{fs,io}_{save,restore}
> > > > History tells us that single usecase gfp flags are too tempting to abuse
> > > > and using incorrectly.
> > >
> > > Great idea, will do!
> > > Guess PF_ flags isn't a scarce resource?
> >
> > Actually, no, we are out of bits in current->flags already.
> > I could introduce a separate flag into struct task, but that won't
> > work in interrupt contexts - how do you solve that problem for FS/IO
> > allocations?
>
> NOFS/NOIO is not a problem for IRQ context because we never do reclaim
> from that context.
>
> I was also not aware that there are users from the IRQ context. I
> thought this would be an internal KMSAN stuff. What would be the IRQ
> context you call this from?

KMSAN allocates its metadata lazily*, so if any code does
alloc_pages() from IRQ context, we need to call alloc_pages two more
times for shadow/origin pages.
We also unwind the stack on every creation/copy of an uninitialized
value. Those stacks are stored in the stack depot, which may also
allocate new pages to store stacks.

> Anyway, if you cannot go with the task_struct then a per-cpu value
> should work, right?

Yes, I will try that.

* - as mentioned in the cover letter, a lot of problems could've been
solved if we pre-allocate the metadata pages at boot time so that for
every two contiguous physical pages their metadata pages are also
contiguous.
I haven't come up with a good idea about how to implement that.
Suggestions are very welcome.
--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-25 16:19   ` Christoph Hellwig
@ 2020-03-27 17:03     ` Alexander Potapenko
  2020-03-27 17:06       ` Christoph Hellwig
  0 siblings, 1 reply; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-27 17:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrew Morton, Andrey Ryabinin,
	Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Wed, Mar 25, 2020 at 5:19 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Mar 25, 2020 at 05:12:45PM +0100, glider@google.com wrote:
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index a8560052a915f..63dc1a594964a 100644
> > --- a/kernel/dma/direct.c
> > +++ b/kernel/dma/direct.c
> > @@ -367,6 +367,7 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
> >                            &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
> >               return DMA_MAPPING_ERROR;
> >       }
> > +     kmsan_handle_dma(page_address(page) + offset, size, dir);
>
> This needs to go into dma_map_page so that it also covers IOMMUs.
> dma_map_sg_atttrs will also need similar treatment.

Thanks, will be done in v6!

> Also the page
> doesn't have to be mapped into kernel address space, you probably
> want to pass the page to kmsan_handle_dma and throw in a highmem
> check there.

Do you mean comparing the address to TASK_SIZE, or is there a more
portable way to check that?

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-27 17:03     ` Alexander Potapenko
@ 2020-03-27 17:06       ` Christoph Hellwig
  2020-03-27 18:46         ` Alexander Potapenko
  0 siblings, 1 reply; 60+ messages in thread
From: Christoph Hellwig @ 2020-03-27 17:06 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	Linux Memory Management List, Al Viro, Andreas Dilger,
	Andrew Morton, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Fri, Mar 27, 2020 at 06:03:32PM +0100, Alexander Potapenko wrote:
> > Also the page
> > doesn't have to be mapped into kernel address space, you probably
> > want to pass the page to kmsan_handle_dma and throw in a highmem
> > check there.
> 
> Do you mean comparing the address to TASK_SIZE, or is there a more
> portable way to check that?

!PageHighMem(page) implies the page has a kernel direct mapping.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-27 17:06       ` Christoph Hellwig
@ 2020-03-27 18:46         ` Alexander Potapenko
  2020-03-28  8:52           ` Christoph Hellwig
  0 siblings, 1 reply; 60+ messages in thread
From: Alexander Potapenko @ 2020-03-27 18:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marek Szyprowski, Robin Murphy, Vegard Nossum, Dmitry Vyukov,
	Marco Elver, Andrey Konovalov, Linux Memory Management List,
	Al Viro, Andreas Dilger, Andrew Morton, Andrey Ryabinin,
	Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Fri, Mar 27, 2020 at 6:06 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Fri, Mar 27, 2020 at 06:03:32PM +0100, Alexander Potapenko wrote:
> > > Also the page
> > > doesn't have to be mapped into kernel address space, you probably
> > > want to pass the page to kmsan_handle_dma and throw in a highmem
> > > check there.
> >
> > Do you mean comparing the address to TASK_SIZE, or is there a more
> > portable way to check that?
>
> !PageHighMem(page) implies the page has a kernel direct mapping.

I tried adding this check and started seeing false positives because
the virtio_ring driver actually uses highmem pages for DMA, and data
from those pages is later copied to the kernel.
Guess it's easier to just allow handling highmem pages? What problems
do you anticipate?

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page()
  2020-03-27 18:46         ` Alexander Potapenko
@ 2020-03-28  8:52           ` Christoph Hellwig
  0 siblings, 0 replies; 60+ messages in thread
From: Christoph Hellwig @ 2020-03-28  8:52 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Vegard Nossum,
	Dmitry Vyukov, Marco Elver, Andrey Konovalov,
	Linux Memory Management List, Al Viro, Andreas Dilger,
	Andrew Morton, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Darrick J. Wong, David Miller,
	Dmitry Torokhov, Eric Biggers, Eric Dumazet, Eric Van Hensbergen,
	Greg Kroah-Hartman, Harry Wentland, Herbert Xu, Ilya Leoshkevich,
	Ingo Molnar, Jason Wang, Jens Axboe, Mark Rutland,
	Martin K . Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S. Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Sergey Senozhatsky, Steven Rostedt,
	Takashi Iwai, Theodore Ts'o, Thomas Gleixner, Vasily Gorbik,
	Wolfram Sang

On Fri, Mar 27, 2020 at 07:46:08PM +0100, Alexander Potapenko wrote:
> > > Do you mean comparing the address to TASK_SIZE, or is there a more
> > > portable way to check that?
> >
> > !PageHighMem(page) implies the page has a kernel direct mapping.
> 
> I tried adding this check and started seeing false positives because
> the virtio_ring driver actually uses highmem pages for DMA, and data
> from those pages is later copied to the kernel.
> Guess it's easier to just allow handling highmem pages? What problems
> do you anticipate?

For PageHighMem(page), page_address(page) is not actually valid, so І'm
not sure how your code in this patch even worked at all.  Note that
all drivers (well, except for a few buggy legacy ones with workarounds)
can DMA from/to highmem.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t
  2020-03-25 16:12 ` [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider
@ 2020-03-30 13:36   ` Andrey Konovalov
  0 siblings, 0 replies; 60+ messages in thread
From: Andrey Konovalov @ 2020-03-30 13:36 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Linux Memory Management List, Alexander Viro, Andreas Dilger,
	Andrew Morton, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David S. Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K. Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S . Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Robin Murphy, Sergey Senozhatsky,
	Steven Rostedt, Takashi Iwai, Theodore Ts'o, Thomas Gleixner,
	Vasily Gorbik, Wolfram Sang

On Wed, Mar 25, 2020 at 5:12 PM <glider@google.com> wrote:
>
> Some users (currently only KMSAN) may want to use spare bits in
> depot_stack_handle_t. Let them do so and provide get_dsh_extra_bits()
> and set_dsh_extra_bits() to access those bits.
>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> To: Alexander Potapenko <glider@google.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: linux-mm@kvack.org
> ---
>
> Change-Id: I23580dbde85908eeda0bdd8f83a8c3882ab3e012
> ---
>  include/linux/stackdepot.h |  8 ++++++++
>  lib/stackdepot.c           | 24 +++++++++++++++++++++++-
>  2 files changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
> index 24d49c732341a..ac1b5a78d7f65 100644
> --- a/include/linux/stackdepot.h
> +++ b/include/linux/stackdepot.h
> @@ -12,6 +12,11 @@
>  #define _LINUX_STACKDEPOT_H
>
>  typedef u32 depot_stack_handle_t;
> +/*
> + * Number of bits in the handle that stack depot doesn't use. Users may store
> + * information in them.
> + */

Could it be that stack depot starts using those bits at some point in
the future and then external users will get broken? If not, maybe it
makes sense to change the language here that stack depot explicitly
dedicates these 5 bits to external use. Otherwise this looks a bit
confusing IMO.

> +#define STACK_DEPOT_EXTRA_BITS 5
>
>  depot_stack_handle_t stack_depot_save(unsigned long *entries,
>                                       unsigned int nr_entries, gfp_t gfp_flags);
> @@ -20,5 +25,8 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
>                                unsigned long **entries);
>
>  unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries);
> +depot_stack_handle_t set_dsh_extra_bits(depot_stack_handle_t handle,
> +                                       unsigned int bits);
> +unsigned int get_dsh_extra_bits(depot_stack_handle_t handle);
>
>  #endif
> diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> index 2caffc64e4c82..195ce3dc7c37e 100644
> --- a/lib/stackdepot.c
> +++ b/lib/stackdepot.c
> @@ -40,8 +40,10 @@
>  #define STACK_ALLOC_ALIGN 4
>  #define STACK_ALLOC_OFFSET_BITS (STACK_ALLOC_ORDER + PAGE_SHIFT - \
>                                         STACK_ALLOC_ALIGN)
> +
>  #define STACK_ALLOC_INDEX_BITS (DEPOT_STACK_BITS - \
> -               STACK_ALLOC_NULL_PROTECTION_BITS - STACK_ALLOC_OFFSET_BITS)
> +               STACK_ALLOC_NULL_PROTECTION_BITS - \
> +               STACK_ALLOC_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
>  #define STACK_ALLOC_SLABS_CAP 8192
>  #define STACK_ALLOC_MAX_SLABS \
>         (((1LL << (STACK_ALLOC_INDEX_BITS)) < STACK_ALLOC_SLABS_CAP) ? \
> @@ -54,6 +56,7 @@ union handle_parts {
>                 u32 slabindex : STACK_ALLOC_INDEX_BITS;
>                 u32 offset : STACK_ALLOC_OFFSET_BITS;
>                 u32 valid : STACK_ALLOC_NULL_PROTECTION_BITS;
> +               u32 extra : STACK_DEPOT_EXTRA_BITS;
>         };
>  };
>
> @@ -72,6 +75,24 @@ static int next_slab_inited;
>  static size_t depot_offset;
>  static DEFINE_SPINLOCK(depot_lock);
>
> +depot_stack_handle_t set_dsh_extra_bits(depot_stack_handle_t handle,
> +                                       u32 bits)
> +{
> +       union handle_parts parts = { .handle = handle };
> +
> +       parts.extra = bits & ((1U << STACK_DEPOT_EXTRA_BITS) - 1);
> +       return parts.handle;
> +}
> +EXPORT_SYMBOL_GPL(set_dsh_extra_bits);
> +
> +u32 get_dsh_extra_bits(depot_stack_handle_t handle)
> +{
> +       union handle_parts parts = { .handle = handle };
> +
> +       return parts.extra;
> +}
> +EXPORT_SYMBOL_GPL(get_dsh_extra_bits);
> +
>  static bool init_stack_slab(void **prealloc)
>  {
>         if (!*prealloc)
> @@ -136,6 +157,7 @@ static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
>         stack->handle.slabindex = depot_index;
>         stack->handle.offset = depot_offset >> STACK_ALLOC_ALIGN;
>         stack->handle.valid = 1;
> +       stack->handle.extra = 0;
>         memcpy(stack->entries, entries, size * sizeof(unsigned long));
>         depot_offset += required_size;
>
> --
> 2.25.1.696.g5e7596f4ac-goog
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__
  2020-03-25 16:12 ` [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__ glider
@ 2020-03-30 13:37   ` Andrey Konovalov
  0 siblings, 0 replies; 60+ messages in thread
From: Andrey Konovalov @ 2020-03-30 13:37 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Linux Memory Management List, Alexander Viro, Andreas Dilger,
	Andrew Morton, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David S. Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K. Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S . Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Robin Murphy, Sergey Senozhatsky,
	Steven Rostedt, Takashi Iwai, Theodore Ts'o, Thomas Gleixner,
	Vasily Gorbik, Wolfram Sang

On Wed, Mar 25, 2020 at 5:13 PM <glider@google.com> wrote:
>
> __no_sanitize_memory is a function attribute that makes KMSAN
> ignore the uninitialized values coming from the function's
> inputs, and initialize the function's outputs.
>
> Functions marked with this attribute can't be inlined into functions
> not marked with it, and vice versa.
>
> __SANITIZE_MEMORY__ is a macro that's defined iff the file is
> instrumented with KMSAN. This is not the same as CONFIG_KMSAN, which is
> defined for every file.
>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> To: Alexander Potapenko <glider@google.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: linux-mm@kvack.org
> Acked-by: Marco Elver <elver@google.com>

Reviewed-by: Andrey Konovalov <andreyknvl@google.com>

>
> ---
>
> v4:
>  - dropped an unnecessary comment as requested by Marco Elver
>
> Change-Id: I1f1672652c8392f15f7ca8ac26cd4e71f9cc1e4b
> ---
>  include/linux/compiler-clang.h | 7 +++++++
>  include/linux/compiler-gcc.h   | 5 +++++
>  2 files changed, 12 insertions(+)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 2cb42d8bdedc6..d4f929b4a6705 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -33,6 +33,13 @@
>  #define __no_sanitize_thread
>  #endif
>
> +#if __has_feature(memory_sanitizer)
> +# define __SANITIZE_MEMORY__
> +# define __no_sanitize_memory __attribute__((no_sanitize("kernel-memory")))
> +#else
> +# define __no_sanitize_memory
> +#endif
> +
>  /*
>   * Not all versions of clang implement the the type-generic versions
>   * of the builtin overflow checkers. Fortunately, clang implements
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index cf294faec2f87..1121557252f88 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -151,6 +151,11 @@
>  #define __no_sanitize_thread
>  #endif
>
> +/*
> + * GCC doesn't support KMSAN.
> + */
> +#define __no_sanitize_memory
> +
>  #if GCC_VERSION >= 50100
>  #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
>  #endif
> --
> 2.25.1.696.g5e7596f4ac-goog
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 05/38] kmsan: reduce vmalloc space
  2020-03-25 16:12 ` [PATCH v5 05/38] kmsan: reduce vmalloc space glider
@ 2020-03-30 13:48   ` Andrey Konovalov
  0 siblings, 0 replies; 60+ messages in thread
From: Andrey Konovalov @ 2020-03-30 13:48 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Andrew Morton, Dmitry Vyukov, Marco Elver,
	Linux Memory Management List, Alexander Viro, Andreas Dilger,
	Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Christoph Hellwig, Christoph Hellwig, Darrick J. Wong,
	David S. Miller, Dmitry Torokhov, Eric Biggers, Eric Dumazet,
	Eric Van Hensbergen, Greg Kroah-Hartman, Harry Wentland,
	Herbert Xu, Ilya Leoshkevich, Ingo Molnar, Jason Wang,
	Jens Axboe, Marek Szyprowski, Mark Rutland, Martin K. Petersen,
	Martin Schwidefsky, Matthew Wilcox, Michael S . Tsirkin,
	Michal Hocko, Michal Simek, Petr Mladek, Qian Cai, Randy Dunlap,
	Robin Murphy, Sergey Senozhatsky, Steven Rostedt, Takashi Iwai,
	Theodore Ts'o, Thomas Gleixner, Vasily Gorbik, Wolfram Sang

On Wed, Mar 25, 2020 at 5:13 PM <glider@google.com> wrote:
>
> KMSAN is going to use 3/4 of existing vmalloc space to hold the
> metadata, therefore we lower VMALLOC_END to make sure vmalloc() doesn't
> allocate past the first 1/4.
>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> To: Alexander Potapenko <glider@google.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: linux-mm@kvack.org
>
> ---
>
> Change-Id: Iaa5e8e0fc2aa66c956f937f5a1de6e5ef40d57cc
> ---
>  arch/x86/include/asm/pgtable_64_types.h | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
> index 52e5f5f2240d9..586629e204366 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -139,7 +139,22 @@ extern unsigned int ptrs_per_p4d;
>  # define VMEMMAP_START         __VMEMMAP_BASE_L4
>  #endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */
>
> +#ifndef CONFIG_KMSAN
>  #define VMALLOC_END            (VMALLOC_START + (VMALLOC_SIZE_TB << 40) - 1)
> +#else
> +/*
> + * In KMSAN builds vmalloc area is four times smaller, and the remaining 3/4
> + * are used to keep the metadata for virtual pages.
> + */
> +#define VMALLOC_QUARTER_SIZE   ((VMALLOC_SIZE_TB << 40) >> 2)
> +#define VMALLOC_END            (VMALLOC_START + VMALLOC_QUARTER_SIZE - 1)
> +#define VMALLOC_SHADOW_OFFSET  VMALLOC_QUARTER_SIZE
> +#define VMALLOC_ORIGIN_OFFSET  (VMALLOC_QUARTER_SIZE * 2)

"<< 1" instead of "* 2" for consistency (since we're using ">> 2" just above")?

> +#define VMALLOC_META_END       (VMALLOC_END + VMALLOC_ORIGIN_OFFSET)
> +#define MODULES_SHADOW_START   (VMALLOC_META_END + 1)
> +#define MODULES_ORIGIN_START   (MODULES_SHADOW_START + MODULES_LEN)
> +#define MODULES_ORIGIN_END     (MODULES_ORIGIN_START + MODULES_LEN)
> +#endif

These macros are a bit hard to understand. VMALLOC_SHADOW_OFFSET and
VMALLOC_ORIGIN_OFFSET are offsets from VMALLOC_END and denote where
shadow and origin areas start? What is stored in (VMALLOC_END,
VMALLOC_END + VMALLOC_SHADOW_OFFSET] then? Maybe sorting these
constants in some logical order would help, or adding a comment on how
exactly those 3/4th of vmalloc space are split.

>
>  #define MODULES_VADDR          (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
>  /* The module sections ends with the start of the fixmap */
> --
> 2.25.1.696.g5e7596f4ac-goog
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v5 02/38] kmsan: add ReST documentation
  2020-03-25 16:12 ` [PATCH v5 02/38] kmsan: add ReST documentation glider
@ 2020-03-30 14:32   ` Andrey Konovalov
  0 siblings, 0 replies; 60+ messages in thread
From: Andrey Konovalov @ 2020-03-30 14:32 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Vegard Nossum, Dmitry Vyukov, Marco Elver,
	Linux Memory Management List, Alexander Viro, Andreas Dilger,
	Andrew Morton, Andrey Ryabinin, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Christoph Hellwig, Christoph Hellwig,
	Darrick J. Wong, David S. Miller, Dmitry Torokhov, Eric Biggers,
	Eric Dumazet, Eric Van Hensbergen, Greg Kroah-Hartman,
	Harry Wentland, Herbert Xu, Ilya Leoshkevich, Ingo Molnar,
	Jason Wang, Jens Axboe, Marek Szyprowski, Mark Rutland,
	Martin K. Petersen, Martin Schwidefsky, Matthew Wilcox,
	Michael S . Tsirkin, Michal Hocko, Michal Simek, Petr Mladek,
	Qian Cai, Randy Dunlap, Robin Murphy, Sergey Senozhatsky,
	Steven Rostedt, Takashi Iwai, Theodore Ts'o, Thomas Gleixner,
	Vasily Gorbik, Wolfram Sang

On Wed, Mar 25, 2020 at 5:13 PM <glider@google.com> wrote:
>
> Add Documentation/dev-tools/kmsan.rst and reference it in the dev-tools
> index.
>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> To: Alexander Potapenko <glider@google.com>
> Cc: Vegard Nossum <vegard.nossum@oracle.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: linux-mm@kvack.org
>
> ---
> v4:
>  - address comments by Marco Elver:
>   - remove contractions
>   - fix references
>   - minor fixes
>
> Change-Id: Iac6345065e6804ef811f1124fdf779c67ff1530e
> ---
>  Documentation/dev-tools/index.rst |   1 +
>  Documentation/dev-tools/kmsan.rst | 424 ++++++++++++++++++++++++++++++
>  2 files changed, 425 insertions(+)
>  create mode 100644 Documentation/dev-tools/kmsan.rst
>
> diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
> index f7809c7b1ba9e..a3b9579fc810c 100644
> --- a/Documentation/dev-tools/index.rst
> +++ b/Documentation/dev-tools/index.rst
> @@ -19,6 +19,7 @@ whole; patches welcome!
>     kcov
>     gcov
>     kasan
> +   kmsan
>     ubsan
>     kmemleak
>     kcsan
> diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst
> new file mode 100644
> index 0000000000000..591c4809d46f3
> --- /dev/null
> +++ b/Documentation/dev-tools/kmsan.rst
> @@ -0,0 +1,424 @@
> +=============================
> +KernelMemorySanitizer (KMSAN)
> +=============================
> +
> +KMSAN is a dynamic memory error detector aimed at finding uses of uninitialized
> +memory.
> +It is based on compiler instrumentation, and is quite similar to the userspace
> +`MemorySanitizer tool`_.
> +
> +Example report
> +==============
> +Here is an example of a real KMSAN report in ``packet_bind_spkt()``::
> +
> +  ==================================================================
> +  BUG: KMSAN: uninit-value in strlen
> +  CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
> +  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> +   0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
> +   ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
> +   0000000000000000 0000000000000092 00000000ec400911 0000000000000002
> +  Call Trace:
> +   [<     inline     >] __dump_stack lib/dump_stack.c:15
> +   [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
> +   [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
> +   [<ffffffff818a783b>] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424
> +   [<     inline     >] strlen lib/string.c:484
> +   [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
> +   [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230 net/packet/af_packet.c:3132
> +   [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
> +   [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
> +   [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:?
> +  chained origin:
> +   [<ffffffff810bb787>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
> +   [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
> +   [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
> +   [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:527
> +   [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380
> +   [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
> +   [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
> +   [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:?
> +  origin description: ----address@SYSC_bind (origin=00000000eb400911)
> +  ==================================================================
> +
> +The report tells that the local variable ``address`` was created uninitialized
> +in ``SYSC_bind()`` (the ``bind`` system call implementation). The lower stack
> +trace corresponds to the place where this variable was created.
> +
> +The upper stack shows where the uninit value was used - in ``strlen()``.
> +It turned out that the contents of ``address`` were partially copied from the
> +userspace, but the buffer was not zero-terminated and contained some trailing
> +uninitialized bytes.
> +
> +``packet_bind_spkt()`` did not check the length of the buffer, but called
> +``strlcpy()`` on it, which called ``strlen()``, which started reading the
> +buffer byte by byte till it hit the uninitialized memory.
> +
> +
> +
> +KMSAN and Clang
> +===============
> +
> +In order for KMSAN to work the kernel must be
> +built with Clang, which so far is the only compiler that has KMSAN support.
> +The kernel instrumentation pass is based on the userspace
> +`MemorySanitizer tool`_. Because of the instrumentation complexity it is
> +unlikely that any other compiler will support KMSAN soon.
> +
> +Right now the instrumentation pass supports x86_64 only.
> +
> +How to build
> +============
> +
> +In order to build a kernel with KMSAN you will need a fresh Clang (10.0.0+,
> +trunk version r365008 or greater). Please refer to `LLVM documentation`_
> +for the instructions on how to build Clang::
> +
> +  export KMSAN_CLANG_PATH=/path/to/clang
> +  # Now configure and build the kernel with CONFIG_KMSAN enabled.
> +  make CC=$KMSAN_CLANG_PATH
> +
> +How KMSAN works
> +===============
> +
> +KMSAN shadow memory
> +-------------------
> +
> +KMSAN associates a metadata byte (also called shadow byte) with every byte of
> +kernel memory.
> +A bit in the shadow byte is set iff the corresponding bit of the kernel memory
> +byte is uninitialized.
> +Marking the memory uninitialized (i.e. setting its shadow bytes to 0xff) is
> +called poisoning, marking it initialized (setting the shadow bytes to 0x00) is
> +called unpoisoning.
> +
> +When a new variable is allocated on the stack, it is poisoned by default by
> +instrumentation code inserted by the compiler (unless it is a stack variable
> +that is immediately initialized). Any new heap allocation done without
> +``__GFP_ZERO`` is also poisoned.
> +
> +Compiler instrumentation also tracks the shadow values with the help from the
> +runtime library in ``mm/kmsan/``.
> +
> +The shadow value of a basic or compound type is an array of bytes of the same
> +length.
> +When a constant value is written into memory, that memory is unpoisoned.
> +When a value is read from memory, its shadow memory is also obtained and
> +propagated into all the operations which use that value. For every instruction
> +that takes one or more values the compiler generates code that calculates the
> +shadow of the result depending on those values and their shadows.
> +
> +Example::
> +
> +  int a = 0xff;
> +  int b;
> +  int c = a | b;
> +
> +In this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``,
> +shadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of
> +``c`` are uninitialized, while the lower byte is initialized.
> +
> +
> +Origin tracking
> +---------------
> +
> +Every four bytes of kernel memory also have a so-called origin assigned to
> +them.
> +This origin describes the point in program execution at which the uninitialized
> +value was created. Every origin is associated with a creation stack, which lets
> +the user figure out what is going on.
> +
> +When an uninitialized variable is allocated on stack or heap, a new origin
> +value is created, and that variable's origin is filled with that value.
> +When a value is read from memory, its origin is also read and kept together
> +with the shadow. For every instruction that takes one or more values the origin
> +of the result is one of the origins corresponding to any of the uninitialized
> +inputs.
> +If a poisoned value is written into memory, its origin is written to the
> +corresponding storage as well.
> +
> +Example 1::
> +
> +  int a = 0;
> +  int b;
> +  int c = a + b;
> +
> +In this case the origin of ``b`` is generated upon function entry, and is
> +stored to the origin of ``c`` right before the addition result is written into
> +memory.
> +
> +Several variables may share the same origin address, if they are stored in the
> +same four-byte chunk.
> +In this case every write to either variable updates the origin for all of them.
> +
> +Example 2::
> +
> +  int combine(short a, short b) {
> +    union ret_t {
> +      int i;
> +      short s[2];
> +    } ret;
> +    ret.s[0] = a;
> +    ret.s[1] = b;
> +    return ret.i;
> +  }
> +
> +If ``a`` is initialized and ``b`` is not, the shadow of the result would be
> +0xffff0000, and the origin of the result would be the origin of ``b``.
> +``ret.s[0]`` would have the same origin, but it will be never used, because
> +that variable is initialized.
> +
> +If both function arguments are uninitialized, only the origin of the second
> +argument is preserved.
> +
> +Origin chaining
> +~~~~~~~~~~~~~~~
> +To ease debugging, KMSAN creates a new origin for every memory store.
> +The new origin references both its creation stack and the previous origin the
> +memory location had.
> +This may cause increased memory consumption, so we limit the length of origin
> +chains in the runtime.
> +
> +Clang instrumentation API
> +-------------------------
> +
> +Clang instrumentation pass inserts calls to functions defined in
> +``mm/kmsan/kmsan_instr.c`` into the kernel code.
> +
> +Shadow manipulation
> +~~~~~~~~~~~~~~~~~~~
> +For every memory access the compiler emits a call to a function that returns a
> +pair of pointers to the shadow and origin addresses of the given memory::
> +
> +  typedef struct {
> +    void *s, *o;
> +  } shadow_origin_ptr_t
> +
> +  shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr)
> +  shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr)
> +  shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, u64 size)
> +  shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, u64 size)
> +
> +The function name depends on the memory access size.
> +Each such function also checks if the shadow of the memory in the range
> +[``addr``, ``addr + n``) is contiguous and reports an error otherwise.

Makes sense to refer to the "Metadata allocation" section here, which
explains what happens in case of an error.

> +
> +The compiler makes sure that for every loaded value its shadow and origin
> +values are read from memory.
> +When a value is stored to memory, its shadow and origin are also stored using
> +the metadata pointers.
> +
> +Origin tracking
> +~~~~~~~~~~~~~~~
> +A special function is used to create a new origin value for a local variable
> +and set the origin of that variable to that value::
> +
> +  void __msan_poison_alloca(u64 address, u64 size, char *descr)
> +
> +Access to per-task data
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +At the beginning of every instrumented function KMSAN inserts a call to
> +``__msan_get_context_state()``::
> +
> +  kmsan_context_state *__msan_get_context_state(void)
> +
> +``kmsan_context_state`` is declared in ``include/linux/kmsan.h``::
> +
> +  struct kmsan_context_s {
> +    char param_tls[KMSAN_PARAM_SIZE];
> +    char retval_tls[RETVAL_SIZE];
> +    char va_arg_tls[KMSAN_PARAM_SIZE];
> +    char va_arg_origin_tls[KMSAN_PARAM_SIZE];
> +    u64 va_arg_overflow_size_tls;
> +    depot_stack_handle_t param_origin_tls[PARAM_ARRAY_SIZE];
> +    depot_stack_handle_t retval_origin_tls;
> +    depot_stack_handle_t origin_tls;
> +  };
> +
> +This structure is used by KMSAN to pass parameter shadows and origins between
> +instrumented functions.
> +
> +String functions
> +~~~~~~~~~~~~~~~~
> +
> +The compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the
> +following functions. These functions are also called when data structures are
> +initialized or copied, making sure shadow and origin values are copied alongside
> +with the data::
> +
> +  void *__msan_memcpy(void *dst, void *src, u64 n)
> +  void *__msan_memmove(void *dst, void *src, u64 n)
> +  void *__msan_memset(void *dst, int c, size_t n)
> +
> +Error reporting
> +~~~~~~~~~~~~~~~
> +
> +For each pointer dereference and each condition the compiler emits a shadow
> +check that calls ``__msan_warning()`` in the case a poisoned value is being
> +used::
> +
> +  void __msan_warning(u32 origin)
> +
> +``__msan_warning()`` causes KMSAN runtime to print an error report.
> +
> +Inline assembly instrumentation
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +KMSAN instruments every inline assembly output with a call to::
> +
> +  void __msan_instrument_asm_store(u64 addr, u64 size)
> +
> +, which unpoisons the memory region.
> +
> +This approach may mask certain errors, but it also helps to avoid a lot of
> +false positives in bitwise operations, atomics etc.
> +
> +Sometimes the pointers passed into inline assembly do not point to valid memory.
> +In such cases they are ignored at runtime.
> +
> +Disabling the instrumentation
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +A function can be marked with ``__no_sanitize_memory``.
> +Doing so does not remove KMSAN instrumentation from it, however it makes the
> +compiler ignore the uninitialized values coming from the function's inputs,
> +and initialize the function's outputs.
> +The compiler will not inline functions marked with this attribute into functions
> +not marked with it, and vice versa.
> +
> +It is also possible to disable KMSAN for a single file (e.g. main.o)::
> +
> +  KMSAN_SANITIZE_main.o := n
> +
> +or for the whole directory::
> +
> +  KMSAN_SANITIZE := n
> +
> +in the Makefile. This comes at a cost however: stack allocations from such files
> +and parameters of instrumented functions called from them will have incorrect
> +shadow/origin values. As a rule of thumb, avoid using KMSAN_SANITIZE.
> +
> +Runtime library
> +---------------
> +The code is located in ``mm/kmsan/``.
> +
> +Per-task KMSAN state
> +~~~~~~~~~~~~~~~~~~~~
> +
> +Every task_struct has an associated KMSAN task state that holds the KMSAN
> +context (see above) and a per-task flag disallowing KMSAN reports::
> +
> +  struct kmsan_task_state {
> +    ...
> +    bool allow_reporting;
> +    struct kmsan_context_state cstate;
> +    ...
> +  }
> +
> +  struct task_struct {
> +    ...
> +    struct kmsan_task_state kmsan;
> +    ...
> +  }
> +
> +
> +KMSAN contexts
> +~~~~~~~~~~~~~~
> +
> +When running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to
> +hold the metadata for function parameters and return values.
> +
> +But in the case the kernel is running in the interrupt, softirq or NMI context,
> +where ``current`` is unavailable, KMSAN switches to per-cpu interrupt state::
> +
> +  DEFINE_PER_CPU(kmsan_context_state[KMSAN_NESTED_CONTEXT_MAX],
> +                 kmsan_percpu_cstate);
> +
> +Metadata allocation
> +~~~~~~~~~~~~~~~~~~~
> +There are several places in the kernel for which the metadata is stored.
> +
> +1. Each ``struct page`` instance contains two pointers to its shadow and
> +origin pages::
> +
> +  struct page {
> +    ...
> +    struct page *shadow, *origin;
> +    ...
> +  };
> +
> +Every time a ``struct page`` is allocated, the runtime library allocates two
> +additional pages to hold its shadow and origins. This is done by adding hooks
> +to ``alloc_pages()``/``free_pages()`` in ``mm/page_alloc.c``.
> +To avoid allocating the metadata for non-interesting pages (right now only the
> +shadow/origin page themselves and Metadata allocationstackdepot storage) the
> +``__GFP_NO_KMSAN_SHADOW`` flag is used.
> +
> +There is a problem related to this allocation algorithm: when two contiguous
> +memory blocks are allocated with two different ``alloc_pages()`` calls, their
> +shadow pages may not be contiguous. So, if a memory access crosses the boundary
> +of a memory block, accesses to shadow/origin memory may potentially corrupt
> +other pages or read incorrect values from them.
> +
> +As a workaround, we check the access size in
> +``__msan_metadata_ptr_for_XXX_YYY()`` and return a pointer to a fake shadow
> +region in the case of an error::
> +
> +  char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
> +  char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
> +
> +``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes.
> +All stores to ``dummy_store_page`` are ignored.
> +
> +Unfortunately at boot time we need to allocate shadow and origin pages for the
> +kernel data (``.data``, ``.bss`` etc.) and percpu memory regions, the size of
> +which is not a power of 2. As a result, we have to allocate the metadata page by
> +page, so that it is also non-contiguous, although it may be perfectly valid to
> +access the corresponding kernel memory across page boundaries.
> +This can be probably fixed by allocating 1<<N pages at once, splitting them and
> +deallocating the rest.
> +
> +LSB of the ``shadow`` pointer in a ``struct page`` may be set to 1. In this case
> +shadow and origin pages are allocated, but KMSAN ignores accesses to them by
> +falling back to dummy pages. Allocating the metadata pages is still needed to
> +support ``vmap()/vunmap()`` operations on this struct page.

This part is not clear. We allocate shadow for vmap()'ed regions but
don't do any initialization checks for that memory?

> +
> +2. For vmalloc memory and modules, there is a direct mapping between the memory
> +range, its shadow and origin. KMSAN lessens the vmalloc area by 3/4, making only
> +the first quarter available to ``vmalloc()``. The second quarter of the vmalloc
> +area contains shadow memory for the first quarter, the third one holds the
> +origins. A small part of the fourth quarter contains shadow and origins for the
> +kernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for
> +more details.
> +
> +When an array of pages is mapped into a contiguous virtual memory space, their
> +shadow and origin pages are similarly mapped into contiguous regions.
> +
> +3. For CPU entry area there are separate per-CPU arrays that hold its
> +metadata::
> +
> +  DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_shadow);
> +  DEFINE_PER_CPU(char[CPU_ENTRY_AREA_SIZE], cpu_entry_area_origin);
> +
> +When calculating shadow and origin addresses for a given memory address, the
> +runtime checks whether the address belongs to the physical page range, the
> +virtual page range or CPU entry area.
> +
> +Handling ``pt_regs``
> +~~~~~~~~~~~~~~~~~~~~
> +
> +Many functions receive a ``struct pt_regs`` holding the register state at a
> +certain point. Registers do not have (easily calculatable) shadow or origin
> +associated with them.
> +We can assume that the registers are always initialized.
> +
> +References
> +==========
> +
> +E. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized
> +memory use in C++
> +<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_.
> +In Proceedings of CGO 2015.
> +
> +.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html
> +.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html
> --
> 2.25.1.696.g5e7596f4ac-goog
>

Nit: some sections have empty lines after the section header, while
others don't.


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, back to index

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25 16:12 [PATCH v5 00/38] Add KernelMemorySanitizer infrastructure glider
2020-03-25 16:12 ` [PATCH v5 01/38] stackdepot: reserve 5 extra bits in depot_stack_handle_t glider
2020-03-30 13:36   ` Andrey Konovalov
2020-03-25 16:12 ` [PATCH v5 02/38] kmsan: add ReST documentation glider
2020-03-30 14:32   ` Andrey Konovalov
2020-03-25 16:12 ` [PATCH v5 03/38] kmsan: gfp: introduce __GFP_NO_KMSAN_SHADOW glider
2020-03-25 16:19   ` Michal Hocko
2020-03-25 17:26     ` Alexander Potapenko
2020-03-25 17:40       ` Alexander Potapenko
2020-03-25 17:49         ` Matthew Wilcox
2020-03-25 18:03           ` Alexander Potapenko
2020-03-25 18:09             ` Matthew Wilcox
2020-03-25 18:30               ` Alexander Potapenko
2020-03-25 18:43                 ` Michal Hocko
2020-03-25 18:40           ` Michal Hocko
2020-03-25 18:38         ` Michal Hocko
2020-03-27 12:20           ` Alexander Potapenko
2020-03-25 17:43       ` Michal Hocko
2020-03-25 16:12 ` [PATCH v5 04/38] kmsan: introduce __no_sanitize_memory and __SANITIZE_MEMORY__ glider
2020-03-30 13:37   ` Andrey Konovalov
2020-03-25 16:12 ` [PATCH v5 05/38] kmsan: reduce vmalloc space glider
2020-03-30 13:48   ` Andrey Konovalov
2020-03-25 16:12 ` [PATCH v5 06/38] kmsan: add KMSAN runtime core glider
2020-03-25 16:12 ` [PATCH v5 07/38] kmsan: KMSAN compiler API implementation glider
2020-03-25 16:12 ` [PATCH v5 08/38] kmsan: add KMSAN hooks for kernel subsystems glider
2020-03-25 16:12 ` [PATCH v5 09/38] kmsan: stackdepot: don't allocate KMSAN metadata for stackdepot glider
2020-03-25 16:12 ` [PATCH v5 10/38] kmsan: define READ_ONCE_NOCHECK() glider
2020-03-25 16:12 ` [PATCH v5 11/38] kmsan: make READ_ONCE_TASK_STACK() return initialized values glider
2020-03-25 16:12 ` [PATCH v5 12/38] kmsan: x86: sync metadata pages on page fault glider
2020-03-25 16:12 ` [PATCH v5 13/38] kmsan: add tests for KMSAN glider
2020-03-25 16:12 ` [PATCH v5 14/38] crypto: kmsan: disable accelerated configs under KMSAN glider
2020-03-25 16:12 ` [PATCH v5 15/38] kmsan: x86: disable UNWINDER_ORC " glider
2020-03-25 16:12 ` [PATCH v5 16/38] kmsan: x86/asm: softirq: add KMSAN IRQ entry hooks glider
2020-03-25 16:12 ` [PATCH v5 17/38] kmsan: disable KMSAN instrumentation for certain kernel parts glider
2020-03-25 16:12 ` [PATCH v5 18/38] kmsan: mm: call KMSAN hooks from SLUB code glider
2020-03-25 16:12 ` [PATCH v5 19/38] kmsan: mm: maintain KMSAN metadata for page operations glider
2020-03-25 16:12 ` [PATCH v5 20/38] kmsan: handle memory sent to/from USB glider
2020-03-25 16:12 ` [PATCH v5 21/38] kmsan: handle task creation and exiting glider
2020-03-25 16:12 ` [PATCH v5 22/38] kmsan: net: check the value of skb before sending it to the network glider
2020-03-25 16:12 ` [PATCH v5 23/38] kmsan: printk: treat the result of vscnprintf() as initialized glider
2020-03-25 16:12 ` [PATCH v5 24/38] kmsan: disable instrumentation of certain functions glider
2020-03-25 16:12 ` [PATCH v5 25/38] kmsan: unpoison |tlb| in arch_tlb_gather_mmu() glider
2020-03-25 16:12 ` [PATCH v5 26/38] kmsan: use __msan_ string functions where possible glider
2020-03-25 16:12 ` [PATCH v5 27/38] kmsan: hooks for copy_to_user() and friends glider
2020-03-25 16:12 ` [PATCH v5 28/38] kmsan: init: call KMSAN initialization routines glider
2020-03-25 16:12 ` [PATCH v5 29/38] kmsan: enable KMSAN builds glider
2020-03-25 16:12 ` [PATCH v5 30/38] kmsan: handle /dev/[u]random glider
2020-03-25 16:12 ` [PATCH v5 31/38] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg() glider
2020-03-25 16:12 ` [PATCH v5 32/38] kmsan: disable strscpy() optimization under KMSAN glider
2020-03-25 16:12 ` [PATCH v5 33/38] kmsan: add iomap support glider
2020-03-25 16:12 ` [PATCH v5 34/38] kmsan: dma: unpoison memory mapped by dma_direct_map_page() glider
2020-03-25 16:19   ` Christoph Hellwig
2020-03-27 17:03     ` Alexander Potapenko
2020-03-27 17:06       ` Christoph Hellwig
2020-03-27 18:46         ` Alexander Potapenko
2020-03-28  8:52           ` Christoph Hellwig
2020-03-25 16:12 ` [PATCH v5 35/38] kmsan: disable physical page merging in biovec glider
2020-03-25 16:12 ` [PATCH v5 36/38] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN glider
2020-03-25 16:12 ` [PATCH v5 37/38] kmsan: x86/uprobes: unpoison regs in arch_uprobe_exception_notify() glider
2020-03-25 16:12 ` [PATCH v5 38/38] kmsan: block: skip bio block merging logic for KMSAN glider

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git