From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752969AbcALT2I (ORCPT ); Tue, 12 Jan 2016 14:28:08 -0500 Received: from mail-io0-f169.google.com ([209.85.223.169]:36307 "EHLO mail-io0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752246AbcALT1v (ORCPT ); Tue, 12 Jan 2016 14:27:51 -0500 MIME-Version: 1.0 In-Reply-To: References: <1452615352-117784-1-git-send-email-dvyukov@google.com> Date: Tue, 12 Jan 2016 11:27:50 -0800 Message-ID: Subject: Re: [PATCH] kernel: add kcov code coverage From: Kees Cook To: Dmitry Vyukov Cc: Andrew Morton , David Drysdale , Quentin Casasnovas , Sasha Levin , Vegard Nossum , LKML , Eric Dumazet , Tavis Ormandy , Bjorn Helgaas , syzkaller , Kostya Serebryany , Alexander Potapenko , Andrey Ryabinin Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 12, 2016 at 11:19 AM, Dmitry Vyukov wrote: > On Tue, Jan 12, 2016 at 6:31 PM, Kees Cook wrote: >> On Tue, Jan 12, 2016 at 8:15 AM, Dmitry Vyukov wrote: >>> kcov provides code coverage collection for coverage-guided fuzzing >>> (randomized testing). Coverage-guided fuzzing is a testing technique >>> that uses coverage feedback to determine new interesting inputs to a >>> system. A notable user-space example is AFL >>> (http://lcamtuf.coredump.cx/afl/). However, this technique is not >>> widely used for kernel testing due to missing compiler and kernel >>> support. >>> >>> kcov does not aim to collect as much coverage as possible. It aims >>> to collect more or less stable coverage that is function of syscall >>> inputs. To achieve this goal it does not collect coverage in >>> soft/hard interrupts and instrumentation of some inherently >>> non-deterministic or non-interesting parts of kernel is disbled >>> (e.g. scheduler, locking). >>> >>> Currently there is a single coverage collection mode (tracing), >>> but the API anticipates additional collection modes. >>> Initially I also implemented a second mode which exposes >>> coverage in a fixed-size hash table of counters (what Quentin >>> used in his original patch). I've dropped the second mode for >>> simplicity. >>> >>> This patch adds the necessary support on kernel side. >>> The complimentary compiler support was added in gcc revision 231296. >>> >>> We've used this support to build syzkaller system call fuzzer, >>> which has found 90 kernel bugs in just 2 months: >>> https://github.com/google/syzkaller/wiki/Found-Bugs >>> We've also found 30+ bugs in our internal systems with syzkaller. >>> Another (yet unexplored) direction where kcov coverage would greatly >>> help is more traditional "blob mutation". For example, mounting >>> a random blob as a filesystem, or receiving a random blob over wire. >>> >>> Why not gcov. Typical fuzzing loop looks as follows: (1) reset >>> coverage, (2) execute a bit of code, (3) collect coverage, repeat. >>> A typical coverage can be just a dozen of basic blocks (e.g. an >>> invalid input). In such context gcov becomes prohibitively expensive >>> as reset/collect coverage steps depend on total number of basic >>> blocks/edges in program (in case of kernel it is about 2M). Cost of >>> kcov depends only on number of executed basic blocks/edges. On top of >>> that, kernel requires per-thread coverage because there are >>> always background threads and unrelated processes that also produce >>> coverage. With inlined gcov instrumentation per-thread coverage is not >>> possible. >>> >>> Based on a patch by Quentin Casasnovas. >>> Signed-off-by: Dmitry Vyukov >> >> Reviewed-by: Kees Cook >> >>> --- >>> Anticipating reasonable questions regarding usage of this feature. >>> Quentin Casasnovas and Vegard Nossum also plan to use kcov for >>> coverage-guided fuzzing. Currently they use a custom kernel patch >>> for their fuzzer and found several dozens of bugs. >>> There is also interest from Intel 0-DAY kernel test infrastructure. >>> >>> Based on commit 03891f9c853d5c4473224478a1e03ea00d70ff8d. >>> --- >>> Documentation/kcov.txt | 111 +++++++++++++++ >>> Makefile | 10 +- >>> arch/x86/Kconfig | 1 + >>> arch/x86/boot/Makefile | 6 + >>> arch/x86/boot/compressed/Makefile | 2 + >>> arch/x86/entry/vdso/Makefile | 2 + >>> arch/x86/kernel/Makefile | 5 + >>> arch/x86/kernel/apic/Makefile | 4 + >>> arch/x86/kernel/cpu/Makefile | 4 + >>> arch/x86/lib/Makefile | 3 + >>> arch/x86/mm/Makefile | 3 + >>> arch/x86/realmode/rm/Makefile | 2 + >>> include/linux/kcov.h | 19 +++ >>> include/linux/sched.h | 10 ++ >>> include/uapi/linux/kcov.h | 10 ++ >>> kernel/Makefile | 9 ++ >>> kernel/exit.c | 2 + >>> kernel/fork.c | 3 + >>> kernel/kcov/Makefile | 5 + >>> kernel/kcov/kcov.c | 287 ++++++++++++++++++++++++++++++++++++++ >>> kernel/locking/Makefile | 3 + >>> kernel/rcu/Makefile | 4 + >>> kernel/sched/Makefile | 4 + >>> lib/Kconfig.debug | 27 ++++ >>> lib/Makefile | 9 ++ >>> mm/Makefile | 15 ++ >>> mm/kasan/Makefile | 1 + >>> scripts/Makefile.lib | 6 + >>> 28 files changed, 566 insertions(+), 1 deletion(-) >>> create mode 100644 Documentation/kcov.txt >>> create mode 100644 include/linux/kcov.h >>> create mode 100644 include/uapi/linux/kcov.h >>> create mode 100644 kernel/kcov/Makefile >>> create mode 100644 kernel/kcov/kcov.c >>> >>> diff --git a/Documentation/kcov.txt b/Documentation/kcov.txt >>> new file mode 100644 >>> index 0000000..1fa6a3d >>> --- /dev/null >>> +++ b/Documentation/kcov.txt >>> @@ -0,0 +1,111 @@ >>> +kcov: code coverage for fuzzing >>> +=============================== >>> + >>> +kcov exposes kernel code coverage information in a form suitable for coverage- >>> +guided fuzzing (randomized testing). Coverage data of a running kernel is >>> +exported via the "kcov" debugfs file. Coverage collection is enabled on a task >>> +basis, and thus it can capture precise coverage of a single system call. >>> + >>> +Note that kcov does not aim to collect as much coverage as possible. It aims >>> +to collect more or less stable coverage that is function of syscall inputs. >>> +To achieve this goal it does not collect coverage in soft/hard interrupts >>> +and instrumentation of some inherently non-deterministic parts of kernel is >>> +disbled (e.g. scheduler, locking). >>> + >>> +Usage: >>> +====== >>> + >>> +Configure kernel with: >>> + >>> + CONFIG_KCOV=y >>> + CONFIG_DEBUG_FS=y >>> + >>> +CONFIG_KCOV requires gcc built on revision 231296 or later. >>> +Profiling data will only become accessible once debugfs has been mounted: >>> + >>> + mount -t debugfs none /sys/kernel/debug >>> + >>> +The following program demonstrates kcov usage from within a test program: >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) >>> +#define KCOV_ENABLE _IO('c', 100) >>> +#define KCOV_DISABLE _IO('c', 101) >>> +#define COVER_SIZE (64<<10) >>> + >>> +int main(int argc, char **argv) >>> +{ >>> + int fd; >>> + uint32_t *cover, n, i; >>> + >>> + /* A single fd descriptor allows coverage collection on a single >>> + * thread. >>> + */ >>> + fd = open("/sys/kernel/debug/kcov", O_RDWR); >>> + if (fd == -1) >>> + perror("open"); >>> + /* Setup trace mode and trace size. */ >>> + if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE)) >>> + perror("ioctl"); >>> + /* Mmap buffer shared between kernel- and user-space. */ >>> + cover = (uint32_t*)mmap(NULL, COVER_SIZE * sizeof(uint32_t), >>> + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); >>> + if ((void*)cover == MAP_FAILED) >>> + perror("mmap"); >>> + /* Enable coverage collection on the current thread. */ >>> + if (ioctl(fd, KCOV_ENABLE, 0)) >>> + perror("ioctl"); >>> + /* Reset coverage from the tail of the ioctl() call. */ >>> + __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED); >>> + /* That's the target syscal call. */ >>> + read(-1, NULL, 0); >>> + /* Read number of PCs collected. */ >>> + n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); >>> + /* PCs are shorten to uint32_t, so we need to restore the upper part. */ >>> + for (i = 0; i < n; i++) >>> + printf("0xffffffff%0lx\n", (unsigned long)cover[i + 1]); >>> + /* Disable coverage collection for the current thread. After this call >>> + * coverage can be enabled for a different thread. >>> + */ >>> + if (ioctl(fd, KCOV_DISABLE, 0)) >>> + perror("ioctl"); >>> + /* Free resources. */ >>> + if (munmap(cover, COVER_SIZE * sizeof(uint32_t))) >>> + perror("munmap"); >>> + if (close(fd)) >>> + perror("close"); >>> + return 0; >>> +} >>> + >>> +After piping through addr2line output of the program looks as follows: >>> + >>> +SyS_read >>> +fs/read_write.c:562 >>> +__fdget_pos >>> +fs/file.c:774 >>> +__fget_light >>> +fs/file.c:746 >>> +__fget_light >>> +fs/file.c:750 >>> +__fget_light >>> +fs/file.c:760 >>> +__fdget_pos >>> +fs/file.c:784 >>> +SyS_read >>> +fs/read_write.c:562 >>> + >>> +If a program needs to collect coverage from several threads (independently), >>> +it needs to open /sys/kernel/debug/kcov in each thread separately. >>> + >>> +The interface is fine-grained to allow efficient forking of test processes. >>> +That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode, >>> +mmaps coverage buffer and then forks child processes in a loop. Child processes >>> +only need to enable coverage (disable happens automatically on thread end). >>> diff --git a/Makefile b/Makefile >>> index 70dea02..9fe404a 100644 >>> --- a/Makefile >>> +++ b/Makefile >>> @@ -365,6 +365,7 @@ LDFLAGS_MODULE = >>> CFLAGS_KERNEL = >>> AFLAGS_KERNEL = >>> CFLAGS_GCOV = -fprofile-arcs -ftest-coverage >>> +CFLAGS_KCOV = -fsanitize-coverage=trace-pc >>> >>> >>> # Use USERINCLUDE when you must reference the UAPI directories only. >>> @@ -411,7 +412,7 @@ export MAKE AWK GENKSYMS INSTALLKERNEL PERL PYTHON UTS_MACHINE >>> export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS >>> >>> export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS >>> -export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KASAN >>> +export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KCOV CFLAGS_KASAN >>> export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE >>> export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE >>> export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL >>> @@ -667,6 +668,13 @@ endif >>> endif >>> KBUILD_CFLAGS += $(stackp-flag) >>> >>> +ifdef CONFIG_KCOV >>> + ifeq ($(call cc-option, $(CFLAGS_KCOV)),) >>> + $(warning Cannot use CONFIG_KCOV: \ >>> + -fsanitize-coverage=trace-pc is not supported by compiler) >>> + endif >>> +endif >>> + >>> ifeq ($(cc-name),clang) >>> KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,) >>> KBUILD_CPPFLAGS += $(call cc-option,-Wno-unknown-warning-option,) >>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >>> index 258965d..be39ab5 100644 >>> --- a/arch/x86/Kconfig >>> +++ b/arch/x86/Kconfig >>> @@ -27,6 +27,7 @@ config X86 >>> select ARCH_HAS_ELF_RANDOMIZE >>> select ARCH_HAS_FAST_MULTIPLIER >>> select ARCH_HAS_GCOV_PROFILE_ALL >>> + select ARCH_HAS_KCOV if X86_64 >>> select ARCH_HAS_PMEM_API if X86_64 >>> select ARCH_HAS_MMIO_FLUSH >>> select ARCH_HAS_SG_CHAIN >>> diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile >>> index 2ee62db..b2eb295 100644 >>> --- a/arch/x86/boot/Makefile >>> +++ b/arch/x86/boot/Makefile >>> @@ -10,6 +10,12 @@ >>> # >>> >>> KASAN_SANITIZE := n >>> +# Kernel does not boot with kcov instrumentation here. >>> +# One of the problems observed was insertion of __sanitizer_cov_trace_pc() >>> +# callback into middle of per-cpu data enabling code. Thus the callback observed >>> +# inconsistent state and crashed. We are interested mostly in syscall coverage, >>> +# so boot code is not interesting anyway. >>> +KCOV_INSTRUMENT := n >>> >>> # If you want to preset the SVGA mode, uncomment the next line and >>> # set SVGA_MODE to whatever number you want. >>> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile >>> index 0a291cd..e625939 100644 >>> --- a/arch/x86/boot/compressed/Makefile >>> +++ b/arch/x86/boot/compressed/Makefile >>> @@ -17,6 +17,8 @@ >>> # compressed vmlinux.bin.all + u32 size of vmlinux.bin.all >>> >>> KASAN_SANITIZE := n >>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in. >>> +KCOV_INSTRUMENT := n >>> >>> targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \ >>> vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4 >>> diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile >>> index 265c0ed..1b663b8 100644 >>> --- a/arch/x86/entry/vdso/Makefile >>> +++ b/arch/x86/entry/vdso/Makefile >>> @@ -4,6 +4,8 @@ >>> >>> KBUILD_CFLAGS += $(DISABLE_LTO) >>> KASAN_SANITIZE := n >>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in. >>> +KCOV_INSTRUMENT := n >>> >>> VDSO64-$(CONFIG_X86_64) := y >>> VDSOX32-$(CONFIG_X86_X32_ABI) := y >>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile >>> index b1b78ff..4648960 100644 >>> --- a/arch/x86/kernel/Makefile >>> +++ b/arch/x86/kernel/Makefile >>> @@ -19,6 +19,11 @@ endif >>> KASAN_SANITIZE_head$(BITS).o := n >>> KASAN_SANITIZE_dumpstack.o := n >>> KASAN_SANITIZE_dumpstack_$(BITS).o := n >>> +# If instrumentation of this dir is enabled, boot hangs during first second. >>> +# Probably could be more selective here, but note that files related to irqs, >>> +# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to >>> +# non-deterministic coverage. >>> +KCOV_INSTRUMENT := n >>> >>> CFLAGS_irq.o := -I$(src)/../include/asm/trace >>> >>> diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile >>> index 8bb12dd..8f2a3d7 100644 >>> --- a/arch/x86/kernel/apic/Makefile >>> +++ b/arch/x86/kernel/apic/Makefile >>> @@ -2,6 +2,10 @@ >>> # Makefile for local APIC drivers and for the IO-APIC code >>> # >>> >>> +# Leads to non-deterministic coverage that is not a function of syscall inputs. >>> +# In particualr, smp_apic_timer_interrupt() is called in random places. >>> +KCOV_INSTRUMENT := n >>> + >>> obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o ipi.o vector.o >>> obj-y += hw_nmi.o >>> >>> diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile >>> index 5803130..c108683 100644 >>> --- a/arch/x86/kernel/cpu/Makefile >>> +++ b/arch/x86/kernel/cpu/Makefile >>> @@ -8,6 +8,10 @@ CFLAGS_REMOVE_common.o = -pg >>> CFLAGS_REMOVE_perf_event.o = -pg >>> endif >>> >>> +# If these files are instrumented, boot hangs during the first second. >>> +KCOV_INSTRUMENT_common.o := n >>> +KCOV_INSTRUMENT_perf_event.o := n >>> + >>> # Make sure load_percpu_segment has no stackprotector >>> nostackp := $(call cc-option, -fno-stack-protector) >>> CFLAGS_common.o := $(nostackp) >>> diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile >>> index a501fa2..fefca94 100644 >>> --- a/arch/x86/lib/Makefile >>> +++ b/arch/x86/lib/Makefile >>> @@ -2,6 +2,9 @@ >>> # Makefile for x86 specific library files. >>> # >>> >>> +# Produces uninteresting flaky coverage. >>> +KCOV_INSTRUMENT_delay.o := n >>> + >>> inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk >>> inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt >>> quiet_cmd_inat_tables = GEN $@ >>> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile >>> index f9d38a4..147def6 100644 >>> --- a/arch/x86/mm/Makefile >>> +++ b/arch/x86/mm/Makefile >>> @@ -1,3 +1,6 @@ >>> +# Kernel does not boot with instrumentation of tlb.c. >>> +KCOV_INSTRUMENT_tlb.o := n >>> + >>> obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ >>> pat.o pgtable.o physaddr.o gup.o setup_nx.o >>> >>> diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile >>> index 2730d77..2abf667 100644 >>> --- a/arch/x86/realmode/rm/Makefile >>> +++ b/arch/x86/realmode/rm/Makefile >>> @@ -7,6 +7,8 @@ >>> # >>> # >>> KASAN_SANITIZE := n >>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in. >>> +KCOV_INSTRUMENT := n >>> >>> always := realmode.bin realmode.relocs >>> >>> diff --git a/include/linux/kcov.h b/include/linux/kcov.h >>> new file mode 100644 >>> index 0000000..72ff663 >>> --- /dev/null >>> +++ b/include/linux/kcov.h >>> @@ -0,0 +1,19 @@ >>> +#ifndef _LINUX_KCOV_H >>> +#define _LINUX_KCOV_H >>> + >>> +#include >>> + >>> +struct task_struct; >>> + >>> +#ifdef CONFIG_KCOV >>> + >>> +void kcov_task_init(struct task_struct *t); >>> +void kcov_task_exit(struct task_struct *t); >>> + >>> +#else >>> + >>> +static inline void kcov_task_init(struct task_struct *t) {} >>> +static inline void kcov_task_exit(struct task_struct *t) {} >>> + >>> +#endif /* CONFIG_KCOV */ >>> +#endif /* _LINUX_KCOV_H */ >>> diff --git a/include/linux/sched.h b/include/linux/sched.h >>> index 4bae8ab..299d0180 100644 >>> --- a/include/linux/sched.h >>> +++ b/include/linux/sched.h >>> @@ -1806,6 +1806,16 @@ struct task_struct { >>> /* bitmask and counter of trace recursion */ >>> unsigned long trace_recursion; >>> #endif /* CONFIG_TRACING */ >>> +#ifdef CONFIG_KCOV >>> + /* Coverage collection mode enabled for this task (0 if disabled). */ >>> + int kcov_mode; >>> + /* Size of the kcov_area. */ >>> + unsigned long kcov_size; >>> + /* Buffer for coverage collection. */ >>> + void *kcov_area; >>> + /* kcov desciptor wired with this task or NULL. */ >>> + void *kcov; >>> +#endif >>> #ifdef CONFIG_MEMCG >>> struct mem_cgroup *memcg_in_oom; >>> gfp_t memcg_oom_gfp_mask; >>> diff --git a/include/uapi/linux/kcov.h b/include/uapi/linux/kcov.h >>> new file mode 100644 >>> index 0000000..574e22e >>> --- /dev/null >>> +++ b/include/uapi/linux/kcov.h >>> @@ -0,0 +1,10 @@ >>> +#ifndef _LINUX_KCOV_IOCTLS_H >>> +#define _LINUX_KCOV_IOCTLS_H >>> + >>> +#include >>> + >>> +#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) >>> +#define KCOV_ENABLE _IO('c', 100) >>> +#define KCOV_DISABLE _IO('c', 101) >>> + >>> +#endif /* _LINUX_KCOV_IOCTLS_H */ >>> diff --git a/kernel/Makefile b/kernel/Makefile >>> index 53abf00..db7278b 100644 >>> --- a/kernel/Makefile >>> +++ b/kernel/Makefile >>> @@ -19,6 +19,14 @@ CFLAGS_REMOVE_cgroup-debug.o = $(CC_FLAGS_FTRACE) >>> CFLAGS_REMOVE_irq_work.o = $(CC_FLAGS_FTRACE) >>> endif >>> >>> +# Prevents flicker of uninteresting __do_softirq()/__local_bh_disable_ip() >>> +# in coverage traces. >>> +KCOV_INSTRUMENT_softirq.o := n >>> +# These are called from save_stack_trace() on slub debug path, >>> +# and produce insane amounts of uninteresting coverage. >>> +KCOV_INSTRUMENT_module.o := n >>> +KCOV_INSTRUMENT_extable.o := n >>> + >>> # cond_syscall is currently not LTO compatible >>> CFLAGS_sys_ni.o = $(DISABLE_LTO) >>> >>> @@ -69,6 +77,7 @@ obj-$(CONFIG_AUDITSYSCALL) += auditsc.o >>> obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o audit_fsnotify.o >>> obj-$(CONFIG_AUDIT_TREE) += audit_tree.o >>> obj-$(CONFIG_GCOV_KERNEL) += gcov/ >>> +obj-$(CONFIG_KCOV) += kcov/ >>> obj-$(CONFIG_KPROBES) += kprobes.o >>> obj-$(CONFIG_KGDB) += debug/ >>> obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o >>> diff --git a/kernel/exit.c b/kernel/exit.c >>> index 07110c6..49a1339 100644 >>> --- a/kernel/exit.c >>> +++ b/kernel/exit.c >>> @@ -53,6 +53,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #include >>> #include >>> @@ -657,6 +658,7 @@ void do_exit(long code) >>> TASKS_RCU(int tasks_rcu_i); >>> >>> profile_task_exit(tsk); >>> + kcov_task_exit(tsk); >>> >>> WARN_ON(blk_needs_flush_plug(tsk)); >>> >>> diff --git a/kernel/fork.c b/kernel/fork.c >>> index 291b08c..6b28993 100644 >>> --- a/kernel/fork.c >>> +++ b/kernel/fork.c >>> @@ -75,6 +75,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #include >>> #include >>> @@ -384,6 +385,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig) >>> >>> account_kernel_stack(ti, 1); >>> >>> + kcov_task_init(tsk); >>> + >>> return tsk; >>> >>> free_ti: >>> diff --git a/kernel/kcov/Makefile b/kernel/kcov/Makefile >>> new file mode 100644 >>> index 0000000..88892b7 >>> --- /dev/null >>> +++ b/kernel/kcov/Makefile >>> @@ -0,0 +1,5 @@ >>> +KCOV_INSTRUMENT := n >>> +KASAN_SANITIZE := n >>> + >>> +obj-y := kcov.o >>> + >>> diff --git a/kernel/kcov/kcov.c b/kernel/kcov/kcov.c >>> new file mode 100644 >>> index 0000000..05ec361 >>> --- /dev/null >>> +++ b/kernel/kcov/kcov.c >>> @@ -0,0 +1,287 @@ >>> +#define pr_fmt(fmt) "kcov: " fmt >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +enum kcov_mode { >>> + /* Tracing coverage collection mode. >>> + * Covered PCs are collected in a per-task buffer. >>> + */ >>> + kcov_mode_trace = 1, >>> +}; >>> + >>> +/* kcov descriptor (one per opened debugfs file). */ >>> +struct kcov { >>> + /* Reference counter. We keep one for: >>> + * - opened file descriptor >>> + * - mmapped region (including copies after fork) >>> + * - task with enabled coverage (we can't unwire it from another task) >>> + */ >>> + atomic_t rc; >>> + /* The lock protects state transitions of the descriptor: >>> + * - initial state after open() >>> + * - then there must be a single ioctl(KCOV_INIT_TRACE) call >>> + * - then, mmap() call (several calls are allowed but not useful) >>> + * - then, repeated enable/disable for a task (only one task a time >>> + * allowed >>> + */ >>> + spinlock_t lock; >>> + enum kcov_mode mode; >>> + unsigned long size; >>> + void *area; >>> + struct task_struct *t; >>> +}; >>> + >>> +/* Entry point from instrumented code. >>> + * This is called once per basic-block/edge. >>> + */ >>> +void __sanitizer_cov_trace_pc(void) >>> +{ >>> + struct task_struct *t; >>> + enum kcov_mode mode; >>> + >>> + t = current; >>> + /* We are interested in code coverage as a function of a syscall inputs, >>> + * so we ignore code executed in interrupts. >>> + */ >>> + if (!t || in_interrupt()) >>> + return; >>> + mode = READ_ONCE(t->kcov_mode); >>> + if (mode == kcov_mode_trace) { >>> + u32 *area; >>> + u32 pos; >>> + >>> + /* There is some code that runs in interrupts but for which >>> + * in_interrupt() returns false (e.g. preempt_schedule_irq()). >>> + * READ_ONCE()/barrier() effectively provides load-acquire wrt >>> + * interrupts, there are paired barrier()/WRITE_ONCE() in >>> + * kcov_ioctl_locked(). >>> + */ >>> + barrier(); >>> + area = t->kcov_area; >>> + /* The first u32 is number of subsequent PCs. */ >>> + pos = READ_ONCE(area[0]) + 1; >>> + if (likely(pos < t->kcov_size)) { >>> + area[pos] = (u32)_RET_IP_; >>> + WRITE_ONCE(area[0], pos); >>> + } >>> + } >>> +} >>> +EXPORT_SYMBOL(__sanitizer_cov_trace_pc); >>> + >>> +static void kcov_put(struct kcov *kcov) >>> +{ >>> + if (atomic_dec_and_test(&kcov->rc)) { >>> + vfree(kcov->area); >>> + kfree(kcov); >>> + } >>> +} >>> + >>> +void kcov_task_init(struct task_struct *t) >>> +{ >>> + t->kcov_mode = 0; >>> + t->kcov_size = 0; >>> + t->kcov_area = NULL; >>> + t->kcov = NULL; >>> +} >>> + >>> +void kcov_task_exit(struct task_struct *t) >>> +{ >>> + struct kcov *kcov; >>> + >>> + kcov = t->kcov; >>> + if (kcov == NULL) >>> + return; >>> + spin_lock(&kcov->lock); >>> + BUG_ON(kcov->t != t); >>> + /* Just to not leave dangling references behind. */ >>> + kcov_task_init(t); >>> + kcov->t = NULL; >>> + spin_unlock(&kcov->lock); >>> + kcov_put(kcov); >>> +} >>> + >>> +static int kcov_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf) >>> +{ >>> + struct kcov *kcov; >>> + unsigned long off; >>> + struct page *page; >>> + >>> + /* Map the preallocated kcov->area. */ >>> + kcov = vma->vm_file->private_data; >>> + off = vmf->pgoff << PAGE_SHIFT; >>> + if (off >= kcov->size * sizeof(u32)) >>> + return -1; >>> + >>> + page = vmalloc_to_page(kcov->area + off); >>> + get_page(page); >>> + vmf->page = page; >>> + return 0; >>> +} >>> + >>> +static void kcov_unmap(struct vm_area_struct *vma) >>> +{ >>> + kcov_put(vma->vm_file->private_data); >>> +} >>> + >>> +static void kcov_map_copied(struct vm_area_struct *vma) >>> +{ >>> + struct kcov *kcov; >>> + >>> + kcov = vma->vm_file->private_data; >>> + atomic_inc(&kcov->rc); >>> +} >>> + >>> +static const struct vm_operations_struct kcov_vm_ops = { >>> + .fault = kcov_vm_fault, >>> + .close = kcov_unmap, >>> + /* Called on fork()/clone() when the mapping is copied. */ >>> + .open = kcov_map_copied, >>> +}; >>> + >>> +static int kcov_mmap(struct file *filep, struct vm_area_struct *vma) >>> +{ >>> + int res = 0; >>> + void *area; >>> + struct kcov *kcov = vma->vm_file->private_data; >>> + >>> + /* Can't call vmalloc_user() under a spinlock. */ >>> + area = vmalloc_user(vma->vm_end - vma->vm_start); >>> + if (!area) >>> + return -ENOMEM; >>> + >>> + spin_lock(&kcov->lock); >>> + if (kcov->mode == 0 || vma->vm_pgoff != 0 || >>> + vma->vm_end - vma->vm_start != kcov->size * sizeof(u32)) { >>> + res = -EINVAL; >>> + goto exit; >>> + } >>> + if (!kcov->area) { >>> + kcov->area = area; >>> + area = NULL; >>> + } >>> + /* The file drops a reference on close, but the file >>> + * descriptor can be closed with the mmaping still alive so we keep >>> + * a reference for those. This is put in kcov_unmap(). >>> + */ >>> + atomic_inc(&kcov->rc); >>> + vma->vm_ops = &kcov_vm_ops; >>> +exit: >>> + spin_unlock(&kcov->lock); >>> + vfree(area); >>> + return res; >>> +} >>> + >>> +static int kcov_open(struct inode *inode, struct file *filep) >>> +{ >>> + struct kcov *kcov; >>> + >>> + kcov = kzalloc(sizeof(*kcov), GFP_KERNEL); >>> + if (!kcov) >>> + return -ENOMEM; >>> + atomic_set(&kcov->rc, 1); >>> + spin_lock_init(&kcov->lock); >>> + filep->private_data = kcov; >>> + return nonseekable_open(inode, filep); >>> +} >>> + >>> +static int kcov_close(struct inode *inode, struct file *filep) >>> +{ >>> + kcov_put(filep->private_data); >>> + return 0; >>> +} >>> + >>> +static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd, >>> + unsigned long arg) >>> +{ >>> + struct task_struct *t; >>> + >>> + switch (cmd) { >>> + case KCOV_INIT_TRACE: >>> + /* Enable kcov in trace mode and setup buffer size. >>> + * Must happen before anything else. >>> + */ >>> + if (arg < 256 || arg > (128<<20) || arg & (arg - 1)) >>> + return -EINVAL; >>> + if (kcov->mode != 0) >>> + return -EBUSY; >>> + kcov->mode = kcov_mode_trace; >>> + kcov->size = arg; >>> + return 0; >>> + case KCOV_ENABLE: >>> + /* Enable coverage for the current task. >>> + * At this point user must have been enabled trace mode, >>> + * and mmapped the file. Coverage collection is disabled only >>> + * at task exit or voluntary by KCOV_DISABLE. After that it can >>> + * be enabled for another task. >>> + */ >>> + if (kcov->mode == 0 || kcov->area == NULL) >>> + return -EINVAL; >>> + if (kcov->t != NULL) >>> + return -EBUSY; >>> + t = current; >>> + /* Cache in task struct for performance. */ >>> + t->kcov_size = kcov->size; >>> + t->kcov_area = kcov->area; >>> + /* See comment in __sanitizer_cov_trace_pc(). */ >>> + barrier(); >>> + WRITE_ONCE(t->kcov_mode, kcov->mode); >>> + t->kcov = kcov; >>> + kcov->t = t; >>> + /* This is put either in kcov_task_exit() or in KCOV_DISABLE. */ >>> + atomic_inc(&kcov->rc); >>> + return 0; >>> + case KCOV_DISABLE: >>> + /* Disable coverage for the current task. */ >>> + if (current->kcov != kcov) >>> + return -EINVAL; >>> + t = current; >>> + BUG_ON(kcov->t != t); >>> + kcov_task_init(t); >>> + kcov->t = NULL; >>> + BUG_ON(atomic_dec_and_test(&kcov->rc)); >>> + return 0; >>> + default: >>> + return -EINVAL; >>> + } >>> +} >>> + >>> +static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) >>> +{ >>> + struct kcov *kcov; >>> + int res; >>> + >>> + kcov = filep->private_data; >>> + spin_lock(&kcov->lock); >>> + res = kcov_ioctl_locked(kcov, cmd, arg); >>> + spin_unlock(&kcov->lock); >>> + return res; >>> +} >>> + >>> +static const struct file_operations kcov_fops = { >>> + .open = kcov_open, >>> + .unlocked_ioctl = kcov_ioctl, >>> + .mmap = kcov_mmap, >>> + .release = kcov_close, >>> +}; >>> + >>> +static int __init kcov_init(void) >>> +{ >>> + if (!debugfs_create_file("kcov", 0666, NULL, NULL, &kcov_fops)) { >>> + pr_err("init failed\n"); >>> + return 1; >>> + } >>> + return 0; >>> +} >>> + >>> +device_initcall(kcov_init); >>> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile >>> index 8e96f6c..f816de9 100644 >>> --- a/kernel/locking/Makefile >>> +++ b/kernel/locking/Makefile >>> @@ -1,3 +1,6 @@ >>> +# Any varying coverage in these files is non-deterministic >>> +# and is generally not a function of system call inputs. >>> +KCOV_INSTRUMENT := n >>> >>> obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o >>> >>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile >>> index 61a1656..032b2c0 100644 >>> --- a/kernel/rcu/Makefile >>> +++ b/kernel/rcu/Makefile >>> @@ -1,3 +1,7 @@ >>> +# Any varying coverage in these files is non-deterministic >>> +# and is generally not a function of system call inputs. >>> +KCOV_INSTRUMENT := n >>> + >>> obj-y += update.o sync.o >>> obj-$(CONFIG_SRCU) += srcu.o >>> obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o >>> diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile >>> index 6768797..f0a9265 100644 >>> --- a/kernel/sched/Makefile >>> +++ b/kernel/sched/Makefile >>> @@ -2,6 +2,10 @@ ifdef CONFIG_FUNCTION_TRACER >>> CFLAGS_REMOVE_clock.o = $(CC_FLAGS_FTRACE) >>> endif >>> >>> +# These files are disabled because they produce non-interesting flaky coverage >>> +# that is not a function of syscall inputs. E.g. involuntary context switches. >>> +KCOV_INSTRUMENT := n >>> + >>> ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) >>> # According to Alan Modra , the -fno-omit-frame-pointer is >>> # needed for x86 only. Why this used to be enabled for all architectures is beyond >>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug >>> index c98e93c..cb71e25 100644 >>> --- a/lib/Kconfig.debug >>> +++ b/lib/Kconfig.debug >>> @@ -670,6 +670,33 @@ config DEBUG_STACKOVERFLOW >>> >>> If in doubt, say "N". >>> >>> +config ARCH_HAS_KCOV >>> + bool >>> + >>> +if ARCH_HAS_KCOV Ah! I see now it now, I'd missed this if/endif section. I think it would be more readable to enhance the "config ARCH_HAS_KCOV" to have a help section that describes what an architecture needs to do to support KCOV (in this case, "test it at all"), and then instead of the if/endif wrapping, add it to the "depends" line: >>> + >>> +config KCOV >>> + bool "Code coverage for fuzzing" >>> + depends on !RANDOMIZE_BASE e.g.: depends on !RANDOMIZE_BASE && ARCH_HAS_KCOV >>> + default n >> >> Minor nit: "default n" is redundant. > > Will address this in a next version. > > >>> + help >>> + KCOV exposes kernel code coverage information in a form suitable >>> + for coverage-guided fuzzing (randomized testing). >>> + >>> + RANDOMIZE_BASE is not supported. KCOV exposes PC values that are meant >>> + to be stable on different machines and across reboots. RANDOMIZE_BASE >>> + breaks this assumption. Potentially it can be supported by subtracting >>> + _stext from [_stext, _send), but it is more tricky (and slow) for >>> + modules. >> >> In the future, it'd be nice if the kASLR conflict were run-time >> selectable instead of build-time selectable (as done for hibernation). > > I think in the future we will just support KASLR one way or another. > It is required for Android. It will slowdown coverage a bit, but that > code will be under #ifdef CONFIG_RANDOMIZE_BASE. Sounds good! > > > >>> + >>> + KCOV does not have any arch-specific code, but currently it is enabled >>> + only for x86_64. KCOV requires testing on other archs, and most likely >>> + disabling of instrumentation for some early boot code. >> >> I don't see where this is enforced. Should this say "is tested only on >> x86_64" instead of "enabled"? > > It is enforced with ARCH_HAS_KCOV. Thanks! -Kees > > > Thanks for the review! > >>> + >>> + For more details, see Documentation/kcov.txt. >>> + >>> +endif >>> + >>> source "lib/Kconfig.kmemcheck" >>> >>> source "lib/Kconfig.kasan" >>> diff --git a/lib/Makefile b/lib/Makefile >>> index 7f1de26..bfcc12e 100644 >>> --- a/lib/Makefile >>> +++ b/lib/Makefile >>> @@ -7,6 +7,15 @@ ORIG_CFLAGS := $(KBUILD_CFLAGS) >>> KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)) >>> endif >>> >>> +# These files are disabled because they produce lots of non-interesting and/or >>> +# flaky coverage that is not a function of syscall inputs. For example, >>> +# rbtree can be global and individual rotations don't correlate with inputs. >>> +KCOV_INSTRUMENT_string.o := n >>> +KCOV_INSTRUMENT_rbtree.o := n >>> +KCOV_INSTRUMENT_list_debug.o := n >>> +KCOV_INSTRUMENT_debugobjects.o := n >>> +KCOV_INSTRUMENT_dynamic_debug.o := n >>> + >>> lib-y := ctype.o string.o vsprintf.o cmdline.o \ >>> rbtree.o radix-tree.o dump_stack.o timerqueue.o\ >>> idr.o int_sqrt.o extable.o \ >>> diff --git a/mm/Makefile b/mm/Makefile >>> index 2ed4319..cf751bb 100644 >>> --- a/mm/Makefile >>> +++ b/mm/Makefile >>> @@ -5,6 +5,21 @@ >>> KASAN_SANITIZE_slab_common.o := n >>> KASAN_SANITIZE_slub.o := n >>> >>> +# These files are disabled because they produce non-interesting and/or >>> +# flaky coverage that is not a function of syscall inputs. E.g. slab is out of >>> +# free pages, or a task is migrated between nodes. >>> +KCOV_INSTRUMENT_slab_common.o := n >>> +KCOV_INSTRUMENT_slob.o := n >>> +KCOV_INSTRUMENT_slab.o := n >>> +KCOV_INSTRUMENT_slub.o := n >>> +KCOV_INSTRUMENT_page_alloc.o := n >>> +KCOV_INSTRUMENT_debug-pagealloc.o := n >>> +KCOV_INSTRUMENT_kmemleak.o := n >>> +KCOV_INSTRUMENT_kmemcheck.o := n >>> +KCOV_INSTRUMENT_memcontrol.o := n >>> +KCOV_INSTRUMENT_mmzone.o := n >>> +KCOV_INSTRUMENT_vmstat.o := n >>> + >>> mmu-y := nommu.o >>> mmu-$(CONFIG_MMU) := gup.o highmem.o memory.o mincore.o \ >>> mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \ >>> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile >>> index 6471014..ad97f0b 100644 >>> --- a/mm/kasan/Makefile >>> +++ b/mm/kasan/Makefile >>> @@ -1,4 +1,5 @@ >>> KASAN_SANITIZE := n >>> +KCOV_INSTRUMENT := n >>> >>> CFLAGS_REMOVE_kasan.o = -pg >>> # Function splitter causes unnecessary splits in __asan_load1/__asan_store1 >>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib >>> index 79e8661..ebf6f1b 100644 >>> --- a/scripts/Makefile.lib >>> +++ b/scripts/Makefile.lib >>> @@ -129,6 +129,12 @@ _c_flags += $(if $(patsubst n%,, \ >>> $(CFLAGS_KASAN)) >>> endif >>> >>> +ifeq ($(CONFIG_KCOV),y) >>> +_c_flags += $(if $(patsubst n%,, \ >>> + $(KCOV_INSTRUMENT_$(basetarget).o)$(KCOV_INSTRUMENT)y), \ >>> + $(CFLAGS_KCOV)) >>> +endif >>> + >>> # If building the kernel in a separate objtree expand all occurrences >>> # of -Idir to -I$(srctree)/dir except for absolute paths (starting with '/'). >>> >>> -- >>> 2.6.0.rc2.230.g3dd15c0 >>> >> >> Very cool! :) >> >> -Kees >> >> -- >> Kees Cook >> Chrome OS & Brillo Security -- Kees Cook Chrome OS & Brillo Security