kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2, 0/4] x86 instruction emulator fuzzing
@ 2019-06-12 15:35 Sam Caccavale
  2019-06-12 15:35 ` [v2, 1/4] Build target for emulate.o as a userspace binary Sam Caccavale
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Sam Caccavale @ 2019-06-12 15:35 UTC (permalink / raw)
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel,
	Sam Caccavale

Dear all,

This series aims to provide an entrypoint for, and fuzz KVM's x86 instruction
emulator from userspace.  It mirrors Xen's application of the AFL fuzzer to
it's instruction emulator in the hopes of discovering vulnerabilities.
Since this entrypoint also allows arbitrary execution of the emulators code
from userspace, it may also be useful for testing.

The current 4 patches build the emulator and 2 harnesses: simple-harness is
an example of unit testing; afl-harness is a frontend for the AFL fuzzer.

Patches
=======

- 01: Builds and links afl-harness with the required kernel objects.
- 02: Introduces the minimal set of emulator operations and supporting code
to emulate simple instructions.
- 03: Demonstrates simple-harness as a unit test.
- 04: Adds scripts for install, running, and crash triage.

Any comments/suggestions are greatly appreciated.

Best,
Sam Caccavale

Sam Caccavale (4):
  Build target for emulate.o as a userspace binary
  Emulate simple x86 instructions in userspace
  Demonstrating unit testing via simple-harness
  Added scripts for filtering, building, deploying

 tools/Makefile                                |   9 +
 tools/fuzz/x86ie/.gitignore                   |   2 +
 tools/fuzz/x86ie/Makefile                     |  54 +++
 tools/fuzz/x86ie/README.md                    |  12 +
 tools/fuzz/x86ie/afl-harness.c                | 151 +++++++
 tools/fuzz/x86ie/common.h                     |  87 ++++
 tools/fuzz/x86ie/emulator_ops.c               | 398 ++++++++++++++++++
 tools/fuzz/x86ie/emulator_ops.h               | 120 ++++++
 tools/fuzz/x86ie/scripts/afl-many             |  28 ++
 tools/fuzz/x86ie/scripts/bin.sh               |  49 +++
 tools/fuzz/x86ie/scripts/build.sh             |  32 ++
 tools/fuzz/x86ie/scripts/coalesce.sh          |   6 +
 tools/fuzz/x86ie/scripts/deploy.sh            |   9 +
 tools/fuzz/x86ie/scripts/deploy_remote.sh     |   9 +
 tools/fuzz/x86ie/scripts/gen_output.sh        |  11 +
 tools/fuzz/x86ie/scripts/install_afl.sh       |  14 +
 .../fuzz/x86ie/scripts/install_deps_ubuntu.sh |   5 +
 tools/fuzz/x86ie/scripts/rebuild.sh           |   6 +
 tools/fuzz/x86ie/scripts/run.sh               |  10 +
 tools/fuzz/x86ie/scripts/summarize.sh         |   9 +
 tools/fuzz/x86ie/simple-harness.c             |  42 ++
 tools/fuzz/x86ie/stubs.c                      |  56 +++
 tools/fuzz/x86ie/stubs.h                      |  52 +++
 23 files changed, 1171 insertions(+)
 create mode 100644 tools/fuzz/x86ie/.gitignore
 create mode 100644 tools/fuzz/x86ie/Makefile
 create mode 100644 tools/fuzz/x86ie/README.md
 create mode 100644 tools/fuzz/x86ie/afl-harness.c
 create mode 100644 tools/fuzz/x86ie/common.h
 create mode 100644 tools/fuzz/x86ie/emulator_ops.c
 create mode 100644 tools/fuzz/x86ie/emulator_ops.h
 create mode 100755 tools/fuzz/x86ie/scripts/afl-many
 create mode 100755 tools/fuzz/x86ie/scripts/bin.sh
 create mode 100755 tools/fuzz/x86ie/scripts/build.sh
 create mode 100755 tools/fuzz/x86ie/scripts/coalesce.sh
 create mode 100644 tools/fuzz/x86ie/scripts/deploy.sh
 create mode 100755 tools/fuzz/x86ie/scripts/deploy_remote.sh
 create mode 100755 tools/fuzz/x86ie/scripts/gen_output.sh
 create mode 100755 tools/fuzz/x86ie/scripts/install_afl.sh
 create mode 100755 tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
 create mode 100755 tools/fuzz/x86ie/scripts/rebuild.sh
 create mode 100755 tools/fuzz/x86ie/scripts/run.sh
 create mode 100755 tools/fuzz/x86ie/scripts/summarize.sh
 create mode 100644 tools/fuzz/x86ie/simple-harness.c
 create mode 100644 tools/fuzz/x86ie/stubs.c
 create mode 100644 tools/fuzz/x86ie/stubs.h

--
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [v2, 1/4] Build target for emulate.o as a userspace binary
  2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
@ 2019-06-12 15:35 ` Sam Caccavale
  2019-06-21 13:33   ` Alexander Graf
  2019-06-12 15:35 ` [v2, 2/4] Emulate simple x86 instructions in userspace Sam Caccavale
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Sam Caccavale @ 2019-06-12 15:35 UTC (permalink / raw)
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel,
	Sam Caccavale

This commit contains the minimal set of functionality to build
afl-harness around arch/x86/emulate.c which allows exercising code
in that source file, like x86_emulate_insn.  Resolving the
dependencies was done via GCC's -H flag by get_headers.py.

CR: https://code.amazon.com/reviews/CR-8325546
---
 tools/Makefile                  |   9 ++
 tools/fuzz/x86ie/.gitignore     |   2 +
 tools/fuzz/x86ie/Makefile       |  51 +++++++++++
 tools/fuzz/x86ie/README.md      |  12 +++
 tools/fuzz/x86ie/afl-harness.c  | 149 ++++++++++++++++++++++++++++++++
 tools/fuzz/x86ie/common.h       |  87 +++++++++++++++++++
 tools/fuzz/x86ie/emulator_ops.c |  58 +++++++++++++
 tools/fuzz/x86ie/emulator_ops.h | 117 +++++++++++++++++++++++++
 tools/fuzz/x86ie/stubs.c        |  56 ++++++++++++
 tools/fuzz/x86ie/stubs.h        |  52 +++++++++++
 10 files changed, 593 insertions(+)
 create mode 100644 tools/fuzz/x86ie/.gitignore
 create mode 100644 tools/fuzz/x86ie/Makefile
 create mode 100644 tools/fuzz/x86ie/README.md
 create mode 100644 tools/fuzz/x86ie/afl-harness.c
 create mode 100644 tools/fuzz/x86ie/common.h
 create mode 100644 tools/fuzz/x86ie/emulator_ops.c
 create mode 100644 tools/fuzz/x86ie/emulator_ops.h
 create mode 100644 tools/fuzz/x86ie/stubs.c
 create mode 100644 tools/fuzz/x86ie/stubs.h

diff --git a/tools/Makefile b/tools/Makefile
index 3dfd72ae6c1a..4d68817b7e49 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -94,6 +94,12 @@ freefall: FORCE
 kvm_stat: FORCE
 	$(call descend,kvm/$@)

+fuzz: FORCE
+	$(call descend,fuzz/x86_instruction_emulation)
+
+fuzz_deps: FORCE
+	$(call descend,fuzz/x86_instruction_emulation,fuzz_deps)
+
 all: acpi cgroup cpupower gpio hv firewire liblockdep \
 		perf selftests spi turbostat usb \
 		virtio vm bpf x86_energy_perf_policy \
@@ -171,6 +177,9 @@ tmon_clean:
 freefall_clean:
 	$(call descend,laptop/freefall,clean)

+fuzz_clean:
+	$(call descend,fuzz/x86_instruction_emulation,clean)
+
 build_clean:
 	$(call descend,build,clean)

diff --git a/tools/fuzz/x86ie/.gitignore b/tools/fuzz/x86ie/.gitignore
new file mode 100644
index 000000000000..7d44f7ce266e
--- /dev/null
+++ b/tools/fuzz/x86ie/.gitignore
@@ -0,0 +1,2 @@
+*.o
+*-harness
diff --git a/tools/fuzz/x86ie/Makefile b/tools/fuzz/x86ie/Makefile
new file mode 100644
index 000000000000..d45fe6d266b9
--- /dev/null
+++ b/tools/fuzz/x86ie/Makefile
@@ -0,0 +1,51 @@
+ROOT_DIR=../../..
+THIS_DIR=tools/fuzz/x86_instruction_emulation
+
+include ../../scripts/Makefile.include
+
+.DEFAULT_GOAL := all
+
+INCLUDES := $(patsubst -I./%,-I./$(ROOT_DIR)/%, $(LINUXINCLUDE))
+INCLUDES := $(patsubst ./include/%,./$(ROOT_DIR)/include/%, $(INCLUDES))
+INCLUDES += -include ./$(ROOT_DIR)/include/linux/compiler_types.h
+
+$(ROOT_DIR)/.config:
+	make -C $(ROOT_DIR) menuconfig
+	sed -i -r 's/^#? *CONFIG_KVM(.*)=.*/CONFIG_KVM\1=y/' $(ROOT_DIR)/.config
+
+
+KBUILD_CFLAGS += -fsanitize=address
+BAD_FLAGS := -mcmodel=kernel # Causes all kinds of errors in a userspace bin
+BAD_FLAGS += -mpreferred-stack-boundary=3 # Similar to ^, breaks ubuntu 16 too
+BAD_FLAGS += -mno-sse # stdlibs use sse, would like to have it
+
+ifdef DEBUG
+KBUILD_CFLAGS += -DDEBUG
+KBUILD_CFLAGS += -O0
+BAD_FLAGS += -O3
+BAD_FLAGS += -O2
+BAD_FLAGS += -O1
+endif
+
+KBUILD_CFLAGS := $(filter-out $(BAD_FLAGS),$(KBUILD_CFLAGS))
+
+KERNEL_OBJS := arch/x86/kvm/emulate.o \
+		arch/x86/lib/retpoline.o \
+		lib/find_bit.o
+KERNEL_OBJS := $(patsubst %,$(ROOT_DIR)/%, $(KERNEL_OBJS))
+# $(KERNEL_OBJS): $(ROOT_DIR)/.config
+# 	$(error Run `./tools/fuzz/x86_instruction_emulation/scripts/make_deps' first (setting envvar CC if desired).)
+
+DEPS := emulator_ops.h stubs.h common.h
+%.o: %.c $(DEPS)
+	$(CC) $(KBUILD_CFLAGS) $(INCLUDES) -g -c -o $@ $<
+
+LOCAL_OBJS := emulator_ops.o stubs.o
+afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
+	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
+
+all: afl-harness
+
+.PHONY: clean
+clean:
+	$(RM) -r *.o afl-harness
diff --git a/tools/fuzz/x86ie/README.md b/tools/fuzz/x86ie/README.md
new file mode 100644
index 000000000000..a78ac51b0152
--- /dev/null
+++ b/tools/fuzz/x86ie/README.md
@@ -0,0 +1,12 @@
+# Building
+
+From the root of linux, run:
+1. `./tools/fuzz/x86_instruction_emulation/scripts/make_deps`
+  - Optionally, set the envvar CC with your desired compiler.
+2. `make tools/fuzz`
+
+## TODO
+
+I'd like to add the object files built in `make_deps` as real
+dependencies of `make tools/fuzz` but it causes a cycle in Make to
+add them to `tools/fuzz/x86_instruction_emulation/Makefile`.
\ No newline at end of file
diff --git a/tools/fuzz/x86ie/afl-harness.c b/tools/fuzz/x86ie/afl-harness.c
new file mode 100644
index 000000000000..b3b09d7f15f2
--- /dev/null
+++ b/tools/fuzz/x86ie/afl-harness.c
@@ -0,0 +1,149 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * x86 Instruction Emulation Fuzzing Wrapper
+ *
+ * Authors:
+ *   Sam Caccavale   <samcacc@amazon.de>
+ *
+ * Supporting code from xen:
+ *  xen/tools/fuzz/x86_instruction_emulation/afl-harness.c
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1
+ */
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+#include "emulator_ops.h"
+
+/* Arbitrary, but limiting fuzz input size is important. */
+#define MAX_INPUT_SIZE 4096
+#define INSTRUCTION_BYTES (MAX_INPUT_SIZE - MIN_INPUT_SIZE)
+
+int main(int argc, char **argv)
+{
+	size_t size;
+	FILE *fp = NULL;
+	int max, count;
+	struct state *state;
+
+	setbuf(stdin, NULL);
+	setbuf(stdout, NULL);
+
+	while (1) {
+		enum { OPT_INPUT_SIZE,
+		};
+		static const struct option lopts[] = {
+			{ "input-size", no_argument, NULL, OPT_INPUT_SIZE },
+			{ 0, 0, 0, 0 }
+		};
+		int c = getopt_long_only(argc, argv, "", lopts, NULL);
+
+		if (c == -1)
+			break;
+
+		switch (c) {
+		case OPT_INPUT_SIZE:
+			printf("Min: %u\n", MIN_INPUT_SIZE);
+			printf("Max: %u\n", MAX_INPUT_SIZE);
+			exit(0);
+			break;
+
+		case '?':
+			printf("Usage: %s $FILE [$FILE...] | [--input-size]\n",
+			       argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			printf("Bad getopt return %d (%c)\n", c, c);
+			exit(-1);
+			break;
+		}
+	}
+
+	max = argc - optind;
+
+	if (!max) { /* No positional parameters.  Use stdin. */
+		max = 1;
+		fp = stdin;
+	}
+
+	state = create_emulator();
+	state->data = malloc(INSTRUCTION_BYTES);
+
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+	__AFL_INIT();
+
+	/*
+	 * This is the number of times AFL's forkserver should reuse a
+	 * process to fuzz the target.  1000 is the recommended starting
+	 * point.  Future tweaking may or may not yeild better results.
+	 */
+	for (count = 0; __AFL_LOOP(1000);)
+#else
+	for (count = 0; count < max; count++)
+#endif
+	{
+		if (fp != stdin) { /* If not stdin, open the provided file. */
+			printf("Opening file %s\n", argv[optind + count]);
+			fp = fopen(argv[optind + count], "rb");
+			if (fp == NULL) {
+				perror("fopen");
+				exit(-1);
+			}
+		}
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+		else {
+			/*
+			 * This will ensure we're dealing with a clean stream
+			 * state after the afl-fuzz process messes with the
+			 * open file handle.
+			 */
+			fseek(fp, 0, SEEK_SET);
+		}
+#endif
+		reset_emulator(state);
+
+		size = fread(state, 1, MIN_INPUT_SIZE, fp);
+		if (size != MIN_INPUT_SIZE) {
+			printf("Input does not populate state\n");
+			if (max == 1)
+				exit(-1);
+		}
+
+		size = fread(state->data, 1, INSTRUCTION_BYTES, fp);
+		state->data_available = size;
+
+		if (ferror(fp)) {
+			perror("fread");
+			exit(-1);
+		}
+
+		/* Only run the test if the input file was < than INPUT_SIZE */
+		if (feof(fp)) {
+			initialize_emulator(state);
+			emulate_until_complete(state);
+		} else {
+			printf("Input too large\n");
+			/* Don't exit if we're doing batch processing */
+			if (max == 1)
+				exit(-1);
+		}
+
+		if (fp != stdin) {
+			fclose(fp);
+			fp = NULL;
+		}
+	}
+
+	free(state->data);
+	free_emulator(state);
+	return 0;
+}
diff --git a/tools/fuzz/x86ie/common.h b/tools/fuzz/x86ie/common.h
new file mode 100644
index 000000000000..9aec2bea3d50
--- /dev/null
+++ b/tools/fuzz/x86ie/common.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <stdint.h>
+
+#define PRIx64 "llx"
+#define MSR_INDEX_MAX 16
+
+#ifdef DEBUG
+#define DEBUG_PRINT 1
+#else
+#define DEBUG_PRINT 0
+#endif
+
+#define debug(...)							\
+	do {								\
+		if (DEBUG_PRINT)					\
+			fprintf(stderr, __VA_ARGS__);			\
+	} while (0)
+
+enum x86_segment {
+	/* General purpose. */
+	x86_seg_es,
+	x86_seg_cs,
+	x86_seg_ss,
+	x86_seg_ds,
+	x86_seg_fs,
+	x86_seg_gs,
+	/* System: Valid to use for implicit table references. */
+	x86_seg_tr,
+	x86_seg_ldtr,
+	x86_seg_gdtr,
+	x86_seg_idtr,
+	/* No Segment: For accesses which are already linear. */
+	x86_seg_none
+};
+
+#define NR_SEG x86_seg_none
+
+struct segment_register {
+	uint16_t sel;
+	union {
+		uint16_t attr;
+		struct {
+			uint16_t type : 4;
+			uint16_t s : 1;
+			uint16_t dpl : 2;
+			uint16_t p : 1;
+			uint16_t avl : 1;
+			uint16_t l : 1;
+			uint16_t db : 1;
+			uint16_t g : 1;
+			uint16_t pad : 4;
+		};
+	};
+	uint32_t limit;
+	uint64_t base;
+};
+
+enum user_regs {
+	REGS_RAX,
+	REGS_RCX,
+	REGS_RDX,
+	REGS_RBX,
+	REGS_RSP,
+	REGS_RBP,
+	REGS_RSI,
+	REGS_RDI,
+	REGS_R8,
+	REGS_R9,
+	REGS_R10,
+	REGS_R11,
+	REGS_R12,
+	REGS_R13,
+	REGS_R14,
+	REGS_R15,
+	NR_REGS
+};
+#define NR_VCPU_REGS NR_REGS
+
+static inline void print_n_bytes(unsigned char *bytes, size_t n)
+{
+	int i;
+
+	for (i = 0; i < n; i++)
+		fprintf(stderr, " %02x", bytes[i]);
+	fprintf(stderr, "\n");
+}
diff --git a/tools/fuzz/x86ie/emulator_ops.c b/tools/fuzz/x86ie/emulator_ops.c
new file mode 100644
index 000000000000..55ae4e8fbd96
--- /dev/null
+++ b/tools/fuzz/x86ie/emulator_ops.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * x86 Instruction Emulation Fuzzing Wrapper
+ *
+ * Authors:
+ *   Sam Caccavale   <samcacc@amazon.de>
+ *
+ * Supporting code from xen:
+ *  xen/tools/fuzz/x86_instruction_emulation/fuzz_emul.c
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1
+ */
+
+#include <string.h>
+#include <stdio.h>
+#include <assert.h>
+#include <stdlib.h>
+
+#include "stubs.h"
+#include "emulator_ops.h"
+
+#include <linux/compiler_attributes.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/processor-flags.h>
+#include <asm/user_64.h>
+#include <asm/kvm.h>
+
+void initialize_emulator(struct state *state)
+{
+}
+
+int step_emulator(struct state *state)
+{
+	return 0;
+}
+
+int emulate_until_complete(struct state *state)
+{
+	return 0;
+}
+
+struct state *create_emulator(void)
+{
+	struct state *state = malloc(sizeof(struct state));
+	return state;
+}
+
+void reset_emulator(struct state *state)
+{
+}
+
+void free_emulator(struct state *state)
+{
+}
diff --git a/tools/fuzz/x86ie/emulator_ops.h b/tools/fuzz/x86ie/emulator_ops.h
new file mode 100644
index 000000000000..5ae072d5f205
--- /dev/null
+++ b/tools/fuzz/x86ie/emulator_ops.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef EMULATOR_OPS_H
+#define EMULATOR_OPS_H
+
+#include "common.h"
+#include "stubs.h"
+#include <asm/kvm_emulate.h>
+
+struct vcpu {
+	unsigned long cr[5];
+	uint64_t msr[MSR_INDEX_MAX];
+	uint64_t regs[NR_REGS];
+	uint64_t rflags;
+	struct segment_register segments[NR_SEG];
+};
+
+/*
+ * Internal state of the emulate harness.  Calculated initially from the input
+ * corpus, and later mutated by the emulation callbacks.
+ */
+struct state {
+	/* Bitmask which enables/disables hooks. */
+	unsigned long options;
+
+	/* Internal representation of emulated CPU. */
+	struct vcpu vcpu;
+
+	/*
+	 * Input bytes are consumed at s.data[eip + s.other_bytes_consumed]
+	 * while eip + size_requested + other_bytes_consumed < data_available
+	 *
+	 * emul_fetch consumes bytes for use as x86 instructions as eip grows
+	 *
+	 * get_bytes_and_increment consumes bytes and increments
+	 * other_bytes_consumed.  These bytes can be used as return values for
+	 * memory reads, random chances to fail, or other purposes.
+	 *
+	 *  Only these two functions should be used to access this data.
+	 *
+	 * This causes .data to be an interspliced source of instructions and
+	 * other data.  Xen's instruction emulation does this to provide
+	 * deterministic randomness on fuzz runs at the cost of complexity to
+	 * crash output.
+	 *
+	 * Other alternatives (two separate streams, getting ins bytes from
+	 * low, and random bytes from high, etc) yield byte streams which may
+	 * not bear an as close correlation with AFL's input.  This was chosen
+	 * since the only drawback to this approach is remedied by simple
+	 * bookkeeping of instruction bytes when debugging crashes.  This is
+	 * enabled by the TODO flag.
+	 */
+	unsigned char *data;
+
+	/* Real amount of data backing state->data[]. */
+	size_t data_available;
+
+	/*
+	 * Amount of bytes consumed for purposes other than instructions.
+	 * E.G. whether a memory access should fault.
+	 */
+	size_t other_bytes_consumed;
+
+	/* Emulation context */
+	struct x86_emulate_ctxt ctxt;
+};
+
+#define MIN_INPUT_SIZE (offsetof(struct state, data))
+
+#define container_of(ptr, type, member)					\
+	({								\
+		const typeof(((type *)0)->member) * __mptr = (ptr);	\
+		(type *)((char *)__mptr - offsetof(type, member));	\
+	})
+
+#define get_state(h) container_of(h, struct state, ctxt)
+
+void buffer_stderr(void) __attribute__((constructor));
+
+/*
+ * Allocates space for, and creates a `struct state`.  The user should set
+ * state->data to their instruction stream and do any modification of the
+ * state-vpcu if desired.
+ */
+extern struct state *create_emulator(void);
+
+/*
+ * memset's all fields of state except data and data_available.
+ */
+extern void reset_emulator(struct state *state);
+
+/*
+ * free_emulator does not free the state->data member, the user should free
+ * it before freeing the emulator.  The alternative implementation either
+ * restrains data to a fixed size or has the user pass in a (pointer, size)
+ * pair which would have to be copied.  Copying this would be slow to fuzz.
+ */
+extern void free_emulator(struct state *state);
+
+/*
+ * Uses the state->option field to disable certain functionality and
+ * initializes the state->ctxt to a valid state.  This is optional if
+ * intentionally testing the emulator in an invalid state.
+ */
+extern void initialize_emulator(struct state *state);
+
+/*
+ * Advances the emulator one instruction and handles any exceptions.
+ */
+extern int step_emulator(struct state *state);
+
+/*
+ * Steps the emulator until no instructions remain or something fails.
+ */
+extern int emulate_until_complete(struct state *state);
+
+#endif /* ifdef EMULATOR_OPS_H */
diff --git a/tools/fuzz/x86ie/stubs.c b/tools/fuzz/x86ie/stubs.c
new file mode 100644
index 000000000000..d2ffeb3ec599
--- /dev/null
+++ b/tools/fuzz/x86ie/stubs.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * These functions/symbols are required to build emulate.o but belong in
+ * linux tree files with include many other headers/unnecessary symbols
+ * and cause building emulate.o to become far more complicated than just
+ * stubbing them out here.  In order to stub them without modifying the
+ * included source, we're declaring many of them here.
+ */
+#include <linux/types.h>
+
+#include "stubs.h"
+
+#define VMWARE_BACKDOOR_PMC_HOST_TSC 0x10000
+#define VMWARE_BACKDOOR_PMC_REAL_TIME 0x10001
+#define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002
+
+/*
+ * This is easy enough to stub out its full functionality.
+ */
+bool is_vmware_backdoor_pmc(u32 pmc_idx)
+{
+	switch (pmc_idx) {
+	case VMWARE_BACKDOOR_PMC_HOST_TSC:
+	case VMWARE_BACKDOOR_PMC_REAL_TIME:
+	case VMWARE_BACKDOOR_PMC_APPARENT_TIME:
+		return true;
+	}
+	return false;
+}
+
+/*
+ * Printk has no side effects so we don't need to worry about it.
+ */
+int printk(const char *s, ...)
+{
+	return 0;
+}
+
+/*
+ * This is required by source included from emulate.c and would be linked in.
+ */
+bool enable_vmware_backdoor;
+
+struct exception_table_entry;
+struct pt_regs;
+
+/*
+ * TODO: we likely need to implement this to handle emulated exceptions.
+ */
+bool ex_handler_default(const struct exception_table_entry *fixup,
+			struct pt_regs *regs, int trapnr,
+			unsigned long error_code, unsigned long fault_addr)
+{
+	return true;
+}
diff --git a/tools/fuzz/x86ie/stubs.h b/tools/fuzz/x86ie/stubs.h
new file mode 100644
index 000000000000..02c2f6f9bc26
--- /dev/null
+++ b/tools/fuzz/x86ie/stubs.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef STUBS_H
+#define STUBS_H
+
+
+/*
+ * Several typedefs in linux/types.h collide with ones in standard libs.
+ *
+ * kvm_emulate.h uses many other types in linux/types.h heavily, so we must
+ * include it but having access to the standard libs is also important for the
+ * harnesses.
+ *
+ * A solution would be to not include any kernel files in the harness but then
+ * useful introspection into state->ctxt is impossible as x86_emulate_ctxt is
+ * defined in kvm_emulate.h and uses a ton of linux/types.h types.
+ *
+ * Instead, we use dummy defines to avoid the collision and linux seems
+ * content to use the equivalent types from the standard libs.
+ */
+#define fd_set fake_fd_set
+#define dev_t fake_dev_t
+#define nlink_t fake_nlink_t
+#define timer_t fake_timer_t
+#define loff_t fake_loff_t
+#define u_int64_t fake_u_int64_t
+#define int64_t fake_int64_t
+#define blkcnt_t fake_blkcnt_t
+#define uint64_t fake_uint64_t
+#include <linux/types.h>
+#undef fd_set
+#undef dev_t
+#undef nlint_t
+#undef timer_t
+#undef loff_t
+#undef u_int64_t
+#undef int64_t
+#undef blkcnt_t
+#undef uint64_t
+
+/*
+ * These are identically defined in string.h and included kernel headers.
+ */
+#ifdef __always_inline
+#undef __always_inline
+#endif
+
+#ifdef __attribute_const__
+#undef __attribute_const__
+#endif
+
+#endif /* STUBS_H */
--
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v2, 2/4] Emulate simple x86 instructions in userspace
  2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
  2019-06-12 15:35 ` [v2, 1/4] Build target for emulate.o as a userspace binary Sam Caccavale
@ 2019-06-12 15:35 ` Sam Caccavale
  2019-06-21 13:40   ` Alexander Graf
  2019-06-12 15:35 ` [v2, 3/4] Demonstrating unit testing via simple-harness Sam Caccavale
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Sam Caccavale @ 2019-06-12 15:35 UTC (permalink / raw)
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel,
	Sam Caccavale

Added the minimal subset of code to run afl-harness with a binary file
as input.  These bytes are used to populate the vcpu structure and then
as an instruction stream for the emulator.  It does not attempt to handle
exceptions an only supports very simple ops.

CR: https://code.amazon.com/reviews/CR-8552453
---
 tools/fuzz/x86ie/afl-harness.c    |   8 +-
 tools/fuzz/x86ie/emulator_ops.c   | 342 +++++++++++++++++++++++++++++-
 tools/fuzz/x86ie/emulator_ops.h   |   7 +-
 tools/fuzz/x86ie/scripts/afl-many |  28 +++
 4 files changed, 379 insertions(+), 6 deletions(-)
 create mode 100755 tools/fuzz/x86ie/scripts/afl-many

diff --git a/tools/fuzz/x86ie/afl-harness.c b/tools/fuzz/x86ie/afl-harness.c
index b3b09d7f15f2..a3eeab0cfc90 100644
--- a/tools/fuzz/x86ie/afl-harness.c
+++ b/tools/fuzz/x86ie/afl-harness.c
@@ -50,7 +50,7 @@ int main(int argc, char **argv)

 		switch (c) {
 		case OPT_INPUT_SIZE:
-			printf("Min: %u\n", MIN_INPUT_SIZE);
+			printf("Min: %lu\n", MIN_INPUT_SIZE);
 			printf("Max: %u\n", MAX_INPUT_SIZE);
 			exit(0);
 			break;
@@ -77,6 +77,10 @@ int main(int argc, char **argv)

 	state = create_emulator();
 	state->data = malloc(INSTRUCTION_BYTES);
+	if (!state->data) {
+		printf("Malloc failed.\n");
+		return -1;
+	}

 #ifdef __AFL_HAVE_MANUAL_CONTROL
 	__AFL_INIT();
@@ -109,8 +113,6 @@ int main(int argc, char **argv)
 			fseek(fp, 0, SEEK_SET);
 		}
 #endif
-		reset_emulator(state);
-
 		size = fread(state, 1, MIN_INPUT_SIZE, fp);
 		if (size != MIN_INPUT_SIZE) {
 			printf("Input does not populate state\n");
diff --git a/tools/fuzz/x86ie/emulator_ops.c b/tools/fuzz/x86ie/emulator_ops.c
index 55ae4e8fbd96..370ac970ab9d 100644
--- a/tools/fuzz/x86ie/emulator_ops.c
+++ b/tools/fuzz/x86ie/emulator_ops.c
@@ -29,17 +29,349 @@
 #include <asm/user_64.h>
 #include <asm/kvm.h>

+ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg)
+{
+	assert(reg < number_of_gprs(ctxt));
+	return get_state(ctxt)->vcpu.regs[reg];
+}
+
+void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg, ulong val)
+{
+	assert(reg < number_of_gprs(ctxt));
+	get_state(ctxt)->vcpu.regs[reg] = val;
+}
+
+/* All read ops: */
+
+static int _get_bytes(void *dst, struct state *state, unsigned int bytes,
+		      char *callee)
+{
+	if (state->bytes_consumed + bytes > state->data_available) {
+		fprintf(stderr, "Tried retrieving %d bytes\n", bytes);
+		fprintf(stderr, "%s failed to retrieve bytes for %s.\n",
+			__func__, callee);
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	memcpy(dst, &state->data[state->bytes_consumed], bytes);
+	return X86EMUL_CONTINUE;
+}
+
+/*
+ * The only function that any x86_emulate_ops should call to retrieve bytes.
+ * See comments in struct state definition for more information.
+ */
+static int get_bytes_and_increment(void *dst, struct state *state,
+				   unsigned int bytes, char *callee)
+{
+	int rc = _get_bytes(dst, state, bytes, callee);
+
+	if (rc == X86EMUL_CONTINUE)
+		state->bytes_consumed += bytes;
+
+	return rc;
+}
+
+/*
+ * This is called by x86_decode_insn to fetch bytes.
+ */
+int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr, void *val,
+	       unsigned int bytes, struct x86_exception *fault)
+{
+	if (get_bytes_and_increment(val, get_state(ctxt), bytes,
+		"emul_fetch") != X86EMUL_CONTINUE) {
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	return X86EMUL_CONTINUE;
+}
+
+int emul_read_emulated(struct x86_emulate_ctxt *ctxt,
+		       unsigned long addr, void *val, unsigned int bytes,
+		       struct x86_exception *fault)
+{
+	if (get_bytes_and_increment(val, get_state(ctxt), bytes,
+		"emul_read_emulated") != X86EMUL_CONTINUE) {
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	return X86EMUL_CONTINUE;
+}
+
+int emul_write_emulated(struct x86_emulate_ctxt *ctxt,
+		   unsigned long addr, const void *val,
+		   unsigned int bytes,
+		   struct x86_exception *fault)
+{
+	return X86EMUL_CONTINUE;
+}
+
+ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
+{
+	return get_state(ctxt)->vcpu.cr[cr];
+}
+
+int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
+{
+	get_state(ctxt)->vcpu.cr[cr] = val;
+	return 0;
+}
+
+unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt)
+{
+	return get_state(ctxt)->vcpu.rflags;
+}
+
+void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags)
+{
+	get_state(ctxt)->vcpu.rflags = hflags;
+}
+
+/* End of emulator ops */
+
+#define SET(h) .h = emul_##h
+const struct x86_emulate_ops all_emulator_ops = {
+	SET(read_gpr),
+	SET(write_gpr),
+	SET(fetch),
+	SET(read_emulated),
+	SET(write_emulated),
+	SET(get_cr),
+	SET(set_cr),
+	SET(get_hflags),
+	SET(set_hflags),
+};
+#undef SET
+
+enum {
+	HOOK_read_gpr,
+	HOOK_write_gpr,
+	HOOK_fetch,
+	HOOK_read_emulated,
+	HOOK_write_emulated,
+	HOOK_get_cr,
+	HOOK_set_cr,
+	HOOK_get_hflags,
+	HOOK_set_hflags
+};
+
+/*
+ * Disable an x86_emulate_op if options << HOOK_op is set.
+ *
+ * Expects options to be defined.
+ */
+#define MAYBE_DISABLE_HOOK(h)						\
+	do {								\
+		if (options & (1 << HOOK_##h)) {			\
+			vcpu->ctxt.ops.h = NULL;			\
+			debug("Disabling hook " #h "\n");		\
+		}							\
+	} while (0)
+
+/*
+ * FROM XEN:
+ *
+ * Constrain input to architecturally-possible states where
+ * the emulator relies on these
+ *
+ * In general we want the emulator to be as absolutely robust as
+ * possible; which means that we want to minimize the number of things
+ * it assumes about the input state.  Tesing this means minimizing and
+ * removing as much of the input constraints as possible.
+ *
+ * So we only add constraints that (in general) have been proven to
+ * cause crashes in the emulator.
+ *
+ * For future reference: other constraints which might be necessary at
+ * some point:
+ *
+ * - EFER.LMA => !EFLAGS.NT
+ * - In VM86 mode, force segment...
+ *  - ...access rights to 0xf3
+ *  - ...limits to 0xffff
+ *  - ...bases to below 1Mb, 16-byte aligned
+ *  - ...selectors to (base >> 4)
+ */
+static void sanitize_input(struct state *s)
+{
+	/* Some hooks can't be disabled. */
+	// options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
+
+	/* Zero 'private' entries */
+	// regs->error_code = 0;
+	// regs->entry_vector = 0;
+
+	// CANONICALIZE_MAYBE(rip);
+	// CANONICALIZE_MAYBE(rsp);
+	// CANONICALIZE_MAYBE(rbp);
+
+	/*
+	 * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
+	 * set PE if PG is set.
+	 */
+	if (s->vcpu.cr[0] & X86_CR0_PG)
+		s->vcpu.cr[0] |= X86_CR0_PE;
+
+	/* EFLAGS.VM not available in long mode */
+	if (s->ctxt.mode == X86EMUL_MODE_PROT64)
+		s->vcpu.rflags &= ~X86_EFLAGS_VM;
+
+	/* EFLAGS.VM implies 16-bit mode */
+	if (s->vcpu.rflags & X86_EFLAGS_VM) {
+		s->vcpu.segments[x86_seg_cs].db = 0;
+		s->vcpu.segments[x86_seg_ss].db = 0;
+	}
+}
+
 void initialize_emulator(struct state *state)
 {
+	reset_emulator(state);
+	state->ctxt.ops = &all_emulator_ops;
+
+	/* See also sanitize_input, some hooks can't be disabled. */
+	// MAYBE_DISABLE_HOOK(read_gpr);
+
+	sanitize_input(state);
+}
+
+static const char *const x86emul_mode_string[] = {
+	[X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL",
+	[X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86",
+	[X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16",
+	[X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32",
+	[X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64",
+};
+
+static void dump_state_after(const char *desc, struct state *state)
+{
+	debug(" -- State after %s --\n", desc);
+	debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
+	debug(" cr0: %lx\n", state->vcpu.cr[0]);
+	debug(" cr3: %lx\n", state->vcpu.cr[3]);
+	debug(" cr4: %lx\n", state->vcpu.cr[4]);
+
+	debug("Decode _eip: %lu\n", state->ctxt._eip);
+	debug("Emulate eip: %lu\n", state->ctxt.eip);
+
+	debug("\n");
 }

+static void init_emulate_ctxt(struct state *state)
+{
+	struct x86_emulate_ctxt *ctxt = &state->ctxt;
+
+	ctxt->eflags = ctxt->ops->get_hflags(ctxt);
+	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
+
+	ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this
+
+	init_decode_cache(ctxt);
+}
+
+
 int step_emulator(struct state *state)
 {
-	return 0;
+	int rc;
+	unsigned long prev_eip = state->ctxt._eip;
+	unsigned long emul_offset;
+	int decode_size = state->data_available - state->bytes_consumed;
+
+	/*
+	 * This is annoing to have to explain the reasoning behind:
+	 * ._eip is incremented by x86_decode_insn.  It will be > .eip between
+	 * decoding and emulating.
+	 * .eip is incremented by x86_emulate_insn.  It may be incremented
+	 * beyond the length of instruction emulated E.G. if a jump is taken.
+	 *
+	 * If these are out of sync before emulating, then something is
+	 * horribly wrong with the harness.
+	 */
+	assert(state->ctxt.eip == state->ctxt._eip);
+
+	if (decode_size <= 0) {
+		debug("Out of instructions\n");
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	init_emulate_ctxt(state);
+	state->ctxt.interruptibility = 0;
+	state->ctxt.have_exception = false;
+	state->ctxt.exception.vector = -1;
+	state->ctxt.perm_ok = false;
+	state->ctxt.ud = 0; // (emulation_type(0) & EMULTYPE_TRAP_UD);
+
+	/*
+	 * When decoding with NULL, 0, the emulator will use the emul_fetch
+	 * op which handles incrementing the state->data variables.  However
+	 * x86_decode_insn will always try to grab 15 bytes which may be more
+	 * than are left in the stream.
+	 *
+	 * Calling x86_decode_insn from a buffer with a length causes it to
+	 * directly memcpy those bytes into the ctxt structure and does not
+	 * increment state->bytes_consumed.  In that case, we manually
+	 * update state->bytes_consumed by the difference in the decoding
+	 * _eip.  This is gross but I cannot figure out a better way to do
+	 * this.
+	 *
+	 * We must limit the size to avoid going over the buffer and since
+	 * calling x86_decode_insn with a buffer does not go through any of
+	 * our ops, we need to update bytes_consumed.  The only improvement
+	 * I can currently think of would be a nicer way to get the size of
+	 * the decoded instruction.
+	 */
+	if (decode_size > 15)
+		decode_size = 15;
+
+	rc = x86_decode_insn(&state->ctxt,
+		&state->data[state->bytes_consumed], decode_size);
+	assert(state->ctxt._eip - prev_eip > 0); // Only move forward.
+	state->bytes_consumed += state->ctxt._eip - prev_eip;
+
+	debug("Decode result: %d\n", rc);
+	if (rc != X86EMUL_CONTINUE)
+		return rc;
+
+	emul_offset = state->ctxt._eip - state->ctxt.eip;
+	debug("Instruction: ");
+	print_n_bytes(&state->data[state->bytes_consumed - emul_offset],
+		      emul_offset);
+
+	state->ctxt.exception.address = state->vcpu.cr[2];
+
+	// This is extraneous but explicit due to the above assert
+	prev_eip = state->ctxt.eip;
+	rc = x86_emulate_insn(&state->ctxt);
+	debug("Emulation result: %d\n", rc);
+	dump_state_after("emulating", state);
+
+	if (rc == -1) {
+		return rc;
+	} else if (state->ctxt.have_exception) {
+		fprintf(stderr, "Emulator propagated exception: { ");
+		fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector);
+		fprintf(stderr, "error code: %d }\n",
+			state->ctxt.exception.error_code);
+		rc = X86EMUL_UNHANDLEABLE;
+	} else if (prev_eip == state->ctxt.eip) {
+		fprintf(stderr, "ctxt.eip not advanced.\n");
+		rc = X86EMUL_UNHANDLEABLE;
+	}
+
+	if (state->bytes_consumed == state->data_available)
+		debug("emulator is done\n");
+
+	return rc;
 }

 int emulate_until_complete(struct state *state)
 {
+	int count = 0;
+
+	do {
+		count++;
+	} while (step_emulator(state) == X86EMUL_CONTINUE);
+
+	debug("Emulated %d instructions\n", count);
 	return 0;
 }

@@ -51,8 +383,16 @@ struct state *create_emulator(void)

 void reset_emulator(struct state *state)
 {
+	unsigned char *data = state->data;
+	size_t data_available = state->data_available;
+
+	memset(state, 0, sizeof(struct state));
+
+	state->data = data;
+	state->data_available = data_available;
 }

 void free_emulator(struct state *state)
 {
+	free(state);
 }
diff --git a/tools/fuzz/x86ie/emulator_ops.h b/tools/fuzz/x86ie/emulator_ops.h
index 5ae072d5f205..19f3bd0ec6a3 100644
--- a/tools/fuzz/x86ie/emulator_ops.h
+++ b/tools/fuzz/x86ie/emulator_ops.h
@@ -59,7 +59,7 @@ struct state {
 	 * Amount of bytes consumed for purposes other than instructions.
 	 * E.G. whether a memory access should fault.
 	 */
-	size_t other_bytes_consumed;
+	size_t bytes_consumed;

 	/* Emulation context */
 	struct x86_emulate_ctxt ctxt;
@@ -75,7 +75,10 @@ struct state {

 #define get_state(h) container_of(h, struct state, ctxt)

-void buffer_stderr(void) __attribute__((constructor));
+static inline int number_of_gprs(struct x86_emulate_ctxt *c)
+{
+	return (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8);
+}

 /*
  * Allocates space for, and creates a `struct state`.  The user should set
diff --git a/tools/fuzz/x86ie/scripts/afl-many b/tools/fuzz/x86ie/scripts/afl-many
new file mode 100755
index 000000000000..ab15258573a2
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/afl-many
@@ -0,0 +1,28 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+# This is for running AFL over NPROC or `nproc` cores with normal AFL options.
+
+export AFL_NO_AFFINITY=1
+
+while [ -z "$sync_dir" ]; do
+  while getopts ":o:" opt; do
+    case "${opt}" in
+      o)
+        sync_dir="${OPTARG}"
+        ;;
+      *)
+        ;;
+    esac
+  done
+  ((OPTIND++))
+  [ $OPTIND -gt $# ] && break
+done
+
+for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
+    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
+done
+taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &
+
+sleep 5
+watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
+pkill afl-fuzz
--
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v2, 3/4] Demonstrating unit testing via simple-harness
  2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
  2019-06-12 15:35 ` [v2, 1/4] Build target for emulate.o as a userspace binary Sam Caccavale
  2019-06-12 15:35 ` [v2, 2/4] Emulate simple x86 instructions in userspace Sam Caccavale
@ 2019-06-12 15:35 ` Sam Caccavale
  2019-06-21 13:43   ` Alexander Graf
  2019-06-12 15:36 ` [v2, 4/4] Added scripts for filtering, building, deploying Sam Caccavale
  2019-06-21 13:30 ` [v2, 0/4] x86 instruction emulator fuzzing Alexander Graf
  4 siblings, 1 reply; 10+ messages in thread
From: Sam Caccavale @ 2019-06-12 15:35 UTC (permalink / raw)
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel,
	Sam Caccavale

Simple-harness.c uses inline asm support to generate asm and then has the
emulator emulate this code.  This may be useful as a form of testing for
the emulator.

CR: https://code.amazon.com/reviews/CR-8591638
---
 tools/fuzz/x86ie/Makefile         |  7 ++++--
 tools/fuzz/x86ie/simple-harness.c | 42 +++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 2 deletions(-)
 create mode 100644 tools/fuzz/x86ie/simple-harness.c

diff --git a/tools/fuzz/x86ie/Makefile b/tools/fuzz/x86ie/Makefile
index d45fe6d266b9..e79d275e1040 100644
--- a/tools/fuzz/x86ie/Makefile
+++ b/tools/fuzz/x86ie/Makefile
@@ -44,8 +44,11 @@ LOCAL_OBJS := emulator_ops.o stubs.o
 afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
 	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie

-all: afl-harness
+simple-harness: simple-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
+	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
+
+all: afl-harness simple-harness

 .PHONY: clean
 clean:
-	$(RM) -r *.o afl-harness
+	$(RM) -r *.o afl-harness simple-harness
diff --git a/tools/fuzz/x86ie/simple-harness.c b/tools/fuzz/x86ie/simple-harness.c
new file mode 100644
index 000000000000..f21fdafe1dd1
--- /dev/null
+++ b/tools/fuzz/x86ie/simple-harness.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include "emulator_ops.h"
+#include <asm/kvm_emulate.h>
+
+extern void foo(void)
+{
+	asm volatile("__start:"
+		     ".byte 0x32, 0x05, 0x00, 0x00, 0x00, 0x00;" // xor eax,DWORD PTR [rip+0x0]
+		     ".byte 0x90;"
+		     //".byte 0x0f, 0x7f, 0xde;" // movq mm6,mm3
+		     ".byte 0x0f, 0x6f, 0xde;" // same instruction...
+		     ".byte 0x90;"
+		     "__end:");
+}
+
+int main(int argc, char **argv)
+{
+	extern unsigned char __start;
+	extern unsigned char __end;
+	struct state *state = create_emulator();
+	int rc;
+
+	/* Ensures the emulator is in a valid state. */
+	initialize_emulator(state);
+
+	/* Provide the emulator with instructions to emulate. */
+	state->data = &__start;
+	state->data_available = &__end - &__start;
+
+	/* rip addressed instruction */
+	rc = emulate_until_complete(state);
+
+	/* Free the emulator. */
+	free_emulator(state);
+
+	return 0;
+}
--
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [v2, 4/4] Added scripts for filtering, building, deploying
  2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
                   ` (2 preceding siblings ...)
  2019-06-12 15:35 ` [v2, 3/4] Demonstrating unit testing via simple-harness Sam Caccavale
@ 2019-06-12 15:36 ` Sam Caccavale
  2019-06-21 13:50   ` Alexander Graf
  2019-06-21 13:30 ` [v2, 0/4] x86 instruction emulator fuzzing Alexander Graf
  4 siblings, 1 reply; 10+ messages in thread
From: Sam Caccavale @ 2019-06-12 15:36 UTC (permalink / raw)
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel,
	Sam Caccavale

bin.sh produces output which diagnoses whether the crash was expected.
coalesce.sh, gen_output.sh, and summarize.sh are useful for parsing
the large crash directories that afl produces.
deploy_remote.sh does all of the setup to launch a fuzz run via
install_deps_ubuntu.sh, install_afl.sh, build.sh, and run.sh.
rebuild.sh cleans the directories and executes build.sh
---
 tools/fuzz/x86ie/scripts/afl-many             |  6 +--
 tools/fuzz/x86ie/scripts/bin.sh               | 49 +++++++++++++++++++
 tools/fuzz/x86ie/scripts/build.sh             | 32 ++++++++++++
 tools/fuzz/x86ie/scripts/coalesce.sh          |  6 +++
 tools/fuzz/x86ie/scripts/deploy.sh            |  9 ++++
 tools/fuzz/x86ie/scripts/deploy_remote.sh     |  9 ++++
 tools/fuzz/x86ie/scripts/gen_output.sh        | 11 +++++
 tools/fuzz/x86ie/scripts/install_afl.sh       | 14 ++++++
 .../fuzz/x86ie/scripts/install_deps_ubuntu.sh |  5 ++
 tools/fuzz/x86ie/scripts/rebuild.sh           |  6 +++
 tools/fuzz/x86ie/scripts/run.sh               | 10 ++++
 tools/fuzz/x86ie/scripts/summarize.sh         |  9 ++++
 12 files changed, 163 insertions(+), 3 deletions(-)
 create mode 100755 tools/fuzz/x86ie/scripts/bin.sh
 create mode 100755 tools/fuzz/x86ie/scripts/build.sh
 create mode 100755 tools/fuzz/x86ie/scripts/coalesce.sh
 create mode 100644 tools/fuzz/x86ie/scripts/deploy.sh
 create mode 100755 tools/fuzz/x86ie/scripts/deploy_remote.sh
 create mode 100755 tools/fuzz/x86ie/scripts/gen_output.sh
 create mode 100755 tools/fuzz/x86ie/scripts/install_afl.sh
 create mode 100755 tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
 create mode 100755 tools/fuzz/x86ie/scripts/rebuild.sh
 create mode 100755 tools/fuzz/x86ie/scripts/run.sh
 create mode 100755 tools/fuzz/x86ie/scripts/summarize.sh

diff --git a/tools/fuzz/x86ie/scripts/afl-many b/tools/fuzz/x86ie/scripts/afl-many
index ab15258573a2..3fe6423309a6 100755
--- a/tools/fuzz/x86ie/scripts/afl-many
+++ b/tools/fuzz/x86ie/scripts/afl-many
@@ -19,10 +19,10 @@ while [ -z "$sync_dir" ]; do
 done

 for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
-    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
+    taskset -c "$i" $AFLPATH/afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
 done
-taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &
+taskset -c 0 $AFLPATH/afl-fuzz -M master $@ >/dev/null 2>&1 &

 sleep 5
-watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
+watch -n1 "echo \"Executing 'AFLPATH/afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && $AFLPATH/afl-whatsup -s ${sync_dir}"
 pkill afl-fuzz
diff --git a/tools/fuzz/x86ie/scripts/bin.sh b/tools/fuzz/x86ie/scripts/bin.sh
new file mode 100755
index 000000000000..6383a883ff33
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/bin.sh
@@ -0,0 +1,49 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+if [ "$#" -lt 3 ]; then
+  echo "Usage: './bin path/to/afl-harness path/to/afl_crash [path/to/linux/src/root]'"
+  exit
+fi
+
+export AFL_HARNESS="$1"
+export LINUX_SRC="$3"
+
+diagnose_segfault() {
+  SOURCE=$(gdb -batch -ex r -ex 'bt 2' --args $@ 2>&1 | grep -Po '#1.* \K([^ ]+:[0-9]+)');
+  IFS=: read FILE LINE <<< "$SOURCE"
+
+  OP="$(sed -n "${LINE}p" "$LINUX_SRC/$FILE" 2>/dev/null)"
+  if [ $? -ne 0 ]; then
+    OP="$(sed -n "${LINE}p" "$LINUX_SRC/tools/fuzz/x86_instruction_emulation/$FILE" 2>/dev/null)"
+  fi
+
+  OP="$(echo $OP | grep -Po 'ops->\K([^(]+)')"
+  if [ -z "$OP" ]; then
+    echo "SEGV: unknown, in $FILE:$LINE"
+  else
+    echo "Expected: segfaulting on emulator->$OP"
+  fi
+}
+export -f diagnose_segfault
+
+bin() {
+  OUTPUT=$(bash -c "timeout 1s $AFL_HARNESS $1 2>&1" 2>&1)
+  RETVAL=$?
+
+  echo "$OUTPUT"
+  if [ $RETVAL -eq 0 ]; then
+    echo "Terminated successfully"
+  elif [ $RETVAL -eq 124 ]; then
+    echo "Unknown: killed due to timeout.  Loop likely."
+  elif echo "$OUTPUT" | grep -q "SEGV"; then
+    echo "$(diagnose_segfault $AFL_HARNESS $1)"
+  elif echo "$OUTPUT" | grep -q "FPE"; then
+    echo "Expected: floating point exception."
+  else
+    echo "Unknown cause of crash."
+  fi
+}
+export -f bin
+
+echo "$(bin $2 2>&1)"
diff --git a/tools/fuzz/x86ie/scripts/build.sh b/tools/fuzz/x86ie/scripts/build.sh
new file mode 100755
index 000000000000..74b893f222c1
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/build.sh
@@ -0,0 +1,32 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+kernel_objects="arch/x86/kvm/emulate.o arch/x86/lib/retpoline.o lib/find_bit.o"
+
+disable() { sed -i -r "/\b$1\b/c\# $1" .config; }
+enable() { sed -i -r "/\b$1\b/c\\$1=y" .config; }
+
+make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} defconfig
+
+enable "CONFIG_DEBUG_INFO"
+enable "CONFIG_STACKPROTECTOR"
+
+yes ' ' | make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} $kernel_objects
+
+omit_arg () { args=$(echo "$args" | sed "s/ $1//g"); }
+add_arg () { args+=" $1"; }
+
+rebuild () {
+  args="$(head -1 $(dirname $1)/.$(basename $1).cmd | sed -e 's/.*:= //g')"
+  omit_arg "-mcmodel=kernel"
+  omit_arg "-mpreferred-stack-boundary=3"
+  add_arg "-fsanitize=address"
+  echo -e "Rebuilding $1 with \n$args"
+  eval "$args"
+}
+
+for object in $kernel_objects; do
+  rebuild $object
+done
+
+make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} tools/fuzz
diff --git a/tools/fuzz/x86ie/scripts/coalesce.sh b/tools/fuzz/x86ie/scripts/coalesce.sh
new file mode 100755
index 000000000000..18c2ca7f2767
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/coalesce.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+mkdir -p all
+rm -rf all/*
+find . -type f -wholename '*crashes/id*' | parallel cp {} ./all/$(basename $(dirname {//})):{/}
diff --git a/tools/fuzz/x86ie/scripts/deploy.sh b/tools/fuzz/x86ie/scripts/deploy.sh
new file mode 100644
index 000000000000..f95c3aa2b5b5
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/deploy.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+REMOTE=$1
+DSTDIR=/dev/shm
+
+rsync -av $(pwd) $REMOTE:$DSTDIR
+
+ssh $REMOTE "cd $DSTDIR/$(basename $(pwd)); bash -s tools/fuzz/x86_instruction_emulation/scripts/deploy_remote.sh"
diff --git a/tools/fuzz/x86ie/scripts/deploy_remote.sh b/tools/fuzz/x86ie/scripts/deploy_remote.sh
new file mode 100755
index 000000000000..e002c5a932f5
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/deploy_remote.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+SCRIPTDIR=$(pwd)/tools/fuzz/x86_instruction_emulation/scripts
+
+$SCRIPTDIR/install_deps_ubuntu.sh
+source $SCRIPTDIR/install_afl.sh
+CC=$AFLPATH/afl-gcc $SCRIPTDIR/build.sh
+FUZZDIR="${FUZZDIR:-$(pwd)/fuzz}" $SCRIPTDIR/run.sh
diff --git a/tools/fuzz/x86ie/scripts/gen_output.sh b/tools/fuzz/x86ie/scripts/gen_output.sh
new file mode 100755
index 000000000000..6c0707eb6d08
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/gen_output.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+if [ "$#" -lt 3 ]; then
+  echo "Usage: '$0 path/to/afl-harness path/to/afl_crash_dir path/to/linux/src/root'"
+  exit
+fi
+
+mkdir -p output
+rm -rf output/*
+find $2 -type f | parallel ./bin.sh $1 {} $3 '>' ./output/{/}.out
diff --git a/tools/fuzz/x86ie/scripts/install_afl.sh b/tools/fuzz/x86ie/scripts/install_afl.sh
new file mode 100755
index 000000000000..b1c5612eca1c
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/install_afl.sh
@@ -0,0 +1,14 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz
+mkdir -p afl
+tar xzf afl-latest.tgz -C afl --strip-components 1
+
+pushd afl
+set AFL_USE_ASAN
+make clean all
+export AFLPATH="$(pwd)"
+popd
+
+sudo bash -c "echo core >/proc/sys/kernel/core_pattern"
diff --git a/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh b/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
new file mode 100755
index 000000000000..5525bc8b659c
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+sudo apt update
+sudo apt install -y make gcc wget screen build-essential libssh-dev flex bison libelf-dev bc
diff --git a/tools/fuzz/x86ie/scripts/rebuild.sh b/tools/fuzz/x86ie/scripts/rebuild.sh
new file mode 100755
index 000000000000..ecdc5aa52653
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/rebuild.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+make clean
+make tools/fuzz_clean
+FUZZDIR="./fuzz" ./tools/fuzz/x86_instruction_emulation/scripts/build.sh
diff --git a/tools/fuzz/x86ie/scripts/run.sh b/tools/fuzz/x86ie/scripts/run.sh
new file mode 100755
index 000000000000..9b7d69e0f0f6
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/run.sh
@@ -0,0 +1,10 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+FUZZDIR="${FUZZDIR:-$(pwd)/fuzz}"
+
+mkdir -p $FUZZDIR/in
+cp tools/fuzz/x86_instruction_emulation/rand_sample.bin $FUZZDIR/in
+mkdir -p $FUZZDIR/out
+
+screen bash -c "ulimit -Sv $[21999999999 << 10]; ./tools/fuzz/x86_instruction_emulation/scripts/afl-many -m 22000000000 -i $FUZZDIR/in -o $FUZZDIR/out tools/fuzz/x86_instruction_emulation/afl-harness @@"
diff --git a/tools/fuzz/x86ie/scripts/summarize.sh b/tools/fuzz/x86ie/scripts/summarize.sh
new file mode 100755
index 000000000000..27761f283ee3
--- /dev/null
+++ b/tools/fuzz/x86ie/scripts/summarize.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+if [ "$#" -lt 1 ]; then
+  echo "Usage: '$0 path/to/output/dir'"
+  exit
+fi
+
+time bash -c "find $1 -type f -exec tail -n 1 {} \; | sort | uniq -c | sort -rn"
--
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [v2, 0/4] x86 instruction emulator fuzzing
  2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
                   ` (3 preceding siblings ...)
  2019-06-12 15:36 ` [v2, 4/4] Added scripts for filtering, building, deploying Sam Caccavale
@ 2019-06-21 13:30 ` Alexander Graf
  4 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2019-06-21 13:30 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 12.06.19 17:35, Sam Caccavale wrote:
> Dear all,
>
> This series aims to provide an entrypoint for, and fuzz KVM's x86 instruction
> emulator from userspace.  It mirrors Xen's application of the AFL fuzzer to
> it's instruction emulator in the hopes of discovering vulnerabilities.
> Since this entrypoint also allows arbitrary execution of the emulators code
> from userspace, it may also be useful for testing.
>
> The current 4 patches build the emulator and 2 harnesses: simple-harness is
> an example of unit testing; afl-harness is a frontend for the AFL fuzzer.
>
> Patches
> =======
>
> - 01: Builds and links afl-harness with the required kernel objects.
> - 02: Introduces the minimal set of emulator operations and supporting code
> to emulate simple instructions.
> - 03: Demonstrates simple-harness as a unit test.
> - 04: Adds scripts for install, running, and crash triage.
>
> Any comments/suggestions are greatly appreciated.


The cover letter as well as the individual patches are missing a change 
log from v1 to v2.



Alex



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v2, 1/4] Build target for emulate.o as a userspace binary
  2019-06-12 15:35 ` [v2, 1/4] Build target for emulate.o as a userspace binary Sam Caccavale
@ 2019-06-21 13:33   ` Alexander Graf
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2019-06-21 13:33 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 12.06.19 17:35, Sam Caccavale wrote:
> This commit contains the minimal set of functionality to build
> afl-harness around arch/x86/emulate.c which allows exercising code
> in that source file, like x86_emulate_insn.  Resolving the
> dependencies was done via GCC's -H flag by get_headers.py.
>
> CR: https://code.amazon.com/reviews/CR-8325546


I'm fairly sure that nobody on the LKML can access this page or even 
remotely cares about it ;).

Also, your patches are missing an SoB line.


Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v2, 2/4] Emulate simple x86 instructions in userspace
  2019-06-12 15:35 ` [v2, 2/4] Emulate simple x86 instructions in userspace Sam Caccavale
@ 2019-06-21 13:40   ` Alexander Graf
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2019-06-21 13:40 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 12.06.19 17:35, Sam Caccavale wrote:
> Added the minimal subset of code to run afl-harness with a binary file
> as input.  These bytes are used to populate the vcpu structure and then
> as an instruction stream for the emulator.  It does not attempt to handle
> exceptions an only supports very simple ops.
>
> CR: https://code.amazon.com/reviews/CR-8552453
> ---
>   tools/fuzz/x86ie/afl-harness.c    |   8 +-
>   tools/fuzz/x86ie/emulator_ops.c   | 342 +++++++++++++++++++++++++++++-
>   tools/fuzz/x86ie/emulator_ops.h   |   7 +-
>   tools/fuzz/x86ie/scripts/afl-many |  28 +++
>   4 files changed, 379 insertions(+), 6 deletions(-)
>   create mode 100755 tools/fuzz/x86ie/scripts/afl-many
>
> diff --git a/tools/fuzz/x86ie/afl-harness.c b/tools/fuzz/x86ie/afl-harness.c
> index b3b09d7f15f2..a3eeab0cfc90 100644
> --- a/tools/fuzz/x86ie/afl-harness.c
> +++ b/tools/fuzz/x86ie/afl-harness.c
> @@ -50,7 +50,7 @@ int main(int argc, char **argv)
>
>   		switch (c) {
>   		case OPT_INPUT_SIZE:
> -			printf("Min: %u\n", MIN_INPUT_SIZE);
> +			printf("Min: %lu\n", MIN_INPUT_SIZE);


Why this change here?


>   			printf("Max: %u\n", MAX_INPUT_SIZE);
>   			exit(0);
>   			break;
> @@ -77,6 +77,10 @@ int main(int argc, char **argv)
>
>   	state = create_emulator();
>   	state->data = malloc(INSTRUCTION_BYTES);
> +	if (!state->data) {
> +		printf("Malloc failed.\n");
> +		return -1;
> +	}


Why here and not in 1/4?


>
>   #ifdef __AFL_HAVE_MANUAL_CONTROL
>   	__AFL_INIT();
> @@ -109,8 +113,6 @@ int main(int argc, char **argv)
>   			fseek(fp, 0, SEEK_SET);
>   		}
>   #endif
> -		reset_emulator(state);
> -


Same question.


>   		size = fread(state, 1, MIN_INPUT_SIZE, fp);
>   		if (size != MIN_INPUT_SIZE) {
>   			printf("Input does not populate state\n");
> diff --git a/tools/fuzz/x86ie/emulator_ops.c b/tools/fuzz/x86ie/emulator_ops.c
> index 55ae4e8fbd96..370ac970ab9d 100644
> --- a/tools/fuzz/x86ie/emulator_ops.c
> +++ b/tools/fuzz/x86ie/emulator_ops.c
> @@ -29,17 +29,349 @@
>   #include <asm/user_64.h>
>   #include <asm/kvm.h>
>
> +ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg)
> +{
> +	assert(reg < number_of_gprs(ctxt));
> +	return get_state(ctxt)->vcpu.regs[reg];
> +}
> +
> +void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg, ulong val)
> +{
> +	assert(reg < number_of_gprs(ctxt));
> +	get_state(ctxt)->vcpu.regs[reg] = val;
> +}
> +
> +/* All read ops: */
> +
> +static int _get_bytes(void *dst, struct state *state, unsigned int bytes,
> +		      char *callee)
> +{
> +	if (state->bytes_consumed + bytes > state->data_available) {
> +		fprintf(stderr, "Tried retrieving %d bytes\n", bytes);
> +		fprintf(stderr, "%s failed to retrieve bytes for %s.\n",
> +			__func__, callee);
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	memcpy(dst, &state->data[state->bytes_consumed], bytes);
> +	return X86EMUL_CONTINUE;
> +}
> +
> +/*
> + * The only function that any x86_emulate_ops should call to retrieve bytes.
> + * See comments in struct state definition for more information.
> + */
> +static int get_bytes_and_increment(void *dst, struct state *state,
> +				   unsigned int bytes, char *callee)
> +{
> +	int rc = _get_bytes(dst, state, bytes, callee);
> +
> +	if (rc == X86EMUL_CONTINUE)
> +		state->bytes_consumed += bytes;
> +
> +	return rc;
> +}
> +
> +/*
> + * This is called by x86_decode_insn to fetch bytes.
> + */
> +int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr, void *val,
> +	       unsigned int bytes, struct x86_exception *fault)
> +{
> +	if (get_bytes_and_increment(val, get_state(ctxt), bytes,
> +		"emul_fetch") != X86EMUL_CONTINUE) {
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	return X86EMUL_CONTINUE;
> +}
> +
> +int emul_read_emulated(struct x86_emulate_ctxt *ctxt,
> +		       unsigned long addr, void *val, unsigned int bytes,
> +		       struct x86_exception *fault)
> +{
> +	if (get_bytes_and_increment(val, get_state(ctxt), bytes,
> +		"emul_read_emulated") != X86EMUL_CONTINUE) {
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	return X86EMUL_CONTINUE;
> +}
> +
> +int emul_write_emulated(struct x86_emulate_ctxt *ctxt,
> +		   unsigned long addr, const void *val,
> +		   unsigned int bytes,
> +		   struct x86_exception *fault)
> +{
> +	return X86EMUL_CONTINUE;
> +}
> +
> +ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
> +{
> +	return get_state(ctxt)->vcpu.cr[cr];
> +}
> +
> +int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
> +{
> +	get_state(ctxt)->vcpu.cr[cr] = val;
> +	return 0;
> +}
> +
> +unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt)
> +{
> +	return get_state(ctxt)->vcpu.rflags;
> +}
> +
> +void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags)
> +{
> +	get_state(ctxt)->vcpu.rflags = hflags;
> +}
> +
> +/* End of emulator ops */
> +
> +#define SET(h) .h = emul_##h


This is a pretty dangerous macro name, as it has great potential for 
conflict with a random header. I would suggest EMUL_OP() instead.


> +const struct x86_emulate_ops all_emulator_ops = {
> +	SET(read_gpr),
> +	SET(write_gpr),
> +	SET(fetch),
> +	SET(read_emulated),
> +	SET(write_emulated),
> +	SET(get_cr),
> +	SET(set_cr),
> +	SET(get_hflags),
> +	SET(set_hflags),
> +};
> +#undef SET
> +
> +enum {
> +	HOOK_read_gpr,
> +	HOOK_write_gpr,
> +	HOOK_fetch,
> +	HOOK_read_emulated,
> +	HOOK_write_emulated,
> +	HOOK_get_cr,
> +	HOOK_set_cr,
> +	HOOK_get_hflags,
> +	HOOK_set_hflags
> +};
> +
> +/*
> + * Disable an x86_emulate_op if options << HOOK_op is set.
> + *
> + * Expects options to be defined.
> + */
> +#define MAYBE_DISABLE_HOOK(h)						\
> +	do {								\
> +		if (options & (1 << HOOK_##h)) {			\
> +			vcpu->ctxt.ops.h = NULL;			\
> +			debug("Disabling hook " #h "\n");		\
> +		}							\
> +	} while (0)
> +
> +/*
> + * FROM XEN:
> + *
> + * Constrain input to architecturally-possible states where
> + * the emulator relies on these
> + *
> + * In general we want the emulator to be as absolutely robust as
> + * possible; which means that we want to minimize the number of things
> + * it assumes about the input state.  Tesing this means minimizing and
> + * removing as much of the input constraints as possible.
> + *
> + * So we only add constraints that (in general) have been proven to
> + * cause crashes in the emulator.
> + *
> + * For future reference: other constraints which might be necessary at
> + * some point:
> + *
> + * - EFER.LMA => !EFLAGS.NT
> + * - In VM86 mode, force segment...
> + *  - ...access rights to 0xf3
> + *  - ...limits to 0xffff
> + *  - ...bases to below 1Mb, 16-byte aligned
> + *  - ...selectors to (base >> 4)
> + */
> +static void sanitize_input(struct state *s)
> +{
> +	/* Some hooks can't be disabled. */
> +	// options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
> +
> +	/* Zero 'private' entries */
> +	// regs->error_code = 0;
> +	// regs->entry_vector = 0;
> +
> +	// CANONICALIZE_MAYBE(rip);
> +	// CANONICALIZE_MAYBE(rsp);
> +	// CANONICALIZE_MAYBE(rbp);


Now that you removed the macro, please also remove all commented out lines.


> +
> +	/*
> +	 * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
> +	 * set PE if PG is set.
> +	 */
> +	if (s->vcpu.cr[0] & X86_CR0_PG)
> +		s->vcpu.cr[0] |= X86_CR0_PE;
> +
> +	/* EFLAGS.VM not available in long mode */
> +	if (s->ctxt.mode == X86EMUL_MODE_PROT64)
> +		s->vcpu.rflags &= ~X86_EFLAGS_VM;
> +
> +	/* EFLAGS.VM implies 16-bit mode */
> +	if (s->vcpu.rflags & X86_EFLAGS_VM) {
> +		s->vcpu.segments[x86_seg_cs].db = 0;
> +		s->vcpu.segments[x86_seg_ss].db = 0;
> +	}
> +}
> +
>   void initialize_emulator(struct state *state)
>   {
> +	reset_emulator(state);
> +	state->ctxt.ops = &all_emulator_ops;
> +
> +	/* See also sanitize_input, some hooks can't be disabled. */
> +	// MAYBE_DISABLE_HOOK(read_gpr);
> +
> +	sanitize_input(state);
> +}
> +
> +static const char *const x86emul_mode_string[] = {
> +	[X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL",
> +	[X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86",
> +	[X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16",
> +	[X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32",
> +	[X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64",
> +};
> +
> +static void dump_state_after(const char *desc, struct state *state)
> +{
> +	debug(" -- State after %s --\n", desc);
> +	debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
> +	debug(" cr0: %lx\n", state->vcpu.cr[0]);
> +	debug(" cr3: %lx\n", state->vcpu.cr[3]);
> +	debug(" cr4: %lx\n", state->vcpu.cr[4]);
> +
> +	debug("Decode _eip: %lu\n", state->ctxt._eip);
> +	debug("Emulate eip: %lu\n", state->ctxt.eip);
> +
> +	debug("\n");
>   }
>
> +static void init_emulate_ctxt(struct state *state)
> +{
> +	struct x86_emulate_ctxt *ctxt = &state->ctxt;
> +
> +	ctxt->eflags = ctxt->ops->get_hflags(ctxt);
> +	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
> +
> +	ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this
> +
> +	init_decode_cache(ctxt);
> +}
> +
> +
>   int step_emulator(struct state *state)
>   {
> -	return 0;
> +	int rc;
> +	unsigned long prev_eip = state->ctxt._eip;
> +	unsigned long emul_offset;
> +	int decode_size = state->data_available - state->bytes_consumed;
> +
> +	/*
> +	 * This is annoing to have to explain the reasoning behind:
> +	 * ._eip is incremented by x86_decode_insn.  It will be > .eip between
> +	 * decoding and emulating.
> +	 * .eip is incremented by x86_emulate_insn.  It may be incremented
> +	 * beyond the length of instruction emulated E.G. if a jump is taken.
> +	 *
> +	 * If these are out of sync before emulating, then something is
> +	 * horribly wrong with the harness.
> +	 */
> +	assert(state->ctxt.eip == state->ctxt._eip);
> +
> +	if (decode_size <= 0) {
> +		debug("Out of instructions\n");
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	init_emulate_ctxt(state);
> +	state->ctxt.interruptibility = 0;
> +	state->ctxt.have_exception = false;
> +	state->ctxt.exception.vector = -1;
> +	state->ctxt.perm_ok = false;
> +	state->ctxt.ud = 0; // (emulation_type(0) & EMULTYPE_TRAP_UD);
> +
> +	/*
> +	 * When decoding with NULL, 0, the emulator will use the emul_fetch
> +	 * op which handles incrementing the state->data variables.  However
> +	 * x86_decode_insn will always try to grab 15 bytes which may be more
> +	 * than are left in the stream.
> +	 *
> +	 * Calling x86_decode_insn from a buffer with a length causes it to
> +	 * directly memcpy those bytes into the ctxt structure and does not
> +	 * increment state->bytes_consumed.  In that case, we manually
> +	 * update state->bytes_consumed by the difference in the decoding
> +	 * _eip.  This is gross but I cannot figure out a better way to do
> +	 * this.
> +	 *
> +	 * We must limit the size to avoid going over the buffer and since
> +	 * calling x86_decode_insn with a buffer does not go through any of
> +	 * our ops, we need to update bytes_consumed.  The only improvement
> +	 * I can currently think of would be a nicer way to get the size of
> +	 * the decoded instruction.
> +	 */
> +	if (decode_size > 15)
> +		decode_size = 15;
> +
> +	rc = x86_decode_insn(&state->ctxt,
> +		&state->data[state->bytes_consumed], decode_size);
> +	assert(state->ctxt._eip - prev_eip > 0); // Only move forward.
> +	state->bytes_consumed += state->ctxt._eip - prev_eip;
> +
> +	debug("Decode result: %d\n", rc);
> +	if (rc != X86EMUL_CONTINUE)
> +		return rc;
> +
> +	emul_offset = state->ctxt._eip - state->ctxt.eip;
> +	debug("Instruction: ");
> +	print_n_bytes(&state->data[state->bytes_consumed - emul_offset],
> +		      emul_offset);
> +
> +	state->ctxt.exception.address = state->vcpu.cr[2];
> +
> +	// This is extraneous but explicit due to the above assert
> +	prev_eip = state->ctxt.eip;
> +	rc = x86_emulate_insn(&state->ctxt);
> +	debug("Emulation result: %d\n", rc);
> +	dump_state_after("emulating", state);
> +
> +	if (rc == -1) {
> +		return rc;
> +	} else if (state->ctxt.have_exception) {
> +		fprintf(stderr, "Emulator propagated exception: { ");
> +		fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector);
> +		fprintf(stderr, "error code: %d }\n",
> +			state->ctxt.exception.error_code);
> +		rc = X86EMUL_UNHANDLEABLE;
> +	} else if (prev_eip == state->ctxt.eip) {
> +		fprintf(stderr, "ctxt.eip not advanced.\n");
> +		rc = X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	if (state->bytes_consumed == state->data_available)
> +		debug("emulator is done\n");
> +
> +	return rc;
>   }
>
>   int emulate_until_complete(struct state *state)
>   {
> +	int count = 0;
> +
> +	do {
> +		count++;
> +	} while (step_emulator(state) == X86EMUL_CONTINUE);
> +
> +	debug("Emulated %d instructions\n", count);
>   	return 0;
>   }
>
> @@ -51,8 +383,16 @@ struct state *create_emulator(void)
>
>   void reset_emulator(struct state *state)
>   {
> +	unsigned char *data = state->data;
> +	size_t data_available = state->data_available;
> +
> +	memset(state, 0, sizeof(struct state));
> +
> +	state->data = data;
> +	state->data_available = data_available;
>   }
>
>   void free_emulator(struct state *state)
>   {
> +	free(state);
>   }
> diff --git a/tools/fuzz/x86ie/emulator_ops.h b/tools/fuzz/x86ie/emulator_ops.h
> index 5ae072d5f205..19f3bd0ec6a3 100644
> --- a/tools/fuzz/x86ie/emulator_ops.h
> +++ b/tools/fuzz/x86ie/emulator_ops.h
> @@ -59,7 +59,7 @@ struct state {
>   	 * Amount of bytes consumed for purposes other than instructions.
>   	 * E.G. whether a memory access should fault.
>   	 */
> -	size_t other_bytes_consumed;
> +	size_t bytes_consumed;


Why in this patch?


>
>   	/* Emulation context */
>   	struct x86_emulate_ctxt ctxt;
> @@ -75,7 +75,10 @@ struct state {
>
>   #define get_state(h) container_of(h, struct state, ctxt)
>
> -void buffer_stderr(void) __attribute__((constructor));


Same question.


> +static inline int number_of_gprs(struct x86_emulate_ctxt *c)
> +{
> +	return (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8);
> +}
>
>   /*
>    * Allocates space for, and creates a `struct state`.  The user should set
> diff --git a/tools/fuzz/x86ie/scripts/afl-many b/tools/fuzz/x86ie/scripts/afl-many
> new file mode 100755
> index 000000000000..ab15258573a2
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/afl-many
> @@ -0,0 +1,28 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# This is for running AFL over NPROC or `nproc` cores with normal AFL options.
> +
> +export AFL_NO_AFFINITY=1
> +
> +while [ -z "$sync_dir" ]; do
> +  while getopts ":o:" opt; do
> +    case "${opt}" in
> +      o)
> +        sync_dir="${OPTARG}"
> +        ;;
> +      *)
> +        ;;
> +    esac
> +  done
> +  ((OPTIND++))
> +  [ $OPTIND -gt $# ] && break
> +done
> +
> +for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
> +    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
> +done
> +taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &


You want to add a comment above the execution via taskset explaining 
that this is necessary for performance because the Linux scheduler 
otherwise works against you for local caching effects.


> +
> +sleep 5


I'm still not a big fan of the sleep. Is there really no way to 
determine a sane monitoring state for real? Can you maybe find it out 
from looking at the open FDs on the afl-fuzz processes via /proc/$pid/fds?

Alex


> +watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
> +pkill afl-fuzz
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v2, 3/4] Demonstrating unit testing via simple-harness
  2019-06-12 15:35 ` [v2, 3/4] Demonstrating unit testing via simple-harness Sam Caccavale
@ 2019-06-21 13:43   ` Alexander Graf
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2019-06-21 13:43 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel



On 12.06.19 17:35, Sam Caccavale wrote:
> Simple-harness.c uses inline asm support to generate asm and then has the
> emulator emulate this code.  This may be useful as a form of testing for
> the emulator.
> 
> CR: https://code.amazon.com/reviews/CR-8591638
> ---
>   tools/fuzz/x86ie/Makefile         |  7 ++++--
>   tools/fuzz/x86ie/simple-harness.c | 42 +++++++++++++++++++++++++++++++
>   2 files changed, 47 insertions(+), 2 deletions(-)
>   create mode 100644 tools/fuzz/x86ie/simple-harness.c
> 
> diff --git a/tools/fuzz/x86ie/Makefile b/tools/fuzz/x86ie/Makefile
> index d45fe6d266b9..e79d275e1040 100644
> --- a/tools/fuzz/x86ie/Makefile
> +++ b/tools/fuzz/x86ie/Makefile
> @@ -44,8 +44,11 @@ LOCAL_OBJS := emulator_ops.o stubs.o
>   afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
>   	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
> 
> -all: afl-harness
> +simple-harness: simple-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
> +	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
> +
> +all: afl-harness simple-harness
> 
>   .PHONY: clean
>   clean:
> -	$(RM) -r *.o afl-harness
> +	$(RM) -r *.o afl-harness simple-harness
> diff --git a/tools/fuzz/x86ie/simple-harness.c b/tools/fuzz/x86ie/simple-harness.c
> new file mode 100644
> index 000000000000..f21fdafe1dd1
> --- /dev/null
> +++ b/tools/fuzz/x86ie/simple-harness.c
> @@ -0,0 +1,42 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <assert.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include "emulator_ops.h"
> +#include <asm/kvm_emulate.h>
> +
> +extern void foo(void)
> +{
> +	asm volatile("__start:"
> +		     ".byte 0x32, 0x05, 0x00, 0x00, 0x00, 0x00;" // xor eax,DWORD PTR [rip+0x0]
> +		     ".byte 0x90;"
> +		     //".byte 0x0f, 0x7f, 0xde;" // movq mm6,mm3

Why?

Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [v2, 4/4] Added scripts for filtering, building, deploying
  2019-06-12 15:36 ` [v2, 4/4] Added scripts for filtering, building, deploying Sam Caccavale
@ 2019-06-21 13:50   ` Alexander Graf
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2019-06-21 13:50 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel



On 12.06.19 17:36, Sam Caccavale wrote:
> bin.sh produces output which diagnoses whether the crash was expected.
> coalesce.sh, gen_output.sh, and summarize.sh are useful for parsing
> the large crash directories that afl produces.
> deploy_remote.sh does all of the setup to launch a fuzz run via
> install_deps_ubuntu.sh, install_afl.sh, build.sh, and run.sh.
> rebuild.sh cleans the directories and executes build.sh
> ---
>   tools/fuzz/x86ie/scripts/afl-many             |  6 +--
>   tools/fuzz/x86ie/scripts/bin.sh               | 49 +++++++++++++++++++
>   tools/fuzz/x86ie/scripts/build.sh             | 32 ++++++++++++
>   tools/fuzz/x86ie/scripts/coalesce.sh          |  6 +++
>   tools/fuzz/x86ie/scripts/deploy.sh            |  9 ++++
>   tools/fuzz/x86ie/scripts/deploy_remote.sh     |  9 ++++
>   tools/fuzz/x86ie/scripts/gen_output.sh        | 11 +++++
>   tools/fuzz/x86ie/scripts/install_afl.sh       | 14 ++++++
>   .../fuzz/x86ie/scripts/install_deps_ubuntu.sh |  5 ++
>   tools/fuzz/x86ie/scripts/rebuild.sh           |  6 +++
>   tools/fuzz/x86ie/scripts/run.sh               | 10 ++++
>   tools/fuzz/x86ie/scripts/summarize.sh         |  9 ++++
>   12 files changed, 163 insertions(+), 3 deletions(-)
>   create mode 100755 tools/fuzz/x86ie/scripts/bin.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/build.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/coalesce.sh
>   create mode 100644 tools/fuzz/x86ie/scripts/deploy.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/deploy_remote.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/gen_output.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/install_afl.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/rebuild.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/run.sh
>   create mode 100755 tools/fuzz/x86ie/scripts/summarize.sh
> 
> diff --git a/tools/fuzz/x86ie/scripts/afl-many b/tools/fuzz/x86ie/scripts/afl-many
> index ab15258573a2..3fe6423309a6 100755
> --- a/tools/fuzz/x86ie/scripts/afl-many
> +++ b/tools/fuzz/x86ie/scripts/afl-many
> @@ -19,10 +19,10 @@ while [ -z "$sync_dir" ]; do
>   done
> 
>   for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
> -    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
> +    taskset -c "$i" $AFLPATH/afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
>   done
> -taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &
> +taskset -c 0 $AFLPATH/afl-fuzz -M master $@ >/dev/null 2>&1 &
> 
>   sleep 5
> -watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
> +watch -n1 "echo \"Executing 'AFLPATH/afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && $AFLPATH/afl-whatsup -s 

This is missing a $ sign.

${sync_dir}"
>   pkill afl-fuzz
> diff --git a/tools/fuzz/x86ie/scripts/bin.sh b/tools/fuzz/x86ie/scripts/bin.sh
> new file mode 100755
> index 000000000000..6383a883ff33
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/bin.sh
> @@ -0,0 +1,49 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +if [ "$#" -lt 3 ]; then
> +  echo "Usage: './bin path/to/afl-harness path/to/afl_crash [path/to/linux/src/root]'"
> +  exit
> +fi
> +
> +export AFL_HARNESS="$1"
> +export LINUX_SRC="$3"
> +
> +diagnose_segfault() {
> +  SOURCE=$(gdb -batch -ex r -ex 'bt 2' --args $@ 2>&1 | grep -Po '#1.* \K([^ ]+:[0-9]+)');
> +  IFS=: read FILE LINE <<< "$SOURCE"
> +
> +  OP="$(sed -n "${LINE}p" "$LINUX_SRC/$FILE" 2>/dev/null)"
> +  if [ $? -ne 0 ]; then
> +    OP="$(sed -n "${LINE}p" "$LINUX_SRC/tools/fuzz/x86_instruction_emulation/$FILE" 2>/dev/null)"
> +  fi
> +
> +  OP="$(echo $OP | grep -Po 'ops->\K([^(]+)')"
> +  if [ -z "$OP" ]; then
> +    echo "SEGV: unknown, in $FILE:$LINE"
> +  else
> +    echo "Expected: segfaulting on emulator->$OP"
> +  fi
> +}
> +export -f diagnose_segfault
> +
> +bin() {
> +  OUTPUT=$(bash -c "timeout 1s $AFL_HARNESS $1 2>&1" 2>&1)
> +  RETVAL=$?
> +
> +  echo "$OUTPUT"
> +  if [ $RETVAL -eq 0 ]; then
> +    echo "Terminated successfully"
> +  elif [ $RETVAL -eq 124 ]; then
> +    echo "Unknown: killed due to timeout.  Loop likely."
> +  elif echo "$OUTPUT" | grep -q "SEGV"; then
> +    echo "$(diagnose_segfault $AFL_HARNESS $1)"
> +  elif echo "$OUTPUT" | grep -q "FPE"; then
> +    echo "Expected: floating point exception."
> +  else
> +    echo "Unknown cause of crash."
> +  fi
> +}
> +export -f bin
> +
> +echo "$(bin $2 2>&1)"
> diff --git a/tools/fuzz/x86ie/scripts/build.sh b/tools/fuzz/x86ie/scripts/build.sh
> new file mode 100755
> index 000000000000..74b893f222c1
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/build.sh
> @@ -0,0 +1,32 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +kernel_objects="arch/x86/kvm/emulate.o arch/x86/lib/retpoline.o lib/find_bit.o"
> +
> +disable() { sed -i -r "/\b$1\b/c\# $1" .config; }
> +enable() { sed -i -r "/\b$1\b/c\\$1=y" .config; }
> +
> +make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} defconfig
> +
> +enable "CONFIG_DEBUG_INFO"
> +enable "CONFIG_STACKPROTECTOR"
> +
> +yes ' ' | make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} $kernel_objects
> +
> +omit_arg () { args=$(echo "$args" | sed "s/ $1//g"); }
> +add_arg () { args+=" $1"; }
> +
> +rebuild () {
> +  args="$(head -1 $(dirname $1)/.$(basename $1).cmd | sed -e 's/.*:= //g')"
> +  omit_arg "-mcmodel=kernel"
> +  omit_arg "-mpreferred-stack-boundary=3"
> +  add_arg "-fsanitize=address"
> +  echo -e "Rebuilding $1 with \n$args"
> +  eval "$args"
> +}
> +
> +for object in $kernel_objects; do
> +  rebuild $object
> +done
> +
> +make ${CC:+ "CC=$CC"} ${DEBUG:+ "DEBUG=1"} tools/fuzz
> diff --git a/tools/fuzz/x86ie/scripts/coalesce.sh b/tools/fuzz/x86ie/scripts/coalesce.sh
> new file mode 100755
> index 000000000000..18c2ca7f2767
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/coalesce.sh
> @@ -0,0 +1,6 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +mkdir -p all
> +rm -rf all/*
> +find . -type f -wholename '*crashes/id*' | parallel cp {} ./all/$(basename $(dirname {//})):{/}
> diff --git a/tools/fuzz/x86ie/scripts/deploy.sh b/tools/fuzz/x86ie/scripts/deploy.sh
> new file mode 100644
> index 000000000000..f95c3aa2b5b5
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/deploy.sh
> @@ -0,0 +1,9 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +REMOTE=$1
> +DSTDIR=/dev/shm
> +
> +rsync -av $(pwd) $REMOTE:$DSTDIR
> +
> +ssh $REMOTE "cd $DSTDIR/$(basename $(pwd)); bash -s tools/fuzz/x86_instruction_emulation/scripts/deploy_remote.sh"

Does this really belong in here?

> diff --git a/tools/fuzz/x86ie/scripts/deploy_remote.sh b/tools/fuzz/x86ie/scripts/deploy_remote.sh
> new file mode 100755
> index 000000000000..e002c5a932f5
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/deploy_remote.sh
> @@ -0,0 +1,9 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +SCRIPTDIR=$(pwd)/tools/fuzz/x86_instruction_emulation/scripts
> +
> +$SCRIPTDIR/install_deps_ubuntu.sh
> +source $SCRIPTDIR/install_afl.sh
> +CC=$AFLPATH/afl-gcc $SCRIPTDIR/build.sh
> +FUZZDIR="${FUZZDIR:-$(pwd)/fuzz}" $SCRIPTDIR/run.sh
> diff --git a/tools/fuzz/x86ie/scripts/gen_output.sh b/tools/fuzz/x86ie/scripts/gen_output.sh
> new file mode 100755
> index 000000000000..6c0707eb6d08
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/gen_output.sh
> @@ -0,0 +1,11 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +if [ "$#" -lt 3 ]; then
> +  echo "Usage: '$0 path/to/afl-harness path/to/afl_crash_dir path/to/linux/src/root'"
> +  exit
> +fi
> +
> +mkdir -p output
> +rm -rf output/*
> +find $2 -type f | parallel ./bin.sh $1 {} $3 '>' ./output/{/}.out
> diff --git a/tools/fuzz/x86ie/scripts/install_afl.sh b/tools/fuzz/x86ie/scripts/install_afl.sh
> new file mode 100755
> index 000000000000..b1c5612eca1c
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/install_afl.sh
> @@ -0,0 +1,14 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz
> +mkdir -p afl
> +tar xzf afl-latest.tgz -C afl --strip-components 1
> +
> +pushd afl
> +set AFL_USE_ASAN
> +make clean all
> +export AFLPATH="$(pwd)"
> +popd
> +
> +sudo bash -c "echo core >/proc/sys/kernel/core_pattern"
> diff --git a/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh b/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
> new file mode 100755
> index 000000000000..5525bc8b659c
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/install_deps_ubuntu.sh
> @@ -0,0 +1,5 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +sudo apt update
> +sudo apt install -y make gcc wget screen build-essential libssh-dev flex bison libelf-dev bc

Same as this. This file could bitrot really quick. And it doesn't help 
any non-Ubuntu users.


I think most files in here are not strictly needed. Maybe split this 
patch into one that actually contains all changes necessary to easily 
start a test run and a separate one with all your convencience scripts?


Alex

> diff --git a/tools/fuzz/x86ie/scripts/rebuild.sh b/tools/fuzz/x86ie/scripts/rebuild.sh
> new file mode 100755
> index 000000000000..ecdc5aa52653
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/rebuild.sh
> @@ -0,0 +1,6 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +make clean
> +make tools/fuzz_clean
> +FUZZDIR="./fuzz" ./tools/fuzz/x86_instruction_emulation/scripts/build.sh
> diff --git a/tools/fuzz/x86ie/scripts/run.sh b/tools/fuzz/x86ie/scripts/run.sh
> new file mode 100755
> index 000000000000..9b7d69e0f0f6
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/run.sh
> @@ -0,0 +1,10 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +FUZZDIR="${FUZZDIR:-$(pwd)/fuzz}"
> +
> +mkdir -p $FUZZDIR/in
> +cp tools/fuzz/x86_instruction_emulation/rand_sample.bin $FUZZDIR/in
> +mkdir -p $FUZZDIR/out
> +
> +screen bash -c "ulimit -Sv $[21999999999 << 10]; ./tools/fuzz/x86_instruction_emulation/scripts/afl-many -m 22000000000 -i $FUZZDIR/in -o $FUZZDIR/out tools/fuzz/x86_instruction_emulation/afl-harness @@"
> diff --git a/tools/fuzz/x86ie/scripts/summarize.sh b/tools/fuzz/x86ie/scripts/summarize.sh
> new file mode 100755
> index 000000000000..27761f283ee3
> --- /dev/null
> +++ b/tools/fuzz/x86ie/scripts/summarize.sh
> @@ -0,0 +1,9 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +
> +if [ "$#" -lt 1 ]; then
> +  echo "Usage: '$0 path/to/output/dir'"
> +  exit
> +fi
> +
> +time bash -c "find $1 -type f -exec tail -n 1 {} \; | sort | uniq -c | sort -rn"
> --
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-06-21 13:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-12 15:35 [v2, 0/4] x86 instruction emulator fuzzing Sam Caccavale
2019-06-12 15:35 ` [v2, 1/4] Build target for emulate.o as a userspace binary Sam Caccavale
2019-06-21 13:33   ` Alexander Graf
2019-06-12 15:35 ` [v2, 2/4] Emulate simple x86 instructions in userspace Sam Caccavale
2019-06-21 13:40   ` Alexander Graf
2019-06-12 15:35 ` [v2, 3/4] Demonstrating unit testing via simple-harness Sam Caccavale
2019-06-21 13:43   ` Alexander Graf
2019-06-12 15:36 ` [v2, 4/4] Added scripts for filtering, building, deploying Sam Caccavale
2019-06-21 13:50   ` Alexander Graf
2019-06-21 13:30 ` [v2, 0/4] x86 instruction emulator fuzzing Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).