All of lore.kernel.org
 help / color / mirror / Atom feed
* x86 instruction emulator fuzzing
@ 2019-05-21 15:39 Sam Caccavale
  2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Sam Caccavale @ 2019-05-21 15:39 UTC (permalink / raw)
  Cc: samcacc, samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf,
	karahmed, andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx,
	mingo, bp, hpa, paullangton4, anirudhkaushik, x86, kvm,
	linux-kernel

Dear all,

This series aims to provide an entrypoint for, and fuzz KVM's x86 instruction
emulator from userspace.  It mirrors Xen's application of the AFL fuzzer to
it's instruction emulator in the hopes of discovering vulnerabilities.
Since this entrypoint also allows arbitrary execution of the emulators code
from userspace, it may also be useful for testing.

The current 3 patches build the emulator and 2 harnesses: simple-harness is
an example of unit testing; afl-harness is a frontend for the AFL fuzzer.
They are early POC and include some issues outlined under "Issues."

Patches
=======

- 01: Builds and links afl-harness with the required kernel objects.
- 02: Introduces the minimal set of emulator operations and supporting code
to emulate simple instructions.
- 03: Demonstrates simple-harness as a unit test.

Issues
=======

1. Currently, building requires manually running the `make_deps` script
since I was unable to make the kernel objects a dependency of the tool.
2. The code will segfault if `CONFIG_STACKPROTECTOR=y` in config.
3. The code requires stderr to be buffered or it otherwise segfaults.

The latter two issues seem related and all of them are likely fixable by
someone more familiar with the linux than me.

Concerns
=======

I was able to carve the `arch/x86/kvm/emulate.c` code, but the emulator is
constructed in such a way that a lot of the code which enforces expected
behavior lives in the x86_emulate_ops supplied in `arch/x86/kvm/x86.c`.
Testing the emulator is still valuable, but a reproducible way to use the kvm
ops would be useful.

Any comments/suggestions are greatly appreciated.

Best,
Sam Caccavale





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] Build target for emulate.o as a userspace binary
  2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
@ 2019-05-21 15:39 ` Sam Caccavale
  2019-05-31  8:02   ` Alexander Graf
  2019-05-21 15:39 ` [PATCH 2/3] Emulate simple x86 instructions in userspace Sam Caccavale
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Sam Caccavale @ 2019-05-21 15:39 UTC (permalink / raw)
  Cc: samcacc, samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf,
	karahmed, andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx,
	mingo, bp, hpa, paullangton4, anirudhkaushik, x86, kvm,
	linux-kernel

This commit contains the minimal set of functionality to build
afl-harness around arch/x86/emulate.c which allows exercising code
in that source file, like x86_emulate_insn.  Resolving the
dependencies was done via GCC's -H flag by get_headers.py.

---
 tools/Makefile                                |   9 ++
 .../fuzz/x86_instruction_emulation/.gitignore |   2 +
 tools/fuzz/x86_instruction_emulation/Makefile |  57 +++++++
 .../fuzz/x86_instruction_emulation/README.md  |  12 ++
 .../x86_instruction_emulation/afl-harness.c   | 149 ++++++++++++++++++
 tools/fuzz/x86_instruction_emulation/common.h |  87 ++++++++++
 .../x86_instruction_emulation/emulator_ops.c  |  58 +++++++
 .../x86_instruction_emulation/emulator_ops.h  | 117 ++++++++++++++
 .../scripts/get_headers.py                    |  95 +++++++++++
 .../scripts/make_deps                         |   4 +
 tools/fuzz/x86_instruction_emulation/stubs.c  |  56 +++++++
 tools/fuzz/x86_instruction_emulation/stubs.h  |  52 ++++++
 12 files changed, 698 insertions(+)
 create mode 100644 tools/fuzz/x86_instruction_emulation/.gitignore
 create mode 100644 tools/fuzz/x86_instruction_emulation/Makefile
 create mode 100644 tools/fuzz/x86_instruction_emulation/README.md
 create mode 100644 tools/fuzz/x86_instruction_emulation/afl-harness.c
 create mode 100644 tools/fuzz/x86_instruction_emulation/common.h
 create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.c
 create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.h
 create mode 100644 tools/fuzz/x86_instruction_emulation/scripts/get_headers.py
 create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/make_deps
 create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.c
 create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.h

diff --git a/tools/Makefile b/tools/Makefile
index 3dfd72ae6c1a..4d68817b7e49 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -94,6 +94,12 @@ freefall: FORCE
 kvm_stat: FORCE
 	$(call descend,kvm/$@)
 
+fuzz: FORCE
+	$(call descend,fuzz/x86_instruction_emulation)
+
+fuzz_deps: FORCE
+	$(call descend,fuzz/x86_instruction_emulation,fuzz_deps)
+
 all: acpi cgroup cpupower gpio hv firewire liblockdep \
 		perf selftests spi turbostat usb \
 		virtio vm bpf x86_energy_perf_policy \
@@ -171,6 +177,9 @@ tmon_clean:
 freefall_clean:
 	$(call descend,laptop/freefall,clean)
 
+fuzz_clean:
+	$(call descend,fuzz/x86_instruction_emulation,clean)
+
 build_clean:
 	$(call descend,build,clean)
 
diff --git a/tools/fuzz/x86_instruction_emulation/.gitignore b/tools/fuzz/x86_instruction_emulation/.gitignore
new file mode 100644
index 000000000000..7d44f7ce266e
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/.gitignore
@@ -0,0 +1,2 @@
+*.o
+*-harness
diff --git a/tools/fuzz/x86_instruction_emulation/Makefile b/tools/fuzz/x86_instruction_emulation/Makefile
new file mode 100644
index 000000000000..d2854a332605
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/Makefile
@@ -0,0 +1,57 @@
+ROOT_DIR=../../..
+THIS_DIR=tools/fuzz/x86_instruction_emulation
+
+include ../../scripts/Makefile.include
+
+.DEFAULT_GOAL := all
+
+INCLUDES := $(patsubst -I./%,-I./$(ROOT_DIR)/%, $(LINUXINCLUDE))
+INCLUDES := $(patsubst ./include/%,./$(ROOT_DIR)/include/%, $(INCLUDES))
+INCLUDES += -include ./$(ROOT_DIR)/include/linux/compiler_types.h
+
+$(ROOT_DIR)/.config:
+	make -C $(ROOT_DIR) menuconfig
+	sed -i -r 's/^#? *CONFIG_KVM(.*)=.*/CONFIG_KVM\1=y/' $(ROOT_DIR)/.config
+
+
+ifdef DEBUG
+KBUILD_CFLAGS += -DDEBUG
+endif
+KBUILD_CFLAGS += -g -O0
+
+# This may still not a final solution-- but multiple build flags need to be
+# filtered out of KBUILD_FLAGS so a solution is needed.
+UNNEEDED_FLAGS:= -mno-sse
+KBUILD_CFLAGS := $(filter-out $(UNNEEDED_FLAGS),$(KBUILD_CFLAGS))
+
+DEPS := emulator_ops.h stubs.h common.h
+
+%.o: %.c $(DEPS)
+	$(CC) $(KBUILD_CFLAGS) $(INCLUDES) -g -c -o $@ $<
+
+KERNEL_OBJS := arch/x86/kvm/emulate.o \
+			arch/x86/lib/retpoline.o \
+			lib/find_bit.o
+KERNEL_OBJS := $(patsubst %,$(ROOT_DIR)/%, $(KERNEL_OBJS))
+# I don't know how to make KERNEL_OBJS a dependency of this.  I cannot call
+# make from here since it causes a cycle and I would rather not hardcode
+# these dependencies into _another_ makefile.
+$(KERNEL_OBJS): $(ROOT_DIR)/.config
+	$(error Run `./tools/fuzz/x86_instruction_emulation/scripts/make_deps' first (setting envvar CC if desired).)
+
+LOCAL_OBJS := emulator_ops.o stubs.o
+afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
+	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
+
+all: afl-harness
+
+.PHONY: fuzz_deps
+fuzz_deps:
+	cd $(ROOT_DIR) && \
+	python3 $(THIS_DIR)/scripts/get_headers.py \
+		$(THIS_DIR)/afl-harness.c \
+		arch/x86/kvm/emulate.c
+
+.PHONY: clean
+clean:
+	$(RM) -r *.o afl-harness
diff --git a/tools/fuzz/x86_instruction_emulation/README.md b/tools/fuzz/x86_instruction_emulation/README.md
new file mode 100644
index 000000000000..a78ac51b0152
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/README.md
@@ -0,0 +1,12 @@
+# Building
+
+From the root of linux, run:
+1. `./tools/fuzz/x86_instruction_emulation/scripts/make_deps`
+  - Optionally, set the envvar CC with your desired compiler.
+2. `make tools/fuzz`
+
+## TODO
+
+I'd like to add the object files built in `make_deps` as real
+dependencies of `make tools/fuzz` but it causes a cycle in Make to
+add them to `tools/fuzz/x86_instruction_emulation/Makefile`.
\ No newline at end of file
diff --git a/tools/fuzz/x86_instruction_emulation/afl-harness.c b/tools/fuzz/x86_instruction_emulation/afl-harness.c
new file mode 100644
index 000000000000..b3b09d7f15f2
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/afl-harness.c
@@ -0,0 +1,149 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * x86 Instruction Emulation Fuzzing Wrapper
+ *
+ * Authors:
+ *   Sam Caccavale   <samcacc@amazon.de>
+ *
+ * Supporting code from xen:
+ *  xen/tools/fuzz/x86_instruction_emulation/afl-harness.c
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1
+ */
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+#include "emulator_ops.h"
+
+/* Arbitrary, but limiting fuzz input size is important. */
+#define MAX_INPUT_SIZE 4096
+#define INSTRUCTION_BYTES (MAX_INPUT_SIZE - MIN_INPUT_SIZE)
+
+int main(int argc, char **argv)
+{
+	size_t size;
+	FILE *fp = NULL;
+	int max, count;
+	struct state *state;
+
+	setbuf(stdin, NULL);
+	setbuf(stdout, NULL);
+
+	while (1) {
+		enum { OPT_INPUT_SIZE,
+		};
+		static const struct option lopts[] = {
+			{ "input-size", no_argument, NULL, OPT_INPUT_SIZE },
+			{ 0, 0, 0, 0 }
+		};
+		int c = getopt_long_only(argc, argv, "", lopts, NULL);
+
+		if (c == -1)
+			break;
+
+		switch (c) {
+		case OPT_INPUT_SIZE:
+			printf("Min: %u\n", MIN_INPUT_SIZE);
+			printf("Max: %u\n", MAX_INPUT_SIZE);
+			exit(0);
+			break;
+
+		case '?':
+			printf("Usage: %s $FILE [$FILE...] | [--input-size]\n",
+			       argv[0]);
+			exit(-1);
+			break;
+
+		default:
+			printf("Bad getopt return %d (%c)\n", c, c);
+			exit(-1);
+			break;
+		}
+	}
+
+	max = argc - optind;
+
+	if (!max) { /* No positional parameters.  Use stdin. */
+		max = 1;
+		fp = stdin;
+	}
+
+	state = create_emulator();
+	state->data = malloc(INSTRUCTION_BYTES);
+
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+	__AFL_INIT();
+
+	/*
+	 * This is the number of times AFL's forkserver should reuse a
+	 * process to fuzz the target.  1000 is the recommended starting
+	 * point.  Future tweaking may or may not yeild better results.
+	 */
+	for (count = 0; __AFL_LOOP(1000);)
+#else
+	for (count = 0; count < max; count++)
+#endif
+	{
+		if (fp != stdin) { /* If not stdin, open the provided file. */
+			printf("Opening file %s\n", argv[optind + count]);
+			fp = fopen(argv[optind + count], "rb");
+			if (fp == NULL) {
+				perror("fopen");
+				exit(-1);
+			}
+		}
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+		else {
+			/*
+			 * This will ensure we're dealing with a clean stream
+			 * state after the afl-fuzz process messes with the
+			 * open file handle.
+			 */
+			fseek(fp, 0, SEEK_SET);
+		}
+#endif
+		reset_emulator(state);
+
+		size = fread(state, 1, MIN_INPUT_SIZE, fp);
+		if (size != MIN_INPUT_SIZE) {
+			printf("Input does not populate state\n");
+			if (max == 1)
+				exit(-1);
+		}
+
+		size = fread(state->data, 1, INSTRUCTION_BYTES, fp);
+		state->data_available = size;
+
+		if (ferror(fp)) {
+			perror("fread");
+			exit(-1);
+		}
+
+		/* Only run the test if the input file was < than INPUT_SIZE */
+		if (feof(fp)) {
+			initialize_emulator(state);
+			emulate_until_complete(state);
+		} else {
+			printf("Input too large\n");
+			/* Don't exit if we're doing batch processing */
+			if (max == 1)
+				exit(-1);
+		}
+
+		if (fp != stdin) {
+			fclose(fp);
+			fp = NULL;
+		}
+	}
+
+	free(state->data);
+	free_emulator(state);
+	return 0;
+}
diff --git a/tools/fuzz/x86_instruction_emulation/common.h b/tools/fuzz/x86_instruction_emulation/common.h
new file mode 100644
index 000000000000..9aec2bea3d50
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/common.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <stdint.h>
+
+#define PRIx64 "llx"
+#define MSR_INDEX_MAX 16
+
+#ifdef DEBUG
+#define DEBUG_PRINT 1
+#else
+#define DEBUG_PRINT 0
+#endif
+
+#define debug(...)							\
+	do {								\
+		if (DEBUG_PRINT)					\
+			fprintf(stderr, __VA_ARGS__);			\
+	} while (0)
+
+enum x86_segment {
+	/* General purpose. */
+	x86_seg_es,
+	x86_seg_cs,
+	x86_seg_ss,
+	x86_seg_ds,
+	x86_seg_fs,
+	x86_seg_gs,
+	/* System: Valid to use for implicit table references. */
+	x86_seg_tr,
+	x86_seg_ldtr,
+	x86_seg_gdtr,
+	x86_seg_idtr,
+	/* No Segment: For accesses which are already linear. */
+	x86_seg_none
+};
+
+#define NR_SEG x86_seg_none
+
+struct segment_register {
+	uint16_t sel;
+	union {
+		uint16_t attr;
+		struct {
+			uint16_t type : 4;
+			uint16_t s : 1;
+			uint16_t dpl : 2;
+			uint16_t p : 1;
+			uint16_t avl : 1;
+			uint16_t l : 1;
+			uint16_t db : 1;
+			uint16_t g : 1;
+			uint16_t pad : 4;
+		};
+	};
+	uint32_t limit;
+	uint64_t base;
+};
+
+enum user_regs {
+	REGS_RAX,
+	REGS_RCX,
+	REGS_RDX,
+	REGS_RBX,
+	REGS_RSP,
+	REGS_RBP,
+	REGS_RSI,
+	REGS_RDI,
+	REGS_R8,
+	REGS_R9,
+	REGS_R10,
+	REGS_R11,
+	REGS_R12,
+	REGS_R13,
+	REGS_R14,
+	REGS_R15,
+	NR_REGS
+};
+#define NR_VCPU_REGS NR_REGS
+
+static inline void print_n_bytes(unsigned char *bytes, size_t n)
+{
+	int i;
+
+	for (i = 0; i < n; i++)
+		fprintf(stderr, " %02x", bytes[i]);
+	fprintf(stderr, "\n");
+}
diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
new file mode 100644
index 000000000000..55ae4e8fbd96
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * x86 Instruction Emulation Fuzzing Wrapper
+ *
+ * Authors:
+ *   Sam Caccavale   <samcacc@amazon.de>
+ *
+ * Supporting code from xen:
+ *  xen/tools/fuzz/x86_instruction_emulation/fuzz_emul.c
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * From: xen/master f68f35fd2016e36ee30f8b3e7dfd46c554407ac1
+ */
+
+#include <string.h>
+#include <stdio.h>
+#include <assert.h>
+#include <stdlib.h>
+
+#include "stubs.h"
+#include "emulator_ops.h"
+
+#include <linux/compiler_attributes.h>
+
+#include <asm/kvm_emulate.h>
+#include <asm/processor-flags.h>
+#include <asm/user_64.h>
+#include <asm/kvm.h>
+
+void initialize_emulator(struct state *state)
+{
+}
+
+int step_emulator(struct state *state)
+{
+	return 0;
+}
+
+int emulate_until_complete(struct state *state)
+{
+	return 0;
+}
+
+struct state *create_emulator(void)
+{
+	struct state *state = malloc(sizeof(struct state));
+	return state;
+}
+
+void reset_emulator(struct state *state)
+{
+}
+
+void free_emulator(struct state *state)
+{
+}
diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.h b/tools/fuzz/x86_instruction_emulation/emulator_ops.h
new file mode 100644
index 000000000000..5ae072d5f205
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.h
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef EMULATOR_OPS_H
+#define EMULATOR_OPS_H
+
+#include "common.h"
+#include "stubs.h"
+#include <asm/kvm_emulate.h>
+
+struct vcpu {
+	unsigned long cr[5];
+	uint64_t msr[MSR_INDEX_MAX];
+	uint64_t regs[NR_REGS];
+	uint64_t rflags;
+	struct segment_register segments[NR_SEG];
+};
+
+/*
+ * Internal state of the emulate harness.  Calculated initially from the input
+ * corpus, and later mutated by the emulation callbacks.
+ */
+struct state {
+	/* Bitmask which enables/disables hooks. */
+	unsigned long options;
+
+	/* Internal representation of emulated CPU. */
+	struct vcpu vcpu;
+
+	/*
+	 * Input bytes are consumed at s.data[eip + s.other_bytes_consumed]
+	 * while eip + size_requested + other_bytes_consumed < data_available
+	 *
+	 * emul_fetch consumes bytes for use as x86 instructions as eip grows
+	 *
+	 * get_bytes_and_increment consumes bytes and increments
+	 * other_bytes_consumed.  These bytes can be used as return values for
+	 * memory reads, random chances to fail, or other purposes.
+	 *
+	 *  Only these two functions should be used to access this data.
+	 *
+	 * This causes .data to be an interspliced source of instructions and
+	 * other data.  Xen's instruction emulation does this to provide
+	 * deterministic randomness on fuzz runs at the cost of complexity to
+	 * crash output.
+	 *
+	 * Other alternatives (two separate streams, getting ins bytes from
+	 * low, and random bytes from high, etc) yield byte streams which may
+	 * not bear an as close correlation with AFL's input.  This was chosen
+	 * since the only drawback to this approach is remedied by simple
+	 * bookkeeping of instruction bytes when debugging crashes.  This is
+	 * enabled by the TODO flag.
+	 */
+	unsigned char *data;
+
+	/* Real amount of data backing state->data[]. */
+	size_t data_available;
+
+	/*
+	 * Amount of bytes consumed for purposes other than instructions.
+	 * E.G. whether a memory access should fault.
+	 */
+	size_t other_bytes_consumed;
+
+	/* Emulation context */
+	struct x86_emulate_ctxt ctxt;
+};
+
+#define MIN_INPUT_SIZE (offsetof(struct state, data))
+
+#define container_of(ptr, type, member)					\
+	({								\
+		const typeof(((type *)0)->member) * __mptr = (ptr);	\
+		(type *)((char *)__mptr - offsetof(type, member));	\
+	})
+
+#define get_state(h) container_of(h, struct state, ctxt)
+
+void buffer_stderr(void) __attribute__((constructor));
+
+/*
+ * Allocates space for, and creates a `struct state`.  The user should set
+ * state->data to their instruction stream and do any modification of the
+ * state-vpcu if desired.
+ */
+extern struct state *create_emulator(void);
+
+/*
+ * memset's all fields of state except data and data_available.
+ */
+extern void reset_emulator(struct state *state);
+
+/*
+ * free_emulator does not free the state->data member, the user should free
+ * it before freeing the emulator.  The alternative implementation either
+ * restrains data to a fixed size or has the user pass in a (pointer, size)
+ * pair which would have to be copied.  Copying this would be slow to fuzz.
+ */
+extern void free_emulator(struct state *state);
+
+/*
+ * Uses the state->option field to disable certain functionality and
+ * initializes the state->ctxt to a valid state.  This is optional if
+ * intentionally testing the emulator in an invalid state.
+ */
+extern void initialize_emulator(struct state *state);
+
+/*
+ * Advances the emulator one instruction and handles any exceptions.
+ */
+extern int step_emulator(struct state *state);
+
+/*
+ * Steps the emulator until no instructions remain or something fails.
+ */
+extern int emulate_until_complete(struct state *state);
+
+#endif /* ifdef EMULATOR_OPS_H */
diff --git a/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py b/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py
new file mode 100644
index 000000000000..94e63cab4e81
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/scripts/get_headers.py
@@ -0,0 +1,95 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0+
+
+"""
+This script compiles files given via arguments with GCC's -H flag.  This
+produces the list of headers used in compilation which we (primitively)
+check for coresponding source files to recur on.  In the end, we have the
+subset of source and header files needed to compile/link the source files
+that were given as arguments.
+
+There are a lot of hacks here including:
+Fickle regex to parse the -H output
+Attempting to use the `.FILE_NAME.o.cmd` that linux produces when building
+Being totally unrelocatable and requiring some linux build env vars
+Running `subproces.run` on input read from a file on disk
+
+If anything breaks it is likely related to one of the above hacks.
+"""
+
+import os
+import sys
+import subprocess
+from pathlib import Path
+
+if len(sys.argv) < 2:
+    print("Supply a starting file(s)")
+    exit(-1)
+
+KBUILD_FLAGS = os.environ.get("KBUILD_CFLAGS", "").split()
+LINUXINCLUDE = os.environ.get("LINUXINCLUDE", "").split()
+
+if [] in (KBUILD_FLAGS, LINUXINCLUDE):
+    print("Environment variable KBUILD_FLAGS or LINUXINCLUDE is not set.")
+    print("\tThese are emitted by the root Makefile-- are you running")
+    print("\tthis script from somewhere other than `make tools/fuzz_deps`?")
+    exit(-1)
+
+queue = [Path(source_file) for source_file in sys.argv[1:]]
+all_headers = set()
+source_files = set()
+
+# While there are source files in the queue left to process
+while queue:
+    current_file = queue.pop()  # Grab a file
+
+    source_files.add(current_file)  # Add it to the set of processed files
+    queue_size = len(queue)
+    print(f"\nFinding headers included in { current_file }:")
+
+    # If there is a `.FILE_NAME.o.cmd` file, then use that to build FILE_NAME.c
+    # if not, then `gcc -H FILE_NAME $KBUILD_FLAGS $LINUXINCLUDE` probably works
+    gcc_command = ["gcc", "-H", current_file] + KBUILD_FLAGS + LINUXINCLUDE
+    cmd_file = current_file.parent / ("." + current_file.name[:-1] + "o.cmd")
+    if cmd_file.exists():
+        with open(cmd_file, "r") as f:
+            gcc_command = f.readline().split(":= ")[1].split() + ["-H"]
+
+    # Build the file, pipe the output to sed to get just the list of headers
+    p1 = subprocess.run(gcc_command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+    p2 = subprocess.run(["sed", "/^\.[^/]/!d"], input=p1.stdout, stdout=subprocess.PIPE)
+
+    headers = p2.stdout.decode("utf-8").split("\n")
+    try:
+        # Output is of type r'\.+ filename' so split on whitespace and take the latter
+        headers = {Path(header.split()[1]) for header in headers if header}
+    except IndexError:  # Fickle regex sometimes produces lines without whitespace
+        print("Breaking output:")
+        print({header for header in headers if header and len(header.split()) < 2})
+        raise
+
+    # This loop is for adding new source files back to the queue
+    for header in headers:  # Ensure all the headers are actually files.
+        assert header.exists()
+
+        # If we have already seen the header file then we have already seen its source
+        # so there is no need to add it.  This is probably redudant with the next check.
+        if header in all_headers:
+            continue
+
+        # If we found a header file with a coresponding source file that we have not
+        # processed, then add it to the queue to process
+        source = header.parent / (header.name[:-1] + "c")
+        if source.exists() and source not in source_files:
+            queue.append(source)
+
+    print(
+        f"\t{ len(headers - all_headers) } new headers found,"
+        f"{ len(headers | all_headers) } total."
+    )
+    print(f"\tQueue size was { queue_size } and is now { len(queue) }.")
+    all_headers.update(headers)  # Add all the newly found headers to all_headers
+
+print("\n\nObjects included:")
+print("".join(sorted(f"\t{ str(path) }\n" for path in source_files)))
+exit(0)
diff --git a/tools/fuzz/x86_instruction_emulation/scripts/make_deps b/tools/fuzz/x86_instruction_emulation/scripts/make_deps
new file mode 100755
index 000000000000..4e2ab38c886f
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/scripts/make_deps
@@ -0,0 +1,4 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+
+make ${CC:+ "CC=$CC"} arch/x86/kvm/emulate.o arch/x86/lib/retpoline.o lib/find_bit.o
diff --git a/tools/fuzz/x86_instruction_emulation/stubs.c b/tools/fuzz/x86_instruction_emulation/stubs.c
new file mode 100644
index 000000000000..d2ffeb3ec599
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/stubs.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * These functions/symbols are required to build emulate.o but belong in
+ * linux tree files with include many other headers/unnecessary symbols
+ * and cause building emulate.o to become far more complicated than just
+ * stubbing them out here.  In order to stub them without modifying the
+ * included source, we're declaring many of them here.
+ */
+#include <linux/types.h>
+
+#include "stubs.h"
+
+#define VMWARE_BACKDOOR_PMC_HOST_TSC 0x10000
+#define VMWARE_BACKDOOR_PMC_REAL_TIME 0x10001
+#define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002
+
+/*
+ * This is easy enough to stub out its full functionality.
+ */
+bool is_vmware_backdoor_pmc(u32 pmc_idx)
+{
+	switch (pmc_idx) {
+	case VMWARE_BACKDOOR_PMC_HOST_TSC:
+	case VMWARE_BACKDOOR_PMC_REAL_TIME:
+	case VMWARE_BACKDOOR_PMC_APPARENT_TIME:
+		return true;
+	}
+	return false;
+}
+
+/*
+ * Printk has no side effects so we don't need to worry about it.
+ */
+int printk(const char *s, ...)
+{
+	return 0;
+}
+
+/*
+ * This is required by source included from emulate.c and would be linked in.
+ */
+bool enable_vmware_backdoor;
+
+struct exception_table_entry;
+struct pt_regs;
+
+/*
+ * TODO: we likely need to implement this to handle emulated exceptions.
+ */
+bool ex_handler_default(const struct exception_table_entry *fixup,
+			struct pt_regs *regs, int trapnr,
+			unsigned long error_code, unsigned long fault_addr)
+{
+	return true;
+}
diff --git a/tools/fuzz/x86_instruction_emulation/stubs.h b/tools/fuzz/x86_instruction_emulation/stubs.h
new file mode 100644
index 000000000000..02c2f6f9bc26
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/stubs.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef STUBS_H
+#define STUBS_H
+
+
+/*
+ * Several typedefs in linux/types.h collide with ones in standard libs.
+ *
+ * kvm_emulate.h uses many other types in linux/types.h heavily, so we must
+ * include it but having access to the standard libs is also important for the
+ * harnesses.
+ *
+ * A solution would be to not include any kernel files in the harness but then
+ * useful introspection into state->ctxt is impossible as x86_emulate_ctxt is
+ * defined in kvm_emulate.h and uses a ton of linux/types.h types.
+ *
+ * Instead, we use dummy defines to avoid the collision and linux seems
+ * content to use the equivalent types from the standard libs.
+ */
+#define fd_set fake_fd_set
+#define dev_t fake_dev_t
+#define nlink_t fake_nlink_t
+#define timer_t fake_timer_t
+#define loff_t fake_loff_t
+#define u_int64_t fake_u_int64_t
+#define int64_t fake_int64_t
+#define blkcnt_t fake_blkcnt_t
+#define uint64_t fake_uint64_t
+#include <linux/types.h>
+#undef fd_set
+#undef dev_t
+#undef nlint_t
+#undef timer_t
+#undef loff_t
+#undef u_int64_t
+#undef int64_t
+#undef blkcnt_t
+#undef uint64_t
+
+/*
+ * These are identically defined in string.h and included kernel headers.
+ */
+#ifdef __always_inline
+#undef __always_inline
+#endif
+
+#ifdef __attribute_const__
+#undef __attribute_const__
+#endif
+
+#endif /* STUBS_H */
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] Emulate simple x86 instructions in userspace
  2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
  2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
@ 2019-05-21 15:39 ` Sam Caccavale
  2019-05-31  8:38   ` Alexander Graf
  2019-05-21 15:39 ` [PATCH 3/3] Demonstrating unit testing via simple-harness Sam Caccavale
  2019-05-31  8:39 ` x86 instruction emulator fuzzing Alexander Graf
  3 siblings, 1 reply; 11+ messages in thread
From: Sam Caccavale @ 2019-05-21 15:39 UTC (permalink / raw)
  Cc: samcacc, samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf,
	karahmed, andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx,
	mingo, bp, hpa, paullangton4, anirudhkaushik, x86, kvm,
	linux-kernel

Added the minimal subset of code to run afl-harness with a binary file
as input.  These bytes are used to populate the vcpu structure and then
as an instruction stream for the emulator.  It does not attempt to handle
exceptions an only supports very simple ops.

---
 .../x86_instruction_emulation/emulator_ops.c  | 324 +++++++++++++++++-
 .../scripts/afl-many                          |  28 ++
 .../scripts/bin_fuzz                          |  23 ++
 3 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/afl-many
 create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz

diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
index 55ae4e8fbd96..38bba4c97e2f 100644
--- a/tools/fuzz/x86_instruction_emulation/emulator_ops.c
+++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
@@ -29,17 +29,331 @@
 #include <asm/user_64.h>
 #include <asm/kvm.h>
 
+/*
+ * Due to some quirk of building, the way printing to an unbuffered stream
+ * is implemented smashes the stack.  For now, we'll just buffer stderr but
+ * a fix is needed.
+ */
+char buff[BUFSIZ];
+void buffer_stderr(void)
+{
+	setvbuf(stderr, buff, _IOLBF, BUFSIZ);
+}
+
+#define number_of_registers(c) (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8)
+
+ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg)
+{
+	assert(reg < number_of_registers(ctxt));
+	return get_state(ctxt)->vcpu.regs[reg];
+}
+
+void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg, ulong val)
+{
+	assert(reg < number_of_registers(ctxt));
+	get_state(ctxt)->vcpu.regs[reg] = val;
+}
+
+#undef number_of_registers /* (ctxt->mode == X86EMUL_MODE_PROT64 ? 16 : 8) */
+
+/* All read ops: */
+
+#define emul_offset (state->ctxt.eip + state->other_bytes_consumed)
+#define decode_offset (state->ctxt._eip + state->other_bytes_consumed)
+/*
+ * There is a very loose abstraction around ctxt.eip, ctxt._eip, and
+ * state.other_bytes_consumed.
+ *
+ * eip is incremented when an instruction is emulated by x86_emulate_insn
+ * _eip is incremeneted when an instruction is decoded by x86_decode_insn
+ *
+ * state.other_bytes_consumed is incremented when bytes are consumed by
+ *  get_bytes_and_increment for any use besides an x86 instruction.
+ *
+ * The bytes between emul_offset and decode_offset are instructions yet
+ * to be executed.
+ */
+
+#define _data_available(bytes) (decode_offset + bytes < state->data_available)
+
+static int _get_bytes(void *dst, struct state *state, unsigned int bytes,
+		      char *callee)
+{
+	if (!_data_available(bytes)) {
+		fprintf(stderr, "Tried retrieving %d bytes\n", bytes);
+		fprintf(stderr, "%s failed to retrieve bytes for %s.\n",
+			__func__, callee);
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	memcpy(dst, &state->data[decode_offset], bytes);
+	return X86EMUL_CONTINUE;
+}
+
+/*
+ * The only function that any x86_emulate_ops should call to retrieve bytes.
+ * See comments in struct state definition for more information.
+ */
+static int get_bytes_and_increment(void *dst, struct state *state,
+				   unsigned int bytes, char *callee)
+{
+	int rc = _get_bytes(dst, state, bytes, callee);
+
+	if (rc == X86EMUL_CONTINUE)
+		state->other_bytes_consumed += bytes;
+
+	return rc;
+}
+
+/*
+ * This is called by x86_decode_insn to fetch bytes and is the only function
+ * that gets bytes from state.data without incrementing .other_byes_consumed
+ * since the emulator will increment ._eip during x86_decode_insn and ._eip
+ * is used as an index into state.data.
+ */
+int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr, void *val,
+	       unsigned int bytes, struct x86_exception *fault)
+{
+	if (_get_bytes(val, get_state(ctxt), bytes, "emul_fetch") !=
+	    X86EMUL_CONTINUE) {
+		return X86EMUL_UNHANDLEABLE;
+	}
+
+	return X86EMUL_CONTINUE;
+}
+
+ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
+{
+	return get_state(ctxt)->vcpu.cr[cr];
+}
+
+int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
+{
+	get_state(ctxt)->vcpu.cr[cr] = val;
+	return 0;
+}
+
+unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt)
+{
+	return get_state(ctxt)->vcpu.rflags;
+}
+
+void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags)
+{
+	get_state(ctxt)->vcpu.rflags = hflags;
+}
+
+/* End of emulator ops */
+
+#define SET(h) .h = emul_##h
+const struct x86_emulate_ops all_emulator_ops = {
+	SET(read_gpr),
+	SET(write_gpr),
+	SET(fetch),
+	SET(get_cr),
+	SET(set_cr),
+	SET(get_hflags),
+	SET(set_hflags),
+};
+#undef SET
+
+enum {
+	HOOK_read_gpr,
+	HOOK_write_gpr,
+	HOOK_fetch,
+	HOOK_get_cr,
+	HOOK_set_cr,
+	HOOK_get_hflags,
+	HOOK_set_hflags
+};
+
+/* Put reg_x into a definitely valid state. */
+#define CANONICALIZE(reg_x)						\
+	do {								\
+		uint64_t _y = (reg_x);					\
+		if (_y & (1ULL << 47))					\
+			_y |= (~0ULL) << 48;				\
+		else							\
+			_y &= (1ULL << 48) - 1;				\
+		debug("Canonicalized %" PRIx64 " to %" PRIx64 "\n",	\
+			reg_x, y);					\
+		(reg_x) = _y;						\
+	} while (0)
+
+/*
+ * Canonicalizes reg if options << CANONICALIZE_##reg is set.
+ * This weighs the emulator to use canonical values for important registers.
+ *
+ * Expects options and regs to be defined.
+ */
+#define CANONICALIZE_MAYBE(reg)						\
+	do {								\
+		if (!(options & (1 << CANONICALIZE_##reg)))		\
+			CANONICALIZE(regs->reg);			\
+	} while (0)
+
+/*
+ * Disable an x86_emulate_op if options << HOOK_op is set.
+ *
+ * Expects options to be defined.
+ */
+#define MAYBE_DISABLE_HOOK(h)						\
+	do {								\
+		if (options & (1 << HOOK_##h)) {			\
+			vcpu->ctxt.ops.h = NULL;			\
+			debug("Disabling hook " #h "\n");		\
+		}							\
+	} while (0)
+
+/*
+ * FROM XEN:
+ *
+ * Constrain input to architecturally-possible states where
+ * the emulator relies on these
+ *
+ * In general we want the emulator to be as absolutely robust as
+ * possible; which means that we want to minimize the number of things
+ * it assumes about the input state.  Tesing this means minimizing and
+ * removing as much of the input constraints as possible.
+ *
+ * So we only add constraints that (in general) have been proven to
+ * cause crashes in the emulator.
+ *
+ * For future reference: other constraints which might be necessary at
+ * some point:
+ *
+ * - EFER.LMA => !EFLAGS.NT
+ * - In VM86 mode, force segment...
+ *  - ...access rights to 0xf3
+ *  - ...limits to 0xffff
+ *  - ...bases to below 1Mb, 16-byte aligned
+ *  - ...selectors to (base >> 4)
+ */
+static void sanitize_input(struct state *s)
+{
+	/* Some hooks can't be disabled. */
+	// options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
+
+	/* Zero 'private' entries */
+	// regs->error_code = 0;
+	// regs->entry_vector = 0;
+
+	// CANONICALIZE_MAYBE(rip);
+	// CANONICALIZE_MAYBE(rsp);
+	// CANONICALIZE_MAYBE(rbp);
+
+	/*
+	 * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
+	 * set PE if PG is set.
+	 */
+	if (s->vcpu.cr[0] & X86_CR0_PG)
+		s->vcpu.cr[0] |= X86_CR0_PE;
+
+	/* EFLAGS.VM not available in long mode */
+	if (s->ctxt.mode == X86EMUL_MODE_PROT64)
+		s->vcpu.rflags &= ~X86_EFLAGS_VM;
+
+	/* EFLAGS.VM implies 16-bit mode */
+	if (s->vcpu.rflags & X86_EFLAGS_VM) {
+		s->vcpu.segments[x86_seg_cs].db = 0;
+		s->vcpu.segments[x86_seg_ss].db = 0;
+	}
+}
+
+static void init_x86_emulate_ctxt(struct x86_emulate_ctxt *ctxt)
+{
+	ctxt->ops = &all_emulator_ops;
+	ctxt->eflags = 0;
+	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
+	ctxt->eip = 0;
+
+	ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this?
+
+	init_decode_cache(ctxt);
+}
+
 void initialize_emulator(struct state *state)
 {
+	state->ctxt.ops = &all_emulator_ops;
+
+	/* See also sanitize_input, some hooks can't be disabled. */
+	// MAYBE_DISABLE_HOOK(read_gpr);
+
+	sanitize_input(state);
+	init_x86_emulate_ctxt(&state->ctxt);
+}
+
+static const char *const x86emul_mode_string[] = {
+	[X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL",
+	[X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86",
+	[X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16",
+	[X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32",
+	[X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64",
+};
+
+static void dump_state_after(const char *desc, struct state *state)
+{
+	debug(" -- State after %s --\n", desc);
+	debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
+	debug(" cr0: %lx\n", state->vcpu.cr[0]);
+	debug(" cr3: %lx\n", state->vcpu.cr[3]);
+	debug(" cr4: %lx\n", state->vcpu.cr[4]);
+
+	debug("Decode _eip: %lu\n", state->ctxt._eip);
+	debug("Emulate eip: %lu\n", state->ctxt.eip);
+
+	debug("\n");
 }
 
 int step_emulator(struct state *state)
 {
-	return 0;
+	int rc, prev_eip = state->ctxt.eip;
+	int decode_size = state->data_available - decode_offset;
+
+	if (decode_size < 15) {
+		rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset],
+				     decode_size);
+	} else {
+		rc = x86_decode_insn(&state->ctxt, NULL, 0);
+	}
+	debug("Decode result: %d\n", rc);
+	if (rc != X86EMUL_CONTINUE)
+		return rc;
+
+	debug("Instruction: ");
+	print_n_bytes(&state->data[emul_offset],
+		      state->ctxt._eip - state->ctxt.eip);
+
+	rc = x86_emulate_insn(&state->ctxt);
+	debug("Emulation result: %d\n", rc);
+	dump_state_after("emulating", state);
+
+	if (state->ctxt.have_exception) {
+		fprintf(stderr, "Emulator propagated exception: { ");
+		fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector);
+		fprintf(stderr, "error code: %d }\n",
+			state->ctxt.exception.error_code);
+		rc = X86EMUL_UNHANDLEABLE;
+	} else if (prev_eip == state->ctxt.eip) {
+		fprintf(stderr, "ctxt.eip not advanced.\n");
+		rc = X86EMUL_UNHANDLEABLE;
+	}
+
+	if (decode_offset == state->data_available)
+		debug("emulator is done\n");
+
+	return rc;
 }
 
 int emulate_until_complete(struct state *state)
 {
+	int count = 0;
+
+	do {
+		count++;
+	} while (step_emulator(state) == X86EMUL_CONTINUE);
+
+	debug("Emulated %d instructions\n", count);
 	return 0;
 }
 
@@ -51,8 +365,16 @@ struct state *create_emulator(void)
 
 void reset_emulator(struct state *state)
 {
+	unsigned char *data = state->data;
+	size_t data_available = state->data_available;
+
+	memset(state, 0, sizeof(struct state));
+
+	state->data = data;
+	state->data_available = data_available;
 }
 
 void free_emulator(struct state *state)
 {
+	free(state);
 }
diff --git a/tools/fuzz/x86_instruction_emulation/scripts/afl-many b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
new file mode 100755
index 000000000000..ab15258573a2
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
@@ -0,0 +1,28 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+# This is for running AFL over NPROC or `nproc` cores with normal AFL options.
+
+export AFL_NO_AFFINITY=1
+
+while [ -z "$sync_dir" ]; do
+  while getopts ":o:" opt; do
+    case "${opt}" in
+      o)
+        sync_dir="${OPTARG}"
+        ;;
+      *)
+        ;;
+    esac
+  done
+  ((OPTIND++))
+  [ $OPTIND -gt $# ] && break
+done
+
+for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
+    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
+done
+taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &
+
+sleep 5
+watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
+pkill afl-fuzz
diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
new file mode 100755
index 000000000000..e570b17f9404
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
@@ -0,0 +1,23 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+# This runs the afl-harness at $1, $2 times (or 100)
+# It runs uniq and sorts the output to give an idea of what is causing the
+# most crashes.  Useful for deciding what to implement next.
+
+if [ "$#" -lt 1 ]; then
+  echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]"
+  exit
+fi
+
+mkdir -p fuzz
+rm -f fuzz/*.in fuzz/*.out
+
+for i in $(seq 1 1 ${2:-100})
+do
+  {
+  head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1
+  } > fuzz/$i.out 2>&1
+
+done
+
+find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.* Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction output/' | sort | uniq -c | sort -rn
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] Demonstrating unit testing via simple-harness
  2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
  2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
  2019-05-21 15:39 ` [PATCH 2/3] Emulate simple x86 instructions in userspace Sam Caccavale
@ 2019-05-21 15:39 ` Sam Caccavale
  2019-05-31  8:39 ` x86 instruction emulator fuzzing Alexander Graf
  3 siblings, 0 replies; 11+ messages in thread
From: Sam Caccavale @ 2019-05-21 15:39 UTC (permalink / raw)
  Cc: samcacc, samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf,
	karahmed, andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx,
	mingo, bp, hpa, paullangton4, anirudhkaushik, x86, kvm,
	linux-kernel

Simple-harness.c uses inline asm support to generate asm and then has the
emulator emulate this code.  This may be useful as a form of testing for
the emulator.

---
 tools/fuzz/x86_instruction_emulation/Makefile |  7 ++-
 .../simple-harness.c                          | 49 +++++++++++++++++++
 2 files changed, 54 insertions(+), 2 deletions(-)
 create mode 100644 tools/fuzz/x86_instruction_emulation/simple-harness.c

diff --git a/tools/fuzz/x86_instruction_emulation/Makefile b/tools/fuzz/x86_instruction_emulation/Makefile
index d2854a332605..bb29149ae0f7 100644
--- a/tools/fuzz/x86_instruction_emulation/Makefile
+++ b/tools/fuzz/x86_instruction_emulation/Makefile
@@ -43,7 +43,10 @@ LOCAL_OBJS := emulator_ops.o stubs.o
 afl-harness: afl-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
 	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
 
-all: afl-harness
+simple-harness: simple-harness.o $(LOCAL_OBJS) $(KERNEL_OBJS)
+	@$(CC) -v $(KBUILD_CFLAGS) $(LOCAL_OBJS) $(KERNEL_OBJS) $< $(INCLUDES) -Istubs.h -o $@ -no-pie
+
+all: afl-harness simple-harness
 
 .PHONY: fuzz_deps
 fuzz_deps:
@@ -54,4 +57,4 @@ fuzz_deps:
 
 .PHONY: clean
 clean:
-	$(RM) -r *.o afl-harness
+	$(RM) -r *.o afl-harness simple-harness
diff --git a/tools/fuzz/x86_instruction_emulation/simple-harness.c b/tools/fuzz/x86_instruction_emulation/simple-harness.c
new file mode 100644
index 000000000000..9601aafb9423
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulation/simple-harness.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include "emulator_ops.h"
+#include <asm/kvm_emulate.h>
+
+extern void foo(void)
+{
+	asm volatile("__start:mov $0xdeadbeef, %rax;"
+		     "xor %rax, %rax;"
+		     "__end:");
+}
+
+int main(int argc, char **argv)
+{
+	extern unsigned char __start;
+	extern unsigned char __end;
+	struct state *state = create_emulator();
+	int rc;
+
+	/* Ensures the emulator is in a valid state. */
+	initialize_emulator(state);
+
+	/* Provide the emulator with instructions to emulate. */
+	state->data = &__start;
+	state->data_available = &__end - &__start;
+
+	/* Execute mov $0xdeadbeef, %rax */
+	rc = step_emulator(state);
+	/* Check that the emulator succeeded. */
+	assert(rc == X86EMUL_CONTINUE);
+	/* Check that 0xdeadbeef was moved to rax. */
+	assert(state->ctxt._regs[REGS_RAX] == 0xdeadbeef);
+
+	/* Execute xor %rax, %rax */
+	rc = step_emulator(state);
+	/* Check that the emulator succeeded. */
+	assert(rc == X86EMUL_CONTINUE);
+	/* Check that xoring rax with itself cleared rax. */
+	assert(state->ctxt._regs[REGS_RAX] == 0);
+
+	/* Free the emulator. */
+	free_emulator(state);
+
+	return 0;
+}
-- 
2.17.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] Build target for emulate.o as a userspace binary
  2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
@ 2019-05-31  8:02   ` Alexander Graf
  2019-06-12 15:19     ` samcacc
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Graf @ 2019-05-31  8:02 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 21.05.19 17:39, Sam Caccavale wrote:
> This commit contains the minimal set of functionality to build
> afl-harness around arch/x86/emulate.c which allows exercising code
> in that source file, like x86_emulate_insn.  Resolving the
> dependencies was done via GCC's -H flag by get_headers.py.
>
> ---
>   tools/Makefile                                |   9 ++
>   .../fuzz/x86_instruction_emulation/.gitignore |   2 +
>   tools/fuzz/x86_instruction_emulation/Makefile |  57 +++++++
>   .../fuzz/x86_instruction_emulation/README.md  |  12 ++
>   .../x86_instruction_emulation/afl-harness.c   | 149 ++++++++++++++++++
>   tools/fuzz/x86_instruction_emulation/common.h |  87 ++++++++++
>   .../x86_instruction_emulation/emulator_ops.c  |  58 +++++++
>   .../x86_instruction_emulation/emulator_ops.h  | 117 ++++++++++++++
>   .../scripts/get_headers.py                    |  95 +++++++++++
>   .../scripts/make_deps                         |   4 +
>   tools/fuzz/x86_instruction_emulation/stubs.c  |  56 +++++++
>   tools/fuzz/x86_instruction_emulation/stubs.h  |  52 ++++++
>   12 files changed, 698 insertions(+)
>   create mode 100644 tools/fuzz/x86_instruction_emulation/.gitignore
>   create mode 100644 tools/fuzz/x86_instruction_emulation/Makefile
>   create mode 100644 tools/fuzz/x86_instruction_emulation/README.md
>   create mode 100644 tools/fuzz/x86_instruction_emulation/afl-harness.c
>   create mode 100644 tools/fuzz/x86_instruction_emulation/common.h
>   create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.c
>   create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.h
>   create mode 100644 tools/fuzz/x86_instruction_emulation/scripts/get_headers.py
>   create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/make_deps
>   create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.c
>   create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.h
>
> diff --git a/tools/Makefile b/tools/Makefile
> index 3dfd72ae6c1a..4d68817b7e49 100644
> --- a/tools/Makefile
> +++ b/tools/Makefile
> @@ -94,6 +94,12 @@ freefall: FORCE
>   kvm_stat: FORCE
>   	$(call descend,kvm/$@)
>   
> +fuzz: FORCE
> +	$(call descend,fuzz/x86_instruction_emulation)
> +
> +fuzz_deps: FORCE
> +	$(call descend,fuzz/x86_instruction_emulation,fuzz_deps)
> +
>   all: acpi cgroup cpupower gpio hv firewire liblockdep \
>   		perf selftests spi turbostat usb \
>   		virtio vm bpf x86_energy_perf_policy \
> @@ -171,6 +177,9 @@ tmon_clean:
>   freefall_clean:
>   	$(call descend,laptop/freefall,clean)
>   
> +fuzz_clean:
> +	$(call descend,fuzz/x86_instruction_emulation,clean)
> +
>   build_clean:
>   	$(call descend,build,clean)
>   
> diff --git a/tools/fuzz/x86_instruction_emulation/.gitignore b/tools/fuzz/x86_instruction_emulation/.gitignore
> new file mode 100644
> index 000000000000..7d44f7ce266e
> --- /dev/null
> +++ b/tools/fuzz/x86_instruction_emulation/.gitignore
> @@ -0,0 +1,2 @@
> +*.o
> +*-harness
> diff --git a/tools/fuzz/x86_instruction_emulation/Makefile b/tools/fuzz/x86_instruction_emulation/Makefile
> new file mode 100644
> index 000000000000..d2854a332605
> --- /dev/null
> +++ b/tools/fuzz/x86_instruction_emulation/Makefile
> @@ -0,0 +1,57 @@
> +ROOT_DIR=../../..
> +THIS_DIR=tools/fuzz/x86_instruction_emulation
> +
> +include ../../scripts/Makefile.include
> +
> +.DEFAULT_GOAL := all
> +
> +INCLUDES := $(patsubst -I./%,-I./$(ROOT_DIR)/%, $(LINUXINCLUDE))
> +INCLUDES := $(patsubst ./include/%,./$(ROOT_DIR)/include/%, $(INCLUDES))
> +INCLUDES += -include ./$(ROOT_DIR)/include/linux/compiler_types.h
> +
> +$(ROOT_DIR)/.config:
> +	make -C $(ROOT_DIR) menuconfig
> +	sed -i -r 's/^#? *CONFIG_KVM(.*)=.*/CONFIG_KVM\1=y/' $(ROOT_DIR)/.config
> +
> +
> +ifdef DEBUG
> +KBUILD_CFLAGS += -DDEBUG
> +endif
> +KBUILD_CFLAGS += -g -O0


Why -O0? I would expect a some bugs to only emerge with optimization 
enabled.

Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] Emulate simple x86 instructions in userspace
  2019-05-21 15:39 ` [PATCH 2/3] Emulate simple x86 instructions in userspace Sam Caccavale
@ 2019-05-31  8:38   ` Alexander Graf
  2019-06-12 15:19     ` samcacc
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Graf @ 2019-05-31  8:38 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 21.05.19 17:39, Sam Caccavale wrote:
> Added the minimal subset of code to run afl-harness with a binary file
> as input.  These bytes are used to populate the vcpu structure and then
> as an instruction stream for the emulator.  It does not attempt to handle
> exceptions an only supports very simple ops.
>
> ---
>   .../x86_instruction_emulation/emulator_ops.c  | 324 +++++++++++++++++-
>   .../scripts/afl-many                          |  28 ++
>   .../scripts/bin_fuzz                          |  23 ++
>   3 files changed, 374 insertions(+), 1 deletion(-)
>   create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/afl-many
>   create mode 100755 tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>
> diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
> index 55ae4e8fbd96..38bba4c97e2f 100644
> --- a/tools/fuzz/x86_instruction_emulation/emulator_ops.c
> +++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
> @@ -29,17 +29,331 @@
>   #include <asm/user_64.h>
>   #include <asm/kvm.h>
>   
> +/*
> + * Due to some quirk of building, the way printing to an unbuffered stream
> + * is implemented smashes the stack.  For now, we'll just buffer stderr but
> + * a fix is needed.


I'm not sure I fully understand this comment. If printing smashes the 
stack, wouldn't that always break things?


> + */
> +char buff[BUFSIZ];
> +void buffer_stderr(void)
> +{
> +	setvbuf(stderr, buff, _IOLBF, BUFSIZ);
> +}
> +
> +#define number_of_registers(c) (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8)


This can go into a header, no? Also, just say "gprs" instead of 
registers - it's shorter. In addition, you want bits like this also be 
static inline functions rather than macros to preserve type checking.


> +
> +ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg)
> +{
> +	assert(reg < number_of_registers(ctxt));
> +	return get_state(ctxt)->vcpu.regs[reg];
> +}
> +
> +void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg, ulong val)
> +{
> +	assert(reg < number_of_registers(ctxt));
> +	get_state(ctxt)->vcpu.regs[reg] = val;
> +}
> +
> +#undef number_of_registers /* (ctxt->mode == X86EMUL_MODE_PROT64 ? 16 : 8) */
> +
> +/* All read ops: */
> +
> +#define emul_offset (state->ctxt.eip + state->other_bytes_consumed)
> +#define decode_offset (state->ctxt._eip + state->other_bytes_consumed)


Same here, better make this static inline functions and pass state as 
parameter in. At least for me, it's otherwise really hard to remember 
what really happens when a variable appears out of the blue.


> +/*
> + * There is a very loose abstraction around ctxt.eip, ctxt._eip, and
> + * state.other_bytes_consumed.
> + *
> + * eip is incremented when an instruction is emulated by x86_emulate_insn
> + * _eip is incremeneted when an instruction is decoded by x86_decode_insn
> + *
> + * state.other_bytes_consumed is incremented when bytes are consumed by
> + *  get_bytes_and_increment for any use besides an x86 instruction.
> + *
> + * The bytes between emul_offset and decode_offset are instructions yet
> + * to be executed.
> + */
> +
> +#define _data_available(bytes) (decode_offset + bytes < state->data_available)


Please make this a static function too.


> +
> +static int _get_bytes(void *dst, struct state *state, unsigned int bytes,
> +		      char *callee)
> +{
> +	if (!_data_available(bytes)) {
> +		fprintf(stderr, "Tried retrieving %d bytes\n", bytes);
> +		fprintf(stderr, "%s failed to retrieve bytes for %s.\n",
> +			__func__, callee);
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	memcpy(dst, &state->data[decode_offset], bytes);
> +	return X86EMUL_CONTINUE;
> +}
> +
> +/*
> + * The only function that any x86_emulate_ops should call to retrieve bytes.
> + * See comments in struct state definition for more information.
> + */
> +static int get_bytes_and_increment(void *dst, struct state *state,
> +				   unsigned int bytes, char *callee)
> +{
> +	int rc = _get_bytes(dst, state, bytes, callee);
> +
> +	if (rc == X86EMUL_CONTINUE)
> +		state->other_bytes_consumed += bytes;
> +
> +	return rc;
> +}
> +
> +/*
> + * This is called by x86_decode_insn to fetch bytes and is the only function
> + * that gets bytes from state.data without incrementing .other_byes_consumed
> + * since the emulator will increment ._eip during x86_decode_insn and ._eip
> + * is used as an index into state.data.
> + */
> +int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr, void *val,
> +	       unsigned int bytes, struct x86_exception *fault)
> +{
> +	if (_get_bytes(val, get_state(ctxt), bytes, "emul_fetch") !=
> +	    X86EMUL_CONTINUE) {
> +		return X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	return X86EMUL_CONTINUE;
> +}
> +
> +ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
> +{
> +	return get_state(ctxt)->vcpu.cr[cr];
> +}
> +
> +int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
> +{
> +	get_state(ctxt)->vcpu.cr[cr] = val;
> +	return 0;
> +}
> +
> +unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt)
> +{
> +	return get_state(ctxt)->vcpu.rflags;
> +}
> +
> +void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags)
> +{
> +	get_state(ctxt)->vcpu.rflags = hflags;
> +}
> +
> +/* End of emulator ops */
> +
> +#define SET(h) .h = emul_##h
> +const struct x86_emulate_ops all_emulator_ops = {
> +	SET(read_gpr),
> +	SET(write_gpr),
> +	SET(fetch),
> +	SET(get_cr),
> +	SET(set_cr),
> +	SET(get_hflags),
> +	SET(set_hflags),
> +};
> +#undef SET
> +
> +enum {
> +	HOOK_read_gpr,
> +	HOOK_write_gpr,
> +	HOOK_fetch,
> +	HOOK_get_cr,
> +	HOOK_set_cr,
> +	HOOK_get_hflags,
> +	HOOK_set_hflags
> +};
> +
> +/* Put reg_x into a definitely valid state. */
> +#define CANONICALIZE(reg_x)						\
> +	do {								\
> +		uint64_t _y = (reg_x);					\
> +		if (_y & (1ULL << 47))					\
> +			_y |= (~0ULL) << 48;				\
> +		else							\
> +			_y &= (1ULL << 48) - 1;				\
> +		debug("Canonicalized %" PRIx64 " to %" PRIx64 "\n",	\
> +			reg_x, y);					\
> +		(reg_x) = _y;						\
> +	} while (0)


We support >48 bits VA/PA these days. Also, why is this a macro and not 
a static function?


> +
> +/*
> + * Canonicalizes reg if options << CANONICALIZE_##reg is set.
> + * This weighs the emulator to use canonical values for important registers.
> + *
> + * Expects options and regs to be defined.
> + */
> +#define CANONICALIZE_MAYBE(reg)						\
> +	do {								\
> +		if (!(options & (1 << CANONICALIZE_##reg)))		\
> +			CANONICALIZE(regs->reg);			\
> +	} while (0)
> +
> +/*
> + * Disable an x86_emulate_op if options << HOOK_op is set.
> + *
> + * Expects options to be defined.
> + */
> +#define MAYBE_DISABLE_HOOK(h)						\
> +	do {								\
> +		if (options & (1 << HOOK_##h)) {			\
> +			vcpu->ctxt.ops.h = NULL;			\
> +			debug("Disabling hook " #h "\n");		\
> +		}							\
> +	} while (0)
> +
> +/*
> + * FROM XEN:
> + *
> + * Constrain input to architecturally-possible states where
> + * the emulator relies on these
> + *
> + * In general we want the emulator to be as absolutely robust as
> + * possible; which means that we want to minimize the number of things
> + * it assumes about the input state.  Tesing this means minimizing and
> + * removing as much of the input constraints as possible.
> + *
> + * So we only add constraints that (in general) have been proven to
> + * cause crashes in the emulator.
> + *
> + * For future reference: other constraints which might be necessary at
> + * some point:
> + *
> + * - EFER.LMA => !EFLAGS.NT
> + * - In VM86 mode, force segment...
> + *  - ...access rights to 0xf3
> + *  - ...limits to 0xffff
> + *  - ...bases to below 1Mb, 16-byte aligned
> + *  - ...selectors to (base >> 4)
> + */
> +static void sanitize_input(struct state *s)
> +{
> +	/* Some hooks can't be disabled. */
> +	// options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
> +
> +	/* Zero 'private' entries */
> +	// regs->error_code = 0;
> +	// regs->entry_vector = 0;
> +
> +	// CANONICALIZE_MAYBE(rip);
> +	// CANONICALIZE_MAYBE(rsp);
> +	// CANONICALIZE_MAYBE(rbp);
> +
> +	/*
> +	 * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
> +	 * set PE if PG is set.
> +	 */
> +	if (s->vcpu.cr[0] & X86_CR0_PG)
> +		s->vcpu.cr[0] |= X86_CR0_PE;
> +
> +	/* EFLAGS.VM not available in long mode */
> +	if (s->ctxt.mode == X86EMUL_MODE_PROT64)
> +		s->vcpu.rflags &= ~X86_EFLAGS_VM;
> +
> +	/* EFLAGS.VM implies 16-bit mode */
> +	if (s->vcpu.rflags & X86_EFLAGS_VM) {
> +		s->vcpu.segments[x86_seg_cs].db = 0;
> +		s->vcpu.segments[x86_seg_ss].db = 0;
> +	}
> +}
> +
> +static void init_x86_emulate_ctxt(struct x86_emulate_ctxt *ctxt)
> +{
> +	ctxt->ops = &all_emulator_ops;
> +	ctxt->eflags = 0;
> +	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
> +	ctxt->eip = 0;
> +
> +	ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this?
> +
> +	init_decode_cache(ctxt);
> +}
> +
>   void initialize_emulator(struct state *state)
>   {
> +	state->ctxt.ops = &all_emulator_ops;
> +
> +	/* See also sanitize_input, some hooks can't be disabled. */
> +	// MAYBE_DISABLE_HOOK(read_gpr);
> +
> +	sanitize_input(state);
> +	init_x86_emulate_ctxt(&state->ctxt);
> +}
> +
> +static const char *const x86emul_mode_string[] = {
> +	[X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL",
> +	[X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86",
> +	[X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16",
> +	[X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32",
> +	[X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64",
> +};
> +
> +static void dump_state_after(const char *desc, struct state *state)
> +{
> +	debug(" -- State after %s --\n", desc);
> +	debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
> +	debug(" cr0: %lx\n", state->vcpu.cr[0]);
> +	debug(" cr3: %lx\n", state->vcpu.cr[3]);
> +	debug(" cr4: %lx\n", state->vcpu.cr[4]);
> +
> +	debug("Decode _eip: %lu\n", state->ctxt._eip);
> +	debug("Emulate eip: %lu\n", state->ctxt.eip);
> +
> +	debug("\n");
>   }
>   
>   int step_emulator(struct state *state)
>   {
> -	return 0;
> +	int rc, prev_eip = state->ctxt.eip;
> +	int decode_size = state->data_available - decode_offset;
> +
> +	if (decode_size < 15) {
> +		rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset],
> +				     decode_size);
> +	} else {
> +		rc = x86_decode_insn(&state->ctxt, NULL, 0);


Isn't this going to fetch instructions from data as well? Why do we need 
the < 15 special case at all?


> +	}
> +	debug("Decode result: %d\n", rc);
> +	if (rc != X86EMUL_CONTINUE)
> +		return rc;
> +
> +	debug("Instruction: ");
> +	print_n_bytes(&state->data[emul_offset],
> +		      state->ctxt._eip - state->ctxt.eip);
> +
> +	rc = x86_emulate_insn(&state->ctxt);
> +	debug("Emulation result: %d\n", rc);
> +	dump_state_after("emulating", state);
> +
> +	if (state->ctxt.have_exception) {
> +		fprintf(stderr, "Emulator propagated exception: { ");
> +		fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector);
> +		fprintf(stderr, "error code: %d }\n",
> +			state->ctxt.exception.error_code);
> +		rc = X86EMUL_UNHANDLEABLE;
> +	} else if (prev_eip == state->ctxt.eip) {
> +		fprintf(stderr, "ctxt.eip not advanced.\n");


During fuzzing, do I really care about all that debug output? I mean, 
both cases above are perfectly legal. What we're trying to catch are 
cases where we're overwriting memory we shouldn't have, no?


In fact, how do we detect that? Would it make sense to have canaries in 
the vcpu struct that we can check before/after instruction execution to 
see if there was an out of bounds memory access somewhere?


> +		rc = X86EMUL_UNHANDLEABLE;
> +	}
> +
> +	if (decode_offset == state->data_available)
> +		debug("emulator is done\n");
> +
> +	return rc;
>   }
>   
>   int emulate_until_complete(struct state *state)
>   {
> +	int count = 0;
> +
> +	do {
> +		count++;
> +	} while (step_emulator(state) == X86EMUL_CONTINUE);
> +
> +	debug("Emulated %d instructions\n", count);
>   	return 0;
>   }
>   
> @@ -51,8 +365,16 @@ struct state *create_emulator(void)
>   
>   void reset_emulator(struct state *state)
>   {
> +	unsigned char *data = state->data;
> +	size_t data_available = state->data_available;
> +
> +	memset(state, 0, sizeof(struct state));
> +
> +	state->data = data;
> +	state->data_available = data_available;
>   }
>   
>   void free_emulator(struct state *state)
>   {
> +	free(state);
>   }
> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/afl-many b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
> new file mode 100755
> index 000000000000..ab15258573a2
> --- /dev/null
> +++ b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
> @@ -0,0 +1,28 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# This is for running AFL over NPROC or `nproc` cores with normal AFL options.
> +
> +export AFL_NO_AFFINITY=1
> +
> +while [ -z "$sync_dir" ]; do
> +  while getopts ":o:" opt; do
> +    case "${opt}" in
> +      o)
> +        sync_dir="${OPTARG}"
> +        ;;
> +      *)
> +        ;;
> +    esac
> +  done
> +  ((OPTIND++))
> +  [ $OPTIND -gt $# ] && break
> +done
> +
> +for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
> +    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
> +done
> +taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &


Why the CPU pinning?


> +
> +sleep 5


Why sleep? Shouldn't this rather wait for some async notification to see 
whether all slaves were successfully started?


> +watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)} cores.\" && ./afl-whatsup -s ${sync_dir}"
> +pkill afl-fuzz
> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
> new file mode 100755
> index 000000000000..e570b17f9404
> --- /dev/null
> +++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
> @@ -0,0 +1,23 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# This runs the afl-harness at $1, $2 times (or 100)
> +# It runs uniq and sorts the output to give an idea of what is causing the
> +# most crashes.  Useful for deciding what to implement next.
> +
> +if [ "$#" -lt 1 ]; then
> +  echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]"
> +  exit
> +fi
> +
> +mkdir -p fuzz
> +rm -f fuzz/*.in fuzz/*.out
> +
> +for i in $(seq 1 1 ${2:-100})
> +do
> +  {
> +  head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1
> +  } > fuzz/$i.out 2>&1
> +
> +done
> +
> +find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.* Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction output/' | sort | uniq -c | sort -rn


What is that Segman thing about?


Alex



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86 instruction emulator fuzzing
  2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
                   ` (2 preceding siblings ...)
  2019-05-21 15:39 ` [PATCH 3/3] Demonstrating unit testing via simple-harness Sam Caccavale
@ 2019-05-31  8:39 ` Alexander Graf
  2019-06-12 15:19   ` samcacc
  3 siblings, 1 reply; 11+ messages in thread
From: Alexander Graf @ 2019-05-31  8:39 UTC (permalink / raw)
  To: Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 21.05.19 17:39, Sam Caccavale wrote:
> Dear all,
>
> This series aims to provide an entrypoint for, and fuzz KVM's x86 instruction
> emulator from userspace.  It mirrors Xen's application of the AFL fuzzer to
> it's instruction emulator in the hopes of discovering vulnerabilities.
> Since this entrypoint also allows arbitrary execution of the emulators code
> from userspace, it may also be useful for testing.
>
> The current 3 patches build the emulator and 2 harnesses: simple-harness is
> an example of unit testing; afl-harness is a frontend for the AFL fuzzer.
> They are early POC and include some issues outlined under "Issues."
>
> Patches
> =======
>
> - 01: Builds and links afl-harness with the required kernel objects.
> - 02: Introduces the minimal set of emulator operations and supporting code
> to emulate simple instructions.
> - 03: Demonstrates simple-harness as a unit test.
>
> Issues
> =======
>
> 1. Currently, building requires manually running the `make_deps` script
> since I was unable to make the kernel objects a dependency of the tool.
> 2. The code will segfault if `CONFIG_STACKPROTECTOR=y` in config.
> 3. The code requires stderr to be buffered or it otherwise segfaults.
>
> The latter two issues seem related and all of them are likely fixable by
> someone more familiar with the linux than me.
>
> Concerns
> =======
>
> I was able to carve the `arch/x86/kvm/emulate.c` code, but the emulator is
> constructed in such a way that a lot of the code which enforces expected
> behavior lives in the x86_emulate_ops supplied in `arch/x86/kvm/x86.c`.
> Testing the emulator is still valuable, but a reproducible way to use the kvm
> ops would be useful.
>
> Any comments/suggestions are greatly appreciated.


First off, thanks a lot for this :). The x86 emulator has been a sore 
(bug prone) point in KVM for a long time and I'm surprised it's not 
covered by fuzzing yet. It's great to see that finally happening.

A few nits:

   1) Cover letter should be [PATCH 0/3]. Just generate it with git 
format-patch --cover-letter.
   2) The directory name "x86_instruction_emulation" is a bit long, no?
   3) I think the cover letter should also detail how this relates to 
other fuzzing efforts and why we need another, separate one.


Alex



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86 instruction emulator fuzzing
  2019-05-31  8:39 ` x86 instruction emulator fuzzing Alexander Graf
@ 2019-06-12 15:19   ` samcacc
  0 siblings, 0 replies; 11+ messages in thread
From: samcacc @ 2019-06-12 15:19 UTC (permalink / raw)
  To: Alexander Graf, Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel

On 5/31/19 10:39 AM, Alexander Graf wrote:
> 
> On 21.05.19 17:39, Sam Caccavale wrote:
>> Dear all,
>>
>> This series aims to provide an entrypoint for, and fuzz KVM's x86
>> instruction
>> emulator from userspace.  It mirrors Xen's application of the AFL
>> fuzzer to
>> it's instruction emulator in the hopes of discovering vulnerabilities.
>> Since this entrypoint also allows arbitrary execution of the emulators
>> code
>> from userspace, it may also be useful for testing.
>>
>> The current 3 patches build the emulator and 2 harnesses:
>> simple-harness is
>> an example of unit testing; afl-harness is a frontend for the AFL fuzzer.
>> They are early POC and include some issues outlined under "Issues."
>>
>> Patches
>> =======
>>
>> - 01: Builds and links afl-harness with the required kernel objects.
>> - 02: Introduces the minimal set of emulator operations and supporting
>> code
>> to emulate simple instructions.
>> - 03: Demonstrates simple-harness as a unit test.
>>
>> Issues
>> =======
>>
>> 1. Currently, building requires manually running the `make_deps` script
>> since I was unable to make the kernel objects a dependency of the tool.
>> 2. The code will segfault if `CONFIG_STACKPROTECTOR=y` in config.
>> 3. The code requires stderr to be buffered or it otherwise segfaults.
>>
>> The latter two issues seem related and all of them are likely fixable by
>> someone more familiar with the linux than me.
>>
>> Concerns
>> =======
>>
>> I was able to carve the `arch/x86/kvm/emulate.c` code, but the
>> emulator is
>> constructed in such a way that a lot of the code which enforces expected
>> behavior lives in the x86_emulate_ops supplied in `arch/x86/kvm/x86.c`.
>> Testing the emulator is still valuable, but a reproducible way to use
>> the kvm
>> ops would be useful.
>>
>> Any comments/suggestions are greatly appreciated.
> 
> 
> First off, thanks a lot for this :). The x86 emulator has been a sore
> (bug prone) point in KVM for a long time and I'm surprised it's not
> covered by fuzzing yet. It's great to see that finally happening.
> 
> A few nits:
> 
>   1) Cover letter should be [PATCH 0/3]. Just generate it with git
> format-patch --cover-letter.

Understood and corrected in v2.
>   2) The directory name "x86_instruction_emulation" is a bit long, no?
Yes it is.  While tab completion has mostly kept me sane until now, I've
renamed the folder to `x86ie`s

>   3) I think the cover letter should also detail how this relates to
> other fuzzing efforts and why we need another, separate one.

This sounds worthwhile.  In the interest of time I haven't included this
in v2, but I'll write it up as soon as possible and update.

> 
> 
> Alex
> 
> 

Thanks for the advice, v2 to follow.

Sam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] Build target for emulate.o as a userspace binary
  2019-05-31  8:02   ` Alexander Graf
@ 2019-06-12 15:19     ` samcacc
  0 siblings, 0 replies; 11+ messages in thread
From: samcacc @ 2019-06-12 15:19 UTC (permalink / raw)
  To: Alexander Graf, Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel

On 5/31/19 10:02 AM, Alexander Graf wrote:
> 
> On 21.05.19 17:39, Sam Caccavale wrote:
>> This commit contains the minimal set of functionality to build
>> afl-harness around arch/x86/emulate.c which allows exercising code
>> in that source file, like x86_emulate_insn.  Resolving the
>> dependencies was done via GCC's -H flag by get_headers.py.
>>
>> ---
>>   tools/Makefile                                |   9 ++
>>   .../fuzz/x86_instruction_emulation/.gitignore |   2 +
>>   tools/fuzz/x86_instruction_emulation/Makefile |  57 +++++++
>>   .../fuzz/x86_instruction_emulation/README.md  |  12 ++
>>   .../x86_instruction_emulation/afl-harness.c   | 149 ++++++++++++++++++
>>   tools/fuzz/x86_instruction_emulation/common.h |  87 ++++++++++
>>   .../x86_instruction_emulation/emulator_ops.c  |  58 +++++++
>>   .../x86_instruction_emulation/emulator_ops.h  | 117 ++++++++++++++
>>   .../scripts/get_headers.py                    |  95 +++++++++++
>>   .../scripts/make_deps                         |   4 +
>>   tools/fuzz/x86_instruction_emulation/stubs.c  |  56 +++++++
>>   tools/fuzz/x86_instruction_emulation/stubs.h  |  52 ++++++
>>   12 files changed, 698 insertions(+)
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/.gitignore
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/Makefile
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/README.md
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/afl-harness.c
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/common.h
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.c
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/emulator_ops.h
>>   create mode 100644
>> tools/fuzz/x86_instruction_emulation/scripts/get_headers.py
>>   create mode 100755
>> tools/fuzz/x86_instruction_emulation/scripts/make_deps
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.c
>>   create mode 100644 tools/fuzz/x86_instruction_emulation/stubs.h
>>
>> diff --git a/tools/Makefile b/tools/Makefile
>> index 3dfd72ae6c1a..4d68817b7e49 100644
>> --- a/tools/Makefile
>> +++ b/tools/Makefile
>> @@ -94,6 +94,12 @@ freefall: FORCE
>>   kvm_stat: FORCE
>>       $(call descend,kvm/$@)
>>   +fuzz: FORCE
>> +    $(call descend,fuzz/x86_instruction_emulation)
>> +
>> +fuzz_deps: FORCE
>> +    $(call descend,fuzz/x86_instruction_emulation,fuzz_deps)
>> +
>>   all: acpi cgroup cpupower gpio hv firewire liblockdep \
>>           perf selftests spi turbostat usb \
>>           virtio vm bpf x86_energy_perf_policy \
>> @@ -171,6 +177,9 @@ tmon_clean:
>>   freefall_clean:
>>       $(call descend,laptop/freefall,clean)
>>   +fuzz_clean:
>> +    $(call descend,fuzz/x86_instruction_emulation,clean)
>> +
>>   build_clean:
>>       $(call descend,build,clean)
>>   diff --git a/tools/fuzz/x86_instruction_emulation/.gitignore
>> b/tools/fuzz/x86_instruction_emulation/.gitignore
>> new file mode 100644
>> index 000000000000..7d44f7ce266e
>> --- /dev/null
>> +++ b/tools/fuzz/x86_instruction_emulation/.gitignore
>> @@ -0,0 +1,2 @@
>> +*.o
>> +*-harness
>> diff --git a/tools/fuzz/x86_instruction_emulation/Makefile
>> b/tools/fuzz/x86_instruction_emulation/Makefile
>> new file mode 100644
>> index 000000000000..d2854a332605
>> --- /dev/null
>> +++ b/tools/fuzz/x86_instruction_emulation/Makefile
>> @@ -0,0 +1,57 @@
>> +ROOT_DIR=../../..
>> +THIS_DIR=tools/fuzz/x86_instruction_emulation
>> +
>> +include ../../scripts/Makefile.include
>> +
>> +.DEFAULT_GOAL := all
>> +
>> +INCLUDES := $(patsubst -I./%,-I./$(ROOT_DIR)/%, $(LINUXINCLUDE))
>> +INCLUDES := $(patsubst ./include/%,./$(ROOT_DIR)/include/%, $(INCLUDES))
>> +INCLUDES += -include ./$(ROOT_DIR)/include/linux/compiler_types.h
>> +
>> +$(ROOT_DIR)/.config:
>> +    make -C $(ROOT_DIR) menuconfig
>> +    sed -i -r 's/^#? *CONFIG_KVM(.*)=.*/CONFIG_KVM\1=y/'
>> $(ROOT_DIR)/.config
>> +
>> +
>> +ifdef DEBUG
>> +KBUILD_CFLAGS += -DDEBUG
>> +endif
>> +KBUILD_CFLAGS += -g -O0
> 
> 
> Why -O0? I would expect a some bugs to only emerge with optimization
> enabled.
> 
> Alex
> 

This was supposed to be the `ifdef` actually.  Fixed in v2.

Sam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] Emulate simple x86 instructions in userspace
  2019-05-31  8:38   ` Alexander Graf
@ 2019-06-12 15:19     ` samcacc
  2019-06-21 13:28       ` Alexander Graf
  0 siblings, 1 reply; 11+ messages in thread
From: samcacc @ 2019-06-12 15:19 UTC (permalink / raw)
  To: Alexander Graf, Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel

On 5/31/19 10:38 AM, Alexander Graf wrote:
> 
> On 21.05.19 17:39, Sam Caccavale wrote:
>> Added the minimal subset of code to run afl-harness with a binary file
>> as input.  These bytes are used to populate the vcpu structure and then
>> as an instruction stream for the emulator.  It does not attempt to handle
>> exceptions an only supports very simple ops.
>>
>> ---
>>   .../x86_instruction_emulation/emulator_ops.c  | 324 +++++++++++++++++-
>>   .../scripts/afl-many                          |  28 ++
>>   .../scripts/bin_fuzz                          |  23 ++
>>   3 files changed, 374 insertions(+), 1 deletion(-)
>>   create mode 100755
>> tools/fuzz/x86_instruction_emulation/scripts/afl-many
>>   create mode 100755
>> tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>
>> diff --git a/tools/fuzz/x86_instruction_emulation/emulator_ops.c
>> b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
>> index 55ae4e8fbd96..38bba4c97e2f 100644
>> --- a/tools/fuzz/x86_instruction_emulation/emulator_ops.c
>> +++ b/tools/fuzz/x86_instruction_emulation/emulator_ops.c
>> @@ -29,17 +29,331 @@
>>   #include <asm/user_64.h>
>>   #include <asm/kvm.h>
>>   +/*
>> + * Due to some quirk of building, the way printing to an unbuffered
>> stream
>> + * is implemented smashes the stack.  For now, we'll just buffer
>> stderr but
>> + * a fix is needed.
> 
> 
> I'm not sure I fully understand this comment. If printing smashes the
> stack, wouldn't that always break things?

This seemed to be the result of the `-mcmodel=kernel` flag not playing
well with printf and a userspace binary.  I've started building both the
kernel objects and harness without the flag and it clears up the issue.

> 
> 
>> + */
>> +char buff[BUFSIZ];
>> +void buffer_stderr(void)
>> +{
>> +    setvbuf(stderr, buff, _IOLBF, BUFSIZ);
>> +}
>> +
>> +#define number_of_registers(c) (c->mode == X86EMUL_MODE_PROT64 ? 16 : 8)
> 
> 
> This can go into a header, no? Also, just say "gprs" instead of
> registers - it's shorter. In addition, you want bits like this also be
> static inline functions rather than macros to preserve type checking.
> 

Moved to a static inline function in emulator_ops.h.

> 
>> +
>> +ulong emul_read_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg)
>> +{
>> +    assert(reg < number_of_registers(ctxt));
>> +    return get_state(ctxt)->vcpu.regs[reg];
>> +}
>> +
>> +void emul_write_gpr(struct x86_emulate_ctxt *ctxt, unsigned int reg,
>> ulong val)
>> +{
>> +    assert(reg < number_of_registers(ctxt));
>> +    get_state(ctxt)->vcpu.regs[reg] = val;
>> +}
>> +
>> +#undef number_of_registers /* (ctxt->mode == X86EMUL_MODE_PROT64 ? 16
>> : 8) */
>> +
>> +/* All read ops: */
>> +
>> +#define emul_offset (state->ctxt.eip + state->other_bytes_consumed)
>> +#define decode_offset (state->ctxt._eip + state->other_bytes_consumed)
> 
> 
> Same here, better make this static inline functions and pass state as
> parameter in. At least for me, it's otherwise really hard to remember
> what really happens when a variable appears out of the blue.
> 

These were removed for a far less complicated method of keeping track of
bytes consumed.  Additionally, using eip was a mistake since jumps could
massively change it.

> 
>> +/*
>> + * There is a very loose abstraction around ctxt.eip, ctxt._eip, and
>> + * state.other_bytes_consumed.
>> + *
>> + * eip is incremented when an instruction is emulated by
>> x86_emulate_insn
>> + * _eip is incremeneted when an instruction is decoded by
>> x86_decode_insn
>> + *
>> + * state.other_bytes_consumed is incremented when bytes are consumed by
>> + *  get_bytes_and_increment for any use besides an x86 instruction.
>> + *
>> + * The bytes between emul_offset and decode_offset are instructions yet
>> + * to be executed.
>> + */
>> +
>> +#define _data_available(bytes) (decode_offset + bytes <
>> state->data_available)
> 
> 
> Please make this a static function too.
> 

Removed in v2 in lieu of `state.data_available` on the state structure
which is incremented as bytes are consumed.

> 
>> +
>> +static int _get_bytes(void *dst, struct state *state, unsigned int
>> bytes,
>> +              char *callee)
>> +{
>> +    if (!_data_available(bytes)) {
>> +        fprintf(stderr, "Tried retrieving %d bytes\n", bytes);
>> +        fprintf(stderr, "%s failed to retrieve bytes for %s.\n",
>> +            __func__, callee);
>> +        return X86EMUL_UNHANDLEABLE;
>> +    }
>> +
>> +    memcpy(dst, &state->data[decode_offset], bytes);
>> +    return X86EMUL_CONTINUE;
>> +}
>> +
>> +/*
>> + * The only function that any x86_emulate_ops should call to retrieve
>> bytes.
>> + * See comments in struct state definition for more information.
>> + */
>> +static int get_bytes_and_increment(void *dst, struct state *state,
>> +                   unsigned int bytes, char *callee)
>> +{
>> +    int rc = _get_bytes(dst, state, bytes, callee);
>> +
>> +    if (rc == X86EMUL_CONTINUE)
>> +        state->other_bytes_consumed += bytes;
>> +
>> +    return rc;
>> +}
>> +
>> +/*
>> + * This is called by x86_decode_insn to fetch bytes and is the only
>> function
>> + * that gets bytes from state.data without incrementing
>> .other_byes_consumed
>> + * since the emulator will increment ._eip during x86_decode_insn and
>> ._eip
>> + * is used as an index into state.data.
>> + */
>> +int emul_fetch(struct x86_emulate_ctxt *ctxt, unsigned long addr,
>> void *val,
>> +           unsigned int bytes, struct x86_exception *fault)
>> +{
>> +    if (_get_bytes(val, get_state(ctxt), bytes, "emul_fetch") !=
>> +        X86EMUL_CONTINUE) {
>> +        return X86EMUL_UNHANDLEABLE;
>> +    }
>> +
>> +    return X86EMUL_CONTINUE;
>> +}
>> +
>> +ulong emul_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
>> +{
>> +    return get_state(ctxt)->vcpu.cr[cr];
>> +}
>> +
>> +int emul_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
>> +{
>> +    get_state(ctxt)->vcpu.cr[cr] = val;
>> +    return 0;
>> +}
>> +
>> +unsigned int emul_get_hflags(struct x86_emulate_ctxt *ctxt)
>> +{
>> +    return get_state(ctxt)->vcpu.rflags;
>> +}
>> +
>> +void emul_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned int hflags)
>> +{
>> +    get_state(ctxt)->vcpu.rflags = hflags;
>> +}
>> +
>> +/* End of emulator ops */
>> +
>> +#define SET(h) .h = emul_##h
>> +const struct x86_emulate_ops all_emulator_ops = {
>> +    SET(read_gpr),
>> +    SET(write_gpr),
>> +    SET(fetch),
>> +    SET(get_cr),
>> +    SET(set_cr),
>> +    SET(get_hflags),
>> +    SET(set_hflags),
>> +};
>> +#undef SET
>> +
>> +enum {
>> +    HOOK_read_gpr,
>> +    HOOK_write_gpr,
>> +    HOOK_fetch,
>> +    HOOK_get_cr,
>> +    HOOK_set_cr,
>> +    HOOK_get_hflags,
>> +    HOOK_set_hflags
>> +};
>> +
>> +/* Put reg_x into a definitely valid state. */
>> +#define CANONICALIZE(reg_x)                        \
>> +    do {                                \
>> +        uint64_t _y = (reg_x);                    \
>> +        if (_y & (1ULL << 47))                    \
>> +            _y |= (~0ULL) << 48;                \
>> +        else                            \
>> +            _y &= (1ULL << 48) - 1;                \
>> +        debug("Canonicalized %" PRIx64 " to %" PRIx64 "\n",    \
>> +            reg_x, y);                    \
>> +        (reg_x) = _y;                        \
>> +    } while (0)
> 
> 
> We support >48 bits VA/PA these days. Also, why is this a macro and not
> a static function?
> 

This was actually dead code kept from the Xen fuzzer, it has been removed.

> 
>> +
>> +/*
>> + * Canonicalizes reg if options << CANONICALIZE_##reg is set.
>> + * This weighs the emulator to use canonical values for important
>> registers.
>> + *
>> + * Expects options and regs to be defined.
>> + */
>> +#define CANONICALIZE_MAYBE(reg)                        \
>> +    do {                                \
>> +        if (!(options & (1 << CANONICALIZE_##reg)))        \
>> +            CANONICALIZE(regs->reg);            \
>> +    } while (0)
>> +
>> +/*
>> + * Disable an x86_emulate_op if options << HOOK_op is set.
>> + *
>> + * Expects options to be defined.
>> + */
>> +#define MAYBE_DISABLE_HOOK(h)                        \
>> +    do {                                \
>> +        if (options & (1 << HOOK_##h)) {            \
>> +            vcpu->ctxt.ops.h = NULL;            \
>> +            debug("Disabling hook " #h "\n");        \
>> +        }                            \
>> +    } while (0)
>> +
>> +/*
>> + * FROM XEN:
>> + *
>> + * Constrain input to architecturally-possible states where
>> + * the emulator relies on these
>> + *
>> + * In general we want the emulator to be as absolutely robust as
>> + * possible; which means that we want to minimize the number of things
>> + * it assumes about the input state.  Tesing this means minimizing and
>> + * removing as much of the input constraints as possible.
>> + *
>> + * So we only add constraints that (in general) have been proven to
>> + * cause crashes in the emulator.
>> + *
>> + * For future reference: other constraints which might be necessary at
>> + * some point:
>> + *
>> + * - EFER.LMA => !EFLAGS.NT
>> + * - In VM86 mode, force segment...
>> + *  - ...access rights to 0xf3
>> + *  - ...limits to 0xffff
>> + *  - ...bases to below 1Mb, 16-byte aligned
>> + *  - ...selectors to (base >> 4)
>> + */
>> +static void sanitize_input(struct state *s)
>> +{
>> +    /* Some hooks can't be disabled. */
>> +    // options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>> +
>> +    /* Zero 'private' entries */
>> +    // regs->error_code = 0;
>> +    // regs->entry_vector = 0;
>> +
>> +    // CANONICALIZE_MAYBE(rip);
>> +    // CANONICALIZE_MAYBE(rsp);
>> +    // CANONICALIZE_MAYBE(rbp);
>> +
>> +    /*
>> +     * CR0.PG can't be set if CR0.PE isn't set.  Set is more
>> interesting, so
>> +     * set PE if PG is set.
>> +     */
>> +    if (s->vcpu.cr[0] & X86_CR0_PG)
>> +        s->vcpu.cr[0] |= X86_CR0_PE;
>> +
>> +    /* EFLAGS.VM not available in long mode */
>> +    if (s->ctxt.mode == X86EMUL_MODE_PROT64)
>> +        s->vcpu.rflags &= ~X86_EFLAGS_VM;
>> +
>> +    /* EFLAGS.VM implies 16-bit mode */
>> +    if (s->vcpu.rflags & X86_EFLAGS_VM) {
>> +        s->vcpu.segments[x86_seg_cs].db = 0;
>> +        s->vcpu.segments[x86_seg_ss].db = 0;
>> +    }
>> +}
>> +
>> +static void init_x86_emulate_ctxt(struct x86_emulate_ctxt *ctxt)
>> +{
>> +    ctxt->ops = &all_emulator_ops;
>> +    ctxt->eflags = 0;
>> +    ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
>> +    ctxt->eip = 0;
>> +
>> +    ctxt->mode = X86EMUL_MODE_PROT64; // TODO: eventually vary this?
>> +
>> +    init_decode_cache(ctxt);
>> +}
>> +
>>   void initialize_emulator(struct state *state)
>>   {
>> +    state->ctxt.ops = &all_emulator_ops;
>> +
>> +    /* See also sanitize_input, some hooks can't be disabled. */
>> +    // MAYBE_DISABLE_HOOK(read_gpr);
>> +
>> +    sanitize_input(state);
>> +    init_x86_emulate_ctxt(&state->ctxt);
>> +}
>> +
>> +static const char *const x86emul_mode_string[] = {
>> +    [X86EMUL_MODE_REAL] = "X86EMUL_MODE_REAL",
>> +    [X86EMUL_MODE_VM86] = "X86EMUL_MODE_VM86",
>> +    [X86EMUL_MODE_PROT16] = "X86EMUL_MODE_PROT16",
>> +    [X86EMUL_MODE_PROT32] = "X86EMUL_MODE_PROT32",
>> +    [X86EMUL_MODE_PROT64] = "X86EMUL_MODE_PROT64",
>> +};
>> +
>> +static void dump_state_after(const char *desc, struct state *state)
>> +{
>> +    debug(" -- State after %s --\n", desc);
>> +    debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
>> +    debug(" cr0: %lx\n", state->vcpu.cr[0]);
>> +    debug(" cr3: %lx\n", state->vcpu.cr[3]);
>> +    debug(" cr4: %lx\n", state->vcpu.cr[4]);
>> +
>> +    debug("Decode _eip: %lu\n", state->ctxt._eip);
>> +    debug("Emulate eip: %lu\n", state->ctxt.eip);
>> +
>> +    debug("\n");
>>   }
>>     int step_emulator(struct state *state)
>>   {
>> -    return 0;
>> +    int rc, prev_eip = state->ctxt.eip;
>> +    int decode_size = state->data_available - decode_offset;
>> +
>> +    if (decode_size < 15) {
>> +        rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset],
>> +                     decode_size);
>> +    } else {
>> +        rc = x86_decode_insn(&state->ctxt, NULL, 0);
> 
> 
> Isn't this going to fetch instructions from data as well? Why do we need
> the < 15 special case at all?
> 

I've changed the method of acquiring data in v2, but the 15 limit is
still relevant.  If x86_decode_insn is called with a NULL pointer and
instruction size 0, the bytes are fetched via the emulator_ops.fetch
function.  This would be nice, but there is no way of limiting how many
bytes it will try and fetch-- and it usually grabs 15 since that is the
longest x86 instruction (as of yet?).  When there are less than 15 bytes
left, limiting the fetch size to the remaining bytes is important.

> 
>> +    }
>> +    debug("Decode result: %d\n", rc);
>> +    if (rc != X86EMUL_CONTINUE)
>> +        return rc;
>> +
>> +    debug("Instruction: ");
>> +    print_n_bytes(&state->data[emul_offset],
>> +              state->ctxt._eip - state->ctxt.eip);
>> +
>> +    rc = x86_emulate_insn(&state->ctxt);
>> +    debug("Emulation result: %d\n", rc);
>> +    dump_state_after("emulating", state);
>> +
>> +    if (state->ctxt.have_exception) {
>> +        fprintf(stderr, "Emulator propagated exception: { ");
>> +        fprintf(stderr, "vector: %d, ", state->ctxt.exception.vector);
>> +        fprintf(stderr, "error code: %d }\n",
>> +            state->ctxt.exception.error_code);
>> +        rc = X86EMUL_UNHANDLEABLE;
>> +    } else if (prev_eip == state->ctxt.eip) {
>> +        fprintf(stderr, "ctxt.eip not advanced.\n");
> 
> 
> During fuzzing, do I really care about all that debug output? I mean,
> both cases above are perfectly legal. What we're trying to catch are
> cases where we're overwriting memory we shouldn't have, no?
> 

This is true.  Debug has been put into a compiler macro.

> 
> In fact, how do we detect that? Would it make sense to have canaries in
> the vcpu struct that we can check before/after instruction execution to
> see if there was an out of bounds memory access somewhere?
> 

As of now, I compile with ASAN, which _hopefully_ would catch most OOB
memory accesses.  We also use gcc's stack canaries.  Additional ways to
detect these kinds of errors are desirable, but I don't know of any at
this point in time.

> 
>> +        rc = X86EMUL_UNHANDLEABLE;
>> +    }
>> +
>> +    if (decode_offset == state->data_available)
>> +        debug("emulator is done\n");
>> +
>> +    return rc;
>>   }
>>     int emulate_until_complete(struct state *state)
>>   {
>> +    int count = 0;
>> +
>> +    do {
>> +        count++;
>> +    } while (step_emulator(state) == X86EMUL_CONTINUE);
>> +
>> +    debug("Emulated %d instructions\n", count);
>>       return 0;
>>   }
>>   @@ -51,8 +365,16 @@ struct state *create_emulator(void)
>>     void reset_emulator(struct state *state)
>>   {
>> +    unsigned char *data = state->data;
>> +    size_t data_available = state->data_available;
>> +
>> +    memset(state, 0, sizeof(struct state));
>> +
>> +    state->data = data;
>> +    state->data_available = data_available;
>>   }
>>     void free_emulator(struct state *state)
>>   {
>> +    free(state);
>>   }
>> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/afl-many
>> b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
>> new file mode 100755
>> index 000000000000..ab15258573a2
>> --- /dev/null
>> +++ b/tools/fuzz/x86_instruction_emulation/scripts/afl-many
>> @@ -0,0 +1,28 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-2.0+
>> +# This is for running AFL over NPROC or `nproc` cores with normal AFL
>> options.
>> +
>> +export AFL_NO_AFFINITY=1
>> +
>> +while [ -z "$sync_dir" ]; do
>> +  while getopts ":o:" opt; do
>> +    case "${opt}" in
>> +      o)
>> +        sync_dir="${OPTARG}"
>> +        ;;
>> +      *)
>> +        ;;
>> +    esac
>> +  done
>> +  ((OPTIND++))
>> +  [ $OPTIND -gt $# ] && break
>> +done
>> +
>> +for i in $(seq 1 $(( ${NPROC:-$(nproc)} - 1)) ); do
>> +    taskset -c "$i" ./afl-fuzz -S "slave$i" $@ >/dev/null 2>&1 &
>> +done
>> +taskset -c 0 ./afl-fuzz -M master $@ >/dev/null 2>&1 &
> 
> 
> Why the CPU pinning?
> 

AFL has a bad habit of being unable to spread its threads out to
different cores.  We cpu pin each thread to its own vcpu to help with
performance.

> 
>> +
>> +sleep 5
> 
> 
> Why sleep? Shouldn't this rather wait for some async notification to see
> whether all slaves were successfully started?
> 

Again, AFL is a little primitive in this sense.  It does not offer an
async method of notifying when the threads are up.  In the interest of
time I just put the wait in.

> 
>> +watch -n1 "echo \"Executing './afl-fuzz $@' on ${NPROC:-$(nproc)}
>> cores.\" && ./afl-whatsup -s ${sync_dir}"
>> +pkill afl-fuzz
>> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>> b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>> new file mode 100755
>> index 000000000000..e570b17f9404
>> --- /dev/null
>> +++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>> @@ -0,0 +1,23 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-2.0+
>> +# This runs the afl-harness at $1, $2 times (or 100)
>> +# It runs uniq and sorts the output to give an idea of what is
>> causing the
>> +# most crashes.  Useful for deciding what to implement next.
>> +
>> +if [ "$#" -lt 1 ]; then
>> +  echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]"
>> +  exit
>> +fi
>> +
>> +mkdir -p fuzz
>> +rm -f fuzz/*.in fuzz/*.out
>> +
>> +for i in $(seq 1 1 ${2:-100})
>> +do
>> +  {
>> +  head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1
>> +  } > fuzz/$i.out 2>&1
>> +
>> +done
>> +
>> +find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.*
>> Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction
>> output/' | sort | uniq -c | sort -rn
> 
> 
> What is that Segman thing about?
> 

This was for binning crashes-- check `tools/fuzz/x86ie/scripts/bin.sh`
in v2 for the updated version.  Basically, it checks whether a
segmentation fault has happened, and if so, launches a gdb session to
see whether it was caused by an unimplemented x86_emulator_op.  This is
useful in development for prioritizing the unimplemented features which
are causing the most fake crashes.

> 
> Alex
> 
> 

Thanks for taking a look at this.  I'll be sending a v2 out shortly
which addresses all of these issues and makes other misc fixes.

Sam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] Emulate simple x86 instructions in userspace
  2019-06-12 15:19     ` samcacc
@ 2019-06-21 13:28       ` Alexander Graf
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Graf @ 2019-06-21 13:28 UTC (permalink / raw)
  To: samcacc, Sam Caccavale
  Cc: samcaccavale, nmanthey, wipawel, dwmw, mpohlack, graf, karahmed,
	andrew.cooper3, JBeulich, pbonzini, rkrcmar, tglx, mingo, bp,
	hpa, paullangton4, anirudhkaushik, x86, kvm, linux-kernel


On 12.06.19 17:19, samcacc@amazon.com wrote:
> On 5/31/19 10:38 AM, Alexander Graf wrote:
>> On 21.05.19 17:39, Sam Caccavale wrote:
>>
>>> +static void dump_state_after(const char *desc, struct state *state)
>>> +{
>>> +    debug(" -- State after %s --\n", desc);
>>> +    debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
>>> +    debug(" cr0: %lx\n", state->vcpu.cr[0]);
>>> +    debug(" cr3: %lx\n", state->vcpu.cr[3]);
>>> +    debug(" cr4: %lx\n", state->vcpu.cr[4]);
>>> +
>>> +    debug("Decode _eip: %lu\n", state->ctxt._eip);
>>> +    debug("Emulate eip: %lu\n", state->ctxt.eip);
>>> +
>>> +    debug("\n");
>>>    }
>>>      int step_emulator(struct state *state)
>>>    {
>>> -    return 0;
>>> +    int rc, prev_eip = state->ctxt.eip;
>>> +    int decode_size = state->data_available - decode_offset;
>>> +
>>> +    if (decode_size < 15) {
>>> +        rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset],
>>> +                     decode_size);
>>> +    } else {
>>> +        rc = x86_decode_insn(&state->ctxt, NULL, 0);
>>
>> Isn't this going to fetch instructions from data as well? Why do we need
>> the < 15 special case at all?
>>
> I've changed the method of acquiring data in v2, but the 15 limit is
> still relevant.  If x86_decode_insn is called with a NULL pointer and
> instruction size 0, the bytes are fetched via the emulator_ops.fetch
> function.  This would be nice, but there is no way of limiting how many
> bytes it will try and fetch-- and it usually grabs 15 since that is the
> longest x86 instruction (as of yet?).  When there are less than 15 bytes
> left, limiting the fetch size to the remaining bytes is important.


You want to at least add a comment here, detailing the fact that where 
the magic 15 comes from and that you want to exercise the normal 
prefetch path while still allowing the buffer to shrink < 15 bytes :). 
Maybe move MAX_INST_SIZE from svm.c into a .h file and reuse that while 
at it.


[...]


>>> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> new file mode 100755
>>> index 000000000000..e570b17f9404
>>> --- /dev/null
>>> +++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> @@ -0,0 +1,23 @@
>>> +#!/bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0+
>>> +# This runs the afl-harness at $1, $2 times (or 100)
>>> +# It runs uniq and sorts the output to give an idea of what is
>>> causing the
>>> +# most crashes.  Useful for deciding what to implement next.
>>> +
>>> +if [ "$#" -lt 1 ]; then
>>> +  echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]"
>>> +  exit
>>> +fi
>>> +
>>> +mkdir -p fuzz
>>> +rm -f fuzz/*.in fuzz/*.out
>>> +
>>> +for i in $(seq 1 1 ${2:-100})
>>> +do
>>> +  {
>>> +  head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1
>>> +  } > fuzz/$i.out 2>&1
>>> +
>>> +done
>>> +
>>> +find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.*
>>> Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction
>>> output/' | sort | uniq -c | sort -rn
>>
>> What is that Segman thing about?
>>
> This was for binning crashes-- check `tools/fuzz/x86ie/scripts/bin.sh`
> in v2 for the updated version.  Basically, it checks whether a
> segmentation fault has happened, and if so, launches a gdb session to
> see whether it was caused by an unimplemented x86_emulator_op.  This is
> useful in development for prioritizing the unimplemented features which
> are causing the most fake crashes.


I can see why you want to combine them, but I don't understand where 
"Segman" comes from. Where is there a man here?



Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-21 13:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
2019-05-31  8:02   ` Alexander Graf
2019-06-12 15:19     ` samcacc
2019-05-21 15:39 ` [PATCH 2/3] Emulate simple x86 instructions in userspace Sam Caccavale
2019-05-31  8:38   ` Alexander Graf
2019-06-12 15:19     ` samcacc
2019-06-21 13:28       ` Alexander Graf
2019-05-21 15:39 ` [PATCH 3/3] Demonstrating unit testing via simple-harness Sam Caccavale
2019-05-31  8:39 ` x86 instruction emulator fuzzing Alexander Graf
2019-06-12 15:19   ` samcacc

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.