All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way
@ 2017-10-11 17:52 George Dunlap
  2017-10-11 17:52 ` [PATCH v4 02/12] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Commit 849a1f10c9 was checked in inappropriately; review flagged up
that clearerr() was too big a hammer, as it would clear both the EOF
flag and stream errors.

Stream errors shouldn't be cleared; we only want the EOF and other
stream-related state reset.  To do this, it is sufficient to fseek()
to zero.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
This is a candidate for backport to 4.9 (probably merged with 849a1f10c9).

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index b4d15451b5..31ae1daef1 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -77,6 +77,17 @@ int main(int argc, char **argv)
                 exit(-1);
             }
         }
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+        else
+        {
+            /* 
+             * This will ensure we're dealing with a clean stream
+             * state after the afl-fuzz process messes with the open
+             * file handle.
+             */
+            fseek(fp, 0, SEEK_SET);
+        }
+#endif
 
         size = fread(input, 1, INPUT_SIZE, fp);
 
@@ -97,8 +108,6 @@ int main(int argc, char **argv)
             fclose(fp);
             fp = NULL;
         }
-        else
-            clearerr(fp);
 
         LLVMFuzzerTestOneInput(input, size);
     }
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 02/12] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 03/12] fuzz/x86_emulate: Implement input_read() and input_avail() George Dunlap
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

- Print the symbolic name rather than the number
- Explicitly state when data_read() fails due to EOI

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v4:
- Make array 'static const char* const'
Changes in v2:
- Add spaces around '='

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 48a879cc88..999f417716 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -52,6 +52,14 @@ struct fuzz_state
     struct x86_emulate_ops ops;
 };
 
+static const char* const x86emul_return_string[] = {
+    [X86EMUL_OKAY] = "X86EMUL_OKAY",
+    [X86EMUL_UNHANDLEABLE] = "X86EMUL_UNHANDLEABLE",
+    [X86EMUL_EXCEPTION] = "X86EMUL_EXCEPTION",
+    [X86EMUL_RETRY] = "X86EMUL_RETRY",
+    [X86EMUL_DONE] = "X86EMUL_DONE",
+};
+
 /*
  * Randomly return success or failure when processing data.  If
  * `exception` is false, this function turns _EXCEPTION to _OKAY.
@@ -84,7 +92,7 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
     if ( rc == X86EMUL_EXCEPTION && !exception )
         rc = X86EMUL_OKAY;
 
-    printf("maybe_fail %s: %d\n", why, rc);
+    printf("maybe_fail %s: %s\n", why, x86emul_return_string[rc]);
 
     if ( rc == X86EMUL_EXCEPTION )
         /* Fake up a pagefault. */
@@ -113,6 +121,7 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
             x86_emul_hw_exception(13, 0, ctxt);
 
         rc = X86EMUL_EXCEPTION;
+        printf("data_read %s: X86EMUL_EXCEPTION (end of input)\n", why);
     }
     else
         rc = maybe_fail(ctxt, why, true);
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 03/12] fuzz/x86_emulate: Implement input_read() and input_avail()
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
  2017-10-11 17:52 ` [PATCH v4 02/12] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 04/12] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Rather than open-coding the "read" from the input file.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v3:
 - s/input_available/input_avail/;
 - Constify argument to input_avail
 - Fix off-by-one error in input_avail
 - Return false / true rather than 0 / 1 in input_read
v2:
- Use less dread-ful names
- Return bool rather than int

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 31 ++++++++++++++++++-------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 999f417716..5fb8586955 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -52,6 +52,22 @@ struct fuzz_state
     struct x86_emulate_ops ops;
 };
 
+static inline bool input_avail(const struct fuzz_state *s, size_t size)
+{
+    return s->data_index + size <= s->data_num;
+}
+
+static inline bool input_read(struct fuzz_state *s, void *dst, size_t size)
+{
+    if ( !input_avail(s, size) )
+        return false;
+
+    memcpy(dst, &s->corpus->data[s->data_index], size);
+    s->data_index += size;
+
+    return true;
+}
+
 static const char* const x86emul_return_string[] = {
     [X86EMUL_OKAY] = "X86EMUL_OKAY",
     [X86EMUL_UNHANDLEABLE] = "X86EMUL_UNHANDLEABLE",
@@ -68,10 +84,10 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
                       const char *why, bool exception)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
+    unsigned char c;
     int rc;
 
-    if ( s->data_index >= s->data_num )
+    if ( !input_read(s, &c, sizeof(c)) )
         rc = X86EMUL_EXCEPTION;
     else
     {
@@ -80,13 +96,12 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
          * 25% unhandlable
          * 25% exception
          */
-        if ( c->data[s->data_index] > 0xc0 )
+        if ( c > 0xc0 )
             rc = X86EMUL_EXCEPTION;
-        else if ( c->data[s->data_index] > 0x80 )
+        else if ( c > 0x80 )
             rc = X86EMUL_UNHANDLEABLE;
         else
             rc = X86EMUL_OKAY;
-        s->data_index++;
     }
 
     if ( rc == X86EMUL_EXCEPTION && !exception )
@@ -106,11 +121,10 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
                      const char *why, void *dst, unsigned int bytes)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     unsigned int i;
     int rc;
 
-    if ( s->data_index + bytes > s->data_num )
+    if ( !input_avail(s, bytes) )
     {
         /*
          * Fake up a segment limit violation.  System segment limit volations
@@ -128,8 +142,7 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
 
     if ( rc == X86EMUL_OKAY )
     {
-        memcpy(dst, &c->data[s->data_index], bytes);
-        s->data_index += bytes;
+        input_read(s, dst, bytes);
 
         printf("%s: ", why);
         for ( i = 0; i < bytes; i++ )
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 04/12] fuzz/x86_emulate: Rename the file containing the wrapper code
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
  2017-10-11 17:52 ` [PATCH v4 02/12] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
  2017-10-11 17:52 ` [PATCH v4 03/12] fuzz/x86_emulate: Implement input_read() and input_avail() George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 05/12] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

When generating coverage output, by default gcov generates output
filenames based only on the coverage file and the "leaf" source file,
not the full path.  As a result, it uses the same name for
x86_emulate.c and x86_emulate/x86_emulate.c, generally overwriting the
second (which we actually are about) with the first (which is just a
wrapper).

Rename the user-space wrapper helpers to x86-emulate.[ch], so
that it generates separate files.

There is actually an option to gcov, `--preserve-paths`, which will
cause the full path name to be included in the filename, properly
distinguishing between the two.  However, given that the user-space
wrapper doesn't actually do any emulation (and the poor state of gcov
documentation making it difficult to find the option in the first
place), it seems to make more sense to rename the file anyway.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v3:
- x86-emulate.* rather than x86_emulate_user.*
- Also update .gitignore to ignore the new files

NB: I discovered the `-p` option to gcov after writing this patch.
But I think the patch itself still makes sense.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 .gitignore                                                |  3 ++-
 tools/fuzz/x86_instruction_emulator/Makefile              | 12 ++++++------
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c           |  2 +-
 tools/tests/x86_emulator/Makefile                         |  6 +++---
 tools/tests/x86_emulator/test_x86_emulator.c              |  2 +-
 tools/tests/x86_emulator/{x86_emulate.c => x86-emulate.c} |  2 +-
 tools/tests/x86_emulator/{x86_emulate.h => x86-emulate.h} |  0
 7 files changed, 14 insertions(+), 13 deletions(-)
 rename tools/tests/x86_emulator/{x86_emulate.c => x86-emulate.c} (99%)
 rename tools/tests/x86_emulator/{x86_emulate.h => x86-emulate.h} (100%)

diff --git a/.gitignore b/.gitignore
index 95f40f19c8..b3587f3809 100644
--- a/.gitignore
+++ b/.gitignore
@@ -163,7 +163,8 @@ tools/flask/utils/flask-set-bool
 tools/flask/utils/flask-label-pci
 tools/fuzz/libelf/afl-libelf-fuzzer
 tools/fuzz/x86_instruction_emulator/asm
-tools/fuzz/x86_instruction_emulator/x86_emulate*
+tools/fuzz/x86_instruction_emulator/x86_emulate
+tools/fuzz/x86_instruction_emulator/x86-emulate.[ch]
 tools/fuzz/x86_instruction_emulator/afl-harness
 tools/helpers/_paths.h
 tools/helpers/init-xenstore-domain
diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
index a3f6b2c754..107bf62a21 100644
--- a/tools/fuzz/x86_instruction_emulator/Makefile
+++ b/tools/fuzz/x86_instruction_emulator/Makefile
@@ -18,22 +18,22 @@ asm:
 
 asm/%: asm ;
 
-x86_emulate.c x86_emulate.h: %:
+x86-emulate.c x86-emulate.h: %:
 	[ -L $* ] || ln -sf $(XEN_ROOT)/tools/tests/x86_emulator/$*
 
 CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
 
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
-x86_emulate.h := x86_emulate.h x86_emulate/x86_emulate.h $(x86.h)
+x86_emulate.h := x86-emulate.h x86_emulate/x86_emulate.h $(x86.h)
 
-x86_emulate.o: x86_emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+x86-emulate.o: x86-emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
 
 fuzz-emul.o: $(x86_emulate.h)
 
-x86-insn-fuzzer.a: fuzz-emul.o x86_emulate.o
+x86-insn-fuzzer.a: fuzz-emul.o x86-emulate.o
 	$(AR) rc $@ $^
 
-afl-harness: afl-harness.o fuzz-emul.o x86_emulate.o
+afl-harness: afl-harness.o fuzz-emul.o x86-emulate.o
 	$(CC) $(CFLAGS) $^ -o $@
 
 # Common targets
@@ -42,7 +42,7 @@ all: x86-insn-fuzz-all
 
 .PHONY: distclean
 distclean: clean
-	rm -f x86_emulate x86_emulate.c x86_emulate.h asm
+	rm -f x86_emulate x86-emulate.c x86-emulate.h asm
 
 .PHONY: clean
 clean:
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 5fb8586955..8998f21fe1 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -15,7 +15,7 @@
 #include <unistd.h>
 #include <xen/xen.h>
 
-#include "x86_emulate.h"
+#include "x86-emulate.h"
 
 #define MSR_INDEX_MAX 16
 
diff --git a/tools/tests/x86_emulator/Makefile b/tools/tests/x86_emulator/Makefile
index d7be77d98a..ed0fd9710e 100644
--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -77,7 +77,7 @@ $(addsuffix .h,$(TESTCASES)): %.h: %.c testcase.mk Makefile
 $(addsuffix .c,$(SIMD)) $(addsuffix -avx.c,$(filter sse%,$(SIMD))):
 	ln -sf simd.c $@
 
-$(TARGET): x86_emulate.o test_x86_emulator.o
+$(TARGET): x86-emulate.o test_x86_emulator.o
 	$(HOSTCC) $(HOSTCFLAGS) -o $@ $^
 
 .PHONY: clean
@@ -105,9 +105,9 @@ $(call cc-option-add,HOSTCFLAGS-x86_64,HOSTCC,-no-pie)
 HOSTCFLAGS += $(CFLAGS_xeninclude) -I. $(HOSTCFLAGS-$(XEN_COMPILE_ARCH))
 
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
-x86_emulate.h := x86_emulate.h x86_emulate/x86_emulate.h $(x86.h)
+x86_emulate.h := x86-emulate.h x86_emulate/x86_emulate.h $(x86.h)
 
-x86_emulate.o: x86_emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+x86-emulate.o: x86-emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
 	$(HOSTCC) $(HOSTCFLAGS) -D__XEN_TOOLS__ -c -g -o $@ $<
 
 test_x86_emulator.o: test_x86_emulator.c $(addsuffix .h,$(TESTCASES)) $(x86_emulate.h)
diff --git a/tools/tests/x86_emulator/test_x86_emulator.c b/tools/tests/x86_emulator/test_x86_emulator.c
index 2bb4b1e04b..7a8df419cd 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -3,7 +3,7 @@
 #include <stdio.h>
 #include <sys/mman.h>
 
-#include "x86_emulate.h"
+#include "x86-emulate.h"
 #include "blowfish.h"
 #include "sse.h"
 #include "sse2.h"
diff --git a/tools/tests/x86_emulator/x86_emulate.c b/tools/tests/x86_emulator/x86-emulate.c
similarity index 99%
rename from tools/tests/x86_emulator/x86_emulate.c
rename to tools/tests/x86_emulator/x86-emulate.c
index 79661d5c2b..975ddc7e53 100644
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86-emulate.c
@@ -1,4 +1,4 @@
-#include "x86_emulate.h"
+#include "x86-emulate.h"
 
 #include <sys/mman.h>
 
diff --git a/tools/tests/x86_emulator/x86_emulate.h b/tools/tests/x86_emulator/x86-emulate.h
similarity index 100%
rename from tools/tests/x86_emulator/x86_emulate.h
rename to tools/tests/x86_emulator/x86-emulate.h
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 05/12] fuzz/x86_emulate: Add 'afl-cov' target
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (2 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 04/12] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 06/12] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

...to generate a "normal" coverage-instrumented binary, suitable for
use with gcov or afl-cov.

This is slightly annoying because:

 - Every object file needs to have been instrumented to work
   effectively

 - You generally want to have both an afl-instrumented binary and a
   gcov-instrumented binary at the same time, but

 - gcov instrumentation and afl instrumentation are mutually exclusive

So when making the `afl-cov` target, generate a second set of object
files and a second binary with the `-cov` suffix.

While we're here, remove the redundant x86-emulate.c dependency for
x86-emulate.o.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v3:
- Rebase on new versions of previous patch (mainly x86-emulate.* rename)
- Tighten up build rules
- Add newline at the end of README.afl
- Use := for GCOV_FLAGS in Makefile
Changes in v2:
- Pull 'inputs' to x86_emulate_user* into a make variable to avoid duplication

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 .gitignore                                   |  1 +
 tools/fuzz/README.afl                        | 14 ++++++++++++++
 tools/fuzz/x86_instruction_emulator/Makefile | 17 ++++++++++++++---
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/.gitignore b/.gitignore
index b3587f3809..d64b03d06c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -166,6 +166,7 @@ tools/fuzz/x86_instruction_emulator/asm
 tools/fuzz/x86_instruction_emulator/x86_emulate
 tools/fuzz/x86_instruction_emulator/x86-emulate.[ch]
 tools/fuzz/x86_instruction_emulator/afl-harness
+tools/fuzz/x86_instruction_emulator/afl-harness-cov
 tools/helpers/_paths.h
 tools/helpers/init-xenstore-domain
 tools/helpers/xen-init-dom0
diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
index 4758de2490..8b58b8cdea 100644
--- a/tools/fuzz/README.afl
+++ b/tools/fuzz/README.afl
@@ -41,3 +41,17 @@ Use the x86 instruction emulator fuzzer as an example.
    $ $AFLPATH/afl-fuzz -t 1000 -i testcase_dir -o findings_dir -- ./afl-harness
 
 Please see AFL documentation for more information.
+
+# GENERATING COVERAGE INFORMATION
+
+To use afl-cov or gcov, you need a separate binary instrumented to
+generate coverage data.  To do this, use the target `afl-cov`:
+
+    $ make afl-cov #produces afl-harness-cov
+
+NOTE: Please also note that the coverage instrumentation hard-codes
+the absolute path for the instrumentation read and write files in the
+binary; so coverage data will always show up in the build directory no
+matter where you run the binary from.
+
+Please see afl-cov and/or gcov documentation for more information.
diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
index 107bf62a21..cb561aec3f 100644
--- a/tools/fuzz/x86_instruction_emulator/Makefile
+++ b/tools/fuzz/x86_instruction_emulator/Makefile
@@ -23,12 +23,17 @@ x86-emulate.c x86-emulate.h: %:
 
 CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
 
+GCOV_FLAGS := --coverage
+%-cov.o: %.c
+	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) $< -o $@
+
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
 x86_emulate.h := x86-emulate.h x86_emulate/x86_emulate.h $(x86.h)
 
-x86-emulate.o: x86-emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+# x86-emulate.c will be implicit for both
+x86-emulate.o x86-emulate-cov.o: x86_emulate/x86_emulate.c $(x86_emulate.h)
 
-fuzz-emul.o: $(x86_emulate.h)
+fuzz-emul.o fuzz-emulate-cov.o: $(x86_emulate.h)
 
 x86-insn-fuzzer.a: fuzz-emul.o x86-emulate.o
 	$(AR) rc $@ $^
@@ -36,6 +41,9 @@ x86-insn-fuzzer.a: fuzz-emul.o x86-emulate.o
 afl-harness: afl-harness.o fuzz-emul.o x86-emulate.o
 	$(CC) $(CFLAGS) $^ -o $@
 
+afl-harness-cov: afl-harness-cov.o fuzz-emul-cov.o x86-emulate-cov.o
+	$(CC) $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
+
 # Common targets
 .PHONY: all
 all: x86-insn-fuzz-all
@@ -46,7 +54,7 @@ distclean: clean
 
 .PHONY: clean
 clean:
-	rm -f *.a *.o .*.d afl-harness
+	rm -f *.a *.o .*.d afl-harness afl-harness-cov *.gcda *.gcno *.gcov
 
 .PHONY: install
 install: all
@@ -55,3 +63,6 @@ install: all
 
 .PHONY: afl
 afl: afl-harness
+
+.PHONY: afl-cov
+afl-cov: afl-harness-cov
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 06/12] fuzz/x86_emulate: Take multiple test files for inputs
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (3 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 05/12] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header George Dunlap
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Finding aggregate coverage for a set of test files means running each
afl-generated test case through the harness.  At the moment, this is
done by re-executing afl-harness-cov with each input file.  When a
large number of test cases have been generated, this can take a
significant amonut of time; a recent test with 30k total files
generated by 4 parallel fuzzers took over 7 minutes.

The vast majority of this time is taken up with 'exec', however.
Since the harness is already designed to loop over multiple inputs for
llvm "persistent mode", just allow it to take a large number of inputs
on the same when *not* running in llvm "persistent mode"..  Then the
command can be efficiently executed like this:

  ls */queue/id* | xargs $path/afl-harness-cov

For the above-mentioned test on 30k files, the time to generate
coverage data was reduced from 7 minutes to under 30 seconds.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v4:
- Fix printf to print the right filename
v3:
- Combine some variable declarations
- Make sure that count is set only once no matter how it's compiled
v2:
- Make check for batch processing more clear

Jan: I took the liberty of retaining your Ack on this, since it was a
simple and obvious fix (which I think you had also suggested).

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/README.afl                             |  7 +++++++
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 25 +++++++++++++++--------
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
index 8b58b8cdea..a59564985a 100644
--- a/tools/fuzz/README.afl
+++ b/tools/fuzz/README.afl
@@ -49,6 +49,13 @@ generate coverage data.  To do this, use the target `afl-cov`:
 
     $ make afl-cov #produces afl-harness-cov
 
+In order to speed up the process of checking total coverage,
+`afl-harness-cov` can take several test inputs on its command-line;
+the speed-up effect should be similar to that of using afl-clang-fast.
+You can use xargs to do this most efficiently, like so:
+
+    $ ls queue/id* | xargs $path/afl-harness-cov
+
 NOTE: Please also note that the coverage instrumentation hard-codes
 the absolute path for the instrumentation read and write files in the
 binary; so coverage data will always show up in the build directory no
diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 31ae1daef1..e0c56aadf7 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -16,6 +16,7 @@ int main(int argc, char **argv)
 {
     size_t size;
     FILE *fp = NULL;
+    int max, count;
 
     setbuf(stdin, NULL);
     setbuf(stdout, NULL);
@@ -42,8 +43,7 @@ int main(int argc, char **argv)
             break;
 
         case '?':
-        usage:
-            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
@@ -54,10 +54,13 @@ int main(int argc, char **argv)
         }
     }
 
-    if ( optind == argc ) /* No positional parameters.  Use stdin. */
+    max = argc - optind;
+
+    if ( !max ) /* No positional parameters.  Use stdin. */
+    {
+        max = 1;
         fp = stdin;
-    else if ( optind != (argc - 1) )
-        goto usage;
+    }
 
     if ( LLVMFuzzerInitialize(&argc, &argv) )
         exit(-1);
@@ -65,12 +68,15 @@ int main(int argc, char **argv)
 #ifdef __AFL_HAVE_MANUAL_CONTROL
     __AFL_INIT();
 
-    while ( __AFL_LOOP(1000) )
+    for( count = 0; __AFL_LOOP(1000); )
+#else
+    for( count = 0; count < max; count++ )
 #endif
     {
         if ( fp != stdin ) /* If not using stdin, open the provided file. */
         {
-            fp = fopen(argv[optind], "rb");
+            printf("Opening file %s\n", argv[optind + count]);
+            fp = fopen(argv[optind + count], "rb");
             if ( fp == NULL )
             {
                 perror("fopen");
@@ -100,7 +106,10 @@ int main(int argc, char **argv)
         if ( !feof(fp) )
         {
             printf("Input too large\n");
-            exit(-1);
+            /* Don't exit if we're doing batch processing */
+            if ( max == 1 )
+                exit(-1);
+            continue;
         }
 
         if ( fp != stdin )
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (4 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 06/12] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-12  9:03   ` Wei Liu
  2017-10-11 17:52 ` [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Move fuzz-emul.c function prototypes into a header.  Also share the
definition of the input size (rather than hard-coding it in
fuzz-emul.c).

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
RFC: Worth trying to BUILD_BUG_ON(INPUT_SIZE < DATA_SIZE_FULL)?

v4:
- Move earlier in the queue.
- Rebase over previous patch (getting rid of fuzz_corpus struct)
- Add build-time assert
v3:
- New in this version

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c |  6 +-----
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   |  7 +++++--
 tools/fuzz/x86_instruction_emulator/fuzz-emul.h   | 10 ++++++++++
 3 files changed, 16 insertions(+), 7 deletions(-)
 create mode 100644 tools/fuzz/x86_instruction_emulator/fuzz-emul.h

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index e0c56aadf7..d514468dd2 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -4,12 +4,8 @@
 #include <stdlib.h>
 #include <string.h>
 #include <getopt.h>
+#include "fuzz-emul.h"
 
-extern int LLVMFuzzerInitialize(int *argc, char ***argv);
-extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size);
-extern unsigned int fuzz_minimal_input_size(void);
-
-#define INPUT_SIZE  4096
 static uint8_t input[INPUT_SIZE];
 
 int main(int argc, char **argv)
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 8998f21fe1..964682aa1a 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -16,6 +16,7 @@
 #include <xen/xen.h>
 
 #include "x86-emulate.h"
+#include "fuzz-emul.h"
 
 #define MSR_INDEX_MAX 16
 
@@ -29,7 +30,7 @@ struct fuzz_corpus
     struct cpu_user_regs regs;
     struct segment_register segments[SEG_NUM];
     unsigned long options;
-    unsigned char data[4096];
+    unsigned char data[INPUT_SIZE];
 } input;
 #define DATA_OFFSET offsetof(struct fuzz_corpus, data)
 
@@ -827,7 +828,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
         return 1;
     }
 
-    if ( size > sizeof(input) )
+    if ( size > INPUT_SIZE )
     {
         printf("Input too large\n");
         return 1;
@@ -858,6 +859,8 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
 
 unsigned int fuzz_minimal_input_size(void)
 {
+    BUILD_BUG_ON(DATA_OFFSET > INPUT_SIZE);
+
     return DATA_OFFSET + 1;
 }
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
new file mode 100644
index 0000000000..30dd8de21e
--- /dev/null
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
@@ -0,0 +1,10 @@
+#ifndef FUZZ_EMUL_H
+# define FUZZ_EMUL_H
+
+extern int LLVMFuzzerInitialize(int *argc, char ***argv);
+extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size);
+extern unsigned int fuzz_minimal_input_size(void);
+
+#define INPUT_SIZE  4096
+
+#endif /* ifdef FUZZ_EMUL_H */
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (5 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-12 15:16   ` Jan Beulich
  2017-10-11 17:52 ` [PATCH v4 09/12] fuzz/x86_emulate: Make input more compact George Dunlap
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

At the moment we copy data from the input into a struct named
'corpus', then read and write this state (so that it no longer
resembles the corpus that we read from).

Instead, move all "emulated cpu" state into fuzz_state, and explicitly
state that we are expecting to change it.  Get rid of 'input', and
always read data directly from the pointer passed into the fuzzer.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v4:
 - Reword commit message to make it clear it's not just about the compact state
 - Get rid of fuzz_corpus entirely, and avoid the unnecessary copy
v3:
 - Move DATA_OFFSET inside the structure
 - Remove a stray blank line
v2: Port over previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 114 +++++++++++-------------
 1 file changed, 53 insertions(+), 61 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 964682aa1a..4e3751ce50 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -22,34 +22,31 @@
 
 #define SEG_NUM x86_seg_none
 
-/* Layout of data expected as fuzzing input. */
-struct fuzz_corpus
+/*
+ * State of the fuzzing harness and emulated cpu.  Calculated
+ * initially from the input corpus, and later mutated by the emulation
+ * callbacks (and the emulator itself, in the case of regs).
+ */
+struct fuzz_state
 {
+    /* Emulated CPU state */
+    unsigned long options;
     unsigned long cr[5];
     uint64_t msr[MSR_INDEX_MAX];
-    struct cpu_user_regs regs;
     struct segment_register segments[SEG_NUM];
-    unsigned long options;
-    unsigned char data[INPUT_SIZE];
-} input;
-#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
+    struct cpu_user_regs regs;
 
-/*
- * Internal state of the fuzzing harness.  Calculated initially from the input
- * corpus, and later mutates by the emulation callbacks.
- */
-struct fuzz_state
-{
     /* Fuzzer's input data. */
-    struct fuzz_corpus *corpus;
+#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
+    const unsigned char * corpus;
 
-    /* Real amount of data backing corpus->data[]. */
+    /* Real amount of data backing corpus[]. */
     size_t data_num;
 
-    /* Amount of corpus->data[] consumed thus far. */
+    /* Amount of corpus[] data consumed thus far. */
     size_t data_index;
 
-    /* Emulation ops, some of which are disabled based on corpus->options. */
+    /* Emulation ops, some of which are disabled based on options. */
     struct x86_emulate_ops ops;
 };
 
@@ -63,7 +60,7 @@ static inline bool input_read(struct fuzz_state *s, void *dst, size_t size)
     if ( !input_avail(s, size) )
         return false;
 
-    memcpy(dst, &s->corpus->data[s->data_index], size);
+    memcpy(dst, &s->corpus[s->data_index], size);
     s->data_index += size;
 
     return true;
@@ -393,11 +390,10 @@ static int fuzz_read_segment(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
     assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
 
-    *reg = c->segments[seg];
+    *reg = s->segments[seg];
 
     return X86EMUL_OKAY;
 }
@@ -408,7 +404,6 @@ static int fuzz_write_segment(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     int rc;
 
     assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
@@ -416,7 +411,7 @@ static int fuzz_write_segment(
     rc = maybe_fail(ctxt, "write_segment", true);
 
     if ( rc == X86EMUL_OKAY )
-        c->segments[seg] = *reg;
+        s->segments[seg] = *reg;
 
     return rc;
 }
@@ -427,12 +422,11 @@ static int fuzz_read_cr(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
-    if ( reg >= ARRAY_SIZE(c->cr) )
+    if ( reg >= ARRAY_SIZE(s->cr) )
         return X86EMUL_UNHANDLEABLE;
 
-    *val = c->cr[reg];
+    *val = s->cr[reg];
 
     return X86EMUL_OKAY;
 }
@@ -443,17 +437,16 @@ static int fuzz_write_cr(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     int rc;
 
-    if ( reg >= ARRAY_SIZE(c->cr) )
+    if ( reg >= ARRAY_SIZE(s->cr) )
         return X86EMUL_UNHANDLEABLE;
 
     rc = maybe_fail(ctxt, "write_cr", true);
     if ( rc != X86EMUL_OKAY )
         return rc;
 
-    c->cr[reg] = val;
+    s->cr[reg] = val;
 
     return X86EMUL_OKAY;
 }
@@ -488,7 +481,6 @@ static int fuzz_read_msr(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     unsigned int idx;
 
     switch ( reg )
@@ -502,10 +494,10 @@ static int fuzz_read_msr(
          */
         return data_read(ctxt, x86_seg_none, "read_msr", val, sizeof(*val));
     case MSR_EFER:
-        *val = c->msr[MSRI_EFER];
+        *val = s->msr[MSRI_EFER];
         *val &= ~EFER_LMA;
-        if ( (*val & EFER_LME) && (c->cr[4] & X86_CR4_PAE) &&
-             (c->cr[0] & X86_CR0_PG) )
+        if ( (*val & EFER_LME) && (s->cr[4] & X86_CR4_PAE) &&
+             (s->cr[0] & X86_CR0_PG) )
         {
             printf("Setting EFER_LMA\n");
             *val |= EFER_LMA;
@@ -517,7 +509,7 @@ static int fuzz_read_msr(
     {
         if ( msr_index[idx] == reg )
         {
-            *val = c->msr[idx];
+            *val = s->msr[idx];
             return X86EMUL_OKAY;
         }
     }
@@ -532,7 +524,6 @@ static int fuzz_write_msr(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     unsigned int idx;
     int rc;
 
@@ -551,7 +542,7 @@ static int fuzz_write_msr(
     {
         if ( msr_index[idx] == reg )
         {
-            c->msr[idx] = val;
+            s->msr[idx] = val;
             return X86EMUL_OKAY;
         }
     }
@@ -601,15 +592,14 @@ static void setup_fpu_exception_handler(void)
 static void dump_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     struct cpu_user_regs *regs = ctxt->regs;
     uint64_t val = 0;
 
     printf(" -- State -- \n");
     printf("addr / sp size: %d / %d\n", ctxt->addr_size, ctxt->sp_size);
-    printf(" cr0: %lx\n", c->cr[0]);
-    printf(" cr3: %lx\n", c->cr[3]);
-    printf(" cr4: %lx\n", c->cr[4]);
+    printf(" cr0: %lx\n", s->cr[0]);
+    printf(" cr3: %lx\n", s->cr[3]);
+    printf(" cr4: %lx\n", s->cr[4]);
 
     printf(" rip: %"PRIx64"\n", regs->rip);
 
@@ -630,15 +620,13 @@ static bool long_mode_active(struct x86_emulate_ctxt *ctxt)
 static bool in_longmode(struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
-    return long_mode_active(ctxt) && c->segments[x86_seg_cs].l;
+    return long_mode_active(ctxt) && s->segments[x86_seg_cs].l;
 }
 
 static void set_sizes(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
     ctxt->lma = long_mode_active(ctxt);
 
@@ -646,11 +634,20 @@ static void set_sizes(struct x86_emulate_ctxt *ctxt)
         ctxt->addr_size = ctxt->sp_size = 64;
     else
     {
-        ctxt->addr_size = c->segments[x86_seg_cs].db ? 32 : 16;
-        ctxt->sp_size   = c->segments[x86_seg_ss].db ? 32 : 16;
+        ctxt->addr_size = s->segments[x86_seg_cs].db ? 32 : 16;
+        ctxt->sp_size   = s->segments[x86_seg_ss].db ? 32 : 16;
     }
 }
 
+static void setup_state(struct x86_emulate_ctxt *ctxt)
+{
+    struct fuzz_state *s = ctxt->data;
+
+    /* Fuzz all of the emulated state in one go */
+    if (!input_read(s, s, DATA_OFFSET))
+        exit(-1);
+}
+
 #define CANONICALIZE(x)                                   \
     do {                                                  \
         uint64_t _y = (x);                                \
@@ -710,8 +707,7 @@ enum {
 static void disable_hooks(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
-    unsigned long bitmap = c->options;
+    unsigned long bitmap = s->options;
 
     /* See also sanitize_input, some hooks can't be disabled. */
     MAYBE_DISABLE_HOOK(read);
@@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
 static void sanitize_input(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
-    struct cpu_user_regs *regs = &c->regs;
-    unsigned long bitmap = c->options;
+    struct cpu_user_regs *regs = ctxt->regs;
+    unsigned long bitmap = s->options;
 
     /* Some hooks can't be disabled. */
-    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
+    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
 
     /* Zero 'private' entries */
     regs->error_code = 0;
@@ -780,8 +775,8 @@ static void sanitize_input(struct x86_emulate_ctxt *ctxt)
      * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
      * set PE if PG is set.
      */
-    if ( c->cr[0] & X86_CR0_PG )
-        c->cr[0] |= X86_CR0_PE;
+    if ( s->cr[0] & X86_CR0_PG )
+        s->cr[0] |= X86_CR0_PE;
 
     /* EFLAGS.VM not available in long mode */
     if ( long_mode_active(ctxt) )
@@ -790,8 +785,8 @@ static void sanitize_input(struct x86_emulate_ctxt *ctxt)
     /* EFLAGS.VM implies 16-bit mode */
     if ( regs->rflags & X86_EFLAGS_VM )
     {
-        c->segments[x86_seg_cs].db = 0;
-        c->segments[x86_seg_ss].db = 0;
+        s->segments[x86_seg_cs].db = 0;
+        s->segments[x86_seg_ss].db = 0;
     }
 }
 
@@ -813,15 +808,12 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     };
     struct x86_emulate_ctxt ctxt = {
         .data = &state,
-        .regs = &input.regs,
+        .regs = &state.regs,
         .addr_size = 8 * sizeof(void *),
         .sp_size = 8 * sizeof(void *),
     };
     int rc;
 
-    /* Reset all global state variables */
-    memset(&input, 0, sizeof(input));
-
     if ( size <= DATA_OFFSET )
     {
         printf("Input too small\n");
@@ -834,10 +826,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
         return 1;
     }
 
-    memcpy(&input, data_p, size);
+    state.corpus = (void*)data_p;
+    state.data_num = size;
 
-    state.corpus = &input;
-    state.data_num = size - DATA_OFFSET;
+    setup_state(&ctxt);
 
     sanitize_input(&ctxt);
 
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 09/12] fuzz/x86_emulate: Make input more compact
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (6 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-11 17:52 ` [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

At the moment, AFL reckons that for any given input, 87% of it is
completely irrelevant: that is, it can change it as much as it wants
but have no impact on the result of the test; and yet it can't remove
it.

This is largely because we interpret the blob handed to us as a large
struct, including CR values, MSR values, segment registers, and a full
cpu_user_regs.

Instead, modify our interpretation to have a "set state" stanza at the
front.  Begin by reading a 16-bit value; if it is lower than a certain
threshold, set some state according to what byte it is, and repeat.
Continue until the byte is above a certain threshold.

This allows AFL to compact any given test case much smaller; to the
point where now it reckons there is not a single byte of the test file
which becomes irrelevant.  Testing have shown that this option both
allows AFL to reach coverage much faster, and to have a total coverage
higher than with the old format.

Make this an option (rather than a unilateral change) to enable
side-by-side performance comparison of the old and new formats.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v4:
 - Rebase over previous changes
v3:
 - Set default for opt_compact statically in fuzz-emul.c rather than afl-harness
 - Make input size / unconditional input reading more consistent
 - Require only minimum input read, not first instruction byte
 - Use ARRAY_SIZE() for hardcoded values
 - Use `for ( ; ; )` rather than `while(1)`
 - Some style issues
 - Move opt_compact declaration into fuzz-emul.h
v2: Port over previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c |   9 +-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 103 +++++++++++++++++++---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.h   |   2 +
 3 files changed, 99 insertions(+), 15 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index d514468dd2..23a77a73c0 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -4,6 +4,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <getopt.h>
+#include <stdbool.h>
 #include "fuzz-emul.h"
 
 static uint8_t input[INPUT_SIZE];
@@ -21,9 +22,11 @@ int main(int argc, char **argv)
     {
         enum {
             OPT_MIN_SIZE,
+            OPT_COMPACT,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
+            { "compact", required_argument, NULL, OPT_COMPACT },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -38,8 +41,12 @@ int main(int argc, char **argv)
             exit(0);
             break;
 
+        case OPT_COMPACT:
+            opt_compact = atoi(optarg);
+            break;
+            
         case '?':
-            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 4e3751ce50..3ea8a8b726 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -31,14 +31,15 @@ struct fuzz_state
 {
     /* Emulated CPU state */
     unsigned long options;
+#define DATA_SIZE_COMPACT offsetof(struct fuzz_state, cr)
     unsigned long cr[5];
     uint64_t msr[MSR_INDEX_MAX];
     struct segment_register segments[SEG_NUM];
     struct cpu_user_regs regs;
 
     /* Fuzzer's input data. */
-#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
-    const unsigned char * corpus;
+#define DATA_SIZE_FULL offsetof(struct fuzz_state, corpus)
+    const unsigned char *corpus;
 
     /* Real amount of data backing corpus[]. */
     size_t data_num;
@@ -50,6 +51,15 @@ struct fuzz_state
     struct x86_emulate_ops ops;
 };
 
+bool opt_compact = true;
+
+unsigned int fuzz_minimal_input_size(void)
+{
+    BUILD_BUG_ON(DATA_SIZE_FULL > INPUT_SIZE);
+
+    return opt_compact ? DATA_SIZE_COMPACT : DATA_SIZE_FULL;
+}
+
 static inline bool input_avail(const struct fuzz_state *s, size_t size)
 {
     return s->data_index + size <= s->data_num;
@@ -643,9 +653,82 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
 
-    /* Fuzz all of the emulated state in one go */
-    if (!input_read(s, s, DATA_OFFSET))
-        exit(-1);
+    if ( !opt_compact )
+    {
+        /* Fuzz all of the emulated state in one go */
+        if ( !input_read(s, s, DATA_SIZE_FULL) )
+            exit(-1);
+        return;
+    }
+
+    /* Modify only select bits of state */
+
+    /* Always read 'options' */
+    if ( !input_read(s, s, DATA_SIZE_COMPACT) )
+        return;
+    
+    for ( ; ; )
+    {
+        uint16_t offset;
+
+        /* Read 16 bits to decide what bit of state to modify */
+        if ( !input_read(s, &offset, sizeof(offset)) )
+            return;
+
+        /* 
+         * Then decide if it's "pointing to" different bits of the
+         * state 
+         */
+
+        /* cr[]? */
+        if ( offset < ARRAY_SIZE(s->cr) )
+        {
+            if ( !input_read(s, s->cr + offset, sizeof(*s->cr)) )
+                return;
+            printf("Setting CR %d to %lx\n", offset, s->cr[offset]);
+            continue;
+        }
+        
+        offset -= ARRAY_SIZE(s->cr);
+
+        /* msr[]? */
+        if ( offset < ARRAY_SIZE(s->msr) )
+        {
+            if ( !input_read(s, s->msr + offset, sizeof(*s->msr)) )
+                return;
+            printf("Setting MSR i%d (%x) to %lx\n", offset,
+                   msr_index[offset], s->msr[offset]);
+            continue;
+        }
+
+        offset -= ARRAY_SIZE(s->msr);
+
+        /* segments[]? */
+        if ( offset < ARRAY_SIZE(s->segments) )
+        {
+            if ( !input_read(s, s->segments + offset, sizeof(*s->segments)) )
+                return;
+            printf("Setting Segment %d\n", offset);
+            continue;
+            
+        }
+
+        offset -= ARRAY_SIZE(s->segments);
+
+        /* regs? */
+        if ( offset < sizeof(struct cpu_user_regs)
+             && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) )
+        {
+            if ( !input_read(s, ((char *)ctxt->regs) + offset, sizeof(uint64_t)) )
+                return;
+            printf("Setting cpu_user_regs offset %x\n", offset);
+            continue;
+        }
+
+        /* None of the above -- take that as "start emulating" */
+        
+        return;
+    }
 }
 
 #define CANONICALIZE(x)                                   \
@@ -814,7 +897,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     };
     int rc;
 
-    if ( size <= DATA_OFFSET )
+    if ( size <= fuzz_minimal_input_size() )
     {
         printf("Input too small\n");
         return 1;
@@ -848,14 +931,6 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
 
     return 0;
 }
-
-unsigned int fuzz_minimal_input_size(void)
-{
-    BUILD_BUG_ON(DATA_OFFSET > INPUT_SIZE);
-
-    return DATA_OFFSET + 1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
index 30dd8de21e..85d21cbf0f 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
@@ -5,6 +5,8 @@ extern int LLVMFuzzerInitialize(int *argc, char ***argv);
 extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size);
 extern unsigned int fuzz_minimal_input_size(void);
 
+extern bool opt_compact;
+
 #define INPUT_SIZE  4096
 
 #endif /* ifdef FUZZ_EMUL_H */
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (7 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 09/12] fuzz/x86_emulate: Make input more compact George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-12 15:24   ` Jan Beulich
  2017-10-11 17:52 ` [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
  2017-10-11 17:52 ` [PATCH v4 12/12] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
  10 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Current stability numbers are not 100%.  In order to help track this
down, add a --rerun option which will run the same input twice,
resetting the state between each run, and comparing the state
afterwards.  If the state differs, call abort().

This allows AFL to help the process of tracking down what state is not
being reset properly between runs by proving testcases that
demonstrate the behavior.

To do this:

- Move ctxt into struct fuzz-state to simplify handling

- Rather than copying the data into input, treat the data handed as
  immutable and point each "copy" to it

- Factor out various steps (setting up fuzz state, running an
  individual test) so that they can be efficiently run either once or
  twice (as necessary)

- Compare the states afterwards, printing what's different and calling
  abort() if anything is found.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v4:
- Fix some stylistic issues
- Rename PINE macro to PRINT_NE
- Remove unnecessary printf
- Fix size_t format string
v3:
- Make new local functions static
- Avoid losing const-ness of pointer in setup_fuzz_state()
- Remove useless *_size initialization
- Remove extra blank line
- Use ARRAY_SIZE() when appropriate
- Move opt_rerun declaration into fuzz-emul.h
- Change loop variable to unsigned int
- Print segment contents when segments differ
v2:
- Fix some coding style issues
- Port over previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c |   8 +-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 188 ++++++++++++++++++----
 tools/fuzz/x86_instruction_emulator/fuzz-emul.h   |   1 +
 3 files changed, 168 insertions(+), 29 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 23a77a73c0..0489034642 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -23,10 +23,12 @@ int main(int argc, char **argv)
         enum {
             OPT_MIN_SIZE,
             OPT_COMPACT,
+            OPT_RERUN,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
             { "compact", required_argument, NULL, OPT_COMPACT },
+            { "rerun", no_argument, NULL, OPT_RERUN },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -45,8 +47,12 @@ int main(int argc, char **argv)
             opt_compact = atoi(optarg);
             break;
             
+        case OPT_RERUN:
+            opt_rerun = true;
+            break;
+
         case '?':
-            printf("Usage: %s [--compact=0|1] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] [--rerun] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 3ea8a8b726..f1621f98da 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -49,6 +49,7 @@ struct fuzz_state
 
     /* Emulation ops, some of which are disabled based on options. */
     struct x86_emulate_ops ops;
+    struct x86_emulate_ctxt ctxt;
 };
 
 bool opt_compact = true;
@@ -493,6 +494,12 @@ static int fuzz_read_msr(
     const struct fuzz_state *s = ctxt->data;
     unsigned int idx;
 
+    /* 
+     * NB at the moment dump_state() relies on the fact that this
+     * cannot fail.  If we add in fuzzed failures we'll have to handle
+     * that differently.
+     */
+    
     switch ( reg )
     {
     case MSR_TSC_AUX:
@@ -613,6 +620,7 @@ static void dump_state(struct x86_emulate_ctxt *ctxt)
 
     printf(" rip: %"PRIx64"\n", regs->rip);
 
+    /* read_msr() never fails at the moment */
     fuzz_read_msr(MSR_EFER, &val, ctxt);
     printf("EFER: %"PRIx64"\n", val);
 }
@@ -787,9 +795,8 @@ enum {
         printf("Disabling hook "#h"\n");               \
     }
 
-static void disable_hooks(struct x86_emulate_ctxt *ctxt)
+static void disable_hooks(struct fuzz_state *s)
 {
-    struct fuzz_state *s = ctxt->data;
     unsigned long bitmap = s->options;
 
     /* See also sanitize_input, some hooks can't be disabled. */
@@ -837,7 +844,7 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
  *  - ...bases to below 1Mb, 16-byte aligned
  *  - ...selectors to (base >> 4)
  */
-static void sanitize_input(struct x86_emulate_ctxt *ctxt)
+static void sanitize_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
     struct cpu_user_regs *regs = ctxt->regs;
@@ -884,20 +891,146 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
     return 0;
 }
 
-int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
+static void setup_fuzz_state(struct fuzz_state *state, const void *data_p, size_t size)
 {
-    struct fuzz_state state = {
-        .ops = all_fuzzer_ops,
-    };
-    struct x86_emulate_ctxt ctxt = {
-        .data = &state,
-        .regs = &state.regs,
-        .addr_size = 8 * sizeof(void *),
-        .sp_size = 8 * sizeof(void *),
-    };
+    memset(state, 0, sizeof(*state));
+    state->corpus = data_p;
+    state->data_num = size;
+}
+
+static int runtest(struct fuzz_state *state) {
     int rc;
 
-    if ( size <= fuzz_minimal_input_size() )
+    struct x86_emulate_ctxt *ctxt = &state->ctxt;
+    
+    state->ops = all_fuzzer_ops;
+
+    ctxt->data = state;
+    ctxt->regs = &state->regs;
+
+    setup_state(ctxt);
+
+    sanitize_state(ctxt);
+
+    disable_hooks(state);
+
+    do {
+        /* FIXME: Until we actually implement SIGFPE handling properly */
+        setup_fpu_exception_handler();
+
+        set_sizes(ctxt);
+        dump_state(ctxt);
+
+        rc = x86_emulate(ctxt, &state->ops);
+        printf("Emulation result: %d\n", rc);
+    } while ( rc == X86EMUL_OKAY );
+
+    return 0;
+}
+
+static void compare_states(struct fuzz_state state[2])
+{
+    /* First zero any "internal" pointers */
+    state[0].corpus = state[1].corpus = NULL;
+    state[0].ctxt.data = state[1].ctxt.data = NULL;
+    state[0].ctxt.regs = state[1].ctxt.regs = NULL;
+
+    if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
+    {
+        unsigned int i;
+
+        printf("State mismatch\n");
+
+        for ( i = 0; i < ARRAY_SIZE(state[0].cr); i++ )
+            if ( state[0].cr[i] != state[1].cr[i] )
+                printf("cr[%u]: %lx != %lx\n",
+                       i, state[0].cr[i], state[1].cr[i]);
+        
+        for ( i = 0; i < ARRAY_SIZE(state[0].msr); i++ )
+            if ( state[0].msr[i] != state[1].msr[i] )
+                printf("msr[%u]: %lx != %lx\n",
+                       i, state[0].msr[i], state[1].msr[i]);
+        
+        for ( i = 0; i < ARRAY_SIZE(state[0].segments); i++ )
+            if ( memcmp(&state[0].segments[i], &state[1].segments[i],
+                        sizeof(state[0].segments[0])) )
+                printf("segments[%u]: [%x:%x:%x:%lx] != [%x:%x:%x:%lx]!\n", i,
+                       (unsigned)state[0].segments[i].sel,
+                       (unsigned)state[0].segments[i].attr,
+                       state[0].segments[i].limit,
+                       state[0].segments[i].base,
+                       (unsigned)state[1].segments[i].sel,
+                       (unsigned)state[1].segments[i].attr,
+                       state[1].segments[i].limit,
+                       state[1].segments[i].base);
+
+        if ( state[0].data_num != state[1].data_num )
+            printf("data_num: %lx !=  %lx\n", state[0].data_num,
+                   state[1].data_num);
+        if ( state[0].data_index != state[1].data_index )
+            printf("data_index: %lx !=  %lx\n", state[0].data_index,
+                   state[1].data_index);
+
+        if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
+        {
+            printf("registers differ!\n");
+            /* Print If Not Equal */
+#define PRINT_NE(elem)\
+            if ( state[0].elem != state[1].elem ) \
+                printf(#elem " differ: %lx != %lx\n", \
+                       (unsigned long)state[0].elem, \
+                       (unsigned long)state[1].elem)
+            PRINT_NE(regs.r15);
+            PRINT_NE(regs.r14);
+            PRINT_NE(regs.r13);
+            PRINT_NE(regs.r12);
+            PRINT_NE(regs.rbp);
+            PRINT_NE(regs.rbx);
+            PRINT_NE(regs.r10);
+            PRINT_NE(regs.r11);
+            PRINT_NE(regs.r9);
+            PRINT_NE(regs.r8);
+            PRINT_NE(regs.rax);
+            PRINT_NE(regs.rcx);
+            PRINT_NE(regs.rdx);
+            PRINT_NE(regs.rsi);
+            PRINT_NE(regs.rdi);
+
+            for ( i = offsetof(struct cpu_user_regs, error_code) / sizeof(unsigned);
+                  i < sizeof(state[1].regs)/sizeof(unsigned); i++ )
+            {
+                printf("[%04zu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].regs)[i],
+                       ((unsigned *)&state[1].regs)[i]);
+            }
+        }
+
+        if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
+            printf("ops differ!\n");
+
+        if ( memcmp(&state[0].ctxt, &state[1].ctxt, sizeof(state[0].ctxt)) )
+        {
+            printf("ctxt differs!\n");
+            for ( i = 0;  i < sizeof(state[0].ctxt)/sizeof(unsigned); i++ )
+            {
+                printf("[%04lu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].ctxt)[i],
+                       ((unsigned *)&state[1].ctxt)[i]);
+            }
+            
+        }
+
+        abort();
+    }
+}
+
+bool opt_rerun = false;
+
+int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
+{
+    struct fuzz_state state[2];
+
+    if ( size < fuzz_minimal_input_size() )
     {
         printf("Input too small\n");
         return 1;
@@ -909,25 +1042,24 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
         return 1;
     }
 
-    state.corpus = (void*)data_p;
-    state.data_num = size;
-
-    setup_state(&ctxt);
+    setup_fuzz_state(&state[0], data_p, size);
+    
+    if ( opt_rerun )
+        printf("||| INITIAL RUN |||\n");
+    
+    runtest(&state[0]);
 
-    sanitize_input(&ctxt);
+    if ( !opt_rerun )
+        return 0;
 
-    disable_hooks(&ctxt);
+    /* Reset all global state variables again */
+    setup_fuzz_state(&state[1], data_p, size);
 
-    do {
-        /* FIXME: Until we actually implement SIGFPE handling properly */
-        setup_fpu_exception_handler();
+    printf("||| SECOND RUN |||\n");
 
-        set_sizes(&ctxt);
-        dump_state(&ctxt);
+    runtest(&state[1]);
 
-        rc = x86_emulate(&ctxt, &state.ops);
-        printf("Emulation result: %d\n", rc);
-    } while ( rc == X86EMUL_OKAY );
+    compare_states(state);
 
     return 0;
 }
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
index 85d21cbf0f..4863bf2166 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
@@ -6,6 +6,7 @@ extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size);
 extern unsigned int fuzz_minimal_input_size(void);
 
 extern bool opt_compact;
+extern bool opt_rerun;
 
 #define INPUT_SIZE  4096
 
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (8 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  2017-10-12 15:38   ` Jan Beulich
  2017-10-11 17:52 ` [PATCH v4 12/12] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
  10 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

x86_emulate() operates not only on state passed to it in
cpu_user_regs, but also on state currently found on the cpu: namely,
the FPU and XMM registers.  At the moment, we re-zero (and/or
re-initialize) cpu_user_regs on every invocation, but leave the
cpu-stored state alone.  In "persistent mode", this causes test cases
to behave differently -- sometimes significantly so -- depending on
which test cases have been run beforehand.

Zero out the state before each test run, and then fuzz it based on the
corpus input.

The Intel manual claims that, "If [certain CPUID bits] are set, the
processor deprecates FCS and FDS, and the field is saved as 0000h";
but experimentally it would be more accurate to say, "the field is
occasionally saved as 0000h".  This causes the --rerun checking to
trip non-deterministically.  Sanitize them to zero.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v4:
- Remove ineffective fxrstor
- Sanitize fcs and fds elements
v3:
- Make type 512 bytes rather than 464
- Style changes
- Change argument from 'store' to 'write'
- Add a comment explaining why we always 'save' even for a write
- Sanitize mxcsr with mxcrs_mask when writing instead of zeroing it in sanitize_state
- Get rid of redundant mxcsr_mask setting
- Add comments explaining why we're arbitrarily writing 32 bits
v2: Rebase on top of previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 102 +++++++++++++++++++++++-
 1 file changed, 101 insertions(+), 1 deletion(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index f1621f98da..881c4d03c1 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -36,6 +36,7 @@ struct fuzz_state
     uint64_t msr[MSR_INDEX_MAX];
     struct segment_register segments[SEG_NUM];
     struct cpu_user_regs regs;
+    char fxsave[512] __attribute__((aligned(16)));
 
     /* Fuzzer's input data. */
 #define DATA_SIZE_FULL offsetof(struct fuzz_state, corpus)
@@ -594,6 +595,75 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
 };
 #undef SET
 
+/*
+ * This funciton will read or write fxsave to the fpu.  When writing,
+ * it 'sanitizes' the state: It will mask off the appropriate bits in
+ * the mxcsr, 'restore' the state to the fpu, then 'save' it again so
+ * that the data in fxsave reflects what's actually in the FPU.
+ *
+ * TODO: Extend state beyond just FPU (ymm registers, &c)
+ */
+static void _set_fpu_state(char *fxsave, bool write)
+{
+    if ( cpu_has_fxsr )
+    {
+        static union __attribute__((__aligned__(16))) {
+            char x[512];
+            struct {
+                uint16_t cw, sw;
+                uint8_t  tw, _rsvd1;
+                uint16_t op;
+                uint32_t ip;
+                uint16_t cs, _rsvd2;
+                uint32_t dp;
+                uint16_t ds, _rsvd3;
+                uint32_t mxcsr;
+                uint32_t mxcsr_mask;
+                /* ... */
+            };
+        } *fxs;
+
+        fxs = (typeof(fxs))fxsave;
+
+        if ( write )
+        {
+            /* 
+             * Clear reserved bits to make sure we don't get any
+             * exceptions
+             */
+            fxs->mxcsr &= mxcsr_mask;
+
+            /*
+             * The Intel manual says that on newer models CS/DS are
+             * deprecated and that these fields "are saved as 0000h".
+             * Experimentally, however, at least on my test box,
+             * whether this saved as 0000h or as the previously
+             * written value is random; meaning that when run with
+             * --rerun, we occasionally detect a "state mismatch" in these
+             * bytes.  Instead, simply sanitize them to zero.
+             *
+             * TODO Check CPUID as specified in the manual before
+             * clearing
+             */
+            fxs->cs = fxs->ds = 0;
+
+            asm volatile( "fxrstor %0" :: "m" (*fxs) );
+        }
+
+        asm volatile( "fxsave %0" : "=m" (*fxs) );
+    }
+}
+
+static void set_fpu_state(char *fxsave)
+{
+    _set_fpu_state(fxsave, true);
+}
+
+static void save_fpu_state(char *fxsave)
+{
+    _set_fpu_state(fxsave, false);
+}
+
 static void setup_fpu_exception_handler(void)
 {
     /* FIXME - just disable exceptions for now */
@@ -669,7 +739,11 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
         return;
     }
 
-    /* Modify only select bits of state */
+    /*
+     * Modify only select bits of state.  In general, try not to fuzz less
+     * than 32 bits at a time; otherwise we're reading 2 bytes in order to fuzz only
+     * one byte. 
+     */
 
     /* Always read 'options' */
     if ( !input_read(s, s, DATA_SIZE_COMPACT) )
@@ -732,6 +806,18 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
             printf("Setting cpu_user_regs offset %x\n", offset);
             continue;
         }
+        offset -= sizeof(struct cpu_user_regs);
+
+        /* Fuzz fxsave state */
+        if ( offset < sizeof(s->fxsave) / 4 )
+        {
+            /* 32-bit size is arbitrary; see comment above */
+            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
+                return;
+            printf("Setting fxsave offset %x\n", offset * 4);
+            continue;
+        }
+        offset -= 128;
 
         /* None of the above -- take that as "start emulating" */
         
@@ -914,6 +1000,8 @@ static int runtest(struct fuzz_state *state) {
 
     disable_hooks(state);
 
+    set_fpu_state(state->fxsave);
+
     do {
         /* FIXME: Until we actually implement SIGFPE handling properly */
         setup_fpu_exception_handler();
@@ -925,6 +1013,8 @@ static int runtest(struct fuzz_state *state) {
         printf("Emulation result: %d\n", rc);
     } while ( rc == X86EMUL_OKAY );
 
+    save_fpu_state(state->fxsave);
+    
     return 0;
 }
 
@@ -1008,6 +1098,16 @@ static void compare_states(struct fuzz_state state[2])
         if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
             printf("ops differ!\n");
 
+        if ( memcmp(&state[0].fxsave, &state[1].fxsave, sizeof(state[0].fxsave)) )
+        {
+            printf("fxsave differs!\n");
+            for ( i = 0;  i < sizeof(state[0].fxsave)/sizeof(unsigned); i++ )
+            {
+                printf("[%04lu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].fxsave)[i], ((unsigned *)&state[1].fxsave)[i]);
+            }
+        }
+
         if ( memcmp(&state[0].ctxt, &state[1].ctxt, sizeof(state[0].ctxt)) )
         {
             printf("ctxt differs!\n");
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 12/12] fuzz/x86_emulate: Add an option to limit the number of instructions executed
  2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
                   ` (9 preceding siblings ...)
  2017-10-11 17:52 ` [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
@ 2017-10-11 17:52 ` George Dunlap
  10 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-11 17:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

AFL considers a testcase to be a useful addition not only if there are
tuples exercised by that testcase which were not exercised otherwise,
but also if the *number* of times an individual tuple is exercised
changes significantly; in particular, if the number of the highest
non-zero bit changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).

One simple way to increase these stats it to execute the same (or
similar) instructions multiple times: If executing a given instruction
once hits a particular tuple 2 times, executing it twice will hit the
tuple 4 times, four times will hit the tuple 8 times, and so on.  All
of these will look different to AFL, and so it is likely that many of
the "unique test cases" will simply be the same instruction repeated
powers of 2 times until the tuple counts max out (at 128).

It is unlikely that executing a single instruction more than a handful
of times will generate any state we actually care about; but such long
testcases take exponentially longer to fuzz: the fuzzer spends more
time flipping bits looking for meaningful changes, and each execution
takes longer because it is doing more things.  So long paths which add
nothing to the actual code coverage but effectively "distract" the
fuzzer, making it less effective.

Experiments have shown that not allowing infinite number of
instruction retries for the old (non-compact) format does indeed speed
up and increase code coverage.  However, it has also shown that on the
new, more compact format, having no instruction limit causes the highest
throughput in code coverage.

So leave the option in, but have it default to 0 (no limit).

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v3:
- Change opt_instruction_limit to unsigned, default to UINT_MAX
- Simplify limit checking (now that the actual variable itself will never be 0)
- Change counter to unsigned
- Update changelog to try to be a bit more clear


CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 11 ++++++++++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   |  5 ++++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.h   |  1 +
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 0489034642..375b3333ad 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -1,4 +1,5 @@
 #include <assert.h>
+#include <limits.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -24,11 +25,13 @@ int main(int argc, char **argv)
             OPT_MIN_SIZE,
             OPT_COMPACT,
             OPT_RERUN,
+            OPT_INSTRUCTION_LIMIT,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
             { "compact", required_argument, NULL, OPT_COMPACT },
             { "rerun", no_argument, NULL, OPT_RERUN },
+            { "instruction-limit", required_argument, NULL, OPT_INSTRUCTION_LIMIT },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -51,8 +54,14 @@ int main(int argc, char **argv)
             opt_rerun = true;
             break;
 
+        case OPT_INSTRUCTION_LIMIT:
+            opt_instruction_limit = atoi(optarg);
+            if ( !opt_instruction_limit )
+                opt_instruction_limit = UINT_MAX;
+            break;
+
         case '?':
-            printf("Usage: %s [--compact=0|1] [--rerun] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] [--rerun] [--instruction-limit=N] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 881c4d03c1..d2198e71c9 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -984,10 +984,13 @@ static void setup_fuzz_state(struct fuzz_state *state, const void *data_p, size_
     state->data_num = size;
 }
 
+unsigned int opt_instruction_limit = UINT_MAX;
+
 static int runtest(struct fuzz_state *state) {
     int rc;
 
     struct x86_emulate_ctxt *ctxt = &state->ctxt;
+    unsigned int icount = 0;
     
     state->ops = all_fuzzer_ops;
 
@@ -1011,7 +1014,7 @@ static int runtest(struct fuzz_state *state) {
 
         rc = x86_emulate(ctxt, &state->ops);
         printf("Emulation result: %d\n", rc);
-    } while ( rc == X86EMUL_OKAY );
+    } while ( rc == X86EMUL_OKAY && ++icount < opt_instruction_limit );
 
     save_fpu_state(state->fxsave);
     
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
index 4863bf2166..746e1b542d 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.h
@@ -7,6 +7,7 @@ extern unsigned int fuzz_minimal_input_size(void);
 
 extern bool opt_compact;
 extern bool opt_rerun;
+extern unsigned int opt_instruction_limit;
 
 #define INPUT_SIZE  4096
 
-- 
2.14.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header
  2017-10-11 17:52 ` [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header George Dunlap
@ 2017-10-12  9:03   ` Wei Liu
  0 siblings, 0 replies; 22+ messages in thread
From: Wei Liu @ 2017-10-12  9:03 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Wed, Oct 11, 2017 at 06:52:38PM +0100, George Dunlap wrote:
> Move fuzz-emul.c function prototypes into a header.  Also share the
> definition of the input size (rather than hard-coding it in
> fuzz-emul.c).
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state
  2017-10-11 17:52 ` [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
@ 2017-10-12 15:16   ` Jan Beulich
  2017-10-13  9:22     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2017-10-12 15:16 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -22,34 +22,31 @@
>  
>  #define SEG_NUM x86_seg_none
>  
> -/* Layout of data expected as fuzzing input. */
> -struct fuzz_corpus
> +/*
> + * State of the fuzzing harness and emulated cpu.  Calculated
> + * initially from the input corpus, and later mutated by the emulation
> + * callbacks (and the emulator itself, in the case of regs).
> + */
> +struct fuzz_state
>  {
> +    /* Emulated CPU state */
> +    unsigned long options;
>      unsigned long cr[5];
>      uint64_t msr[MSR_INDEX_MAX];
> -    struct cpu_user_regs regs;
>      struct segment_register segments[SEG_NUM];
> -    unsigned long options;
> -    unsigned char data[INPUT_SIZE];
> -} input;
> -#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
> +    struct cpu_user_regs regs;
>  
> -/*
> - * Internal state of the fuzzing harness.  Calculated initially from the input
> - * corpus, and later mutates by the emulation callbacks.
> - */
> -struct fuzz_state
> -{
>      /* Fuzzer's input data. */
> -    struct fuzz_corpus *corpus;
> +#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
> +    const unsigned char * corpus;

Stray blank after *. Also any reason this can't be uint8_t,
matching LLVMFuzzerTestOneInput()'s parameter and making
it possible to avoid the cast you currently use on that
assignment?

> @@ -646,11 +634,20 @@ static void set_sizes(struct x86_emulate_ctxt *ctxt)
>          ctxt->addr_size = ctxt->sp_size = 64;
>      else
>      {
> -        ctxt->addr_size = c->segments[x86_seg_cs].db ? 32 : 16;
> -        ctxt->sp_size   = c->segments[x86_seg_ss].db ? 32 : 16;
> +        ctxt->addr_size = s->segments[x86_seg_cs].db ? 32 : 16;
> +        ctxt->sp_size   = s->segments[x86_seg_ss].db ? 32 : 16;
>      }
>  }
>  
> +static void setup_state(struct x86_emulate_ctxt *ctxt)
> +{
> +    struct fuzz_state *s = ctxt->data;
> +
> +    /* Fuzz all of the emulated state in one go */
> +    if (!input_read(s, s, DATA_OFFSET))

Missing blanks.

> @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
>  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
>  {
>      struct fuzz_state *s = ctxt->data;
> -    struct fuzz_corpus *c = s->corpus;
> -    struct cpu_user_regs *regs = &c->regs;
> -    unsigned long bitmap = c->options;
> +    struct cpu_user_regs *regs = ctxt->regs;
> +    unsigned long bitmap = s->options;
>  
>      /* Some hooks can't be disabled. */
> -    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
> +    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));

Mind adding the missing blanks here while you touch this?

> @@ -834,10 +826,10 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
>          return 1;
>      }
>  
> -    memcpy(&input, data_p, size);
> +    state.corpus = (void*)data_p;

If for any reason the suggested type change can't or shouldn't be
done (and hence the cast needs to stay), then please add a blank
before * and don't cast away const-ness.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-10-11 17:52 ` [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
@ 2017-10-12 15:24   ` Jan Beulich
  2017-10-13  9:43     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2017-10-12 15:24 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
> @@ -884,20 +891,146 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
>      return 0;
>  }
>  
> -int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
> +static void setup_fuzz_state(struct fuzz_state *state, const void *data_p, size_t size)
>  {
> -    struct fuzz_state state = {
> -        .ops = all_fuzzer_ops,
> -    };
> -    struct x86_emulate_ctxt ctxt = {
> -        .data = &state,
> -        .regs = &state.regs,
> -        .addr_size = 8 * sizeof(void *),
> -        .sp_size = 8 * sizeof(void *),
> -    };
> +    memset(state, 0, sizeof(*state));
> +    state->corpus = data_p;
> +    state->data_num = size;
> +}
> +
> +static int runtest(struct fuzz_state *state) {
>      int rc;
>  
> -    if ( size <= fuzz_minimal_input_size() )
> +    struct x86_emulate_ctxt *ctxt = &state->ctxt;

Please don't leave a blank line between declarations.

> +static void compare_states(struct fuzz_state state[2])
> +{
> +    /* First zero any "internal" pointers */
> +    state[0].corpus = state[1].corpus = NULL;
> +    state[0].ctxt.data = state[1].ctxt.data = NULL;
> +    state[0].ctxt.regs = state[1].ctxt.regs = NULL;
> +
> +    if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
> +    {
> +        unsigned int i;
> +
> +        printf("State mismatch\n");
> +
> +        for ( i = 0; i < ARRAY_SIZE(state[0].cr); i++ )
> +            if ( state[0].cr[i] != state[1].cr[i] )
> +                printf("cr[%u]: %lx != %lx\n",
> +                       i, state[0].cr[i], state[1].cr[i]);
> +        
> +        for ( i = 0; i < ARRAY_SIZE(state[0].msr); i++ )
> +            if ( state[0].msr[i] != state[1].msr[i] )
> +                printf("msr[%u]: %lx != %lx\n",
> +                       i, state[0].msr[i], state[1].msr[i]);
> +        
> +        for ( i = 0; i < ARRAY_SIZE(state[0].segments); i++ )
> +            if ( memcmp(&state[0].segments[i], &state[1].segments[i],
> +                        sizeof(state[0].segments[0])) )
> +                printf("segments[%u]: [%x:%x:%x:%lx] != [%x:%x:%x:%lx]!\n", i,
> +                       (unsigned)state[0].segments[i].sel,
> +                       (unsigned)state[0].segments[i].attr,
> +                       state[0].segments[i].limit,
> +                       state[0].segments[i].base,
> +                       (unsigned)state[1].segments[i].sel,
> +                       (unsigned)state[1].segments[i].attr,
> +                       state[1].segments[i].limit,
> +                       state[1].segments[i].base);
> +
> +        if ( state[0].data_num != state[1].data_num )
> +            printf("data_num: %lx !=  %lx\n", state[0].data_num,
> +                   state[1].data_num);
> +        if ( state[0].data_index != state[1].data_index )
> +            printf("data_index: %lx !=  %lx\n", state[0].data_index,
> +                   state[1].data_index);
> +
> +        if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
> +        {
> +            printf("registers differ!\n");
> +            /* Print If Not Equal */
> +#define PRINT_NE(elem)\
> +            if ( state[0].elem != state[1].elem ) \
> +                printf(#elem " differ: %lx != %lx\n", \
> +                       (unsigned long)state[0].elem, \
> +                       (unsigned long)state[1].elem)
> +            PRINT_NE(regs.r15);
> +            PRINT_NE(regs.r14);
> +            PRINT_NE(regs.r13);
> +            PRINT_NE(regs.r12);
> +            PRINT_NE(regs.rbp);
> +            PRINT_NE(regs.rbx);
> +            PRINT_NE(regs.r10);
> +            PRINT_NE(regs.r11);
> +            PRINT_NE(regs.r9);
> +            PRINT_NE(regs.r8);
> +            PRINT_NE(regs.rax);
> +            PRINT_NE(regs.rcx);
> +            PRINT_NE(regs.rdx);
> +            PRINT_NE(regs.rsi);
> +            PRINT_NE(regs.rdi);

Aren't these register fields all of the same type? If so, why do you
need to casts to unsigned long in the macro?

Additionally iirc Andrew had asked for eflags (and perhaps also
selector register values) to be printed separately for ease of
recognition - I support this request.

> +            for ( i = offsetof(struct cpu_user_regs, error_code) / sizeof(unsigned);
> +                  i < sizeof(state[1].regs)/sizeof(unsigned); i++ )

Blanks around binary operators please (also elsewhere).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state
  2017-10-11 17:52 ` [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
@ 2017-10-12 15:38   ` Jan Beulich
  2017-10-13 10:39     ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2017-10-12 15:38 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
> The Intel manual claims that, "If [certain CPUID bits] are set, the
> processor deprecates FCS and FDS, and the field is saved as 0000h";
> but experimentally it would be more accurate to say, "the field is
> occasionally saved as 0000h".  This causes the --rerun checking to
> trip non-deterministically.  Sanitize them to zero.

I think we've meanwhile settled on the field being saved as zero
being a side effect of using 32-bit fxsave plus a context switch in
the OS kernel.

> @@ -594,6 +595,75 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>  };
>  #undef SET
>  
> +/*
> + * This funciton will read or write fxsave to the fpu.  When writing,
> + * it 'sanitizes' the state: It will mask off the appropriate bits in
> + * the mxcsr, 'restore' the state to the fpu, then 'save' it again so
> + * that the data in fxsave reflects what's actually in the FPU.
> + *
> + * TODO: Extend state beyond just FPU (ymm registers, &c)
> + */
> +static void _set_fpu_state(char *fxsave, bool write)
> +{
> +    if ( cpu_has_fxsr )
> +    {
> +        static union __attribute__((__aligned__(16))) {
> +            char x[512];
> +            struct {
> +                uint16_t cw, sw;
> +                uint8_t  tw, _rsvd1;
> +                uint16_t op;
> +                uint32_t ip;
> +                uint16_t cs, _rsvd2;
> +                uint32_t dp;
> +                uint16_t ds, _rsvd3;
> +                uint32_t mxcsr;
> +                uint32_t mxcsr_mask;
> +                /* ... */
> +            };
> +        } *fxs;
> +
> +        fxs = (typeof(fxs))fxsave;
> +
> +        if ( write )
> +        {
> +            /* 
> +             * Clear reserved bits to make sure we don't get any
> +             * exceptions
> +             */
> +            fxs->mxcsr &= mxcsr_mask;
> +
> +            /*
> +             * The Intel manual says that on newer models CS/DS are
> +             * deprecated and that these fields "are saved as 0000h".
> +             * Experimentally, however, at least on my test box,
> +             * whether this saved as 0000h or as the previously
> +             * written value is random; meaning that when run with
> +             * --rerun, we occasionally detect a "state mismatch" in these
> +             * bytes.  Instead, simply sanitize them to zero.
> +             *
> +             * TODO Check CPUID as specified in the manual before
> +             * clearing
> +             */
> +            fxs->cs = fxs->ds = 0;

Shouldn't be needed anymore with ...

> +            asm volatile( "fxrstor %0" :: "m" (*fxs) );

rex64 (or fxrstor64) used here and ...

> +        }
> +
> +        asm volatile( "fxsave %0" : "=m" (*fxs) );

... here (of course the alternative here then is fxsave64).

Also please add blanks before the opening parentheses.

> @@ -732,6 +806,18 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>              printf("Setting cpu_user_regs offset %x\n", offset);
>              continue;
>          }
> +        offset -= sizeof(struct cpu_user_regs);
> +
> +        /* Fuzz fxsave state */
> +        if ( offset < sizeof(s->fxsave) / 4 )

You've switched to sizeof() here but ...

> +        {
> +            /* 32-bit size is arbitrary; see comment above */
> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
> +                return;
> +            printf("Setting fxsave offset %x\n", offset * 4);
> +            continue;
> +        }
> +        offset -= 128;

... not here.

> @@ -1008,6 +1098,16 @@ static void compare_states(struct fuzz_state state[2])
>          if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
>              printf("ops differ!\n");
>  
> +        if ( memcmp(&state[0].fxsave, &state[1].fxsave, sizeof(state[0].fxsave)) )
> +        {
> +            printf("fxsave differs!\n");
> +            for ( i = 0;  i < sizeof(state[0].fxsave)/sizeof(unsigned); i++ )

Blanks around / again please.

> +            {
> +                printf("[%04lu] %08x %08x\n",

I think I've indicated before that I consider leading zeros on decimal
numbers misleading. Could I talk you into using %4lu instead (or
really %4zu, considering the expression type) in places like this one
(i.e. also in the earlier patch, where I notice only now the l -> z
conversion wasn't done consistently either)?

> +                        i * sizeof(unsigned), ((unsigned *)&state[0].fxsave)[i], ((unsigned *)&state[1].fxsave)[i]);

Long line.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state
  2017-10-12 15:16   ` Jan Beulich
@ 2017-10-13  9:22     ` George Dunlap
  2017-10-13  9:54       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-13  9:22 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

On 10/12/2017 04:16 PM, Jan Beulich wrote:
>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -22,34 +22,31 @@
>>  
>>  #define SEG_NUM x86_seg_none
>>  
>> -/* Layout of data expected as fuzzing input. */
>> -struct fuzz_corpus
>> +/*
>> + * State of the fuzzing harness and emulated cpu.  Calculated
>> + * initially from the input corpus, and later mutated by the emulation
>> + * callbacks (and the emulator itself, in the case of regs).
>> + */
>> +struct fuzz_state
>>  {
>> +    /* Emulated CPU state */
>> +    unsigned long options;
>>      unsigned long cr[5];
>>      uint64_t msr[MSR_INDEX_MAX];
>> -    struct cpu_user_regs regs;
>>      struct segment_register segments[SEG_NUM];
>> -    unsigned long options;
>> -    unsigned char data[INPUT_SIZE];
>> -} input;
>> -#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
>> +    struct cpu_user_regs regs;
>>  
>> -/*
>> - * Internal state of the fuzzing harness.  Calculated initially from the input
>> - * corpus, and later mutates by the emulation callbacks.
>> - */
>> -struct fuzz_state
>> -{
>>      /* Fuzzer's input data. */
>> -    struct fuzz_corpus *corpus;
>> +#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
>> +    const unsigned char * corpus;
> 
> Stray blank after *. Also any reason this can't be uint8_t,
> matching LLVMFuzzerTestOneInput()'s parameter and making
> it possible to avoid the cast you currently use on that
> assignment?

For some reason I thought this would make things uglier; but it actually
works pretty well.

>> @@ -646,11 +634,20 @@ static void set_sizes(struct x86_emulate_ctxt *ctxt)
>>          ctxt->addr_size = ctxt->sp_size = 64;
>>      else
>>      {
>> -        ctxt->addr_size = c->segments[x86_seg_cs].db ? 32 : 16;
>> -        ctxt->sp_size   = c->segments[x86_seg_ss].db ? 32 : 16;
>> +        ctxt->addr_size = s->segments[x86_seg_cs].db ? 32 : 16;
>> +        ctxt->sp_size   = s->segments[x86_seg_ss].db ? 32 : 16;
>>      }
>>  }
>>  
>> +static void setup_state(struct x86_emulate_ctxt *ctxt)
>> +{
>> +    struct fuzz_state *s = ctxt->data;
>> +
>> +    /* Fuzz all of the emulated state in one go */
>> +    if (!input_read(s, s, DATA_OFFSET))
> 
> Missing blanks.

Ack

> 
>> @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
>>  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
>>  {
>>      struct fuzz_state *s = ctxt->data;
>> -    struct fuzz_corpus *c = s->corpus;
>> -    struct cpu_user_regs *regs = &c->regs;
>> -    unsigned long bitmap = c->options;
>> +    struct cpu_user_regs *regs = ctxt->regs;
>> +    unsigned long bitmap = s->options;
>>  
>>      /* Some hooks can't be disabled. */
>> -    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>> +    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
> 
> Mind adding the missing blanks here while you touch this?

Like this?

    s->options &= ~((1<<HOOK_read) | (1<<HOOK_insn_fetch));

Thanks,
 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-10-12 15:24   ` Jan Beulich
@ 2017-10-13  9:43     ` George Dunlap
  2017-10-13  9:56       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: George Dunlap @ 2017-10-13  9:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

On 10/12/2017 04:24 PM, Jan Beulich wrote:
>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>> @@ -884,20 +891,146 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
>>      return 0;
>>  }
>>  
>> -int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
>> +static void setup_fuzz_state(struct fuzz_state *state, const void *data_p, size_t size)
>>  {
>> -    struct fuzz_state state = {
>> -        .ops = all_fuzzer_ops,
>> -    };
>> -    struct x86_emulate_ctxt ctxt = {
>> -        .data = &state,
>> -        .regs = &state.regs,
>> -        .addr_size = 8 * sizeof(void *),
>> -        .sp_size = 8 * sizeof(void *),
>> -    };
>> +    memset(state, 0, sizeof(*state));
>> +    state->corpus = data_p;
>> +    state->data_num = size;
>> +}
>> +
>> +static int runtest(struct fuzz_state *state) {
>>      int rc;
>>  
>> -    if ( size <= fuzz_minimal_input_size() )
>> +    struct x86_emulate_ctxt *ctxt = &state->ctxt;
> 
> Please don't leave a blank line between declarations.
> 
>> +static void compare_states(struct fuzz_state state[2])
>> +{
>> +    /* First zero any "internal" pointers */
>> +    state[0].corpus = state[1].corpus = NULL;
>> +    state[0].ctxt.data = state[1].ctxt.data = NULL;
>> +    state[0].ctxt.regs = state[1].ctxt.regs = NULL;
>> +
>> +    if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
>> +    {
>> +        unsigned int i;
>> +
>> +        printf("State mismatch\n");
>> +
>> +        for ( i = 0; i < ARRAY_SIZE(state[0].cr); i++ )
>> +            if ( state[0].cr[i] != state[1].cr[i] )
>> +                printf("cr[%u]: %lx != %lx\n",
>> +                       i, state[0].cr[i], state[1].cr[i]);
>> +        
>> +        for ( i = 0; i < ARRAY_SIZE(state[0].msr); i++ )
>> +            if ( state[0].msr[i] != state[1].msr[i] )
>> +                printf("msr[%u]: %lx != %lx\n",
>> +                       i, state[0].msr[i], state[1].msr[i]);
>> +        
>> +        for ( i = 0; i < ARRAY_SIZE(state[0].segments); i++ )
>> +            if ( memcmp(&state[0].segments[i], &state[1].segments[i],
>> +                        sizeof(state[0].segments[0])) )
>> +                printf("segments[%u]: [%x:%x:%x:%lx] != [%x:%x:%x:%lx]!\n", i,
>> +                       (unsigned)state[0].segments[i].sel,
>> +                       (unsigned)state[0].segments[i].attr,
>> +                       state[0].segments[i].limit,
>> +                       state[0].segments[i].base,
>> +                       (unsigned)state[1].segments[i].sel,
>> +                       (unsigned)state[1].segments[i].attr,
>> +                       state[1].segments[i].limit,
>> +                       state[1].segments[i].base);
>> +
>> +        if ( state[0].data_num != state[1].data_num )
>> +            printf("data_num: %lx !=  %lx\n", state[0].data_num,
>> +                   state[1].data_num);
>> +        if ( state[0].data_index != state[1].data_index )
>> +            printf("data_index: %lx !=  %lx\n", state[0].data_index,
>> +                   state[1].data_index);
>> +
>> +        if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
>> +        {
>> +            printf("registers differ!\n");
>> +            /* Print If Not Equal */
>> +#define PRINT_NE(elem)\
>> +            if ( state[0].elem != state[1].elem ) \
>> +                printf(#elem " differ: %lx != %lx\n", \
>> +                       (unsigned long)state[0].elem, \
>> +                       (unsigned long)state[1].elem)
>> +            PRINT_NE(regs.r15);
>> +            PRINT_NE(regs.r14);
>> +            PRINT_NE(regs.r13);
>> +            PRINT_NE(regs.r12);
>> +            PRINT_NE(regs.rbp);
>> +            PRINT_NE(regs.rbx);
>> +            PRINT_NE(regs.r10);
>> +            PRINT_NE(regs.r11);
>> +            PRINT_NE(regs.r9);
>> +            PRINT_NE(regs.r8);
>> +            PRINT_NE(regs.rax);
>> +            PRINT_NE(regs.rcx);
>> +            PRINT_NE(regs.rdx);
>> +            PRINT_NE(regs.rsi);
>> +            PRINT_NE(regs.rdi);
> 
> Aren't these register fields all of the same type? If so, why do you
> need to casts to unsigned long in the macro?

As it happens, they're all the same size; when I wrote the macro it was
designed such that the same macro could be used for all the elements
regardless of what size they were.  Since there's no time pressure,
would you rather I add the segment registers (and leave the cast), or
only add rflags (and remove the cast)?

> 
>> +            for ( i = offsetof(struct cpu_user_regs, error_code) / sizeof(unsigned);
>> +                  i < sizeof(state[1].regs)/sizeof(unsigned); i++ )
> 
> Blanks around binary operators please (also elsewhere).

Ack

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state
  2017-10-13  9:22     ` George Dunlap
@ 2017-10-13  9:54       ` Jan Beulich
  2017-10-13  9:55         ` George Dunlap
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2017-10-13  9:54 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

>>> On 13.10.17 at 11:22, <george.dunlap@citrix.com> wrote:
> On 10/12/2017 04:16 PM, Jan Beulich wrote:
>>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>>> @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
>>>  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
>>>  {
>>>      struct fuzz_state *s = ctxt->data;
>>> -    struct fuzz_corpus *c = s->corpus;
>>> -    struct cpu_user_regs *regs = &c->regs;
>>> -    unsigned long bitmap = c->options;
>>> +    struct cpu_user_regs *regs = ctxt->regs;
>>> +    unsigned long bitmap = s->options;
>>>  
>>>      /* Some hooks can't be disabled. */
>>> -    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>>> +    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>> 
>> Mind adding the missing blanks here while you touch this?
> 
> Like this?
> 
>     s->options &= ~((1<<HOOK_read) | (1<<HOOK_insn_fetch));

Even farther (at the same time adding the missing number suffixes):

    s->options &= ~((1UL << HOOK_read) | (1UL << HOOK_insn_fetch));

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state
  2017-10-13  9:54       ` Jan Beulich
@ 2017-10-13  9:55         ` George Dunlap
  0 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-13  9:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

On 10/13/2017 10:54 AM, Jan Beulich wrote:
>>>> On 13.10.17 at 11:22, <george.dunlap@citrix.com> wrote:
>> On 10/12/2017 04:16 PM, Jan Beulich wrote:
>>>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>>>> @@ -761,12 +757,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
>>>>  static void sanitize_input(struct x86_emulate_ctxt *ctxt)
>>>>  {
>>>>      struct fuzz_state *s = ctxt->data;
>>>> -    struct fuzz_corpus *c = s->corpus;
>>>> -    struct cpu_user_regs *regs = &c->regs;
>>>> -    unsigned long bitmap = c->options;
>>>> +    struct cpu_user_regs *regs = ctxt->regs;
>>>> +    unsigned long bitmap = s->options;
>>>>  
>>>>      /* Some hooks can't be disabled. */
>>>> -    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>>>> +    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
>>>
>>> Mind adding the missing blanks here while you touch this?
>>
>> Like this?
>>
>>     s->options &= ~((1<<HOOK_read) | (1<<HOOK_insn_fetch));
> 
> Even farther (at the same time adding the missing number suffixes):
> 
>     s->options &= ~((1UL << HOOK_read) | (1UL << HOOK_insn_fetch));

Got it.  (I was actually trying to verify the 'snuggly' outer braces,
but missed spaces around the '<<'s).

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-10-13  9:43     ` George Dunlap
@ 2017-10-13  9:56       ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2017-10-13  9:56 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

>>> On 13.10.17 at 11:43, <george.dunlap@citrix.com> wrote:
> On 10/12/2017 04:24 PM, Jan Beulich wrote:
>>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>>> +        if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
>>> +        {
>>> +            printf("registers differ!\n");
>>> +            /* Print If Not Equal */
>>> +#define PRINT_NE(elem)\
>>> +            if ( state[0].elem != state[1].elem ) \
>>> +                printf(#elem " differ: %lx != %lx\n", \
>>> +                       (unsigned long)state[0].elem, \
>>> +                       (unsigned long)state[1].elem)
>>> +            PRINT_NE(regs.r15);
>>> +            PRINT_NE(regs.r14);
>>> +            PRINT_NE(regs.r13);
>>> +            PRINT_NE(regs.r12);
>>> +            PRINT_NE(regs.rbp);
>>> +            PRINT_NE(regs.rbx);
>>> +            PRINT_NE(regs.r10);
>>> +            PRINT_NE(regs.r11);
>>> +            PRINT_NE(regs.r9);
>>> +            PRINT_NE(regs.r8);
>>> +            PRINT_NE(regs.rax);
>>> +            PRINT_NE(regs.rcx);
>>> +            PRINT_NE(regs.rdx);
>>> +            PRINT_NE(regs.rsi);
>>> +            PRINT_NE(regs.rdi);
>> 
>> Aren't these register fields all of the same type? If so, why do you
>> need to casts to unsigned long in the macro?
> 
> As it happens, they're all the same size; when I wrote the macro it was
> designed such that the same macro could be used for all the elements
> regardless of what size they were.  Since there's no time pressure,
> would you rather I add the segment registers (and leave the cast), or
> only add rflags (and remove the cast)?

Printing the selector registers separately is more important than
macro cosmetics, so if you end up using the macro for them, then
I'm of course fine with the cast left in place.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state
  2017-10-12 15:38   ` Jan Beulich
@ 2017-10-13 10:39     ` George Dunlap
  0 siblings, 0 replies; 22+ messages in thread
From: George Dunlap @ 2017-10-13 10:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Ian Jackson

On 10/12/2017 04:38 PM, Jan Beulich wrote:
>>>> On 11.10.17 at 19:52, <george.dunlap@citrix.com> wrote:
>> The Intel manual claims that, "If [certain CPUID bits] are set, the
>> processor deprecates FCS and FDS, and the field is saved as 0000h";
>> but experimentally it would be more accurate to say, "the field is
>> occasionally saved as 0000h".  This causes the --rerun checking to
>> trip non-deterministically.  Sanitize them to zero.
> 
> I think we've meanwhile settled on the field being saved as zero
> being a side effect of using 32-bit fxsave plus a context switch in
> the OS kernel.
> 
>> @@ -594,6 +595,75 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>  };
>>  #undef SET
>>  
>> +/*
>> + * This funciton will read or write fxsave to the fpu.  When writing,
>> + * it 'sanitizes' the state: It will mask off the appropriate bits in
>> + * the mxcsr, 'restore' the state to the fpu, then 'save' it again so
>> + * that the data in fxsave reflects what's actually in the FPU.
>> + *
>> + * TODO: Extend state beyond just FPU (ymm registers, &c)
>> + */
>> +static void _set_fpu_state(char *fxsave, bool write)
>> +{
>> +    if ( cpu_has_fxsr )
>> +    {
>> +        static union __attribute__((__aligned__(16))) {
>> +            char x[512];
>> +            struct {
>> +                uint16_t cw, sw;
>> +                uint8_t  tw, _rsvd1;
>> +                uint16_t op;
>> +                uint32_t ip;
>> +                uint16_t cs, _rsvd2;
>> +                uint32_t dp;
>> +                uint16_t ds, _rsvd3;
>> +                uint32_t mxcsr;
>> +                uint32_t mxcsr_mask;
>> +                /* ... */
>> +            };
>> +        } *fxs;
>> +
>> +        fxs = (typeof(fxs))fxsave;
>> +
>> +        if ( write )
>> +        {
>> +            /* 
>> +             * Clear reserved bits to make sure we don't get any
>> +             * exceptions
>> +             */
>> +            fxs->mxcsr &= mxcsr_mask;
>> +
>> +            /*
>> +             * The Intel manual says that on newer models CS/DS are
>> +             * deprecated and that these fields "are saved as 0000h".
>> +             * Experimentally, however, at least on my test box,
>> +             * whether this saved as 0000h or as the previously
>> +             * written value is random; meaning that when run with
>> +             * --rerun, we occasionally detect a "state mismatch" in these
>> +             * bytes.  Instead, simply sanitize them to zero.
>> +             *
>> +             * TODO Check CPUID as specified in the manual before
>> +             * clearing
>> +             */
>> +            fxs->cs = fxs->ds = 0;
> 
> Shouldn't be needed anymore with ...
> 
>> +            asm volatile( "fxrstor %0" :: "m" (*fxs) );
> 
> rex64 (or fxrstor64) used here and ...
> 
>> +        }
>> +
>> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
> 
> ... here (of course the alternative here then is fxsave64).
> 
> Also please add blanks before the opening parentheses.
> 
>> @@ -732,6 +806,18 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>              printf("Setting cpu_user_regs offset %x\n", offset);
>>              continue;
>>          }
>> +        offset -= sizeof(struct cpu_user_regs);
>> +
>> +        /* Fuzz fxsave state */
>> +        if ( offset < sizeof(s->fxsave) / 4 )
> 
> You've switched to sizeof() here but ...
> 
>> +        {
>> +            /* 32-bit size is arbitrary; see comment above */
>> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
>> +                return;
>> +            printf("Setting fxsave offset %x\n", offset * 4);
>> +            continue;
>> +        }
>> +        offset -= 128;
> 
> ... not here.
> 
>> @@ -1008,6 +1098,16 @@ static void compare_states(struct fuzz_state state[2])
>>          if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
>>              printf("ops differ!\n");
>>  
>> +        if ( memcmp(&state[0].fxsave, &state[1].fxsave, sizeof(state[0].fxsave)) )
>> +        {
>> +            printf("fxsave differs!\n");
>> +            for ( i = 0;  i < sizeof(state[0].fxsave)/sizeof(unsigned); i++ )
> 
> Blanks around / again please.
> 
>> +            {
>> +                printf("[%04lu] %08x %08x\n",
> 
> I think I've indicated before that I consider leading zeros on decimal
> numbers misleading. 

Come to think of it I agree with you.

> Could I talk you into using %4lu instead (or
> really %4zu, considering the expression type) in places like this one
> (i.e. also in the earlier patch, where I notice only now the l -> z
> conversion wasn't done consistently either)?

/me looks up what %zu is supposed to do

Sure.

> 
>> +                        i * sizeof(unsigned), ((unsigned *)&state[0].fxsave)[i], ((unsigned *)&state[1].fxsave)[i]);
> 
> Long line.

Ack.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-10-13 10:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 17:52 [PATCH v4 01/12] fuzz/x86_emulate: Clear errors in the officially sanctioned way George Dunlap
2017-10-11 17:52 ` [PATCH v4 02/12] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
2017-10-11 17:52 ` [PATCH v4 03/12] fuzz/x86_emulate: Implement input_read() and input_avail() George Dunlap
2017-10-11 17:52 ` [PATCH v4 04/12] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
2017-10-11 17:52 ` [PATCH v4 05/12] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
2017-10-11 17:52 ` [PATCH v4 06/12] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
2017-10-11 17:52 ` [PATCH v4 07/12] fuzz/x86_emulate: Move definitions into a header George Dunlap
2017-10-12  9:03   ` Wei Liu
2017-10-11 17:52 ` [PATCH v4 08/12] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
2017-10-12 15:16   ` Jan Beulich
2017-10-13  9:22     ` George Dunlap
2017-10-13  9:54       ` Jan Beulich
2017-10-13  9:55         ` George Dunlap
2017-10-11 17:52 ` [PATCH v4 09/12] fuzz/x86_emulate: Make input more compact George Dunlap
2017-10-11 17:52 ` [PATCH v4 10/12] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
2017-10-12 15:24   ` Jan Beulich
2017-10-13  9:43     ` George Dunlap
2017-10-13  9:56       ` Jan Beulich
2017-10-11 17:52 ` [PATCH v4 11/12] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
2017-10-12 15:38   ` Jan Beulich
2017-10-13 10:39     ` George Dunlap
2017-10-11 17:52 ` [PATCH v4 12/12] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.