All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
@ 2017-08-25 16:43 George Dunlap
  2017-08-25 16:43 ` [PATCH 02/14] x86emul/fuzz: add rudimentary limit checking George Dunlap
                   ` (13 more replies)
  0 siblings, 14 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

You don't need __AFL_INIT if you have __AFL_LOOP.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 154869336a..1a79ff228e 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -63,8 +63,6 @@ int main(int argc, char **argv)
         exit(-1);
 
 #ifdef __AFL_HAVE_MANUAL_CONTROL
-    __AFL_INIT();
-
     while ( __AFL_LOOP(1000) )
 #endif
     {
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/14] x86emul/fuzz: add rudimentary limit checking
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 16:43 ` [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input George Dunlap
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich, Andrew Cooper

From: Jan Beulich <jbeulich@suse.com>

fuzz_insn_fetch() is the only data access helper where it is possible
to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
incoming rIP untouched in the emulator itself. The check is needed here
as otherwise, after successfully fetching insn bytes, we may end up
zero-extending EIP soon after complete_insn, which collides with the
X86EMUL_EXCEPTION-conditional respective ASSERT() in
x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
complete_insn to be reached with rc set to other than X86EMUL_OKAY or
X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
exception handling for rep_* hooks"].)

Add assert()-s for all other (data) access routines, as effective
address generation in the emulator ought to guarantee in-range values.
For them to not trigger, several adjustments to the emulator's address
calculations are needed: While for DstBitBase it is really mandatory,
the specification allows for either behavior for two-part accesses.
Observed behavior on real hardware, however, is for such accesses to
silently wrap at the 2^^32 boundary in other than 64-bit mode, just
like they do at the 2^^64 boundary in 64-bit mode. While adding
truncate_ea() invocations there, also convert open coded instances of
it.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 32 ++++++++++++++++++++++---
 xen/arch/x86/x86_emulate/x86_emulate.c          | 22 +++++++++--------
 2 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index a2329f84a5..105145e9f9 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -139,7 +139,18 @@ static int fuzz_read(
     struct x86_emulate_ctxt *ctxt)
 {
     /* Reads expected for all user and system segments. */
-    assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else if ( seg == x86_seg_tr )
+        /*
+         * The TSS is special in that accesses below the segment base are
+         * possible, as the Interrupt Redirection Bitmap starts 32 bytes
+         * ahead of the I/O Bitmap, regardless of the value of the latter.
+         */
+        assert((long)offset < 0 ? (long)offset > -32 : !(offset >> 17));
+    else
+        assert(is_x86_system_segment(seg) &&
+               (ctxt->lma ? offset <= 0x10007 : !(offset >> 16)));
 
     return data_read(ctxt, seg, "read", p_data, bytes);
 }
@@ -162,6 +173,13 @@ static int fuzz_insn_fetch(
 {
     assert(seg == x86_seg_cs);
 
+    /* Minimal segment limit checking, until full one is being put in place. */
+    if ( ctxt->addr_size < 64 && (offset >> 32) )
+    {
+        x86_emul_hw_exception(13, 0, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
     /*
      * Zero-length instruction fetches are made at the destination of jumps,
      * to perform segmentation checks.  No data needs returning.
@@ -232,6 +250,7 @@ static int fuzz_rep_ins(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !(dst_offset >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_ins", reps);
 }
@@ -247,6 +266,7 @@ static int fuzz_rep_movs(
 {
     assert(is_x86_user_segment(src_seg));
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !((src_offset | dst_offset) >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_movs", reps);
 }
@@ -260,6 +280,7 @@ static int fuzz_rep_outs(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(is_x86_user_segment(src_seg));
+    assert(ctxt->addr_size == 64 || !(src_offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_outs", reps);
 }
@@ -277,6 +298,7 @@ static int fuzz_rep_stos(
      * for CLZERO.
      */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_stos", reps);
 }
@@ -290,6 +312,7 @@ static int fuzz_write(
 {
     /* Writes not expected for any system segments. */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "write", true);
 }
@@ -306,8 +329,10 @@ static int fuzz_cmpxchg(
      * Cmpxchg expected for user segments, and setting accessed/busy bits in
      * GDT/LDT enties, but not expected for any IDT or TR accesses.
      */
-    assert(is_x86_user_segment(seg) ||
-           seg == x86_seg_gdtr || seg == x86_seg_ldtr);
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else
+        assert((seg == x86_seg_gdtr || seg == x86_seg_ldtr) && !(offset >> 16));
 
     return maybe_fail(ctxt, "cmpxchg", true);
 }
@@ -319,6 +344,7 @@ static int fuzz_invlpg(
 {
     /* invlpg(), unlike all other hooks, may be called with x86_seg_none. */
     assert(is_x86_user_segment(seg) || seg == x86_seg_none);
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "invlpg", false);
 }
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 2201852cb4..6bcf8ee93d 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1249,10 +1249,10 @@ static void __put_rep_prefix(
 
 /* Clip maximum repetitions so that the index register at most just wraps. */
 #define truncate_ea_and_reps(ea, reps, bytes_per_rep) ({                  \
-    unsigned long todo__, ea__ = truncate_word(ea, ad_bytes);             \
+    unsigned long todo__, ea__ = truncate_ea(ea);                         \
     if ( !(_regs.eflags & X86_EFLAGS_DF) )                                \
-        todo__ = truncate_word(-(ea), ad_bytes) / (bytes_per_rep);        \
-    else if ( truncate_word((ea) + (bytes_per_rep) - 1, ad_bytes) < ea__ )\
+        todo__ = truncate_ea(-ea__) / (bytes_per_rep);                    \
+    else if ( truncate_ea(ea__ + (bytes_per_rep) - 1) < ea__ )            \
         todo__ = 1;                                                       \
     else                                                                  \
         todo__ = ea__ / (bytes_per_rep) + 1;                              \
@@ -3128,6 +3128,7 @@ x86_emulate(
                     op_bytes + (((-src.val - 1) >> 3) & ~(op_bytes - 1L));
             else
                 ea.mem.off += (src.val >> 3) & ~(op_bytes - 1L);
+            ea.mem.off = truncate_ea(ea.mem.off);
         }
 
         /* Bit index always truncated to within range. */
@@ -3346,7 +3347,7 @@ x86_emulate(
         unsigned long src_val2;
         int lb, ub, idx;
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + op_bytes),
                               &src_val2, op_bytes, ctxt, ops)) )
             goto done;
         ub  = (op_bytes == 2) ? (int16_t)src_val2 : (int32_t)src_val2;
@@ -3897,7 +3898,7 @@ x86_emulate(
         seg = (b & 1) * 3; /* es = 0, ds = 3 */
     les:
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + src.bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + src.bytes),
                               &dst.val, 2, ctxt, ops)) != X86EMUL_OKAY )
             goto done;
         ASSERT(is_x86_user_segment(seg));
@@ -4931,7 +4932,8 @@ x86_emulate(
         case 5: /* jmp (far, absolute indirect) */
             generate_exception_if(src.type != OP_MEM, EXC_UD);
 
-            if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+            if ( (rc = read_ulong(src.mem.seg,
+                                  truncate_ea(src.mem.off + op_bytes),
                                   &imm2, 2, ctxt, ops)) )
                 goto done;
             imm1 = src.val;
@@ -5115,8 +5117,8 @@ x86_emulate(
             }
             if ( (rc = ops->write(ea.mem.seg, ea.mem.off, &sreg.limit,
                                   2, ctxt)) != X86EMUL_OKAY ||
-                 (rc = ops->write(ea.mem.seg, ea.mem.off + 2, &sreg.base,
-                                  op_bytes, ctxt)) != X86EMUL_OKAY )
+                 (rc = ops->write(ea.mem.seg, truncate_ea(ea.mem.off + 2),
+                                  &sreg.base, op_bytes, ctxt)) != X86EMUL_OKAY )
                 goto done;
             break;
         case 2: /* lgdt */
@@ -5125,9 +5127,9 @@ x86_emulate(
             generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->write_segment == NULL);
             memset(&sreg, 0, sizeof(sreg));
-            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
+            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
                                   &limit, 2, ctxt, ops)) ||
-                 (rc = read_ulong(ea.mem.seg, ea.mem.off+2,
+                 (rc = read_ulong(ea.mem.seg, truncate_ea(ea.mem.off + 2),
                                   &base, mode_64bit() ? 8 : 4, ctxt, ops)) )
                 goto done;
             generate_exception_if(!is_canonical_address(base), EXC_GP, 0);
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
  2017-08-25 16:43 ` [PATCH 02/14] x86emul/fuzz: add rudimentary limit checking George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 11:21   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 04/14] fuzz/x86_emulate: Add a better input size check George Dunlap
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Commit c07574b reorganized the way fuzzing was done, explicitly
creating a structure that the input data would be copied into.

Unfortunately, the cpu register state used by the emulator is on the
stack; it's cleared, but data is never copied into it.

If we're explicitly setting an entirely new cpu_regs struct for each
new input anyway, there's no need to have two copies around anymore;
just point to the one in the data structure.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---

This is a candidate for backporting to 4.9.

To test that this has an effect, revert the previous patch
("x86emul/fuzz: add rudimentary limit checking"): with this patch it
hits an ASSERT().

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 105145e9f9..48a879cc88 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -785,13 +785,12 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
 
 int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
 {
-    struct cpu_user_regs regs = {};
     struct fuzz_state state = {
         .ops = all_fuzzer_ops,
     };
     struct x86_emulate_ctxt ctxt = {
         .data = &state,
-        .regs = &regs,
+        .regs = &input.regs,
         .addr_size = 8 * sizeof(void *),
         .sp_size = 8 * sizeof(void *),
     };
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/14] fuzz/x86_emulate: Add a better input size check
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
  2017-08-25 16:43 ` [PATCH 02/14] x86emul/fuzz: add rudimentary limit checking George Dunlap
  2017-08-25 16:43 ` [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 17:42   ` Andrew Cooper
  2017-09-15 11:39   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

For some reason the 'feof()' check for the file size isn't working in
llvm-clang-fast mode; the result is several kilobyte files rather than
the 4k limit files as we've requested.  This is bad in part because
AFL will spend time trying to "fuzz" bits of the input that are never
touched.

Add a new check: Offer to read INPUT_SIZE + 1; if we actually get that
many bytes, return an error.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 1a79ff228e..51e0183356 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -76,7 +76,7 @@ int main(int argc, char **argv)
             }
         }
 
-        size = fread(input, 1, INPUT_SIZE, fp);
+        size = fread(input, 1, INPUT_SIZE + 1, fp);
 
         if ( ferror(fp) )
         {
@@ -84,7 +84,7 @@ int main(int argc, char **argv)
             exit(-1);
         }
 
-        if ( !feof(fp) )
+        if ( !feof(fp) || size > INPUT_SIZE )
         {
             printf("Input too large\n");
             exit(-1);
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (2 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 04/14] fuzz/x86_emulate: Add a better input size check George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 11:41   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail() George Dunlap
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

- Print the symbolic name rather than the number
- Explicitly state when data_read() fails due to EOI

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 48a879cc88..7f9a369421 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -52,6 +52,14 @@ struct fuzz_state
     struct x86_emulate_ops ops;
 };
 
+char *x86emul_return_string[] = {
+    [X86EMUL_OKAY]="X86EMUL_OKAY",
+    [X86EMUL_UNHANDLEABLE]="X86EMUL_UNHANDLEABLE",
+    [X86EMUL_EXCEPTION]="X86EMUL_EXCEPTION",
+    [X86EMUL_RETRY]="X86EMUL_RETRY",
+    [X86EMUL_DONE]="X86EMUL_DONE",
+};
+
 /*
  * Randomly return success or failure when processing data.  If
  * `exception` is false, this function turns _EXCEPTION to _OKAY.
@@ -84,7 +92,7 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
     if ( rc == X86EMUL_EXCEPTION && !exception )
         rc = X86EMUL_OKAY;
 
-    printf("maybe_fail %s: %d\n", why, rc);
+    printf("maybe_fail %s: %s\n", why, x86emul_return_string[rc]);
 
     if ( rc == X86EMUL_EXCEPTION )
         /* Fake up a pagefault. */
@@ -113,6 +121,7 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
             x86_emul_hw_exception(13, 0, ctxt);
 
         rc = X86EMUL_EXCEPTION;
+        printf("data_read %s: X86EMUL_EXCEPTION (end of input)\n", why);
     }
     else
         rc = maybe_fail(ctxt, why, true);
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail()
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (3 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 17:45   ` Andrew Cooper
  2017-08-25 16:43 ` [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Rather than open-coding the "read" from the input file.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 31 ++++++++++++++++++-------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 7f9a369421..0f5ff0b265 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -52,6 +52,22 @@ struct fuzz_state
     struct x86_emulate_ops ops;
 };
 
+static inline int davail(struct fuzz_state *s, size_t size)
+{
+    return s->data_index + size < s->data_num;
+}
+
+static inline int dread(struct fuzz_state *s, void *dst, size_t size)
+{
+    if ( !davail(s, size) )
+        return 0;
+
+    memcpy(dst, &s->corpus->data[s->data_index], size);
+    s->data_index += size;
+
+    return 1;
+}
+
 char *x86emul_return_string[] = {
     [X86EMUL_OKAY]="X86EMUL_OKAY",
     [X86EMUL_UNHANDLEABLE]="X86EMUL_UNHANDLEABLE",
@@ -68,10 +84,10 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
                       const char *why, bool exception)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
+    unsigned char c;
     int rc;
 
-    if ( s->data_index >= s->data_num )
+    if ( !dread(s, &c, sizeof(c)) )
         rc = X86EMUL_EXCEPTION;
     else
     {
@@ -80,13 +96,12 @@ static int maybe_fail(struct x86_emulate_ctxt *ctxt,
          * 25% unhandlable
          * 25% exception
          */
-        if ( c->data[s->data_index] > 0xc0 )
+        if ( c > 0xc0 )
             rc = X86EMUL_EXCEPTION;
-        else if ( c->data[s->data_index] > 0x80 )
+        else if ( c > 0x80 )
             rc = X86EMUL_UNHANDLEABLE;
         else
             rc = X86EMUL_OKAY;
-        s->data_index++;
     }
 
     if ( rc == X86EMUL_EXCEPTION && !exception )
@@ -106,11 +121,10 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
                      const char *why, void *dst, unsigned int bytes)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     unsigned int i;
     int rc;
 
-    if ( s->data_index + bytes > s->data_num )
+    if ( !davail(s, bytes) )
     {
         /*
          * Fake up a segment limit violation.  System segment limit volations
@@ -128,8 +142,7 @@ static int data_read(struct x86_emulate_ctxt *ctxt,
 
     if ( rc == X86EMUL_OKAY )
     {
-        memcpy(dst, &c->data[s->data_index], bytes);
-        s->data_index += bytes;
+        dread(s, dst, bytes);
 
         printf("%s: ", why);
         for ( i = 0; i < bytes; i++ )
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (4 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail() George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 11:45   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

When generating coverage output, by default gcov generates output
filenames based only on the coverage file and the "leaf" source file,
not the full path.  As a result, it uses the same name for
x86_emulate.c and x86_emulate/x86_emulate.c, generally overwriting the
second (which we actually are about) with the first (which is just a
wrapper).

Rename the user-space wrapper helpers to x86_emulate_user.[ch], so
that it generates separate files.

There is actually an option to gcov, `--preserve-paths`, which will
cause the full path name to be included in the filename, properly
distinguishing between the two.  However, given that the user-space
wrapper doesn't actually do any emulation (and the poor state of gcov
documentation making it difficult to find the option in the first
place), it seems to make more sense to rename the file anyway.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---

NB: I discovered the `-p` option to gcov after writing this patch.
But I think the patch itself still makes sense.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/Makefile                 | 12 ++++++------
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c              |  2 +-
 tools/tests/x86_emulator/Makefile                            |  6 +++---
 tools/tests/x86_emulator/test_x86_emulator.c                 |  2 +-
 .../tests/x86_emulator/{x86_emulate.c => x86_emulate_user.c} |  2 +-
 .../tests/x86_emulator/{x86_emulate.h => x86_emulate_user.h} |  0
 6 files changed, 12 insertions(+), 12 deletions(-)
 rename tools/tests/x86_emulator/{x86_emulate.c => x86_emulate_user.c} (99%)
 rename tools/tests/x86_emulator/{x86_emulate.h => x86_emulate_user.h} (100%)

diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
index a3f6b2c754..10009dc08f 100644
--- a/tools/fuzz/x86_instruction_emulator/Makefile
+++ b/tools/fuzz/x86_instruction_emulator/Makefile
@@ -18,22 +18,22 @@ asm:
 
 asm/%: asm ;
 
-x86_emulate.c x86_emulate.h: %:
+x86_emulate_user.c x86_emulate_user.h: %:
 	[ -L $* ] || ln -sf $(XEN_ROOT)/tools/tests/x86_emulator/$*
 
 CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
 
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
-x86_emulate.h := x86_emulate.h x86_emulate/x86_emulate.h $(x86.h)
+x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
 
-x86_emulate.o: x86_emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
 
 fuzz-emul.o: $(x86_emulate.h)
 
-x86-insn-fuzzer.a: fuzz-emul.o x86_emulate.o
+x86-insn-fuzzer.a: fuzz-emul.o x86_emulate_user.o
 	$(AR) rc $@ $^
 
-afl-harness: afl-harness.o fuzz-emul.o x86_emulate.o
+afl-harness: afl-harness.o fuzz-emul.o x86_emulate_user.o
 	$(CC) $(CFLAGS) $^ -o $@
 
 # Common targets
@@ -42,7 +42,7 @@ all: x86-insn-fuzz-all
 
 .PHONY: distclean
 distclean: clean
-	rm -f x86_emulate x86_emulate.c x86_emulate.h asm
+	rm -f x86_emulate x86_emulate_user.c x86_emulate_user.h asm
 
 .PHONY: clean
 clean:
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 0f5ff0b265..746e7989af 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -15,7 +15,7 @@
 #include <unistd.h>
 #include <xen/xen.h>
 
-#include "x86_emulate.h"
+#include "x86_emulate_user.h"
 
 #define MSR_INDEX_MAX 16
 
diff --git a/tools/tests/x86_emulator/Makefile b/tools/tests/x86_emulator/Makefile
index fd13ab53b1..888495a6a2 100644
--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -75,7 +75,7 @@ $(addsuffix .h,$(TESTCASES)): %.h: %.c testcase.mk Makefile
 $(addsuffix .c,$(SIMD)) $(addsuffix -avx.c,$(filter sse%,$(SIMD))):
 	ln -sf simd.c $@
 
-$(TARGET): x86_emulate.o test_x86_emulator.o
+$(TARGET): x86_emulate_user.o test_x86_emulator.o
 	$(HOSTCC) -o $@ $^
 
 .PHONY: clean
@@ -101,9 +101,9 @@ asm/%: asm ;
 HOSTCFLAGS += $(CFLAGS_xeninclude) -I.
 
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
-x86_emulate.h := x86_emulate.h x86_emulate/x86_emulate.h $(x86.h)
+x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
 
-x86_emulate.o: x86_emulate.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
 	$(HOSTCC) $(HOSTCFLAGS) -D__XEN_TOOLS__ -c -g -o $@ $<
 
 test_x86_emulator.o: test_x86_emulator.c $(addsuffix .h,$(TESTCASES)) $(x86_emulate.h)
diff --git a/tools/tests/x86_emulator/test_x86_emulator.c b/tools/tests/x86_emulator/test_x86_emulator.c
index 76665abda5..0d8cba1ad5 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -3,7 +3,7 @@
 #include <stdio.h>
 #include <sys/mman.h>
 
-#include "x86_emulate.h"
+#include "x86_emulate_user.h"
 #include "blowfish.h"
 #include "sse.h"
 #include "sse2.h"
diff --git a/tools/tests/x86_emulator/x86_emulate.c b/tools/tests/x86_emulator/x86_emulate_user.c
similarity index 99%
rename from tools/tests/x86_emulator/x86_emulate.c
rename to tools/tests/x86_emulator/x86_emulate_user.c
index 79661d5c2b..adae6950c8 100644
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate_user.c
@@ -1,4 +1,4 @@
-#include "x86_emulate.h"
+#include "x86_emulate_user.h"
 
 #include <sys/mman.h>
 
diff --git a/tools/tests/x86_emulator/x86_emulate.h b/tools/tests/x86_emulator/x86_emulate_user.h
similarity index 100%
rename from tools/tests/x86_emulator/x86_emulate.h
rename to tools/tests/x86_emulator/x86_emulate_user.h
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (5 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 12:55   ` Wei Liu
  2017-09-15 12:57   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
                   ` (6 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

...to generate a "normal" coverage-instrumented binary, suitable for
use with gcov or afl-cov.

This is slightly annoying because:

 - Every object file needs to have been instrumented to work
   effectively

 - You generally want to have both an afl-instrumented binary and a
   gcov-instrumented binary at the same time, but

 - gcov instrumentation and afl instrumentation are mutually exclusive

So when making the `afl-cov` target, generate a second set of object
files and a second binary with the `-cov` suffix.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 .gitignore                                   |  1 +
 tools/fuzz/README.afl                        | 14 ++++++++++++++
 tools/fuzz/x86_instruction_emulator/Makefile | 19 ++++++++++++++++++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index 594ffd9a7f..66bceb3ebe 100644
--- a/.gitignore
+++ b/.gitignore
@@ -159,6 +159,7 @@ tools/fuzz/libelf/afl-libelf-fuzzer
 tools/fuzz/x86_instruction_emulator/asm
 tools/fuzz/x86_instruction_emulator/x86_emulate*
 tools/fuzz/x86_instruction_emulator/afl-harness
+tools/fuzz/x86_instruction_emulator/afl-harness-cov
 tools/helpers/_paths.h
 tools/helpers/init-xenstore-domain
 tools/helpers/xen-init-dom0
diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
index 4758de2490..0d955b2687 100644
--- a/tools/fuzz/README.afl
+++ b/tools/fuzz/README.afl
@@ -41,3 +41,17 @@ Use the x86 instruction emulator fuzzer as an example.
    $ $AFLPATH/afl-fuzz -t 1000 -i testcase_dir -o findings_dir -- ./afl-harness
 
 Please see AFL documentation for more information.
+
+# GENERATING COVERAGE INFORMATION
+
+To use afl-cov or gcov, you need a separate binary instrumented to
+generate coverage data.  To do this, use the target `afl-cov`:
+
+    $ make afl-cov #produces afl-harness-cov
+
+NOTE: Please also note that the coverage instrumentation hard-codes
+the absolute path for the instrumentation read and write files in the
+binary; so coverage data will always show up in the build directory no
+matter where you run the binary from.
+
+Please see afl-cov and/or gcov documentation for more information.
\ No newline at end of file
diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
index 10009dc08f..629e191029 100644
--- a/tools/fuzz/x86_instruction_emulator/Makefile
+++ b/tools/fuzz/x86_instruction_emulator/Makefile
@@ -23,19 +23,33 @@ x86_emulate_user.c x86_emulate_user.h: %:
 
 CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
 
+GCOV_FLAGS=--coverage
+
 x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
 x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
 
 x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
 
+x86_emulate_user-cov.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
+	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ x86_emulate_user.c
+
 fuzz-emul.o: $(x86_emulate.h)
 
+fuzz-emul-cov.o: fuzz-emul.c $(x86_emulate.h)
+	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ fuzz-emul.c
+
+afl-harness-cov.o: afl-harness.c
+	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
+
 x86-insn-fuzzer.a: fuzz-emul.o x86_emulate_user.o
 	$(AR) rc $@ $^
 
 afl-harness: afl-harness.o fuzz-emul.o x86_emulate_user.o
 	$(CC) $(CFLAGS) $^ -o $@
 
+afl-harness-cov: afl-harness-cov.o fuzz-emul-cov.o x86_emulate_user-cov.o
+	$(CC) $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
+
 # Common targets
 .PHONY: all
 all: x86-insn-fuzz-all
@@ -46,7 +60,7 @@ distclean: clean
 
 .PHONY: clean
 clean:
-	rm -f *.a *.o .*.d afl-harness
+	rm -f *.a *.o .*.d afl-harness afl-harness-cov *.gcda *.gcno *.gcov
 
 .PHONY: install
 install: all
@@ -55,3 +69,6 @@ install: all
 
 .PHONY: afl
 afl: afl-harness
+
+.PHONY: afl-cov
+afl-cov: afl-harness-cov
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (6 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 13:07   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 10/14] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Finding aggregate coverage for a set of test files means running each
afl-generated test case through the harness.  At the moment, this is
done by re-executing afl-harness-cov with each input file.  When a
large number of test cases have been generated, this can take a
significant amonut of time; a recent test with 30k total files
generated by 4 parallel fuzzers took over 7 minutes.

The vast majority of this time is taken up with 'exec', however.
Since the harness is already designed to loop over multiple inputs for
llvm "persistent mode", just allow it to take a large number of inputs
on the same when *not* running in llvm "persistent mode"..  Then the
command can be efficiently executed like this:

  ls */queue/id* | xargs $path/afl-harness-cov

For the above-mentioned test on 30k files, the time to generate
coverage data was reduced from 7 minutes to under 30 seconds.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/README.afl                             |  7 +++++++
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 23 ++++++++++++++++-------
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
index 0d955b2687..e8c23d734c 100644
--- a/tools/fuzz/README.afl
+++ b/tools/fuzz/README.afl
@@ -49,6 +49,13 @@ generate coverage data.  To do this, use the target `afl-cov`:
 
     $ make afl-cov #produces afl-harness-cov
 
+In order to speed up the process of checking total coverage,
+`afl-harness-cov` can take several test inputs on its command-line;
+the speed-up effect should be similar to that of using afl-clang-fast.
+You can use xargs to do this most efficiently, like so:
+
+    $ ls queue/id* | xargs $path/afl-harness-cov
+
 NOTE: Please also note that the coverage instrumentation hard-codes
 the absolute path for the instrumentation read and write files in the
 binary; so coverage data will always show up in the build directory no
diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 51e0183356..79f8aec653 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -16,6 +16,8 @@ int main(int argc, char **argv)
 {
     size_t size;
     FILE *fp = NULL;
+    int count = 0;
+    int max;
 
     setbuf(stdin, NULL);
     setbuf(stdout, NULL);
@@ -42,8 +44,7 @@ int main(int argc, char **argv)
             break;
 
         case '?':
-        usage:
-            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
@@ -54,21 +55,27 @@ int main(int argc, char **argv)
         }
     }
 
-    if ( optind == argc ) /* No positional parameters.  Use stdin. */
+    max = argc - optind;
+
+    if ( !max ) /* No positional parameters.  Use stdin. */
+    {
+        max = 1;
         fp = stdin;
-    else if ( optind != (argc - 1) )
-        goto usage;
+    }
 
     if ( LLVMFuzzerInitialize(&argc, &argv) )
         exit(-1);
 
 #ifdef __AFL_HAVE_MANUAL_CONTROL
     while ( __AFL_LOOP(1000) )
+#else
+    for( count = 0; count < max; count++ )
 #endif
     {
         if ( fp != stdin ) /* If not using stdin, open the provided file. */
         {
-            fp = fopen(argv[optind], "rb");
+            printf("Opening file %s\n", argv[optind]);
+            fp = fopen(argv[optind + count], "rb");
             if ( fp == NULL )
             {
                 perror("fopen");
@@ -87,7 +94,9 @@ int main(int argc, char **argv)
         if ( !feof(fp) || size > INPUT_SIZE )
         {
             printf("Input too large\n");
-            exit(-1);
+            if ( optind + 1 ==  argc )
+                exit(-1);
+            continue;
         }
 
         if ( fp != stdin )
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/14] fuzz/x86_emulate: Move all state into fuzz_state
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (7 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 16:43 ` [PATCH 11/14] fuzz/x86_emulate: Make input more compact George Dunlap
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

This is in preparation for adding the option for a more "compact"
interpretation of the fuzzing data, in which we only change select
bits of the state.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 90 +++++++++++++------------
 1 file changed, 46 insertions(+), 44 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 746e7989af..89d1714125 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -24,14 +24,8 @@
 /* Layout of data expected as fuzzing input. */
 struct fuzz_corpus
 {
-    unsigned long cr[5];
-    uint64_t msr[MSR_INDEX_MAX];
-    struct cpu_user_regs regs;
-    struct segment_register segments[SEG_NUM];
-    unsigned long options;
     unsigned char data[4096];
 } input;
-#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
 
 /*
  * Internal state of the fuzzing harness.  Calculated initially from the input
@@ -39,6 +33,12 @@ struct fuzz_corpus
  */
 struct fuzz_state
 {
+    unsigned long options;
+    unsigned long cr[5];
+    uint64_t msr[MSR_INDEX_MAX];
+    struct segment_register segments[SEG_NUM];
+    struct cpu_user_regs regs;
+
     /* Fuzzer's input data. */
     struct fuzz_corpus *corpus;
 
@@ -51,6 +51,8 @@ struct fuzz_state
     /* Emulation ops, some of which are disabled based on corpus->options. */
     struct x86_emulate_ops ops;
 };
+#define DATA_OFFSET offsetof(struct fuzz_state, corpus)
+
 
 static inline int davail(struct fuzz_state *s, size_t size)
 {
@@ -392,11 +394,10 @@ static int fuzz_read_segment(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
     assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
 
-    *reg = c->segments[seg];
+    *reg = s->segments[seg];
 
     return X86EMUL_OKAY;
 }
@@ -407,7 +408,6 @@ static int fuzz_write_segment(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     int rc;
 
     assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
@@ -415,7 +415,7 @@ static int fuzz_write_segment(
     rc = maybe_fail(ctxt, "write_segment", true);
 
     if ( rc == X86EMUL_OKAY )
-        c->segments[seg] = *reg;
+        s->segments[seg] = *reg;
 
     return rc;
 }
@@ -426,12 +426,11 @@ static int fuzz_read_cr(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
-    if ( reg >= ARRAY_SIZE(c->cr) )
+    if ( reg >= ARRAY_SIZE(s->cr) )
         return X86EMUL_UNHANDLEABLE;
 
-    *val = c->cr[reg];
+    *val = s->cr[reg];
 
     return X86EMUL_OKAY;
 }
@@ -442,17 +441,16 @@ static int fuzz_write_cr(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     int rc;
 
-    if ( reg >= ARRAY_SIZE(c->cr) )
+    if ( reg >= ARRAY_SIZE(s->cr) )
         return X86EMUL_UNHANDLEABLE;
 
     rc = maybe_fail(ctxt, "write_cr", true);
     if ( rc != X86EMUL_OKAY )
         return rc;
 
-    c->cr[reg] = val;
+    s->cr[reg] = val;
 
     return X86EMUL_OKAY;
 }
@@ -487,7 +485,6 @@ static int fuzz_read_msr(
     struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     unsigned int idx;
 
     switch ( reg )
@@ -501,10 +498,10 @@ static int fuzz_read_msr(
          */
         return data_read(ctxt, x86_seg_none, "read_msr", val, sizeof(*val));
     case MSR_EFER:
-        *val = c->msr[MSRI_EFER];
+        *val = s->msr[MSRI_EFER];
         *val &= ~EFER_LMA;
-        if ( (*val & EFER_LME) && (c->cr[4] & X86_CR4_PAE) &&
-             (c->cr[0] & X86_CR0_PG) )
+        if ( (*val & EFER_LME) && (s->cr[4] & X86_CR4_PAE) &&
+             (s->cr[0] & X86_CR0_PG) )
         {
             printf("Setting EFER_LMA\n");
             *val |= EFER_LMA;
@@ -516,7 +513,7 @@ static int fuzz_read_msr(
     {
         if ( msr_index[idx] == reg )
         {
-            *val = c->msr[idx];
+            *val = s->msr[idx];
             return X86EMUL_OKAY;
         }
     }
@@ -531,7 +528,6 @@ static int fuzz_write_msr(
     struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
     unsigned int idx;
     int rc;
 
@@ -550,7 +546,7 @@ static int fuzz_write_msr(
     {
         if ( msr_index[idx] == reg )
         {
-            c->msr[idx] = val;
+            s->msr[idx] = val;
             return X86EMUL_OKAY;
         }
     }
@@ -600,15 +596,14 @@ static void setup_fpu_exception_handler(void)
 static void dump_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
     struct cpu_user_regs *regs = ctxt->regs;
     uint64_t val = 0;
 
     printf(" -- State -- \n");
     printf("addr / sp size: %d / %d\n", ctxt->addr_size, ctxt->sp_size);
-    printf(" cr0: %lx\n", c->cr[0]);
-    printf(" cr3: %lx\n", c->cr[3]);
-    printf(" cr4: %lx\n", c->cr[4]);
+    printf(" cr0: %lx\n", s->cr[0]);
+    printf(" cr3: %lx\n", s->cr[3]);
+    printf(" cr4: %lx\n", s->cr[4]);
 
     printf(" rip: %"PRIx64"\n", regs->rip);
 
@@ -629,15 +624,13 @@ static bool long_mode_active(struct x86_emulate_ctxt *ctxt)
 static bool in_longmode(struct x86_emulate_ctxt *ctxt)
 {
     const struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
-    return long_mode_active(ctxt) && c->segments[x86_seg_cs].l;
+    return long_mode_active(ctxt) && s->segments[x86_seg_cs].l;
 }
 
 static void set_sizes(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
 
     ctxt->lma = long_mode_active(ctxt);
 
@@ -645,11 +638,20 @@ static void set_sizes(struct x86_emulate_ctxt *ctxt)
         ctxt->addr_size = ctxt->sp_size = 64;
     else
     {
-        ctxt->addr_size = c->segments[x86_seg_cs].db ? 32 : 16;
-        ctxt->sp_size   = c->segments[x86_seg_ss].db ? 32 : 16;
+        ctxt->addr_size = s->segments[x86_seg_cs].db ? 32 : 16;
+        ctxt->sp_size   = s->segments[x86_seg_ss].db ? 32 : 16;
     }
 }
 
+static void setup_state(struct x86_emulate_ctxt *ctxt)
+{
+    struct fuzz_state *s = ctxt->data;
+
+    /* Fuzz all of the state in one go */
+    if (!dread(s, s, DATA_OFFSET))
+        exit(-1);
+}
+
 #define CANONICALIZE(x)                                   \
     do {                                                  \
         uint64_t _y = (x);                                \
@@ -709,8 +711,7 @@ enum {
 static void disable_hooks(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    const struct fuzz_corpus *c = s->corpus;
-    unsigned long bitmap = c->options;
+    unsigned long bitmap = s->options;
 
     /* See also sanitize_input, some hooks can't be disabled. */
     MAYBE_DISABLE_HOOK(read);
@@ -760,12 +761,11 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
 static void sanitize_input(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
-    struct fuzz_corpus *c = s->corpus;
-    struct cpu_user_regs *regs = &c->regs;
-    unsigned long bitmap = c->options;
+    struct cpu_user_regs *regs = ctxt->regs;
+    unsigned long bitmap = s->options;
 
     /* Some hooks can't be disabled. */
-    c->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
+    s->options &= ~((1<<HOOK_read)|(1<<HOOK_insn_fetch));
 
     /* Zero 'private' entries */
     regs->error_code = 0;
@@ -779,8 +779,8 @@ static void sanitize_input(struct x86_emulate_ctxt *ctxt)
      * CR0.PG can't be set if CR0.PE isn't set.  Set is more interesting, so
      * set PE if PG is set.
      */
-    if ( c->cr[0] & X86_CR0_PG )
-        c->cr[0] |= X86_CR0_PE;
+    if ( s->cr[0] & X86_CR0_PG )
+        s->cr[0] |= X86_CR0_PE;
 
     /* EFLAGS.VM not available in long mode */
     if ( long_mode_active(ctxt) )
@@ -789,8 +789,8 @@ static void sanitize_input(struct x86_emulate_ctxt *ctxt)
     /* EFLAGS.VM implies 16-bit mode */
     if ( regs->rflags & X86_EFLAGS_VM )
     {
-        c->segments[x86_seg_cs].db = 0;
-        c->segments[x86_seg_ss].db = 0;
+        s->segments[x86_seg_cs].db = 0;
+        s->segments[x86_seg_ss].db = 0;
     }
 }
 
@@ -812,7 +812,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     };
     struct x86_emulate_ctxt ctxt = {
         .data = &state,
-        .regs = &input.regs,
+        .regs = &state.regs,
         .addr_size = 8 * sizeof(void *),
         .sp_size = 8 * sizeof(void *),
     };
@@ -836,7 +836,9 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     memcpy(&input, data_p, size);
 
     state.corpus = &input;
-    state.data_num = size - DATA_OFFSET;
+    state.data_num = size;
+
+    setup_state(&ctxt);
 
     sanitize_input(&ctxt);
 
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/14] fuzz/x86_emulate: Make input more compact
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (8 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 10/14] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 16:52   ` George Dunlap
  2017-08-25 17:59   ` Andrew Cooper
  2017-08-25 16:43 ` [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

At the moment, AFL reckons that for any given input, 87% of it is
completely irrelevant: that is, it can change it as much as it wants
but have no impact on the result of the test; and yet it can't remove
it.

This is largely because we interpret the blob handed to us as a large
struct, including CR values, MSR values, segment registers, and a full
cpu_user_regs.

Instead, modify our interpretation to have a "set state" stanza at the
front.  Begin by reading a byte; if it is lower than a certain
threshold, set some state according to what byte it is, and repeat.
Continue until the byte is above a certain threshold.

This allows AFL to compact any given test case much smaller; to the
point where now it reckons there is not a single byte of the test file
which becomes irrelevant.  Testing have shown that this option both
allows AFL to reach coverage much faster, and to have a total coverage
higher than with the old format.

Make this an option (rather than a unilateral change) to enable
side-by-side performance comparison of the old and new formats.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
I'll reply to this e-mail with a graph of some tests I ran.

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 13 +++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 94 ++++++++++++++++++++---
 2 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 79f8aec653..12b3765dcc 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -4,6 +4,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <getopt.h>
+#include <stdbool.h>
 
 extern int LLVMFuzzerInitialize(int *argc, char ***argv);
 extern int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size);
@@ -12,6 +13,8 @@ extern unsigned int fuzz_minimal_input_size(void);
 #define INPUT_SIZE  4096
 static uint8_t input[INPUT_SIZE];
 
+extern bool opt_compact;
+
 int main(int argc, char **argv)
 {
     size_t size;
@@ -22,13 +25,17 @@ int main(int argc, char **argv)
     setbuf(stdin, NULL);
     setbuf(stdout, NULL);
 
+    opt_compact = true;
+
     while ( 1 )
     {
         enum {
             OPT_MIN_SIZE,
+            OPT_COMPACT,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
+            { "compact", required_argument, NULL, OPT_COMPACT },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -43,8 +50,12 @@ int main(int argc, char **argv)
             exit(0);
             break;
 
+        case OPT_COMPACT:
+            opt_compact = atoi(optarg);
+            break;
+            
         case '?':
-            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 89d1714125..48b02f2bf6 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -53,6 +53,15 @@ struct fuzz_state
 };
 #define DATA_OFFSET offsetof(struct fuzz_state, corpus)
 
+bool opt_compact;
+
+unsigned int fuzz_minimal_input_size(void)
+{
+    if ( opt_compact )
+        return sizeof(unsigned long) + 1;
+    else
+        return DATA_OFFSET + 1;
+}
 
 static inline int davail(struct fuzz_state *s, size_t size)
 {
@@ -647,9 +656,81 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
 
-    /* Fuzz all of the state in one go */
-    if (!dread(s, s, DATA_OFFSET))
-        exit(-1);
+    if ( !opt_compact )
+    {
+        /* Fuzz all of the state in one go */
+        if (!dread(s, s, DATA_OFFSET))
+            exit(-1);
+        return;
+    }
+
+    /* Modify only select bits of state */
+
+    /* Always read 'options' */
+    if ( !dread(s, &s->options, sizeof(s->options)) )
+        return;
+    
+    while(1) {
+        uint16_t offset;
+
+        /* Read 16 bits to decide what bit of state to modify */
+        if ( !dread(s, &offset, sizeof(offset)) )
+            return;
+
+        /* 
+         * Then decide if it's "pointing to" different bits of the
+         * state 
+         */
+
+        /* cr[]? */
+        if ( offset < 5 )
+        {
+            if ( !dread(s, s->cr + offset, sizeof(*s->cr)) )
+                return;
+            printf("Setting CR %d to %lx\n", offset, s->cr[offset]);
+            continue;
+        }
+        
+        offset -= 5;
+
+        /* msr[]? */
+        if ( offset < MSR_INDEX_MAX )
+        {
+            if ( !dread(s, s->msr + offset, sizeof(*s->msr)) )
+                return;
+            printf("Setting MSR i%d (%x) to %lx\n", offset,
+                   msr_index[offset], s->msr[offset]);
+            continue;
+        }
+
+        offset -= MSR_INDEX_MAX;
+
+        /* segments[]? */
+        if ( offset < SEG_NUM )
+        {
+            if ( !dread(s, s->segments + offset, sizeof(*s->segments)) )
+                return;
+            printf("Setting Segment %d\n", offset);
+            continue;
+            
+        }
+
+        offset -= SEG_NUM;
+
+        /* regs? */
+        if ( offset < sizeof(struct cpu_user_regs)
+             && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) )
+        {
+            if ( !dread(s, ((char *)ctxt->regs) + offset, sizeof(uint64_t)) )
+                return;
+            printf("Setting cpu_user_regs offset %x\n", offset);
+            continue;
+        }
+
+        /* None of the above -- take that as "start emulating" */
+        
+        return;
+    }
 }
 
 #define CANONICALIZE(x)                                   \
@@ -821,7 +902,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     /* Reset all global state variables */
     memset(&input, 0, sizeof(input));
 
-    if ( size <= DATA_OFFSET )
+    if ( size < fuzz_minimal_input_size() )
     {
         printf("Input too small\n");
         return 1;
@@ -858,11 +939,6 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     return 0;
 }
 
-unsigned int fuzz_minimal_input_size(void)
-{
-    return DATA_OFFSET + 1;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (9 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 11/14] fuzz/x86_emulate: Make input more compact George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 13:30   ` Wei Liu
  2017-08-25 16:43 ` [PATCH 13/14] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

Current stability numbers are not 100%.  In order to help track this
down, add a --rerun option which will run the same input twice,
resetting the state in between each run, and comparing the state
afterwards.  If the state differs, call abort().

This allows AFL to help the process of tracking down what state is not
being reset properly between runs by proving testcases that
demonstrate the behavior.

To do this:

- Move ctxt into struct fuzz-state to simplify handling

- Rather than copying the data into input, treat the data handed as
  immutable and point each "copy" to it

- Factor out various steps (setting up fuzz state, running an
  individual test) so that they can be efficiently run either once or
  twice (as necessary)

- Compare the states afterwards, printing what's different and calling
  abort() if anything is found.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c |   9 +-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 182 ++++++++++++++++++----
 2 files changed, 159 insertions(+), 32 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 12b3765dcc..86c1241784 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -14,6 +14,7 @@ extern unsigned int fuzz_minimal_input_size(void);
 static uint8_t input[INPUT_SIZE];
 
 extern bool opt_compact;
+extern bool opt_rerun;
 
 int main(int argc, char **argv)
 {
@@ -32,10 +33,12 @@ int main(int argc, char **argv)
         enum {
             OPT_MIN_SIZE,
             OPT_COMPACT,
+            OPT_RERUN,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
             { "compact", required_argument, NULL, OPT_COMPACT },
+            { "rerun", no_argument, NULL, OPT_RERUN },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -54,8 +57,12 @@ int main(int argc, char **argv)
             opt_compact = atoi(optarg);
             break;
             
+        case OPT_RERUN:
+            opt_rerun = true;
+            break;
+
         case '?':
-            printf("Usage: %s [--compact=0|1] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] [--rerun] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 48b02f2bf6..1d0293e990 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -40,7 +40,7 @@ struct fuzz_state
     struct cpu_user_regs regs;
 
     /* Fuzzer's input data. */
-    struct fuzz_corpus *corpus;
+    const struct fuzz_corpus *corpus;
 
     /* Real amount of data backing corpus->data[]. */
     size_t data_num;
@@ -50,6 +50,7 @@ struct fuzz_state
 
     /* Emulation ops, some of which are disabled based on corpus->options. */
     struct x86_emulate_ops ops;
+    struct x86_emulate_ctxt ctxt;
 };
 #define DATA_OFFSET offsetof(struct fuzz_state, corpus)
 
@@ -496,6 +497,12 @@ static int fuzz_read_msr(
     const struct fuzz_state *s = ctxt->data;
     unsigned int idx;
 
+    /* 
+     * NB at the moment dump_state() relies on the fact that this
+     * cannot fail.  If we add in fuzzed failures we'll have to handle
+     * that differently.
+     */
+    
     switch ( reg )
     {
     case MSR_TSC_AUX:
@@ -616,6 +623,7 @@ static void dump_state(struct x86_emulate_ctxt *ctxt)
 
     printf(" rip: %"PRIx64"\n", regs->rip);
 
+    /* read_msr() never fails at the moment */
     fuzz_read_msr(MSR_EFER, &val, ctxt);
     printf("EFER: %"PRIx64"\n", val);
 }
@@ -660,7 +668,10 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
     {
         /* Fuzz all of the state in one go */
         if (!dread(s, s, DATA_OFFSET))
+        {
+            printf("Input size too small\n");
             exit(-1);
+        }
         return;
     }
 
@@ -789,9 +800,8 @@ enum {
         printf("Disabling hook "#h"\n");               \
     }
 
-static void disable_hooks(struct x86_emulate_ctxt *ctxt)
+static void disable_hooks(struct fuzz_state *s)
 {
-    struct fuzz_state *s = ctxt->data;
     unsigned long bitmap = s->options;
 
     /* See also sanitize_input, some hooks can't be disabled. */
@@ -839,7 +849,7 @@ static void disable_hooks(struct x86_emulate_ctxt *ctxt)
  *  - ...bases to below 1Mb, 16-byte aligned
  *  - ...selectors to (base >> 4)
  */
-static void sanitize_input(struct x86_emulate_ctxt *ctxt)
+static void sanitize_state(struct x86_emulate_ctxt *ctxt)
 {
     struct fuzz_state *s = ctxt->data;
     struct cpu_user_regs *regs = ctxt->regs;
@@ -886,21 +896,132 @@ int LLVMFuzzerInitialize(int *argc, char ***argv)
     return 0;
 }
 
-int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
+void setup_fuzz_state(struct fuzz_state *state, const uint8_t *data_p, size_t size)
 {
-    struct fuzz_state state = {
-        .ops = all_fuzzer_ops,
-    };
-    struct x86_emulate_ctxt ctxt = {
-        .data = &state,
-        .regs = &state.regs,
-        .addr_size = 8 * sizeof(void *),
-        .sp_size = 8 * sizeof(void *),
-    };
+    memset(state, 0, sizeof(*state));
+    state->corpus = (struct fuzz_corpus *)data_p;
+    state->data_num = size;
+}
+
+int runtest(struct fuzz_state *state) {
     int rc;
 
-    /* Reset all global state variables */
-    memset(&input, 0, sizeof(input));
+    struct x86_emulate_ctxt *ctxt = &state->ctxt;
+    
+    state->ops = all_fuzzer_ops;
+
+    ctxt->data = state;
+    ctxt->regs = &state->regs;
+    ctxt->addr_size = ctxt->sp_size = 8 * sizeof(void *);
+
+    setup_state(ctxt);
+
+    sanitize_state(ctxt);
+
+    disable_hooks(state);
+
+    do {
+        /* FIXME: Until we actually implement SIGFPE handling properly */
+        setup_fpu_exception_handler();
+
+        set_sizes(ctxt);
+        dump_state(ctxt);
+
+        rc = x86_emulate(ctxt, &state->ops);
+        printf("Emulation result: %d\n", rc);
+    } while ( rc == X86EMUL_OKAY );
+
+    return 0;
+}
+
+void compare_states(struct fuzz_state state[2])
+{
+    // First zero any "internal" pointers
+    state[0].corpus = state[1].corpus = NULL;
+    state[0].ctxt.data = state[1].ctxt.data = NULL;
+    state[0].ctxt.regs = state[1].ctxt.regs = NULL;
+
+    
+    if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
+    {
+        int i;
+
+        printf("State mismatch\n");
+
+        for ( i=0; i<5; i++)
+            if (state[0].cr[i] != state[1].cr[i])
+                printf("cr[%d]: %lx != %lx\n",
+                       i, state[0].cr[i], state[1].cr[i]);
+        
+        for ( i=0; i<MSR_INDEX_MAX; i++)
+            if (state[0].msr[i] != state[1].msr[i])
+                printf("msr[%d]: %lx != %lx\n",
+                       i, state[0].msr[i], state[1].msr[i]);
+        
+        for ( i=0; i<SEG_NUM; i++)
+            if ( memcmp(&state[0].segments[i], &state[1].segments[i], sizeof(state[0].segments[0])) )
+                printf("segments[%d] differ!\n", i);
+
+        if (state[0].data_num != state[1].data_num)
+            printf("data_num: %lx !=  %lx\n", state[0].data_num, state[1].data_num);
+        if (state[0].data_index != state[1].data_index)
+            printf("data_index: %lx !=  %lx\n", state[0].data_index, state[1].data_index);
+
+        if ( memcmp(&state[0].regs, &state[1].regs, sizeof(state[0].regs)) )
+        {
+            printf("registers differ!\n");
+            /* Print If Not Equal */
+#define PINE(elem)\
+            if ( state[0].elem != state[1].elem ) \
+                printf(#elem " differ: %lx != %lx\n", \
+                       (unsigned long)state[0].elem, (unsigned long)state[1].elem)
+            PINE(regs.r15);
+            PINE(regs.r14);
+            PINE(regs.r13);
+            PINE(regs.r12);
+            PINE(regs.rbp);
+            PINE(regs.rbx);
+            PINE(regs.r10);
+            PINE(regs.r11);
+            PINE(regs.r9);
+            PINE(regs.r8);
+            PINE(regs.rax);
+            PINE(regs.rcx);
+            PINE(regs.rdx);
+            PINE(regs.rsi);
+            PINE(regs.rdi);
+
+            for ( i = offsetof(struct cpu_user_regs, error_code) / sizeof(unsigned);
+                  i < sizeof(state[1].regs)/sizeof(unsigned); i++ )
+            {
+                printf("[%04lu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].regs)[i], ((unsigned *)&state[1].regs)[i]);
+            }
+        }
+
+        if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
+            printf("ops differ!\n");
+
+        if ( memcmp(&state[0].ctxt, &state[1].ctxt, sizeof(state[0].ctxt)) )
+        {
+            printf("ctxt differs!\n");
+            for ( i = 0;  i < sizeof(state[0].ctxt)/sizeof(unsigned); i++ )
+            {
+                printf("[%04lu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].ctxt)[i], ((unsigned *)&state[1].ctxt)[i]);
+            }
+            
+        }
+
+        abort();
+    }
+}
+
+bool opt_rerun = false;
+
+int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
+{
+    struct fuzz_state state[2];
 
     if ( size < fuzz_minimal_input_size() )
     {
@@ -908,7 +1029,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
         return 1;
     }
 
-    if ( size > sizeof(input) )
+    if ( size > sizeof(struct fuzz_corpus) )
     {
         printf("Input too large\n");
         return 1;
@@ -916,25 +1037,24 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
 
     memcpy(&input, data_p, size);
 
-    state.corpus = &input;
-    state.data_num = size;
-
-    setup_state(&ctxt);
+    setup_fuzz_state(&state[0], data_p, size);
+    
+    if ( opt_rerun )
+        printf("||| INITIAL RUN |||\n");
+    
+    runtest(&state[0]);
 
-    sanitize_input(&ctxt);
+    if ( !opt_rerun )
+        return 0;
 
-    disable_hooks(&ctxt);
+    /* Reset all global state variables again */
+    setup_fuzz_state(&state[1], data_p, size);
 
-    do {
-        /* FIXME: Until we actually implement SIGFPE handling properly */
-        setup_fpu_exception_handler();
+    printf("||| SECOND RUN |||\n");
 
-        set_sizes(&ctxt);
-        dump_state(&ctxt);
+    runtest(&state[1]);
 
-        rc = x86_emulate(&ctxt, &state.ops);
-        printf("Emulation result: %d\n", rc);
-    } while ( rc == X86EMUL_OKAY );
+    compare_states(state);
 
     return 0;
 }
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/14] fuzz/x86_emulate: Set and fuzz more CPU state
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (10 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-08-25 16:43 ` [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
  2017-08-25 17:37 ` [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook Andrew Cooper
  13 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

x86_emulate() operates not only on state passed to it in
cpu_user_regs, but also on state currently found on the cpu: namely,
the FPU and XMM registers.  At the moment, we re-zero (and/or
re-initialize) cpu_user_regs on every invocation, but leave the
cpu-stored state alone.  In "persistent mode", this causes test cases
to behave differently -- sometimes significantly so -- depending on
which test cases have been run beforehand.

Zero out the state before each test run, and then fuzz it based on the
corpus input.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 71 +++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 1d0293e990..7a07e7e37a 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -38,6 +38,8 @@ struct fuzz_state
     uint64_t msr[MSR_INDEX_MAX];
     struct segment_register segments[SEG_NUM];
     struct cpu_user_regs regs;
+    char fxsave[512] __attribute__((aligned(16)));
+
 
     /* Fuzzer's input data. */
     const struct fuzz_corpus *corpus;
@@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
 };
 #undef SET
 
+static void _set_fpu_state(char *fxsave, bool store)
+{
+    if ( cpu_has_fxsr )
+    {
+        static union __attribute__((__aligned__(16))) {
+            char x[464];
+            struct {
+                uint32_t other[6];
+                uint32_t mxcsr;
+                uint32_t mxcsr_mask;
+                /* ... */
+            };
+        } *fxs;
+
+        fxs = (typeof(fxs)) fxsave;
+
+        if ( store ) {
+            char null[512] __attribute__((aligned(16))) = { 0 };
+            asm volatile(" fxrstor %0; "::"m"(*null));
+            asm volatile(" fxrstor %0; "::"m"(*fxsave));
+        }
+        
+        asm volatile( "fxsave %0" : "=m" (*fxs) );
+
+        if ( fxs->mxcsr_mask )
+            mxcsr_mask = fxs->mxcsr_mask;
+        else
+            mxcsr_mask = 0x000ffbf;
+    }
+}
+
+static void set_fpu_state(char *fxsave)
+{
+    _set_fpu_state(fxsave, true);
+}
+
+static void save_fpu_state(char *fxsave)
+{
+    _set_fpu_state(fxsave, false);
+}
+
 static void setup_fpu_exception_handler(void)
 {
     /* FIXME - just disable exceptions for now */
@@ -737,6 +780,17 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
             printf("Setting cpu_user_regs offset %x\n", offset);
             continue;
         }
+        offset -= sizeof(struct cpu_user_regs);
+
+        /* Fuzz fxsave state */
+        if ( offset < 128 )
+        {
+            if ( !dread(s, s->fxsave + (offset * 4), 4) )
+                return;
+            printf("Setting fxsave offset %x\n", offset * 4);
+            continue;
+        }
+        offset -= 128;
 
         /* None of the above -- take that as "start emulating" */
         
@@ -883,6 +937,9 @@ static void sanitize_state(struct x86_emulate_ctxt *ctxt)
         s->segments[x86_seg_cs].db = 0;
         s->segments[x86_seg_ss].db = 0;
     }
+
+    /* Setting this value seems to cause crashes in fxrstor */
+    *((unsigned int *)(s->fxsave) + 6) = 0;
 }
 
 int LLVMFuzzerInitialize(int *argc, char ***argv)
@@ -920,6 +977,8 @@ int runtest(struct fuzz_state *state) {
 
     disable_hooks(state);
 
+    set_fpu_state(state->fxsave);
+
     do {
         /* FIXME: Until we actually implement SIGFPE handling properly */
         setup_fpu_exception_handler();
@@ -931,6 +990,8 @@ int runtest(struct fuzz_state *state) {
         printf("Emulation result: %d\n", rc);
     } while ( rc == X86EMUL_OKAY );
 
+    save_fpu_state(state->fxsave);
+    
     return 0;
 }
 
@@ -1002,6 +1063,16 @@ void compare_states(struct fuzz_state state[2])
         if ( memcmp(&state[0].ops, &state[1].ops, sizeof(state[0].ops)) )
             printf("ops differ!\n");
 
+        if ( memcmp(&state[0].fxsave, &state[1].fxsave, sizeof(state[0].fxsave)) )
+        {
+            printf("fxsave differs!\n");
+            for ( i = 0;  i < sizeof(state[0].fxsave)/sizeof(unsigned); i++ )
+            {
+                printf("[%04lu] %08x %08x\n",
+                        i * sizeof(unsigned), ((unsigned *)&state[0].fxsave)[i], ((unsigned *)&state[1].fxsave)[i]);
+            }
+        }
+
         if ( memcmp(&state[0].ctxt, &state[1].ctxt, sizeof(state[0].ctxt)) )
         {
             printf("ctxt differs!\n");
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (11 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 13/14] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
@ 2017-08-25 16:43 ` George Dunlap
  2017-09-15 13:38   ` Wei Liu
  2017-08-25 17:37 ` [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook Andrew Cooper
  13 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:43 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, George Dunlap, Jan Beulich, Andrew Cooper

AFL considers a testcase to be a useful addition not only if there are
tuples exercised by that testcase which were not exercised otherwise,
but also if the *number* of times an individual tuple is exercised
changes significantly; in particular, if the number of the highes bit
changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).

Unfortunately, one simple way to increase these stats it to execute
the same (or similar) instructions multiple times.  Such long
testcases take exponentially longer to fuzz: the fuzzer spends more
time flipping bits looking for meaningful changes, and each execution
takes longer because it is doing more things.  So long paths which add
nothing to the actual code coverage but effectively "distract" the
fuzzer, making it less effective.

Experiments have shown that not allowing infinite number of
instruction retries for the old (non-compact) format does indeed speed
up and increase code coverage.  However, it has also shown that on the
new, more compact format, having no instruction limit causes the highest
throughput in code coverage.

So leave the option in, but have it default to 0 (no limit).

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c | 9 ++++++++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 7 ++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
index 86c1241784..5cc6ba39ff 100644
--- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
+++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
@@ -15,6 +15,7 @@ static uint8_t input[INPUT_SIZE];
 
 extern bool opt_compact;
 extern bool opt_rerun;
+extern int opt_instruction_limit;
 
 int main(int argc, char **argv)
 {
@@ -34,11 +35,13 @@ int main(int argc, char **argv)
             OPT_MIN_SIZE,
             OPT_COMPACT,
             OPT_RERUN,
+            OPT_INSTRUCTION_LIMIT,
         };
         static const struct option lopts[] = {
             { "min-input-size", no_argument, NULL, OPT_MIN_SIZE },
             { "compact", required_argument, NULL, OPT_COMPACT },
             { "rerun", no_argument, NULL, OPT_RERUN },
+            { "instruction-limit", required_argument, NULL, OPT_INSTRUCTION_LIMIT },
             { 0, 0, 0, 0 }
         };
         int c = getopt_long_only(argc, argv, "", lopts, NULL);
@@ -61,8 +64,12 @@ int main(int argc, char **argv)
             opt_rerun = true;
             break;
 
+        case OPT_INSTRUCTION_LIMIT:
+            opt_instruction_limit = atoi(optarg);
+            break;
+
         case '?':
-            printf("Usage: %s [--compact=0|1] [--rerun] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
+            printf("Usage: %s [--compact=0|1] [--rerun] [--instruction-limit=N] $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
             exit(-1);
             break;
 
diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 7a07e7e37a..46c382db11 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -960,10 +960,13 @@ void setup_fuzz_state(struct fuzz_state *state, const uint8_t *data_p, size_t si
     state->data_num = size;
 }
 
+int opt_instruction_limit = 0;
+
 int runtest(struct fuzz_state *state) {
     int rc;
 
     struct x86_emulate_ctxt *ctxt = &state->ctxt;
+    int icount = 0;
     
     state->ops = all_fuzzer_ops;
 
@@ -988,7 +991,9 @@ int runtest(struct fuzz_state *state) {
 
         rc = x86_emulate(ctxt, &state->ops);
         printf("Emulation result: %d\n", rc);
-    } while ( rc == X86EMUL_OKAY );
+    } while ( rc == X86EMUL_OKAY &&
+              (!opt_instruction_limit ||
+               (++icount < opt_instruction_limit)) );
 
     save_fpu_state(state->fxsave);
     
-- 
2.14.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] fuzz/x86_emulate: Make input more compact
  2017-08-25 16:43 ` [PATCH 11/14] fuzz/x86_emulate: Make input more compact George Dunlap
@ 2017-08-25 16:52   ` George Dunlap
  2017-08-25 17:59   ` Andrew Cooper
  1 sibling, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-25 16:52 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich, Andrew Cooper

[-- Attachment #1: Type: text/plain, Size: 2710 bytes --]

On 08/25/2017 05:43 PM, George Dunlap wrote:
> At the moment, AFL reckons that for any given input, 87% of it is
> completely irrelevant: that is, it can change it as much as it wants
> but have no impact on the result of the test; and yet it can't remove
> it.
> 
> This is largely because we interpret the blob handed to us as a large
> struct, including CR values, MSR values, segment registers, and a full
> cpu_user_regs.
> 
> Instead, modify our interpretation to have a "set state" stanza at the
> front.  Begin by reading a byte; if it is lower than a certain
> threshold, set some state according to what byte it is, and repeat.
> Continue until the byte is above a certain threshold.
> 
> This allows AFL to compact any given test case much smaller; to the
> point where now it reckons there is not a single byte of the test file
> which becomes irrelevant.  Testing have shown that this option both
> allows AFL to reach coverage much faster, and to have a total coverage
> higher than with the old format.
> 
> Make this an option (rather than a unilateral change) to enable
> side-by-side performance comparison of the old and new formats.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> I'll reply to this e-mail with a graph of some tests I ran.

Here are the results of my testing justifying the new defaults for
whether to use 'compact' state definition or not, and how many
instructions to execute.

'new-format' had --compact=1, 'old-format' was run with --compact=0.

no-limit was run with --instruction-limit=0, limit-1 with
--instruction-limit=1, and so on.

In each case I ran 4 parallel AFL instances on my workstation for 24
hours.

The attached graph shows the number of unique AFL 'tuples' touched
over time.  (See [1] for more information.)  Having the line be higher
overall at the end of the run (i.e., best branch coverage) is the
primary goal; having the line shift over to the left (quickest
discovery) is a secondary goal.  The two are related in the sense that
quicker discovery should allow more time to explore further search
space.

The combination that had both the quickest discovery and the highest
overall branch coverage was the new 'compact' format with no
instruction limit (the defaults set by this patch).

In the 'compact' format, limiting the number of instructions seemed
only to slow things down and reduce the final coverage amount
(although the final coverage for limit-1 did beat the final coverages
for other limits).

In the 'non-compact' format, having unlimited instructions seems very
much to slow down discovery; but it also increased the final coverage
count.

 -George

[1] http://lcamtuf.coredump.cx/afl/technical_details.txt

[-- Attachment #2: afl-effectiveness-tuples.png --]
[-- Type: image/png, Size: 180329 bytes --]

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
  2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
                   ` (12 preceding siblings ...)
  2017-08-25 16:43 ` [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
@ 2017-08-25 17:37 ` Andrew Cooper
  2017-08-28 10:34   ` George Dunlap
  2017-09-22 15:47   ` George Dunlap
  13 siblings, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2017-08-25 17:37 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 25/08/17 17:43, George Dunlap wrote:
> You don't need __AFL_INIT if you have __AFL_LOOP.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Really?  Is that covered in any documentation?

I got the contrary impression from whichever version of AFL I was using
when I put this in, and a quick look over the afl-fuzz source doesn't
appear to equate them in any way.

~Andrew

> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> index 154869336a..1a79ff228e 100644
> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> @@ -63,8 +63,6 @@ int main(int argc, char **argv)
>          exit(-1);
>  
>  #ifdef __AFL_HAVE_MANUAL_CONTROL
> -    __AFL_INIT();
> -
>      while ( __AFL_LOOP(1000) )
>  #endif
>      {


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/14] fuzz/x86_emulate: Add a better input size check
  2017-08-25 16:43 ` [PATCH 04/14] fuzz/x86_emulate: Add a better input size check George Dunlap
@ 2017-08-25 17:42   ` Andrew Cooper
  2017-09-15 11:39   ` Wei Liu
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2017-08-25 17:42 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 25/08/17 17:43, George Dunlap wrote:
> For some reason the 'feof()' check for the file size isn't working in
> llvm-clang-fast mode; the result is several kilobyte files rather than
> the 4k limit files as we've requested.  This is bad in part because
> AFL will spend time trying to "fuzz" bits of the input that are never
> touched.
>
> Add a new check: Offer to read INPUT_SIZE + 1; if we actually get that
> many bytes, return an error.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Doesn't fast mode pass the corpus by shared memory?  I wonder if it is
doing something slightly wonky with hooking the library functions.

Has this issue been reported upstream?  A workaround like this shouldn't
be necessary.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail()
  2017-08-25 16:43 ` [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail() George Dunlap
@ 2017-08-25 17:45   ` Andrew Cooper
  2017-09-14 17:06     ` George Dunlap
  2017-09-25 11:40     ` George Dunlap
  0 siblings, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2017-08-25 17:45 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 25/08/17 17:43, George Dunlap wrote:
> Rather than open-coding the "read" from the input file.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

This patch fills me with dread.

How about data_read() and data_available() which are slightly more
descriptive?

Also, both should be using bool rather than int.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] fuzz/x86_emulate: Make input more compact
  2017-08-25 16:43 ` [PATCH 11/14] fuzz/x86_emulate: Make input more compact George Dunlap
  2017-08-25 16:52   ` George Dunlap
@ 2017-08-25 17:59   ` Andrew Cooper
  2017-08-28  9:10     ` George Dunlap
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2017-08-25 17:59 UTC (permalink / raw)
  To: George Dunlap, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 25/08/17 17:43, George Dunlap wrote:
> At the moment, AFL reckons that for any given input, 87% of it is
> completely irrelevant: that is, it can change it as much as it wants
> but have no impact on the result of the test; and yet it can't remove
> it.
>
> This is largely because we interpret the blob handed to us as a large
> struct, including CR values, MSR values, segment registers, and a full
> cpu_user_regs.
>
> Instead, modify our interpretation to have a "set state" stanza at the
> front.  Begin by reading a byte; if it is lower than a certain
> threshold, set some state according to what byte it is, and repeat.
> Continue until the byte is above a certain threshold.
>
> This allows AFL to compact any given test case much smaller; to the
> point where now it reckons there is not a single byte of the test file
> which becomes irrelevant.  Testing have shown that this option both
> allows AFL to reach coverage much faster, and to have a total coverage
> higher than with the old format.
>
> Make this an option (rather than a unilateral change) to enable
> side-by-side performance comparison of the old and new formats.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

I continue to think this is a bad idea.  You are taking a genuine
problem and adding a complicated algorithm to try and fool alf, rather
than fixing the problem.

The reason 87% of input is irrelevant is because it really is.  The
input state is full of 64bit values being used for a one or two bits
which we ever look at.

The solution to this problem is remove the irrelevant information from
fuzz_corpus.  I already started doing this with the alf-fast work for
the Xen 4.9 release, but I've basically been doing security work ever
since and haven't had time to continue it.

For the record, this hunk is how I intended to continue the work:

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 74e8c85..dafe435 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -24,7 +24,27 @@
 /* Layout of data expected as fuzzing input. */
 struct fuzz_corpus
 {
-    unsigned long cr[5];
+    /* %cr0 */
+    bool pe:1;
+    bool mp:1;
+    bool em:1;
+    bool ts:1;
+    bool pg:1;
+
+    /* %cr4 */
+    bool vme:1;
+    bool pvi:1;
+    bool tsd:1;
+    bool osfxsr:1;
+    bool osxmmexcpt:1;
+    bool umip:1;
+    bool fsgsbase:1;
+    bool osxsave:1;
+
+    /* EFER */
+    bool sce:1;
+    bool lme:1;
+
     uint64_t msr[MSR_INDEX_MAX];
     struct cpu_user_regs regs;
     struct segment_register segments[SEG_NUM];
@@ -50,6 +70,9 @@ struct fuzz_state
 
     /* Emulation ops, some of which are disabled based on
corpus->options. */
     struct x86_emulate_ops ops;
+
+    unsigned long cr0, cr2, cr3, cr4, cr8;
+    uint64_t efer;
 };
 
 /*

Which drops loads of useless bits out of AFLs view.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/14] fuzz/x86_emulate: Make input more compact
  2017-08-25 17:59   ` Andrew Cooper
@ 2017-08-28  9:10     ` George Dunlap
  0 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-08-28  9:10 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 08/25/2017 06:59 PM, Andrew Cooper wrote:
> On 25/08/17 17:43, George Dunlap wrote:
>> At the moment, AFL reckons that for any given input, 87% of it is
>> completely irrelevant: that is, it can change it as much as it wants
>> but have no impact on the result of the test; and yet it can't remove
>> it.
>>
>> This is largely because we interpret the blob handed to us as a large
>> struct, including CR values, MSR values, segment registers, and a full
>> cpu_user_regs.
>>
>> Instead, modify our interpretation to have a "set state" stanza at the
>> front.  Begin by reading a byte; if it is lower than a certain
>> threshold, set some state according to what byte it is, and repeat.
>> Continue until the byte is above a certain threshold.
>>
>> This allows AFL to compact any given test case much smaller; to the
>> point where now it reckons there is not a single byte of the test file
>> which becomes irrelevant.  Testing have shown that this option both
>> allows AFL to reach coverage much faster, and to have a total coverage
>> higher than with the old format.
>>
>> Make this an option (rather than a unilateral change) to enable
>> side-by-side performance comparison of the old and new formats.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> 
> I continue to think this is a bad idea.  You are taking a genuine
> problem and adding a complicated algorithm to try and fool alf, rather
> than fixing the problem.
> 
> The reason 87% of input is irrelevant is because it really is.  The
> input state is full of 64bit values being used for a one or two bits
> which we ever look at.

My take on it is this.

The state struct is very large -- I think it's around 500 bytes without
the FPU state, and over 1k with it.

Nearly every bit of the state has *some* instruction that interacts with
it.  But in order for AFL to notice that, it has to find a combination
<state, instruction> such that the instruction and the state actually
interact.  For any given <state, instruction> tuple, changing the vast
majority of the state will have absolutely no effect -- meaning that
AFL's fuzzing is almost entirely wasted.  AFL will spend a huge amount
of time fuzzing bits of the state struct that cannot possibly have an
effect on the outcome, since the instuctions in the test case don't
interact with those bits at all.

With the "compact" input interpretation, once AFL finds a <offset,
content, instruction> tuple that correspond, changing any of the three
is pretty much guaranteed to have some effect; finding a set that
correspond should be much easier, and in any case fuzzing it should be a
lot faster.

As such, I don't think I'm trying to "fool" AFL: I'm just giving it
tools to search more effectively.

> The solution to this problem is remove the irrelevant information from
> fuzz_corpus.  I already started doing this with the alf-fast work for
> the Xen 4.9 release, but I've basically been doing security work ever
> since and haven't had time to continue it.
> 
> For the record, this hunk is how I intended to continue the work:
> 
> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> index 74e8c85..dafe435 100644
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -24,7 +24,27 @@
>  /* Layout of data expected as fuzzing input. */
>  struct fuzz_corpus
>  {
> -    unsigned long cr[5];
> +    /* %cr0 */
> +    bool pe:1;
> +    bool mp:1;
> +    bool em:1;
> +    bool ts:1;
> +    bool pg:1;
> +
> +    /* %cr4 */
> +    bool vme:1;
> +    bool pvi:1;
> +    bool tsd:1;
> +    bool osfxsr:1;
> +    bool osxmmexcpt:1;
> +    bool umip:1;
> +    bool fsgsbase:1;
> +    bool osxsave:1;
> +
> +    /* EFER */
> +    bool sce:1;
> +    bool lme:1;

I'm 98% sure that the handful of bits in cr0 and cr4 aren't the problem.
 AFL isnt' looking at *bits* in the cr0 register, it's looking at that
as an interger, and I'm sure that it notices, "I add X to this and it
changes stuff, so it must be used."

The problem is these:

>      uint64_t msr[MSR_INDEX_MAX];
>      struct cpu_user_regs regs;
>      struct segment_register segments[SEG_NUM];

And in particular this one:
    char fxsave[512] __attribute__((aligned(16)));

Your proposal doesn't change the fact that for any given test case, the
vast majority of state won't interact with the instructions it contains
at all.

I've still got the scripts and analysis stuff, so if you send me a patch
you think is better, I'm happy to run it through and add it to the
comparison.

In any case, prediction based on expert intuition is good, but empirical
evidence is better*.  The graph I sent clearly shows that the 'compact'
format, with unlimited number of instructions, generates far more map
coverage in far less time than any other combination.  Unless you can
find some flaw in my methodology, or an alternate interpretation of the
data, I think we should go with the more 'compact' format.  The old
format is still available, and it would be easy to switch the default
back (or entirely remove the 'compact' format) anytime it is shown
empirically to be more effective.

-George

* For instance, if I hadn't done these graphs, I would have suggested an
instruction limit of 2 or 3, because my intiution told me that unlimited
instructions wasted AFL's time.  Which it does in the old format, but
apparently does not in the new.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
  2017-08-25 17:37 ` [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook Andrew Cooper
@ 2017-08-28 10:34   ` George Dunlap
  2017-09-14 15:26     ` George Dunlap
  2017-09-22 15:47   ` George Dunlap
  1 sibling, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-08-28 10:34 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

On 08/25/2017 06:37 PM, Andrew Cooper wrote:
> On 25/08/17 17:43, George Dunlap wrote:
>> You don't need __AFL_INIT if you have __AFL_LOOP.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> 
> Really?  Is that covered in any documentation?
> 
> I got the contrary impression from whichever version of AFL I was using
> when I put this in, and a quick look over the afl-fuzz source doesn't
> appear to equate them in any way.

The documentation does seem a bit hazy on the subject.  However:

1. It clear from the documentation [1] that both of them work *only* in
llvm mode (i.e., when compiled with afl-clang-fast).  In particular the
last paragraph of section 4: "afl-gcc or afl-clang will
*not* generate a deferred-initialization binary".

2. The documentation does seem to speak of them as separate 'modes'
(Section 5, "Note that as with the previous mode, ...")

3. Empirically speaking, persistent mode works fine with __AFL_LOOP()
and no __AFL_INIT() (for me anyway).

 -George

[1] https://github.com/mirrorer/afl/tree/master/llvm_mode

> 
> ~Andrew
> 
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> ---
>>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> index 154869336a..1a79ff228e 100644
>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> @@ -63,8 +63,6 @@ int main(int argc, char **argv)
>>          exit(-1);
>>  
>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>> -    __AFL_INIT();
>> -
>>      while ( __AFL_LOOP(1000) )
>>  #endif
>>      {
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
  2017-08-28 10:34   ` George Dunlap
@ 2017-09-14 15:26     ` George Dunlap
  0 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-14 15:26 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Jan Beulich

Ping?

I realize this isn't a  major feature but it would be nice to get it
in for 4.10.

 -George

On Mon, Aug 28, 2017 at 11:34 AM, George Dunlap
<george.dunlap@citrix.com> wrote:
> On 08/25/2017 06:37 PM, Andrew Cooper wrote:
>> On 25/08/17 17:43, George Dunlap wrote:
>>> You don't need __AFL_INIT if you have __AFL_LOOP.
>>>
>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>
>> Really?  Is that covered in any documentation?
>>
>> I got the contrary impression from whichever version of AFL I was using
>> when I put this in, and a quick look over the afl-fuzz source doesn't
>> appear to equate them in any way.
>
> The documentation does seem a bit hazy on the subject.  However:
>
> 1. It clear from the documentation [1] that both of them work *only* in
> llvm mode (i.e., when compiled with afl-clang-fast).  In particular the
> last paragraph of section 4: "afl-gcc or afl-clang will
> *not* generate a deferred-initialization binary".
>
> 2. The documentation does seem to speak of them as separate 'modes'
> (Section 5, "Note that as with the previous mode, ...")
>
> 3. Empirically speaking, persistent mode works fine with __AFL_LOOP()
> and no __AFL_INIT() (for me anyway).
>
>  -George
>
> [1] https://github.com/mirrorer/afl/tree/master/llvm_mode
>
>>
>> ~Andrew
>>
>>> ---
>>> CC: Ian Jackson <ian.jackson@citrix.com>
>>> CC: Wei Liu <wei.liu2@citrix.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> ---
>>>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>>
>>> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>>> index 154869336a..1a79ff228e 100644
>>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>>> @@ -63,8 +63,6 @@ int main(int argc, char **argv)
>>>          exit(-1);
>>>
>>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>>> -    __AFL_INIT();
>>> -
>>>      while ( __AFL_LOOP(1000) )
>>>  #endif
>>>      {
>>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail()
  2017-08-25 17:45   ` Andrew Cooper
@ 2017-09-14 17:06     ` George Dunlap
  2017-09-25 11:40     ` George Dunlap
  1 sibling, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-14 17:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 6:45 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 25/08/17 17:43, George Dunlap wrote:
>> Rather than open-coding the "read" from the input file.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>
> This patch fills me with dread.
>
> How about data_read() and data_available() which are slightly more
> descriptive?
>
> Also, both should be using bool rather than int.

I'm happy to do both; I was just waiting for comments on the rest of
the series before resending.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input
  2017-08-25 16:43 ` [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input George Dunlap
@ 2017-09-15 11:21   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-15 11:21 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:32PM +0100, George Dunlap wrote:
> Commit c07574b reorganized the way fuzzing was done, explicitly
> creating a structure that the input data would be copied into.
> 
> Unfortunately, the cpu register state used by the emulator is on the
> stack; it's cleared, but data is never copied into it.
> 
> If we're explicitly setting an entirely new cpu_regs struct for each
> new input anyway, there's no need to have two copies around anymore;
> just point to the one in the data structure.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/14] fuzz/x86_emulate: Add a better input size check
  2017-08-25 16:43 ` [PATCH 04/14] fuzz/x86_emulate: Add a better input size check George Dunlap
  2017-08-25 17:42   ` Andrew Cooper
@ 2017-09-15 11:39   ` Wei Liu
  2017-09-25  9:36     ` George Dunlap
  1 sibling, 1 reply; 43+ messages in thread
From: Wei Liu @ 2017-09-15 11:39 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:33PM +0100, George Dunlap wrote:
> For some reason the 'feof()' check for the file size isn't working in
> llvm-clang-fast mode; the result is several kilobyte files rather than
> the 4k limit files as we've requested.  This is bad in part because
> AFL will spend time trying to "fuzz" bits of the input that are never
> touched.
> 

You mean feof returns non-zero (true) when it shouldn't?

> Add a new check: Offer to read INPUT_SIZE + 1; if we actually get that
> many bytes, return an error.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> index 1a79ff228e..51e0183356 100644
> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> @@ -76,7 +76,7 @@ int main(int argc, char **argv)
>              }
>          }
>  
> -        size = fread(input, 1, INPUT_SIZE, fp);
> +        size = fread(input, 1, INPUT_SIZE + 1, fp);

You probably want to actual define input to be of INPUT_SIZE+1 byte as well.

I doubt address sanitiser will be happy with overrunning the buffer.

>  
>          if ( ferror(fp) )
>          {
> @@ -84,7 +84,7 @@ int main(int argc, char **argv)
>              exit(-1);
>          }
>  
> -        if ( !feof(fp) )
> +        if ( !feof(fp) || size > INPUT_SIZE )
>          {
>              printf("Input too large\n");
>              exit(-1);
> -- 
> 2.14.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness
  2017-08-25 16:43 ` [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
@ 2017-09-15 11:41   ` Wei Liu
  2017-09-15 11:47     ` George Dunlap
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2017-09-15 11:41 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:34PM +0100, George Dunlap wrote:
> - Print the symbolic name rather than the number
> - Explicitly state when data_read() fails due to EOI
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> index 48a879cc88..7f9a369421 100644
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -52,6 +52,14 @@ struct fuzz_state
>      struct x86_emulate_ops ops;
>  };
>  
> +char *x86emul_return_string[] = {
> +    [X86EMUL_OKAY]="X86EMUL_OKAY",
> +    [X86EMUL_UNHANDLEABLE]="X86EMUL_UNHANDLEABLE",
> +    [X86EMUL_EXCEPTION]="X86EMUL_EXCEPTION",
> +    [X86EMUL_RETRY]="X86EMUL_RETRY",
> +    [X86EMUL_DONE]="X86EMUL_DONE",

Can you please add spaces around "=" ?

With that fixed:

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code
  2017-08-25 16:43 ` [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
@ 2017-09-15 11:45   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-15 11:45 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:36PM +0100, George Dunlap wrote:
> When generating coverage output, by default gcov generates output
> filenames based only on the coverage file and the "leaf" source file,
> not the full path.  As a result, it uses the same name for
> x86_emulate.c and x86_emulate/x86_emulate.c, generally overwriting the
> second (which we actually are about) with the first (which is just a
> wrapper).
> 
> Rename the user-space wrapper helpers to x86_emulate_user.[ch], so
> that it generates separate files.
> 
> There is actually an option to gcov, `--preserve-paths`, which will
> cause the full path name to be included in the filename, properly
> distinguishing between the two.  However, given that the user-space
> wrapper doesn't actually do any emulation (and the poor state of gcov
> documentation making it difficult to find the option in the first
> place), it seems to make more sense to rename the file anyway.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness
  2017-09-15 11:41   ` Wei Liu
@ 2017-09-15 11:47     ` George Dunlap
  0 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-15 11:47 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Andrew Cooper, Jan Beulich, Ian Jackson


> On Sep 15, 2017, at 12:41 PM, Wei Liu <wei.liu2@citrix.com> wrote:
> 
> On Fri, Aug 25, 2017 at 05:43:34PM +0100, George Dunlap wrote:
>> - Print the symbolic name rather than the number
>> - Explicitly state when data_read() fails due to EOI
>> 
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> ---
>> tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> index 48a879cc88..7f9a369421 100644
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -52,6 +52,14 @@ struct fuzz_state
>>     struct x86_emulate_ops ops;
>> };
>> 
>> +char *x86emul_return_string[] = {
>> +    [X86EMUL_OKAY]="X86EMUL_OKAY",
>> +    [X86EMUL_UNHANDLEABLE]="X86EMUL_UNHANDLEABLE",
>> +    [X86EMUL_EXCEPTION]="X86EMUL_EXCEPTION",
>> +    [X86EMUL_RETRY]="X86EMUL_RETRY",
>> +    [X86EMUL_DONE]="X86EMUL_DONE",
> 
> Can you please add spaces around "=“ ?

Of course.  Thanks!
 -G

> 
> With that fixed:
> 
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target
  2017-08-25 16:43 ` [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
@ 2017-09-15 12:55   ` Wei Liu
  2017-09-15 12:57   ` Wei Liu
  1 sibling, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-15 12:55 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:37PM +0100, George Dunlap wrote:
> ...to generate a "normal" coverage-instrumented binary, suitable for
> use with gcov or afl-cov.
> 
> This is slightly annoying because:
> 
>  - Every object file needs to have been instrumented to work
>    effectively
> 
>  - You generally want to have both an afl-instrumented binary and a
>    gcov-instrumented binary at the same time, but
> 
>  - gcov instrumentation and afl instrumentation are mutually exclusive
> 
> So when making the `afl-cov` target, generate a second set of object
> files and a second binary with the `-cov` suffix.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  .gitignore                                   |  1 +
>  tools/fuzz/README.afl                        | 14 ++++++++++++++
>  tools/fuzz/x86_instruction_emulator/Makefile | 19 ++++++++++++++++++-
>  3 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/.gitignore b/.gitignore
> index 594ffd9a7f..66bceb3ebe 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -159,6 +159,7 @@ tools/fuzz/libelf/afl-libelf-fuzzer
>  tools/fuzz/x86_instruction_emulator/asm
>  tools/fuzz/x86_instruction_emulator/x86_emulate*
>  tools/fuzz/x86_instruction_emulator/afl-harness
> +tools/fuzz/x86_instruction_emulator/afl-harness-cov
>  tools/helpers/_paths.h
>  tools/helpers/init-xenstore-domain
>  tools/helpers/xen-init-dom0
> diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
> index 4758de2490..0d955b2687 100644
> --- a/tools/fuzz/README.afl
> +++ b/tools/fuzz/README.afl
> @@ -41,3 +41,17 @@ Use the x86 instruction emulator fuzzer as an example.
>     $ $AFLPATH/afl-fuzz -t 1000 -i testcase_dir -o findings_dir -- ./afl-harness
>  
>  Please see AFL documentation for more information.
> +
> +# GENERATING COVERAGE INFORMATION
> +
> +To use afl-cov or gcov, you need a separate binary instrumented to
> +generate coverage data.  To do this, use the target `afl-cov`:
> +
> +    $ make afl-cov #produces afl-harness-cov
> +
> +NOTE: Please also note that the coverage instrumentation hard-codes
> +the absolute path for the instrumentation read and write files in the
> +binary; so coverage data will always show up in the build directory no
> +matter where you run the binary from.
> +
> +Please see afl-cov and/or gcov documentation for more information.
> \ No newline at end of file
> diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
> index 10009dc08f..629e191029 100644
> --- a/tools/fuzz/x86_instruction_emulator/Makefile
> +++ b/tools/fuzz/x86_instruction_emulator/Makefile
> @@ -23,19 +23,33 @@ x86_emulate_user.c x86_emulate_user.h: %:
>  
>  CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
>  
> +GCOV_FLAGS=--coverage
> +
>  x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
>  x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
>  
>  x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
>  
> +x86_emulate_user-cov.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)

The dependencies should be factored out and used by this and the
non-gcov target.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target
  2017-08-25 16:43 ` [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
  2017-09-15 12:55   ` Wei Liu
@ 2017-09-15 12:57   ` Wei Liu
  2017-09-15 13:28     ` George Dunlap
  1 sibling, 1 reply; 43+ messages in thread
From: Wei Liu @ 2017-09-15 12:57 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:37PM +0100, George Dunlap wrote:
> diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
> index 10009dc08f..629e191029 100644
> --- a/tools/fuzz/x86_instruction_emulator/Makefile
> +++ b/tools/fuzz/x86_instruction_emulator/Makefile
> @@ -23,19 +23,33 @@ x86_emulate_user.c x86_emulate_user.h: %:
>  
>  CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
>  
> +GCOV_FLAGS=--coverage
> +
>  x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
>  x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
>  
>  x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
>  
> +x86_emulate_user-cov.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ x86_emulate_user.c
> +
>  fuzz-emul.o: $(x86_emulate.h)
>  
> +fuzz-emul-cov.o: fuzz-emul.c $(x86_emulate.h)
> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ fuzz-emul.c
> +
> +afl-harness-cov.o: afl-harness.c
> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
> +
>  x86-insn-fuzzer.a: fuzz-emul.o x86_emulate_user.o
>  	$(AR) rc $@ $^
>  
>  afl-harness: afl-harness.o fuzz-emul.o x86_emulate_user.o
>  	$(CC) $(CFLAGS) $^ -o $@
>  
> +afl-harness-cov: afl-harness-cov.o fuzz-emul-cov.o x86_emulate_user-cov.o
> +	$(CC) $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
> +

Also it is possible to use pattern rules to simplify targets. But given
this is a rather small program I won't insist on using using pattern
rules.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs
  2017-08-25 16:43 ` [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
@ 2017-09-15 13:07   ` Wei Liu
  2017-09-15 13:27     ` George Dunlap
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2017-09-15 13:07 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:38PM +0100, George Dunlap wrote:
> Finding aggregate coverage for a set of test files means running each
> afl-generated test case through the harness.  At the moment, this is
> done by re-executing afl-harness-cov with each input file.  When a
> large number of test cases have been generated, this can take a
> significant amonut of time; a recent test with 30k total files
> generated by 4 parallel fuzzers took over 7 minutes.
> 
> The vast majority of this time is taken up with 'exec', however.
> Since the harness is already designed to loop over multiple inputs for
> llvm "persistent mode", just allow it to take a large number of inputs
> on the same when *not* running in llvm "persistent mode"..  Then the
> command can be efficiently executed like this:
> 
>   ls */queue/id* | xargs $path/afl-harness-cov
> 
> For the above-mentioned test on 30k files, the time to generate
> coverage data was reduced from 7 minutes to under 30 seconds.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Ian Jackson <ian.jackson@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  tools/fuzz/README.afl                             |  7 +++++++
>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 23 ++++++++++++++++-------
>  2 files changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
> index 0d955b2687..e8c23d734c 100644
> --- a/tools/fuzz/README.afl
> +++ b/tools/fuzz/README.afl
> @@ -49,6 +49,13 @@ generate coverage data.  To do this, use the target `afl-cov`:
>  
>      $ make afl-cov #produces afl-harness-cov
>  
> +In order to speed up the process of checking total coverage,
> +`afl-harness-cov` can take several test inputs on its command-line;
> +the speed-up effect should be similar to that of using afl-clang-fast.
> +You can use xargs to do this most efficiently, like so:
> +
> +    $ ls queue/id* | xargs $path/afl-harness-cov
> +
>  NOTE: Please also note that the coverage instrumentation hard-codes
>  the absolute path for the instrumentation read and write files in the
>  binary; so coverage data will always show up in the build directory no
> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> index 51e0183356..79f8aec653 100644
> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
> @@ -16,6 +16,8 @@ int main(int argc, char **argv)
>  {
>      size_t size;
>      FILE *fp = NULL;
> +    int count = 0;
> +    int max;

unsigned int.

>  
>      setbuf(stdin, NULL);
>      setbuf(stdout, NULL);
> @@ -42,8 +44,7 @@ int main(int argc, char **argv)
>              break;
>  
>          case '?':
> -        usage:
> -            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
> +            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
>              exit(-1);
>              break;
>  
> @@ -54,21 +55,27 @@ int main(int argc, char **argv)
>          }
>      }
>  
> -    if ( optind == argc ) /* No positional parameters.  Use stdin. */
> +    max = argc - optind;
> +
> +    if ( !max ) /* No positional parameters.  Use stdin. */
> +    {
> +        max = 1;
>          fp = stdin;
> -    else if ( optind != (argc - 1) )
> -        goto usage;
> +    }
>  
>      if ( LLVMFuzzerInitialize(&argc, &argv) )
>          exit(-1);
>  
>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>      while ( __AFL_LOOP(1000) )
> +#else
> +    for( count = 0; count < max; count++ )
>  #endif
>      {
>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>          {
> -            fp = fopen(argv[optind], "rb");
> +            printf("Opening file %s\n", argv[optind]);

optind + count

> +            fp = fopen(argv[optind + count], "rb");
>              if ( fp == NULL )
>              {
>                  perror("fopen");
> @@ -87,7 +94,9 @@ int main(int argc, char **argv)
>          if ( !feof(fp) || size > INPUT_SIZE )
>          {
>              printf("Input too large\n");
> -            exit(-1);
> +            if ( optind + 1 ==  argc )

What is this for?

> +                exit(-1);
> +            continue;
>          }
>  
>          if ( fp != stdin )
> -- 
> 2.14.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs
  2017-09-15 13:07   ` Wei Liu
@ 2017-09-15 13:27     ` George Dunlap
  2017-09-15 13:42       ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-09-15 13:27 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Andrew Cooper, Jan Beulich, Ian Jackson

On 09/15/2017 02:07 PM, Wei Liu wrote:
> On Fri, Aug 25, 2017 at 05:43:38PM +0100, George Dunlap wrote:
>> Finding aggregate coverage for a set of test files means running each
>> afl-generated test case through the harness.  At the moment, this is
>> done by re-executing afl-harness-cov with each input file.  When a
>> large number of test cases have been generated, this can take a
>> significant amonut of time; a recent test with 30k total files
>> generated by 4 parallel fuzzers took over 7 minutes.
>>
>> The vast majority of this time is taken up with 'exec', however.
>> Since the harness is already designed to loop over multiple inputs for
>> llvm "persistent mode", just allow it to take a large number of inputs
>> on the same when *not* running in llvm "persistent mode"..  Then the
>> command can be efficiently executed like this:
>>
>>   ls */queue/id* | xargs $path/afl-harness-cov
>>
>> For the above-mentioned test on 30k files, the time to generate
>> coverage data was reduced from 7 minutes to under 30 seconds.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> ---
>>  tools/fuzz/README.afl                             |  7 +++++++
>>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 23 ++++++++++++++++-------
>>  2 files changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/fuzz/README.afl b/tools/fuzz/README.afl
>> index 0d955b2687..e8c23d734c 100644
>> --- a/tools/fuzz/README.afl
>> +++ b/tools/fuzz/README.afl
>> @@ -49,6 +49,13 @@ generate coverage data.  To do this, use the target `afl-cov`:
>>  
>>      $ make afl-cov #produces afl-harness-cov
>>  
>> +In order to speed up the process of checking total coverage,
>> +`afl-harness-cov` can take several test inputs on its command-line;
>> +the speed-up effect should be similar to that of using afl-clang-fast.
>> +You can use xargs to do this most efficiently, like so:
>> +
>> +    $ ls queue/id* | xargs $path/afl-harness-cov
>> +
>>  NOTE: Please also note that the coverage instrumentation hard-codes
>>  the absolute path for the instrumentation read and write files in the
>>  binary; so coverage data will always show up in the build directory no
>> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> index 51e0183356..79f8aec653 100644
>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> @@ -16,6 +16,8 @@ int main(int argc, char **argv)
>>  {
>>      size_t size;
>>      FILE *fp = NULL;
>> +    int count = 0;
>> +    int max;
> 
> unsigned int.
> 
>>  
>>      setbuf(stdin, NULL);
>>      setbuf(stdout, NULL);
>> @@ -42,8 +44,7 @@ int main(int argc, char **argv)
>>              break;
>>  
>>          case '?':
>> -        usage:
>> -            printf("Usage: %s $FILE | [--min-input-size]\n", argv[0]);
>> +            printf("Usage: %s $FILE [$FILE...] | [--min-input-size]\n", argv[0]);
>>              exit(-1);
>>              break;
>>  
>> @@ -54,21 +55,27 @@ int main(int argc, char **argv)
>>          }
>>      }
>>  
>> -    if ( optind == argc ) /* No positional parameters.  Use stdin. */
>> +    max = argc - optind;
>> +
>> +    if ( !max ) /* No positional parameters.  Use stdin. */
>> +    {
>> +        max = 1;
>>          fp = stdin;
>> -    else if ( optind != (argc - 1) )
>> -        goto usage;
>> +    }
>>  
>>      if ( LLVMFuzzerInitialize(&argc, &argv) )
>>          exit(-1);
>>  
>>  #ifdef __AFL_HAVE_MANUAL_CONTROL
>>      while ( __AFL_LOOP(1000) )
>> +#else
>> +    for( count = 0; count < max; count++ )
>>  #endif
>>      {
>>          if ( fp != stdin ) /* If not using stdin, open the provided file. */
>>          {
>> -            fp = fopen(argv[optind], "rb");
>> +            printf("Opening file %s\n", argv[optind]);
> 
> optind + count
> 
>> +            fp = fopen(argv[optind + count], "rb");
>>              if ( fp == NULL )
>>              {
>>                  perror("fopen");
>> @@ -87,7 +94,9 @@ int main(int argc, char **argv)
>>          if ( !feof(fp) || size > INPUT_SIZE )
>>          {
>>              printf("Input too large\n");
>> -            exit(-1);
>> +            if ( optind + 1 ==  argc )
> 
> What is this for?

Only call exit() when the file is too large if the current testcase is
the last one.

One of the "interesting testcases" that AFL finds is always "The file
was too large".  Without this clause, "afl-harness-cov [blah] queue/id*"
would process testcases until it found that one, then stop processing,
rather than processing the full set of them.

We could make this something like "if ( max == 1 )" (only exit if not in
'batch mode') if you think that would be better.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target
  2017-09-15 12:57   ` Wei Liu
@ 2017-09-15 13:28     ` George Dunlap
  0 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-15 13:28 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Andrew Cooper, Jan Beulich, Ian Jackson

On 09/15/2017 01:57 PM, Wei Liu wrote:
> On Fri, Aug 25, 2017 at 05:43:37PM +0100, George Dunlap wrote:
>> diff --git a/tools/fuzz/x86_instruction_emulator/Makefile b/tools/fuzz/x86_instruction_emulator/Makefile
>> index 10009dc08f..629e191029 100644
>> --- a/tools/fuzz/x86_instruction_emulator/Makefile
>> +++ b/tools/fuzz/x86_instruction_emulator/Makefile
>> @@ -23,19 +23,33 @@ x86_emulate_user.c x86_emulate_user.h: %:
>>  
>>  CFLAGS += $(CFLAGS_xeninclude) -D__XEN_TOOLS__ -I.
>>  
>> +GCOV_FLAGS=--coverage
>> +
>>  x86.h := asm/x86-vendors.h asm/x86-defns.h asm/msr-index.h
>>  x86_emulate.h := x86_emulate_user.h x86_emulate/x86_emulate.h $(x86.h)
>>  
>>  x86_emulate_user.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
>>  
>> +x86_emulate_user-cov.o: x86_emulate_user.c x86_emulate/x86_emulate.c $(x86_emulate.h)
>> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ x86_emulate_user.c
>> +
>>  fuzz-emul.o: $(x86_emulate.h)
>>  
>> +fuzz-emul-cov.o: fuzz-emul.c $(x86_emulate.h)
>> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) -o $@ fuzz-emul.c
>> +
>> +afl-harness-cov.o: afl-harness.c
>> +	$(CC) -c $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
>> +
>>  x86-insn-fuzzer.a: fuzz-emul.o x86_emulate_user.o
>>  	$(AR) rc $@ $^
>>  
>>  afl-harness: afl-harness.o fuzz-emul.o x86_emulate_user.o
>>  	$(CC) $(CFLAGS) $^ -o $@
>>  
>> +afl-harness-cov: afl-harness-cov.o fuzz-emul-cov.o x86_emulate_user-cov.o
>> +	$(CC) $(CFLAGS) $(GCOV_FLAGS) $^ -o $@
>> +
> 
> Also it is possible to use pattern rules to simplify targets. But given
> this is a rather small program I won't insist on using using pattern
> rules.

Indeed, a previous incarnation of this patch actually had such rules;
but it seemed a bit heavy-handed.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability
  2017-08-25 16:43 ` [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
@ 2017-09-15 13:30   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-15 13:30 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:41PM +0100, George Dunlap wrote:
> Current stability numbers are not 100%.  In order to help track this
> down, add a --rerun option which will run the same input twice,
> resetting the state in between each run, and comparing the state
> afterwards.  If the state differs, call abort().
> 
> This allows AFL to help the process of tracking down what state is not
> being reset properly between runs by proving testcases that
> demonstrate the behavior.
> 
> To do this:
> 
> - Move ctxt into struct fuzz-state to simplify handling
> 
> - Rather than copying the data into input, treat the data handed as
>   immutable and point each "copy" to it
> 
> - Factor out various steps (setting up fuzz state, running an
>   individual test) so that they can be efficiently run either once or
>   twice (as necessary)
> 
> - Compare the states afterwards, printing what's different and calling
>   abort() if anything is found.
> 

FWIW I think this is an useful option to have. Since this patch depends
on previous ones which have comments I only skim-read it.

> +
> +void compare_states(struct fuzz_state state[2])
> +{
> +    // First zero any "internal" pointers
> +    state[0].corpus = state[1].corpus = NULL;
> +    state[0].ctxt.data = state[1].ctxt.data = NULL;
> +    state[0].ctxt.regs = state[1].ctxt.regs = NULL;
> +
> +    
> +    if ( memcmp(&state[0], &state[1], sizeof(struct fuzz_state)) )
> +    {
> +        int i;
> +
> +        printf("State mismatch\n");
> +
> +        for ( i=0; i<5; i++)
> +            if (state[0].cr[i] != state[1].cr[i])
> +                printf("cr[%d]: %lx != %lx\n",
> +                       i, state[0].cr[i], state[1].cr[i]);

Coding style issues here and below.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed
  2017-08-25 16:43 ` [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
@ 2017-09-15 13:38   ` Wei Liu
  2017-09-15 13:55     ` George Dunlap
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2017-09-15 13:38 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 05:43:43PM +0100, George Dunlap wrote:
> AFL considers a testcase to be a useful addition not only if there are
> tuples exercised by that testcase which were not exercised otherwise,
> but also if the *number* of times an individual tuple is exercised
> changes significantly; in particular, if the number of the highes bit
> changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).
> 
> Unfortunately, one simple way to increase these stats it to execute
> the same (or similar) instructions multiple times.  Such long
> testcases take exponentially longer to fuzz: the fuzzer spends more
> time flipping bits looking for meaningful changes, and each execution
> takes longer because it is doing more things.  So long paths which add
> nothing to the actual code coverage but effectively "distract" the
> fuzzer, making it less effective.
> 
> Experiments have shown that not allowing infinite number of
> instruction retries for the old (non-compact) format does indeed speed
> up and increase code coverage.  However, it has also shown that on the
> new, more compact format, having no instruction limit causes the highest
> throughput in code coverage.
> 
> So leave the option in, but have it default to 0 (no limit).

How does limiting the number of loops help afl produce better input?
Wouldn't afl still try to flip bits beyond the limit (say, the >=n+1
instructions when the limit is n)? I assume it will give up at some
point, but when?

I guess my point is having a limit but doesn't tell afl about it seems
a bit sub-optimal to me. I'm not quite sure if I understand correctly
how afl works though.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs
  2017-09-15 13:27     ` George Dunlap
@ 2017-09-15 13:42       ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-15 13:42 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Sep 15, 2017 at 02:27:35PM +0100, George Dunlap wrote:
> >>                  perror("fopen");
> >> @@ -87,7 +94,9 @@ int main(int argc, char **argv)
> >>          if ( !feof(fp) || size > INPUT_SIZE )
> >>          {
> >>              printf("Input too large\n");
> >> -            exit(-1);
> >> +            if ( optind + 1 ==  argc )
> > 
> > What is this for?
> 
> Only call exit() when the file is too large if the current testcase is
> the last one.
> 
> One of the "interesting testcases" that AFL finds is always "The file
> was too large".  Without this clause, "afl-harness-cov [blah] queue/id*"
> would process testcases until it found that one, then stop processing,
> rather than processing the full set of them.
> 
> We could make this something like "if ( max == 1 )" (only exit if not in
> 'batch mode') if you think that would be better.
> 

Yes the new check looks clearer to me.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed
  2017-09-15 13:38   ` Wei Liu
@ 2017-09-15 13:55     ` George Dunlap
  2017-09-19 10:05       ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-09-15 13:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Andrew Cooper, Jan Beulich, Ian Jackson

On 09/15/2017 02:38 PM, Wei Liu wrote:
> On Fri, Aug 25, 2017 at 05:43:43PM +0100, George Dunlap wrote:
>> AFL considers a testcase to be a useful addition not only if there are
>> tuples exercised by that testcase which were not exercised otherwise,
>> but also if the *number* of times an individual tuple is exercised
>> changes significantly; in particular, if the number of the highes bit
>> changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).
>>
>> Unfortunately, one simple way to increase these stats it to execute
>> the same (or similar) instructions multiple times.  Such long
>> testcases take exponentially longer to fuzz: the fuzzer spends more
>> time flipping bits looking for meaningful changes, and each execution
>> takes longer because it is doing more things.  So long paths which add
>> nothing to the actual code coverage but effectively "distract" the
>> fuzzer, making it less effective.
>>
>> Experiments have shown that not allowing infinite number of
>> instruction retries for the old (non-compact) format does indeed speed
>> up and increase code coverage.  However, it has also shown that on the
>> new, more compact format, having no instruction limit causes the highest
>> throughput in code coverage.
>>
>> So leave the option in, but have it default to 0 (no limit).
> 
> How does limiting the number of loops help afl produce better input?
> Wouldn't afl still try to flip bits beyond the limit (say, the >=n+1
> instructions when the limit is n)? I assume it will give up at some
> point, but when?

* Limiting the number of loops means longer testcases don't look
different than shorter testcases

* If a testcase doesn't look different than existing testcases, then AFL
will discard it rather than adding it to the queue

* Larger "queue" testcases exhibit a quadratic effect for search time:
They are longer to fuzz (since they're bigger) and each testcase is
longer to run.

* So limiting the number of loops causes the testcases in the queue to
be shorter, which in turn decreases the time to fuzz and execute test
cases in the queue, which causes higher throughput.

> I guess my point is having a limit but doesn't tell afl about it seems
> a bit sub-optimal to me. I'm not quite sure if I understand correctly
> how afl works though.

Well I think the key thing is to look at the graph.  Either 1) there's
something wrong with my methodology, or 2) for the original layout,
limiting the number of testcases helps improve AFL's performance.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed
  2017-09-15 13:55     ` George Dunlap
@ 2017-09-19 10:05       ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2017-09-19 10:05 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Andrew Cooper, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Sep 15, 2017 at 02:55:03PM +0100, George Dunlap wrote:
> On 09/15/2017 02:38 PM, Wei Liu wrote:
> > On Fri, Aug 25, 2017 at 05:43:43PM +0100, George Dunlap wrote:
> >> AFL considers a testcase to be a useful addition not only if there are
> >> tuples exercised by that testcase which were not exercised otherwise,
> >> but also if the *number* of times an individual tuple is exercised
> >> changes significantly; in particular, if the number of the highes bit
> >> changes (i.e., if it is run 1, 2-3, 4-7, 8-15, &c).
> >>
> >> Unfortunately, one simple way to increase these stats it to execute
> >> the same (or similar) instructions multiple times.  Such long
> >> testcases take exponentially longer to fuzz: the fuzzer spends more
> >> time flipping bits looking for meaningful changes, and each execution
> >> takes longer because it is doing more things.  So long paths which add
> >> nothing to the actual code coverage but effectively "distract" the
> >> fuzzer, making it less effective.
> >>
> >> Experiments have shown that not allowing infinite number of
> >> instruction retries for the old (non-compact) format does indeed speed
> >> up and increase code coverage.  However, it has also shown that on the
> >> new, more compact format, having no instruction limit causes the highest
> >> throughput in code coverage.
> >>
> >> So leave the option in, but have it default to 0 (no limit).
> > 
> > How does limiting the number of loops help afl produce better input?
> > Wouldn't afl still try to flip bits beyond the limit (say, the >=n+1
> > instructions when the limit is n)? I assume it will give up at some
> > point, but when?
> 
> * Limiting the number of loops means longer testcases don't look
> different than shorter testcases
> 
> * If a testcase doesn't look different than existing testcases, then AFL
> will discard it rather than adding it to the queue
> 
> * Larger "queue" testcases exhibit a quadratic effect for search time:
> They are longer to fuzz (since they're bigger) and each testcase is
> longer to run.
> 
> * So limiting the number of loops causes the testcases in the queue to
> be shorter, which in turn decreases the time to fuzz and execute test
> cases in the queue, which causes higher throughput.
> 
> > I guess my point is having a limit but doesn't tell afl about it seems
> > a bit sub-optimal to me. I'm not quite sure if I understand correctly
> > how afl works though.
> 
> Well I think the key thing is to look at the graph.  Either 1) there's
> something wrong with my methodology, or 2) for the original layout,
> limiting the number of testcases helps improve AFL's performance.
> 

I was just curious how it got that behaviour, either way I don't think I
would care that much because they are all internal implementation
details which are subject to change at any time.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
  2017-08-25 17:37 ` [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook Andrew Cooper
  2017-08-28 10:34   ` George Dunlap
@ 2017-09-22 15:47   ` George Dunlap
  2017-09-22 16:09     ` Andrew Cooper
  1 sibling, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-09-22 15:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 6:37 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 25/08/17 17:43, George Dunlap wrote:
>> You don't need __AFL_INIT if you have __AFL_LOOP.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>
> Really?  Is that covered in any documentation?
>
> I got the contrary impression from whichever version of AFL I was using
> when I put this in, and a quick look over the afl-fuzz source doesn't
> appear to equate them in any way.

Trying to answer the feof() question, I dug into llvm mode a bit more.
They are independent features but they can be used together: that is,
if you compile with afl-clang-fast, you always get a forkserver.  If
you specify the location with __AFL_INIT, that's where the forkserver
get started; otherwise, it's somewhere before main() is called.

When __AFL_LOOP(N) is present, it will execute the loop N times before
calling exit(); at which point, the forkserver will fork another
instance (either before main(), or at __AFL_INIT).

So you don't *need* __AFL_INIT; and the amount of time it will save
isn't much (the initialization every 1000 iterations), but I suppose
we might as well keep it.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook
  2017-09-22 15:47   ` George Dunlap
@ 2017-09-22 16:09     ` Andrew Cooper
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2017-09-22 16:09 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, Wei Liu, Jan Beulich, Ian Jackson

On 22/09/17 16:47, George Dunlap wrote:
> On Fri, Aug 25, 2017 at 6:37 PM, Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
>> On 25/08/17 17:43, George Dunlap wrote:
>>> You don't need __AFL_INIT if you have __AFL_LOOP.
>>>
>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> Really?  Is that covered in any documentation?
>>
>> I got the contrary impression from whichever version of AFL I was using
>> when I put this in, and a quick look over the afl-fuzz source doesn't
>> appear to equate them in any way.
> Trying to answer the feof() question, I dug into llvm mode a bit more.
> They are independent features but they can be used together: that is,
> if you compile with afl-clang-fast, you always get a forkserver.  If
> you specify the location with __AFL_INIT, that's where the forkserver
> get started; otherwise, it's somewhere before main() is called.
>
> When __AFL_LOOP(N) is present, it will execute the loop N times before
> calling exit(); at which point, the forkserver will fork another
> instance (either before main(), or at __AFL_INIT).
>
> So you don't *need* __AFL_INIT; and the amount of time it will save
> isn't much (the initialization every 1000 iterations), but I suppose
> we might as well keep it.

The purpose of c/s 63092064eb was very deliberately to cause the 
mprotect() remapping the stack as executable to be not repeated.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/14] fuzz/x86_emulate: Add a better input size check
  2017-09-15 11:39   ` Wei Liu
@ 2017-09-25  9:36     ` George Dunlap
  2017-09-25 11:08       ` George Dunlap
  0 siblings, 1 reply; 43+ messages in thread
From: George Dunlap @ 2017-09-25  9:36 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Ian Jackson, Jan Beulich, Andrew Cooper

On Fri, Sep 15, 2017 at 12:39 PM, Wei Liu <wei.liu2@citrix.com> wrote:
> On Fri, Aug 25, 2017 at 05:43:33PM +0100, George Dunlap wrote:
>> For some reason the 'feof()' check for the file size isn't working in
>> llvm-clang-fast mode; the result is several kilobyte files rather than
>> the 4k limit files as we've requested.  This is bad in part because
>> AFL will spend time trying to "fuzz" bits of the input that are never
>> touched.
>>
>
> You mean feof returns non-zero (true) when it shouldn't?

It looks like it does.  I modified the code thus:

        if ( !feof(fp) )
        {
            printf("Input too large\n");
            if ( optind + 1 ==  argc )
                exit(-1);
            continue;
        }

        if ( fread(input, 1, 1, fp) > 0 )
        {
            fprintf(stderr, "feof check failed to detect oversized input!");
            abort();
        }

And ran AFL for a bit in afl-clang-fast mode.  It ran fine for about
two cycles, before the massive repetition started happening; but once
the file size got larger than 4096 it found "crashes", even though
running it manually with the same input file results in a simple
"Input too large".

>> Add a new check: Offer to read INPUT_SIZE + 1; if we actually get that
>> many bytes, return an error.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> CC: Ian Jackson <ian.jackson@citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Jan Beulich <jbeulich@suse.com>
>> ---
>>  tools/fuzz/x86_instruction_emulator/afl-harness.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/fuzz/x86_instruction_emulator/afl-harness.c b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> index 1a79ff228e..51e0183356 100644
>> --- a/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> +++ b/tools/fuzz/x86_instruction_emulator/afl-harness.c
>> @@ -76,7 +76,7 @@ int main(int argc, char **argv)
>>              }
>>          }
>>
>> -        size = fread(input, 1, INPUT_SIZE, fp);
>> +        size = fread(input, 1, INPUT_SIZE + 1, fp);
>
> You probably want to actual define input to be of INPUT_SIZE+1 byte as well.
>
> I doubt address sanitiser will be happy with overrunning the buffer.

Yeah, the check is a little suboptimal; I'll see what I can come up with.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/14] fuzz/x86_emulate: Add a better input size check
  2017-09-25  9:36     ` George Dunlap
@ 2017-09-25 11:08       ` George Dunlap
  0 siblings, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-25 11:08 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Ian Jackson, Jan Beulich, Andrew Cooper

On Mon, Sep 25, 2017 at 10:36 AM, George Dunlap
<george.dunlap@citrix.com> wrote:
> On Fri, Sep 15, 2017 at 12:39 PM, Wei Liu <wei.liu2@citrix.com> wrote:
>> On Fri, Aug 25, 2017 at 05:43:33PM +0100, George Dunlap wrote:
>>> For some reason the 'feof()' check for the file size isn't working in
>>> llvm-clang-fast mode; the result is several kilobyte files rather than
>>> the 4k limit files as we've requested.  This is bad in part because
>>> AFL will spend time trying to "fuzz" bits of the input that are never
>>> touched.
>>>
>>
>> You mean feof returns non-zero (true) when it shouldn't?
>
> It looks like it does.  I modified the code thus:
>
>         if ( !feof(fp) )
>         {
>             printf("Input too large\n");
>             if ( optind + 1 ==  argc )
>                 exit(-1);
>             continue;
>         }
>
>         if ( fread(input, 1, 1, fp) > 0 )
>         {
>             fprintf(stderr, "feof check failed to detect oversized input!");
>             abort();
>         }
>
> And ran AFL for a bit in afl-clang-fast mode.  It ran fine for about
> two cycles, before the massive repetition started happening; but once
> the file size got larger than 4096 it found "crashes", even though
> running it manually with the same input file results in a simple
> "Input too large".

Actually -- I had forgotten that over the weekend I came up with
another hypothesis: The reason feof() is returning true even when
we're not at an EOF is that we haven't called clearerr() on the
previous iteration.

Testing now with clearerr() -- if that works I'll send a patch that
fixes the root problem.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail()
  2017-08-25 17:45   ` Andrew Cooper
  2017-09-14 17:06     ` George Dunlap
@ 2017-09-25 11:40     ` George Dunlap
  1 sibling, 0 replies; 43+ messages in thread
From: George Dunlap @ 2017-09-25 11:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Jan Beulich, Ian Jackson

On Fri, Aug 25, 2017 at 6:45 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 25/08/17 17:43, George Dunlap wrote:
>> Rather than open-coding the "read" from the input file.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>
> This patch fills me with dread.
>
> How about data_read() and data_available() which are slightly more
> descriptive?

Actually, data_read() is actually already being used by the function
which will 'randomly' fail.

How about 'input_read' and 'input_available'?

 -george

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-09-25 11:40 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25 16:43 [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook George Dunlap
2017-08-25 16:43 ` [PATCH 02/14] x86emul/fuzz: add rudimentary limit checking George Dunlap
2017-08-25 16:43 ` [PATCH 03/14] fuzz/x86_emulate: Actually use cpu_regs input George Dunlap
2017-09-15 11:21   ` Wei Liu
2017-08-25 16:43 ` [PATCH 04/14] fuzz/x86_emulate: Add a better input size check George Dunlap
2017-08-25 17:42   ` Andrew Cooper
2017-09-15 11:39   ` Wei Liu
2017-09-25  9:36     ` George Dunlap
2017-09-25 11:08       ` George Dunlap
2017-08-25 16:43 ` [PATCH 05/14] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
2017-09-15 11:41   ` Wei Liu
2017-09-15 11:47     ` George Dunlap
2017-08-25 16:43 ` [PATCH 06/14] fuzz/x86_emulate: Implement dread() and davail() George Dunlap
2017-08-25 17:45   ` Andrew Cooper
2017-09-14 17:06     ` George Dunlap
2017-09-25 11:40     ` George Dunlap
2017-08-25 16:43 ` [PATCH 07/14] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
2017-09-15 11:45   ` Wei Liu
2017-08-25 16:43 ` [PATCH 08/14] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
2017-09-15 12:55   ` Wei Liu
2017-09-15 12:57   ` Wei Liu
2017-09-15 13:28     ` George Dunlap
2017-08-25 16:43 ` [PATCH 09/14] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
2017-09-15 13:07   ` Wei Liu
2017-09-15 13:27     ` George Dunlap
2017-09-15 13:42       ` Wei Liu
2017-08-25 16:43 ` [PATCH 10/14] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
2017-08-25 16:43 ` [PATCH 11/14] fuzz/x86_emulate: Make input more compact George Dunlap
2017-08-25 16:52   ` George Dunlap
2017-08-25 17:59   ` Andrew Cooper
2017-08-28  9:10     ` George Dunlap
2017-08-25 16:43 ` [PATCH 12/14] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
2017-09-15 13:30   ` Wei Liu
2017-08-25 16:43 ` [PATCH 13/14] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
2017-08-25 16:43 ` [PATCH 14/14] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
2017-09-15 13:38   ` Wei Liu
2017-09-15 13:55     ` George Dunlap
2017-09-19 10:05       ` Wei Liu
2017-08-25 17:37 ` [PATCH 01/14] fuzz/x86_emulate: Remove redundant AFL hook Andrew Cooper
2017-08-28 10:34   ` George Dunlap
2017-09-14 15:26     ` George Dunlap
2017-09-22 15:47   ` George Dunlap
2017-09-22 16:09     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.