All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/7] fuzz: improve crash case minimization
@ 2020-12-29  4:39 Qiuhao Li
  2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:39 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

Extend and refine the crash case minimization process.

Test input:
  Bug 1909261 full_reproducer
  6500 QTest instructions (write mostly)

Refined (-M1 minimization level) vs. Original version:
  real  38m31.942s  <-- real  532m57.192s
  user  28m18.188s  <-- user  89m0.536s
  sys   12m42.239s  <-- sys   50m33.074s
  2558 instructions <-- 2846 instructions

Test Enviroment:
  i7-8550U, 16GB LPDDR3, SSD 
  Ubuntu 20.04.1 5.4.0-58-generic x86_64
  Python 3.8.5

v4:
  Fix: messy diff in [PATCH v3 4/7]

v3:
  Fix: checkpatch.pl errors

v2: 
  New: [PATCH v2 1/7]
  New: [PATCH v2 2/7]
  New: [PATCH v2 4/7]
  New: [PATCH v2 6/7]
  New: [PATCH v2 7/7]
  Fix: [PATCH 2/4] split using binary approach
  Fix: [PATCH 3/4] typo in comments
  Discard: [PATCH 1/4] the hardcoded regex match for crash detection
  Discard: [PATCH 4/4] the delaying minimizer
  
Thanks for the suggestions from:
  Alexander Bulekov

Qiuhao Li (7):
  fuzz: accelerate non-crash detection
  fuzz: double the IOs to remove for every loop
  fuzz: split write operand using binary approach
  fuzz: loop the remove minimizer and refactoring
  fuzz: set bits in operand of write/out to zero
  fuzz: add minimization options
  fuzz: heuristic split write based on past IOs

 scripts/oss-fuzz/minimize_qtest_trace.py | 257 ++++++++++++++++++-----
 1 file changed, 209 insertions(+), 48 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  3:42   ` Alexander Bulekov
  2021-01-07  4:18   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 2/7] fuzz: double the IOs to remove for every loop Qiuhao Li
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

We spend much time waiting for the timeout program during the minimization
process until it passes a time limit. This patch hacks the CLOSED (indicates
the redirection file closed) notification in QTest's output if it doesn't
crash.

Test with quadrupled trace input at:
  https://bugs.launchpad.net/qemu/+bug/1890333/comments/1

Original version:
  real	1m37.246s
  user	0m13.069s
  sys	0m8.399s

Refined version:
  real	0m45.904s
  user	0m16.874s
  sys	0m10.042s

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--------
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 5e405a0d5f..aa69c7963e 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a string that idenitifes the
 crash by setting CRASH_TOKEN=
 """.format((sys.argv[0])))
 
+deduplication_note = """\n\
+Note: While trimming the input, sometimes the mutated trace triggers a different
+crash output but indicates the same bug. Under this situation, our minimizer is
+incapable of recognizing and stopped from removing it. In the future, we may
+use a more sophisticated crash case deduplication method.
+\n"""
+
 def check_if_trace_crashes(trace, path):
-    global CRASH_TOKEN
     with open(path, "w") as tracefile:
         tracefile.write("".join(trace))
 
-    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 2>&1\
+    proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\
     < {trace_path}".format(timeout=TIMEOUT,
                            qemu_path=QEMU_PATH,
                            qemu_args=QEMU_ARGS,
                            trace_path=path),
                           shell=True,
                           stdin=subprocess.PIPE,
-                          stdout=subprocess.PIPE)
-    stdo = rc.communicate()[0]
-    output = stdo.decode('unicode_escape')
-    if rc.returncode == 137:    # Timed Out
-        return False
-    if len(output.splitlines()) < 2:
-        return False
-
+                          stdout=subprocess.PIPE,
+                          encoding="utf-8")
+    global CRASH_TOKEN
     if CRASH_TOKEN is None:
-        CRASH_TOKEN = output.splitlines()[-2]
+        try:
+            outs, _ = proc.communicate(timeout=5)
+            CRASH_TOKEN = outs.splitlines()[-2]
+        except subprocess.TimeoutExpired:
+            print("subprocess.TimeoutExpired")
+            return False
+        print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
+        global deduplication_note
+        print(deduplication_note)
+        return True
 
-    return CRASH_TOKEN in output
+    for line in iter(proc.stdout.readline, b''):
+        if "CLOSED" in line:
+            return False
+        if CRASH_TOKEN in line:
+            return True
+
+    return False
 
 
 def minimize_trace(inpath, outpath):
@@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
     print("Crashed in {} seconds".format(end-start))
     TIMEOUT = (end-start)*5
     print("Setting the timeout for {} seconds".format(TIMEOUT))
-    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
 
     i = 0
     newtrace = trace[:]
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 2/7] fuzz: double the IOs to remove for every loop
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
  2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  4:19   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 3/7] fuzz: split write operand using binary approach Qiuhao Li
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

Instead of removing IO instructions one by one, we can try deleting multiple
instructions at once. According to the locality of reference, we double the
number of instructions to remove for the next round and recover it to one
once we fail.

This patch is usually significant for large input.

Test with quadrupled trace input at:
  https://bugs.launchpad.net/qemu/+bug/1890333/comments/1

Patched 1/6 version:
  real  0m45.904s
  user  0m16.874s
  sys   0m10.042s

Refined version:
  real  0m11.412s
  user  0m6.888s
  sys   0m3.325s

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 33 +++++++++++++++---------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index aa69c7963e..0b665ae657 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -85,19 +85,28 @@ def minimize_trace(inpath, outpath):
 
     i = 0
     newtrace = trace[:]
-    # For each line
+    remove_step = 1
     while i < len(newtrace):
-        # 1.) Try to remove it completely and reproduce the crash. If it works,
-        # we're done.
-        prior = newtrace[i]
-        print("Trying to remove {}".format(newtrace[i]))
-        # Try to remove the line completely
-        newtrace[i] = ""
+        # 1.) Try to remove lines completely and reproduce the crash.
+        # If it works, we're done.
+        if (i+remove_step) >= len(newtrace):
+            remove_step = 1
+        prior = newtrace[i:i+remove_step]
+        for j in range(i, i+remove_step):
+            newtrace[j] = ""
+        print("Removing {lines} ...".format(lines=prior))
         if check_if_trace_crashes(newtrace, outpath):
-            i += 1
+            i += remove_step
+            # Double the number of lines to remove for next round
+            remove_step *= 2
             continue
-        newtrace[i] = prior
-
+        # Failed to remove multiple IOs, fast recovery
+        if remove_step > 1:
+            for j in range(i, i+remove_step):
+                newtrace[j] = prior[j-i]
+            remove_step = 1
+            continue
+        newtrace[i] = prior[0] # remove_step = 1
         # 2.) Try to replace write{bwlq} commands with a write addr, len
         # command. Since this can require swapping endianness, try both LE and
         # BE options. We do this, so we can "trim" the writes in (3)
@@ -118,7 +127,7 @@ def minimize_trace(inpath, outpath):
                 if(check_if_trace_crashes(newtrace, outpath)):
                     break
             else:
-                newtrace[i] = prior
+                newtrace[i] = prior[0]
 
         # 3.) If it is a qtest write command: write addr len data, try to split
         # it into two separate write commands. If splitting the write down the
@@ -151,7 +160,7 @@ def minimize_trace(inpath, outpath):
                 if check_if_trace_crashes(newtrace, outpath):
                     i -= 1
                 else:
-                    newtrace[i] = prior
+                    newtrace[i] = prior[0]
                     del newtrace[i+1]
         i += 1
     check_if_trace_crashes(newtrace, outpath)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 3/7] fuzz: split write operand using binary approach
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
  2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
  2020-12-29  4:40 ` [PATCH v4 2/7] fuzz: double the IOs to remove for every loop Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  4:28   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring Qiuhao Li
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

Currently, we split the write commands' data from the middle. If it does not
work, try to move the pivot left by one byte and retry until there is no
space.

But, this method has two flaws:

1. It may fail to trim all unnecessary bytes on the right side.

For example, there is an IO write command:

  write addr uuxxxxuu

u is the unnecessary byte for the crash. Unlike ram write commands, in most
case, a split IO write won't trigger the same crash, So if we split from the
middle, we will get:

  write addr uu (will be removed in next round)
  write addr xxxxuu

For xxxxuu, since split it from the middle and retry to the leftmost byte
won't get the same crash, we will be stopped from removing the last two
bytes.

2. The algorithm complexity is O(n) since we move the pivot byte by byte.

To solve the first issue, we can try a symmetrical position on the right if
we fail on the left. As for the second issue, instead moving by one byte, we
can approach the boundary exponentially, achieving O(log(n)).

Give an example:

                   xxxxuu len=6
                        +
                        |
                        +
                 xxx,xuu 6/2=3 fail
                        +
         +--------------+-------------+
         |                            |
         +                            +
  xx,xxuu 6/2^2=1 fail         xxxxu,u 6-1=5 success
                                 +   +
         +------------------+----+   |
         |                  |        +-------------+ u removed
         +                  +
   xx,xxu 5/2=2 fail  xxxx,u 6-2=4 success
                           +
                           |
                           +-----------+ u removed

In some rare case, this algorithm will fail to trim all unnecessary bytes:

  xxxxxxxxxuxxxxxx
  xxxxxxxx-xuxxxxxx Fail
  xxxx-xxxxxuxxxxxx Fail
  xxxxxxxxxuxx-xxxx Fail
  ...

I think the trade-off is worth it.

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 29 ++++++++++++++++--------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 0b665ae657..1a26bf5b93 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -94,7 +94,7 @@ def minimize_trace(inpath, outpath):
         prior = newtrace[i:i+remove_step]
         for j in range(i, i+remove_step):
             newtrace[j] = ""
-        print("Removing {lines} ...".format(lines=prior))
+        print("Removing {lines} ...\n".format(lines=prior))
         if check_if_trace_crashes(newtrace, outpath):
             i += remove_step
             # Double the number of lines to remove for next round
@@ -107,9 +107,11 @@ def minimize_trace(inpath, outpath):
             remove_step = 1
             continue
         newtrace[i] = prior[0] # remove_step = 1
+
         # 2.) Try to replace write{bwlq} commands with a write addr, len
         # command. Since this can require swapping endianness, try both LE and
         # BE options. We do this, so we can "trim" the writes in (3)
+
         if (newtrace[i].startswith("write") and not
             newtrace[i].startswith("write ")):
             suffix = newtrace[i].split()[0][-1]
@@ -130,11 +132,15 @@ def minimize_trace(inpath, outpath):
                 newtrace[i] = prior[0]
 
         # 3.) If it is a qtest write command: write addr len data, try to split
-        # it into two separate write commands. If splitting the write down the
-        # middle does not work, try to move the pivot "left" and retry, until
-        # there is no space left. The idea is to prune unneccessary bytes from
-        # long writes, while accommodating arbitrary MemoryRegion access sizes
-        # and alignments.
+        # it into two separate write commands. If splitting the data operand
+        # from length/2^n bytes to the left does not work, try to move the pivot
+        # to the right side, then add one to n, until length/2^n == 0. The idea
+        # is to prune unneccessary bytes from long writes, while accommodating
+        # arbitrary MemoryRegion access sizes and alignments.
+
+        # This algorithm will fail under some rare situations.
+        # e.g., xxxxxxxxxuxxxxxx (u is the unnecessary byte)
+
         if newtrace[i].startswith("write "):
             addr = int(newtrace[i].split()[1], 16)
             length = int(newtrace[i].split()[2], 16)
@@ -143,6 +149,7 @@ def minimize_trace(inpath, outpath):
                 leftlength = int(length/2)
                 rightlength = length - leftlength
                 newtrace.insert(i+1, "")
+                power = 1
                 while leftlength > 0:
                     newtrace[i] = "write {addr} {size} 0x{data}\n".format(
                             addr=hex(addr),
@@ -154,9 +161,13 @@ def minimize_trace(inpath, outpath):
                             data=data[leftlength*2:])
                     if check_if_trace_crashes(newtrace, outpath):
                         break
-                    else:
-                        leftlength -= 1
-                        rightlength += 1
+                    # move the pivot to right side
+                    if leftlength < rightlength:
+                        rightlength, leftlength = leftlength, rightlength
+                        continue
+                    power += 1
+                    leftlength = int(length/pow(2, power))
+                    rightlength = length - leftlength
                 if check_if_trace_crashes(newtrace, outpath):
                     i -= 1
                 else:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (2 preceding siblings ...)
  2020-12-29  4:40 ` [PATCH v4 3/7] fuzz: split write operand using binary approach Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  4:53   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero Qiuhao Li
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

Now we use a one-time scan and remove strategy in the remval minimizer,
which is not suitable for timing dependent instructions.

For example, instruction A will indicate an address where the config
chunk locates, and instruction B will make the configuration active. If
we have the following instruction sequence:

...
A1
B1
A2
B2
...

A2 and B2 are the actual instructions that trigger the bug.

If we scan from top to bottom, after we remove A1, the behavior of B1
might be unknowable, including not to crash the program. But we will
successfully remove B1 later cause A2 and B2 will crash the process
anyway:

...
A1
A2
B2
...

Now one more trimming will remove A1.

In the perfect case, we would need to be able to remove A and B (or C!) at
the same time. But for now, let's just add a loop around the minimizer.

Since we only remove instructions, this iterative algorithm is converging.

Tested with Bug 1908062.

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 41 +++++++++++++++---------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 1a26bf5b93..378a7ccec6 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -71,21 +71,9 @@ def check_if_trace_crashes(trace, path):
     return False
 
 
-def minimize_trace(inpath, outpath):
-    global TIMEOUT
-    with open(inpath) as f:
-        trace = f.readlines()
-    start = time.time()
-    if not check_if_trace_crashes(trace, outpath):
-        sys.exit("The input qtest trace didn't cause a crash...")
-    end = time.time()
-    print("Crashed in {} seconds".format(end-start))
-    TIMEOUT = (end-start)*5
-    print("Setting the timeout for {} seconds".format(TIMEOUT))
-
-    i = 0
-    newtrace = trace[:]
+def remove_minimizer(newtrace, outpath):
     remove_step = 1
+    i = 0
     while i < len(newtrace):
         # 1.) Try to remove lines completely and reproduce the crash.
         # If it works, we're done.
@@ -174,7 +162,30 @@ def minimize_trace(inpath, outpath):
                     newtrace[i] = prior[0]
                     del newtrace[i+1]
         i += 1
-    check_if_trace_crashes(newtrace, outpath)
+
+
+def minimize_trace(inpath, outpath):
+    global TIMEOUT
+    with open(inpath) as f:
+        trace = f.readlines()
+    start = time.time()
+    if not check_if_trace_crashes(trace, outpath):
+        sys.exit("The input qtest trace didn't cause a crash...")
+    end = time.time()
+    print("Crashed in {} seconds".format(end-start))
+    TIMEOUT = (end-start)*5
+    print("Setting the timeout for {} seconds".format(TIMEOUT))
+
+    newtrace = trace[:]
+
+    # remove minimizer
+    old_len = len(newtrace) + 1
+    while(old_len > len(newtrace)):
+        old_len = len(newtrace)
+        remove_minimizer(newtrace, outpath)
+        newtrace = list(filter(lambda s: s != "", newtrace))
+
+    assert(check_if_trace_crashes(newtrace, outpath))
 
 
 if __name__ == '__main__':
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (3 preceding siblings ...)
  2020-12-29  4:40 ` [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  5:08   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 6/7] fuzz: add minimization options Qiuhao Li
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

Simplifying the crash cases by opportunistically setting bits in operands of
out/write to zero may help to debug, since usually bit one means turn on or
trigger a function while zero is the default turn-off setting.

Tested Bug 1908062.

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 39 ++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 378a7ccec6..70ac0c5366 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -164,6 +164,42 @@ def remove_minimizer(newtrace, outpath):
         i += 1
 
 
+def set_zero_minimizer(newtrace, outpath):
+    # try setting bits in operands of out/write to zero
+    i = 0
+    while i < len(newtrace):
+        if (not newtrace[i].startswith("write ") and not
+           newtrace[i].startswith("out")):
+           i += 1
+           continue
+        # write ADDR SIZE DATA
+        # outx ADDR VALUE
+        print("\nzero setting bits: {}".format(newtrace[i]))
+
+        prefix = " ".join(newtrace[i].split()[:-1])
+        data = newtrace[i].split()[-1]
+        data_bin = bin(int(data, 16))
+        data_bin_list = list(data_bin)
+
+        for j in range(2, len(data_bin_list)):
+            prior = newtrace[i]
+            if (data_bin_list[j] == '1'):
+                data_bin_list[j] = '0'
+                data_try = hex(int("".join(data_bin_list), 2))
+                # It seems qtest only accepts padded hex-values.
+                if len(data_try) % 2 == 1:
+                    data_try = data_try[:2] + "0" + data_try[2:-1]
+
+                newtrace[i] = "{prefix} {data_try}\n".format(
+                        prefix=prefix,
+                        data_try=data_try)
+
+                if not check_if_trace_crashes(newtrace, outpath):
+                    data_bin_list[j] = '1'
+                    newtrace[i] = prior
+        i += 1
+
+
 def minimize_trace(inpath, outpath):
     global TIMEOUT
     with open(inpath) as f:
@@ -184,7 +220,10 @@ def minimize_trace(inpath, outpath):
         old_len = len(newtrace)
         remove_minimizer(newtrace, outpath)
         newtrace = list(filter(lambda s: s != "", newtrace))
+    assert(check_if_trace_crashes(newtrace, outpath))
 
+    # set zero minimizer
+    set_zero_minimizer(newtrace, outpath)
     assert(check_if_trace_crashes(newtrace, outpath))
 
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 6/7] fuzz: add minimization options
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (4 preceding siblings ...)
  2020-12-29  4:40 ` [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-07  5:54   ` Alexander Bulekov
  2020-12-29  4:40 ` [PATCH v4 7/7] fuzz: heuristic split write based on past IOs Qiuhao Li
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

-M1: loop around the remove minimizer
-M2: try setting bits in operand of write/out to zero
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 32 +++++++++++++++++++-----
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 70ac0c5366..a681984076 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -16,6 +16,10 @@ QEMU_PATH = None
 TIMEOUT = 5
 CRASH_TOKEN = None
 
+# Minimization levels
+M1 = False # loop around the remove minimizer
+M2 = False # try setting bits in operand of write/out to zero
+
 write_suffix_lookup = {"b": (1, "B"),
                        "w": (2, "H"),
                        "l": (4, "L"),
@@ -23,10 +27,20 @@ write_suffix_lookup = {"b": (1, "B"),
 
 def usage():
     sys.exit("""\
-Usage: QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} input_trace output_trace
+Usage:
+
+QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} [Options] input_trace output_trace
+
 By default, will try to use the second-to-last line in the output to identify
 whether the crash occred. Optionally, manually set a string that idenitifes the
 crash by setting CRASH_TOKEN=
+
+Options:
+
+-M1: enable a loop around the remove minimizer, which may help decrease some
+     timing dependant instructions. Off by default.
+-M2: try setting bits in operand of write/out to zero. Off by default.
+
 """.format((sys.argv[0])))
 
 deduplication_note = """\n\
@@ -213,24 +227,30 @@ def minimize_trace(inpath, outpath):
     print("Setting the timeout for {} seconds".format(TIMEOUT))
 
     newtrace = trace[:]
-
+    global M1, M2
     # remove minimizer
     old_len = len(newtrace) + 1
     while(old_len > len(newtrace)):
         old_len = len(newtrace)
+        print("trace lenth = ", old_len)
         remove_minimizer(newtrace, outpath)
+        if not M1 and not M2:
+            break
         newtrace = list(filter(lambda s: s != "", newtrace))
     assert(check_if_trace_crashes(newtrace, outpath))
 
-    # set zero minimizer
-    set_zero_minimizer(newtrace, outpath)
+    if M2:
+        set_zero_minimizer(newtrace, outpath)
     assert(check_if_trace_crashes(newtrace, outpath))
 
 
 if __name__ == '__main__':
     if len(sys.argv) < 3:
         usage()
-
+    if "-M1" in sys.argv:
+        M1 = True
+    if "-M2" in sys.argv:
+        M2 = True
     QEMU_PATH = os.getenv("QEMU_PATH")
     QEMU_ARGS = os.getenv("QEMU_ARGS")
     if QEMU_PATH is None or QEMU_ARGS is None:
@@ -239,4 +259,4 @@ if __name__ == '__main__':
     #     QEMU_ARGS += " -accel qtest"
     CRASH_TOKEN = os.getenv("CRASH_TOKEN")
     QEMU_ARGS += " -qtest stdio -monitor none -serial none "
-    minimize_trace(sys.argv[1], sys.argv[2])
+    minimize_trace(sys.argv[-2], sys.argv[-1])
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 7/7] fuzz: heuristic split write based on past IOs
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (5 preceding siblings ...)
  2020-12-29  4:40 ` [PATCH v4 6/7] fuzz: add minimization options Qiuhao Li
@ 2020-12-29  4:40 ` Qiuhao Li
  2021-01-08  4:30   ` Alexander Bulekov
  2021-01-05  8:00 ` Ping: [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
  2021-01-08  4:32 ` Alexander Bulekov
  8 siblings, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2020-12-29  4:40 UTC (permalink / raw)
  To: alxndr, qemu-devel
  Cc: thuth, Qiuhao Li, darren.kenny, bsd, stefanha, pbonzini

If previous write commands write the same length of data with the same step,
we view it as a hint.

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 56 ++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index a681984076..6cbf2b0419 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -85,6 +85,43 @@ def check_if_trace_crashes(trace, path):
     return False
 
 
+# If previous write commands write the same length of data at the same
+# interval, we view it as a hint.
+def split_write_hint(newtrace, i):
+    HINT_LEN = 3 # > 2
+    if i <=(HINT_LEN-1):
+        return None
+
+    #find previous continuous write traces
+    k = 0
+    l = i-1
+    writes = []
+    while (k != HINT_LEN and l >= 0):
+        if newtrace[l].startswith("write "):
+            writes.append(newtrace[l])
+            k += 1
+            l -= 1
+        elif newtrace[l] == "":
+            l -= 1
+        else:
+            return None
+    if k != HINT_LEN:
+        return None
+
+    length = int(writes[0].split()[2], 16)
+    for j in range(1, HINT_LEN):
+        if length != int(writes[j].split()[2], 16):
+            return None
+
+    step = int(writes[0].split()[1], 16) - int(writes[1].split()[1], 16)
+    for j in range(1, HINT_LEN-1):
+        if step != int(writes[j].split()[1], 16) - \
+            int(writes[j+1].split()[1], 16):
+            return None
+
+    return (int(writes[0].split()[1], 16)+step, length)
+
+
 def remove_minimizer(newtrace, outpath):
     remove_step = 1
     i = 0
@@ -148,6 +185,25 @@ def remove_minimizer(newtrace, outpath):
             length = int(newtrace[i].split()[2], 16)
             data = newtrace[i].split()[3][2:]
             if length > 1:
+
+                # Can we get a hint from previous writes?
+                hint = split_write_hint(newtrace, i)
+                if hint is not None:
+                    hint_addr = hint[0]
+                    hint_len = hint[1]
+                    if hint_addr >= addr and hint_addr+hint_len <= addr+length:
+                        newtrace[i] = "write {addr} {size} 0x{data}\n".format(
+                            addr=hex(hint_addr),
+                            size=hex(hint_len),
+                            data=data[(hint_addr-addr)*2:\
+                                (hint_addr-addr)*2+hint_len*2])
+                        if check_if_trace_crashes(newtrace, outpath):
+                            # next round
+                            i += 1
+                            continue
+                        newtrace[i] = prior[0]
+
+                # Try splitting it using a binary approach
                 leftlength = int(length/2)
                 rightlength = length - leftlength
                 newtrace.insert(i+1, "")
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Ping: [PATCH v4 0/7] fuzz: improve crash case minimization
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (6 preceding siblings ...)
  2020-12-29  4:40 ` [PATCH v4 7/7] fuzz: heuristic split write based on past IOs Qiuhao Li
@ 2021-01-05  8:00 ` Qiuhao Li
  2021-01-08  4:32 ` Alexander Bulekov
  8 siblings, 0 replies; 23+ messages in thread
From: Qiuhao Li @ 2021-01-05  8:00 UTC (permalink / raw)
  To: alxndr, qemu-devel; +Cc: darren.kenny, thuth, bsd, stefanha, pbonzini

Kindly ping :)

Wondering if there is anything wrong with this patch?

On Tue, 2020-12-29 at 12:39 +0800, Qiuhao Li wrote:
> Extend and refine the crash case minimization process.
> 
> Test input:
>   Bug 1909261 full_reproducer
>   6500 QTest instructions (write mostly)
> 
> Refined (-M1 minimization level) vs. Original version:
>   real  38m31.942s  <-- real  532m57.192s
>   user  28m18.188s  <-- user  89m0.536s
>   sys   12m42.239s  <-- sys   50m33.074s
>   2558 instructions <-- 2846 instructions
> 
> Test Enviroment:
>   i7-8550U, 16GB LPDDR3, SSD 
>   Ubuntu 20.04.1 5.4.0-58-generic x86_64
>   Python 3.8.5
> 
> v4:
>   Fix: messy diff in [PATCH v3 4/7]
> 
> v3:
>   Fix: checkpatch.pl errors
> 
> v2: 
>   New: [PATCH v2 1/7]
>   New: [PATCH v2 2/7]
>   New: [PATCH v2 4/7]
>   New: [PATCH v2 6/7]
>   New: [PATCH v2 7/7]
>   Fix: [PATCH 2/4] split using binary approach
>   Fix: [PATCH 3/4] typo in comments
>   Discard: [PATCH 1/4] the hardcoded regex match for crash detection
>   Discard: [PATCH 4/4] the delaying minimizer
>   
> Thanks for the suggestions from:
>   Alexander Bulekov
> 
> Qiuhao Li (7):
>   fuzz: accelerate non-crash detection
>   fuzz: double the IOs to remove for every loop
>   fuzz: split write operand using binary approach
>   fuzz: loop the remove minimizer and refactoring
>   fuzz: set bits in operand of write/out to zero
>   fuzz: add minimization options
>   fuzz: heuristic split write based on past IOs
> 
>  scripts/oss-fuzz/minimize_qtest_trace.py | 257 ++++++++++++++++++---
> --
>  1 file changed, 209 insertions(+), 48 deletions(-)
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
@ 2021-01-07  3:42   ` Alexander Bulekov
  2021-01-07  4:18   ` Alexander Bulekov
  1 sibling, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  3:42 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> We spend much time waiting for the timeout program during the minimization
> process until it passes a time limit. This patch hacks the CLOSED (indicates
> the redirection file closed) notification in QTest's output if it doesn't
> crash.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Original version:
>   real	1m37.246s
>   user	0m13.069s
>   sys	0m8.399s
> 
> Refined version:
>   real	0m45.904s
>   user	0m16.874s
>   sys	0m10.042s
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

This makes a huge difference, thanks! Maybe there is some edge-case
where the crash happens after "CLOSED" (e.g. a timer fires after the
last command), but I haven't found such an example among existing
reproducers.

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--------
>  1 file changed, 28 insertions(+), 13 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..aa69c7963e 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a string that idenitifes the
>  crash by setting CRASH_TOKEN=
>  """.format((sys.argv[0])))
>  
> +deduplication_note = """\n\
> +Note: While trimming the input, sometimes the mutated trace triggers a different
> +crash output but indicates the same bug. Under this situation, our minimizer is
> +incapable of recognizing and stopped from removing it. In the future, we may
> +use a more sophisticated crash case deduplication method.
> +\n"""
> +
>  def check_if_trace_crashes(trace, path):
> -    global CRASH_TOKEN
>      with open(path, "w") as tracefile:
>          tracefile.write("".join(trace))
>  
> -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 2>&1\
> +    proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\
>      < {trace_path}".format(timeout=TIMEOUT,
>                             qemu_path=QEMU_PATH,
>                             qemu_args=QEMU_ARGS,
>                             trace_path=path),
>                            shell=True,
>                            stdin=subprocess.PIPE,
> -                          stdout=subprocess.PIPE)
> -    stdo = rc.communicate()[0]
> -    output = stdo.decode('unicode_escape')
> -    if rc.returncode == 137:    # Timed Out
> -        return False
> -    if len(output.splitlines()) < 2:
> -        return False
> -
> +                          stdout=subprocess.PIPE,
> +                          encoding="utf-8")
> +    global CRASH_TOKEN
>      if CRASH_TOKEN is None:
> -        CRASH_TOKEN = output.splitlines()[-2]
> +        try:
> +            outs, _ = proc.communicate(timeout=5)
> +            CRASH_TOKEN = outs.splitlines()[-2]
> +        except subprocess.TimeoutExpired:
> +            print("subprocess.TimeoutExpired")
> +            return False
> +        print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
> +        global deduplication_note
> +        print(deduplication_note)
> +        return True
>  
> -    return CRASH_TOKEN in output
> +    for line in iter(proc.stdout.readline, b''):
> +        if "CLOSED" in line:
> +            return False
> +        if CRASH_TOKEN in line:
> +            return True
> +
> +    return False
>  
>  
>  def minimize_trace(inpath, outpath):
> @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
>      print("Crashed in {} seconds".format(end-start))
>      TIMEOUT = (end-start)*5
>      print("Setting the timeout for {} seconds".format(TIMEOUT))
> -    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>  
>      i = 0
>      newtrace = trace[:]
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
  2021-01-07  3:42   ` Alexander Bulekov
@ 2021-01-07  4:18   ` Alexander Bulekov
  2021-01-08  2:47     ` Qiuhao Li
  2021-01-10 13:10     ` Qiuhao Li
  1 sibling, 2 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  4:18 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> We spend much time waiting for the timeout program during the minimization
> process until it passes a time limit. This patch hacks the CLOSED (indicates
> the redirection file closed) notification in QTest's output if it doesn't
> crash.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Original version:
>   real	1m37.246s
>   user	0m13.069s
>   sys	0m8.399s
> 
> Refined version:
>   real	0m45.904s
>   user	0m16.874s
>   sys	0m10.042s
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--------
>  1 file changed, 28 insertions(+), 13 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..aa69c7963e 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a string that idenitifes the
>  crash by setting CRASH_TOKEN=
>  """.format((sys.argv[0])))
>  
> +deduplication_note = """\n\
> +Note: While trimming the input, sometimes the mutated trace triggers a different
> +crash output but indicates the same bug. Under this situation, our minimizer is
> +incapable of recognizing and stopped from removing it. In the future, we may
> +use a more sophisticated crash case deduplication method.
> +\n"""
> +
>  def check_if_trace_crashes(trace, path):
> -    global CRASH_TOKEN
>      with open(path, "w") as tracefile:
>          tracefile.write("".join(trace))
>  
> -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 2>&1\
> +    proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\

Why remove the -s 9 here? I ran into a case where the minimizer got
stuck on one iteration. Adding back "sigkill" to the timeout can be a
safety net to catch those bad cases.
-Alex

>      < {trace_path}".format(timeout=TIMEOUT,
>                             qemu_path=QEMU_PATH,
>                             qemu_args=QEMU_ARGS,
>                             trace_path=path),
>                            shell=True,
>                            stdin=subprocess.PIPE,
> -                          stdout=subprocess.PIPE)
> -    stdo = rc.communicate()[0]
> -    output = stdo.decode('unicode_escape')
> -    if rc.returncode == 137:    # Timed Out
> -        return False
> -    if len(output.splitlines()) < 2:
> -        return False
> -
> +                          stdout=subprocess.PIPE,
> +                          encoding="utf-8")
> +    global CRASH_TOKEN
>      if CRASH_TOKEN is None:
> -        CRASH_TOKEN = output.splitlines()[-2]
> +        try:
> +            outs, _ = proc.communicate(timeout=5)
> +            CRASH_TOKEN = outs.splitlines()[-2]
> +        except subprocess.TimeoutExpired:
> +            print("subprocess.TimeoutExpired")
> +            return False
> +        print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
> +        global deduplication_note
> +        print(deduplication_note)
> +        return True
>  
> -    return CRASH_TOKEN in output
> +    for line in iter(proc.stdout.readline, b''):
> +        if "CLOSED" in line:
> +            return False
> +        if CRASH_TOKEN in line:
> +            return True
> +
> +    return False
>  
>  
>  def minimize_trace(inpath, outpath):
> @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
>      print("Crashed in {} seconds".format(end-start))
>      TIMEOUT = (end-start)*5
>      print("Setting the timeout for {} seconds".format(TIMEOUT))
> -    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>  
>      i = 0
>      newtrace = trace[:]
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 2/7] fuzz: double the IOs to remove for every loop
  2020-12-29  4:40 ` [PATCH v4 2/7] fuzz: double the IOs to remove for every loop Qiuhao Li
@ 2021-01-07  4:19   ` Alexander Bulekov
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  4:19 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> Instead of removing IO instructions one by one, we can try deleting multiple
> instructions at once. According to the locality of reference, we double the
> number of instructions to remove for the next round and recover it to one
> once we fail.
> 
> This patch is usually significant for large input.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Patched 1/6 version:
>   real  0m45.904s
>   user  0m16.874s
>   sys   0m10.042s
> 
> Refined version:
>   real  0m11.412s
>   user  0m6.888s
>   sys   0m3.325s
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 33 +++++++++++++++---------
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index aa69c7963e..0b665ae657 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -85,19 +85,28 @@ def minimize_trace(inpath, outpath):
>  
>      i = 0
>      newtrace = trace[:]
> -    # For each line
> +    remove_step = 1
>      while i < len(newtrace):
> -        # 1.) Try to remove it completely and reproduce the crash. If it works,
> -        # we're done.
> -        prior = newtrace[i]
> -        print("Trying to remove {}".format(newtrace[i]))
> -        # Try to remove the line completely
> -        newtrace[i] = ""
> +        # 1.) Try to remove lines completely and reproduce the crash.
> +        # If it works, we're done.
> +        if (i+remove_step) >= len(newtrace):
> +            remove_step = 1
> +        prior = newtrace[i:i+remove_step]
> +        for j in range(i, i+remove_step):
> +            newtrace[j] = ""
> +        print("Removing {lines} ...".format(lines=prior))
>          if check_if_trace_crashes(newtrace, outpath):
> -            i += 1
> +            i += remove_step
> +            # Double the number of lines to remove for next round
> +            remove_step *= 2
>              continue
> -        newtrace[i] = prior
> -
> +        # Failed to remove multiple IOs, fast recovery
> +        if remove_step > 1:
> +            for j in range(i, i+remove_step):
> +                newtrace[j] = prior[j-i]
> +            remove_step = 1
> +            continue
> +        newtrace[i] = prior[0] # remove_step = 1
>          # 2.) Try to replace write{bwlq} commands with a write addr, len
>          # command. Since this can require swapping endianness, try both LE and
>          # BE options. We do this, so we can "trim" the writes in (3)
> @@ -118,7 +127,7 @@ def minimize_trace(inpath, outpath):
>                  if(check_if_trace_crashes(newtrace, outpath)):
>                      break
>              else:
> -                newtrace[i] = prior
> +                newtrace[i] = prior[0]
>  
>          # 3.) If it is a qtest write command: write addr len data, try to split
>          # it into two separate write commands. If splitting the write down the
> @@ -151,7 +160,7 @@ def minimize_trace(inpath, outpath):
>                  if check_if_trace_crashes(newtrace, outpath):
>                      i -= 1
>                  else:
> -                    newtrace[i] = prior
> +                    newtrace[i] = prior[0]
>                      del newtrace[i+1]
>          i += 1
>      check_if_trace_crashes(newtrace, outpath)
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/7] fuzz: split write operand using binary approach
  2020-12-29  4:40 ` [PATCH v4 3/7] fuzz: split write operand using binary approach Qiuhao Li
@ 2021-01-07  4:28   ` Alexander Bulekov
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  4:28 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> Currently, we split the write commands' data from the middle. If it does not
> work, try to move the pivot left by one byte and retry until there is no
> space.
> 
> But, this method has two flaws:
> 
> 1. It may fail to trim all unnecessary bytes on the right side.
> 
> For example, there is an IO write command:
> 
>   write addr uuxxxxuu
> 
> u is the unnecessary byte for the crash. Unlike ram write commands, in most
> case, a split IO write won't trigger the same crash, So if we split from the
> middle, we will get:
> 
>   write addr uu (will be removed in next round)
>   write addr xxxxuu
> 
> For xxxxuu, since split it from the middle and retry to the leftmost byte
> won't get the same crash, we will be stopped from removing the last two
> bytes.
> 
> 2. The algorithm complexity is O(n) since we move the pivot byte by byte.
> 
> To solve the first issue, we can try a symmetrical position on the right if
> we fail on the left. As for the second issue, instead moving by one byte, we
> can approach the boundary exponentially, achieving O(log(n)).
> 
> Give an example:
> 
>                    xxxxuu len=6
>                         +
>                         |
>                         +
>                  xxx,xuu 6/2=3 fail
>                         +
>          +--------------+-------------+
>          |                            |
>          +                            +
>   xx,xxuu 6/2^2=1 fail         xxxxu,u 6-1=5 success
>                                  +   +
>          +------------------+----+   |
>          |                  |        +-------------+ u removed
>          +                  +
>    xx,xxu 5/2=2 fail  xxxx,u 6-2=4 success
>                            +
>                            |
>                            +-----------+ u removed
> 
> In some rare case, this algorithm will fail to trim all unnecessary bytes:
> 
>   xxxxxxxxxuxxxxxx
>   xxxxxxxx-xuxxxxxx Fail
>   xxxx-xxxxxuxxxxxx Fail
>   xxxxxxxxxuxx-xxxx Fail
>   ...
> 
> I think the trade-off is worth it.
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 29 ++++++++++++++++--------
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 0b665ae657..1a26bf5b93 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -94,7 +94,7 @@ def minimize_trace(inpath, outpath):
>          prior = newtrace[i:i+remove_step]
>          for j in range(i, i+remove_step):
>              newtrace[j] = ""
> -        print("Removing {lines} ...".format(lines=prior))
> +        print("Removing {lines} ...\n".format(lines=prior))
>          if check_if_trace_crashes(newtrace, outpath):
>              i += remove_step
>              # Double the number of lines to remove for next round
> @@ -107,9 +107,11 @@ def minimize_trace(inpath, outpath):
>              remove_step = 1
>              continue
>          newtrace[i] = prior[0] # remove_step = 1
> +
>          # 2.) Try to replace write{bwlq} commands with a write addr, len
>          # command. Since this can require swapping endianness, try both LE and
>          # BE options. We do this, so we can "trim" the writes in (3)
> +
>          if (newtrace[i].startswith("write") and not
>              newtrace[i].startswith("write ")):
>              suffix = newtrace[i].split()[0][-1]
> @@ -130,11 +132,15 @@ def minimize_trace(inpath, outpath):
>                  newtrace[i] = prior[0]
>  
>          # 3.) If it is a qtest write command: write addr len data, try to split
> -        # it into two separate write commands. If splitting the write down the
> -        # middle does not work, try to move the pivot "left" and retry, until
> -        # there is no space left. The idea is to prune unneccessary bytes from
> -        # long writes, while accommodating arbitrary MemoryRegion access sizes
> -        # and alignments.
> +        # it into two separate write commands. If splitting the data operand
> +        # from length/2^n bytes to the left does not work, try to move the pivot
> +        # to the right side, then add one to n, until length/2^n == 0. The idea
> +        # is to prune unneccessary bytes from long writes, while accommodating
> +        # arbitrary MemoryRegion access sizes and alignments.
> +
> +        # This algorithm will fail under some rare situations.
> +        # e.g., xxxxxxxxxuxxxxxx (u is the unnecessary byte)
> +
>          if newtrace[i].startswith("write "):
>              addr = int(newtrace[i].split()[1], 16)
>              length = int(newtrace[i].split()[2], 16)
> @@ -143,6 +149,7 @@ def minimize_trace(inpath, outpath):
>                  leftlength = int(length/2)
>                  rightlength = length - leftlength
>                  newtrace.insert(i+1, "")
> +                power = 1
>                  while leftlength > 0:
>                      newtrace[i] = "write {addr} {size} 0x{data}\n".format(
>                              addr=hex(addr),
> @@ -154,9 +161,13 @@ def minimize_trace(inpath, outpath):
>                              data=data[leftlength*2:])
>                      if check_if_trace_crashes(newtrace, outpath):
>                          break
> -                    else:
> -                        leftlength -= 1
> -                        rightlength += 1
> +                    # move the pivot to right side
> +                    if leftlength < rightlength:
> +                        rightlength, leftlength = leftlength, rightlength
> +                        continue
> +                    power += 1
> +                    leftlength = int(length/pow(2, power))
> +                    rightlength = length - leftlength
>                  if check_if_trace_crashes(newtrace, outpath):
>                      i -= 1
>                  else:
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring
  2020-12-29  4:40 ` [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring Qiuhao Li
@ 2021-01-07  4:53   ` Alexander Bulekov
  2021-01-08  2:49     ` Qiuhao Li
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  4:53 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> Now we use a one-time scan and remove strategy in the remval minimizer,
> which is not suitable for timing dependent instructions.
> 
> For example, instruction A will indicate an address where the config
> chunk locates, and instruction B will make the configuration active. If
> we have the following instruction sequence:
> 
> ...
> A1
> B1
> A2
> B2
> ...
> 
> A2 and B2 are the actual instructions that trigger the bug.
> 
> If we scan from top to bottom, after we remove A1, the behavior of B1
> might be unknowable, including not to crash the program. But we will
> successfully remove B1 later cause A2 and B2 will crash the process
> anyway:
> 
> ...
> A1
> A2
> B2
> ...
> 
> Now one more trimming will remove A1.
> 
> In the perfect case, we would need to be able to remove A and B (or C!) at
> the same time. But for now, let's just add a loop around the minimizer.
> 
> Since we only remove instructions, this iterative algorithm is converging.
> 
> Tested with Bug 1908062.
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Small note below, but otherwise:
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 41 +++++++++++++++---------
>  1 file changed, 26 insertions(+), 15 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 1a26bf5b93..378a7ccec6 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -71,21 +71,9 @@ def check_if_trace_crashes(trace, path):
>      return False
>  
>  
> -def minimize_trace(inpath, outpath):
> -    global TIMEOUT
> -    with open(inpath) as f:
> -        trace = f.readlines()
> -    start = time.time()
> -    if not check_if_trace_crashes(trace, outpath):
> -        sys.exit("The input qtest trace didn't cause a crash...")
> -    end = time.time()
> -    print("Crashed in {} seconds".format(end-start))
> -    TIMEOUT = (end-start)*5
> -    print("Setting the timeout for {} seconds".format(TIMEOUT))
> -
> -    i = 0
> -    newtrace = trace[:]
> +def remove_minimizer(newtrace, outpath):

Maybe a different name for this function?
e.g. minimize_each_line or minimize_iter

-Alex

>      remove_step = 1
> +    i = 0
>      while i < len(newtrace):
>          # 1.) Try to remove lines completely and reproduce the crash.
>          # If it works, we're done.
> @@ -174,7 +162,30 @@ def minimize_trace(inpath, outpath):
>                      newtrace[i] = prior[0]
>                      del newtrace[i+1]
>          i += 1
> -    check_if_trace_crashes(newtrace, outpath)
> +
> +
> +def minimize_trace(inpath, outpath):
> +    global TIMEOUT
> +    with open(inpath) as f:
> +        trace = f.readlines()
> +    start = time.time()
> +    if not check_if_trace_crashes(trace, outpath):
> +        sys.exit("The input qtest trace didn't cause a crash...")
> +    end = time.time()
> +    print("Crashed in {} seconds".format(end-start))
> +    TIMEOUT = (end-start)*5
> +    print("Setting the timeout for {} seconds".format(TIMEOUT))
> +
> +    newtrace = trace[:]
> +
> +    # remove minimizer
> +    old_len = len(newtrace) + 1
> +    while(old_len > len(newtrace)):
> +        old_len = len(newtrace)
> +        remove_minimizer(newtrace, outpath)
> +        newtrace = list(filter(lambda s: s != "", newtrace))
> +
> +    assert(check_if_trace_crashes(newtrace, outpath))
>  
>  
>  if __name__ == '__main__':
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero
  2020-12-29  4:40 ` [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero Qiuhao Li
@ 2021-01-07  5:08   ` Alexander Bulekov
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  5:08 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> Simplifying the crash cases by opportunistically setting bits in operands of
> out/write to zero may help to debug, since usually bit one means turn on or
> trigger a function while zero is the default turn-off setting.
> 
> Tested Bug 1908062.
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 39 ++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 378a7ccec6..70ac0c5366 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -164,6 +164,42 @@ def remove_minimizer(newtrace, outpath):
>          i += 1
>  
>  
> +def set_zero_minimizer(newtrace, outpath):
> +    # try setting bits in operands of out/write to zero
> +    i = 0
> +    while i < len(newtrace):
> +        if (not newtrace[i].startswith("write ") and not
> +           newtrace[i].startswith("out")):
> +           i += 1
> +           continue
> +        # write ADDR SIZE DATA
> +        # outx ADDR VALUE
> +        print("\nzero setting bits: {}".format(newtrace[i]))
> +
> +        prefix = " ".join(newtrace[i].split()[:-1])
> +        data = newtrace[i].split()[-1]
> +        data_bin = bin(int(data, 16))
> +        data_bin_list = list(data_bin)
> +
> +        for j in range(2, len(data_bin_list)):
> +            prior = newtrace[i]
> +            if (data_bin_list[j] == '1'):
> +                data_bin_list[j] = '0'
> +                data_try = hex(int("".join(data_bin_list), 2))
> +                # It seems qtest only accepts padded hex-values.
> +                if len(data_try) % 2 == 1:
> +                    data_try = data_try[:2] + "0" + data_try[2:-1]
> +
> +                newtrace[i] = "{prefix} {data_try}\n".format(
> +                        prefix=prefix,
> +                        data_try=data_try)
> +
> +                if not check_if_trace_crashes(newtrace, outpath):
> +                    data_bin_list[j] = '1'
> +                    newtrace[i] = prior
> +        i += 1
> +
> +
>  def minimize_trace(inpath, outpath):
>      global TIMEOUT
>      with open(inpath) as f:
> @@ -184,7 +220,10 @@ def minimize_trace(inpath, outpath):
>          old_len = len(newtrace)
>          remove_minimizer(newtrace, outpath)
>          newtrace = list(filter(lambda s: s != "", newtrace))
> +    assert(check_if_trace_crashes(newtrace, outpath))
>  
> +    # set zero minimizer
> +    set_zero_minimizer(newtrace, outpath)
>      assert(check_if_trace_crashes(newtrace, outpath))
>  
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 6/7] fuzz: add minimization options
  2020-12-29  4:40 ` [PATCH v4 6/7] fuzz: add minimization options Qiuhao Li
@ 2021-01-07  5:54   ` Alexander Bulekov
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-07  5:54 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> -M1: loop around the remove minimizer
> -M2: try setting bits in operand of write/out to zero
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 32 +++++++++++++++++++-----
>  1 file changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 70ac0c5366..a681984076 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -16,6 +16,10 @@ QEMU_PATH = None
>  TIMEOUT = 5
>  CRASH_TOKEN = None
>  
> +# Minimization levels
> +M1 = False # loop around the remove minimizer
> +M2 = False # try setting bits in operand of write/out to zero
> +
>  write_suffix_lookup = {"b": (1, "B"),
>                         "w": (2, "H"),
>                         "l": (4, "L"),
> @@ -23,10 +27,20 @@ write_suffix_lookup = {"b": (1, "B"),
>  
>  def usage():
>      sys.exit("""\
> -Usage: QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} input_trace output_trace
> +Usage:
> +
> +QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} [Options] input_trace output_trace
> +
>  By default, will try to use the second-to-last line in the output to identify
>  whether the crash occred. Optionally, manually set a string that idenitifes the
>  crash by setting CRASH_TOKEN=
> +
> +Options:
> +
> +-M1: enable a loop around the remove minimizer, which may help decrease some
> +     timing dependant instructions. Off by default.
> +-M2: try setting bits in operand of write/out to zero. Off by default.
> +
>  """.format((sys.argv[0])))
>  
>  deduplication_note = """\n\
> @@ -213,24 +227,30 @@ def minimize_trace(inpath, outpath):
>      print("Setting the timeout for {} seconds".format(TIMEOUT))
>  
>      newtrace = trace[:]
> -
> +    global M1, M2
>      # remove minimizer
>      old_len = len(newtrace) + 1
>      while(old_len > len(newtrace)):
>          old_len = len(newtrace)
> +        print("trace lenth = ", old_len)
>          remove_minimizer(newtrace, outpath)
> +        if not M1 and not M2:
> +            break
>          newtrace = list(filter(lambda s: s != "", newtrace))
>      assert(check_if_trace_crashes(newtrace, outpath))
>  
> -    # set zero minimizer
> -    set_zero_minimizer(newtrace, outpath)
> +    if M2:
> +        set_zero_minimizer(newtrace, outpath)
>      assert(check_if_trace_crashes(newtrace, outpath))
>  
>  
>  if __name__ == '__main__':
>      if len(sys.argv) < 3:
>          usage()
> -
> +    if "-M1" in sys.argv:
> +        M1 = True
> +    if "-M2" in sys.argv:
> +        M2 = True
>      QEMU_PATH = os.getenv("QEMU_PATH")
>      QEMU_ARGS = os.getenv("QEMU_ARGS")
>      if QEMU_PATH is None or QEMU_ARGS is None:
> @@ -239,4 +259,4 @@ if __name__ == '__main__':
>      #     QEMU_ARGS += " -accel qtest"
>      CRASH_TOKEN = os.getenv("CRASH_TOKEN")
>      QEMU_ARGS += " -qtest stdio -monitor none -serial none "
> -    minimize_trace(sys.argv[1], sys.argv[2])
> +    minimize_trace(sys.argv[-2], sys.argv[-1])
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2021-01-07  4:18   ` Alexander Bulekov
@ 2021-01-08  2:47     ` Qiuhao Li
  2021-01-10 13:10     ` Qiuhao Li
  1 sibling, 0 replies; 23+ messages in thread
From: Qiuhao Li @ 2021-01-08  2:47 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> On 201229 1240, Qiuhao Li wrote:
> > We spend much time waiting for the timeout program during the
> > minimization
> > process until it passes a time limit. This patch hacks the CLOSED
> > (indicates
> > the redirection file closed) notification in QTest's output if it
> > doesn't
> > crash.
> > 
> > Test with quadrupled trace input at:
> >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > 
> > Original version:
> >   real	1m37.246s
> >   user	0m13.069s
> >   sys	0m8.399s
> > 
> > Refined version:
> >   real	0m45.904s
> >   user	0m16.874s
> >   sys	0m10.042s
> > 
> > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--
> > ------
> >  1 file changed, 28 insertions(+), 13 deletions(-)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index 5e405a0d5f..aa69c7963e 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > set a string that idenitifes the
> >  crash by setting CRASH_TOKEN=
> >  """.format((sys.argv[0])))
> >  
> > +deduplication_note = """\n\
> > +Note: While trimming the input, sometimes the mutated trace
> > triggers a different
> > +crash output but indicates the same bug. Under this situation, our
> > minimizer is
> > +incapable of recognizing and stopped from removing it. In the
> > future, we may
> > +use a more sophisticated crash case deduplication method.
> > +\n"""
> > +
> >  def check_if_trace_crashes(trace, path):
> > -    global CRASH_TOKEN
> >      with open(path, "w") as tracefile:
> >          tracefile.write("".join(trace))
> >  
> > -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> > +    proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> 
> Why remove the -s 9 here? I ran into a case where the minimizer got
> stuck on one iteration. Adding back "sigkill" to the timeout can be a
> safety net to catch those bad cases.
> -Alex

Oops, I thought SIGKILL is the default signal timeout will send.
Fixed in version 5, thanks.

> 
> >      < {trace_path}".format(timeout=TIMEOUT,
> >                             qemu_path=QEMU_PATH,
> >                             qemu_args=QEMU_ARGS,
> >                             trace_path=path),
> >                            shell=True,
> >                            stdin=subprocess.PIPE,
> > -                          stdout=subprocess.PIPE)
> > -    stdo = rc.communicate()[0]
> > -    output = stdo.decode('unicode_escape')
> > -    if rc.returncode == 137:    # Timed Out
> > -        return False
> > -    if len(output.splitlines()) < 2:
> > -        return False
> > -
> > +                          stdout=subprocess.PIPE,
> > +                          encoding="utf-8")
> > +    global CRASH_TOKEN
> >      if CRASH_TOKEN is None:
> > -        CRASH_TOKEN = output.splitlines()[-2]
> > +        try:
> > +            outs, _ = proc.communicate(timeout=5)
> > +            CRASH_TOKEN = outs.splitlines()[-2]
> > +        except subprocess.TimeoutExpired:
> > +            print("subprocess.TimeoutExpired")
> > +            return False
> > +        print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> > +        global deduplication_note
> > +        print(deduplication_note)
> > +        return True
> >  
> > -    return CRASH_TOKEN in output
> > +    for line in iter(proc.stdout.readline, b''):
> > +        if "CLOSED" in line:
> > +            return False
> > +        if CRASH_TOKEN in line:
> > +            return True
> > +
> > +    return False
> >  
> >  
> >  def minimize_trace(inpath, outpath):
> > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> >      print("Crashed in {} seconds".format(end-start))
> >      TIMEOUT = (end-start)*5
> >      print("Setting the timeout for {} seconds".format(TIMEOUT))
> > -    print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> >  
> >      i = 0
> >      newtrace = trace[:]
> > -- 
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring
  2021-01-07  4:53   ` Alexander Bulekov
@ 2021-01-08  2:49     ` Qiuhao Li
  0 siblings, 0 replies; 23+ messages in thread
From: Qiuhao Li @ 2021-01-08  2:49 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On Wed, 2021-01-06 at 23:53 -0500, Alexander Bulekov wrote:
> On 201229 1240, Qiuhao Li wrote:
> > Now we use a one-time scan and remove strategy in the remval
> > minimizer,
> > which is not suitable for timing dependent instructions.
> > 
> > For example, instruction A will indicate an address where the
> > config
> > chunk locates, and instruction B will make the configuration
> > active. If
> > we have the following instruction sequence:
> > 
> > ...
> > A1
> > B1
> > A2
> > B2
> > ...
> > 
> > A2 and B2 are the actual instructions that trigger the bug.
> > 
> > If we scan from top to bottom, after we remove A1, the behavior of
> > B1
> > might be unknowable, including not to crash the program. But we
> > will
> > successfully remove B1 later cause A2 and B2 will crash the process
> > anyway:
> > 
> > ...
> > A1
> > A2
> > B2
> > ...
> > 
> > Now one more trimming will remove A1.
> > 
> > In the perfect case, we would need to be able to remove A and B (or
> > C!) at
> > the same time. But for now, let's just add a loop around the
> > minimizer.
> > 
> > Since we only remove instructions, this iterative algorithm is
> > converging.
> > 
> > Tested with Bug 1908062.
> > 
> > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> 
> Small note below, but otherwise:
> Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
> 
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 +++++++++++++++-----
> > ----
> >  1 file changed, 26 insertions(+), 15 deletions(-)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index 1a26bf5b93..378a7ccec6 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -71,21 +71,9 @@ def check_if_trace_crashes(trace, path):
> >      return False
> >  
> >  
> > -def minimize_trace(inpath, outpath):
> > -    global TIMEOUT
> > -    with open(inpath) as f:
> > -        trace = f.readlines()
> > -    start = time.time()
> > -    if not check_if_trace_crashes(trace, outpath):
> > -        sys.exit("The input qtest trace didn't cause a crash...")
> > -    end = time.time()
> > -    print("Crashed in {} seconds".format(end-start))
> > -    TIMEOUT = (end-start)*5
> > -    print("Setting the timeout for {} seconds".format(TIMEOUT))
> > -
> > -    i = 0
> > -    newtrace = trace[:]
> > +def remove_minimizer(newtrace, outpath):
> 
> Maybe a different name for this function?
> e.g. minimize_each_line or minimize_iter
> 
> -Alex

Ok, changed to remove_lines in version 5, thanks.

> 
> >      remove_step = 1
> > +    i = 0
> >      while i < len(newtrace):
> >          # 1.) Try to remove lines completely and reproduce the
> > crash.
> >          # If it works, we're done.
> > @@ -174,7 +162,30 @@ def minimize_trace(inpath, outpath):
> >                      newtrace[i] = prior[0]
> >                      del newtrace[i+1]
> >          i += 1
> > -    check_if_trace_crashes(newtrace, outpath)
> > +
> > +
> > +def minimize_trace(inpath, outpath):
> > +    global TIMEOUT
> > +    with open(inpath) as f:
> > +        trace = f.readlines()
> > +    start = time.time()
> > +    if not check_if_trace_crashes(trace, outpath):
> > +        sys.exit("The input qtest trace didn't cause a crash...")
> > +    end = time.time()
> > +    print("Crashed in {} seconds".format(end-start))
> > +    TIMEOUT = (end-start)*5
> > +    print("Setting the timeout for {} seconds".format(TIMEOUT))
> > +
> > +    newtrace = trace[:]
> > +
> > +    # remove minimizer
> > +    old_len = len(newtrace) + 1
> > +    while(old_len > len(newtrace)):
> > +        old_len = len(newtrace)
> > +        remove_minimizer(newtrace, outpath)
> > +        newtrace = list(filter(lambda s: s != "", newtrace))
> > +
> > +    assert(check_if_trace_crashes(newtrace, outpath))
> >  
> >  
> >  if __name__ == '__main__':
> > -- 
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 7/7] fuzz: heuristic split write based on past IOs
  2020-12-29  4:40 ` [PATCH v4 7/7] fuzz: heuristic split write based on past IOs Qiuhao Li
@ 2021-01-08  4:30   ` Alexander Bulekov
  0 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-08  4:30 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1240, Qiuhao Li wrote:
> If previous write commands write the same length of data with the same step,
> we view it as a hint.
> 
> Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>

Reviewed-by: Alexander Bulekov <alxndr@bu.edu>

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 56 ++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
> index a681984076..6cbf2b0419 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -85,6 +85,43 @@ def check_if_trace_crashes(trace, path):
>      return False
>  
>  
> +# If previous write commands write the same length of data at the same
> +# interval, we view it as a hint.
> +def split_write_hint(newtrace, i):
> +    HINT_LEN = 3 # > 2
> +    if i <=(HINT_LEN-1):
> +        return None
> +
> +    #find previous continuous write traces
> +    k = 0
> +    l = i-1
> +    writes = []
> +    while (k != HINT_LEN and l >= 0):
> +        if newtrace[l].startswith("write "):
> +            writes.append(newtrace[l])
> +            k += 1
> +            l -= 1
> +        elif newtrace[l] == "":
> +            l -= 1
> +        else:
> +            return None
> +    if k != HINT_LEN:
> +        return None
> +
> +    length = int(writes[0].split()[2], 16)
> +    for j in range(1, HINT_LEN):
> +        if length != int(writes[j].split()[2], 16):
> +            return None
> +
> +    step = int(writes[0].split()[1], 16) - int(writes[1].split()[1], 16)
> +    for j in range(1, HINT_LEN-1):
> +        if step != int(writes[j].split()[1], 16) - \
> +            int(writes[j+1].split()[1], 16):
> +            return None
> +
> +    return (int(writes[0].split()[1], 16)+step, length)
> +
> +
>  def remove_minimizer(newtrace, outpath):
>      remove_step = 1
>      i = 0
> @@ -148,6 +185,25 @@ def remove_minimizer(newtrace, outpath):
>              length = int(newtrace[i].split()[2], 16)
>              data = newtrace[i].split()[3][2:]
>              if length > 1:
> +
> +                # Can we get a hint from previous writes?
> +                hint = split_write_hint(newtrace, i)
> +                if hint is not None:
> +                    hint_addr = hint[0]
> +                    hint_len = hint[1]
> +                    if hint_addr >= addr and hint_addr+hint_len <= addr+length:
> +                        newtrace[i] = "write {addr} {size} 0x{data}\n".format(
> +                            addr=hex(hint_addr),
> +                            size=hex(hint_len),
> +                            data=data[(hint_addr-addr)*2:\
> +                                (hint_addr-addr)*2+hint_len*2])
> +                        if check_if_trace_crashes(newtrace, outpath):
> +                            # next round
> +                            i += 1
> +                            continue
> +                        newtrace[i] = prior[0]
> +
> +                # Try splitting it using a binary approach
>                  leftlength = int(length/2)
>                  rightlength = length - leftlength
>                  newtrace.insert(i+1, "")
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] fuzz: improve crash case minimization
  2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
                   ` (7 preceding siblings ...)
  2021-01-05  8:00 ` Ping: [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
@ 2021-01-08  4:32 ` Alexander Bulekov
  8 siblings, 0 replies; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-08  4:32 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 201229 1239, Qiuhao Li wrote:
> Extend and refine the crash case minimization process.
> 

Hi Qiuhao,
For this whole series:
Tested-by: Alexander Bulekov <alxndr@bu.edu>

Thank you for this effort! -  it is a big improvement over what we had.
-Alex

> Test input:
>   Bug 1909261 full_reproducer
>   6500 QTest instructions (write mostly)
> 
> Refined (-M1 minimization level) vs. Original version:
>   real  38m31.942s  <-- real  532m57.192s
>   user  28m18.188s  <-- user  89m0.536s
>   sys   12m42.239s  <-- sys   50m33.074s
>   2558 instructions <-- 2846 instructions
> 
> Test Enviroment:
>   i7-8550U, 16GB LPDDR3, SSD 
>   Ubuntu 20.04.1 5.4.0-58-generic x86_64
>   Python 3.8.5
> 
> v4:
>   Fix: messy diff in [PATCH v3 4/7]
> 
> v3:
>   Fix: checkpatch.pl errors
> 
> v2: 
>   New: [PATCH v2 1/7]
>   New: [PATCH v2 2/7]
>   New: [PATCH v2 4/7]
>   New: [PATCH v2 6/7]
>   New: [PATCH v2 7/7]
>   Fix: [PATCH 2/4] split using binary approach
>   Fix: [PATCH 3/4] typo in comments
>   Discard: [PATCH 1/4] the hardcoded regex match for crash detection
>   Discard: [PATCH 4/4] the delaying minimizer
>   
> Thanks for the suggestions from:
>   Alexander Bulekov
> 
> Qiuhao Li (7):
>   fuzz: accelerate non-crash detection
>   fuzz: double the IOs to remove for every loop
>   fuzz: split write operand using binary approach
>   fuzz: loop the remove minimizer and refactoring
>   fuzz: set bits in operand of write/out to zero
>   fuzz: add minimization options
>   fuzz: heuristic split write based on past IOs
> 
>  scripts/oss-fuzz/minimize_qtest_trace.py | 257 ++++++++++++++++++-----
>  1 file changed, 209 insertions(+), 48 deletions(-)
> 
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2021-01-07  4:18   ` Alexander Bulekov
  2021-01-08  2:47     ` Qiuhao Li
@ 2021-01-10 13:10     ` Qiuhao Li
  2021-01-10 16:00       ` Alexander Bulekov
  1 sibling, 1 reply; 23+ messages in thread
From: Qiuhao Li @ 2021-01-10 13:10 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> On 201229 1240, Qiuhao Li wrote:
> > We spend much time waiting for the timeout program during the
> > minimization
> > process until it passes a time limit. This patch hacks the CLOSED
> > (indicates
> > the redirection file closed) notification in QTest's output if it
> > doesn't
> > crash.
> > 
> > Test with quadrupled trace input at:
> >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > 
> > Original version:
> >   real	1m37.246s
> >   user	0m13.069s
> >   sys	0m8.399s
> > 
> > Refined version:
> >   real	0m45.904s
> >   user	0m16.874s
> >   sys	0m10.042s
> > 
> > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--
> > ------
> >  1 file changed, 28 insertions(+), 13 deletions(-)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index 5e405a0d5f..aa69c7963e 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > set a string that idenitifes the
> >  crash by setting CRASH_TOKEN=
> >  """.format((sys.argv[0])))
> >  
> > +deduplication_note = """\n\
> > +Note: While trimming the input, sometimes the mutated trace
> > triggers a different
> > +crash output but indicates the same bug. Under this situation, our
> > minimizer is
> > +incapable of recognizing and stopped from removing it. In the
> > future, we may
> > +use a more sophisticated crash case deduplication method.
> > +\n"""
> > +
> >  def check_if_trace_crashes(trace, path):
> > -    global CRASH_TOKEN
> >      with open(path, "w") as tracefile:
> >          tracefile.write("".join(trace))
> >  
> > -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> > +    proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> 
> Why remove the -s 9 here? I ran into a case where the minimizer got
> stuck on one iteration. Adding back "sigkill" to the timeout can be a
> safety net to catch those bad cases.
> -Alex

Hi Alex,

After reviewed this patch again, I think this get-stuck bug may be
caused by code:

-    return CRASH_TOKEN in output
+    for line in iter(rc.stdout.readline, b''):
+        if "CLOSED" in line:
+            return False
+        if CRASH_TOKEN in line:
+            return True

I assumed there are only two end cases in lines of stdout, but while we
are trimming the trace input, the crash output (second-to-last line)
may changes, in which case we will go through the output and fail to
find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.

To fix this bug and get a more trimmed input trace, we can:

Use the first three words of the second-to-last line instead of the
whole string, which indicate the type of crash as the token.

-        CRASH_TOKEN = output.splitlines()[-2]
+        CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])

If we reach the end of a subprocess' output, return False.

+        if line == "":
+            return False

I fix it in [PATCH v7 1/7] and give an example. Could you review again?
Thanks :-)

FYI, I mentioned this situation firstly in [PATCH 1/4], where I gave a
more detailed example:

https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html

> 
> >      < {trace_path}".format(timeout=TIMEOUT,
> >                             qemu_path=QEMU_PATH,
> >                             qemu_args=QEMU_ARGS,
> >                             trace_path=path),
> >                            shell=True,
> >                            stdin=subprocess.PIPE,
> > -                          stdout=subprocess.PIPE)
> > -    stdo = rc.communicate()[0]
> > -    output = stdo.decode('unicode_escape')
> > -    if rc.returncode == 137:    # Timed Out
> > -        return False
> > -    if len(output.splitlines()) < 2:
> > -        return False
> > -
> > +                          stdout=subprocess.PIPE,
> > +                          encoding="utf-8")
> > +    global CRASH_TOKEN
> >      if CRASH_TOKEN is None:
> > -        CRASH_TOKEN = output.splitlines()[-2]
> > +        try:
> > +            outs, _ = proc.communicate(timeout=5)
> > +            CRASH_TOKEN = outs.splitlines()[-2]
> > +        except subprocess.TimeoutExpired:
> > +            print("subprocess.TimeoutExpired")
> > +            return False
> > +        print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> > +        global deduplication_note
> > +        print(deduplication_note)
> > +        return True
> >  
> > -    return CRASH_TOKEN in output
> > +    for line in iter(proc.stdout.readline, b''):
> > +        if "CLOSED" in line:
> > +            return False
> > +        if CRASH_TOKEN in line:
> > +            return True
> > +
> > +    return False
> >  
> >  
> >  def minimize_trace(inpath, outpath):
> > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> >      print("Crashed in {} seconds".format(end-start))
> >      TIMEOUT = (end-start)*5
> >      print("Setting the timeout for {} seconds".format(TIMEOUT))
> > -    print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> >  
> >      i = 0
> >      newtrace = trace[:]
> > -- 
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2021-01-10 13:10     ` Qiuhao Li
@ 2021-01-10 16:00       ` Alexander Bulekov
  2021-01-11  2:19         ` Qiuhao Li
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Bulekov @ 2021-01-10 16:00 UTC (permalink / raw)
  To: Qiuhao Li; +Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On 210110 2110, Qiuhao Li wrote:
> On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> > On 201229 1240, Qiuhao Li wrote:
> > > We spend much time waiting for the timeout program during the
> > > minimization
> > > process until it passes a time limit. This patch hacks the CLOSED
> > > (indicates
> > > the redirection file closed) notification in QTest's output if it
> > > doesn't
> > > crash.
> > > 
> > > Test with quadrupled trace input at:
> > >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > > 
> > > Original version:
> > >   real	1m37.246s
> > >   user	0m13.069s
> > >   sys	0m8.399s
> > > 
> > > Refined version:
> > >   real	0m45.904s
> > >   user	0m16.874s
> > >   sys	0m10.042s
> > > 
> > > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> > > ---
> > >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 ++++++++++++++++--
> > > ------
> > >  1 file changed, 28 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > index 5e405a0d5f..aa69c7963e 100755
> > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > > set a string that idenitifes the
> > >  crash by setting CRASH_TOKEN=
> > >  """.format((sys.argv[0])))
> > >  
> > > +deduplication_note = """\n\
> > > +Note: While trimming the input, sometimes the mutated trace
> > > triggers a different
> > > +crash output but indicates the same bug. Under this situation, our
> > > minimizer is
> > > +incapable of recognizing and stopped from removing it. In the
> > > future, we may
> > > +use a more sophisticated crash case deduplication method.
> > > +\n"""
> > > +
> > >  def check_if_trace_crashes(trace, path):
> > > -    global CRASH_TOKEN
> > >      with open(path, "w") as tracefile:
> > >          tracefile.write("".join(trace))
> > >  
> > > -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > > {qemu_args} 2>&1\
> > > +    proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > > {qemu_args} 2>&1\
> > 
> > Why remove the -s 9 here? I ran into a case where the minimizer got
> > stuck on one iteration. Adding back "sigkill" to the timeout can be a
> > safety net to catch those bad cases.
> > -Alex
> 
> Hi Alex,
> 
> After reviewed this patch again, I think this get-stuck bug may be
> caused by code:
> 
> -    return CRASH_TOKEN in output

Hi,
Thanks for fixing this. Strangely, I was able to fix it by swapping
the b'' for a ' ' when I was stuck on a testcase a few days ago.
                                            vvv 
> +    for line in iter(rc.stdout.readline, b''):
> +        if "CLOSED" in line:
> +            return False
> +        if CRASH_TOKEN in line:
> +            return True
> 

I think your proposed change essentially does the same?
-Alex

> I assumed there are only two end cases in lines of stdout, but while we
> are trimming the trace input, the crash output (second-to-last line)
> may changes, in which case we will go through the output and fail to
> find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.
> 
> To fix this bug and get a more trimmed input trace, we can:
> 
> Use the first three words of the second-to-last line instead of the
> whole string, which indicate the type of crash as the token.
> 
> -        CRASH_TOKEN = output.splitlines()[-2]
> +        CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> 
> If we reach the end of a subprocess' output, return False.
> 
> +        if line == "":
> +            return False
> 
> I fix it in [PATCH v7 1/7] and give an example. Could you review again?
> Thanks :-)
> 
> FYI, I mentioned this situation firstly in [PATCH 1/4], where I gave a
> more detailed example:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html
> 
> > 
> > >      < {trace_path}".format(timeout=TIMEOUT,
> > >                             qemu_path=QEMU_PATH,
> > >                             qemu_args=QEMU_ARGS,
> > >                             trace_path=path),
> > >                            shell=True,
> > >                            stdin=subprocess.PIPE,
> > > -                          stdout=subprocess.PIPE)
> > > -    stdo = rc.communicate()[0]
> > > -    output = stdo.decode('unicode_escape')
> > > -    if rc.returncode == 137:    # Timed Out
> > > -        return False
> > > -    if len(output.splitlines()) < 2:
> > > -        return False
> > > -
> > > +                          stdout=subprocess.PIPE,
> > > +                          encoding="utf-8")
> > > +    global CRASH_TOKEN
> > >      if CRASH_TOKEN is None:
> > > -        CRASH_TOKEN = output.splitlines()[-2]
> > > +        try:
> > > +            outs, _ = proc.communicate(timeout=5)
> > > +            CRASH_TOKEN = outs.splitlines()[-2]
> > > +        except subprocess.TimeoutExpired:
> > > +            print("subprocess.TimeoutExpired")
> > > +            return False
> > > +        print("Identifying Crashes by this string:
> > > {}".format(CRASH_TOKEN))
> > > +        global deduplication_note
> > > +        print(deduplication_note)
> > > +        return True
> > >  
> > > -    return CRASH_TOKEN in output
> > > +    for line in iter(proc.stdout.readline, b''):
> > > +        if "CLOSED" in line:
> > > +            return False
> > > +        if CRASH_TOKEN in line:
> > > +            return True
> > > +
> > > +    return False
> > >  
> > >  
> > >  def minimize_trace(inpath, outpath):
> > > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> > >      print("Crashed in {} seconds".format(end-start))
> > >      TIMEOUT = (end-start)*5
> > >      print("Setting the timeout for {} seconds".format(TIMEOUT))
> > > -    print("Identifying Crashes by this string:
> > > {}".format(CRASH_TOKEN))
> > >  
> > >      i = 0
> > >      newtrace = trace[:]
> > > -- 
> > > 2.25.1
> > > 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
  2021-01-10 16:00       ` Alexander Bulekov
@ 2021-01-11  2:19         ` Qiuhao Li
  0 siblings, 0 replies; 23+ messages in thread
From: Qiuhao Li @ 2021-01-11  2:19 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: thuth, qemu-devel, darren.kenny, bsd, stefanha, pbonzini

On Sun, 2021-01-10 at 11:00 -0500, Alexander Bulekov wrote:
> On 210110 2110, Qiuhao Li wrote:
> > On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> > > On 201229 1240, Qiuhao Li wrote:
> > > > We spend much time waiting for the timeout program during the
> > > > minimization
> > > > process until it passes a time limit. This patch hacks the
> > > > CLOSED
> > > > (indicates
> > > > the redirection file closed) notification in QTest's output if
> > > > it
> > > > doesn't
> > > > crash.
> > > > 
> > > > Test with quadrupled trace input at:
> > > >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > > > 
> > > > Original version:
> > > >   real	1m37.246s
> > > >   user	0m13.069s
> > > >   sys	0m8.399s
> > > > 
> > > > Refined version:
> > > >   real	0m45.904s
> > > >   user	0m16.874s
> > > >   sys	0m10.042s
> > > > 
> > > > Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
> > > > ---
> > > >  scripts/oss-fuzz/minimize_qtest_trace.py | 41
> > > > ++++++++++++++++--
> > > > ------
> > > >  1 file changed, 28 insertions(+), 13 deletions(-)
> > > > 
> > > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > index 5e405a0d5f..aa69c7963e 100755
> > > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally,
> > > > manually
> > > > set a string that idenitifes the
> > > >  crash by setting CRASH_TOKEN=
> > > >  """.format((sys.argv[0])))
> > > >  
> > > > +deduplication_note = """\n\
> > > > +Note: While trimming the input, sometimes the mutated trace
> > > > triggers a different
> > > > +crash output but indicates the same bug. Under this situation,
> > > > our
> > > > minimizer is
> > > > +incapable of recognizing and stopped from removing it. In the
> > > > future, we may
> > > > +use a more sophisticated crash case deduplication method.
> > > > +\n"""
> > > > +
> > > >  def check_if_trace_crashes(trace, path):
> > > > -    global CRASH_TOKEN
> > > >      with open(path, "w") as tracefile:
> > > >          tracefile.write("".join(trace))
> > > >  
> > > > -    rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > > > +    proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > > 
> > > Why remove the -s 9 here? I ran into a case where the minimizer
> > > got
> > > stuck on one iteration. Adding back "sigkill" to the timeout can
> > > be a
> > > safety net to catch those bad cases.
> > > -Alex
> > 
> > Hi Alex,
> > 
> > After reviewed this patch again, I think this get-stuck bug may be
> > caused by code:
> > 
> > -    return CRASH_TOKEN in output
> 
> Hi,
> Thanks for fixing this. Strangely, I was able to fix it by swapping
> the b'' for a ' ' when I was stuck on a testcase a few days ago.
>                                             vvv 
> > +    for line in iter(rc.stdout.readline, b''):
> > +        if "CLOSED" in line:
> > +            return False
> > +        if CRASH_TOKEN in line:
> > +            return True
> > 
> 
> I think your proposed change essentially does the same?
> -Alex

Hi Alex,

It looks like I misused the bytes type. Instead of b'', '' (the str
type) should be used here:

-    for line in iter(rc.stdout.readline, b''):
+    for line in iter(rc.stdout.readline, ''):
And you are right, if we use iter() with sentinel parameter '', it's
does the same as:

+        if line == "":
+            return False

But if we just fix the get-stuck bug here, we may fail
the assert(check_if_trace_crashes(newtrace, outpath)) check after
remove_lines() or clear_bits() since the same trace input may trigger a
different output between runs.

My solution is instead of using the whole second-to-last line as token,
we only use the the first three words which indicate the type of crash:

-        CRASH_TOKEN = output.splitlines()[-2]
+        CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])

Example: "SUMMARY: AddressSanitizer: stack-overflow"

And thus, we may a get a more trimmed input trace.

> 
> > I assumed there are only two end cases in lines of stdout, but
> > while we
> > are trimming the trace input, the crash output (second-to-last
> > line)
> > may changes, in which case we will go through the output and fail
> > to
> > find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.
> > 
> > To fix this bug and get a more trimmed input trace, we can:
> > 
> > Use the first three words of the second-to-last line instead of the
> > whole string, which indicate the type of crash as the token.
> > 
> > -        CRASH_TOKEN = output.splitlines()[-2]
> > +        CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> > 
> > If we reach the end of a subprocess' output, return False.
> > 
> > +        if line == "":
> > +            return False
> > 
> > I fix it in [PATCH v7 1/7] and give an example. Could you review
> > again?
> > Thanks :-)
> > 
> > FYI, I mentioned this situation firstly in [PATCH 1/4], where I
> > gave a
> > more detailed example:
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html
> > 
> > > >      < {trace_path}".format(timeout=TIMEOUT,
> > > >                             qemu_path=QEMU_PATH,
> > > >                             qemu_args=QEMU_ARGS,
> > > >                             trace_path=path),
> > > >                            shell=True,
> > > >                            stdin=subprocess.PIPE,
> > > > -                          stdout=subprocess.PIPE)
> > > > -    stdo = rc.communicate()[0]
> > > > -    output = stdo.decode('unicode_escape')
> > > > -    if rc.returncode == 137:    # Timed Out
> > > > -        return False
> > > > -    if len(output.splitlines()) < 2:
> > > > -        return False
> > > > -
> > > > +                          stdout=subprocess.PIPE,
> > > > +                          encoding="utf-8")
> > > > +    global CRASH_TOKEN
> > > >      if CRASH_TOKEN is None:
> > > > -        CRASH_TOKEN = output.splitlines()[-2]
> > > > +        try:
> > > > +            outs, _ = proc.communicate(timeout=5)
> > > > +            CRASH_TOKEN = outs.splitlines()[-2]
> > > > +        except subprocess.TimeoutExpired:
> > > > +            print("subprocess.TimeoutExpired")
> > > > +            return False
> > > > +        print("Identifying Crashes by this string:
> > > > {}".format(CRASH_TOKEN))
> > > > +        global deduplication_note
> > > > +        print(deduplication_note)
> > > > +        return True
> > > >  
> > > > -    return CRASH_TOKEN in output
> > > > +    for line in iter(proc.stdout.readline, b''):
> > > > +        if "CLOSED" in line:
> > > > +            return False
> > > > +        if CRASH_TOKEN in line:
> > > > +            return True
> > > > +
> > > > +    return False
> > > >  
> > > >  
> > > >  def minimize_trace(inpath, outpath):
> > > > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> > > >      print("Crashed in {} seconds".format(end-start))
> > > >      TIMEOUT = (end-start)*5
> > > >      print("Setting the timeout for {}
> > > > seconds".format(TIMEOUT))
> > > > -    print("Identifying Crashes by this string:
> > > > {}".format(CRASH_TOKEN))
> > > >  
> > > >      i = 0
> > > >      newtrace = trace[:]
> > > > -- 
> > > > 2.25.1
> > > > 



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-01-11  2:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-29  4:39 [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
2020-12-29  4:40 ` [PATCH v4 1/7] fuzz: accelerate non-crash detection Qiuhao Li
2021-01-07  3:42   ` Alexander Bulekov
2021-01-07  4:18   ` Alexander Bulekov
2021-01-08  2:47     ` Qiuhao Li
2021-01-10 13:10     ` Qiuhao Li
2021-01-10 16:00       ` Alexander Bulekov
2021-01-11  2:19         ` Qiuhao Li
2020-12-29  4:40 ` [PATCH v4 2/7] fuzz: double the IOs to remove for every loop Qiuhao Li
2021-01-07  4:19   ` Alexander Bulekov
2020-12-29  4:40 ` [PATCH v4 3/7] fuzz: split write operand using binary approach Qiuhao Li
2021-01-07  4:28   ` Alexander Bulekov
2020-12-29  4:40 ` [PATCH v4 4/7] fuzz: loop the remove minimizer and refactoring Qiuhao Li
2021-01-07  4:53   ` Alexander Bulekov
2021-01-08  2:49     ` Qiuhao Li
2020-12-29  4:40 ` [PATCH v4 5/7] fuzz: set bits in operand of write/out to zero Qiuhao Li
2021-01-07  5:08   ` Alexander Bulekov
2020-12-29  4:40 ` [PATCH v4 6/7] fuzz: add minimization options Qiuhao Li
2021-01-07  5:54   ` Alexander Bulekov
2020-12-29  4:40 ` [PATCH v4 7/7] fuzz: heuristic split write based on past IOs Qiuhao Li
2021-01-08  4:30   ` Alexander Bulekov
2021-01-05  8:00 ` Ping: [PATCH v4 0/7] fuzz: improve crash case minimization Qiuhao Li
2021-01-08  4:32 ` Alexander Bulekov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.