All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qiuhao Li <Qiuhao.Li@outlook.com>
To: alxndr@bu.edu, qemu-devel@nongnu.org
Cc: darren.kenny@oracle.com, bsd@redhat.com, thuth@redhat.com,
	stefanha@redhat.com, pbonzini@redhat.com
Subject: [PATCH 1/4] fuzz: refine crash detection mechanism
Date: Sun, 20 Dec 2020 02:56:24 +0800	[thread overview]
Message-ID: <ME3P282MB14924A6558A105B7FBFA579DFCC20@ME3P282MB1492.AUSP282.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <ME3P282MB1492BFA2302041F2AB420EBEFCC20@ME3P282MB1492.AUSP282.PROD.OUTLOOK.COM>

The original crash detection method is to fork a process to test our new
trace input. If the child process exits in time and the second-to-last line
is the same as the first crash, we think it is a crash triggered by the same
bug. However, in some situations, it doesn't work since it is a
hardcoded-offset string comparison.

For example, suppose an assertion failure makes the crash. In that case, the
second-to-last line will be 'timeout: the monitored command dumped core',
which doesn't contain any information about the assertion failure like where
it happened or the assertion statement. This may lead to a minimized input
triggers assertion failure but may indicate another bug. As for some
sanitizers' crashes, the direct string comparison may stop us from getting a
smaller input, since they may have a different leaf stack frame.

Perhaps we can detect crashes using both precise output string comparison
and rough pattern string match and info the user when the trace input
triggers different but a seminar output.

Tested:
Assertion failure, https://bugs.launchpad.net/qemu/+bug/1908062
AddressSanitizer, https://bugs.launchpad.net/qemu/+bug/1907497
Trace input that doesn't crash
Trace input that crashes Qtest

Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 59 ++++++++++++++++++------
 1 file changed, 46 insertions(+), 13 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py
index 5e405a0d5f..d3b09e6567 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -10,11 +10,16 @@ import os
 import subprocess
 import time
 import struct
+import re
 
 QEMU_ARGS = None
 QEMU_PATH = None
 TIMEOUT = 5
-CRASH_TOKEN = None
+
+crash_patterns = ("Assertion.+failed",
+                  "SUMMARY.+Sanitizer")
+crash_pattern = None
+crash_string = None
 
 write_suffix_lookup = {"b": (1, "B"),
                        "w": (2, "H"),
@@ -24,13 +29,12 @@ write_suffix_lookup = {"b": (1, "B"),
 def usage():
     sys.exit("""\
 Usage: QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} input_trace output_trace
-By default, will try to use the second-to-last line in the output to identify
-whether the crash occred. Optionally, manually set a string that idenitifes the
-crash by setting CRASH_TOKEN=
+By default, we will try to search predefined crash patterns through the
+tracing output to see whether the crash occred. Optionally, manually set a
+string that idenitifes the crash by setting CRASH_PATTERN=
 """.format((sys.argv[0])))
 
 def check_if_trace_crashes(trace, path):
-    global CRASH_TOKEN
     with open(path, "w") as tracefile:
         tracefile.write("".join(trace))
 
@@ -42,17 +46,47 @@ def check_if_trace_crashes(trace, path):
                           shell=True,
                           stdin=subprocess.PIPE,
                           stdout=subprocess.PIPE)
+    if rc.returncode == 137:    # Timed Out
+        return False
+
     stdo = rc.communicate()[0]
     output = stdo.decode('unicode_escape')
-    if rc.returncode == 137:    # Timed Out
-        return False
-    if len(output.splitlines()) < 2:
+    output_lines = output.splitlines()
+    # Usually we care about the summary info in the last few lines, reverse.
+    output_lines.reverse()
+
+    global crash_pattern, crash_patterns, crash_string
+    if crash_pattern is None: # Initialization
+        for line in output_lines:
+            for c in crash_patterns:
+                if re.search(c, line) is not None:
+                    crash_pattern = c
+                    crash_string = line
+                    print("Identifying crash pattern by this string: ",\
+                          crash_string)
+                    print("Using regex pattern: ", crash_pattern)
+                    return True
+        print("Failed to initialize crash pattern: no match.")
         return False
 
-    if CRASH_TOKEN is None:
-        CRASH_TOKEN = output.splitlines()[-2]
+    # First, we search exactly the previous crash string.
+    for line in output_lines:
+        if crash_string == line:
+            return True
+
+    # Then we decide whether a similar (same pattern) crash happened.
+    # Slower now :(
+    for line in output_lines:
+        if re.search(crash_pattern, line) is not None:
+            print("\nINFO: The crash string changed during our minimization process.")
+            print("Before: ", crash_string)
+            print("After: ", line)
+            print("The original regex pattern can still match, updated the crash string.")
+            crash_string = line
+            return True
 
-    return CRASH_TOKEN in output
+    # The input did not trigger (the same type) bug.
+    return False
 
 
 def minimize_trace(inpath, outpath):
@@ -66,7 +100,6 @@ def minimize_trace(inpath, outpath):
     print("Crashed in {} seconds".format(end-start))
     TIMEOUT = (end-start)*5
     print("Setting the timeout for {} seconds".format(TIMEOUT))
-    print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
 
     i = 0
     newtrace = trace[:]
@@ -152,6 +185,6 @@ if __name__ == '__main__':
         usage()
     # if "accel" not in QEMU_ARGS:
     #     QEMU_ARGS += " -accel qtest"
-    CRASH_TOKEN = os.getenv("CRASH_TOKEN")
+    crash_pattern = os.getenv("CRASH_PATTERN")
     QEMU_ARGS += " -qtest stdio -monitor none -serial none "
     minimize_trace(sys.argv[1], sys.argv[2])
-- 
2.25.1



  reply	other threads:[~2020-12-19 19:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-19 18:39 [PATCH 0/4] improve crash case minimization Qiuhao Li
2020-12-19 18:56 ` Qiuhao Li [this message]
2020-12-21 18:46   ` [PATCH 1/4] fuzz: refine crash detection mechanism Alexander Bulekov
2020-12-22 11:18     ` Qiuhao Li
2020-12-22 16:47   ` Alexander Bulekov
2020-12-23  5:58     ` Li Qiuhao
2020-12-19 18:56 ` [PATCH 2/4] fuzz: split QTest writes from the rightmost byte Qiuhao Li
2020-12-21 20:01   ` Alexander Bulekov
2020-12-22 11:20     ` Qiuhao Li
2020-12-19 18:56 ` [PATCH 3/4] fuzz: setting bits in operand of out/write to zero Qiuhao Li
2020-12-21 20:35   ` Alexander Bulekov
2020-12-22 11:21     ` Qiuhao Li
2020-12-19 18:56 ` [PATCH 4/4] fuzz: delay IO until they can't trigger the crash Qiuhao Li
2020-12-21 21:17   ` Alexander Bulekov
2020-12-22 11:22     ` Qiuhao Li
2020-12-22 18:30       ` Alexander Bulekov
2020-12-23  9:20         ` Qiuhao Li
2020-12-25  0:24           ` Alexander Bulekov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ME3P282MB14924A6558A105B7FBFA579DFCC20@ME3P282MB1492.AUSP282.PROD.OUTLOOK.COM \
    --to=qiuhao.li@outlook.com \
    --cc=alxndr@bu.edu \
    --cc=bsd@redhat.com \
    --cc=darren.kenny@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.