Re: [PATCH 00/10] Retire Fork-Based Fuzzing

From: Stefan Hajnoczi <stefanha@gmail.com>
To: Laurent Vivier <lvivier@redhat.com>
Cc: "Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Alexander Bulekov" <alxndr@bu.edu>,
	qemu-devel@nongnu.org, "Bandan Das" <bsd@redhat.com>,
	"Darren Kenny" <darren.kenny@oracle.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>
Subject: Re: [PATCH 00/10] Retire Fork-Based Fuzzing
Date: Tue, 14 Feb 2023 13:46:48 -0500	[thread overview]
Message-ID: <CAJSP0QWeK-EKt_LhWE4i28zDxh78qrvqtFtSD0XQmy89+JxxgQ@mail.gmail.com> (raw)
In-Reply-To: <323ad0aa-e38f-bcc1-8c6f-0732c1cd0fc0@redhat.com>

On Tue, 14 Feb 2023 at 12:59, Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 2/14/23 17:08, Philippe Mathieu-Daudé wrote:
> > On 14/2/23 16:38, Stefan Hajnoczi wrote:
> >> On Sat, Feb 04, 2023 at 11:29:41PM -0500, Alexander Bulekov wrote:
> >>> Hello,
> >>> This series removes fork-based fuzzing.
> >>> How does fork-based fuzzing work?
> >>>   * A single parent process initializes QEMU
> >>>   * We identify the devices we wish to fuzz (fuzzer-dependent)
> >>>   * Use QTest to PCI enumerate the devices
> >>>   * After that we start a fork-server which forks the process and executes
> >>>     fuzzer inputs inside the disposable children.
> >>>
> >>> In a normal fuzzing process, everything happens in a single process.
> >>>
> >>> Pros of fork-based fuzzing:
> >>>   * We only need to do common configuration once (e.g. PCI enumeration).
> >>>   * Fork provides a strong guarantee that fuzzer inputs will not interfere with
> >>>     each-other
> >>>   * The fuzzing process can continue even after a child-process crashes
> >>>   * We can apply our-own timers to child-processes to exit slow inputs, early
> >>>
> >>> Cons of fork-based fuzzing:
> >>>   * Fork-based fuzzing is not supported by libfuzzer. We had to build our own
> >>>     fork-server and rely on tricks using linker-scripts and shared-memory to
> >>>     support fuzzing. ( https://physics.bu.edu/~alxndr/libfuzzer-forkserver/ )
> >>>   * Fork-based fuzzing is currently the main blocker preventing us from enabling
> >>>     other fuzzers such as AFL++ on OSS-Fuzz
> >>>   * Fork-based fuzzing may be a reason why coverage-builds are failing on
> >>>     OSS-Fuzz. Coverage is an important fuzzing metric which would allow us to
> >>>     find parts of the code that are not well-covered.
> >>>   * Fork-based fuzzing has high overhead. fork() is an expensive system-call,
> >>>     especially for processes running ASAN (with large/complex) VMA layouts.
> >>>   * Fork prevents us from effectively fuzzing devices that rely on
> >>>     threads (e.g. qxl).
> >>>
> >>> These patches remove fork-based fuzzing and replace it with reboot-based
> >>> fuzzing for most cases. Misc notes about this change:
> >>>   * libfuzzer appears to be no longer in active development. As such, the
> >>>     current implementation of fork-based fuzzing (while having some nice
> >>>     advantages) is likely to hold us back in the future. If these changes
> >>>     are approved and appear to run successfully on OSS-Fuzz, we should be
> >>>     able to easily experiment with other fuzzing engines (AFL++).
> >>>   * Some device do not completely reset their state. This can lead to
> >>>     non-reproducible crashes. However, in my local tests, most crashes
> >>>     were reproducible. OSS-Fuzz shouldn't send us reports unless it can
> >>>     consistently reproduce a crash.
> >>>   * In theory, the corpus-format should not change, so the existing
> >>>     corpus-inputs on OSS-Fuzz will transfer to the new reset()-able
> >>>     fuzzers.
> >>>   * Each fuzzing process will now exit after a single crash is found. To
> >>>     continue the fuzzing process, use libfuzzer flags such as -jobs=-1
> >>>   * We no long control input-timeouts (those are handled by libfuzzer).
> >>>     Since timeouts on oss-fuzz can be many seconds long, I added a limit
> >>>     on the number of DMA bytes written.
> >>>
> >>> Alexander Bulekov (10):
> >>>    hw/sparse-mem: clear memory on reset
> >>>    fuzz: add fuzz_reboot API
> >>>    fuzz/generic-fuzz: use reboots instead of forks to reset state
> >>>    fuzz/generic-fuzz: add a limit on DMA bytes written
> >>>    fuzz/virtio-scsi: remove fork-based fuzzer
> >>>    fuzz/virtio-net: remove fork-based fuzzer
> >>>    fuzz/virtio-blk: remove fork-based fuzzer
> >>>    fuzz/i440fx: remove fork-based fuzzer
> >>>    fuzz: remove fork-fuzzing scaffolding
> >>>    docs/fuzz: remove mentions of fork-based fuzzing
> >>>
> >>>   docs/devel/fuzzing.rst              |  22 +-----
> >>>   hw/mem/sparse-mem.c                 |  13 +++-
> >>>   meson.build                         |   4 -
> >>>   tests/qtest/fuzz/fork_fuzz.c        |  41 ----------
> >>>   tests/qtest/fuzz/fork_fuzz.h        |  23 ------
> >>>   tests/qtest/fuzz/fork_fuzz.ld       |  56 --------------
> >>>   tests/qtest/fuzz/fuzz.c             |   6 ++
> >>>   tests/qtest/fuzz/fuzz.h             |   2 +-
> >>>   tests/qtest/fuzz/generic_fuzz.c     | 111 +++++++---------------------
> >>>   tests/qtest/fuzz/i440fx_fuzz.c      |  27 +------
> >>>   tests/qtest/fuzz/meson.build        |   6 +-
> >>>   tests/qtest/fuzz/virtio_blk_fuzz.c  |  51 ++-----------
> >>>   tests/qtest/fuzz/virtio_net_fuzz.c  |  54 ++------------
> >>>   tests/qtest/fuzz/virtio_scsi_fuzz.c |  51 ++-----------
> >>>   14 files changed, 72 insertions(+), 395 deletions(-)
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.c
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.h
> >>>   delete mode 100644 tests/qtest/fuzz/fork_fuzz.ld
> >>>
> >>> --
> >>> 2.39.0
> >>>
> >>
> >> Whose tree should this go through? Laurent's qtest tree?
> >
> > Do you mean Thomas?
> >
> > $ git shortlog -cs tests/qtest/fuzz | sort -rn
> >      32  Thomas Huth
> >      26  Paolo Bonzini
> >      19  Stefan Hajnoczi
> >       6  Markus Armbruster
> >       5  Alexander Bulekov
> >       4  Marc-André Lureau
> >       3  Peter Maydell
> >       2  Laurent Vivier
> >       1  Michael S. Tsirkin
> >       1  Gerd Hoffmann
> >
> > In doubt, cc'ing both :)
>
> Yes, Thomas is the real maintainer.

Want to update the ./MAINTAINERS file? That's where I found your name.
Thomas is only listed as a reviewer.

Stefan