qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Ramping up Continuous Fuzzing of Virtual Devices in QEMU
@ 2020-10-22 16:19 Alexander Bulekov
  2020-10-22 16:24 ` Alexander Bulekov
  2020-10-24  3:10 ` Li Qiang
  0 siblings, 2 replies; 11+ messages in thread
From: Alexander Bulekov @ 2020-10-22 16:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, thuth, berrange, 0ops, liq3ea, f4bug, rjones,
	darren.kenny, bsd, stefanha, andrey.shinkevich, pbonzini,
	dimastep

Hello,
QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
earlier this year. The fuzzers currently running on oss-fuzz are based on my
2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
to provide a framework for writing virtual-device fuzzers. At the moment, there
are a handful of fuzzers upstream and running on oss-fuzz(located in
tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
examples.

If everything goes well, soon a generic fuzzer [2] will land upstream, which
allows us to fuzz many configurations of QEMU, without any device-specific
code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
also be able to send simple patches to add additional device configurations for
fuzzing.

The oss-fuzz process looks roughly like this:
    1. oss-fuzz fuzzes QEMU
    2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
    access to reports and reproducers.
    3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
    bug as fixed and make the report public 30 days later.
    3. After 90 days the bug(fixed or not) becomes public, so anyone can view
    it here https://bugs.chromium.org/p/oss-fuzz/issues/list

The oss-fuzz reports look like this:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2

This means that when oss-fuzz find new bugs, the relevant developers do not
know about them unless someone with access files a separate report to the
list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
running some small example fuzzers. Once [2] lands upstream, we should
see a significant uptick in oss-fuzz reports, and I hope that we can develop a
process to ensure these bugs are properly dealt with. One option we have is to
make the reports public immediately and send notifications to
qemu-devel. This is the approach taken by some other projects on
oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
syzkaller in the kernel, are also automatically sent to a public list.
The question is: 

What approach should we take for dealing with bugs found on oss-fuzz?

[1] https://github.com/google/oss-fuzz
[2] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06331.html
[3] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06345.html
[4] https://github.com/google/oss-fuzz/blob/fbf916ce14952ba192e58fe8550096b868fcf62d/projects/qemu/project.yaml#L4

For further reference, the vast majority of these bugs, were found with the
generic-fuzzer:
https://bugs.launchpad.net/~a1xndr/+bugs

There are more that I haven't yet had time to write reports for.
Thank you
-Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:19 Ramping up Continuous Fuzzing of Virtual Devices in QEMU Alexander Bulekov
@ 2020-10-22 16:24 ` Alexander Bulekov
  2020-10-22 16:39   ` Daniel P. Berrangé
  2020-10-24  3:10 ` Li Qiang
  1 sibling, 1 reply; 11+ messages in thread
From: Alexander Bulekov @ 2020-10-22 16:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, thuth, berrange, 0ops, liq3ea, f4bug, rjones,
	darren.kenny, bsd, stefanha, andrey.shinkevich, pbonzini,
	ppandit, dimastep

+CC Prasad

On 201022 1219, Alexander Bulekov wrote:
> Hello,
> QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
> earlier this year. The fuzzers currently running on oss-fuzz are based on my
> 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
> to provide a framework for writing virtual-device fuzzers. At the moment, there
> are a handful of fuzzers upstream and running on oss-fuzz(located in
> tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
> examples.
> 
> If everything goes well, soon a generic fuzzer [2] will land upstream, which
> allows us to fuzz many configurations of QEMU, without any device-specific
> code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
> generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
> bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
> also be able to send simple patches to add additional device configurations for
> fuzzing.
> 
> The oss-fuzz process looks roughly like this:
>     1. oss-fuzz fuzzes QEMU
>     2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
>     access to reports and reproducers.
>     3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
>     bug as fixed and make the report public 30 days later.
>     3. After 90 days the bug(fixed or not) becomes public, so anyone can view
>     it here https://bugs.chromium.org/p/oss-fuzz/issues/list
> 
> The oss-fuzz reports look like this:
> https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
> 
> This means that when oss-fuzz find new bugs, the relevant developers do not
> know about them unless someone with access files a separate report to the
> list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
> running some small example fuzzers. Once [2] lands upstream, we should
> see a significant uptick in oss-fuzz reports, and I hope that we can develop a
> process to ensure these bugs are properly dealt with. One option we have is to
> make the reports public immediately and send notifications to
> qemu-devel. This is the approach taken by some other projects on
> oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
> syzkaller in the kernel, are also automatically sent to a public list.
> The question is: 
> 
> What approach should we take for dealing with bugs found on oss-fuzz?
> 
> [1] https://github.com/google/oss-fuzz
> [2] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06331.html
> [3] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06345.html
> [4] https://github.com/google/oss-fuzz/blob/fbf916ce14952ba192e58fe8550096b868fcf62d/projects/qemu/project.yaml#L4
> 
> For further reference, the vast majority of these bugs, were found with the
> generic-fuzzer:
> https://bugs.launchpad.net/~a1xndr/+bugs
> 
> There are more that I haven't yet had time to write reports for.
> Thank you
> -Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:24 ` Alexander Bulekov
@ 2020-10-22 16:39   ` Daniel P. Berrangé
  2020-10-22 18:07     ` Alexander Bulekov
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2020-10-22 16:39 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: peter.maydell, thuth, rjones, 0ops, liq3ea, qemu-devel, f4bug,
	darren.kenny, bsd, stefanha, pbonzini, andrey.shinkevich,
	ppandit, dimastep

On Thu, Oct 22, 2020 at 12:24:16PM -0400, Alexander Bulekov wrote:
> +CC Prasad
> 
> On 201022 1219, Alexander Bulekov wrote:
> > Hello,
> > QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
> > earlier this year. The fuzzers currently running on oss-fuzz are based on my
> > 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
> > to provide a framework for writing virtual-device fuzzers. At the moment, there
> > are a handful of fuzzers upstream and running on oss-fuzz(located in
> > tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
> > examples.
> > 
> > If everything goes well, soon a generic fuzzer [2] will land upstream, which
> > allows us to fuzz many configurations of QEMU, without any device-specific
> > code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
> > generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
> > bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
> > also be able to send simple patches to add additional device configurations for
> > fuzzing.
> > 
> > The oss-fuzz process looks roughly like this:
> >     1. oss-fuzz fuzzes QEMU
> >     2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
> >     access to reports and reproducers.
> >     3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
> >     bug as fixed and make the report public 30 days later.
> >     3. After 90 days the bug(fixed or not) becomes public, so anyone can view
> >     it here https://bugs.chromium.org/p/oss-fuzz/issues/list
> > 
> > The oss-fuzz reports look like this:
> > https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
> > 
> > This means that when oss-fuzz find new bugs, the relevant developers do not
> > know about them unless someone with access files a separate report to the
> > list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
> > running some small example fuzzers. Once [2] lands upstream, we should
> > see a significant uptick in oss-fuzz reports, and I hope that we can develop a
> > process to ensure these bugs are properly dealt with. One option we have is to
> > make the reports public immediately and send notifications to
> > qemu-devel. This is the approach taken by some other projects on
> > oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
> > syzkaller in the kernel, are also automatically sent to a public list.
> > The question is: 
> > 
> > What approach should we take for dealing with bugs found on oss-fuzz?

If we assume that a non-negligible number of fuzz bugs will be exploitable
by a malicious guest OS to break out into the host, then I think it is
likely undesirable to make them public immediately without at least a basic
human triage step to catch possibly serious security issues.

Still a large % are likely to be low impact / not urgent to deal with so
we want a low overhead way to handle the fuzz output, which doesn't create
a bottleneck on a small number of people.

Overall my feeling is that we want to be able to farm out triage to the
respective subsystem maintainers, who can then decide whether the bug
needs notifying to the security team, or can be made public immediately.

I think ideally we would be doing triage in QEMU's own bug tracker, so
we don't need to have maintainers create accounts on a 3rd party tracker
to see reports.

Is is practical to identify a primary affected source file from the fuzz
crash report with any level reliablility such that we could file a private
launchpad bug, automatically CC'ing a subsystem maintainer (and the security
team)  ?


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:39   ` Daniel P. Berrangé
@ 2020-10-22 18:07     ` Alexander Bulekov
  2020-10-22 21:17     ` Philippe Mathieu-Daudé
  2020-11-04 10:30     ` P J P
  2 siblings, 0 replies; 11+ messages in thread
From: Alexander Bulekov @ 2020-10-22 18:07 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: peter.maydell, thuth, rjones, 0ops, liq3ea, qemu-devel, f4bug,
	darren.kenny, bsd, stefanha, pbonzini, andrey.shinkevich,
	ppandit, dimastep

On 201022 1739, Daniel P. Berrangé wrote:
> On Thu, Oct 22, 2020 at 12:24:16PM -0400, Alexander Bulekov wrote:
> > +CC Prasad
> > 
> > On 201022 1219, Alexander Bulekov wrote:
> > > Hello,
> > > QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
> > > earlier this year. The fuzzers currently running on oss-fuzz are based on my
> > > 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
> > > to provide a framework for writing virtual-device fuzzers. At the moment, there
> > > are a handful of fuzzers upstream and running on oss-fuzz(located in
> > > tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
> > > examples.
> > > 
> > > If everything goes well, soon a generic fuzzer [2] will land upstream, which
> > > allows us to fuzz many configurations of QEMU, without any device-specific
> > > code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
> > > generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
> > > bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
> > > also be able to send simple patches to add additional device configurations for
> > > fuzzing.
> > > 
> > > The oss-fuzz process looks roughly like this:
> > >     1. oss-fuzz fuzzes QEMU
> > >     2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
> > >     access to reports and reproducers.
> > >     3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
> > >     bug as fixed and make the report public 30 days later.
> > >     3. After 90 days the bug(fixed or not) becomes public, so anyone can view
> > >     it here https://bugs.chromium.org/p/oss-fuzz/issues/list
> > > 
> > > The oss-fuzz reports look like this:
> > > https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
> > > 
> > > This means that when oss-fuzz find new bugs, the relevant developers do not
> > > know about them unless someone with access files a separate report to the
> > > list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
> > > running some small example fuzzers. Once [2] lands upstream, we should
> > > see a significant uptick in oss-fuzz reports, and I hope that we can develop a
> > > process to ensure these bugs are properly dealt with. One option we have is to
> > > make the reports public immediately and send notifications to
> > > qemu-devel. This is the approach taken by some other projects on
> > > oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
> > > syzkaller in the kernel, are also automatically sent to a public list.
> > > The question is: 
> > > 
> > > What approach should we take for dealing with bugs found on oss-fuzz?
> 
> If we assume that a non-negligible number of fuzz bugs will be exploitable
> by a malicious guest OS to break out into the host, then I think it is
> likely undesirable to make them public immediately without at least a basic
> human triage step to catch possibly serious security issues.
> 
> Still a large % are likely to be low impact / not urgent to deal with so
> we want a low overhead way to handle the fuzz output, which doesn't create
> a bottleneck on a small number of people.
> 
> Overall my feeling is that we want to be able to farm out triage to the
> respective subsystem maintainers, who can then decide whether the bug
> needs notifying to the security team, or can be made public immediately.
> 
> I think ideally we would be doing triage in QEMU's own bug tracker, so
> we don't need to have maintainers create accounts on a 3rd party tracker
> to see reports.
> 
> Is is practical to identify a primary affected source file from the fuzz
> crash report with any level reliablility such that we could file a private
> launchpad bug, automatically CC'ing a subsystem maintainer (and the security
> team)  ?

Hi Daniel,
As far as I know, there is currently no API for accessing oss-fuzz
results. We could use email-based scripts to parse the automated reports
(e.g.  [1]) and follow the links to automatically download crash-traces
and reproducers. However, accessing those requires a login through
google.com which might be tough to script against in a reliable way.

Assuming we have found a way to automatically download the binary fuzzer
reproducer, we should be able to automatically convert it into a qtest
reproducer that we could send to the right people. 
There are a few approaches I can think of to automatically identify the
maintainers to CC:
 1. Walk the stack-trace until we find the line likely responsible for
 the bug. This can be tricky, since the buggy line is often not the
 first line. E.g. from [2]
    #3 __GI___assert_fail assert.c:101
    #4 iov_from_buf_full util/iov.c:40
    #5 iov_from_buf iov.h:49
    #6 net_tx_pkt_update_ip_checksums hw/net/net_tx_pkt.c:139
    #7 e1000e_setup_tx_offloads hw/net/e1000e_core.c:638

  Do we cc the iov maintainer, the net_tx_pkt maintainer, or the e1000e
  maintainer? Might be tough to figure out automatically.

 2. git bisect until we find the commit where the reproducer started to
 crash and CC everyone who signed-off or reviewed that commit and who is
 in MAINTAINERS. This is also not a silver bullet, since reproducers
 might stop working due to reasons unrelated to the bug: e.g.  the MMIO
 range for a device changed, so the qtest reproducer now interacts with
 the wrong addresses, or the bug was obscured by another bug.

 [1] https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
 [2] https://bugs.launchpad.net/qemu/+bug/1878250

-Alex

> 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:39   ` Daniel P. Berrangé
  2020-10-22 18:07     ` Alexander Bulekov
@ 2020-10-22 21:17     ` Philippe Mathieu-Daudé
  2020-11-04 10:30     ` P J P
  2 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-10-22 21:17 UTC (permalink / raw)
  To: Daniel P. Berrangé, Alexander Bulekov
  Cc: peter.maydell, thuth, 0ops, liq3ea, rjones, qemu-devel,
	darren.kenny, bsd, stefanha, andrey.shinkevich, pbonzini,
	ppandit, dimastep

On 10/22/20 6:39 PM, Daniel P. Berrangé wrote:
> On Thu, Oct 22, 2020 at 12:24:16PM -0400, Alexander Bulekov wrote:
>> +CC Prasad
>>
>> On 201022 1219, Alexander Bulekov wrote:
>>> Hello,
>>> QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
>>> earlier this year. The fuzzers currently running on oss-fuzz are based on my
>>> 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
>>> to provide a framework for writing virtual-device fuzzers. At the moment, there
>>> are a handful of fuzzers upstream and running on oss-fuzz(located in
>>> tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
>>> examples.
>>>
>>> If everything goes well, soon a generic fuzzer [2] will land upstream, which
>>> allows us to fuzz many configurations of QEMU, without any device-specific
>>> code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
>>> generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
>>> bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
>>> also be able to send simple patches to add additional device configurations for
>>> fuzzing.
>>>
>>> The oss-fuzz process looks roughly like this:
>>>      1. oss-fuzz fuzzes QEMU
>>>      2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
>>>      access to reports and reproducers.
>>>      3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
>>>      bug as fixed and make the report public 30 days later.
>>>      3. After 90 days the bug(fixed or not) becomes public, so anyone can view
>>>      it here https://bugs.chromium.org/p/oss-fuzz/issues/list
>>>
>>> The oss-fuzz reports look like this:
>>> https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
>>>
>>> This means that when oss-fuzz find new bugs, the relevant developers do not
>>> know about them unless someone with access files a separate report to the
>>> list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
>>> running some small example fuzzers. Once [2] lands upstream, we should
>>> see a significant uptick in oss-fuzz reports, and I hope that we can develop a
>>> process to ensure these bugs are properly dealt with. One option we have is to
>>> make the reports public immediately and send notifications to
>>> qemu-devel. This is the approach taken by some other projects on
>>> oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
>>> syzkaller in the kernel, are also automatically sent to a public list.
>>> The question is:
>>>
>>> What approach should we take for dealing with bugs found on oss-fuzz?
> 
> If we assume that a non-negligible number of fuzz bugs will be exploitable
> by a malicious guest OS to break out into the host, then I think it is
> likely undesirable to make them public immediately without at least a basic
> human triage step to catch possibly serious security issues.
> 
> Still a large % are likely to be low impact / not urgent to deal with so
> we want a low overhead way to handle the fuzz output, which doesn't create
> a bottleneck on a small number of people.
> 
> Overall my feeling is that we want to be able to farm out triage to the
> respective subsystem maintainers, who can then decide whether the bug
> needs notifying to the security team, or can be made public immediately.
> 
> I think ideally we would be doing triage in QEMU's own bug tracker, so
> we don't need to have maintainers create accounts on a 3rd party tracker
> to see reports.
> 
> Is is practical to identify a primary affected source file from the fuzz
> crash report with any level reliablility such that we could file a private
> launchpad bug, automatically CC'ing a subsystem maintainer (and the security
> team)  ?

Also what is not very clear is, who is supposed/going to fix these bugs?

I see this pattern:

a) bug found by human: human keeps asking for the bug
   1/ security issue: someone assigned to fix
   2/ else: if human keeps asking, the bug gets eventually fixed.

b) bug found by fuzzer:
   1/ security issue: someone assigned to fix
   2/ else: nothing happens because unlikely hit by user

Do we want to keep tracking b.2 bug reports? I think this is the case of
the ~50 Alexander mentioned.

Regards,

Phil.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:19 Ramping up Continuous Fuzzing of Virtual Devices in QEMU Alexander Bulekov
  2020-10-22 16:24 ` Alexander Bulekov
@ 2020-10-24  3:10 ` Li Qiang
  2020-10-26 16:17   ` Alexander Bulekov
  1 sibling, 1 reply; 11+ messages in thread
From: Li Qiang @ 2020-10-24  3:10 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: Peter Maydell, Thomas Huth, Daniel P. Berrange,
	Richard W.M. Jones, 0ops, Qemu Developers,
	Philippe Mathieu-Daudé,
	Darren Kenny, bsd, Stefan Hajnoczi, andrey.shinkevich,
	Paolo Bonzini, dimastep

Alexander Bulekov <alxndr@bu.edu> 于2020年10月23日周五 上午12:20写道:
>
> Hello,
> QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
> earlier this year. The fuzzers currently running on oss-fuzz are based on my
> 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
> to provide a framework for writing virtual-device fuzzers. At the moment, there
> are a handful of fuzzers upstream and running on oss-fuzz(located in
> tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
> examples.
>
> If everything goes well, soon a generic fuzzer [2] will land upstream, which
> allows us to fuzz many configurations of QEMU, without any device-specific
> code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
> generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
> bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
> also be able to send simple patches to add additional device configurations for
> fuzzing.
>
> The oss-fuzz process looks roughly like this:
>     1. oss-fuzz fuzzes QEMU
>     2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
>     access to reports and reproducers.
>     3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
>     bug as fixed and make the report public 30 days later.
>     3. After 90 days the bug(fixed or not) becomes public, so anyone can view
>     it here https://bugs.chromium.org/p/oss-fuzz/issues/list
>
> The oss-fuzz reports look like this:
> https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
>
> This means that when oss-fuzz find new bugs, the relevant developers do not
> know about them unless someone with access files a separate report to the
> list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
> running some small example fuzzers. Once [2] lands upstream, we should
> see a significant uptick in oss-fuzz reports, and I hope that we can develop a
> process to ensure these bugs are properly dealt with. One option we have is to
> make the reports public immediately and send notifications to
> qemu-devel. This is the approach taken by some other projects on
> oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
> syzkaller in the kernel, are also automatically sent to a public list.
> The question is:
>
> What approach should we take for dealing with bugs found on oss-fuzz?
>

Hi Alex,

I prefer to send these bugs to public list such as qemu-devel.

There are lots of low impact bugs so no need to prepare a private
bugtracker for the little important issues.
Also the maintainer's decision may take a long time.

For the public issues, the security engineer, maintainer and volunteer
can both see them and point out its
impact more quickly.



> [1] https://github.com/google/oss-fuzz
> [2] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06331.html
> [3] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06345.html
> [4] https://github.com/google/oss-fuzz/blob/fbf916ce14952ba192e58fe8550096b868fcf62d/projects/qemu/project.yaml#L4

BTW, is there any condition to join this lists?
I'm quite interested to fix the qemu issues.

Thanks,
Li Qiang

>
> For further reference, the vast majority of these bugs, were found with the
> generic-fuzzer:
> https://bugs.launchpad.net/~a1xndr/+bugs
>
> There are more that I haven't yet had time to write reports for.
> Thank you
> -Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-24  3:10 ` Li Qiang
@ 2020-10-26 16:17   ` Alexander Bulekov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Bulekov @ 2020-10-26 16:17 UTC (permalink / raw)
  To: Li Qiang
  Cc: Peter Maydell, Thomas Huth, Daniel P. Berrange,
	Richard W.M. Jones, 0ops, Qemu Developers,
	Philippe Mathieu-Daudé,
	Darren Kenny, bsd, Stefan Hajnoczi, andrey.shinkevich,
	Paolo Bonzini, dimastep

On 201024 1110, Li Qiang wrote:
> Alexander Bulekov <alxndr@bu.edu> 于2020年10月23日周五 上午12:20写道:
> >
> > Hello,
> > QEMU was accepted into Google's oss-fuzz continuous-fuzzing platform [1]
> > earlier this year. The fuzzers currently running on oss-fuzz are based on my
> > 2019 Google Summer of Code Project, which leveraged libfuzzer, qtest and libqos
> > to provide a framework for writing virtual-device fuzzers. At the moment, there
> > are a handful of fuzzers upstream and running on oss-fuzz(located in
> > tests/qtest/fuzz/). They fuzz only a few devices and serve mostly as
> > examples.
> >
> > If everything goes well, soon a generic fuzzer [2] will land upstream, which
> > allows us to fuzz many configurations of QEMU, without any device-specific
> > code. To date this fuzzer has led to ~50 bug reports on launchpad. Once the
> > generic-fuzzer lands upstream, OSS-Fuzz will automatically start fuzzing a
> > bunch [3] of fuzzer configurations, and it is likely to find bugs.  Others will
> > also be able to send simple patches to add additional device configurations for
> > fuzzing.
> >
> > The oss-fuzz process looks roughly like this:
> >     1. oss-fuzz fuzzes QEMU
> >     2. When oss-fuzz finds a bug, it reports it to a few [4] people that have
> >     access to reports and reproducers.
> >     3. If a fix is merged upstream, oss-fuzz will figure this out and mark the
> >     bug as fixed and make the report public 30 days later.
> >     3. After 90 days the bug(fixed or not) becomes public, so anyone can view
> >     it here https://bugs.chromium.org/p/oss-fuzz/issues/list
> >
> > The oss-fuzz reports look like this:
> > https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=23701&q=qemu&can=2
> >
> > This means that when oss-fuzz find new bugs, the relevant developers do not
> > know about them unless someone with access files a separate report to the
> > list/launchpad. So far this hasn't been a problem, since oss-fuzz has only been
> > running some small example fuzzers. Once [2] lands upstream, we should
> > see a significant uptick in oss-fuzz reports, and I hope that we can develop a
> > process to ensure these bugs are properly dealt with. One option we have is to
> > make the reports public immediately and send notifications to
> > qemu-devel. This is the approach taken by some other projects on
> > oss-fuzz, such as LLVM. Though its not on oss-fuzz, bugs found by
> > syzkaller in the kernel, are also automatically sent to a public list.
> > The question is:
> >
> > What approach should we take for dealing with bugs found on oss-fuzz?
> >
> 
> Hi Alex,
> 
> I prefer to send these bugs to public list such as qemu-devel.
> 
> There are lots of low impact bugs so no need to prepare a private
> bugtracker for the little important issues.
> Also the maintainer's decision may take a long time.
> 
> For the public issues, the security engineer, maintainer and volunteer
> can both see them and point out its
> impact more quickly.
> 
> 
> 
> > [1] https://github.com/google/oss-fuzz
> > [2] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06331.html
> > [3] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06345.html
> > [4] https://github.com/google/oss-fuzz/blob/fbf916ce14952ba192e58fe8550096b868fcf62d/projects/qemu/project.yaml#L4
> 
> BTW, is there any condition to join this lists?
> I'm quite interested to fix the qemu issues.

I guess the original message is trying to figure out what role that list
of people will play. If we go with your option and just add qemu-devel
to the list, then we wouldn't need to worry about adding anyone else to
the list. 
Otherwise, we will need to figure that out - I don't think anyone
currently on that list signed up to triage a bunch of fuzzing bugs :)

As a side note, the general fuzzers were just merged.
https://git.qemu.org/?p=qemu.git;a=commit;h=e75de8354ac5c67145b2f8874d3c36022d4a94bb
oss-fuzz should start using them for fuzzing sometime tommorrow. 
-Alex

> Thanks,
> Li Qiang
> 
> >
> > For further reference, the vast majority of these bugs, were found with the
> > generic-fuzzer:
> > https://bugs.launchpad.net/~a1xndr/+bugs
> >
> > There are more that I haven't yet had time to write reports for.
> > Thank you
> > -Alex


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-10-22 16:39   ` Daniel P. Berrangé
  2020-10-22 18:07     ` Alexander Bulekov
  2020-10-22 21:17     ` Philippe Mathieu-Daudé
@ 2020-11-04 10:30     ` P J P
  2020-11-04 15:25       ` Alexander Bulekov
  2 siblings, 1 reply; 11+ messages in thread
From: P J P @ 2020-11-04 10:30 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: peter.maydell, thuth, Daniel P. Berrangé,
	rjones, 0ops, liq3ea, qemu-devel, f4bug, darren.kenny, bsd,
	stefanha, pbonzini, andrey.shinkevich, ppandit, dimastep

[-- Attachment #1: Type: text/plain, Size: 1891 bytes --]

+-- On Thu, 22 Oct 2020, Daniel P. Berrangé wrote --+
| On Thu, Oct 22, 2020 at 12:24:16PM -0400, Alexander Bulekov wrote:
| > > Once [2] lands upstream, we should see a significant uptick in oss-fuzz 
| > > reports, and I hope that we can develop a process to ensure these bugs 
| > > are properly dealt with. One option we have is to make the reports 
| > > public immediately and send notifications to qemu-devel. This is the 
| > > approach taken by some other projects on oss-fuzz, such as LLVM. Though 
| > > its not on oss-fuzz, bugs found by syzkaller in the kernel, are also 
| > > automatically sent to a public list. The question is:
| > > 
| > > What approach should we take for dealing with bugs found on oss-fuzz?
| 
| If we assume that a non-negligible number of fuzz bugs will be exploitable
| by a malicious guest OS to break out into the host, then I think it is
| likely undesirable to make them public immediately without at least a basic
| human triage step to catch possibly serious security issues.

* Maybe the proposed 'qemu-security' list can receive such issue reports.  It 
  is more close than qemu-devel.

  But it also depends on the quantum of traffic oss-fuzz generates. We don't 
  want to flood/overwhelm qemu-security list or any other list for that 
  matter.

* Human triage is required to know potential impact of an issue before it is 
  sent to a public list. It would not be good to send guest-to-host-escape
  type issues directly to a public list.

* Ideally preliminary human triage should be done on the fuzzers' side.  
  After it hits an issue, someone should have a look at it before sending an 
  email to a list OR maintainer(s).

  Ex. TCG issues are generally not considered for CVE. They need not go to a
  security list.



Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-11-04 10:30     ` P J P
@ 2020-11-04 15:25       ` Alexander Bulekov
  2020-11-04 15:46         ` Peter Maydell
  0 siblings, 1 reply; 11+ messages in thread
From: Alexander Bulekov @ 2020-11-04 15:25 UTC (permalink / raw)
  To: P J P
  Cc: peter.maydell, thuth, Daniel P. Berrangé,
	rjones, 0ops, liq3ea, qemu-devel, f4bug, darren.kenny, bsd,
	stefanha, pbonzini, andrey.shinkevich, dimastep

On 201104 1600, P J P wrote:
> +-- On Thu, 22 Oct 2020, Daniel P. Berrangé wrote --+
> | On Thu, Oct 22, 2020 at 12:24:16PM -0400, Alexander Bulekov wrote:
> | > > Once [2] lands upstream, we should see a significant uptick in oss-fuzz 
> | > > reports, and I hope that we can develop a process to ensure these bugs 
> | > > are properly dealt with. One option we have is to make the reports 
> | > > public immediately and send notifications to qemu-devel. This is the 
> | > > approach taken by some other projects on oss-fuzz, such as LLVM. Though 
> | > > its not on oss-fuzz, bugs found by syzkaller in the kernel, are also 
> | > > automatically sent to a public list. The question is:
> | > > 
> | > > What approach should we take for dealing with bugs found on oss-fuzz?
> | 
> | If we assume that a non-negligible number of fuzz bugs will be exploitable
> | by a malicious guest OS to break out into the host, then I think it is
> | likely undesirable to make them public immediately without at least a basic
> | human triage step to catch possibly serious security issues.
> 
> * Maybe the proposed 'qemu-security' list can receive such issue reports.  It 
>   is more close than qemu-devel.
> 
>   But it also depends on the quantum of traffic oss-fuzz generates. We don't 
>   want to flood/overwhelm qemu-security list or any other list for that 
>   matter.
> 

If I understand correctly, this is analogous to what happens with
Coverity reports. Access to Coverity is closed (not sure if there is a
process to apply for access). It also seems that there is a push to fix
CID issues prior to new releases. Maybe a similar process can be used for
fuzzing?

> * Human triage is required to know potential impact of an issue before it is 
>   sent to a public list. It would not be good to send guest-to-host-escape
>   type issues directly to a public list.
> 
> * Ideally preliminary human triage should be done on the fuzzers' side.  
>   After it hits an issue, someone should have a look at it before sending an 
>   email to a list OR maintainer(s).
> 
>   Ex. TCG issues are generally not considered for CVE. They need not go to a
>   security list.

Of all the issues found by the fuzzer, I think there are two major
categories of "false-positives".
    * Intentionally-triggerable assertion failures
      (e.g. "assert Feature X not supported !")
    * Problems only triggerable through CPU (e.g. problems related to
      referencing the thread-local "current_cpu" variable.

These should be a minority of all issues for which we can automatically
generate qtest reproducers. As far as I know, OSS-Fuzz isn't fuzzing any
devices unsupported by KVM. 

Thanks
-Alex

> 
> 
> 
> Thank you.
> --
> Prasad J Pandit / Red Hat Product Security Team
> 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-11-04 15:25       ` Alexander Bulekov
@ 2020-11-04 15:46         ` Peter Maydell
  2020-11-04 16:52           ` Alexander Bulekov
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2020-11-04 15:46 UTC (permalink / raw)
  To: Alexander Bulekov
  Cc: Thomas Huth, Daniel P. Berrangé,
	Richard W.M. Jones, 0ops, Li Qiang, QEMU Developers, P J P,
	Darren Kenny, Bandan Das, Stefan Hajnoczi, Paolo Bonzini,
	Andrey Shinkevich, Dima Stepanov, Philippe Mathieu-Daudé

On Wed, 4 Nov 2020 at 15:26, Alexander Bulekov <alxndr@bu.edu> wrote:
> If I understand correctly, this is analogous to what happens with
> Coverity reports. Access to Coverity is closed (not sure if there is a
> process to apply for access). It also seems that there is a push to fix
> CID issues prior to new releases. Maybe a similar process can be used for
> fuzzing?

Coverity is only closed in the sense that you have to request
an account on the website. Anybody who's a QEMU developer
can look at the reports.

The attempt to fix CID issues works because:
 * Coverity reports a fairly small number of issues, so
   "fix them all" is relatively tractable, and then once you
   get down to "no outstanding issues" the only new ones
   that get found are for new changes to QEMU (not entirely
   true, but close enough)
 * Mostly issues are reported soon after the offending code
   goes into the tree, so it's often possible to quickly
   identify the patch that introduced the issue and ask
   the person who wrote that patch to fix the bug
 * Coverity reports are categorized by kind-of-failure,
   so it's easy to prioritize important stuff (buffer overflows)
   and leave less significant stuff (dead code) for later
 * Coverity's reports include the automated analysis of
   why Coverity thinks there's an issue -- this is not
   always right but it's a solid head start on "what's the
   bug here" compared to just having a repro case and an
   assertion-failure message
 * There's a set of people who care enough about Coverity
   reports to put the time in to fixing them...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ramping up Continuous Fuzzing of Virtual Devices in QEMU
  2020-11-04 15:46         ` Peter Maydell
@ 2020-11-04 16:52           ` Alexander Bulekov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexander Bulekov @ 2020-11-04 16:52 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Thomas Huth, Daniel P. Berrangé,
	Richard W.M. Jones, 0ops, Li Qiang, QEMU Developers, P J P,
	Darren Kenny, Bandan Das, Stefan Hajnoczi, Paolo Bonzini,
	Andrey Shinkevich, Dima Stepanov, Philippe Mathieu-Daudé

On 201104 1546, Peter Maydell wrote:
> On Wed, 4 Nov 2020 at 15:26, Alexander Bulekov <alxndr@bu.edu> wrote:
> > If I understand correctly, this is analogous to what happens with
> > Coverity reports. Access to Coverity is closed (not sure if there is a
> > process to apply for access). It also seems that there is a push to fix
> > CID issues prior to new releases. Maybe a similar process can be used for
> > fuzzing?
> 
> Coverity is only closed in the sense that you have to request
> an account on the website. Anybody who's a QEMU developer
> can look at the reports.
> 

OK thats good to know. Why should fuzzing reports be treated
differently? Is it because they come with a ready-to-use reproducer?

> The attempt to fix CID issues works because:
>  * Coverity reports a fairly small number of issues, so
>    "fix them all" is relatively tractable, and then once you
>    get down to "no outstanding issues" the only new ones
>    that get found are for new changes to QEMU (not entirely
>    true, but close enough)

I think fuzzing is quite similar. After an initial wave of reports,
they should slow to a trickle and eventually should just catch problems
in new changes.

>  * Mostly issues are reported soon after the offending code
>    goes into the tree, so it's often possible to quickly
>    identify the patch that introduced the issue and ask
>    the person who wrote that patch to fix the bug

While fuzzing can take some time to find issues after new code is
committed, I don't think the problem of reports on stale code is unique
to fuzzing. That should only be an issue for the initial wave of reports
(the same as I'm guessing it was was when Coverity started analyzing
QEMU in 2013).
Additionally, we can filter crashes based on where they occured, which
should give a similar level of control as the Coverity
components/patterns. If problems in e.g. ati or vmxnet are unlikely to
be looked at, we can
 1.) Not fuzz them (easy to do by removing an entry from
 tests/qtest/fuzz/generic_fuzz_configs.h)
 2.) Apply filters to ignore crash reports with ati.c or vmxnet.c in the
 callstack.

That said, IIUC main purpose of the Coverity componsnets/patterns is to
prevent false positives. This shouldn't be nearly as much of a problem
for oss-fuzz reports for which we can provide qtest reproducers.

>  * Coverity reports are categorized by kind-of-failure,
>    so it's easy to prioritize important stuff (buffer overflows)
>    and leave less significant stuff (dead code) for later

Fuzzer reports are the same. OSS-fuzz tells us if something is an
overflow, UAF, overlapping-memcpy, double-free, assertion-failure,
null-ptr derefs etc.
If noise from assertion-failure/null-ptr derefs reports is a concern,
maybe we can just ignore those for now?

>  * Coverity's reports include the automated analysis of
>    why Coverity thinks there's an issue -- this is not
>    always right but it's a solid head start on "what's the
>    bug here" compared to just having a repro case and an
>    assertion-failure message

Thats true - the types of bugs found by fuzzing rather than static
analysis usually are tough to automatically suggest (pretty) fixes for.
I have been thinking about ways to make this situation better, beyond
just enabling the relevant -trace events and hoping they provide context
and make life easier.

>  * There's a set of people who care enough about Coverity
>    reports to put the time in to fixing them...

I hope this set of people grows for fuzzing reports as well. Though
fuzzing reports are nothing compared to the ~1250 fixed coverity
defects, fuzzing has already helped highlight some serious issues that
have been hiding in the code for a long time. Unfortunately, I think
there is a gap between the types of problems reported by fuzzing and
Coverity, where I don't see someone picking up a dozen random fuzzing
reports and having a dozen patches ready by the end of the day. Multiple
people working on fuzzer-discovered issues have mentioned that they are
often quite time consuming to properly fix (for little preceived
reward). Maybe the solution is to limit the scope of fuzzer reports so
that they might be rarer, but are more likely to feature problems that
people will care about? This might mean having a narrower selection of
fuzzed devices than just "anything that works with KVM"

-Alex

> 
> thanks
> -- PMM


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-11-04 16:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 16:19 Ramping up Continuous Fuzzing of Virtual Devices in QEMU Alexander Bulekov
2020-10-22 16:24 ` Alexander Bulekov
2020-10-22 16:39   ` Daniel P. Berrangé
2020-10-22 18:07     ` Alexander Bulekov
2020-10-22 21:17     ` Philippe Mathieu-Daudé
2020-11-04 10:30     ` P J P
2020-11-04 15:25       ` Alexander Bulekov
2020-11-04 15:46         ` Peter Maydell
2020-11-04 16:52           ` Alexander Bulekov
2020-10-24  3:10 ` Li Qiang
2020-10-26 16:17   ` Alexander Bulekov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).