From: Peter Maydell <peter.maydell@linaro.org>
To: Peter Xu <peterx@redhat.com>
Cc: "David Hildenbrand" <david@redhat.com>,
"Jason Wang" <jasowang@redhat.com>, "Li Qiang" <liq3ea@gmail.com>,
"QEMU Developers" <qemu-devel@nongnu.org>,
"Qiuhao Li" <Qiuhao.Li@outlook.com>,
"Alexander Bulekov" <alxndr@bu.edu>,
qemu-arm <qemu-arm@nongnu.org>,
"Gerd Hoffmann" <kraxel@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Edgar E . Iglesias" <edgar.iglesias@gmail.com>,
"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [RFC PATCH v2 0/5] physmem: Have flaview API check bus permission from MemTxAttrs argument
Date: Tue, 24 Aug 2021 10:49:07 +0100 [thread overview]
Message-ID: <CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com> (raw)
In-Reply-To: <YSQKHaGiJZE5OAk2@t490s>
On Mon, 23 Aug 2021 at 21:50, Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Aug 23, 2021 at 08:10:50PM +0100, Peter Maydell wrote:
> > On Mon, 23 Aug 2021 at 17:42, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
> > >
> > > This series aim to kill a recent class of bug, the infamous
> > > "DMA reentrancy" issues found by Alexander while fuzzing.
> > >
> > > Introduce the 'bus_perm' field in MemTxAttrs, defining 3 bits:
> > >
> > > - MEMTXPERM_UNSPECIFIED (current default, unchanged behavior)
> > > - MEMTXPERM_UNRESTRICTED (allow list approach)
> > > - MEMTXPERM_RAM_DEVICE (example of deny list approach)
> > >
> > > If a transaction permission is not allowed (for example access
> > > to non-RAM device), we return the specific MEMTX_BUS_ERROR.
> > >
> > > Permissions are checked in after the flatview is resolved, and
> > > before the access is done, in a new function: flatview_access_allowed().
> >
> > So I'm not going to say 'no' to this, because we have a real
> > recursive-device-handling problem and I don't have a better
> > idea to hand, but the thing about this is that we end up with
> > behaviour which is not what the real hardware does. I'm not
> > aware of any DMA device which has this kind of "can only DMA
> > to/from RAM, and aborts on access to a device" behaviour...
>
> Sorry for not being familiar with the context - is there more info regarding
> the problem to fix?
So, the general problem is that we have a whole class of bugs
that look like this:
* Device A is DMA-capable. It also has a set of memory
mapped registers which can be used to control it.
* Malicious guest code (or the fuzzer) programs A's DMA
engine to do a DMA read or write to the address where
A's own registers are mapped.
* Typically, the MemoryRegionOps write function for the
register block will handle the "write to start-dma
register" by doing the DMA, ie calling address_space_write(),
pci_dma_write(), or equivalent. Because of the target address
the guest code has set, that will result in the memory
subsystem calling back into the same MemoryRegionOps
write function, recursively.
* Our code implementing the model of device A is not at
all expecting this re-entrancy, and might crash, access
freed memory, or otherwise misbehave.
You can elaborate on that basic scenario, for instance with
a loop of multiple devices where you program device A to
do a DMA write to device B's registers which starts device B doing
a DMA write to A's registers. Nor is it inherently limited to
DMA -- device A could be made to assert a qemu_irq that is
connected to device B in a way that causes device B to
do something that results in code calling back into device A;
or maybe device A DMAs to a register in device B that implements
a device-reset on device A. DMA is just the easiest for guest
code to set up and has the least restrictions on how it's
connected up.
In some specific cases we have "fixed" individual instances
of this bug by putting in checks or changes to whatever device
model the fuzzer happened to find a problem with, or put in
slightly wider-scale attempts to catch this (eg commit 22dc8663d9f
which prevents re-entering a NIC device's packet-rx callback
function if the guest has set it up so that a received packet
results in DMA that triggers another received packet). But
we don't have a coherent model of how we ought to structure
device models that can avoid this problem in a general way,
and I think that until we do we're liable to keep running into
specific bugs, some of which will be (or at least be labelled
as) "security issues".
Philippe's series here tries to fix this for at least any
variety of this bug where there is a DMA access in the
loop, by forbidding DMA accesses to MMIO regions not backed
by RAM. That works, in that it breaks the loop; as I
mentioned in my other email, it does it in a way that's
not the way real h/w behaves. Unfortunately "what does real
h/w do?" is not in this situation necessarily a helpful
guide for QEMU design: I think that real hardware:
(a) often doesn't see the same kind of problem because
the design will usually decouple the DMA engine from
the register-access logic in a way that means it
naturally doesn't literally lock up
(b) often won't have been designed to deal with "software
programs a DMA-to-self" either, but the threat model
for real hw is different, in that software has many ways
of making the overall system crash, hang or misbehave;
it often doesn't have the "need to allow untrusted
software to touch this device" situation.
One could have QEMU work somewhat like (a) by mandating
that all DMA of any kind was done in separate bottom-half
routines and not directly from the register-write code.
That would probably reduce performance and be a lot of
code to restructure. It would deal also with another class
of "guest code can make QEMU hang by programming it to do
an enormous DMA all at once or by setting up an infinitely
looping chain of DMA commands" bugs, though.
I was vaguely tossing an idea around in the back of my mind
about whether you could have a flag on devices that marked
them as "this device is currently involved in IO", such that
you could then just fail the last DMA (or qemu_irq_set, or
whatever) that would complete the loop back to a device that
was already doing IO. But that would need a lot of thinking
through to figure out if it's feasible, and it's probably
a lot of code change.
thanks
-- PMM
next prev parent reply other threads:[~2021-08-24 9:50 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-23 16:41 [RFC PATCH v2 0/5] physmem: Have flaview API check bus permission from MemTxAttrs argument Philippe Mathieu-Daudé
2021-08-23 16:41 ` [RFC PATCH v2 1/5] softmmu/physmem: Simplify flatview_write and address_space_access_valid Philippe Mathieu-Daudé
2021-08-23 18:45 ` Peter Xu
2021-08-23 18:59 ` David Hildenbrand
2021-08-24 9:03 ` Alexander Bulekov
2021-08-24 13:04 ` Stefan Hajnoczi
2021-08-23 16:41 ` [RFC PATCH v2 2/5] hw/intc/arm_gicv3: Check for !MEMTX_OK instead of MEMTX_ERROR Philippe Mathieu-Daudé
2021-08-23 18:46 ` Peter Xu
2021-08-23 19:01 ` David Hildenbrand
2021-08-23 19:07 ` Peter Maydell
2021-08-24 13:04 ` Stefan Hajnoczi
2021-08-23 16:41 ` [RFC PATCH v2 3/5] exec/memattrs: Introduce MemTxAttrs::bus_perm field Philippe Mathieu-Daudé
2021-08-23 18:41 ` Peter Xu
2021-08-23 19:04 ` David Hildenbrand
2021-12-15 17:14 ` Philippe Mathieu-Daudé
2021-08-24 13:08 ` Stefan Hajnoczi
2021-12-15 17:11 ` Philippe Mathieu-Daudé
2021-08-23 16:41 ` [RFC PATCH v2 4/5] softmmu/physmem: Introduce flatview_access_allowed() to check bus perms Philippe Mathieu-Daudé
2021-08-23 18:43 ` Peter Xu
2021-08-23 19:03 ` David Hildenbrand
2021-08-24 13:13 ` Stefan Hajnoczi
2021-08-23 16:41 ` [RFC PATCH v2 5/5] softmmu/physmem: Have flaview API check MemTxAttrs::bus_perm field Philippe Mathieu-Daudé
2021-08-23 18:45 ` Peter Xu
2021-08-23 19:10 ` David Hildenbrand
2021-08-24 13:15 ` Stefan Hajnoczi
2021-08-24 13:50 ` Philippe Mathieu-Daudé
2021-08-24 14:21 ` Peter Maydell
2021-11-18 21:04 ` Philippe Mathieu-Daudé
2021-08-23 19:10 ` [RFC PATCH v2 0/5] physmem: Have flaview API check bus permission from MemTxAttrs argument Peter Maydell
2021-08-23 20:50 ` Peter Xu
2021-08-23 22:26 ` Alexander Bulekov
2021-08-24 7:24 ` Philippe Mathieu-Daudé
2021-08-24 9:49 ` Peter Maydell [this message]
2021-08-24 12:01 ` Gerd Hoffmann
2021-08-24 12:12 ` Li Qiang
2021-08-24 19:34 ` Peter Xu
2021-08-24 9:25 ` Edgar E. Iglesias
2021-08-24 13:26 ` Stefan Hajnoczi
2021-08-24 8:58 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com \
--to=peter.maydell@linaro.org \
--cc=Qiuhao.Li@outlook.com \
--cc=alxndr@bu.edu \
--cc=david@redhat.com \
--cc=edgar.iglesias@gmail.com \
--cc=jasowang@redhat.com \
--cc=kraxel@redhat.com \
--cc=liq3ea@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).