All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Alexander Bulekov <alxndr@bu.edu>
Cc: qemu-devel@nongnu.org,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Mauro Matteo Cascella" <mcascell@redhat.com>,
	"Qiuhao Li" <Qiuhao.Li@outlook.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Li Qiang" <liq3ea@gmail.com>, "Thomas Huth" <thuth@redhat.com>,
	"Laurent Vivier" <lvivier@redhat.com>,
	"Bandan Das" <bsd@redhat.com>,
	"Edgar E . Iglesias" <edgar.iglesias@gmail.com>,
	"Darren Kenny" <darren.kenny@oracle.com>,
	"Bin Meng" <bin.meng@windriver.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Jon Maloy" <jmaloy@redhat.com>, "Siqi Chen" <coc.cyqh@gmail.com>
Subject: Re: [PATCH v3 1/7] memory: associate DMA accesses with the initiator Device
Date: Mon, 14 Nov 2022 15:08:01 -0500	[thread overview]
Message-ID: <CAJSP0QU3k2FwVCG4uVBqtrweLtzdHn=czyS5WVHUWndOKLRxew@mail.gmail.com> (raw)
In-Reply-To: <20221028191648.964076-2-alxndr@bu.edu>

On Fri, 28 Oct 2022 at 15:19, Alexander Bulekov <alxndr@bu.edu> wrote:
>
> Add transitionary DMA APIs which associate accesses with the device
> initiating them. The modified APIs maintain a "MemReentrancyGuard" in
> the DeviceState, which is used to prevent DMA re-entrancy issues.
> The MemReentrancyGuard is set/checked when entering IO handlers and when
> initiating a DMA access.
>
> 1.) mmio -> dma -> mmio case
> 2.) bh -> dma write -> mmio case
>
> These issues have led to problems such as stack-exhaustion and
> use-after-frees.
>
> Summary of the problem from Peter Maydell:
> https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
>
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  include/hw/qdev-core.h |  2 ++
>  include/sysemu/dma.h   | 41 +++++++++++++++++++++++++++++++++++++++++
>  softmmu/memory.c       | 15 +++++++++++++++
>  softmmu/trace-events   |  1 +
>  4 files changed, 59 insertions(+)
>
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 785dd5a56e..ab78d211af 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -8,6 +8,7 @@
>  #include "qom/object.h"
>  #include "hw/hotplug.h"
>  #include "hw/resettable.h"
> +#include "sysemu/dma.h"
>
>  enum {
>      DEV_NVECTORS_UNSPECIFIED = -1,
> @@ -194,6 +195,7 @@ struct DeviceState {
>      int alias_required_for_version;
>      ResettableState reset;
>      GSList *unplug_blockers;
> +    MemReentrancyGuard mem_reentrancy_guard;
>  };
>
>  struct DeviceListener {
> diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
> index a1ac5bc1b5..879b666bbb 100644
> --- a/include/sysemu/dma.h
> +++ b/include/sysemu/dma.h
> @@ -15,6 +15,10 @@
>  #include "block/block.h"
>  #include "block/accounting.h"
>
> +typedef struct {
> +    bool engaged_in_io;
> +} MemReentrancyGuard;

Please add a doc comment that explains the purpose of MemReentrancyGuard.

> +
>  typedef enum {
>      DMA_DIRECTION_TO_DEVICE = 0,
>      DMA_DIRECTION_FROM_DEVICE = 1,
> @@ -321,4 +325,41 @@ void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie,
>  uint64_t dma_aligned_pow2_mask(uint64_t start, uint64_t end,
>                                 int max_addr_bits);
>
> +#define REENTRANCY_GUARD(func, ret_type, dev, ...) \
> +    ({\
> +     ret_type retval;\
> +     MemReentrancyGuard prior_guard_state = dev->mem_reentrancy_guard;\
> +     dev->mem_reentrancy_guard.engaged_in_io = 1;\

Please use true/false for bool constants. That way it's obvious to the
reader that this is a bool and not an int.

> +     retval = func(__VA_ARGS__);\
> +     dev->mem_reentrancy_guard = prior_guard_state;\
> +     retval;\
> +     })

I'm trying to understand the purpose of this macro. It restores the
previous state of mem_reentrancy_guard, implying that this is
sometimes called when the guard is already true (i.e. from
MemoryRegion callbacks). It can also be called in the BH case and I
think that's why mem_reentrancy_guard is set to true here. Using BHs
to avoid deep stacks and re-entrancy is a valid technique though, and
this macro seems to be designed to prevent it. Can you explain a bit
more about how this is supposed to be used?

If this macro is a public API that other parts of QEMU will use, then
the following approach is more consistent with how the lock guard
macros work:

  REENTRANCY_GUARD(dev) {
      retval = func(1, 2, 3);
  }

It's also more readable then:

  REENTRANCY_GUARD(func, int, dev, 1, 2, 3);

?

> +#define REENTRANCY_GUARD_NORET(func, dev, ...) \
> +    ({\
> +     MemReentrancyGuard prior_guard_state = dev->mem_reentrancy_guard;\
> +     dev->mem_reentrancy_guard.engaged_in_io = 1;\
> +     func(__VA_ARGS__);\
> +     dev->mem_reentrancy_guard = prior_guard_state;\
> +     })
> +#define dma_memory_rw_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_memory_rw, MemTxResult, dev, __VA_ARGS__)
> +#define dma_memory_read_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_memory_read, MemTxResult, dev, __VA_ARGS__)
> +#define dma_memory_write_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_memory_write, MemTxResult, dev, __VA_ARGS__)
> +#define dma_memory_set_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_memory_set, MemTxResult, dev, __VA_ARGS__)
> +#define dma_memory_map_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_memory_map, void*, dev, __VA_ARGS__)
> +#define dma_memory_unmap_guarded(dev, ...) \
> +    REENTRANCY_GUARD_NORET(dma_memory_unmap, dev, __VA_ARGS__)
> +#define ldub_dma_guarded(dev, ...) \
> +    REENTRANCY_GUARD(ldub_dma, MemTxResult, dev, __VA_ARGS__)
> +#define stb_dma_guarded(dev, ...) \
> +    REENTRANCY_GUARD(stb_dma, MemTxResult, dev, __VA_ARGS__)
> +#define dma_buf_read_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_buf_read, MemTxResult, dev, __VA_ARGS__)
> +#define dma_buf_write_guarded(dev, ...) \
> +    REENTRANCY_GUARD(dma_buf_read, MemTxResult, dev, __VA_ARGS__)
> +
>  #endif
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 7ba2048836..c44dc75149 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -532,6 +532,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>      uint64_t access_mask;
>      unsigned access_size;
>      unsigned i;
> +    DeviceState *dev = NULL;
>      MemTxResult r = MEMTX_OK;
>
>      if (!access_size_min) {
> @@ -541,6 +542,17 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>          access_size_max = 4;
>      }
>
> +    /* Do not allow more than one simultanous access to a device's IO Regions */
> +    if (mr->owner &&
> +            !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {

Why are readonly MemoryRegions exempt?

> +        dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
> +        if (dev->mem_reentrancy_guard.engaged_in_io) {
> +            trace_memory_region_reentrant_io(get_cpu_index(), mr, addr, size);
> +            return MEMTX_ERROR;
> +        }
> +        dev->mem_reentrancy_guard.engaged_in_io = true;
> +    }
> +
>      /* FIXME: support unaligned access? */
>      access_size = MAX(MIN(size, access_size_max), access_size_min);
>      access_mask = MAKE_64BIT_MASK(0, access_size * 8);
> @@ -555,6 +567,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>                          access_mask, attrs);
>          }
>      }
> +    if (dev) {
> +        dev->mem_reentrancy_guard.engaged_in_io = false;
> +    }
>      return r;
>  }
>
> diff --git a/softmmu/trace-events b/softmmu/trace-events
> index 22606dc27b..62d04ea9a7 100644
> --- a/softmmu/trace-events
> +++ b/softmmu/trace-events
> @@ -13,6 +13,7 @@ memory_region_ops_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, u
>  memory_region_ops_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size, const char *name) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u name '%s'"
>  memory_region_subpage_read(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_subpage_write(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
> +memory_region_reentrant_io(int cpu_index, void *mr, uint64_t offset, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" size %u"
>  memory_region_ram_device_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_ram_device_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_sync_dirty(const char *mr, const char *listener, int global) "mr '%s' listener '%s' synced (global=%d)"
> --
> 2.27.0
>
>


  reply	other threads:[~2022-11-15  1:14 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-28 19:16 [PATCH v3 0/7] memory: prevent dma-reentracy issues Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 1/7] memory: associate DMA accesses with the initiator Device Alexander Bulekov
2022-11-14 20:08   ` Stefan Hajnoczi [this message]
2022-11-14 20:31   ` Stefan Hajnoczi
2022-11-15 16:19   ` Peter Xu
2022-11-15 16:49     ` Peter Maydell
2022-11-15 17:44     ` Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 2/7] dma-helpers: switch to guarded DMA accesses Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 3/7] ahci: switch to guarded DMA acccesses Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 4/7] sdhci: switch to guarded DMA accesses Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 5/7] ehci: " Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 6/7] xhci: " Alexander Bulekov
2022-10-28 19:16 ` [PATCH v3 7/7] usb/libhw: " Alexander Bulekov
2022-11-07 17:09 ` [PATCH v3 0/7] memory: prevent dma-reentracy issues Alexander Bulekov
2022-11-10 20:50 ` Stefan Hajnoczi
2022-11-10 20:53   ` Michael S. Tsirkin
2022-11-10 22:50   ` Peter Maydell
2022-11-15 11:28   ` Philippe Mathieu-Daudé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJSP0QU3k2FwVCG4uVBqtrweLtzdHn=czyS5WVHUWndOKLRxew@mail.gmail.com' \
    --to=stefanha@gmail.com \
    --cc=Qiuhao.Li@outlook.com \
    --cc=alxndr@bu.edu \
    --cc=berrange@redhat.com \
    --cc=bin.meng@windriver.com \
    --cc=bsd@redhat.com \
    --cc=coc.cyqh@gmail.com \
    --cc=darren.kenny@oracle.com \
    --cc=david@redhat.com \
    --cc=edgar.iglesias@gmail.com \
    --cc=eduardo@habkost.net \
    --cc=jasowang@redhat.com \
    --cc=jmaloy@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=liq3ea@gmail.com \
    --cc=lvivier@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mcascell@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.