qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Bulekov <alxndr@bu.edu>
To: Qiuhao Li <Qiuhao.Li@outlook.com>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Mauro Matteo Cascella" <mcascell@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Bin Meng" <bin.meng@windriver.com>,
	"Li Qiang" <liq3ea@gmail.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Peter Xu" <peterx@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Darren Kenny" <darren.kenny@oracle.com>,
	"Bandan Das" <bsd@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Edgar E . Iglesias" <edgar.iglesias@gmail.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [RFC PATCH] memory: Fix dma-reentrancy issues at the MMIO level
Date: Fri, 17 Dec 2021 09:20:06 -0500	[thread overview]
Message-ID: <20211217142006.opo3qko5cgd7m2p6@mozz.bu.edu> (raw)
In-Reply-To: <PN0PR01MB6352C2E496E5723275EB1878FC789@PN0PR01MB6352.INDPRD01.PROD.OUTLOOK.COM>

On 211217 0627, Qiuhao Li wrote:
> Thanks Alex. It seems this patch sets and checks if the destination device is busy. But how about the data transfers not triggered directly by PMIO/MMIO handlers? For example:
> 
> 1. Device A Timer's callback -> Device A MMIO handler
> 2. Device A BH's callback -> Device A MMIO handler
> 
> In these situations, when A launches a DMA to itself, the dev->engaged_in_direct_io is not set, so the operation is allowed. Maybe we should log the source and check the destination when we launch data transfers. Is there a way to do that?

Ahh, I forgot that this can happen in BHs as well... I think the only
place to do that, without major API changes, is at pci_dma_rw and
dma_buffer_rw (where we can catch a glimpse of the
PCIDevice/DeviceState).

I'll send a V2.

-Alex

> 
> Below is a reproducer in NVMe which triggers DMA in a timer's callback (nvme_process_sq). I can still trigger use-after-free exception with this patch on qemu-6.1.0:
> 
> cat << EOF | ./qemu-system-x86_64 -display none -machine accel=qtest \
> -machine q35 -nodefaults -drive file=null-co://,if=none,format=raw,id=disk0 \
> -device nvme,drive=disk0,serial=1 -qtest stdio \
> 
> outl 0xcf8 0x80000810               /* MLBAR (BAR0) – Memory Register Base Address, lower 32-bits */
> outl 0xcfc 0xe0000000               /* MMIO Base Address = 0xe0000000 */
> outl 0xcf8 0x80000804               /* CMD - Command */
> outw 0xcfc 0x06                     /* Bus Master Enable, Memory Space Enable */
> write 0xe0000024 0x4 0x02000200     /* [3] 3.1.8, Admin Queue Attributes */
> write 0xe0000028 0x4 0x00100000     /* asq = 0x1000 */
> write 0xe0000030 0x4 0x00200000     /* acq = 0x2000 */
> write 0xe0000014 0x4 0x01004600     /* [3] 3.1.5, Controller Configuration, start ctrl */
> write 0xe0001000 0x1 0x01           /* [3] 3.1.24, SQyTDBL – Submission Queue y Tail Doorbell */
> write 0x1000 0x1 0x02               /* cmd->opcode, NVME_ADM_CMD_GET_LOG_PAGE, nvme_get_log() */
> write 0x1018 0x4 0x140000e0         /* prp1 = 0xe0000014, NVME_REG_CC, nvme_ctrl_reset() */
> write 0x1028 0x4 0x03000004         /* cmd->cdw10, lid = 3 NVME_LOG_FW_SLOT_INFO, nvme_fw_log_info, buf_len = 4 */
> write 0x1030 0x4 0xfc010000         /* cmd->cdw12 = 0x1fc, Log Page Offset, trans_len = sizeof(fw_log) - 0x1fc = 4 */
> clock_step
> EOF
> 
> CC: Mauro Matteo Cascella and Philippe Mathieu-Daudé. Should we put the reproducer above to https://gitlab.com/qemu-project/qemu/-/issues/556?
> 
> ________________________________
> From: Alexander Bulekov <alxndr@bu.edu>
> Sent: Friday, December 17, 2021 11:08
> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
> Cc: Alexander Bulekov <alxndr@bu.edu>; Philippe Mathieu-Daudé <philmd@redhat.com>; Mauro Matteo Cascella <mcascell@redhat.com>; Qiuhao Li <Qiuhao.Li@outlook.com>; Peter Xu <peterx@redhat.com>; Jason Wang <jasowang@redhat.com>; David Hildenbrand <david@redhat.com>; Gerd Hoffmann <kraxel@redhat.com>; Peter Maydell <peter.maydell@linaro.org>; Li Qiang <liq3ea@gmail.com>; Thomas Huth <thuth@redhat.com>; Laurent Vivier <lvivier@redhat.com>; Bandan Das <bsd@redhat.com>; Edgar E . Iglesias <edgar.iglesias@gmail.com>; Darren Kenny <darren.kenny@oracle.com>; Bin Meng <bin.meng@windriver.com>; Paolo Bonzini <pbonzini@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Daniel P. Berrangé <berrange@redhat.com>; Eduardo Habkost <eduardo@habkost.net>
> Subject: [RFC PATCH] memory: Fix dma-reentrancy issues at the MMIO level
> 
> Here's my shot at fixing dma-reentracy issues. This patch adds a flag to
> the DeviceState, which is set/checked when we call an accessor
> associated with the device's IO MRs.
> 
> The problem, in short, as I understand it: For the vast majority of
> cases, we want to prevent a device from accessing it's own PIO/MMIO
> regions over DMA.
> 
> This patch/solution is based on some assumptions:
> 1. DMA accesses that hit mmio regions are only dangerous if they end up
> interacting with memory-regions belonging to the device initiating the
> DMA.
> Not dangerous:  sdhci_pio->dma_write->e1000_mmio
> Dangerous:      sdhci_pio->dma_write->sdhci_mmio
> 
> 2. Most devices do not interact with their own PIO/MMIO memory-regions
> using DMA.
> 
> 3. There is no way for there to be multiple simultaneous accesses to a
> device's PIO/MMIO memory-regions.
> 
> 4. All devices are QOMified :-)
> 
> With this patch, I wasn't able to reproduce the issues being tracked
> here, with QTest reproducers:
> https://gitlab.com/qemu-project/qemu/-/issues/556
> 
> This passes the i386 qos/qtests for me and I was able to boot some linux/windows
> VMs with basic devices configured, without any apparent problems.
> 
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Mauro Matteo Cascella <mcascell@redhat.com>
> Cc: Qiuhao Li <Qiuhao.Li@outlook.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Li Qiang <liq3ea@gmail.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Bandan Das <bsd@redhat.com>
> Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
> Cc: Darren Kenny <darren.kenny@oracle.com>
> Cc: Bin Meng <bin.meng@windriver.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
> ---
>  include/hw/qdev-core.h |  1 +
>  softmmu/memory.c       | 15 +++++++++++++++
>  softmmu/trace-events   |  1 +
>  3 files changed, 17 insertions(+)
> 
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 20d3066595..32f7c779ab 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -193,6 +193,7 @@ struct DeviceState {
>      int instance_id_alias;
>      int alias_required_for_version;
>      ResettableState reset;
> +    int engaged_in_direct_io;
>  };
> 
>  struct DeviceListener {
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 7340e19ff5..255c3c602f 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -532,6 +532,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>      uint64_t access_mask;
>      unsigned access_size;
>      unsigned i;
> +    DeviceState *dev = NULL;
>      MemTxResult r = MEMTX_OK;
> 
>      if (!access_size_min) {
> @@ -541,6 +542,17 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>          access_size_max = 4;
>      }
> 
> +    /* Do not allow more than one simultanous access to a device's IO Regions */
> +    if (mr->owner &&
> +        !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
> +        dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
> +        if (dev->engaged_in_direct_io) {
> +            trace_memory_region_reentrant_io(get_cpu_index(), mr, addr, size);
> +            return MEMTX_ERROR;
> +        }
> +        dev->engaged_in_direct_io = true;
> +    }
> +
>      /* FIXME: support unaligned access? */
>      access_size = MAX(MIN(size, access_size_max), access_size_min);
>      access_mask = MAKE_64BIT_MASK(0, access_size * 8);
> @@ -555,6 +567,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
>                          access_mask, attrs);
>          }
>      }
> +    if (dev) {
> +        dev->engaged_in_direct_io = false;
> +    }
>      return r;
>  }
> 
> diff --git a/softmmu/trace-events b/softmmu/trace-events
> index 9c88887b3c..d7228316db 100644
> --- a/softmmu/trace-events
> +++ b/softmmu/trace-events
> @@ -13,6 +13,7 @@ memory_region_ops_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, u
>  memory_region_ops_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size, const char *name) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u name '%s'"
>  memory_region_subpage_read(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_subpage_write(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
> +memory_region_reentrant_io(int cpu_index, void *mr, uint64_t offset, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" size %u"
>  memory_region_ram_device_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_ram_device_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
>  memory_region_sync_dirty(const char *mr, const char *listener, int global) "mr '%s' listener '%s' synced (global=%d)"
> --
> 2.33.0
> 


  parent reply	other threads:[~2021-12-17 14:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-17  3:08 [RFC PATCH] memory: Fix dma-reentrancy issues at the MMIO level Alexander Bulekov
2021-12-17  6:27 ` Qiuhao Li
2021-12-17  8:37   ` Klaus Jensen
2021-12-17  9:44   ` Mauro Matteo Cascella
2021-12-17 14:20   ` Alexander Bulekov [this message]
2021-12-17 13:58 ` Philippe Mathieu-Daudé
2021-12-17 14:30   ` Alexander Bulekov
2021-12-17 15:25     ` Philippe Mathieu-Daudé
2021-12-17 16:51       ` Alexander Bulekov
2021-12-17  8:51 Qiuhao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211217142006.opo3qko5cgd7m2p6@mozz.bu.edu \
    --to=alxndr@bu.edu \
    --cc=Qiuhao.Li@outlook.com \
    --cc=berrange@redhat.com \
    --cc=bin.meng@windriver.com \
    --cc=bsd@redhat.com \
    --cc=darren.kenny@oracle.com \
    --cc=david@redhat.com \
    --cc=edgar.iglesias@gmail.com \
    --cc=eduardo@habkost.net \
    --cc=jasowang@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=liq3ea@gmail.com \
    --cc=lvivier@redhat.com \
    --cc=mcascell@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).