qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices
@ 2021-08-10 13:40 David Edmondson
  2021-08-10 13:40 ` [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region David Edmondson
  2021-08-20 18:19 ` [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices Philippe Mathieu-Daudé
  0 siblings, 2 replies; 5+ messages in thread
From: David Edmondson @ 2021-08-10 13:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Maydell, qemu-block, Michael S. Tsirkin,
	Xu Yandong, John Snow, Markus Armbruster, Max Reitz,
	Alex Bennée, Shannon Zhao, Zheng Xiang, qemu-arm,
	haibinzhang(张海斌),
	Stefan Hajnoczi, Paolo Bonzini, Igor Mammedov, David Edmondson,
	Philippe Mathieu-Daudé,
	Stefano Garzarella

As described in
https://lore.kernel.org/r/20201116104216.439650-1-david.edmondson@oracle.com
and
https://lore.kernel.org/r/20210222174757.2329740-1-david.edmondson@oracle.com
I'd like to reduce the amount of memory consumed by QEMU mapping UEFI
images on aarch64.

To recap:

> Currently ARM UEFI images are typically built as 2MB/768kB flash
> images for code and variables respectively. These images are both
> then padded out to 64MB before being loaded by QEMU.
>
> Because the images are 64MB each, QEMU allocates 128MB of memory to
> read them, and then proceeds to read all 128MB from disk (dirtying
> the memory). Of this 128MB less than 3MB is useful - the rest is
> zero padding.
>
> On a machine with 100 VMs this wastes over 12GB of memory.

Some of the cleanups in the previous patches were incorporated, but
the patch that reduced memory consumption was not accepted. This is
essentially that patch rebased after some unrelated changes. Having
investigated alternatives, I think that the patch here is useful as it
stands.

All read/write operations to areas outside of the underlying block
device are handled directly. Reads return 0, writes either fail
(read-only devices) or are discarded (writable devices).

This reduces the memory consumption for the AAVMF code image from
64MiB to around 2MB and that for the AAVMF vars from 64MiB to 768KiB
(presuming that the UEFI images are adjusted accordingly).

For read-only devices (such as the AAVMF code) this seems completely
safe.

For writable devices there is a change in behaviour - previously it
was possible to write anywhere in the extent of the flash device, read
back the data written and have that data persist through a restart of
QEMU. This is no longer the case - writes outside of the extent of the
underlying backing block device will be discarded. That is, a read
after a write will *not* return the written data, either immediately
or after a QEMU restart - it will return zeros.

Looking at the AAVMF implementation, it seems to me that if the
initial VARS image is prepared as being 768KiB in size (which it is),
none of AAVMF itself will attempt to write outside of that extent, and
so I believe that this is an acceptable compromise.

It would be relatively straightforward to allow writes outside of the
backing device to persist for the lifetime of a particular QEMU by
allocating memory on demand (i.e. when there is a write to the
relevant region). This would allow a read to return the relevant data,
but only until a QEMU restart, at which point the data would be lost.

It may be possible to persist writes by extending the underlying
backing device to accommodate a new extent. This would definitely add
complication, as ideally the size of the memory sub-region would also
be updated. I have not investigated this further.

There was a suggestion in a previous thread that perhaps the pflash
driver could be re-worked to use the block IO interfaces to access the
underlying device "on demand" rather than reading in the entire image
at startup (at least, that's how I understood the comment).

An implementation of this based around mapping the flash region only
for IO, which meant that every read or write had to be handled
directly by the pflash driver (there was no ROMD style operation),
made booting an aarch64 VM significantly slower - getting through the
firmware went from under 1 second to around 10 seconds. It's possible
that this could be improved by caching blocks or some other mechanism,
but I have not pursued it further.

Philippe implemented a suggestion to use mmap() to avoid the need to
allocate (and dirty) memory for read-only pflash images in
https://lore.kernel.org/qemu-devel/20210301115329.411762-1-philmd@redhat.com/.

This solution was, I believe, considered incomplete, as:
- it does not handle the case where the image underlying a pflash
  device is changed via QAPI,
- it does not handle writable devices.

There is also an assumption that multiple QEMU instances on a single
host will share the same AAVMF code image (to benefit from a shared
mapping) - this is not the case in the environment that I am looking
to support.

If using mmap() for read-only device is particularly valuable, it
could be combined with the patches here - the benefit would be
cumulative.

The only drawback that I see with this patch is the change in
behaviour for writes beyond the extent of an underlying image. Unless
the AAVMF build process is modified to generate smaller images (768kB
for the variables, for example), this will never be a problem in
reality, as the underlying image will match the size of the device.

Only when a deliberate decision is taken to use an image smaller than
the device does this drawback come to the fore, which is a tradeoff
that an administrator can choose to make if they wish.

v2:
- Unify the approach for both read-only and writable devices, saving
  another 63MiB per QEMU instance.

v3:
- Add Reviewed-by: for two changes (Philippe).
- Call blk_pread() directly rather than using
  blk_check_size_and_read_all(), given that we know how much we want
  to read.

v4:
- Remove changes already upstream.
- Rebase.

David Edmondson (1):
  hw/pflash_cfi01: Allow backing devices to be smaller than memory
    region

 hw/block/pflash_cfi01.c | 105 ++++++++++++++++++++++++++++++++--------
 hw/block/trace-events   |   3 ++
 2 files changed, 87 insertions(+), 21 deletions(-)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region
  2021-08-10 13:40 [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices David Edmondson
@ 2021-08-10 13:40 ` David Edmondson
  2021-09-09  9:27   ` Philippe Mathieu-Daudé
  2021-08-20 18:19 ` [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices Philippe Mathieu-Daudé
  1 sibling, 1 reply; 5+ messages in thread
From: David Edmondson @ 2021-08-10 13:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Maydell, qemu-block, Michael S. Tsirkin,
	Xu Yandong, John Snow, Markus Armbruster, Max Reitz,
	Alex Bennée, Shannon Zhao, Zheng Xiang, qemu-arm,
	haibinzhang(张海斌),
	Stefan Hajnoczi, Paolo Bonzini, Igor Mammedov, David Edmondson,
	Philippe Mathieu-Daudé,
	Stefano Garzarella

Allow the backing device to be smaller than the extent of the flash
device by mapping it as a subregion of the flash device region.

Return zeroes for all reads of the flash device beyond the extent of
the backing device.

For writes beyond the extent of the underlying device, fail on
read-only devices and discard them for writable devices.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
---
 hw/block/pflash_cfi01.c | 105 ++++++++++++++++++++++++++++++++--------
 hw/block/trace-events   |   3 ++
 2 files changed, 87 insertions(+), 21 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 81f9f971d8..f3289b6a2f 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -83,6 +83,8 @@ struct PFlashCFI01 {
     uint64_t counter;
     unsigned int writeblock_size;
     MemoryRegion mem;
+    MemoryRegion mem_outer;
+    char outer_name[64];
     char *name;
     void *storage;
     VMChangeStateEntry *vmstate;
@@ -434,7 +436,6 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset,
         }
         break;
     }
-
 }
 
 static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
@@ -656,8 +657,44 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
 }
 
 
-static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
-                                              unsigned len, MemTxAttrs attrs)
+static MemTxResult pflash_outer_read_with_attrs(void *opaque, hwaddr addr,
+                                                uint64_t *value, unsigned len,
+                                                MemTxAttrs attrs)
+{
+    PFlashCFI01 *pfl = opaque;
+
+    trace_pflash_outer_read(pfl->name, addr, len);
+    *value = 0;
+    return MEMTX_OK;
+}
+
+static MemTxResult pflash_outer_write_with_attrs(void *opaque, hwaddr addr,
+                                                 uint64_t value, unsigned len,
+                                                 MemTxAttrs attrs)
+{
+    PFlashCFI01 *pfl = opaque;
+
+    trace_pflash_outer_write(pfl->name, addr, len);
+    if (pfl->ro) {
+        return MEMTX_ERROR;
+    } else {
+        warn_report_once("%s: "
+                         "attempt to write outside of the backing block device "
+                         "(offset " TARGET_FMT_plx ") ignored",
+                         pfl->name, addr);
+        return MEMTX_OK;
+    }
+}
+
+static const MemoryRegionOps pflash_cfi01_outer_ops = {
+    .read_with_attrs = pflash_outer_read_with_attrs,
+    .write_with_attrs = pflash_outer_write_with_attrs,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr,
+                                              uint64_t *value, unsigned len,
+                                              MemTxAttrs attrs)
 {
     PFlashCFI01 *pfl = opaque;
     bool be = !!(pfl->features & (1 << PFLASH_BE));
@@ -670,8 +707,9 @@ static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_
     return MEMTX_OK;
 }
 
-static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, uint64_t value,
-                                               unsigned len, MemTxAttrs attrs)
+static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr,
+                                               uint64_t value, unsigned len,
+                                               MemTxAttrs attrs)
 {
     PFlashCFI01 *pfl = opaque;
     bool be = !!(pfl->features & (1 << PFLASH_BE));
@@ -800,7 +838,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
 {
     ERRP_GUARD();
     PFlashCFI01 *pfl = PFLASH_CFI01(dev);
-    uint64_t total_len;
+    uint64_t outer_len, inner_len;
     int ret;
 
     if (pfl->sector_len == 0) {
@@ -816,35 +854,60 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    total_len = pfl->sector_len * pfl->nb_blocs;
-
-    memory_region_init_rom_device(
-        &pfl->mem, OBJECT(dev),
-        &pflash_cfi01_ops,
-        pfl,
-        pfl->name, total_len, errp);
-    if (*errp) {
-        return;
-    }
-
-    pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
-    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+    outer_len = pfl->sector_len * pfl->nb_blocs;
 
     if (pfl->blk) {
         uint64_t perm;
+
         pfl->ro = !blk_supports_write_perm(pfl->blk);
         perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE);
         ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp);
         if (ret < 0) {
             return;
         }
+
+        inner_len = blk_getlength(pfl->blk);
+
+        if (inner_len > outer_len) {
+            error_setg(errp, "%s: "
+                       "block backend provides %" PRIu64 " bytes, "
+                       "device limited to %" PRIu64 " bytes",
+                       pfl->name, inner_len, outer_len);
+            return;
+        }
     } else {
         pfl->ro = false;
+        inner_len = outer_len;
     }
 
+    trace_pflash_realize(pfl->name, pfl->ro, inner_len, outer_len);
+
+    snprintf(pfl->outer_name, sizeof(pfl->outer_name),
+             "%s container", pfl->name);
+    memory_region_init_io(&pfl->mem_outer, OBJECT(dev),
+                          &pflash_cfi01_outer_ops,
+                          pfl, pfl->outer_name, outer_len);
+
+    memory_region_init_rom_device(&pfl->mem, OBJECT(dev),
+                                  &pflash_cfi01_ops,
+                                  pfl, pfl->name, inner_len, errp);
+    if (*errp) {
+        return;
+    }
+
+    memory_region_add_subregion(&pfl->mem_outer, 0, &pfl->mem);
+
+    pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem_outer);
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+
     if (pfl->blk) {
-        if (!blk_check_size_and_read_all(pfl->blk, pfl->storage, total_len,
-                                         errp)) {
+        int ret = blk_pread(pfl->blk, 0, pfl->storage, inner_len);
+
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "cannot read %" PRIu64 " "
+                             "bytes from block backend", inner_len);
             vmstate_unregister_ram(&pfl->mem, DEVICE(pfl));
             return;
         }
diff --git a/hw/block/trace-events b/hw/block/trace-events
index d86b53520c..3d1e07261e 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -21,10 +21,13 @@ pflash_io_read(const char *name, uint64_t offset, unsigned int size, uint32_t va
 pflash_io_write(const char *name, uint64_t offset, unsigned int size, uint32_t value, uint8_t wcycle) "%s: offset:0x%04"PRIx64" size:%u value:0x%04x wcycle:%u"
 pflash_manufacturer_id(const char *name, uint16_t id) "%s: read manufacturer ID: 0x%04x"
 pflash_mode_read_array(const char *name) "%s: read array mode"
+pflash_outer_read(const char *name, uint64_t addr, unsigned int len) "%s: addr:0x%" PRIx64 " len:%d"
+pflash_outer_write(const char *name, uint64_t addr, unsigned int len) "%s: addr:0x%" PRIx64 " len:%d"
 pflash_postload_cb(const char *name)  "%s: updating bdrv"
 pflash_read_done(const char *name, uint64_t offset, uint64_t ret) "%s: ID:0x%" PRIx64 " ret:0x%" PRIx64
 pflash_read_status(const char *name, uint32_t ret) "%s: status:0x%x"
 pflash_read_unknown_state(const char *name, uint8_t cmd) "%s: unknown command state:0x%x"
+pflash_realize(const char *name, bool ro, uint64_t blk_len, uint64_t total_len) "%s: ro:%d blk_len:0x%" PRIx64 " total_len:0x%" PRIx64
 pflash_reset(const char *name) "%s: reset"
 pflash_sector_erase_start(const char *name, int width1, uint64_t start, int width2, uint64_t end) "%s: start sector erase at: 0x%0*" PRIx64 "-0x%0*" PRIx64
 pflash_timer_expired(const char *name, uint8_t cmd) "%s: command 0x%02x done"
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices
  2021-08-10 13:40 [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices David Edmondson
  2021-08-10 13:40 ` [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region David Edmondson
@ 2021-08-20 18:19 ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-20 18:19 UTC (permalink / raw)
  To: David Edmondson, qemu-devel
  Cc: Kevin Wolf, Peter Maydell, qemu-block, Michael S. Tsirkin,
	Xu Yandong, Alex Bennée, Markus Armbruster, Max Reitz,
	Shannon Zhao, Zheng Xiang, qemu-arm,
	haibinzhang(张海斌),
	Stefan Hajnoczi, Paolo Bonzini, Igor Mammedov, John Snow,
	Stefano Garzarella

Hi David,

On 8/10/21 3:40 PM, David Edmondson wrote:
> As described in
> https://lore.kernel.org/r/20201116104216.439650-1-david.edmondson@oracle.com
> and
> https://lore.kernel.org/r/20210222174757.2329740-1-david.edmondson@oracle.com
> I'd like to reduce the amount of memory consumed by QEMU mapping UEFI
> images on aarch64.

> v4:
> - Rebase.

Sorry I haven't forgotten about your patch, I can't make room
to think quietly about your problem. I reserved a dedicated
slot of few hours next week.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region
  2021-08-10 13:40 ` [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region David Edmondson
@ 2021-09-09  9:27   ` Philippe Mathieu-Daudé
  2021-09-14 14:01     ` David Edmondson
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-09  9:27 UTC (permalink / raw)
  To: David Edmondson, qemu-devel, Tom Lendacky, James Bottomley
  Cc: Kevin Wolf, Peter Maydell, qemu-block, Michael S. Tsirkin,
	Xu Yandong, Alex Bennée, Markus Armbruster, Max Reitz,
	Shannon Zhao, Zheng Xiang, qemu-arm,
	haibinzhang(张海斌),
	Stefan Hajnoczi, Paolo Bonzini, Igor Mammedov, John Snow,
	Stefano Garzarella

Hi David,

On 8/10/21 3:40 PM, David Edmondson wrote:
> Allow the backing device to be smaller than the extent of the flash
> device by mapping it as a subregion of the flash device region.
> 
> Return zeroes for all reads of the flash device beyond the extent of
> the backing device.
> 
> For writes beyond the extent of the underlying device, fail on
> read-only devices and discard them for writable devices.
> 
> Signed-off-by: David Edmondson <david.edmondson@oracle.com>
> ---
>  hw/block/pflash_cfi01.c | 105 ++++++++++++++++++++++++++++++++--------
>  hw/block/trace-events   |   3 ++
>  2 files changed, 87 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
> index 81f9f971d8..f3289b6a2f 100644
> --- a/hw/block/pflash_cfi01.c
> +++ b/hw/block/pflash_cfi01.c
> @@ -83,6 +83,8 @@ struct PFlashCFI01 {
>      uint64_t counter;
>      unsigned int writeblock_size;
>      MemoryRegion mem;
> +    MemoryRegion mem_outer;
> +    char outer_name[64];
>      char *name;
>      void *storage;
>      VMChangeStateEntry *vmstate;
> @@ -434,7 +436,6 @@ static inline void pflash_data_write(PFlashCFI01 *pfl, hwaddr offset,
>          }
>          break;
>      }
> -
>  }
>  
>  static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
> @@ -656,8 +657,44 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
>  }
>  
>  
> -static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_t *value,
> -                                              unsigned len, MemTxAttrs attrs)
> +static MemTxResult pflash_outer_read_with_attrs(void *opaque, hwaddr addr,
> +                                                uint64_t *value, unsigned len,
> +                                                MemTxAttrs attrs)
> +{
> +    PFlashCFI01 *pfl = opaque;
> +
> +    trace_pflash_outer_read(pfl->name, addr, len);
> +    *value = 0;

This seems to break the "width" and "old-multiple-chip-handling"
parameters ("emulating a number of flash devices wired up in parallel").

Also this breaks booting with SEV enabled on X86... See commits
9617cddb726 ("pc: add parser for OVMF reset block") and b2f73a0784b
("sev/i386: Allow AP booting under SEV-ES").

> +    return MEMTX_OK;
> +}
> +
> +static MemTxResult pflash_outer_write_with_attrs(void *opaque, hwaddr addr,
> +                                                 uint64_t value, unsigned len,
> +                                                 MemTxAttrs attrs)
> +{
> +    PFlashCFI01 *pfl = opaque;
> +
> +    trace_pflash_outer_write(pfl->name, addr, len);
> +    if (pfl->ro) {
> +        return MEMTX_ERROR;
> +    } else {
> +        warn_report_once("%s: "
> +                         "attempt to write outside of the backing block device "
> +                         "(offset " TARGET_FMT_plx ") ignored",
> +                         pfl->name, addr);

This doesn't seem acceptable on mainstream, see:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg607292.html

> +        return MEMTX_OK;
> +    }
> +}
> +
> +static const MemoryRegionOps pflash_cfi01_outer_ops = {
> +    .read_with_attrs = pflash_outer_read_with_attrs,
> +    .write_with_attrs = pflash_outer_write_with_attrs,
> +    .endianness = DEVICE_NATIVE_ENDIAN,
> +};
> +
> +static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr,
> +                                              uint64_t *value, unsigned len,
> +                                              MemTxAttrs attrs)
>  {
>      PFlashCFI01 *pfl = opaque;
>      bool be = !!(pfl->features & (1 << PFLASH_BE));
> @@ -670,8 +707,9 @@ static MemTxResult pflash_mem_read_with_attrs(void *opaque, hwaddr addr, uint64_
>      return MEMTX_OK;
>  }
>  
> -static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr, uint64_t value,
> -                                               unsigned len, MemTxAttrs attrs)
> +static MemTxResult pflash_mem_write_with_attrs(void *opaque, hwaddr addr,
> +                                               uint64_t value, unsigned len,
> +                                               MemTxAttrs attrs)
>  {
>      PFlashCFI01 *pfl = opaque;
>      bool be = !!(pfl->features & (1 << PFLASH_BE));
> @@ -800,7 +838,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
>  {
>      ERRP_GUARD();
>      PFlashCFI01 *pfl = PFLASH_CFI01(dev);
> -    uint64_t total_len;
> +    uint64_t outer_len, inner_len;
>      int ret;
>  
>      if (pfl->sector_len == 0) {
> @@ -816,35 +854,60 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
>          return;
>      }
>  
> -    total_len = pfl->sector_len * pfl->nb_blocs;
> -
> -    memory_region_init_rom_device(
> -        &pfl->mem, OBJECT(dev),
> -        &pflash_cfi01_ops,
> -        pfl,
> -        pfl->name, total_len, errp);
> -    if (*errp) {
> -        return;
> -    }
> -
> -    pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
> -    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
> +    outer_len = pfl->sector_len * pfl->nb_blocs;
>  
>      if (pfl->blk) {
>          uint64_t perm;
> +
>          pfl->ro = !blk_supports_write_perm(pfl->blk);
>          perm = BLK_PERM_CONSISTENT_READ | (pfl->ro ? 0 : BLK_PERM_WRITE);
>          ret = blk_set_perm(pfl->blk, perm, BLK_PERM_ALL, errp);
>          if (ret < 0) {
>              return;
>          }
> +
> +        inner_len = blk_getlength(pfl->blk);
> +
> +        if (inner_len > outer_len) {
> +            error_setg(errp, "%s: "
> +                       "block backend provides %" PRIu64 " bytes, "
> +                       "device limited to %" PRIu64 " bytes",
> +                       pfl->name, inner_len, outer_len);
> +            return;
> +        }
>      } else {
>          pfl->ro = false;
> +        inner_len = outer_len;
>      }
>  
> +    trace_pflash_realize(pfl->name, pfl->ro, inner_len, outer_len);
> +
> +    snprintf(pfl->outer_name, sizeof(pfl->outer_name),
> +             "%s container", pfl->name);
> +    memory_region_init_io(&pfl->mem_outer, OBJECT(dev),
> +                          &pflash_cfi01_outer_ops,
> +                          pfl, pfl->outer_name, outer_len);

Here you create an I/O region but name it "container" ...

> +
> +    memory_region_init_rom_device(&pfl->mem, OBJECT(dev),
> +                                  &pflash_cfi01_ops,
> +                                  pfl, pfl->name, inner_len, errp);
> +    if (*errp) {
> +        return;
> +    }
> +
> +    memory_region_add_subregion(&pfl->mem_outer, 0, &pfl->mem);

... then add it inside the previous region, so &pfl->mem is used
as container (containing &pfl->mem_outer named "container"...).
This is confusing. Anyhow we shouldn't add subregions to I/O
regions but use real containers instead, creating the container
with memory_region_init(), then adding subregions inside.

> +
> +    pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
> +    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem_outer);
> +    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);


Have you audited no code uses:

  mr = sysbus_mmio_get_region(SYS_BUS_DEVICE(pflash), 0);

Because we'd need to change 0 -> 1.

See also the problem with pflash_cfi01_get_memory():
https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg01988.html
https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg02026.html

>      if (pfl->blk) {
> -        if (!blk_check_size_and_read_all(pfl->blk, pfl->storage, total_len,
> -                                         errp)) {
> +        int ret = blk_pread(pfl->blk, 0, pfl->storage, inner_len);
> +
> +        if (ret < 0) {
> +            error_setg_errno(errp, -ret,
> +                             "cannot read %" PRIu64 " "
> +                             "bytes from block backend", inner_len);
>              vmstate_unregister_ram(&pfl->mem, DEVICE(pfl));
>              return;
>          }



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region
  2021-09-09  9:27   ` Philippe Mathieu-Daudé
@ 2021-09-14 14:01     ` David Edmondson
  0 siblings, 0 replies; 5+ messages in thread
From: David Edmondson @ 2021-09-14 14:01 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Kevin Wolf, Tom Lendacky, qemu-block, Peter Maydell, Xu Yandong,
	Alex Bennée, James Bottomley, Michael S. Tsirkin,
	qemu-devel, Max Reitz, Shannon Zhao, Zheng Xiang, qemu-arm,
	haibinzhang(张海斌),
	Stefan Hajnoczi, Paolo Bonzini, Igor Mammedov, John Snow,
	Markus Armbruster, Stefano Garzarella

On Thursday, 2021-09-09 at 11:27:56 +02, Philippe Mathieu-Daudé wrote:

> Hi David,

Philippe, thank you for the careful and considered review.

>> +static MemTxResult pflash_outer_write_with_attrs(void *opaque, hwaddr addr,
>> +                                                 uint64_t value, unsigned len,
>> +                                                 MemTxAttrs attrs)
>> +{
>> +    PFlashCFI01 *pfl = opaque;
>> +
>> +    trace_pflash_outer_write(pfl->name, addr, len);
>> +    if (pfl->ro) {
>> +        return MEMTX_ERROR;
>> +    } else {
>> +        warn_report_once("%s: "
>> +                         "attempt to write outside of the backing block device "
>> +                         "(offset " TARGET_FMT_plx ") ignored",
>> +                         pfl->name, addr);
>
> This doesn't seem acceptable on mainstream, see:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg607292.html

If this position still holds (it seems reasonable to assume so), then
the approach in this patch isn't going to be useful for writeable pflash
devices. With appropriate adjustment for your other comments, it may
well be fine for read-only devices.

Given that, I will go back to trying to find an approach that works for
writeable devices, perhaps by always intercepting reads/writes and
driving the block layer directly (while I previously had that working,
the performance was poor, which I can re-examine).


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-14 14:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10 13:40 [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices David Edmondson
2021-08-10 13:40 ` [PATCH v4 1/1] hw/pflash_cfi01: Allow backing devices to be smaller than memory region David Edmondson
2021-09-09  9:27   ` Philippe Mathieu-Daudé
2021-09-14 14:01     ` David Edmondson
2021-08-20 18:19 ` [PATCH v4 0/1] hw/pflash_cfi01: Allow an administrator to reduce the memory consumption of flash devices Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).