All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read
@ 2015-11-03  0:35 Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

New since v2:

	- Patches 1-3: updated to address Laszlo's suggestions for better
	  and more accurate change descriptions in commit logs, comments,
	  etc.

	- Patch 4: Per Laszlo's recommendation, this has been split into
	  two separate components for improved legibility:

		- Patch 4/5: Introduces the new generic read method, and
		  replaces the existing MMIO logic

		- Patch 5/5: Replaces the IOPort read logic, and cleans
		  up the remaining unused bits of code.

Thanks again,
  --Gabriel

>This series' main purpose is to update (and simplify) the specified
>read callback behavior. An earlier standalone patch to move qemu function
>call API documentation into fw_cfg.h should logically be part of the series.
>
>Here's the summary of what each patch does:
>
>        - Patch 1/4 is an updated version of the standalone v1 patch
>          I sent out earlier; it moves all the qemu-internal host-side
>          function call api documentation out of docs/specs/fw_cfg.txt,
>          and into the fw_cfg.h header file, next to the prototype of
>          each documented api function.
>
>        - Patch 2/4 modifies the specified behavior of read callbacks
>          (from being invoked once per byte read, to being invoked once,
>           before ANY data is read, specifically once each time an item
>           is selected).
>
>        - Patch 3/4 additionally removes the now-redundant offset argument
>          from the read callback prototype.
>
>        - Finally, 4/4 consolidates (non-DMA) reads, minimizing the number
>          of times redundant sanity checks are performed, particularly for
>          wide (> byte) sized reads.

Gabriel L. Somlo (5):
  fw_cfg: move internal function call docs to header file
  fw_cfg: amend callback behavior spec to once per select
  fw_cfg: remove offset argument from callback prototype
  fw_cfg: add generic non-DMA read method
  fw_cfg: replace ioport data read with generic method

 docs/specs/fw_cfg.txt     |  85 +-----------------------------
 hw/arm/virt-acpi-build.c  |   2 +-
 hw/i386/acpi-build.c      |   2 +-
 hw/nvram/fw_cfg.c         |  58 ++++++++-------------
 include/hw/nvram/fw_cfg.h | 128 +++++++++++++++++++++++++++++++++++++++++++++-
 trace-events              |   2 +-
 6 files changed, 151 insertions(+), 126 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file
  2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
@ 2015-11-03  0:35 ` Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

Move documentation for fw_cfg functions internal to qemu from
docs/specs/fw_cfg.txt to the fw_cfg.h header file, next to their
prototype declarations, formatted as doc-comments.
NOTE: Documentation for fw_cfg_add_callback() is completely
dropped by this patch, as that function has been eliminated
by commit 023e3148.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc Marí <markmb@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
---
 docs/specs/fw_cfg.txt     |  85 +-----------------------------
 include/hw/nvram/fw_cfg.h | 129 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 130 insertions(+), 84 deletions(-)

diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
index b8c794f..2099ad9 100644
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -192,90 +192,7 @@ To check the result, read the "control" field:
                             today due to implementation not being async,
                             but may in the future).
 
-= Host-side API =
-
-The following functions are available to the QEMU programmer for adding
-data to a fw_cfg device during guest initialization (see fw_cfg.h for
-each function's complete prototype):
-
-== fw_cfg_add_bytes() ==
-
-Given a selector key value, starting pointer, and size, create an item
-as a raw "blob" of the given size, available by selecting the given key.
-The data referenced by the starting pointer is only linked, NOT copied,
-into the data structure of the fw_cfg device.
-
-== fw_cfg_add_string() ==
-
-Instead of a starting pointer and size, this function accepts a pointer
-to a NUL-terminated ascii string, and inserts a newly allocated copy of
-the string (including the NUL terminator) into the fw_cfg device data
-structure.
-
-== fw_cfg_add_iXX() ==
-
-Insert an XX-bit item, where XX may be 16, 32, or 64. These functions
-will convert a 16-, 32-, or 64-bit integer to little-endian, then add
-a dynamically allocated copy of the appropriately sized item to fw_cfg
-under the given selector key value.
-
-== fw_cfg_modify_iXX() ==
-
-Modify the value of an XX-bit item (where XX may be 16, 32, or 64).
-Similarly to the corresponding fw_cfg_add_iXX() function set, convert
-a 16-, 32-, or 64-bit integer to little endian, create a dynamically
-allocated copy of the required size, and replace the existing item at
-the given selector key value with the newly allocated one. The previous
-item, assumed to have been allocated during an earlier call to
-fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed
-before the function returns.
-
-== fw_cfg_add_file() ==
-
-Given a filename (i.e., fw_cfg item name), starting pointer, and size,
-create an item as a raw "blob" of the given size. Unlike fw_cfg_add_bytes()
-above, the next available selector key (above 0x0020, FW_CFG_FILE_FIRST)
-will be used, and a new entry will be added to the file directory structure
-(at key 0x0019), containing the item name, blob size, and automatically
-assigned selector key value. The data referenced by the starting pointer
-is only linked, NOT copied, into the fw_cfg data structure.
-
-== fw_cfg_add_file_callback() ==
-
-Like fw_cfg_add_file(), but additionally sets pointers to a callback
-function (and opaque argument), which will be executed host-side by
-QEMU each time a byte is read by the guest from this particular item.
-
-NOTE: The callback function is given the opaque argument set by
-fw_cfg_add_file_callback(), but also the current data offset,
-allowing it the option of only acting upon specific offset values
-(e.g., 0, before the first data byte of the selected item is
-returned to the guest).
-
-== fw_cfg_modify_file() ==
-
-Given a filename (i.e., fw_cfg item name), starting pointer, and size,
-completely replace the configuration item referenced by the given item
-name with the new given blob. If an existing blob is found, its
-callback information is removed, and a pointer to the old data is
-returned to allow the caller to free it, helping avoid memory leaks.
-If a configuration item does not already exist under the given item
-name, a new item will be created as with fw_cfg_add_file(), and NULL
-is returned to the caller. In any case, the data referenced by the
-starting pointer is only linked, NOT copied, into the fw_cfg data
-structure.
-
-== fw_cfg_add_callback() ==
-
-Like fw_cfg_add_bytes(), but additionally sets pointers to a callback
-function (and opaque argument), which will be executed host-side by
-QEMU each time a guest-side write operation to this particular item
-completes fully overwriting the item's data.
-
-NOTE: This function is deprecated, and will be completely removed
-starting with QEMU v2.4.
-
-== Externally Provided Items ==
+= Externally Provided Items =
 
 As of v2.4, "file" fw_cfg items (i.e., items with selector keys above
 FW_CFG_FILE_FIRST, and with a corresponding entry in the fw_cfg file
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index ee0cd8a..4b5e196 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -73,19 +73,148 @@ typedef struct FWCfgDmaAccess {
 typedef void (*FWCfgCallback)(void *opaque, uint8_t *data);
 typedef void (*FWCfgReadCallback)(void *opaque, uint32_t offset);
 
+/**
+ * fw_cfg_add_bytes:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @data: pointer to start of item data
+ * @len: size of item data
+ *
+ * Add a new fw_cfg item, available by selecting the given key, as a raw
+ * "blob" of the given size. The data referenced by the starting pointer
+ * is only linked, NOT copied, into the data structure of the fw_cfg device.
+ */
 void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len);
+
+/**
+ * fw_cfg_add_string:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: NUL-terminated ascii string
+ *
+ * Add a new fw_cfg item, available by selecting the given key. The item
+ * data will consist of a dynamically allocated copy of the provided string,
+ * including its NUL terminator.
+ */
 void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value);
+
+/**
+ * fw_cfg_add_i16:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 16-bit integer
+ *
+ * Add a new fw_cfg item, available by selecting the given key. The item
+ * data will consist of a dynamically allocated copy of the given 16-bit
+ * value, converted to little-endian representation.
+ */
 void fw_cfg_add_i16(FWCfgState *s, uint16_t key, uint16_t value);
+
+/**
+ * fw_cfg_modify_i16:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 16-bit integer
+ *
+ * Replace the fw_cfg item available by selecting the given key. The new
+ * data will consist of a dynamically allocated copy of the given 16-bit
+ * value, converted to little-endian representation. The data being replaced,
+ * assumed to have been dynamically allocated during an earlier call to
+ * either fw_cfg_add_i16() or fw_cfg_modify_i16(), is freed before returning.
+ */
 void fw_cfg_modify_i16(FWCfgState *s, uint16_t key, uint16_t value);
+
+/**
+ * fw_cfg_add_i32:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 32-bit integer
+ *
+ * Add a new fw_cfg item, available by selecting the given key. The item
+ * data will consist of a dynamically allocated copy of the given 32-bit
+ * value, converted to little-endian representation.
+ */
 void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
+
+/**
+ * fw_cfg_add_i64:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 64-bit integer
+ *
+ * Add a new fw_cfg item, available by selecting the given key. The item
+ * data will consist of a dynamically allocated copy of the given 64-bit
+ * value, converted to little-endian representation.
+ */
 void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
+
+/**
+ * fw_cfg_add_file:
+ * @s: fw_cfg device being modified
+ * @filename: name of new fw_cfg file item
+ * @data: pointer to start of item data
+ * @len: size of item data
+ *
+ * Add a new NAMED fw_cfg item as a raw "blob" of the given size. The data
+ * referenced by the starting pointer is only linked, NOT copied, into the
+ * data structure of the fw_cfg device.
+ * The next available (unused) selector key starting at FW_CFG_FILE_FIRST
+ * will be used; also, a new entry will be added to the file directory
+ * structure residing at key value FW_CFG_FILE_DIR, containing the item name,
+ * data size, and assigned selector key value.
+ */
 void fw_cfg_add_file(FWCfgState *s, const char *filename, void *data,
                      size_t len);
+
+/**
+ * fw_cfg_add_file_callback:
+ * @s: fw_cfg device being modified
+ * @filename: name of new fw_cfg file item
+ * @callback: callback function
+ * @callback_opaque: argument to be passed into callback function
+ * @data: pointer to start of item data
+ * @len: size of item data
+ *
+ * Add a new NAMED fw_cfg item as a raw "blob" of the given size. The data
+ * referenced by the starting pointer is only linked, NOT copied, into the
+ * data structure of the fw_cfg device.
+ * The next available (unused) selector key starting at FW_CFG_FILE_FIRST
+ * will be used; also, a new entry will be added to the file directory
+ * structure residing at key value FW_CFG_FILE_DIR, containing the item name,
+ * data size, and assigned selector key value.
+ * Additionally, set a callback function (and argument) to be called each
+ * time a byte is read by the guest from this particular item, or, in the
+ * case of DMA, each time a read or skip request overlaps with the valid
+ * offset range of the item.
+ * NOTE: In addition to the opaque argument set here, the callback function
+ * takes the current data offset as an additional argument, allowing it the
+ * option of only acting upon specific offset values (e.g., 0, before the
+ * first data byte of the selected item is returned to the guest).
+ */
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
                               FWCfgReadCallback callback, void *callback_opaque,
                               void *data, size_t len);
+
+/**
+ * fw_cfg_modify_file:
+ * @s: fw_cfg device being modified
+ * @filename: name of new fw_cfg file item
+ * @data: pointer to start of item data
+ * @len: size of item data
+ *
+ * Replace a NAMED fw_cfg item. If an existing item is found, its callback
+ * information will be cleared, and a pointer to its data will be returned
+ * to the caller, so that it may be freed if necessary. If an existing item
+ * is not found, this call defaults to fw_cfg_add_file(), and NULL is
+ * returned to the caller.
+ * In either case, the new item data is only linked, NOT copied, into the
+ * data structure of the fw_cfg device.
+ *
+ * Returns: pointer to old item's data, or NULL if old item does not exist.
+ */
 void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data,
                          size_t len);
+
 FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t dma_iobase,
                                 AddressSpace *dma_as);
 FWCfgState *fw_cfg_init_io(uint32_t iobase);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select
  2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
@ 2015-11-03  0:35 ` Gabriel L. Somlo
  2015-11-03  9:51   ` Laszlo Ersek
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype Gabriel L. Somlo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

Currently, the fw_cfg internal API specifies that if an item was set up
with a read callback, the callback must be run each time a byte is read
from the item. This behavior is both wasteful (most items do not have a
read callback set), and impractical for bulk transfers (e.g., DMA read).

At the time of this writing, the only items configured with a callback
are "/etc/table-loader", "/etc/acpi/tables", and "/etc/acpi/rsdp". They
all share the same callback functions: virt_acpi_build_update() on ARM
(in hw/arm/virt-acpi-build.c), and acpi_build_update() on i386 (in
hw/i386/acpi.c). Both of these callbacks are one-shot (i.e. they return
without doing anything at all after the first time they are invoked with
a given build_state; since build_state is also shared across all three
items mentioned above, the callback only ever runs *once*, the first
time either of the listed items is read).

This patch amends the specification for fw_cfg_add_file_callback() to
state that any available read callback will only be invoked once each
time the item is selected. This change has no practical effect on the
current behavior of QEMU, and it enables us to significantly optimize
the behavior of fw_cfg reads during guest firmware setup, eliminating
a large amount of redundant callback checks and invocations.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
---
 hw/nvram/fw_cfg.c         | 19 ++++++++++---------
 include/hw/nvram/fw_cfg.h | 10 +++-------
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 73b0a81..6e6414b 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -252,7 +252,8 @@ static void fw_cfg_write(FWCfgState *s, uint8_t value)
 
 static int fw_cfg_select(FWCfgState *s, uint16_t key)
 {
-    int ret;
+    int arch, ret;
+    FWCfgEntry *e;
 
     s->cur_offset = 0;
     if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) {
@@ -261,6 +262,12 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
     } else {
         s->cur_entry = key;
         ret = 1;
+        /* entry successfully selected, now run callback if present */
+        arch = !!(key & FW_CFG_ARCH_LOCAL);
+        e = &s->entries[arch][key & FW_CFG_ENTRY_MASK];
+        if (e->read_callback) {
+            e->read_callback(e->callback_opaque, s->cur_offset);
+        }
     }
 
     trace_fw_cfg_select(s, key, ret);
@@ -276,9 +283,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
     if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
         ret = 0;
     else {
-        if (e->read_callback) {
-            e->read_callback(e->callback_opaque, s->cur_offset);
-        }
         ret = e->data[s->cur_offset++];
     }
 
@@ -371,10 +375,6 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
                 len = (e->len - s->cur_offset);
             }
 
-            if (e->read_callback) {
-                e->read_callback(e->callback_opaque, s->cur_offset);
-            }
-
             /* If the access is not a read access, it will be a skip access,
              * tested before.
              */
@@ -513,7 +513,8 @@ static void fw_cfg_reset(DeviceState *d)
 {
     FWCfgState *s = FW_CFG(d);
 
-    fw_cfg_select(s, 0);
+    /* we never register a read callback for FW_CFG_SIGNATURE */
+    fw_cfg_select(s, FW_CFG_SIGNATURE);
 }
 
 /* Save restore 32 bit int as uint16_t
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 4b5e196..a1cfaa4 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -183,13 +183,9 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, void *data,
  * structure residing at key value FW_CFG_FILE_DIR, containing the item name,
  * data size, and assigned selector key value.
  * Additionally, set a callback function (and argument) to be called each
- * time a byte is read by the guest from this particular item, or, in the
- * case of DMA, each time a read or skip request overlaps with the valid
- * offset range of the item.
- * NOTE: In addition to the opaque argument set here, the callback function
- * takes the current data offset as an additional argument, allowing it the
- * option of only acting upon specific offset values (e.g., 0, before the
- * first data byte of the selected item is returned to the guest).
+ * time this item is selected (by having its selector key either written to
+ * the fw_cfg control register, or passed to QEMU in FWCfgDmaAccess.control
+ * with FW_CFG_DMA_CTL_SELECT).
  */
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
                               FWCfgReadCallback callback, void *callback_opaque,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype
  2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
@ 2015-11-03  0:35 ` Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
  4 siblings, 0 replies; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

Read callbacks are now only invoked at item selection, before any
data is read. As such, the value of the offset argument passed to
the callback will always be 0. Also, the two callback instances
currently in use both leave their offset argument unused.

This patch removes the offset argument from the fw_cfg read callback
prototype, and from the currently available instances. The unused
(write) callback prototype is also removed (write support was removed
earlier, in commit 023e3148).

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
---
 hw/arm/virt-acpi-build.c  | 2 +-
 hw/i386/acpi-build.c      | 2 +-
 hw/nvram/fw_cfg.c         | 2 +-
 include/hw/nvram/fw_cfg.h | 3 +--
 4 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 1aaff1f..4eed24d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -626,7 +626,7 @@ static void acpi_ram_update(MemoryRegion *mr, GArray *data)
     memory_region_set_dirty(mr, 0, size);
 }
 
-static void virt_acpi_build_update(void *build_opaque, uint32_t offset)
+static void virt_acpi_build_update(void *build_opaque)
 {
     AcpiBuildState *build_state = build_opaque;
     AcpiBuildTables tables;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 95e0c65..29e30ce 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1818,7 +1818,7 @@ static void acpi_ram_update(MemoryRegion *mr, GArray *data)
     memory_region_set_dirty(mr, 0, size);
 }
 
-static void acpi_build_update(void *build_opaque, uint32_t offset)
+static void acpi_build_update(void *build_opaque)
 {
     AcpiBuildState *build_state = build_opaque;
     AcpiBuildTables tables;
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 6e6414b..c2d3a0a 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -266,7 +266,7 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
         arch = !!(key & FW_CFG_ARCH_LOCAL);
         e = &s->entries[arch][key & FW_CFG_ENTRY_MASK];
         if (e->read_callback) {
-            e->read_callback(e->callback_opaque, s->cur_offset);
+            e->read_callback(e->callback_opaque);
         }
     }
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index a1cfaa4..664eaf6 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -70,8 +70,7 @@ typedef struct FWCfgDmaAccess {
     uint64_t address;
 } QEMU_PACKED FWCfgDmaAccess;
 
-typedef void (*FWCfgCallback)(void *opaque, uint8_t *data);
-typedef void (*FWCfgReadCallback)(void *opaque, uint32_t offset);
+typedef void (*FWCfgReadCallback)(void *opaque);
 
 /**
  * fw_cfg_add_bytes:
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
                   ` (2 preceding siblings ...)
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype Gabriel L. Somlo
@ 2015-11-03  0:35 ` Gabriel L. Somlo
  2015-11-03 10:53   ` Laszlo Ersek
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
  4 siblings, 1 reply; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

Introduce fw_cfg_data_read(), a generic read method which works
on all access widths (1 through 8 bytes, inclusive), and can be
used during both IOPort and MMIO read accesses.

To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
data read method) is replaced by this patch. The new method
essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
combo, but without unnecessarily repeating all the validity
checks performed by the latter on each byte being read.

This patch also modifies the trace_fw_cfg_read prototype to
accept a 64-bit value argument, allowing it to work properly
with the new read method, but also remain backward compatible
with existing call sites.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
---
 hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
 trace-events      |  2 +-
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index c2d3a0a..8aa980c 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
     return ret;
 }
 
+static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
+{
+    FWCfgState *s = opaque;
+    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
+    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
+    uint64_t value = 0;
+
+    assert(size <= sizeof(value));
+    if (s->cur_entry != FW_CFG_INVALID && e->data) {
+        while (size-- && s->cur_offset < e->len) {
+            value = (value << 8) | e->data[s->cur_offset++];
+        }
+    }
+
+    trace_fw_cfg_read(s, value);
+    return value;
+}
+
 static uint8_t fw_cfg_read(FWCfgState *s)
 {
     int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
@@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
     return ret;
 }
 
-static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
-                                     unsigned size)
-{
-    FWCfgState *s = opaque;
-    uint64_t value = 0;
-    unsigned i;
-
-    for (i = 0; i < size; ++i) {
-        value = (value << 8) | fw_cfg_read(s);
-    }
-    return value;
-}
-
 static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
                                   uint64_t value, unsigned size)
 {
@@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
 };
 
 static const MemoryRegionOps fw_cfg_data_mem_ops = {
-    .read = fw_cfg_data_mem_read,
+    .read = fw_cfg_data_read,
     .write = fw_cfg_data_mem_write,
     .endianness = DEVICE_BIG_ENDIAN,
     .valid = {
diff --git a/trace-events b/trace-events
index 72136b9..5073040 100644
--- a/trace-events
+++ b/trace-events
@@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
 
 # hw/nvram/fw_cfg.c
 fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
-fw_cfg_read(void *s, uint8_t ret) "%p = %d"
+fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
 fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
 
 # hw/block/hd-geometry.c
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method
  2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
                   ` (3 preceding siblings ...)
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
@ 2015-11-03  0:35 ` Gabriel L. Somlo
  2015-11-03 11:11   ` Laszlo Ersek
  4 siblings, 1 reply; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03  0:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, jordan.l.justen, kraxel, pbonzini, markmb, lersek

IOPort read access is limited to one byte at a time by
fw_cfg_comb_valid(). As such, fw_cfg_comb_read() may safely
ignore its size argument (which will always be 1), and simply
call its fw_cfg_read() helper function once, returning 8 bits
via the least significant byte of a 64-bit return value.

This patch replaces fw_cfg_comb_read() with the generic method
fw_cfg_data_read(), and removes the unused fw_cfg_read() helper.

When called with size = 1, fw_cfg_data_read() acts exactly like
fw_cfg_read(), performing the same set of sanity checks, and
executing the while loop at most once (subject to the current
read offset being within range).

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
---
 hw/nvram/fw_cfg.c | 24 +-----------------------
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 8aa980c..7d216f0 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -292,22 +292,6 @@ static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
     return value;
 }
 
-static uint8_t fw_cfg_read(FWCfgState *s)
-{
-    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
-    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
-    uint8_t ret;
-
-    if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
-        ret = 0;
-    else {
-        ret = e->data[s->cur_offset++];
-    }
-
-    trace_fw_cfg_read(s, ret);
-    return ret;
-}
-
 static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
                                   uint64_t value, unsigned size)
 {
@@ -456,12 +440,6 @@ static bool fw_cfg_ctl_mem_valid(void *opaque, hwaddr addr,
     return is_write && size == 2;
 }
 
-static uint64_t fw_cfg_comb_read(void *opaque, hwaddr addr,
-                                 unsigned size)
-{
-    return fw_cfg_read(opaque);
-}
-
 static void fw_cfg_comb_write(void *opaque, hwaddr addr,
                               uint64_t value, unsigned size)
 {
@@ -499,7 +477,7 @@ static const MemoryRegionOps fw_cfg_data_mem_ops = {
 };
 
 static const MemoryRegionOps fw_cfg_comb_mem_ops = {
-    .read = fw_cfg_comb_read,
+    .read = fw_cfg_data_read,
     .write = fw_cfg_comb_write,
     .endianness = DEVICE_LITTLE_ENDIAN,
     .valid.accepts = fw_cfg_comb_valid,
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
@ 2015-11-03  9:51   ` Laszlo Ersek
  0 siblings, 0 replies; 14+ messages in thread
From: Laszlo Ersek @ 2015-11-03  9:51 UTC (permalink / raw)
  To: Gabriel L. Somlo, qemu-devel
  Cc: peter.maydell, jordan.l.justen, markmb, kraxel, pbonzini

On 11/03/15 01:35, Gabriel L. Somlo wrote:
> Currently, the fw_cfg internal API specifies that if an item was set up
> with a read callback, the callback must be run each time a byte is read
> from the item. This behavior is both wasteful (most items do not have a
> read callback set), and impractical for bulk transfers (e.g., DMA read).
> 
> At the time of this writing, the only items configured with a callback
> are "/etc/table-loader", "/etc/acpi/tables", and "/etc/acpi/rsdp". They
> all share the same callback functions: virt_acpi_build_update() on ARM
> (in hw/arm/virt-acpi-build.c), and acpi_build_update() on i386 (in
> hw/i386/acpi.c). Both of these callbacks are one-shot (i.e. they return
> without doing anything at all after the first time they are invoked with
> a given build_state; since build_state is also shared across all three
> items mentioned above, the callback only ever runs *once*, the first
> time either of the listed items is read).
> 
> This patch amends the specification for fw_cfg_add_file_callback() to
> state that any available read callback will only be invoked once each
> time the item is selected. This change has no practical effect on the
> current behavior of QEMU, and it enables us to significantly optimize
> the behavior of fw_cfg reads during guest firmware setup, eliminating
> a large amount of redundant callback checks and invocations.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Marc Marí <markmb@redhat.com>
> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> ---
>  hw/nvram/fw_cfg.c         | 19 ++++++++++---------
>  include/hw/nvram/fw_cfg.h | 10 +++-------
>  2 files changed, 13 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 73b0a81..6e6414b 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -252,7 +252,8 @@ static void fw_cfg_write(FWCfgState *s, uint8_t value)
>  
>  static int fw_cfg_select(FWCfgState *s, uint16_t key)
>  {
> -    int ret;
> +    int arch, ret;
> +    FWCfgEntry *e;
>  
>      s->cur_offset = 0;
>      if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) {
> @@ -261,6 +262,12 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>      } else {
>          s->cur_entry = key;
>          ret = 1;
> +        /* entry successfully selected, now run callback if present */
> +        arch = !!(key & FW_CFG_ARCH_LOCAL);
> +        e = &s->entries[arch][key & FW_CFG_ENTRY_MASK];
> +        if (e->read_callback) {
> +            e->read_callback(e->callback_opaque, s->cur_offset);
> +        }
>      }
>  
>      trace_fw_cfg_select(s, key, ret);
> @@ -276,9 +283,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>      if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
>          ret = 0;
>      else {
> -        if (e->read_callback) {
> -            e->read_callback(e->callback_opaque, s->cur_offset);
> -        }
>          ret = e->data[s->cur_offset++];
>      }
>  
> @@ -371,10 +375,6 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
>                  len = (e->len - s->cur_offset);
>              }
>  
> -            if (e->read_callback) {
> -                e->read_callback(e->callback_opaque, s->cur_offset);
> -            }
> -
>              /* If the access is not a read access, it will be a skip access,
>               * tested before.
>               */
> @@ -513,7 +513,8 @@ static void fw_cfg_reset(DeviceState *d)
>  {
>      FWCfgState *s = FW_CFG(d);
>  
> -    fw_cfg_select(s, 0);
> +    /* we never register a read callback for FW_CFG_SIGNATURE */
> +    fw_cfg_select(s, FW_CFG_SIGNATURE);
>  }
>  
>  /* Save restore 32 bit int as uint16_t
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index 4b5e196..a1cfaa4 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -183,13 +183,9 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, void *data,
>   * structure residing at key value FW_CFG_FILE_DIR, containing the item name,
>   * data size, and assigned selector key value.
>   * Additionally, set a callback function (and argument) to be called each
> - * time a byte is read by the guest from this particular item, or, in the
> - * case of DMA, each time a read or skip request overlaps with the valid
> - * offset range of the item.
> - * NOTE: In addition to the opaque argument set here, the callback function
> - * takes the current data offset as an additional argument, allowing it the
> - * option of only acting upon specific offset values (e.g., 0, before the
> - * first data byte of the selected item is returned to the guest).
> + * time this item is selected (by having its selector key either written to
> + * the fw_cfg control register, or passed to QEMU in FWCfgDmaAccess.control
> + * with FW_CFG_DMA_CTL_SELECT).
>   */
>  void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
>                                FWCfgReadCallback callback, void *callback_opaque,
> 

diffed this against the v2 counterpart, looks good

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
@ 2015-11-03 10:53   ` Laszlo Ersek
  2015-11-03 17:55     ` Gabriel L. Somlo
  0 siblings, 1 reply; 14+ messages in thread
From: Laszlo Ersek @ 2015-11-03 10:53 UTC (permalink / raw)
  To: Gabriel L. Somlo, qemu-devel
  Cc: peter.maydell, jordan.l.justen, markmb, kraxel, pbonzini

Thank you for splitting out this patch; it makes it easier to review.
However,

On 11/03/15 01:35, Gabriel L. Somlo wrote:
> Introduce fw_cfg_data_read(), a generic read method which works
> on all access widths (1 through 8 bytes, inclusive), and can be
> used during both IOPort and MMIO read accesses.
> 
> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
> data read method) is replaced by this patch. The new method
> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
> combo, but without unnecessarily repeating all the validity
> checks performed by the latter on each byte being read.

this unwinding caused a bug to creep in.

Namely, we have to identify the set of data that remains constant
between *all* "size" calls that fw_cfg_data_mem_read() makes to
fw_cfg_read(), and hoist / eliminate the checks on those *only*.

Specifically,

> This patch also modifies the trace_fw_cfg_read prototype to
> accept a 64-bit value argument, allowing it to work properly
> with the new read method, but also remain backward compatible
> with existing call sites.
> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Marc Marí <markmb@redhat.com>
> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> ---
>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
>  trace-events      |  2 +-
>  2 files changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index c2d3a0a..8aa980c 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>      return ret;
>  }
>  
> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +    FWCfgState *s = opaque;

This is good.

> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);

Okay too.

> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];

(1) Side point: the conversion here is faithful to the original code in
fw_cfg_read(), but even in the original code, the expression uses
"s->cur_entry" as a (masked) subscript *before* comparing it against
FW_CFG_INVALID. I don't think that's right.

The same issue is present in fw_cfg_dma_transfer(). Care to write a
patch (before the restructuring) that fixes both?

Note, I am aware that the expression in both of the above mentioned
functions only calculates the *address* of the nonexistent element
belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:

  e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];

But it doesn't matter; it's undefined behavior just the same. Instead,
*both* locations should say:

 e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
     &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];

(I share the blame for not noticing this earlier -- I too reviewed
fw_cfg_dma_transfer().)

NULL is a valid pointer to *evaluate* (not to dereference), whereas the
current address-of expression is not valid even for evaluation. Also, in
practice, dereferencing NULL would give us a nice (as in, non-garbage)
SIGSEGV.

Anyway, back to the topic at hand:

> +    uint64_t value = 0;
> +
> +    assert(size <= sizeof(value));
> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {

Right, good conversion. (Side note: this does protect against
*dereferencing* "e", but it's already too late, as far as undefined
behavior is concerned.)

> +        while (size-- && s->cur_offset < e->len) {
> +            value = (value << 8) | e->data[s->cur_offset++];
> +        }

(2) So, this is the bug. The pre-conversion code would keep shifting
"value" to the left until "size" was reached, regardless of the
underlying blob size, and just leave the least significant bytes zeroed
if the item ended too early. Whereas this loop *stops shifting* when the
blob ends.

Since the wide data register (which is big-endian) implements a
substring-preserving transfer (on top of QEMU's integer preserving
device r/w infrastructure), this change breaks the case when the
firmware reads, say, 8 bytes from the register in a single access, when
only 3 are left in the blob, and then uses only the three *lowest
address* bytes from the uint64_t value read. Although no known firmware
does this at the moment, it would be valid, and the above hunk would
break it.

Hence please

(2a) either append the missing "cumulative" shift after the loop:

    while (size && s->cur_offset < e->len) {
        --size;
        value = (value << 8) | e->data[s->cur_offset++];
    }
    value <<= 8 * size;

(2b) or move the offset check from the loop's controlling expression
into the value composition:

        while (size--) {
            value = (value << 8) | (s->cur_offset < e->len ?
                                    e->data[s->cur_offset++] :
                                    0);
        }

The rest looks good.

Thanks
Laszlo

> +    }
> +
> +    trace_fw_cfg_read(s, value);
> +    return value;
> +}
> +
>  static uint8_t fw_cfg_read(FWCfgState *s)
>  {
>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>      return ret;
>  }
>  
> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
> -                                     unsigned size)
> -{
> -    FWCfgState *s = opaque;
> -    uint64_t value = 0;
> -    unsigned i;
> -
> -    for (i = 0; i < size; ++i) {
> -        value = (value << 8) | fw_cfg_read(s);
> -    }
> -    return value;
> -}
> -
>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>                                    uint64_t value, unsigned size)
>  {
> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
>  };
>  
>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
> -    .read = fw_cfg_data_mem_read,
> +    .read = fw_cfg_data_read,
>      .write = fw_cfg_data_mem_write,
>      .endianness = DEVICE_BIG_ENDIAN,
>      .valid = {
> diff --git a/trace-events b/trace-events
> index 72136b9..5073040 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
>  
>  # hw/nvram/fw_cfg.c
>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
>  
>  # hw/block/hd-geometry.c
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method
  2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
@ 2015-11-03 11:11   ` Laszlo Ersek
  0 siblings, 0 replies; 14+ messages in thread
From: Laszlo Ersek @ 2015-11-03 11:11 UTC (permalink / raw)
  To: Gabriel L. Somlo, qemu-devel
  Cc: peter.maydell, jordan.l.justen, markmb, kraxel, pbonzini

On 11/03/15 01:35, Gabriel L. Somlo wrote:
> IOPort read access is limited to one byte at a time by
> fw_cfg_comb_valid(). As such, fw_cfg_comb_read() may safely
> ignore its size argument (which will always be 1), and simply
> call its fw_cfg_read() helper function once, returning 8 bits
> via the least significant byte of a 64-bit return value.
> 
> This patch replaces fw_cfg_comb_read() with the generic method
> fw_cfg_data_read(), and removes the unused fw_cfg_read() helper.
> 
> When called with size = 1, fw_cfg_data_read() acts exactly like
> fw_cfg_read(), performing the same set of sanity checks, and
> executing the while loop at most once (subject to the current
> read offset being within range).

If you choose (2a) for v3 4/5, then the commit message here can stay as-is.

If you choose (2b), then the last sentence should say

  When called with size = 1, fw_cfg_data_read() acts exactly like
  fw_cfg_read(), performing the same set of sanity checks, and
  executing the while loop exactly once (where the value composition
  honors the current read offset being within range).

With that update (as appropriate),

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks
Laszlo

> 
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Marc Marí <markmb@redhat.com>
> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> ---
>  hw/nvram/fw_cfg.c | 24 +-----------------------
>  1 file changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 8aa980c..7d216f0 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -292,22 +292,6 @@ static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>      return value;
>  }
>  
> -static uint8_t fw_cfg_read(FWCfgState *s)
> -{
> -    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> -    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> -    uint8_t ret;
> -
> -    if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)
> -        ret = 0;
> -    else {
> -        ret = e->data[s->cur_offset++];
> -    }
> -
> -    trace_fw_cfg_read(s, ret);
> -    return ret;
> -}
> -
>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>                                    uint64_t value, unsigned size)
>  {
> @@ -456,12 +440,6 @@ static bool fw_cfg_ctl_mem_valid(void *opaque, hwaddr addr,
>      return is_write && size == 2;
>  }
>  
> -static uint64_t fw_cfg_comb_read(void *opaque, hwaddr addr,
> -                                 unsigned size)
> -{
> -    return fw_cfg_read(opaque);
> -}
> -
>  static void fw_cfg_comb_write(void *opaque, hwaddr addr,
>                                uint64_t value, unsigned size)
>  {
> @@ -499,7 +477,7 @@ static const MemoryRegionOps fw_cfg_data_mem_ops = {
>  };
>  
>  static const MemoryRegionOps fw_cfg_comb_mem_ops = {
> -    .read = fw_cfg_comb_read,
> +    .read = fw_cfg_data_read,
>      .write = fw_cfg_comb_write,
>      .endianness = DEVICE_LITTLE_ENDIAN,
>      .valid.accepts = fw_cfg_comb_valid,
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03 10:53   ` Laszlo Ersek
@ 2015-11-03 17:55     ` Gabriel L. Somlo
  2015-11-03 21:35       ` Laszlo Ersek
  0 siblings, 1 reply; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03 17:55 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell, jordan.l.justen, qemu-devel, kraxel, pbonzini, markmb

On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
> Thank you for splitting out this patch; it makes it easier to review.
> However,
> 
> On 11/03/15 01:35, Gabriel L. Somlo wrote:
> > Introduce fw_cfg_data_read(), a generic read method which works
> > on all access widths (1 through 8 bytes, inclusive), and can be
> > used during both IOPort and MMIO read accesses.
> > 
> > To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
> > data read method) is replaced by this patch. The new method
> > essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
> > combo, but without unnecessarily repeating all the validity
> > checks performed by the latter on each byte being read.
> 
> this unwinding caused a bug to creep in.
> 
> Namely, we have to identify the set of data that remains constant
> between *all* "size" calls that fw_cfg_data_mem_read() makes to
> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
> 
> Specifically,
> 
> > This patch also modifies the trace_fw_cfg_read prototype to
> > accept a 64-bit value argument, allowing it to work properly
> > with the new read method, but also remain backward compatible
> > with existing call sites.
> > 
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Cc: Gerd Hoffmann <kraxel@redhat.com>
> > Cc: Marc Marí <markmb@redhat.com>
> > Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> > ---
> >  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
> >  trace-events      |  2 +-
> >  2 files changed, 20 insertions(+), 15 deletions(-)
> > 
> > diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> > index c2d3a0a..8aa980c 100644
> > --- a/hw/nvram/fw_cfg.c
> > +++ b/hw/nvram/fw_cfg.c
> > @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
> >      return ret;
> >  }
> >  
> > +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> > +{
> > +    FWCfgState *s = opaque;
> 
> This is good.
> 
> > +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> 
> Okay too.
> 
> > +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> 
> (1) Side point: the conversion here is faithful to the original code in
> fw_cfg_read(), but even in the original code, the expression uses
> "s->cur_entry" as a (masked) subscript *before* comparing it against
> FW_CFG_INVALID. I don't think that's right.
> 
> The same issue is present in fw_cfg_dma_transfer(). Care to write a
> patch (before the restructuring) that fixes both?
> 
> Note, I am aware that the expression in both of the above mentioned
> functions only calculates the *address* of the nonexistent element
> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
> 
>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> 
> But it doesn't matter; it's undefined behavior just the same. Instead,
> *both* locations should say:
> 
>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> 
> (I share the blame for not noticing this earlier -- I too reviewed
> fw_cfg_dma_transfer().)
> 
> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
> current address-of expression is not valid even for evaluation. Also, in
> practice, dereferencing NULL would give us a nice (as in, non-garbage)
> SIGSEGV.

Done.

> 
> Anyway, back to the topic at hand:
> 
> > +    uint64_t value = 0;
> > +
> > +    assert(size <= sizeof(value));
> > +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> 
> Right, good conversion. (Side note: this does protect against
> *dereferencing* "e", but it's already too late, as far as undefined
> behavior is concerned.)
> 
> > +        while (size-- && s->cur_offset < e->len) {
> > +            value = (value << 8) | e->data[s->cur_offset++];
> > +        }
> 
> (2) So, this is the bug. The pre-conversion code would keep shifting
> "value" to the left until "size" was reached, regardless of the
> underlying blob size, and just leave the least significant bytes zeroed
> if the item ended too early. Whereas this loop *stops shifting* when the
> blob ends.

D'OH!!! That should teach me to pay more attention -- thanks for
catching it!

> Since the wide data register (which is big-endian) implements a
> substring-preserving transfer (on top of QEMU's integer preserving
> device r/w infrastructure), this change breaks the case when the
> firmware reads, say, 8 bytes from the register in a single access, when
> only 3 are left in the blob, and then uses only the three *lowest
> address* bytes from the uint64_t value read. Although no known firmware
> does this at the moment, it would be valid, and the above hunk would
> break it.
> 
> Hence please
> 
> (2a) either append the missing "cumulative" shift after the loop:
> 
>     while (size && s->cur_offset < e->len) {
>         --size;
>         value = (value << 8) | e->data[s->cur_offset++];
>     }
>     value <<= 8 * size;

I went with 2a. Also added a comment to make things painfully obvious
to any potential future archaeologists:

+static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
+{
+    FWCfgState *s = opaque;
+    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
+    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
+                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
+    uint64_t value = 0;
+
+    assert(size <= sizeof(value));
+    if (s->cur_entry != FW_CFG_INVALID && e->data) {
+        /* The least significant 'size' bytes of the return value are
+         * expected to contain a string preserving portion of the item
+         * data, padded with zeros to the right in case we run out early.
+         */
+        while (size && s->cur_offset < e->len) {
+            value = (value << 8) | e->data[s->cur_offset++];
+            size--;
+        }
+        /* If size is still not zero, we *did* run out early, so finish
+         * left-shifting to add the appropriate number of padding zeros
+         * on the right.
+         */
+        value <<= 8 * size;
+    }
+
+    trace_fw_cfg_read(s, value);
+    return value;
+}

Version 4 should be out by the end of today.

Thanks again,
--Gabriel

> 
> (2b) or move the offset check from the loop's controlling expression
> into the value composition:
> 
>         while (size--) {
>             value = (value << 8) | (s->cur_offset < e->len ?
>                                     e->data[s->cur_offset++] :
>                                     0);
>         }
> 
> The rest looks good.
> 
> Thanks
> Laszlo
> 
> > +    }
> > +
> > +    trace_fw_cfg_read(s, value);
> > +    return value;
> > +}
> > +
> >  static uint8_t fw_cfg_read(FWCfgState *s)
> >  {
> >      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> > @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
> >      return ret;
> >  }
> >  
> > -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
> > -                                     unsigned size)
> > -{
> > -    FWCfgState *s = opaque;
> > -    uint64_t value = 0;
> > -    unsigned i;
> > -
> > -    for (i = 0; i < size; ++i) {
> > -        value = (value << 8) | fw_cfg_read(s);
> > -    }
> > -    return value;
> > -}
> > -
> >  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
> >                                    uint64_t value, unsigned size)
> >  {
> > @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
> >  };
> >  
> >  static const MemoryRegionOps fw_cfg_data_mem_ops = {
> > -    .read = fw_cfg_data_mem_read,
> > +    .read = fw_cfg_data_read,
> >      .write = fw_cfg_data_mem_write,
> >      .endianness = DEVICE_BIG_ENDIAN,
> >      .valid = {
> > diff --git a/trace-events b/trace-events
> > index 72136b9..5073040 100644
> > --- a/trace-events
> > +++ b/trace-events
> > @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
> >  
> >  # hw/nvram/fw_cfg.c
> >  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
> > -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
> > +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
> >  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
> >  
> >  # hw/block/hd-geometry.c
> > 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03 17:55     ` Gabriel L. Somlo
@ 2015-11-03 21:35       ` Laszlo Ersek
  2015-11-03 21:44         ` Gabriel L. Somlo
  2015-11-03 22:03         ` Gabriel L. Somlo
  0 siblings, 2 replies; 14+ messages in thread
From: Laszlo Ersek @ 2015-11-03 21:35 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: peter.maydell, jordan.l.justen, qemu-devel, kraxel, pbonzini, markmb

On 11/03/15 18:55, Gabriel L. Somlo wrote:
> On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
>> Thank you for splitting out this patch; it makes it easier to review.
>> However,
>>
>> On 11/03/15 01:35, Gabriel L. Somlo wrote:
>>> Introduce fw_cfg_data_read(), a generic read method which works
>>> on all access widths (1 through 8 bytes, inclusive), and can be
>>> used during both IOPort and MMIO read accesses.
>>>
>>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
>>> data read method) is replaced by this patch. The new method
>>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
>>> combo, but without unnecessarily repeating all the validity
>>> checks performed by the latter on each byte being read.
>>
>> this unwinding caused a bug to creep in.
>>
>> Namely, we have to identify the set of data that remains constant
>> between *all* "size" calls that fw_cfg_data_mem_read() makes to
>> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
>>
>> Specifically,
>>
>>> This patch also modifies the trace_fw_cfg_read prototype to
>>> accept a 64-bit value argument, allowing it to work properly
>>> with the new read method, but also remain backward compatible
>>> with existing call sites.
>>>
>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>> Cc: Marc Marí <markmb@redhat.com>
>>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
>>> ---
>>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
>>>  trace-events      |  2 +-
>>>  2 files changed, 20 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>>> index c2d3a0a..8aa980c 100644
>>> --- a/hw/nvram/fw_cfg.c
>>> +++ b/hw/nvram/fw_cfg.c
>>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>>>      return ret;
>>>  }
>>>  
>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>> +{
>>> +    FWCfgState *s = opaque;
>>
>> This is good.
>>
>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>
>> Okay too.
>>
>>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> (1) Side point: the conversion here is faithful to the original code in
>> fw_cfg_read(), but even in the original code, the expression uses
>> "s->cur_entry" as a (masked) subscript *before* comparing it against
>> FW_CFG_INVALID. I don't think that's right.
>>
>> The same issue is present in fw_cfg_dma_transfer(). Care to write a
>> patch (before the restructuring) that fixes both?
>>
>> Note, I am aware that the expression in both of the above mentioned
>> functions only calculates the *address* of the nonexistent element
>> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
>>
>>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> But it doesn't matter; it's undefined behavior just the same. Instead,
>> *both* locations should say:
>>
>>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>
>> (I share the blame for not noticing this earlier -- I too reviewed
>> fw_cfg_dma_transfer().)
>>
>> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
>> current address-of expression is not valid even for evaluation. Also, in
>> practice, dereferencing NULL would give us a nice (as in, non-garbage)
>> SIGSEGV.
> 
> Done.
> 
>>
>> Anyway, back to the topic at hand:
>>
>>> +    uint64_t value = 0;
>>> +
>>> +    assert(size <= sizeof(value));
>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>
>> Right, good conversion. (Side note: this does protect against
>> *dereferencing* "e", but it's already too late, as far as undefined
>> behavior is concerned.)
>>
>>> +        while (size-- && s->cur_offset < e->len) {
>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>> +        }
>>
>> (2) So, this is the bug. The pre-conversion code would keep shifting
>> "value" to the left until "size" was reached, regardless of the
>> underlying blob size, and just leave the least significant bytes zeroed
>> if the item ended too early. Whereas this loop *stops shifting* when the
>> blob ends.
> 
> D'OH!!! That should teach me to pay more attention -- thanks for
> catching it!
> 
>> Since the wide data register (which is big-endian) implements a
>> substring-preserving transfer (on top of QEMU's integer preserving
>> device r/w infrastructure), this change breaks the case when the
>> firmware reads, say, 8 bytes from the register in a single access, when
>> only 3 are left in the blob, and then uses only the three *lowest
>> address* bytes from the uint64_t value read. Although no known firmware
>> does this at the moment, it would be valid, and the above hunk would
>> break it.
>>
>> Hence please
>>
>> (2a) either append the missing "cumulative" shift after the loop:
>>
>>     while (size && s->cur_offset < e->len) {
>>         --size;
>>         value = (value << 8) | e->data[s->cur_offset++];
>>     }
>>     value <<= 8 * size;
> 
> I went with 2a. Also added a comment to make things painfully obvious
> to any potential future archaeologists:
> 
> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +    FWCfgState *s = opaque;
> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> +    uint64_t value = 0;
> +
> +    assert(size <= sizeof(value));
> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> +        /* The least significant 'size' bytes of the return value are
> +         * expected to contain a string preserving portion of the item
> +         * data, padded with zeros to the right in case we run out early.

Please say "*on* the right" here, just like it reads below (emphasis
added only for review purposes).

Also, while the above seems correct, I prefer my own wording from commit
3c23402d4032:

  The solution is to compose the host-endian representation [...] of
  the big endian interpretation [...] of the fw_cfg string [...]

I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
;)) -- my preference could be harder to interpret for "future
archeologist". So I'll leave it to you whether to keep yours, pick mine,
or run with a mixture / union.

Cheers!
Laszlo

> +         */
> +        while (size && s->cur_offset < e->len) {
> +            value = (value << 8) | e->data[s->cur_offset++];
> +            size--;
> +        }
> +        /* If size is still not zero, we *did* run out early, so finish
> +         * left-shifting to add the appropriate number of padding zeros
> +         * on the right.
> +         */
> +        value <<= 8 * size;
> +    }
> +
> +    trace_fw_cfg_read(s, value);
> +    return value;
> +}
> 
> Version 4 should be out by the end of today.
> 
> Thanks again,
> --Gabriel
> 
>>
>> (2b) or move the offset check from the loop's controlling expression
>> into the value composition:
>>
>>         while (size--) {
>>             value = (value << 8) | (s->cur_offset < e->len ?
>>                                     e->data[s->cur_offset++] :
>>                                     0);
>>         }
>>
>> The rest looks good.
>>
>> Thanks
>> Laszlo
>>
>>> +    }
>>> +
>>> +    trace_fw_cfg_read(s, value);
>>> +    return value;
>>> +}
>>> +
>>>  static uint8_t fw_cfg_read(FWCfgState *s)
>>>  {
>>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>>>      return ret;
>>>  }
>>>  
>>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
>>> -                                     unsigned size)
>>> -{
>>> -    FWCfgState *s = opaque;
>>> -    uint64_t value = 0;
>>> -    unsigned i;
>>> -
>>> -    for (i = 0; i < size; ++i) {
>>> -        value = (value << 8) | fw_cfg_read(s);
>>> -    }
>>> -    return value;
>>> -}
>>> -
>>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>>>                                    uint64_t value, unsigned size)
>>>  {
>>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
>>>  };
>>>  
>>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
>>> -    .read = fw_cfg_data_mem_read,
>>> +    .read = fw_cfg_data_read,
>>>      .write = fw_cfg_data_mem_write,
>>>      .endianness = DEVICE_BIG_ENDIAN,
>>>      .valid = {
>>> diff --git a/trace-events b/trace-events
>>> index 72136b9..5073040 100644
>>> --- a/trace-events
>>> +++ b/trace-events
>>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
>>>  
>>>  # hw/nvram/fw_cfg.c
>>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
>>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
>>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
>>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
>>>  
>>>  # hw/block/hd-geometry.c
>>>
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03 21:35       ` Laszlo Ersek
@ 2015-11-03 21:44         ` Gabriel L. Somlo
  2015-11-03 22:03         ` Gabriel L. Somlo
  1 sibling, 0 replies; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03 21:44 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell, jordan.l.justen, qemu-devel, kraxel, pbonzini, markmb

On Tue, Nov 03, 2015 at 10:35:36PM +0100, Laszlo Ersek wrote:
> On 11/03/15 18:55, Gabriel L. Somlo wrote:
> > On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
> >> Thank you for splitting out this patch; it makes it easier to review.
> >> However,
> >>
> >> On 11/03/15 01:35, Gabriel L. Somlo wrote:
> >>> Introduce fw_cfg_data_read(), a generic read method which works
> >>> on all access widths (1 through 8 bytes, inclusive), and can be
> >>> used during both IOPort and MMIO read accesses.
> >>>
> >>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
> >>> data read method) is replaced by this patch. The new method
> >>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
> >>> combo, but without unnecessarily repeating all the validity
> >>> checks performed by the latter on each byte being read.
> >>
> >> this unwinding caused a bug to creep in.
> >>
> >> Namely, we have to identify the set of data that remains constant
> >> between *all* "size" calls that fw_cfg_data_mem_read() makes to
> >> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
> >>
> >> Specifically,
> >>
> >>> This patch also modifies the trace_fw_cfg_read prototype to
> >>> accept a 64-bit value argument, allowing it to work properly
> >>> with the new read method, but also remain backward compatible
> >>> with existing call sites.
> >>>
> >>> Cc: Laszlo Ersek <lersek@redhat.com>
> >>> Cc: Gerd Hoffmann <kraxel@redhat.com>
> >>> Cc: Marc Marí <markmb@redhat.com>
> >>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> >>> ---
> >>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
> >>>  trace-events      |  2 +-
> >>>  2 files changed, 20 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> >>> index c2d3a0a..8aa980c 100644
> >>> --- a/hw/nvram/fw_cfg.c
> >>> +++ b/hw/nvram/fw_cfg.c
> >>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
> >>>      return ret;
> >>>  }
> >>>  
> >>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> >>> +{
> >>> +    FWCfgState *s = opaque;
> >>
> >> This is good.
> >>
> >>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>
> >> Okay too.
> >>
> >>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> (1) Side point: the conversion here is faithful to the original code in
> >> fw_cfg_read(), but even in the original code, the expression uses
> >> "s->cur_entry" as a (masked) subscript *before* comparing it against
> >> FW_CFG_INVALID. I don't think that's right.
> >>
> >> The same issue is present in fw_cfg_dma_transfer(). Care to write a
> >> patch (before the restructuring) that fixes both?
> >>
> >> Note, I am aware that the expression in both of the above mentioned
> >> functions only calculates the *address* of the nonexistent element
> >> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
> >>
> >>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> But it doesn't matter; it's undefined behavior just the same. Instead,
> >> *both* locations should say:
> >>
> >>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> >>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> (I share the blame for not noticing this earlier -- I too reviewed
> >> fw_cfg_dma_transfer().)
> >>
> >> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
> >> current address-of expression is not valid even for evaluation. Also, in
> >> practice, dereferencing NULL would give us a nice (as in, non-garbage)
> >> SIGSEGV.
> > 
> > Done.
> > 
> >>
> >> Anyway, back to the topic at hand:
> >>
> >>> +    uint64_t value = 0;
> >>> +
> >>> +    assert(size <= sizeof(value));
> >>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> >>
> >> Right, good conversion. (Side note: this does protect against
> >> *dereferencing* "e", but it's already too late, as far as undefined
> >> behavior is concerned.)
> >>
> >>> +        while (size-- && s->cur_offset < e->len) {
> >>> +            value = (value << 8) | e->data[s->cur_offset++];
> >>> +        }
> >>
> >> (2) So, this is the bug. The pre-conversion code would keep shifting
> >> "value" to the left until "size" was reached, regardless of the
> >> underlying blob size, and just leave the least significant bytes zeroed
> >> if the item ended too early. Whereas this loop *stops shifting* when the
> >> blob ends.
> > 
> > D'OH!!! That should teach me to pay more attention -- thanks for
> > catching it!
> > 
> >> Since the wide data register (which is big-endian) implements a
> >> substring-preserving transfer (on top of QEMU's integer preserving
> >> device r/w infrastructure), this change breaks the case when the
> >> firmware reads, say, 8 bytes from the register in a single access, when
> >> only 3 are left in the blob, and then uses only the three *lowest
> >> address* bytes from the uint64_t value read. Although no known firmware
> >> does this at the moment, it would be valid, and the above hunk would
> >> break it.
> >>
> >> Hence please
> >>
> >> (2a) either append the missing "cumulative" shift after the loop:
> >>
> >>     while (size && s->cur_offset < e->len) {
> >>         --size;
> >>         value = (value << 8) | e->data[s->cur_offset++];
> >>     }
> >>     value <<= 8 * size;
> > 
> > I went with 2a. Also added a comment to make things painfully obvious
> > to any potential future archaeologists:
> > 
> > +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> > +{
> > +    FWCfgState *s = opaque;
> > +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> > +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> > +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> > +    uint64_t value = 0;
> > +
> > +    assert(size <= sizeof(value));
> > +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> > +        /* The least significant 'size' bytes of the return value are
> > +         * expected to contain a string preserving portion of the item
> > +         * data, padded with zeros to the right in case we run out early.
> 
> Please say "*on* the right" here, just like it reads below (emphasis
> added only for review purposes).
> 
> Also, while the above seems correct, I prefer my own wording from commit
> 3c23402d4032:
> 
>   The solution is to compose the host-endian representation [...] of
>   the big endian interpretation [...] of the fw_cfg string [...]
> 
> I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
> ;)) -- my preference could be harder to interpret for "future
> archeologist". So I'll leave it to you whether to keep yours, pick mine,
> or run with a mixture / union.

Oops, I just fired off v4 literally a few seconds before this email
came in :)

I'll make the changes, and queue them for v5. I'll send that out in a
few days, or as soon as I get any more feedback on the series...

Thanks,
--Gabriel

> 
> Cheers!
> Laszlo
> 
> > +         */
> > +        while (size && s->cur_offset < e->len) {
> > +            value = (value << 8) | e->data[s->cur_offset++];
> > +            size--;
> > +        }
> > +        /* If size is still not zero, we *did* run out early, so finish
> > +         * left-shifting to add the appropriate number of padding zeros
> > +         * on the right.
> > +         */
> > +        value <<= 8 * size;
> > +    }
> > +
> > +    trace_fw_cfg_read(s, value);
> > +    return value;
> > +}
> > 
> > Version 4 should be out by the end of today.
> > 
> > Thanks again,
> > --Gabriel
> > 
> >>
> >> (2b) or move the offset check from the loop's controlling expression
> >> into the value composition:
> >>
> >>         while (size--) {
> >>             value = (value << 8) | (s->cur_offset < e->len ?
> >>                                     e->data[s->cur_offset++] :
> >>                                     0);
> >>         }
> >>
> >> The rest looks good.
> >>
> >> Thanks
> >> Laszlo
> >>
> >>> +    }
> >>> +
> >>> +    trace_fw_cfg_read(s, value);
> >>> +    return value;
> >>> +}
> >>> +
> >>>  static uint8_t fw_cfg_read(FWCfgState *s)
> >>>  {
> >>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
> >>>      return ret;
> >>>  }
> >>>  
> >>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
> >>> -                                     unsigned size)
> >>> -{
> >>> -    FWCfgState *s = opaque;
> >>> -    uint64_t value = 0;
> >>> -    unsigned i;
> >>> -
> >>> -    for (i = 0; i < size; ++i) {
> >>> -        value = (value << 8) | fw_cfg_read(s);
> >>> -    }
> >>> -    return value;
> >>> -}
> >>> -
> >>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
> >>>                                    uint64_t value, unsigned size)
> >>>  {
> >>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
> >>>  };
> >>>  
> >>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
> >>> -    .read = fw_cfg_data_mem_read,
> >>> +    .read = fw_cfg_data_read,
> >>>      .write = fw_cfg_data_mem_write,
> >>>      .endianness = DEVICE_BIG_ENDIAN,
> >>>      .valid = {
> >>> diff --git a/trace-events b/trace-events
> >>> index 72136b9..5073040 100644
> >>> --- a/trace-events
> >>> +++ b/trace-events
> >>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
> >>>  
> >>>  # hw/nvram/fw_cfg.c
> >>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
> >>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
> >>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
> >>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
> >>>  
> >>>  # hw/block/hd-geometry.c
> >>>
> >>
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03 21:35       ` Laszlo Ersek
  2015-11-03 21:44         ` Gabriel L. Somlo
@ 2015-11-03 22:03         ` Gabriel L. Somlo
  2015-11-04 13:01           ` Laszlo Ersek
  1 sibling, 1 reply; 14+ messages in thread
From: Gabriel L. Somlo @ 2015-11-03 22:03 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: peter.maydell, jordan.l.justen, qemu-devel, kraxel, pbonzini, markmb

On Tue, Nov 03, 2015 at 10:35:36PM +0100, Laszlo Ersek wrote:
> On 11/03/15 18:55, Gabriel L. Somlo wrote:
> > On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
> >> Thank you for splitting out this patch; it makes it easier to review.
> >> However,
> >>
> >> On 11/03/15 01:35, Gabriel L. Somlo wrote:
> >>> Introduce fw_cfg_data_read(), a generic read method which works
> >>> on all access widths (1 through 8 bytes, inclusive), and can be
> >>> used during both IOPort and MMIO read accesses.
> >>>
> >>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
> >>> data read method) is replaced by this patch. The new method
> >>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
> >>> combo, but without unnecessarily repeating all the validity
> >>> checks performed by the latter on each byte being read.
> >>
> >> this unwinding caused a bug to creep in.
> >>
> >> Namely, we have to identify the set of data that remains constant
> >> between *all* "size" calls that fw_cfg_data_mem_read() makes to
> >> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
> >>
> >> Specifically,
> >>
> >>> This patch also modifies the trace_fw_cfg_read prototype to
> >>> accept a 64-bit value argument, allowing it to work properly
> >>> with the new read method, but also remain backward compatible
> >>> with existing call sites.
> >>>
> >>> Cc: Laszlo Ersek <lersek@redhat.com>
> >>> Cc: Gerd Hoffmann <kraxel@redhat.com>
> >>> Cc: Marc Marí <markmb@redhat.com>
> >>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
> >>> ---
> >>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
> >>>  trace-events      |  2 +-
> >>>  2 files changed, 20 insertions(+), 15 deletions(-)
> >>>
> >>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> >>> index c2d3a0a..8aa980c 100644
> >>> --- a/hw/nvram/fw_cfg.c
> >>> +++ b/hw/nvram/fw_cfg.c
> >>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
> >>>      return ret;
> >>>  }
> >>>  
> >>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> >>> +{
> >>> +    FWCfgState *s = opaque;
> >>
> >> This is good.
> >>
> >>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>
> >> Okay too.
> >>
> >>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> (1) Side point: the conversion here is faithful to the original code in
> >> fw_cfg_read(), but even in the original code, the expression uses
> >> "s->cur_entry" as a (masked) subscript *before* comparing it against
> >> FW_CFG_INVALID. I don't think that's right.
> >>
> >> The same issue is present in fw_cfg_dma_transfer(). Care to write a
> >> patch (before the restructuring) that fixes both?
> >>
> >> Note, I am aware that the expression in both of the above mentioned
> >> functions only calculates the *address* of the nonexistent element
> >> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
> >>
> >>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> But it doesn't matter; it's undefined behavior just the same. Instead,
> >> *both* locations should say:
> >>
> >>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> >>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> >>
> >> (I share the blame for not noticing this earlier -- I too reviewed
> >> fw_cfg_dma_transfer().)
> >>
> >> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
> >> current address-of expression is not valid even for evaluation. Also, in
> >> practice, dereferencing NULL would give us a nice (as in, non-garbage)
> >> SIGSEGV.
> > 
> > Done.
> > 
> >>
> >> Anyway, back to the topic at hand:
> >>
> >>> +    uint64_t value = 0;
> >>> +
> >>> +    assert(size <= sizeof(value));
> >>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> >>
> >> Right, good conversion. (Side note: this does protect against
> >> *dereferencing* "e", but it's already too late, as far as undefined
> >> behavior is concerned.)
> >>
> >>> +        while (size-- && s->cur_offset < e->len) {
> >>> +            value = (value << 8) | e->data[s->cur_offset++];
> >>> +        }
> >>
> >> (2) So, this is the bug. The pre-conversion code would keep shifting
> >> "value" to the left until "size" was reached, regardless of the
> >> underlying blob size, and just leave the least significant bytes zeroed
> >> if the item ended too early. Whereas this loop *stops shifting* when the
> >> blob ends.
> > 
> > D'OH!!! That should teach me to pay more attention -- thanks for
> > catching it!
> > 
> >> Since the wide data register (which is big-endian) implements a
> >> substring-preserving transfer (on top of QEMU's integer preserving
> >> device r/w infrastructure), this change breaks the case when the
> >> firmware reads, say, 8 bytes from the register in a single access, when
> >> only 3 are left in the blob, and then uses only the three *lowest
> >> address* bytes from the uint64_t value read. Although no known firmware
> >> does this at the moment, it would be valid, and the above hunk would
> >> break it.
> >>
> >> Hence please
> >>
> >> (2a) either append the missing "cumulative" shift after the loop:
> >>
> >>     while (size && s->cur_offset < e->len) {
> >>         --size;
> >>         value = (value << 8) | e->data[s->cur_offset++];
> >>     }
> >>     value <<= 8 * size;
> > 
> > I went with 2a. Also added a comment to make things painfully obvious
> > to any potential future archaeologists:
> > 
> > +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
> > +{
> > +    FWCfgState *s = opaque;
> > +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> > +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
> > +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> > +    uint64_t value = 0;
> > +
> > +    assert(size <= sizeof(value));
> > +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
> > +        /* The least significant 'size' bytes of the return value are
> > +         * expected to contain a string preserving portion of the item
> > +         * data, padded with zeros to the right in case we run out early.
> 
> Please say "*on* the right" here, just like it reads below (emphasis
> added only for review purposes).

Done.

> Also, while the above seems correct, I prefer my own wording from commit
> 3c23402d4032:
> 
>   The solution is to compose the host-endian representation [...] of
>   the big endian interpretation [...] of the fw_cfg string [...]
> 
> I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
> ;)) -- my preference could be harder to interpret for "future
> archeologist". So I'll leave it to you whether to keep yours, pick mine,
> or run with a mixture / union.

You mean commit  36b62ae, I think :) I'm going to go with a "union",
since the "string preserving" verbiage is also in use (by our mutual
agreement) in docs/specs/fw_cfg.txt :)

So that comment will read:

        /* The least significant 'size' bytes of the return value are
         * expected to contain a string preserving portion of the item
         * data, padded with zeros on the right in case we run out early.
         * In technical terms, we're composing the host-endian representation
         * of the big endian interpretation of the fw_cfg string.
         */

... when I'll send out v5.

Thanks,
--Gabriel

> > +         */
> > +        while (size && s->cur_offset < e->len) {
> > +            value = (value << 8) | e->data[s->cur_offset++];
> > +            size--;
> > +        }
> > +        /* If size is still not zero, we *did* run out early, so finish
> > +         * left-shifting to add the appropriate number of padding zeros
> > +         * on the right.
> > +         */
> > +        value <<= 8 * size;
> > +    }
> > +
> > +    trace_fw_cfg_read(s, value);
> > +    return value;
> > +}
> > 
> > Version 4 should be out by the end of today.
> > 
> > Thanks again,
> > --Gabriel
> > 
> >>
> >> (2b) or move the offset check from the loop's controlling expression
> >> into the value composition:
> >>
> >>         while (size--) {
> >>             value = (value << 8) | (s->cur_offset < e->len ?
> >>                                     e->data[s->cur_offset++] :
> >>                                     0);
> >>         }
> >>
> >> The rest looks good.
> >>
> >> Thanks
> >> Laszlo
> >>
> >>> +    }
> >>> +
> >>> +    trace_fw_cfg_read(s, value);
> >>> +    return value;
> >>> +}
> >>> +
> >>>  static uint8_t fw_cfg_read(FWCfgState *s)
> >>>  {
> >>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> >>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
> >>>      return ret;
> >>>  }
> >>>  
> >>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
> >>> -                                     unsigned size)
> >>> -{
> >>> -    FWCfgState *s = opaque;
> >>> -    uint64_t value = 0;
> >>> -    unsigned i;
> >>> -
> >>> -    for (i = 0; i < size; ++i) {
> >>> -        value = (value << 8) | fw_cfg_read(s);
> >>> -    }
> >>> -    return value;
> >>> -}
> >>> -
> >>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
> >>>                                    uint64_t value, unsigned size)
> >>>  {
> >>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
> >>>  };
> >>>  
> >>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
> >>> -    .read = fw_cfg_data_mem_read,
> >>> +    .read = fw_cfg_data_read,
> >>>      .write = fw_cfg_data_mem_write,
> >>>      .endianness = DEVICE_BIG_ENDIAN,
> >>>      .valid = {
> >>> diff --git a/trace-events b/trace-events
> >>> index 72136b9..5073040 100644
> >>> --- a/trace-events
> >>> +++ b/trace-events
> >>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
> >>>  
> >>>  # hw/nvram/fw_cfg.c
> >>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
> >>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
> >>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
> >>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
> >>>  
> >>>  # hw/block/hd-geometry.c
> >>>
> >>
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method
  2015-11-03 22:03         ` Gabriel L. Somlo
@ 2015-11-04 13:01           ` Laszlo Ersek
  0 siblings, 0 replies; 14+ messages in thread
From: Laszlo Ersek @ 2015-11-04 13:01 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: peter.maydell, jordan.l.justen, qemu-devel, kraxel, pbonzini, markmb

On 11/03/15 23:03, Gabriel L. Somlo wrote:
> On Tue, Nov 03, 2015 at 10:35:36PM +0100, Laszlo Ersek wrote:
>> On 11/03/15 18:55, Gabriel L. Somlo wrote:
>>> On Tue, Nov 03, 2015 at 11:53:55AM +0100, Laszlo Ersek wrote:
>>>> Thank you for splitting out this patch; it makes it easier to review.
>>>> However,
>>>>
>>>> On 11/03/15 01:35, Gabriel L. Somlo wrote:
>>>>> Introduce fw_cfg_data_read(), a generic read method which works
>>>>> on all access widths (1 through 8 bytes, inclusive), and can be
>>>>> used during both IOPort and MMIO read accesses.
>>>>>
>>>>> To maintain legibility, only fw_cfg_data_mem_read() (the MMIO
>>>>> data read method) is replaced by this patch. The new method
>>>>> essentially unwinds the fw_cfg_data_mem_read() + fw_cfg_read()
>>>>> combo, but without unnecessarily repeating all the validity
>>>>> checks performed by the latter on each byte being read.
>>>>
>>>> this unwinding caused a bug to creep in.
>>>>
>>>> Namely, we have to identify the set of data that remains constant
>>>> between *all* "size" calls that fw_cfg_data_mem_read() makes to
>>>> fw_cfg_read(), and hoist / eliminate the checks on those *only*.
>>>>
>>>> Specifically,
>>>>
>>>>> This patch also modifies the trace_fw_cfg_read prototype to
>>>>> accept a 64-bit value argument, allowing it to work properly
>>>>> with the new read method, but also remain backward compatible
>>>>> with existing call sites.
>>>>>
>>>>> Cc: Laszlo Ersek <lersek@redhat.com>
>>>>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>>>>> Cc: Marc Marí <markmb@redhat.com>
>>>>> Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
>>>>> ---
>>>>>  hw/nvram/fw_cfg.c | 33 +++++++++++++++++++--------------
>>>>>  trace-events      |  2 +-
>>>>>  2 files changed, 20 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>>>>> index c2d3a0a..8aa980c 100644
>>>>> --- a/hw/nvram/fw_cfg.c
>>>>> +++ b/hw/nvram/fw_cfg.c
>>>>> @@ -274,6 +274,24 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
>>>>>      return ret;
>>>>>  }
>>>>>  
>>>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>>>> +{
>>>>> +    FWCfgState *s = opaque;
>>>>
>>>> This is good.
>>>>
>>>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>>>
>>>> Okay too.
>>>>
>>>>> +    FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> (1) Side point: the conversion here is faithful to the original code in
>>>> fw_cfg_read(), but even in the original code, the expression uses
>>>> "s->cur_entry" as a (masked) subscript *before* comparing it against
>>>> FW_CFG_INVALID. I don't think that's right.
>>>>
>>>> The same issue is present in fw_cfg_dma_transfer(). Care to write a
>>>> patch (before the restructuring) that fixes both?
>>>>
>>>> Note, I am aware that the expression in both of the above mentioned
>>>> functions only calculates the *address* of the nonexistent element
>>>> belonging to (FW_CFG_INVALID & FW_CFG_ENTRY_MASK) == 0x3FFF:
>>>>
>>>>   e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> But it doesn't matter; it's undefined behavior just the same. Instead,
>>>> *both* locations should say:
>>>>
>>>>  e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>>>      &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>>>
>>>> (I share the blame for not noticing this earlier -- I too reviewed
>>>> fw_cfg_dma_transfer().)
>>>>
>>>> NULL is a valid pointer to *evaluate* (not to dereference), whereas the
>>>> current address-of expression is not valid even for evaluation. Also, in
>>>> practice, dereferencing NULL would give us a nice (as in, non-garbage)
>>>> SIGSEGV.
>>>
>>> Done.
>>>
>>>>
>>>> Anyway, back to the topic at hand:
>>>>
>>>>> +    uint64_t value = 0;
>>>>> +
>>>>> +    assert(size <= sizeof(value));
>>>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>>>
>>>> Right, good conversion. (Side note: this does protect against
>>>> *dereferencing* "e", but it's already too late, as far as undefined
>>>> behavior is concerned.)
>>>>
>>>>> +        while (size-- && s->cur_offset < e->len) {
>>>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>>>> +        }
>>>>
>>>> (2) So, this is the bug. The pre-conversion code would keep shifting
>>>> "value" to the left until "size" was reached, regardless of the
>>>> underlying blob size, and just leave the least significant bytes zeroed
>>>> if the item ended too early. Whereas this loop *stops shifting* when the
>>>> blob ends.
>>>
>>> D'OH!!! That should teach me to pay more attention -- thanks for
>>> catching it!
>>>
>>>> Since the wide data register (which is big-endian) implements a
>>>> substring-preserving transfer (on top of QEMU's integer preserving
>>>> device r/w infrastructure), this change breaks the case when the
>>>> firmware reads, say, 8 bytes from the register in a single access, when
>>>> only 3 are left in the blob, and then uses only the three *lowest
>>>> address* bytes from the uint64_t value read. Although no known firmware
>>>> does this at the moment, it would be valid, and the above hunk would
>>>> break it.
>>>>
>>>> Hence please
>>>>
>>>> (2a) either append the missing "cumulative" shift after the loop:
>>>>
>>>>     while (size && s->cur_offset < e->len) {
>>>>         --size;
>>>>         value = (value << 8) | e->data[s->cur_offset++];
>>>>     }
>>>>     value <<= 8 * size;
>>>
>>> I went with 2a. Also added a comment to make things painfully obvious
>>> to any potential future archaeologists:
>>>
>>> +static uint64_t fw_cfg_data_read(void *opaque, hwaddr addr, unsigned size)
>>> +{
>>> +    FWCfgState *s = opaque;
>>> +    int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>> +    FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
>>> +                    &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
>>> +    uint64_t value = 0;
>>> +
>>> +    assert(size <= sizeof(value));
>>> +    if (s->cur_entry != FW_CFG_INVALID && e->data) {
>>> +        /* The least significant 'size' bytes of the return value are
>>> +         * expected to contain a string preserving portion of the item
>>> +         * data, padded with zeros to the right in case we run out early.
>>
>> Please say "*on* the right" here, just like it reads below (emphasis
>> added only for review purposes).
> 
> Done.
> 
>> Also, while the above seems correct, I prefer my own wording from commit
>> 3c23402d4032:
>>
>>   The solution is to compose the host-endian representation [...] of
>>   the big endian interpretation [...] of the fw_cfg string [...]
>>
>> I'm admittedly biased (I have deep scars that read "FW CFG" if I squint
>> ;)) -- my preference could be harder to interpret for "future
>> archeologist". So I'll leave it to you whether to keep yours, pick mine,
>> or run with a mixture / union.
> 
> You mean commit  36b62ae, I think :)

Sigh, yes. Sorry. Must have been multi-tasking too heavily.

> I'm going to go with a "union",
> since the "string preserving" verbiage is also in use (by our mutual
> agreement) in docs/specs/fw_cfg.txt :)
> 
> So that comment will read:
> 
>         /* The least significant 'size' bytes of the return value are
>          * expected to contain a string preserving portion of the item
>          * data, padded with zeros on the right in case we run out early.
>          * In technical terms, we're composing the host-endian representation
>          * of the big endian interpretation of the fw_cfg string.
>          */

Sounds great, thanks!
Laszlo

> 
> ... when I'll send out v5.
> 
> Thanks,
> --Gabriel
> 
>>> +         */
>>> +        while (size && s->cur_offset < e->len) {
>>> +            value = (value << 8) | e->data[s->cur_offset++];
>>> +            size--;
>>> +        }
>>> +        /* If size is still not zero, we *did* run out early, so finish
>>> +         * left-shifting to add the appropriate number of padding zeros
>>> +         * on the right.
>>> +         */
>>> +        value <<= 8 * size;
>>> +    }
>>> +
>>> +    trace_fw_cfg_read(s, value);
>>> +    return value;
>>> +}
>>>
>>> Version 4 should be out by the end of today.
>>>
>>> Thanks again,
>>> --Gabriel
>>>
>>>>
>>>> (2b) or move the offset check from the loop's controlling expression
>>>> into the value composition:
>>>>
>>>>         while (size--) {
>>>>             value = (value << 8) | (s->cur_offset < e->len ?
>>>>                                     e->data[s->cur_offset++] :
>>>>                                     0);
>>>>         }
>>>>
>>>> The rest looks good.
>>>>
>>>> Thanks
>>>> Laszlo
>>>>
>>>>> +    }
>>>>> +
>>>>> +    trace_fw_cfg_read(s, value);
>>>>> +    return value;
>>>>> +}
>>>>> +
>>>>>  static uint8_t fw_cfg_read(FWCfgState *s)
>>>>>  {
>>>>>      int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
>>>>> @@ -290,19 +308,6 @@ static uint8_t fw_cfg_read(FWCfgState *s)
>>>>>      return ret;
>>>>>  }
>>>>>  
>>>>> -static uint64_t fw_cfg_data_mem_read(void *opaque, hwaddr addr,
>>>>> -                                     unsigned size)
>>>>> -{
>>>>> -    FWCfgState *s = opaque;
>>>>> -    uint64_t value = 0;
>>>>> -    unsigned i;
>>>>> -
>>>>> -    for (i = 0; i < size; ++i) {
>>>>> -        value = (value << 8) | fw_cfg_read(s);
>>>>> -    }
>>>>> -    return value;
>>>>> -}
>>>>> -
>>>>>  static void fw_cfg_data_mem_write(void *opaque, hwaddr addr,
>>>>>                                    uint64_t value, unsigned size)
>>>>>  {
>>>>> @@ -483,7 +488,7 @@ static const MemoryRegionOps fw_cfg_ctl_mem_ops = {
>>>>>  };
>>>>>  
>>>>>  static const MemoryRegionOps fw_cfg_data_mem_ops = {
>>>>> -    .read = fw_cfg_data_mem_read,
>>>>> +    .read = fw_cfg_data_read,
>>>>>      .write = fw_cfg_data_mem_write,
>>>>>      .endianness = DEVICE_BIG_ENDIAN,
>>>>>      .valid = {
>>>>> diff --git a/trace-events b/trace-events
>>>>> index 72136b9..5073040 100644
>>>>> --- a/trace-events
>>>>> +++ b/trace-events
>>>>> @@ -196,7 +196,7 @@ ecc_diag_mem_readb(uint64_t addr, uint32_t ret) "Read diagnostic %"PRId64"= %02x
>>>>>  
>>>>>  # hw/nvram/fw_cfg.c
>>>>>  fw_cfg_select(void *s, uint16_t key, int ret) "%p key %d = %d"
>>>>> -fw_cfg_read(void *s, uint8_t ret) "%p = %d"
>>>>> +fw_cfg_read(void *s, uint64_t ret) "%p = %"PRIx64
>>>>>  fw_cfg_add_file(void *s, int index, char *name, size_t len) "%p #%d: %s (%zd bytes)"
>>>>>  
>>>>>  # hw/block/hd-geometry.c
>>>>>
>>>>
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-04 13:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03  0:35 [Qemu-devel] [PATCH v3 0/5] fw_cfg: spec update, misc. cleanup, optimize read Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 1/5] fw_cfg: move internal function call docs to header file Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 2/5] fw_cfg: amend callback behavior spec to once per select Gabriel L. Somlo
2015-11-03  9:51   ` Laszlo Ersek
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 3/5] fw_cfg: remove offset argument from callback prototype Gabriel L. Somlo
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 4/5] fw_cfg: add generic non-DMA read method Gabriel L. Somlo
2015-11-03 10:53   ` Laszlo Ersek
2015-11-03 17:55     ` Gabriel L. Somlo
2015-11-03 21:35       ` Laszlo Ersek
2015-11-03 21:44         ` Gabriel L. Somlo
2015-11-03 22:03         ` Gabriel L. Somlo
2015-11-04 13:01           ` Laszlo Ersek
2015-11-03  0:35 ` [Qemu-devel] [PATCH v3 5/5] fw_cfg: replace ioport data read with generic method Gabriel L. Somlo
2015-11-03 11:11   ` Laszlo Ersek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.