All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
@ 2021-06-30 19:07 Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 01/10] ACPI ERST: bios-tables-test.c steps 1 and 2 Eric DeVolder
                   ` (11 more replies)
  0 siblings, 12 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

=============================
I believe I have corrected for all feedback on v4, but with
responses to certain feedback below.

In patch 1/6, Igor asks:
"you are adding empty template files here
but the later matching bios-tables-test is nowhere to be found
Was testcase lost somewhere along the way?

also it seems you add ERST only to pc/q35,
so why tests/data/acpi/microvm/ERST is here?"

I did miss setting up microvm. That has been corrected.

As for the question about lost test cases, if you are referring
to the new binary blobs for pc,q35, those were in patch
6/6. There is a qtest in patch 5/6. If I don't understand the
question, please indicate as such.


In patch 3/6, Igor asks:
"Also spec (ERST) is rather (maybe intentionally) vague on specifics,
so it would be better that before a patch that implements hw part
were a doc patch describing concrete implementation. As model
you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
I'd start posting/discussing that spec within these thread
to avoid spamming list until doc is settled up."

I'm thinking that this cover letter is the bulk of the spec? But as
you say, to avoid spamming the group, we can use this thread to make
suggested changes to this cover letter which I will then convert
into a spec, for v6.


In patch 3/6, in many places Igor mentions utilizing the hostmem
mapped directly in the guest in order to avoid need-less copying.

It is true that the ERST has an "NVRAM" mode that would allow for
all the simplifications Igor points out, however, Linux does not
support this mode. This mode puts the burden of managing the NVRAM
space on the OS. So this implementation, like BIOS, is the non-NVRAM
mode.

I did go ahead and separate the registers from the exchange buffer,
which would facilitate the support of NVRAM mode.

 linux/drivers/acpi/apei/erst.c:
 /* NVRAM ERST Error Log Address Range is not supported yet */
 static void pr_unimpl_nvram(void)
 {
    if (printk_ratelimit())
        pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
 }

 static int __erst_write_to_nvram(const struct cper_record_header *record)
 {
    /* do not print message, because printk is not safe for NMI */
    return -ENOSYS;
 }

 static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
 {
    pr_unimpl_nvram();
    return -ENOSYS;
 }

 static int __erst_clear_from_nvram(u64 record_id)
 {
    pr_unimpl_nvram();
    return -ENOSYS;
 }

=============================

This patchset introduces support for the ACPI Error Record
Serialization Table, ERST.

For background and implementation information, please see
docs/specs/acpi_erst.txt, which is patch 2/10.

Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>

---
v5: 30jun2021
 - Create docs/specs/acpi_erst.txt, per Igor
 - Separate PCI BARs for registers and memory, per Igor
 - Convert debugging to use trace infrastructure, per Igor
 - Various other fixups, per Igor

v4: 11jun2021
 - Converted to a PCI device, per Igor.
 - Updated qtest.
 - Rearranged patches, per Igor.

v3: 28may2021
 - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
   internal array with explicit file operations, per Igor.
 - Changed the way the qdev and base address are handled, allowing
   ERST to be disabled at run-time. Also aligns better with other
   existing code.

v2: 8feb2021
 - Added qtest/smoke test per Paolo Bonzini
 - Split patch into smaller chunks, per Igor Mammedov
 - Did away with use of ACPI packed structures, per Igor Mammedov

v1: 26oct2020
 - initial post

---

Eric DeVolder (10):
  ACPI ERST: bios-tables-test.c steps 1 and 2
  ACPI ERST: specification for ERST support
  ACPI ERST: PCI device_id for ERST
  ACPI ERST: header file for ERST
  ACPI ERST: support for ACPI ERST feature
  ACPI ERST: build the ACPI ERST table
  ACPI ERST: trace support
  ACPI ERST: create ACPI ERST table for pc/x86 machines.
  ACPI ERST: qtest for ERST
  ACPI ERST: step 6 of bios-tables-test.c

 docs/specs/acpi_erst.txt     | 152 +++++++
 hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build          |   1 +
 hw/acpi/trace-events         |  14 +
 hw/i386/acpi-build.c         |   9 +
 hw/i386/acpi-microvm.c       |   9 +
 include/hw/acpi/erst.h       |  84 ++++
 include/hw/pci/pci.h         |   1 +
 tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
 tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
 tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
 tests/qtest/erst-test.c      | 129 ++++++
 tests/qtest/meson.build      |   2 +
 13 files changed, 1319 insertions(+)
 create mode 100644 docs/specs/acpi_erst.txt
 create mode 100644 hw/acpi/erst.c
 create mode 100644 include/hw/acpi/erst.h
 create mode 100644 tests/data/acpi/microvm/ERST
 create mode 100644 tests/data/acpi/pc/ERST
 create mode 100644 tests/data/acpi/q35/ERST
 create mode 100644 tests/qtest/erst-test.c

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v5 01/10] ACPI ERST: bios-tables-test.c steps 1 and 2
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 02/10] ACPI ERST: specification for ERST support Eric DeVolder
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

Following the guidelines in tests/qtest/bios-tables-test.c, this
change adds empty placeholder files per step 1 for the new ERST
table, and excludes resulting changed files in bios-tables-test-allowed-diff.h
per step 2.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 tests/data/acpi/microvm/ERST                | 0
 tests/data/acpi/pc/ERST                     | 0
 tests/data/acpi/q35/ERST                    | 0
 tests/qtest/bios-tables-test-allowed-diff.h | 4 ++++
 4 files changed, 4 insertions(+)
 create mode 100644 tests/data/acpi/microvm/ERST
 create mode 100644 tests/data/acpi/pc/ERST
 create mode 100644 tests/data/acpi/q35/ERST

diff --git a/tests/data/acpi/microvm/ERST b/tests/data/acpi/microvm/ERST
new file mode 100644
index 0000000..e69de29
diff --git a/tests/data/acpi/pc/ERST b/tests/data/acpi/pc/ERST
new file mode 100644
index 0000000..e69de29
diff --git a/tests/data/acpi/q35/ERST b/tests/data/acpi/q35/ERST
new file mode 100644
index 0000000..e69de29
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523..e004c71 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,5 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/pc/ERST",
+"tests/data/acpi/q35/ERST",
+"tests/data/acpi/microvm/ERST",
+
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 01/10] ACPI ERST: bios-tables-test.c steps 1 and 2 Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-06-30 19:26   ` Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST Eric DeVolder
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

Information on the implementation of the ACPI ERST support.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)
 create mode 100644 docs/specs/acpi_erst.txt

diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
new file mode 100644
index 0000000..79f8eb9
--- /dev/null
+++ b/docs/specs/acpi_erst.txt
@@ -0,0 +1,152 @@
+ACPI ERST DEVICE
+================
+
+The ACPI ERST device is utilized to support the ACPI Error Record
+Serialization Table, ERST, functionality. The functionality is
+designed for storing error records in persistent storage for
+future reference/debugging.
+
+The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
+(APEI)", and specifically subsection "Error Serialization", outlines
+a method for storing error records into persistent storage.
+
+The format of error records is described in the UEFI specification[2],
+in Appendix N "Common Platform Error Record".
+
+While the ACPI specification allows for an NVRAM "mode" (see
+GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
+directly exposed for direct access by the OS/guest, this implements
+the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
+by most BIOS (since flash memory requires programming operations
+in order to update its contents). Furthermore, as of the time of this
+writing, Linux does not support the non-NVRAM "mode".
+
+
+Background/Motivation
+---------------------
+Linux uses the persistent storage filesystem, pstore, to record
+information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
+independent of, and runs before, kdump.  In certain scenarios (ie.
+hosts/guests with root filesystems on NFS/iSCSI where networking
+software and/or hardware fails), pstore may contain the only
+information available for post-mortem debugging.
+
+Two common storage backends for the pstore filesystem are ACPI ERST
+and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
+all guests. With QEMU supporting ACPI ERST, it becomes a viable
+pstore storage backend for virtual machines (as it is now for
+bare metal machines).
+
+Enabling support for ACPI ERST facilitates a consistent method to
+capture kernel panic information in a wide range of guests: from
+resource-constrained microvms to very large guests, and in
+particular, in direct-boot environments (which would lack UEFI
+run-time services).
+
+Note that Microsoft Windows also utilizes the ACPI ERST for certain
+crash information, if available.
+
+
+Invocation
+----------
+
+To utilize ACPI ERST, a memory-backend-file object and acpi-erst
+device must be created, for example:
+
+ qemu ...
+ -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
+  size=0x10000,share=on
+ -device acpi-erst,memdev=erstnvram
+
+For proper operation, the ACPI ERST device needs a memory-backend-file
+object with the following parameters:
+
+ - id: The id of the memory-backend-file object is used to associate
+   this memory with the acpi-erst device.
+ - size: The size of the ACPI ERST backing storage. This parameter is
+   required.
+ - mem-path: The location of the ACPI ERST backing storage file. This
+   parameter is also required.
+ - share: The share=on parameter is required so that updates to the
+   ERST back store are written to the file immediately as well. Without
+   it, updates the the backing file are unpredictable and may not
+   properly persist (eg. if qemu should crash).
+
+The ACPI ERST device is a simple PCI device, and requires this one
+parameter:
+
+ - memdev: Is the object id of the memory-backend-file.
+
+
+PCI Interface
+-------------
+
+The ERST device is a PCI device with two BARs, one for accessing
+the programming registers, and the other for accessing the
+record exchange buffer.
+
+BAR0 contains the programming interface consisting of just two
+64-bit registers. The two registers are an ACTION (cmd) and a
+VALUE (data). All ERST actions/operations/side effects happen
+on the write to the ACTION, by design. Thus any data needed
+by the action must be placed into VALUE prior to writing
+ACTION. Reading the VALUE simply returns the register contents,
+which can be updated by a previous ACTION. This behavior is
+encoded in the ACPI ERST table generated by QEMU.
+
+BAR1 contains the record exchange buffer, and the size of this
+buffer sets the maximum record size. This record exchange
+buffer size is 8KiB.
+
+Backing File
+------------
+
+The ACPI ERST persistent storage is contained within a single backing
+file. The size and location of the backing file is specified upon
+QEMU startup of the ACPI ERST device.
+
+Records are stored in the backing file in a simple fashion.
+The backing file is essentially divided into fixed size
+"slots", ERST_RECORD_SIZE in length, with each "slot"
+storing a single record. No attempt at optimizing storage
+through compression, compaction, etc is attempted.
+NOTE that any change to this value will make any pre-
+existing backing files, not of the same ERST_RECORD_SIZE,
+unusable to the guest.
+
+Below is an example layout of the backing store file.
+The size of the file is a multiple of ERST_RECORD_SIZE,
+and contains N number of "slots" to store records. The
+example below shows two records (in CPER format) in the
+backing file, while the remaining slots are empty/
+available.
+
+ Slot   Record
+        +--------------------------------------------+
+    0   | empty/available                            |
+        +--------------------------------------------+
+    1   | CPER                                       |
+        +--------------------------------------------+
+    2   | CPER                                       |
+        +--------------------------------------------+
+  ...   |                                            |
+        +--------------------------------------------+
+    N   | empty/available                            |
+        +--------------------------------------------+
+        <-------------- ERST_RECORD_SIZE ------------>
+
+Not all slots need to be occupied, and they need not be
+occupied in a contiguous fashion. The ability to clear/erase
+specific records allows for the formation of unoccupied
+slots.
+
+
+References
+----------
+
+[1] "Advanced Configuration and Power Interface Specification",
+    version 4.0, June 2009.
+
+[2] "Unified Extensible Firmware Interface Specification",
+    version 2.1, October 2008.
+
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 01/10] ACPI ERST: bios-tables-test.c steps 1 and 2 Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 02/10] ACPI ERST: specification for ERST support Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-19 15:06   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 04/10] ACPI ERST: header file " Eric DeVolder
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This change declares the PCI device_id for the new ACPI ERST
device.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 include/hw/pci/pci.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 6be4e0c..eef3ef4 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -108,6 +108,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
 #define PCI_DEVICE_ID_REDHAT_NVME        0x0010
 #define PCI_DEVICE_ID_REDHAT_PVPANIC     0x0011
+#define PCI_DEVICE_ID_REDHAT_ACPI_ERST   0x0012
 #define PCI_DEVICE_ID_REDHAT_QXL         0x0100
 
 #define FMT_PCIBUS                      PRIx64
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 04/10] ACPI ERST: header file for ERST
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (2 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-06-30 19:07 ` [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature Eric DeVolder
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This change introduces the defintions for ACPI ERST.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 include/hw/acpi/erst.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)
 create mode 100644 include/hw/acpi/erst.h

diff --git a/include/hw/acpi/erst.h b/include/hw/acpi/erst.h
new file mode 100644
index 0000000..07a3fa5
--- /dev/null
+++ b/include/hw/acpi/erst.h
@@ -0,0 +1,84 @@
+/*
+ * ACPI Error Record Serialization Table, ERST, Implementation
+ *
+ * Copyright (c) 2021 Oracle and/or its affiliates.
+ *
+ * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
+ * ACPI Platform Error Interfaces : Error Serialization
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+#ifndef HW_ACPI_ERST_H
+#define HW_ACPI_ERST_H
+
+void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
+                const char *oem_id, const char *oem_table_id);
+
+#define TYPE_ACPI_ERST "acpi-erst"
+#define ACPI_ERST_MEMDEV_PROP "memdev"
+
+#define ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION         0x0
+#define ACPI_ERST_ACTION_BEGIN_READ_OPERATION          0x1
+#define ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION         0x2
+#define ACPI_ERST_ACTION_END_OPERATION                 0x3
+#define ACPI_ERST_ACTION_SET_RECORD_OFFSET             0x4
+#define ACPI_ERST_ACTION_EXECUTE_OPERATION             0x5
+#define ACPI_ERST_ACTION_CHECK_BUSY_STATUS             0x6
+#define ACPI_ERST_ACTION_GET_COMMAND_STATUS            0x7
+#define ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER         0x8
+#define ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER         0x9
+#define ACPI_ERST_ACTION_GET_RECORD_COUNT              0xA
+#define ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION   0xB
+#define ACPI_ERST_ACTION_RESERVED                      0xC
+#define ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE   0xD
+#define ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH  0xE
+#define ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES 0xF
+#define ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS 0x10
+#define ACPI_ERST_MAX_ACTIONS \
+    (ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS + 1)
+
+#define ACPI_ERST_STATUS_SUCCESS                0x00
+#define ACPI_ERST_STATUS_NOT_ENOUGH_SPACE       0x01
+#define ACPI_ERST_STATUS_HARDWARE_NOT_AVAILABLE 0x02
+#define ACPI_ERST_STATUS_FAILED                 0x03
+#define ACPI_ERST_STATUS_RECORD_STORE_EMPTY     0x04
+#define ACPI_ERST_STATUS_RECORD_NOT_FOUND       0x05
+
+#define ACPI_ERST_INST_READ_REGISTER                 0x00
+#define ACPI_ERST_INST_READ_REGISTER_VALUE           0x01
+#define ACPI_ERST_INST_WRITE_REGISTER                0x02
+#define ACPI_ERST_INST_WRITE_REGISTER_VALUE          0x03
+#define ACPI_ERST_INST_NOOP                          0x04
+#define ACPI_ERST_INST_LOAD_VAR1                     0x05
+#define ACPI_ERST_INST_LOAD_VAR2                     0x06
+#define ACPI_ERST_INST_STORE_VAR1                    0x07
+#define ACPI_ERST_INST_ADD                           0x08
+#define ACPI_ERST_INST_SUBTRACT                      0x09
+#define ACPI_ERST_INST_ADD_VALUE                     0x0A
+#define ACPI_ERST_INST_SUBTRACT_VALUE                0x0B
+#define ACPI_ERST_INST_STALL                         0x0C
+#define ACPI_ERST_INST_STALL_WHILE_TRUE              0x0D
+#define ACPI_ERST_INST_SKIP_NEXT_INSTRUCTION_IF_TRUE 0x0E
+#define ACPI_ERST_INST_GOTO                          0x0F
+#define ACPI_ERST_INST_SET_SRC_ADDRESS_BASE          0x10
+#define ACPI_ERST_INST_SET_DST_ADDRESS_BASE          0x11
+#define ACPI_ERST_INST_MOVE_DATA                     0x12
+
+/* returns NULL unless there is exactly one device */
+static inline Object *find_erst_dev(void)
+{
+    return object_resolve_path_type("", TYPE_ACPI_ERST, NULL);
+}
+#endif
+
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (3 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 04/10] ACPI ERST: header file " Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 12:17   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table Eric DeVolder
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This change implements the support for the ACPI ERST feature.

This implements a PCI device for ACPI ERST. This implments the
non-NVRAM "mode" of operation for ERST.

This change also includes erst.c in the build of general ACPI support.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 hw/acpi/erst.c      | 704 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build |   1 +
 2 files changed, 705 insertions(+)
 create mode 100644 hw/acpi/erst.c

diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
new file mode 100644
index 0000000..6e9bd2e
--- /dev/null
+++ b/hw/acpi/erst.c
@@ -0,0 +1,704 @@
+/*
+ * ACPI Error Record Serialization Table, ERST, Implementation
+ *
+ * Copyright (c) 2021 Oracle and/or its affiliates.
+ *
+ * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
+ * ACPI Platform Error Interfaces : Error Serialization
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/qdev-core.h"
+#include "exec/memory.h"
+#include "qom/object.h"
+#include "hw/pci/pci.h"
+#include "qom/object_interfaces.h"
+#include "qemu/error-report.h"
+#include "migration/vmstate.h"
+#include "hw/qdev-properties.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/acpi-defs.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "exec/address-spaces.h"
+#include "sysemu/hostmem.h"
+#include "hw/acpi/erst.h"
+#include "trace.h"
+
+/* UEFI 2.1: Append N Common Platform Error Record */
+#define UEFI_CPER_RECORD_MIN_SIZE 128U
+#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
+#define UEFI_CPER_RECORD_ID_OFFSET 96U
+#define IS_UEFI_CPER_RECORD(ptr) \
+    (((ptr)[0] == 'C') && \
+     ((ptr)[1] == 'P') && \
+     ((ptr)[2] == 'E') && \
+     ((ptr)[3] == 'R'))
+#define THE_UEFI_CPER_RECORD_ID(ptr) \
+    (*(uint64_t *)(&(ptr)[UEFI_CPER_RECORD_ID_OFFSET]))
+
+/*
+ * This implementation is an ACTION (cmd) and VALUE (data)
+ * interface consisting of just two 64-bit registers.
+ */
+#define ERST_REG_SIZE (2UL * sizeof(uint64_t))
+#define ERST_CSR_ACTION (0UL << 3) /* action (cmd) */
+#define ERST_CSR_VALUE  (1UL << 3) /* argument/value (data) */
+
+/*
+ * ERST_RECORD_SIZE is the buffer size for exchanging ERST
+ * record contents. Thus, it defines the maximum record size.
+ * As this is mapped through a PCI BAR, it must be a power of
+ * two, and should be at least PAGE_SIZE.
+ * Records are stored in the backing file in a simple fashion.
+ * The backing file is essentially divided into fixed size
+ * "slots", ERST_RECORD_SIZE in length, with each "slot"
+ * storing a single record. No attempt at optimizing storage
+ * through compression, compaction, etc is attempted.
+ * NOTE that any change to this value will make any pre-
+ * existing backing files, not of the same ERST_RECORD_SIZE,
+ * unusable to the guest.
+ */
+/* 8KiB records, not too small, not too big */
+#define ERST_RECORD_SIZE (2UL * 4096)
+
+#define ERST_INVALID_RECORD_ID (~0UL)
+#define ERST_EXECUTE_OPERATION_MAGIC 0x9CUL
+
+/*
+ * Object cast macro
+ */
+#define ACPIERST(obj) \
+    OBJECT_CHECK(ERSTDeviceState, (obj), TYPE_ACPI_ERST)
+
+/*
+ * Main ERST device state structure
+ */
+typedef struct {
+    PCIDevice parent_obj;
+
+    HostMemoryBackend *hostmem;
+    MemoryRegion *hostmem_mr;
+
+    MemoryRegion iomem; /* programming registes */
+    MemoryRegion nvmem; /* exchange buffer */
+    uint32_t prop_size;
+    hwaddr bar0; /* programming registers */
+    hwaddr bar1; /* exchange buffer */
+
+    uint8_t operation;
+    uint8_t busy_status;
+    uint8_t command_status;
+    uint32_t record_offset;
+    uint32_t record_count;
+    uint64_t reg_action;
+    uint64_t reg_value;
+    uint64_t record_identifier;
+
+    unsigned next_record_index;
+    uint8_t record[ERST_RECORD_SIZE]; /* read/written directly by guest */
+    uint8_t tmp_record[ERST_RECORD_SIZE]; /* intermediate manipulation buffer */
+
+} ERSTDeviceState;
+
+/*******************************************************************/
+/*******************************************************************/
+
+static unsigned copy_from_nvram_by_index(ERSTDeviceState *s, unsigned index)
+{
+    /* Read an nvram entry into tmp_record */
+    unsigned rc = ACPI_ERST_STATUS_FAILED;
+    off_t offset = (index * ERST_RECORD_SIZE);
+
+    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
+        if (s->hostmem_mr) {
+            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
+            memcpy(s->tmp_record, p + offset, ERST_RECORD_SIZE);
+            rc = ACPI_ERST_STATUS_SUCCESS;
+        }
+    }
+    return rc;
+}
+
+static unsigned copy_to_nvram_by_index(ERSTDeviceState *s, unsigned index)
+{
+    /* Write entry in tmp_record into nvram, and backing file */
+    unsigned rc = ACPI_ERST_STATUS_FAILED;
+    off_t offset = (index * ERST_RECORD_SIZE);
+
+    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
+        if (s->hostmem_mr) {
+            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
+            memcpy(p + offset, s->tmp_record, ERST_RECORD_SIZE);
+            rc = ACPI_ERST_STATUS_SUCCESS;
+        }
+    }
+    return rc;
+}
+
+static int lookup_erst_record_by_identifier(ERSTDeviceState *s,
+    uint64_t record_identifier, bool *record_found, bool alloc_for_write)
+{
+    int rc = -1;
+    int empty_index = -1;
+    int index = 0;
+    unsigned rrc;
+
+    *record_found = 0;
+
+    do {
+        rrc = copy_from_nvram_by_index(s, (unsigned)index);
+        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
+            uint64_t this_identifier;
+            this_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
+            if (IS_UEFI_CPER_RECORD(s->tmp_record) &&
+                (this_identifier == record_identifier)) {
+                rc = index;
+                *record_found = 1;
+                break;
+            }
+            if ((this_identifier == ERST_INVALID_RECORD_ID) &&
+                (empty_index < 0)) {
+                empty_index = index; /* first available for write */
+            }
+        }
+        ++index;
+    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
+
+    /* Record not found, allocate for writing */
+    if ((rc < 0) && alloc_for_write) {
+        rc = empty_index;
+    }
+
+    return rc;
+}
+
+static unsigned clear_erst_record(ERSTDeviceState *s)
+{
+    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
+    bool record_found;
+    int index;
+
+    index = lookup_erst_record_by_identifier(s,
+        s->record_identifier, &record_found, 0);
+    if (record_found) {
+        memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
+        rc = copy_to_nvram_by_index(s, (unsigned)index);
+        if (rc == ACPI_ERST_STATUS_SUCCESS) {
+            s->record_count -= 1;
+        }
+    }
+
+    return rc;
+}
+
+static unsigned write_erst_record(ERSTDeviceState *s)
+{
+    unsigned rc = ACPI_ERST_STATUS_FAILED;
+
+    if (s->record_offset < (ERST_RECORD_SIZE - UEFI_CPER_RECORD_MIN_SIZE)) {
+        uint64_t record_identifier;
+        uint8_t *record = &s->record[s->record_offset];
+        bool record_found;
+        int index;
+
+        record_identifier = (s->record_identifier == ERST_INVALID_RECORD_ID)
+            ? THE_UEFI_CPER_RECORD_ID(record) : s->record_identifier;
+
+        index = lookup_erst_record_by_identifier(s,
+            record_identifier, &record_found, 1);
+        if (index < 0) {
+            rc = ACPI_ERST_STATUS_NOT_ENOUGH_SPACE;
+        } else {
+            if (0 != s->record_offset) {
+                memset(&s->tmp_record[ERST_RECORD_SIZE - s->record_offset],
+                    0xFF, s->record_offset);
+            }
+            memcpy(s->tmp_record, record, ERST_RECORD_SIZE - s->record_offset);
+            rc = copy_to_nvram_by_index(s, (unsigned)index);
+            if (rc == ACPI_ERST_STATUS_SUCCESS) {
+                if (!record_found) { /* not overwriting existing record */
+                    s->record_count += 1; /* writing new record */
+                }
+            }
+        }
+    }
+
+    return rc;
+}
+
+static unsigned next_erst_record(ERSTDeviceState *s,
+    uint64_t *record_identifier)
+{
+    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
+    unsigned index;
+    unsigned rrc;
+
+    *record_identifier = ERST_INVALID_RECORD_ID;
+
+    index = s->next_record_index;
+    do {
+        rrc = copy_from_nvram_by_index(s, (unsigned)index);
+        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
+            if (IS_UEFI_CPER_RECORD(s->tmp_record)) {
+                s->next_record_index = index + 1; /* where to start next time */
+                *record_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
+                rc = ACPI_ERST_STATUS_SUCCESS;
+                break;
+            }
+            ++index;
+        } else {
+            if (s->next_record_index == 0) {
+                rc = ACPI_ERST_STATUS_RECORD_STORE_EMPTY;
+            }
+            s->next_record_index = 0; /* at end, reset */
+        }
+    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
+
+    return rc;
+}
+
+static unsigned read_erst_record(ERSTDeviceState *s)
+{
+    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
+    bool record_found;
+    int index;
+
+    index = lookup_erst_record_by_identifier(s,
+        s->record_identifier, &record_found, 0);
+    if (record_found) {
+        rc = copy_from_nvram_by_index(s, (unsigned)index);
+        if (rc == ACPI_ERST_STATUS_SUCCESS) {
+            if (s->record_offset < ERST_RECORD_SIZE) {
+                memcpy(&s->record[s->record_offset], s->tmp_record,
+                    ERST_RECORD_SIZE - s->record_offset);
+            }
+        }
+    }
+
+    return rc;
+}
+
+static unsigned get_erst_record_count(ERSTDeviceState *s)
+{
+    /* Compute record_count */
+    unsigned index = 0;
+
+    s->record_count = 0;
+    while (copy_from_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
+        uint8_t *ptr = &s->tmp_record[0];
+        uint64_t record_identifier = THE_UEFI_CPER_RECORD_ID(ptr);
+        if (IS_UEFI_CPER_RECORD(ptr) &&
+            (ERST_INVALID_RECORD_ID != record_identifier)) {
+            s->record_count += 1;
+        }
+        ++index;
+    }
+    return s->record_count;
+}
+
+/*******************************************************************/
+
+static uint64_t erst_rd_reg64(hwaddr addr,
+    uint64_t reg, unsigned size)
+{
+    uint64_t rdval;
+    uint64_t mask;
+    unsigned shift;
+
+    if (size == sizeof(uint64_t)) {
+        /* 64b access */
+        mask = 0xFFFFFFFFFFFFFFFFUL;
+        shift = 0;
+    } else {
+        /* 32b access */
+        mask = 0x00000000FFFFFFFFUL;
+        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
+    }
+
+    rdval = reg;
+    rdval >>= shift;
+    rdval &= mask;
+
+    return rdval;
+}
+
+static uint64_t erst_wr_reg64(hwaddr addr,
+    uint64_t reg, uint64_t val, unsigned size)
+{
+    uint64_t wrval;
+    uint64_t mask;
+    unsigned shift;
+
+    if (size == sizeof(uint64_t)) {
+        /* 64b access */
+        mask = 0xFFFFFFFFFFFFFFFFUL;
+        shift = 0;
+    } else {
+        /* 32b access */
+        mask = 0x00000000FFFFFFFFUL;
+        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
+    }
+
+    val &= mask;
+    val <<= shift;
+    mask <<= shift;
+    wrval = reg;
+    wrval &= ~mask;
+    wrval |= val;
+
+    return wrval;
+}
+
+static void erst_reg_write(void *opaque, hwaddr addr,
+    uint64_t val, unsigned size)
+{
+    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
+
+    /*
+     * NOTE: All actions/operations/side effects happen on the WRITE,
+     * by design. The READs simply return the reg_value contents.
+     */
+    trace_acpi_erst_reg_write(addr, val, size);
+
+    switch (addr) {
+    case ERST_CSR_VALUE + 0:
+    case ERST_CSR_VALUE + 4:
+        s->reg_value = erst_wr_reg64(addr, s->reg_value, val, size);
+        break;
+    case ERST_CSR_ACTION + 0:
+/*  case ERST_CSR_ACTION+4: as coded, not really a 64b register */
+        switch (val) {
+        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
+        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
+        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
+        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
+        case ACPI_ERST_ACTION_END_OPERATION:
+            s->operation = val;
+            break;
+        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
+            s->record_offset = s->reg_value;
+            break;
+        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
+            if ((uint8_t)s->reg_value == ERST_EXECUTE_OPERATION_MAGIC) {
+                s->busy_status = 1;
+                switch (s->operation) {
+                case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
+                    s->command_status = write_erst_record(s);
+                    break;
+                case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
+                    s->command_status = read_erst_record(s);
+                    break;
+                case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
+                    s->command_status = clear_erst_record(s);
+                    break;
+                case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
+                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
+                    break;
+                case ACPI_ERST_ACTION_END_OPERATION:
+                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
+                    break;
+                default:
+                    s->command_status = ACPI_ERST_STATUS_FAILED;
+                    break;
+                }
+                s->record_identifier = ERST_INVALID_RECORD_ID;
+                s->busy_status = 0;
+            }
+            break;
+        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
+            s->reg_value = s->busy_status;
+            break;
+        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
+            s->reg_value = s->command_status;
+            break;
+        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
+            s->command_status = next_erst_record(s, &s->reg_value);
+            break;
+        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
+            s->record_identifier = s->reg_value;
+            break;
+        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
+            s->reg_value = s->record_count;
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
+            s->reg_value = s->bar1;
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
+            s->reg_value = ERST_RECORD_SIZE;
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
+            s->reg_value = 0x0; /* intentional, not NVRAM mode */
+            break;
+        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
+            /*
+             * 100UL is max, 10UL is nominal
+             */
+            s->reg_value = ((100UL << 32) | (10UL << 0));
+            break;
+        case ACPI_ERST_ACTION_RESERVED:
+        default:
+            /*
+             * Unknown action/command, NOP
+             */
+            break;
+        }
+        break;
+    default:
+        /*
+         * This should not happen, but if it does, NOP
+         */
+        break;
+    }
+}
+
+static uint64_t erst_reg_read(void *opaque, hwaddr addr,
+                                unsigned size)
+{
+    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
+    uint64_t val = 0;
+
+    switch (addr) {
+    case ERST_CSR_ACTION + 0:
+    case ERST_CSR_ACTION + 4:
+        val = erst_rd_reg64(addr, s->reg_action, size);
+        break;
+    case ERST_CSR_VALUE + 0:
+    case ERST_CSR_VALUE + 4:
+        val = erst_rd_reg64(addr, s->reg_value, size);
+        break;
+    default:
+        break;
+    }
+    trace_acpi_erst_reg_read(addr, val, size);
+    return val;
+}
+
+static const MemoryRegionOps erst_reg_ops = {
+    .read = erst_reg_read,
+    .write = erst_reg_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void erst_mem_write(void *opaque, hwaddr addr,
+    uint64_t val, unsigned size)
+{
+    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
+    uint8_t *ptr = &s->record[addr - 0];
+    trace_acpi_erst_mem_write(addr, val, size);
+    switch (size) {
+    default:
+    case sizeof(uint8_t):
+        *(uint8_t *)ptr = (uint8_t)val;
+        break;
+    case sizeof(uint16_t):
+        *(uint16_t *)ptr = (uint16_t)val;
+        break;
+    case sizeof(uint32_t):
+        *(uint32_t *)ptr = (uint32_t)val;
+        break;
+    case sizeof(uint64_t):
+        *(uint64_t *)ptr = (uint64_t)val;
+        break;
+    }
+}
+
+static uint64_t erst_mem_read(void *opaque, hwaddr addr,
+                                unsigned size)
+{
+    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
+    uint8_t *ptr = &s->record[addr - 0];
+    uint64_t val = 0;
+    switch (size) {
+    default:
+    case sizeof(uint8_t):
+        val = *(uint8_t *)ptr;
+        break;
+    case sizeof(uint16_t):
+        val = *(uint16_t *)ptr;
+        break;
+    case sizeof(uint32_t):
+        val = *(uint32_t *)ptr;
+        break;
+    case sizeof(uint64_t):
+        val = *(uint64_t *)ptr;
+        break;
+    }
+    trace_acpi_erst_mem_read(addr, val, size);
+    return val;
+}
+
+static const MemoryRegionOps erst_mem_ops = {
+    .read = erst_mem_read,
+    .write = erst_mem_write,
+    .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+/*******************************************************************/
+/*******************************************************************/
+
+static const VMStateDescription erst_vmstate  = {
+    .name = "acpi-erst",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT8(operation, ERSTDeviceState),
+        VMSTATE_UINT8(busy_status, ERSTDeviceState),
+        VMSTATE_UINT8(command_status, ERSTDeviceState),
+        VMSTATE_UINT32(record_offset, ERSTDeviceState),
+        VMSTATE_UINT32(record_count, ERSTDeviceState),
+        VMSTATE_UINT64(reg_action, ERSTDeviceState),
+        VMSTATE_UINT64(reg_value, ERSTDeviceState),
+        VMSTATE_UINT64(record_identifier, ERSTDeviceState),
+        VMSTATE_UINT8_ARRAY(record, ERSTDeviceState, ERST_RECORD_SIZE),
+        VMSTATE_UINT8_ARRAY(tmp_record, ERSTDeviceState, ERST_RECORD_SIZE),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
+{
+    ERSTDeviceState *s = ACPIERST(pci_dev);
+    unsigned index = 0;
+    bool share;
+
+    trace_acpi_erst_realizefn_in();
+
+    if (!s->hostmem) {
+        error_setg(errp, "'" ACPI_ERST_MEMDEV_PROP "' property is not set");
+        return;
+    } else if (host_memory_backend_is_mapped(s->hostmem)) {
+        error_setg(errp, "can't use already busy memdev: %s",
+                   object_get_canonical_path_component(OBJECT(s->hostmem)));
+        return;
+    }
+
+    share = object_property_get_bool(OBJECT(s->hostmem), "share", &error_fatal);
+    if (!share) {
+        error_setg(errp, "ACPI ERST requires hostmem property share=on: %s",
+                   object_get_canonical_path_component(OBJECT(s->hostmem)));
+    }
+
+    s->hostmem_mr = host_memory_backend_get_memory(s->hostmem);
+
+    /* HostMemoryBackend size will be multiple of PAGE_SIZE */
+    s->prop_size = object_property_get_int(OBJECT(s->hostmem), "size", &error_fatal);
+
+    /* Convert prop_size to integer multiple of ERST_RECORD_SIZE */
+    s->prop_size -= (s->prop_size % ERST_RECORD_SIZE);
+
+    /*
+     * MemoryBackend initializes contents to zero, but we actually
+     * want contents initialized to 0xFF, ERST_INVALID_RECORD_ID.
+     */
+    if (copy_from_nvram_by_index(s, 0) == ACPI_ERST_STATUS_SUCCESS) {
+        if (s->tmp_record[0] == 0x00) {
+            memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
+            index = 0;
+            while (copy_to_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
+                ++index;
+            }
+        }
+    }
+
+    /* Initialize record_count */
+    get_erst_record_count(s);
+
+    /* BAR 0: Programming registers */
+    memory_region_init_io(&s->iomem, OBJECT(pci_dev), &erst_reg_ops, s,
+                          TYPE_ACPI_ERST, ERST_REG_SIZE);
+    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->iomem);
+
+    /* BAR 1: Exchange buffer memory */
+    memory_region_init_io(&s->nvmem, OBJECT(pci_dev), &erst_mem_ops, s,
+                          TYPE_ACPI_ERST, ERST_RECORD_SIZE);
+    pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->nvmem);
+
+    /*
+     * The vmstate_register_ram_global() puts the memory in
+     * migration stream, where it is written back to the memory
+     * upon reaching the destination, which causes the backing
+     * file to be updated (with share=on).
+     */
+    vmstate_register_ram_global(s->hostmem_mr);
+
+    trace_acpi_erst_realizefn_out(s->prop_size);
+}
+
+static void erst_reset(DeviceState *dev)
+{
+    ERSTDeviceState *s = ACPIERST(dev);
+
+    trace_acpi_erst_reset_in(s->record_count);
+    s->operation = 0;
+    s->busy_status = 0;
+    s->command_status = ACPI_ERST_STATUS_SUCCESS;
+    /* indicate empty/no-more until further notice */
+    s->record_identifier = ERST_INVALID_RECORD_ID;
+    s->record_offset = 0;
+    s->next_record_index = 0;
+    /* NOTE: record_count and nvram are initialized elsewhere */
+    trace_acpi_erst_reset_out(s->record_count);
+}
+
+static Property erst_properties[] = {
+    DEFINE_PROP_LINK(ACPI_ERST_MEMDEV_PROP, ERSTDeviceState, hostmem,
+                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void erst_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    trace_acpi_erst_class_init_in();
+    k->realize = erst_realizefn;
+    k->vendor_id = PCI_VENDOR_ID_REDHAT;
+    k->device_id = PCI_DEVICE_ID_REDHAT_ACPI_ERST;
+    k->revision = 0x00;
+    k->class_id = PCI_CLASS_OTHERS;
+    dc->reset = erst_reset;
+    dc->vmsd = &erst_vmstate;
+    dc->user_creatable = true;
+    device_class_set_props(dc, erst_properties);
+    dc->desc = "ACPI Error Record Serialization Table (ERST) device";
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    trace_acpi_erst_class_init_out();
+}
+
+static const TypeInfo erst_type_info = {
+    .name          = TYPE_ACPI_ERST,
+    .parent        = TYPE_PCI_DEVICE,
+    .class_init    = erst_class_init,
+    .instance_size = sizeof(ERSTDeviceState),
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+        { }
+    }
+};
+
+static void erst_register_types(void)
+{
+    type_register_static(&erst_type_info);
+}
+
+type_init(erst_register_types)
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index dd69577..262a8ee 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -4,6 +4,7 @@ acpi_ss.add(files(
   'aml-build.c',
   'bios-linker-loader.c',
   'utils.c',
+  'erst.c',
 ))
 acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (4 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 13:16   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 07/10] ACPI ERST: trace support Eric DeVolder
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This code is called from the machine code (if ACPI supported)
to generate the ACPI ERST table.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 hw/acpi/erst.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 214 insertions(+)

diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
index 6e9bd2e..1f1dbbc 100644
--- a/hw/acpi/erst.c
+++ b/hw/acpi/erst.c
@@ -555,6 +555,220 @@ static const MemoryRegionOps erst_mem_ops = {
 /*******************************************************************/
 /*******************************************************************/
 
+/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
+static void build_serialization_instruction_entry(GArray *table_data,
+    uint8_t serialization_action,
+    uint8_t instruction,
+    uint8_t flags,
+    uint8_t register_bit_width,
+    uint64_t register_address,
+    uint64_t value,
+    uint64_t mask)
+{
+    /* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
+    struct AcpiGenericAddress gas;
+
+    /* Serialization Action */
+    build_append_int_noprefix(table_data, serialization_action, 1);
+    /* Instruction */
+    build_append_int_noprefix(table_data, instruction         , 1);
+    /* Flags */
+    build_append_int_noprefix(table_data, flags               , 1);
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0                   , 1);
+    /* Register Region */
+    gas.space_id = AML_SYSTEM_MEMORY;
+    gas.bit_width = register_bit_width;
+    gas.bit_offset = 0;
+    switch (register_bit_width) {
+    case 8:
+        gas.access_width = 1;
+        break;
+    case 16:
+        gas.access_width = 2;
+        break;
+    case 32:
+        gas.access_width = 3;
+        break;
+    case 64:
+        gas.access_width = 4;
+        break;
+    default:
+        gas.access_width = 0;
+        break;
+    }
+    gas.address = register_address;
+    build_append_gas_from_struct(table_data, &gas);
+    /* Value */
+    build_append_int_noprefix(table_data, value  , 8);
+    /* Mask */
+    build_append_int_noprefix(table_data, mask   , 8);
+}
+
+/* ACPI 4.0: 17.4.1 Serialization Action Table */
+void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
+    const char *oem_id, const char *oem_table_id)
+{
+    ERSTDeviceState *s = ACPIERST(erst_dev);
+    unsigned action;
+    unsigned erst_start = table_data->len;
+
+    s->bar0 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
+    trace_acpi_erst_pci_bar_0(s->bar0);
+    s->bar1 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);
+    trace_acpi_erst_pci_bar_1(s->bar1);
+
+    acpi_data_push(table_data, sizeof(AcpiTableHeader));
+    /* serialization_header_length */
+    build_append_int_noprefix(table_data, 48, 4);
+    /* reserved */
+    build_append_int_noprefix(table_data,  0, 4);
+    /*
+     * instruction_entry_count - changes to the number of serialization
+     * instructions in the ACTIONs below must be reflected in this
+     * pre-computed value.
+     */
+    build_append_int_noprefix(table_data, 29, 4);
+
+#define MASK8  0x00000000000000FFUL
+#define MASK16 0x000000000000FFFFUL
+#define MASK32 0x00000000FFFFFFFFUL
+#define MASK64 0xFFFFFFFFFFFFFFFFUL
+
+    for (action = 0; action < ACPI_ERST_MAX_ACTIONS; ++action) {
+        switch (action) {
+        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_END_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
+                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
+                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
+            break;
+        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
+            break;
+        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
+            break;
+        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
+                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
+            break;
+        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_RESERVED:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
+            break;
+        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
+            break;
+        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
+                s->bar0 + ERST_CSR_ACTION, action, MASK8);
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
+                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
+        default:
+            build_serialization_instruction_entry(table_data, action,
+                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
+            break;
+        }
+    }
+    build_header(linker, table_data,
+                 (void *)(table_data->data + erst_start),
+                 "ERST", table_data->len - erst_start,
+                 1, oem_id, oem_table_id);
+}
+
+/*******************************************************************/
+/*******************************************************************/
+
 static const VMStateDescription erst_vmstate  = {
     .name = "acpi-erst",
     .version_id = 1,
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 07/10] ACPI ERST: trace support
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (5 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 13:15   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines Eric DeVolder
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

Provide the definitions needed to support tracing in ACPI ERST.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 hw/acpi/trace-events | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events
index dcc1438..a5c2755 100644
--- a/hw/acpi/trace-events
+++ b/hw/acpi/trace-events
@@ -55,3 +55,17 @@ piix4_gpe_writeb(uint64_t addr, unsigned width, uint64_t val) "addr: 0x%" PRIx64
 # tco.c
 tco_timer_reload(int ticks, int msec) "ticks=%d (%d ms)"
 tco_timer_expired(int timeouts_no, bool strap, bool no_reboot) "timeouts_no=%d no_reboot=%d/%d"
+
+# erst.c
+acpi_erst_reg_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%04" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
+acpi_erst_reg_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%04" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
+acpi_erst_mem_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%06" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
+acpi_erst_mem_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%06" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
+acpi_erst_pci_bar_0(uint64_t addr) "BAR0: 0x%016" PRIx64
+acpi_erst_pci_bar_1(uint64_t addr) "BAR1: 0x%016" PRIx64
+acpi_erst_realizefn_in(void)
+acpi_erst_realizefn_out(unsigned size) "total nvram size %u bytes"
+acpi_erst_reset_in(unsigned record_count) "record_count %u"
+acpi_erst_reset_out(unsigned record_count) "record_count %u"
+acpi_erst_class_init_in(void)
+acpi_erst_class_init_out(void)
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines.
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (6 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 07/10] ACPI ERST: trace support Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 13:19   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 09/10] ACPI ERST: qtest for ERST Eric DeVolder
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This change exposes ACPI ERST support for x86 guests.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 hw/i386/acpi-build.c   | 9 +++++++++
 hw/i386/acpi-microvm.c | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index de98750..d2026cc 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -43,6 +43,7 @@
 #include "sysemu/tpm.h"
 #include "hw/acpi/tpm.h"
 #include "hw/acpi/vmgenid.h"
+#include "hw/acpi/erst.h"
 #include "hw/boards.h"
 #include "sysemu/tpm_backend.h"
 #include "hw/rtc/mc146818rtc_regs.h"
@@ -2327,6 +2328,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
     GArray *tables_blob = tables->table_data;
     AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
     Object *vmgenid_dev;
+    Object *erst_dev;
     char *oem_id;
     char *oem_table_id;
 
@@ -2388,6 +2390,13 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
                     ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
                     x86ms->oem_table_id);
 
+    erst_dev = find_erst_dev();
+    if (erst_dev) {
+        acpi_add_table(table_offsets, tables_blob);
+        build_erst(tables_blob, tables->linker, erst_dev,
+                   x86ms->oem_id, x86ms->oem_table_id);
+    }
+
     vmgenid_dev = find_vmgenid_dev();
     if (vmgenid_dev) {
         acpi_add_table(table_offsets, tables_blob);
diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
index ccd3303..0099b13 100644
--- a/hw/i386/acpi-microvm.c
+++ b/hw/i386/acpi-microvm.c
@@ -30,6 +30,7 @@
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/generic_event_device.h"
 #include "hw/acpi/utils.h"
+#include "hw/acpi/erst.h"
 #include "hw/boards.h"
 #include "hw/i386/fw_cfg.h"
 #include "hw/i386/microvm.h"
@@ -160,6 +161,7 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
     X86MachineState *x86ms = X86_MACHINE(mms);
     GArray *table_offsets;
     GArray *tables_blob = tables->table_data;
+    Object *erst_dev;
     unsigned dsdt, xsdt;
     AcpiFadtData pmfadt = {
         /* ACPI 5.0: 4.1 Hardware-Reduced ACPI */
@@ -209,6 +211,13 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
                     ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
                     x86ms->oem_table_id);
 
+    erst_dev = find_erst_dev();
+    if (erst_dev) {
+        acpi_add_table(table_offsets, tables_blob);
+        build_erst(tables_blob, tables->linker, erst_dev,
+                   x86ms->oem_id, x86ms->oem_table_id);
+    }
+
     xsdt = tables_blob->len;
     build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
                x86ms->oem_table_id);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 09/10] ACPI ERST: qtest for ERST
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (7 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 13:38   ` Igor Mammedov
  2021-06-30 19:07 ` [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c Eric DeVolder
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

This change provides a qtest that locates and then does a simple
interrogation of the ERST feature within the guest.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 tests/qtest/erst-test.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build |   2 +
 2 files changed, 131 insertions(+)
 create mode 100644 tests/qtest/erst-test.c

diff --git a/tests/qtest/erst-test.c b/tests/qtest/erst-test.c
new file mode 100644
index 0000000..ce014c1
--- /dev/null
+++ b/tests/qtest/erst-test.c
@@ -0,0 +1,129 @@
+/*
+ * QTest testcase for ACPI ERST
+ *
+ * Copyright (c) 2021 Oracle
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/bitmap.h"
+#include "qemu/uuid.h"
+#include "hw/acpi/acpi-defs.h"
+#include "boot-sector.h"
+#include "acpi-utils.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+
+#define RSDP_ADDR_INVALID 0x100000 /* RSDP must be below this address */
+
+static uint64_t acpi_find_erst(QTestState *qts)
+{
+    uint32_t rsdp_offset;
+    uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
+    uint32_t rsdt_len, table_length;
+    uint8_t *rsdt, *ent;
+    uint64_t base = 0;
+
+    /* Wait for guest firmware to finish and start the payload. */
+    boot_sector_test(qts);
+
+    /* Tables should be initialized now. */
+    rsdp_offset = acpi_find_rsdp_address(qts);
+
+    g_assert_cmphex(rsdp_offset, <, RSDP_ADDR_INVALID);
+
+    acpi_fetch_rsdp_table(qts, rsdp_offset, rsdp_table);
+    acpi_fetch_table(qts, &rsdt, &rsdt_len, &rsdp_table[16 /* RsdtAddress */],
+                     4, "RSDT", true);
+
+    ACPI_FOREACH_RSDT_ENTRY(rsdt, rsdt_len, ent, 4 /* Entry size */) {
+        uint8_t *table_aml;
+        acpi_fetch_table(qts, &table_aml, &table_length, ent, 4, NULL, true);
+        if (!memcmp(table_aml + 0 /* Header Signature */, "ERST", 4)) {
+            /*
+             * Picking up ERST base address from the Register Region
+             * specified as part of the first Serialization Instruction
+             * Action (which is a Begin Write Operation).
+             */
+            memcpy(&base, &table_aml[56], sizeof(base));
+            g_free(table_aml);
+            break;
+        }
+        g_free(table_aml);
+    }
+    g_free(rsdt);
+    return base;
+}
+
+static char disk[] = "tests/erst-test-disk-XXXXXX";
+
+#define ERST_CMD()                              \
+    "-accel kvm -accel tcg "                    \
+    "-object memory-backend-file," \
+      "id=erstnvram,mem-path=tests/acpi-erst-XXXXXX,size=0x10000,share=on " \
+    "-device acpi-erst,memdev=erstnvram " \
+    "-drive id=hd0,if=none,file=%s,format=raw " \
+    "-device ide-hd,drive=hd0 ", disk
+
+static void erst_get_error_log_address_range(void)
+{
+    QTestState *qts;
+    uint64_t log_address_range = 0;
+    unsigned log_address_length = 0;
+    unsigned log_address_attr = 0;
+
+    qts = qtest_initf(ERST_CMD());
+
+    uint64_t base = acpi_find_erst(qts);
+    g_assert(base != 0);
+
+    /* Issue GET_ERROR_LOG_ADDRESS_RANGE command */
+    qtest_writel(qts, base + 0, 0xD);
+    /* Read GET_ERROR_LOG_ADDRESS_RANGE result */
+    log_address_range = qtest_readq(qts, base + 8);\
+
+    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_LENGTH command */
+    qtest_writel(qts, base + 0, 0xE);
+    /* Read GET_ERROR_LOG_ADDRESS_RANGE_LENGTH result */
+    log_address_length = qtest_readq(qts, base + 8);\
+
+    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES command */
+    qtest_writel(qts, base + 0, 0xF);
+    /* Read GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES result */
+    log_address_attr = qtest_readq(qts, base + 8);\
+
+    /* Check log_address_range is not 0,~0 or base */
+    g_assert(log_address_range != base);
+    g_assert(log_address_range != 0);
+    g_assert(log_address_range != ~0UL);
+
+    /* Check log_address_length is ERST_RECORD_SIZE */
+    g_assert(log_address_length == (8 * 1024));
+
+    /* Check log_address_attr is 0 */
+    g_assert(log_address_attr == 0);
+
+    qtest_quit(qts);
+}
+
+int main(int argc, char **argv)
+{
+    int ret;
+
+    ret = boot_sector_init(disk);
+    if (ret) {
+        return ret;
+    }
+
+    g_test_init(&argc, &argv, NULL);
+
+    qtest_add_func("/erst/get-error-log-address-range",
+                   erst_get_error_log_address_range);
+
+    ret = g_test_run();
+    boot_sector_cleanup(disk);
+
+    return ret;
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 0c76738..deae443 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -66,6 +66,7 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
   (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
   (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
+  (config_all_devices.has_key('CONFIG_ACPI') ? ['erst-test'] : []) +                 \
   qtests_pci +                                                                              \
   ['fdc-test',
    'ide-test',
@@ -237,6 +238,7 @@ qtests = {
   'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
   'cdrom-test': files('boot-sector.c'),
   'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
+  'erst-test': files('erst-test.c', 'boot-sector.c', 'acpi-utils.c'),
   'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
   'migration-test': files('migration-helpers.c'),
   'pxe-test': files('boot-sector.c'),
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (8 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 09/10] ACPI ERST: qtest for ERST Eric DeVolder
@ 2021-06-30 19:07 ` Eric DeVolder
  2021-07-20 13:24   ` Igor Mammedov
  2021-07-13 20:38 ` [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Michael S. Tsirkin
  2021-07-20 14:57 ` Igor Mammedov
  11 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, konrad.wilk, pbonzini, imammedo, boris.ostrovsky, rth

Following the guidelines in tests/qtest/bios-tables-test.c, this
is step 6, the re-generated ACPI tables binary blobs.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 tests/data/acpi/microvm/ERST                | Bin 0 -> 976 bytes
 tests/data/acpi/pc/ERST                     | Bin 0 -> 976 bytes
 tests/data/acpi/q35/ERST                    | Bin 0 -> 976 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   4 ----
 4 files changed, 4 deletions(-)

diff --git a/tests/data/acpi/microvm/ERST b/tests/data/acpi/microvm/ERST
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..db2adaa8d9b45e295f9976d6bb5a07a813214f52 100644
GIT binary patch
literal 976
zcmaKqTMmLS5Jd+l50TdfOjv?(1qNf{pGN#}aW2XoVQ=kJawAMa;r8^<4tl<ik9Q&x
z9fs@aGWNsscIs_KB7$e!_x3{VFxa)yyAdhW<ewtq@KMTR;_(*;o)AYwsc#_I{R=ny
z8zx&whJ53fsGoYS{%e8zX-SD^^!|)F8lIhx`_IYG$#;3?dmQ>N$k#r!KbMbUbUyg_
zK(;pcerufGzoGM$#7pML|ITms2HKLp#iT7ge?`3d;=pU-HFM;Z{u=Td@?Bo=v9u+>
bCEw+h{yXwJ@?BooAHQFxe`xQi@1uMGuJKX<

literal 0
HcmV?d00001

diff --git a/tests/data/acpi/pc/ERST b/tests/data/acpi/pc/ERST
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
GIT binary patch
literal 976
zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ
c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91

literal 0
HcmV?d00001

diff --git a/tests/data/acpi/q35/ERST b/tests/data/acpi/q35/ERST
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
GIT binary patch
literal 976
zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ
c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91

literal 0
HcmV?d00001

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
index e004c71..dfb8523 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,5 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/ERST",
-"tests/data/acpi/q35/ERST",
-"tests/data/acpi/microvm/ERST",
-
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-06-30 19:07 ` [PATCH v5 02/10] ACPI ERST: specification for ERST support Eric DeVolder
@ 2021-06-30 19:26   ` Eric DeVolder
  2021-07-19 15:02     ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-06-30 19:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, mst, Konrad Wilk, pbonzini, imammedo, Boris Ostrovsky, rth

[-- Attachment #1: Type: text/plain, Size: 7492 bytes --]

Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
rather than "non-NVRAM mode", which contradicts everything I stated prior.
Eric.
________________________________
From: Eric DeVolder <eric.devolder@oracle.com>
Sent: Wednesday, June 30, 2021 2:07 PM
To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support

Information on the implementation of the ACPI ERST support.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)
 create mode 100644 docs/specs/acpi_erst.txt

diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
new file mode 100644
index 0000000..79f8eb9
--- /dev/null
+++ b/docs/specs/acpi_erst.txt
@@ -0,0 +1,152 @@
+ACPI ERST DEVICE
+================
+
+The ACPI ERST device is utilized to support the ACPI Error Record
+Serialization Table, ERST, functionality. The functionality is
+designed for storing error records in persistent storage for
+future reference/debugging.
+
+The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
+(APEI)", and specifically subsection "Error Serialization", outlines
+a method for storing error records into persistent storage.
+
+The format of error records is described in the UEFI specification[2],
+in Appendix N "Common Platform Error Record".
+
+While the ACPI specification allows for an NVRAM "mode" (see
+GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
+directly exposed for direct access by the OS/guest, this implements
+the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
+by most BIOS (since flash memory requires programming operations
+in order to update its contents). Furthermore, as of the time of this
+writing, Linux does not support the non-NVRAM "mode".
+
+
+Background/Motivation
+---------------------
+Linux uses the persistent storage filesystem, pstore, to record
+information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
+independent of, and runs before, kdump.  In certain scenarios (ie.
+hosts/guests with root filesystems on NFS/iSCSI where networking
+software and/or hardware fails), pstore may contain the only
+information available for post-mortem debugging.
+
+Two common storage backends for the pstore filesystem are ACPI ERST
+and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
+all guests. With QEMU supporting ACPI ERST, it becomes a viable
+pstore storage backend for virtual machines (as it is now for
+bare metal machines).
+
+Enabling support for ACPI ERST facilitates a consistent method to
+capture kernel panic information in a wide range of guests: from
+resource-constrained microvms to very large guests, and in
+particular, in direct-boot environments (which would lack UEFI
+run-time services).
+
+Note that Microsoft Windows also utilizes the ACPI ERST for certain
+crash information, if available.
+
+
+Invocation
+----------
+
+To utilize ACPI ERST, a memory-backend-file object and acpi-erst
+device must be created, for example:
+
+ qemu ...
+ -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
+  size=0x10000,share=on
+ -device acpi-erst,memdev=erstnvram
+
+For proper operation, the ACPI ERST device needs a memory-backend-file
+object with the following parameters:
+
+ - id: The id of the memory-backend-file object is used to associate
+   this memory with the acpi-erst device.
+ - size: The size of the ACPI ERST backing storage. This parameter is
+   required.
+ - mem-path: The location of the ACPI ERST backing storage file. This
+   parameter is also required.
+ - share: The share=on parameter is required so that updates to the
+   ERST back store are written to the file immediately as well. Without
+   it, updates the the backing file are unpredictable and may not
+   properly persist (eg. if qemu should crash).
+
+The ACPI ERST device is a simple PCI device, and requires this one
+parameter:
+
+ - memdev: Is the object id of the memory-backend-file.
+
+
+PCI Interface
+-------------
+
+The ERST device is a PCI device with two BARs, one for accessing
+the programming registers, and the other for accessing the
+record exchange buffer.
+
+BAR0 contains the programming interface consisting of just two
+64-bit registers. The two registers are an ACTION (cmd) and a
+VALUE (data). All ERST actions/operations/side effects happen
+on the write to the ACTION, by design. Thus any data needed
+by the action must be placed into VALUE prior to writing
+ACTION. Reading the VALUE simply returns the register contents,
+which can be updated by a previous ACTION. This behavior is
+encoded in the ACPI ERST table generated by QEMU.
+
+BAR1 contains the record exchange buffer, and the size of this
+buffer sets the maximum record size. This record exchange
+buffer size is 8KiB.
+
+Backing File
+------------
+
+The ACPI ERST persistent storage is contained within a single backing
+file. The size and location of the backing file is specified upon
+QEMU startup of the ACPI ERST device.
+
+Records are stored in the backing file in a simple fashion.
+The backing file is essentially divided into fixed size
+"slots", ERST_RECORD_SIZE in length, with each "slot"
+storing a single record. No attempt at optimizing storage
+through compression, compaction, etc is attempted.
+NOTE that any change to this value will make any pre-
+existing backing files, not of the same ERST_RECORD_SIZE,
+unusable to the guest.
+
+Below is an example layout of the backing store file.
+The size of the file is a multiple of ERST_RECORD_SIZE,
+and contains N number of "slots" to store records. The
+example below shows two records (in CPER format) in the
+backing file, while the remaining slots are empty/
+available.
+
+ Slot   Record
+        +--------------------------------------------+
+    0   | empty/available                            |
+        +--------------------------------------------+
+    1   | CPER                                       |
+        +--------------------------------------------+
+    2   | CPER                                       |
+        +--------------------------------------------+
+  ...   |                                            |
+        +--------------------------------------------+
+    N   | empty/available                            |
+        +--------------------------------------------+
+        <-------------- ERST_RECORD_SIZE ------------>
+
+Not all slots need to be occupied, and they need not be
+occupied in a contiguous fashion. The ability to clear/erase
+specific records allows for the formation of unoccupied
+slots.
+
+
+References
+----------
+
+[1] "Advanced Configuration and Power Interface Specification",
+    version 4.0, June 2009.
+
+[2] "Unified Extensible Firmware Interface Specification",
+    version 2.1, October 2008.
+
--
1.8.3.1


[-- Attachment #2: Type: text/html, Size: 10670 bytes --]

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (9 preceding siblings ...)
  2021-06-30 19:07 ` [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c Eric DeVolder
@ 2021-07-13 20:38 ` Michael S. Tsirkin
  2021-07-21 15:23   ` Eric DeVolder
  2021-07-20 14:57 ` Igor Mammedov
  11 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2021-07-13 20:38 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, konrad.wilk, qemu-devel, pbonzini, imammedo,
	boris.ostrovsky, rth

On Wed, Jun 30, 2021 at 03:07:11PM -0400, Eric DeVolder wrote:
> =============================
> I believe I have corrected for all feedback on v4, but with
> responses to certain feedback below.
> 
> In patch 1/6, Igor asks:
> "you are adding empty template files here
> but the later matching bios-tables-test is nowhere to be found
> Was testcase lost somewhere along the way?
> 
> also it seems you add ERST only to pc/q35,
> so why tests/data/acpi/microvm/ERST is here?"
> 
> I did miss setting up microvm. That has been corrected.
> 
> As for the question about lost test cases, if you are referring
> to the new binary blobs for pc,q35, those were in patch
> 6/6. There is a qtest in patch 5/6. If I don't understand the
> question, please indicate as such.
> 
> 
> In patch 3/6, Igor asks:
> "Also spec (ERST) is rather (maybe intentionally) vague on specifics,
> so it would be better that before a patch that implements hw part
> were a doc patch describing concrete implementation. As model
> you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
> I'd start posting/discussing that spec within these thread
> to avoid spamming list until doc is settled up."
> 
> I'm thinking that this cover letter is the bulk of the spec? But as
> you say, to avoid spamming the group, we can use this thread to make
> suggested changes to this cover letter which I will then convert
> into a spec, for v6.
> 
> 
> In patch 3/6, in many places Igor mentions utilizing the hostmem
> mapped directly in the guest in order to avoid need-less copying.
> 
> It is true that the ERST has an "NVRAM" mode that would allow for
> all the simplifications Igor points out, however, Linux does not
> support this mode. This mode puts the burden of managing the NVRAM
> space on the OS. So this implementation, like BIOS, is the non-NVRAM
> mode.
> 
> I did go ahead and separate the registers from the exchange buffer,
> which would facilitate the support of NVRAM mode.
> 
>  linux/drivers/acpi/apei/erst.c:
>  /* NVRAM ERST Error Log Address Range is not supported yet */
>  static void pr_unimpl_nvram(void)
>  {
>     if (printk_ratelimit())
>         pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
>  }
> 
>  static int __erst_write_to_nvram(const struct cper_record_header *record)
>  {
>     /* do not print message, because printk is not safe for NMI */
>     return -ENOSYS;
>  }
> 
>  static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
>  {
>     pr_unimpl_nvram();
>     return -ENOSYS;
>  }
> 
>  static int __erst_clear_from_nvram(u64 record_id)
>  {
>     pr_unimpl_nvram();
>     return -ENOSYS;
>  }
> 
> =============================
> 
> This patchset introduces support for the ACPI Error Record
> Serialization Table, ERST.
> 
> For background and implementation information, please see
> docs/specs/acpi_erst.txt, which is patch 2/10.
> 
> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>


../hw/acpi/erst.c: In function ‘build_erst’:
../hw/acpi/erst.c:754:13: error: this statement may fall through [-Werror=implicit-fallthrough=]
  754 |             build_serialization_instruction_entry(table_data, action,
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  755 |                 ACPI_ERST_INST_READ_REGISTER       , 0, 64,
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  756 |                 s->bar0 + ERST_CSR_VALUE, 0, MASK64);
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../hw/acpi/erst.c:757:9: note: here
  757 |         default:
      |         ^~~~~~~
cc1: all warnings being treated as errors


Pls correct.
mingw32 build also failed. Pls take a look.


Thanks!


> ---
> v5: 30jun2021
>  - Create docs/specs/acpi_erst.txt, per Igor
>  - Separate PCI BARs for registers and memory, per Igor
>  - Convert debugging to use trace infrastructure, per Igor
>  - Various other fixups, per Igor
> 
> v4: 11jun2021
>  - Converted to a PCI device, per Igor.
>  - Updated qtest.
>  - Rearranged patches, per Igor.
> 
> v3: 28may2021
>  - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>    internal array with explicit file operations, per Igor.
>  - Changed the way the qdev and base address are handled, allowing
>    ERST to be disabled at run-time. Also aligns better with other
>    existing code.
> 
> v2: 8feb2021
>  - Added qtest/smoke test per Paolo Bonzini
>  - Split patch into smaller chunks, per Igor Mammedov
>  - Did away with use of ACPI packed structures, per Igor Mammedov
> 
> v1: 26oct2020
>  - initial post
> 
> ---
> 
> Eric DeVolder (10):
>   ACPI ERST: bios-tables-test.c steps 1 and 2
>   ACPI ERST: specification for ERST support
>   ACPI ERST: PCI device_id for ERST
>   ACPI ERST: header file for ERST
>   ACPI ERST: support for ACPI ERST feature
>   ACPI ERST: build the ACPI ERST table
>   ACPI ERST: trace support
>   ACPI ERST: create ACPI ERST table for pc/x86 machines.
>   ACPI ERST: qtest for ERST
>   ACPI ERST: step 6 of bios-tables-test.c
> 
>  docs/specs/acpi_erst.txt     | 152 +++++++
>  hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build          |   1 +
>  hw/acpi/trace-events         |  14 +
>  hw/i386/acpi-build.c         |   9 +
>  hw/i386/acpi-microvm.c       |   9 +
>  include/hw/acpi/erst.h       |  84 ++++
>  include/hw/pci/pci.h         |   1 +
>  tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
>  tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>  tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>  tests/qtest/erst-test.c      | 129 ++++++
>  tests/qtest/meson.build      |   2 +
>  13 files changed, 1319 insertions(+)
>  create mode 100644 docs/specs/acpi_erst.txt
>  create mode 100644 hw/acpi/erst.c
>  create mode 100644 include/hw/acpi/erst.h
>  create mode 100644 tests/data/acpi/microvm/ERST
>  create mode 100644 tests/data/acpi/pc/ERST
>  create mode 100644 tests/data/acpi/q35/ERST
>  create mode 100644 tests/qtest/erst-test.c
> 
> -- 
> 1.8.3.1



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-06-30 19:26   ` Eric DeVolder
@ 2021-07-19 15:02     ` Igor Mammedov
  2021-07-21 15:42       ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-19 15:02 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth

On Wed, 30 Jun 2021 19:26:39 +0000
Eric DeVolder <eric.devolder@oracle.com> wrote:

> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
> rather than "non-NVRAM mode", which contradicts everything I stated prior.
> Eric.
> ________________________________
> From: Eric DeVolder <eric.devolder@oracle.com>
> Sent: Wednesday, June 30, 2021 2:07 PM
> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
> 
> Information on the implementation of the ACPI ERST support.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 152 insertions(+)
>  create mode 100644 docs/specs/acpi_erst.txt
> 
> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
> new file mode 100644
> index 0000000..79f8eb9
> --- /dev/null
> +++ b/docs/specs/acpi_erst.txt
> @@ -0,0 +1,152 @@
> +ACPI ERST DEVICE
> +================
> +
> +The ACPI ERST device is utilized to support the ACPI Error Record
> +Serialization Table, ERST, functionality. The functionality is
> +designed for storing error records in persistent storage for
> +future reference/debugging.
> +
> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
> +(APEI)", and specifically subsection "Error Serialization", outlines
> +a method for storing error records into persistent storage.
> +
> +The format of error records is described in the UEFI specification[2],
> +in Appendix N "Common Platform Error Record".
> +
> +While the ACPI specification allows for an NVRAM "mode" (see
> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
> +directly exposed for direct access by the OS/guest, this implements
> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
> +by most BIOS (since flash memory requires programming operations
> +in order to update its contents). Furthermore, as of the time of this
> +writing, Linux does not support the non-NVRAM "mode".

shouldn't it be s/non-NVRAM/NVRAM/ ?

> +
> +
> +Background/Motivation
> +---------------------
> +Linux uses the persistent storage filesystem, pstore, to record
> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
> +independent of, and runs before, kdump.  In certain scenarios (ie.
> +hosts/guests with root filesystems on NFS/iSCSI where networking
> +software and/or hardware fails), pstore may contain the only
> +information available for post-mortem debugging.

well,
it's not the only way, one can use existing pvpanic device to notify
mgmt layer about crash and mgmt layer can take appropriate measures
to for post-mortem debugging, including dumping guest state,
which is superior to anything pstore can offer as VM is still exists
and mgmt layer can inspect VMs crashed state directly or dump
necessary parts of it.

So ERST shouldn't be portrayed as the only way here but rather
as limited alternative to pvpanic in regards to post-mortem debugging
(it's the only way only on bare-metal).

It would be better to describe here other use-cases you've mentioned
in earlier reviews, that justify adding alternative to pvpanic.

> +Two common storage backends for the pstore filesystem are ACPI ERST
> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
> +pstore storage backend for virtual machines (as it is now for
> +bare metal machines).
> +

> +Enabling support for ACPI ERST facilitates a consistent method to
> +capture kernel panic information in a wide range of guests: from
> +resource-constrained microvms to very large guests, and in
> +particular, in direct-boot environments (which would lack UEFI
> +run-time services).
this hunk probably not necessary

> +
> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
> +crash information, if available.
a pointer to a relevant source would be helpful here.

> +Invocation
s/^^/Configuration|Usage/

> +----------
> +
> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst
s/utilize/use/

> +device must be created, for example:
s/must/can/

> +
> + qemu ...
> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
> +  size=0x10000,share=on
I'd put ^^^ on the same line as -object and use '\' at the end the 
so example could be easily copy-pasted

> + -device acpi-erst,memdev=erstnvram
> +
> +For proper operation, the ACPI ERST device needs a memory-backend-file
> +object with the following parameters:
> +
> + - id: The id of the memory-backend-file object is used to associate
> +   this memory with the acpi-erst device.
> + - size: The size of the ACPI ERST backing storage. This parameter is
> +   required.
> + - mem-path: The location of the ACPI ERST backing storage file. This
> +   parameter is also required.
> + - share: The share=on parameter is required so that updates to the
> +   ERST back store are written to the file immediately as well. Without
> +   it, updates the the backing file are unpredictable and may not
> +   properly persist (eg. if qemu should crash).

mmap manpage says:
  MAP_SHARED
             Updates to the mapping ... are carried through to the underlying file.
it doesn't guarantee 'written to the file immediately', though.
So I'd rephrase it to something like that:

- share: The share=on parameter is required so that updates to the ERST back store
         are written back to the file.

> +
> +The ACPI ERST device is a simple PCI device, and requires this one
> +parameter:
s/^.*:/and ERST device:/

> +
> + - memdev: Is the object id of the memory-backend-file.
> +
> +
> +PCI Interface
> +-------------
> +
> +The ERST device is a PCI device with two BARs, one for accessing
> +the programming registers, and the other for accessing the
> +record exchange buffer.
> +
> +BAR0 contains the programming interface consisting of just two
> +64-bit registers. The two registers are an ACTION (cmd) and a
> +VALUE (data). All ERST actions/operations/side effects happen
s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/

> +on the write to the ACTION, by design. Thus any data needed
s/Thus//

> +by the action must be placed into VALUE prior to writing
> +ACTION. Reading the VALUE simply returns the register contents,
> +which can be updated by a previous ACTION.

> This behavior is
> +encoded in the ACPI ERST table generated by QEMU.
it's too vague, Either drop sentence or add a reference to relevant place in spec.


> +
> +BAR1 contains the record exchange buffer, and the size of this
> +buffer sets the maximum record size. This record exchange
> +buffer size is 8KiB.
s/^^^/
BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.


> +Backing File

s/^^^/Backing Storage Format/

> +------------


> +
> +The ACPI ERST persistent storage is contained within a single backing
> +file. The size and location of the backing file is specified upon
> +QEMU startup of the ACPI ERST device.

I'd drop above paragraph and describe file format here,
ultimately used backend doesn't have to be a file. For
example if user doesn't need it persist over QEMU restarts,
ram backend could be used, guest will still be able to see
it's own crash log after guest is reboot, or it could be
memfd backend passed to QEMU by mgmt layer.


> +Records are stored in the backing file in a simple fashion.
s/backing file/backend storage/
ditto for other occurrences

> +The backing file is essentially divided into fixed size
> +"slots", ERST_RECORD_SIZE in length, with each "slot"
> +storing a single record.

> No attempt at optimizing storage
> +through compression, compaction, etc is attempted.
s/^^^//

> +NOTE that any change to this value will make any pre-
> +existing backing files, not of the same ERST_RECORD_SIZE,
> +unusable to the guest.
when that can happen, can we detect it and error out?


> +Below is an example layout of the backing store file.
> +The size of the file is a multiple of ERST_RECORD_SIZE,
> +and contains N number of "slots" to store records. The
> +example below shows two records (in CPER format) in the
> +backing file, while the remaining slots are empty/
> +available.
> +
> + Slot   Record
> +        +--------------------------------------------+
> +    0   | empty/available                            |
> +        +--------------------------------------------+
> +    1   | CPER                                       |
> +        +--------------------------------------------+
> +    2   | CPER                                       |
> +        +--------------------------------------------+
> +  ...   |                                            |
> +        +--------------------------------------------+
> +    N   | empty/available                            |
> +        +--------------------------------------------+
> +        <-------------- ERST_RECORD_SIZE ------------>


> +Not all slots need to be occupied, and they need not be
> +occupied in a contiguous fashion. The ability to clear/erase
> +specific records allows for the formation of unoccupied
> +slots.
I'd drop this as not necessary


> +
> +
> +References
> +----------
> +
> +[1] "Advanced Configuration and Power Interface Specification",
> +    version 4.0, June 2009.
> +
> +[2] "Unified Extensible Firmware Interface Specification",
> +    version 2.1, October 2008.
> +
> --
> 1.8.3.1
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST
  2021-06-30 19:07 ` [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST Eric DeVolder
@ 2021-07-19 15:06   ` Igor Mammedov
  2021-07-21 15:42     ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-19 15:06 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:14 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> This change declares the PCI device_id for the new ACPI ERST

s/This change declares/Reserve/

> device.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  include/hw/pci/pci.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 6be4e0c..eef3ef4 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -108,6 +108,7 @@ extern bool pci_available;
>  #define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
>  #define PCI_DEVICE_ID_REDHAT_NVME        0x0010
>  #define PCI_DEVICE_ID_REDHAT_PVPANIC     0x0011
> +#define PCI_DEVICE_ID_REDHAT_ACPI_ERST   0x0012
>  #define PCI_DEVICE_ID_REDHAT_QXL         0x0100
>  
>  #define FMT_PCIBUS                      PRIx64



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-06-30 19:07 ` [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature Eric DeVolder
@ 2021-07-20 12:17   ` Igor Mammedov
  2021-07-21 16:07     ` Eric DeVolder
  2021-07-21 17:36     ` Eric DeVolder
  0 siblings, 2 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 12:17 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:16 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> This change implements the support for the ACPI ERST feature.
Drop this

> 
> This implements a PCI device for ACPI ERST. This implments the
s/implments/implements/

> non-NVRAM "mode" of operation for ERST.
add here why non-NVRAM "mode" is implemented.

Also even if this non-NVRAM implementation, there is still
a lot of not necessary data copying (see below) so drop it
or justify why it's there.
 
> This change also includes erst.c in the build of general ACPI support.
Drop this as well


> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  hw/acpi/erst.c      | 704 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build |   1 +
>  2 files changed, 705 insertions(+)
>  create mode 100644 hw/acpi/erst.c
> 
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> new file mode 100644
> index 0000000..6e9bd2e
> --- /dev/null
> +++ b/hw/acpi/erst.c
> @@ -0,0 +1,704 @@
> +/*
> + * ACPI Error Record Serialization Table, ERST, Implementation
> + *
> + * Copyright (c) 2021 Oracle and/or its affiliates.
> + *
> + * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
> + * ACPI Platform Error Interfaces : Error Serialization
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation;
> + * version 2 of the License.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
> + */
> +
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/qdev-core.h"
> +#include "exec/memory.h"
> +#include "qom/object.h"
> +#include "hw/pci/pci.h"
> +#include "qom/object_interfaces.h"
> +#include "qemu/error-report.h"
> +#include "migration/vmstate.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/acpi-defs.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/acpi/bios-linker-loader.h"
> +#include "exec/address-spaces.h"
> +#include "sysemu/hostmem.h"
> +#include "hw/acpi/erst.h"
> +#include "trace.h"
> +
> +/* UEFI 2.1: Append N Common Platform Error Record */
> +#define UEFI_CPER_RECORD_MIN_SIZE 128U
> +#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
> +#define UEFI_CPER_RECORD_ID_OFFSET 96U
> +#define IS_UEFI_CPER_RECORD(ptr) \
> +    (((ptr)[0] == 'C') && \
> +     ((ptr)[1] == 'P') && \
> +     ((ptr)[2] == 'E') && \
> +     ((ptr)[3] == 'R'))
> +#define THE_UEFI_CPER_RECORD_ID(ptr) \
> +    (*(uint64_t *)(&(ptr)[UEFI_CPER_RECORD_ID_OFFSET]))
> +
> +/*
> + * This implementation is an ACTION (cmd) and VALUE (data)
> + * interface consisting of just two 64-bit registers.
> + */
> +#define ERST_REG_SIZE (2UL * sizeof(uint64_t))

> +#define ERST_CSR_ACTION (0UL << 3) /* action (cmd) */
> +#define ERST_CSR_VALUE  (1UL << 3) /* argument/value (data) */
what's meaning of CRS?
Looking at patch both should be called ERST_[ACTION|VALUE]_OFFSET 
pls use explicit offset values instead of shifting bit.


> +/*
> + * ERST_RECORD_SIZE is the buffer size for exchanging ERST
> + * record contents. Thus, it defines the maximum record size.
> + * As this is mapped through a PCI BAR, it must be a power of
> + * two, and should be at least PAGE_SIZE.
> + * Records are stored in the backing file in a simple fashion.
> + * The backing file is essentially divided into fixed size
> + * "slots", ERST_RECORD_SIZE in length, with each "slot"
> + * storing a single record. No attempt at optimizing storage
> + * through compression, compaction, etc is attempted.
> + * NOTE that any change to this value will make any pre-
> + * existing backing files, not of the same ERST_RECORD_SIZE,
> + * unusable to the guest.
> + */
> +/* 8KiB records, not too small, not too big */
> +#define ERST_RECORD_SIZE (2UL * 4096)
> +
> +#define ERST_INVALID_RECORD_ID (~0UL)
> +#define ERST_EXECUTE_OPERATION_MAGIC 0x9CUL
> +
> +/*
> + * Object cast macro
> + */
> +#define ACPIERST(obj) \
> +    OBJECT_CHECK(ERSTDeviceState, (obj), TYPE_ACPI_ERST)
> +
> +/*
> + * Main ERST device state structure
> + */
> +typedef struct {
> +    PCIDevice parent_obj;
> +
> +    HostMemoryBackend *hostmem;
> +    MemoryRegion *hostmem_mr;
> +
> +    MemoryRegion iomem; /* programming registes */
> +    MemoryRegion nvmem; /* exchange buffer */
> +    uint32_t prop_size;
s/^^^/storage_size/

> +    hwaddr bar0; /* programming registers */
> +    hwaddr bar1; /* exchange buffer */
why do you need to keep this addresses around?
Suggest to drop these fields and use local variables or pci_get_bar_addr() at call site.

> +
> +    uint8_t operation;
> +    uint8_t busy_status;
> +    uint8_t command_status;
> +    uint32_t record_offset;
> +    uint32_t record_count;
> +    uint64_t reg_action;
> +    uint64_t reg_value;
> +    uint64_t record_identifier;
> +
> +    unsigned next_record_index;


> +    uint8_t record[ERST_RECORD_SIZE]; /* read/written directly by guest */
> +    uint8_t tmp_record[ERST_RECORD_SIZE]; /* intermediate manipulation buffer */
drop these see [**] below

> +
> +} ERSTDeviceState;
> +
> +/*******************************************************************/
> +/*******************************************************************/
> +
> +static unsigned copy_from_nvram_by_index(ERSTDeviceState *s, unsigned index)
> +{
> +    /* Read an nvram entry into tmp_record */
> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> +    off_t offset = (index * ERST_RECORD_SIZE);
> +
> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
> +        if (s->hostmem_mr) {
> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
> +            memcpy(s->tmp_record, p + offset, ERST_RECORD_SIZE);
> +            rc = ACPI_ERST_STATUS_SUCCESS;
> +        }
> +    }
> +    return rc;
> +}
> +
> +static unsigned copy_to_nvram_by_index(ERSTDeviceState *s, unsigned index)
> +{
> +    /* Write entry in tmp_record into nvram, and backing file */
> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> +    off_t offset = (index * ERST_RECORD_SIZE);
> +
> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
> +        if (s->hostmem_mr) {
> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
> +            memcpy(p + offset, s->tmp_record, ERST_RECORD_SIZE);
> +            rc = ACPI_ERST_STATUS_SUCCESS;
> +        }
> +    }
> +    return rc;
> +}
> +
> +static int lookup_erst_record_by_identifier(ERSTDeviceState *s,
> +    uint64_t record_identifier, bool *record_found, bool alloc_for_write)
> +{
> +    int rc = -1;
> +    int empty_index = -1;
> +    int index = 0;
> +    unsigned rrc;
> +
> +    *record_found = 0;
> +
> +    do {
> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);

you have direct access to backend memory so there is no need
whatsoever to copy records from it to an intermediate buffer
everywhere. Almost all operations with records can be done
in place modulo EXECUTE_OPERATION action in BEGIN_[READ|WRITE]
context, where record is moved between backend and guest buffer.

So please eliminate all not necessary copying.
(for fun, time operations and set backend size to some huge
value to see how expensive this code is)

> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
> +            uint64_t this_identifier;
> +            this_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
> +            if (IS_UEFI_CPER_RECORD(s->tmp_record) &&
> +                (this_identifier == record_identifier)) {
> +                rc = index;
> +                *record_found = 1;
> +                break;
> +            }
> +            if ((this_identifier == ERST_INVALID_RECORD_ID) &&
> +                (empty_index < 0)) {
> +                empty_index = index; /* first available for write */
> +            }
> +        }
> +        ++index;
> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
> +
> +    /* Record not found, allocate for writing */
> +    if ((rc < 0) && alloc_for_write) {
> +        rc = empty_index;
> +    }
> +
> +    return rc;
> +}
> +
> +static unsigned clear_erst_record(ERSTDeviceState *s)
> +{
> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> +    bool record_found;
> +    int index;
> +
> +    index = lookup_erst_record_by_identifier(s,
> +        s->record_identifier, &record_found, 0);
> +    if (record_found) {
> +        memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
> +        rc = copy_to_nvram_by_index(s, (unsigned)index);
> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
> +            s->record_count -= 1;
> +        }
> +    }
> +
> +    return rc;
> +}
> +
> +static unsigned write_erst_record(ERSTDeviceState *s)
> +{
> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> +
> +    if (s->record_offset < (ERST_RECORD_SIZE - UEFI_CPER_RECORD_MIN_SIZE)) {
> +        uint64_t record_identifier;
> +        uint8_t *record = &s->record[s->record_offset];
> +        bool record_found;
> +        int index;
> +
> +        record_identifier = (s->record_identifier == ERST_INVALID_RECORD_ID)
> +            ? THE_UEFI_CPER_RECORD_ID(record) : s->record_identifier;
> +
> +        index = lookup_erst_record_by_identifier(s,
> +            record_identifier, &record_found, 1);
> +        if (index < 0) {
> +            rc = ACPI_ERST_STATUS_NOT_ENOUGH_SPACE;
> +        } else {
> +            if (0 != s->record_offset) {
> +                memset(&s->tmp_record[ERST_RECORD_SIZE - s->record_offset],
> +                    0xFF, s->record_offset);
> +            }
> +            memcpy(s->tmp_record, record, ERST_RECORD_SIZE - s->record_offset);
> +            rc = copy_to_nvram_by_index(s, (unsigned)index);
> +            if (rc == ACPI_ERST_STATUS_SUCCESS) {
> +                if (!record_found) { /* not overwriting existing record */
> +                    s->record_count += 1; /* writing new record */
> +                }
> +            }
> +        }
> +    }
> +
> +    return rc;
> +}
> +
> +static unsigned next_erst_record(ERSTDeviceState *s,
> +    uint64_t *record_identifier)
> +{
> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> +    unsigned index;
> +    unsigned rrc;
> +
> +    *record_identifier = ERST_INVALID_RECORD_ID;
> +
> +    index = s->next_record_index;
> +    do {
> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
> +            if (IS_UEFI_CPER_RECORD(s->tmp_record)) {
> +                s->next_record_index = index + 1; /* where to start next time */
> +                *record_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
> +                rc = ACPI_ERST_STATUS_SUCCESS;
> +                break;
> +            }
> +            ++index;
> +        } else {
> +            if (s->next_record_index == 0) {
> +                rc = ACPI_ERST_STATUS_RECORD_STORE_EMPTY;
> +            }
> +            s->next_record_index = 0; /* at end, reset */
> +        }
> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
> +
> +    return rc;
> +}
> +
> +static unsigned read_erst_record(ERSTDeviceState *s)
> +{
> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> +    bool record_found;
> +    int index;
> +
> +    index = lookup_erst_record_by_identifier(s,
> +        s->record_identifier, &record_found, 0);
> +    if (record_found) {
> +        rc = copy_from_nvram_by_index(s, (unsigned)index);
> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
> +            if (s->record_offset < ERST_RECORD_SIZE) {
> +                memcpy(&s->record[s->record_offset], s->tmp_record,
> +                    ERST_RECORD_SIZE - s->record_offset);
> +            }
> +        }
> +    }
> +
> +    return rc;
> +}
> +
> +static unsigned get_erst_record_count(ERSTDeviceState *s)
> +{
> +    /* Compute record_count */
> +    unsigned index = 0;
> +
> +    s->record_count = 0;
> +    while (copy_from_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
> +        uint8_t *ptr = &s->tmp_record[0];
> +        uint64_t record_identifier = THE_UEFI_CPER_RECORD_ID(ptr);
> +        if (IS_UEFI_CPER_RECORD(ptr) &&
> +            (ERST_INVALID_RECORD_ID != record_identifier)) {
> +            s->record_count += 1;
> +        }
> +        ++index;
> +    }
> +    return s->record_count;
> +}
> +
> +/*******************************************************************/
> +
> +static uint64_t erst_rd_reg64(hwaddr addr,
> +    uint64_t reg, unsigned size)
> +{
> +    uint64_t rdval;
> +    uint64_t mask;
> +    unsigned shift;
> +
> +    if (size == sizeof(uint64_t)) {
> +        /* 64b access */
> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> +        shift = 0;
> +    } else {
> +        /* 32b access */
> +        mask = 0x00000000FFFFFFFFUL;
> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> +    }
> +
> +    rdval = reg;
> +    rdval >>= shift;
> +    rdval &= mask;
> +
> +    return rdval;
> +}
> +
> +static uint64_t erst_wr_reg64(hwaddr addr,
> +    uint64_t reg, uint64_t val, unsigned size)
> +{
> +    uint64_t wrval;
> +    uint64_t mask;
> +    unsigned shift;
> +
> +    if (size == sizeof(uint64_t)) {
> +        /* 64b access */
> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> +        shift = 0;
> +    } else {
> +        /* 32b access */
> +        mask = 0x00000000FFFFFFFFUL;
> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> +    }
> +
> +    val &= mask;
> +    val <<= shift;
> +    mask <<= shift;
> +    wrval = reg;
> +    wrval &= ~mask;
> +    wrval |= val;
> +
> +    return wrval;
> +}
(I see in next patch it's us defining access width in the ACPI tables)
so question is: do we have to have mixed register width access?
can't all register accesses be 64-bit?

> +static void erst_reg_write(void *opaque, hwaddr addr,
> +    uint64_t val, unsigned size)
> +{
> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> +
> +    /*
> +     * NOTE: All actions/operations/side effects happen on the WRITE,
> +     * by design. The READs simply return the reg_value contents.
> +     */
> +    trace_acpi_erst_reg_write(addr, val, size);
> +
> +    switch (addr) {
> +    case ERST_CSR_VALUE + 0:
> +    case ERST_CSR_VALUE + 4:
> +        s->reg_value = erst_wr_reg64(addr, s->reg_value, val, size);
> +        break;
> +    case ERST_CSR_ACTION + 0:
> +/*  case ERST_CSR_ACTION+4: as coded, not really a 64b register */
> +        switch (val) {
> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> +        case ACPI_ERST_ACTION_END_OPERATION:
> +            s->operation = val;
> +            break;
> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
> +            s->record_offset = s->reg_value;
> +            break;
> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
> +            if ((uint8_t)s->reg_value == ERST_EXECUTE_OPERATION_MAGIC) {
> +                s->busy_status = 1;
> +                switch (s->operation) {
> +                case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
> +                    s->command_status = write_erst_record(s);
> +                    break;
> +                case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> +                    s->command_status = read_erst_record(s);
> +                    break;
> +                case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> +                    s->command_status = clear_erst_record(s);
> +                    break;
> +                case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
> +                    break;
> +                case ACPI_ERST_ACTION_END_OPERATION:
> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
> +                    break;
> +                default:
> +                    s->command_status = ACPI_ERST_STATUS_FAILED;
> +                    break;
> +                }
> +                s->record_identifier = ERST_INVALID_RECORD_ID;
> +                s->busy_status = 0;
> +            }
> +            break;
> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
> +            s->reg_value = s->busy_status;
> +            break;
> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
> +            s->reg_value = s->command_status;
> +            break;
> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
> +            s->command_status = next_erst_record(s, &s->reg_value);
> +            break;
> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
> +            s->record_identifier = s->reg_value;
> +            break;
> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
> +            s->reg_value = s->record_count;
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
> +            s->reg_value = s->bar1;
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
> +            s->reg_value = ERST_RECORD_SIZE;
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
> +            s->reg_value = 0x0; /* intentional, not NVRAM mode */
> +            break;
> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
> +            /*
> +             * 100UL is max, 10UL is nominal
100/10 of what, also add reference to spec/table it comes from
and explain in comment why theses values were chosen 

> +             */
> +            s->reg_value = ((100UL << 32) | (10UL << 0));
> +            break;
> +        case ACPI_ERST_ACTION_RESERVED:
not necessary, it will be handled by 'default:' 

> +        default:
> +            /*
> +             * Unknown action/command, NOP
> +             */
> +            break;
> +        }
> +        break;
> +    default:
> +        /*
> +         * This should not happen, but if it does, NOP
> +         */
> +        break;
> +    }
> +}
> +
> +static uint64_t erst_reg_read(void *opaque, hwaddr addr,
> +                                unsigned size)
> +{
> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> +    uint64_t val = 0;
> +
> +    switch (addr) {
> +    case ERST_CSR_ACTION + 0:
> +    case ERST_CSR_ACTION + 4:
> +        val = erst_rd_reg64(addr, s->reg_action, size);
> +        break;
> +    case ERST_CSR_VALUE + 0:
> +    case ERST_CSR_VALUE + 4:
> +        val = erst_rd_reg64(addr, s->reg_value, size);
> +        break;
> +    default:
> +        break;
> +    }
> +    trace_acpi_erst_reg_read(addr, val, size);
> +    return val;
> +}
> +
> +static const MemoryRegionOps erst_reg_ops = {
> +    .read = erst_reg_read,
> +    .write = erst_reg_write,
> +    .endianness = DEVICE_NATIVE_ENDIAN,
> +};
> +
> +static void erst_mem_write(void *opaque, hwaddr addr,
> +    uint64_t val, unsigned size)
> +{
> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;

> +    uint8_t *ptr = &s->record[addr - 0];
> +    trace_acpi_erst_mem_write(addr, val, size);
> +    switch (size) {
> +    default:
> +    case sizeof(uint8_t):
> +        *(uint8_t *)ptr = (uint8_t)val;
> +        break;
> +    case sizeof(uint16_t):
> +        *(uint16_t *)ptr = (uint16_t)val;
> +        break;
> +    case sizeof(uint32_t):
> +        *(uint32_t *)ptr = (uint32_t)val;
> +        break;
> +    case sizeof(uint64_t):
> +        *(uint64_t *)ptr = (uint64_t)val;
> +        break;
> +    }
> +}
> +
> +static uint64_t erst_mem_read(void *opaque, hwaddr addr,
> +                                unsigned size)
> +{
> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> +    uint8_t *ptr = &s->record[addr - 0];
> +    uint64_t val = 0;
> +    switch (size) {
> +    default:
> +    case sizeof(uint8_t):
> +        val = *(uint8_t *)ptr;
> +        break;
> +    case sizeof(uint16_t):
> +        val = *(uint16_t *)ptr;
> +        break;
> +    case sizeof(uint32_t):
> +        val = *(uint32_t *)ptr;
> +        break;
> +    case sizeof(uint64_t):
> +        val = *(uint64_t *)ptr;
> +        break;
> +    }
> +    trace_acpi_erst_mem_read(addr, val, size);
> +    return val;
> +}
> +
> +static const MemoryRegionOps erst_mem_ops = {
> +    .read = erst_mem_read,
> +    .write = erst_mem_write,
> +    .endianness = DEVICE_NATIVE_ENDIAN,
> +};
> +
> +/*******************************************************************/
> +/*******************************************************************/
> +
> +static const VMStateDescription erst_vmstate  = {
> +    .name = "acpi-erst",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT8(operation, ERSTDeviceState),
> +        VMSTATE_UINT8(busy_status, ERSTDeviceState),
> +        VMSTATE_UINT8(command_status, ERSTDeviceState),
> +        VMSTATE_UINT32(record_offset, ERSTDeviceState),
> +        VMSTATE_UINT32(record_count, ERSTDeviceState),
> +        VMSTATE_UINT64(reg_action, ERSTDeviceState),
> +        VMSTATE_UINT64(reg_value, ERSTDeviceState),
> +        VMSTATE_UINT64(record_identifier, ERSTDeviceState),
> +        VMSTATE_UINT8_ARRAY(record, ERSTDeviceState, ERST_RECORD_SIZE),
> +        VMSTATE_UINT8_ARRAY(tmp_record, ERSTDeviceState, ERST_RECORD_SIZE),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
> +{
> +    ERSTDeviceState *s = ACPIERST(pci_dev);
> +    unsigned index = 0;
> +    bool share;
> +
> +    trace_acpi_erst_realizefn_in();
> +
> +    if (!s->hostmem) {
> +        error_setg(errp, "'" ACPI_ERST_MEMDEV_PROP "' property is not set");
> +        return;
> +    } else if (host_memory_backend_is_mapped(s->hostmem)) {
> +        error_setg(errp, "can't use already busy memdev: %s",
> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
> +        return;
> +    }
> +
> +    share = object_property_get_bool(OBJECT(s->hostmem), "share", &error_fatal);
s/&error_fatal/errp/

> +    if (!share) {
> +        error_setg(errp, "ACPI ERST requires hostmem property share=on: %s",
> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
> +    }
This limits possible to use backends to file|memfd only, so
I wonder if really need this limitation, what if user doesn't
care about preserving it across QEMU restarts. (i.e. usecase
where storage is used as a means to troubleshoot guest crash
i.e. QEMU is not restarted in between)

Maybe instead of enforcing we should just document that if user
wishes to preserve content they should use file|memfd backend with
share=on option.

> +
> +    s->hostmem_mr = host_memory_backend_get_memory(s->hostmem);
> +
> +    /* HostMemoryBackend size will be multiple of PAGE_SIZE */
> +    s->prop_size = object_property_get_int(OBJECT(s->hostmem), "size", &error_fatal);
s/&error_fatal/errp/

> +
> +    /* Convert prop_size to integer multiple of ERST_RECORD_SIZE */
> +    s->prop_size -= (s->prop_size % ERST_RECORD_SIZE);

pls, no fixups on behalf of user, if size is not what it should be
error out with suggestion how to fix it.

> +
> +    /*
> +     * MemoryBackend initializes contents to zero, but we actually
> +     * want contents initialized to 0xFF, ERST_INVALID_RECORD_ID.
> +     */
> +    if (copy_from_nvram_by_index(s, 0) == ACPI_ERST_STATUS_SUCCESS) {
> +        if (s->tmp_record[0] == 0x00) {
> +            memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
this doesn't scale,
(set backend size to more than host physical RAM, put it on slow storage and have fun.)

Is it possible to use 0 as invalid record id or change storage format
so you would not have to rewrite whole file at startup (maybe some sort
of metadata header/records book-keeping table before actual records.
And initialize file only if header is invalid.)

> +            index = 0;
> +            while (copy_to_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
> +                ++index;
> +            }
also back&forth copying here is not really necessary.

> +        }
> +    }
> +
> +    /* Initialize record_count */
> +    get_erst_record_count(s);
why not put it into reset?

> +
> +    /* BAR 0: Programming registers */
> +    memory_region_init_io(&s->iomem, OBJECT(pci_dev), &erst_reg_ops, s,
> +                          TYPE_ACPI_ERST, ERST_REG_SIZE);
> +    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->iomem);
> +

> +    /* BAR 1: Exchange buffer memory */
> +    memory_region_init_io(&s->nvmem, OBJECT(pci_dev), &erst_mem_ops, s,
> +                          TYPE_ACPI_ERST, ERST_RECORD_SIZE);
> +    pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->nvmem);

**)
instead of using mmio for buffer where each write causes
guest exit to QEMU, map memory region directly to guest.
see ivshmem_bar2, the only difference with ivshmem, you'd
create memory region manually (for example you can use
memory_region_init_resizeable_ram)

this way you can speedup access and drop erst_mem_ops and
[tmp_]record intermediate buffers.

Instead of [tmp_]record you can copy record content
directly between buffer and backend memory regions.

> +    /*
> +     * The vmstate_register_ram_global() puts the memory in
> +     * migration stream, where it is written back to the memory
> +     * upon reaching the destination, which causes the backing
> +     * file to be updated (with share=on).
> +     */
> +    vmstate_register_ram_global(s->hostmem_mr);
> +
> +    trace_acpi_erst_realizefn_out(s->prop_size);
> +}
> +
> +static void erst_reset(DeviceState *dev)
> +{
> +    ERSTDeviceState *s = ACPIERST(dev);
> +
> +    trace_acpi_erst_reset_in(s->record_count);
> +    s->operation = 0;
> +    s->busy_status = 0;
> +    s->command_status = ACPI_ERST_STATUS_SUCCESS;

> +    /* indicate empty/no-more until further notice */
pls rephrase, I'm not sure what it's trying to say

> +    s->record_identifier = ERST_INVALID_RECORD_ID;
> +    s->record_offset = 0;
> +    s->next_record_index = 0;

> +    /* NOTE: record_count and nvram are initialized elsewhere */
> +    trace_acpi_erst_reset_out(s->record_count);
> +}
> +
> +static Property erst_properties[] = {
> +    DEFINE_PROP_LINK(ACPI_ERST_MEMDEV_PROP, ERSTDeviceState, hostmem,
> +                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void erst_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    trace_acpi_erst_class_init_in();
> +    k->realize = erst_realizefn;
> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> +    k->device_id = PCI_DEVICE_ID_REDHAT_ACPI_ERST;
> +    k->revision = 0x00;
> +    k->class_id = PCI_CLASS_OTHERS;
> +    dc->reset = erst_reset;
> +    dc->vmsd = &erst_vmstate;
> +    dc->user_creatable = true;
> +    device_class_set_props(dc, erst_properties);
> +    dc->desc = "ACPI Error Record Serialization Table (ERST) device";
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +    trace_acpi_erst_class_init_out();
> +}
> +
> +static const TypeInfo erst_type_info = {
> +    .name          = TYPE_ACPI_ERST,
> +    .parent        = TYPE_PCI_DEVICE,
> +    .class_init    = erst_class_init,
> +    .instance_size = sizeof(ERSTDeviceState),
> +    .interfaces = (InterfaceInfo[]) {
> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
what is this for here?

> +        { }
> +    }
> +};
> +
> +static void erst_register_types(void)
> +{
> +    type_register_static(&erst_type_info);
> +}
> +
> +type_init(erst_register_types)
> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> index dd69577..262a8ee 100644
> --- a/hw/acpi/meson.build
> +++ b/hw/acpi/meson.build
> @@ -4,6 +4,7 @@ acpi_ss.add(files(
>    'aml-build.c',
>    'bios-linker-loader.c',
>    'utils.c',
> +  'erst.c',
>  ))
>  acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
>  acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 07/10] ACPI ERST: trace support
  2021-06-30 19:07 ` [PATCH v5 07/10] ACPI ERST: trace support Eric DeVolder
@ 2021-07-20 13:15   ` Igor Mammedov
  2021-07-21 16:14     ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 13:15 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:18 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> Provide the definitions needed to support tracing in ACPI ERST.
trace points should be introduced in patches that use them for the first time,
as it stands now series breaks bisection.

> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  hw/acpi/trace-events | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events
> index dcc1438..a5c2755 100644
> --- a/hw/acpi/trace-events
> +++ b/hw/acpi/trace-events
> @@ -55,3 +55,17 @@ piix4_gpe_writeb(uint64_t addr, unsigned width, uint64_t val) "addr: 0x%" PRIx64
>  # tco.c
>  tco_timer_reload(int ticks, int msec) "ticks=%d (%d ms)"
>  tco_timer_expired(int timeouts_no, bool strap, bool no_reboot) "timeouts_no=%d no_reboot=%d/%d"
> +
> +# erst.c
> +acpi_erst_reg_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%04" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
> +acpi_erst_reg_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%04" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
> +acpi_erst_mem_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%06" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
> +acpi_erst_mem_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%06" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
> +acpi_erst_pci_bar_0(uint64_t addr) "BAR0: 0x%016" PRIx64
> +acpi_erst_pci_bar_1(uint64_t addr) "BAR1: 0x%016" PRIx64
> +acpi_erst_realizefn_in(void)
> +acpi_erst_realizefn_out(unsigned size) "total nvram size %u bytes"
> +acpi_erst_reset_in(unsigned record_count) "record_count %u"
> +acpi_erst_reset_out(unsigned record_count) "record_count %u"
> +acpi_erst_class_init_in(void)
> +acpi_erst_class_init_out(void)



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-06-30 19:07 ` [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table Eric DeVolder
@ 2021-07-20 13:16   ` Igor Mammedov
  2021-07-20 14:59     ` Igor Mammedov
  2021-07-21 16:12     ` Eric DeVolder
  0 siblings, 2 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 13:16 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:17 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> This code is called from the machine code (if ACPI supported)
> to generate the ACPI ERST table.
should be along lines:
This builds ACPI ERST table /spec ref/ to inform OSMP
how to communicate with ... device.

> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  hw/acpi/erst.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 214 insertions(+)
> 
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> index 6e9bd2e..1f1dbbc 100644
> --- a/hw/acpi/erst.c
> +++ b/hw/acpi/erst.c
> @@ -555,6 +555,220 @@ static const MemoryRegionOps erst_mem_ops = {
>  /*******************************************************************/
>  /*******************************************************************/
>  
> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
> +static void build_serialization_instruction_entry(GArray *table_data,
> +    uint8_t serialization_action,
> +    uint8_t instruction,
> +    uint8_t flags,
> +    uint8_t register_bit_width,
> +    uint64_t register_address,
> +    uint64_t value,
> +    uint64_t mask)
like I mentioned in previous patch, It could be simplified
a lot if it's possible to use fixed 64-bit access with every
action and the same width mask. 

> +{
> +    /* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
> +    struct AcpiGenericAddress gas;
> +
> +    /* Serialization Action */
> +    build_append_int_noprefix(table_data, serialization_action, 1);
> +    /* Instruction */
> +    build_append_int_noprefix(table_data, instruction         , 1);
> +    /* Flags */
> +    build_append_int_noprefix(table_data, flags               , 1);
> +    /* Reserved */
> +    build_append_int_noprefix(table_data, 0                   , 1);
> +    /* Register Region */
> +    gas.space_id = AML_SYSTEM_MEMORY;
> +    gas.bit_width = register_bit_width;
> +    gas.bit_offset = 0;
> +    switch (register_bit_width) {
> +    case 8:
> +        gas.access_width = 1;
> +        break;
> +    case 16:
> +        gas.access_width = 2;
> +        break;
> +    case 32:
> +        gas.access_width = 3;
> +        break;
> +    case 64:
> +        gas.access_width = 4;
> +        break;
> +    default:
> +        gas.access_width = 0;
> +        break;
> +    }
> +    gas.address = register_address;
> +    build_append_gas_from_struct(table_data, &gas);
> +    /* Value */
> +    build_append_int_noprefix(table_data, value  , 8);
> +    /* Mask */
> +    build_append_int_noprefix(table_data, mask   , 8);
> +}
> +
> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
> +    const char *oem_id, const char *oem_table_id)
> +{
> +    ERSTDeviceState *s = ACPIERST(erst_dev);

globals are not welcomed in new code,
pass erst_dev as argument here (ex: find_vmgenid_dev)

> +    unsigned action;
> +    unsigned erst_start = table_data->len;
> +

> +    s->bar0 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
> +    trace_acpi_erst_pci_bar_0(s->bar0);
> +    s->bar1 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);

just store pci_get_bar_addr(PCI_DEVICE(erst_dev), 0) in local variable,
Bar 1 is not used in this function so you don't need it here.


> +    trace_acpi_erst_pci_bar_1(s->bar1);
> +
> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> +    /* serialization_header_length */
comments documenting table entries should be verbatim copy from spec,
see build_amd_iommu() as example of preferred style.

> +    build_append_int_noprefix(table_data, 48, 4);
> +    /* reserved */
> +    build_append_int_noprefix(table_data,  0, 4);
> +    /*
> +     * instruction_entry_count - changes to the number of serialization
> +     * instructions in the ACTIONs below must be reflected in this
> +     * pre-computed value.
> +     */
> +    build_append_int_noprefix(table_data, 29, 4);
a bit fragile as it can easily diverge from actual number later on.
maybe instead of building instruction entries in place, build it
in separate array and when done, use actual count to fill instruction_entry_count.
pseudo code could look like:

     /* prepare instructions in advance because ... */
     GArray table_instruction_data;
     build_serialization_instruction_entry(table_instruction_data,...);;
     ...
     build_serialization_instruction_entry(table_instruction_data,...);
     /* instructions count */
     build_append_int_noprefix(table_data, table_instruction_data.len/entry_size, 4);
     /* copy prepared in advance instructions */
     g_array_append_vals(table_data, table_instruction_data.data, table_instruction_data.len);          
   

> +
> +#define MASK8  0x00000000000000FFUL
> +#define MASK16 0x000000000000FFFFUL
> +#define MASK32 0x00000000FFFFFFFFUL
> +#define MASK64 0xFFFFFFFFFFFFFFFFUL
> +
> +    for (action = 0; action < ACPI_ERST_MAX_ACTIONS; ++action) {
I'd unroll this loop and just directly code entries in required order.
also drop reserved and nop actions/instructions or explain why they are necessary.

> +        switch (action) {
> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
given these names will/should never be exposed outside of hw/acpi/erst.c
I'd drop ACPI_ERST_ACTION_/ACPI_ERST_INST_ prefixes (i.e. use names as defined in spec)
if it doesn't cause build issues.

> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_END_OPERATION:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> +            break;
> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> +            break;
> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_RESERVED:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> +            break;
> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> +            break;
> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> +        default:
> +            build_serialization_instruction_entry(table_data, action,
> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
> +            break;
> +        }
> +    }
> +    build_header(linker, table_data,
> +                 (void *)(table_data->data + erst_start),
> +                 "ERST", table_data->len - erst_start,
> +                 1, oem_id, oem_table_id);
> +}
> +
> +/*******************************************************************/
> +/*******************************************************************/
> +
>  static const VMStateDescription erst_vmstate  = {
>      .name = "acpi-erst",
>      .version_id = 1,



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines.
  2021-06-30 19:07 ` [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines Eric DeVolder
@ 2021-07-20 13:19   ` Igor Mammedov
  2021-07-21 16:16     ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 13:19 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:19 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> This change exposes ACPI ERST support for x86 guests.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
looks good to me, maybe move find_erst_dev() impl. here as well
if it's the patch it's first used.

> ---
>  hw/i386/acpi-build.c   | 9 +++++++++
>  hw/i386/acpi-microvm.c | 9 +++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index de98750..d2026cc 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -43,6 +43,7 @@
>  #include "sysemu/tpm.h"
>  #include "hw/acpi/tpm.h"
>  #include "hw/acpi/vmgenid.h"
> +#include "hw/acpi/erst.h"
>  #include "hw/boards.h"
>  #include "sysemu/tpm_backend.h"
>  #include "hw/rtc/mc146818rtc_regs.h"
> @@ -2327,6 +2328,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>      GArray *tables_blob = tables->table_data;
>      AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
>      Object *vmgenid_dev;
> +    Object *erst_dev;
>      char *oem_id;
>      char *oem_table_id;
>  
> @@ -2388,6 +2390,13 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>                      ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>                      x86ms->oem_table_id);
>  
> +    erst_dev = find_erst_dev();
> +    if (erst_dev) {
> +        acpi_add_table(table_offsets, tables_blob);
> +        build_erst(tables_blob, tables->linker, erst_dev,
> +                   x86ms->oem_id, x86ms->oem_table_id);
> +    }
> +
>      vmgenid_dev = find_vmgenid_dev();
>      if (vmgenid_dev) {
>          acpi_add_table(table_offsets, tables_blob);
> diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
> index ccd3303..0099b13 100644
> --- a/hw/i386/acpi-microvm.c
> +++ b/hw/i386/acpi-microvm.c
> @@ -30,6 +30,7 @@
>  #include "hw/acpi/bios-linker-loader.h"
>  #include "hw/acpi/generic_event_device.h"
>  #include "hw/acpi/utils.h"
> +#include "hw/acpi/erst.h"
>  #include "hw/boards.h"
>  #include "hw/i386/fw_cfg.h"
>  #include "hw/i386/microvm.h"
> @@ -160,6 +161,7 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>      X86MachineState *x86ms = X86_MACHINE(mms);
>      GArray *table_offsets;
>      GArray *tables_blob = tables->table_data;
> +    Object *erst_dev;
>      unsigned dsdt, xsdt;
>      AcpiFadtData pmfadt = {
>          /* ACPI 5.0: 4.1 Hardware-Reduced ACPI */
> @@ -209,6 +211,13 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>                      ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>                      x86ms->oem_table_id);
>  
> +    erst_dev = find_erst_dev();
> +    if (erst_dev) {
> +        acpi_add_table(table_offsets, tables_blob);
> +        build_erst(tables_blob, tables->linker, erst_dev,
> +                   x86ms->oem_id, x86ms->oem_table_id);
> +    }
> +
>      xsdt = tables_blob->len;
>      build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
>                 x86ms->oem_table_id);



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c
  2021-06-30 19:07 ` [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c Eric DeVolder
@ 2021-07-20 13:24   ` Igor Mammedov
  2021-07-21 16:19     ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 13:24 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:21 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> Following the guidelines in tests/qtest/bios-tables-test.c, this
> is step 6, the re-generated ACPI tables binary blobs.

looks like test case itself got lost somewhere along the way.
 
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  tests/data/acpi/microvm/ERST                | Bin 0 -> 976 bytes
>  tests/data/acpi/pc/ERST                     | Bin 0 -> 976 bytes
>  tests/data/acpi/q35/ERST                    | Bin 0 -> 976 bytes
>  tests/qtest/bios-tables-test-allowed-diff.h |   4 ----
>  4 files changed, 4 deletions(-)
> 
> diff --git a/tests/data/acpi/microvm/ERST b/tests/data/acpi/microvm/ERST
> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..db2adaa8d9b45e295f9976d6bb5a07a813214f52 100644
> GIT binary patch
> literal 976
> zcmaKqTMmLS5Jd+l50TdfOjv?(1qNf{pGN#}aW2XoVQ=kJawAMa;r8^<4tl<ik9Q&x
> z9fs@aGWNsscIs_KB7$e!_x3{VFxa)yyAdhW<ewtq@KMTR;_(*;o)AYwsc#_I{R=ny
> z8zx&whJ53fsGoYS{%e8zX-SD^^!|)F8lIhx`_IYG$#;3?dmQ>N$k#r!KbMbUbUyg_
> zK(;pcerufGzoGM$#7pML|ITms2HKLp#iT7ge?`3d;=pU-HFM;Z{u=Td@?Bo=v9u+>
> bCEw+h{yXwJ@?BooAHQFxe`xQi@1uMGuJKX<
> 
> literal 0
> HcmV?d00001
> 
> diff --git a/tests/data/acpi/pc/ERST b/tests/data/acpi/pc/ERST
> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
> GIT binary patch
> literal 976
> zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
> z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
> z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
> z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ  
> c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91
> 
> literal 0
> HcmV?d00001
> 
> diff --git a/tests/data/acpi/q35/ERST b/tests/data/acpi/q35/ERST
> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
> GIT binary patch
> literal 976
> zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
> z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
> z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
> z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ  
> c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91
> 
> literal 0
> HcmV?d00001
> 
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
> index e004c71..dfb8523 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1,5 +1 @@
>  /* List of comma-separated changed AML files to ignore */
> -"tests/data/acpi/pc/ERST",
> -"tests/data/acpi/q35/ERST",
> -"tests/data/acpi/microvm/ERST",
> -



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 09/10] ACPI ERST: qtest for ERST
  2021-06-30 19:07 ` [PATCH v5 09/10] ACPI ERST: qtest for ERST Eric DeVolder
@ 2021-07-20 13:38   ` Igor Mammedov
  2021-07-21 16:18     ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 13:38 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:20 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> This change provides a qtest that locates and then does a simple
> interrogation of the ERST feature within the guest.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  tests/qtest/erst-test.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/qtest/meson.build |   2 +
>  2 files changed, 131 insertions(+)
>  create mode 100644 tests/qtest/erst-test.c
> 
> diff --git a/tests/qtest/erst-test.c b/tests/qtest/erst-test.c
> new file mode 100644
> index 0000000..ce014c1
> --- /dev/null
> +++ b/tests/qtest/erst-test.c
> @@ -0,0 +1,129 @@
> +/*
> + * QTest testcase for ACPI ERST
> + *
> + * Copyright (c) 2021 Oracle
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/bitmap.h"
> +#include "qemu/uuid.h"
> +#include "hw/acpi/acpi-defs.h"
> +#include "boot-sector.h"
> +#include "acpi-utils.h"
> +#include "libqos/libqtest.h"
> +#include "qapi/qmp/qdict.h"
> +
> +#define RSDP_ADDR_INVALID 0x100000 /* RSDP must be below this address */
> +
> +static uint64_t acpi_find_erst(QTestState *qts)
> +{
> +    uint32_t rsdp_offset;
> +    uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
> +    uint32_t rsdt_len, table_length;
> +    uint8_t *rsdt, *ent;
> +    uint64_t base = 0;
> +
> +    /* Wait for guest firmware to finish and start the payload. */
> +    boot_sector_test(qts);
> +
> +    /* Tables should be initialized now. */
> +    rsdp_offset = acpi_find_rsdp_address(qts);
> +
> +    g_assert_cmphex(rsdp_offset, <, RSDP_ADDR_INVALID);
> +
> +    acpi_fetch_rsdp_table(qts, rsdp_offset, rsdp_table);
> +    acpi_fetch_table(qts, &rsdt, &rsdt_len, &rsdp_table[16 /* RsdtAddress */],
> +                     4, "RSDT", true);
> +
> +    ACPI_FOREACH_RSDT_ENTRY(rsdt, rsdt_len, ent, 4 /* Entry size */) {
> +        uint8_t *table_aml;
> +        acpi_fetch_table(qts, &table_aml, &table_length, ent, 4, NULL, true);
> +        if (!memcmp(table_aml + 0 /* Header Signature */, "ERST", 4)) {
> +            /*
> +             * Picking up ERST base address from the Register Region
> +             * specified as part of the first Serialization Instruction
> +             * Action (which is a Begin Write Operation).
> +             */
> +            memcpy(&base, &table_aml[56], sizeof(base));
> +            g_free(table_aml);
> +            break;
> +        }
> +        g_free(table_aml);
> +    }
> +    g_free(rsdt);
> +    return base;
> +}
I'd drop this, bios-tables-test should do ACPI table check
as for PCI device itself you can test it with qtest accelerator
that allows to instantiate it and access registers directly
without overhead of running actual guest.

As example you can look into megasas-test.c, ivshmem-test.c
or other PCI device tests.

> +static char disk[] = "tests/erst-test-disk-XXXXXX";
> +
> +#define ERST_CMD()                              \
> +    "-accel kvm -accel tcg "                    \
> +    "-object memory-backend-file," \
> +      "id=erstnvram,mem-path=tests/acpi-erst-XXXXXX,size=0x10000,share=on " \
> +    "-device acpi-erst,memdev=erstnvram " \
> +    "-drive id=hd0,if=none,file=%s,format=raw " \
> +    "-device ide-hd,drive=hd0 ", disk
> +
> +static void erst_get_error_log_address_range(void)
> +{
> +    QTestState *qts;
> +    uint64_t log_address_range = 0;
> +    unsigned log_address_length = 0;
> +    unsigned log_address_attr = 0;
> +
> +    qts = qtest_initf(ERST_CMD());
> +
> +    uint64_t base = acpi_find_erst(qts);
> +    g_assert(base != 0);
> +
> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE command */
> +    qtest_writel(qts, base + 0, 0xD);
> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE result */
> +    log_address_range = qtest_readq(qts, base + 8);\
> +
> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_LENGTH command */
> +    qtest_writel(qts, base + 0, 0xE);
> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_LENGTH result */
> +    log_address_length = qtest_readq(qts, base + 8);\
> +
> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES command */
> +    qtest_writel(qts, base + 0, 0xF);
> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES result */
> +    log_address_attr = qtest_readq(qts, base + 8);\
> +
> +    /* Check log_address_range is not 0,~0 or base */
> +    g_assert(log_address_range != base);
> +    g_assert(log_address_range != 0);
> +    g_assert(log_address_range != ~0UL);
> +
> +    /* Check log_address_length is ERST_RECORD_SIZE */
> +    g_assert(log_address_length == (8 * 1024));
> +
> +    /* Check log_address_attr is 0 */
> +    g_assert(log_address_attr == 0);
> +
> +    qtest_quit(qts);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    int ret;
> +
> +    ret = boot_sector_init(disk);
> +    if (ret) {
> +        return ret;
> +    }
> +
> +    g_test_init(&argc, &argv, NULL);
> +
> +    qtest_add_func("/erst/get-error-log-address-range",
> +                   erst_get_error_log_address_range);
> +
> +    ret = g_test_run();
> +    boot_sector_cleanup(disk);
> +
> +    return ret;
> +}
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index 0c76738..deae443 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -66,6 +66,7 @@ qtests_i386 = \
>    (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
>    (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
>    (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
> +  (config_all_devices.has_key('CONFIG_ACPI') ? ['erst-test'] : []) +                 \
>    qtests_pci +                                                                              \
>    ['fdc-test',
>     'ide-test',
> @@ -237,6 +238,7 @@ qtests = {
>    'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
>    'cdrom-test': files('boot-sector.c'),
>    'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
> +  'erst-test': files('erst-test.c', 'boot-sector.c', 'acpi-utils.c'),
>    'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
>    'migration-test': files('migration-helpers.c'),
>    'pxe-test': files('boot-sector.c'),



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
                   ` (10 preceding siblings ...)
  2021-07-13 20:38 ` [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Michael S. Tsirkin
@ 2021-07-20 14:57 ` Igor Mammedov
  2021-07-21 15:26   ` Eric DeVolder
                     ` (2 more replies)
  11 siblings, 3 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 14:57 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 30 Jun 2021 15:07:11 -0400
Eric DeVolder <eric.devolder@oracle.com> wrote:

> =============================
> I believe I have corrected for all feedback on v4, but with
> responses to certain feedback below.
> 
> In patch 1/6, Igor asks:
> "you are adding empty template files here
> but the later matching bios-tables-test is nowhere to be found
> Was testcase lost somewhere along the way?
> 
> also it seems you add ERST only to pc/q35,
> so why tests/data/acpi/microvm/ERST is here?"
> 
> I did miss setting up microvm. That has been corrected.
> 
> As for the question about lost test cases, if you are referring
> to the new binary blobs for pc,q35, those were in patch
> 6/6. There is a qtest in patch 5/6. If I don't understand the
> question, please indicate as such.

All I see in this series is
 [PATCH v5 09/10] ACPI ERST: qtest for ERST
which is not related to bios-tables-test and blobs whatsoever.

Blobs are for use with bios-tables-test and I'm referring to
missing test case in bios-tables-test.c

> 
> 
> In patch 3/6, Igor asks:
> "Also spec (ERST) is rather (maybe intentionally) vague on specifics,
> so it would be better that before a patch that implements hw part
> were a doc patch describing concrete implementation. As model
> you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
> I'd start posting/discussing that spec within these thread
> to avoid spamming list until doc is settled up."
> 
> I'm thinking that this cover letter is the bulk of the spec? But as
> you say, to avoid spamming the group, we can use this thread to make
> suggested changes to this cover letter which I will then convert
> into a spec, for v6.
> 
> 
> In patch 3/6, in many places Igor mentions utilizing the hostmem
> mapped directly in the guest in order to avoid need-less copying.
> 
> It is true that the ERST has an "NVRAM" mode that would allow for
> all the simplifications Igor points out, however, Linux does not
> support this mode. This mode puts the burden of managing the NVRAM
> space on the OS. So this implementation, like BIOS, is the non-NVRAM
> mode.
see per patch comments where copying is not necessary regardless of
the implemented mode.


> I did go ahead and separate the registers from the exchange buffer,
> which would facilitate the support of NVRAM mode.
> 
>  linux/drivers/acpi/apei/erst.c:
>  /* NVRAM ERST Error Log Address Range is not supported yet */
>  static void pr_unimpl_nvram(void)
>  {
>     if (printk_ratelimit())
>         pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
>  }
> 
>  static int __erst_write_to_nvram(const struct cper_record_header *record)
>  {
>     /* do not print message, because printk is not safe for NMI */
>     return -ENOSYS;
>  }
> 
>  static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
>  {
>     pr_unimpl_nvram();
>     return -ENOSYS;
>  }
> 
>  static int __erst_clear_from_nvram(u64 record_id)
>  {
>     pr_unimpl_nvram();
>     return -ENOSYS;
>  }
> 
> =============================
PS:
it's inconvenient when you copy questions/parts of unfinished discussion
from previous revision with a little context.
Usually discussion should continue in the original thread and
once some sort of consensus is reached new series based on it
is posted. Above blob shouldn't be here. (You can look at how others
handle multiple revisions)

The way you do it now, makes reviewer to repeat job done earlier
to point to the the same issues, so it wastes your and reviewer's time.
So please finish discussions in threads they started at and then post
new revision.

> This patchset introduces support for the ACPI Error Record
> Serialization Table, ERST.
> 
> For background and implementation information, please see
> docs/specs/acpi_erst.txt, which is patch 2/10.
> 
> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> 
> ---
> v5: 30jun2021
>  - Create docs/specs/acpi_erst.txt, per Igor
>  - Separate PCI BARs for registers and memory, per Igor
>  - Convert debugging to use trace infrastructure, per Igor
>  - Various other fixups, per Igor
> 
> v4: 11jun2021
>  - Converted to a PCI device, per Igor.
>  - Updated qtest.
>  - Rearranged patches, per Igor.
> 
> v3: 28may2021
>  - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>    internal array with explicit file operations, per Igor.
>  - Changed the way the qdev and base address are handled, allowing
>    ERST to be disabled at run-time. Also aligns better with other
>    existing code.
> 
> v2: 8feb2021
>  - Added qtest/smoke test per Paolo Bonzini
>  - Split patch into smaller chunks, per Igor Mammedov
>  - Did away with use of ACPI packed structures, per Igor Mammedov
> 
> v1: 26oct2020
>  - initial post
> 
> ---
> 
> Eric DeVolder (10):
>   ACPI ERST: bios-tables-test.c steps 1 and 2
>   ACPI ERST: specification for ERST support
>   ACPI ERST: PCI device_id for ERST
>   ACPI ERST: header file for ERST
>   ACPI ERST: support for ACPI ERST feature
>   ACPI ERST: build the ACPI ERST table
>   ACPI ERST: trace support
>   ACPI ERST: create ACPI ERST table for pc/x86 machines.
>   ACPI ERST: qtest for ERST
>   ACPI ERST: step 6 of bios-tables-test.c
> 
>  docs/specs/acpi_erst.txt     | 152 +++++++
>  hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build          |   1 +
>  hw/acpi/trace-events         |  14 +
>  hw/i386/acpi-build.c         |   9 +
>  hw/i386/acpi-microvm.c       |   9 +
>  include/hw/acpi/erst.h       |  84 ++++
>  include/hw/pci/pci.h         |   1 +
>  tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
>  tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>  tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>  tests/qtest/erst-test.c      | 129 ++++++
>  tests/qtest/meson.build      |   2 +
>  13 files changed, 1319 insertions(+)
>  create mode 100644 docs/specs/acpi_erst.txt
>  create mode 100644 hw/acpi/erst.c
>  create mode 100644 include/hw/acpi/erst.h
>  create mode 100644 tests/data/acpi/microvm/ERST
>  create mode 100644 tests/data/acpi/pc/ERST
>  create mode 100644 tests/data/acpi/q35/ERST
>  create mode 100644 tests/qtest/erst-test.c
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-20 13:16   ` Igor Mammedov
@ 2021-07-20 14:59     ` Igor Mammedov
  2021-07-21 16:12     ` Eric DeVolder
  1 sibling, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-20 14:59 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Tue, 20 Jul 2021 15:16:40 +0200
Igor Mammedov <imammedo@redhat.com> wrote:

> On Wed, 30 Jun 2021 15:07:17 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
[...]
> > +/* ACPI 4.0: 17.4.1 Serialization Action Table */
> > +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
> > +    const char *oem_id, const char *oem_table_id)
> > +{
> > +    ERSTDeviceState *s = ACPIERST(erst_dev);  
> 
> globals are not welcomed in new code,
> pass erst_dev as argument here (ex: find_vmgenid_dev)
ignore this, I didn't notice that it's passed as argument.




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-13 20:38 ` [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Michael S. Tsirkin
@ 2021-07-21 15:23   ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 15:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, konrad.wilk, qemu-devel, pbonzini, imammedo,
	boris.ostrovsky, rth



On 7/13/21 3:38 PM, Michael S. Tsirkin wrote:
> On Wed, Jun 30, 2021 at 03:07:11PM -0400, Eric DeVolder wrote:
>> =============================
>> I believe I have corrected for all feedback on v4, but with
>> responses to certain feedback below.
>>
>> In patch 1/6, Igor asks:
>> "you are adding empty template files here
>> but the later matching bios-tables-test is nowhere to be found
>> Was testcase lost somewhere along the way?
>>
>> also it seems you add ERST only to pc/q35,
>> so why tests/data/acpi/microvm/ERST is here?"
>>
>> I did miss setting up microvm. That has been corrected.
>>
>> As for the question about lost test cases, if you are referring
>> to the new binary blobs for pc,q35, those were in patch
>> 6/6. There is a qtest in patch 5/6. If I don't understand the
>> question, please indicate as such.
>>
>>
>> In patch 3/6, Igor asks:
>> "Also spec (ERST) is rather (maybe intentionally) vague on specifics,
>> so it would be better that before a patch that implements hw part
>> were a doc patch describing concrete implementation. As model
>> you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
>> I'd start posting/discussing that spec within these thread
>> to avoid spamming list until doc is settled up."
>>
>> I'm thinking that this cover letter is the bulk of the spec? But as
>> you say, to avoid spamming the group, we can use this thread to make
>> suggested changes to this cover letter which I will then convert
>> into a spec, for v6.
>>
>>
>> In patch 3/6, in many places Igor mentions utilizing the hostmem
>> mapped directly in the guest in order to avoid need-less copying.
>>
>> It is true that the ERST has an "NVRAM" mode that would allow for
>> all the simplifications Igor points out, however, Linux does not
>> support this mode. This mode puts the burden of managing the NVRAM
>> space on the OS. So this implementation, like BIOS, is the non-NVRAM
>> mode.
>>
>> I did go ahead and separate the registers from the exchange buffer,
>> which would facilitate the support of NVRAM mode.
>>
>>   linux/drivers/acpi/apei/erst.c:
>>   /* NVRAM ERST Error Log Address Range is not supported yet */
>>   static void pr_unimpl_nvram(void)
>>   {
>>      if (printk_ratelimit())
>>          pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
>>   }
>>
>>   static int __erst_write_to_nvram(const struct cper_record_header *record)
>>   {
>>      /* do not print message, because printk is not safe for NMI */
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_clear_from_nvram(u64 record_id)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>> =============================
>>
>> This patchset introduces support for the ACPI Error Record
>> Serialization Table, ERST.
>>
>> For background and implementation information, please see
>> docs/specs/acpi_erst.txt, which is patch 2/10.
>>
>> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> 
> 
> ../hw/acpi/erst.c: In function ‘build_erst’:
> ../hw/acpi/erst.c:754:13: error: this statement may fall through [-Werror=implicit-fallthrough=]
>    754 |             build_serialization_instruction_entry(table_data, action,
>        |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    755 |                 ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>        |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    756 |                 s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>        |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../hw/acpi/erst.c:757:9: note: here
>    757 |         default:
>        |         ^~~~~~~
> cc1: all warnings being treated as errors
> 
> 
> Pls correct.
> mingw32 build also failed. Pls take a look.

I've corrected the above build error.
I've also corrected the mingw32 build error.
eric

> 
> 
> Thanks!
> 
> 
>> ---
>> v5: 30jun2021
>>   - Create docs/specs/acpi_erst.txt, per Igor
>>   - Separate PCI BARs for registers and memory, per Igor
>>   - Convert debugging to use trace infrastructure, per Igor
>>   - Various other fixups, per Igor
>>
>> v4: 11jun2021
>>   - Converted to a PCI device, per Igor.
>>   - Updated qtest.
>>   - Rearranged patches, per Igor.
>>
>> v3: 28may2021
>>   - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>>     internal array with explicit file operations, per Igor.
>>   - Changed the way the qdev and base address are handled, allowing
>>     ERST to be disabled at run-time. Also aligns better with other
>>     existing code.
>>
>> v2: 8feb2021
>>   - Added qtest/smoke test per Paolo Bonzini
>>   - Split patch into smaller chunks, per Igor Mammedov
>>   - Did away with use of ACPI packed structures, per Igor Mammedov
>>
>> v1: 26oct2020
>>   - initial post
>>
>> ---
>>
>> Eric DeVolder (10):
>>    ACPI ERST: bios-tables-test.c steps 1 and 2
>>    ACPI ERST: specification for ERST support
>>    ACPI ERST: PCI device_id for ERST
>>    ACPI ERST: header file for ERST
>>    ACPI ERST: support for ACPI ERST feature
>>    ACPI ERST: build the ACPI ERST table
>>    ACPI ERST: trace support
>>    ACPI ERST: create ACPI ERST table for pc/x86 machines.
>>    ACPI ERST: qtest for ERST
>>    ACPI ERST: step 6 of bios-tables-test.c
>>
>>   docs/specs/acpi_erst.txt     | 152 +++++++
>>   hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
>>   hw/acpi/meson.build          |   1 +
>>   hw/acpi/trace-events         |  14 +
>>   hw/i386/acpi-build.c         |   9 +
>>   hw/i386/acpi-microvm.c       |   9 +
>>   include/hw/acpi/erst.h       |  84 ++++
>>   include/hw/pci/pci.h         |   1 +
>>   tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
>>   tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>>   tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>>   tests/qtest/erst-test.c      | 129 ++++++
>>   tests/qtest/meson.build      |   2 +
>>   13 files changed, 1319 insertions(+)
>>   create mode 100644 docs/specs/acpi_erst.txt
>>   create mode 100644 hw/acpi/erst.c
>>   create mode 100644 include/hw/acpi/erst.h
>>   create mode 100644 tests/data/acpi/microvm/ERST
>>   create mode 100644 tests/data/acpi/pc/ERST
>>   create mode 100644 tests/data/acpi/q35/ERST
>>   create mode 100644 tests/qtest/erst-test.c
>>
>> -- 
>> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-20 14:57 ` Igor Mammedov
@ 2021-07-21 15:26   ` Eric DeVolder
  2021-07-23 16:26   ` Eric DeVolder
  2021-07-27 12:55   ` Igor Mammedov
  2 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 15:26 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 9:57 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:11 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> =============================
>> I believe I have corrected for all feedback on v4, but with
>> responses to certain feedback below.
>>
>> In patch 1/6, Igor asks:
>> "you are adding empty template files here
>> but the later matching bios-tables-test is nowhere to be found
>> Was testcase lost somewhere along the way?
>>
>> also it seems you add ERST only to pc/q35,
>> so why tests/data/acpi/microvm/ERST is here?"
>>
>> I did miss setting up microvm. That has been corrected.
>>
>> As for the question about lost test cases, if you are referring
>> to the new binary blobs for pc,q35, those were in patch
>> 6/6. There is a qtest in patch 5/6. If I don't understand the
>> question, please indicate as such.
> 
> All I see in this series is
>   [PATCH v5 09/10] ACPI ERST: qtest for ERST
> which is not related to bios-tables-test and blobs whatsoever.
> 
> Blobs are for use with bios-tables-test and I'm referring to
> missing test case in bios-tables-test.c

I now understand that "missing test case" are in bios-tables-test.c.
I've got those implemented, though I'm debugging a problem with the
microvm version.

> 
>>
>>
>> In patch 3/6, Igor asks:
>> "Also spec (ERST) is rather (maybe intentionally) vague on specifics,
>> so it would be better that before a patch that implements hw part
>> were a doc patch describing concrete implementation. As model
>> you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
>> I'd start posting/discussing that spec within these thread
>> to avoid spamming list until doc is settled up."
>>
>> I'm thinking that this cover letter is the bulk of the spec? But as
>> you say, to avoid spamming the group, we can use this thread to make
>> suggested changes to this cover letter which I will then convert
>> into a spec, for v6.
>>
>>
>> In patch 3/6, in many places Igor mentions utilizing the hostmem
>> mapped directly in the guest in order to avoid need-less copying.
>>
>> It is true that the ERST has an "NVRAM" mode that would allow for
>> all the simplifications Igor points out, however, Linux does not
>> support this mode. This mode puts the burden of managing the NVRAM
>> space on the OS. So this implementation, like BIOS, is the non-NVRAM
>> mode.
> see per patch comments where copying is not necessary regardless of
> the implemented mode.
> 
> 
>> I did go ahead and separate the registers from the exchange buffer,
>> which would facilitate the support of NVRAM mode.
>>
>>   linux/drivers/acpi/apei/erst.c:
>>   /* NVRAM ERST Error Log Address Range is not supported yet */
>>   static void pr_unimpl_nvram(void)
>>   {
>>      if (printk_ratelimit())
>>          pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
>>   }
>>
>>   static int __erst_write_to_nvram(const struct cper_record_header *record)
>>   {
>>      /* do not print message, because printk is not safe for NMI */
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_clear_from_nvram(u64 record_id)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>> =============================
> PS:
> it's inconvenient when you copy questions/parts of unfinished discussion
> from previous revision with a little context.
> Usually discussion should continue in the original thread and
> once some sort of consensus is reached new series based on it
> is posted. Above blob shouldn't be here. (You can look at how others
> handle multiple revisions)
> 
> The way you do it now, makes reviewer to repeat job done earlier
> to point to the the same issues, so it wastes your and reviewer's time.
> So please finish discussions in threads they started at and then post
> new revision.

Yes, I apologize. The email tool I was using did not handle threads very well.
I've switched email tools and I'm able to respond to each thread much better now.

> 
>> This patchset introduces support for the ACPI Error Record
>> Serialization Table, ERST.
>>
>> For background and implementation information, please see
>> docs/specs/acpi_erst.txt, which is patch 2/10.
>>
>> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>
>> ---
>> v5: 30jun2021
>>   - Create docs/specs/acpi_erst.txt, per Igor
>>   - Separate PCI BARs for registers and memory, per Igor
>>   - Convert debugging to use trace infrastructure, per Igor
>>   - Various other fixups, per Igor
>>
>> v4: 11jun2021
>>   - Converted to a PCI device, per Igor.
>>   - Updated qtest.
>>   - Rearranged patches, per Igor.
>>
>> v3: 28may2021
>>   - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>>     internal array with explicit file operations, per Igor.
>>   - Changed the way the qdev and base address are handled, allowing
>>     ERST to be disabled at run-time. Also aligns better with other
>>     existing code.
>>
>> v2: 8feb2021
>>   - Added qtest/smoke test per Paolo Bonzini
>>   - Split patch into smaller chunks, per Igor Mammedov
>>   - Did away with use of ACPI packed structures, per Igor Mammedov
>>
>> v1: 26oct2020
>>   - initial post
>>
>> ---
>>
>> Eric DeVolder (10):
>>    ACPI ERST: bios-tables-test.c steps 1 and 2
>>    ACPI ERST: specification for ERST support
>>    ACPI ERST: PCI device_id for ERST
>>    ACPI ERST: header file for ERST
>>    ACPI ERST: support for ACPI ERST feature
>>    ACPI ERST: build the ACPI ERST table
>>    ACPI ERST: trace support
>>    ACPI ERST: create ACPI ERST table for pc/x86 machines.
>>    ACPI ERST: qtest for ERST
>>    ACPI ERST: step 6 of bios-tables-test.c
>>
>>   docs/specs/acpi_erst.txt     | 152 +++++++
>>   hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
>>   hw/acpi/meson.build          |   1 +
>>   hw/acpi/trace-events         |  14 +
>>   hw/i386/acpi-build.c         |   9 +
>>   hw/i386/acpi-microvm.c       |   9 +
>>   include/hw/acpi/erst.h       |  84 ++++
>>   include/hw/pci/pci.h         |   1 +
>>   tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
>>   tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>>   tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>>   tests/qtest/erst-test.c      | 129 ++++++
>>   tests/qtest/meson.build      |   2 +
>>   13 files changed, 1319 insertions(+)
>>   create mode 100644 docs/specs/acpi_erst.txt
>>   create mode 100644 hw/acpi/erst.c
>>   create mode 100644 include/hw/acpi/erst.h
>>   create mode 100644 tests/data/acpi/microvm/ERST
>>   create mode 100644 tests/data/acpi/pc/ERST
>>   create mode 100644 tests/data/acpi/q35/ERST
>>   create mode 100644 tests/qtest/erst-test.c
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-07-19 15:02     ` Igor Mammedov
@ 2021-07-21 15:42       ` Eric DeVolder
  2021-07-26 10:06         ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 15:42 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth



On 7/19/21 10:02 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 19:26:39 +0000
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
>> rather than "non-NVRAM mode", which contradicts everything I stated prior.
>> Eric.
>> ________________________________
>> From: Eric DeVolder <eric.devolder@oracle.com>
>> Sent: Wednesday, June 30, 2021 2:07 PM
>> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
>> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
>>
>> Information on the implementation of the ACPI ERST support.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 152 insertions(+)
>>   create mode 100644 docs/specs/acpi_erst.txt
>>
>> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
>> new file mode 100644
>> index 0000000..79f8eb9
>> --- /dev/null
>> +++ b/docs/specs/acpi_erst.txt
>> @@ -0,0 +1,152 @@
>> +ACPI ERST DEVICE
>> +================
>> +
>> +The ACPI ERST device is utilized to support the ACPI Error Record
>> +Serialization Table, ERST, functionality. The functionality is
>> +designed for storing error records in persistent storage for
>> +future reference/debugging.
>> +
>> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
>> +(APEI)", and specifically subsection "Error Serialization", outlines
>> +a method for storing error records into persistent storage.
>> +
>> +The format of error records is described in the UEFI specification[2],
>> +in Appendix N "Common Platform Error Record".
>> +
>> +While the ACPI specification allows for an NVRAM "mode" (see
>> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
>> +directly exposed for direct access by the OS/guest, this implements
>> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
>> +by most BIOS (since flash memory requires programming operations
>> +in order to update its contents). Furthermore, as of the time of this
>> +writing, Linux does not support the non-NVRAM "mode".
> 
> shouldn't it be s/non-NVRAM/NVRAM/ ?

Yes, it has been corrected.

> 
>> +
>> +
>> +Background/Motivation
>> +---------------------
>> +Linux uses the persistent storage filesystem, pstore, to record
>> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
>> +independent of, and runs before, kdump.  In certain scenarios (ie.
>> +hosts/guests with root filesystems on NFS/iSCSI where networking
>> +software and/or hardware fails), pstore may contain the only
>> +information available for post-mortem debugging.
> 
> well,
> it's not the only way, one can use existing pvpanic device to notify
> mgmt layer about crash and mgmt layer can take appropriate measures
> to for post-mortem debugging, including dumping guest state,
> which is superior to anything pstore can offer as VM is still exists
> and mgmt layer can inspect VMs crashed state directly or dump
> necessary parts of it.
> 
> So ERST shouldn't be portrayed as the only way here but rather
> as limited alternative to pvpanic in regards to post-mortem debugging
> (it's the only way only on bare-metal).
> 
> It would be better to describe here other use-cases you've mentioned
> in earlier reviews, that justify adding alternative to pvpanic.

I'm not sure how I would change this. I do say "may contain", which means it
is not the only way. Pvpanic is a way to notify the mgmt layer/host, but
this is a method solely with the guest. Each serves a different purpose;
plugs a different hole.

As noted in a separate message, my company has intentions of storing other
data in ERST beyond panics.

> 
>> +Two common storage backends for the pstore filesystem are ACPI ERST
>> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
>> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
>> +pstore storage backend for virtual machines (as it is now for
>> +bare metal machines).
>> +
> 
>> +Enabling support for ACPI ERST facilitates a consistent method to
>> +capture kernel panic information in a wide range of guests: from
>> +resource-constrained microvms to very large guests, and in
>> +particular, in direct-boot environments (which would lack UEFI
>> +run-time services).
> this hunk probably not necessary
> 
>> +
>> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
>> +crash information, if available.
> a pointer to a relevant source would be helpful here.

I've included the reference, here for your benefit.
Windows Hardware Error Architecutre, specifically Persistence Mechanism
https://docs.microsoft.com/en-us/windows-hardware/drivers/whea/error-record-persistence-mechanism

> 
>> +Invocation
> s/^^/Configuration|Usage/

Corrected

> 
>> +----------
>> +
>> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst
> s/utilize/use/

Corrected

> 
>> +device must be created, for example:
> s/must/can/

Corrected

> 
>> +
>> + qemu ...
>> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
>> +  size=0x10000,share=on
> I'd put ^^^ on the same line as -object and use '\' at the end the
> so example could be easily copy-pasted

Corrected

> 
>> + -device acpi-erst,memdev=erstnvram
>> +
>> +For proper operation, the ACPI ERST device needs a memory-backend-file
>> +object with the following parameters:
>> +
>> + - id: The id of the memory-backend-file object is used to associate
>> +   this memory with the acpi-erst device.
>> + - size: The size of the ACPI ERST backing storage. This parameter is
>> +   required.
>> + - mem-path: The location of the ACPI ERST backing storage file. This
>> +   parameter is also required.
>> + - share: The share=on parameter is required so that updates to the
>> +   ERST back store are written to the file immediately as well. Without
>> +   it, updates the the backing file are unpredictable and may not
>> +   properly persist (eg. if qemu should crash).
> 
> mmap manpage says:
>    MAP_SHARED
>               Updates to the mapping ... are carried through to the underlying file.
> it doesn't guarantee 'written to the file immediately', though.
> So I'd rephrase it to something like that:
> 
> - share: The share=on parameter is required so that updates to the ERST back store
>           are written back to the file.

Corrected

> 
>> +
>> +The ACPI ERST device is a simple PCI device, and requires this one
>> +parameter:
> s/^.*:/and ERST device:/

Corrected

> 
>> +
>> + - memdev: Is the object id of the memory-backend-file.
>> +
>> +
>> +PCI Interface
>> +-------------
>> +
>> +The ERST device is a PCI device with two BARs, one for accessing
>> +the programming registers, and the other for accessing the
>> +record exchange buffer.
>> +
>> +BAR0 contains the programming interface consisting of just two
>> +64-bit registers. The two registers are an ACTION (cmd) and a
>> +VALUE (data). All ERST actions/operations/side effects happen
> s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/

Corrected

> 
>> +on the write to the ACTION, by design. Thus any data needed
> s/Thus//
Corrected

> 
>> +by the action must be placed into VALUE prior to writing
>> +ACTION. Reading the VALUE simply returns the register contents,
>> +which can be updated by a previous ACTION.
> 
>> This behavior is
>> +encoded in the ACPI ERST table generated by QEMU.
> it's too vague, Either drop sentence or add a reference to relevant place in spec.
Corrected

> 
> 
>> +
>> +BAR1 contains the record exchange buffer, and the size of this
>> +buffer sets the maximum record size. This record exchange
>> +buffer size is 8KiB.
> s/^^^/
> BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.
Corrected

> 
> 
>> +Backing File
> 
> s/^^^/Backing Storage Format/
Corrected

> 
>> +------------
> 
> 
>> +
>> +The ACPI ERST persistent storage is contained within a single backing
>> +file. The size and location of the backing file is specified upon
>> +QEMU startup of the ACPI ERST device.
> 
> I'd drop above paragraph and describe file format here,
> ultimately used backend doesn't have to be a file. For
> example if user doesn't need it persist over QEMU restarts,
> ram backend could be used, guest will still be able to see
> it's own crash log after guest is reboot, or it could be
> memfd backend passed to QEMU by mgmt layer.
Dropped

> 
> 
>> +Records are stored in the backing file in a simple fashion.
> s/backing file/backend storage/
> ditto for other occurrences
Corrected

> 
>> +The backing file is essentially divided into fixed size
>> +"slots", ERST_RECORD_SIZE in length, with each "slot"
>> +storing a single record.
> 
>> No attempt at optimizing storage
>> +through compression, compaction, etc is attempted.
> s/^^^//

I'd like to keep this statement. It is there because in a number of
hardware BIOS I tested, these kinds of features lead to bugs in the
ERST support.

> 
>> +NOTE that any change to this value will make any pre-
>> +existing backing files, not of the same ERST_RECORD_SIZE,
>> +unusable to the guest.
> when that can happen, can we detect it and error out?
I've dropped this statement. That value is hard coded, and not a
parameter, so there is no simple way to change it. This comment
does exist next to the ERST_RECORD_SIZE declaration in the code.

> 
> 
>> +Below is an example layout of the backing store file.
>> +The size of the file is a multiple of ERST_RECORD_SIZE,
>> +and contains N number of "slots" to store records. The
>> +example below shows two records (in CPER format) in the
>> +backing file, while the remaining slots are empty/
>> +available.
>> +
>> + Slot   Record
>> +        +--------------------------------------------+
>> +    0   | empty/available                            |
>> +        +--------------------------------------------+
>> +    1   | CPER                                       |
>> +        +--------------------------------------------+
>> +    2   | CPER                                       |
>> +        +--------------------------------------------+
>> +  ...   |                                            |
>> +        +--------------------------------------------+
>> +    N   | empty/available                            |
>> +        +--------------------------------------------+
>> +        <-------------- ERST_RECORD_SIZE ------------>
> 
> 
>> +Not all slots need to be occupied, and they need not be
>> +occupied in a contiguous fashion. The ability to clear/erase
>> +specific records allows for the formation of unoccupied
>> +slots.
> I'd drop this as not necessary

I'd like to keep this statement. Again, several BIOS on which I tested
ERST had bugs around non-contiguous record storage.
> 
> 
>> +
>> +
>> +References
>> +----------
>> +
>> +[1] "Advanced Configuration and Power Interface Specification",
>> +    version 4.0, June 2009.
>> +
>> +[2] "Unified Extensible Firmware Interface Specification",
>> +    version 2.1, October 2008.
>> +
>> --
>> 1.8.3.1
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST
  2021-07-19 15:06   ` Igor Mammedov
@ 2021-07-21 15:42     ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 15:42 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/19/21 10:06 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:14 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> This change declares the PCI device_id for the new ACPI ERST
> 
> s/This change declares/Reserve/
Corrected

> 
>> device.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   include/hw/pci/pci.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>> index 6be4e0c..eef3ef4 100644
>> --- a/include/hw/pci/pci.h
>> +++ b/include/hw/pci/pci.h
>> @@ -108,6 +108,7 @@ extern bool pci_available;
>>   #define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
>>   #define PCI_DEVICE_ID_REDHAT_NVME        0x0010
>>   #define PCI_DEVICE_ID_REDHAT_PVPANIC     0x0011
>> +#define PCI_DEVICE_ID_REDHAT_ACPI_ERST   0x0012
>>   #define PCI_DEVICE_ID_REDHAT_QXL         0x0100
>>   
>>   #define FMT_PCIBUS                      PRIx64
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-20 12:17   ` Igor Mammedov
@ 2021-07-21 16:07     ` Eric DeVolder
  2021-07-26 10:42       ` Igor Mammedov
  2021-07-21 17:36     ` Eric DeVolder
  1 sibling, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:07 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 7:17 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:16 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> This change implements the support for the ACPI ERST feature.
> Drop this
Done

> 
>>
>> This implements a PCI device for ACPI ERST. This implments the
> s/implments/implements/
Corrected

> 
>> non-NVRAM "mode" of operation for ERST.
> add here why non-NVRAM "mode" is implemented.
How about:
This implements a PCI device for ACPI ERST. This implments the
non-NVRAM "mode" of operation for ERST as it is supported by
Linux and Windows and aligns with ERST support in most BIOS.


> 
> Also even if this non-NVRAM implementation, there is still
> a lot of not necessary data copying (see below) so drop it
> or justify why it's there.
>   
>> This change also includes erst.c in the build of general ACPI support.
> Drop this as well
Done

> 
> 
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   hw/acpi/erst.c      | 704 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   hw/acpi/meson.build |   1 +
>>   2 files changed, 705 insertions(+)
>>   create mode 100644 hw/acpi/erst.c
>>
>> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
>> new file mode 100644
>> index 0000000..6e9bd2e
>> --- /dev/null
>> +++ b/hw/acpi/erst.c
>> @@ -0,0 +1,704 @@
>> +/*
>> + * ACPI Error Record Serialization Table, ERST, Implementation
>> + *
>> + * Copyright (c) 2021 Oracle and/or its affiliates.
>> + *
>> + * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
>> + * ACPI Platform Error Interfaces : Error Serialization
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation;
>> + * version 2 of the License.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
>> + */
>> +
>> +#include <sys/types.h>
>> +#include <sys/stat.h>
>> +#include <unistd.h>
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "hw/qdev-core.h"
>> +#include "exec/memory.h"
>> +#include "qom/object.h"
>> +#include "hw/pci/pci.h"
>> +#include "qom/object_interfaces.h"
>> +#include "qemu/error-report.h"
>> +#include "migration/vmstate.h"
>> +#include "hw/qdev-properties.h"
>> +#include "hw/acpi/acpi.h"
>> +#include "hw/acpi/acpi-defs.h"
>> +#include "hw/acpi/aml-build.h"
>> +#include "hw/acpi/bios-linker-loader.h"
>> +#include "exec/address-spaces.h"
>> +#include "sysemu/hostmem.h"
>> +#include "hw/acpi/erst.h"
>> +#include "trace.h"
>> +
>> +/* UEFI 2.1: Append N Common Platform Error Record */
>> +#define UEFI_CPER_RECORD_MIN_SIZE 128U
>> +#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
>> +#define UEFI_CPER_RECORD_ID_OFFSET 96U
>> +#define IS_UEFI_CPER_RECORD(ptr) \
>> +    (((ptr)[0] == 'C') && \
>> +     ((ptr)[1] == 'P') && \
>> +     ((ptr)[2] == 'E') && \
>> +     ((ptr)[3] == 'R'))
>> +#define THE_UEFI_CPER_RECORD_ID(ptr) \
>> +    (*(uint64_t *)(&(ptr)[UEFI_CPER_RECORD_ID_OFFSET]))
>> +
>> +/*
>> + * This implementation is an ACTION (cmd) and VALUE (data)
>> + * interface consisting of just two 64-bit registers.
>> + */
>> +#define ERST_REG_SIZE (2UL * sizeof(uint64_t))
> 
>> +#define ERST_CSR_ACTION (0UL << 3) /* action (cmd) */
>> +#define ERST_CSR_VALUE  (1UL << 3) /* argument/value (data) */
> what's meaning of CRS?
CSR = control status register
> Looking at patch both should be called ERST_[ACTION|VALUE]_OFFSET
Done
> pls use explicit offset values instead of shifting bit.
Done
> 
> 
>> +/*
>> + * ERST_RECORD_SIZE is the buffer size for exchanging ERST
>> + * record contents. Thus, it defines the maximum record size.
>> + * As this is mapped through a PCI BAR, it must be a power of
>> + * two, and should be at least PAGE_SIZE.
>> + * Records are stored in the backing file in a simple fashion.
>> + * The backing file is essentially divided into fixed size
>> + * "slots", ERST_RECORD_SIZE in length, with each "slot"
>> + * storing a single record. No attempt at optimizing storage
>> + * through compression, compaction, etc is attempted.
>> + * NOTE that any change to this value will make any pre-
>> + * existing backing files, not of the same ERST_RECORD_SIZE,
>> + * unusable to the guest.
>> + */
>> +/* 8KiB records, not too small, not too big */
>> +#define ERST_RECORD_SIZE (2UL * 4096)
>> +
>> +#define ERST_INVALID_RECORD_ID (~0UL)
>> +#define ERST_EXECUTE_OPERATION_MAGIC 0x9CUL
>> +
>> +/*
>> + * Object cast macro
>> + */
>> +#define ACPIERST(obj) \
>> +    OBJECT_CHECK(ERSTDeviceState, (obj), TYPE_ACPI_ERST)
>> +
>> +/*
>> + * Main ERST device state structure
>> + */
>> +typedef struct {
>> +    PCIDevice parent_obj;
>> +
>> +    HostMemoryBackend *hostmem;
>> +    MemoryRegion *hostmem_mr;
>> +
>> +    MemoryRegion iomem; /* programming registes */
>> +    MemoryRegion nvmem; /* exchange buffer */
>> +    uint32_t prop_size;
> s/^^^/storage_size/
Corrected

> 
>> +    hwaddr bar0; /* programming registers */
>> +    hwaddr bar1; /* exchange buffer */
> why do you need to keep this addresses around?
> Suggest to drop these fields and use local variables or pci_get_bar_addr() at call site.
Corrected

> 
>> +
>> +    uint8_t operation;
>> +    uint8_t busy_status;
>> +    uint8_t command_status;
>> +    uint32_t record_offset;
>> +    uint32_t record_count;
>> +    uint64_t reg_action;
>> +    uint64_t reg_value;
>> +    uint64_t record_identifier;
>> +
>> +    unsigned next_record_index;
> 
> 
>> +    uint8_t record[ERST_RECORD_SIZE]; /* read/written directly by guest */
>> +    uint8_t tmp_record[ERST_RECORD_SIZE]; /* intermediate manipulation buffer */
> drop these see [**] below
Corrected

> 
>> +
>> +} ERSTDeviceState;
>> +
>> +/*******************************************************************/
>> +/*******************************************************************/
>> +
>> +static unsigned copy_from_nvram_by_index(ERSTDeviceState *s, unsigned index)
>> +{
>> +    /* Read an nvram entry into tmp_record */
>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>> +    off_t offset = (index * ERST_RECORD_SIZE);
>> +
>> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
>> +        if (s->hostmem_mr) {
>> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
>> +            memcpy(s->tmp_record, p + offset, ERST_RECORD_SIZE);
>> +            rc = ACPI_ERST_STATUS_SUCCESS;
>> +        }
>> +    }
>> +    return rc;
>> +}
>> +
>> +static unsigned copy_to_nvram_by_index(ERSTDeviceState *s, unsigned index)
>> +{
>> +    /* Write entry in tmp_record into nvram, and backing file */
>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>> +    off_t offset = (index * ERST_RECORD_SIZE);
>> +
>> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
>> +        if (s->hostmem_mr) {
>> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
>> +            memcpy(p + offset, s->tmp_record, ERST_RECORD_SIZE);
>> +            rc = ACPI_ERST_STATUS_SUCCESS;
>> +        }
>> +    }
>> +    return rc;
>> +}
>> +
>> +static int lookup_erst_record_by_identifier(ERSTDeviceState *s,
>> +    uint64_t record_identifier, bool *record_found, bool alloc_for_write)
>> +{
>> +    int rc = -1;
>> +    int empty_index = -1;
>> +    int index = 0;
>> +    unsigned rrc;
>> +
>> +    *record_found = 0;
>> +
>> +    do {
>> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
> 
> you have direct access to backend memory so there is no need
> whatsoever to copy records from it to an intermediate buffer
> everywhere. Almost all operations with records can be done
> in place modulo EXECUTE_OPERATION action in BEGIN_[READ|WRITE]
> context, where record is moved between backend and guest buffer.
> 
> So please eliminate all not necessary copying.
> (for fun, time operations and set backend size to some huge
> value to see how expensive this code is)

I've corrected this. In our previous exchangs, I thought the reference
to copying was about trying to directly have guest write/read the appropriate
record in the backend storage. After reading this comment I realized that
yes I was doing alot of copying (an artifact of the transition away from
direct file i/o to MemoryBackend). So good find, and I've eliminated the
intermediate copying.

> 
>> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
>> +            uint64_t this_identifier;
>> +            this_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
>> +            if (IS_UEFI_CPER_RECORD(s->tmp_record) &&
>> +                (this_identifier == record_identifier)) {
>> +                rc = index;
>> +                *record_found = 1;
>> +                break;
>> +            }
>> +            if ((this_identifier == ERST_INVALID_RECORD_ID) &&
>> +                (empty_index < 0)) {
>> +                empty_index = index; /* first available for write */
>> +            }
>> +        }
>> +        ++index;
>> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
>> +
>> +    /* Record not found, allocate for writing */
>> +    if ((rc < 0) && alloc_for_write) {
>> +        rc = empty_index;
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>> +static unsigned clear_erst_record(ERSTDeviceState *s)
>> +{
>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>> +    bool record_found;
>> +    int index;
>> +
>> +    index = lookup_erst_record_by_identifier(s,
>> +        s->record_identifier, &record_found, 0);
>> +    if (record_found) {
>> +        memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
>> +        rc = copy_to_nvram_by_index(s, (unsigned)index);
>> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
>> +            s->record_count -= 1;
>> +        }
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>> +static unsigned write_erst_record(ERSTDeviceState *s)
>> +{
>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>> +
>> +    if (s->record_offset < (ERST_RECORD_SIZE - UEFI_CPER_RECORD_MIN_SIZE)) {
>> +        uint64_t record_identifier;
>> +        uint8_t *record = &s->record[s->record_offset];
>> +        bool record_found;
>> +        int index;
>> +
>> +        record_identifier = (s->record_identifier == ERST_INVALID_RECORD_ID)
>> +            ? THE_UEFI_CPER_RECORD_ID(record) : s->record_identifier;
>> +
>> +        index = lookup_erst_record_by_identifier(s,
>> +            record_identifier, &record_found, 1);
>> +        if (index < 0) {
>> +            rc = ACPI_ERST_STATUS_NOT_ENOUGH_SPACE;
>> +        } else {
>> +            if (0 != s->record_offset) {
>> +                memset(&s->tmp_record[ERST_RECORD_SIZE - s->record_offset],
>> +                    0xFF, s->record_offset);
>> +            }
>> +            memcpy(s->tmp_record, record, ERST_RECORD_SIZE - s->record_offset);
>> +            rc = copy_to_nvram_by_index(s, (unsigned)index);
>> +            if (rc == ACPI_ERST_STATUS_SUCCESS) {
>> +                if (!record_found) { /* not overwriting existing record */
>> +                    s->record_count += 1; /* writing new record */
>> +                }
>> +            }
>> +        }
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>> +static unsigned next_erst_record(ERSTDeviceState *s,
>> +    uint64_t *record_identifier)
>> +{
>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>> +    unsigned index;
>> +    unsigned rrc;
>> +
>> +    *record_identifier = ERST_INVALID_RECORD_ID;
>> +
>> +    index = s->next_record_index;
>> +    do {
>> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
>> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
>> +            if (IS_UEFI_CPER_RECORD(s->tmp_record)) {
>> +                s->next_record_index = index + 1; /* where to start next time */
>> +                *record_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
>> +                rc = ACPI_ERST_STATUS_SUCCESS;
>> +                break;
>> +            }
>> +            ++index;
>> +        } else {
>> +            if (s->next_record_index == 0) {
>> +                rc = ACPI_ERST_STATUS_RECORD_STORE_EMPTY;
>> +            }
>> +            s->next_record_index = 0; /* at end, reset */
>> +        }
>> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
>> +
>> +    return rc;
>> +}
>> +
>> +static unsigned read_erst_record(ERSTDeviceState *s)
>> +{
>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>> +    bool record_found;
>> +    int index;
>> +
>> +    index = lookup_erst_record_by_identifier(s,
>> +        s->record_identifier, &record_found, 0);
>> +    if (record_found) {
>> +        rc = copy_from_nvram_by_index(s, (unsigned)index);
>> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
>> +            if (s->record_offset < ERST_RECORD_SIZE) {
>> +                memcpy(&s->record[s->record_offset], s->tmp_record,
>> +                    ERST_RECORD_SIZE - s->record_offset);
>> +            }
>> +        }
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>> +static unsigned get_erst_record_count(ERSTDeviceState *s)
>> +{
>> +    /* Compute record_count */
>> +    unsigned index = 0;
>> +
>> +    s->record_count = 0;
>> +    while (copy_from_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
>> +        uint8_t *ptr = &s->tmp_record[0];
>> +        uint64_t record_identifier = THE_UEFI_CPER_RECORD_ID(ptr);
>> +        if (IS_UEFI_CPER_RECORD(ptr) &&
>> +            (ERST_INVALID_RECORD_ID != record_identifier)) {
>> +            s->record_count += 1;
>> +        }
>> +        ++index;
>> +    }
>> +    return s->record_count;
>> +}
>> +
>> +/*******************************************************************/
>> +
>> +static uint64_t erst_rd_reg64(hwaddr addr,
>> +    uint64_t reg, unsigned size)
>> +{
>> +    uint64_t rdval;
>> +    uint64_t mask;
>> +    unsigned shift;
>> +
>> +    if (size == sizeof(uint64_t)) {
>> +        /* 64b access */
>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>> +        shift = 0;
>> +    } else {
>> +        /* 32b access */
>> +        mask = 0x00000000FFFFFFFFUL;
>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>> +    }
>> +
>> +    rdval = reg;
>> +    rdval >>= shift;
>> +    rdval &= mask;
>> +
>> +    return rdval;
>> +}
>> +
>> +static uint64_t erst_wr_reg64(hwaddr addr,
>> +    uint64_t reg, uint64_t val, unsigned size)
>> +{
>> +    uint64_t wrval;
>> +    uint64_t mask;
>> +    unsigned shift;
>> +
>> +    if (size == sizeof(uint64_t)) {
>> +        /* 64b access */
>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>> +        shift = 0;
>> +    } else {
>> +        /* 32b access */
>> +        mask = 0x00000000FFFFFFFFUL;
>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>> +    }
>> +
>> +    val &= mask;
>> +    val <<= shift;
>> +    mask <<= shift;
>> +    wrval = reg;
>> +    wrval &= ~mask;
>> +    wrval |= val;
>> +
>> +    return wrval;
>> +}
> (I see in next patch it's us defining access width in the ACPI tables)
> so question is: do we have to have mixed register width access?
> can't all register accesses be 64-bit?

Initially I attempted to just use 64-bit exclusively. The problem is that,
for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
register access into two. Here's an example of reading the exchange buffer
address, which is coded as a 64b access:

acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)

So I went ahead and made ACTION register accesses 32b, else there would
be two reads of 32-bts, of which the second is useless.

> 
>> +static void erst_reg_write(void *opaque, hwaddr addr,
>> +    uint64_t val, unsigned size)
>> +{
>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>> +
>> +    /*
>> +     * NOTE: All actions/operations/side effects happen on the WRITE,
>> +     * by design. The READs simply return the reg_value contents.
>> +     */
>> +    trace_acpi_erst_reg_write(addr, val, size);
>> +
>> +    switch (addr) {
>> +    case ERST_CSR_VALUE + 0:
>> +    case ERST_CSR_VALUE + 4:
>> +        s->reg_value = erst_wr_reg64(addr, s->reg_value, val, size);
>> +        break;
>> +    case ERST_CSR_ACTION + 0:
>> +/*  case ERST_CSR_ACTION+4: as coded, not really a 64b register */
>> +        switch (val) {
>> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>> +        case ACPI_ERST_ACTION_END_OPERATION:
>> +            s->operation = val;
>> +            break;
>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
>> +            s->record_offset = s->reg_value;
>> +            break;
>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
>> +            if ((uint8_t)s->reg_value == ERST_EXECUTE_OPERATION_MAGIC) {
>> +                s->busy_status = 1;
>> +                switch (s->operation) {
>> +                case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
>> +                    s->command_status = write_erst_record(s);
>> +                    break;
>> +                case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>> +                    s->command_status = read_erst_record(s);
>> +                    break;
>> +                case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>> +                    s->command_status = clear_erst_record(s);
>> +                    break;
>> +                case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
>> +                    break;
>> +                case ACPI_ERST_ACTION_END_OPERATION:
>> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
>> +                    break;
>> +                default:
>> +                    s->command_status = ACPI_ERST_STATUS_FAILED;
>> +                    break;
>> +                }
>> +                s->record_identifier = ERST_INVALID_RECORD_ID;
>> +                s->busy_status = 0;
>> +            }
>> +            break;
>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
>> +            s->reg_value = s->busy_status;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
>> +            s->reg_value = s->command_status;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
>> +            s->command_status = next_erst_record(s, &s->reg_value);
>> +            break;
>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
>> +            s->record_identifier = s->reg_value;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
>> +            s->reg_value = s->record_count;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
>> +            s->reg_value = s->bar1;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
>> +            s->reg_value = ERST_RECORD_SIZE;
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
>> +            s->reg_value = 0x0; /* intentional, not NVRAM mode */
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
>> +            /*
>> +             * 100UL is max, 10UL is nominal
> 100/10 of what, also add reference to spec/table it comes from
> and explain in comment why theses values were chosen
I've changed the comment and style to be similar to build_amd_iommu().
These are merely sane non-zero max/min times.

> 
>> +             */
>> +            s->reg_value = ((100UL << 32) | (10UL << 0));
>> +            break;
>> +        case ACPI_ERST_ACTION_RESERVED:
> not necessary, it will be handled by 'default:'
Corrected

> 
>> +        default:
>> +            /*
>> +             * Unknown action/command, NOP
>> +             */
>> +            break;
>> +        }
>> +        break;
>> +    default:
>> +        /*
>> +         * This should not happen, but if it does, NOP
>> +         */
>> +        break;
>> +    }
>> +}
>> +
>> +static uint64_t erst_reg_read(void *opaque, hwaddr addr,
>> +                                unsigned size)
>> +{
>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>> +    uint64_t val = 0;
>> +
>> +    switch (addr) {
>> +    case ERST_CSR_ACTION + 0:
>> +    case ERST_CSR_ACTION + 4:
>> +        val = erst_rd_reg64(addr, s->reg_action, size);
>> +        break;
>> +    case ERST_CSR_VALUE + 0:
>> +    case ERST_CSR_VALUE + 4:
>> +        val = erst_rd_reg64(addr, s->reg_value, size);
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +    trace_acpi_erst_reg_read(addr, val, size);
>> +    return val;
>> +}
>> +
>> +static const MemoryRegionOps erst_reg_ops = {
>> +    .read = erst_reg_read,
>> +    .write = erst_reg_write,
>> +    .endianness = DEVICE_NATIVE_ENDIAN,
>> +};
>> +
>> +static void erst_mem_write(void *opaque, hwaddr addr,
>> +    uint64_t val, unsigned size)
>> +{
>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> 
>> +    uint8_t *ptr = &s->record[addr - 0];
>> +    trace_acpi_erst_mem_write(addr, val, size);
>> +    switch (size) {
>> +    default:
>> +    case sizeof(uint8_t):
>> +        *(uint8_t *)ptr = (uint8_t)val;
>> +        break;
>> +    case sizeof(uint16_t):
>> +        *(uint16_t *)ptr = (uint16_t)val;
>> +        break;
>> +    case sizeof(uint32_t):
>> +        *(uint32_t *)ptr = (uint32_t)val;
>> +        break;
>> +    case sizeof(uint64_t):
>> +        *(uint64_t *)ptr = (uint64_t)val;
>> +        break;
>> +    }
>> +}
>> +
>> +static uint64_t erst_mem_read(void *opaque, hwaddr addr,
>> +                                unsigned size)
>> +{
>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>> +    uint8_t *ptr = &s->record[addr - 0];
>> +    uint64_t val = 0;
>> +    switch (size) {
>> +    default:
>> +    case sizeof(uint8_t):
>> +        val = *(uint8_t *)ptr;
>> +        break;
>> +    case sizeof(uint16_t):
>> +        val = *(uint16_t *)ptr;
>> +        break;
>> +    case sizeof(uint32_t):
>> +        val = *(uint32_t *)ptr;
>> +        break;
>> +    case sizeof(uint64_t):
>> +        val = *(uint64_t *)ptr;
>> +        break;
>> +    }
>> +    trace_acpi_erst_mem_read(addr, val, size);
>> +    return val;
>> +}
>> +
>> +static const MemoryRegionOps erst_mem_ops = {
>> +    .read = erst_mem_read,
>> +    .write = erst_mem_write,
>> +    .endianness = DEVICE_NATIVE_ENDIAN,
>> +};
>> +
>> +/*******************************************************************/
>> +/*******************************************************************/
>> +
>> +static const VMStateDescription erst_vmstate  = {
>> +    .name = "acpi-erst",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_UINT8(operation, ERSTDeviceState),
>> +        VMSTATE_UINT8(busy_status, ERSTDeviceState),
>> +        VMSTATE_UINT8(command_status, ERSTDeviceState),
>> +        VMSTATE_UINT32(record_offset, ERSTDeviceState),
>> +        VMSTATE_UINT32(record_count, ERSTDeviceState),
>> +        VMSTATE_UINT64(reg_action, ERSTDeviceState),
>> +        VMSTATE_UINT64(reg_value, ERSTDeviceState),
>> +        VMSTATE_UINT64(record_identifier, ERSTDeviceState),
>> +        VMSTATE_UINT8_ARRAY(record, ERSTDeviceState, ERST_RECORD_SIZE),
>> +        VMSTATE_UINT8_ARRAY(tmp_record, ERSTDeviceState, ERST_RECORD_SIZE),
>> +        VMSTATE_END_OF_LIST()
>> +    }
>> +};
>> +
>> +static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
>> +{
>> +    ERSTDeviceState *s = ACPIERST(pci_dev);
>> +    unsigned index = 0;
>> +    bool share;
>> +
>> +    trace_acpi_erst_realizefn_in();
>> +
>> +    if (!s->hostmem) {
>> +        error_setg(errp, "'" ACPI_ERST_MEMDEV_PROP "' property is not set");
>> +        return;
>> +    } else if (host_memory_backend_is_mapped(s->hostmem)) {
>> +        error_setg(errp, "can't use already busy memdev: %s",
>> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
>> +        return;
>> +    }
>> +
>> +    share = object_property_get_bool(OBJECT(s->hostmem), "share", &error_fatal);
> s/&error_fatal/errp/
Corrected

> 
>> +    if (!share) {
>> +        error_setg(errp, "ACPI ERST requires hostmem property share=on: %s",
>> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
>> +    }
> This limits possible to use backends to file|memfd only, so
> I wonder if really need this limitation, what if user doesn't
> care about preserving it across QEMU restarts. (i.e. usecase
> where storage is used as a means to troubleshoot guest crash
> i.e. QEMU is not restarted in between)
> 
> Maybe instead of enforcing we should just document that if user
> wishes to preserve content they should use file|memfd backend with
> share=on option.

I've removed this check. It is documented the way it is intended to be used.

> 
>> +
>> +    s->hostmem_mr = host_memory_backend_get_memory(s->hostmem);
>> +
>> +    /* HostMemoryBackend size will be multiple of PAGE_SIZE */
>> +    s->prop_size = object_property_get_int(OBJECT(s->hostmem), "size", &error_fatal);
> s/&error_fatal/errp/
Corrected

> 
>> +
>> +    /* Convert prop_size to integer multiple of ERST_RECORD_SIZE */
>> +    s->prop_size -= (s->prop_size % ERST_RECORD_SIZE);
> 
> pls, no fixups on behalf of user, if size is not what it should be
> error out with suggestion how to fix it.
Removed

> 
>> +
>> +    /*
>> +     * MemoryBackend initializes contents to zero, but we actually
>> +     * want contents initialized to 0xFF, ERST_INVALID_RECORD_ID.
>> +     */
>> +    if (copy_from_nvram_by_index(s, 0) == ACPI_ERST_STATUS_SUCCESS) {
>> +        if (s->tmp_record[0] == 0x00) {
>> +            memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
> this doesn't scale,
> (set backend size to more than host physical RAM, put it on slow storage and have fun.)
of course, which is why i think we need to have an upper bound (my early
submissions did).

> 
> Is it possible to use 0 as invalid record id or change storage format
> so you would not have to rewrite whole file at startup (maybe some sort
no

> of metadata header/records book-keeping table before actual records.
> And initialize file only if header is invalid.)
I have to scan the backend storage anyway in order to initialize the record
count, so I've combined that scan with a test to see if the backend storage
needs to be initialized.

> 
>> +            index = 0;
>> +            while (copy_to_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
>> +                ++index;
>> +            }
> also back&forth copying here is not really necessary.
corrected

> 
>> +        }
>> +    }
>> +
>> +    /* Initialize record_count */
>> +    get_erst_record_count(s);
> why not put it into reset?
It is initialized once, then subsequent write/clear operations update
the counter as needed.

> 
>> +
>> +    /* BAR 0: Programming registers */
>> +    memory_region_init_io(&s->iomem, OBJECT(pci_dev), &erst_reg_ops, s,
>> +                          TYPE_ACPI_ERST, ERST_REG_SIZE);
>> +    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->iomem);
>> +
> 
>> +    /* BAR 1: Exchange buffer memory */
>> +    memory_region_init_io(&s->nvmem, OBJECT(pci_dev), &erst_mem_ops, s,
>> +                          TYPE_ACPI_ERST, ERST_RECORD_SIZE);
>> +    pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->nvmem);
> 
> **)
> instead of using mmio for buffer where each write causes
> guest exit to QEMU, map memory region directly to guest.
> see ivshmem_bar2, the only difference with ivshmem, you'd
> create memory region manually (for example you can use
> memory_region_init_resizeable_ram)
> 
> this way you can speedup access and drop erst_mem_ops and
> [tmp_]record intermediate buffers.
> 
> Instead of [tmp_]record you can copy record content
> directly between buffer and backend memory regions.

I've changed the exchange buffer into a MemoryBackend object and
eliminated the erst_mem_ops.

> 
>> +    /*
>> +     * The vmstate_register_ram_global() puts the memory in
>> +     * migration stream, where it is written back to the memory
>> +     * upon reaching the destination, which causes the backing
>> +     * file to be updated (with share=on).
>> +     */
>> +    vmstate_register_ram_global(s->hostmem_mr);
>> +
>> +    trace_acpi_erst_realizefn_out(s->prop_size);
>> +}
>> +
>> +static void erst_reset(DeviceState *dev)
>> +{
>> +    ERSTDeviceState *s = ACPIERST(dev);
>> +
>> +    trace_acpi_erst_reset_in(s->record_count);
>> +    s->operation = 0;
>> +    s->busy_status = 0;
>> +    s->command_status = ACPI_ERST_STATUS_SUCCESS;
> 
>> +    /* indicate empty/no-more until further notice */
> pls rephrase, I'm not sure what it's trying to say
Eliminated; I don't know why I was trying to say there either
> 
>> +    s->record_identifier = ERST_INVALID_RECORD_ID;
>> +    s->record_offset = 0;
>> +    s->next_record_index = 0;
> 
>> +    /* NOTE: record_count and nvram are initialized elsewhere */
>> +    trace_acpi_erst_reset_out(s->record_count);
>> +}
>> +
>> +static Property erst_properties[] = {
>> +    DEFINE_PROP_LINK(ACPI_ERST_MEMDEV_PROP, ERSTDeviceState, hostmem,
>> +                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void erst_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>> +
>> +    trace_acpi_erst_class_init_in();
>> +    k->realize = erst_realizefn;
>> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
>> +    k->device_id = PCI_DEVICE_ID_REDHAT_ACPI_ERST;
>> +    k->revision = 0x00;
>> +    k->class_id = PCI_CLASS_OTHERS;
>> +    dc->reset = erst_reset;
>> +    dc->vmsd = &erst_vmstate;
>> +    dc->user_creatable = true;
>> +    device_class_set_props(dc, erst_properties);
>> +    dc->desc = "ACPI Error Record Serialization Table (ERST) device";
>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>> +    trace_acpi_erst_class_init_out();
>> +}
>> +
>> +static const TypeInfo erst_type_info = {
>> +    .name          = TYPE_ACPI_ERST,
>> +    .parent        = TYPE_PCI_DEVICE,
>> +    .class_init    = erst_class_init,
>> +    .instance_size = sizeof(ERSTDeviceState),
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> what is this for here?
> 
>> +        { }
>> +    }
>> +};
>> +
>> +static void erst_register_types(void)
>> +{
>> +    type_register_static(&erst_type_info);
>> +}
>> +
>> +type_init(erst_register_types)
>> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
>> index dd69577..262a8ee 100644
>> --- a/hw/acpi/meson.build
>> +++ b/hw/acpi/meson.build
>> @@ -4,6 +4,7 @@ acpi_ss.add(files(
>>     'aml-build.c',
>>     'bios-linker-loader.c',
>>     'utils.c',
>> +  'erst.c',
>>   ))
>>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
>>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-20 13:16   ` Igor Mammedov
  2021-07-20 14:59     ` Igor Mammedov
@ 2021-07-21 16:12     ` Eric DeVolder
  2021-07-26 11:00       ` Igor Mammedov
  1 sibling, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:12 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 8:16 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:17 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> This code is called from the machine code (if ACPI supported)
>> to generate the ACPI ERST table.
> should be along lines:
> This builds ACPI ERST table /spec ref/ to inform OSMP
> how to communicate with ... device.

Like this?
This builds the ACPI ERST table to inform OSMP how to communicate
with the acpi-erst device.



> 
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   hw/acpi/erst.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 214 insertions(+)
>>
>> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
>> index 6e9bd2e..1f1dbbc 100644
>> --- a/hw/acpi/erst.c
>> +++ b/hw/acpi/erst.c
>> @@ -555,6 +555,220 @@ static const MemoryRegionOps erst_mem_ops = {
>>   /*******************************************************************/
>>   /*******************************************************************/
>>   
>> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
>> +static void build_serialization_instruction_entry(GArray *table_data,
>> +    uint8_t serialization_action,
>> +    uint8_t instruction,
>> +    uint8_t flags,
>> +    uint8_t register_bit_width,
>> +    uint64_t register_address,
>> +    uint64_t value,
>> +    uint64_t mask)
> like I mentioned in previous patch, It could be simplified
> a lot if it's possible to use fixed 64-bit access with every
> action and the same width mask.
See previous response.

> 
>> +{
>> +    /* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
>> +    struct AcpiGenericAddress gas;
>> +
>> +    /* Serialization Action */
>> +    build_append_int_noprefix(table_data, serialization_action, 1);
>> +    /* Instruction */
>> +    build_append_int_noprefix(table_data, instruction         , 1);
>> +    /* Flags */
>> +    build_append_int_noprefix(table_data, flags               , 1);
>> +    /* Reserved */
>> +    build_append_int_noprefix(table_data, 0                   , 1);
>> +    /* Register Region */
>> +    gas.space_id = AML_SYSTEM_MEMORY;
>> +    gas.bit_width = register_bit_width;
>> +    gas.bit_offset = 0;
>> +    switch (register_bit_width) {
>> +    case 8:
>> +        gas.access_width = 1;
>> +        break;
>> +    case 16:
>> +        gas.access_width = 2;
>> +        break;
>> +    case 32:
>> +        gas.access_width = 3;
>> +        break;
>> +    case 64:
>> +        gas.access_width = 4;
>> +        break;
>> +    default:
>> +        gas.access_width = 0;
>> +        break;
>> +    }
>> +    gas.address = register_address;
>> +    build_append_gas_from_struct(table_data, &gas);
>> +    /* Value */
>> +    build_append_int_noprefix(table_data, value  , 8);
>> +    /* Mask */
>> +    build_append_int_noprefix(table_data, mask   , 8);
>> +}
>> +
>> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
>> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
>> +    const char *oem_id, const char *oem_table_id)
>> +{
>> +    ERSTDeviceState *s = ACPIERST(erst_dev);
> 
> globals are not welcomed in new code,
> pass erst_dev as argument here (ex: find_vmgenid_dev)
> 
>> +    unsigned action;
>> +    unsigned erst_start = table_data->len;
>> +
> 
>> +    s->bar0 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
>> +    trace_acpi_erst_pci_bar_0(s->bar0);
>> +    s->bar1 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);
> 
> just store pci_get_bar_addr(PCI_DEVICE(erst_dev), 0) in local variable,
> Bar 1 is not used in this function so you don't need it here.
Corrected

> 
> 
>> +    trace_acpi_erst_pci_bar_1(s->bar1);
>> +
>> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
>> +    /* serialization_header_length */
> comments documenting table entries should be verbatim copy from spec,
> see build_amd_iommu() as example of preferred style.
Corrected

> 
>> +    build_append_int_noprefix(table_data, 48, 4);
>> +    /* reserved */
>> +    build_append_int_noprefix(table_data,  0, 4);
>> +    /*
>> +     * instruction_entry_count - changes to the number of serialization
>> +     * instructions in the ACTIONs below must be reflected in this
>> +     * pre-computed value.
>> +     */
>> +    build_append_int_noprefix(table_data, 29, 4);
> a bit fragile as it can easily diverge from actual number later on.
> maybe instead of building instruction entries in place, build it
> in separate array and when done, use actual count to fill instruction_entry_count.
> pseudo code could look like:
> 
>       /* prepare instructions in advance because ... */
>       GArray table_instruction_data;
>       build_serialization_instruction_entry(table_instruction_data,...);;
>       ...
>       build_serialization_instruction_entry(table_instruction_data,...);
>       /* instructions count */
>       build_append_int_noprefix(table_data, table_instruction_data.len/entry_size, 4);
>       /* copy prepared in advance instructions */
>       g_array_append_vals(table_data, table_instruction_data.data, table_instruction_data.len);
Corrected

>     
> 
>> +
>> +#define MASK8  0x00000000000000FFUL
>> +#define MASK16 0x000000000000FFFFUL
>> +#define MASK32 0x00000000FFFFFFFFUL
>> +#define MASK64 0xFFFFFFFFFFFFFFFFUL
>> +
>> +    for (action = 0; action < ACPI_ERST_MAX_ACTIONS; ++action) {
> I'd unroll this loop and just directly code entries in required order.
> also drop reserved and nop actions/instructions or explain why they are necessary.
Unrolled. Dropped the NOP.

> 
>> +        switch (action) {
>> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
> given these names will/should never be exposed outside of hw/acpi/erst.c
> I'd drop ACPI_ERST_ACTION_/ACPI_ERST_INST_ prefixes (i.e. use names as defined in spec)
> if it doesn't cause build issues.
These are in include/hw/acpi/erst.h which is included by hw/i386/acpi-build.c,
which includes many other hardware files.
Removing the prefix leaves a rather generic name.
I'd prefer to leave them as it uniquely differentiates.


> 
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_END_OPERATION:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>> +            break;
>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>> +            break;
>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_RESERVED:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>> +            break;
>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>> +        default:
>> +            build_serialization_instruction_entry(table_data, action,
>> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
>> +            break;
>> +        }
>> +    }
>> +    build_header(linker, table_data,
>> +                 (void *)(table_data->data + erst_start),
>> +                 "ERST", table_data->len - erst_start,
>> +                 1, oem_id, oem_table_id);
>> +}
>> +
>> +/*******************************************************************/
>> +/*******************************************************************/
>> +
>>   static const VMStateDescription erst_vmstate  = {
>>       .name = "acpi-erst",
>>       .version_id = 1,
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 07/10] ACPI ERST: trace support
  2021-07-20 13:15   ` Igor Mammedov
@ 2021-07-21 16:14     ` Eric DeVolder
  2021-07-26 11:08       ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:14 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 8:15 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:18 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> Provide the definitions needed to support tracing in ACPI ERST.
> trace points should be introduced in patches that use them for the first time,
> as it stands now series breaks bisection.

Are you asking to move this patch before the patch that introduces erst.c (which
uses these trace points)?
Or are you asking to include this patch with the patch that introduces erst.c?

Also, you requested I separate the building of ERST table from the implemenation
of the erst device as separate patches. Doesn't that also break bisection?


> 
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   hw/acpi/trace-events | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events
>> index dcc1438..a5c2755 100644
>> --- a/hw/acpi/trace-events
>> +++ b/hw/acpi/trace-events
>> @@ -55,3 +55,17 @@ piix4_gpe_writeb(uint64_t addr, unsigned width, uint64_t val) "addr: 0x%" PRIx64
>>   # tco.c
>>   tco_timer_reload(int ticks, int msec) "ticks=%d (%d ms)"
>>   tco_timer_expired(int timeouts_no, bool strap, bool no_reboot) "timeouts_no=%d no_reboot=%d/%d"
>> +
>> +# erst.c
>> +acpi_erst_reg_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%04" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
>> +acpi_erst_reg_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%04" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
>> +acpi_erst_mem_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%06" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
>> +acpi_erst_mem_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%06" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
>> +acpi_erst_pci_bar_0(uint64_t addr) "BAR0: 0x%016" PRIx64
>> +acpi_erst_pci_bar_1(uint64_t addr) "BAR1: 0x%016" PRIx64
>> +acpi_erst_realizefn_in(void)
>> +acpi_erst_realizefn_out(unsigned size) "total nvram size %u bytes"
>> +acpi_erst_reset_in(unsigned record_count) "record_count %u"
>> +acpi_erst_reset_out(unsigned record_count) "record_count %u"
>> +acpi_erst_class_init_in(void)
>> +acpi_erst_class_init_out(void)
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines.
  2021-07-20 13:19   ` Igor Mammedov
@ 2021-07-21 16:16     ` Eric DeVolder
  2021-07-26 11:30       ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:16 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 8:19 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:19 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> This change exposes ACPI ERST support for x86 guests.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> looks good to me, maybe move find_erst_dev() impl. here as well
> if it's the patch it's first used.

I've followed your previous suggestion of mimicking find_vmgenid_dev(), which
declares it in its header file. I've done the same, find_erst_dev() is
declared in its header file and used in these files.


> 
>> ---
>>   hw/i386/acpi-build.c   | 9 +++++++++
>>   hw/i386/acpi-microvm.c | 9 +++++++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index de98750..d2026cc 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -43,6 +43,7 @@
>>   #include "sysemu/tpm.h"
>>   #include "hw/acpi/tpm.h"
>>   #include "hw/acpi/vmgenid.h"
>> +#include "hw/acpi/erst.h"
>>   #include "hw/boards.h"
>>   #include "sysemu/tpm_backend.h"
>>   #include "hw/rtc/mc146818rtc_regs.h"
>> @@ -2327,6 +2328,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>>       GArray *tables_blob = tables->table_data;
>>       AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
>>       Object *vmgenid_dev;
>> +    Object *erst_dev;
>>       char *oem_id;
>>       char *oem_table_id;
>>   
>> @@ -2388,6 +2390,13 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>>                       ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>>                       x86ms->oem_table_id);
>>   
>> +    erst_dev = find_erst_dev();
>> +    if (erst_dev) {
>> +        acpi_add_table(table_offsets, tables_blob);
>> +        build_erst(tables_blob, tables->linker, erst_dev,
>> +                   x86ms->oem_id, x86ms->oem_table_id);
>> +    }
>> +
>>       vmgenid_dev = find_vmgenid_dev();
>>       if (vmgenid_dev) {
>>           acpi_add_table(table_offsets, tables_blob);
>> diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
>> index ccd3303..0099b13 100644
>> --- a/hw/i386/acpi-microvm.c
>> +++ b/hw/i386/acpi-microvm.c
>> @@ -30,6 +30,7 @@
>>   #include "hw/acpi/bios-linker-loader.h"
>>   #include "hw/acpi/generic_event_device.h"
>>   #include "hw/acpi/utils.h"
>> +#include "hw/acpi/erst.h"
>>   #include "hw/boards.h"
>>   #include "hw/i386/fw_cfg.h"
>>   #include "hw/i386/microvm.h"
>> @@ -160,6 +161,7 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>>       X86MachineState *x86ms = X86_MACHINE(mms);
>>       GArray *table_offsets;
>>       GArray *tables_blob = tables->table_data;
>> +    Object *erst_dev;
>>       unsigned dsdt, xsdt;
>>       AcpiFadtData pmfadt = {
>>           /* ACPI 5.0: 4.1 Hardware-Reduced ACPI */
>> @@ -209,6 +211,13 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>>                       ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>>                       x86ms->oem_table_id);
>>   
>> +    erst_dev = find_erst_dev();
>> +    if (erst_dev) {
>> +        acpi_add_table(table_offsets, tables_blob);
>> +        build_erst(tables_blob, tables->linker, erst_dev,
>> +                   x86ms->oem_id, x86ms->oem_table_id);
>> +    }
>> +
>>       xsdt = tables_blob->len;
>>       build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
>>                  x86ms->oem_table_id);
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 09/10] ACPI ERST: qtest for ERST
  2021-07-20 13:38   ` Igor Mammedov
@ 2021-07-21 16:18     ` Eric DeVolder
  2021-07-26 11:45       ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:18 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 8:38 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:20 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> This change provides a qtest that locates and then does a simple
>> interrogation of the ERST feature within the guest.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   tests/qtest/erst-test.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   tests/qtest/meson.build |   2 +
>>   2 files changed, 131 insertions(+)
>>   create mode 100644 tests/qtest/erst-test.c
>>
>> diff --git a/tests/qtest/erst-test.c b/tests/qtest/erst-test.c
>> new file mode 100644
>> index 0000000..ce014c1
>> --- /dev/null
>> +++ b/tests/qtest/erst-test.c
>> @@ -0,0 +1,129 @@
>> +/*
>> + * QTest testcase for ACPI ERST
>> + *
>> + * Copyright (c) 2021 Oracle
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/bitmap.h"
>> +#include "qemu/uuid.h"
>> +#include "hw/acpi/acpi-defs.h"
>> +#include "boot-sector.h"
>> +#include "acpi-utils.h"
>> +#include "libqos/libqtest.h"
>> +#include "qapi/qmp/qdict.h"
>> +
>> +#define RSDP_ADDR_INVALID 0x100000 /* RSDP must be below this address */
>> +
>> +static uint64_t acpi_find_erst(QTestState *qts)
>> +{
>> +    uint32_t rsdp_offset;
>> +    uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
>> +    uint32_t rsdt_len, table_length;
>> +    uint8_t *rsdt, *ent;
>> +    uint64_t base = 0;
>> +
>> +    /* Wait for guest firmware to finish and start the payload. */
>> +    boot_sector_test(qts);
>> +
>> +    /* Tables should be initialized now. */
>> +    rsdp_offset = acpi_find_rsdp_address(qts);
>> +
>> +    g_assert_cmphex(rsdp_offset, <, RSDP_ADDR_INVALID);
>> +
>> +    acpi_fetch_rsdp_table(qts, rsdp_offset, rsdp_table);
>> +    acpi_fetch_table(qts, &rsdt, &rsdt_len, &rsdp_table[16 /* RsdtAddress */],
>> +                     4, "RSDT", true);
>> +
>> +    ACPI_FOREACH_RSDT_ENTRY(rsdt, rsdt_len, ent, 4 /* Entry size */) {
>> +        uint8_t *table_aml;
>> +        acpi_fetch_table(qts, &table_aml, &table_length, ent, 4, NULL, true);
>> +        if (!memcmp(table_aml + 0 /* Header Signature */, "ERST", 4)) {
>> +            /*
>> +             * Picking up ERST base address from the Register Region
>> +             * specified as part of the first Serialization Instruction
>> +             * Action (which is a Begin Write Operation).
>> +             */
>> +            memcpy(&base, &table_aml[56], sizeof(base));
>> +            g_free(table_aml);
>> +            break;
>> +        }
>> +        g_free(table_aml);
>> +    }
>> +    g_free(rsdt);
>> +    return base;
>> +}
> I'd drop this, bios-tables-test should do ACPI table check
> as for PCI device itself you can test it with qtest accelerator
> that allows to instantiate it and access registers directly
> without overhead of running actual guest.
Yes, bios-tables-test checks the ACPI table, but not the functionality.
This test has actually caught a problem/bug during development.

> 
> As example you can look into megasas-test.c, ivshmem-test.c
> or other PCI device tests.
But I'll look at these and see about migrating to this approach.

> 
>> +static char disk[] = "tests/erst-test-disk-XXXXXX";
>> +
>> +#define ERST_CMD()                              \
>> +    "-accel kvm -accel tcg "                    \
>> +    "-object memory-backend-file," \
>> +      "id=erstnvram,mem-path=tests/acpi-erst-XXXXXX,size=0x10000,share=on " \
>> +    "-device acpi-erst,memdev=erstnvram " \
>> +    "-drive id=hd0,if=none,file=%s,format=raw " \
>> +    "-device ide-hd,drive=hd0 ", disk
>> +
>> +static void erst_get_error_log_address_range(void)
>> +{
>> +    QTestState *qts;
>> +    uint64_t log_address_range = 0;
>> +    unsigned log_address_length = 0;
>> +    unsigned log_address_attr = 0;
>> +
>> +    qts = qtest_initf(ERST_CMD());
>> +
>> +    uint64_t base = acpi_find_erst(qts);
>> +    g_assert(base != 0);
>> +
>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE command */
>> +    qtest_writel(qts, base + 0, 0xD);
>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE result */
>> +    log_address_range = qtest_readq(qts, base + 8);\
>> +
>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_LENGTH command */
>> +    qtest_writel(qts, base + 0, 0xE);
>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_LENGTH result */
>> +    log_address_length = qtest_readq(qts, base + 8);\
>> +
>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES command */
>> +    qtest_writel(qts, base + 0, 0xF);
>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES result */
>> +    log_address_attr = qtest_readq(qts, base + 8);\
>> +
>> +    /* Check log_address_range is not 0,~0 or base */
>> +    g_assert(log_address_range != base);
>> +    g_assert(log_address_range != 0);
>> +    g_assert(log_address_range != ~0UL);
>> +
>> +    /* Check log_address_length is ERST_RECORD_SIZE */
>> +    g_assert(log_address_length == (8 * 1024));
>> +
>> +    /* Check log_address_attr is 0 */
>> +    g_assert(log_address_attr == 0);
>> +
>> +    qtest_quit(qts);
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +    int ret;
>> +
>> +    ret = boot_sector_init(disk);
>> +    if (ret) {
>> +        return ret;
>> +    }
>> +
>> +    g_test_init(&argc, &argv, NULL);
>> +
>> +    qtest_add_func("/erst/get-error-log-address-range",
>> +                   erst_get_error_log_address_range);
>> +
>> +    ret = g_test_run();
>> +    boot_sector_cleanup(disk);
>> +
>> +    return ret;
>> +}
>> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
>> index 0c76738..deae443 100644
>> --- a/tests/qtest/meson.build
>> +++ b/tests/qtest/meson.build
>> @@ -66,6 +66,7 @@ qtests_i386 = \
>>     (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
>>     (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
>>     (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
>> +  (config_all_devices.has_key('CONFIG_ACPI') ? ['erst-test'] : []) +                 \
>>     qtests_pci +                                                                              \
>>     ['fdc-test',
>>      'ide-test',
>> @@ -237,6 +238,7 @@ qtests = {
>>     'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
>>     'cdrom-test': files('boot-sector.c'),
>>     'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
>> +  'erst-test': files('erst-test.c', 'boot-sector.c', 'acpi-utils.c'),
>>     'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
>>     'migration-test': files('migration-helpers.c'),
>>     'pxe-test': files('boot-sector.c'),
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c
  2021-07-20 13:24   ` Igor Mammedov
@ 2021-07-21 16:19     ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 16:19 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 8:24 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:21 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> Following the guidelines in tests/qtest/bios-tables-test.c, this
>> is step 6, the re-generated ACPI tables binary blobs.
> 
> looks like test case itself got lost somewhere along the way.
I now understand that this means the test cases in bios-tables-test.c.
I've tests in there now, though still working through a microvm fail.


>   
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   tests/data/acpi/microvm/ERST                | Bin 0 -> 976 bytes
>>   tests/data/acpi/pc/ERST                     | Bin 0 -> 976 bytes
>>   tests/data/acpi/q35/ERST                    | Bin 0 -> 976 bytes
>>   tests/qtest/bios-tables-test-allowed-diff.h |   4 ----
>>   4 files changed, 4 deletions(-)
>>
>> diff --git a/tests/data/acpi/microvm/ERST b/tests/data/acpi/microvm/ERST
>> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..db2adaa8d9b45e295f9976d6bb5a07a813214f52 100644
>> GIT binary patch
>> literal 976
>> zcmaKqTMmLS5Jd+l50TdfOjv?(1qNf{pGN#}aW2XoVQ=kJawAMa;r8^<4tl<ik9Q&x
>> z9fs@aGWNsscIs_KB7$e!_x3{VFxa)yyAdhW<ewtq@KMTR;_(*;o)AYwsc#_I{R=ny
>> z8zx&whJ53fsGoYS{%e8zX-SD^^!|)F8lIhx`_IYG$#;3?dmQ>N$k#r!KbMbUbUyg_
>> zK(;pcerufGzoGM$#7pML|ITms2HKLp#iT7ge?`3d;=pU-HFM;Z{u=Td@?Bo=v9u+>
>> bCEw+h{yXwJ@?BooAHQFxe`xQi@1uMGuJKX<
>>
>> literal 0
>> HcmV?d00001
>>
>> diff --git a/tests/data/acpi/pc/ERST b/tests/data/acpi/pc/ERST
>> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
>> GIT binary patch
>> literal 976
>> zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
>> z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
>> z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
>> z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ
>> c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91
>>
>> literal 0
>> HcmV?d00001
>>
>> diff --git a/tests/data/acpi/q35/ERST b/tests/data/acpi/q35/ERST
>> index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7236018951f9d111d8cacaa93ee07a8dc3294f18 100644
>> GIT binary patch
>> literal 976
>> zcmaKqSq_3Q6h#Y^dE9^rOK=GWV&b1h{BUvZ#VzQD#NN_}<VJW2!{zj}Jj(Gp++KlF
>> z-m^RRr=jicm%cUSDW!0a>)srw9ZqJfYH@yl5T!<U;}M6C67CcCCp`0jI3h}X4Z*CR
>> z@cQFuhiLM(wSRu-xcHA1F8zhXBbq;Aj)oWS$Nk6T$K>0*@ExA}PsmTmxA~y7^f&wF
>> z`=C;Mzb#Jlr!;>?JY$ah@BPi%Ksot29-5N<Er=Hro_R^UWRASiUqyaJzRfE>hSucQ
>> c<lDT_e?xvlzRfG^WB(fYq22#4zMDpU0r#ed0RR91
>>
>> literal 0
>> HcmV?d00001
>>
>> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
>> index e004c71..dfb8523 100644
>> --- a/tests/qtest/bios-tables-test-allowed-diff.h
>> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
>> @@ -1,5 +1 @@
>>   /* List of comma-separated changed AML files to ignore */
>> -"tests/data/acpi/pc/ERST",
>> -"tests/data/acpi/q35/ERST",
>> -"tests/data/acpi/microvm/ERST",
>> -
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-20 12:17   ` Igor Mammedov
  2021-07-21 16:07     ` Eric DeVolder
@ 2021-07-21 17:36     ` Eric DeVolder
  2021-07-26 10:13       ` Igor Mammedov
  1 sibling, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-21 17:36 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/20/21 7:17 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:16 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
>> +
>> +static const TypeInfo erst_type_info = {
>> +    .name          = TYPE_ACPI_ERST,
>> +    .parent        = TYPE_PCI_DEVICE,
>> +    .class_init    = erst_class_init,
>> +    .instance_size = sizeof(ERSTDeviceState),
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> what is this for here?

Without this, at run-time, I obtain:
qemu-system-x86_64: ../hw/pci/pci.c:2673: pci_device_class_base_init: Asse
rtion `conventional || pcie' failed.

> 
>> +        { }
>> +    }
>> +};
>> +
>> +static void erst_register_types(void)
>> +{
>> +    type_register_static(&erst_type_info);
>> +}
>> +
>> +type_init(erst_register_types)
>> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
>> index dd69577..262a8ee 100644
>> --- a/hw/acpi/meson.build
>> +++ b/hw/acpi/meson.build
>> @@ -4,6 +4,7 @@ acpi_ss.add(files(
>>     'aml-build.c',
>>     'bios-linker-loader.c',
>>     'utils.c',
>> +  'erst.c',
>>   ))
>>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
>>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-20 14:57 ` Igor Mammedov
  2021-07-21 15:26   ` Eric DeVolder
@ 2021-07-23 16:26   ` Eric DeVolder
  2021-07-27 12:55   ` Igor Mammedov
  2 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-23 16:26 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

Igor,
Pending your responses to a few questions, I have v6 ready to go.
Thanks,
eric

On 7/20/21 9:57 AM, Igor Mammedov wrote:
> On Wed, 30 Jun 2021 15:07:11 -0400
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> =============================
>> I believe I have corrected for all feedback on v4, but with
>> responses to certain feedback below.
>>
>> In patch 1/6, Igor asks:
>> "you are adding empty template files here
>> but the later matching bios-tables-test is nowhere to be found
>> Was testcase lost somewhere along the way?
>>
>> also it seems you add ERST only to pc/q35,
>> so why tests/data/acpi/microvm/ERST is here?"
>>
>> I did miss setting up microvm. That has been corrected.
>>
>> As for the question about lost test cases, if you are referring
>> to the new binary blobs for pc,q35, those were in patch
>> 6/6. There is a qtest in patch 5/6. If I don't understand the
>> question, please indicate as such.
> 
> All I see in this series is
>   [PATCH v5 09/10] ACPI ERST: qtest for ERST
> which is not related to bios-tables-test and blobs whatsoever.
> 
> Blobs are for use with bios-tables-test and I'm referring to
> missing test case in bios-tables-test.c
> 
>>
>>
>> In patch 3/6, Igor asks:
>> "Also spec (ERST) is rather (maybe intentionally) vague on specifics,
>> so it would be better that before a patch that implements hw part
>> were a doc patch describing concrete implementation. As model
>> you can use docs/specs/acpi_hest_ghes.rst or other docs/specs/acpi_* files.
>> I'd start posting/discussing that spec within these thread
>> to avoid spamming list until doc is settled up."
>>
>> I'm thinking that this cover letter is the bulk of the spec? But as
>> you say, to avoid spamming the group, we can use this thread to make
>> suggested changes to this cover letter which I will then convert
>> into a spec, for v6.
>>
>>
>> In patch 3/6, in many places Igor mentions utilizing the hostmem
>> mapped directly in the guest in order to avoid need-less copying.
>>
>> It is true that the ERST has an "NVRAM" mode that would allow for
>> all the simplifications Igor points out, however, Linux does not
>> support this mode. This mode puts the burden of managing the NVRAM
>> space on the OS. So this implementation, like BIOS, is the non-NVRAM
>> mode.
> see per patch comments where copying is not necessary regardless of
> the implemented mode.
> 
> 
>> I did go ahead and separate the registers from the exchange buffer,
>> which would facilitate the support of NVRAM mode.
>>
>>   linux/drivers/acpi/apei/erst.c:
>>   /* NVRAM ERST Error Log Address Range is not supported yet */
>>   static void pr_unimpl_nvram(void)
>>   {
>>      if (printk_ratelimit())
>>          pr_warn("NVRAM ERST Log Address Range not implemented yet.\n");
>>   }
>>
>>   static int __erst_write_to_nvram(const struct cper_record_header *record)
>>   {
>>      /* do not print message, because printk is not safe for NMI */
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_read_to_erange_from_nvram(u64 record_id, u64 *offset)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>>   static int __erst_clear_from_nvram(u64 record_id)
>>   {
>>      pr_unimpl_nvram();
>>      return -ENOSYS;
>>   }
>>
>> =============================
> PS:
> it's inconvenient when you copy questions/parts of unfinished discussion
> from previous revision with a little context.
> Usually discussion should continue in the original thread and
> once some sort of consensus is reached new series based on it
> is posted. Above blob shouldn't be here. (You can look at how others
> handle multiple revisions)
> 
> The way you do it now, makes reviewer to repeat job done earlier
> to point to the the same issues, so it wastes your and reviewer's time.
> So please finish discussions in threads they started at and then post
> new revision.
> 
>> This patchset introduces support for the ACPI Error Record
>> Serialization Table, ERST.
>>
>> For background and implementation information, please see
>> docs/specs/acpi_erst.txt, which is patch 2/10.
>>
>> Suggested-by: Konrad Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>
>> ---
>> v5: 30jun2021
>>   - Create docs/specs/acpi_erst.txt, per Igor
>>   - Separate PCI BARs for registers and memory, per Igor
>>   - Convert debugging to use trace infrastructure, per Igor
>>   - Various other fixups, per Igor
>>
>> v4: 11jun2021
>>   - Converted to a PCI device, per Igor.
>>   - Updated qtest.
>>   - Rearranged patches, per Igor.
>>
>> v3: 28may2021
>>   - Converted to using a TYPE_MEMORY_BACKEND_FILE object rather than
>>     internal array with explicit file operations, per Igor.
>>   - Changed the way the qdev and base address are handled, allowing
>>     ERST to be disabled at run-time. Also aligns better with other
>>     existing code.
>>
>> v2: 8feb2021
>>   - Added qtest/smoke test per Paolo Bonzini
>>   - Split patch into smaller chunks, per Igor Mammedov
>>   - Did away with use of ACPI packed structures, per Igor Mammedov
>>
>> v1: 26oct2020
>>   - initial post
>>
>> ---
>>
>> Eric DeVolder (10):
>>    ACPI ERST: bios-tables-test.c steps 1 and 2
>>    ACPI ERST: specification for ERST support
>>    ACPI ERST: PCI device_id for ERST
>>    ACPI ERST: header file for ERST
>>    ACPI ERST: support for ACPI ERST feature
>>    ACPI ERST: build the ACPI ERST table
>>    ACPI ERST: trace support
>>    ACPI ERST: create ACPI ERST table for pc/x86 machines.
>>    ACPI ERST: qtest for ERST
>>    ACPI ERST: step 6 of bios-tables-test.c
>>
>>   docs/specs/acpi_erst.txt     | 152 +++++++
>>   hw/acpi/erst.c               | 918 +++++++++++++++++++++++++++++++++++++++++++
>>   hw/acpi/meson.build          |   1 +
>>   hw/acpi/trace-events         |  14 +
>>   hw/i386/acpi-build.c         |   9 +
>>   hw/i386/acpi-microvm.c       |   9 +
>>   include/hw/acpi/erst.h       |  84 ++++
>>   include/hw/pci/pci.h         |   1 +
>>   tests/data/acpi/microvm/ERST | Bin 0 -> 976 bytes
>>   tests/data/acpi/pc/ERST      | Bin 0 -> 976 bytes
>>   tests/data/acpi/q35/ERST     | Bin 0 -> 976 bytes
>>   tests/qtest/erst-test.c      | 129 ++++++
>>   tests/qtest/meson.build      |   2 +
>>   13 files changed, 1319 insertions(+)
>>   create mode 100644 docs/specs/acpi_erst.txt
>>   create mode 100644 hw/acpi/erst.c
>>   create mode 100644 include/hw/acpi/erst.h
>>   create mode 100644 tests/data/acpi/microvm/ERST
>>   create mode 100644 tests/data/acpi/pc/ERST
>>   create mode 100644 tests/data/acpi/q35/ERST
>>   create mode 100644 tests/qtest/erst-test.c
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-07-21 15:42       ` Eric DeVolder
@ 2021-07-26 10:06         ` Igor Mammedov
  2021-07-26 19:52           ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 10:06 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth

On Wed, 21 Jul 2021 10:42:33 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/19/21 10:02 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 19:26:39 +0000
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
> >> rather than "non-NVRAM mode", which contradicts everything I stated prior.
> >> Eric.
> >> ________________________________
> >> From: Eric DeVolder <eric.devolder@oracle.com>
> >> Sent: Wednesday, June 30, 2021 2:07 PM
> >> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
> >> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
> >>
> >> Information on the implementation of the ACPI ERST support.
> >>
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >> ---
> >>   docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
> >>   1 file changed, 152 insertions(+)
> >>   create mode 100644 docs/specs/acpi_erst.txt
> >>
> >> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
> >> new file mode 100644
> >> index 0000000..79f8eb9
> >> --- /dev/null
> >> +++ b/docs/specs/acpi_erst.txt
> >> @@ -0,0 +1,152 @@
> >> +ACPI ERST DEVICE
> >> +================
> >> +
> >> +The ACPI ERST device is utilized to support the ACPI Error Record
> >> +Serialization Table, ERST, functionality. The functionality is
> >> +designed for storing error records in persistent storage for
> >> +future reference/debugging.
> >> +
> >> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
> >> +(APEI)", and specifically subsection "Error Serialization", outlines
> >> +a method for storing error records into persistent storage.
> >> +
> >> +The format of error records is described in the UEFI specification[2],
> >> +in Appendix N "Common Platform Error Record".
> >> +
> >> +While the ACPI specification allows for an NVRAM "mode" (see
> >> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
> >> +directly exposed for direct access by the OS/guest, this implements
> >> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
> >> +by most BIOS (since flash memory requires programming operations
> >> +in order to update its contents). Furthermore, as of the time of this
> >> +writing, Linux does not support the non-NVRAM "mode".  
> > 
> > shouldn't it be s/non-NVRAM/NVRAM/ ?  
> 
> Yes, it has been corrected.
> 
> >   
> >> +
> >> +
> >> +Background/Motivation
> >> +---------------------
> >> +Linux uses the persistent storage filesystem, pstore, to record
> >> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
> >> +independent of, and runs before, kdump.  In certain scenarios (ie.
> >> +hosts/guests with root filesystems on NFS/iSCSI where networking
> >> +software and/or hardware fails), pstore may contain the only
> >> +information available for post-mortem debugging.  
> > 
> > well,
> > it's not the only way, one can use existing pvpanic device to notify
> > mgmt layer about crash and mgmt layer can take appropriate measures
> > to for post-mortem debugging, including dumping guest state,
> > which is superior to anything pstore can offer as VM is still exists
> > and mgmt layer can inspect VMs crashed state directly or dump
> > necessary parts of it.
> > 
> > So ERST shouldn't be portrayed as the only way here but rather
> > as limited alternative to pvpanic in regards to post-mortem debugging
> > (it's the only way only on bare-metal).
> > 
> > It would be better to describe here other use-cases you've mentioned
> > in earlier reviews, that justify adding alternative to pvpanic.  
> 
> I'm not sure how I would change this. I do say "may contain", which means it
> is not the only way. Pvpanic is a way to notify the mgmt layer/host, but
> this is a method solely with the guest. Each serves a different purpose;
> plugs a different hole.
> 

I'd suggest edit  "pstore may contain the only information" as "pstore may contain information"

> As noted in a separate message, my company has intentions of storing other
> data in ERST beyond panics.
perhaps add your use cases here as well.
 

> >> +Two common storage backends for the pstore filesystem are ACPI ERST
> >> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
> >> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
> >> +pstore storage backend for virtual machines (as it is now for
> >> +bare metal machines).
> >> +  
> >   
> >> +Enabling support for ACPI ERST facilitates a consistent method to
> >> +capture kernel panic information in a wide range of guests: from
> >> +resource-constrained microvms to very large guests, and in
> >> +particular, in direct-boot environments (which would lack UEFI
> >> +run-time services).  
> > this hunk probably not necessary
> >   
> >> +
> >> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
> >> +crash information, if available.  
> > a pointer to a relevant source would be helpful here.  
> 
> I've included the reference, here for your benefit.
> Windows Hardware Error Architecutre, specifically Persistence Mechanism
> https://docs.microsoft.com/en-us/windows-hardware/drivers/whea/error-record-persistence-mechanism
> 
> >   
> >> +Invocation  
> > s/^^/Configuration|Usage/  
> 
> Corrected
> 
> >   
> >> +----------
> >> +
> >> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst  
> > s/utilize/use/  
> 
> Corrected
> 
> >   
> >> +device must be created, for example:  
> > s/must/can/  
> 
> Corrected
> 
> >   
> >> +
> >> + qemu ...
> >> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
> >> +  size=0x10000,share=on  
> > I'd put ^^^ on the same line as -object and use '\' at the end the
> > so example could be easily copy-pasted  
> 
> Corrected
> 
> >   
> >> + -device acpi-erst,memdev=erstnvram
> >> +
> >> +For proper operation, the ACPI ERST device needs a memory-backend-file
> >> +object with the following parameters:
> >> +
> >> + - id: The id of the memory-backend-file object is used to associate
> >> +   this memory with the acpi-erst device.
> >> + - size: The size of the ACPI ERST backing storage. This parameter is
> >> +   required.
> >> + - mem-path: The location of the ACPI ERST backing storage file. This
> >> +   parameter is also required.
> >> + - share: The share=on parameter is required so that updates to the
> >> +   ERST back store are written to the file immediately as well. Without
> >> +   it, updates the the backing file are unpredictable and may not
> >> +   properly persist (eg. if qemu should crash).  
> > 
> > mmap manpage says:
> >    MAP_SHARED
> >               Updates to the mapping ... are carried through to the underlying file.
> > it doesn't guarantee 'written to the file immediately', though.
> > So I'd rephrase it to something like that:
> > 
> > - share: The share=on parameter is required so that updates to the ERST back store
> >           are written back to the file.  
> 
> Corrected
> 
> >   
> >> +
> >> +The ACPI ERST device is a simple PCI device, and requires this one
> >> +parameter:  
> > s/^.*:/and ERST device:/  
> 
> Corrected
> 
> >   
> >> +
> >> + - memdev: Is the object id of the memory-backend-file.
> >> +
> >> +
> >> +PCI Interface
> >> +-------------
> >> +
> >> +The ERST device is a PCI device with two BARs, one for accessing
> >> +the programming registers, and the other for accessing the
> >> +record exchange buffer.
> >> +
> >> +BAR0 contains the programming interface consisting of just two
> >> +64-bit registers. The two registers are an ACTION (cmd) and a
> >> +VALUE (data). All ERST actions/operations/side effects happen  
> > s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/  
> 
> Corrected
> 
> >   
> >> +on the write to the ACTION, by design. Thus any data needed  
> > s/Thus//  
> Corrected
> 
> >   
> >> +by the action must be placed into VALUE prior to writing
> >> +ACTION. Reading the VALUE simply returns the register contents,
> >> +which can be updated by a previous ACTION.  
> >   
> >> This behavior is
> >> +encoded in the ACPI ERST table generated by QEMU.  
> > it's too vague, Either drop sentence or add a reference to relevant place in spec.  
> Corrected
> 
> > 
> >   
> >> +
> >> +BAR1 contains the record exchange buffer, and the size of this
> >> +buffer sets the maximum record size. This record exchange
> >> +buffer size is 8KiB.  
> > s/^^^/
> > BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.  
> Corrected
> 
> > 
> >   
> >> +Backing File  
> > 
> > s/^^^/Backing Storage Format/  
> Corrected
> 
> >   
> >> +------------  
> > 
> >   
> >> +
> >> +The ACPI ERST persistent storage is contained within a single backing
> >> +file. The size and location of the backing file is specified upon
> >> +QEMU startup of the ACPI ERST device.  
> > 
> > I'd drop above paragraph and describe file format here,
> > ultimately used backend doesn't have to be a file. For
> > example if user doesn't need it persist over QEMU restarts,
> > ram backend could be used, guest will still be able to see
> > it's own crash log after guest is reboot, or it could be
> > memfd backend passed to QEMU by mgmt layer.  
> Dropped
> 
> > 
> >   
> >> +Records are stored in the backing file in a simple fashion.  
> > s/backing file/backend storage/
> > ditto for other occurrences  
> Corrected
> 
> >   
> >> +The backing file is essentially divided into fixed size
> >> +"slots", ERST_RECORD_SIZE in length, with each "slot"
> >> +storing a single record.  
> >   
> >> No attempt at optimizing storage
> >> +through compression, compaction, etc is attempted.  
> > s/^^^//  
> 
> I'd like to keep this statement. It is there because in a number of
> hardware BIOS I tested, these kinds of features lead to bugs in the
> ERST support.
this doc it's not about issues in other BIOSes, it's about conrete
QEMU impl. So sentence starting with "No attempt" is not relevant here at all.
  
> >> +NOTE that any change to this value will make any pre-
> >> +existing backing files, not of the same ERST_RECORD_SIZE,
> >> +unusable to the guest.  
> > when that can happen, can we detect it and error out?  
> I've dropped this statement. That value is hard coded, and not a
> parameter, so there is no simple way to change it. This comment
> does exist next to the ERST_RECORD_SIZE declaration in the code.

It's not a problem with current impl. but rather with possible
future expansion.

If you'd add a header with record size at the start of storage,
it wouldn't be issue as ERST would be able to get used record
size for storage. That will help with avoiding compat issues
later on.

> >> +Below is an example layout of the backing store file.
> >> +The size of the file is a multiple of ERST_RECORD_SIZE,
> >> +and contains N number of "slots" to store records. The
> >> +example below shows two records (in CPER format) in the
> >> +backing file, while the remaining slots are empty/
> >> +available.
> >> +
> >> + Slot   Record
> >> +        +--------------------------------------------+
> >> +    0   | empty/available                            |
> >> +        +--------------------------------------------+
> >> +    1   | CPER                                       |
> >> +        +--------------------------------------------+
> >> +    2   | CPER                                       |
> >> +        +--------------------------------------------+
> >> +  ...   |                                            |
> >> +        +--------------------------------------------+
> >> +    N   | empty/available                            |
> >> +        +--------------------------------------------+
> >> +        <-------------- ERST_RECORD_SIZE ------------>  
> > 
> >   
> >> +Not all slots need to be occupied, and they need not be
> >> +occupied in a contiguous fashion. The ability to clear/erase
> >> +specific records allows for the formation of unoccupied
> >> +slots.  
> > I'd drop this as not necessary  
> 
> I'd like to keep this statement. Again, several BIOS on which I tested
> ERST had bugs around non-contiguous record storage.

I'd drop this and alter sentence above ending with " in a simple fashion."
to describe sparse usage of storage and then after that comes example diagram.

I'd like this part to start with unambiguous concise description of
format and to be finished with example diagram.
It is the part that will be considered as the the only true source
how file should be formatted, when it comes to fixing bugs or
modifying original impl. later on. So it's important to have clear
description without any unnecessary information here.


> > 
> >   
> >> +
> >> +
> >> +References
> >> +----------
> >> +
> >> +[1] "Advanced Configuration and Power Interface Specification",
> >> +    version 4.0, June 2009.
> >> +
> >> +[2] "Unified Extensible Firmware Interface Specification",
> >> +    version 2.1, October 2008.
> >> +
> >> --
> >> 1.8.3.1
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-21 17:36     ` Eric DeVolder
@ 2021-07-26 10:13       ` Igor Mammedov
  0 siblings, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 10:13 UTC (permalink / raw)
  To: Eric DeVolder; +Cc: jusual, qemu-devel, mst

On Wed, 21 Jul 2021 12:36:20 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 7:17 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:16 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:  
> >> +
> >> +static const TypeInfo erst_type_info = {
> >> +    .name          = TYPE_ACPI_ERST,
> >> +    .parent        = TYPE_PCI_DEVICE,
> >> +    .class_init    = erst_class_init,
> >> +    .instance_size = sizeof(ERSTDeviceState),
> >> +    .interfaces = (InterfaceInfo[]) {
> >> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },  
> > what is this for here?  
> 
> Without this, at run-time, I obtain:
> qemu-system-x86_64: ../hw/pci/pci.c:2673: pci_device_class_base_init: Asse
> rtion `conventional || pcie' failed.

Michael,
should we make it PCI or PCIE device?
Potential users are arm-virt/q35/microvm machines (PCIE capable)
and the old 'pc' machine (PCI).

> >   
> >> +        { }
> >> +    }
> >> +};
> >> +
> >> +static void erst_register_types(void)
> >> +{
> >> +    type_register_static(&erst_type_info);
> >> +}
> >> +
> >> +type_init(erst_register_types)
> >> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> >> index dd69577..262a8ee 100644
> >> --- a/hw/acpi/meson.build
> >> +++ b/hw/acpi/meson.build
> >> @@ -4,6 +4,7 @@ acpi_ss.add(files(
> >>     'aml-build.c',
> >>     'bios-linker-loader.c',
> >>     'utils.c',
> >> +  'erst.c',
> >>   ))
> >>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
> >>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-21 16:07     ` Eric DeVolder
@ 2021-07-26 10:42       ` Igor Mammedov
  2021-07-26 20:01         ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 10:42 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 21 Jul 2021 11:07:40 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 7:17 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:16 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> This change implements the support for the ACPI ERST feature.  
> > Drop this  
> Done
> 
> >   
> >>
> >> This implements a PCI device for ACPI ERST. This implments the  
> > s/implments/implements/  
> Corrected
> 
> >   
> >> non-NVRAM "mode" of operation for ERST.  
> > add here why non-NVRAM "mode" is implemented.  
> How about:
> This implements a PCI device for ACPI ERST. This implments the
> non-NVRAM "mode" of operation for ERST as it is supported by
> Linux and Windows and aligns with ERST support in most BIOS.

modulo typos looks good to me.
pls consider using a spell checker to check commit messages/comments.

> 
> > 
> > Also even if this non-NVRAM implementation, there is still
> > a lot of not necessary data copying (see below) so drop it
> > or justify why it's there.
> >     
> >> This change also includes erst.c in the build of general ACPI support.  
> > Drop this as well  
> Done
> 
> > 
> >   
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >> ---
> >>   hw/acpi/erst.c      | 704 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>   hw/acpi/meson.build |   1 +
> >>   2 files changed, 705 insertions(+)
> >>   create mode 100644 hw/acpi/erst.c
> >>
> >> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> >> new file mode 100644
> >> index 0000000..6e9bd2e
> >> --- /dev/null
> >> +++ b/hw/acpi/erst.c
> >> @@ -0,0 +1,704 @@
> >> +/*
> >> + * ACPI Error Record Serialization Table, ERST, Implementation
> >> + *
> >> + * Copyright (c) 2021 Oracle and/or its affiliates.
> >> + *
> >> + * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
> >> + * ACPI Platform Error Interfaces : Error Serialization
> >> + *
> >> + * This library is free software; you can redistribute it and/or
> >> + * modify it under the terms of the GNU Lesser General Public
> >> + * License as published by the Free Software Foundation;
> >> + * version 2 of the License.
> >> + *
> >> + * This library is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >> + * Lesser General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU Lesser General Public
> >> + * License along with this library; if not, see <http://www.gnu.org/licenses/>

consider adding SPDX-License-Identifier to all new files.
 
> >> + */
> >> +
> >> +#include <sys/types.h>
> >> +#include <sys/stat.h>
> >> +#include <unistd.h>
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "qapi/error.h"
> >> +#include "hw/qdev-core.h"
> >> +#include "exec/memory.h"
> >> +#include "qom/object.h"
> >> +#include "hw/pci/pci.h"
> >> +#include "qom/object_interfaces.h"
> >> +#include "qemu/error-report.h"
> >> +#include "migration/vmstate.h"
> >> +#include "hw/qdev-properties.h"
> >> +#include "hw/acpi/acpi.h"
> >> +#include "hw/acpi/acpi-defs.h"
> >> +#include "hw/acpi/aml-build.h"
> >> +#include "hw/acpi/bios-linker-loader.h"
> >> +#include "exec/address-spaces.h"
> >> +#include "sysemu/hostmem.h"
> >> +#include "hw/acpi/erst.h"
> >> +#include "trace.h"
> >> +
> >> +/* UEFI 2.1: Append N Common Platform Error Record */
> >> +#define UEFI_CPER_RECORD_MIN_SIZE 128U
> >> +#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
> >> +#define UEFI_CPER_RECORD_ID_OFFSET 96U
> >> +#define IS_UEFI_CPER_RECORD(ptr) \
> >> +    (((ptr)[0] == 'C') && \
> >> +     ((ptr)[1] == 'P') && \
> >> +     ((ptr)[2] == 'E') && \
> >> +     ((ptr)[3] == 'R'))
> >> +#define THE_UEFI_CPER_RECORD_ID(ptr) \
> >> +    (*(uint64_t *)(&(ptr)[UEFI_CPER_RECORD_ID_OFFSET]))
> >> +
> >> +/*
> >> + * This implementation is an ACTION (cmd) and VALUE (data)
> >> + * interface consisting of just two 64-bit registers.
> >> + */
> >> +#define ERST_REG_SIZE (2UL * sizeof(uint64_t))  
> >   
> >> +#define ERST_CSR_ACTION (0UL << 3) /* action (cmd) */
> >> +#define ERST_CSR_VALUE  (1UL << 3) /* argument/value (data) */  
> > what's meaning of CRS?  
> CSR = control status register
> > Looking at patch both should be called ERST_[ACTION|VALUE]_OFFSET  
> Done
> > pls use explicit offset values instead of shifting bit.  
> Done
> > 
> >   
> >> +/*
> >> + * ERST_RECORD_SIZE is the buffer size for exchanging ERST
> >> + * record contents. Thus, it defines the maximum record size.
> >> + * As this is mapped through a PCI BAR, it must be a power of
> >> + * two, and should be at least PAGE_SIZE.
> >> + * Records are stored in the backing file in a simple fashion.
> >> + * The backing file is essentially divided into fixed size
> >> + * "slots", ERST_RECORD_SIZE in length, with each "slot"
> >> + * storing a single record. No attempt at optimizing storage
> >> + * through compression, compaction, etc is attempted.
> >> + * NOTE that any change to this value will make any pre-
> >> + * existing backing files, not of the same ERST_RECORD_SIZE,
> >> + * unusable to the guest.
> >> + */
> >> +/* 8KiB records, not too small, not too big */
> >> +#define ERST_RECORD_SIZE (2UL * 4096)
> >> +
> >> +#define ERST_INVALID_RECORD_ID (~0UL)
> >> +#define ERST_EXECUTE_OPERATION_MAGIC 0x9CUL
> >> +
> >> +/*
> >> + * Object cast macro
> >> + */
> >> +#define ACPIERST(obj) \
> >> +    OBJECT_CHECK(ERSTDeviceState, (obj), TYPE_ACPI_ERST)
> >> +
> >> +/*
> >> + * Main ERST device state structure
> >> + */
> >> +typedef struct {
> >> +    PCIDevice parent_obj;
> >> +
> >> +    HostMemoryBackend *hostmem;
> >> +    MemoryRegion *hostmem_mr;
> >> +
> >> +    MemoryRegion iomem; /* programming registes */
> >> +    MemoryRegion nvmem; /* exchange buffer */
> >> +    uint32_t prop_size;  
> > s/^^^/storage_size/  
> Corrected
> 
> >   
> >> +    hwaddr bar0; /* programming registers */
> >> +    hwaddr bar1; /* exchange buffer */  
> > why do you need to keep this addresses around?
> > Suggest to drop these fields and use local variables or pci_get_bar_addr() at call site.  
> Corrected
> 
> >   
> >> +
> >> +    uint8_t operation;
> >> +    uint8_t busy_status;
> >> +    uint8_t command_status;
> >> +    uint32_t record_offset;
> >> +    uint32_t record_count;
> >> +    uint64_t reg_action;
> >> +    uint64_t reg_value;
> >> +    uint64_t record_identifier;
> >> +
> >> +    unsigned next_record_index;  
> > 
> >   
> >> +    uint8_t record[ERST_RECORD_SIZE]; /* read/written directly by guest */
> >> +    uint8_t tmp_record[ERST_RECORD_SIZE]; /* intermediate manipulation buffer */  
> > drop these see [**] below  
> Corrected
> 
> >   
> >> +
> >> +} ERSTDeviceState;
> >> +
> >> +/*******************************************************************/
> >> +/*******************************************************************/
> >> +
> >> +static unsigned copy_from_nvram_by_index(ERSTDeviceState *s, unsigned index)
> >> +{
> >> +    /* Read an nvram entry into tmp_record */
> >> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> >> +    off_t offset = (index * ERST_RECORD_SIZE);
> >> +
> >> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
> >> +        if (s->hostmem_mr) {
> >> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
> >> +            memcpy(s->tmp_record, p + offset, ERST_RECORD_SIZE);
> >> +            rc = ACPI_ERST_STATUS_SUCCESS;
> >> +        }
> >> +    }
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned copy_to_nvram_by_index(ERSTDeviceState *s, unsigned index)
> >> +{
> >> +    /* Write entry in tmp_record into nvram, and backing file */
> >> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> >> +    off_t offset = (index * ERST_RECORD_SIZE);
> >> +
> >> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
> >> +        if (s->hostmem_mr) {
> >> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
> >> +            memcpy(p + offset, s->tmp_record, ERST_RECORD_SIZE);
> >> +            rc = ACPI_ERST_STATUS_SUCCESS;
> >> +        }
> >> +    }
> >> +    return rc;
> >> +}
> >> +
> >> +static int lookup_erst_record_by_identifier(ERSTDeviceState *s,
> >> +    uint64_t record_identifier, bool *record_found, bool alloc_for_write)
> >> +{
> >> +    int rc = -1;
> >> +    int empty_index = -1;
> >> +    int index = 0;
> >> +    unsigned rrc;
> >> +
> >> +    *record_found = 0;
> >> +
> >> +    do {
> >> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);  
> > 
> > you have direct access to backend memory so there is no need
> > whatsoever to copy records from it to an intermediate buffer
> > everywhere. Almost all operations with records can be done
> > in place modulo EXECUTE_OPERATION action in BEGIN_[READ|WRITE]
> > context, where record is moved between backend and guest buffer.
> > 
> > So please eliminate all not necessary copying.
> > (for fun, time operations and set backend size to some huge
> > value to see how expensive this code is)  
> 
> I've corrected this. In our previous exchangs, I thought the reference
> to copying was about trying to directly have guest write/read the appropriate
> record in the backend storage. After reading this comment I realized that
> yes I was doing alot of copying (an artifact of the transition away from
> direct file i/o to MemoryBackend). So good find, and I've eliminated the
> intermediate copying.
> 
> >   
> >> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
> >> +            uint64_t this_identifier;
> >> +            this_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
> >> +            if (IS_UEFI_CPER_RECORD(s->tmp_record) &&
> >> +                (this_identifier == record_identifier)) {
> >> +                rc = index;
> >> +                *record_found = 1;
> >> +                break;
> >> +            }
> >> +            if ((this_identifier == ERST_INVALID_RECORD_ID) &&
> >> +                (empty_index < 0)) {
> >> +                empty_index = index; /* first available for write */
> >> +            }
> >> +        }
> >> +        ++index;
> >> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
> >> +
> >> +    /* Record not found, allocate for writing */
> >> +    if ((rc < 0) && alloc_for_write) {
> >> +        rc = empty_index;
> >> +    }
> >> +
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned clear_erst_record(ERSTDeviceState *s)
> >> +{
> >> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> >> +    bool record_found;
> >> +    int index;
> >> +
> >> +    index = lookup_erst_record_by_identifier(s,
> >> +        s->record_identifier, &record_found, 0);
> >> +    if (record_found) {
> >> +        memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
> >> +        rc = copy_to_nvram_by_index(s, (unsigned)index);
> >> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
> >> +            s->record_count -= 1;
> >> +        }
> >> +    }
> >> +
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned write_erst_record(ERSTDeviceState *s)
> >> +{
> >> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
> >> +
> >> +    if (s->record_offset < (ERST_RECORD_SIZE - UEFI_CPER_RECORD_MIN_SIZE)) {
> >> +        uint64_t record_identifier;
> >> +        uint8_t *record = &s->record[s->record_offset];
> >> +        bool record_found;
> >> +        int index;
> >> +
> >> +        record_identifier = (s->record_identifier == ERST_INVALID_RECORD_ID)
> >> +            ? THE_UEFI_CPER_RECORD_ID(record) : s->record_identifier;
> >> +
> >> +        index = lookup_erst_record_by_identifier(s,
> >> +            record_identifier, &record_found, 1);
> >> +        if (index < 0) {
> >> +            rc = ACPI_ERST_STATUS_NOT_ENOUGH_SPACE;
> >> +        } else {
> >> +            if (0 != s->record_offset) {
> >> +                memset(&s->tmp_record[ERST_RECORD_SIZE - s->record_offset],
> >> +                    0xFF, s->record_offset);
> >> +            }
> >> +            memcpy(s->tmp_record, record, ERST_RECORD_SIZE - s->record_offset);
> >> +            rc = copy_to_nvram_by_index(s, (unsigned)index);
> >> +            if (rc == ACPI_ERST_STATUS_SUCCESS) {
> >> +                if (!record_found) { /* not overwriting existing record */
> >> +                    s->record_count += 1; /* writing new record */
> >> +                }
> >> +            }
> >> +        }
> >> +    }
> >> +
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned next_erst_record(ERSTDeviceState *s,
> >> +    uint64_t *record_identifier)
> >> +{
> >> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> >> +    unsigned index;
> >> +    unsigned rrc;
> >> +
> >> +    *record_identifier = ERST_INVALID_RECORD_ID;
> >> +
> >> +    index = s->next_record_index;
> >> +    do {
> >> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
> >> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
> >> +            if (IS_UEFI_CPER_RECORD(s->tmp_record)) {
> >> +                s->next_record_index = index + 1; /* where to start next time */
> >> +                *record_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
> >> +                rc = ACPI_ERST_STATUS_SUCCESS;
> >> +                break;
> >> +            }
> >> +            ++index;
> >> +        } else {
> >> +            if (s->next_record_index == 0) {
> >> +                rc = ACPI_ERST_STATUS_RECORD_STORE_EMPTY;
> >> +            }
> >> +            s->next_record_index = 0; /* at end, reset */
> >> +        }
> >> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
> >> +
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned read_erst_record(ERSTDeviceState *s)
> >> +{
> >> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
> >> +    bool record_found;
> >> +    int index;
> >> +
> >> +    index = lookup_erst_record_by_identifier(s,
> >> +        s->record_identifier, &record_found, 0);
> >> +    if (record_found) {
> >> +        rc = copy_from_nvram_by_index(s, (unsigned)index);
> >> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
> >> +            if (s->record_offset < ERST_RECORD_SIZE) {
> >> +                memcpy(&s->record[s->record_offset], s->tmp_record,
> >> +                    ERST_RECORD_SIZE - s->record_offset);
> >> +            }
> >> +        }
> >> +    }
> >> +
> >> +    return rc;
> >> +}
> >> +
> >> +static unsigned get_erst_record_count(ERSTDeviceState *s)
> >> +{
> >> +    /* Compute record_count */
> >> +    unsigned index = 0;
> >> +
> >> +    s->record_count = 0;
> >> +    while (copy_from_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
> >> +        uint8_t *ptr = &s->tmp_record[0];
> >> +        uint64_t record_identifier = THE_UEFI_CPER_RECORD_ID(ptr);
> >> +        if (IS_UEFI_CPER_RECORD(ptr) &&
> >> +            (ERST_INVALID_RECORD_ID != record_identifier)) {
> >> +            s->record_count += 1;
> >> +        }
> >> +        ++index;
> >> +    }
> >> +    return s->record_count;
> >> +}
> >> +
> >> +/*******************************************************************/
> >> +
> >> +static uint64_t erst_rd_reg64(hwaddr addr,
> >> +    uint64_t reg, unsigned size)
> >> +{
> >> +    uint64_t rdval;
> >> +    uint64_t mask;
> >> +    unsigned shift;
> >> +
> >> +    if (size == sizeof(uint64_t)) {
> >> +        /* 64b access */
> >> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >> +        shift = 0;
> >> +    } else {
> >> +        /* 32b access */
> >> +        mask = 0x00000000FFFFFFFFUL;
> >> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >> +    }
> >> +
> >> +    rdval = reg;
> >> +    rdval >>= shift;
> >> +    rdval &= mask;
> >> +
> >> +    return rdval;
> >> +}
> >> +
> >> +static uint64_t erst_wr_reg64(hwaddr addr,
> >> +    uint64_t reg, uint64_t val, unsigned size)
> >> +{
> >> +    uint64_t wrval;
> >> +    uint64_t mask;
> >> +    unsigned shift;
> >> +
> >> +    if (size == sizeof(uint64_t)) {
> >> +        /* 64b access */
> >> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >> +        shift = 0;
> >> +    } else {
> >> +        /* 32b access */
> >> +        mask = 0x00000000FFFFFFFFUL;
> >> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >> +    }
> >> +
> >> +    val &= mask;
> >> +    val <<= shift;
> >> +    mask <<= shift;
> >> +    wrval = reg;
> >> +    wrval &= ~mask;
> >> +    wrval |= val;
> >> +
> >> +    return wrval;
> >> +}  
> > (I see in next patch it's us defining access width in the ACPI tables)
> > so question is: do we have to have mixed register width access?
> > can't all register accesses be 64-bit?  
> 
> Initially I attempted to just use 64-bit exclusively. The problem is that,
> for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
> register access into two. Here's an example of reading the exchange buffer
> address, which is coded as a 64b access:
> 
> acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
> acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
> acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)
> 
> So I went ahead and made ACTION register accesses 32b, else there would
> be two reads of 32-bts, of which the second is useless.

could you post here decompiled ERST table?

> 
> >   
> >> +static void erst_reg_write(void *opaque, hwaddr addr,
> >> +    uint64_t val, unsigned size)
> >> +{
> >> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> >> +
> >> +    /*
> >> +     * NOTE: All actions/operations/side effects happen on the WRITE,
> >> +     * by design. The READs simply return the reg_value contents.
> >> +     */
> >> +    trace_acpi_erst_reg_write(addr, val, size);
> >> +
> >> +    switch (addr) {
> >> +    case ERST_CSR_VALUE + 0:
> >> +    case ERST_CSR_VALUE + 4:
> >> +        s->reg_value = erst_wr_reg64(addr, s->reg_value, val, size);
> >> +        break;
> >> +    case ERST_CSR_ACTION + 0:
> >> +/*  case ERST_CSR_ACTION+4: as coded, not really a 64b register */
> >> +        switch (val) {
> >> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
> >> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> >> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> >> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> >> +        case ACPI_ERST_ACTION_END_OPERATION:
> >> +            s->operation = val;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
> >> +            s->record_offset = s->reg_value;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
> >> +            if ((uint8_t)s->reg_value == ERST_EXECUTE_OPERATION_MAGIC) {
> >> +                s->busy_status = 1;
> >> +                switch (s->operation) {
> >> +                case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
> >> +                    s->command_status = write_erst_record(s);
> >> +                    break;
> >> +                case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> >> +                    s->command_status = read_erst_record(s);
> >> +                    break;
> >> +                case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> >> +                    s->command_status = clear_erst_record(s);
> >> +                    break;
> >> +                case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> >> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
> >> +                    break;
> >> +                case ACPI_ERST_ACTION_END_OPERATION:
> >> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
> >> +                    break;
> >> +                default:
> >> +                    s->command_status = ACPI_ERST_STATUS_FAILED;
> >> +                    break;
> >> +                }
> >> +                s->record_identifier = ERST_INVALID_RECORD_ID;
> >> +                s->busy_status = 0;
> >> +            }
> >> +            break;
> >> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
> >> +            s->reg_value = s->busy_status;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
> >> +            s->reg_value = s->command_status;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
> >> +            s->command_status = next_erst_record(s, &s->reg_value);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
> >> +            s->record_identifier = s->reg_value;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
> >> +            s->reg_value = s->record_count;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
> >> +            s->reg_value = s->bar1;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
> >> +            s->reg_value = ERST_RECORD_SIZE;
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
> >> +            s->reg_value = 0x0; /* intentional, not NVRAM mode */
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
> >> +            /*
> >> +             * 100UL is max, 10UL is nominal  
> > 100/10 of what, also add reference to spec/table it comes from
> > and explain in comment why theses values were chosen  
> I've changed the comment and style to be similar to build_amd_iommu().
> These are merely sane non-zero max/min times.
> 
> >   
> >> +             */
> >> +            s->reg_value = ((100UL << 32) | (10UL << 0));
> >> +            break;
> >> +        case ACPI_ERST_ACTION_RESERVED:  
> > not necessary, it will be handled by 'default:'  
> Corrected
> 
> >   
> >> +        default:
> >> +            /*
> >> +             * Unknown action/command, NOP
> >> +             */
> >> +            break;
> >> +        }
> >> +        break;
> >> +    default:
> >> +        /*
> >> +         * This should not happen, but if it does, NOP
> >> +         */
> >> +        break;
> >> +    }
> >> +}
> >> +
> >> +static uint64_t erst_reg_read(void *opaque, hwaddr addr,
> >> +                                unsigned size)
> >> +{
> >> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> >> +    uint64_t val = 0;
> >> +
> >> +    switch (addr) {
> >> +    case ERST_CSR_ACTION + 0:
> >> +    case ERST_CSR_ACTION + 4:
> >> +        val = erst_rd_reg64(addr, s->reg_action, size);
> >> +        break;
> >> +    case ERST_CSR_VALUE + 0:
> >> +    case ERST_CSR_VALUE + 4:
> >> +        val = erst_rd_reg64(addr, s->reg_value, size);
> >> +        break;
> >> +    default:
> >> +        break;
> >> +    }
> >> +    trace_acpi_erst_reg_read(addr, val, size);
> >> +    return val;
> >> +}
> >> +
> >> +static const MemoryRegionOps erst_reg_ops = {
> >> +    .read = erst_reg_read,
> >> +    .write = erst_reg_write,
> >> +    .endianness = DEVICE_NATIVE_ENDIAN,
> >> +};
> >> +
> >> +static void erst_mem_write(void *opaque, hwaddr addr,
> >> +    uint64_t val, unsigned size)
> >> +{
> >> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;  
> >   
> >> +    uint8_t *ptr = &s->record[addr - 0];
> >> +    trace_acpi_erst_mem_write(addr, val, size);
> >> +    switch (size) {
> >> +    default:
> >> +    case sizeof(uint8_t):
> >> +        *(uint8_t *)ptr = (uint8_t)val;
> >> +        break;
> >> +    case sizeof(uint16_t):
> >> +        *(uint16_t *)ptr = (uint16_t)val;
> >> +        break;
> >> +    case sizeof(uint32_t):
> >> +        *(uint32_t *)ptr = (uint32_t)val;
> >> +        break;
> >> +    case sizeof(uint64_t):
> >> +        *(uint64_t *)ptr = (uint64_t)val;
> >> +        break;
> >> +    }
> >> +}
> >> +
> >> +static uint64_t erst_mem_read(void *opaque, hwaddr addr,
> >> +                                unsigned size)
> >> +{
> >> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
> >> +    uint8_t *ptr = &s->record[addr - 0];
> >> +    uint64_t val = 0;
> >> +    switch (size) {
> >> +    default:
> >> +    case sizeof(uint8_t):
> >> +        val = *(uint8_t *)ptr;
> >> +        break;
> >> +    case sizeof(uint16_t):
> >> +        val = *(uint16_t *)ptr;
> >> +        break;
> >> +    case sizeof(uint32_t):
> >> +        val = *(uint32_t *)ptr;
> >> +        break;
> >> +    case sizeof(uint64_t):
> >> +        val = *(uint64_t *)ptr;
> >> +        break;
> >> +    }
> >> +    trace_acpi_erst_mem_read(addr, val, size);
> >> +    return val;
> >> +}
> >> +
> >> +static const MemoryRegionOps erst_mem_ops = {
> >> +    .read = erst_mem_read,
> >> +    .write = erst_mem_write,
> >> +    .endianness = DEVICE_NATIVE_ENDIAN,
> >> +};
> >> +
> >> +/*******************************************************************/
> >> +/*******************************************************************/
> >> +
> >> +static const VMStateDescription erst_vmstate  = {
> >> +    .name = "acpi-erst",
> >> +    .version_id = 1,
> >> +    .minimum_version_id = 1,
> >> +    .fields = (VMStateField[]) {
> >> +        VMSTATE_UINT8(operation, ERSTDeviceState),
> >> +        VMSTATE_UINT8(busy_status, ERSTDeviceState),
> >> +        VMSTATE_UINT8(command_status, ERSTDeviceState),
> >> +        VMSTATE_UINT32(record_offset, ERSTDeviceState),
> >> +        VMSTATE_UINT32(record_count, ERSTDeviceState),
> >> +        VMSTATE_UINT64(reg_action, ERSTDeviceState),
> >> +        VMSTATE_UINT64(reg_value, ERSTDeviceState),
> >> +        VMSTATE_UINT64(record_identifier, ERSTDeviceState),
> >> +        VMSTATE_UINT8_ARRAY(record, ERSTDeviceState, ERST_RECORD_SIZE),
> >> +        VMSTATE_UINT8_ARRAY(tmp_record, ERSTDeviceState, ERST_RECORD_SIZE),
> >> +        VMSTATE_END_OF_LIST()
> >> +    }
> >> +};
> >> +
> >> +static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
> >> +{
> >> +    ERSTDeviceState *s = ACPIERST(pci_dev);
> >> +    unsigned index = 0;
> >> +    bool share;
> >> +
> >> +    trace_acpi_erst_realizefn_in();
> >> +
> >> +    if (!s->hostmem) {
> >> +        error_setg(errp, "'" ACPI_ERST_MEMDEV_PROP "' property is not set");
> >> +        return;
> >> +    } else if (host_memory_backend_is_mapped(s->hostmem)) {
> >> +        error_setg(errp, "can't use already busy memdev: %s",
> >> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
> >> +        return;
> >> +    }
> >> +
> >> +    share = object_property_get_bool(OBJECT(s->hostmem), "share", &error_fatal);  
> > s/&error_fatal/errp/  
> Corrected
> 
> >   
> >> +    if (!share) {
> >> +        error_setg(errp, "ACPI ERST requires hostmem property share=on: %s",
> >> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
> >> +    }  
> > This limits possible to use backends to file|memfd only, so
> > I wonder if really need this limitation, what if user doesn't
> > care about preserving it across QEMU restarts. (i.e. usecase
> > where storage is used as a means to troubleshoot guest crash
> > i.e. QEMU is not restarted in between)
> > 
> > Maybe instead of enforcing we should just document that if user
> > wishes to preserve content they should use file|memfd backend with
> > share=on option.  
> 
> I've removed this check. It is documented the way it is intended to be used.
> 
> >   
> >> +
> >> +    s->hostmem_mr = host_memory_backend_get_memory(s->hostmem);
> >> +
> >> +    /* HostMemoryBackend size will be multiple of PAGE_SIZE */
> >> +    s->prop_size = object_property_get_int(OBJECT(s->hostmem), "size", &error_fatal);  
> > s/&error_fatal/errp/  
> Corrected
> 
> >   
> >> +
> >> +    /* Convert prop_size to integer multiple of ERST_RECORD_SIZE */
> >> +    s->prop_size -= (s->prop_size % ERST_RECORD_SIZE);  
> > 
> > pls, no fixups on behalf of user, if size is not what it should be
> > error out with suggestion how to fix it.  
> Removed
> 
> >   
> >> +
> >> +    /*
> >> +     * MemoryBackend initializes contents to zero, but we actually
> >> +     * want contents initialized to 0xFF, ERST_INVALID_RECORD_ID.
> >> +     */
> >> +    if (copy_from_nvram_by_index(s, 0) == ACPI_ERST_STATUS_SUCCESS) {
> >> +        if (s->tmp_record[0] == 0x00) {
> >> +            memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);  
> > this doesn't scale,
> > (set backend size to more than host physical RAM, put it on slow storage and have fun.)  
> of course, which is why i think we need to have an upper bound (my early
> submissions did).
> 
> > 
> > Is it possible to use 0 as invalid record id or change storage format
> > so you would not have to rewrite whole file at startup (maybe some sort  
> no
> 
> > of metadata header/records book-keeping table before actual records.
> > And initialize file only if header is invalid.)  
> I have to scan the backend storage anyway in order to initialize the record
> count, so I've combined that scan with a test to see if the backend storage
> needs to be initialized.


if you add a records table at the start of backend,
then you won't need to read/write whole file.
It would be enough to read/initialize header only and access
actual records only when necessary. Header could look something like:

|erst magic string|
|slot size|
|slots nr|
+++++++++++++++++ slots header ++++++++++++
|is_empty, record offset from file start, maybe something else that would speed up access|
...
|last record descriptor|
++++++++++ actual records +++++++++++++
|slot 0|
...
|slot n|

> >> +            index = 0;
> >> +            while (copy_to_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
> >> +                ++index;
> >> +            }  
> > also back&forth copying here is not really necessary.  
> corrected
> 
> >   
> >> +        }
> >> +    }
> >> +
> >> +    /* Initialize record_count */
> >> +    get_erst_record_count(s);  
> > why not put it into reset?  
> It is initialized once, then subsequent write/clear operations update
> the counter as needed.

ok

> >   
> >> +
> >> +    /* BAR 0: Programming registers */
> >> +    memory_region_init_io(&s->iomem, OBJECT(pci_dev), &erst_reg_ops, s,
> >> +                          TYPE_ACPI_ERST, ERST_REG_SIZE);
> >> +    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->iomem);
> >> +  
> >   
> >> +    /* BAR 1: Exchange buffer memory */
> >> +    memory_region_init_io(&s->nvmem, OBJECT(pci_dev), &erst_mem_ops, s,
> >> +                          TYPE_ACPI_ERST, ERST_RECORD_SIZE);
> >> +    pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->nvmem);  
> > 
> > **)
> > instead of using mmio for buffer where each write causes
> > guest exit to QEMU, map memory region directly to guest.
> > see ivshmem_bar2, the only difference with ivshmem, you'd
> > create memory region manually (for example you can use
> > memory_region_init_resizeable_ram)
> > 
> > this way you can speedup access and drop erst_mem_ops and
> > [tmp_]record intermediate buffers.
> > 
> > Instead of [tmp_]record you can copy record content
> > directly between buffer and backend memory regions.  
> 
> I've changed the exchange buffer into a MemoryBackend object and
> eliminated the erst_mem_ops.
> 
> >   
> >> +    /*
> >> +     * The vmstate_register_ram_global() puts the memory in
> >> +     * migration stream, where it is written back to the memory
> >> +     * upon reaching the destination, which causes the backing
> >> +     * file to be updated (with share=on).
> >> +     */
> >> +    vmstate_register_ram_global(s->hostmem_mr);
> >> +
> >> +    trace_acpi_erst_realizefn_out(s->prop_size);
> >> +}
> >> +
> >> +static void erst_reset(DeviceState *dev)
> >> +{
> >> +    ERSTDeviceState *s = ACPIERST(dev);
> >> +
> >> +    trace_acpi_erst_reset_in(s->record_count);
> >> +    s->operation = 0;
> >> +    s->busy_status = 0;
> >> +    s->command_status = ACPI_ERST_STATUS_SUCCESS;  
> >   
> >> +    /* indicate empty/no-more until further notice */  
> > pls rephrase, I'm not sure what it's trying to say  
> Eliminated; I don't know why I was trying to say there either
> >   
> >> +    s->record_identifier = ERST_INVALID_RECORD_ID;
> >> +    s->record_offset = 0;
> >> +    s->next_record_index = 0;  
> >   
> >> +    /* NOTE: record_count and nvram are initialized elsewhere */
> >> +    trace_acpi_erst_reset_out(s->record_count);
> >> +}
> >> +
> >> +static Property erst_properties[] = {
> >> +    DEFINE_PROP_LINK(ACPI_ERST_MEMDEV_PROP, ERSTDeviceState, hostmem,
> >> +                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> >> +    DEFINE_PROP_END_OF_LIST(),
> >> +};
> >> +
> >> +static void erst_class_init(ObjectClass *klass, void *data)
> >> +{
> >> +    DeviceClass *dc = DEVICE_CLASS(klass);
> >> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> >> +
> >> +    trace_acpi_erst_class_init_in();
> >> +    k->realize = erst_realizefn;
> >> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> >> +    k->device_id = PCI_DEVICE_ID_REDHAT_ACPI_ERST;
> >> +    k->revision = 0x00;
> >> +    k->class_id = PCI_CLASS_OTHERS;
> >> +    dc->reset = erst_reset;
> >> +    dc->vmsd = &erst_vmstate;
> >> +    dc->user_creatable = true;
> >> +    device_class_set_props(dc, erst_properties);
> >> +    dc->desc = "ACPI Error Record Serialization Table (ERST) device";
> >> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >> +    trace_acpi_erst_class_init_out();
> >> +}
> >> +
> >> +static const TypeInfo erst_type_info = {
> >> +    .name          = TYPE_ACPI_ERST,
> >> +    .parent        = TYPE_PCI_DEVICE,
> >> +    .class_init    = erst_class_init,
> >> +    .instance_size = sizeof(ERSTDeviceState),
> >> +    .interfaces = (InterfaceInfo[]) {
> >> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },  
> > what is this for here?
> >   
> >> +        { }
> >> +    }
> >> +};
> >> +
> >> +static void erst_register_types(void)
> >> +{
> >> +    type_register_static(&erst_type_info);
> >> +}
> >> +
> >> +type_init(erst_register_types)
> >> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> >> index dd69577..262a8ee 100644
> >> --- a/hw/acpi/meson.build
> >> +++ b/hw/acpi/meson.build
> >> @@ -4,6 +4,7 @@ acpi_ss.add(files(
> >>     'aml-build.c',
> >>     'bios-linker-loader.c',
> >>     'utils.c',
> >> +  'erst.c',
> >>   ))
> >>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
> >>   acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-21 16:12     ` Eric DeVolder
@ 2021-07-26 11:00       ` Igor Mammedov
  2021-07-26 20:02         ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 11:00 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 21 Jul 2021 11:12:41 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 8:16 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:17 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> This code is called from the machine code (if ACPI supported)
> >> to generate the ACPI ERST table.  
> > should be along lines:
> > This builds ACPI ERST table /spec ref/ to inform OSMP
> > how to communicate with ... device.  
> 
> Like this?
> This builds the ACPI ERST table to inform OSMP how to communicate
                                 ^ [1]
> with the acpi-erst device.
> 

1) ACPI spec vX.Y, chapter foo

> 
> >   
> >>
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >> ---
> >>   hw/acpi/erst.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>   1 file changed, 214 insertions(+)
> >>
> >> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> >> index 6e9bd2e..1f1dbbc 100644
> >> --- a/hw/acpi/erst.c
> >> +++ b/hw/acpi/erst.c
> >> @@ -555,6 +555,220 @@ static const MemoryRegionOps erst_mem_ops = {
> >>   /*******************************************************************/
> >>   /*******************************************************************/
> >>   
> >> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
> >> +static void build_serialization_instruction_entry(GArray *table_data,
> >> +    uint8_t serialization_action,
> >> +    uint8_t instruction,
> >> +    uint8_t flags,
> >> +    uint8_t register_bit_width,
> >> +    uint64_t register_address,
> >> +    uint64_t value,
> >> +    uint64_t mask)  
> > like I mentioned in previous patch, It could be simplified
> > a lot if it's possible to use fixed 64-bit access with every
> > action and the same width mask.  
> See previous response.
lets see if it's a guest OS issue first, and then decide what to do with it.

> 
> >   
> >> +{
> >> +    /* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
> >> +    struct AcpiGenericAddress gas;
> >> +
> >> +    /* Serialization Action */
> >> +    build_append_int_noprefix(table_data, serialization_action, 1);
> >> +    /* Instruction */
> >> +    build_append_int_noprefix(table_data, instruction         , 1);
> >> +    /* Flags */
> >> +    build_append_int_noprefix(table_data, flags               , 1);
> >> +    /* Reserved */
> >> +    build_append_int_noprefix(table_data, 0                   , 1);
> >> +    /* Register Region */
> >> +    gas.space_id = AML_SYSTEM_MEMORY;
> >> +    gas.bit_width = register_bit_width;
> >> +    gas.bit_offset = 0;
> >> +    switch (register_bit_width) {
> >> +    case 8:
> >> +        gas.access_width = 1;
> >> +        break;
> >> +    case 16:
> >> +        gas.access_width = 2;
> >> +        break;
> >> +    case 32:
> >> +        gas.access_width = 3;
> >> +        break;
> >> +    case 64:
> >> +        gas.access_width = 4;
> >> +        break;
> >> +    default:
> >> +        gas.access_width = 0;
> >> +        break;
> >> +    }
> >> +    gas.address = register_address;
> >> +    build_append_gas_from_struct(table_data, &gas);
> >> +    /* Value */
> >> +    build_append_int_noprefix(table_data, value  , 8);
> >> +    /* Mask */
> >> +    build_append_int_noprefix(table_data, mask   , 8);
> >> +}
> >> +
> >> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
> >> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
> >> +    const char *oem_id, const char *oem_table_id)
> >> +{
> >> +    ERSTDeviceState *s = ACPIERST(erst_dev);  
> > 
> > globals are not welcomed in new code,
> > pass erst_dev as argument here (ex: find_vmgenid_dev)
> >   
> >> +    unsigned action;
> >> +    unsigned erst_start = table_data->len;
> >> +  
> >   
> >> +    s->bar0 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
> >> +    trace_acpi_erst_pci_bar_0(s->bar0);
> >> +    s->bar1 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);  
> > 
> > just store pci_get_bar_addr(PCI_DEVICE(erst_dev), 0) in local variable,
> > Bar 1 is not used in this function so you don't need it here.  
> Corrected
> 
> > 
> >   
> >> +    trace_acpi_erst_pci_bar_1(s->bar1);
> >> +
> >> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
> >> +    /* serialization_header_length */  
> > comments documenting table entries should be verbatim copy from spec,
> > see build_amd_iommu() as example of preferred style.  
> Corrected
> 
> >   
> >> +    build_append_int_noprefix(table_data, 48, 4);
> >> +    /* reserved */
> >> +    build_append_int_noprefix(table_data,  0, 4);
> >> +    /*
> >> +     * instruction_entry_count - changes to the number of serialization
> >> +     * instructions in the ACTIONs below must be reflected in this
> >> +     * pre-computed value.
> >> +     */
> >> +    build_append_int_noprefix(table_data, 29, 4);  
> > a bit fragile as it can easily diverge from actual number later on.
> > maybe instead of building instruction entries in place, build it
> > in separate array and when done, use actual count to fill instruction_entry_count.
> > pseudo code could look like:
> > 
> >       /* prepare instructions in advance because ... */
> >       GArray table_instruction_data;
> >       build_serialization_instruction_entry(table_instruction_data,...);;
> >       ...
> >       build_serialization_instruction_entry(table_instruction_data,...);
> >       /* instructions count */
> >       build_append_int_noprefix(table_data, table_instruction_data.len/entry_size, 4);
> >       /* copy prepared in advance instructions */
> >       g_array_append_vals(table_data, table_instruction_data.data, table_instruction_data.len);  
> Corrected
> 
> >     
> >   
> >> +
> >> +#define MASK8  0x00000000000000FFUL
> >> +#define MASK16 0x000000000000FFFFUL
> >> +#define MASK32 0x00000000FFFFFFFFUL
> >> +#define MASK64 0xFFFFFFFFFFFFFFFFUL
> >> +
> >> +    for (action = 0; action < ACPI_ERST_MAX_ACTIONS; ++action) {  
> > I'd unroll this loop and just directly code entries in required order.
> > also drop reserved and nop actions/instructions or explain why they are necessary.  
> Unrolled. Dropped the NOP.

What about 'reserved"?

> 
> >   
> >> +        switch (action) {
> >> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:  
> > given these names will/should never be exposed outside of hw/acpi/erst.c
> > I'd drop ACPI_ERST_ACTION_/ACPI_ERST_INST_ prefixes (i.e. use names as defined in spec)
> > if it doesn't cause build issues.  
> These are in include/hw/acpi/erst.h which is included by hw/i386/acpi-build.c,
> which includes many other hardware files.
> Removing the prefix leaves a rather generic name.
> I'd prefer to leave them as it uniquely differentiates.
is there a reason to put them into erst.h and expose to outside world?
If not then it might be better to move them into erst.c

> 
> 
> >   
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_END_OPERATION:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
> >> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_RESERVED:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >> +            break;
> >> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >> +        default:
> >> +            build_serialization_instruction_entry(table_data, action,
> >> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
> >> +            break;
> >> +        }
> >> +    }
> >> +    build_header(linker, table_data,
> >> +                 (void *)(table_data->data + erst_start),
> >> +                 "ERST", table_data->len - erst_start,
> >> +                 1, oem_id, oem_table_id);
> >> +}
> >> +
> >> +/*******************************************************************/
> >> +/*******************************************************************/
> >> +
> >>   static const VMStateDescription erst_vmstate  = {
> >>       .name = "acpi-erst",
> >>       .version_id = 1,  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 07/10] ACPI ERST: trace support
  2021-07-21 16:14     ` Eric DeVolder
@ 2021-07-26 11:08       ` Igor Mammedov
  2021-07-26 20:03         ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 11:08 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 21 Jul 2021 11:14:37 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 8:15 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:18 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> Provide the definitions needed to support tracing in ACPI ERST.  
> > trace points should be introduced in patches that use them for the first time,
> > as it stands now series breaks bisection.  
> 
> Are you asking to move this patch before the patch that introduces erst.c (which
> uses these trace points)?
> Or are you asking to include this patch with the patch that introduces erst.c?

I'd squash it into patch that introduces corresponding functions.

> Also, you requested I separate the building of ERST table from the implemenation
> of the erst device as separate patches. Doesn't that also break bisection?

it should be possible to compile series patch by patch and not break 'make check'
after each patch.

ACPI table is not part of device, it's separate part that describes to OSPM
how to work with device. So if code split correctly between patches
it shouldn't break bisection.

> 
> >   
> >>
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >> ---
> >>   hw/acpi/trace-events | 14 ++++++++++++++
> >>   1 file changed, 14 insertions(+)
> >>
> >> diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events
> >> index dcc1438..a5c2755 100644
> >> --- a/hw/acpi/trace-events
> >> +++ b/hw/acpi/trace-events
> >> @@ -55,3 +55,17 @@ piix4_gpe_writeb(uint64_t addr, unsigned width, uint64_t val) "addr: 0x%" PRIx64
> >>   # tco.c
> >>   tco_timer_reload(int ticks, int msec) "ticks=%d (%d ms)"
> >>   tco_timer_expired(int timeouts_no, bool strap, bool no_reboot) "timeouts_no=%d no_reboot=%d/%d"
> >> +
> >> +# erst.c
> >> +acpi_erst_reg_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%04" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
> >> +acpi_erst_reg_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%04" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
> >> +acpi_erst_mem_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%06" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
> >> +acpi_erst_mem_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%06" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
> >> +acpi_erst_pci_bar_0(uint64_t addr) "BAR0: 0x%016" PRIx64
> >> +acpi_erst_pci_bar_1(uint64_t addr) "BAR1: 0x%016" PRIx64
> >> +acpi_erst_realizefn_in(void)
> >> +acpi_erst_realizefn_out(unsigned size) "total nvram size %u bytes"
> >> +acpi_erst_reset_in(unsigned record_count) "record_count %u"
> >> +acpi_erst_reset_out(unsigned record_count) "record_count %u"
> >> +acpi_erst_class_init_in(void)
> >> +acpi_erst_class_init_out(void)  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines.
  2021-07-21 16:16     ` Eric DeVolder
@ 2021-07-26 11:30       ` Igor Mammedov
  2021-07-26 20:03         ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 11:30 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 21 Jul 2021 11:16:42 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 8:19 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:19 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> This change exposes ACPI ERST support for x86 guests.
> >>
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>  
> > looks good to me, maybe move find_erst_dev() impl. here as well
> > if it's the patch it's first used.  
> 
> I've followed your previous suggestion of mimicking find_vmgenid_dev(), which
> declares it in its header file. I've done the same, find_erst_dev() is
> declared in its header file and used in these files.

it's fine doing like this but
it would be easier to follow if this were part of [6/10],
so that function is introduced and used in the same patch.


> >   
> >> ---
> >>   hw/i386/acpi-build.c   | 9 +++++++++
> >>   hw/i386/acpi-microvm.c | 9 +++++++++
> >>   2 files changed, 18 insertions(+)
> >>
> >> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> >> index de98750..d2026cc 100644
> >> --- a/hw/i386/acpi-build.c
> >> +++ b/hw/i386/acpi-build.c
> >> @@ -43,6 +43,7 @@
> >>   #include "sysemu/tpm.h"
> >>   #include "hw/acpi/tpm.h"
> >>   #include "hw/acpi/vmgenid.h"
> >> +#include "hw/acpi/erst.h"
> >>   #include "hw/boards.h"
> >>   #include "sysemu/tpm_backend.h"
> >>   #include "hw/rtc/mc146818rtc_regs.h"
> >> @@ -2327,6 +2328,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
> >>       GArray *tables_blob = tables->table_data;
> >>       AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
> >>       Object *vmgenid_dev;
> >> +    Object *erst_dev;
> >>       char *oem_id;
> >>       char *oem_table_id;
> >>   
> >> @@ -2388,6 +2390,13 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
> >>                       ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
> >>                       x86ms->oem_table_id);
> >>   
> >> +    erst_dev = find_erst_dev();
> >> +    if (erst_dev) {
> >> +        acpi_add_table(table_offsets, tables_blob);
> >> +        build_erst(tables_blob, tables->linker, erst_dev,
> >> +                   x86ms->oem_id, x86ms->oem_table_id);
> >> +    }
> >> +
> >>       vmgenid_dev = find_vmgenid_dev();
> >>       if (vmgenid_dev) {
> >>           acpi_add_table(table_offsets, tables_blob);
> >> diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
> >> index ccd3303..0099b13 100644
> >> --- a/hw/i386/acpi-microvm.c
> >> +++ b/hw/i386/acpi-microvm.c
> >> @@ -30,6 +30,7 @@
> >>   #include "hw/acpi/bios-linker-loader.h"
> >>   #include "hw/acpi/generic_event_device.h"
> >>   #include "hw/acpi/utils.h"
> >> +#include "hw/acpi/erst.h"
> >>   #include "hw/boards.h"
> >>   #include "hw/i386/fw_cfg.h"
> >>   #include "hw/i386/microvm.h"
> >> @@ -160,6 +161,7 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
> >>       X86MachineState *x86ms = X86_MACHINE(mms);
> >>       GArray *table_offsets;
> >>       GArray *tables_blob = tables->table_data;
> >> +    Object *erst_dev;
> >>       unsigned dsdt, xsdt;
> >>       AcpiFadtData pmfadt = {
> >>           /* ACPI 5.0: 4.1 Hardware-Reduced ACPI */
> >> @@ -209,6 +211,13 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
> >>                       ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
> >>                       x86ms->oem_table_id);
> >>   
> >> +    erst_dev = find_erst_dev();
> >> +    if (erst_dev) {
> >> +        acpi_add_table(table_offsets, tables_blob);
> >> +        build_erst(tables_blob, tables->linker, erst_dev,
> >> +                   x86ms->oem_id, x86ms->oem_table_id);
> >> +    }
> >> +
> >>       xsdt = tables_blob->len;
> >>       build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
> >>                  x86ms->oem_table_id);  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 09/10] ACPI ERST: qtest for ERST
  2021-07-21 16:18     ` Eric DeVolder
@ 2021-07-26 11:45       ` Igor Mammedov
  2021-07-26 20:06         ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-26 11:45 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 21 Jul 2021 11:18:44 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/20/21 8:38 AM, Igor Mammedov wrote:
> > On Wed, 30 Jun 2021 15:07:20 -0400
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> This change provides a qtest that locates and then does a simple
> >> interrogation of the ERST feature within the guest.
> >>
> >> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >> ---
> >>   tests/qtest/erst-test.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>   tests/qtest/meson.build |   2 +
> >>   2 files changed, 131 insertions(+)
> >>   create mode 100644 tests/qtest/erst-test.c
> >>
> >> diff --git a/tests/qtest/erst-test.c b/tests/qtest/erst-test.c
> >> new file mode 100644
> >> index 0000000..ce014c1
> >> --- /dev/null
> >> +++ b/tests/qtest/erst-test.c
> >> @@ -0,0 +1,129 @@
> >> +/*
> >> + * QTest testcase for ACPI ERST
> >> + *
> >> + * Copyright (c) 2021 Oracle
> >> + *
> >> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> >> + * See the COPYING file in the top-level directory.
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "qemu/bitmap.h"
> >> +#include "qemu/uuid.h"
> >> +#include "hw/acpi/acpi-defs.h"
> >> +#include "boot-sector.h"
> >> +#include "acpi-utils.h"
> >> +#include "libqos/libqtest.h"
> >> +#include "qapi/qmp/qdict.h"
> >> +
> >> +#define RSDP_ADDR_INVALID 0x100000 /* RSDP must be below this address */
> >> +
> >> +static uint64_t acpi_find_erst(QTestState *qts)
> >> +{
> >> +    uint32_t rsdp_offset;
> >> +    uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
> >> +    uint32_t rsdt_len, table_length;
> >> +    uint8_t *rsdt, *ent;
> >> +    uint64_t base = 0;
> >> +
> >> +    /* Wait for guest firmware to finish and start the payload. */
> >> +    boot_sector_test(qts);
> >> +
> >> +    /* Tables should be initialized now. */
> >> +    rsdp_offset = acpi_find_rsdp_address(qts);
> >> +
> >> +    g_assert_cmphex(rsdp_offset, <, RSDP_ADDR_INVALID);
> >> +
> >> +    acpi_fetch_rsdp_table(qts, rsdp_offset, rsdp_table);
> >> +    acpi_fetch_table(qts, &rsdt, &rsdt_len, &rsdp_table[16 /* RsdtAddress */],
> >> +                     4, "RSDT", true);
> >> +
> >> +    ACPI_FOREACH_RSDT_ENTRY(rsdt, rsdt_len, ent, 4 /* Entry size */) {
> >> +        uint8_t *table_aml;
> >> +        acpi_fetch_table(qts, &table_aml, &table_length, ent, 4, NULL, true);
> >> +        if (!memcmp(table_aml + 0 /* Header Signature */, "ERST", 4)) {
> >> +            /*
> >> +             * Picking up ERST base address from the Register Region
> >> +             * specified as part of the first Serialization Instruction
> >> +             * Action (which is a Begin Write Operation).
> >> +             */
> >> +            memcpy(&base, &table_aml[56], sizeof(base));
> >> +            g_free(table_aml);
> >> +            break;
> >> +        }
> >> +        g_free(table_aml);
> >> +    }
> >> +    g_free(rsdt);
> >> +    return base;
> >> +}  
> > I'd drop this, bios-tables-test should do ACPI table check
> > as for PCI device itself you can test it with qtest accelerator
> > that allows to instantiate it and access registers directly
> > without overhead of running actual guest.  
> Yes, bios-tables-test checks the ACPI table, but not the functionality.
> This test has actually caught a problem/bug during development.

What I'm saying is not to drop test, but rather use qtest
accelerator to test PCI hardware registers. Which doesn't run
guest code. but lets you directly program/access PCI device.

So instead of searching/parsing ERST table, you'd program BARs
manually on behalf of BIOS, and then test that it works as expected.

As for ACPI tables, we don't have complete testing infrastructure
in tree, bios-tables-test, only tests matching to committed
reference blobs. And verifying that reference blob is correct,
is manual process currently.

To test whole stack one could write an optional acceptance test,
which would run actual guest (if you wish to add that, you can look at
docs/devel/testing.rst "Acceptance tests ...").



> > As example you can look into megasas-test.c, ivshmem-test.c
> > or other PCI device tests.  
> But I'll look at these and see about migrating to this approach.
> 
> >   
> >> +static char disk[] = "tests/erst-test-disk-XXXXXX";
> >> +
> >> +#define ERST_CMD()                              \
> >> +    "-accel kvm -accel tcg "                    \
> >> +    "-object memory-backend-file," \
> >> +      "id=erstnvram,mem-path=tests/acpi-erst-XXXXXX,size=0x10000,share=on " \
> >> +    "-device acpi-erst,memdev=erstnvram " \
> >> +    "-drive id=hd0,if=none,file=%s,format=raw " \
> >> +    "-device ide-hd,drive=hd0 ", disk
> >> +
> >> +static void erst_get_error_log_address_range(void)
> >> +{
> >> +    QTestState *qts;
> >> +    uint64_t log_address_range = 0;
> >> +    unsigned log_address_length = 0;
> >> +    unsigned log_address_attr = 0;
> >> +
> >> +    qts = qtest_initf(ERST_CMD());
> >> +
> >> +    uint64_t base = acpi_find_erst(qts);
> >> +    g_assert(base != 0);
> >> +
> >> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE command */
> >> +    qtest_writel(qts, base + 0, 0xD);
> >> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE result */
> >> +    log_address_range = qtest_readq(qts, base + 8);\
> >> +
> >> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_LENGTH command */
> >> +    qtest_writel(qts, base + 0, 0xE);
> >> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_LENGTH result */
> >> +    log_address_length = qtest_readq(qts, base + 8);\
> >> +
> >> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES command */
> >> +    qtest_writel(qts, base + 0, 0xF);
> >> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES result */
> >> +    log_address_attr = qtest_readq(qts, base + 8);\
> >> +
> >> +    /* Check log_address_range is not 0,~0 or base */
> >> +    g_assert(log_address_range != base);
> >> +    g_assert(log_address_range != 0);
> >> +    g_assert(log_address_range != ~0UL);
> >> +
> >> +    /* Check log_address_length is ERST_RECORD_SIZE */
> >> +    g_assert(log_address_length == (8 * 1024));
> >> +
> >> +    /* Check log_address_attr is 0 */
> >> +    g_assert(log_address_attr == 0);
> >> +
> >> +    qtest_quit(qts);
> >> +}
> >> +
> >> +int main(int argc, char **argv)
> >> +{
> >> +    int ret;
> >> +
> >> +    ret = boot_sector_init(disk);
> >> +    if (ret) {
> >> +        return ret;
> >> +    }
> >> +
> >> +    g_test_init(&argc, &argv, NULL);
> >> +
> >> +    qtest_add_func("/erst/get-error-log-address-range",
> >> +                   erst_get_error_log_address_range);
> >> +
> >> +    ret = g_test_run();
> >> +    boot_sector_cleanup(disk);
> >> +
> >> +    return ret;
> >> +}
> >> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> >> index 0c76738..deae443 100644
> >> --- a/tests/qtest/meson.build
> >> +++ b/tests/qtest/meson.build
> >> @@ -66,6 +66,7 @@ qtests_i386 = \
> >>     (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
> >>     (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
> >>     (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
> >> +  (config_all_devices.has_key('CONFIG_ACPI') ? ['erst-test'] : []) +                 \
> >>     qtests_pci +                                                                              \
> >>     ['fdc-test',
> >>      'ide-test',
> >> @@ -237,6 +238,7 @@ qtests = {
> >>     'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
> >>     'cdrom-test': files('boot-sector.c'),
> >>     'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
> >> +  'erst-test': files('erst-test.c', 'boot-sector.c', 'acpi-utils.c'),
> >>     'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
> >>     'migration-test': files('migration-helpers.c'),
> >>     'pxe-test': files('boot-sector.c'),  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-07-26 10:06         ` Igor Mammedov
@ 2021-07-26 19:52           ` Eric DeVolder
  2021-07-27 11:45             ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 19:52 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth



On 7/26/21 5:06 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 10:42:33 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/19/21 10:02 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 19:26:39 +0000
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
>>>> rather than "non-NVRAM mode", which contradicts everything I stated prior.
>>>> Eric.
>>>> ________________________________
>>>> From: Eric DeVolder <eric.devolder@oracle.com>
>>>> Sent: Wednesday, June 30, 2021 2:07 PM
>>>> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
>>>> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
>>>>
>>>> Information on the implementation of the ACPI ERST support.
>>>>
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>> ---
>>>>    docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 152 insertions(+)
>>>>    create mode 100644 docs/specs/acpi_erst.txt
>>>>
>>>> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
>>>> new file mode 100644
>>>> index 0000000..79f8eb9
>>>> --- /dev/null
>>>> +++ b/docs/specs/acpi_erst.txt
>>>> @@ -0,0 +1,152 @@
>>>> +ACPI ERST DEVICE
>>>> +================
>>>> +
>>>> +The ACPI ERST device is utilized to support the ACPI Error Record
>>>> +Serialization Table, ERST, functionality. The functionality is
>>>> +designed for storing error records in persistent storage for
>>>> +future reference/debugging.
>>>> +
>>>> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
>>>> +(APEI)", and specifically subsection "Error Serialization", outlines
>>>> +a method for storing error records into persistent storage.
>>>> +
>>>> +The format of error records is described in the UEFI specification[2],
>>>> +in Appendix N "Common Platform Error Record".
>>>> +
>>>> +While the ACPI specification allows for an NVRAM "mode" (see
>>>> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
>>>> +directly exposed for direct access by the OS/guest, this implements
>>>> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
>>>> +by most BIOS (since flash memory requires programming operations
>>>> +in order to update its contents). Furthermore, as of the time of this
>>>> +writing, Linux does not support the non-NVRAM "mode".
>>>
>>> shouldn't it be s/non-NVRAM/NVRAM/ ?
>>
>> Yes, it has been corrected.
>>
>>>    
>>>> +
>>>> +
>>>> +Background/Motivation
>>>> +---------------------
>>>> +Linux uses the persistent storage filesystem, pstore, to record
>>>> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
>>>> +independent of, and runs before, kdump.  In certain scenarios (ie.
>>>> +hosts/guests with root filesystems on NFS/iSCSI where networking
>>>> +software and/or hardware fails), pstore may contain the only
>>>> +information available for post-mortem debugging.
>>>
>>> well,
>>> it's not the only way, one can use existing pvpanic device to notify
>>> mgmt layer about crash and mgmt layer can take appropriate measures
>>> to for post-mortem debugging, including dumping guest state,
>>> which is superior to anything pstore can offer as VM is still exists
>>> and mgmt layer can inspect VMs crashed state directly or dump
>>> necessary parts of it.
>>>
>>> So ERST shouldn't be portrayed as the only way here but rather
>>> as limited alternative to pvpanic in regards to post-mortem debugging
>>> (it's the only way only on bare-metal).
>>>
>>> It would be better to describe here other use-cases you've mentioned
>>> in earlier reviews, that justify adding alternative to pvpanic.
>>
>> I'm not sure how I would change this. I do say "may contain", which means it
>> is not the only way. Pvpanic is a way to notify the mgmt layer/host, but
>> this is a method solely with the guest. Each serves a different purpose;
>> plugs a different hole.
>>
> 
> I'd suggest edit  "pstore may contain the only information" as "pstore may contain information"
> 
Done

>> As noted in a separate message, my company has intentions of storing other
>> data in ERST beyond panics.
> perhaps add your use cases here as well.
>   
> 
>>>> +Two common storage backends for the pstore filesystem are ACPI ERST
>>>> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
>>>> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
>>>> +pstore storage backend for virtual machines (as it is now for
>>>> +bare metal machines).
>>>> +
>>>    
>>>> +Enabling support for ACPI ERST facilitates a consistent method to
>>>> +capture kernel panic information in a wide range of guests: from
>>>> +resource-constrained microvms to very large guests, and in
>>>> +particular, in direct-boot environments (which would lack UEFI
>>>> +run-time services).
>>> this hunk probably not necessary
>>>    
>>>> +
>>>> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
>>>> +crash information, if available.
>>> a pointer to a relevant source would be helpful here.
>>
>> I've included the reference, here for your benefit.
>> Windows Hardware Error Architecutre, specifically Persistence Mechanism
>> https://docs.microsoft.com/en-us/windows-hardware/drivers/whea/error-record-persistence-mechanism
>>
>>>    
>>>> +Invocation
>>> s/^^/Configuration|Usage/
>>
>> Corrected
>>
>>>    
>>>> +----------
>>>> +
>>>> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst
>>> s/utilize/use/
>>
>> Corrected
>>
>>>    
>>>> +device must be created, for example:
>>> s/must/can/
>>
>> Corrected
>>
>>>    
>>>> +
>>>> + qemu ...
>>>> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
>>>> +  size=0x10000,share=on
>>> I'd put ^^^ on the same line as -object and use '\' at the end the
>>> so example could be easily copy-pasted
>>
>> Corrected
>>
>>>    
>>>> + -device acpi-erst,memdev=erstnvram
>>>> +
>>>> +For proper operation, the ACPI ERST device needs a memory-backend-file
>>>> +object with the following parameters:
>>>> +
>>>> + - id: The id of the memory-backend-file object is used to associate
>>>> +   this memory with the acpi-erst device.
>>>> + - size: The size of the ACPI ERST backing storage. This parameter is
>>>> +   required.
>>>> + - mem-path: The location of the ACPI ERST backing storage file. This
>>>> +   parameter is also required.
>>>> + - share: The share=on parameter is required so that updates to the
>>>> +   ERST back store are written to the file immediately as well. Without
>>>> +   it, updates the the backing file are unpredictable and may not
>>>> +   properly persist (eg. if qemu should crash).
>>>
>>> mmap manpage says:
>>>     MAP_SHARED
>>>                Updates to the mapping ... are carried through to the underlying file.
>>> it doesn't guarantee 'written to the file immediately', though.
>>> So I'd rephrase it to something like that:
>>>
>>> - share: The share=on parameter is required so that updates to the ERST back store
>>>            are written back to the file.
>>
>> Corrected
>>
>>>    
>>>> +
>>>> +The ACPI ERST device is a simple PCI device, and requires this one
>>>> +parameter:
>>> s/^.*:/and ERST device:/
>>
>> Corrected
>>
>>>    
>>>> +
>>>> + - memdev: Is the object id of the memory-backend-file.
>>>> +
>>>> +
>>>> +PCI Interface
>>>> +-------------
>>>> +
>>>> +The ERST device is a PCI device with two BARs, one for accessing
>>>> +the programming registers, and the other for accessing the
>>>> +record exchange buffer.
>>>> +
>>>> +BAR0 contains the programming interface consisting of just two
>>>> +64-bit registers. The two registers are an ACTION (cmd) and a
>>>> +VALUE (data). All ERST actions/operations/side effects happen
>>> s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/
>>
>> Corrected
>>
>>>    
>>>> +on the write to the ACTION, by design. Thus any data needed
>>> s/Thus//
>> Corrected
>>
>>>    
>>>> +by the action must be placed into VALUE prior to writing
>>>> +ACTION. Reading the VALUE simply returns the register contents,
>>>> +which can be updated by a previous ACTION.
>>>    
>>>> This behavior is
>>>> +encoded in the ACPI ERST table generated by QEMU.
>>> it's too vague, Either drop sentence or add a reference to relevant place in spec.
>> Corrected
>>
>>>
>>>    
>>>> +
>>>> +BAR1 contains the record exchange buffer, and the size of this
>>>> +buffer sets the maximum record size. This record exchange
>>>> +buffer size is 8KiB.
>>> s/^^^/
>>> BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.
>> Corrected
>>
>>>
>>>    
>>>> +Backing File
>>>
>>> s/^^^/Backing Storage Format/
>> Corrected
>>
>>>    
>>>> +------------
>>>
>>>    
>>>> +
>>>> +The ACPI ERST persistent storage is contained within a single backing
>>>> +file. The size and location of the backing file is specified upon
>>>> +QEMU startup of the ACPI ERST device.
>>>
>>> I'd drop above paragraph and describe file format here,
>>> ultimately used backend doesn't have to be a file. For
>>> example if user doesn't need it persist over QEMU restarts,
>>> ram backend could be used, guest will still be able to see
>>> it's own crash log after guest is reboot, or it could be
>>> memfd backend passed to QEMU by mgmt layer.
>> Dropped
>>
>>>
>>>    
>>>> +Records are stored in the backing file in a simple fashion.
>>> s/backing file/backend storage/
>>> ditto for other occurrences
>> Corrected
>>
>>>    
>>>> +The backing file is essentially divided into fixed size
>>>> +"slots", ERST_RECORD_SIZE in length, with each "slot"
>>>> +storing a single record.
>>>    
>>>> No attempt at optimizing storage
>>>> +through compression, compaction, etc is attempted.
>>> s/^^^//
>>
>> I'd like to keep this statement. It is there because in a number of
>> hardware BIOS I tested, these kinds of features lead to bugs in the
>> ERST support.
> this doc it's not about issues in other BIOSes, it's about conrete
> QEMU impl. So sentence starting with "No attempt" is not relevant here at all.
Dropped

>    
>>>> +NOTE that any change to this value will make any pre-
>>>> +existing backing files, not of the same ERST_RECORD_SIZE,
>>>> +unusable to the guest.
>>> when that can happen, can we detect it and error out?
>> I've dropped this statement. That value is hard coded, and not a
>> parameter, so there is no simple way to change it. This comment
>> does exist next to the ERST_RECORD_SIZE declaration in the code.
> 
> It's not a problem with current impl. but rather with possible
> future expansion.
> 
> If you'd add a header with record size at the start of storage,
> it wouldn't be issue as ERST would be able to get used record
> size for storage. That will help with avoiding compat issues
> later on.
I'll go ahead and add the header. I'll put the magic and record size in there,
but I do not intend to put any data that would be "cached" from the records
themselves. So no recordids, in particular, will be cached in this header.

I'm not even sure I want to record/cache the number of records because:
  - it has almost no use (undermined by the fact overall storage size is not exposed, imho)
  - we backed off requiring the share=on, so it is conceivable this header value could
    encounter data integrity issues, should a guest crash...
  - scans still happen (see next)

While having it (number of records cached in header) would avoid a startup scan
to compute it, the write operation requires a scan to determine if the incoming
recordid is present or not, in order to determine overwrite or allocate-and-write.

And typically the first operation that Linux does is effectively a scan to
populate the /sys/fs/pstore entries via the GET_RECORD_IDENTIFIER action.

And the typical size of the ERST storage [on hardware systems] is 64 to 128KiB;
so not much storage to examine, especially since only looking at 12 bytes of each
8KiB record.

I'd still be in favor of putting an upper bound on the hostmem object, to avoid
your worst case fears...


> 
>>>> +Below is an example layout of the backing store file.
>>>> +The size of the file is a multiple of ERST_RECORD_SIZE,
>>>> +and contains N number of "slots" to store records. The
>>>> +example below shows two records (in CPER format) in the
>>>> +backing file, while the remaining slots are empty/
>>>> +available.
>>>> +
>>>> + Slot   Record
>>>> +        +--------------------------------------------+
>>>> +    0   | empty/available                            |
>>>> +        +--------------------------------------------+
>>>> +    1   | CPER                                       |
>>>> +        +--------------------------------------------+
>>>> +    2   | CPER                                       |
>>>> +        +--------------------------------------------+
>>>> +  ...   |                                            |
>>>> +        +--------------------------------------------+
>>>> +    N   | empty/available                            |
>>>> +        +--------------------------------------------+
>>>> +        <-------------- ERST_RECORD_SIZE ------------>
>>>
>>>    
>>>> +Not all slots need to be occupied, and they need not be
>>>> +occupied in a contiguous fashion. The ability to clear/erase
>>>> +specific records allows for the formation of unoccupied
>>>> +slots.
>>> I'd drop this as not necessary
>>
>> I'd like to keep this statement. Again, several BIOS on which I tested
>> ERST had bugs around non-contiguous record storage.
> 
> I'd drop this and alter sentence above ending with " in a simple fashion."
> to describe sparse usage of storage and then after that comes example diagram.
Done

> 
> I'd like this part to start with unambiguous concise description of
> format and to be finished with example diagram.
> It is the part that will be considered as the the only true source
> how file should be formatted, when it comes to fixing bugs or
> modifying original impl. later on. So it's important to have clear
> description without any unnecessary information here.
Done

> 
> 
>>>
>>>    
>>>> +
>>>> +
>>>> +References
>>>> +----------
>>>> +
>>>> +[1] "Advanced Configuration and Power Interface Specification",
>>>> +    version 4.0, June 2009.
>>>> +
>>>> +[2] "Unified Extensible Firmware Interface Specification",
>>>> +    version 2.1, October 2008.
>>>> +
>>>> --
>>>> 1.8.3.1
>>>>   
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-26 10:42       ` Igor Mammedov
@ 2021-07-26 20:01         ` Eric DeVolder
  2021-07-27 12:52           ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 20:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/26/21 5:42 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 11:07:40 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/20/21 7:17 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 15:07:16 -0400
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> This change implements the support for the ACPI ERST feature.
>>> Drop this
>> Done
>>
>>>    
>>>>
>>>> This implements a PCI device for ACPI ERST. This implments the
>>> s/implments/implements/
>> Corrected
>>
>>>    
>>>> non-NVRAM "mode" of operation for ERST.
>>> add here why non-NVRAM "mode" is implemented.
>> How about:
>> This implements a PCI device for ACPI ERST. This implments the
>> non-NVRAM "mode" of operation for ERST as it is supported by
>> Linux and Windows and aligns with ERST support in most BIOS.
> 
> modulo typos looks good to me.
> pls consider using a spell checker to check commit messages/comments.
done

> 
>>
>>>
>>> Also even if this non-NVRAM implementation, there is still
>>> a lot of not necessary data copying (see below) so drop it
>>> or justify why it's there.
>>>      
>>>> This change also includes erst.c in the build of general ACPI support.
>>> Drop this as well
>> Done
>>
>>>
>>>    
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>> ---
>>>>    hw/acpi/erst.c      | 704 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    hw/acpi/meson.build |   1 +
>>>>    2 files changed, 705 insertions(+)
>>>>    create mode 100644 hw/acpi/erst.c
>>>>
>>>> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
>>>> new file mode 100644
>>>> index 0000000..6e9bd2e
>>>> --- /dev/null
>>>> +++ b/hw/acpi/erst.c
>>>> @@ -0,0 +1,704 @@
>>>> +/*
>>>> + * ACPI Error Record Serialization Table, ERST, Implementation
>>>> + *
>>>> + * Copyright (c) 2021 Oracle and/or its affiliates.
>>>> + *
>>>> + * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
>>>> + * ACPI Platform Error Interfaces : Error Serialization
>>>> + *
>>>> + * This library is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU Lesser General Public
>>>> + * License as published by the Free Software Foundation;
>>>> + * version 2 of the License.
>>>> + *
>>>> + * This library is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * Lesser General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU Lesser General Public
>>>> + * License along with this library; if not, see <http://www.gnu.org/licenses/>
> 
> consider adding SPDX-License-Identifier to all new files.
Done

>   
>>>> + */
>>>> +
>>>> +#include <sys/types.h>
>>>> +#include <sys/stat.h>
>>>> +#include <unistd.h>
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "qapi/error.h"
>>>> +#include "hw/qdev-core.h"
>>>> +#include "exec/memory.h"
>>>> +#include "qom/object.h"
>>>> +#include "hw/pci/pci.h"
>>>> +#include "qom/object_interfaces.h"
>>>> +#include "qemu/error-report.h"
>>>> +#include "migration/vmstate.h"
>>>> +#include "hw/qdev-properties.h"
>>>> +#include "hw/acpi/acpi.h"
>>>> +#include "hw/acpi/acpi-defs.h"
>>>> +#include "hw/acpi/aml-build.h"
>>>> +#include "hw/acpi/bios-linker-loader.h"
>>>> +#include "exec/address-spaces.h"
>>>> +#include "sysemu/hostmem.h"
>>>> +#include "hw/acpi/erst.h"
>>>> +#include "trace.h"
>>>> +
>>>> +/* UEFI 2.1: Append N Common Platform Error Record */
>>>> +#define UEFI_CPER_RECORD_MIN_SIZE 128U
>>>> +#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
>>>> +#define UEFI_CPER_RECORD_ID_OFFSET 96U
>>>> +#define IS_UEFI_CPER_RECORD(ptr) \
>>>> +    (((ptr)[0] == 'C') && \
>>>> +     ((ptr)[1] == 'P') && \
>>>> +     ((ptr)[2] == 'E') && \
>>>> +     ((ptr)[3] == 'R'))
>>>> +#define THE_UEFI_CPER_RECORD_ID(ptr) \
>>>> +    (*(uint64_t *)(&(ptr)[UEFI_CPER_RECORD_ID_OFFSET]))
>>>> +
>>>> +/*
>>>> + * This implementation is an ACTION (cmd) and VALUE (data)
>>>> + * interface consisting of just two 64-bit registers.
>>>> + */
>>>> +#define ERST_REG_SIZE (2UL * sizeof(uint64_t))
>>>    
>>>> +#define ERST_CSR_ACTION (0UL << 3) /* action (cmd) */
>>>> +#define ERST_CSR_VALUE  (1UL << 3) /* argument/value (data) */
>>> what's meaning of CRS?
>> CSR = control status register
>>> Looking at patch both should be called ERST_[ACTION|VALUE]_OFFSET
>> Done
>>> pls use explicit offset values instead of shifting bit.
>> Done
>>>
>>>    
>>>> +/*
>>>> + * ERST_RECORD_SIZE is the buffer size for exchanging ERST
>>>> + * record contents. Thus, it defines the maximum record size.
>>>> + * As this is mapped through a PCI BAR, it must be a power of
>>>> + * two, and should be at least PAGE_SIZE.
>>>> + * Records are stored in the backing file in a simple fashion.
>>>> + * The backing file is essentially divided into fixed size
>>>> + * "slots", ERST_RECORD_SIZE in length, with each "slot"
>>>> + * storing a single record. No attempt at optimizing storage
>>>> + * through compression, compaction, etc is attempted.
>>>> + * NOTE that any change to this value will make any pre-
>>>> + * existing backing files, not of the same ERST_RECORD_SIZE,
>>>> + * unusable to the guest.
>>>> + */
>>>> +/* 8KiB records, not too small, not too big */
>>>> +#define ERST_RECORD_SIZE (2UL * 4096)
>>>> +
>>>> +#define ERST_INVALID_RECORD_ID (~0UL)
>>>> +#define ERST_EXECUTE_OPERATION_MAGIC 0x9CUL
>>>> +
>>>> +/*
>>>> + * Object cast macro
>>>> + */
>>>> +#define ACPIERST(obj) \
>>>> +    OBJECT_CHECK(ERSTDeviceState, (obj), TYPE_ACPI_ERST)
>>>> +
>>>> +/*
>>>> + * Main ERST device state structure
>>>> + */
>>>> +typedef struct {
>>>> +    PCIDevice parent_obj;
>>>> +
>>>> +    HostMemoryBackend *hostmem;
>>>> +    MemoryRegion *hostmem_mr;
>>>> +
>>>> +    MemoryRegion iomem; /* programming registes */
>>>> +    MemoryRegion nvmem; /* exchange buffer */
>>>> +    uint32_t prop_size;
>>> s/^^^/storage_size/
>> Corrected
>>
>>>    
>>>> +    hwaddr bar0; /* programming registers */
>>>> +    hwaddr bar1; /* exchange buffer */
>>> why do you need to keep this addresses around?
>>> Suggest to drop these fields and use local variables or pci_get_bar_addr() at call site.
>> Corrected
>>
>>>    
>>>> +
>>>> +    uint8_t operation;
>>>> +    uint8_t busy_status;
>>>> +    uint8_t command_status;
>>>> +    uint32_t record_offset;
>>>> +    uint32_t record_count;
>>>> +    uint64_t reg_action;
>>>> +    uint64_t reg_value;
>>>> +    uint64_t record_identifier;
>>>> +
>>>> +    unsigned next_record_index;
>>>
>>>    
>>>> +    uint8_t record[ERST_RECORD_SIZE]; /* read/written directly by guest */
>>>> +    uint8_t tmp_record[ERST_RECORD_SIZE]; /* intermediate manipulation buffer */
>>> drop these see [**] below
>> Corrected
>>
>>>    
>>>> +
>>>> +} ERSTDeviceState;
>>>> +
>>>> +/*******************************************************************/
>>>> +/*******************************************************************/
>>>> +
>>>> +static unsigned copy_from_nvram_by_index(ERSTDeviceState *s, unsigned index)
>>>> +{
>>>> +    /* Read an nvram entry into tmp_record */
>>>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>>>> +    off_t offset = (index * ERST_RECORD_SIZE);
>>>> +
>>>> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
>>>> +        if (s->hostmem_mr) {
>>>> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
>>>> +            memcpy(s->tmp_record, p + offset, ERST_RECORD_SIZE);
>>>> +            rc = ACPI_ERST_STATUS_SUCCESS;
>>>> +        }
>>>> +    }
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned copy_to_nvram_by_index(ERSTDeviceState *s, unsigned index)
>>>> +{
>>>> +    /* Write entry in tmp_record into nvram, and backing file */
>>>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>>>> +    off_t offset = (index * ERST_RECORD_SIZE);
>>>> +
>>>> +    if ((offset + ERST_RECORD_SIZE) <= s->prop_size) {
>>>> +        if (s->hostmem_mr) {
>>>> +            uint8_t *p = (uint8_t *)memory_region_get_ram_ptr(s->hostmem_mr);
>>>> +            memcpy(p + offset, s->tmp_record, ERST_RECORD_SIZE);
>>>> +            rc = ACPI_ERST_STATUS_SUCCESS;
>>>> +        }
>>>> +    }
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static int lookup_erst_record_by_identifier(ERSTDeviceState *s,
>>>> +    uint64_t record_identifier, bool *record_found, bool alloc_for_write)
>>>> +{
>>>> +    int rc = -1;
>>>> +    int empty_index = -1;
>>>> +    int index = 0;
>>>> +    unsigned rrc;
>>>> +
>>>> +    *record_found = 0;
>>>> +
>>>> +    do {
>>>> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
>>>
>>> you have direct access to backend memory so there is no need
>>> whatsoever to copy records from it to an intermediate buffer
>>> everywhere. Almost all operations with records can be done
>>> in place modulo EXECUTE_OPERATION action in BEGIN_[READ|WRITE]
>>> context, where record is moved between backend and guest buffer.
>>>
>>> So please eliminate all not necessary copying.
>>> (for fun, time operations and set backend size to some huge
>>> value to see how expensive this code is)
>>
>> I've corrected this. In our previous exchangs, I thought the reference
>> to copying was about trying to directly have guest write/read the appropriate
>> record in the backend storage. After reading this comment I realized that
>> yes I was doing alot of copying (an artifact of the transition away from
>> direct file i/o to MemoryBackend). So good find, and I've eliminated the
>> intermediate copying.
>>
>>>    
>>>> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
>>>> +            uint64_t this_identifier;
>>>> +            this_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
>>>> +            if (IS_UEFI_CPER_RECORD(s->tmp_record) &&
>>>> +                (this_identifier == record_identifier)) {
>>>> +                rc = index;
>>>> +                *record_found = 1;
>>>> +                break;
>>>> +            }
>>>> +            if ((this_identifier == ERST_INVALID_RECORD_ID) &&
>>>> +                (empty_index < 0)) {
>>>> +                empty_index = index; /* first available for write */
>>>> +            }
>>>> +        }
>>>> +        ++index;
>>>> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
>>>> +
>>>> +    /* Record not found, allocate for writing */
>>>> +    if ((rc < 0) && alloc_for_write) {
>>>> +        rc = empty_index;
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned clear_erst_record(ERSTDeviceState *s)
>>>> +{
>>>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>>>> +    bool record_found;
>>>> +    int index;
>>>> +
>>>> +    index = lookup_erst_record_by_identifier(s,
>>>> +        s->record_identifier, &record_found, 0);
>>>> +    if (record_found) {
>>>> +        memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
>>>> +        rc = copy_to_nvram_by_index(s, (unsigned)index);
>>>> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
>>>> +            s->record_count -= 1;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned write_erst_record(ERSTDeviceState *s)
>>>> +{
>>>> +    unsigned rc = ACPI_ERST_STATUS_FAILED;
>>>> +
>>>> +    if (s->record_offset < (ERST_RECORD_SIZE - UEFI_CPER_RECORD_MIN_SIZE)) {
>>>> +        uint64_t record_identifier;
>>>> +        uint8_t *record = &s->record[s->record_offset];
>>>> +        bool record_found;
>>>> +        int index;
>>>> +
>>>> +        record_identifier = (s->record_identifier == ERST_INVALID_RECORD_ID)
>>>> +            ? THE_UEFI_CPER_RECORD_ID(record) : s->record_identifier;
>>>> +
>>>> +        index = lookup_erst_record_by_identifier(s,
>>>> +            record_identifier, &record_found, 1);
>>>> +        if (index < 0) {
>>>> +            rc = ACPI_ERST_STATUS_NOT_ENOUGH_SPACE;
>>>> +        } else {
>>>> +            if (0 != s->record_offset) {
>>>> +                memset(&s->tmp_record[ERST_RECORD_SIZE - s->record_offset],
>>>> +                    0xFF, s->record_offset);
>>>> +            }
>>>> +            memcpy(s->tmp_record, record, ERST_RECORD_SIZE - s->record_offset);
>>>> +            rc = copy_to_nvram_by_index(s, (unsigned)index);
>>>> +            if (rc == ACPI_ERST_STATUS_SUCCESS) {
>>>> +                if (!record_found) { /* not overwriting existing record */
>>>> +                    s->record_count += 1; /* writing new record */
>>>> +                }
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned next_erst_record(ERSTDeviceState *s,
>>>> +    uint64_t *record_identifier)
>>>> +{
>>>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>>>> +    unsigned index;
>>>> +    unsigned rrc;
>>>> +
>>>> +    *record_identifier = ERST_INVALID_RECORD_ID;
>>>> +
>>>> +    index = s->next_record_index;
>>>> +    do {
>>>> +        rrc = copy_from_nvram_by_index(s, (unsigned)index);
>>>> +        if (rrc == ACPI_ERST_STATUS_SUCCESS) {
>>>> +            if (IS_UEFI_CPER_RECORD(s->tmp_record)) {
>>>> +                s->next_record_index = index + 1; /* where to start next time */
>>>> +                *record_identifier = THE_UEFI_CPER_RECORD_ID(s->tmp_record);
>>>> +                rc = ACPI_ERST_STATUS_SUCCESS;
>>>> +                break;
>>>> +            }
>>>> +            ++index;
>>>> +        } else {
>>>> +            if (s->next_record_index == 0) {
>>>> +                rc = ACPI_ERST_STATUS_RECORD_STORE_EMPTY;
>>>> +            }
>>>> +            s->next_record_index = 0; /* at end, reset */
>>>> +        }
>>>> +    } while (rrc == ACPI_ERST_STATUS_SUCCESS);
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned read_erst_record(ERSTDeviceState *s)
>>>> +{
>>>> +    unsigned rc = ACPI_ERST_STATUS_RECORD_NOT_FOUND;
>>>> +    bool record_found;
>>>> +    int index;
>>>> +
>>>> +    index = lookup_erst_record_by_identifier(s,
>>>> +        s->record_identifier, &record_found, 0);
>>>> +    if (record_found) {
>>>> +        rc = copy_from_nvram_by_index(s, (unsigned)index);
>>>> +        if (rc == ACPI_ERST_STATUS_SUCCESS) {
>>>> +            if (s->record_offset < ERST_RECORD_SIZE) {
>>>> +                memcpy(&s->record[s->record_offset], s->tmp_record,
>>>> +                    ERST_RECORD_SIZE - s->record_offset);
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +static unsigned get_erst_record_count(ERSTDeviceState *s)
>>>> +{
>>>> +    /* Compute record_count */
>>>> +    unsigned index = 0;
>>>> +
>>>> +    s->record_count = 0;
>>>> +    while (copy_from_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
>>>> +        uint8_t *ptr = &s->tmp_record[0];
>>>> +        uint64_t record_identifier = THE_UEFI_CPER_RECORD_ID(ptr);
>>>> +        if (IS_UEFI_CPER_RECORD(ptr) &&
>>>> +            (ERST_INVALID_RECORD_ID != record_identifier)) {
>>>> +            s->record_count += 1;
>>>> +        }
>>>> +        ++index;
>>>> +    }
>>>> +    return s->record_count;
>>>> +}
>>>> +
>>>> +/*******************************************************************/
>>>> +
>>>> +static uint64_t erst_rd_reg64(hwaddr addr,
>>>> +    uint64_t reg, unsigned size)
>>>> +{
>>>> +    uint64_t rdval;
>>>> +    uint64_t mask;
>>>> +    unsigned shift;
>>>> +
>>>> +    if (size == sizeof(uint64_t)) {
>>>> +        /* 64b access */
>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>>>> +        shift = 0;
>>>> +    } else {
>>>> +        /* 32b access */
>>>> +        mask = 0x00000000FFFFFFFFUL;
>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>>>> +    }
>>>> +
>>>> +    rdval = reg;
>>>> +    rdval >>= shift;
>>>> +    rdval &= mask;
>>>> +
>>>> +    return rdval;
>>>> +}
>>>> +
>>>> +static uint64_t erst_wr_reg64(hwaddr addr,
>>>> +    uint64_t reg, uint64_t val, unsigned size)
>>>> +{
>>>> +    uint64_t wrval;
>>>> +    uint64_t mask;
>>>> +    unsigned shift;
>>>> +
>>>> +    if (size == sizeof(uint64_t)) {
>>>> +        /* 64b access */
>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>>>> +        shift = 0;
>>>> +    } else {
>>>> +        /* 32b access */
>>>> +        mask = 0x00000000FFFFFFFFUL;
>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>>>> +    }
>>>> +
>>>> +    val &= mask;
>>>> +    val <<= shift;
>>>> +    mask <<= shift;
>>>> +    wrval = reg;
>>>> +    wrval &= ~mask;
>>>> +    wrval |= val;
>>>> +
>>>> +    return wrval;
>>>> +}
>>> (I see in next patch it's us defining access width in the ACPI tables)
>>> so question is: do we have to have mixed register width access?
>>> can't all register accesses be 64-bit?
>>
>> Initially I attempted to just use 64-bit exclusively. The problem is that,
>> for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
>> register access into two. Here's an example of reading the exchange buffer
>> address, which is coded as a 64b access:
>>
>> acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
>> acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
>> acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)
>>
>> So I went ahead and made ACTION register accesses 32b, else there would
>> be two reads of 32-bts, of which the second is useless.
> 
> could you post here decompiled ERST table?
As it is long, I posted it to the end of this message.

> 
>>
>>>    
>>>> +static void erst_reg_write(void *opaque, hwaddr addr,
>>>> +    uint64_t val, unsigned size)
>>>> +{
>>>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>>>> +
>>>> +    /*
>>>> +     * NOTE: All actions/operations/side effects happen on the WRITE,
>>>> +     * by design. The READs simply return the reg_value contents.
>>>> +     */
>>>> +    trace_acpi_erst_reg_write(addr, val, size);
>>>> +
>>>> +    switch (addr) {
>>>> +    case ERST_CSR_VALUE + 0:
>>>> +    case ERST_CSR_VALUE + 4:
>>>> +        s->reg_value = erst_wr_reg64(addr, s->reg_value, val, size);
>>>> +        break;
>>>> +    case ERST_CSR_ACTION + 0:
>>>> +/*  case ERST_CSR_ACTION+4: as coded, not really a 64b register */
>>>> +        switch (val) {
>>>> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
>>>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>>>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>>>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>>>> +        case ACPI_ERST_ACTION_END_OPERATION:
>>>> +            s->operation = val;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
>>>> +            s->record_offset = s->reg_value;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
>>>> +            if ((uint8_t)s->reg_value == ERST_EXECUTE_OPERATION_MAGIC) {
>>>> +                s->busy_status = 1;
>>>> +                switch (s->operation) {
>>>> +                case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
>>>> +                    s->command_status = write_erst_record(s);
>>>> +                    break;
>>>> +                case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>>>> +                    s->command_status = read_erst_record(s);
>>>> +                    break;
>>>> +                case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>>>> +                    s->command_status = clear_erst_record(s);
>>>> +                    break;
>>>> +                case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>>>> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
>>>> +                    break;
>>>> +                case ACPI_ERST_ACTION_END_OPERATION:
>>>> +                    s->command_status = ACPI_ERST_STATUS_SUCCESS;
>>>> +                    break;
>>>> +                default:
>>>> +                    s->command_status = ACPI_ERST_STATUS_FAILED;
>>>> +                    break;
>>>> +                }
>>>> +                s->record_identifier = ERST_INVALID_RECORD_ID;
>>>> +                s->busy_status = 0;
>>>> +            }
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
>>>> +            s->reg_value = s->busy_status;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
>>>> +            s->reg_value = s->command_status;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
>>>> +            s->command_status = next_erst_record(s, &s->reg_value);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
>>>> +            s->record_identifier = s->reg_value;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
>>>> +            s->reg_value = s->record_count;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
>>>> +            s->reg_value = s->bar1;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
>>>> +            s->reg_value = ERST_RECORD_SIZE;
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
>>>> +            s->reg_value = 0x0; /* intentional, not NVRAM mode */
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
>>>> +            /*
>>>> +             * 100UL is max, 10UL is nominal
>>> 100/10 of what, also add reference to spec/table it comes from
>>> and explain in comment why theses values were chosen
>> I've changed the comment and style to be similar to build_amd_iommu().
>> These are merely sane non-zero max/min times.
>>
>>>    
>>>> +             */
>>>> +            s->reg_value = ((100UL << 32) | (10UL << 0));
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_RESERVED:
>>> not necessary, it will be handled by 'default:'
>> Corrected
>>
>>>    
>>>> +        default:
>>>> +            /*
>>>> +             * Unknown action/command, NOP
>>>> +             */
>>>> +            break;
>>>> +        }
>>>> +        break;
>>>> +    default:
>>>> +        /*
>>>> +         * This should not happen, but if it does, NOP
>>>> +         */
>>>> +        break;
>>>> +    }
>>>> +}
>>>> +
>>>> +static uint64_t erst_reg_read(void *opaque, hwaddr addr,
>>>> +                                unsigned size)
>>>> +{
>>>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>>>> +    uint64_t val = 0;
>>>> +
>>>> +    switch (addr) {
>>>> +    case ERST_CSR_ACTION + 0:
>>>> +    case ERST_CSR_ACTION + 4:
>>>> +        val = erst_rd_reg64(addr, s->reg_action, size);
>>>> +        break;
>>>> +    case ERST_CSR_VALUE + 0:
>>>> +    case ERST_CSR_VALUE + 4:
>>>> +        val = erst_rd_reg64(addr, s->reg_value, size);
>>>> +        break;
>>>> +    default:
>>>> +        break;
>>>> +    }
>>>> +    trace_acpi_erst_reg_read(addr, val, size);
>>>> +    return val;
>>>> +}
>>>> +
>>>> +static const MemoryRegionOps erst_reg_ops = {
>>>> +    .read = erst_reg_read,
>>>> +    .write = erst_reg_write,
>>>> +    .endianness = DEVICE_NATIVE_ENDIAN,
>>>> +};
>>>> +
>>>> +static void erst_mem_write(void *opaque, hwaddr addr,
>>>> +    uint64_t val, unsigned size)
>>>> +{
>>>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>>>    
>>>> +    uint8_t *ptr = &s->record[addr - 0];
>>>> +    trace_acpi_erst_mem_write(addr, val, size);
>>>> +    switch (size) {
>>>> +    default:
>>>> +    case sizeof(uint8_t):
>>>> +        *(uint8_t *)ptr = (uint8_t)val;
>>>> +        break;
>>>> +    case sizeof(uint16_t):
>>>> +        *(uint16_t *)ptr = (uint16_t)val;
>>>> +        break;
>>>> +    case sizeof(uint32_t):
>>>> +        *(uint32_t *)ptr = (uint32_t)val;
>>>> +        break;
>>>> +    case sizeof(uint64_t):
>>>> +        *(uint64_t *)ptr = (uint64_t)val;
>>>> +        break;
>>>> +    }
>>>> +}
>>>> +
>>>> +static uint64_t erst_mem_read(void *opaque, hwaddr addr,
>>>> +                                unsigned size)
>>>> +{
>>>> +    ERSTDeviceState *s = (ERSTDeviceState *)opaque;
>>>> +    uint8_t *ptr = &s->record[addr - 0];
>>>> +    uint64_t val = 0;
>>>> +    switch (size) {
>>>> +    default:
>>>> +    case sizeof(uint8_t):
>>>> +        val = *(uint8_t *)ptr;
>>>> +        break;
>>>> +    case sizeof(uint16_t):
>>>> +        val = *(uint16_t *)ptr;
>>>> +        break;
>>>> +    case sizeof(uint32_t):
>>>> +        val = *(uint32_t *)ptr;
>>>> +        break;
>>>> +    case sizeof(uint64_t):
>>>> +        val = *(uint64_t *)ptr;
>>>> +        break;
>>>> +    }
>>>> +    trace_acpi_erst_mem_read(addr, val, size);
>>>> +    return val;
>>>> +}
>>>> +
>>>> +static const MemoryRegionOps erst_mem_ops = {
>>>> +    .read = erst_mem_read,
>>>> +    .write = erst_mem_write,
>>>> +    .endianness = DEVICE_NATIVE_ENDIAN,
>>>> +};
>>>> +
>>>> +/*******************************************************************/
>>>> +/*******************************************************************/
>>>> +
>>>> +static const VMStateDescription erst_vmstate  = {
>>>> +    .name = "acpi-erst",
>>>> +    .version_id = 1,
>>>> +    .minimum_version_id = 1,
>>>> +    .fields = (VMStateField[]) {
>>>> +        VMSTATE_UINT8(operation, ERSTDeviceState),
>>>> +        VMSTATE_UINT8(busy_status, ERSTDeviceState),
>>>> +        VMSTATE_UINT8(command_status, ERSTDeviceState),
>>>> +        VMSTATE_UINT32(record_offset, ERSTDeviceState),
>>>> +        VMSTATE_UINT32(record_count, ERSTDeviceState),
>>>> +        VMSTATE_UINT64(reg_action, ERSTDeviceState),
>>>> +        VMSTATE_UINT64(reg_value, ERSTDeviceState),
>>>> +        VMSTATE_UINT64(record_identifier, ERSTDeviceState),
>>>> +        VMSTATE_UINT8_ARRAY(record, ERSTDeviceState, ERST_RECORD_SIZE),
>>>> +        VMSTATE_UINT8_ARRAY(tmp_record, ERSTDeviceState, ERST_RECORD_SIZE),
>>>> +        VMSTATE_END_OF_LIST()
>>>> +    }
>>>> +};
>>>> +
>>>> +static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
>>>> +{
>>>> +    ERSTDeviceState *s = ACPIERST(pci_dev);
>>>> +    unsigned index = 0;
>>>> +    bool share;
>>>> +
>>>> +    trace_acpi_erst_realizefn_in();
>>>> +
>>>> +    if (!s->hostmem) {
>>>> +        error_setg(errp, "'" ACPI_ERST_MEMDEV_PROP "' property is not set");
>>>> +        return;
>>>> +    } else if (host_memory_backend_is_mapped(s->hostmem)) {
>>>> +        error_setg(errp, "can't use already busy memdev: %s",
>>>> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    share = object_property_get_bool(OBJECT(s->hostmem), "share", &error_fatal);
>>> s/&error_fatal/errp/
>> Corrected
>>
>>>    
>>>> +    if (!share) {
>>>> +        error_setg(errp, "ACPI ERST requires hostmem property share=on: %s",
>>>> +                   object_get_canonical_path_component(OBJECT(s->hostmem)));
>>>> +    }
>>> This limits possible to use backends to file|memfd only, so
>>> I wonder if really need this limitation, what if user doesn't
>>> care about preserving it across QEMU restarts. (i.e. usecase
>>> where storage is used as a means to troubleshoot guest crash
>>> i.e. QEMU is not restarted in between)
>>>
>>> Maybe instead of enforcing we should just document that if user
>>> wishes to preserve content they should use file|memfd backend with
>>> share=on option.
>>
>> I've removed this check. It is documented the way it is intended to be used.
>>
>>>    
>>>> +
>>>> +    s->hostmem_mr = host_memory_backend_get_memory(s->hostmem);
>>>> +
>>>> +    /* HostMemoryBackend size will be multiple of PAGE_SIZE */
>>>> +    s->prop_size = object_property_get_int(OBJECT(s->hostmem), "size", &error_fatal);
>>> s/&error_fatal/errp/
>> Corrected
>>
>>>    
>>>> +
>>>> +    /* Convert prop_size to integer multiple of ERST_RECORD_SIZE */
>>>> +    s->prop_size -= (s->prop_size % ERST_RECORD_SIZE);
>>>
>>> pls, no fixups on behalf of user, if size is not what it should be
>>> error out with suggestion how to fix it.
>> Removed
>>
>>>    
>>>> +
>>>> +    /*
>>>> +     * MemoryBackend initializes contents to zero, but we actually
>>>> +     * want contents initialized to 0xFF, ERST_INVALID_RECORD_ID.
>>>> +     */
>>>> +    if (copy_from_nvram_by_index(s, 0) == ACPI_ERST_STATUS_SUCCESS) {
>>>> +        if (s->tmp_record[0] == 0x00) {
>>>> +            memset(s->tmp_record, 0xFF, ERST_RECORD_SIZE);
>>> this doesn't scale,
>>> (set backend size to more than host physical RAM, put it on slow storage and have fun.)
>> of course, which is why i think we need to have an upper bound (my early
>> submissions did).
>>
>>>
>>> Is it possible to use 0 as invalid record id or change storage format
>>> so you would not have to rewrite whole file at startup (maybe some sort
>> no
>>
>>> of metadata header/records book-keeping table before actual records.
>>> And initialize file only if header is invalid.)
>> I have to scan the backend storage anyway in order to initialize the record
>> count, so I've combined that scan with a test to see if the backend storage
>> needs to be initialized.
> 
> 
> if you add a records table at the start of backend,
> then you won't need to read/write whole file.
> It would be enough to read/initialize header only and access
> actual records only when necessary. Header could look something like:
> 
> |erst magic string|
> |slot size|
> |slots nr|
> +++++++++++++++++ slots header ++++++++++++
> |is_empty, record offset from file start, maybe something else that would speed up access|
> ...
> |last record descriptor|
> ++++++++++ actual records +++++++++++++
> |slot 0|
> ...
> |slot n|
> 

see my concerns in the response to [02/10].


>>>> +            index = 0;
>>>> +            while (copy_to_nvram_by_index(s, index) == ACPI_ERST_STATUS_SUCCESS) {
>>>> +                ++index;
>>>> +            }
>>> also back&forth copying here is not really necessary.
>> corrected
>>
>>>    
>>>> +        }
>>>> +    }
>>>> +
>>>> +    /* Initialize record_count */
>>>> +    get_erst_record_count(s);
>>> why not put it into reset?
>> It is initialized once, then subsequent write/clear operations update
>> the counter as needed.
> 
> ok
> 
>>>    
>>>> +
>>>> +    /* BAR 0: Programming registers */
>>>> +    memory_region_init_io(&s->iomem, OBJECT(pci_dev), &erst_reg_ops, s,
>>>> +                          TYPE_ACPI_ERST, ERST_REG_SIZE);
>>>> +    pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->iomem);
>>>> +
>>>    
>>>> +    /* BAR 1: Exchange buffer memory */
>>>> +    memory_region_init_io(&s->nvmem, OBJECT(pci_dev), &erst_mem_ops, s,
>>>> +                          TYPE_ACPI_ERST, ERST_RECORD_SIZE);
>>>> +    pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->nvmem);
>>>
>>> **)
>>> instead of using mmio for buffer where each write causes
>>> guest exit to QEMU, map memory region directly to guest.
>>> see ivshmem_bar2, the only difference with ivshmem, you'd
>>> create memory region manually (for example you can use
>>> memory_region_init_resizeable_ram)
>>>
>>> this way you can speedup access and drop erst_mem_ops and
>>> [tmp_]record intermediate buffers.
>>>
>>> Instead of [tmp_]record you can copy record content
>>> directly between buffer and backend memory regions.
>>
>> I've changed the exchange buffer into a MemoryBackend object and
>> eliminated the erst_mem_ops.
>>
>>>    
>>>> +    /*
>>>> +     * The vmstate_register_ram_global() puts the memory in
>>>> +     * migration stream, where it is written back to the memory
>>>> +     * upon reaching the destination, which causes the backing
>>>> +     * file to be updated (with share=on).
>>>> +     */
>>>> +    vmstate_register_ram_global(s->hostmem_mr);
>>>> +
>>>> +    trace_acpi_erst_realizefn_out(s->prop_size);
>>>> +}
>>>> +
>>>> +static void erst_reset(DeviceState *dev)
>>>> +{
>>>> +    ERSTDeviceState *s = ACPIERST(dev);
>>>> +
>>>> +    trace_acpi_erst_reset_in(s->record_count);
>>>> +    s->operation = 0;
>>>> +    s->busy_status = 0;
>>>> +    s->command_status = ACPI_ERST_STATUS_SUCCESS;
>>>    
>>>> +    /* indicate empty/no-more until further notice */
>>> pls rephrase, I'm not sure what it's trying to say
>> Eliminated; I don't know why I was trying to say there either
>>>    
>>>> +    s->record_identifier = ERST_INVALID_RECORD_ID;
>>>> +    s->record_offset = 0;
>>>> +    s->next_record_index = 0;
>>>    
>>>> +    /* NOTE: record_count and nvram are initialized elsewhere */
>>>> +    trace_acpi_erst_reset_out(s->record_count);
>>>> +}
>>>> +
>>>> +static Property erst_properties[] = {
>>>> +    DEFINE_PROP_LINK(ACPI_ERST_MEMDEV_PROP, ERSTDeviceState, hostmem,
>>>> +                     TYPE_MEMORY_BACKEND, HostMemoryBackend *),
>>>> +    DEFINE_PROP_END_OF_LIST(),
>>>> +};
>>>> +
>>>> +static void erst_class_init(ObjectClass *klass, void *data)
>>>> +{
>>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>>> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>> +
>>>> +    trace_acpi_erst_class_init_in();
>>>> +    k->realize = erst_realizefn;
>>>> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
>>>> +    k->device_id = PCI_DEVICE_ID_REDHAT_ACPI_ERST;
>>>> +    k->revision = 0x00;
>>>> +    k->class_id = PCI_CLASS_OTHERS;
>>>> +    dc->reset = erst_reset;
>>>> +    dc->vmsd = &erst_vmstate;
>>>> +    dc->user_creatable = true;
>>>> +    device_class_set_props(dc, erst_properties);
>>>> +    dc->desc = "ACPI Error Record Serialization Table (ERST) device";
>>>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>>> +    trace_acpi_erst_class_init_out();
>>>> +}
>>>> +
>>>> +static const TypeInfo erst_type_info = {
>>>> +    .name          = TYPE_ACPI_ERST,
>>>> +    .parent        = TYPE_PCI_DEVICE,
>>>> +    .class_init    = erst_class_init,
>>>> +    .instance_size = sizeof(ERSTDeviceState),
>>>> +    .interfaces = (InterfaceInfo[]) {
>>>> +        { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>>> what is this for here?
>>>    
>>>> +        { }
>>>> +    }
>>>> +};
>>>> +
>>>> +static void erst_register_types(void)
>>>> +{
>>>> +    type_register_static(&erst_type_info);
>>>> +}
>>>> +
>>>> +type_init(erst_register_types)
>>>> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
>>>> index dd69577..262a8ee 100644
>>>> --- a/hw/acpi/meson.build
>>>> +++ b/hw/acpi/meson.build
>>>> @@ -4,6 +4,7 @@ acpi_ss.add(files(
>>>>      'aml-build.c',
>>>>      'bios-linker-loader.c',
>>>>      'utils.c',
>>>> +  'erst.c',
>>>>    ))
>>>>    acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c'))
>>>>    acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
>>>    
>>
> 


Obtained via a running guest with:
iasl -d -vt /sys/firmware/acpi/tables/ERST

/*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20180105 (64-bit version)
  * Copyright (c) 2000 - 2018 Intel Corporation
  *
  * Disassembly of ERST.blob, Mon Jul 26 14:31:21 2021
  *
  * ACPI Data Table [ERST]
  *
  * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
  */

[000h 0000   4]                    Signature : "ERST"    [Error Record Serialization Table]
[004h 0004   4]                 Table Length : 00000390
[008h 0008   1]                     Revision : 01
[009h 0009   1]                     Checksum : C8
[00Ah 0010   6]                       Oem ID : "BOCHS "
[010h 0016   8]                 Oem Table ID : "BXPC    "
[018h 0024   4]                 Oem Revision : 00000001
[01Ch 0028   4]              Asl Compiler ID : "BXPC"
[020h 0032   4]        Asl Compiler Revision : 00000001

[024h 0036   4]  Serialization Header Length : 00000030
[028h 0040   4]                     Reserved : 00000000
[02Ch 0044   4]      Instruction Entry Count : 0000001B

[030h 0048   1]                       Action : 00 [Begin Write Operation]
[031h 0049   1]                  Instruction : 03 [Write Register Value]
[032h 0050   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[033h 0051   1]                     Reserved : 00

[034h 0052  12]              Register Region : [Generic Address Structure]
[034h 0052   1]                     Space ID : 00 [SystemMemory]
[035h 0053   1]                    Bit Width : 20
[036h 0054   1]                   Bit Offset : 00
[037h 0055   1]         Encoded Access Width : 03 [DWord Access:32]
[038h 0056   8]                      Address : 00000000C1063000

[040h 0064   8]                        Value : 0000000000000000
[048h 0072   8]                         Mask : 00000000000000FF

[050h 0080   1]                       Action : 01 [Begin Read Operation]
[051h 0081   1]                  Instruction : 03 [Write Register Value]
[052h 0082   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[053h 0083   1]                     Reserved : 00

[054h 0084  12]              Register Region : [Generic Address Structure]
[054h 0084   1]                     Space ID : 00 [SystemMemory]
[055h 0085   1]                    Bit Width : 20
[056h 0086   1]                   Bit Offset : 00
[057h 0087   1]         Encoded Access Width : 03 [DWord Access:32]
[058h 0088   8]                      Address : 00000000C1063000

[060h 0096   8]                        Value : 0000000000000001
[068h 0104   8]                         Mask : 00000000000000FF

[070h 0112   1]                       Action : 02 [Begin Clear Operation]
[071h 0113   1]                  Instruction : 03 [Write Register Value]
[072h 0114   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[073h 0115   1]                     Reserved : 00

[074h 0116  12]              Register Region : [Generic Address Structure]
[074h 0116   1]                     Space ID : 00 [SystemMemory]
[075h 0117   1]                    Bit Width : 20
[076h 0118   1]                   Bit Offset : 00
[077h 0119   1]         Encoded Access Width : 03 [DWord Access:32]
[078h 0120   8]                      Address : 00000000C1063000

[080h 0128   8]                        Value : 0000000000000002
[088h 0136   8]                         Mask : 00000000000000FF

[090h 0144   1]                       Action : 03 [End Operation]
[091h 0145   1]                  Instruction : 03 [Write Register Value]
[092h 0146   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[093h 0147   1]                     Reserved : 00

[094h 0148  12]              Register Region : [Generic Address Structure]
[094h 0148   1]                     Space ID : 00 [SystemMemory]
[095h 0149   1]                    Bit Width : 20
[096h 0150   1]                   Bit Offset : 00
[097h 0151   1]         Encoded Access Width : 03 [DWord Access:32]
[098h 0152   8]                      Address : 00000000C1063000

[0A0h 0160   8]                        Value : 0000000000000003
[0A8h 0168   8]                         Mask : 00000000000000FF

[0B0h 0176   1]                       Action : 04 [Set Record Offset]
[0B1h 0177   1]                  Instruction : 02 [Write Register]
[0B2h 0178   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[0B3h 0179   1]                     Reserved : 00

[0B4h 0180  12]              Register Region : [Generic Address Structure]
[0B4h 0180   1]                     Space ID : 00 [SystemMemory]
[0B5h 0181   1]                    Bit Width : 20
[0B6h 0182   1]                   Bit Offset : 00
[0B7h 0183   1]         Encoded Access Width : 03 [DWord Access:32]
[0B8h 0184   8]                      Address : 00000000C1063008

[0C0h 0192   8]                        Value : 0000000000000000
[0C8h 0200   8]                         Mask : 00000000FFFFFFFF

[0D0h 0208   1]                       Action : 04 [Set Record Offset]
[0D1h 0209   1]                  Instruction : 03 [Write Register Value]
[0D2h 0210   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[0D3h 0211   1]                     Reserved : 00

[0D4h 0212  12]              Register Region : [Generic Address Structure]
[0D4h 0212   1]                     Space ID : 00 [SystemMemory]
[0D5h 0213   1]                    Bit Width : 20
[0D6h 0214   1]                   Bit Offset : 00
[0D7h 0215   1]         Encoded Access Width : 03 [DWord Access:32]
[0D8h 0216   8]                      Address : 00000000C1063000

[0E0h 0224   8]                        Value : 0000000000000004
[0E8h 0232   8]                         Mask : 00000000000000FF

[0F0h 0240   1]                       Action : 05 [Execute Operation]
[0F1h 0241   1]                  Instruction : 03 [Write Register Value]
[0F2h 0242   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[0F3h 0243   1]                     Reserved : 00

[0F4h 0244  12]              Register Region : [Generic Address Structure]
[0F4h 0244   1]                     Space ID : 00 [SystemMemory]
[0F5h 0245   1]                    Bit Width : 20
[0F6h 0246   1]                   Bit Offset : 00
[0F7h 0247   1]         Encoded Access Width : 03 [DWord Access:32]
[0F8h 0248   8]                      Address : 00000000C1063008

[100h 0256   8]                        Value : 000000000000009C
[108h 0264   8]                         Mask : 00000000000000FF

[110h 0272   1]                       Action : 05 [Execute Operation]
[111h 0273   1]                  Instruction : 03 [Write Register Value]
[112h 0274   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[113h 0275   1]                     Reserved : 00

[114h 0276  12]              Register Region : [Generic Address Structure]
[114h 0276   1]                     Space ID : 00 [SystemMemory]
[115h 0277   1]                    Bit Width : 20
[116h 0278   1]                   Bit Offset : 00
[117h 0279   1]         Encoded Access Width : 03 [DWord Access:32]
[118h 0280   8]                      Address : 00000000C1063000

[120h 0288   8]                        Value : 0000000000000005
[128h 0296   8]                         Mask : 00000000000000FF

[130h 0304   1]                       Action : 06 [Check Busy Status]
[131h 0305   1]                  Instruction : 03 [Write Register Value]
[132h 0306   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[133h 0307   1]                     Reserved : 00

[134h 0308  12]              Register Region : [Generic Address Structure]
[134h 0308   1]                     Space ID : 00 [SystemMemory]
[135h 0309   1]                    Bit Width : 20
[136h 0310   1]                   Bit Offset : 00
[137h 0311   1]         Encoded Access Width : 03 [DWord Access:32]
[138h 0312   8]                      Address : 00000000C1063000

[140h 0320   8]                        Value : 0000000000000006
[148h 0328   8]                         Mask : 00000000000000FF

[150h 0336   1]                       Action : 06 [Check Busy Status]
[151h 0337   1]                  Instruction : 01 [Read Register Value]
[152h 0338   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[153h 0339   1]                     Reserved : 00

[154h 0340  12]              Register Region : [Generic Address Structure]
[154h 0340   1]                     Space ID : 00 [SystemMemory]
[155h 0341   1]                    Bit Width : 20
[156h 0342   1]                   Bit Offset : 00
[157h 0343   1]         Encoded Access Width : 03 [DWord Access:32]
[158h 0344   8]                      Address : 00000000C1063008

[160h 0352   8]                        Value : 0000000000000001
[168h 0360   8]                         Mask : 00000000000000FF

[170h 0368   1]                       Action : 07 [Get Command Status]
[171h 0369   1]                  Instruction : 03 [Write Register Value]
[172h 0370   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[173h 0371   1]                     Reserved : 00

[174h 0372  12]              Register Region : [Generic Address Structure]
[174h 0372   1]                     Space ID : 00 [SystemMemory]
[175h 0373   1]                    Bit Width : 20
[176h 0374   1]                   Bit Offset : 00
[177h 0375   1]         Encoded Access Width : 03 [DWord Access:32]
[178h 0376   8]                      Address : 00000000C1063000

[180h 0384   8]                        Value : 0000000000000007
[188h 0392   8]                         Mask : 00000000000000FF

[190h 0400   1]                       Action : 07 [Get Command Status]
[191h 0401   1]                  Instruction : 00 [Read Register]
[192h 0402   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[193h 0403   1]                     Reserved : 00

[194h 0404  12]              Register Region : [Generic Address Structure]
[194h 0404   1]                     Space ID : 00 [SystemMemory]
[195h 0405   1]                    Bit Width : 20
[196h 0406   1]                   Bit Offset : 00
[197h 0407   1]         Encoded Access Width : 03 [DWord Access:32]
[198h 0408   8]                      Address : 00000000C1063008

[1A0h 0416   8]                        Value : 0000000000000000
[1A8h 0424   8]                         Mask : 00000000000000FF

[1B0h 0432   1]                       Action : 08 [Get Record Identifier]
[1B1h 0433   1]                  Instruction : 03 [Write Register Value]
[1B2h 0434   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[1B3h 0435   1]                     Reserved : 00

[1B4h 0436  12]              Register Region : [Generic Address Structure]
[1B4h 0436   1]                     Space ID : 00 [SystemMemory]
[1B5h 0437   1]                    Bit Width : 20
[1B6h 0438   1]                   Bit Offset : 00
[1B7h 0439   1]         Encoded Access Width : 03 [DWord Access:32]
[1B8h 0440   8]                      Address : 00000000C1063000

[1C0h 0448   8]                        Value : 0000000000000008
[1C8h 0456   8]                         Mask : 00000000000000FF

[1D0h 0464   1]                       Action : 08 [Get Record Identifier]
[1D1h 0465   1]                  Instruction : 00 [Read Register]
[1D2h 0466   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[1D3h 0467   1]                     Reserved : 00

[1D4h 0468  12]              Register Region : [Generic Address Structure]
[1D4h 0468   1]                     Space ID : 00 [SystemMemory]
[1D5h 0469   1]                    Bit Width : 40
[1D6h 0470   1]                   Bit Offset : 00
[1D7h 0471   1]         Encoded Access Width : 04 [QWord Access:64]
[1D8h 0472   8]                      Address : 00000000C1063008

[1E0h 0480   8]                        Value : 0000000000000000
[1E8h 0488   8]                         Mask : FFFFFFFFFFFFFFFF

[1F0h 0496   1]                       Action : 09 [Set Record Identifier]
[1F1h 0497   1]                  Instruction : 02 [Write Register]
[1F2h 0498   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[1F3h 0499   1]                     Reserved : 00

[1F4h 0500  12]              Register Region : [Generic Address Structure]
[1F4h 0500   1]                     Space ID : 00 [SystemMemory]
[1F5h 0501   1]                    Bit Width : 40
[1F6h 0502   1]                   Bit Offset : 00
[1F7h 0503   1]         Encoded Access Width : 04 [QWord Access:64]
[1F8h 0504   8]                      Address : 00000000C1063008

[200h 0512   8]                        Value : 0000000000000000
[208h 0520   8]                         Mask : FFFFFFFFFFFFFFFF

[210h 0528   1]                       Action : 09 [Set Record Identifier]
[211h 0529   1]                  Instruction : 03 [Write Register Value]
[212h 0530   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[213h 0531   1]                     Reserved : 00

[214h 0532  12]              Register Region : [Generic Address Structure]
[214h 0532   1]                     Space ID : 00 [SystemMemory]
[215h 0533   1]                    Bit Width : 20
[216h 0534   1]                   Bit Offset : 00
[217h 0535   1]         Encoded Access Width : 03 [DWord Access:32]
[218h 0536   8]                      Address : 00000000C1063000

[220h 0544   8]                        Value : 0000000000000009
[228h 0552   8]                         Mask : 00000000000000FF

[230h 0560   1]                       Action : 0A [Get Record Count]
[231h 0561   1]                  Instruction : 03 [Write Register Value]
[232h 0562   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[233h 0563   1]                     Reserved : 00

[234h 0564  12]              Register Region : [Generic Address Structure]
[234h 0564   1]                     Space ID : 00 [SystemMemory]
[235h 0565   1]                    Bit Width : 20
[236h 0566   1]                   Bit Offset : 00
[237h 0567   1]         Encoded Access Width : 03 [DWord Access:32]
[238h 0568   8]                      Address : 00000000C1063000

[240h 0576   8]                        Value : 000000000000000A
[248h 0584   8]                         Mask : 00000000000000FF

[250h 0592   1]                       Action : 0A [Get Record Count]
[251h 0593   1]                  Instruction : 00 [Read Register]
[252h 0594   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[253h 0595   1]                     Reserved : 00

[254h 0596  12]              Register Region : [Generic Address Structure]
[254h 0596   1]                     Space ID : 00 [SystemMemory]
[255h 0597   1]                    Bit Width : 20
[256h 0598   1]                   Bit Offset : 00
[257h 0599   1]         Encoded Access Width : 03 [DWord Access:32]
[258h 0600   8]                      Address : 00000000C1063008

[260h 0608   8]                        Value : 0000000000000000
[268h 0616   8]                         Mask : 00000000FFFFFFFF

[270h 0624   1]                       Action : 0B [Begin Dummy Write]
[271h 0625   1]                  Instruction : 03 [Write Register Value]
[272h 0626   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[273h 0627   1]                     Reserved : 00

[274h 0628  12]              Register Region : [Generic Address Structure]
[274h 0628   1]                     Space ID : 00 [SystemMemory]
[275h 0629   1]                    Bit Width : 20
[276h 0630   1]                   Bit Offset : 00
[277h 0631   1]         Encoded Access Width : 03 [DWord Access:32]
[278h 0632   8]                      Address : 00000000C1063000

[280h 0640   8]                        Value : 000000000000000B
[288h 0648   8]                         Mask : 00000000000000FF

[290h 0656   1]                       Action : 0D [Get Error Address Range]
[291h 0657   1]                  Instruction : 03 [Write Register Value]
[292h 0658   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[293h 0659   1]                     Reserved : 00

[294h 0660  12]              Register Region : [Generic Address Structure]
[294h 0660   1]                     Space ID : 00 [SystemMemory]
[295h 0661   1]                    Bit Width : 20
[296h 0662   1]                   Bit Offset : 00
[297h 0663   1]         Encoded Access Width : 03 [DWord Access:32]
[298h 0664   8]                      Address : 00000000C1063000

[2A0h 0672   8]                        Value : 000000000000000D
[2A8h 0680   8]                         Mask : 00000000000000FF

[2B0h 0688   1]                       Action : 0D [Get Error Address Range]
[2B1h 0689   1]                  Instruction : 00 [Read Register]
[2B2h 0690   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[2B3h 0691   1]                     Reserved : 00

[2B4h 0692  12]              Register Region : [Generic Address Structure]
[2B4h 0692   1]                     Space ID : 00 [SystemMemory]
[2B5h 0693   1]                    Bit Width : 40
[2B6h 0694   1]                   Bit Offset : 00
[2B7h 0695   1]         Encoded Access Width : 04 [QWord Access:64]
[2B8h 0696   8]                      Address : 00000000C1063008

[2C0h 0704   8]                        Value : 0000000000000000
[2C8h 0712   8]                         Mask : FFFFFFFFFFFFFFFF

[2D0h 0720   1]                       Action : 0E [Get Error Address Length]
[2D1h 0721   1]                  Instruction : 03 [Write Register Value]
[2D2h 0722   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[2D3h 0723   1]                     Reserved : 00

[2D4h 0724  12]              Register Region : [Generic Address Structure]
[2D4h 0724   1]                     Space ID : 00 [SystemMemory]
[2D5h 0725   1]                    Bit Width : 20
[2D6h 0726   1]                   Bit Offset : 00
[2D7h 0727   1]         Encoded Access Width : 03 [DWord Access:32]
[2D8h 0728   8]                      Address : 00000000C1063000

[2E0h 0736   8]                        Value : 000000000000000E
[2E8h 0744   8]                         Mask : 00000000000000FF

[2F0h 0752   1]                       Action : 0E [Get Error Address Length]
[2F1h 0753   1]                  Instruction : 00 [Read Register]
[2F2h 0754   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[2F3h 0755   1]                     Reserved : 00

[2F4h 0756  12]              Register Region : [Generic Address Structure]
[2F4h 0756   1]                     Space ID : 00 [SystemMemory]
[2F5h 0757   1]                    Bit Width : 40
[2F6h 0758   1]                   Bit Offset : 00
[2F7h 0759   1]         Encoded Access Width : 04 [QWord Access:64]
[2F8h 0760   8]                      Address : 00000000C1063008

[300h 0768   8]                        Value : 0000000000000000
[308h 0776   8]                         Mask : 00000000FFFFFFFF

[310h 0784   1]                       Action : 0F [Get Error Attributes]
[311h 0785   1]                  Instruction : 03 [Write Register Value]
[312h 0786   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[313h 0787   1]                     Reserved : 00

[314h 0788  12]              Register Region : [Generic Address Structure]
[314h 0788   1]                     Space ID : 00 [SystemMemory]
[315h 0789   1]                    Bit Width : 20
[316h 0790   1]                   Bit Offset : 00
[317h 0791   1]         Encoded Access Width : 03 [DWord Access:32]
[318h 0792   8]                      Address : 00000000C1063000

[320h 0800   8]                        Value : 000000000000000F
[328h 0808   8]                         Mask : 00000000000000FF

[330h 0816   1]                       Action : 0F [Get Error Attributes]
[331h 0817   1]                  Instruction : 00 [Read Register]
[332h 0818   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[333h 0819   1]                     Reserved : 00

[334h 0820  12]              Register Region : [Generic Address Structure]
[334h 0820   1]                     Space ID : 00 [SystemMemory]
[335h 0821   1]                    Bit Width : 20
[336h 0822   1]                   Bit Offset : 00
[337h 0823   1]         Encoded Access Width : 03 [DWord Access:32]
[338h 0824   8]                      Address : 00000000C1063008

[340h 0832   8]                        Value : 0000000000000000
[348h 0840   8]                         Mask : 00000000FFFFFFFF

[350h 0848   1]                       Action : 10 [Execute Timings]
[351h 0849   1]                  Instruction : 03 [Write Register Value]
[352h 0850   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[353h 0851   1]                     Reserved : 00

[354h 0852  12]              Register Region : [Generic Address Structure]
[354h 0852   1]                     Space ID : 00 [SystemMemory]
[355h 0853   1]                    Bit Width : 20
[356h 0854   1]                   Bit Offset : 00
[357h 0855   1]         Encoded Access Width : 03 [DWord Access:32]
[358h 0856   8]                      Address : 00000000C1063000

[360h 0864   8]                        Value : 0000000000000010
[368h 0872   8]                         Mask : 00000000000000FF

[370h 0880   1]                       Action : 10 [Execute Timings]
[371h 0881   1]                  Instruction : 00 [Read Register]
[372h 0882   1]        Flags (decoded below) : 00
                       Preserve Register Bits : 0
[373h 0883   1]                     Reserved : 00

[374h 0884  12]              Register Region : [Generic Address Structure]
[374h 0884   1]                     Space ID : 00 [SystemMemory]
[375h 0885   1]                    Bit Width : 40
[376h 0886   1]                   Bit Offset : 00
[377h 0887   1]         Encoded Access Width : 04 [QWord Access:64]
[378h 0888   8]                      Address : 00000000C1063008

[380h 0896   8]                        Value : 0000000000000000
[388h 0904   8]                         Mask : FFFFFFFFFFFFFFFF

Raw Table Data: Length 912 (0x390)

   0000: 45 52 53 54 90 03 00 00 01 C8 42 4F 43 48 53 20  // ERST......BOCHS
   0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
   0020: 01 00 00 00 30 00 00 00 00 00 00 00 1B 00 00 00  // ....0...........
   0030: 00 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0040: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0050: 01 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0060: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0070: 02 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0080: 02 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0090: 03 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   00A0: 03 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   00B0: 04 02 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   00C0: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
   00D0: 04 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   00E0: 04 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   00F0: 05 03 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   0100: 9C 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0110: 05 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0120: 05 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0130: 06 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0140: 06 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0150: 06 01 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   0160: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0170: 07 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0180: 07 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0190: 07 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   01A0: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   01B0: 08 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   01C0: 08 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   01D0: 08 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
   01E0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
   01F0: 09 02 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
   0200: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
   0210: 09 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0220: 09 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0230: 0A 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0240: 0A 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0250: 0A 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   0260: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
   0270: 0B 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0280: 0B 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0290: 0D 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   02A0: 0D 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   02B0: 0D 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
   02C0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
   02D0: 0E 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   02E0: 0E 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   02F0: 0E 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
   0300: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
   0310: 0F 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0320: 0F 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0330: 0F 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
   0340: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
   0350: 10 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
   0360: 10 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
   0370: 10 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
   0380: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-26 11:00       ` Igor Mammedov
@ 2021-07-26 20:02         ` Eric DeVolder
  2021-07-27 12:01           ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 20:02 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/26/21 6:00 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 11:12:41 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/20/21 8:16 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 15:07:17 -0400
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> This code is called from the machine code (if ACPI supported)
>>>> to generate the ACPI ERST table.
>>> should be along lines:
>>> This builds ACPI ERST table /spec ref/ to inform OSMP
>>> how to communicate with ... device.
>>
>> Like this?
>> This builds the ACPI ERST table to inform OSMP how to communicate
>                                   ^ [1]
>> with the acpi-erst device.
>>
> 
> 1) ACPI spec vX.Y, chapter foo
done

> 
>>
>>>    
>>>>
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>> ---
>>>>    hw/acpi/erst.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 214 insertions(+)
>>>>
>>>> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
>>>> index 6e9bd2e..1f1dbbc 100644
>>>> --- a/hw/acpi/erst.c
>>>> +++ b/hw/acpi/erst.c
>>>> @@ -555,6 +555,220 @@ static const MemoryRegionOps erst_mem_ops = {
>>>>    /*******************************************************************/
>>>>    /*******************************************************************/
>>>>    
>>>> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
>>>> +static void build_serialization_instruction_entry(GArray *table_data,
>>>> +    uint8_t serialization_action,
>>>> +    uint8_t instruction,
>>>> +    uint8_t flags,
>>>> +    uint8_t register_bit_width,
>>>> +    uint64_t register_address,
>>>> +    uint64_t value,
>>>> +    uint64_t mask)
>>> like I mentioned in previous patch, It could be simplified
>>> a lot if it's possible to use fixed 64-bit access with every
>>> action and the same width mask.
>> See previous response.
> lets see if it's a guest OS issue first, and then decide what to do with it.
> 
>>
>>>    
>>>> +{
>>>> +    /* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
>>>> +    struct AcpiGenericAddress gas;
>>>> +
>>>> +    /* Serialization Action */
>>>> +    build_append_int_noprefix(table_data, serialization_action, 1);
>>>> +    /* Instruction */
>>>> +    build_append_int_noprefix(table_data, instruction         , 1);
>>>> +    /* Flags */
>>>> +    build_append_int_noprefix(table_data, flags               , 1);
>>>> +    /* Reserved */
>>>> +    build_append_int_noprefix(table_data, 0                   , 1);
>>>> +    /* Register Region */
>>>> +    gas.space_id = AML_SYSTEM_MEMORY;
>>>> +    gas.bit_width = register_bit_width;
>>>> +    gas.bit_offset = 0;
>>>> +    switch (register_bit_width) {
>>>> +    case 8:
>>>> +        gas.access_width = 1;
>>>> +        break;
>>>> +    case 16:
>>>> +        gas.access_width = 2;
>>>> +        break;
>>>> +    case 32:
>>>> +        gas.access_width = 3;
>>>> +        break;
>>>> +    case 64:
>>>> +        gas.access_width = 4;
>>>> +        break;
>>>> +    default:
>>>> +        gas.access_width = 0;
>>>> +        break;
>>>> +    }
>>>> +    gas.address = register_address;
>>>> +    build_append_gas_from_struct(table_data, &gas);
>>>> +    /* Value */
>>>> +    build_append_int_noprefix(table_data, value  , 8);
>>>> +    /* Mask */
>>>> +    build_append_int_noprefix(table_data, mask   , 8);
>>>> +}
>>>> +
>>>> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
>>>> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
>>>> +    const char *oem_id, const char *oem_table_id)
>>>> +{
>>>> +    ERSTDeviceState *s = ACPIERST(erst_dev);
>>>
>>> globals are not welcomed in new code,
>>> pass erst_dev as argument here (ex: find_vmgenid_dev)
>>>    
>>>> +    unsigned action;
>>>> +    unsigned erst_start = table_data->len;
>>>> +
>>>    
>>>> +    s->bar0 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
>>>> +    trace_acpi_erst_pci_bar_0(s->bar0);
>>>> +    s->bar1 = (hwaddr)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);
>>>
>>> just store pci_get_bar_addr(PCI_DEVICE(erst_dev), 0) in local variable,
>>> Bar 1 is not used in this function so you don't need it here.
>> Corrected
>>
>>>
>>>    
>>>> +    trace_acpi_erst_pci_bar_1(s->bar1);
>>>> +
>>>> +    acpi_data_push(table_data, sizeof(AcpiTableHeader));
>>>> +    /* serialization_header_length */
>>> comments documenting table entries should be verbatim copy from spec,
>>> see build_amd_iommu() as example of preferred style.
>> Corrected
>>
>>>    
>>>> +    build_append_int_noprefix(table_data, 48, 4);
>>>> +    /* reserved */
>>>> +    build_append_int_noprefix(table_data,  0, 4);
>>>> +    /*
>>>> +     * instruction_entry_count - changes to the number of serialization
>>>> +     * instructions in the ACTIONs below must be reflected in this
>>>> +     * pre-computed value.
>>>> +     */
>>>> +    build_append_int_noprefix(table_data, 29, 4);
>>> a bit fragile as it can easily diverge from actual number later on.
>>> maybe instead of building instruction entries in place, build it
>>> in separate array and when done, use actual count to fill instruction_entry_count.
>>> pseudo code could look like:
>>>
>>>        /* prepare instructions in advance because ... */
>>>        GArray table_instruction_data;
>>>        build_serialization_instruction_entry(table_instruction_data,...);;
>>>        ...
>>>        build_serialization_instruction_entry(table_instruction_data,...);
>>>        /* instructions count */
>>>        build_append_int_noprefix(table_data, table_instruction_data.len/entry_size, 4);
>>>        /* copy prepared in advance instructions */
>>>        g_array_append_vals(table_data, table_instruction_data.data, table_instruction_data.len);
>> Corrected
>>
>>>      
>>>    
>>>> +
>>>> +#define MASK8  0x00000000000000FFUL
>>>> +#define MASK16 0x000000000000FFFFUL
>>>> +#define MASK32 0x00000000FFFFFFFFUL
>>>> +#define MASK64 0xFFFFFFFFFFFFFFFFUL
>>>> +
>>>> +    for (action = 0; action < ACPI_ERST_MAX_ACTIONS; ++action) {
>>> I'd unroll this loop and just directly code entries in required order.
>>> also drop reserved and nop actions/instructions or explain why they are necessary.
>> Unrolled. Dropped the NOP.
> 
> What about 'reserved"?
I dropped Reserved as it was composed of a NOP, and isn't needed either.

> 
>>
>>>    
>>>> +        switch (action) {
>>>> +        case ACPI_ERST_ACTION_BEGIN_WRITE_OPERATION:
>>> given these names will/should never be exposed outside of hw/acpi/erst.c
>>> I'd drop ACPI_ERST_ACTION_/ACPI_ERST_INST_ prefixes (i.e. use names as defined in spec)
>>> if it doesn't cause build issues.
>> These are in include/hw/acpi/erst.h which is included by hw/i386/acpi-build.c,
>> which includes many other hardware files.
>> Removing the prefix leaves a rather generic name.
>> I'd prefer to leave them as it uniquely differentiates.
> is there a reason to put them into erst.h and expose to outside world?
> If not then it might be better to move them into erst.c
I've moved them into erst.c

> 
>>
>>
>>>    
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_END_OPERATION:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
>>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_RESERVED:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>> +            break;
>>>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>> +        default:
>>>> +            build_serialization_instruction_entry(table_data, action,
>>>> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +    build_header(linker, table_data,
>>>> +                 (void *)(table_data->data + erst_start),
>>>> +                 "ERST", table_data->len - erst_start,
>>>> +                 1, oem_id, oem_table_id);
>>>> +}
>>>> +
>>>> +/*******************************************************************/
>>>> +/*******************************************************************/
>>>> +
>>>>    static const VMStateDescription erst_vmstate  = {
>>>>        .name = "acpi-erst",
>>>>        .version_id = 1,
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 07/10] ACPI ERST: trace support
  2021-07-26 11:08       ` Igor Mammedov
@ 2021-07-26 20:03         ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 20:03 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/26/21 6:08 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 11:14:37 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/20/21 8:15 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 15:07:18 -0400
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> Provide the definitions needed to support tracing in ACPI ERST.
>>> trace points should be introduced in patches that use them for the first time,
>>> as it stands now series breaks bisection.
>>
>> Are you asking to move this patch before the patch that introduces erst.c (which
>> uses these trace points)?
>> Or are you asking to include this patch with the patch that introduces erst.c?
> 
> I'd squash it into patch that introduces corresponding functions.
Done

> 
>> Also, you requested I separate the building of ERST table from the implemenation
>> of the erst device as separate patches. Doesn't that also break bisection?
> 
> it should be possible to compile series patch by patch and not break 'make check'
> after each patch.
> 
> ACPI table is not part of device, it's separate part that describes to OSPM
> how to work with device. So if code split correctly between patches
> it shouldn't break bisection.
<nods>

> 
>>
>>>    
>>>>
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>> ---
>>>>    hw/acpi/trace-events | 14 ++++++++++++++
>>>>    1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events
>>>> index dcc1438..a5c2755 100644
>>>> --- a/hw/acpi/trace-events
>>>> +++ b/hw/acpi/trace-events
>>>> @@ -55,3 +55,17 @@ piix4_gpe_writeb(uint64_t addr, unsigned width, uint64_t val) "addr: 0x%" PRIx64
>>>>    # tco.c
>>>>    tco_timer_reload(int ticks, int msec) "ticks=%d (%d ms)"
>>>>    tco_timer_expired(int timeouts_no, bool strap, bool no_reboot) "timeouts_no=%d no_reboot=%d/%d"
>>>> +
>>>> +# erst.c
>>>> +acpi_erst_reg_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%04" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
>>>> +acpi_erst_reg_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%04" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
>>>> +acpi_erst_mem_write(uint64_t addr, uint64_t val, unsigned size) "addr: 0x%06" PRIx64 " <== 0x%016" PRIx64 " (size: %u)"
>>>> +acpi_erst_mem_read(uint64_t addr, uint64_t val, unsigned size) " addr: 0x%06" PRIx64 " ==> 0x%016" PRIx64 " (size: %u)"
>>>> +acpi_erst_pci_bar_0(uint64_t addr) "BAR0: 0x%016" PRIx64
>>>> +acpi_erst_pci_bar_1(uint64_t addr) "BAR1: 0x%016" PRIx64
>>>> +acpi_erst_realizefn_in(void)
>>>> +acpi_erst_realizefn_out(unsigned size) "total nvram size %u bytes"
>>>> +acpi_erst_reset_in(unsigned record_count) "record_count %u"
>>>> +acpi_erst_reset_out(unsigned record_count) "record_count %u"
>>>> +acpi_erst_class_init_in(void)
>>>> +acpi_erst_class_init_out(void)
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines.
  2021-07-26 11:30       ` Igor Mammedov
@ 2021-07-26 20:03         ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 20:03 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/26/21 6:30 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 11:16:42 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/20/21 8:19 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 15:07:19 -0400
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> This change exposes ACPI ERST support for x86 guests.
>>>>
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>> looks good to me, maybe move find_erst_dev() impl. here as well
>>> if it's the patch it's first used.
>>
>> I've followed your previous suggestion of mimicking find_vmgenid_dev(), which
>> declares it in its header file. I've done the same, find_erst_dev() is
>> declared in its header file and used in these files.
> 
> it's fine doing like this but
> it would be easier to follow if this were part of [6/10],
> so that function is introduced and used in the same patch.
Done

> 
> 
>>>    
>>>> ---
>>>>    hw/i386/acpi-build.c   | 9 +++++++++
>>>>    hw/i386/acpi-microvm.c | 9 +++++++++
>>>>    2 files changed, 18 insertions(+)
>>>>
>>>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>>>> index de98750..d2026cc 100644
>>>> --- a/hw/i386/acpi-build.c
>>>> +++ b/hw/i386/acpi-build.c
>>>> @@ -43,6 +43,7 @@
>>>>    #include "sysemu/tpm.h"
>>>>    #include "hw/acpi/tpm.h"
>>>>    #include "hw/acpi/vmgenid.h"
>>>> +#include "hw/acpi/erst.h"
>>>>    #include "hw/boards.h"
>>>>    #include "sysemu/tpm_backend.h"
>>>>    #include "hw/rtc/mc146818rtc_regs.h"
>>>> @@ -2327,6 +2328,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>>>>        GArray *tables_blob = tables->table_data;
>>>>        AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
>>>>        Object *vmgenid_dev;
>>>> +    Object *erst_dev;
>>>>        char *oem_id;
>>>>        char *oem_table_id;
>>>>    
>>>> @@ -2388,6 +2390,13 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>>>>                        ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>>>>                        x86ms->oem_table_id);
>>>>    
>>>> +    erst_dev = find_erst_dev();
>>>> +    if (erst_dev) {
>>>> +        acpi_add_table(table_offsets, tables_blob);
>>>> +        build_erst(tables_blob, tables->linker, erst_dev,
>>>> +                   x86ms->oem_id, x86ms->oem_table_id);
>>>> +    }
>>>> +
>>>>        vmgenid_dev = find_vmgenid_dev();
>>>>        if (vmgenid_dev) {
>>>>            acpi_add_table(table_offsets, tables_blob);
>>>> diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
>>>> index ccd3303..0099b13 100644
>>>> --- a/hw/i386/acpi-microvm.c
>>>> +++ b/hw/i386/acpi-microvm.c
>>>> @@ -30,6 +30,7 @@
>>>>    #include "hw/acpi/bios-linker-loader.h"
>>>>    #include "hw/acpi/generic_event_device.h"
>>>>    #include "hw/acpi/utils.h"
>>>> +#include "hw/acpi/erst.h"
>>>>    #include "hw/boards.h"
>>>>    #include "hw/i386/fw_cfg.h"
>>>>    #include "hw/i386/microvm.h"
>>>> @@ -160,6 +161,7 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>>>>        X86MachineState *x86ms = X86_MACHINE(mms);
>>>>        GArray *table_offsets;
>>>>        GArray *tables_blob = tables->table_data;
>>>> +    Object *erst_dev;
>>>>        unsigned dsdt, xsdt;
>>>>        AcpiFadtData pmfadt = {
>>>>            /* ACPI 5.0: 4.1 Hardware-Reduced ACPI */
>>>> @@ -209,6 +211,13 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>>>>                        ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>>>>                        x86ms->oem_table_id);
>>>>    
>>>> +    erst_dev = find_erst_dev();
>>>> +    if (erst_dev) {
>>>> +        acpi_add_table(table_offsets, tables_blob);
>>>> +        build_erst(tables_blob, tables->linker, erst_dev,
>>>> +                   x86ms->oem_id, x86ms->oem_table_id);
>>>> +    }
>>>> +
>>>>        xsdt = tables_blob->len;
>>>>        build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
>>>>                   x86ms->oem_table_id);
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 09/10] ACPI ERST: qtest for ERST
  2021-07-26 11:45       ` Igor Mammedov
@ 2021-07-26 20:06         ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-26 20:06 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/26/21 6:45 AM, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 11:18:44 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/20/21 8:38 AM, Igor Mammedov wrote:
>>> On Wed, 30 Jun 2021 15:07:20 -0400
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> This change provides a qtest that locates and then does a simple
>>>> interrogation of the ERST feature within the guest.
>>>>
>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>> ---
>>>>    tests/qtest/erst-test.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    tests/qtest/meson.build |   2 +
>>>>    2 files changed, 131 insertions(+)
>>>>    create mode 100644 tests/qtest/erst-test.c
>>>>
>>>> diff --git a/tests/qtest/erst-test.c b/tests/qtest/erst-test.c
>>>> new file mode 100644
>>>> index 0000000..ce014c1
>>>> --- /dev/null
>>>> +++ b/tests/qtest/erst-test.c
>>>> @@ -0,0 +1,129 @@
>>>> +/*
>>>> + * QTest testcase for ACPI ERST
>>>> + *
>>>> + * Copyright (c) 2021 Oracle
>>>> + *
>>>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>>>> + * See the COPYING file in the top-level directory.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "qemu/bitmap.h"
>>>> +#include "qemu/uuid.h"
>>>> +#include "hw/acpi/acpi-defs.h"
>>>> +#include "boot-sector.h"
>>>> +#include "acpi-utils.h"
>>>> +#include "libqos/libqtest.h"
>>>> +#include "qapi/qmp/qdict.h"
>>>> +
>>>> +#define RSDP_ADDR_INVALID 0x100000 /* RSDP must be below this address */
>>>> +
>>>> +static uint64_t acpi_find_erst(QTestState *qts)
>>>> +{
>>>> +    uint32_t rsdp_offset;
>>>> +    uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
>>>> +    uint32_t rsdt_len, table_length;
>>>> +    uint8_t *rsdt, *ent;
>>>> +    uint64_t base = 0;
>>>> +
>>>> +    /* Wait for guest firmware to finish and start the payload. */
>>>> +    boot_sector_test(qts);
>>>> +
>>>> +    /* Tables should be initialized now. */
>>>> +    rsdp_offset = acpi_find_rsdp_address(qts);
>>>> +
>>>> +    g_assert_cmphex(rsdp_offset, <, RSDP_ADDR_INVALID);
>>>> +
>>>> +    acpi_fetch_rsdp_table(qts, rsdp_offset, rsdp_table);
>>>> +    acpi_fetch_table(qts, &rsdt, &rsdt_len, &rsdp_table[16 /* RsdtAddress */],
>>>> +                     4, "RSDT", true);
>>>> +
>>>> +    ACPI_FOREACH_RSDT_ENTRY(rsdt, rsdt_len, ent, 4 /* Entry size */) {
>>>> +        uint8_t *table_aml;
>>>> +        acpi_fetch_table(qts, &table_aml, &table_length, ent, 4, NULL, true);
>>>> +        if (!memcmp(table_aml + 0 /* Header Signature */, "ERST", 4)) {
>>>> +            /*
>>>> +             * Picking up ERST base address from the Register Region
>>>> +             * specified as part of the first Serialization Instruction
>>>> +             * Action (which is a Begin Write Operation).
>>>> +             */
>>>> +            memcpy(&base, &table_aml[56], sizeof(base));
>>>> +            g_free(table_aml);
>>>> +            break;
>>>> +        }
>>>> +        g_free(table_aml);
>>>> +    }
>>>> +    g_free(rsdt);
>>>> +    return base;
>>>> +}
>>> I'd drop this, bios-tables-test should do ACPI table check
>>> as for PCI device itself you can test it with qtest accelerator
>>> that allows to instantiate it and access registers directly
>>> without overhead of running actual guest.
>> Yes, bios-tables-test checks the ACPI table, but not the functionality.
>> This test has actually caught a problem/bug during development.
> 
> What I'm saying is not to drop test, but rather use qtest
> accelerator to test PCI hardware registers. Which doesn't run
> guest code. but lets you directly program/access PCI device.
> 
> So instead of searching/parsing ERST table, you'd program BARs
> manually on behalf of BIOS, and then test that it works as expected.
> 
> As for ACPI tables, we don't have complete testing infrastructure
> in tree, bios-tables-test, only tests matching to committed
> reference blobs. And verifying that reference blob is correct,
> is manual process currently.
> 
> To test whole stack one could write an optional acceptance test,
> which would run actual guest (if you wish to add that, you can look at
> docs/devel/testing.rst "Acceptance tests ...").
> 

I've reworked this to pattern it after ivshmem test.

> 
> 
>>> As example you can look into megasas-test.c, ivshmem-test.c
>>> or other PCI device tests.
>> But I'll look at these and see about migrating to this approach.
>>
>>>    
>>>> +static char disk[] = "tests/erst-test-disk-XXXXXX";
>>>> +
>>>> +#define ERST_CMD()                              \
>>>> +    "-accel kvm -accel tcg "                    \
>>>> +    "-object memory-backend-file," \
>>>> +      "id=erstnvram,mem-path=tests/acpi-erst-XXXXXX,size=0x10000,share=on " \
>>>> +    "-device acpi-erst,memdev=erstnvram " \
>>>> +    "-drive id=hd0,if=none,file=%s,format=raw " \
>>>> +    "-device ide-hd,drive=hd0 ", disk
>>>> +
>>>> +static void erst_get_error_log_address_range(void)
>>>> +{
>>>> +    QTestState *qts;
>>>> +    uint64_t log_address_range = 0;
>>>> +    unsigned log_address_length = 0;
>>>> +    unsigned log_address_attr = 0;
>>>> +
>>>> +    qts = qtest_initf(ERST_CMD());
>>>> +
>>>> +    uint64_t base = acpi_find_erst(qts);
>>>> +    g_assert(base != 0);
>>>> +
>>>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE command */
>>>> +    qtest_writel(qts, base + 0, 0xD);
>>>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE result */
>>>> +    log_address_range = qtest_readq(qts, base + 8);\
>>>> +
>>>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_LENGTH command */
>>>> +    qtest_writel(qts, base + 0, 0xE);
>>>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_LENGTH result */
>>>> +    log_address_length = qtest_readq(qts, base + 8);\
>>>> +
>>>> +    /* Issue GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES command */
>>>> +    qtest_writel(qts, base + 0, 0xF);
>>>> +    /* Read GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES result */
>>>> +    log_address_attr = qtest_readq(qts, base + 8);\
>>>> +
>>>> +    /* Check log_address_range is not 0,~0 or base */
>>>> +    g_assert(log_address_range != base);
>>>> +    g_assert(log_address_range != 0);
>>>> +    g_assert(log_address_range != ~0UL);
>>>> +
>>>> +    /* Check log_address_length is ERST_RECORD_SIZE */
>>>> +    g_assert(log_address_length == (8 * 1024));
>>>> +
>>>> +    /* Check log_address_attr is 0 */
>>>> +    g_assert(log_address_attr == 0);
>>>> +
>>>> +    qtest_quit(qts);
>>>> +}
>>>> +
>>>> +int main(int argc, char **argv)
>>>> +{
>>>> +    int ret;
>>>> +
>>>> +    ret = boot_sector_init(disk);
>>>> +    if (ret) {
>>>> +        return ret;
>>>> +    }
>>>> +
>>>> +    g_test_init(&argc, &argv, NULL);
>>>> +
>>>> +    qtest_add_func("/erst/get-error-log-address-range",
>>>> +                   erst_get_error_log_address_range);
>>>> +
>>>> +    ret = g_test_run();
>>>> +    boot_sector_cleanup(disk);
>>>> +
>>>> +    return ret;
>>>> +}
>>>> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
>>>> index 0c76738..deae443 100644
>>>> --- a/tests/qtest/meson.build
>>>> +++ b/tests/qtest/meson.build
>>>> @@ -66,6 +66,7 @@ qtests_i386 = \
>>>>      (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
>>>>      (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
>>>>      (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
>>>> +  (config_all_devices.has_key('CONFIG_ACPI') ? ['erst-test'] : []) +                 \
>>>>      qtests_pci +                                                                              \
>>>>      ['fdc-test',
>>>>       'ide-test',
>>>> @@ -237,6 +238,7 @@ qtests = {
>>>>      'bios-tables-test': [io, 'boot-sector.c', 'acpi-utils.c', 'tpm-emu.c'],
>>>>      'cdrom-test': files('boot-sector.c'),
>>>>      'dbus-vmstate-test': files('migration-helpers.c') + dbus_vmstate1,
>>>> +  'erst-test': files('erst-test.c', 'boot-sector.c', 'acpi-utils.c'),
>>>>      'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
>>>>      'migration-test': files('migration-helpers.c'),
>>>>      'pxe-test': files('boot-sector.c'),
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-07-26 19:52           ` Eric DeVolder
@ 2021-07-27 11:45             ` Igor Mammedov
  2021-07-28 15:16               ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-27 11:45 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth

On Mon, 26 Jul 2021 14:52:15 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/26/21 5:06 AM, Igor Mammedov wrote:
> > On Wed, 21 Jul 2021 10:42:33 -0500
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> On 7/19/21 10:02 AM, Igor Mammedov wrote:  
> >>> On Wed, 30 Jun 2021 19:26:39 +0000
> >>> Eric DeVolder <eric.devolder@oracle.com> wrote:
> >>>      
> >>>> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
> >>>> rather than "non-NVRAM mode", which contradicts everything I stated prior.
> >>>> Eric.
> >>>> ________________________________
> >>>> From: Eric DeVolder <eric.devolder@oracle.com>
> >>>> Sent: Wednesday, June 30, 2021 2:07 PM
> >>>> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
> >>>> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >>>> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
> >>>>
> >>>> Information on the implementation of the ACPI ERST support.
> >>>>
> >>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> >>>> ---
> >>>>    docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>>    1 file changed, 152 insertions(+)
> >>>>    create mode 100644 docs/specs/acpi_erst.txt
> >>>>
> >>>> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
> >>>> new file mode 100644
> >>>> index 0000000..79f8eb9
> >>>> --- /dev/null
> >>>> +++ b/docs/specs/acpi_erst.txt
> >>>> @@ -0,0 +1,152 @@
> >>>> +ACPI ERST DEVICE
> >>>> +================
> >>>> +
> >>>> +The ACPI ERST device is utilized to support the ACPI Error Record
> >>>> +Serialization Table, ERST, functionality. The functionality is
> >>>> +designed for storing error records in persistent storage for
> >>>> +future reference/debugging.
> >>>> +
> >>>> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
> >>>> +(APEI)", and specifically subsection "Error Serialization", outlines
> >>>> +a method for storing error records into persistent storage.
> >>>> +
> >>>> +The format of error records is described in the UEFI specification[2],
> >>>> +in Appendix N "Common Platform Error Record".
> >>>> +
> >>>> +While the ACPI specification allows for an NVRAM "mode" (see
> >>>> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
> >>>> +directly exposed for direct access by the OS/guest, this implements
> >>>> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
> >>>> +by most BIOS (since flash memory requires programming operations
> >>>> +in order to update its contents). Furthermore, as of the time of this
> >>>> +writing, Linux does not support the non-NVRAM "mode".  
> >>>
> >>> shouldn't it be s/non-NVRAM/NVRAM/ ?  
> >>
> >> Yes, it has been corrected.
> >>  
> >>>      
> >>>> +
> >>>> +
> >>>> +Background/Motivation
> >>>> +---------------------
> >>>> +Linux uses the persistent storage filesystem, pstore, to record
> >>>> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
> >>>> +independent of, and runs before, kdump.  In certain scenarios (ie.
> >>>> +hosts/guests with root filesystems on NFS/iSCSI where networking
> >>>> +software and/or hardware fails), pstore may contain the only
> >>>> +information available for post-mortem debugging.  
> >>>
> >>> well,
> >>> it's not the only way, one can use existing pvpanic device to notify
> >>> mgmt layer about crash and mgmt layer can take appropriate measures
> >>> to for post-mortem debugging, including dumping guest state,
> >>> which is superior to anything pstore can offer as VM is still exists
> >>> and mgmt layer can inspect VMs crashed state directly or dump
> >>> necessary parts of it.
> >>>
> >>> So ERST shouldn't be portrayed as the only way here but rather
> >>> as limited alternative to pvpanic in regards to post-mortem debugging
> >>> (it's the only way only on bare-metal).
> >>>
> >>> It would be better to describe here other use-cases you've mentioned
> >>> in earlier reviews, that justify adding alternative to pvpanic.  
> >>
> >> I'm not sure how I would change this. I do say "may contain", which means it
> >> is not the only way. Pvpanic is a way to notify the mgmt layer/host, but
> >> this is a method solely with the guest. Each serves a different purpose;
> >> plugs a different hole.
> >>  
> > 
> > I'd suggest edit  "pstore may contain the only information" as "pstore may contain information"
> >   
> Done
> 
> >> As noted in a separate message, my company has intentions of storing other
> >> data in ERST beyond panics.  
> > perhaps add your use cases here as well.
> >   
> >   
> >>>> +Two common storage backends for the pstore filesystem are ACPI ERST
> >>>> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
> >>>> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
> >>>> +pstore storage backend for virtual machines (as it is now for
> >>>> +bare metal machines).
> >>>> +  
> >>>      
> >>>> +Enabling support for ACPI ERST facilitates a consistent method to
> >>>> +capture kernel panic information in a wide range of guests: from
> >>>> +resource-constrained microvms to very large guests, and in
> >>>> +particular, in direct-boot environments (which would lack UEFI
> >>>> +run-time services).  
> >>> this hunk probably not necessary
> >>>      
> >>>> +
> >>>> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
> >>>> +crash information, if available.  
> >>> a pointer to a relevant source would be helpful here.  
> >>
> >> I've included the reference, here for your benefit.
> >> Windows Hardware Error Architecutre, specifically Persistence Mechanism
> >> https://docs.microsoft.com/en-us/windows-hardware/drivers/whea/error-record-persistence-mechanism
> >>  
> >>>      
> >>>> +Invocation  
> >>> s/^^/Configuration|Usage/  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +----------
> >>>> +
> >>>> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst  
> >>> s/utilize/use/  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +device must be created, for example:  
> >>> s/must/can/  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +
> >>>> + qemu ...
> >>>> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
> >>>> +  size=0x10000,share=on  
> >>> I'd put ^^^ on the same line as -object and use '\' at the end the
> >>> so example could be easily copy-pasted  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> + -device acpi-erst,memdev=erstnvram
> >>>> +
> >>>> +For proper operation, the ACPI ERST device needs a memory-backend-file
> >>>> +object with the following parameters:
> >>>> +
> >>>> + - id: The id of the memory-backend-file object is used to associate
> >>>> +   this memory with the acpi-erst device.
> >>>> + - size: The size of the ACPI ERST backing storage. This parameter is
> >>>> +   required.
> >>>> + - mem-path: The location of the ACPI ERST backing storage file. This
> >>>> +   parameter is also required.
> >>>> + - share: The share=on parameter is required so that updates to the
> >>>> +   ERST back store are written to the file immediately as well. Without
> >>>> +   it, updates the the backing file are unpredictable and may not
> >>>> +   properly persist (eg. if qemu should crash).  
> >>>
> >>> mmap manpage says:
> >>>     MAP_SHARED
> >>>                Updates to the mapping ... are carried through to the underlying file.
> >>> it doesn't guarantee 'written to the file immediately', though.
> >>> So I'd rephrase it to something like that:
> >>>
> >>> - share: The share=on parameter is required so that updates to the ERST back store
> >>>            are written back to the file.  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +
> >>>> +The ACPI ERST device is a simple PCI device, and requires this one
> >>>> +parameter:  
> >>> s/^.*:/and ERST device:/  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +
> >>>> + - memdev: Is the object id of the memory-backend-file.
> >>>> +
> >>>> +
> >>>> +PCI Interface
> >>>> +-------------
> >>>> +
> >>>> +The ERST device is a PCI device with two BARs, one for accessing
> >>>> +the programming registers, and the other for accessing the
> >>>> +record exchange buffer.
> >>>> +
> >>>> +BAR0 contains the programming interface consisting of just two
> >>>> +64-bit registers. The two registers are an ACTION (cmd) and a
> >>>> +VALUE (data). All ERST actions/operations/side effects happen  
> >>> s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/  
> >>
> >> Corrected
> >>  
> >>>      
> >>>> +on the write to the ACTION, by design. Thus any data needed  
> >>> s/Thus//  
> >> Corrected
> >>  
> >>>      
> >>>> +by the action must be placed into VALUE prior to writing
> >>>> +ACTION. Reading the VALUE simply returns the register contents,
> >>>> +which can be updated by a previous ACTION.  
> >>>      
> >>>> This behavior is
> >>>> +encoded in the ACPI ERST table generated by QEMU.  
> >>> it's too vague, Either drop sentence or add a reference to relevant place in spec.  
> >> Corrected
> >>  
> >>>
> >>>      
> >>>> +
> >>>> +BAR1 contains the record exchange buffer, and the size of this
> >>>> +buffer sets the maximum record size. This record exchange
> >>>> +buffer size is 8KiB.  
> >>> s/^^^/
> >>> BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.  
> >> Corrected
> >>  
> >>>
> >>>      
> >>>> +Backing File  
> >>>
> >>> s/^^^/Backing Storage Format/  
> >> Corrected
> >>  
> >>>      
> >>>> +------------  
> >>>
> >>>      
> >>>> +
> >>>> +The ACPI ERST persistent storage is contained within a single backing
> >>>> +file. The size and location of the backing file is specified upon
> >>>> +QEMU startup of the ACPI ERST device.  
> >>>
> >>> I'd drop above paragraph and describe file format here,
> >>> ultimately used backend doesn't have to be a file. For
> >>> example if user doesn't need it persist over QEMU restarts,
> >>> ram backend could be used, guest will still be able to see
> >>> it's own crash log after guest is reboot, or it could be
> >>> memfd backend passed to QEMU by mgmt layer.  
> >> Dropped
> >>  
> >>>
> >>>      
> >>>> +Records are stored in the backing file in a simple fashion.  
> >>> s/backing file/backend storage/
> >>> ditto for other occurrences  
> >> Corrected
> >>  
> >>>      
> >>>> +The backing file is essentially divided into fixed size
> >>>> +"slots", ERST_RECORD_SIZE in length, with each "slot"
> >>>> +storing a single record.  
> >>>      
> >>>> No attempt at optimizing storage
> >>>> +through compression, compaction, etc is attempted.  
> >>> s/^^^//  
> >>
> >> I'd like to keep this statement. It is there because in a number of
> >> hardware BIOS I tested, these kinds of features lead to bugs in the
> >> ERST support.  
> > this doc it's not about issues in other BIOSes, it's about conrete
> > QEMU impl. So sentence starting with "No attempt" is not relevant here at all.  
> Dropped
> 
> >      
> >>>> +NOTE that any change to this value will make any pre-
> >>>> +existing backing files, not of the same ERST_RECORD_SIZE,
> >>>> +unusable to the guest.  
> >>> when that can happen, can we detect it and error out?  
> >> I've dropped this statement. That value is hard coded, and not a
> >> parameter, so there is no simple way to change it. This comment
> >> does exist next to the ERST_RECORD_SIZE declaration in the code.  
> > 
> > It's not a problem with current impl. but rather with possible
> > future expansion.
> > 
> > If you'd add a header with record size at the start of storage,
> > it wouldn't be issue as ERST would be able to get used record
> > size for storage. That will help with avoiding compat issues
> > later on.  
> I'll go ahead and add the header. I'll put the magic and record size in there,
> but I do not intend to put any data that would be "cached" from the records
> themselves. So no recordids, in particular, will be cached in this header.
maybe also add offset of the 1st slot, so however comes later
to fix performance issues will have less compatibility issues.

> 
> I'm not even sure I want to record/cache the number of records because:
>   - it has almost no use (undermined by the fact overall storage size is not exposed, imho)
>   - we backed off requiring the share=on, so it is conceivable this header value could
>     encounter data integrity issues, should a guest crash...
guest crash won't affect data,  and if backend is not shared then,
data won't be persistently stored anyways, they will live only for
lifetime of QEMU instance.
The only time where integrity is affected is host crash and we already
agreed that we don't care about this case.

>   - scans still happen (see next)
> 
> While having it (number of records cached in header) would avoid a startup scan
> to compute it, the write operation requires a scan to determine if the incoming
> recordid is present or not, in order to determine overwrite or allocate-and-write.
if present/non present per slot status is cached, you don't have to do
expensive full scan when guest scans slots.

> And typically the first operation that Linux does is effectively a scan to
> populate the /sys/fs/pstore entries via the GET_RECORD_IDENTIFIER action.
> 
> And the typical size of the ERST storage [on hardware systems] is 64 to 128KiB;
> so not much storage to examine, especially since only looking at 12 bytes of each
> 8KiB record.
> 
> I'd still be in favor of putting an upper bound on the hostmem object, to avoid
> your worst case fears...

Considering device is not present by default, I give up on
trying to convince you to design it efficiently.

If one would wish to use this with container like workloads
where fast startup matters, one would have to live with crappy
performance or rewrite your impl.

> >>>> +Below is an example layout of the backing store file.
> >>>> +The size of the file is a multiple of ERST_RECORD_SIZE,
> >>>> +and contains N number of "slots" to store records. The
> >>>> +example below shows two records (in CPER format) in the
> >>>> +backing file, while the remaining slots are empty/
> >>>> +available.
> >>>> +
> >>>> + Slot   Record
> >>>> +        +--------------------------------------------+
> >>>> +    0   | empty/available                            |
> >>>> +        +--------------------------------------------+
> >>>> +    1   | CPER                                       |
> >>>> +        +--------------------------------------------+
> >>>> +    2   | CPER                                       |
> >>>> +        +--------------------------------------------+
> >>>> +  ...   |                                            |
> >>>> +        +--------------------------------------------+
> >>>> +    N   | empty/available                            |
> >>>> +        +--------------------------------------------+
> >>>> +        <-------------- ERST_RECORD_SIZE ------------>  
> >>>
> >>>      
> >>>> +Not all slots need to be occupied, and they need not be
> >>>> +occupied in a contiguous fashion. The ability to clear/erase
> >>>> +specific records allows for the formation of unoccupied
> >>>> +slots.  
> >>> I'd drop this as not necessary  
> >>
> >> I'd like to keep this statement. Again, several BIOS on which I tested
> >> ERST had bugs around non-contiguous record storage.  
> > 
> > I'd drop this and alter sentence above ending with " in a simple fashion."
> > to describe sparse usage of storage and then after that comes example diagram.  
> Done
> 
> > 
> > I'd like this part to start with unambiguous concise description of
> > format and to be finished with example diagram.
> > It is the part that will be considered as the the only true source
> > how file should be formatted, when it comes to fixing bugs or
> > modifying original impl. later on. So it's important to have clear
> > description without any unnecessary information here.  
> Done
> 
> > 
> >   
> >>>
> >>>      
> >>>> +
> >>>> +
> >>>> +References
> >>>> +----------
> >>>> +
> >>>> +[1] "Advanced Configuration and Power Interface Specification",
> >>>> +    version 4.0, June 2009.
> >>>> +
> >>>> +[2] "Unified Extensible Firmware Interface Specification",
> >>>> +    version 2.1, October 2008.
> >>>> +
> >>>> --
> >>>> 1.8.3.1
> >>>>     
> >>>      
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-26 20:02         ` Eric DeVolder
@ 2021-07-27 12:01           ` Igor Mammedov
  2021-07-28 15:18             ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-27 12:01 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Mon, 26 Jul 2021 15:02:55 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

[...]
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_END_OPERATION:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
> >>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_RESERVED:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
> >>>> +            break;
> >>>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
> >>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
> >>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
> >>>> +        default:
> >>>> +            build_serialization_instruction_entry(table_data, action,
> >>>> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
> >>>> +            break;
> >>>> +        }

../../builds/imammedo/qemu/hw/acpi/erst.c: In function ‘build_erst’:
../../builds/imammedo/qemu/hw/acpi/erst.c:754:13: error: this statement may fall through [-Werror=implicit-fallthrough=]
             build_serialization_instruction_entry(table_data, action,
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 ACPI_ERST_INST_READ_REGISTER       , 0, 64,
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 s->bar0 + ERST_CSR_VALUE, 0, MASK64);
                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../builds/imammedo/qemu/hw/acpi/erst.c:757:9: note: here
         default:
         ^~~~~~~
cc1: all warnings being treated as errors


> >>>> +    }
> >>>> +    build_header(linker, table_data,
> >>>> +                 (void *)(table_data->data + erst_start),
> >>>> +                 "ERST", table_data->len - erst_start,
> >>>> +                 1, oem_id, oem_table_id);
> >>>> +}
> >>>> +
> >>>> +/*******************************************************************/
> >>>> +/*******************************************************************/
> >>>> +
> >>>>    static const VMStateDescription erst_vmstate  = {
> >>>>        .name = "acpi-erst",
> >>>>        .version_id = 1,  
> >>>      
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-26 20:01         ` Eric DeVolder
@ 2021-07-27 12:52           ` Igor Mammedov
  2021-08-04 22:13             ` Eric DeVolder
  0 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-27 12:52 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Mon, 26 Jul 2021 15:01:05 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/26/21 5:42 AM, Igor Mammedov wrote:
> > On Wed, 21 Jul 2021 11:07:40 -0500
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> On 7/20/21 7:17 AM, Igor Mammedov wrote:  
> >>> On Wed, 30 Jun 2021 15:07:16 -0400
> >>> Eric DeVolder <eric.devolder@oracle.com> wrote:
> >>>      
[..]
> >>>> +
> >>>> +static uint64_t erst_rd_reg64(hwaddr addr,
> >>>> +    uint64_t reg, unsigned size)
> >>>> +{
> >>>> +    uint64_t rdval;
> >>>> +    uint64_t mask;
> >>>> +    unsigned shift;
> >>>> +
> >>>> +    if (size == sizeof(uint64_t)) {
> >>>> +        /* 64b access */
> >>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >>>> +        shift = 0;
> >>>> +    } else {
> >>>> +        /* 32b access */
> >>>> +        mask = 0x00000000FFFFFFFFUL;
> >>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >>>> +    }
> >>>> +
> >>>> +    rdval = reg;
> >>>> +    rdval >>= shift;
> >>>> +    rdval &= mask;
> >>>> +
> >>>> +    return rdval;
> >>>> +}
> >>>> +
> >>>> +static uint64_t erst_wr_reg64(hwaddr addr,
> >>>> +    uint64_t reg, uint64_t val, unsigned size)
> >>>> +{
> >>>> +    uint64_t wrval;
> >>>> +    uint64_t mask;
> >>>> +    unsigned shift;
> >>>> +
> >>>> +    if (size == sizeof(uint64_t)) {
> >>>> +        /* 64b access */
> >>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >>>> +        shift = 0;
> >>>> +    } else {
> >>>> +        /* 32b access */
> >>>> +        mask = 0x00000000FFFFFFFFUL;
> >>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >>>> +    }
> >>>> +
> >>>> +    val &= mask;
> >>>> +    val <<= shift;
> >>>> +    mask <<= shift;
> >>>> +    wrval = reg;
> >>>> +    wrval &= ~mask;
> >>>> +    wrval |= val;
> >>>> +
> >>>> +    return wrval;
> >>>> +}  
> >>> (I see in next patch it's us defining access width in the ACPI tables)
> >>> so question is: do we have to have mixed register width access?
> >>> can't all register accesses be 64-bit?  
> >>
> >> Initially I attempted to just use 64-bit exclusively. The problem is that,
> >> for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
> >> register access into two. Here's an example of reading the exchange buffer
> >> address, which is coded as a 64b access:
> >>
> >> acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
> >> acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
> >> acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)
> >>
> >> So I went ahead and made ACTION register accesses 32b, else there would
> >> be two reads of 32-bts, of which the second is useless.  
> > 
> > could you post here decompiled ERST table?  
> As it is long, I posted it to the end of this message.

RHEL8 or Fedora 34 says that erst is invalid table,
so I can't help tracing what's going on there.

You'll have to figure out why access is not 64 bit on your own.

[...]

> Obtained via a running guest with:
> iasl -d -vt /sys/firmware/acpi/tables/ERST
> 
> /*
>   * Intel ACPI Component Architecture
>   * AML/ASL+ Disassembler version 20180105 (64-bit version)
>   * Copyright (c) 2000 - 2018 Intel Corporation
>   *
>   * Disassembly of ERST.blob, Mon Jul 26 14:31:21 2021
>   *
>   * ACPI Data Table [ERST]
>   *
>   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
>   */
> 
> [000h 0000   4]                    Signature : "ERST"    [Error Record Serialization Table]
> [004h 0004   4]                 Table Length : 00000390
> [008h 0008   1]                     Revision : 01
> [009h 0009   1]                     Checksum : C8
> [00Ah 0010   6]                       Oem ID : "BOCHS "
> [010h 0016   8]                 Oem Table ID : "BXPC    "
> [018h 0024   4]                 Oem Revision : 00000001
> [01Ch 0028   4]              Asl Compiler ID : "BXPC"
> [020h 0032   4]        Asl Compiler Revision : 00000001
> 
> [024h 0036   4]  Serialization Header Length : 00000030
> [028h 0040   4]                     Reserved : 00000000
> [02Ch 0044   4]      Instruction Entry Count : 0000001B
> 
> [030h 0048   1]                       Action : 00 [Begin Write Operation]
> [031h 0049   1]                  Instruction : 03 [Write Register Value]
> [032h 0050   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [033h 0051   1]                     Reserved : 00
> 
> [034h 0052  12]              Register Region : [Generic Address Structure]
> [034h 0052   1]                     Space ID : 00 [SystemMemory]
> [035h 0053   1]                    Bit Width : 20
> [036h 0054   1]                   Bit Offset : 00
> [037h 0055   1]         Encoded Access Width : 03 [DWord Access:32]
> [038h 0056   8]                      Address : 00000000C1063000
> 
> [040h 0064   8]                        Value : 0000000000000000
> [048h 0072   8]                         Mask : 00000000000000FF
> 
> [050h 0080   1]                       Action : 01 [Begin Read Operation]
> [051h 0081   1]                  Instruction : 03 [Write Register Value]
> [052h 0082   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [053h 0083   1]                     Reserved : 00
> 
> [054h 0084  12]              Register Region : [Generic Address Structure]
> [054h 0084   1]                     Space ID : 00 [SystemMemory]
> [055h 0085   1]                    Bit Width : 20
> [056h 0086   1]                   Bit Offset : 00
> [057h 0087   1]         Encoded Access Width : 03 [DWord Access:32]
> [058h 0088   8]                      Address : 00000000C1063000
> 
> [060h 0096   8]                        Value : 0000000000000001
> [068h 0104   8]                         Mask : 00000000000000FF
> 
> [070h 0112   1]                       Action : 02 [Begin Clear Operation]
> [071h 0113   1]                  Instruction : 03 [Write Register Value]
> [072h 0114   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [073h 0115   1]                     Reserved : 00
> 
> [074h 0116  12]              Register Region : [Generic Address Structure]
> [074h 0116   1]                     Space ID : 00 [SystemMemory]
> [075h 0117   1]                    Bit Width : 20
> [076h 0118   1]                   Bit Offset : 00
> [077h 0119   1]         Encoded Access Width : 03 [DWord Access:32]
> [078h 0120   8]                      Address : 00000000C1063000
> 
> [080h 0128   8]                        Value : 0000000000000002
> [088h 0136   8]                         Mask : 00000000000000FF
> 
> [090h 0144   1]                       Action : 03 [End Operation]
> [091h 0145   1]                  Instruction : 03 [Write Register Value]
> [092h 0146   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [093h 0147   1]                     Reserved : 00
> 
> [094h 0148  12]              Register Region : [Generic Address Structure]
> [094h 0148   1]                     Space ID : 00 [SystemMemory]
> [095h 0149   1]                    Bit Width : 20
> [096h 0150   1]                   Bit Offset : 00
> [097h 0151   1]         Encoded Access Width : 03 [DWord Access:32]
> [098h 0152   8]                      Address : 00000000C1063000
> 
> [0A0h 0160   8]                        Value : 0000000000000003
> [0A8h 0168   8]                         Mask : 00000000000000FF
> 
> [0B0h 0176   1]                       Action : 04 [Set Record Offset]
> [0B1h 0177   1]                  Instruction : 02 [Write Register]
> [0B2h 0178   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [0B3h 0179   1]                     Reserved : 00
> 
> [0B4h 0180  12]              Register Region : [Generic Address Structure]
> [0B4h 0180   1]                     Space ID : 00 [SystemMemory]
> [0B5h 0181   1]                    Bit Width : 20
> [0B6h 0182   1]                   Bit Offset : 00
> [0B7h 0183   1]         Encoded Access Width : 03 [DWord Access:32]
> [0B8h 0184   8]                      Address : 00000000C1063008
> 
> [0C0h 0192   8]                        Value : 0000000000000000
> [0C8h 0200   8]                         Mask : 00000000FFFFFFFF
> 
> [0D0h 0208   1]                       Action : 04 [Set Record Offset]
> [0D1h 0209   1]                  Instruction : 03 [Write Register Value]
> [0D2h 0210   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [0D3h 0211   1]                     Reserved : 00
> 
> [0D4h 0212  12]              Register Region : [Generic Address Structure]
> [0D4h 0212   1]                     Space ID : 00 [SystemMemory]
> [0D5h 0213   1]                    Bit Width : 20
> [0D6h 0214   1]                   Bit Offset : 00
> [0D7h 0215   1]         Encoded Access Width : 03 [DWord Access:32]
> [0D8h 0216   8]                      Address : 00000000C1063000
> 
> [0E0h 0224   8]                        Value : 0000000000000004
> [0E8h 0232   8]                         Mask : 00000000000000FF
> 
> [0F0h 0240   1]                       Action : 05 [Execute Operation]
> [0F1h 0241   1]                  Instruction : 03 [Write Register Value]
> [0F2h 0242   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [0F3h 0243   1]                     Reserved : 00
> 
> [0F4h 0244  12]              Register Region : [Generic Address Structure]
> [0F4h 0244   1]                     Space ID : 00 [SystemMemory]
> [0F5h 0245   1]                    Bit Width : 20
> [0F6h 0246   1]                   Bit Offset : 00
> [0F7h 0247   1]         Encoded Access Width : 03 [DWord Access:32]
> [0F8h 0248   8]                      Address : 00000000C1063008
> 
> [100h 0256   8]                        Value : 000000000000009C
> [108h 0264   8]                         Mask : 00000000000000FF
> 
> [110h 0272   1]                       Action : 05 [Execute Operation]
> [111h 0273   1]                  Instruction : 03 [Write Register Value]
> [112h 0274   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [113h 0275   1]                     Reserved : 00
> 
> [114h 0276  12]              Register Region : [Generic Address Structure]
> [114h 0276   1]                     Space ID : 00 [SystemMemory]
> [115h 0277   1]                    Bit Width : 20
> [116h 0278   1]                   Bit Offset : 00
> [117h 0279   1]         Encoded Access Width : 03 [DWord Access:32]
> [118h 0280   8]                      Address : 00000000C1063000
> 
> [120h 0288   8]                        Value : 0000000000000005
> [128h 0296   8]                         Mask : 00000000000000FF
> 
> [130h 0304   1]                       Action : 06 [Check Busy Status]
> [131h 0305   1]                  Instruction : 03 [Write Register Value]
> [132h 0306   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [133h 0307   1]                     Reserved : 00
> 
> [134h 0308  12]              Register Region : [Generic Address Structure]
> [134h 0308   1]                     Space ID : 00 [SystemMemory]
> [135h 0309   1]                    Bit Width : 20
> [136h 0310   1]                   Bit Offset : 00
> [137h 0311   1]         Encoded Access Width : 03 [DWord Access:32]
> [138h 0312   8]                      Address : 00000000C1063000
> 
> [140h 0320   8]                        Value : 0000000000000006
> [148h 0328   8]                         Mask : 00000000000000FF
> 
> [150h 0336   1]                       Action : 06 [Check Busy Status]
> [151h 0337   1]                  Instruction : 01 [Read Register Value]
> [152h 0338   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [153h 0339   1]                     Reserved : 00
> 
> [154h 0340  12]              Register Region : [Generic Address Structure]
> [154h 0340   1]                     Space ID : 00 [SystemMemory]
> [155h 0341   1]                    Bit Width : 20
> [156h 0342   1]                   Bit Offset : 00
> [157h 0343   1]         Encoded Access Width : 03 [DWord Access:32]
> [158h 0344   8]                      Address : 00000000C1063008
> 
> [160h 0352   8]                        Value : 0000000000000001
> [168h 0360   8]                         Mask : 00000000000000FF
> 
> [170h 0368   1]                       Action : 07 [Get Command Status]
> [171h 0369   1]                  Instruction : 03 [Write Register Value]
> [172h 0370   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [173h 0371   1]                     Reserved : 00
> 
> [174h 0372  12]              Register Region : [Generic Address Structure]
> [174h 0372   1]                     Space ID : 00 [SystemMemory]
> [175h 0373   1]                    Bit Width : 20
> [176h 0374   1]                   Bit Offset : 00
> [177h 0375   1]         Encoded Access Width : 03 [DWord Access:32]
> [178h 0376   8]                      Address : 00000000C1063000
> 
> [180h 0384   8]                        Value : 0000000000000007
> [188h 0392   8]                         Mask : 00000000000000FF
> 
> [190h 0400   1]                       Action : 07 [Get Command Status]
> [191h 0401   1]                  Instruction : 00 [Read Register]
> [192h 0402   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [193h 0403   1]                     Reserved : 00
> 
> [194h 0404  12]              Register Region : [Generic Address Structure]
> [194h 0404   1]                     Space ID : 00 [SystemMemory]
> [195h 0405   1]                    Bit Width : 20
> [196h 0406   1]                   Bit Offset : 00
> [197h 0407   1]         Encoded Access Width : 03 [DWord Access:32]
> [198h 0408   8]                      Address : 00000000C1063008
> 
> [1A0h 0416   8]                        Value : 0000000000000000
> [1A8h 0424   8]                         Mask : 00000000000000FF
> 
> [1B0h 0432   1]                       Action : 08 [Get Record Identifier]
> [1B1h 0433   1]                  Instruction : 03 [Write Register Value]
> [1B2h 0434   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [1B3h 0435   1]                     Reserved : 00
> 
> [1B4h 0436  12]              Register Region : [Generic Address Structure]
> [1B4h 0436   1]                     Space ID : 00 [SystemMemory]
> [1B5h 0437   1]                    Bit Width : 20
> [1B6h 0438   1]                   Bit Offset : 00
> [1B7h 0439   1]         Encoded Access Width : 03 [DWord Access:32]
> [1B8h 0440   8]                      Address : 00000000C1063000
> 
> [1C0h 0448   8]                        Value : 0000000000000008
> [1C8h 0456   8]                         Mask : 00000000000000FF
> 
> [1D0h 0464   1]                       Action : 08 [Get Record Identifier]
> [1D1h 0465   1]                  Instruction : 00 [Read Register]
> [1D2h 0466   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [1D3h 0467   1]                     Reserved : 00
> 
> [1D4h 0468  12]              Register Region : [Generic Address Structure]
> [1D4h 0468   1]                     Space ID : 00 [SystemMemory]
> [1D5h 0469   1]                    Bit Width : 40
> [1D6h 0470   1]                   Bit Offset : 00
> [1D7h 0471   1]         Encoded Access Width : 04 [QWord Access:64]
> [1D8h 0472   8]                      Address : 00000000C1063008
> 
> [1E0h 0480   8]                        Value : 0000000000000000
> [1E8h 0488   8]                         Mask : FFFFFFFFFFFFFFFF
> 
> [1F0h 0496   1]                       Action : 09 [Set Record Identifier]
> [1F1h 0497   1]                  Instruction : 02 [Write Register]
> [1F2h 0498   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [1F3h 0499   1]                     Reserved : 00
> 
> [1F4h 0500  12]              Register Region : [Generic Address Structure]
> [1F4h 0500   1]                     Space ID : 00 [SystemMemory]
> [1F5h 0501   1]                    Bit Width : 40
> [1F6h 0502   1]                   Bit Offset : 00
> [1F7h 0503   1]         Encoded Access Width : 04 [QWord Access:64]
> [1F8h 0504   8]                      Address : 00000000C1063008
> 
> [200h 0512   8]                        Value : 0000000000000000
> [208h 0520   8]                         Mask : FFFFFFFFFFFFFFFF
> 
> [210h 0528   1]                       Action : 09 [Set Record Identifier]
> [211h 0529   1]                  Instruction : 03 [Write Register Value]
> [212h 0530   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [213h 0531   1]                     Reserved : 00
> 
> [214h 0532  12]              Register Region : [Generic Address Structure]
> [214h 0532   1]                     Space ID : 00 [SystemMemory]
> [215h 0533   1]                    Bit Width : 20
> [216h 0534   1]                   Bit Offset : 00
> [217h 0535   1]         Encoded Access Width : 03 [DWord Access:32]
> [218h 0536   8]                      Address : 00000000C1063000
> 
> [220h 0544   8]                        Value : 0000000000000009
> [228h 0552   8]                         Mask : 00000000000000FF
> 
> [230h 0560   1]                       Action : 0A [Get Record Count]
> [231h 0561   1]                  Instruction : 03 [Write Register Value]
> [232h 0562   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [233h 0563   1]                     Reserved : 00
> 
> [234h 0564  12]              Register Region : [Generic Address Structure]
> [234h 0564   1]                     Space ID : 00 [SystemMemory]
> [235h 0565   1]                    Bit Width : 20
> [236h 0566   1]                   Bit Offset : 00
> [237h 0567   1]         Encoded Access Width : 03 [DWord Access:32]
> [238h 0568   8]                      Address : 00000000C1063000
> 
> [240h 0576   8]                        Value : 000000000000000A
> [248h 0584   8]                         Mask : 00000000000000FF
> 
> [250h 0592   1]                       Action : 0A [Get Record Count]
> [251h 0593   1]                  Instruction : 00 [Read Register]
> [252h 0594   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [253h 0595   1]                     Reserved : 00
> 
> [254h 0596  12]              Register Region : [Generic Address Structure]
> [254h 0596   1]                     Space ID : 00 [SystemMemory]
> [255h 0597   1]                    Bit Width : 20
> [256h 0598   1]                   Bit Offset : 00
> [257h 0599   1]         Encoded Access Width : 03 [DWord Access:32]
> [258h 0600   8]                      Address : 00000000C1063008
> 
> [260h 0608   8]                        Value : 0000000000000000
> [268h 0616   8]                         Mask : 00000000FFFFFFFF
> 
> [270h 0624   1]                       Action : 0B [Begin Dummy Write]
> [271h 0625   1]                  Instruction : 03 [Write Register Value]
> [272h 0626   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [273h 0627   1]                     Reserved : 00
> 
> [274h 0628  12]              Register Region : [Generic Address Structure]
> [274h 0628   1]                     Space ID : 00 [SystemMemory]
> [275h 0629   1]                    Bit Width : 20
> [276h 0630   1]                   Bit Offset : 00
> [277h 0631   1]         Encoded Access Width : 03 [DWord Access:32]
> [278h 0632   8]                      Address : 00000000C1063000
> 
> [280h 0640   8]                        Value : 000000000000000B
> [288h 0648   8]                         Mask : 00000000000000FF
> 
> [290h 0656   1]                       Action : 0D [Get Error Address Range]
> [291h 0657   1]                  Instruction : 03 [Write Register Value]
> [292h 0658   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [293h 0659   1]                     Reserved : 00
> 
> [294h 0660  12]              Register Region : [Generic Address Structure]
> [294h 0660   1]                     Space ID : 00 [SystemMemory]
> [295h 0661   1]                    Bit Width : 20
> [296h 0662   1]                   Bit Offset : 00
> [297h 0663   1]         Encoded Access Width : 03 [DWord Access:32]
> [298h 0664   8]                      Address : 00000000C1063000
> 
> [2A0h 0672   8]                        Value : 000000000000000D
> [2A8h 0680   8]                         Mask : 00000000000000FF
> 
> [2B0h 0688   1]                       Action : 0D [Get Error Address Range]
> [2B1h 0689   1]                  Instruction : 00 [Read Register]
> [2B2h 0690   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [2B3h 0691   1]                     Reserved : 00
> 
> [2B4h 0692  12]              Register Region : [Generic Address Structure]
> [2B4h 0692   1]                     Space ID : 00 [SystemMemory]
> [2B5h 0693   1]                    Bit Width : 40
> [2B6h 0694   1]                   Bit Offset : 00
> [2B7h 0695   1]         Encoded Access Width : 04 [QWord Access:64]
> [2B8h 0696   8]                      Address : 00000000C1063008
> 
> [2C0h 0704   8]                        Value : 0000000000000000
> [2C8h 0712   8]                         Mask : FFFFFFFFFFFFFFFF
> 
> [2D0h 0720   1]                       Action : 0E [Get Error Address Length]
> [2D1h 0721   1]                  Instruction : 03 [Write Register Value]
> [2D2h 0722   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [2D3h 0723   1]                     Reserved : 00
> 
> [2D4h 0724  12]              Register Region : [Generic Address Structure]
> [2D4h 0724   1]                     Space ID : 00 [SystemMemory]
> [2D5h 0725   1]                    Bit Width : 20
> [2D6h 0726   1]                   Bit Offset : 00
> [2D7h 0727   1]         Encoded Access Width : 03 [DWord Access:32]
> [2D8h 0728   8]                      Address : 00000000C1063000
> 
> [2E0h 0736   8]                        Value : 000000000000000E
> [2E8h 0744   8]                         Mask : 00000000000000FF
> 
> [2F0h 0752   1]                       Action : 0E [Get Error Address Length]
> [2F1h 0753   1]                  Instruction : 00 [Read Register]
> [2F2h 0754   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [2F3h 0755   1]                     Reserved : 00
> 
> [2F4h 0756  12]              Register Region : [Generic Address Structure]
> [2F4h 0756   1]                     Space ID : 00 [SystemMemory]
> [2F5h 0757   1]                    Bit Width : 40
> [2F6h 0758   1]                   Bit Offset : 00
> [2F7h 0759   1]         Encoded Access Width : 04 [QWord Access:64]
> [2F8h 0760   8]                      Address : 00000000C1063008
> 
> [300h 0768   8]                        Value : 0000000000000000
> [308h 0776   8]                         Mask : 00000000FFFFFFFF
> 
> [310h 0784   1]                       Action : 0F [Get Error Attributes]
> [311h 0785   1]                  Instruction : 03 [Write Register Value]
> [312h 0786   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [313h 0787   1]                     Reserved : 00
> 
> [314h 0788  12]              Register Region : [Generic Address Structure]
> [314h 0788   1]                     Space ID : 00 [SystemMemory]
> [315h 0789   1]                    Bit Width : 20
> [316h 0790   1]                   Bit Offset : 00
> [317h 0791   1]         Encoded Access Width : 03 [DWord Access:32]
> [318h 0792   8]                      Address : 00000000C1063000
> 
> [320h 0800   8]                        Value : 000000000000000F
> [328h 0808   8]                         Mask : 00000000000000FF
> 
> [330h 0816   1]                       Action : 0F [Get Error Attributes]
> [331h 0817   1]                  Instruction : 00 [Read Register]
> [332h 0818   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [333h 0819   1]                     Reserved : 00
> 
> [334h 0820  12]              Register Region : [Generic Address Structure]
> [334h 0820   1]                     Space ID : 00 [SystemMemory]
> [335h 0821   1]                    Bit Width : 20
> [336h 0822   1]                   Bit Offset : 00
> [337h 0823   1]         Encoded Access Width : 03 [DWord Access:32]
> [338h 0824   8]                      Address : 00000000C1063008
> 
> [340h 0832   8]                        Value : 0000000000000000
> [348h 0840   8]                         Mask : 00000000FFFFFFFF
> 
> [350h 0848   1]                       Action : 10 [Execute Timings]
> [351h 0849   1]                  Instruction : 03 [Write Register Value]
> [352h 0850   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [353h 0851   1]                     Reserved : 00
> 
> [354h 0852  12]              Register Region : [Generic Address Structure]
> [354h 0852   1]                     Space ID : 00 [SystemMemory]
> [355h 0853   1]                    Bit Width : 20
> [356h 0854   1]                   Bit Offset : 00
> [357h 0855   1]         Encoded Access Width : 03 [DWord Access:32]
> [358h 0856   8]                      Address : 00000000C1063000
> 
> [360h 0864   8]                        Value : 0000000000000010
> [368h 0872   8]                         Mask : 00000000000000FF
> 
> [370h 0880   1]                       Action : 10 [Execute Timings]
> [371h 0881   1]                  Instruction : 00 [Read Register]
> [372h 0882   1]        Flags (decoded below) : 00
>                        Preserve Register Bits : 0
> [373h 0883   1]                     Reserved : 00
> 
> [374h 0884  12]              Register Region : [Generic Address Structure]
> [374h 0884   1]                     Space ID : 00 [SystemMemory]
> [375h 0885   1]                    Bit Width : 40
> [376h 0886   1]                   Bit Offset : 00
> [377h 0887   1]         Encoded Access Width : 04 [QWord Access:64]
> [378h 0888   8]                      Address : 00000000C1063008
> 
> [380h 0896   8]                        Value : 0000000000000000
> [388h 0904   8]                         Mask : FFFFFFFFFFFFFFFF
> 
> Raw Table Data: Length 912 (0x390)
> 
>    0000: 45 52 53 54 90 03 00 00 01 C8 42 4F 43 48 53 20  // ERST......BOCHS
>    0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
>    0020: 01 00 00 00 30 00 00 00 00 00 00 00 1B 00 00 00  // ....0...........
>    0030: 00 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0040: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0050: 01 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0060: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0070: 02 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0080: 02 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0090: 03 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    00A0: 03 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    00B0: 04 02 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    00C0: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>    00D0: 04 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    00E0: 04 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    00F0: 05 03 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    0100: 9C 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0110: 05 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0120: 05 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0130: 06 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0140: 06 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0150: 06 01 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    0160: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0170: 07 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0180: 07 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0190: 07 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    01A0: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    01B0: 08 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    01C0: 08 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    01D0: 08 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>    01E0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>    01F0: 09 02 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>    0200: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>    0210: 09 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0220: 09 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0230: 0A 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0240: 0A 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0250: 0A 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    0260: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>    0270: 0B 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0280: 0B 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0290: 0D 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    02A0: 0D 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    02B0: 0D 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>    02C0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>    02D0: 0E 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    02E0: 0E 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    02F0: 0E 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>    0300: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>    0310: 0F 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0320: 0F 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0330: 0F 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>    0340: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>    0350: 10 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>    0360: 10 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>    0370: 10 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>    0380: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-20 14:57 ` Igor Mammedov
  2021-07-21 15:26   ` Eric DeVolder
  2021-07-23 16:26   ` Eric DeVolder
@ 2021-07-27 12:55   ` Igor Mammedov
  2021-07-28 15:19     ` Eric DeVolder
  2 siblings, 1 reply; 58+ messages in thread
From: Igor Mammedov @ 2021-07-27 12:55 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

PS:
If I haven't said it already, use checkpatch script before posting patches.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/10] ACPI ERST: specification for ERST support
  2021-07-27 11:45             ` Igor Mammedov
@ 2021-07-28 15:16               ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-28 15:16 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, Konrad Wilk, qemu-devel, pbonzini,
	Boris Ostrovsky, Eric Blake, rth



On 7/27/21 6:45 AM, Igor Mammedov wrote:
> On Mon, 26 Jul 2021 14:52:15 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/26/21 5:06 AM, Igor Mammedov wrote:
>>> On Wed, 21 Jul 2021 10:42:33 -0500
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> On 7/19/21 10:02 AM, Igor Mammedov wrote:
>>>>> On Wed, 30 Jun 2021 19:26:39 +0000
>>>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>>>       
>>>>>> Oops, at the end of the 4th paragraph, I meant to state that "Linux does not support the NVRAM mode."
>>>>>> rather than "non-NVRAM mode", which contradicts everything I stated prior.
>>>>>> Eric.
>>>>>> ________________________________
>>>>>> From: Eric DeVolder <eric.devolder@oracle.com>
>>>>>> Sent: Wednesday, June 30, 2021 2:07 PM
>>>>>> To: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
>>>>>> Cc: mst@redhat.com <mst@redhat.com>; imammedo@redhat.com <imammedo@redhat.com>; marcel.apfelbaum@gmail.com <marcel.apfelbaum@gmail.com>; pbonzini@redhat.com <pbonzini@redhat.com>; rth@twiddle.net <rth@twiddle.net>; ehabkost@redhat.com <ehabkost@redhat.com>; Konrad Wilk <konrad.wilk@oracle.com>; Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>> Subject: [PATCH v5 02/10] ACPI ERST: specification for ERST support
>>>>>>
>>>>>> Information on the implementation of the ACPI ERST support.
>>>>>>
>>>>>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>>>>>> ---
>>>>>>     docs/specs/acpi_erst.txt | 152 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>     1 file changed, 152 insertions(+)
>>>>>>     create mode 100644 docs/specs/acpi_erst.txt
>>>>>>
>>>>>> diff --git a/docs/specs/acpi_erst.txt b/docs/specs/acpi_erst.txt
>>>>>> new file mode 100644
>>>>>> index 0000000..79f8eb9
>>>>>> --- /dev/null
>>>>>> +++ b/docs/specs/acpi_erst.txt
>>>>>> @@ -0,0 +1,152 @@
>>>>>> +ACPI ERST DEVICE
>>>>>> +================
>>>>>> +
>>>>>> +The ACPI ERST device is utilized to support the ACPI Error Record
>>>>>> +Serialization Table, ERST, functionality. The functionality is
>>>>>> +designed for storing error records in persistent storage for
>>>>>> +future reference/debugging.
>>>>>> +
>>>>>> +The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
>>>>>> +(APEI)", and specifically subsection "Error Serialization", outlines
>>>>>> +a method for storing error records into persistent storage.
>>>>>> +
>>>>>> +The format of error records is described in the UEFI specification[2],
>>>>>> +in Appendix N "Common Platform Error Record".
>>>>>> +
>>>>>> +While the ACPI specification allows for an NVRAM "mode" (see
>>>>>> +GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
>>>>>> +directly exposed for direct access by the OS/guest, this implements
>>>>>> +the non-NVRAM "mode". This non-NVRAM "mode" is what is implemented
>>>>>> +by most BIOS (since flash memory requires programming operations
>>>>>> +in order to update its contents). Furthermore, as of the time of this
>>>>>> +writing, Linux does not support the non-NVRAM "mode".
>>>>>
>>>>> shouldn't it be s/non-NVRAM/NVRAM/ ?
>>>>
>>>> Yes, it has been corrected.
>>>>   
>>>>>       
>>>>>> +
>>>>>> +
>>>>>> +Background/Motivation
>>>>>> +---------------------
>>>>>> +Linux uses the persistent storage filesystem, pstore, to record
>>>>>> +information (eg. dmesg tail) upon panics and shutdowns.  Pstore is
>>>>>> +independent of, and runs before, kdump.  In certain scenarios (ie.
>>>>>> +hosts/guests with root filesystems on NFS/iSCSI where networking
>>>>>> +software and/or hardware fails), pstore may contain the only
>>>>>> +information available for post-mortem debugging.
>>>>>
>>>>> well,
>>>>> it's not the only way, one can use existing pvpanic device to notify
>>>>> mgmt layer about crash and mgmt layer can take appropriate measures
>>>>> to for post-mortem debugging, including dumping guest state,
>>>>> which is superior to anything pstore can offer as VM is still exists
>>>>> and mgmt layer can inspect VMs crashed state directly or dump
>>>>> necessary parts of it.
>>>>>
>>>>> So ERST shouldn't be portrayed as the only way here but rather
>>>>> as limited alternative to pvpanic in regards to post-mortem debugging
>>>>> (it's the only way only on bare-metal).
>>>>>
>>>>> It would be better to describe here other use-cases you've mentioned
>>>>> in earlier reviews, that justify adding alternative to pvpanic.
>>>>
>>>> I'm not sure how I would change this. I do say "may contain", which means it
>>>> is not the only way. Pvpanic is a way to notify the mgmt layer/host, but
>>>> this is a method solely with the guest. Each serves a different purpose;
>>>> plugs a different hole.
>>>>   
>>>
>>> I'd suggest edit  "pstore may contain the only information" as "pstore may contain information"
>>>    
>> Done
>>
>>>> As noted in a separate message, my company has intentions of storing other
>>>> data in ERST beyond panics.
>>> perhaps add your use cases here as well.
>>>    
>>>    
>>>>>> +Two common storage backends for the pstore filesystem are ACPI ERST
>>>>>> +and UEFI. Most BIOS implement ACPI ERST.  UEFI is not utilized in
>>>>>> +all guests. With QEMU supporting ACPI ERST, it becomes a viable
>>>>>> +pstore storage backend for virtual machines (as it is now for
>>>>>> +bare metal machines).
>>>>>> +
>>>>>       
>>>>>> +Enabling support for ACPI ERST facilitates a consistent method to
>>>>>> +capture kernel panic information in a wide range of guests: from
>>>>>> +resource-constrained microvms to very large guests, and in
>>>>>> +particular, in direct-boot environments (which would lack UEFI
>>>>>> +run-time services).
>>>>> this hunk probably not necessary
>>>>>       
>>>>>> +
>>>>>> +Note that Microsoft Windows also utilizes the ACPI ERST for certain
>>>>>> +crash information, if available.
>>>>> a pointer to a relevant source would be helpful here.
>>>>
>>>> I've included the reference, here for your benefit.
>>>> Windows Hardware Error Architecutre, specifically Persistence Mechanism
>>>> https://docs.microsoft.com/en-us/windows-hardware/drivers/whea/error-record-persistence-mechanism
>>>>   
>>>>>       
>>>>>> +Invocation
>>>>> s/^^/Configuration|Usage/
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +----------
>>>>>> +
>>>>>> +To utilize ACPI ERST, a memory-backend-file object and acpi-erst
>>>>> s/utilize/use/
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +device must be created, for example:
>>>>> s/must/can/
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +
>>>>>> + qemu ...
>>>>>> + -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,
>>>>>> +  size=0x10000,share=on
>>>>> I'd put ^^^ on the same line as -object and use '\' at the end the
>>>>> so example could be easily copy-pasted
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> + -device acpi-erst,memdev=erstnvram
>>>>>> +
>>>>>> +For proper operation, the ACPI ERST device needs a memory-backend-file
>>>>>> +object with the following parameters:
>>>>>> +
>>>>>> + - id: The id of the memory-backend-file object is used to associate
>>>>>> +   this memory with the acpi-erst device.
>>>>>> + - size: The size of the ACPI ERST backing storage. This parameter is
>>>>>> +   required.
>>>>>> + - mem-path: The location of the ACPI ERST backing storage file. This
>>>>>> +   parameter is also required.
>>>>>> + - share: The share=on parameter is required so that updates to the
>>>>>> +   ERST back store are written to the file immediately as well. Without
>>>>>> +   it, updates the the backing file are unpredictable and may not
>>>>>> +   properly persist (eg. if qemu should crash).
>>>>>
>>>>> mmap manpage says:
>>>>>      MAP_SHARED
>>>>>                 Updates to the mapping ... are carried through to the underlying file.
>>>>> it doesn't guarantee 'written to the file immediately', though.
>>>>> So I'd rephrase it to something like that:
>>>>>
>>>>> - share: The share=on parameter is required so that updates to the ERST back store
>>>>>             are written back to the file.
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +
>>>>>> +The ACPI ERST device is a simple PCI device, and requires this one
>>>>>> +parameter:
>>>>> s/^.*:/and ERST device:/
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +
>>>>>> + - memdev: Is the object id of the memory-backend-file.
>>>>>> +
>>>>>> +
>>>>>> +PCI Interface
>>>>>> +-------------
>>>>>> +
>>>>>> +The ERST device is a PCI device with two BARs, one for accessing
>>>>>> +the programming registers, and the other for accessing the
>>>>>> +record exchange buffer.
>>>>>> +
>>>>>> +BAR0 contains the programming interface consisting of just two
>>>>>> +64-bit registers. The two registers are an ACTION (cmd) and a
>>>>>> +VALUE (data). All ERST actions/operations/side effects happen
>>>>> s/consisting of... All ERST/consisting of ACTION and VALUE 64-bit registers. All ERST/
>>>>
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +on the write to the ACTION, by design. Thus any data needed
>>>>> s/Thus//
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +by the action must be placed into VALUE prior to writing
>>>>>> +ACTION. Reading the VALUE simply returns the register contents,
>>>>>> +which can be updated by a previous ACTION.
>>>>>       
>>>>>> This behavior is
>>>>>> +encoded in the ACPI ERST table generated by QEMU.
>>>>> it's too vague, Either drop sentence or add a reference to relevant place in spec.
>>>> Corrected
>>>>   
>>>>>
>>>>>       
>>>>>> +
>>>>>> +BAR1 contains the record exchange buffer, and the size of this
>>>>>> +buffer sets the maximum record size. This record exchange
>>>>>> +buffer size is 8KiB.
>>>>> s/^^^/
>>>>> BAR1 contains the 8KiB record exchange buffer, which is the implemented maximum record size limit.
>>>> Corrected
>>>>   
>>>>>
>>>>>       
>>>>>> +Backing File
>>>>>
>>>>> s/^^^/Backing Storage Format/
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +------------
>>>>>
>>>>>       
>>>>>> +
>>>>>> +The ACPI ERST persistent storage is contained within a single backing
>>>>>> +file. The size and location of the backing file is specified upon
>>>>>> +QEMU startup of the ACPI ERST device.
>>>>>
>>>>> I'd drop above paragraph and describe file format here,
>>>>> ultimately used backend doesn't have to be a file. For
>>>>> example if user doesn't need it persist over QEMU restarts,
>>>>> ram backend could be used, guest will still be able to see
>>>>> it's own crash log after guest is reboot, or it could be
>>>>> memfd backend passed to QEMU by mgmt layer.
>>>> Dropped
>>>>   
>>>>>
>>>>>       
>>>>>> +Records are stored in the backing file in a simple fashion.
>>>>> s/backing file/backend storage/
>>>>> ditto for other occurrences
>>>> Corrected
>>>>   
>>>>>       
>>>>>> +The backing file is essentially divided into fixed size
>>>>>> +"slots", ERST_RECORD_SIZE in length, with each "slot"
>>>>>> +storing a single record.
>>>>>       
>>>>>> No attempt at optimizing storage
>>>>>> +through compression, compaction, etc is attempted.
>>>>> s/^^^//
>>>>
>>>> I'd like to keep this statement. It is there because in a number of
>>>> hardware BIOS I tested, these kinds of features lead to bugs in the
>>>> ERST support.
>>> this doc it's not about issues in other BIOSes, it's about conrete
>>> QEMU impl. So sentence starting with "No attempt" is not relevant here at all.
>> Dropped
>>
>>>       
>>>>>> +NOTE that any change to this value will make any pre-
>>>>>> +existing backing files, not of the same ERST_RECORD_SIZE,
>>>>>> +unusable to the guest.
>>>>> when that can happen, can we detect it and error out?
>>>> I've dropped this statement. That value is hard coded, and not a
>>>> parameter, so there is no simple way to change it. This comment
>>>> does exist next to the ERST_RECORD_SIZE declaration in the code.
>>>
>>> It's not a problem with current impl. but rather with possible
>>> future expansion.
>>>
>>> If you'd add a header with record size at the start of storage,
>>> it wouldn't be issue as ERST would be able to get used record
>>> size for storage. That will help with avoiding compat issues
>>> later on.
>> I'll go ahead and add the header. I'll put the magic and record size in there,
>> but I do not intend to put any data that would be "cached" from the records
>> themselves. So no recordids, in particular, will be cached in this header.
> maybe also add offset of the 1st slot, so however comes later
> to fix performance issues will have less compatibility issues.
Done

> 
>>
>> I'm not even sure I want to record/cache the number of records because:
>>    - it has almost no use (undermined by the fact overall storage size is not exposed, imho)
>>    - we backed off requiring the share=on, so it is conceivable this header value could
>>      encounter data integrity issues, should a guest crash...
> guest crash won't affect data,  and if backend is not shared then,
> data won't be persistently stored anyways, they will live only for
> lifetime of QEMU instance.
> The only time where integrity is affected is host crash and we already
> agreed that we don't care about this case.
See further below

> 
>>    - scans still happen (see next)
>>
>> While having it (number of records cached in header) would avoid a startup scan
>> to compute it, the write operation requires a scan to determine if the incoming
>> recordid is present or not, in order to determine overwrite or allocate-and-write.
> if present/non present per slot status is cached, you don't have to do
> expensive full scan when guest scans slots.
Done

> 
>> And typically the first operation that Linux does is effectively a scan to
>> populate the /sys/fs/pstore entries via the GET_RECORD_IDENTIFIER action.
>>
>> And the typical size of the ERST storage [on hardware systems] is 64 to 128KiB;
>> so not much storage to examine, especially since only looking at 12 bytes of each
>> 8KiB record.
>>
>> I'd still be in favor of putting an upper bound on the hostmem object, to avoid
>> your worst case fears...
> 
> Considering device is not present by default, I give up on
> trying to convince you to design it efficiently.
> 
> If one would wish to use this with container like workloads
> where fast startup matters, one would have to live with crappy
> performance or rewrite your impl.

I've embraced your assurance of no data integrity issues, and have changed
the implementation to include a header that also tracks/caches the record_ids.
This eliminates all scanning of the entire backend storage.
My original goal was to offer ERST as BIOS do, so backend storage size of about
64 to 128KiB; where the current implementation would be just fine.
But I did mention that we were looking to do more with ERST, and
the backend storage can be quite large, so you are right to push for
better implementation.


> 
>>>>>> +Below is an example layout of the backing store file.
>>>>>> +The size of the file is a multiple of ERST_RECORD_SIZE,
>>>>>> +and contains N number of "slots" to store records. The
>>>>>> +example below shows two records (in CPER format) in the
>>>>>> +backing file, while the remaining slots are empty/
>>>>>> +available.
>>>>>> +
>>>>>> + Slot   Record
>>>>>> +        +--------------------------------------------+
>>>>>> +    0   | empty/available                            |
>>>>>> +        +--------------------------------------------+
>>>>>> +    1   | CPER                                       |
>>>>>> +        +--------------------------------------------+
>>>>>> +    2   | CPER                                       |
>>>>>> +        +--------------------------------------------+
>>>>>> +  ...   |                                            |
>>>>>> +        +--------------------------------------------+
>>>>>> +    N   | empty/available                            |
>>>>>> +        +--------------------------------------------+
>>>>>> +        <-------------- ERST_RECORD_SIZE ------------>
>>>>>
>>>>>       
>>>>>> +Not all slots need to be occupied, and they need not be
>>>>>> +occupied in a contiguous fashion. The ability to clear/erase
>>>>>> +specific records allows for the formation of unoccupied
>>>>>> +slots.
>>>>> I'd drop this as not necessary
>>>>
>>>> I'd like to keep this statement. Again, several BIOS on which I tested
>>>> ERST had bugs around non-contiguous record storage.
>>>
>>> I'd drop this and alter sentence above ending with " in a simple fashion."
>>> to describe sparse usage of storage and then after that comes example diagram.
>> Done
>>
>>>
>>> I'd like this part to start with unambiguous concise description of
>>> format and to be finished with example diagram.
>>> It is the part that will be considered as the the only true source
>>> how file should be formatted, when it comes to fixing bugs or
>>> modifying original impl. later on. So it's important to have clear
>>> description without any unnecessary information here.
>> Done
>>
>>>
>>>    
>>>>>
>>>>>       
>>>>>> +
>>>>>> +
>>>>>> +References
>>>>>> +----------
>>>>>> +
>>>>>> +[1] "Advanced Configuration and Power Interface Specification",
>>>>>> +    version 4.0, June 2009.
>>>>>> +
>>>>>> +[2] "Unified Extensible Firmware Interface Specification",
>>>>>> +    version 2.1, October 2008.
>>>>>> +
>>>>>> --
>>>>>> 1.8.3.1
>>>>>>      
>>>>>       
>>>>   
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table
  2021-07-27 12:01           ` Igor Mammedov
@ 2021-07-28 15:18             ` Eric DeVolder
  0 siblings, 0 replies; 58+ messages in thread
From: Eric DeVolder @ 2021-07-28 15:18 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/27/21 7:01 AM, Igor Mammedov wrote:
> On Mon, 26 Jul 2021 15:02:55 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
> [...]
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_BEGIN_READ_OPERATION:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_BEGIN_CLEAR_OPERATION:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_END_OPERATION:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_SET_RECORD_OFFSET:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK32);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_EXECUTE_OPERATION:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE , ERST_EXECUTE_OPERATION_MAGIC, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_CHECK_BUSY_STATUS:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER_VALUE , 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0x01, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_COMMAND_STATUS:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_RECORD_IDENTIFIER:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_SET_RECORD_IDENTIFIER:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER      , 0, 64,
>>>>>> +                s->bar0 + ERST_CSR_VALUE , 0, MASK64);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_RECORD_COUNT:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_BEGIN_DUMMY_WRITE_OPERATION:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_RESERVED:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_LENGTH:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK32);
>>>>>> +            break;
>>>>>> +        case ACPI_ERST_ACTION_GET_EXECUTE_OPERATION_TIMINGS:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_WRITE_REGISTER_VALUE, 0, 32,
>>>>>> +                s->bar0 + ERST_CSR_ACTION, action, MASK8);
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>>>>>> +                s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>>>>>> +        default:
>>>>>> +            build_serialization_instruction_entry(table_data, action,
>>>>>> +                ACPI_ERST_INST_NOOP, 0, 0, 0, action, 0);
>>>>>> +            break;
>>>>>> +        }
> 
> ../../builds/imammedo/qemu/hw/acpi/erst.c: In function ‘build_erst’:
> ../../builds/imammedo/qemu/hw/acpi/erst.c:754:13: error: this statement may fall through [-Werror=implicit-fallthrough=]
>               build_serialization_instruction_entry(table_data, action,
>               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>                   ACPI_ERST_INST_READ_REGISTER       , 0, 64,
>                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>                   s->bar0 + ERST_CSR_VALUE, 0, MASK64);
>                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../builds/imammedo/qemu/hw/acpi/erst.c:757:9: note: here
>           default:
>           ^~~~~~~
> cc1: all warnings being treated as errors

Michael pointed this one out last week, I've since corrected it.

> 
> 
>>>>>> +    }
>>>>>> +    build_header(linker, table_data,
>>>>>> +                 (void *)(table_data->data + erst_start),
>>>>>> +                 "ERST", table_data->len - erst_start,
>>>>>> +                 1, oem_id, oem_table_id);
>>>>>> +}
>>>>>> +
>>>>>> +/*******************************************************************/
>>>>>> +/*******************************************************************/
>>>>>> +
>>>>>>     static const VMStateDescription erst_vmstate  = {
>>>>>>         .name = "acpi-erst",
>>>>>>         .version_id = 1,
>>>>>       
>>>>   
>>>    
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-27 12:55   ` Igor Mammedov
@ 2021-07-28 15:19     ` Eric DeVolder
  2021-07-29  8:07       ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-07-28 15:19 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/27/21 7:55 AM, Igor Mammedov wrote:
> PS:
> If I haven't said it already, use checkpatch script before posting patches.
> 

I do run checkpatch. On occasion I allow a warning about a line too long. And
there is the MAINTAINERs message due to the new files. Is there something else
that I'm missing?


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU
  2021-07-28 15:19     ` Eric DeVolder
@ 2021-07-29  8:07       ` Igor Mammedov
  0 siblings, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-07-29  8:07 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 28 Jul 2021 10:19:51 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/27/21 7:55 AM, Igor Mammedov wrote:
> > PS:
> > If I haven't said it already, use checkpatch script before posting patches.
> >   
> 
> I do run checkpatch. On occasion I allow a warning about a line too long. And
> there is the MAINTAINERs message due to the new files. Is there something else
> that I'm missing?
> 

there were warnings about new line or something like this.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-07-27 12:52           ` Igor Mammedov
@ 2021-08-04 22:13             ` Eric DeVolder
  2021-08-05  9:05               ` Igor Mammedov
  0 siblings, 1 reply; 58+ messages in thread
From: Eric DeVolder @ 2021-08-04 22:13 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth



On 7/27/21 7:52 AM, Igor Mammedov wrote:
> On Mon, 26 Jul 2021 15:01:05 -0500
> Eric DeVolder <eric.devolder@oracle.com> wrote:
> 
>> On 7/26/21 5:42 AM, Igor Mammedov wrote:
>>> On Wed, 21 Jul 2021 11:07:40 -0500
>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>    
>>>> On 7/20/21 7:17 AM, Igor Mammedov wrote:
>>>>> On Wed, 30 Jun 2021 15:07:16 -0400
>>>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
>>>>>       
> [..]
>>>>>> +
>>>>>> +static uint64_t erst_rd_reg64(hwaddr addr,
>>>>>> +    uint64_t reg, unsigned size)
>>>>>> +{
>>>>>> +    uint64_t rdval;
>>>>>> +    uint64_t mask;
>>>>>> +    unsigned shift;
>>>>>> +
>>>>>> +    if (size == sizeof(uint64_t)) {
>>>>>> +        /* 64b access */
>>>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>>>>>> +        shift = 0;
>>>>>> +    } else {
>>>>>> +        /* 32b access */
>>>>>> +        mask = 0x00000000FFFFFFFFUL;
>>>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>>>>>> +    }
>>>>>> +
>>>>>> +    rdval = reg;
>>>>>> +    rdval >>= shift;
>>>>>> +    rdval &= mask;
>>>>>> +
>>>>>> +    return rdval;
>>>>>> +}
>>>>>> +
>>>>>> +static uint64_t erst_wr_reg64(hwaddr addr,
>>>>>> +    uint64_t reg, uint64_t val, unsigned size)
>>>>>> +{
>>>>>> +    uint64_t wrval;
>>>>>> +    uint64_t mask;
>>>>>> +    unsigned shift;
>>>>>> +
>>>>>> +    if (size == sizeof(uint64_t)) {
>>>>>> +        /* 64b access */
>>>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
>>>>>> +        shift = 0;
>>>>>> +    } else {
>>>>>> +        /* 32b access */
>>>>>> +        mask = 0x00000000FFFFFFFFUL;
>>>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
>>>>>> +    }
>>>>>> +
>>>>>> +    val &= mask;
>>>>>> +    val <<= shift;
>>>>>> +    mask <<= shift;
>>>>>> +    wrval = reg;
>>>>>> +    wrval &= ~mask;
>>>>>> +    wrval |= val;
>>>>>> +
>>>>>> +    return wrval;
>>>>>> +}
>>>>> (I see in next patch it's us defining access width in the ACPI tables)
>>>>> so question is: do we have to have mixed register width access?
>>>>> can't all register accesses be 64-bit?
>>>>
>>>> Initially I attempted to just use 64-bit exclusively. The problem is that,
>>>> for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
>>>> register access into two. Here's an example of reading the exchange buffer
>>>> address, which is coded as a 64b access:
>>>>
>>>> acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
>>>> acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
>>>> acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)
>>>>
>>>> So I went ahead and made ACTION register accesses 32b, else there would
>>>> be two reads of 32-bts, of which the second is useless.
>>>
>>> could you post here decompiled ERST table?
>> As it is long, I posted it to the end of this message.
> 
> RHEL8 or Fedora 34 says that erst is invalid table,
> so I can't help tracing what's going on there.
> 
> You'll have to figure out why access is not 64 bit on your own.

Today I downloaded Fedora 34 Server and created a guest. Using my
qemu-6 branch with ERST, it appears to work just fine. I was able to
create entries into it.


[    0.010215] ACPI: ERST 0x000000007F9F3000 000390 (v01 BOCHS  BXPC     00000001 BXPC 00000001)
[    0.010250] ACPI: Reserving ERST table memory at [mem 0x7f9f3000-0x7f9f338f]
[    1.056244] ERST: Error Record Serialization Table (ERST) support is initialized.
[    1.056279] pstore: Registered erst as persistent store backend

total 0
drwxr-x---.  2 root root     0 Aug  4 18:05 .
drwxr-xr-x. 10 root root     0 Aug  4 18:05 ..
-r--r--r--.  1 root root 17700 Aug  4 17:54 dmesg-erst-6992696633267847169
-r--r--r--.  1 root root 17714 Aug  4 17:54 dmesg-erst-6992696633267847170


It appears to Linux OSPM is taking the 64-bit register access and breaking it
into two 32-bit accesses. If this is the case, then the fix would be in
Linux and not this code.

Pending your response to this finding, I have v6 ready to go.
Thanks
eric


> 
> [...]
> 
>> Obtained via a running guest with:
>> iasl -d -vt /sys/firmware/acpi/tables/ERST
>>
>> /*
>>    * Intel ACPI Component Architecture
>>    * AML/ASL+ Disassembler version 20180105 (64-bit version)
>>    * Copyright (c) 2000 - 2018 Intel Corporation
>>    *
>>    * Disassembly of ERST.blob, Mon Jul 26 14:31:21 2021
>>    *
>>    * ACPI Data Table [ERST]
>>    *
>>    * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
>>    */
>>
>> [000h 0000   4]                    Signature : "ERST"    [Error Record Serialization Table]
>> [004h 0004   4]                 Table Length : 00000390
>> [008h 0008   1]                     Revision : 01
>> [009h 0009   1]                     Checksum : C8
>> [00Ah 0010   6]                       Oem ID : "BOCHS "
>> [010h 0016   8]                 Oem Table ID : "BXPC    "
>> [018h 0024   4]                 Oem Revision : 00000001
>> [01Ch 0028   4]              Asl Compiler ID : "BXPC"
>> [020h 0032   4]        Asl Compiler Revision : 00000001
>>
>> [024h 0036   4]  Serialization Header Length : 00000030
>> [028h 0040   4]                     Reserved : 00000000
>> [02Ch 0044   4]      Instruction Entry Count : 0000001B
>>
>> [030h 0048   1]                       Action : 00 [Begin Write Operation]
>> [031h 0049   1]                  Instruction : 03 [Write Register Value]
>> [032h 0050   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [033h 0051   1]                     Reserved : 00
>>
>> [034h 0052  12]              Register Region : [Generic Address Structure]
>> [034h 0052   1]                     Space ID : 00 [SystemMemory]
>> [035h 0053   1]                    Bit Width : 20
>> [036h 0054   1]                   Bit Offset : 00
>> [037h 0055   1]         Encoded Access Width : 03 [DWord Access:32]
>> [038h 0056   8]                      Address : 00000000C1063000
>>
>> [040h 0064   8]                        Value : 0000000000000000
>> [048h 0072   8]                         Mask : 00000000000000FF
>>
>> [050h 0080   1]                       Action : 01 [Begin Read Operation]
>> [051h 0081   1]                  Instruction : 03 [Write Register Value]
>> [052h 0082   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [053h 0083   1]                     Reserved : 00
>>
>> [054h 0084  12]              Register Region : [Generic Address Structure]
>> [054h 0084   1]                     Space ID : 00 [SystemMemory]
>> [055h 0085   1]                    Bit Width : 20
>> [056h 0086   1]                   Bit Offset : 00
>> [057h 0087   1]         Encoded Access Width : 03 [DWord Access:32]
>> [058h 0088   8]                      Address : 00000000C1063000
>>
>> [060h 0096   8]                        Value : 0000000000000001
>> [068h 0104   8]                         Mask : 00000000000000FF
>>
>> [070h 0112   1]                       Action : 02 [Begin Clear Operation]
>> [071h 0113   1]                  Instruction : 03 [Write Register Value]
>> [072h 0114   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [073h 0115   1]                     Reserved : 00
>>
>> [074h 0116  12]              Register Region : [Generic Address Structure]
>> [074h 0116   1]                     Space ID : 00 [SystemMemory]
>> [075h 0117   1]                    Bit Width : 20
>> [076h 0118   1]                   Bit Offset : 00
>> [077h 0119   1]         Encoded Access Width : 03 [DWord Access:32]
>> [078h 0120   8]                      Address : 00000000C1063000
>>
>> [080h 0128   8]                        Value : 0000000000000002
>> [088h 0136   8]                         Mask : 00000000000000FF
>>
>> [090h 0144   1]                       Action : 03 [End Operation]
>> [091h 0145   1]                  Instruction : 03 [Write Register Value]
>> [092h 0146   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [093h 0147   1]                     Reserved : 00
>>
>> [094h 0148  12]              Register Region : [Generic Address Structure]
>> [094h 0148   1]                     Space ID : 00 [SystemMemory]
>> [095h 0149   1]                    Bit Width : 20
>> [096h 0150   1]                   Bit Offset : 00
>> [097h 0151   1]         Encoded Access Width : 03 [DWord Access:32]
>> [098h 0152   8]                      Address : 00000000C1063000
>>
>> [0A0h 0160   8]                        Value : 0000000000000003
>> [0A8h 0168   8]                         Mask : 00000000000000FF
>>
>> [0B0h 0176   1]                       Action : 04 [Set Record Offset]
>> [0B1h 0177   1]                  Instruction : 02 [Write Register]
>> [0B2h 0178   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [0B3h 0179   1]                     Reserved : 00
>>
>> [0B4h 0180  12]              Register Region : [Generic Address Structure]
>> [0B4h 0180   1]                     Space ID : 00 [SystemMemory]
>> [0B5h 0181   1]                    Bit Width : 20
>> [0B6h 0182   1]                   Bit Offset : 00
>> [0B7h 0183   1]         Encoded Access Width : 03 [DWord Access:32]
>> [0B8h 0184   8]                      Address : 00000000C1063008
>>
>> [0C0h 0192   8]                        Value : 0000000000000000
>> [0C8h 0200   8]                         Mask : 00000000FFFFFFFF
>>
>> [0D0h 0208   1]                       Action : 04 [Set Record Offset]
>> [0D1h 0209   1]                  Instruction : 03 [Write Register Value]
>> [0D2h 0210   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [0D3h 0211   1]                     Reserved : 00
>>
>> [0D4h 0212  12]              Register Region : [Generic Address Structure]
>> [0D4h 0212   1]                     Space ID : 00 [SystemMemory]
>> [0D5h 0213   1]                    Bit Width : 20
>> [0D6h 0214   1]                   Bit Offset : 00
>> [0D7h 0215   1]         Encoded Access Width : 03 [DWord Access:32]
>> [0D8h 0216   8]                      Address : 00000000C1063000
>>
>> [0E0h 0224   8]                        Value : 0000000000000004
>> [0E8h 0232   8]                         Mask : 00000000000000FF
>>
>> [0F0h 0240   1]                       Action : 05 [Execute Operation]
>> [0F1h 0241   1]                  Instruction : 03 [Write Register Value]
>> [0F2h 0242   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [0F3h 0243   1]                     Reserved : 00
>>
>> [0F4h 0244  12]              Register Region : [Generic Address Structure]
>> [0F4h 0244   1]                     Space ID : 00 [SystemMemory]
>> [0F5h 0245   1]                    Bit Width : 20
>> [0F6h 0246   1]                   Bit Offset : 00
>> [0F7h 0247   1]         Encoded Access Width : 03 [DWord Access:32]
>> [0F8h 0248   8]                      Address : 00000000C1063008
>>
>> [100h 0256   8]                        Value : 000000000000009C
>> [108h 0264   8]                         Mask : 00000000000000FF
>>
>> [110h 0272   1]                       Action : 05 [Execute Operation]
>> [111h 0273   1]                  Instruction : 03 [Write Register Value]
>> [112h 0274   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [113h 0275   1]                     Reserved : 00
>>
>> [114h 0276  12]              Register Region : [Generic Address Structure]
>> [114h 0276   1]                     Space ID : 00 [SystemMemory]
>> [115h 0277   1]                    Bit Width : 20
>> [116h 0278   1]                   Bit Offset : 00
>> [117h 0279   1]         Encoded Access Width : 03 [DWord Access:32]
>> [118h 0280   8]                      Address : 00000000C1063000
>>
>> [120h 0288   8]                        Value : 0000000000000005
>> [128h 0296   8]                         Mask : 00000000000000FF
>>
>> [130h 0304   1]                       Action : 06 [Check Busy Status]
>> [131h 0305   1]                  Instruction : 03 [Write Register Value]
>> [132h 0306   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [133h 0307   1]                     Reserved : 00
>>
>> [134h 0308  12]              Register Region : [Generic Address Structure]
>> [134h 0308   1]                     Space ID : 00 [SystemMemory]
>> [135h 0309   1]                    Bit Width : 20
>> [136h 0310   1]                   Bit Offset : 00
>> [137h 0311   1]         Encoded Access Width : 03 [DWord Access:32]
>> [138h 0312   8]                      Address : 00000000C1063000
>>
>> [140h 0320   8]                        Value : 0000000000000006
>> [148h 0328   8]                         Mask : 00000000000000FF
>>
>> [150h 0336   1]                       Action : 06 [Check Busy Status]
>> [151h 0337   1]                  Instruction : 01 [Read Register Value]
>> [152h 0338   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [153h 0339   1]                     Reserved : 00
>>
>> [154h 0340  12]              Register Region : [Generic Address Structure]
>> [154h 0340   1]                     Space ID : 00 [SystemMemory]
>> [155h 0341   1]                    Bit Width : 20
>> [156h 0342   1]                   Bit Offset : 00
>> [157h 0343   1]         Encoded Access Width : 03 [DWord Access:32]
>> [158h 0344   8]                      Address : 00000000C1063008
>>
>> [160h 0352   8]                        Value : 0000000000000001
>> [168h 0360   8]                         Mask : 00000000000000FF
>>
>> [170h 0368   1]                       Action : 07 [Get Command Status]
>> [171h 0369   1]                  Instruction : 03 [Write Register Value]
>> [172h 0370   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [173h 0371   1]                     Reserved : 00
>>
>> [174h 0372  12]              Register Region : [Generic Address Structure]
>> [174h 0372   1]                     Space ID : 00 [SystemMemory]
>> [175h 0373   1]                    Bit Width : 20
>> [176h 0374   1]                   Bit Offset : 00
>> [177h 0375   1]         Encoded Access Width : 03 [DWord Access:32]
>> [178h 0376   8]                      Address : 00000000C1063000
>>
>> [180h 0384   8]                        Value : 0000000000000007
>> [188h 0392   8]                         Mask : 00000000000000FF
>>
>> [190h 0400   1]                       Action : 07 [Get Command Status]
>> [191h 0401   1]                  Instruction : 00 [Read Register]
>> [192h 0402   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [193h 0403   1]                     Reserved : 00
>>
>> [194h 0404  12]              Register Region : [Generic Address Structure]
>> [194h 0404   1]                     Space ID : 00 [SystemMemory]
>> [195h 0405   1]                    Bit Width : 20
>> [196h 0406   1]                   Bit Offset : 00
>> [197h 0407   1]         Encoded Access Width : 03 [DWord Access:32]
>> [198h 0408   8]                      Address : 00000000C1063008
>>
>> [1A0h 0416   8]                        Value : 0000000000000000
>> [1A8h 0424   8]                         Mask : 00000000000000FF
>>
>> [1B0h 0432   1]                       Action : 08 [Get Record Identifier]
>> [1B1h 0433   1]                  Instruction : 03 [Write Register Value]
>> [1B2h 0434   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [1B3h 0435   1]                     Reserved : 00
>>
>> [1B4h 0436  12]              Register Region : [Generic Address Structure]
>> [1B4h 0436   1]                     Space ID : 00 [SystemMemory]
>> [1B5h 0437   1]                    Bit Width : 20
>> [1B6h 0438   1]                   Bit Offset : 00
>> [1B7h 0439   1]         Encoded Access Width : 03 [DWord Access:32]
>> [1B8h 0440   8]                      Address : 00000000C1063000
>>
>> [1C0h 0448   8]                        Value : 0000000000000008
>> [1C8h 0456   8]                         Mask : 00000000000000FF
>>
>> [1D0h 0464   1]                       Action : 08 [Get Record Identifier]
>> [1D1h 0465   1]                  Instruction : 00 [Read Register]
>> [1D2h 0466   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [1D3h 0467   1]                     Reserved : 00
>>
>> [1D4h 0468  12]              Register Region : [Generic Address Structure]
>> [1D4h 0468   1]                     Space ID : 00 [SystemMemory]
>> [1D5h 0469   1]                    Bit Width : 40
>> [1D6h 0470   1]                   Bit Offset : 00
>> [1D7h 0471   1]         Encoded Access Width : 04 [QWord Access:64]
>> [1D8h 0472   8]                      Address : 00000000C1063008
>>
>> [1E0h 0480   8]                        Value : 0000000000000000
>> [1E8h 0488   8]                         Mask : FFFFFFFFFFFFFFFF
>>
>> [1F0h 0496   1]                       Action : 09 [Set Record Identifier]
>> [1F1h 0497   1]                  Instruction : 02 [Write Register]
>> [1F2h 0498   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [1F3h 0499   1]                     Reserved : 00
>>
>> [1F4h 0500  12]              Register Region : [Generic Address Structure]
>> [1F4h 0500   1]                     Space ID : 00 [SystemMemory]
>> [1F5h 0501   1]                    Bit Width : 40
>> [1F6h 0502   1]                   Bit Offset : 00
>> [1F7h 0503   1]         Encoded Access Width : 04 [QWord Access:64]
>> [1F8h 0504   8]                      Address : 00000000C1063008
>>
>> [200h 0512   8]                        Value : 0000000000000000
>> [208h 0520   8]                         Mask : FFFFFFFFFFFFFFFF
>>
>> [210h 0528   1]                       Action : 09 [Set Record Identifier]
>> [211h 0529   1]                  Instruction : 03 [Write Register Value]
>> [212h 0530   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [213h 0531   1]                     Reserved : 00
>>
>> [214h 0532  12]              Register Region : [Generic Address Structure]
>> [214h 0532   1]                     Space ID : 00 [SystemMemory]
>> [215h 0533   1]                    Bit Width : 20
>> [216h 0534   1]                   Bit Offset : 00
>> [217h 0535   1]         Encoded Access Width : 03 [DWord Access:32]
>> [218h 0536   8]                      Address : 00000000C1063000
>>
>> [220h 0544   8]                        Value : 0000000000000009
>> [228h 0552   8]                         Mask : 00000000000000FF
>>
>> [230h 0560   1]                       Action : 0A [Get Record Count]
>> [231h 0561   1]                  Instruction : 03 [Write Register Value]
>> [232h 0562   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [233h 0563   1]                     Reserved : 00
>>
>> [234h 0564  12]              Register Region : [Generic Address Structure]
>> [234h 0564   1]                     Space ID : 00 [SystemMemory]
>> [235h 0565   1]                    Bit Width : 20
>> [236h 0566   1]                   Bit Offset : 00
>> [237h 0567   1]         Encoded Access Width : 03 [DWord Access:32]
>> [238h 0568   8]                      Address : 00000000C1063000
>>
>> [240h 0576   8]                        Value : 000000000000000A
>> [248h 0584   8]                         Mask : 00000000000000FF
>>
>> [250h 0592   1]                       Action : 0A [Get Record Count]
>> [251h 0593   1]                  Instruction : 00 [Read Register]
>> [252h 0594   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [253h 0595   1]                     Reserved : 00
>>
>> [254h 0596  12]              Register Region : [Generic Address Structure]
>> [254h 0596   1]                     Space ID : 00 [SystemMemory]
>> [255h 0597   1]                    Bit Width : 20
>> [256h 0598   1]                   Bit Offset : 00
>> [257h 0599   1]         Encoded Access Width : 03 [DWord Access:32]
>> [258h 0600   8]                      Address : 00000000C1063008
>>
>> [260h 0608   8]                        Value : 0000000000000000
>> [268h 0616   8]                         Mask : 00000000FFFFFFFF
>>
>> [270h 0624   1]                       Action : 0B [Begin Dummy Write]
>> [271h 0625   1]                  Instruction : 03 [Write Register Value]
>> [272h 0626   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [273h 0627   1]                     Reserved : 00
>>
>> [274h 0628  12]              Register Region : [Generic Address Structure]
>> [274h 0628   1]                     Space ID : 00 [SystemMemory]
>> [275h 0629   1]                    Bit Width : 20
>> [276h 0630   1]                   Bit Offset : 00
>> [277h 0631   1]         Encoded Access Width : 03 [DWord Access:32]
>> [278h 0632   8]                      Address : 00000000C1063000
>>
>> [280h 0640   8]                        Value : 000000000000000B
>> [288h 0648   8]                         Mask : 00000000000000FF
>>
>> [290h 0656   1]                       Action : 0D [Get Error Address Range]
>> [291h 0657   1]                  Instruction : 03 [Write Register Value]
>> [292h 0658   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [293h 0659   1]                     Reserved : 00
>>
>> [294h 0660  12]              Register Region : [Generic Address Structure]
>> [294h 0660   1]                     Space ID : 00 [SystemMemory]
>> [295h 0661   1]                    Bit Width : 20
>> [296h 0662   1]                   Bit Offset : 00
>> [297h 0663   1]         Encoded Access Width : 03 [DWord Access:32]
>> [298h 0664   8]                      Address : 00000000C1063000
>>
>> [2A0h 0672   8]                        Value : 000000000000000D
>> [2A8h 0680   8]                         Mask : 00000000000000FF
>>
>> [2B0h 0688   1]                       Action : 0D [Get Error Address Range]
>> [2B1h 0689   1]                  Instruction : 00 [Read Register]
>> [2B2h 0690   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [2B3h 0691   1]                     Reserved : 00
>>
>> [2B4h 0692  12]              Register Region : [Generic Address Structure]
>> [2B4h 0692   1]                     Space ID : 00 [SystemMemory]
>> [2B5h 0693   1]                    Bit Width : 40
>> [2B6h 0694   1]                   Bit Offset : 00
>> [2B7h 0695   1]         Encoded Access Width : 04 [QWord Access:64]
>> [2B8h 0696   8]                      Address : 00000000C1063008
>>
>> [2C0h 0704   8]                        Value : 0000000000000000
>> [2C8h 0712   8]                         Mask : FFFFFFFFFFFFFFFF
>>
>> [2D0h 0720   1]                       Action : 0E [Get Error Address Length]
>> [2D1h 0721   1]                  Instruction : 03 [Write Register Value]
>> [2D2h 0722   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [2D3h 0723   1]                     Reserved : 00
>>
>> [2D4h 0724  12]              Register Region : [Generic Address Structure]
>> [2D4h 0724   1]                     Space ID : 00 [SystemMemory]
>> [2D5h 0725   1]                    Bit Width : 20
>> [2D6h 0726   1]                   Bit Offset : 00
>> [2D7h 0727   1]         Encoded Access Width : 03 [DWord Access:32]
>> [2D8h 0728   8]                      Address : 00000000C1063000
>>
>> [2E0h 0736   8]                        Value : 000000000000000E
>> [2E8h 0744   8]                         Mask : 00000000000000FF
>>
>> [2F0h 0752   1]                       Action : 0E [Get Error Address Length]
>> [2F1h 0753   1]                  Instruction : 00 [Read Register]
>> [2F2h 0754   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [2F3h 0755   1]                     Reserved : 00
>>
>> [2F4h 0756  12]              Register Region : [Generic Address Structure]
>> [2F4h 0756   1]                     Space ID : 00 [SystemMemory]
>> [2F5h 0757   1]                    Bit Width : 40
>> [2F6h 0758   1]                   Bit Offset : 00
>> [2F7h 0759   1]         Encoded Access Width : 04 [QWord Access:64]
>> [2F8h 0760   8]                      Address : 00000000C1063008
>>
>> [300h 0768   8]                        Value : 0000000000000000
>> [308h 0776   8]                         Mask : 00000000FFFFFFFF
>>
>> [310h 0784   1]                       Action : 0F [Get Error Attributes]
>> [311h 0785   1]                  Instruction : 03 [Write Register Value]
>> [312h 0786   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [313h 0787   1]                     Reserved : 00
>>
>> [314h 0788  12]              Register Region : [Generic Address Structure]
>> [314h 0788   1]                     Space ID : 00 [SystemMemory]
>> [315h 0789   1]                    Bit Width : 20
>> [316h 0790   1]                   Bit Offset : 00
>> [317h 0791   1]         Encoded Access Width : 03 [DWord Access:32]
>> [318h 0792   8]                      Address : 00000000C1063000
>>
>> [320h 0800   8]                        Value : 000000000000000F
>> [328h 0808   8]                         Mask : 00000000000000FF
>>
>> [330h 0816   1]                       Action : 0F [Get Error Attributes]
>> [331h 0817   1]                  Instruction : 00 [Read Register]
>> [332h 0818   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [333h 0819   1]                     Reserved : 00
>>
>> [334h 0820  12]              Register Region : [Generic Address Structure]
>> [334h 0820   1]                     Space ID : 00 [SystemMemory]
>> [335h 0821   1]                    Bit Width : 20
>> [336h 0822   1]                   Bit Offset : 00
>> [337h 0823   1]         Encoded Access Width : 03 [DWord Access:32]
>> [338h 0824   8]                      Address : 00000000C1063008
>>
>> [340h 0832   8]                        Value : 0000000000000000
>> [348h 0840   8]                         Mask : 00000000FFFFFFFF
>>
>> [350h 0848   1]                       Action : 10 [Execute Timings]
>> [351h 0849   1]                  Instruction : 03 [Write Register Value]
>> [352h 0850   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [353h 0851   1]                     Reserved : 00
>>
>> [354h 0852  12]              Register Region : [Generic Address Structure]
>> [354h 0852   1]                     Space ID : 00 [SystemMemory]
>> [355h 0853   1]                    Bit Width : 20
>> [356h 0854   1]                   Bit Offset : 00
>> [357h 0855   1]         Encoded Access Width : 03 [DWord Access:32]
>> [358h 0856   8]                      Address : 00000000C1063000
>>
>> [360h 0864   8]                        Value : 0000000000000010
>> [368h 0872   8]                         Mask : 00000000000000FF
>>
>> [370h 0880   1]                       Action : 10 [Execute Timings]
>> [371h 0881   1]                  Instruction : 00 [Read Register]
>> [372h 0882   1]        Flags (decoded below) : 00
>>                         Preserve Register Bits : 0
>> [373h 0883   1]                     Reserved : 00
>>
>> [374h 0884  12]              Register Region : [Generic Address Structure]
>> [374h 0884   1]                     Space ID : 00 [SystemMemory]
>> [375h 0885   1]                    Bit Width : 40
>> [376h 0886   1]                   Bit Offset : 00
>> [377h 0887   1]         Encoded Access Width : 04 [QWord Access:64]
>> [378h 0888   8]                      Address : 00000000C1063008
>>
>> [380h 0896   8]                        Value : 0000000000000000
>> [388h 0904   8]                         Mask : FFFFFFFFFFFFFFFF
>>
>> Raw Table Data: Length 912 (0x390)
>>
>>     0000: 45 52 53 54 90 03 00 00 01 C8 42 4F 43 48 53 20  // ERST......BOCHS
>>     0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
>>     0020: 01 00 00 00 30 00 00 00 00 00 00 00 1B 00 00 00  // ....0...........
>>     0030: 00 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0040: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0050: 01 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0060: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0070: 02 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0080: 02 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0090: 03 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     00A0: 03 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     00B0: 04 02 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     00C0: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>>     00D0: 04 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     00E0: 04 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     00F0: 05 03 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     0100: 9C 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0110: 05 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0120: 05 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0130: 06 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0140: 06 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0150: 06 01 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     0160: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0170: 07 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0180: 07 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0190: 07 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     01A0: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     01B0: 08 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     01C0: 08 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     01D0: 08 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>>     01E0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>>     01F0: 09 02 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>>     0200: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>>     0210: 09 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0220: 09 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0230: 0A 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0240: 0A 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0250: 0A 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     0260: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>>     0270: 0B 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0280: 0B 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0290: 0D 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     02A0: 0D 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     02B0: 0D 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>>     02C0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>>     02D0: 0E 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     02E0: 0E 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     02F0: 0E 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>>     0300: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>>     0310: 0F 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0320: 0F 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0330: 0F 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
>>     0340: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
>>     0350: 10 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
>>     0360: 10 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
>>     0370: 10 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
>>     0380: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature
  2021-08-04 22:13             ` Eric DeVolder
@ 2021-08-05  9:05               ` Igor Mammedov
  0 siblings, 0 replies; 58+ messages in thread
From: Igor Mammedov @ 2021-08-05  9:05 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: ehabkost, mst, konrad.wilk, qemu-devel, pbonzini, boris.ostrovsky, rth

On Wed, 4 Aug 2021 17:13:40 -0500
Eric DeVolder <eric.devolder@oracle.com> wrote:

> On 7/27/21 7:52 AM, Igor Mammedov wrote:
> > On Mon, 26 Jul 2021 15:01:05 -0500
> > Eric DeVolder <eric.devolder@oracle.com> wrote:
> >   
> >> On 7/26/21 5:42 AM, Igor Mammedov wrote:  
> >>> On Wed, 21 Jul 2021 11:07:40 -0500
> >>> Eric DeVolder <eric.devolder@oracle.com> wrote:
> >>>      
> >>>> On 7/20/21 7:17 AM, Igor Mammedov wrote:  
> >>>>> On Wed, 30 Jun 2021 15:07:16 -0400
> >>>>> Eric DeVolder <eric.devolder@oracle.com> wrote:
> >>>>>         
> > [..]  
> >>>>>> +
> >>>>>> +static uint64_t erst_rd_reg64(hwaddr addr,
> >>>>>> +    uint64_t reg, unsigned size)
> >>>>>> +{
> >>>>>> +    uint64_t rdval;
> >>>>>> +    uint64_t mask;
> >>>>>> +    unsigned shift;
> >>>>>> +
> >>>>>> +    if (size == sizeof(uint64_t)) {
> >>>>>> +        /* 64b access */
> >>>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >>>>>> +        shift = 0;
> >>>>>> +    } else {
> >>>>>> +        /* 32b access */
> >>>>>> +        mask = 0x00000000FFFFFFFFUL;
> >>>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >>>>>> +    }
> >>>>>> +
> >>>>>> +    rdval = reg;
> >>>>>> +    rdval >>= shift;
> >>>>>> +    rdval &= mask;
> >>>>>> +
> >>>>>> +    return rdval;
> >>>>>> +}
> >>>>>> +
> >>>>>> +static uint64_t erst_wr_reg64(hwaddr addr,
> >>>>>> +    uint64_t reg, uint64_t val, unsigned size)
> >>>>>> +{
> >>>>>> +    uint64_t wrval;
> >>>>>> +    uint64_t mask;
> >>>>>> +    unsigned shift;
> >>>>>> +
> >>>>>> +    if (size == sizeof(uint64_t)) {
> >>>>>> +        /* 64b access */
> >>>>>> +        mask = 0xFFFFFFFFFFFFFFFFUL;
> >>>>>> +        shift = 0;
> >>>>>> +    } else {
> >>>>>> +        /* 32b access */
> >>>>>> +        mask = 0x00000000FFFFFFFFUL;
> >>>>>> +        shift = ((addr & 0x4) == 0x4) ? 32 : 0;
> >>>>>> +    }
> >>>>>> +
> >>>>>> +    val &= mask;
> >>>>>> +    val <<= shift;
> >>>>>> +    mask <<= shift;
> >>>>>> +    wrval = reg;
> >>>>>> +    wrval &= ~mask;
> >>>>>> +    wrval |= val;
> >>>>>> +
> >>>>>> +    return wrval;
> >>>>>> +}  
> >>>>> (I see in next patch it's us defining access width in the ACPI tables)
> >>>>> so question is: do we have to have mixed register width access?
> >>>>> can't all register accesses be 64-bit?  
> >>>>
> >>>> Initially I attempted to just use 64-bit exclusively. The problem is that,
> >>>> for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
> >>>> register access into two. Here's an example of reading the exchange buffer
> >>>> address, which is coded as a 64b access:
> >>>>
> >>>> acpi_erst_reg_write addr: 0x0000 <== 0x000000000000000d (size: 4)
> >>>> acpi_erst_reg_read  addr: 0x0008 ==> 0x00000000c1010000 (size: 4)
> >>>> acpi_erst_reg_read  addr: 0x000c ==> 0x0000000000000000 (size: 4)
> >>>>
> >>>> So I went ahead and made ACTION register accesses 32b, else there would
> >>>> be two reads of 32-bts, of which the second is useless.  
> >>>
> >>> could you post here decompiled ERST table?  
> >> As it is long, I posted it to the end of this message.  
> > 
> > RHEL8 or Fedora 34 says that erst is invalid table,
> > so I can't help tracing what's going on there.
> > 
> > You'll have to figure out why access is not 64 bit on your own.  
> 
> Today I downloaded Fedora 34 Server and created a guest. Using my
> qemu-6 branch with ERST, it appears to work just fine. I was able to
> create entries into it.
> 
> 
> [    0.010215] ACPI: ERST 0x000000007F9F3000 000390 (v01 BOCHS  BXPC     00000001 BXPC 00000001)
> [    0.010250] ACPI: Reserving ERST table memory at [mem 0x7f9f3000-0x7f9f338f]
> [    1.056244] ERST: Error Record Serialization Table (ERST) support is initialized.
> [    1.056279] pstore: Registered erst as persistent store backend
> 
> total 0
> drwxr-x---.  2 root root     0 Aug  4 18:05 .
> drwxr-xr-x. 10 root root     0 Aug  4 18:05 ..
> -r--r--r--.  1 root root 17700 Aug  4 17:54 dmesg-erst-6992696633267847169
> -r--r--r--.  1 root root 17714 Aug  4 17:54 dmesg-erst-6992696633267847170
> 
> 
> It appears to Linux OSPM is taking the 64-bit register access and breaking it
> into two 32-bit accesses. 
We need to figure out why this happens.

> If this is the case, then the fix would be in
> Linux and not this code.
> 
> Pending your response to this finding, I have v6 ready to go.
ok, let's go with v6, I hope it will work for me.

> Thanks
> eric
> 
> 
> > 
> > [...]
> >   
> >> Obtained via a running guest with:
> >> iasl -d -vt /sys/firmware/acpi/tables/ERST
> >>
> >> /*
> >>    * Intel ACPI Component Architecture
> >>    * AML/ASL+ Disassembler version 20180105 (64-bit version)
> >>    * Copyright (c) 2000 - 2018 Intel Corporation
> >>    *
> >>    * Disassembly of ERST.blob, Mon Jul 26 14:31:21 2021
> >>    *
> >>    * ACPI Data Table [ERST]
> >>    *
> >>    * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
> >>    */
> >>
> >> [000h 0000   4]                    Signature : "ERST"    [Error Record Serialization Table]
> >> [004h 0004   4]                 Table Length : 00000390
> >> [008h 0008   1]                     Revision : 01
> >> [009h 0009   1]                     Checksum : C8
> >> [00Ah 0010   6]                       Oem ID : "BOCHS "
> >> [010h 0016   8]                 Oem Table ID : "BXPC    "
> >> [018h 0024   4]                 Oem Revision : 00000001
> >> [01Ch 0028   4]              Asl Compiler ID : "BXPC"
> >> [020h 0032   4]        Asl Compiler Revision : 00000001
> >>
> >> [024h 0036   4]  Serialization Header Length : 00000030
> >> [028h 0040   4]                     Reserved : 00000000
> >> [02Ch 0044   4]      Instruction Entry Count : 0000001B
> >>
> >> [030h 0048   1]                       Action : 00 [Begin Write Operation]
> >> [031h 0049   1]                  Instruction : 03 [Write Register Value]
> >> [032h 0050   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [033h 0051   1]                     Reserved : 00
> >>
> >> [034h 0052  12]              Register Region : [Generic Address Structure]
> >> [034h 0052   1]                     Space ID : 00 [SystemMemory]
> >> [035h 0053   1]                    Bit Width : 20
> >> [036h 0054   1]                   Bit Offset : 00
> >> [037h 0055   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [038h 0056   8]                      Address : 00000000C1063000
> >>
> >> [040h 0064   8]                        Value : 0000000000000000
> >> [048h 0072   8]                         Mask : 00000000000000FF
> >>
> >> [050h 0080   1]                       Action : 01 [Begin Read Operation]
> >> [051h 0081   1]                  Instruction : 03 [Write Register Value]
> >> [052h 0082   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [053h 0083   1]                     Reserved : 00
> >>
> >> [054h 0084  12]              Register Region : [Generic Address Structure]
> >> [054h 0084   1]                     Space ID : 00 [SystemMemory]
> >> [055h 0085   1]                    Bit Width : 20
> >> [056h 0086   1]                   Bit Offset : 00
> >> [057h 0087   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [058h 0088   8]                      Address : 00000000C1063000
> >>
> >> [060h 0096   8]                        Value : 0000000000000001
> >> [068h 0104   8]                         Mask : 00000000000000FF
> >>
> >> [070h 0112   1]                       Action : 02 [Begin Clear Operation]
> >> [071h 0113   1]                  Instruction : 03 [Write Register Value]
> >> [072h 0114   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [073h 0115   1]                     Reserved : 00
> >>
> >> [074h 0116  12]              Register Region : [Generic Address Structure]
> >> [074h 0116   1]                     Space ID : 00 [SystemMemory]
> >> [075h 0117   1]                    Bit Width : 20
> >> [076h 0118   1]                   Bit Offset : 00
> >> [077h 0119   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [078h 0120   8]                      Address : 00000000C1063000
> >>
> >> [080h 0128   8]                        Value : 0000000000000002
> >> [088h 0136   8]                         Mask : 00000000000000FF
> >>
> >> [090h 0144   1]                       Action : 03 [End Operation]
> >> [091h 0145   1]                  Instruction : 03 [Write Register Value]
> >> [092h 0146   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [093h 0147   1]                     Reserved : 00
> >>
> >> [094h 0148  12]              Register Region : [Generic Address Structure]
> >> [094h 0148   1]                     Space ID : 00 [SystemMemory]
> >> [095h 0149   1]                    Bit Width : 20
> >> [096h 0150   1]                   Bit Offset : 00
> >> [097h 0151   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [098h 0152   8]                      Address : 00000000C1063000
> >>
> >> [0A0h 0160   8]                        Value : 0000000000000003
> >> [0A8h 0168   8]                         Mask : 00000000000000FF
> >>
> >> [0B0h 0176   1]                       Action : 04 [Set Record Offset]
> >> [0B1h 0177   1]                  Instruction : 02 [Write Register]
> >> [0B2h 0178   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [0B3h 0179   1]                     Reserved : 00
> >>
> >> [0B4h 0180  12]              Register Region : [Generic Address Structure]
> >> [0B4h 0180   1]                     Space ID : 00 [SystemMemory]
> >> [0B5h 0181   1]                    Bit Width : 20
> >> [0B6h 0182   1]                   Bit Offset : 00
> >> [0B7h 0183   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [0B8h 0184   8]                      Address : 00000000C1063008
> >>
> >> [0C0h 0192   8]                        Value : 0000000000000000
> >> [0C8h 0200   8]                         Mask : 00000000FFFFFFFF
> >>
> >> [0D0h 0208   1]                       Action : 04 [Set Record Offset]
> >> [0D1h 0209   1]                  Instruction : 03 [Write Register Value]
> >> [0D2h 0210   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [0D3h 0211   1]                     Reserved : 00
> >>
> >> [0D4h 0212  12]              Register Region : [Generic Address Structure]
> >> [0D4h 0212   1]                     Space ID : 00 [SystemMemory]
> >> [0D5h 0213   1]                    Bit Width : 20
> >> [0D6h 0214   1]                   Bit Offset : 00
> >> [0D7h 0215   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [0D8h 0216   8]                      Address : 00000000C1063000
> >>
> >> [0E0h 0224   8]                        Value : 0000000000000004
> >> [0E8h 0232   8]                         Mask : 00000000000000FF
> >>
> >> [0F0h 0240   1]                       Action : 05 [Execute Operation]
> >> [0F1h 0241   1]                  Instruction : 03 [Write Register Value]
> >> [0F2h 0242   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [0F3h 0243   1]                     Reserved : 00
> >>
> >> [0F4h 0244  12]              Register Region : [Generic Address Structure]
> >> [0F4h 0244   1]                     Space ID : 00 [SystemMemory]
> >> [0F5h 0245   1]                    Bit Width : 20
> >> [0F6h 0246   1]                   Bit Offset : 00
> >> [0F7h 0247   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [0F8h 0248   8]                      Address : 00000000C1063008
> >>
> >> [100h 0256   8]                        Value : 000000000000009C
> >> [108h 0264   8]                         Mask : 00000000000000FF
> >>
> >> [110h 0272   1]                       Action : 05 [Execute Operation]
> >> [111h 0273   1]                  Instruction : 03 [Write Register Value]
> >> [112h 0274   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [113h 0275   1]                     Reserved : 00
> >>
> >> [114h 0276  12]              Register Region : [Generic Address Structure]
> >> [114h 0276   1]                     Space ID : 00 [SystemMemory]
> >> [115h 0277   1]                    Bit Width : 20
> >> [116h 0278   1]                   Bit Offset : 00
> >> [117h 0279   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [118h 0280   8]                      Address : 00000000C1063000
> >>
> >> [120h 0288   8]                        Value : 0000000000000005
> >> [128h 0296   8]                         Mask : 00000000000000FF
> >>
> >> [130h 0304   1]                       Action : 06 [Check Busy Status]
> >> [131h 0305   1]                  Instruction : 03 [Write Register Value]
> >> [132h 0306   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [133h 0307   1]                     Reserved : 00
> >>
> >> [134h 0308  12]              Register Region : [Generic Address Structure]
> >> [134h 0308   1]                     Space ID : 00 [SystemMemory]
> >> [135h 0309   1]                    Bit Width : 20
> >> [136h 0310   1]                   Bit Offset : 00
> >> [137h 0311   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [138h 0312   8]                      Address : 00000000C1063000
> >>
> >> [140h 0320   8]                        Value : 0000000000000006
> >> [148h 0328   8]                         Mask : 00000000000000FF
> >>
> >> [150h 0336   1]                       Action : 06 [Check Busy Status]
> >> [151h 0337   1]                  Instruction : 01 [Read Register Value]
> >> [152h 0338   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [153h 0339   1]                     Reserved : 00
> >>
> >> [154h 0340  12]              Register Region : [Generic Address Structure]
> >> [154h 0340   1]                     Space ID : 00 [SystemMemory]
> >> [155h 0341   1]                    Bit Width : 20
> >> [156h 0342   1]                   Bit Offset : 00
> >> [157h 0343   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [158h 0344   8]                      Address : 00000000C1063008
> >>
> >> [160h 0352   8]                        Value : 0000000000000001
> >> [168h 0360   8]                         Mask : 00000000000000FF
> >>
> >> [170h 0368   1]                       Action : 07 [Get Command Status]
> >> [171h 0369   1]                  Instruction : 03 [Write Register Value]
> >> [172h 0370   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [173h 0371   1]                     Reserved : 00
> >>
> >> [174h 0372  12]              Register Region : [Generic Address Structure]
> >> [174h 0372   1]                     Space ID : 00 [SystemMemory]
> >> [175h 0373   1]                    Bit Width : 20
> >> [176h 0374   1]                   Bit Offset : 00
> >> [177h 0375   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [178h 0376   8]                      Address : 00000000C1063000
> >>
> >> [180h 0384   8]                        Value : 0000000000000007
> >> [188h 0392   8]                         Mask : 00000000000000FF
> >>
> >> [190h 0400   1]                       Action : 07 [Get Command Status]
> >> [191h 0401   1]                  Instruction : 00 [Read Register]
> >> [192h 0402   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [193h 0403   1]                     Reserved : 00
> >>
> >> [194h 0404  12]              Register Region : [Generic Address Structure]
> >> [194h 0404   1]                     Space ID : 00 [SystemMemory]
> >> [195h 0405   1]                    Bit Width : 20
> >> [196h 0406   1]                   Bit Offset : 00
> >> [197h 0407   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [198h 0408   8]                      Address : 00000000C1063008
> >>
> >> [1A0h 0416   8]                        Value : 0000000000000000
> >> [1A8h 0424   8]                         Mask : 00000000000000FF
> >>
> >> [1B0h 0432   1]                       Action : 08 [Get Record Identifier]
> >> [1B1h 0433   1]                  Instruction : 03 [Write Register Value]
> >> [1B2h 0434   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [1B3h 0435   1]                     Reserved : 00
> >>
> >> [1B4h 0436  12]              Register Region : [Generic Address Structure]
> >> [1B4h 0436   1]                     Space ID : 00 [SystemMemory]
> >> [1B5h 0437   1]                    Bit Width : 20
> >> [1B6h 0438   1]                   Bit Offset : 00
> >> [1B7h 0439   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [1B8h 0440   8]                      Address : 00000000C1063000
> >>
> >> [1C0h 0448   8]                        Value : 0000000000000008
> >> [1C8h 0456   8]                         Mask : 00000000000000FF
> >>
> >> [1D0h 0464   1]                       Action : 08 [Get Record Identifier]
> >> [1D1h 0465   1]                  Instruction : 00 [Read Register]
> >> [1D2h 0466   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [1D3h 0467   1]                     Reserved : 00
> >>
> >> [1D4h 0468  12]              Register Region : [Generic Address Structure]
> >> [1D4h 0468   1]                     Space ID : 00 [SystemMemory]
> >> [1D5h 0469   1]                    Bit Width : 40
> >> [1D6h 0470   1]                   Bit Offset : 00
> >> [1D7h 0471   1]         Encoded Access Width : 04 [QWord Access:64]
> >> [1D8h 0472   8]                      Address : 00000000C1063008
> >>
> >> [1E0h 0480   8]                        Value : 0000000000000000
> >> [1E8h 0488   8]                         Mask : FFFFFFFFFFFFFFFF
> >>
> >> [1F0h 0496   1]                       Action : 09 [Set Record Identifier]
> >> [1F1h 0497   1]                  Instruction : 02 [Write Register]
> >> [1F2h 0498   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [1F3h 0499   1]                     Reserved : 00
> >>
> >> [1F4h 0500  12]              Register Region : [Generic Address Structure]
> >> [1F4h 0500   1]                     Space ID : 00 [SystemMemory]
> >> [1F5h 0501   1]                    Bit Width : 40
> >> [1F6h 0502   1]                   Bit Offset : 00
> >> [1F7h 0503   1]         Encoded Access Width : 04 [QWord Access:64]
> >> [1F8h 0504   8]                      Address : 00000000C1063008
> >>
> >> [200h 0512   8]                        Value : 0000000000000000
> >> [208h 0520   8]                         Mask : FFFFFFFFFFFFFFFF
> >>
> >> [210h 0528   1]                       Action : 09 [Set Record Identifier]
> >> [211h 0529   1]                  Instruction : 03 [Write Register Value]
> >> [212h 0530   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [213h 0531   1]                     Reserved : 00
> >>
> >> [214h 0532  12]              Register Region : [Generic Address Structure]
> >> [214h 0532   1]                     Space ID : 00 [SystemMemory]
> >> [215h 0533   1]                    Bit Width : 20
> >> [216h 0534   1]                   Bit Offset : 00
> >> [217h 0535   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [218h 0536   8]                      Address : 00000000C1063000
> >>
> >> [220h 0544   8]                        Value : 0000000000000009
> >> [228h 0552   8]                         Mask : 00000000000000FF
> >>
> >> [230h 0560   1]                       Action : 0A [Get Record Count]
> >> [231h 0561   1]                  Instruction : 03 [Write Register Value]
> >> [232h 0562   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [233h 0563   1]                     Reserved : 00
> >>
> >> [234h 0564  12]              Register Region : [Generic Address Structure]
> >> [234h 0564   1]                     Space ID : 00 [SystemMemory]
> >> [235h 0565   1]                    Bit Width : 20
> >> [236h 0566   1]                   Bit Offset : 00
> >> [237h 0567   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [238h 0568   8]                      Address : 00000000C1063000
> >>
> >> [240h 0576   8]                        Value : 000000000000000A
> >> [248h 0584   8]                         Mask : 00000000000000FF
> >>
> >> [250h 0592   1]                       Action : 0A [Get Record Count]
> >> [251h 0593   1]                  Instruction : 00 [Read Register]
> >> [252h 0594   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [253h 0595   1]                     Reserved : 00
> >>
> >> [254h 0596  12]              Register Region : [Generic Address Structure]
> >> [254h 0596   1]                     Space ID : 00 [SystemMemory]
> >> [255h 0597   1]                    Bit Width : 20
> >> [256h 0598   1]                   Bit Offset : 00
> >> [257h 0599   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [258h 0600   8]                      Address : 00000000C1063008
> >>
> >> [260h 0608   8]                        Value : 0000000000000000
> >> [268h 0616   8]                         Mask : 00000000FFFFFFFF
> >>
> >> [270h 0624   1]                       Action : 0B [Begin Dummy Write]
> >> [271h 0625   1]                  Instruction : 03 [Write Register Value]
> >> [272h 0626   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [273h 0627   1]                     Reserved : 00
> >>
> >> [274h 0628  12]              Register Region : [Generic Address Structure]
> >> [274h 0628   1]                     Space ID : 00 [SystemMemory]
> >> [275h 0629   1]                    Bit Width : 20
> >> [276h 0630   1]                   Bit Offset : 00
> >> [277h 0631   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [278h 0632   8]                      Address : 00000000C1063000
> >>
> >> [280h 0640   8]                        Value : 000000000000000B
> >> [288h 0648   8]                         Mask : 00000000000000FF
> >>
> >> [290h 0656   1]                       Action : 0D [Get Error Address Range]
> >> [291h 0657   1]                  Instruction : 03 [Write Register Value]
> >> [292h 0658   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [293h 0659   1]                     Reserved : 00
> >>
> >> [294h 0660  12]              Register Region : [Generic Address Structure]
> >> [294h 0660   1]                     Space ID : 00 [SystemMemory]
> >> [295h 0661   1]                    Bit Width : 20
> >> [296h 0662   1]                   Bit Offset : 00
> >> [297h 0663   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [298h 0664   8]                      Address : 00000000C1063000
> >>
> >> [2A0h 0672   8]                        Value : 000000000000000D
> >> [2A8h 0680   8]                         Mask : 00000000000000FF
> >>
> >> [2B0h 0688   1]                       Action : 0D [Get Error Address Range]
> >> [2B1h 0689   1]                  Instruction : 00 [Read Register]
> >> [2B2h 0690   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [2B3h 0691   1]                     Reserved : 00
> >>
> >> [2B4h 0692  12]              Register Region : [Generic Address Structure]
> >> [2B4h 0692   1]                     Space ID : 00 [SystemMemory]
> >> [2B5h 0693   1]                    Bit Width : 40
> >> [2B6h 0694   1]                   Bit Offset : 00
> >> [2B7h 0695   1]         Encoded Access Width : 04 [QWord Access:64]
> >> [2B8h 0696   8]                      Address : 00000000C1063008
> >>
> >> [2C0h 0704   8]                        Value : 0000000000000000
> >> [2C8h 0712   8]                         Mask : FFFFFFFFFFFFFFFF
> >>
> >> [2D0h 0720   1]                       Action : 0E [Get Error Address Length]
> >> [2D1h 0721   1]                  Instruction : 03 [Write Register Value]
> >> [2D2h 0722   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [2D3h 0723   1]                     Reserved : 00
> >>
> >> [2D4h 0724  12]              Register Region : [Generic Address Structure]
> >> [2D4h 0724   1]                     Space ID : 00 [SystemMemory]
> >> [2D5h 0725   1]                    Bit Width : 20
> >> [2D6h 0726   1]                   Bit Offset : 00
> >> [2D7h 0727   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [2D8h 0728   8]                      Address : 00000000C1063000
> >>
> >> [2E0h 0736   8]                        Value : 000000000000000E
> >> [2E8h 0744   8]                         Mask : 00000000000000FF
> >>
> >> [2F0h 0752   1]                       Action : 0E [Get Error Address Length]
> >> [2F1h 0753   1]                  Instruction : 00 [Read Register]
> >> [2F2h 0754   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [2F3h 0755   1]                     Reserved : 00
> >>
> >> [2F4h 0756  12]              Register Region : [Generic Address Structure]
> >> [2F4h 0756   1]                     Space ID : 00 [SystemMemory]
> >> [2F5h 0757   1]                    Bit Width : 40
> >> [2F6h 0758   1]                   Bit Offset : 00
> >> [2F7h 0759   1]         Encoded Access Width : 04 [QWord Access:64]
> >> [2F8h 0760   8]                      Address : 00000000C1063008
> >>
> >> [300h 0768   8]                        Value : 0000000000000000
> >> [308h 0776   8]                         Mask : 00000000FFFFFFFF
> >>
> >> [310h 0784   1]                       Action : 0F [Get Error Attributes]
> >> [311h 0785   1]                  Instruction : 03 [Write Register Value]
> >> [312h 0786   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [313h 0787   1]                     Reserved : 00
> >>
> >> [314h 0788  12]              Register Region : [Generic Address Structure]
> >> [314h 0788   1]                     Space ID : 00 [SystemMemory]
> >> [315h 0789   1]                    Bit Width : 20
> >> [316h 0790   1]                   Bit Offset : 00
> >> [317h 0791   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [318h 0792   8]                      Address : 00000000C1063000
> >>
> >> [320h 0800   8]                        Value : 000000000000000F
> >> [328h 0808   8]                         Mask : 00000000000000FF
> >>
> >> [330h 0816   1]                       Action : 0F [Get Error Attributes]
> >> [331h 0817   1]                  Instruction : 00 [Read Register]
> >> [332h 0818   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [333h 0819   1]                     Reserved : 00
> >>
> >> [334h 0820  12]              Register Region : [Generic Address Structure]
> >> [334h 0820   1]                     Space ID : 00 [SystemMemory]
> >> [335h 0821   1]                    Bit Width : 20
> >> [336h 0822   1]                   Bit Offset : 00
> >> [337h 0823   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [338h 0824   8]                      Address : 00000000C1063008
> >>
> >> [340h 0832   8]                        Value : 0000000000000000
> >> [348h 0840   8]                         Mask : 00000000FFFFFFFF
> >>
> >> [350h 0848   1]                       Action : 10 [Execute Timings]
> >> [351h 0849   1]                  Instruction : 03 [Write Register Value]
> >> [352h 0850   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [353h 0851   1]                     Reserved : 00
> >>
> >> [354h 0852  12]              Register Region : [Generic Address Structure]
> >> [354h 0852   1]                     Space ID : 00 [SystemMemory]
> >> [355h 0853   1]                    Bit Width : 20
> >> [356h 0854   1]                   Bit Offset : 00
> >> [357h 0855   1]         Encoded Access Width : 03 [DWord Access:32]
> >> [358h 0856   8]                      Address : 00000000C1063000
> >>
> >> [360h 0864   8]                        Value : 0000000000000010
> >> [368h 0872   8]                         Mask : 00000000000000FF
> >>
> >> [370h 0880   1]                       Action : 10 [Execute Timings]
> >> [371h 0881   1]                  Instruction : 00 [Read Register]
> >> [372h 0882   1]        Flags (decoded below) : 00
> >>                         Preserve Register Bits : 0
> >> [373h 0883   1]                     Reserved : 00
> >>
> >> [374h 0884  12]              Register Region : [Generic Address Structure]
> >> [374h 0884   1]                     Space ID : 00 [SystemMemory]
> >> [375h 0885   1]                    Bit Width : 40
> >> [376h 0886   1]                   Bit Offset : 00
> >> [377h 0887   1]         Encoded Access Width : 04 [QWord Access:64]
> >> [378h 0888   8]                      Address : 00000000C1063008
> >>
> >> [380h 0896   8]                        Value : 0000000000000000
> >> [388h 0904   8]                         Mask : FFFFFFFFFFFFFFFF
> >>
> >> Raw Table Data: Length 912 (0x390)
> >>
> >>     0000: 45 52 53 54 90 03 00 00 01 C8 42 4F 43 48 53 20  // ERST......BOCHS
> >>     0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC    ....BXPC
> >>     0020: 01 00 00 00 30 00 00 00 00 00 00 00 1B 00 00 00  // ....0...........
> >>     0030: 00 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0040: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0050: 01 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0060: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0070: 02 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0080: 02 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0090: 03 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     00A0: 03 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     00B0: 04 02 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     00C0: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
> >>     00D0: 04 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     00E0: 04 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     00F0: 05 03 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0100: 9C 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0110: 05 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0120: 05 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0130: 06 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0140: 06 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0150: 06 01 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0160: 01 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0170: 07 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0180: 07 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0190: 07 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     01A0: 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     01B0: 08 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     01C0: 08 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     01D0: 08 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
> >>     01E0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
> >>     01F0: 09 02 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
> >>     0200: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
> >>     0210: 09 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0220: 09 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0230: 0A 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0240: 0A 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0250: 0A 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0260: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
> >>     0270: 0B 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0280: 0B 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0290: 0D 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     02A0: 0D 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     02B0: 0D 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
> >>     02C0: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
> >>     02D0: 0E 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     02E0: 0E 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     02F0: 0E 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
> >>     0300: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
> >>     0310: 0F 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0320: 0F 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0330: 0F 00 00 00 00 20 00 03 08 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0340: 00 00 00 00 00 00 00 00 FF FF FF FF 00 00 00 00  // ................
> >>     0350: 10 03 00 00 00 20 00 03 00 30 06 C1 00 00 00 00  // ..... ...0......
> >>     0360: 10 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00  // ................
> >>     0370: 10 00 00 00 00 40 00 04 08 30 06 C1 00 00 00 00  // .....@...0......
> >>     0380: 00 00 00 00 00 00 00 00 FF FF FF FF FF FF FF FF  // ................
> >>  
> >   
> 



^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2021-08-05  9:06 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-30 19:07 [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 01/10] ACPI ERST: bios-tables-test.c steps 1 and 2 Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 02/10] ACPI ERST: specification for ERST support Eric DeVolder
2021-06-30 19:26   ` Eric DeVolder
2021-07-19 15:02     ` Igor Mammedov
2021-07-21 15:42       ` Eric DeVolder
2021-07-26 10:06         ` Igor Mammedov
2021-07-26 19:52           ` Eric DeVolder
2021-07-27 11:45             ` Igor Mammedov
2021-07-28 15:16               ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 03/10] ACPI ERST: PCI device_id for ERST Eric DeVolder
2021-07-19 15:06   ` Igor Mammedov
2021-07-21 15:42     ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 04/10] ACPI ERST: header file " Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature Eric DeVolder
2021-07-20 12:17   ` Igor Mammedov
2021-07-21 16:07     ` Eric DeVolder
2021-07-26 10:42       ` Igor Mammedov
2021-07-26 20:01         ` Eric DeVolder
2021-07-27 12:52           ` Igor Mammedov
2021-08-04 22:13             ` Eric DeVolder
2021-08-05  9:05               ` Igor Mammedov
2021-07-21 17:36     ` Eric DeVolder
2021-07-26 10:13       ` Igor Mammedov
2021-06-30 19:07 ` [PATCH v5 06/10] ACPI ERST: build the ACPI ERST table Eric DeVolder
2021-07-20 13:16   ` Igor Mammedov
2021-07-20 14:59     ` Igor Mammedov
2021-07-21 16:12     ` Eric DeVolder
2021-07-26 11:00       ` Igor Mammedov
2021-07-26 20:02         ` Eric DeVolder
2021-07-27 12:01           ` Igor Mammedov
2021-07-28 15:18             ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 07/10] ACPI ERST: trace support Eric DeVolder
2021-07-20 13:15   ` Igor Mammedov
2021-07-21 16:14     ` Eric DeVolder
2021-07-26 11:08       ` Igor Mammedov
2021-07-26 20:03         ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 08/10] ACPI ERST: create ACPI ERST table for pc/x86 machines Eric DeVolder
2021-07-20 13:19   ` Igor Mammedov
2021-07-21 16:16     ` Eric DeVolder
2021-07-26 11:30       ` Igor Mammedov
2021-07-26 20:03         ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 09/10] ACPI ERST: qtest for ERST Eric DeVolder
2021-07-20 13:38   ` Igor Mammedov
2021-07-21 16:18     ` Eric DeVolder
2021-07-26 11:45       ` Igor Mammedov
2021-07-26 20:06         ` Eric DeVolder
2021-06-30 19:07 ` [PATCH v5 10/10] ACPI ERST: step 6 of bios-tables-test.c Eric DeVolder
2021-07-20 13:24   ` Igor Mammedov
2021-07-21 16:19     ` Eric DeVolder
2021-07-13 20:38 ` [PATCH v5 00/10] acpi: Error Record Serialization Table, ERST, support for QEMU Michael S. Tsirkin
2021-07-21 15:23   ` Eric DeVolder
2021-07-20 14:57 ` Igor Mammedov
2021-07-21 15:26   ` Eric DeVolder
2021-07-23 16:26   ` Eric DeVolder
2021-07-27 12:55   ` Igor Mammedov
2021-07-28 15:19     ` Eric DeVolder
2021-07-29  8:07       ` Igor Mammedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.