All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiang Zheng <zhengxiang9@huawei.com>
To: <pbonzini@redhat.com>, <mst@redhat.com>, <imammedo@redhat.com>,
	<shannon.zhaosl@gmail.com>, <peter.maydell@linaro.org>,
	<lersek@redhat.com>, <james.morse@arm.com>,
	<gengdongjiu@huawei.com>, <mtosatti@redhat.com>,
	<rth@twiddle.net>, <ehabkost@redhat.com>,
	<jonathan.cameron@huawei.com>, <xuwei5@huawei.com>,
	<kvm@vger.kernel.org>, <qemu-devel@nongnu.org>,
	<qemu-arm@nongnu.org>, <linuxarm@huawei.com>
Cc: <zhengxiang9@huawei.com>, <wanghaibin.wang@huawei.com>
Subject: [PATCH v18 2/6] docs: APEI GHES generation and CPER record description
Date: Fri, 6 Sep 2019 16:31:48 +0800	[thread overview]
Message-ID: <20190906083152.25716-3-zhengxiang9@huawei.com> (raw)
In-Reply-To: <20190906083152.25716-1-zhengxiang9@huawei.com>

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add APEI/GHES detailed design document

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
---
 docs/specs/acpi_hest_ghes.txt | 88 +++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)
 create mode 100644 docs/specs/acpi_hest_ghes.txt

diff --git a/docs/specs/acpi_hest_ghes.txt b/docs/specs/acpi_hest_ghes.txt
new file mode 100644
index 0000000000..690d4b2bd0
--- /dev/null
+++ b/docs/specs/acpi_hest_ghes.txt
@@ -0,0 +1,88 @@
+APEI tables generating and CPER record
+=============================
+
+Copyright (C) 2019 Huawei Corporation.
+
+Design Details:
+-------------------
+
+       etc/acpi/tables                                 etc/hardware_errors
+    ====================                      ==========================================
++ +--------------------------+            +-----------------------+
+| | HEST                     |            |    address            |            +--------------+
+| +--------------------------+            |    registers          |            | Error Status |
+| | GHES1                    |            | +---------------------+            | Data Block 1 |
+| +--------------------------+ +--------->| |error_block_address1 |----------->| +------------+
+| | .................        | |          | +---------------------+            | |  CPER      |
+| | error_status_address-----+-+ +------->| |error_block_address2 |--------+   | |  CPER      |
+| | .................        |   |        | +---------------------+        |   | |  ....      |
+| | read_ack_register--------+-+ |        | |    ..............   |        |   | |  CPER      |
+| | read_ack_preserve        | | |        +-----------------------+        |   | +------------+
+| | read_ack_write           | | | +----->| |error_block_addressN |------+ |   | Error Status |
++ +--------------------------+ | | |      | +---------------------+      | |   | Data Block 2 |
+| | GHES2                    | +-+-+----->| |read_ack_register1   |      | +-->| +------------+
++ +--------------------------+   | |      | +---------------------+      |     | |  CPER      |
+| | .................        |   | | +--->| |read_ack_register2   |      |     | |  CPER      |
+| | error_status_address-----+---+ | |    | +---------------------+      |     | |  ....      |
+| | .................        |     | |    | |  .............      |      |     | |  CPER      |
+| | read_ack_register--------+-----+-+    | +---------------------+      |     +-+------------+
+| | read_ack_preserve        |     |   +->| |read_ack_registerN   |      |     | |..........  |
+| | read_ack_write           |     |   |  | +---------------------+      |     | +------------+
++ +--------------------------|     |   |                                 |     | Error Status |
+| | ...............          |     |   |                                 |     | Data Block N |
++ +--------------------------+     |   |                                 +---->| +------------+
+| | GHESN                    |     |   |                                       | |  CPER      |
++ +--------------------------+     |   |                                       | |  CPER      |
+| | .................        |     |   |                                       | |  ....      |
+| | error_status_address-----+-----+   |                                       | |  CPER      |
+| | .................        |         |                                       +-+------------+
+| | read_ack_register--------+---------+
+| | read_ack_preserve        |
+| | read_ack_write           |
++ +--------------------------+
+
+(1) QEMU generates the ACPI HEST table. This table goes in the current
+    "etc/acpi/tables" fw_cfg blob. Each error source has different
+    notification types.
+
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
+    also need to populate this blob. The "etc/hardwre_errors" fw_cfg blob
+    contains an address registers table and an Error Status Data Block table.
+
+(3) The address registers table contains N Error Block Address entries
+    and N Read Ack Register entries, the size for each entry is 8-byte.
+    The Error Status Data Block table contains N Error Status Data Block
+    entries, the size for each entry is 4096(0x1000) bytes. The total size
+    for "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
+    N is the kinds of hardware error sources.
+
+(4) QEMU generates the ACPI linker/loader script for the firmware, the
+    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
+    and copies blobs content there.
+
+(5) QEMU generates N ADD_POINTER commands, which patch address in the
+    "error_status_address" fields of the HEST table with a pointer to the
+    corresponding "address registers" in "etc/hardware_errors" blob.
+
+(6) QEMU generates N ADD_POINTER commands, which patch address in the
+    "read_ack_register" fields of the HEST table with a pointer to the
+    corresponding "address registers" in "etc/hardware_errors" blob.
+
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
+    address in the " error_block_address" fields with a pointer to the
+    respective "Error Status Data Block" in "etc/hardware_errors" blob.
+
+(8) QEMU defines a third and write-only fw_cfg blob which is called
+    "etc/hardware_errors_addr". Through that blob, the firmware can send back
+    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
+    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER commands
+    for the firmware, the firmware will write back the start address of
+    "etc/hardware_errors" blob to fw_cfg file "etc/hardware_errors_addr".
+
+(9) When QEMU gets SIGBUS from the kernel, QEMU formats the CPER right into
+    guest memory, and then injects whatever interrupt (or assert whatever GPIO
+    line) as a notification which is necessary for notifying the guest.
+
+(10) This notification (in virtual hardware) will be handled by guest kernel,
+    guest APEI driver will read the CPER which is recorded by QEMU and do the
+    recovery.
-- 
2.19.1



WARNING: multiple messages have this Message-ID (diff)
From: Xiang Zheng <zhengxiang9@huawei.com>
To: <pbonzini@redhat.com>, <mst@redhat.com>, <imammedo@redhat.com>,
	<shannon.zhaosl@gmail.com>, <peter.maydell@linaro.org>,
	<lersek@redhat.com>, <james.morse@arm.com>,
	<gengdongjiu@huawei.com>, <mtosatti@redhat.com>,
	<rth@twiddle.net>, <ehabkost@redhat.com>,
	<jonathan.cameron@huawei.com>, <xuwei5@huawei.com>,
	<kvm@vger.kernel.org>, <qemu-devel@nongnu.org>,
	<qemu-arm@nongnu.org>, <linuxarm@huawei.com>
Cc: wanghaibin.wang@huawei.com, zhengxiang9@huawei.com
Subject: [Qemu-devel] [PATCH v18 2/6] docs: APEI GHES generation and CPER record description
Date: Fri, 6 Sep 2019 16:31:48 +0800	[thread overview]
Message-ID: <20190906083152.25716-3-zhengxiang9@huawei.com> (raw)
In-Reply-To: <20190906083152.25716-1-zhengxiang9@huawei.com>

From: Dongjiu Geng <gengdongjiu@huawei.com>

Add APEI/GHES detailed design document

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
---
 docs/specs/acpi_hest_ghes.txt | 88 +++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)
 create mode 100644 docs/specs/acpi_hest_ghes.txt

diff --git a/docs/specs/acpi_hest_ghes.txt b/docs/specs/acpi_hest_ghes.txt
new file mode 100644
index 0000000000..690d4b2bd0
--- /dev/null
+++ b/docs/specs/acpi_hest_ghes.txt
@@ -0,0 +1,88 @@
+APEI tables generating and CPER record
+=============================
+
+Copyright (C) 2019 Huawei Corporation.
+
+Design Details:
+-------------------
+
+       etc/acpi/tables                                 etc/hardware_errors
+    ====================                      ==========================================
++ +--------------------------+            +-----------------------+
+| | HEST                     |            |    address            |            +--------------+
+| +--------------------------+            |    registers          |            | Error Status |
+| | GHES1                    |            | +---------------------+            | Data Block 1 |
+| +--------------------------+ +--------->| |error_block_address1 |----------->| +------------+
+| | .................        | |          | +---------------------+            | |  CPER      |
+| | error_status_address-----+-+ +------->| |error_block_address2 |--------+   | |  CPER      |
+| | .................        |   |        | +---------------------+        |   | |  ....      |
+| | read_ack_register--------+-+ |        | |    ..............   |        |   | |  CPER      |
+| | read_ack_preserve        | | |        +-----------------------+        |   | +------------+
+| | read_ack_write           | | | +----->| |error_block_addressN |------+ |   | Error Status |
++ +--------------------------+ | | |      | +---------------------+      | |   | Data Block 2 |
+| | GHES2                    | +-+-+----->| |read_ack_register1   |      | +-->| +------------+
++ +--------------------------+   | |      | +---------------------+      |     | |  CPER      |
+| | .................        |   | | +--->| |read_ack_register2   |      |     | |  CPER      |
+| | error_status_address-----+---+ | |    | +---------------------+      |     | |  ....      |
+| | .................        |     | |    | |  .............      |      |     | |  CPER      |
+| | read_ack_register--------+-----+-+    | +---------------------+      |     +-+------------+
+| | read_ack_preserve        |     |   +->| |read_ack_registerN   |      |     | |..........  |
+| | read_ack_write           |     |   |  | +---------------------+      |     | +------------+
++ +--------------------------|     |   |                                 |     | Error Status |
+| | ...............          |     |   |                                 |     | Data Block N |
++ +--------------------------+     |   |                                 +---->| +------------+
+| | GHESN                    |     |   |                                       | |  CPER      |
++ +--------------------------+     |   |                                       | |  CPER      |
+| | .................        |     |   |                                       | |  ....      |
+| | error_status_address-----+-----+   |                                       | |  CPER      |
+| | .................        |         |                                       +-+------------+
+| | read_ack_register--------+---------+
+| | read_ack_preserve        |
+| | read_ack_write           |
++ +--------------------------+
+
+(1) QEMU generates the ACPI HEST table. This table goes in the current
+    "etc/acpi/tables" fw_cfg blob. Each error source has different
+    notification types.
+
+(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
+    also need to populate this blob. The "etc/hardwre_errors" fw_cfg blob
+    contains an address registers table and an Error Status Data Block table.
+
+(3) The address registers table contains N Error Block Address entries
+    and N Read Ack Register entries, the size for each entry is 8-byte.
+    The Error Status Data Block table contains N Error Status Data Block
+    entries, the size for each entry is 4096(0x1000) bytes. The total size
+    for "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
+    N is the kinds of hardware error sources.
+
+(4) QEMU generates the ACPI linker/loader script for the firmware, the
+    firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
+    and copies blobs content there.
+
+(5) QEMU generates N ADD_POINTER commands, which patch address in the
+    "error_status_address" fields of the HEST table with a pointer to the
+    corresponding "address registers" in "etc/hardware_errors" blob.
+
+(6) QEMU generates N ADD_POINTER commands, which patch address in the
+    "read_ack_register" fields of the HEST table with a pointer to the
+    corresponding "address registers" in "etc/hardware_errors" blob.
+
+(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
+    address in the " error_block_address" fields with a pointer to the
+    respective "Error Status Data Block" in "etc/hardware_errors" blob.
+
+(8) QEMU defines a third and write-only fw_cfg blob which is called
+    "etc/hardware_errors_addr". Through that blob, the firmware can send back
+    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
+    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER commands
+    for the firmware, the firmware will write back the start address of
+    "etc/hardware_errors" blob to fw_cfg file "etc/hardware_errors_addr".
+
+(9) When QEMU gets SIGBUS from the kernel, QEMU formats the CPER right into
+    guest memory, and then injects whatever interrupt (or assert whatever GPIO
+    line) as a notification which is necessary for notifying the guest.
+
+(10) This notification (in virtual hardware) will be handled by guest kernel,
+    guest APEI driver will read the CPER which is recorded by QEMU and do the
+    recovery.
-- 
2.19.1




  parent reply	other threads:[~2019-09-06  8:33 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06  8:31 [PATCH v18 0/6] Add ARMv8 RAS virtualization support in QEMU Xiang Zheng
2019-09-06  8:31 ` [Qemu-devel] " Xiang Zheng
2019-09-06  8:31 ` [PATCH v18 1/6] hw/arm/virt: Introduce RAS platform version and RAS machine option Xiang Zheng
2019-09-06  8:31   ` [Qemu-devel] " Xiang Zheng
2019-09-27 14:02   ` Peter Maydell
2019-09-27 14:02     ` Peter Maydell
2019-09-29  2:04     ` Xiang Zheng
2019-09-29  2:04       ` Xiang Zheng
2019-09-06  8:31 ` Xiang Zheng [this message]
2019-09-06  8:31   ` [Qemu-devel] [PATCH v18 2/6] docs: APEI GHES generation and CPER record description Xiang Zheng
2019-09-19 13:25   ` Peter Maydell
2019-09-19 13:25     ` [Qemu-devel] " Peter Maydell
2019-09-20  1:45     ` Xiang Zheng
2019-09-20  1:45       ` Xiang Zheng
2019-10-04  8:20   ` [Qemu-devel] " Igor Mammedov
2019-10-04  8:20     ` Igor Mammedov
2019-10-08 13:25     ` Xiang Zheng
2019-10-08 13:25       ` Xiang Zheng
2019-09-06  8:31 ` [PATCH v18 3/6] ACPI: Add APEI GHES table generation support Xiang Zheng
2019-09-06  8:31   ` [Qemu-devel] " Xiang Zheng
2019-09-27 15:43   ` Michael S. Tsirkin
2019-09-27 15:43     ` Michael S. Tsirkin
2019-10-08  6:00     ` Xiang Zheng
2019-10-08  6:00       ` Xiang Zheng
2019-10-08  7:45       ` Michael S. Tsirkin
2019-10-08  7:45         ` Michael S. Tsirkin
2019-10-08 12:48         ` Xiang Zheng
2019-10-08 12:48           ` Xiang Zheng
2019-09-06  8:31 ` [PATCH v18 4/6] KVM: Move hwpoison page related functions into include/sysemu/kvm_int.h Xiang Zheng
2019-09-06  8:31   ` [Qemu-devel] " Xiang Zheng
2019-09-27 13:19   ` [Qemu-arm] " Peter Maydell
2019-09-27 13:19     ` Peter Maydell
2019-10-08  7:01     ` Xiang Zheng
2019-10-08  7:01       ` Xiang Zheng
2019-09-06  8:31 ` [PATCH v18 5/6] target-arm: kvm64: inject synchronous External Abort Xiang Zheng
2019-09-06  8:31   ` [Qemu-devel] " Xiang Zheng
2019-09-27 13:33   ` Peter Maydell
2019-09-27 13:33     ` Peter Maydell
2019-10-08  8:05     ` Xiang Zheng
2019-10-08  8:05       ` Xiang Zheng
2019-09-06  8:31 ` [PATCH v18 6/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM Xiang Zheng
2019-09-06  8:31   ` [Qemu-devel] " Xiang Zheng
2019-09-27 13:57   ` Peter Maydell
2019-09-27 13:57     ` Peter Maydell
2019-10-08 12:42     ` Xiang Zheng
2019-10-08 12:42       ` Xiang Zheng
2019-09-17 12:39 ` [PATCH v18 0/6] Add ARMv8 RAS virtualization support in QEMU Xiang Zheng
2019-09-17 12:39   ` [Qemu-devel] " Xiang Zheng
2019-09-20  2:07   ` gengdongjiu
2019-09-20  2:07     ` gengdongjiu
2019-09-27 14:03 ` [Qemu-arm] " Peter Maydell
2019-09-27 14:03   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906083152.25716-3-zhengxiang9@huawei.com \
    --to=zhengxiang9@huawei.com \
    --cc=ehabkost@redhat.com \
    --cc=gengdongjiu@huawei.com \
    --cc=imammedo@redhat.com \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=lersek@redhat.com \
    --cc=linuxarm@huawei.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=shannon.zhaosl@gmail.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=xuwei5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.