All of lore.kernel.org
 help / color / mirror / Atom feed
From: gengdongjiu <gengdongjiu@huawei.com>
To: Laszlo Ersek <lersek@redhat.com>, Achin Gupta <achin.gupta@arm.com>
Cc: <ard.biesheuvel@linaro.org>, <edk2-devel@ml01.01.org>,
	<qemu-devel@nongnu.org>, <zhaoshenglong@huawei.com>,
	James Morse <james.morse@arm.com>,
	Christoffer Dall <cdall@linaro.org>, <xiexiuqi@huawei.com>,
	Marc Zyngier <marc.zyngier@arm.com>, <catalin.marinas@arm.com>,
	<will.deacon@arm.com>, <christoffer.dall@linaro.org>,
	<rkrcmar@redhat.com>, <suzuki.poulose@arm.com>,
	<andre.przywara@arm.com>, <mark.rutland@arm.com>,
	<vladimir.murzin@arm.com>, <linux-arm-kernel@lists.infradead.org>,
	<kvmarm@lists.cs.columbia.edu>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <wangxiongfeng2@huawei.com>,
	<wuquanming@huawei.com>, <huangshaoyu@huawei.com>,
	<Leif.Lindholm@linaro.com>, <nd@arm.com>,
	Michael Tsirkin <mtsirkin@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH] kvm: pass the virtual SEI syndrome to guest OS
Date: Thu, 6 Apr 2017 20:35:26 +0800	[thread overview]
Message-ID: <7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com> (raw)
In-Reply-To: <2a427164-9b37-6711-3a56-906634ba7f12@redhat.com>

Dear, Laszlo
   Thanks for your detailed explanation.

On 2017/3/29 19:58, Laszlo Ersek wrote:
> (This ought to be one of the longest address lists I've ever seen :)
> Thanks for the CC. I'm glad Shannon is already on the CC list. For good
> measure, I'm adding MST and Igor.)
> 
> On 03/29/17 12:36, Achin Gupta wrote:
>> Hi gengdongjiu,
>>
>> On Wed, Mar 29, 2017 at 05:36:37PM +0800, gengdongjiu wrote:
>>>
>>> Hi Laszlo/Biesheuvel/Qemu developer,
>>>
>>>    Now I encounter a issue and want to consult with you in ARM64 platform, as described below:
>>>
>>> when guest OS happen synchronous or asynchronous abort, kvm needs
>>> to send the error address to Qemu or UEFI through sigbus to
>>> dynamically generate APEI table. from my investigation, there are
>>> two ways:
>>>
>>> (1) Qemu get the error address, and generate the APEI table, then
>>> notify UEFI to know this generation, then inject abort error to
>>> guest OS, guest OS read the APEI table.
>>> (2) Qemu get the error address, and let UEFI to generate the APEI
>>> table, then inject abort error to guest OS, guest OS read the APEI
>>> table.
>>
>> Just being pedantic! I don't think we are talking about creating the APEI table
>> dynamically here. The issue is: Once KVM has received an error that is destined
>> for a guest it will raise a SIGBUS to Qemu. Now before Qemu can inject the error
>> into the guest OS, a CPER (Common Platform Error Record) has to be generated
>> corresponding to the error source (GHES corresponding to memory subsystem,
>> processor etc) to allow the guest OS to do anything meaningful with the
>> error. So who should create the CPER is the question.
>>
>> At the EL3/EL2 interface (Secure Firmware and OS/Hypervisor), an error arrives
>> at EL3 and secure firmware (at EL3 or a lower secure exception level) is
>> responsible for creating the CPER. ARM is experimenting with using a Standalone
>> MM EDK2 image in the secure world to do the CPER creation. This will avoid
>> adding the same code in ARM TF in EL3 (better for security). The error will then
>> be injected into the OS/Hypervisor (through SEA/SEI/SDEI) through ARM Trusted
>> Firmware.
>>
>> Qemu is essentially fulfilling the role of secure firmware at the EL2/EL1
>> interface (as discussed with Christoffer below). So it should generate the CPER
>> before injecting the error.
>>
>> This is corresponds to (1) above apart from notifying UEFI (I am assuming you
>> mean guest UEFI). At this time, the guest OS already knows where to pick up the
>> CPER from through the HEST. Qemu has to create the CPER and populate its address
>> at the address exported in the HEST. Guest UEFI should not be involved in this
>> flow. Its job was to create the HEST at boot and that has been done by this
>> stage.
>>
>> Qemu folk will be able to add but it looks like support for CPER generation will
>> need to be added to Qemu. We need to resolve this.
>>
>> Do shout if I am missing anything above.
> 
> After reading this email, the use case looks *very* similar to what
> we've just done with VMGENID for QEMU 2.9.
> 
> We have a facility between QEMU and the guest firmware, called "ACPI
> linker/loader", with which QEMU instructs the firmware to
> 
> - allocate and download blobs into guest RAM (AcpiNVS type memory) --
> ALLOCATE command,
> 
> - relocate pointers in those blobs, to fields in other (or the same)
> blobs -- ADD_POINTER command,
> 
> - set ACPI table checksums -- ADD_CHECKSUM command,
> 
> - and send GPAs of fields within such blobs back to QEMU --
> WRITE_POINTER command.
> 
> This is how I imagine we can map the facility to the current use case
> (note that this is the first time I read about HEST / GHES / CPER):
> 
>     etc/acpi/tables                 etc/hardware_errors
>     ================     ==========================================
>                          +-----------+
>     +--------------+     | address   |         +-> +--------------+
>     |    HEST      +     | registers |         |   | Error Status |
>     + +------------+     | +---------+         |   | Data Block 1 |
>     | | GHES       | --> | | address | --------+   | +------------+
>     | | GHES       | --> | | address | ------+     | |  CPER      |
>     | | GHES       | --> | | address | ----+ |     | |  CPER      |
>     | | GHES       | --> | | address | -+  | |     | |  CPER      |
>     +-+------------+     +-+---------+  |  | |     +-+------------+
>                                         |  | |
>                                         |  | +---> +--------------+
>                                         |  |       | Error Status |
>                                         |  |       | Data Block 2 |
>                                         |  |       | +------------+
>                                         |  |       | |  CPER      |
>                                         |  |       | |  CPER      |
>                                         |  |       +-+------------+
>                                         |  |
>                                         |  +-----> +--------------+
>                                         |          | Error Status |
>                                         |          | Data Block 3 |
>                                         |          | +------------+
>                                         |          | |  CPER      |
>                                         |          +-+------------+
>                                         |
>                                         +--------> +--------------+
>                                                    | Error Status |
>                                                    | Data Block 4 |
>                                                    | +------------+
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    +-+------------+
> 
> (1) QEMU generates the HEST ACPI table. This table goes in the current
> "etc/acpi/tables" fw_cfg blob. Given N error sources, there will be N
> GHES objects in the HEST.
> 
> (2) We introduce a new fw_cfg blob called "etc/hardware_errors". QEMU
> also populates this blob.
> 
> (2a) Given N error sources, the (unnamed) table of address registers
> will contain N address registers.
> 
> (2b) Given N error sources, the "etc/hardwre_errors" fw_cfg blob will
> also contain N Error Status Data Blocks.
> 
> I don't know about the sizing (number of CPERs) each Error Status Data
> Block has to contain, but I understand it is all pre-allocated as far as
> the OS is concerned, which matches our capabilities well.
here I have a question. as you comment: " 'etc/hardwre_errors' fw_cfg blob will also contain N Error Status Data Blocks",
Because the CPER numbers is not fixed, how to assign each "Error Status Data Block" size using one "etc/hardwre_errors" fw_cfg blob.
when use one etc/hardwre_errors, will the N Error Status Data Block use one continuous buffer? as shown below. if so, maybe it not convenient for each data block size extension.
I see the bios_linker_loader_alloc will allocate one continuous buffer for a blob(such as VMGENID_GUID_FW_CFG_FILE)

    /* Allocate guest memory for the Data fw_cfg blob */
    bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
                             false /* page boundary, high memory */);



-> +--------------+
     |    HEST      +     | registers |             | Error Status |
     + +------------+     | +---------+             | Data Block  |
     | | GHES       | --> | | address | --------+-->| +------------+
     | | GHES       | --> | | address | ------+     | |  CPER      |
     | | GHES       | --> | | address | ----+ |     | |  CPER      |
     | | GHES       | --> | | address | -+  | |     | |  CPER      |
     +-+------------+     +-+---------+  |  | +---> +--------------+
                                         |  |       | |  CPER      |
                                         |  |       | |  CPER      |
                                         |  +-----> +--------------+
                                         |          | |  CPER      |
                                         +--------> +--------------+
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    +-+------------+



so how about we use separate etc/hardwre_errorsN for each Error Status status Block? then

etc/hardwre_errors0
etc/hardwre_errors1
...................
etc/hardwre_errors10
(the max N is 10)


the N can be one of below values, according to ACPI spec "Table 18-345 Hardware Error Notification Structure"
0 – Polled
1 – External Interrupt
2 – Local Interrupt
3 – SCI
4 – NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV




> 
> (3) QEMU generates the ACPI linker/loader script for the firmware, as
> always.
> 
> (3a) The HEST table is part of "etc/acpi/tables", which the firmware
> already allocates memory for, and downloads (because QEMU already
> generates an ALLOCATE linker/loader command for it already).
> 
> (3b) QEMU will have to create another ALLOCATE command for the
> "etc/hardware_errors" blob. The firmware allocates memory for this blob,
> and downloads it.
> 
> (4) QEMU generates, in the ACPI linker/loader script for the firwmare, N
> ADD_POINTER commands, which point the GHES."Error Status
> Address" fields in the HEST table, to the corresponding address
> registers in the downloaded "etc/hardware_errors" blob.
> 
> (5) QEMU generates an ADD_CHECKSUM command for the firmware, so that the
> HEST table is correctly checksummed after executing the N ADD_POINTER
> commands from (4).
> 
> (6) QEMU generates N ADD_POINTER commands for the firmware, pointing the
> address registers (located in guest memory, in the downloaded
> "etc/hardware_errors" blob) to the respective Error Status Data Blocks.
> 
> (7) (This is the trick.) For this step, we need a third, write-only
> fw_cfg blob, called "etc/hardware_errors_addr". Through that blob, the
> firmware can send back the guest-side allocation addresses to QEMU.
> 
> Namely, the "etc/hardware_errors_addr" blob contains N 8-byte entries.
> QEMU generates N WRITE_POINTER commands for the firmware.
> 
> For error source K (0 <= K < N), QEMU instructs the firmware to
> calculate the guest address of Error Status Data Block K, from the
> QEMU-dictated offset within "etc/hardware_errors", and from the
> guest-determined allocation base address for "etc/hardware_errors". The
> firmware then writes the calculated address back to fw_cfg file
> "etc/hardware_errors_addr", at offset K*8, according to the
> WRITE_POINTER command.
> 
> This way QEMU will know the GPA of each Error Status Data Block.
> 
> (In fact this can be simplified to a single WRITE_POINTER command: the
> address of the "address register table" can be sent back to QEMU as
> well, which already contains all Error Status Data Block addresses.)
> 
> (8) When QEMU gets SIGBUS from the kernel -- I hope that's going to come
> through a signalfd -- QEMU can format the CPER right into guest memory,
> and then inject whatever interrupt (or assert whatever GPIO line) is
> necessary for notifying the guest.
> 
> (9) This notification (in virtual hardware) can either be handled by the
> guest kernel stand-alone, or else the guest kernel can invoke an ACPI
> event handler method with it (which would be in the DSDT or one of the
> SSDTs, also generated by QEMU). The ACPI event handler method could
> invoke the specific guest kernel driver for errror handling via a
> Notify() operation.
> 
> I'm attracted to the above design because:
> - it would leave the firmware alone after OS boot, and
> - it would leave the firmware blissfully ignorant about HEST, GHES,
> CPER, and the like. (That's why QEMU's ACPI linker/loader was invented
> in the first place.)
> 
> Thanks
> Laszlo
> 
>>>    Do you think which modules generates the APEI table is better? UEFI or Qemu?
>>>
>>>
>>>
>>>
>>> On 2017/3/28 21:40, James Morse wrote:
>>>> Hi gengdongjiu,
>>>>
>>>> On 28/03/17 13:16, gengdongjiu wrote:
>>>>> On 2017/3/28 19:54, Achin Gupta wrote:
>>>>>> On Tue, Mar 28, 2017 at 01:23:28PM +0200, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 28, 2017 at 11:48:08AM +0100, James Morse wrote:
>>>>>>>> On the host, part of UEFI is involved to generate the CPER records.
>>>>>>>> In a guest?, I don't know.
>>>>>>>> Qemu could generate the records, or drive some other component to do it.
>>>>>>>
>>>>>>> I think I am beginning to understand this a bit.  Since the guet UEFI
>>>>>>> instance is specifically built for the machine it runs on, QEMU's virt
>>>>>>> machine in this case, they could simply agree (by some contract) to
>>>>>>> place the records at some specific location in memory, and if the guest
>>>>>>> kernel asks its guest UEFI for that location, things should just work by
>>>>>>> having logic in QEMU to process error reports and populate guest memory.
>>>>>>>
>>>>>>> Is this how others see the world too?
>>>>>>
>>>>>> I think so!
>>>>>>
>>>>>> AFAIU, the memory where CPERs will reside should be specified in a GHES entry in
>>>>>> the HEST. Is this not the case with a guest kernel i.e. the guest UEFI creates a
>>>>>> HEST for the guest Kernel?
>>>>>>
>>>>>> If so, then the question is how the guest UEFI finds out where QEMU (acting as
>>>>>> EL3 firmware) will populate the CPERs. This could either be a contract between
>>>>>> the two or a guest DXE driver uses the MM_COMMUNICATE call (see [1]) to ask QEMU
>>>>>> where the memory is.
>>>>>
>>>>> whether invoke the guest UEFI will be complex? not see the advantage. it seems x86 Qemu
>>>>> directly generate the ACPI table, but I am not sure, we are checking the qemu
>>>> logical.
>>>>> let Qemu generate CPER record may be clear.
>>>>
>>>> At boot UEFI in the guest will need to make sure the areas of memory that may be
>>>> used for CPER records are reserved. Whether UEFI or Qemu decides where these are
>>>> needs deciding, (but probably not here)...
>>>>
>>>> At runtime, when an error has occurred, I agree it would be simpler (fewer
>>>> components involved) if Qemu generates the CPER records. But if UEFI made the
>>>> memory choice above they need to interact and it gets complicated again. The
>>>> CPER records are defined in the UEFI spec, so I would expect UEFI to contain
>>>> code to generate/parse them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>>
>>>>
>>>> .
>>>>
>>>
> 
> 
> .
> 

WARNING: multiple messages have this Message-ID (diff)
From: gengdongjiu <gengdongjiu@huawei.com>
To: Laszlo Ersek <lersek@redhat.com>, Achin Gupta <achin.gupta@arm.com>
Cc: <ard.biesheuvel@linaro.org>, <edk2-devel@lists.01.org>,
	<qemu-devel@nongnu.org>, <zhaoshenglong@huawei.com>,
	James Morse <james.morse@arm.com>,
	Christoffer Dall <cdall@linaro.org>, <xiexiuqi@huawei.com>,
	Marc Zyngier <marc.zyngier@arm.com>, <catalin.marinas@arm.com>,
	<will.deacon@arm.com>, <christoffer.dall@linaro.org>,
	<rkrcmar@redhat.com>, <suzuki.poulose@arm.com>,
	<andre.przywara@arm.com>, <mark.rutland@arm.com>,
	<vladimir.murzin@arm.com>, <linux-arm-kernel@lists.infradead.org>,
	<kvmarm@lists.cs.columbia.edu>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <wangxiongfeng2@huawei.com>,
	<wuquanming@huawei.com>, <huangshaoyu@huawei.com>,
	<Leif.Lindholm@linaro.com>, <nd@arm.com>,
	Michael Tsirkin <mtsirkin@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH] kvm: pass the virtual SEI syndrome to guest OS
Date: Thu, 6 Apr 2017 20:35:26 +0800	[thread overview]
Message-ID: <7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com> (raw)
In-Reply-To: <2a427164-9b37-6711-3a56-906634ba7f12@redhat.com>

Dear, Laszlo
   Thanks for your detailed explanation.

On 2017/3/29 19:58, Laszlo Ersek wrote:
> (This ought to be one of the longest address lists I've ever seen :)
> Thanks for the CC. I'm glad Shannon is already on the CC list. For good
> measure, I'm adding MST and Igor.)
> 
> On 03/29/17 12:36, Achin Gupta wrote:
>> Hi gengdongjiu,
>>
>> On Wed, Mar 29, 2017 at 05:36:37PM +0800, gengdongjiu wrote:
>>>
>>> Hi Laszlo/Biesheuvel/Qemu developer,
>>>
>>>    Now I encounter a issue and want to consult with you in ARM64 platform, as described below:
>>>
>>> when guest OS happen synchronous or asynchronous abort, kvm needs
>>> to send the error address to Qemu or UEFI through sigbus to
>>> dynamically generate APEI table. from my investigation, there are
>>> two ways:
>>>
>>> (1) Qemu get the error address, and generate the APEI table, then
>>> notify UEFI to know this generation, then inject abort error to
>>> guest OS, guest OS read the APEI table.
>>> (2) Qemu get the error address, and let UEFI to generate the APEI
>>> table, then inject abort error to guest OS, guest OS read the APEI
>>> table.
>>
>> Just being pedantic! I don't think we are talking about creating the APEI table
>> dynamically here. The issue is: Once KVM has received an error that is destined
>> for a guest it will raise a SIGBUS to Qemu. Now before Qemu can inject the error
>> into the guest OS, a CPER (Common Platform Error Record) has to be generated
>> corresponding to the error source (GHES corresponding to memory subsystem,
>> processor etc) to allow the guest OS to do anything meaningful with the
>> error. So who should create the CPER is the question.
>>
>> At the EL3/EL2 interface (Secure Firmware and OS/Hypervisor), an error arrives
>> at EL3 and secure firmware (at EL3 or a lower secure exception level) is
>> responsible for creating the CPER. ARM is experimenting with using a Standalone
>> MM EDK2 image in the secure world to do the CPER creation. This will avoid
>> adding the same code in ARM TF in EL3 (better for security). The error will then
>> be injected into the OS/Hypervisor (through SEA/SEI/SDEI) through ARM Trusted
>> Firmware.
>>
>> Qemu is essentially fulfilling the role of secure firmware at the EL2/EL1
>> interface (as discussed with Christoffer below). So it should generate the CPER
>> before injecting the error.
>>
>> This is corresponds to (1) above apart from notifying UEFI (I am assuming you
>> mean guest UEFI). At this time, the guest OS already knows where to pick up the
>> CPER from through the HEST. Qemu has to create the CPER and populate its address
>> at the address exported in the HEST. Guest UEFI should not be involved in this
>> flow. Its job was to create the HEST at boot and that has been done by this
>> stage.
>>
>> Qemu folk will be able to add but it looks like support for CPER generation will
>> need to be added to Qemu. We need to resolve this.
>>
>> Do shout if I am missing anything above.
> 
> After reading this email, the use case looks *very* similar to what
> we've just done with VMGENID for QEMU 2.9.
> 
> We have a facility between QEMU and the guest firmware, called "ACPI
> linker/loader", with which QEMU instructs the firmware to
> 
> - allocate and download blobs into guest RAM (AcpiNVS type memory) --
> ALLOCATE command,
> 
> - relocate pointers in those blobs, to fields in other (or the same)
> blobs -- ADD_POINTER command,
> 
> - set ACPI table checksums -- ADD_CHECKSUM command,
> 
> - and send GPAs of fields within such blobs back to QEMU --
> WRITE_POINTER command.
> 
> This is how I imagine we can map the facility to the current use case
> (note that this is the first time I read about HEST / GHES / CPER):
> 
>     etc/acpi/tables                 etc/hardware_errors
>     ================     ==========================================
>                          +-----------+
>     +--------------+     | address   |         +-> +--------------+
>     |    HEST      +     | registers |         |   | Error Status |
>     + +------------+     | +---------+         |   | Data Block 1 |
>     | | GHES       | --> | | address | --------+   | +------------+
>     | | GHES       | --> | | address | ------+     | |  CPER      |
>     | | GHES       | --> | | address | ----+ |     | |  CPER      |
>     | | GHES       | --> | | address | -+  | |     | |  CPER      |
>     +-+------------+     +-+---------+  |  | |     +-+------------+
>                                         |  | |
>                                         |  | +---> +--------------+
>                                         |  |       | Error Status |
>                                         |  |       | Data Block 2 |
>                                         |  |       | +------------+
>                                         |  |       | |  CPER      |
>                                         |  |       | |  CPER      |
>                                         |  |       +-+------------+
>                                         |  |
>                                         |  +-----> +--------------+
>                                         |          | Error Status |
>                                         |          | Data Block 3 |
>                                         |          | +------------+
>                                         |          | |  CPER      |
>                                         |          +-+------------+
>                                         |
>                                         +--------> +--------------+
>                                                    | Error Status |
>                                                    | Data Block 4 |
>                                                    | +------------+
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    +-+------------+
> 
> (1) QEMU generates the HEST ACPI table. This table goes in the current
> "etc/acpi/tables" fw_cfg blob. Given N error sources, there will be N
> GHES objects in the HEST.
> 
> (2) We introduce a new fw_cfg blob called "etc/hardware_errors". QEMU
> also populates this blob.
> 
> (2a) Given N error sources, the (unnamed) table of address registers
> will contain N address registers.
> 
> (2b) Given N error sources, the "etc/hardwre_errors" fw_cfg blob will
> also contain N Error Status Data Blocks.
> 
> I don't know about the sizing (number of CPERs) each Error Status Data
> Block has to contain, but I understand it is all pre-allocated as far as
> the OS is concerned, which matches our capabilities well.
here I have a question. as you comment: " 'etc/hardwre_errors' fw_cfg blob will also contain N Error Status Data Blocks",
Because the CPER numbers is not fixed, how to assign each "Error Status Data Block" size using one "etc/hardwre_errors" fw_cfg blob.
when use one etc/hardwre_errors, will the N Error Status Data Block use one continuous buffer? as shown below. if so, maybe it not convenient for each data block size extension.
I see the bios_linker_loader_alloc will allocate one continuous buffer for a blob(such as VMGENID_GUID_FW_CFG_FILE)

    /* Allocate guest memory for the Data fw_cfg blob */
    bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
                             false /* page boundary, high memory */);



-> +--------------+
     |    HEST      +     | registers |             | Error Status |
     + +------------+     | +---------+             | Data Block  |
     | | GHES       | --> | | address | --------+-->| +------------+
     | | GHES       | --> | | address | ------+     | |  CPER      |
     | | GHES       | --> | | address | ----+ |     | |  CPER      |
     | | GHES       | --> | | address | -+  | |     | |  CPER      |
     +-+------------+     +-+---------+  |  | +---> +--------------+
                                         |  |       | |  CPER      |
                                         |  |       | |  CPER      |
                                         |  +-----> +--------------+
                                         |          | |  CPER      |
                                         +--------> +--------------+
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    +-+------------+



so how about we use separate etc/hardwre_errorsN for each Error Status status Block? then

etc/hardwre_errors0
etc/hardwre_errors1
...................
etc/hardwre_errors10
(the max N is 10)


the N can be one of below values, according to ACPI spec "Table 18-345 Hardware Error Notification Structure"
0 – Polled
1 – External Interrupt
2 – Local Interrupt
3 – SCI
4 – NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV




> 
> (3) QEMU generates the ACPI linker/loader script for the firmware, as
> always.
> 
> (3a) The HEST table is part of "etc/acpi/tables", which the firmware
> already allocates memory for, and downloads (because QEMU already
> generates an ALLOCATE linker/loader command for it already).
> 
> (3b) QEMU will have to create another ALLOCATE command for the
> "etc/hardware_errors" blob. The firmware allocates memory for this blob,
> and downloads it.
> 
> (4) QEMU generates, in the ACPI linker/loader script for the firwmare, N
> ADD_POINTER commands, which point the GHES."Error Status
> Address" fields in the HEST table, to the corresponding address
> registers in the downloaded "etc/hardware_errors" blob.
> 
> (5) QEMU generates an ADD_CHECKSUM command for the firmware, so that the
> HEST table is correctly checksummed after executing the N ADD_POINTER
> commands from (4).
> 
> (6) QEMU generates N ADD_POINTER commands for the firmware, pointing the
> address registers (located in guest memory, in the downloaded
> "etc/hardware_errors" blob) to the respective Error Status Data Blocks.
> 
> (7) (This is the trick.) For this step, we need a third, write-only
> fw_cfg blob, called "etc/hardware_errors_addr". Through that blob, the
> firmware can send back the guest-side allocation addresses to QEMU.
> 
> Namely, the "etc/hardware_errors_addr" blob contains N 8-byte entries.
> QEMU generates N WRITE_POINTER commands for the firmware.
> 
> For error source K (0 <= K < N), QEMU instructs the firmware to
> calculate the guest address of Error Status Data Block K, from the
> QEMU-dictated offset within "etc/hardware_errors", and from the
> guest-determined allocation base address for "etc/hardware_errors". The
> firmware then writes the calculated address back to fw_cfg file
> "etc/hardware_errors_addr", at offset K*8, according to the
> WRITE_POINTER command.
> 
> This way QEMU will know the GPA of each Error Status Data Block.
> 
> (In fact this can be simplified to a single WRITE_POINTER command: the
> address of the "address register table" can be sent back to QEMU as
> well, which already contains all Error Status Data Block addresses.)
> 
> (8) When QEMU gets SIGBUS from the kernel -- I hope that's going to come
> through a signalfd -- QEMU can format the CPER right into guest memory,
> and then inject whatever interrupt (or assert whatever GPIO line) is
> necessary for notifying the guest.
> 
> (9) This notification (in virtual hardware) can either be handled by the
> guest kernel stand-alone, or else the guest kernel can invoke an ACPI
> event handler method with it (which would be in the DSDT or one of the
> SSDTs, also generated by QEMU). The ACPI event handler method could
> invoke the specific guest kernel driver for errror handling via a
> Notify() operation.
> 
> I'm attracted to the above design because:
> - it would leave the firmware alone after OS boot, and
> - it would leave the firmware blissfully ignorant about HEST, GHES,
> CPER, and the like. (That's why QEMU's ACPI linker/loader was invented
> in the first place.)
> 
> Thanks
> Laszlo
> 
>>>    Do you think which modules generates the APEI table is better? UEFI or Qemu?
>>>
>>>
>>>
>>>
>>> On 2017/3/28 21:40, James Morse wrote:
>>>> Hi gengdongjiu,
>>>>
>>>> On 28/03/17 13:16, gengdongjiu wrote:
>>>>> On 2017/3/28 19:54, Achin Gupta wrote:
>>>>>> On Tue, Mar 28, 2017 at 01:23:28PM +0200, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 28, 2017 at 11:48:08AM +0100, James Morse wrote:
>>>>>>>> On the host, part of UEFI is involved to generate the CPER records.
>>>>>>>> In a guest?, I don't know.
>>>>>>>> Qemu could generate the records, or drive some other component to do it.
>>>>>>>
>>>>>>> I think I am beginning to understand this a bit.  Since the guet UEFI
>>>>>>> instance is specifically built for the machine it runs on, QEMU's virt
>>>>>>> machine in this case, they could simply agree (by some contract) to
>>>>>>> place the records at some specific location in memory, and if the guest
>>>>>>> kernel asks its guest UEFI for that location, things should just work by
>>>>>>> having logic in QEMU to process error reports and populate guest memory.
>>>>>>>
>>>>>>> Is this how others see the world too?
>>>>>>
>>>>>> I think so!
>>>>>>
>>>>>> AFAIU, the memory where CPERs will reside should be specified in a GHES entry in
>>>>>> the HEST. Is this not the case with a guest kernel i.e. the guest UEFI creates a
>>>>>> HEST for the guest Kernel?
>>>>>>
>>>>>> If so, then the question is how the guest UEFI finds out where QEMU (acting as
>>>>>> EL3 firmware) will populate the CPERs. This could either be a contract between
>>>>>> the two or a guest DXE driver uses the MM_COMMUNICATE call (see [1]) to ask QEMU
>>>>>> where the memory is.
>>>>>
>>>>> whether invoke the guest UEFI will be complex? not see the advantage. it seems x86 Qemu
>>>>> directly generate the ACPI table, but I am not sure, we are checking the qemu
>>>> logical.
>>>>> let Qemu generate CPER record may be clear.
>>>>
>>>> At boot UEFI in the guest will need to make sure the areas of memory that may be
>>>> used for CPER records are reserved. Whether UEFI or Qemu decides where these are
>>>> needs deciding, (but probably not here)...
>>>>
>>>> At runtime, when an error has occurred, I agree it would be simpler (fewer
>>>> components involved) if Qemu generates the CPER records. But if UEFI made the
>>>> memory choice above they need to interact and it gets complicated again. The
>>>> CPER records are defined in the UEFI spec, so I would expect UEFI to contain
>>>> code to generate/parse them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>>
>>>>
>>>> .
>>>>
>>>
> 
> 
> .
> 

WARNING: multiple messages have this Message-ID (diff)
From: gengdongjiu <gengdongjiu@huawei.com>
To: Laszlo Ersek <lersek@redhat.com>, Achin Gupta <achin.gupta@arm.com>
Cc: ard.biesheuvel@linaro.org, edk2-devel@lists.01.org,
	qemu-devel@nongnu.org, zhaoshenglong@huawei.com,
	James Morse <james.morse@arm.com>,
	Christoffer Dall <cdall@linaro.org>,
	xiexiuqi@huawei.com, Marc Zyngier <marc.zyngier@arm.com>,
	catalin.marinas@arm.com, will.deacon@arm.com,
	christoffer.dall@linaro.org, rkrcmar@redhat.com,
	suzuki.poulose@arm.com, andre.przywara@arm.com,
	mark.rutland@arm.com, vladimir.murzin@arm.com,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, wangxiongfeng2@huawei.com,
	wuquanming@huawei.com, huangshaoyu@huawei.com,
	Leif.Lindholm@linaro.comnd@arm.com,
	Michael Tsirkin <mtsirkin@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] kvm: pass the virtual SEI syndrome to guest OS
Date: Thu, 6 Apr 2017 20:35:26 +0800	[thread overview]
Message-ID: <7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com> (raw)
In-Reply-To: <2a427164-9b37-6711-3a56-906634ba7f12@redhat.com>

Dear, Laszlo
   Thanks for your detailed explanation.

On 2017/3/29 19:58, Laszlo Ersek wrote:
> (This ought to be one of the longest address lists I've ever seen :)
> Thanks for the CC. I'm glad Shannon is already on the CC list. For good
> measure, I'm adding MST and Igor.)
> 
> On 03/29/17 12:36, Achin Gupta wrote:
>> Hi gengdongjiu,
>>
>> On Wed, Mar 29, 2017 at 05:36:37PM +0800, gengdongjiu wrote:
>>>
>>> Hi Laszlo/Biesheuvel/Qemu developer,
>>>
>>>    Now I encounter a issue and want to consult with you in ARM64 platform, as described below:
>>>
>>> when guest OS happen synchronous or asynchronous abort, kvm needs
>>> to send the error address to Qemu or UEFI through sigbus to
>>> dynamically generate APEI table. from my investigation, there are
>>> two ways:
>>>
>>> (1) Qemu get the error address, and generate the APEI table, then
>>> notify UEFI to know this generation, then inject abort error to
>>> guest OS, guest OS read the APEI table.
>>> (2) Qemu get the error address, and let UEFI to generate the APEI
>>> table, then inject abort error to guest OS, guest OS read the APEI
>>> table.
>>
>> Just being pedantic! I don't think we are talking about creating the APEI table
>> dynamically here. The issue is: Once KVM has received an error that is destined
>> for a guest it will raise a SIGBUS to Qemu. Now before Qemu can inject the error
>> into the guest OS, a CPER (Common Platform Error Record) has to be generated
>> corresponding to the error source (GHES corresponding to memory subsystem,
>> processor etc) to allow the guest OS to do anything meaningful with the
>> error. So who should create the CPER is the question.
>>
>> At the EL3/EL2 interface (Secure Firmware and OS/Hypervisor), an error arrives
>> at EL3 and secure firmware (at EL3 or a lower secure exception level) is
>> responsible for creating the CPER. ARM is experimenting with using a Standalone
>> MM EDK2 image in the secure world to do the CPER creation. This will avoid
>> adding the same code in ARM TF in EL3 (better for security). The error will then
>> be injected into the OS/Hypervisor (through SEA/SEI/SDEI) through ARM Trusted
>> Firmware.
>>
>> Qemu is essentially fulfilling the role of secure firmware at the EL2/EL1
>> interface (as discussed with Christoffer below). So it should generate the CPER
>> before injecting the error.
>>
>> This is corresponds to (1) above apart from notifying UEFI (I am assuming you
>> mean guest UEFI). At this time, the guest OS already knows where to pick up the
>> CPER from through the HEST. Qemu has to create the CPER and populate its address
>> at the address exported in the HEST. Guest UEFI should not be involved in this
>> flow. Its job was to create the HEST at boot and that has been done by this
>> stage.
>>
>> Qemu folk will be able to add but it looks like support for CPER generation will
>> need to be added to Qemu. We need to resolve this.
>>
>> Do shout if I am missing anything above.
> 
> After reading this email, the use case looks *very* similar to what
> we've just done with VMGENID for QEMU 2.9.
> 
> We have a facility between QEMU and the guest firmware, called "ACPI
> linker/loader", with which QEMU instructs the firmware to
> 
> - allocate and download blobs into guest RAM (AcpiNVS type memory) --
> ALLOCATE command,
> 
> - relocate pointers in those blobs, to fields in other (or the same)
> blobs -- ADD_POINTER command,
> 
> - set ACPI table checksums -- ADD_CHECKSUM command,
> 
> - and send GPAs of fields within such blobs back to QEMU --
> WRITE_POINTER command.
> 
> This is how I imagine we can map the facility to the current use case
> (note that this is the first time I read about HEST / GHES / CPER):
> 
>     etc/acpi/tables                 etc/hardware_errors
>     ================     ==========================================
>                          +-----------+
>     +--------------+     | address   |         +-> +--------------+
>     |    HEST      +     | registers |         |   | Error Status |
>     + +------------+     | +---------+         |   | Data Block 1 |
>     | | GHES       | --> | | address | --------+   | +------------+
>     | | GHES       | --> | | address | ------+     | |  CPER      |
>     | | GHES       | --> | | address | ----+ |     | |  CPER      |
>     | | GHES       | --> | | address | -+  | |     | |  CPER      |
>     +-+------------+     +-+---------+  |  | |     +-+------------+
>                                         |  | |
>                                         |  | +---> +--------------+
>                                         |  |       | Error Status |
>                                         |  |       | Data Block 2 |
>                                         |  |       | +------------+
>                                         |  |       | |  CPER      |
>                                         |  |       | |  CPER      |
>                                         |  |       +-+------------+
>                                         |  |
>                                         |  +-----> +--------------+
>                                         |          | Error Status |
>                                         |          | Data Block 3 |
>                                         |          | +------------+
>                                         |          | |  CPER      |
>                                         |          +-+------------+
>                                         |
>                                         +--------> +--------------+
>                                                    | Error Status |
>                                                    | Data Block 4 |
>                                                    | +------------+
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    +-+------------+
> 
> (1) QEMU generates the HEST ACPI table. This table goes in the current
> "etc/acpi/tables" fw_cfg blob. Given N error sources, there will be N
> GHES objects in the HEST.
> 
> (2) We introduce a new fw_cfg blob called "etc/hardware_errors". QEMU
> also populates this blob.
> 
> (2a) Given N error sources, the (unnamed) table of address registers
> will contain N address registers.
> 
> (2b) Given N error sources, the "etc/hardwre_errors" fw_cfg blob will
> also contain N Error Status Data Blocks.
> 
> I don't know about the sizing (number of CPERs) each Error Status Data
> Block has to contain, but I understand it is all pre-allocated as far as
> the OS is concerned, which matches our capabilities well.
here I have a question. as you comment: " 'etc/hardwre_errors' fw_cfg blob will also contain N Error Status Data Blocks",
Because the CPER numbers is not fixed, how to assign each "Error Status Data Block" size using one "etc/hardwre_errors" fw_cfg blob.
when use one etc/hardwre_errors, will the N Error Status Data Block use one continuous buffer? as shown below. if so, maybe it not convenient for each data block size extension.
I see the bios_linker_loader_alloc will allocate one continuous buffer for a blob(such as VMGENID_GUID_FW_CFG_FILE)

    /* Allocate guest memory for the Data fw_cfg blob */
    bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
                             false /* page boundary, high memory */);



-> +--------------+
     |    HEST      +     | registers |             | Error Status |
     + +------------+     | +---------+             | Data Block  |
     | | GHES       | --> | | address | --------+-->| +------------+
     | | GHES       | --> | | address | ------+     | |  CPER      |
     | | GHES       | --> | | address | ----+ |     | |  CPER      |
     | | GHES       | --> | | address | -+  | |     | |  CPER      |
     +-+------------+     +-+---------+  |  | +---> +--------------+
                                         |  |       | |  CPER      |
                                         |  |       | |  CPER      |
                                         |  +-----> +--------------+
                                         |          | |  CPER      |
                                         +--------> +--------------+
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    +-+------------+



so how about we use separate etc/hardwre_errorsN for each Error Status status Block? then

etc/hardwre_errors0
etc/hardwre_errors1
...................
etc/hardwre_errors10
(the max N is 10)


the N can be one of below values, according to ACPI spec "Table 18-345 Hardware Error Notification Structure"
0 – Polled
1 – External Interrupt
2 – Local Interrupt
3 – SCI
4 – NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV




> 
> (3) QEMU generates the ACPI linker/loader script for the firmware, as
> always.
> 
> (3a) The HEST table is part of "etc/acpi/tables", which the firmware
> already allocates memory for, and downloads (because QEMU already
> generates an ALLOCATE linker/loader command for it already).
> 
> (3b) QEMU will have to create another ALLOCATE command for the
> "etc/hardware_errors" blob. The firmware allocates memory for this blob,
> and downloads it.
> 
> (4) QEMU generates, in the ACPI linker/loader script for the firwmare, N
> ADD_POINTER commands, which point the GHES."Error Status
> Address" fields in the HEST table, to the corresponding address
> registers in the downloaded "etc/hardware_errors" blob.
> 
> (5) QEMU generates an ADD_CHECKSUM command for the firmware, so that the
> HEST table is correctly checksummed after executing the N ADD_POINTER
> commands from (4).
> 
> (6) QEMU generates N ADD_POINTER commands for the firmware, pointing the
> address registers (located in guest memory, in the downloaded
> "etc/hardware_errors" blob) to the respective Error Status Data Blocks.
> 
> (7) (This is the trick.) For this step, we need a third, write-only
> fw_cfg blob, called "etc/hardware_errors_addr". Through that blob, the
> firmware can send back the guest-side allocation addresses to QEMU.
> 
> Namely, the "etc/hardware_errors_addr" blob contains N 8-byte entries.
> QEMU generates N WRITE_POINTER commands for the firmware.
> 
> For error source K (0 <= K < N), QEMU instructs the firmware to
> calculate the guest address of Error Status Data Block K, from the
> QEMU-dictated offset within "etc/hardware_errors", and from the
> guest-determined allocation base address for "etc/hardware_errors". The
> firmware then writes the calculated address back to fw_cfg file
> "etc/hardware_errors_addr", at offset K*8, according to the
> WRITE_POINTER command.
> 
> This way QEMU will know the GPA of each Error Status Data Block.
> 
> (In fact this can be simplified to a single WRITE_POINTER command: the
> address of the "address register table" can be sent back to QEMU as
> well, which already contains all Error Status Data Block addresses.)
> 
> (8) When QEMU gets SIGBUS from the kernel -- I hope that's going to come
> through a signalfd -- QEMU can format the CPER right into guest memory,
> and then inject whatever interrupt (or assert whatever GPIO line) is
> necessary for notifying the guest.
> 
> (9) This notification (in virtual hardware) can either be handled by the
> guest kernel stand-alone, or else the guest kernel can invoke an ACPI
> event handler method with it (which would be in the DSDT or one of the
> SSDTs, also generated by QEMU). The ACPI event handler method could
> invoke the specific guest kernel driver for errror handling via a
> Notify() operation.
> 
> I'm attracted to the above design because:
> - it would leave the firmware alone after OS boot, and
> - it would leave the firmware blissfully ignorant about HEST, GHES,
> CPER, and the like. (That's why QEMU's ACPI linker/loader was invented
> in the first place.)
> 
> Thanks
> Laszlo
> 
>>>    Do you think which modules generates the APEI table is better? UEFI or Qemu?
>>>
>>>
>>>
>>>
>>> On 2017/3/28 21:40, James Morse wrote:
>>>> Hi gengdongjiu,
>>>>
>>>> On 28/03/17 13:16, gengdongjiu wrote:
>>>>> On 2017/3/28 19:54, Achin Gupta wrote:
>>>>>> On Tue, Mar 28, 2017 at 01:23:28PM +0200, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 28, 2017 at 11:48:08AM +0100, James Morse wrote:
>>>>>>>> On the host, part of UEFI is involved to generate the CPER records.
>>>>>>>> In a guest?, I don't know.
>>>>>>>> Qemu could generate the records, or drive some other component to do it.
>>>>>>>
>>>>>>> I think I am beginning to understand this a bit.  Since the guet UEFI
>>>>>>> instance is specifically built for the machine it runs on, QEMU's virt
>>>>>>> machine in this case, they could simply agree (by some contract) to
>>>>>>> place the records at some specific location in memory, and if the guest
>>>>>>> kernel asks its guest UEFI for that location, things should just work by
>>>>>>> having logic in QEMU to process error reports and populate guest memory.
>>>>>>>
>>>>>>> Is this how others see the world too?
>>>>>>
>>>>>> I think so!
>>>>>>
>>>>>> AFAIU, the memory where CPERs will reside should be specified in a GHES entry in
>>>>>> the HEST. Is this not the case with a guest kernel i.e. the guest UEFI creates a
>>>>>> HEST for the guest Kernel?
>>>>>>
>>>>>> If so, then the question is how the guest UEFI finds out where QEMU (acting as
>>>>>> EL3 firmware) will populate the CPERs. This could either be a contract between
>>>>>> the two or a guest DXE driver uses the MM_COMMUNICATE call (see [1]) to ask QEMU
>>>>>> where the memory is.
>>>>>
>>>>> whether invoke the guest UEFI will be complex? not see the advantage. it seems x86 Qemu
>>>>> directly generate the ACPI table, but I am not sure, we are checking the qemu
>>>> logical.
>>>>> let Qemu generate CPER record may be clear.
>>>>
>>>> At boot UEFI in the guest will need to make sure the areas of memory that may be
>>>> used for CPER records are reserved. Whether UEFI or Qemu decides where these are
>>>> needs deciding, (but probably not here)...
>>>>
>>>> At runtime, when an error has occurred, I agree it would be simpler (fewer
>>>> components involved) if Qemu generates the CPER records. But if UEFI made the
>>>> memory choice above they need to interact and it gets complicated again. The
>>>> CPER records are defined in the UEFI spec, so I would expect UEFI to contain
>>>> code to generate/parse them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>>
>>>>
>>>> .
>>>>
>>>
> 
> 
> .
> 

WARNING: multiple messages have this Message-ID (diff)
From: gengdongjiu <gengdongjiu@huawei.com>
To: Laszlo Ersek <lersek@redhat.com>, Achin Gupta <achin.gupta@arm.com>
Cc: ard.biesheuvel@linaro.org, edk2-devel@lists.01.org,
	qemu-devel@nongnu.org, zhaoshenglong@huawei.com,
	James Morse <james.morse@arm.com>,
	Christoffer Dall <cdall@linaro.org>,
	xiexiuqi@huawei.com, Marc Zyngier <marc.zyngier@arm.com>,
	catalin.marinas@arm.com, will.deacon@arm.com,
	christoffer.dall@linaro.org, rkrcmar@redhat.com,
	suzuki.poulose@arm.com, andre.przywara@arm.com,
	mark.rutland@arm.com, vladimir.murzin@arm.com,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, wangxiongfeng2@huawei.com,
	wuquanming@huawei.com, huangshaoyu@huawei.com,
	Leif.Lindholm@linaro.comnd@arm.com,
	Michael Tsirkin <mtsirkin@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH] kvm: pass the virtual SEI syndrome to guest OS
Date: Thu, 6 Apr 2017 20:35:26 +0800	[thread overview]
Message-ID: <7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com> (raw)
In-Reply-To: <2a427164-9b37-6711-3a56-906634ba7f12@redhat.com>

Dear, Laszlo
   Thanks for your detailed explanation.

On 2017/3/29 19:58, Laszlo Ersek wrote:
> (This ought to be one of the longest address lists I've ever seen :)
> Thanks for the CC. I'm glad Shannon is already on the CC list. For good
> measure, I'm adding MST and Igor.)
> 
> On 03/29/17 12:36, Achin Gupta wrote:
>> Hi gengdongjiu,
>>
>> On Wed, Mar 29, 2017 at 05:36:37PM +0800, gengdongjiu wrote:
>>>
>>> Hi Laszlo/Biesheuvel/Qemu developer,
>>>
>>>    Now I encounter a issue and want to consult with you in ARM64 platform, as described below:
>>>
>>> when guest OS happen synchronous or asynchronous abort, kvm needs
>>> to send the error address to Qemu or UEFI through sigbus to
>>> dynamically generate APEI table. from my investigation, there are
>>> two ways:
>>>
>>> (1) Qemu get the error address, and generate the APEI table, then
>>> notify UEFI to know this generation, then inject abort error to
>>> guest OS, guest OS read the APEI table.
>>> (2) Qemu get the error address, and let UEFI to generate the APEI
>>> table, then inject abort error to guest OS, guest OS read the APEI
>>> table.
>>
>> Just being pedantic! I don't think we are talking about creating the APEI table
>> dynamically here. The issue is: Once KVM has received an error that is destined
>> for a guest it will raise a SIGBUS to Qemu. Now before Qemu can inject the error
>> into the guest OS, a CPER (Common Platform Error Record) has to be generated
>> corresponding to the error source (GHES corresponding to memory subsystem,
>> processor etc) to allow the guest OS to do anything meaningful with the
>> error. So who should create the CPER is the question.
>>
>> At the EL3/EL2 interface (Secure Firmware and OS/Hypervisor), an error arrives
>> at EL3 and secure firmware (at EL3 or a lower secure exception level) is
>> responsible for creating the CPER. ARM is experimenting with using a Standalone
>> MM EDK2 image in the secure world to do the CPER creation. This will avoid
>> adding the same code in ARM TF in EL3 (better for security). The error will then
>> be injected into the OS/Hypervisor (through SEA/SEI/SDEI) through ARM Trusted
>> Firmware.
>>
>> Qemu is essentially fulfilling the role of secure firmware at the EL2/EL1
>> interface (as discussed with Christoffer below). So it should generate the CPER
>> before injecting the error.
>>
>> This is corresponds to (1) above apart from notifying UEFI (I am assuming you
>> mean guest UEFI). At this time, the guest OS already knows where to pick up the
>> CPER from through the HEST. Qemu has to create the CPER and populate its address
>> at the address exported in the HEST. Guest UEFI should not be involved in this
>> flow. Its job was to create the HEST at boot and that has been done by this
>> stage.
>>
>> Qemu folk will be able to add but it looks like support for CPER generation will
>> need to be added to Qemu. We need to resolve this.
>>
>> Do shout if I am missing anything above.
> 
> After reading this email, the use case looks *very* similar to what
> we've just done with VMGENID for QEMU 2.9.
> 
> We have a facility between QEMU and the guest firmware, called "ACPI
> linker/loader", with which QEMU instructs the firmware to
> 
> - allocate and download blobs into guest RAM (AcpiNVS type memory) --
> ALLOCATE command,
> 
> - relocate pointers in those blobs, to fields in other (or the same)
> blobs -- ADD_POINTER command,
> 
> - set ACPI table checksums -- ADD_CHECKSUM command,
> 
> - and send GPAs of fields within such blobs back to QEMU --
> WRITE_POINTER command.
> 
> This is how I imagine we can map the facility to the current use case
> (note that this is the first time I read about HEST / GHES / CPER):
> 
>     etc/acpi/tables                 etc/hardware_errors
>     ================     ==========================================
>                          +-----------+
>     +--------------+     | address   |         +-> +--------------+
>     |    HEST      +     | registers |         |   | Error Status |
>     + +------------+     | +---------+         |   | Data Block 1 |
>     | | GHES       | --> | | address | --------+   | +------------+
>     | | GHES       | --> | | address | ------+     | |  CPER      |
>     | | GHES       | --> | | address | ----+ |     | |  CPER      |
>     | | GHES       | --> | | address | -+  | |     | |  CPER      |
>     +-+------------+     +-+---------+  |  | |     +-+------------+
>                                         |  | |
>                                         |  | +---> +--------------+
>                                         |  |       | Error Status |
>                                         |  |       | Data Block 2 |
>                                         |  |       | +------------+
>                                         |  |       | |  CPER      |
>                                         |  |       | |  CPER      |
>                                         |  |       +-+------------+
>                                         |  |
>                                         |  +-----> +--------------+
>                                         |          | Error Status |
>                                         |          | Data Block 3 |
>                                         |          | +------------+
>                                         |          | |  CPER      |
>                                         |          +-+------------+
>                                         |
>                                         +--------> +--------------+
>                                                    | Error Status |
>                                                    | Data Block 4 |
>                                                    | +------------+
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    +-+------------+
> 
> (1) QEMU generates the HEST ACPI table. This table goes in the current
> "etc/acpi/tables" fw_cfg blob. Given N error sources, there will be N
> GHES objects in the HEST.
> 
> (2) We introduce a new fw_cfg blob called "etc/hardware_errors". QEMU
> also populates this blob.
> 
> (2a) Given N error sources, the (unnamed) table of address registers
> will contain N address registers.
> 
> (2b) Given N error sources, the "etc/hardwre_errors" fw_cfg blob will
> also contain N Error Status Data Blocks.
> 
> I don't know about the sizing (number of CPERs) each Error Status Data
> Block has to contain, but I understand it is all pre-allocated as far as
> the OS is concerned, which matches our capabilities well.
here I have a question. as you comment: " 'etc/hardwre_errors' fw_cfg blob will also contain N Error Status Data Blocks",
Because the CPER numbers is not fixed, how to assign each "Error Status Data Block" size using one "etc/hardwre_errors" fw_cfg blob.
when use one etc/hardwre_errors, will the N Error Status Data Block use one continuous buffer? as shown below. if so, maybe it not convenient for each data block size extension.
I see the bios_linker_loader_alloc will allocate one continuous buffer for a blob(such as VMGENID_GUID_FW_CFG_FILE)

    /* Allocate guest memory for the Data fw_cfg blob */
    bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
                             false /* page boundary, high memory */);



-> +--------------+
     |    HEST      +     | registers |             | Error Status |
     + +------------+     | +---------+             | Data Block  |
     | | GHES       | --> | | address | --------+-->| +------------+
     | | GHES       | --> | | address | ------+     | |  CPER      |
     | | GHES       | --> | | address | ----+ |     | |  CPER      |
     | | GHES       | --> | | address | -+  | |     | |  CPER      |
     +-+------------+     +-+---------+  |  | +---> +--------------+
                                         |  |       | |  CPER      |
                                         |  |       | |  CPER      |
                                         |  +-----> +--------------+
                                         |          | |  CPER      |
                                         +--------> +--------------+
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    +-+------------+



so how about we use separate etc/hardwre_errorsN for each Error Status status Block? then

etc/hardwre_errors0
etc/hardwre_errors1
...................
etc/hardwre_errors10
(the max N is 10)


the N can be one of below values, according to ACPI spec "Table 18-345 Hardware Error Notification Structure"
0 – Polled
1 – External Interrupt
2 – Local Interrupt
3 – SCI
4 – NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV




> 
> (3) QEMU generates the ACPI linker/loader script for the firmware, as
> always.
> 
> (3a) The HEST table is part of "etc/acpi/tables", which the firmware
> already allocates memory for, and downloads (because QEMU already
> generates an ALLOCATE linker/loader command for it already).
> 
> (3b) QEMU will have to create another ALLOCATE command for the
> "etc/hardware_errors" blob. The firmware allocates memory for this blob,
> and downloads it.
> 
> (4) QEMU generates, in the ACPI linker/loader script for the firwmare, N
> ADD_POINTER commands, which point the GHES."Error Status
> Address" fields in the HEST table, to the corresponding address
> registers in the downloaded "etc/hardware_errors" blob.
> 
> (5) QEMU generates an ADD_CHECKSUM command for the firmware, so that the
> HEST table is correctly checksummed after executing the N ADD_POINTER
> commands from (4).
> 
> (6) QEMU generates N ADD_POINTER commands for the firmware, pointing the
> address registers (located in guest memory, in the downloaded
> "etc/hardware_errors" blob) to the respective Error Status Data Blocks.
> 
> (7) (This is the trick.) For this step, we need a third, write-only
> fw_cfg blob, called "etc/hardware_errors_addr". Through that blob, the
> firmware can send back the guest-side allocation addresses to QEMU.
> 
> Namely, the "etc/hardware_errors_addr" blob contains N 8-byte entries.
> QEMU generates N WRITE_POINTER commands for the firmware.
> 
> For error source K (0 <= K < N), QEMU instructs the firmware to
> calculate the guest address of Error Status Data Block K, from the
> QEMU-dictated offset within "etc/hardware_errors", and from the
> guest-determined allocation base address for "etc/hardware_errors". The
> firmware then writes the calculated address back to fw_cfg file
> "etc/hardware_errors_addr", at offset K*8, according to the
> WRITE_POINTER command.
> 
> This way QEMU will know the GPA of each Error Status Data Block.
> 
> (In fact this can be simplified to a single WRITE_POINTER command: the
> address of the "address register table" can be sent back to QEMU as
> well, which already contains all Error Status Data Block addresses.)
> 
> (8) When QEMU gets SIGBUS from the kernel -- I hope that's going to come
> through a signalfd -- QEMU can format the CPER right into guest memory,
> and then inject whatever interrupt (or assert whatever GPIO line) is
> necessary for notifying the guest.
> 
> (9) This notification (in virtual hardware) can either be handled by the
> guest kernel stand-alone, or else the guest kernel can invoke an ACPI
> event handler method with it (which would be in the DSDT or one of the
> SSDTs, also generated by QEMU). The ACPI event handler method could
> invoke the specific guest kernel driver for errror handling via a
> Notify() operation.
> 
> I'm attracted to the above design because:
> - it would leave the firmware alone after OS boot, and
> - it would leave the firmware blissfully ignorant about HEST, GHES,
> CPER, and the like. (That's why QEMU's ACPI linker/loader was invented
> in the first place.)
> 
> Thanks
> Laszlo
> 
>>>    Do you think which modules generates the APEI table is better? UEFI or Qemu?
>>>
>>>
>>>
>>>
>>> On 2017/3/28 21:40, James Morse wrote:
>>>> Hi gengdongjiu,
>>>>
>>>> On 28/03/17 13:16, gengdongjiu wrote:
>>>>> On 2017/3/28 19:54, Achin Gupta wrote:
>>>>>> On Tue, Mar 28, 2017 at 01:23:28PM +0200, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 28, 2017 at 11:48:08AM +0100, James Morse wrote:
>>>>>>>> On the host, part of UEFI is involved to generate the CPER records.
>>>>>>>> In a guest?, I don't know.
>>>>>>>> Qemu could generate the records, or drive some other component to do it.
>>>>>>>
>>>>>>> I think I am beginning to understand this a bit.  Since the guet UEFI
>>>>>>> instance is specifically built for the machine it runs on, QEMU's virt
>>>>>>> machine in this case, they could simply agree (by some contract) to
>>>>>>> place the records at some specific location in memory, and if the guest
>>>>>>> kernel asks its guest UEFI for that location, things should just work by
>>>>>>> having logic in QEMU to process error reports and populate guest memory.
>>>>>>>
>>>>>>> Is this how others see the world too?
>>>>>>
>>>>>> I think so!
>>>>>>
>>>>>> AFAIU, the memory where CPERs will reside should be specified in a GHES entry in
>>>>>> the HEST. Is this not the case with a guest kernel i.e. the guest UEFI creates a
>>>>>> HEST for the guest Kernel?
>>>>>>
>>>>>> If so, then the question is how the guest UEFI finds out where QEMU (acting as
>>>>>> EL3 firmware) will populate the CPERs. This could either be a contract between
>>>>>> the two or a guest DXE driver uses the MM_COMMUNICATE call (see [1]) to ask QEMU
>>>>>> where the memory is.
>>>>>
>>>>> whether invoke the guest UEFI will be complex? not see the advantage. it seems x86 Qemu
>>>>> directly generate the ACPI table, but I am not sure, we are checking the qemu
>>>> logical.
>>>>> let Qemu generate CPER record may be clear.
>>>>
>>>> At boot UEFI in the guest will need to make sure the areas of memory that may be
>>>> used for CPER records are reserved. Whether UEFI or Qemu decides where these are
>>>> needs deciding, (but probably not here)...
>>>>
>>>> At runtime, when an error has occurred, I agree it would be simpler (fewer
>>>> components involved) if Qemu generates the CPER records. But if UEFI made the
>>>> memory choice above they need to interact and it gets complicated again. The
>>>> CPER records are defined in the UEFI spec, so I would expect UEFI to contain
>>>> code to generate/parse them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>>
>>>>
>>>> .
>>>>
>>>
> 
> 
> .
> 

WARNING: multiple messages have this Message-ID (diff)
From: gengdongjiu@huawei.com (gengdongjiu)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] kvm: pass the virtual SEI syndrome to guest OS
Date: Thu, 6 Apr 2017 20:35:26 +0800	[thread overview]
Message-ID: <7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com> (raw)
In-Reply-To: <2a427164-9b37-6711-3a56-906634ba7f12@redhat.com>

Dear, Laszlo
   Thanks for your detailed explanation.

On 2017/3/29 19:58, Laszlo Ersek wrote:
> (This ought to be one of the longest address lists I've ever seen :)
> Thanks for the CC. I'm glad Shannon is already on the CC list. For good
> measure, I'm adding MST and Igor.)
> 
> On 03/29/17 12:36, Achin Gupta wrote:
>> Hi gengdongjiu,
>>
>> On Wed, Mar 29, 2017 at 05:36:37PM +0800, gengdongjiu wrote:
>>>
>>> Hi Laszlo/Biesheuvel/Qemu developer,
>>>
>>>    Now I encounter a issue and want to consult with you in ARM64 platform? as described below:
>>>
>>> when guest OS happen synchronous or asynchronous abort, kvm needs
>>> to send the error address to Qemu or UEFI through sigbus to
>>> dynamically generate APEI table. from my investigation, there are
>>> two ways:
>>>
>>> (1) Qemu get the error address, and generate the APEI table, then
>>> notify UEFI to know this generation, then inject abort error to
>>> guest OS, guest OS read the APEI table.
>>> (2) Qemu get the error address, and let UEFI to generate the APEI
>>> table, then inject abort error to guest OS, guest OS read the APEI
>>> table.
>>
>> Just being pedantic! I don't think we are talking about creating the APEI table
>> dynamically here. The issue is: Once KVM has received an error that is destined
>> for a guest it will raise a SIGBUS to Qemu. Now before Qemu can inject the error
>> into the guest OS, a CPER (Common Platform Error Record) has to be generated
>> corresponding to the error source (GHES corresponding to memory subsystem,
>> processor etc) to allow the guest OS to do anything meaningful with the
>> error. So who should create the CPER is the question.
>>
>> At the EL3/EL2 interface (Secure Firmware and OS/Hypervisor), an error arrives
>> at EL3 and secure firmware (at EL3 or a lower secure exception level) is
>> responsible for creating the CPER. ARM is experimenting with using a Standalone
>> MM EDK2 image in the secure world to do the CPER creation. This will avoid
>> adding the same code in ARM TF in EL3 (better for security). The error will then
>> be injected into the OS/Hypervisor (through SEA/SEI/SDEI) through ARM Trusted
>> Firmware.
>>
>> Qemu is essentially fulfilling the role of secure firmware at the EL2/EL1
>> interface (as discussed with Christoffer below). So it should generate the CPER
>> before injecting the error.
>>
>> This is corresponds to (1) above apart from notifying UEFI (I am assuming you
>> mean guest UEFI). At this time, the guest OS already knows where to pick up the
>> CPER from through the HEST. Qemu has to create the CPER and populate its address
>> at the address exported in the HEST. Guest UEFI should not be involved in this
>> flow. Its job was to create the HEST at boot and that has been done by this
>> stage.
>>
>> Qemu folk will be able to add but it looks like support for CPER generation will
>> need to be added to Qemu. We need to resolve this.
>>
>> Do shout if I am missing anything above.
> 
> After reading this email, the use case looks *very* similar to what
> we've just done with VMGENID for QEMU 2.9.
> 
> We have a facility between QEMU and the guest firmware, called "ACPI
> linker/loader", with which QEMU instructs the firmware to
> 
> - allocate and download blobs into guest RAM (AcpiNVS type memory) --
> ALLOCATE command,
> 
> - relocate pointers in those blobs, to fields in other (or the same)
> blobs -- ADD_POINTER command,
> 
> - set ACPI table checksums -- ADD_CHECKSUM command,
> 
> - and send GPAs of fields within such blobs back to QEMU --
> WRITE_POINTER command.
> 
> This is how I imagine we can map the facility to the current use case
> (note that this is the first time I read about HEST / GHES / CPER):
> 
>     etc/acpi/tables                 etc/hardware_errors
>     ================     ==========================================
>                          +-----------+
>     +--------------+     | address   |         +-> +--------------+
>     |    HEST      +     | registers |         |   | Error Status |
>     + +------------+     | +---------+         |   | Data Block 1 |
>     | | GHES       | --> | | address | --------+   | +------------+
>     | | GHES       | --> | | address | ------+     | |  CPER      |
>     | | GHES       | --> | | address | ----+ |     | |  CPER      |
>     | | GHES       | --> | | address | -+  | |     | |  CPER      |
>     +-+------------+     +-+---------+  |  | |     +-+------------+
>                                         |  | |
>                                         |  | +---> +--------------+
>                                         |  |       | Error Status |
>                                         |  |       | Data Block 2 |
>                                         |  |       | +------------+
>                                         |  |       | |  CPER      |
>                                         |  |       | |  CPER      |
>                                         |  |       +-+------------+
>                                         |  |
>                                         |  +-----> +--------------+
>                                         |          | Error Status |
>                                         |          | Data Block 3 |
>                                         |          | +------------+
>                                         |          | |  CPER      |
>                                         |          +-+------------+
>                                         |
>                                         +--------> +--------------+
>                                                    | Error Status |
>                                                    | Data Block 4 |
>                                                    | +------------+
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    | |  CPER      |
>                                                    +-+------------+
> 
> (1) QEMU generates the HEST ACPI table. This table goes in the current
> "etc/acpi/tables" fw_cfg blob. Given N error sources, there will be N
> GHES objects in the HEST.
> 
> (2) We introduce a new fw_cfg blob called "etc/hardware_errors". QEMU
> also populates this blob.
> 
> (2a) Given N error sources, the (unnamed) table of address registers
> will contain N address registers.
> 
> (2b) Given N error sources, the "etc/hardwre_errors" fw_cfg blob will
> also contain N Error Status Data Blocks.
> 
> I don't know about the sizing (number of CPERs) each Error Status Data
> Block has to contain, but I understand it is all pre-allocated as far as
> the OS is concerned, which matches our capabilities well.
here I have a question. as you comment: " 'etc/hardwre_errors' fw_cfg blob will also contain N Error Status Data Blocks",
Because the CPER numbers is not fixed, how to assign each "Error Status Data Block" size using one "etc/hardwre_errors" fw_cfg blob.
when use one etc/hardwre_errors, will the N Error Status Data Block use one continuous buffer? as shown below. if so, maybe it not convenient for each data block size extension.
I see the bios_linker_loader_alloc will allocate one continuous buffer for a blob(such as VMGENID_GUID_FW_CFG_FILE)

    /* Allocate guest memory for the Data fw_cfg blob */
    bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
                             false /* page boundary, high memory */);



-> +--------------+
     |    HEST      +     | registers |             | Error Status |
     + +------------+     | +---------+             | Data Block  |
     | | GHES       | --> | | address | --------+-->| +------------+
     | | GHES       | --> | | address | ------+     | |  CPER      |
     | | GHES       | --> | | address | ----+ |     | |  CPER      |
     | | GHES       | --> | | address | -+  | |     | |  CPER      |
     +-+------------+     +-+---------+  |  | +---> +--------------+
                                         |  |       | |  CPER      |
                                         |  |       | |  CPER      |
                                         |  +-----> +--------------+
                                         |          | |  CPER      |
                                         +--------> +--------------+
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    | |  CPER      |
                                                    +-+------------+



so how about we use separate etc/hardwre_errorsN for each Error Status status Block? then

etc/hardwre_errors0
etc/hardwre_errors1
...................
etc/hardwre_errors10
(the max N is 10)


the N can be one of below values, according to ACPI spec "Table 18-345 Hardware Error Notification Structure"
0 ? Polled
1 ? External Interrupt
2 ? Local Interrupt
3 ? SCI
4 ? NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV




> 
> (3) QEMU generates the ACPI linker/loader script for the firmware, as
> always.
> 
> (3a) The HEST table is part of "etc/acpi/tables", which the firmware
> already allocates memory for, and downloads (because QEMU already
> generates an ALLOCATE linker/loader command for it already).
> 
> (3b) QEMU will have to create another ALLOCATE command for the
> "etc/hardware_errors" blob. The firmware allocates memory for this blob,
> and downloads it.
> 
> (4) QEMU generates, in the ACPI linker/loader script for the firwmare, N
> ADD_POINTER commands, which point the GHES."Error Status
> Address" fields in the HEST table, to the corresponding address
> registers in the downloaded "etc/hardware_errors" blob.
> 
> (5) QEMU generates an ADD_CHECKSUM command for the firmware, so that the
> HEST table is correctly checksummed after executing the N ADD_POINTER
> commands from (4).
> 
> (6) QEMU generates N ADD_POINTER commands for the firmware, pointing the
> address registers (located in guest memory, in the downloaded
> "etc/hardware_errors" blob) to the respective Error Status Data Blocks.
> 
> (7) (This is the trick.) For this step, we need a third, write-only
> fw_cfg blob, called "etc/hardware_errors_addr". Through that blob, the
> firmware can send back the guest-side allocation addresses to QEMU.
> 
> Namely, the "etc/hardware_errors_addr" blob contains N 8-byte entries.
> QEMU generates N WRITE_POINTER commands for the firmware.
> 
> For error source K (0 <= K < N), QEMU instructs the firmware to
> calculate the guest address of Error Status Data Block K, from the
> QEMU-dictated offset within "etc/hardware_errors", and from the
> guest-determined allocation base address for "etc/hardware_errors". The
> firmware then writes the calculated address back to fw_cfg file
> "etc/hardware_errors_addr", at offset K*8, according to the
> WRITE_POINTER command.
> 
> This way QEMU will know the GPA of each Error Status Data Block.
> 
> (In fact this can be simplified to a single WRITE_POINTER command: the
> address of the "address register table" can be sent back to QEMU as
> well, which already contains all Error Status Data Block addresses.)
> 
> (8) When QEMU gets SIGBUS from the kernel -- I hope that's going to come
> through a signalfd -- QEMU can format the CPER right into guest memory,
> and then inject whatever interrupt (or assert whatever GPIO line) is
> necessary for notifying the guest.
> 
> (9) This notification (in virtual hardware) can either be handled by the
> guest kernel stand-alone, or else the guest kernel can invoke an ACPI
> event handler method with it (which would be in the DSDT or one of the
> SSDTs, also generated by QEMU). The ACPI event handler method could
> invoke the specific guest kernel driver for errror handling via a
> Notify() operation.
> 
> I'm attracted to the above design because:
> - it would leave the firmware alone after OS boot, and
> - it would leave the firmware blissfully ignorant about HEST, GHES,
> CPER, and the like. (That's why QEMU's ACPI linker/loader was invented
> in the first place.)
> 
> Thanks
> Laszlo
> 
>>>    Do you think which modules generates the APEI table is better? UEFI or Qemu?
>>>
>>>
>>>
>>>
>>> On 2017/3/28 21:40, James Morse wrote:
>>>> Hi gengdongjiu,
>>>>
>>>> On 28/03/17 13:16, gengdongjiu wrote:
>>>>> On 2017/3/28 19:54, Achin Gupta wrote:
>>>>>> On Tue, Mar 28, 2017 at 01:23:28PM +0200, Christoffer Dall wrote:
>>>>>>> On Tue, Mar 28, 2017 at 11:48:08AM +0100, James Morse wrote:
>>>>>>>> On the host, part of UEFI is involved to generate the CPER records.
>>>>>>>> In a guest?, I don't know.
>>>>>>>> Qemu could generate the records, or drive some other component to do it.
>>>>>>>
>>>>>>> I think I am beginning to understand this a bit.  Since the guet UEFI
>>>>>>> instance is specifically built for the machine it runs on, QEMU's virt
>>>>>>> machine in this case, they could simply agree (by some contract) to
>>>>>>> place the records at some specific location in memory, and if the guest
>>>>>>> kernel asks its guest UEFI for that location, things should just work by
>>>>>>> having logic in QEMU to process error reports and populate guest memory.
>>>>>>>
>>>>>>> Is this how others see the world too?
>>>>>>
>>>>>> I think so!
>>>>>>
>>>>>> AFAIU, the memory where CPERs will reside should be specified in a GHES entry in
>>>>>> the HEST. Is this not the case with a guest kernel i.e. the guest UEFI creates a
>>>>>> HEST for the guest Kernel?
>>>>>>
>>>>>> If so, then the question is how the guest UEFI finds out where QEMU (acting as
>>>>>> EL3 firmware) will populate the CPERs. This could either be a contract between
>>>>>> the two or a guest DXE driver uses the MM_COMMUNICATE call (see [1]) to ask QEMU
>>>>>> where the memory is.
>>>>>
>>>>> whether invoke the guest UEFI will be complex? not see the advantage. it seems x86 Qemu
>>>>> directly generate the ACPI table, but I am not sure, we are checking the qemu
>>>> logical.
>>>>> let Qemu generate CPER record may be clear.
>>>>
>>>> At boot UEFI in the guest will need to make sure the areas of memory that may be
>>>> used for CPER records are reserved. Whether UEFI or Qemu decides where these are
>>>> needs deciding, (but probably not here)...
>>>>
>>>> At runtime, when an error has occurred, I agree it would be simpler (fewer
>>>> components involved) if Qemu generates the CPER records. But if UEFI made the
>>>> memory choice above they need to interact and it gets complicated again. The
>>>> CPER records are defined in the UEFI spec, so I would expect UEFI to contain
>>>> code to generate/parse them.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>>
>>>>
>>>> .
>>>>
>>>
> 
> 
> .
> 

  parent reply	other threads:[~2017-04-06 12:37 UTC|newest]

Thread overview: 164+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-20  7:55 [PATCH] kvm: pass the virtual SEI syndrome to guest OS Dongjiu Geng
2017-03-20  7:55 ` Dongjiu Geng
2017-03-20  7:55 ` Dongjiu Geng
2017-03-20  7:55 ` Dongjiu Geng
2017-03-20 11:24 ` Marc Zyngier
2017-03-20 11:24   ` Marc Zyngier
2017-03-20 11:24   ` Marc Zyngier
2017-03-20 12:28   ` gengdongjiu
2017-03-20 12:28     ` gengdongjiu
2017-03-20 12:28     ` gengdongjiu
2017-03-20 12:28     ` gengdongjiu
2017-03-20 13:58     ` Marc Zyngier
2017-03-20 13:58       ` Marc Zyngier
2017-03-20 13:58       ` Marc Zyngier
2017-03-20 15:08       ` James Morse
2017-03-20 15:08         ` James Morse
2017-03-20 15:08         ` James Morse
2017-03-21  6:32         ` gengdongjiu
2017-03-21  6:32           ` gengdongjiu
2017-03-21  6:32           ` gengdongjiu
2017-03-21  6:32           ` gengdongjiu
2017-03-21 11:34           ` Christoffer Dall
2017-03-21 11:34             ` Christoffer Dall
2017-03-21 11:34             ` Christoffer Dall
2017-03-21 19:11             ` James Morse
2017-03-21 19:11               ` James Morse
2017-03-21 19:11               ` James Morse
2017-03-21 19:36               ` Christoffer Dall
2017-03-21 19:39               ` Christoffer Dall
2017-03-21 19:39                 ` Christoffer Dall
2017-03-21 19:39                 ` Christoffer Dall
2017-03-21 22:10                 ` Peter Maydell
2017-03-21 22:10                   ` Peter Maydell
2017-03-21 22:10                   ` Peter Maydell
2017-03-22 11:15                   ` Marc Zyngier
2017-03-22 11:15                     ` Marc Zyngier
2017-03-22 11:15                     ` Marc Zyngier
2017-03-28 10:48                 ` James Morse
2017-03-28 10:48                   ` James Morse
2017-03-28 10:48                   ` James Morse
2017-03-28 11:23                   ` Christoffer Dall
2017-03-28 11:23                     ` Christoffer Dall
2017-03-28 11:23                     ` Christoffer Dall
2017-03-28 11:33                     ` Peter Maydell
2017-03-28 11:33                       ` Peter Maydell
2017-03-28 11:33                       ` Peter Maydell
2017-03-28 13:27                       ` James Morse
2017-03-28 13:27                         ` James Morse
2017-03-28 13:27                         ` James Morse
2017-03-28 11:54                     ` Achin Gupta
2017-03-28 11:54                       ` Achin Gupta
2017-03-28 11:54                       ` Achin Gupta
2017-03-28 12:16                       ` gengdongjiu
2017-03-28 12:16                         ` gengdongjiu
2017-03-28 12:16                         ` gengdongjiu
2017-03-28 13:40                         ` James Morse
2017-03-28 13:40                           ` James Morse
2017-03-28 13:40                           ` James Morse
2017-03-29  9:36                           ` gengdongjiu
2017-03-29  9:36                             ` gengdongjiu
2017-03-29  9:36                             ` gengdongjiu
2017-03-29  9:36                             ` [Qemu-devel] " gengdongjiu
2017-03-29  9:36                             ` gengdongjiu
2017-03-29 10:36                             ` Achin Gupta
2017-03-29 10:36                               ` Achin Gupta
2017-03-29 10:36                               ` [Qemu-devel] " Achin Gupta
2017-03-29 10:36                               ` Achin Gupta
2017-03-29 11:58                               ` Laszlo Ersek
2017-03-29 11:58                                 ` Laszlo Ersek
2017-03-29 11:58                                 ` [Qemu-devel] " Laszlo Ersek
2017-03-29 11:58                                 ` [edk2] " Laszlo Ersek
2017-03-29 12:51                                 ` Michael S. Tsirkin
2017-03-29 12:51                                   ` Michael S. Tsirkin
2017-03-29 12:51                                   ` Michael S. Tsirkin
2017-03-29 12:51                                   ` [Qemu-devel] " Michael S. Tsirkin
2017-03-29 12:51                                   ` Michael S. Tsirkin
2017-03-29 13:36                                   ` Laszlo Ersek
2017-03-29 13:36                                     ` Laszlo Ersek
2017-03-29 13:36                                     ` [Qemu-devel] " Laszlo Ersek
2017-03-29 13:36                                     ` Laszlo Ersek
2017-03-29 13:54                                     ` Michael S. Tsirkin
2017-03-29 13:54                                       ` Michael S. Tsirkin
2017-03-29 13:54                                       ` Michael S. Tsirkin
2017-03-29 13:54                                       ` [Qemu-devel] " Michael S. Tsirkin
2017-03-29 13:54                                       ` Michael S. Tsirkin
2017-03-29 13:56                                     ` Punit Agrawal
2017-03-29 13:56                                       ` Punit Agrawal
2017-03-29 13:56                                       ` [Qemu-devel] " Punit Agrawal
2017-03-29 13:56                                       ` Punit Agrawal
2017-04-06 12:35                                 ` gengdongjiu [this message]
2017-04-06 12:35                                   ` gengdongjiu
2017-04-06 12:35                                   ` gengdongjiu
2017-04-06 12:35                                   ` [Qemu-devel] " gengdongjiu
2017-04-06 12:35                                   ` gengdongjiu
2017-04-06 18:55                                   ` Laszlo Ersek
2017-04-06 18:55                                     ` Laszlo Ersek
2017-04-06 18:55                                     ` [Qemu-devel] " Laszlo Ersek
2017-04-06 18:55                                     ` [edk2] " Laszlo Ersek
2017-04-07  2:52                                     ` gengdongjiu
2017-04-07  2:52                                       ` gengdongjiu
2017-04-07  2:52                                       ` [Qemu-devel] " gengdongjiu
2017-04-07  2:52                                       ` [edk2] " gengdongjiu
2017-04-07  9:21                                       ` Laszlo Ersek
2017-04-07  9:21                                         ` Laszlo Ersek
2017-04-07  9:21                                         ` [Qemu-devel] " Laszlo Ersek
2017-04-07  9:21                                         ` [edk2] " Laszlo Ersek
2017-04-21 13:27                                     ` gengdongjiu
2017-04-21 13:27                                       ` gengdongjiu
2017-04-21 13:27                                       ` [Qemu-devel] " gengdongjiu
2017-04-21 13:27                                       ` [edk2] " gengdongjiu
2017-04-24 11:27                                       ` Laszlo Ersek
2017-04-24 11:27                                         ` Laszlo Ersek
2017-04-24 11:27                                         ` [Qemu-devel] " Laszlo Ersek
2017-04-24 11:27                                         ` [edk2] " Laszlo Ersek
2017-03-29 14:36                               ` gengdongjiu
2017-03-29 14:36                                 ` gengdongjiu
2017-03-29 14:36                                 ` gengdongjiu
2017-03-29 14:36                                 ` [Qemu-devel] " gengdongjiu
2017-03-29 14:36                                 ` gengdongjiu
2017-03-29 14:48                                 ` Christoffer Dall
2017-03-29 14:48                                   ` Christoffer Dall
2017-03-29 14:48                                   ` Christoffer Dall
2017-03-29 14:48                                   ` [Qemu-devel] " Christoffer Dall
2017-03-29 14:48                                   ` Christoffer Dall
2017-03-29 15:37                                   ` Laszlo Ersek
2017-03-29 15:37                                     ` Laszlo Ersek
2017-03-29 15:37                                     ` [Qemu-devel] " Laszlo Ersek
2017-03-29 15:37                                     ` [edk2] " Laszlo Ersek
2017-03-29 17:44                                     ` Christoffer Dall
2017-03-29 17:44                                       ` Christoffer Dall
2017-03-29 17:44                                       ` [Qemu-devel] " Christoffer Dall
2017-03-29 17:44                                       ` Christoffer Dall
2017-03-30  1:22                                       ` gengdongjiu
2017-03-30  1:22                                         ` gengdongjiu
2017-03-30  1:22                                         ` gengdongjiu
2017-03-30  1:22                                         ` [Qemu-devel] " gengdongjiu
2017-03-30  1:22                                         ` gengdongjiu
2017-03-28 12:22                       ` Christoffer Dall
2017-03-28 12:22                         ` Christoffer Dall
2017-03-28 12:22                         ` Christoffer Dall
2017-03-28 13:24                         ` Achin Gupta
2017-03-28 13:24                           ` Achin Gupta
2017-03-28 13:24                           ` Achin Gupta
2017-03-28 13:40                           ` Christoffer Dall
2017-03-28 13:40                             ` Christoffer Dall
2017-03-28 13:40                             ` Christoffer Dall
2017-03-21 13:10           ` James Morse
2017-03-21 13:10             ` James Morse
2017-03-21 13:10             ` James Morse
2017-03-22 13:37             ` gengdongjiu
2017-03-22 13:37               ` gengdongjiu
2017-03-22 13:37               ` gengdongjiu
2017-03-22 18:56               ` James Morse
2017-03-22 18:56                 ` James Morse
2017-03-22 18:56                 ` James Morse
2017-03-21  6:07       ` gengdongjiu
2017-03-21  6:07         ` gengdongjiu
2017-03-21  6:07         ` gengdongjiu
2017-03-21 13:51 ` kbuild test robot
2017-03-21 13:51   ` kbuild test robot
2017-03-21 13:51   ` kbuild test robot
2017-03-22  3:20   ` gengdongjiu
2017-03-22  3:20     ` gengdongjiu
2017-03-22  3:20     ` gengdongjiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c5c8ab7-8fcc-1c98-0bc1-cccb66c4c84d@huawei.com \
    --to=gengdongjiu@huawei.com \
    --cc=Leif.Lindholm@linaro.com \
    --cc=achin.gupta@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=cdall@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=edk2-devel@ml01.01.org \
    --cc=huangshaoyu@huawei.com \
    --cc=imammedo@redhat.com \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=lersek@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=mtsirkin@redhat.com \
    --cc=nd@arm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rkrcmar@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vladimir.murzin@arm.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=will.deacon@arm.com \
    --cc=wuquanming@huawei.com \
    --cc=xiexiuqi@huawei.com \
    --cc=zhaoshenglong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.