linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
To: James Morse <james.morse@arm.com>, <punit.agrawal@arm.com>
Cc: <mark.rutland@arm.com>, <linux-efi@vger.kernel.org>,
	<kvm@vger.kernel.org>, <rkrcmar@redhat.com>,
	<matt@codeblueprint.co.uk>, <catalin.marinas@arm.com>,
	Tyler Baicar <tbaicar@codeaurora.org>, <will.deacon@arm.com>,
	<robert.moore@intel.com>, <paul.gortmaker@windriver.com>,
	<lv.zheng@intel.com>, <kvmarm@lists.cs.columbia.edu>,
	<fu.wei@linaro.org>, <tn@semihalf.com>, <zjzhang@codeaurora.org>,
	<linux@armlinux.org.uk>, <linux-acpi@vger.kernel.org>,
	<eun.taik.lee@samsung.com>, <shijie.huang@arm.com>,
	<labbott@redhat.com>, <lenb@kernel.org>, <harba@codeaurora.org>,
	<Suzuki.Poulose@arm.com>, <marc.zyngier@arm.com>,
	<john.garry@huawei.com>, <rostedt@goodmis.org>,
	<nkaje@codeaurora.org>, <sandeepa.s.prabhu@gmail.com>,
	<linux-arm-kernel@lists.infradead.org>, <devel@acpica.org>,
	<rjw@rjwysocki.net>, <rruigrok@codeaurora.org>,
	<linux-kernel@vger.kernel.org>, <astone@redhat.com>,
	<hanjun.guo@linaro.org>, <joe@perches.com>, <pbonzini@redhat.com>,
	<akpm@linux-foundation.org>, <bristot@redhat.com>,
	<christoffer.dall@linaro.org>, <shiju.jose@huawei.com>
Subject: Re: [PATCH V11 10/10] arm/arm64: KVM: add guest SEA support
Date: Tue, 28 Feb 2017 14:25:07 +0800	[thread overview]
Message-ID: <3df82a7b-1a32-e2d7-ae78-7132a4eab1a0@huawei.com> (raw)
In-Reply-To: <58B43092.6040401@arm.com>

Hi James,

On 2017/2/27 21:58, James Morse wrote:
> Hi Wang Xiongfeng,
> 
> On 25/02/17 07:15, Xiongfeng Wang wrote:
>> On 2017/2/22 5:22, Tyler Baicar wrote:
>>> Currently external aborts are unsupported by the guest abort
>>> handling. Add handling for SEAs so that the host kernel reports
>>> SEAs which occur in the guest kernel.
> 
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index a5265ed..04f1dd50 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -1444,8 +1445,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>>  
>>>  	/* Check the stage-2 fault is trans. fault or write fault */
>>>  	fault_status = kvm_vcpu_trap_get_fault_type(vcpu);
>>> -	if (fault_status != FSC_FAULT && fault_status != FSC_PERM &&
>>> -	    fault_status != FSC_ACCESS) {
>>> +
>>> +	/* The host kernel will handle the synchronous external abort. There
>>> +	 * is no need to pass the error into the guest.
>>> +	 */
> 
>> Can we inject an sea into the guest, so that the guest can kill the
>> application which causes the error if the guest won't be terminated
>> later. I'm not sure whether ghes_handle_memory_failure() called in
>> ghes_do_proc() will kill the qemu process. I think it only kill user
>> processes marked with PF_MCE_PROCESS & PF_MCE_EARLY.
> 
> My understanding is the pages will get unmapped and recovered where possible
> (e.g. re-read from disk), the user space process will get SIGBUS/SIGSEV when it
> next tries to access that page, which could be some time later.
> These flags in find_early_kill_thread() are a way to make the memory-failure
> code signal the process early, before it does any recovery. The 'MCE' makes me
> think its x86 specific.
> (early and late are described more in [0])
> 
> 
> Guests are a special case as QEMU may never access the faulty memory itself, so
> it won't receive the 'late' signal. It looks like ARM/arm64 KVM lacks support
> for KVM_PFN_ERR_HWPOISON which sends SIGBUS from KVM's fault-handling code. I
> have patches to add support for this which I intend to send at rc1.
> 
> [0] suggests 'KVM qemu' sets these MCE flags to take the 'early' path, but given
> x86s KVM_PFN_ERR_HWPOISON, this may be out of date.
> 
> 
> Either way, once QEMU gets a signal indicating the virtual address, it can
> generate its own APEI CPER records and use the KVM APIs to mock up an
> Synchronous External Abort, (or inject an IRQ or run the vcpu waiting for the
> guest's polling thread to come round, whichever was described to the guest via
> the HEST/GHES tables).
> 
> We can't hand the APEI CPER records we have in the kernel to the guest, as they
> hold a host physical address, and maybe a host virtual address. We don't know
> where in guest memory we could write new APEI CPER records as these locations
> have to be reserved in the guests-UEFI memory map, and only QEMU knows where
> they are.
> 
> To deliver RAS events to a guest we have to get QEMU involved.

Thanks for your reply!

I have another idea about the handling procedure of SEA. Can we divide
the SEA handing procedure into two procedures? The first procedure does
the more urgent work, including sending SIGBUS to user process or panic,
just as PATCH 04/10 does. The second procedure does the APEI analysis
work, including calling memory_failure. The second procedure is executed
when actual errors detected in memory, such as a 2-bit ECC error is
detected  on memory read or write, in which case, a fault handling
interrupt is generated by the memory controller, as RAS Extension
specification says.

We can route this fault handling interrupt into EL3. After BIOS has
filled the HEST table, it can notify OS with an IRQ. And the second
procedure is executed in the IRQ handler. The notification type of
HEST/GHES tables is GSIV.

When uncorrectable data error is detected on write data, a fault
handling interrupt is generated, and no SEA is generated, as RAS
extension specification 6.4.4 says. In this situation, the second
procedure should be executed since error occurs in memory.

In ARM/arm64 KVM situation, when an SEA takes place, an SEA is injected
into guest os directly in kvm_handle_guest_abort(). And the guest os can
execute the first procedure.

When the host OS executes the second procedure and analyses the HEST
table, it sends SIGBUS to qemu process in memory_failure(). And the qemu
process can mock up a HEST table with IPA of the error data. Then the
qemu process can notify the guest OS with an IRQ, and the second
procedure is executed in guest OS. Is this idea reasonable?


Thanks!
Wang Xiongfeng

  reply	other threads:[~2017-02-28  6:33 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-21 21:21 [PATCH V11 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64 Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 01/10] acpi: apei: read ack upon ghes record consumption Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1 Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 03/10] efi: parse ARM processor error Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 04/10] arm64: exception: handle Synchronous External Abort Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 05/10] acpi: apei: handle SEA notification type for ARMv8 Tyler Baicar
2017-03-01  7:42   ` Xie XiuQi
2017-03-01 19:22     ` Baicar, Tyler
2017-02-21 21:21 ` [PATCH V11 06/10] acpi: apei: panic OS with fatal error status block Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 07/10] efi: print unrecognized CPER section Tyler Baicar
2017-02-21 21:21 ` [PATCH V11 08/10] ras: acpi / apei: generate trace event for " Tyler Baicar
2017-02-21 21:22 ` [PATCH V11 09/10] trace, ras: add ARM processor error trace event Tyler Baicar
2017-02-21 21:22 ` [PATCH V11 10/10] arm/arm64: KVM: add guest SEA support Tyler Baicar
2017-02-24 10:42   ` James Morse
2017-02-27 11:31     ` gengdongjiu
2017-02-28 19:43     ` Baicar, Tyler
2017-03-06 10:28       ` James Morse
2017-03-06 14:00         ` Baicar, Tyler
2017-02-25  7:15   ` Xiongfeng Wang
2017-02-27 13:58     ` James Morse
2017-02-28  6:25       ` Xiongfeng Wang [this message]
2017-02-28 13:21         ` James Morse
2017-03-01  2:31           ` Xiongfeng Wang
2017-03-02  9:39             ` Marc Zyngier
2017-03-06  3:38               ` Xiongfeng Wang
2017-03-06  1:28       ` gengdongjiu
2017-03-22  2:46       ` Xiongfeng Wang
2017-03-22 11:14         ` James Morse
2017-03-22 12:08           ` Xie XiuQi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3df82a7b-1a32-e2d7-ae78-7132a4eab1a0@huawei.com \
    --to=wangxiongfeng2@huawei.com \
    --cc=Suzuki.Poulose@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=astone@redhat.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=devel@acpica.org \
    --cc=eun.taik.lee@samsung.com \
    --cc=fu.wei@linaro.org \
    --cc=hanjun.guo@linaro.org \
    --cc=harba@codeaurora.org \
    --cc=james.morse@arm.com \
    --cc=joe@perches.com \
    --cc=john.garry@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=labbott@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=lv.zheng@intel.com \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=matt@codeblueprint.co.uk \
    --cc=nkaje@codeaurora.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=pbonzini@redhat.com \
    --cc=punit.agrawal@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=rkrcmar@redhat.com \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=rruigrok@codeaurora.org \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=shiju.jose@huawei.com \
    --cc=tbaicar@codeaurora.org \
    --cc=tn@semihalf.com \
    --cc=will.deacon@arm.com \
    --cc=zjzhang@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).