From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752385AbdB0PTE (ORCPT ); Mon, 27 Feb 2017 10:19:04 -0500 Received: from foss.arm.com ([217.140.101.70]:54742 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751854AbdB0PTC (ORCPT ); Mon, 27 Feb 2017 10:19:02 -0500 Message-ID: <58B43092.6040401@arm.com> Date: Mon, 27 Feb 2017 13:58:42 +0000 From: James Morse User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 MIME-Version: 1.0 To: Xiongfeng Wang , punit.agrawal@arm.com CC: Tyler Baicar , christoffer.dall@linaro.org, marc.zyngier@arm.com, pbonzini@redhat.com, rkrcmar@redhat.com, linux@armlinux.org.uk, catalin.marinas@arm.com, will.deacon@arm.com, rjw@rjwysocki.net, lenb@kernel.org, matt@codeblueprint.co.uk, robert.moore@intel.com, lv.zheng@intel.com, nkaje@codeaurora.org, zjzhang@codeaurora.org, mark.rutland@arm.com, akpm@linux-foundation.org, eun.taik.lee@samsung.com, sandeepa.s.prabhu@gmail.com, labbott@redhat.com, shijie.huang@arm.com, rruigrok@codeaurora.org, paul.gortmaker@windriver.com, tn@semihalf.com, fu.wei@linaro.org, rostedt@goodmis.org, bristot@redhat.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-efi@vger.kernel.org, devel@acpica.org, Suzuki.Poulose@arm.com, astone@redhat.com, harba@codeaurora.org, hanjun.guo@linaro.org, john.garry@huawei.com, shiju.jose@huawei.com, joe@perches.com Subject: Re: [PATCH V11 10/10] arm/arm64: KVM: add guest SEA support References: <1487712121-16688-1-git-send-email-tbaicar@codeaurora.org> <1487712121-16688-11-git-send-email-tbaicar@codeaurora.org> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Wang Xiongfeng, On 25/02/17 07:15, Xiongfeng Wang wrote: > On 2017/2/22 5:22, Tyler Baicar wrote: >> Currently external aborts are unsupported by the guest abort >> handling. Add handling for SEAs so that the host kernel reports >> SEAs which occur in the guest kernel. >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c >> index a5265ed..04f1dd50 100644 >> --- a/arch/arm/kvm/mmu.c >> +++ b/arch/arm/kvm/mmu.c >> @@ -1444,8 +1445,21 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run) >> >> /* Check the stage-2 fault is trans. fault or write fault */ >> fault_status = kvm_vcpu_trap_get_fault_type(vcpu); >> - if (fault_status != FSC_FAULT && fault_status != FSC_PERM && >> - fault_status != FSC_ACCESS) { >> + >> + /* The host kernel will handle the synchronous external abort. There >> + * is no need to pass the error into the guest. >> + */ > Can we inject an sea into the guest, so that the guest can kill the > application which causes the error if the guest won't be terminated > later. I'm not sure whether ghes_handle_memory_failure() called in > ghes_do_proc() will kill the qemu process. I think it only kill user > processes marked with PF_MCE_PROCESS & PF_MCE_EARLY. My understanding is the pages will get unmapped and recovered where possible (e.g. re-read from disk), the user space process will get SIGBUS/SIGSEV when it next tries to access that page, which could be some time later. These flags in find_early_kill_thread() are a way to make the memory-failure code signal the process early, before it does any recovery. The 'MCE' makes me think its x86 specific. (early and late are described more in [0]) Guests are a special case as QEMU may never access the faulty memory itself, so it won't receive the 'late' signal. It looks like ARM/arm64 KVM lacks support for KVM_PFN_ERR_HWPOISON which sends SIGBUS from KVM's fault-handling code. I have patches to add support for this which I intend to send at rc1. [0] suggests 'KVM qemu' sets these MCE flags to take the 'early' path, but given x86s KVM_PFN_ERR_HWPOISON, this may be out of date. Either way, once QEMU gets a signal indicating the virtual address, it can generate its own APEI CPER records and use the KVM APIs to mock up an Synchronous External Abort, (or inject an IRQ or run the vcpu waiting for the guest's polling thread to come round, whichever was described to the guest via the HEST/GHES tables). We can't hand the APEI CPER records we have in the kernel to the guest, as they hold a host physical address, and maybe a host virtual address. We don't know where in guest memory we could write new APEI CPER records as these locations have to be reserved in the guests-UEFI memory map, and only QEMU knows where they are. To deliver RAS events to a guest we have to get QEMU involved. Thanks, James [0] https://www.kernel.org/doc/Documentation/vm/hwpoison.txt