From: Peter Maydell <peter.maydell@linaro.org> To: gengdongjiu <gengdongjiu@huawei.com> Cc: Fam Zheng <fam@euphon.net>, Eduardo Habkost <ehabkost@redhat.com>, kvm-devel <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>, Jonathan Cameron <jonathan.cameron@huawei.com>, Marcelo Tosatti <mtosatti@redhat.com>, QEMU Developers <qemu-devel@nongnu.org>, Linuxarm <linuxarm@huawei.com>, Shannon Zhao <shannon.zhaosl@gmail.com>, Zheng Xiang <zhengxiang9@huawei.com>, qemu-arm <qemu-arm@nongnu.org>, James Morse <james.morse@arm.com>, "xuwei (O)" <xuwei5@huawei.com>, Igor Mammedov <imammedo@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net> Subject: Re: [PATCH v22 8/9] target-arm: kvm64: handle SIGBUS signal from kernel or KVM Date: Mon, 20 Jan 2020 12:15:21 +0000 Message-ID: <CAFEAcA_Qs3p=iEU+D5iqjyZYpPQO0D16AWvjp0wcvbvRNdGAGg@mail.gmail.com> (raw) In-Reply-To: <c89db331-cb94-8e0b-edf8-25bfb64f826d@huawei.com> On Fri, 17 Jan 2020 at 10:05, gengdongjiu <gengdongjiu@huawei.com> wrote: > > On 2020/1/17 0:28, Peter Maydell wrote: > > On Wed, 8 Jan 2020 at 11:33, Dongjiu Geng <gengdongjiu@huawei.com> wrote: > > > >> +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) > >> +{ > >> + ram_addr_t ram_addr; > >> + hwaddr paddr; > >> + > >> + assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO); > >> + > >> + if (acpi_enabled && addr && > >> + object_property_get_bool(qdev_get_machine(), "ras", NULL)) { > >> + ram_addr = qemu_ram_addr_from_host(addr); > >> + if (ram_addr != RAM_ADDR_INVALID && > >> + kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) { > >> + kvm_hwpoison_page_add(ram_addr); > >> + /* > >> + * Asynchronous signal will be masked by main thread, so > >> + * only handle synchronous signal. > >> + */ > > > > I don't understand this comment. (I think we've had discussions > > about it before, but it's still not clear to me.) > > > > This function (kvm_arch_on_sigbus_vcpu()) will be called in two contexts: > > > > (1) in the vcpu thread: > > * the real SIGBUS handler sigbus_handler() sets a flag and arranges > > for an immediate vcpu exit > > * the vcpu thread reads the flag on exit from KVM_RUN and > > calls kvm_arch_on_sigbus_vcpu() directly > > * the error could be MCEERR_AR or MCEERR_AOFor the vcpu thread, the error can be MCEERR_AR or MCEERR_AO, > but kernel/KVM usually uses MCEERR_AR(action required) instead of MCEERR_AO, because it needs do action immediately. For MCEERR_AO error, the action is optional and the error can be ignored. > At least I do not find Linux kernel/KVM deliver MCEERR_AO in the vcpu threads. > > > (2) MCE errors on other threads: > > * here SIGBUS is blocked, so MCEERR_AR (action-required) > > errors will cause the kernel to just kill the QEMU process > > * MCEERR_AO errors will be handled via the iothread's use > > of signalfd(), so kvm_on_sigbus() will get called from > > the main thread, and it will call kvm_arch_on_sigbus_vcpu() > > * in this case the passed in CPUState will (arbitrarily) be that > > for the first vCPU > > For the MCE errors on other threads, it can only handle MCEERR_AO. If it is MCEERR_AR, the QEMU will assert and exit[2]. > > Case1: Other APP indeed can send MCEERR_AO to QEMU, QEMU handle it via the iothread's use of signalfd() through above path. > Case2: But if the MCEERR_AO is delivered by kernel, I see QEMU ignore it because SIGBUS is masked in main thread[3], for this case, I do not see QEMU handle it via signalfd() for MCEERR_AO errors from my test. SIGBUS is blocked in the main thread because we use signalfd(). The function sigfd_handler() should be called and it will then manually invoke the correct function for the signal. > For Case1,I think we should not let guest know it, because it is not triggered by guest. only other APP send SIGBUS to tell QEMU do somethings. I don't understand what you mean here by "other app" or "guest" triggering of MCEERR. I thought that an MCEERR meant "the hardware has detected that there is a problem with the RAM". If there's a problem with the RAM and it's the RAM that's being used as guest RAM, we need to tell the guest, surely ? > For Case2,it does not call call kvm_arch_on_sigbus_vcpu(). It should do. The code you quote calls that function for that case: > [1]: > /* Called synchronously (via signalfd) in main thread. */ > int kvm_on_sigbus(int code, void *addr) > { > #ifdef KVM_HAVE_MCE_INJECTION > /* Action required MCE kills the process if SIGBUS is blocked. Because > * that's what happens in the I/O thread, where we handle MCE via signalfd, > * we can only get action optional here. > */ > [2]: assert(code != BUS_MCEERR_AR); > kvm_arch_on_sigbus_vcpu(first_cpu, code, addr); > return 0; > #else > return 1; > #endif > } > Above all, from my test, for MCEERR_AO error which is triggered by guest, it not call kvm_arch_on_sigbus_vcpu(). I'm not sure what you mean by "triggered by guest". I assume that exactly what kind of errors the kernel can report and when will depend to some extent on the underlying hardware/firmware implementation of reporting of memory errors, but in principle the ABI allows the kernel to send SIGBUS_(BUS_MCEERR_AO) to the main thread, the signal should be handled by signalfd, our code for working with multiple fds should mean that the main thread calls sigfd_handler() to deal with reading bytes from the signalfd fd, and that function should then call sigbus_handler(), which calls kvm_on_sigbus(), which calls kvm_arch_on_sigbus_vcpu(). If something in that code path is not working then we need to find out what it is. thanks -- PMM
next prev parent reply index Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-08 11:32 [PATCH v22 0/9] Add ARMv8 RAS virtualization support in QEMU Dongjiu Geng 2020-01-08 11:32 ` [PATCH v22 1/9] hw/arm/virt: Introduce a RAS machine option Dongjiu Geng 2020-01-08 11:32 ` [PATCH v22 2/9] docs: APEI GHES generation and CPER record description Dongjiu Geng 2020-01-15 15:11 ` Igor Mammedov 2020-01-08 11:32 ` [PATCH v22 3/9] ACPI: Build related register address fields via hardware error fw_cfg blob Dongjiu Geng 2020-01-23 15:14 ` Igor Mammedov 2020-02-02 14:01 ` gengdongjiu 2020-01-08 11:32 ` [PATCH v22 4/9] ACPI: Build Hardware Error Source Table Dongjiu Geng 2020-01-23 15:48 ` Igor Mammedov 2020-02-02 14:21 ` gengdongjiu 2020-02-05 16:43 ` Jonathan Cameron 2020-02-10 11:18 ` gengdongjiu 2020-01-08 11:32 ` [PATCH v22 5/9] ACPI: Record the Generic Error Status Block address Dongjiu Geng 2020-01-16 16:44 ` Peter Maydell 2020-01-17 10:36 ` gengdongjiu 2020-02-13 15:28 ` gengdongjiu 2020-01-17 7:39 ` Philippe Mathieu-Daudé 2020-01-17 10:47 ` gengdongjiu 2020-01-17 11:20 ` Philippe Mathieu-Daudé 2020-01-28 14:41 ` Igor Mammedov 2020-01-28 16:19 ` Igor Mammedov 2020-02-02 12:44 ` gengdongjiu 2020-02-03 7:51 ` Igor Mammedov 2020-01-08 11:32 ` [PATCH v22 6/9] KVM: Move hwpoison page related functions into kvm-all.c Dongjiu Geng 2020-01-08 11:32 ` [PATCH v22 7/9] ACPI: Record Generic Error Status Block(GESB) table Dongjiu Geng 2020-01-28 15:29 ` Igor Mammedov 2020-02-02 13:42 ` gengdongjiu 2020-02-03 7:55 ` Igor Mammedov 2020-01-08 11:32 ` [PATCH v22 8/9] target-arm: kvm64: handle SIGBUS signal from kernel or KVM Dongjiu Geng 2020-01-16 16:28 ` Peter Maydell 2020-01-16 16:40 ` Peter Maydell 2020-01-17 10:04 ` gengdongjiu 2020-01-20 12:15 ` Peter Maydell [this message] 2020-01-22 15:30 ` gengdongjiu 2020-01-08 11:32 ` [PATCH v22 9/9] MAINTAINERS: Add ACPI/HEST/GHES entries Dongjiu Geng 2020-01-16 16:46 ` Peter Maydell 2020-01-17 7:22 ` Philippe Mathieu-Daudé 2020-01-17 10:16 ` gengdongjiu 2020-01-17 11:09 ` Peter Maydell 2020-01-17 11:22 ` Philippe Mathieu-Daudé 2020-01-19 8:19 ` gengdongjiu 2020-01-17 12:19 ` Michael S. Tsirkin 2020-01-09 3:38 ` [PATCH v22 0/9] Add ARMv8 RAS virtualization support in QEMU gengdongjiu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAFEAcA_Qs3p=iEU+D5iqjyZYpPQO0D16AWvjp0wcvbvRNdGAGg@mail.gmail.com' \ --to=peter.maydell@linaro.org \ --cc=ehabkost@redhat.com \ --cc=fam@euphon.net \ --cc=gengdongjiu@huawei.com \ --cc=imammedo@redhat.com \ --cc=james.morse@arm.com \ --cc=jonathan.cameron@huawei.com \ --cc=kvm@vger.kernel.org \ --cc=linuxarm@huawei.com \ --cc=mst@redhat.com \ --cc=mtosatti@redhat.com \ --cc=pbonzini@redhat.com \ --cc=qemu-arm@nongnu.org \ --cc=qemu-devel@nongnu.org \ --cc=rth@twiddle.net \ --cc=shannon.zhaosl@gmail.com \ --cc=xuwei5@huawei.com \ --cc=zhengxiang9@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
KVM Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \ kvm@vger.kernel.org public-inbox-index kvm Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.kvm AGPL code for this site: git clone https://public-inbox.org/public-inbox.git