KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Xiang Zheng <zhengxiang9@huawei.com>
To: Beata Michalska <beata.michalska@linaro.org>
Cc: <pbonzini@redhat.com>, <mst@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>, <shannon.zhaosl@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Laszlo Ersek <lersek@redhat.com>, <james.morse@arm.com>,
	gengdongjiu <gengdongjiu@huawei.com>, <mtosatti@redhat.com>,
	<rth@twiddle.net>, <ehabkost@redhat.com>,
	<jonathan.cameron@huawei.com>, <xuwei5@huawei.com>,
	<kvm@vger.kernel.org>, <qemu-devel@nongnu.org>,
	<qemu-arm@nongnu.org>, <linuxarm@huawei.com>,
	<wanghaibin.wang@huawei.com>
Subject: Re: [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM
Date: Tue, 3 Dec 2019 11:35:07 +0800
Message-ID: <9e22a655-5333-ba65-a00d-712b5b144ff4@huawei.com> (raw)
In-Reply-To: <CADSWDzsEFNMKrC6h4=r70KMzG8XX_5DS1CfGBGBCMmOTfu6qyA@mail.gmail.com>



On 2019/11/27 22:17, Beata Michalska wrote:
> Hi
> 
> On Wed, 27 Nov 2019 at 12:47, Xiang Zheng <zhengxiang9@huawei.com> wrote:
>>
>> Hi Beata,
>>
>> Thanks for you review!
>>
> YAW
> 
>> On 2019/11/22 23:47, Beata Michalska wrote:
>>> Hi,
>>>
>>> On Mon, 11 Nov 2019 at 01:48, Xiang Zheng <zhengxiang9@huawei.com> wrote:
>>>>
>>>> From: Dongjiu Geng <gengdongjiu@huawei.com>
>>>>
>>>> Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type,
>>>> translates the host VA delivered by host to guest PA, then fills this PA
>>>> to guest APEI GHES memory, then notifies guest according to the SIGBUS
>>>> type.
>>>>
>>>> When guest accesses the poisoned memory, it will generate a Synchronous
>>>> External Abort(SEA). Then host kernel gets an APEI notification and calls
>>>> memory_failure() to unmapped the affected page in stage 2, finally
>>>> returns to guest.
>>>>
>>>> Guest continues to access the PG_hwpoison page, it will trap to KVM as
>>>> stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to
>>>> Qemu, Qemu records this error address into guest APEI GHES memory and
>>>> notifes guest using Synchronous-External-Abort(SEA).
>>>>
>>>> In order to inject a vSEA, we introduce the kvm_inject_arm_sea() function
>>>> in which we can setup the type of exception and the syndrome information.
>>>> When switching to guest, the target vcpu will jump to the synchronous
>>>> external abort vector table entry.
>>>>
>>>> The ESR_ELx.DFSC is set to synchronous external abort(0x10), and the
>>>> ESR_ELx.FnV is set to not valid(0x1), which will tell guest that FAR is
>>>> not valid and hold an UNKNOWN value. These values will be set to KVM
>>>> register structures through KVM_SET_ONE_REG IOCTL.
>>>>
>>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>>>> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>> ---
>>>>  hw/acpi/acpi_ghes.c         | 297 ++++++++++++++++++++++++++++++++++++
>>>>  include/hw/acpi/acpi_ghes.h |   4 +
>>>>  include/sysemu/kvm.h        |   3 +-
>>>>  target/arm/cpu.h            |   4 +
>>>>  target/arm/helper.c         |   2 +-
>>>>  target/arm/internals.h      |   5 +-
>>>>  target/arm/kvm64.c          |  64 ++++++++
>>>>  target/arm/tlb_helper.c     |   2 +-
>>>>  target/i386/cpu.h           |   2 +
>>>>  9 files changed, 377 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c
>>>> index 42c00ff3d3..f5b54990c0 100644
>>>> --- a/hw/acpi/acpi_ghes.c
>>>> +++ b/hw/acpi/acpi_ghes.c
>>>> @@ -39,6 +39,34 @@
>>>>  /* The max size in bytes for one error block */
>>>>  #define ACPI_GHES_MAX_RAW_DATA_LENGTH       0x1000
>>>>
>>>> +/*
>>>> + * The total size of Generic Error Data Entry
>>>> + * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
>>>> + * Table 18-343 Generic Error Data Entry
>>>> + */
>>>> +#define ACPI_GHES_DATA_LENGTH               72
>>>> +
>>>> +/*
>>>> + * The memory section CPER size,
>>>> + * UEFI 2.6: N.2.5 Memory Error Section
>>>> + */
>>>> +#define ACPI_GHES_MEM_CPER_LENGTH           80
>>>> +
>>>> +/*
>>>> + * Masks for block_status flags
>>>> + */
>>>> +#define ACPI_GEBS_UNCORRECTABLE         1
>>>
>>> Why not listing all supported statuses ? Similar to error severity below ?
>>>
>>
>> We now only use the first bit for uncorrectable error. The correctable errors
>> are handled in host and would not be delivered to QEMU.
>>
>> I think it's unnecessary to list all the bit masks.
> 
> I'm not sure we are using all the error severity types either, but fair enough.
>>
>>>> +
>>>> +/*
>>>> + * Values for error_severity field
>>>> + */
>>>> +enum AcpiGenericErrorSeverity {
>>>> +    ACPI_CPER_SEV_RECOVERABLE,
>>>> +    ACPI_CPER_SEV_FATAL,
>>>> +    ACPI_CPER_SEV_CORRECTED,
>>>> +    ACPI_CPER_SEV_NONE,
>>>> +};
>>>> +
>>>>  /*
>>>>   * Now only support ARMv8 SEA notification type error source
>>>>   */
>>>> @@ -49,6 +77,16 @@
>>>>   */
>>>>  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
>>>>
>>>> +#define UUID_BE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7)        \
>>>> +    {{{ ((a) >> 24) & 0xff, ((a) >> 16) & 0xff, ((a) >> 8) & 0xff, (a) & 0xff, \
>>>> +    ((b) >> 8) & 0xff, (b) & 0xff,                   \
>>>> +    ((c) >> 8) & 0xff, (c) & 0xff,                    \
>>>> +    (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) } } }
>>>> +
>>>> +#define UEFI_CPER_SEC_PLATFORM_MEM                   \
>>>> +    UUID_BE(0xA5BC1114, 0x6F64, 0x4EDE, 0xB8, 0x63, 0x3E, 0x83, \
>>>> +    0xED, 0x7C, 0x83, 0xB1)
>>>> +
>>>>  /*
> 
> As suggested in different thread - could this be also made common with
> NVMe code ?

Sure, I will make it common in a separate patch.

>>>> @@ -1036,6 +1062,44 @@ int kvm_arch_get_registers(CPUState *cs)
>>>>      return ret;
>>>>  }
>>>>
>>>> +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>>>> +{
>>>> +    ram_addr_t ram_addr;
>>>> +    hwaddr paddr;
>>>> +
>>>> +    assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>>>> +
>>>> +    if (acpi_enabled && addr &&
>>>> +            object_property_get_bool(qdev_get_machine(), "ras", NULL)) {
>>>> +        ram_addr = qemu_ram_addr_from_host(addr);
>>>> +        if (ram_addr != RAM_ADDR_INVALID &&
>>>> +            kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
>>>> +            kvm_hwpoison_page_add(ram_addr);
>>>> +            /*
>>>> +             * Asynchronous signal will be masked by main thread, so
>>>> +             * only handle synchronous signal.
>>>> +             */
>>>
>>> I'm not entirely sure that the comment above is correct (it has been
>>> pointed out before). I would expect the AO signal to be handled here as
>>> well. Not having proper support to do that just yet is another story but
>>> the comment might be bit misleading.
>>>
>>
>> We also expect the AO signal can be handled here. Maybe we could add the comment like:
>>
>> "Asynchronous signal is masked by main thread now. Once it can be asserted, we could
>> handle it." :)
>>
> Still not entirely there - if I'm not mistaken. Both BUS_MCEERR_AR and
> BUS_MVEERR_AO can end up here.
> I'm not entirely sure what you mean by "masked by main thread" ? Both will be
> handled by sigbus_handler and as such both will end up here either
> directly through kvm_on_sigbus
> or through kvm_cpu_exec with pending sigbus. Or am I misguided ?
> 

In fact BUS_MCEERR_AO cannot go to here, because QEMU main thread masks the SIGBUS signal[1]
and vcpu threads can only handle the BUS_MCEERR_AR.

         Qemu Main Thread   VCPU Threads

Kernel:  Mask SIGBUS        AO SIGBUS would be send to Qemu main thread in kernel(kill_proc())

KVM:     Mask SIGBUS        Only send AR SIGBUS to VCPU threads in KVM(kvm_send_hwpoison_signal())


However, maybe we shouldn't consider the behaviors of kernel or KVM and just keep
the logic of handling the AO signal in kvm_arch_on_sigbus_vcpu() like what x86 version
does.


[1] https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg03575.html

-- 

Thanks,
Xiang


  reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-11  1:40 [RESEND PATCH v21 0/6] Add ARMv8 RAS virtualization support in QEMU Xiang Zheng
2019-11-11  1:40 ` [RESEND PATCH v21 1/6] hw/arm/virt: Introduce a RAS machine option Xiang Zheng
2019-12-02 18:22   ` Peter Maydell
2019-12-07 12:10     ` gengdongjiu
2019-11-11  1:40 ` [RESEND PATCH v21 2/6] docs: APEI GHES generation and CPER record description Xiang Zheng
2019-11-15  9:44   ` Igor Mammedov
2019-11-27  1:37     ` Xiang Zheng
2019-11-27  8:12       ` Igor Mammedov
2019-11-11  1:40 ` [RESEND PATCH v21 3/6] ACPI: Add APEI GHES table generation support Xiang Zheng
2019-11-15  9:38   ` Igor Mammedov
2019-11-18 12:49     ` gengdongjiu
2019-11-18 13:18       ` gengdongjiu
2019-11-18 13:21         ` Michael S. Tsirkin
2019-11-18 13:57           ` gengdongjiu
2019-11-25  9:48           ` Igor Mammedov
2019-11-27 11:16             ` gengdongjiu
2019-11-22 15:44       ` Beata Michalska
2019-11-22 15:42   ` Beata Michalska
2019-11-25  9:23     ` Igor Mammedov
2019-11-11  1:40 ` [RESEND PATCH v21 4/6] KVM: Move hwpoison page related functions into kvm-all.c Xiang Zheng
2019-12-02 18:23   ` Peter Maydell
2019-11-11  1:40 ` [RESEND PATCH v21 5/6] target-arm: kvm64: handle SIGBUS signal from kernel or KVM Xiang Zheng
2019-11-15 16:37   ` Igor Mammedov
2019-11-22 15:47     ` Beata Michalska
2019-11-25  9:37       ` Igor Mammedov
2019-11-27  1:40     ` Xiang Zheng
2019-11-27 10:43       ` Igor Mammedov
2019-12-21 12:35     ` gengdongjiu
2019-11-22 15:47   ` Beata Michalska
2019-11-27 12:47     ` Xiang Zheng
2019-11-27 13:02       ` Igor Mammedov
2019-11-27 14:17         ` Beata Michalska
2019-12-03  3:35           ` Xiang Zheng
2019-11-27 14:17       ` Beata Michalska
2019-12-03  3:35         ` Xiang Zheng [this message]
2019-12-07  9:33     ` gengdongjiu
2019-12-09 13:05       ` Beata Michalska
2019-12-09 14:12         ` gengdongjiu
2019-11-11  1:40 ` [RESEND PATCH v21 6/6] MAINTAINERS: Add APCI/APEI/GHES entries Xiang Zheng
2019-12-02 18:27 ` [RESEND PATCH v21 0/6] Add ARMv8 RAS virtualization support in QEMU Peter Maydell
2019-12-03  2:09   ` gengdongjiu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e22a655-5333-ba65-a00d-712b5b144ff4@huawei.com \
    --to=zhengxiang9@huawei.com \
    --cc=beata.michalska@linaro.org \
    --cc=ehabkost@redhat.com \
    --cc=gengdongjiu@huawei.com \
    --cc=imammedo@redhat.com \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=lersek@redhat.com \
    --cc=linuxarm@huawei.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=shannon.zhaosl@gmail.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=xuwei5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git