linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 16:08 [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg Dongjiu Geng
@ 2017-08-07 15:57 ` Marc Zyngier
  2017-08-07 16:23   ` gengdongjiu
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2017-08-07 15:57 UTC (permalink / raw)
  To: Dongjiu Geng, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel, James Morse

+James, since he deals with all things RAS. Please keep him on CC at all
times.

On 07/08/17 17:08, Dongjiu Geng wrote:
> For the firmware-first RAS solution, SEA and SEI is injected
> by the user space, user space needs to know the esr_el2 and
> far_el2's value, so add them to sysreg. user space uses
> the IOCTL KVM_GET_ONE_REG can get their value.

No.

This has zero purpose being exposed to userspace. Userspace sees a VM
that runs at EL1, and nothing else, so exposing EL2 registers doesn't
make *any* sense.

If you want something to be exposed to userspace, it has to be properly
abstracted and describe something that is relevant to the VM. An EL2
register satisfies none of these conditions.

> 
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> Signed-off-by: Quanming Wu <wuquanming@huawei.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 6 ++++--
>  arch/arm64/kvm/sys_regs.c         | 6 ++++++
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index b6242fb..6063eec 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -103,10 +103,12 @@ enum vcpu_sysreg {
>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>  	TCR_EL1,	/* Translation Control Register */
> -	ESR_EL1,	/* Exception Syndrome Register */
> +	ESR_EL1,	/* Exception Syndrome Register for EL1 */
> +	ESR_EL2,	/* Exception Syndrome Register for EL2 */
>  	AFSR0_EL1,	/* Auxiliary Fault Status Register 0 */
>  	AFSR1_EL1,	/* Auxiliary Fault Status Register 1 */
> -	FAR_EL1,	/* Fault Address Register */
> +	FAR_EL1,	/* Fault Address Register for EL1 */
> +	FAR_EL2,	/* Fault Address Register for EL2 */
>  	MAIR_EL1,	/* Memory Attribute Indirection Register */
>  	VBAR_EL1,	/* Vector Base Address Register */
>  	CONTEXTIDR_EL1,	/* Context ID Register */
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 0e26f8c..0c286bf 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -987,9 +987,15 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  	/* ESR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000),
>  	  access_vm_reg, reset_unknown, ESR_EL1 },
> +	/* ESR_EL2 */
> +	{ Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0010), Op2(0b000),
> +	  access_vm_reg, reset_unknown, ESR_EL2 },
>  	/* FAR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>  	  access_vm_reg, reset_unknown, FAR_EL1 },
> +	/* FAR_EL2 */
> +	{ Op0(0b11), Op1(0b100), CRn(0b0110), CRm(0b0000), Op2(0b000),
> +	  access_vm_reg, reset_unknown, FAR_EL2 },
>  	/* PAR_EL1 */
>  	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
>  	  NULL, reset_unknown, PAR_EL1 },
> 

Also, what do you return here? All you're doing to return to userspace
is 0x1de7ec7edbadc0deULL (which perfectly matches this patch).

So for all intents and purposes, this patch is pretty useless.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
@ 2017-08-07 16:08 Dongjiu Geng
  2017-08-07 15:57 ` Marc Zyngier
  0 siblings, 1 reply; 9+ messages in thread
From: Dongjiu Geng @ 2017-08-07 16:08 UTC (permalink / raw)
  To: christoffer.dall, marc.zyngier, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel, gengdongjiu

For the firmware-first RAS solution, SEA and SEI is injected
by the user space, user space needs to know the esr_el2 and
far_el2's value, so add them to sysreg. user space uses
the IOCTL KVM_GET_ONE_REG can get their value.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Quanming Wu <wuquanming@huawei.com>
---
 arch/arm64/include/asm/kvm_host.h | 6 ++++--
 arch/arm64/kvm/sys_regs.c         | 6 ++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index b6242fb..6063eec 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -103,10 +103,12 @@ enum vcpu_sysreg {
 	TTBR0_EL1,	/* Translation Table Base Register 0 */
 	TTBR1_EL1,	/* Translation Table Base Register 1 */
 	TCR_EL1,	/* Translation Control Register */
-	ESR_EL1,	/* Exception Syndrome Register */
+	ESR_EL1,	/* Exception Syndrome Register for EL1 */
+	ESR_EL2,	/* Exception Syndrome Register for EL2 */
 	AFSR0_EL1,	/* Auxiliary Fault Status Register 0 */
 	AFSR1_EL1,	/* Auxiliary Fault Status Register 1 */
-	FAR_EL1,	/* Fault Address Register */
+	FAR_EL1,	/* Fault Address Register for EL1 */
+	FAR_EL2,	/* Fault Address Register for EL2 */
 	MAIR_EL1,	/* Memory Attribute Indirection Register */
 	VBAR_EL1,	/* Vector Base Address Register */
 	CONTEXTIDR_EL1,	/* Context ID Register */
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 0e26f8c..0c286bf 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -987,9 +987,15 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	/* ESR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000),
 	  access_vm_reg, reset_unknown, ESR_EL1 },
+	/* ESR_EL2 */
+	{ Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0010), Op2(0b000),
+	  access_vm_reg, reset_unknown, ESR_EL2 },
 	/* FAR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
 	  access_vm_reg, reset_unknown, FAR_EL1 },
+	/* FAR_EL2 */
+	{ Op0(0b11), Op1(0b100), CRn(0b0110), CRm(0b0000), Op2(0b000),
+	  access_vm_reg, reset_unknown, FAR_EL2 },
 	/* PAR_EL1 */
 	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
 	  NULL, reset_unknown, PAR_EL1 },
-- 
2.10.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 15:57 ` Marc Zyngier
@ 2017-08-07 16:23   ` gengdongjiu
  2017-08-07 16:56     ` Marc Zyngier
  2017-08-07 16:59     ` James Morse
  0 siblings, 2 replies; 9+ messages in thread
From: gengdongjiu @ 2017-08-07 16:23 UTC (permalink / raw)
  To: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel, James Morse

Hi Marc,
  As James's suggestion, I move injection SEA Error logic to the user space(Qemu), Qemu sets the related guest OS esr/elr/pstate/spsr
through IOCTL KVM_SET_ONE_REG. For the SEA, when Qemu sets the esr_el1.IL bit, it needs to refer to esr_el2.IL, else Qemu does not know the trapped
instruction was a 16-bit or a 32-bit instruction, also it needs to set far_el1 using far_el2, because this is synchronization abort.




On 2017/8/7 23:57, Marc Zyngier wrote:
> +James, since he deals with all things RAS. Please keep him on CC at all
> times.
> 
> On 07/08/17 17:08, Dongjiu Geng wrote:
>> For the firmware-first RAS solution, SEA and SEI is injected
>> by the user space, user space needs to know the esr_el2 and
>> far_el2's value, so add them to sysreg. user space uses
>> the IOCTL KVM_GET_ONE_REG can get their value.
> 
> No.
> 
> This has zero purpose being exposed to userspace. Userspace sees a VM
> that runs at EL1, and nothing else, so exposing EL2 registers doesn't
> make *any* sense.
> 
> If you want something to be exposed to userspace, it has to be properly
> abstracted and describe something that is relevant to the VM. An EL2
> register satisfies none of these conditions.
> 
>>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> Signed-off-by: Quanming Wu <wuquanming@huawei.com>
>> ---
>>  arch/arm64/include/asm/kvm_host.h | 6 ++++--
>>  arch/arm64/kvm/sys_regs.c         | 6 ++++++
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index b6242fb..6063eec 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -103,10 +103,12 @@ enum vcpu_sysreg {
>>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>>  	TCR_EL1,	/* Translation Control Register */
>> -	ESR_EL1,	/* Exception Syndrome Register */
>> +	ESR_EL1,	/* Exception Syndrome Register for EL1 */
>> +	ESR_EL2,	/* Exception Syndrome Register for EL2 */
>>  	AFSR0_EL1,	/* Auxiliary Fault Status Register 0 */
>>  	AFSR1_EL1,	/* Auxiliary Fault Status Register 1 */
>> -	FAR_EL1,	/* Fault Address Register */
>> +	FAR_EL1,	/* Fault Address Register for EL1 */
>> +	FAR_EL2,	/* Fault Address Register for EL2 */
>>  	MAIR_EL1,	/* Memory Attribute Indirection Register */
>>  	VBAR_EL1,	/* Vector Base Address Register */
>>  	CONTEXTIDR_EL1,	/* Context ID Register */
>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>> index 0e26f8c..0c286bf 100644
>> --- a/arch/arm64/kvm/sys_regs.c
>> +++ b/arch/arm64/kvm/sys_regs.c
>> @@ -987,9 +987,15 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>>  	/* ESR_EL1 */
>>  	{ Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000),
>>  	  access_vm_reg, reset_unknown, ESR_EL1 },
>> +	/* ESR_EL2 */
>> +	{ Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0010), Op2(0b000),
>> +	  access_vm_reg, reset_unknown, ESR_EL2 },
>>  	/* FAR_EL1 */
>>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>>  	  access_vm_reg, reset_unknown, FAR_EL1 },
>> +	/* FAR_EL2 */
>> +	{ Op0(0b11), Op1(0b100), CRn(0b0110), CRm(0b0000), Op2(0b000),
>> +	  access_vm_reg, reset_unknown, FAR_EL2 },
>>  	/* PAR_EL1 */
>>  	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
>>  	  NULL, reset_unknown, PAR_EL1 },
>>
> 
> Also, what do you return here? All you're doing to return to userspace
> is 0x1de7ec7edbadc0deULL (which perfectly matches this patch).
> 
> So for all intents and purposes, this patch is pretty useless.
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 16:23   ` gengdongjiu
@ 2017-08-07 16:56     ` Marc Zyngier
  2017-08-07 18:59       ` gengdongjiu
  2017-08-07 16:59     ` James Morse
  1 sibling, 1 reply; 9+ messages in thread
From: Marc Zyngier @ 2017-08-07 16:56 UTC (permalink / raw)
  To: gengdongjiu, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel, James Morse

On 07/08/17 17:23, gengdongjiu wrote:
> Hi Marc,
>   As James's suggestion, I move injection SEA Error logic to the user space(Qemu), Qemu sets the related guest OS esr/elr/pstate/spsr
> through IOCTL KVM_SET_ONE_REG. For the SEA, when Qemu sets the esr_el1.IL bit, it needs to refer to esr_el2.IL, else Qemu does not know the trapped
> instruction was a 16-bit or a 32-bit instruction, also it needs to set far_el1 using far_el2, because this is synchronization abort.

Usespace may need some fault information, but certainly not the full set
of FAR_EL2/ESR_EL2. What it needs is a very small set of well defined
information, properly abstracted, and not data that is completely
private to the hypervisor.

Thanks,

	M.
> 
> 
> 
> 
> On 2017/8/7 23:57, Marc Zyngier wrote:
>> +James, since he deals with all things RAS. Please keep him on CC at all
>> times.
>>
>> On 07/08/17 17:08, Dongjiu Geng wrote:
>>> For the firmware-first RAS solution, SEA and SEI is injected
>>> by the user space, user space needs to know the esr_el2 and
>>> far_el2's value, so add them to sysreg. user space uses
>>> the IOCTL KVM_GET_ONE_REG can get their value.
>>
>> No.
>>
>> This has zero purpose being exposed to userspace. Userspace sees a VM
>> that runs at EL1, and nothing else, so exposing EL2 registers doesn't
>> make *any* sense.
>>
>> If you want something to be exposed to userspace, it has to be properly
>> abstracted and describe something that is relevant to the VM. An EL2
>> register satisfies none of these conditions.
>>
>>>
>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>>> Signed-off-by: Quanming Wu <wuquanming@huawei.com>
>>> ---
>>>  arch/arm64/include/asm/kvm_host.h | 6 ++++--
>>>  arch/arm64/kvm/sys_regs.c         | 6 ++++++
>>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>> index b6242fb..6063eec 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -103,10 +103,12 @@ enum vcpu_sysreg {
>>>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>>>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>>>  	TCR_EL1,	/* Translation Control Register */
>>> -	ESR_EL1,	/* Exception Syndrome Register */
>>> +	ESR_EL1,	/* Exception Syndrome Register for EL1 */
>>> +	ESR_EL2,	/* Exception Syndrome Register for EL2 */
>>>  	AFSR0_EL1,	/* Auxiliary Fault Status Register 0 */
>>>  	AFSR1_EL1,	/* Auxiliary Fault Status Register 1 */
>>> -	FAR_EL1,	/* Fault Address Register */
>>> +	FAR_EL1,	/* Fault Address Register for EL1 */
>>> +	FAR_EL2,	/* Fault Address Register for EL2 */
>>>  	MAIR_EL1,	/* Memory Attribute Indirection Register */
>>>  	VBAR_EL1,	/* Vector Base Address Register */
>>>  	CONTEXTIDR_EL1,	/* Context ID Register */
>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>> index 0e26f8c..0c286bf 100644
>>> --- a/arch/arm64/kvm/sys_regs.c
>>> +++ b/arch/arm64/kvm/sys_regs.c
>>> @@ -987,9 +987,15 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>>>  	/* ESR_EL1 */
>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000),
>>>  	  access_vm_reg, reset_unknown, ESR_EL1 },
>>> +	/* ESR_EL2 */
>>> +	{ Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0010), Op2(0b000),
>>> +	  access_vm_reg, reset_unknown, ESR_EL2 },
>>>  	/* FAR_EL1 */
>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>>>  	  access_vm_reg, reset_unknown, FAR_EL1 },
>>> +	/* FAR_EL2 */
>>> +	{ Op0(0b11), Op1(0b100), CRn(0b0110), CRm(0b0000), Op2(0b000),
>>> +	  access_vm_reg, reset_unknown, FAR_EL2 },
>>>  	/* PAR_EL1 */
>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
>>>  	  NULL, reset_unknown, PAR_EL1 },
>>>
>>
>> Also, what do you return here? All you're doing to return to userspace
>> is 0x1de7ec7edbadc0deULL (which perfectly matches this patch).
>>
>> So for all intents and purposes, this patch is pretty useless.
>>
>> Thanks,
>>
>> 	M.
>>
> 


-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 16:23   ` gengdongjiu
  2017-08-07 16:56     ` Marc Zyngier
@ 2017-08-07 16:59     ` James Morse
  2017-08-07 17:43       ` gengdongjiu
  1 sibling, 1 reply; 9+ messages in thread
From: James Morse @ 2017-08-07 16:59 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel

Hi gengdongjiu,

On 07/08/17 17:23, gengdongjiu wrote:
>   As James's suggestion, I move injection SEA Error logic to the user space(Qemu), Qemu sets the related guest OS esr/elr/pstate/spsr

(because for firmware-first its the CPER records that matter, and only QEMU
knows where it reserved the memory for these, and what it told the guest it
would use as the notification method).

> through IOCTL KVM_SET_ONE_REG. For the SEA, when Qemu sets the esr_el1.IL bit, it needs to refer to esr_el2.IL, else Qemu does not know the trapped
> instruction was a 16-bit or a 32-bit instruction, also it needs to set far_el1 using far_el2, because this is synchronization abort.

The 32bit kernel doesn't support ACPI firmware first, and aarch64 doesn't
support 16-bit instructions.


James

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 16:59     ` James Morse
@ 2017-08-07 17:43       ` gengdongjiu
  2017-08-08 16:27         ` James Morse
  0 siblings, 1 reply; 9+ messages in thread
From: gengdongjiu @ 2017-08-07 17:43 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel

Ok, thanks for James's confirmation.

Another question, For the SEI, I want to also use SIGBUS both for the KVM user and non-kvm user, if SEA and SEI Error all use the SIGBUS to notify user space(Qemu),
the user space(Qemu) will be confused, and do not know whether this is SEA or SEI error. so here I pass the sysreg ESR_EL2(vcpu->arch.fault.esr_el2) to the user space, let user space judges the (vcpu->arch.fault.esr_el2)'s value
to know this is a SEA or SEI Error. do you agree my solution for the SEI? thanks.

because the vcpu->arch.fault.esr_el2 can not directly passed to userspace, so I defined the vcpu->arch.fault.esr_el2 to sysreg ESR_EL2/FAR_EL2, sysreg register can pass to user space.


+		vcpu_sys_reg(vcpu,ESR_EL2) = kvm_vcpu_get_hsr(vcpu);
+		vcpu_sys_reg(vcpu,FAR_EL2) = kvm_vcpu_get_hfar(vcpu);



On 2017/8/8 0:59, James Morse wrote:
> Hi gengdongjiu,
> 
> On 07/08/17 17:23, gengdongjiu wrote:
>>   As James's suggestion, I move injection SEA Error logic to the user space(Qemu), Qemu sets the related guest OS esr/elr/pstate/spsr
> 
> (because for firmware-first its the CPER records that matter, and only QEMU
> knows where it reserved the memory for these, and what it told the guest it
> would use as the notification method).
> 
>> through IOCTL KVM_SET_ONE_REG. For the SEA, when Qemu sets the esr_el1.IL bit, it needs to refer to esr_el2.IL, else Qemu does not know the trapped
>> instruction was a 16-bit or a 32-bit instruction, also it needs to set far_el1 using far_el2, because this is synchronization abort.
> 
> The 32bit kernel doesn't support ACPI firmware first, and aarch64 doesn't
> support 16-bit instructions.
 thanks, so how about the SEA's error FAR_EL1's value? may be FAR_EL1's value get from FAR_EL2's value.

> 
> 
> James
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 16:56     ` Marc Zyngier
@ 2017-08-07 18:59       ` gengdongjiu
  0 siblings, 0 replies; 9+ messages in thread
From: gengdongjiu @ 2017-08-07 18:59 UTC (permalink / raw)
  To: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel, James Morse

Marc,

On 2017/8/8 0:56, Marc Zyngier wrote:
> On 07/08/17 17:23, gengdongjiu wrote:
>> Hi Marc,
>>   As James's suggestion, I move injection SEA Error logic to the user space(Qemu), Qemu sets the related guest OS esr/elr/pstate/spsr
>> through IOCTL KVM_SET_ONE_REG. For the SEA, when Qemu sets the esr_el1.IL bit, it needs to refer to esr_el2.IL, else Qemu does not know the trapped
>> instruction was a 16-bit or a 32-bit instruction, also it needs to set far_el1 using far_el2, because this is synchronization abort.
> 
> Usespace may need some fault information, but certainly not the full set
> of FAR_EL2/ESR_EL2. What it needs is a very small set of well defined
> information, properly abstracted, and not data that is completely
> private to the hypervisor.

Marc, just now I update the patch, may be use the vcpu->arch.fault.esr_el2 and vcpu->arch.fault.far_el2 to set the vcpu_sys_reg(vcpu,FAR_EL2)/vcpu_sys_reg(vcpu,ESR_EL2) can be better.
Now the user space can not directly get the vcpu->arch.faul.xxxx value, so need use vcpu_sys_reg to pass.



> 
> Thanks,
> 
> 	M.
>>
>>
>>
>>
>> On 2017/8/7 23:57, Marc Zyngier wrote:
>>> +James, since he deals with all things RAS. Please keep him on CC at all
>>> times.
>>>
>>> On 07/08/17 17:08, Dongjiu Geng wrote:
>>>> For the firmware-first RAS solution, SEA and SEI is injected
>>>> by the user space, user space needs to know the esr_el2 and
>>>> far_el2's value, so add them to sysreg. user space uses
>>>> the IOCTL KVM_GET_ONE_REG can get their value.
>>>
>>> No.
>>>
>>> This has zero purpose being exposed to userspace. Userspace sees a VM
>>> that runs at EL1, and nothing else, so exposing EL2 registers doesn't
>>> make *any* sense.
>>>
>>> If you want something to be exposed to userspace, it has to be properly
>>> abstracted and describe something that is relevant to the VM. An EL2
>>> register satisfies none of these conditions.
>>>
>>>>
>>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>>>> Signed-off-by: Quanming Wu <wuquanming@huawei.com>
>>>> ---
>>>>  arch/arm64/include/asm/kvm_host.h | 6 ++++--
>>>>  arch/arm64/kvm/sys_regs.c         | 6 ++++++
>>>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>>> index b6242fb..6063eec 100644
>>>> --- a/arch/arm64/include/asm/kvm_host.h
>>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>>> @@ -103,10 +103,12 @@ enum vcpu_sysreg {
>>>>  	TTBR0_EL1,	/* Translation Table Base Register 0 */
>>>>  	TTBR1_EL1,	/* Translation Table Base Register 1 */
>>>>  	TCR_EL1,	/* Translation Control Register */
>>>> -	ESR_EL1,	/* Exception Syndrome Register */
>>>> +	ESR_EL1,	/* Exception Syndrome Register for EL1 */
>>>> +	ESR_EL2,	/* Exception Syndrome Register for EL2 */
>>>>  	AFSR0_EL1,	/* Auxiliary Fault Status Register 0 */
>>>>  	AFSR1_EL1,	/* Auxiliary Fault Status Register 1 */
>>>> -	FAR_EL1,	/* Fault Address Register */
>>>> +	FAR_EL1,	/* Fault Address Register for EL1 */
>>>> +	FAR_EL2,	/* Fault Address Register for EL2 */
>>>>  	MAIR_EL1,	/* Memory Attribute Indirection Register */
>>>>  	VBAR_EL1,	/* Vector Base Address Register */
>>>>  	CONTEXTIDR_EL1,	/* Context ID Register */
>>>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>>>> index 0e26f8c..0c286bf 100644
>>>> --- a/arch/arm64/kvm/sys_regs.c
>>>> +++ b/arch/arm64/kvm/sys_regs.c
>>>> @@ -987,9 +987,15 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>>>>  	/* ESR_EL1 */
>>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0101), CRm(0b0010), Op2(0b000),
>>>>  	  access_vm_reg, reset_unknown, ESR_EL1 },
>>>> +	/* ESR_EL2 */
>>>> +	{ Op0(0b11), Op1(0b100), CRn(0b0101), CRm(0b0010), Op2(0b000),
>>>> +	  access_vm_reg, reset_unknown, ESR_EL2 },
>>>>  	/* FAR_EL1 */
>>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0110), CRm(0b0000), Op2(0b000),
>>>>  	  access_vm_reg, reset_unknown, FAR_EL1 },
>>>> +	/* FAR_EL2 */
>>>> +	{ Op0(0b11), Op1(0b100), CRn(0b0110), CRm(0b0000), Op2(0b000),
>>>> +	  access_vm_reg, reset_unknown, FAR_EL2 },
>>>>  	/* PAR_EL1 */
>>>>  	{ Op0(0b11), Op1(0b000), CRn(0b0111), CRm(0b0100), Op2(0b000),
>>>>  	  NULL, reset_unknown, PAR_EL1 },
>>>>
>>>
>>> Also, what do you return here? All you're doing to return to userspace
>>> is 0x1de7ec7edbadc0deULL (which perfectly matches this patch).
>>>
>>> So for all intents and purposes, this patch is pretty useless.
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-07 17:43       ` gengdongjiu
@ 2017-08-08 16:27         ` James Morse
  2017-08-08 18:27           ` gengdongjiu
  0 siblings, 1 reply; 9+ messages in thread
From: James Morse @ 2017-08-08 16:27 UTC (permalink / raw)
  To: gengdongjiu
  Cc: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel

Hi gengdongjiu,

On 07/08/17 18:43, gengdongjiu wrote:
> Another question, For the SEI, I want to also use SIGBUS both for the KVM user and non-kvm user,
> if SEA and SEI Error all use the SIGBUS to notify user space(Qemu),

User-space shouldn't necessarily be notified about Synchronous External Aborts
or SError Interrupts. You're really asking about RAS firmware-first
notifications that use these as the notification mechanism.

We should not notify user-space that the guest happened to be interrupted by a
RAS firmware-first notification. It may not be relevant, and we can't know until
we parse the CPER records. The notification mechanism is between firmware and
the host kernel, we should never expose anything about it to user space or a guest.

Linux should act on the CPER records first to determine if the host kernel can
keep running. Once it has done this it can deliver signals to affected
processes, but which signal and its properties depends on the CPER records.

The example here is BUS_MCEERR_AO and BUS_MCEERR_AR. These notify userspace that
si_addr_lsb bits of memory are corrupt at si_addr, this is either
Action-Optional or Action-Required.

For arm64 we just needed to turn this code on, it already presents the minimum
necessary information to user-space in an architecture-agnostic way. We didn't
need to do anything to this code to support NOTIFY_SEA, the notification
mechanism is irrelevant, this is all driven by the CPER records.

If you have a class of error that isn't covered by the memory-failure code, then
we need to add something similar. This should be based on the CPER records, and
should work in exactly the same way for all processes on all ACPI platforms.


> do you agree my solution for the SEI? thanks.

No, you are trying to notify userspace that firmware notified the host. This
creates an ABI between EL3 firmware and EL0 user space that we can't possibly
support.

I think you've come to this because you are merging two steps together:
1. The OS uses the v8.2 RAS extensions to isolate errors and notify firmware.
2. If firmware has to tell the OS about the error, firmware generates CPER records.
3. Firmware triggers the GHES notification mechanism for this error source.
4. Linux receives the notification and calls ghes_proc(), (if KVM gets the
notification because a guest happened to be running, it should switch back to
the host and arrange for ghes_proc() to be called).
5. ghes_proc() parses the CPER records and calls other kernel helpers to handle
the specific type of error, e.g. memory_failure().
6. If the helper knows the kernel can keep running, the error is visible to
user-space and user space could do further processing to correct the error, an
error-specific signal is sent.
7. User-space reloads the webpage, notifies the guest or whatever is appropriate.

You are merging steps 3 and 7.

The notification method is an abstraction that only matters to steps 3&4, this
lets us add more without rewriting the world.
User-space signals are an abstraction between steps 6&7, this works across
architectures, even those not using APEI firmware first.
There are no shortcuts here. Doing anything else creates more work for user
space, another platform or another architecture.


Thanks,

James

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg
  2017-08-08 16:27         ` James Morse
@ 2017-08-08 18:27           ` gengdongjiu
  0 siblings, 0 replies; 9+ messages in thread
From: gengdongjiu @ 2017-08-08 18:27 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, christoffer.dall, pbonzini, rkrcmar,
	catalin.marinas, will.deacon, linux-arm-kernel, kvmarm, kvm,
	linux-kernel

Hi James,

On 2017/8/9 0:27, James Morse wrote:
> Hi gengdongjiu,
> 
> On 07/08/17 18:43, gengdongjiu wrote:
>> Another question, For the SEI, I want to also use SIGBUS both for the KVM user and non-kvm user,
>> if SEA and SEI Error all use the SIGBUS to notify user space(Qemu),
> 
> User-space shouldn't necessarily be notified about Synchronous External Aborts
> or SError Interrupts. You're really asking about RAS firmware-first
> notifications that use these as the notification mechanism.
Firstly, we am talking the RAS firmware-first solution. I mainly
want to let user space to know what is the Error type for this hardware error(Synchronous or asynchronous).
we do not care the notification mechanism. As our agreement before , Qemu will record the CPER for the
guest OS. if Qemu does not know the Error type, it can not record the CPER. because in the ghes there is a field to
fill the error type.

I paste the APEI table layout:
https://wiki.linaro.org/LEG/Engineering/Kernel/RAS/APEITables

usual the notification type is classified by the hardware type




> 
> We should not notify user-space that the guest happened to be interrupted by a
> RAS firmware-first notification. It may not be relevant, and we can't know until
> we parse the CPER records. The notification mechanism is between firmware and
> the host kernel, we should never expose anything about it to user space or a guest.
I agree with you this sentence, for the hardware error, host kernel will firstly deal with, and then decided whether to
notified Qemu/KVM tools. In this process, we do not care what is the notification mechanism between
firmware and host kernel. we only concern the hardware error type. different type, Qemu/KVM tools will have different behavior.

> 
> Linux should act on the CPER records first to determine if the host kernel can
> keep running. Once it has done this it can deliver signals to affected
> processes, but which signal and its properties depends on the CPER records.

 if want Qemu to handle this Error, I think qemu/kvmtools should know hardware error type, else it will be confused and do not know how to deal with.

> 
> The example here is BUS_MCEERR_AO and BUS_MCEERR_AR. These notify userspace that
> si_addr_lsb bits of memory are corrupt at si_addr, this is either
> Action-Optional or Action-Required.
> 
> For arm64 we just needed to turn this code on, it already presents the minimum
> necessary information to user-space in an architecture-agnostic way. We didn't
> need to do anything to this code to support NOTIFY_SEA, the notification
> mechanism is irrelevant, this is all driven by the CPER records.
 we do not care the notification type, we are only care the hardware error type.
 if user-space do not know the error type, it can not record the CPER and can not inject the proper Error to guest OS.
 because record CPER and inject the Error to guest OS need this hardware error type. different Error type, there is
 different behavior.

 For different hardware error type, X86 Qemu/kvm tools code also have different behaviour in

> 
> If you have a class of error that isn't covered by the memory-failure code, then
> we need to add something similar. This should be based on the CPER records, and
> should work in exactly the same way for all processes on all ACPI platforms.
> 
> 
>> do you agree my solution for the SEI? thanks.
> 
> No, you are trying to notify userspace that firmware notified the host. This
> creates an ABI between EL3 firmware and EL0 user space that we can't possibly
> support.

you may misunderstand I mentioned solution here. I mean using memory-failure code to signal user space
for the SError(SEI), this way does not creates any ABI. SEA/SEI all use same method.
Qemu can judge the ESR to know the hardware type.


> 
> I think you've come to this because you are merging two steps together:
> 1. The OS uses the v8.2 RAS extensions to isolate errors and notify firmware.
> 2. If firmware has to tell the OS about the error, firmware generates CPER records.
> 3. Firmware triggers the GHES notification mechanism for this error source.
> 4. Linux receives the notification and calls ghes_proc(), (if KVM gets the
> notification because a guest happened to be running, it should switch back to
> the host and arrange for ghes_proc() to be called).
> 5. ghes_proc() parses the CPER records and calls other kernel helpers to handle
> the specific type of error, e.g. memory_failure().
> 6. If the helper knows the kernel can keep running, the error is visible to
> user-space and user space could do further processing to correct the error, an
> error-specific signal is sent.
> 7. User-space reloads the webpage, notifies the guest or whatever is appropriate.
> 
> You are merging steps 3 and 7.
  No, not merge them. in above steps, the steps 7 needs to know the hardware type. if it does not know the
  hardware error type, user space can not correctly do further processing. we do not care the notification type.
  we only care what is the hardware error type.

> 
> The notification method is an abstraction that only matters to steps 3&4, this
> lets us add more without rewriting the world.
> User-space signals are an abstraction between steps 6&7, this works across
> architectures, even those not using APEI firmware first.
  so here can I understand that you also agree that SError type can also use signals to be delivered to user space?


> There are no shortcuts here. Doing anything else creates more work for user
> space, another platform or another architecture.
> 
> 
> Thanks,
> 
> James
> 
> .
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-08-08 18:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-07 16:08 [PATCH] KVM: arm64: add esr_el2 and far_el2 to sysreg Dongjiu Geng
2017-08-07 15:57 ` Marc Zyngier
2017-08-07 16:23   ` gengdongjiu
2017-08-07 16:56     ` Marc Zyngier
2017-08-07 18:59       ` gengdongjiu
2017-08-07 16:59     ` James Morse
2017-08-07 17:43       ` gengdongjiu
2017-08-08 16:27         ` James Morse
2017-08-08 18:27           ` gengdongjiu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).