linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries
@ 2019-06-20  1:46 Suraj Jitindar Singh
  2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Suraj Jitindar Singh @ 2019-06-20  1:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

When a guest vcpu moves from one physical thread to another it is
necessary for the host to perform a tlb flush on the previous core if
another vcpu from the same guest is going to run there. This is because the
guest may use the local form of the tlb invalidation instruction meaning
stale tlb entries would persist where it previously ran. This is handled
on guest entry in kvmppc_check_need_tlb_flush() which calls
flush_guest_tlb() to perform the tlb flush.

Previously the generic radix__local_flush_tlb_lpid_guest() function was
used, however the functionality was reimplemented in flush_guest_tlb()
to avoid the trace_tlbie() call as the flushing may be done in real
mode. The reimplementation in flush_guest_tlb() was missing an erat
invalidation after flushing the tlb.

This lead to observable memory corruption in the guest due to the
caching of stale translations. Fix this by adding the erat invalidation.

Fixes: 70ea13f6e609 "KVM: PPC: Book3S HV: Flush TLB on secondary radix threads"

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/kvm/book3s_hv_builtin.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index 6035d24f1d1d..a46286f73eec 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -833,6 +833,7 @@ static void flush_guest_tlb(struct kvm *kvm)
 		}
 	}
 	asm volatile("ptesync": : :"memory");
+	asm volatile(PPC_INVALIDATE_ERAT : : :"memory");
 }
 
 void kvmppc_check_need_tlb_flush(struct kvm *kvm, int pcpu,
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr
  2019-06-20  1:46 [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Suraj Jitindar Singh
@ 2019-06-20  1:46 ` Suraj Jitindar Singh
  2019-06-20  7:56   ` Laurent Vivier
  2019-06-30  8:37   ` Michael Ellerman
  2019-06-20  1:46 ` [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry Suraj Jitindar Singh
  2019-06-23 10:34 ` [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Michael Ellerman
  2 siblings, 2 replies; 9+ messages in thread
From: Suraj Jitindar Singh @ 2019-06-20  1:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

On POWER9 the decrementer can operate in large decrementer mode where
the decrementer is 56 bits and signed extended to 64 bits. When not
operating in this mode the decrementer behaves as a 32 bit decrementer
which is NOT signed extended (as on POWER8).

Currently when reading a guest decrementer value we don't take into
account whether the large decrementer is enabled or not, and this means
the value will be incorrect when the guest is not using the large
decrementer. Fix this by sign extending the value read when the guest
isn't using the large decrementer.

Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d3684509da35..719fd2529eec 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3607,6 +3607,8 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	vcpu->arch.slb_max = 0;
 	dec = mfspr(SPRN_DEC);
+	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
+		dec = (s32) dec;
 	tb = mftb();
 	vcpu->arch.dec_expires = dec + tb;
 	vcpu->cpu = -1;
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry
  2019-06-20  1:46 [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Suraj Jitindar Singh
  2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
@ 2019-06-20  1:46 ` Suraj Jitindar Singh
  2019-06-20  7:57   ` Laurent Vivier
  2019-06-30  8:37   ` Michael Ellerman
  2019-06-23 10:34 ` [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Michael Ellerman
  2 siblings, 2 replies; 9+ messages in thread
From: Suraj Jitindar Singh @ 2019-06-20  1:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

If we enter an L1 guest with a pending decrementer exception then this
is cleared on guest exit if the guest has writtien a positive value into
the decrementer (indicating that it handled the decrementer exception)
since there is no other way to detect that the guest has handled the
pending exception and that it should be dequeued. In the event that the
L1 guest tries to run a nested (L2) guest immediately after this and the
L2 guest decrementer is negative (which is loaded by L1 before making
the H_ENTER_NESTED hcall), then the pending decrementer exception
isn't cleared and the L2 entry is blocked since L1 has a pending
exception, even though L1 may have already handled the exception and
written a positive value for it's decrementer. This results in a loop of
L1 trying to enter the L2 guest and L0 blocking the entry since L1 has
an interrupt pending with the outcome being that L2 never gets to run
and hangs.

Fix this by clearing any pending decrementer exceptions when L1 makes
the H_ENTER_NESTED hcall since it won't do this if it's decrementer has
gone negative, and anyway it's decrementer has been communicated to L0
in the hdec_expires field and L0 will return control to L1 when this
goes negative by delivering an H_DECREMENTER exception.

Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 719fd2529eec..4a5eb29b952f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4128,8 +4128,15 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
 
 	preempt_enable();
 
-	/* cancel pending decrementer exception if DEC is now positive */
-	if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu))
+	/*
+	 * cancel pending decrementer exception if DEC is now positive, or if
+	 * entering a nested guest in which case the decrementer is now owned
+	 * by L2 and the L1 decrementer is provided in hdec_expires
+	 */
+	if (kvmppc_core_pending_dec(vcpu) &&
+			((get_tb() < vcpu->arch.dec_expires) ||
+			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
+			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
 		kvmppc_core_dequeue_dec(vcpu);
 
 	trace_kvm_guest_exit(vcpu);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr
  2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
@ 2019-06-20  7:56   ` Laurent Vivier
  2019-06-30  8:37   ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Laurent Vivier @ 2019-06-20  7:56 UTC (permalink / raw)
  To: Suraj Jitindar Singh, linuxppc-dev; +Cc: clg, kvm-ppc

On 20/06/2019 03:46, Suraj Jitindar Singh wrote:
> On POWER9 the decrementer can operate in large decrementer mode where
> the decrementer is 56 bits and signed extended to 64 bits. When not
> operating in this mode the decrementer behaves as a 32 bit decrementer
> which is NOT signed extended (as on POWER8).
> 
> Currently when reading a guest decrementer value we don't take into
> account whether the large decrementer is enabled or not, and this means
> the value will be incorrect when the guest is not using the large
> decrementer. Fix this by sign extending the value read when the guest
> isn't using the large decrementer.
> 
> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index d3684509da35..719fd2529eec 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3607,6 +3607,8 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
>  
>  	vcpu->arch.slb_max = 0;
>  	dec = mfspr(SPRN_DEC);
> +	if (!(lpcr & LPCR_LD)) /* Sign extend if not using large decrementer */
> +		dec = (s32) dec;
>  	tb = mftb();
>  	vcpu->arch.dec_expires = dec + tb;
>  	vcpu->cpu = -1;
> 

Patches 2 and 3: tested I can boot and run an L2 nested guest with qemu
v4.0.0 and caps-large-decr=on in the case we have had a hang previously.

Tested-by: Laurent Vivier <lvivier@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry
  2019-06-20  1:46 ` [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry Suraj Jitindar Singh
@ 2019-06-20  7:57   ` Laurent Vivier
  2019-06-20  8:19     ` Cédric Le Goater
  2019-06-30  8:37   ` Michael Ellerman
  1 sibling, 1 reply; 9+ messages in thread
From: Laurent Vivier @ 2019-06-20  7:57 UTC (permalink / raw)
  To: Suraj Jitindar Singh, linuxppc-dev; +Cc: clg, kvm-ppc

On 20/06/2019 03:46, Suraj Jitindar Singh wrote:
> If we enter an L1 guest with a pending decrementer exception then this
> is cleared on guest exit if the guest has writtien a positive value into
> the decrementer (indicating that it handled the decrementer exception)
> since there is no other way to detect that the guest has handled the
> pending exception and that it should be dequeued. In the event that the
> L1 guest tries to run a nested (L2) guest immediately after this and the
> L2 guest decrementer is negative (which is loaded by L1 before making
> the H_ENTER_NESTED hcall), then the pending decrementer exception
> isn't cleared and the L2 entry is blocked since L1 has a pending
> exception, even though L1 may have already handled the exception and
> written a positive value for it's decrementer. This results in a loop of
> L1 trying to enter the L2 guest and L0 blocking the entry since L1 has
> an interrupt pending with the outcome being that L2 never gets to run
> and hangs.
> 
> Fix this by clearing any pending decrementer exceptions when L1 makes
> the H_ENTER_NESTED hcall since it won't do this if it's decrementer has
> gone negative, and anyway it's decrementer has been communicated to L0
> in the hdec_expires field and L0 will return control to L1 when this
> goes negative by delivering an H_DECREMENTER exception.
> 
> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 719fd2529eec..4a5eb29b952f 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4128,8 +4128,15 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
>  
>  	preempt_enable();
>  
> -	/* cancel pending decrementer exception if DEC is now positive */
> -	if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu))
> +	/*
> +	 * cancel pending decrementer exception if DEC is now positive, or if
> +	 * entering a nested guest in which case the decrementer is now owned
> +	 * by L2 and the L1 decrementer is provided in hdec_expires
> +	 */
> +	if (kvmppc_core_pending_dec(vcpu) &&
> +			((get_tb() < vcpu->arch.dec_expires) ||
> +			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
> +			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
>  		kvmppc_core_dequeue_dec(vcpu);
>  
>  	trace_kvm_guest_exit(vcpu);
> 

Patches 2 and 3: tested I can boot and run an L2 nested guest with qemu
v4.0.0 and caps-large-decr=on in the case we have had a hang previously.

Tested-by: Laurent Vivier <lvivier@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry
  2019-06-20  7:57   ` Laurent Vivier
@ 2019-06-20  8:19     ` Cédric Le Goater
  0 siblings, 0 replies; 9+ messages in thread
From: Cédric Le Goater @ 2019-06-20  8:19 UTC (permalink / raw)
  To: Laurent Vivier, Suraj Jitindar Singh, linuxppc-dev; +Cc: kvm-ppc

On 20/06/2019 09:57, Laurent Vivier wrote:
> On 20/06/2019 03:46, Suraj Jitindar Singh wrote:
>> If we enter an L1 guest with a pending decrementer exception then this
>> is cleared on guest exit if the guest has writtien a positive value into
>> the decrementer (indicating that it handled the decrementer exception)
>> since there is no other way to detect that the guest has handled the
>> pending exception and that it should be dequeued. In the event that the
>> L1 guest tries to run a nested (L2) guest immediately after this and the
>> L2 guest decrementer is negative (which is loaded by L1 before making
>> the H_ENTER_NESTED hcall), then the pending decrementer exception
>> isn't cleared and the L2 entry is blocked since L1 has a pending
>> exception, even though L1 may have already handled the exception and
>> written a positive value for it's decrementer. This results in a loop of
>> L1 trying to enter the L2 guest and L0 blocking the entry since L1 has
>> an interrupt pending with the outcome being that L2 never gets to run
>> and hangs.
>>
>> Fix this by clearing any pending decrementer exceptions when L1 makes
>> the H_ENTER_NESTED hcall since it won't do this if it's decrementer has
>> gone negative, and anyway it's decrementer has been communicated to L0
>> in the hdec_expires field and L0 will return control to L1 when this
>> goes negative by delivering an H_DECREMENTER exception.
>>
>> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
>>
>> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 719fd2529eec..4a5eb29b952f 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -4128,8 +4128,15 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
>>  
>>  	preempt_enable();
>>  
>> -	/* cancel pending decrementer exception if DEC is now positive */
>> -	if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu))
>> +	/*
>> +	 * cancel pending decrementer exception if DEC is now positive, or if
>> +	 * entering a nested guest in which case the decrementer is now owned
>> +	 * by L2 and the L1 decrementer is provided in hdec_expires
>> +	 */
>> +	if (kvmppc_core_pending_dec(vcpu) &&
>> +			((get_tb() < vcpu->arch.dec_expires) ||
>> +			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
>> +			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
>>  		kvmppc_core_dequeue_dec(vcpu);
>>  
>>  	trace_kvm_guest_exit(vcpu);
>>
> 
> Patches 2 and 3: tested I can boot and run an L2 nested guest with qemu
> v4.0.0 and caps-large-decr=on in the case we have had a hang previously.
> 
> Tested-by: Laurent Vivier <lvivier@redhat.com>

You beat me to it. All works fine on L0, L1, L2.

  Tested-by: Cédric Le Goater <clg@kaod.org>

With a QEMU-4.1. In this configuration, L2 runs with the XIVE (emulated) 
interrupt mode by default now (kernel_irqchip=allowed, ic-mode=dual).

Thanks,

C.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries
  2019-06-20  1:46 [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Suraj Jitindar Singh
  2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
  2019-06-20  1:46 ` [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry Suraj Jitindar Singh
@ 2019-06-23 10:34 ` Michael Ellerman
  2 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2019-06-23 10:34 UTC (permalink / raw)
  To: Suraj Jitindar Singh, linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

On Thu, 2019-06-20 at 01:46:49 UTC, Suraj Jitindar Singh wrote:
> When a guest vcpu moves from one physical thread to another it is
> necessary for the host to perform a tlb flush on the previous core if
> another vcpu from the same guest is going to run there. This is because the
> guest may use the local form of the tlb invalidation instruction meaning
> stale tlb entries would persist where it previously ran. This is handled
> on guest entry in kvmppc_check_need_tlb_flush() which calls
> flush_guest_tlb() to perform the tlb flush.
> 
> Previously the generic radix__local_flush_tlb_lpid_guest() function was
> used, however the functionality was reimplemented in flush_guest_tlb()
> to avoid the trace_tlbie() call as the flushing may be done in real
> mode. The reimplementation in flush_guest_tlb() was missing an erat
> invalidation after flushing the tlb.
> 
> This lead to observable memory corruption in the guest due to the
> caching of stale translations. Fix this by adding the erat invalidation.
> 
> Fixes: 70ea13f6e609 "KVM: PPC: Book3S HV: Flush TLB on secondary radix threads"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/50087112592016a3fc10b394a55f1f1a1bde6908

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr
  2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
  2019-06-20  7:56   ` Laurent Vivier
@ 2019-06-30  8:37   ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2019-06-30  8:37 UTC (permalink / raw)
  To: Suraj Jitindar Singh, linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

On Thu, 2019-06-20 at 01:46:50 UTC, Suraj Jitindar Singh wrote:
> On POWER9 the decrementer can operate in large decrementer mode where
> the decrementer is 56 bits and signed extended to 64 bits. When not
> operating in this mode the decrementer behaves as a 32 bit decrementer
> which is NOT signed extended (as on POWER8).
> 
> Currently when reading a guest decrementer value we don't take into
> account whether the large decrementer is enabled or not, and this means
> the value will be incorrect when the guest is not using the large
> decrementer. Fix this by sign extending the value read when the guest
> isn't using the large decrementer.
> 
> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> Tested-by: Laurent Vivier <lvivier@redhat.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/869537709ebf1dc865e75c3fc97b23f8acf37c16

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry
  2019-06-20  1:46 ` [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry Suraj Jitindar Singh
  2019-06-20  7:57   ` Laurent Vivier
@ 2019-06-30  8:37   ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2019-06-30  8:37 UTC (permalink / raw)
  To: Suraj Jitindar Singh, linuxppc-dev; +Cc: clg, kvm-ppc, sjitindarsingh

On Thu, 2019-06-20 at 01:46:51 UTC, Suraj Jitindar Singh wrote:
> If we enter an L1 guest with a pending decrementer exception then this
> is cleared on guest exit if the guest has writtien a positive value into
> the decrementer (indicating that it handled the decrementer exception)
> since there is no other way to detect that the guest has handled the
> pending exception and that it should be dequeued. In the event that the
> L1 guest tries to run a nested (L2) guest immediately after this and the
> L2 guest decrementer is negative (which is loaded by L1 before making
> the H_ENTER_NESTED hcall), then the pending decrementer exception
> isn't cleared and the L2 entry is blocked since L1 has a pending
> exception, even though L1 may have already handled the exception and
> written a positive value for it's decrementer. This results in a loop of
> L1 trying to enter the L2 guest and L0 blocking the entry since L1 has
> an interrupt pending with the outcome being that L2 never gets to run
> and hangs.
> 
> Fix this by clearing any pending decrementer exceptions when L1 makes
> the H_ENTER_NESTED hcall since it won't do this if it's decrementer has
> gone negative, and anyway it's decrementer has been communicated to L0
> in the hdec_expires field and L0 will return control to L1 when this
> goes negative by delivering an H_DECREMENTER exception.
> 
> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
> Tested-by: Laurent Vivier <lvivier@redhat.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/3c25ab35fbc8526ac0c9b298e8a78e7ad7a55479

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-30  8:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-20  1:46 [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Suraj Jitindar Singh
2019-06-20  1:46 ` [PATCH 2/3] KVM: PPC: Book3S HV: Signed extend decrementer value if not using large decr Suraj Jitindar Singh
2019-06-20  7:56   ` Laurent Vivier
2019-06-30  8:37   ` Michael Ellerman
2019-06-20  1:46 ` [PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry Suraj Jitindar Singh
2019-06-20  7:57   ` Laurent Vivier
2019-06-20  8:19     ` Cédric Le Goater
2019-06-30  8:37   ` Michael Ellerman
2019-06-23 10:34 ` [PATCH 1/3] KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).