linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
@ 2020-04-16  5:03 Paul Mackerras
  2020-04-16  8:07 ` Cédric Le Goater
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Mackerras @ 2020-04-16  5:03 UTC (permalink / raw)
  To: kvm, kvm-ppc; +Cc: linux-kernel, linuxppc-dev, groug, clg, David Gibson

Since cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT
page fault handler", it's been possible in fairly rare circumstances to
load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a
guest on a POWER8 host.

Because that case wasn't checked for, we could misinterpret the non-present
PTE as being a cache-inhibited PTE.  That could mismatch with the
corresponding hash PTE, which would cause the function to fail with -EFAULT
a little further down.  That would propagate up to the KVM_RUN ioctl()
generally causing the KVM userspace (usually qemu) to fall over.

This addresses the problem by catching that case and returning to the guest
instead, letting it fault again, and retrying the whole page fault from
the beginning.

For completeness, this fixes the radix page fault handler in the same
way.  For radix this didn't cause any obvious misbehaviour, because we
ended up putting the non-present PTE into the guest's partition-scoped
page tables, leading immediately to another hypervisor data/instruction
storage interrupt, which would go through the page fault path again
and fix things up.

Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler"
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
This is a reworked version of the patch David Gibson sent recently,
with the fix applied to the radix case as well. The commit message
is mostly stolen from David's patch.

 arch/powerpc/kvm/book3s_64_mmu_hv.c    | 9 +++++----
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 9 +++++----
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 3aecec8..20b7dce 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -604,18 +604,19 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	 */
 	local_irq_disable();
 	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
+	pte = __pte(0);
+	if (ptep)
+		pte = *ptep;
+	local_irq_enable();
 	/*
 	 * If the PTE disappeared temporarily due to a THP
 	 * collapse, just return and let the guest try again.
 	 */
-	if (!ptep) {
-		local_irq_enable();
+	if (!pte_present(pte)) {
 		if (page)
 			put_page(page);
 		return RESUME_GUEST;
 	}
-	pte = *ptep;
-	local_irq_enable();
 	hpa = pte_pfn(pte) << PAGE_SHIFT;
 	pte_size = PAGE_SIZE;
 	if (shift)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 134fbc1..7bf94ba 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -815,18 +815,19 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 	 */
 	local_irq_disable();
 	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
+	pte = __pte(0);
+	if (ptep)
+		pte = *ptep;
+	local_irq_enable();
 	/*
 	 * If the PTE disappeared temporarily due to a THP
 	 * collapse, just return and let the guest try again.
 	 */
-	if (!ptep) {
-		local_irq_enable();
+	if (!pte_present(pte)) {
 		if (page)
 			put_page(page);
 		return RESUME_GUEST;
 	}
-	pte = *ptep;
-	local_irq_enable();
 
 	/* If we're logging dirty pages, always map single pages */
 	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
  2020-04-16  5:03 [PATCH] KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions Paul Mackerras
@ 2020-04-16  8:07 ` Cédric Le Goater
  2020-04-17  0:47   ` David Gibson
  0 siblings, 1 reply; 3+ messages in thread
From: Cédric Le Goater @ 2020-04-16  8:07 UTC (permalink / raw)
  To: Paul Mackerras, kvm, kvm-ppc
  Cc: linux-kernel, linuxppc-dev, groug, David Gibson

On 4/16/20 7:03 AM, Paul Mackerras wrote:
> Since cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT
> page fault handler", it's been possible in fairly rare circumstances to
> load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a
> guest on a POWER8 host.
> 
> Because that case wasn't checked for, we could misinterpret the non-present
> PTE as being a cache-inhibited PTE.  That could mismatch with the
> corresponding hash PTE, which would cause the function to fail with -EFAULT
> a little further down.  That would propagate up to the KVM_RUN ioctl()
> generally causing the KVM userspace (usually qemu) to fall over.
> 
> This addresses the problem by catching that case and returning to the guest
> instead, letting it fault again, and retrying the whole page fault from
> the beginning.
> 
> For completeness, this fixes the radix page fault handler in the same
> way.  For radix this didn't cause any obvious misbehaviour, because we
> ended up putting the non-present PTE into the guest's partition-scoped
> page tables, leading immediately to another hypervisor data/instruction
> storage interrupt, which would go through the page fault path again
> and fix things up.
> 
> Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler"
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402
> Reported-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

I didn't see the reported issue with the current 5.7-rc1. Anyhow I gave
this patch a try on a P8 host and a P9 host with a radix guest and a hash 
guest (using rhel6). Passthrough is fine also.

Tested-by: Cédric Le Goater <clg@kaod.org>

The code looks correct,

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C. 


> ---
> This is a reworked version of the patch David Gibson sent recently,
> with the fix applied to the radix case as well. The commit message
> is mostly stolen from David's patch.
> 
>  arch/powerpc/kvm/book3s_64_mmu_hv.c    | 9 +++++----
>  arch/powerpc/kvm/book3s_64_mmu_radix.c | 9 +++++----
>  2 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 3aecec8..20b7dce 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -604,18 +604,19 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
>  	 */
>  	local_irq_disable();
>  	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
> +	pte = __pte(0);
> +	if (ptep)
> +		pte = *ptep;
> +	local_irq_enable();
>  	/*
>  	 * If the PTE disappeared temporarily due to a THP
>  	 * collapse, just return and let the guest try again.
>  	 */
> -	if (!ptep) {
> -		local_irq_enable();
> +	if (!pte_present(pte)) {
>  		if (page)
>  			put_page(page);
>  		return RESUME_GUEST;
>  	}
> -	pte = *ptep;
> -	local_irq_enable();
>  	hpa = pte_pfn(pte) << PAGE_SHIFT;
>  	pte_size = PAGE_SIZE;
>  	if (shift)
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 134fbc1..7bf94ba 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -815,18 +815,19 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
>  	 */
>  	local_irq_disable();
>  	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
> +	pte = __pte(0);
> +	if (ptep)
> +		pte = *ptep;
> +	local_irq_enable();
>  	/*
>  	 * If the PTE disappeared temporarily due to a THP
>  	 * collapse, just return and let the guest try again.
>  	 */
> -	if (!ptep) {
> -		local_irq_enable();
> +	if (!pte_present(pte)) {
>  		if (page)
>  			put_page(page);
>  		return RESUME_GUEST;
>  	}
> -	pte = *ptep;
> -	local_irq_enable();
>  
>  	/* If we're logging dirty pages, always map single pages */
>  	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
  2020-04-16  8:07 ` Cédric Le Goater
@ 2020-04-17  0:47   ` David Gibson
  0 siblings, 0 replies; 3+ messages in thread
From: David Gibson @ 2020-04-17  0:47 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Paul Mackerras, kvm, kvm-ppc, linux-kernel, linuxppc-dev, groug

[-- Attachment #1: Type: text/plain, Size: 4724 bytes --]

On Thu, Apr 16, 2020 at 10:07:49AM +0200, Cédric Le Goater wrote:
> On 4/16/20 7:03 AM, Paul Mackerras wrote:
> > Since cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT
> > page fault handler", it's been possible in fairly rare circumstances to
> > load a non-present PTE in kvmppc_book3s_hv_page_fault() when running a
> > guest on a POWER8 host.
> > 
> > Because that case wasn't checked for, we could misinterpret the non-present
> > PTE as being a cache-inhibited PTE.  That could mismatch with the
> > corresponding hash PTE, which would cause the function to fail with -EFAULT
> > a little further down.  That would propagate up to the KVM_RUN ioctl()
> > generally causing the KVM userspace (usually qemu) to fall over.
> > 
> > This addresses the problem by catching that case and returning to the guest
> > instead, letting it fault again, and retrying the whole page fault from
> > the beginning.
> > 
> > For completeness, this fixes the radix page fault handler in the same
> > way.  For radix this didn't cause any obvious misbehaviour, because we
> > ended up putting the non-present PTE into the guest's partition-scoped
> > page tables, leading immediately to another hypervisor data/instruction
> > storage interrupt, which would go through the page fault path again
> > and fix things up.
> > 
> > Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler"
> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1820402
> > Reported-by: David Gibson <david@gibson.dropbear.id.au>
> > Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
> 
> I didn't see the reported issue with the current 5.7-rc1. Anyhow I gave
> this patch a try on a P8 host and a P9 host with a radix guest and a hash 
> guest (using rhel6). Passthrough is fine also.
> 
> Tested-by: Cédric Le Goater <clg@kaod.org>
> 
> The code looks correct,
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

I ran my test case overnight with this patch for over 1000 iterations,
without any apparent problems so

Tested-by: David Gibson <david@gibson.dropbear.id.au>

> 
> Thanks,
> 
> C. 
> 
> 
> > ---
> > This is a reworked version of the patch David Gibson sent recently,
> > with the fix applied to the radix case as well. The commit message
> > is mostly stolen from David's patch.
> > 
> >  arch/powerpc/kvm/book3s_64_mmu_hv.c    | 9 +++++----
> >  arch/powerpc/kvm/book3s_64_mmu_radix.c | 9 +++++----
> >  2 files changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> > index 3aecec8..20b7dce 100644
> > --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> > +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> > @@ -604,18 +604,19 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
> >  	 */
> >  	local_irq_disable();
> >  	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
> > +	pte = __pte(0);
> > +	if (ptep)
> > +		pte = *ptep;
> > +	local_irq_enable();
> >  	/*
> >  	 * If the PTE disappeared temporarily due to a THP
> >  	 * collapse, just return and let the guest try again.
> >  	 */
> > -	if (!ptep) {
> > -		local_irq_enable();
> > +	if (!pte_present(pte)) {
> >  		if (page)
> >  			put_page(page);
> >  		return RESUME_GUEST;
> >  	}
> > -	pte = *ptep;
> > -	local_irq_enable();
> >  	hpa = pte_pfn(pte) << PAGE_SHIFT;
> >  	pte_size = PAGE_SIZE;
> >  	if (shift)
> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > index 134fbc1..7bf94ba 100644
> > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > @@ -815,18 +815,19 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
> >  	 */
> >  	local_irq_disable();
> >  	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
> > +	pte = __pte(0);
> > +	if (ptep)
> > +		pte = *ptep;
> > +	local_irq_enable();
> >  	/*
> >  	 * If the PTE disappeared temporarily due to a THP
> >  	 * collapse, just return and let the guest try again.
> >  	 */
> > -	if (!ptep) {
> > -		local_irq_enable();
> > +	if (!pte_present(pte)) {
> >  		if (page)
> >  			put_page(page);
> >  		return RESUME_GUEST;
> >  	}
> > -	pte = *ptep;
> > -	local_irq_enable();
> >  
> >  	/* If we're logging dirty pages, always map single pages */
> >  	large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES);
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-17  1:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-16  5:03 [PATCH] KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions Paul Mackerras
2020-04-16  8:07 ` Cédric Le Goater
2020-04-17  0:47   ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).