All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 4.19 1/3] KVM: do not assume PTE is writable after follow_pfn
@ 2021-07-27  8:29 Ovidiu Panait
  2021-07-27  8:29 ` [PATCH v2 4.19 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
  2021-07-27  8:29 ` [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
  0 siblings, 2 replies; 5+ messages in thread
From: Ovidiu Panait @ 2021-07-27  8:29 UTC (permalink / raw)
  To: stable; +Cc: pbonzini

From: Paolo Bonzini <pbonzini@redhat.com>

commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream.

In order to convert an HVA to a PFN, KVM usually tries to use
the get_user_pages family of functinso.  This however is not
possible for VM_IO vmas; in that case, KVM instead uses follow_pfn.

In doing this however KVM loses the information on whether the
PFN is writable.  That is usually not a problem because the main
use of VM_IO vmas with KVM is for BARs in PCI device assignment,
however it is a bug.  To fix it, use follow_pte and check pte_write
while under the protection of the PTE lock.  The information can
be used to fail hva_to_pfn_remapped or passed back to the
caller via *writable.

Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix
up page faults before giving up", 2016-07-05); however, even older version
have the same issue, all the way back to commit 2e2e3738af33 ("KVM:
Handle vma regions with no backing page", 2008-07-20), as they also did
not check whether the PFN was writable.

Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page")
Reported-by: David Stevens <stevensd@google.com>
Cc: 3pvd@google.com
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[OP: backport to 4.19, adjust follow_pte() -> follow_pte_pmd()]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
 virt/kvm/kvm_main.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1ecb27b3421a..6aeac96bf147 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1495,9 +1495,11 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       kvm_pfn_t *p_pfn)
 {
 	unsigned long pfn;
+	pte_t *ptep;
+	spinlock_t *ptl;
 	int r;
 
-	r = follow_pfn(vma, addr, &pfn);
+	r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
 	if (r) {
 		/*
 		 * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does
@@ -1512,14 +1514,19 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 		if (r)
 			return r;
 
-		r = follow_pfn(vma, addr, &pfn);
+		r = follow_pte_pmd(vma->vm_mm, addr, NULL, NULL, &ptep, NULL, &ptl);
 		if (r)
 			return r;
+	}
 
+	if (write_fault && !pte_write(*ptep)) {
+		pfn = KVM_PFN_ERR_RO_FAULT;
+		goto out;
 	}
 
 	if (writable)
-		*writable = true;
+		*writable = pte_write(*ptep);
+	pfn = pte_pfn(*ptep);
 
 	/*
 	 * Get a reference here because callers of *hva_to_pfn* and
@@ -1534,6 +1541,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 */ 
 	kvm_get_pfn(pfn);
 
+out:
+	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 4.19 2/3] KVM: do not allow mapping valid but non-reference-counted pages
  2021-07-27  8:29 [PATCH v2 4.19 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
@ 2021-07-27  8:29 ` Ovidiu Panait
  2021-07-27  8:29 ` [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
  1 sibling, 0 replies; 5+ messages in thread
From: Ovidiu Panait @ 2021-07-27  8:29 UTC (permalink / raw)
  To: stable; +Cc: pbonzini

From: Nicholas Piggin <npiggin@gmail.com>

commit f8be156be163a052a067306417cd0ff679068c97 upstream.

It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on valid pages if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

This addresses CVE-2021-22543.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6aeac96bf147..3559eba5f502 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1489,6 +1489,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -1538,13 +1545,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
  2021-07-27  8:29 [PATCH v2 4.19 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
  2021-07-27  8:29 ` [PATCH v2 4.19 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
@ 2021-07-27  8:29 ` Ovidiu Panait
  2021-07-27 11:01   ` Paolo Bonzini
  1 sibling, 1 reply; 5+ messages in thread
From: Ovidiu Panait @ 2021-07-27  8:29 UTC (permalink / raw)
  To: stable; +Cc: pbonzini

From: Sean Christopherson <seanjc@google.com>

commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream

Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving
a so called "remapped" hva/pfn pair.  In theory, the hva could resolve to
a pfn in high memory on a 32-bit kernel.

This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not
assume PTE is writable after follow_pfn"), which added an error PFN value
to the mix, causing gcc to comlain about overflowing the unsigned long.

  arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’:
  include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’
                                  to ‘long unsigned int’ changes value from
                                  ‘9218868437227405314’ to ‘2’ [-Werror=overflow]
   89 | #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2)
      |                              ^
virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’

Cc: stable@vger.kernel.org
Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210208201940.1258328-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3559eba5f502..a3d82113ae1c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1501,7 +1501,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       bool write_fault, bool *writable,
 			       kvm_pfn_t *p_pfn)
 {
-	unsigned long pfn;
+	kvm_pfn_t pfn;
 	pte_t *ptep;
 	spinlock_t *ptl;
 	int r;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
  2021-07-27  8:29 ` [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
@ 2021-07-27 11:01   ` Paolo Bonzini
  2021-07-27 11:20     ` Greg KH
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2021-07-27 11:01 UTC (permalink / raw)
  To: Ovidiu Panait, stable

On 27/07/21 10:29, Ovidiu Panait wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream
> 
> Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving
> a so called "remapped" hva/pfn pair.  In theory, the hva could resolve to
> a pfn in high memory on a 32-bit kernel.
> 
> This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not
> assume PTE is writable after follow_pfn"), which added an error PFN value
> to the mix, causing gcc to comlain about overflowing the unsigned long.
> 
>    arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’:
>    include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’
>                                    to ‘long unsigned int’ changes value from
>                                    ‘9218868437227405314’ to ‘2’ [-Werror=overflow]
>     89 | #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2)
>        |                              ^
> virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’
> 
> Cc: stable@vger.kernel.org
> Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Message-Id: <20210208201940.1258328-1-seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> ---
>   virt/kvm/kvm_main.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 3559eba5f502..a3d82113ae1c 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1501,7 +1501,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
>   			       bool write_fault, bool *writable,
>   			       kvm_pfn_t *p_pfn)
>   {
> -	unsigned long pfn;
> +	kvm_pfn_t pfn;
>   	pte_t *ptep;
>   	spinlock_t *ptl;
>   	int r;
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

and the other two are the same as v1.

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped()
  2021-07-27 11:01   ` Paolo Bonzini
@ 2021-07-27 11:20     ` Greg KH
  0 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2021-07-27 11:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Ovidiu Panait, stable

On Tue, Jul 27, 2021 at 01:01:01PM +0200, Paolo Bonzini wrote:
> On 27/07/21 10:29, Ovidiu Panait wrote:
> > From: Sean Christopherson <seanjc@google.com>
> > 
> > commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream
> > 
> > Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving
> > a so called "remapped" hva/pfn pair.  In theory, the hva could resolve to
> > a pfn in high memory on a 32-bit kernel.
> > 
> > This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not
> > assume PTE is writable after follow_pfn"), which added an error PFN value
> > to the mix, causing gcc to comlain about overflowing the unsigned long.
> > 
> >    arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’:
> >    include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’
> >                                    to ‘long unsigned int’ changes value from
> >                                    ‘9218868437227405314’ to ‘2’ [-Werror=overflow]
> >     89 | #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2)
> >        |                              ^
> > virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > Message-Id: <20210208201940.1258328-1-seanjc@google.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
> > ---
> >   virt/kvm/kvm_main.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 3559eba5f502..a3d82113ae1c 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -1501,7 +1501,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> >   			       bool write_fault, bool *writable,
> >   			       kvm_pfn_t *p_pfn)
> >   {
> > -	unsigned long pfn;
> > +	kvm_pfn_t pfn;
> >   	pte_t *ptep;
> >   	spinlock_t *ptl;
> >   	int r;
> > 
> 
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> and the other two are the same as v1.

Thanks, now queued up.  I'll push out a new -rc with this added.

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-27 11:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27  8:29 [PATCH v2 4.19 1/3] KVM: do not assume PTE is writable after follow_pfn Ovidiu Panait
2021-07-27  8:29 ` [PATCH v2 4.19 2/3] KVM: do not allow mapping valid but non-reference-counted pages Ovidiu Panait
2021-07-27  8:29 ` [PATCH v2 4.19 3/3] KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() Ovidiu Panait
2021-07-27 11:01   ` Paolo Bonzini
2021-07-27 11:20     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.