All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	James Morse <james.morse@arm.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvmarm@lists.cs.columbia.edu, kvm-ppc@vger.kernel.org,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	Sean Christopherson <seanjc@google.com>,
	David Stevens <stevensd@google.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 21:42:44 +1000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0


WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
	kvm@vger.kernel.org, David Stevens <stevensd@google.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	intel-gfx@lists.freedesktop.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	kvmarm@lists.cs.columbia.edu, Will Deacon <will@kernel.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	James Morse <james.morse@arm.com>,
	kvm-ppc@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mips@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org,
	Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 21:42:44 +1000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0


WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
	kvm@vger.kernel.org, David Stevens <stevensd@google.com>,
	intel-gfx@lists.freedesktop.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	kvmarm@lists.cs.columbia.edu, Will Deacon <will@kernel.org>,
	kvm-ppc@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mips@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org,
	Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 21:42:44 +1000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	James Morse <james.morse@arm.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvmarm@lists.cs.columbia.edu, kvm-ppc@vger.kernel.org,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	Sean Christopherson <seanjc@google.com>,
	David Stevens <stevensd@google.com>,
	 Suzuki K Poulose <suzuki.poulose@arm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 21:42:44 +1000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
	kvm@vger.kernel.org, David Stevens <stevensd@google.com>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	intel-gfx@lists.freedesktop.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	kvmarm@lists.cs.columbia.edu, Will Deacon <will@kernel.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	James Morse <james.morse@arm.com>,
	kvm-ppc@vger.kernel.org, Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mips@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org,
	Joerg Roedel <joro@8bytes.org>,
	linux-arm-kernel@lists.infradead.org,
	Jim Mattson <jmattson@google.com>
Subject: Re: [Intel-gfx] [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 21:42:44 +1000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Paul Mackerras <paulus@ozlabs.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Stevens <stevensd@chromium.org>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhi Wang <zhi.a.wang@intel.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-gvt-dev@lists.freedesktop.org,
	James Morse <james.morse@arm.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvmarm@lists.cs.columbia.edu, kvm-ppc@vger.kernel.org,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	Sean Christopherson <seanjc@google.com>,
	David Stevens <stevensd@google.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU
Date: Thu, 24 Jun 2021 11:42:44 +0000	[thread overview]
Message-ID: <1624534759.nj0ylor2eh.astroid@bobo.none> (raw)
In-Reply-To: <1624530624.8jff1f4u11.astroid@bobo.none>

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:
> Excerpts from David Stevens's message of June 24, 2021 1:57 pm:
>> KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
>> follow_pte in gfn_to_pfn. However, the resolved pfns may not have
>> assoicated struct pages, so they should not be passed to pfn_to_page.
>> This series removes such calls from the x86 and arm64 secondary MMU. To
>> do this, this series modifies gfn_to_pfn to return a struct page in
>> addition to a pfn, if the hva was resolved by gup. This allows the
>> caller to call put_page only when necessated by gup.
>> 
>> This series provides a helper function that unwraps the new return type
>> of gfn_to_pfn to provide behavior identical to the old behavior. As I
>> have no hardware to test powerpc/mips changes, the function is used
>> there for minimally invasive changes. Additionally, as gfn_to_page and
>> gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
>> easily changed over to only use pfns.
>> 
>> This addresses CVE-2021-22543 on x86 and arm64.
> 
> Does this fix the problem? (untested I don't have a POC setup at hand,
> but at least in concept)

This one actually compiles at least. Unfortunately I don't have much 
time in the near future to test, and I only just found out about this
CVE a few hours ago.

---


It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

---
 virt/kvm/kvm_main.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, bool write_fault)
 	return true;
 }
 
+static int kvm_try_get_pfn(kvm_pfn_t pfn)
+{
+	if (kvm_is_reserved_pfn(pfn))
+		return 1;
+	return get_page_unless_zero(pfn_to_page(pfn));
+}
+
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 			       unsigned long addr, bool *async,
 			       bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma,
 	 * Whoever called remap_pfn_range is also going to call e.g.
 	 * unmap_mapping_range before the underlying pages are freed,
 	 * causing a call to our MMU notifier.
+	 *
+	 * Certain IO or PFNMAP mappings can be backed with valid
+	 * struct pages, but be allocated without refcounting e.g.,
+	 * tail pages of non-compound higher order allocations, which
+	 * would then underflow the refcount when the caller does the
+	 * required put_page. Don't allow those pages here.
 	 */ 
-	kvm_get_pfn(pfn);
+	if (!kvm_try_get_pfn(pfn))
+		r = -EFAULT;
 
 out:
 	pte_unmap_unlock(ptep, ptl);
 	*p_pfn = pfn;
-	return 0;
+
+	return r;
 }
 
 /*
-- 
2.23.0

  reply	other threads:[~2021-06-24 11:42 UTC|newest]

Thread overview: 213+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  3:57 [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU David Stevens
2021-06-24  3:57 ` David Stevens
2021-06-24  3:57 ` [Intel-gfx] " David Stevens
2021-06-24  3:57 ` David Stevens
2021-06-24  3:57 ` David Stevens
2021-06-24  3:57 ` David Stevens
2021-06-24  3:57 ` [PATCH 1/6] KVM: x86/mmu: release audited pfns David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  8:43   ` Nicholas Piggin
2021-06-24  8:43     ` Nicholas Piggin
2021-06-24  8:43     ` [Intel-gfx] " Nicholas Piggin
2021-06-24  8:43     ` Nicholas Piggin
2021-06-24  8:43     ` Nicholas Piggin
2021-06-24  8:43     ` Nicholas Piggin
2021-06-24  9:43     ` Paolo Bonzini
2021-06-24  9:43       ` Paolo Bonzini
2021-06-24  9:43       ` [Intel-gfx] " Paolo Bonzini
2021-06-24  9:43       ` Paolo Bonzini
2021-06-24  9:43       ` Paolo Bonzini
2021-06-24  9:43       ` Paolo Bonzini
2021-06-24 15:36       ` Sean Christopherson
2021-06-24 15:36         ` [Intel-gfx] " Sean Christopherson
2021-06-24 15:36         ` Sean Christopherson
2021-06-24 15:36         ` Sean Christopherson
2021-06-24 15:36         ` Sean Christopherson
2021-06-24 15:36         ` Sean Christopherson
2021-06-24  3:57 ` [PATCH 2/6] KVM: mmu: also return page from gfn_to_pfn David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  8:52   ` Nicholas Piggin
2021-06-24  8:52     ` Nicholas Piggin
2021-06-24  8:52     ` [Intel-gfx] " Nicholas Piggin
2021-06-24  8:52     ` Nicholas Piggin
2021-06-24  8:52     ` Nicholas Piggin
2021-06-24  8:52     ` Nicholas Piggin
2021-06-24  9:42     ` Paolo Bonzini
2021-06-24  9:42       ` Paolo Bonzini
2021-06-24  9:42       ` [Intel-gfx] " Paolo Bonzini
2021-06-24  9:42       ` Paolo Bonzini
2021-06-24  9:42       ` Paolo Bonzini
2021-06-24  9:42       ` Paolo Bonzini
2021-06-24  9:57       ` Nicholas Piggin
2021-06-24  9:57         ` Nicholas Piggin
2021-06-24  9:57         ` [Intel-gfx] " Nicholas Piggin
2021-06-24  9:57         ` Nicholas Piggin
2021-06-24  9:57         ` Nicholas Piggin
2021-06-24  9:57         ` Nicholas Piggin
2021-06-24 10:13         ` Paolo Bonzini
2021-06-24 10:13           ` Paolo Bonzini
2021-06-24 10:13           ` [Intel-gfx] " Paolo Bonzini
2021-06-24 10:13           ` Paolo Bonzini
2021-06-24 10:13           ` Paolo Bonzini
2021-06-24 10:13           ` Paolo Bonzini
2021-06-24 10:17         ` Nicholas Piggin
2021-06-24 10:17           ` Nicholas Piggin
2021-06-24 10:17           ` [Intel-gfx] " Nicholas Piggin
2021-06-24 10:17           ` Nicholas Piggin
2021-06-24 10:17           ` Nicholas Piggin
2021-06-24 10:17           ` Nicholas Piggin
2021-06-24 10:21           ` Paolo Bonzini
2021-06-24 10:21             ` Paolo Bonzini
2021-06-24 10:21             ` [Intel-gfx] " Paolo Bonzini
2021-06-24 10:21             ` Paolo Bonzini
2021-06-24 10:21             ` Paolo Bonzini
2021-06-24 10:21             ` Paolo Bonzini
2021-06-24 10:42             ` Nicholas Piggin
2021-06-24 10:42               ` Nicholas Piggin
2021-06-24 10:42               ` [Intel-gfx] " Nicholas Piggin
2021-06-24 10:42               ` Nicholas Piggin
2021-06-24 10:42               ` Nicholas Piggin
2021-06-24 10:42               ` Nicholas Piggin
2021-06-24  9:40   ` Marc Zyngier
2021-06-24  9:40     ` Marc Zyngier
2021-06-24  9:40     ` [Intel-gfx] " Marc Zyngier
2021-06-24  9:40     ` Marc Zyngier
2021-06-24  9:40     ` Marc Zyngier
2021-06-24  9:40     ` Marc Zyngier
2021-06-24  9:40     ` Marc Zyngier
2021-06-24  3:57 ` [PATCH 3/6] KVM: x86/mmu: avoid struct page in MMU David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  7:31   ` Paolo Bonzini
2021-06-24  7:31     ` Paolo Bonzini
2021-06-24  7:31     ` [Intel-gfx] " Paolo Bonzini
2021-06-24  7:31     ` Paolo Bonzini
2021-06-24  7:31     ` Paolo Bonzini
2021-06-24  7:31     ` Paolo Bonzini
2021-06-24  8:58   ` Nicholas Piggin
2021-06-24  8:58     ` Nicholas Piggin
2021-06-24  8:58     ` [Intel-gfx] " Nicholas Piggin
2021-06-24  8:58     ` Nicholas Piggin
2021-06-24  8:58     ` Nicholas Piggin
2021-06-24  8:58     ` Nicholas Piggin
2021-06-24 10:06     ` Marc Zyngier
2021-06-24 10:06       ` Marc Zyngier
2021-06-24 10:06       ` [Intel-gfx] " Marc Zyngier
2021-06-24 10:06       ` Marc Zyngier
2021-06-24 10:06       ` Marc Zyngier
2021-06-24 10:06       ` Marc Zyngier
2021-06-24 10:06       ` Marc Zyngier
2021-06-24 10:17       ` Paolo Bonzini
2021-06-24 10:17         ` Paolo Bonzini
2021-06-24 10:17         ` [Intel-gfx] " Paolo Bonzini
2021-06-24 10:17         ` Paolo Bonzini
2021-06-24 10:17         ` Paolo Bonzini
2021-06-24 10:17         ` Paolo Bonzini
2021-06-24 10:17         ` Paolo Bonzini
2021-06-24 10:43       ` Nicholas Piggin
2021-06-24 10:43         ` Nicholas Piggin
2021-06-24 10:43         ` [Intel-gfx] " Nicholas Piggin
2021-06-24 10:43         ` Nicholas Piggin
2021-06-24 10:43         ` Nicholas Piggin
2021-06-24 10:43         ` Nicholas Piggin
2021-06-24 10:43         ` Nicholas Piggin
2021-06-24  3:57 ` [PATCH 4/6] KVM: arm64/mmu: " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24 10:43   ` Marc Zyngier
2021-06-24 10:43     ` Marc Zyngier
2021-06-24 10:43     ` [Intel-gfx] " Marc Zyngier
2021-06-24 10:43     ` Marc Zyngier
2021-06-24 10:43     ` Marc Zyngier
2021-06-24 10:43     ` Marc Zyngier
2021-06-24 10:43     ` Marc Zyngier
2021-06-24  3:57 ` [PATCH 5/6] KVM: mmu: remove over-aggressive warnings David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57 ` [PATCH 6/6] drm/i915/gvt: use gfn_to_pfn's page instead of pfn David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` [Intel-gfx] " David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  3:57   ` David Stevens
2021-06-24  4:28   ` David Stevens
2021-06-24  4:28     ` David Stevens
2021-06-24  4:28     ` [Intel-gfx] " David Stevens
2021-06-24  4:28     ` David Stevens
2021-06-24  4:28     ` David Stevens
2021-06-24  4:28     ` David Stevens
2021-06-24 10:25   ` kernel test robot
2021-06-24 10:38   ` kernel test robot
2021-06-24  4:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for KVM: Remove uses of struct page from x86 and arm64 MMU Patchwork
2021-06-24  6:57 ` [PATCH 0/6] " Paolo Bonzini
2021-06-24  6:57   ` Paolo Bonzini
2021-06-24  6:57   ` [Intel-gfx] " Paolo Bonzini
2021-06-24  6:57   ` Paolo Bonzini
2021-06-24  6:57   ` Paolo Bonzini
2021-06-24  6:57   ` Paolo Bonzini
2021-06-24 10:34 ` Nicholas Piggin
2021-06-24 10:34   ` Nicholas Piggin
2021-06-24 10:34   ` [Intel-gfx] " Nicholas Piggin
2021-06-24 10:34   ` Nicholas Piggin
2021-06-24 10:34   ` Nicholas Piggin
2021-06-24 10:34   ` Nicholas Piggin
2021-06-24 11:42   ` Nicholas Piggin [this message]
2021-06-24 11:42     ` Nicholas Piggin
2021-06-24 11:42     ` [Intel-gfx] " Nicholas Piggin
2021-06-24 11:42     ` Nicholas Piggin
2021-06-24 11:42     ` Nicholas Piggin
2021-06-24 11:42     ` Nicholas Piggin
2021-06-24 12:00     ` Paolo Bonzini
2021-06-24 12:00       ` Paolo Bonzini
2021-06-24 12:00       ` [Intel-gfx] " Paolo Bonzini
2021-06-24 12:00       ` Paolo Bonzini
2021-06-24 12:00       ` Paolo Bonzini
2021-06-24 12:00       ` Paolo Bonzini
2021-06-24 12:41     ` Paolo Bonzini
2021-06-24 12:41       ` Paolo Bonzini
2021-06-24 12:41       ` [Intel-gfx] " Paolo Bonzini
2021-06-24 12:41       ` Paolo Bonzini
2021-06-24 12:41       ` Paolo Bonzini
2021-06-24 12:41       ` Paolo Bonzini
2021-06-24 12:57       ` Nicholas Piggin
2021-06-24 12:57         ` Nicholas Piggin
2021-06-24 12:57         ` [Intel-gfx] " Nicholas Piggin
2021-06-24 12:57         ` Nicholas Piggin
2021-06-24 12:57         ` Nicholas Piggin
2021-06-24 12:57         ` Nicholas Piggin
2021-06-24 15:35         ` Paolo Bonzini
2021-06-24 15:35           ` Paolo Bonzini
2021-06-24 15:35           ` [Intel-gfx] " Paolo Bonzini
2021-06-24 15:35           ` Paolo Bonzini
2021-06-24 15:35           ` Paolo Bonzini
2021-06-24 15:35           ` Paolo Bonzini
2021-06-25  0:20           ` Nicholas Piggin
2021-06-25  0:20             ` Nicholas Piggin
2021-06-25  0:20             ` [Intel-gfx] " Nicholas Piggin
2021-06-25  0:20             ` Nicholas Piggin
2021-06-25  0:20             ` Nicholas Piggin
2021-06-25  0:20             ` Nicholas Piggin
2021-06-25  7:44         ` Christian Borntraeger
2021-06-25  7:44           ` Christian Borntraeger
2021-06-25  7:44           ` [Intel-gfx] " Christian Borntraeger
2021-06-25  7:44           ` Christian Borntraeger
2021-06-25  7:44           ` Christian Borntraeger
2021-06-25  7:44           ` Christian Borntraeger
2021-06-24 16:07 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for KVM: Remove uses of struct page from x86 and arm64 MMU (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1624534759.nj0ylor2eh.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=alexandru.elisei@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=james.morse@arm.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maz@kernel.org \
    --cc=paulus@ozlabs.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=stevensd@chromium.org \
    --cc=stevensd@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=will@kernel.org \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.