linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size
       [not found] <20181016170515.447235311@linuxfoundation.org>
@ 2018-10-16 17:05 ` Greg Kroah-Hartman
  2018-10-16 22:32   ` Paul Mackerras
  0 siblings, 1 reply; 3+ messages in thread
From: Greg Kroah-Hartman @ 2018-10-16 17:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Sasha Levin, Greg Kroah-Hartman, Nicholas Piggin, kvm-ppc,
	stable, Aneesh Kumar K.V, linuxppc-dev, David Gibson

4.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <npiggin@gmail.com>

[ Upstream commit 71d29f43b6332badc5598c656616a62575e83342 ]

THP paths can defer splitting compound pages until after the actual
remap and TLB flushes to split a huge PMD/PUD. This causes radix
partition scope page table mappings to get out of synch with the host
qemu page table mappings.

This results in random memory corruption in the guest when running
with THP. The easiest way to reproduce is use KVM balloon to free up
a lot of memory in the guest and then shrink the balloon to give the
memory back, while some work is being done in the guest.

Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c |   91 +++++++++++++--------------------
 1 file changed, 37 insertions(+), 54 deletions(-)

--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -538,8 +538,8 @@ int kvmppc_book3s_radix_page_fault(struc
 				   unsigned long ea, unsigned long dsisr)
 {
 	struct kvm *kvm = vcpu->kvm;
-	unsigned long mmu_seq, pte_size;
-	unsigned long gpa, gfn, hva, pfn;
+	unsigned long mmu_seq;
+	unsigned long gpa, gfn, hva;
 	struct kvm_memory_slot *memslot;
 	struct page *page = NULL;
 	long ret;
@@ -636,9 +636,10 @@ int kvmppc_book3s_radix_page_fault(struc
 	 */
 	hva = gfn_to_hva_memslot(memslot, gfn);
 	if (upgrade_p && __get_user_pages_fast(hva, 1, 1, &page) == 1) {
-		pfn = page_to_pfn(page);
 		upgrade_write = true;
 	} else {
+		unsigned long pfn;
+
 		/* Call KVM generic code to do the slow-path check */
 		pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
 					   writing, upgrade_p);
@@ -652,63 +653,45 @@ int kvmppc_book3s_radix_page_fault(struc
 		}
 	}
 
-	/* See if we can insert a 1GB or 2MB large PTE here */
-	level = 0;
-	if (page && PageCompound(page)) {
-		pte_size = PAGE_SIZE << compound_order(compound_head(page));
-		if (pte_size >= PUD_SIZE &&
-		    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
-		    (hva & (PUD_SIZE - PAGE_SIZE))) {
-			level = 2;
-			pfn &= ~((PUD_SIZE >> PAGE_SHIFT) - 1);
-		} else if (pte_size >= PMD_SIZE &&
-			   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
-			   (hva & (PMD_SIZE - PAGE_SIZE))) {
-			level = 1;
-			pfn &= ~((PMD_SIZE >> PAGE_SHIFT) - 1);
-		}
-	}
-
 	/*
-	 * Compute the PTE value that we need to insert.
+	 * Read the PTE from the process' radix tree and use that
+	 * so we get the shift and attribute bits.
 	 */
-	if (page) {
-		pgflags = _PAGE_READ | _PAGE_EXEC | _PAGE_PRESENT | _PAGE_PTE |
-			_PAGE_ACCESSED;
-		if (writing || upgrade_write)
-			pgflags |= _PAGE_WRITE | _PAGE_DIRTY;
-		pte = pfn_pte(pfn, __pgprot(pgflags));
+	local_irq_disable();
+	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
+	pte = *ptep;
+	local_irq_enable();
+
+	/* Get pte level from shift/size */
+	if (shift == PUD_SHIFT &&
+	    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
+	    (hva & (PUD_SIZE - PAGE_SIZE))) {
+		level = 2;
+	} else if (shift == PMD_SHIFT &&
+		   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
+		   (hva & (PMD_SIZE - PAGE_SIZE))) {
+		level = 1;
 	} else {
-		/*
-		 * Read the PTE from the process' radix tree and use that
-		 * so we get the attribute bits.
-		 */
-		local_irq_disable();
-		ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
-		pte = *ptep;
-		local_irq_enable();
-		if (shift == PUD_SHIFT &&
-		    (gpa & (PUD_SIZE - PAGE_SIZE)) ==
-		    (hva & (PUD_SIZE - PAGE_SIZE))) {
-			level = 2;
-		} else if (shift == PMD_SHIFT &&
-			   (gpa & (PMD_SIZE - PAGE_SIZE)) ==
-			   (hva & (PMD_SIZE - PAGE_SIZE))) {
-			level = 1;
-		} else if (shift && shift != PAGE_SHIFT) {
-			/* Adjust PFN */
-			unsigned long mask = (1ul << shift) - PAGE_SIZE;
-			pte = __pte(pte_val(pte) | (hva & mask));
-		}
-		pte = __pte(pte_val(pte) | _PAGE_EXEC | _PAGE_ACCESSED);
-		if (writing || upgrade_write) {
-			if (pte_val(pte) & _PAGE_WRITE)
-				pte = __pte(pte_val(pte) | _PAGE_DIRTY);
-		} else {
-			pte = __pte(pte_val(pte) & ~(_PAGE_WRITE | _PAGE_DIRTY));
+		level = 0;
+		if (shift > PAGE_SHIFT) {
+			/*
+			 * If the pte maps more than one page, bring over
+			 * bits from the virtual address to get the real
+			 * address of the specific single page we want.
+			 */
+			unsigned long rpnmask = (1ul << shift) - PAGE_SIZE;
+			pte = __pte(pte_val(pte) | (hva & rpnmask));
 		}
 	}
 
+	pte = __pte(pte_val(pte) | _PAGE_EXEC | _PAGE_ACCESSED);
+	if (writing || upgrade_write) {
+		if (pte_val(pte) & _PAGE_WRITE)
+			pte = __pte(pte_val(pte) | _PAGE_DIRTY);
+	} else {
+		pte = __pte(pte_val(pte) & ~(_PAGE_WRITE | _PAGE_DIRTY));
+	}
+
 	/* Allocate space in the tree and write the PTE */
 	ret = kvmppc_create_pte(kvm, pte, gpa, level, mmu_seq);
 



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size
  2018-10-16 17:05 ` [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size Greg Kroah-Hartman
@ 2018-10-16 22:32   ` Paul Mackerras
  2018-10-17  7:48     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Mackerras @ 2018-10-16 22:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Aneesh Kumar K.V, Nicholas Piggin, linux-kernel, kvm-ppc,
	Sasha Levin, stable, linuxppc-dev, David Gibson

On Tue, Oct 16, 2018 at 07:05:16PM +0200, Greg Kroah-Hartman wrote:
> 4.18-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Nicholas Piggin <npiggin@gmail.com>
> 
> [ Upstream commit 71d29f43b6332badc5598c656616a62575e83342 ]

If you take 71d29f43b633 then you also need 6579804c4317 ("KVM: PPC:
Book3S HV: Avoid crash from THP collapse during radix page fault",
2018-10-04).

Thanks,
Paul.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size
  2018-10-16 22:32   ` Paul Mackerras
@ 2018-10-17  7:48     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 3+ messages in thread
From: Greg Kroah-Hartman @ 2018-10-17  7:48 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Aneesh Kumar K.V, Nicholas Piggin, linux-kernel, kvm-ppc,
	Sasha Levin, stable, linuxppc-dev, David Gibson

On Wed, Oct 17, 2018 at 09:32:25AM +1100, Paul Mackerras wrote:
> On Tue, Oct 16, 2018 at 07:05:16PM +0200, Greg Kroah-Hartman wrote:
> > 4.18-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Nicholas Piggin <npiggin@gmail.com>
> > 
> > [ Upstream commit 71d29f43b6332badc5598c656616a62575e83342 ]
> 
> If you take 71d29f43b633 then you also need 6579804c4317 ("KVM: PPC:
> Book3S HV: Avoid crash from THP collapse during radix page fault",
> 2018-10-04).

Thanks, now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-10-17  7:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20181016170515.447235311@linuxfoundation.org>
2018-10-16 17:05 ` [PATCH 4.18 086/135] KVM: PPC: Book3S HV: Dont use compound_order to determine host mapping size Greg Kroah-Hartman
2018-10-16 22:32   ` Paul Mackerras
2018-10-17  7:48     ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).