kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
To: kvm-ppc@vger.kernel.org
Cc: paulus@ozlabs.org, kvm@vger.kernel.org
Subject: [PATCH 01/23] KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler
Date: Mon, 26 Aug 2019 16:20:47 +1000	[thread overview]
Message-ID: <20190826062109.7573-2-sjitindarsingh@gmail.com> (raw)
In-Reply-To: <20190826062109.7573-1-sjitindarsingh@gmail.com>

From: Paul Mackerras <paulus@ozlabs.org>

This makes the same changes in the page fault handler for HPT guests
that commits 31c8b0d0694a ("KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot()
in page fault handler", 2018-03-01), 71d29f43b633 ("KVM: PPC: Book3S HV:
Don't use compound_order to determine host mapping size", 2018-09-11)
and 6579804c4317 ("KVM: PPC: Book3S HV: Avoid crash from THP collapse
during radix page fault", 2018-10-04) made for the page fault handler
for radix guests.

In summary, where we used to call get_user_pages_fast() and then do
special handling for VM_PFNMAP vmas, we now call __get_user_pages_fast()
and then __gfn_to_pfn_memslot() if that fails, followed by reading the
Linux PTE to get the host PFN, host page size and mapping attributes.

This also brings in the change from SetPageDirty() to set_page_dirty_lock()
which was done for the radix page fault handler in commit c3856aeb2940
("KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault
handler", 2018-02-23).

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
---
 arch/powerpc/kvm/book3s_64_mmu_hv.c | 118 +++++++++++++++++-------------------
 1 file changed, 57 insertions(+), 61 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 9a75f0e1933b..a485bb018193 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -497,17 +497,18 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	__be64 *hptep;
 	unsigned long mmu_seq, psize, pte_size;
 	unsigned long gpa_base, gfn_base;
-	unsigned long gpa, gfn, hva, pfn;
+	unsigned long gpa, gfn, hva, pfn, hpa;
 	struct kvm_memory_slot *memslot;
 	unsigned long *rmap;
 	struct revmap_entry *rev;
-	struct page *page, *pages[1];
-	long index, ret, npages;
+	struct page *page;
+	long index, ret;
 	bool is_ci;
-	unsigned int writing, write_ok;
-	struct vm_area_struct *vma;
+	bool writing, write_ok;
+	unsigned int shift;
 	unsigned long rcbits;
 	long mmio_update;
+	pte_t pte, *ptep;
 
 	if (kvm_is_radix(kvm))
 		return kvmppc_book3s_radix_page_fault(run, vcpu, ea, dsisr);
@@ -581,59 +582,62 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	smp_rmb();
 
 	ret = -EFAULT;
-	is_ci = false;
-	pfn = 0;
 	page = NULL;
-	pte_size = PAGE_SIZE;
 	writing = (dsisr & DSISR_ISSTORE) != 0;
 	/* If writing != 0, then the HPTE must allow writing, if we get here */
 	write_ok = writing;
 	hva = gfn_to_hva_memslot(memslot, gfn);
-	npages = get_user_pages_fast(hva, 1, writing ? FOLL_WRITE : 0, pages);
-	if (npages < 1) {
-		/* Check if it's an I/O mapping */
-		down_read(&current->mm->mmap_sem);
-		vma = find_vma(current->mm, hva);
-		if (vma && vma->vm_start <= hva && hva + psize <= vma->vm_end &&
-		    (vma->vm_flags & VM_PFNMAP)) {
-			pfn = vma->vm_pgoff +
-				((hva - vma->vm_start) >> PAGE_SHIFT);
-			pte_size = psize;
-			is_ci = pte_ci(__pte((pgprot_val(vma->vm_page_prot))));
-			write_ok = vma->vm_flags & VM_WRITE;
-		}
-		up_read(&current->mm->mmap_sem);
-		if (!pfn)
-			goto out_put;
+
+	/*
+	 * Do a fast check first, since __gfn_to_pfn_memslot doesn't
+	 * do it with !atomic && !async, which is how we call it.
+	 * We always ask for write permission since the common case
+	 * is that the page is writable.
+	 */
+	if (__get_user_pages_fast(hva, 1, 1, &page) == 1) {
+		write_ok = true;
 	} else {
-		page = pages[0];
-		pfn = page_to_pfn(page);
-		if (PageHuge(page)) {
-			page = compound_head(page);
-			pte_size <<= compound_order(page);
-		}
-		/* if the guest wants write access, see if that is OK */
-		if (!writing && hpte_is_writable(r)) {
-			pte_t *ptep, pte;
-			unsigned long flags;
-			/*
-			 * We need to protect against page table destruction
-			 * hugepage split and collapse.
-			 */
-			local_irq_save(flags);
-			ptep = find_current_mm_pte(current->mm->pgd,
-						   hva, NULL, NULL);
-			if (ptep) {
-				pte = kvmppc_read_update_linux_pte(ptep, 1);
-				if (__pte_write(pte))
-					write_ok = 1;
-			}
-			local_irq_restore(flags);
+		/* Call KVM generic code to do the slow-path check */
+		pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
+					   writing, &write_ok);
+		if (is_error_noslot_pfn(pfn))
+			return -EFAULT;
+		page = NULL;
+		if (pfn_valid(pfn)) {
+			page = pfn_to_page(pfn);
+			if (PageReserved(page))
+				page = NULL;
 		}
 	}
 
+	/*
+	 * Read the PTE from the process' radix tree and use that
+	 * so we get the shift and attribute bits.
+	 */
+	local_irq_disable();
+	ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
+	/*
+	 * If the PTE disappeared temporarily due to a THP
+	 * collapse, just return and let the guest try again.
+	 */
+	if (!ptep) {
+		local_irq_enable();
+		if (page)
+			put_page(page);
+		return RESUME_GUEST;
+	}
+	pte = *ptep;
+	local_irq_enable();
+	hpa = pte_pfn(pte) << PAGE_SHIFT;
+	pte_size = PAGE_SIZE;
+	if (shift)
+		pte_size = 1ul << shift;
+	is_ci = pte_ci(pte);
+
 	if (psize > pte_size)
 		goto out_put;
+	if (pte_size > psize)
+		hpa |= hva & (pte_size - psize);
 
 	/* Check WIMG vs. the actual page we're accessing */
 	if (!hpte_cache_flags_ok(r, is_ci)) {
@@ -647,14 +651,13 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	}
 
 	/*
-	 * Set the HPTE to point to pfn.
-	 * Since the pfn is at PAGE_SIZE granularity, make sure we
+	 * Set the HPTE to point to hpa.
+	 * Since the hpa is at PAGE_SIZE granularity, make sure we
 	 * don't mask out lower-order bits if psize < PAGE_SIZE.
 	 */
 	if (psize < PAGE_SIZE)
 		psize = PAGE_SIZE;
-	r = (r & HPTE_R_KEY_HI) | (r & ~(HPTE_R_PP0 - psize)) |
-					((pfn << PAGE_SHIFT) & ~(psize - 1));
+	r = (r & HPTE_R_KEY_HI) | (r & ~(HPTE_R_PP0 - psize)) | hpa;
 	if (hpte_is_writable(r) && !write_ok)
 		r = hpte_make_readonly(r);
 	ret = RESUME_GUEST;
@@ -719,20 +722,13 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 	asm volatile("ptesync" : : : "memory");
 	preempt_enable();
 	if (page && hpte_is_writable(r))
-		SetPageDirty(page);
+		set_page_dirty_lock(page);
 
  out_put:
 	trace_kvm_page_fault_exit(vcpu, hpte, ret);
 
-	if (page) {
-		/*
-		 * We drop pages[0] here, not page because page might
-		 * have been set to the head page of a compound, but
-		 * we have to drop the reference on the correct tail
-		 * page to match the get inside gup()
-		 */
-		put_page(pages[0]);
-	}
+	if (page)
+		put_page(page);
 	return ret;
 
  out_unlock:
-- 
2.13.6


  reply	other threads:[~2019-08-26  6:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-26  6:20 [PATCH 00/23] KVM: PPC: BOok3S HV: Support for nested HPT guests Suraj Jitindar Singh
2019-08-26  6:20 ` Suraj Jitindar Singh [this message]
2019-08-26  6:20 ` [PATCH 02/23] KVM: PPC: Book3S HV: Increment mmu_notifier_seq when modifying radix pte rc bits Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 03/23] KVM: PPC: Book3S HV: Nested: Don't allow hash guests to run nested guests Suraj Jitindar Singh
2019-10-23  4:47   ` Paul Mackerras
2019-08-26  6:20 ` [PATCH 04/23] KVM: PPC: Book3S HV: Handle making H_ENTER_NESTED hcall in a separate function Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 05/23] KVM: PPC: Book3S HV: Enable calling kvmppc_hpte_hv_fault in virtual mode Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 06/23] KVM: PPC: Book3S HV: Allow hpt manipulation hcalls to be called " Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 07/23] KVM: PPC: Book3S HV: Make kvmppc_invalidate_hpte() take lpid not a kvm struct Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 08/23] KVM: PPC: Book3S HV: Nested: Allow pseries hypervisor to run hpt nested guest Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 09/23] KVM: PPC: Book3S HV: Nested: Improve comments and naming of nest rmap functions Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 10/23] KVM: PPC: Book3S HV: Nested: Increase gpa field in nest rmap to 46 bits Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 11/23] KVM: PPC: Book3S HV: Nested: Remove single nest rmap entries Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 12/23] KVM: PPC: Book3S HV: Nested: add kvmhv_remove_all_nested_rmap_lpid() Suraj Jitindar Singh
2019-08-26  6:20 ` [PATCH 13/23] KVM: PPC: Book3S HV: Nested: Infrastructure for nested hpt guest setup Suraj Jitindar Singh
2019-10-24  3:43   ` Paul Mackerras
2019-08-26  6:21 ` [PATCH 14/23] KVM: PPC: Book3S HV: Nested: Context switch slb for nested hpt guest Suraj Jitindar Singh
2019-10-24  4:48   ` Paul Mackerras
2019-08-26  6:21 ` [PATCH 15/23] KVM: PPC: Book3S HV: Store lpcr and hdec_exp in the vcpu struct Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 16/23] KVM: PPC: Book3S HV: Nested: Make kvmppc_run_vcpu() entry path nested capable Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 17/23] KVM: PPC: Book3S HV: Nested: Rename kvmhv_xlate_addr_nested_radix Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 18/23] KVM: PPC: Book3S HV: Separate out hashing from kvmppc_hv_find_lock_hpte() Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 19/23] KVM: PPC: Book3S HV: Nested: Implement nested hpt mmu translation Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 20/23] KVM: PPC: Book3S HV: Nested: Handle tlbie hcall for nested hpt guest Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 21/23] KVM: PPC: Book3S HV: Nested: Implement nest rmap invalidations for hpt guests Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 22/23] KVM: PPC: Book3S HV: Nested: Enable nested " Suraj Jitindar Singh
2019-08-26  6:21 ` [PATCH 23/23] KVM: PPC: Book3S HV: Add nested hpt pte information to debugfs Suraj Jitindar Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190826062109.7573-2-sjitindarsingh@gmail.com \
    --to=sjitindarsingh@gmail.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=paulus@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).