KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Keqian Zhu <zhukeqian1@huawei.com>
Cc: <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <kvm@vger.kernel.org>,
	<kvmarm@lists.cs.columbia.edu>, <wanghaibin.wang@huawei.com>
Subject: Re: [PATCH v4 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO
Date: Fri, 16 Apr 2021 15:44:22 +0100
Message-ID: <87a6py2ss9.wl-maz@kernel.org> (raw)
In-Reply-To: <8f55b64f-b4dd-700e-c997-8de9c5ea282f@huawei.com>

On Thu, 15 Apr 2021 15:08:09 +0100,
Keqian Zhu <zhukeqian1@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2021/4/15 22:03, Keqian Zhu wrote:
> > The MMIO region of a device maybe huge (GB level), try to use
> > block mapping in stage2 to speedup both map and unmap.
> > 
> > Compared to normal memory mapping, we should consider two more
> > points when try block mapping for MMIO region:
> > 
> > 1. For normal memory mapping, the PA(host physical address) and
> > HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> > the HVA to request hugepage, so we don't need to consider PA
> > alignment when verifing block mapping. But for device memory
> > mapping, the PA and HVA may have different alignment.
> > 
> > 2. For normal memory mapping, we are sure hugepage size properly
> > fit into vma, so we don't check whether the mapping size exceeds
> > the boundary of vma. But for device memory mapping, we should pay
> > attention to this.
> > 
> > This adds get_vma_page_shift() to get page shift for both normal
> > memory and device MMIO region, and check these two points when
> > selecting block mapping size for MMIO region.
> > 
> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > ---
> >  arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 51 insertions(+), 10 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index c59af5ca01b0..5a1cc7751e6d 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
> >  	return PAGE_SIZE;
> >  }
> >  
> > +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva)
> > +{
> > +	unsigned long pa;
> > +
> > +	if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP))
> > +		return huge_page_shift(hstate_vma(vma));
> > +
> > +	if (!(vma->vm_flags & VM_PFNMAP))
> > +		return PAGE_SHIFT;
> > +
> > +	VM_BUG_ON(is_vm_hugetlb_page(vma));
> > +
> > +	pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> > +
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > +	if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> > +	    ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> > +	    ALIGN(hva, PUD_SIZE) <= vma->vm_end)
> > +		return PUD_SHIFT;
> > +#endif
> > +
> > +	if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> > +	    ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
> > +	    ALIGN(hva, PMD_SIZE) <= vma->vm_end)
> > +		return PMD_SHIFT;
> > +
> > +	return PAGE_SHIFT;
> > +}
> > +
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  			  struct kvm_memory_slot *memslot, unsigned long hva,
> >  			  unsigned long fault_status)
> > @@ -769,7 +798,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  		return -EFAULT;
> >  	}
> >  
> > -	/* Let's check if we will get back a huge page backed by hugetlbfs */
> > +	/*
> > +	 * Let's check if we will get back a huge page backed by hugetlbfs, or
> > +	 * get block mapping for device MMIO region.
> > +	 */
> >  	mmap_read_lock(current->mm);
> >  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> >  	if (unlikely(!vma)) {
> > @@ -778,15 +810,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >  		return -EFAULT;
> >  	}
> >  
> > -	if (is_vm_hugetlb_page(vma))
> > -		vma_shift = huge_page_shift(hstate_vma(vma));
> > -	else
> > -		vma_shift = PAGE_SHIFT;
> > -
> > -	if (logging_active ||
> > -	    (vma->vm_flags & VM_PFNMAP)) {
> > +	/*
> > +	 * logging_active is guaranteed to never be true for VM_PFNMAP
> > +	 * memslots.
> > +	 */
> > +	if (logging_active) {
> >  		force_pte = true;
> >  		vma_shift = PAGE_SHIFT;
> > +	} else {
> > +		vma_shift = get_vma_page_shift(vma, hva);
> >  	}
> I use a if/else manner in v4, please check that. Thanks very much!

That's fine. However, it is getting a bit late for 5.13, and we don't
have much time to left it simmer in -next. I'll probably wait until
after the merge window to pick it up.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-15 14:03 [PATCH v4 0/2] " Keqian Zhu
2021-04-15 14:03 ` [PATCH v4 1/2] kvm/arm64: Remove the creation time's mapping of MMIO regions Keqian Zhu
2021-04-21  6:38   ` Gavin Shan
2021-04-21  6:28     ` Keqian Zhu
2021-04-22  2:12       ` Gavin Shan
2021-04-22  7:41         ` Keqian Zhu
2021-04-23  1:35           ` Gavin Shan
2021-04-23  1:36             ` Keqian Zhu
2021-04-23  0:36   ` Gavin Shan
2021-04-15 14:03 ` [PATCH v4 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO Keqian Zhu
2021-04-15 14:08   ` Keqian Zhu
2021-04-16 14:44     ` Marc Zyngier [this message]
2021-04-17  1:05       ` Keqian Zhu
2021-04-21  7:52   ` Gavin Shan
2021-04-21  6:36     ` Keqian Zhu
2021-04-22  2:25       ` Gavin Shan
2021-04-22  6:51         ` Marc Zyngier
2021-04-23  0:42           ` Gavin Shan
2021-04-23  0:37   ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6py2ss9.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wanghaibin.wang@huawei.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git