From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AA99C433B4 for ; Thu, 22 Apr 2021 06:51:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2A43D61459 for ; Thu, 22 Apr 2021 06:51:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234965AbhDVGwO (ORCPT ); Thu, 22 Apr 2021 02:52:14 -0400 Received: from mail.kernel.org ([198.145.29.99]:45556 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230241AbhDVGwN (ORCPT ); Thu, 22 Apr 2021 02:52:13 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 40C196144D; Thu, 22 Apr 2021 06:51:39 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1lZTBZ-008pcR-8I; Thu, 22 Apr 2021 07:51:37 +0100 Date: Thu, 22 Apr 2021 07:51:36 +0100 Message-ID: <87tunyq0av.wl-maz@kernel.org> From: Marc Zyngier To: Gavin Shan Cc: Keqian Zhu , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, wanghaibin.wang@huawei.com Subject: Re: [PATCH v4 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO In-Reply-To: <46606f3e-ef41-6520-6647-88c0f76a83e0@redhat.com> References: <20210415140328.24200-1-zhukeqian1@huawei.com> <20210415140328.24200-3-zhukeqian1@huawei.com> <960e097d-818b-00bc-b2ee-0da17857f862@redhat.com> <105a403a-e48b-15bc-44ff-0ff34f7d2194@huawei.com> <46606f3e-ef41-6520-6647-88c0f76a83e0@redhat.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: gshan@redhat.com, zhukeqian1@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, wanghaibin.wang@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Thu, 22 Apr 2021 03:25:23 +0100, Gavin Shan wrote: > > Hi Keqian, > > On 4/21/21 4:36 PM, Keqian Zhu wrote: > > On 2021/4/21 15:52, Gavin Shan wrote: > >> On 4/16/21 12:03 AM, Keqian Zhu wrote: > >>> The MMIO region of a device maybe huge (GB level), try to use > >>> block mapping in stage2 to speedup both map and unmap. > >>> > >>> Compared to normal memory mapping, we should consider two more > >>> points when try block mapping for MMIO region: > >>> > >>> 1. For normal memory mapping, the PA(host physical address) and > >>> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use > >>> the HVA to request hugepage, so we don't need to consider PA > >>> alignment when verifing block mapping. But for device memory > >>> mapping, the PA and HVA may have different alignment. > >>> > >>> 2. For normal memory mapping, we are sure hugepage size properly > >>> fit into vma, so we don't check whether the mapping size exceeds > >>> the boundary of vma. But for device memory mapping, we should pay > >>> attention to this. > >>> > >>> This adds get_vma_page_shift() to get page shift for both normal > >>> memory and device MMIO region, and check these two points when > >>> selecting block mapping size for MMIO region. > >>> > >>> Signed-off-by: Keqian Zhu > >>> --- > >>> arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++-------- > >>> 1 file changed, 51 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > >>> index c59af5ca01b0..5a1cc7751e6d 100644 > >>> --- a/arch/arm64/kvm/mmu.c > >>> +++ b/arch/arm64/kvm/mmu.c > >>> @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > >>> return PAGE_SIZE; > >>> } > >>> +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva) > >>> +{ > >>> + unsigned long pa; > >>> + > >>> + if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP)) > >>> + return huge_page_shift(hstate_vma(vma)); > >>> + > >>> + if (!(vma->vm_flags & VM_PFNMAP)) > >>> + return PAGE_SHIFT; > >>> + > >>> + VM_BUG_ON(is_vm_hugetlb_page(vma)); > >>> + > >> > >> I don't understand how VM_PFNMAP is set for hugetlbfs related vma. > >> I think they are exclusive, meaning the flag is never set for > >> hugetlbfs vma. If it's true, VM_PFNMAP needn't be checked on hugetlbfs > >> vma and the VM_BUG_ON() becomes unnecessary. > > Yes, but we're not sure all drivers follow this rule. Add a BUG_ON() is > > a way to catch issue. > > > > I think I didn't make things clear. What I meant is VM_PFNMAP can't > be set for hugetlbfs VMAs. So the checks here can be simplified as > below if you agree: > > if (is_vm_hugetlb_page(vma)) > return huge_page_shift(hstate_vma(vma)); > > if (!(vma->vm_flags & VM_PFNMAP)) > return PAGE_SHIFT; > > VM_BUG_ON(is_vm_hugetlb_page(vma)); /* Can be dropped */ No. If this case happens, I want to see it. I have explicitly asked for it, and this check stays. M. -- Without deviation from the norm, progress is not possible.