From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933072AbdKPHHp (ORCPT ); Thu, 16 Nov 2017 02:07:45 -0500 Received: from mga07.intel.com ([134.134.136.100]:40003 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750814AbdKPHHl (ORCPT ); Thu, 16 Nov 2017 02:07:41 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,402,1505804400"; d="scan'208";a="2159558" Date: Thu, 16 Nov 2017 15:08:02 +0800 From: Haozhong Zhang To: David Hildenbrand , Paolo Bonzini , rkrcmar@redhat.com, Xiao Guangrong Cc: kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Dan Williams , ivan.d.cuevas.escareno@intel.com, karthik.kumar@intel.com, Konrad Rzeszutek Wilk , Olif Chapman , Mikulas Patocka , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Borislav Petkov , Tom Lendacky Subject: Re: [PATCH v5 1/2] x86/mm: add a function to check if a pfn is UC/UC- Message-ID: <20171116070801.ksc2rnly322ibhye@hz-desktop> Mail-Followup-To: David Hildenbrand , Paolo Bonzini , rkrcmar@redhat.com, Xiao Guangrong , kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Dan Williams , ivan.d.cuevas.escareno@intel.com, karthik.kumar@intel.com, Konrad Rzeszutek Wilk , Olif Chapman , Mikulas Patocka , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Borislav Petkov , Tom Lendacky References: <20171108075630.16991-1-haozhong.zhang@intel.com> <20171108075630.16991-2-haozhong.zhang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170714 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/15/17 11:44 +0100, David Hildenbrand wrote: > On 08.11.2017 08:56, Haozhong Zhang wrote: > > It will be used by KVM to check whether a pfn should be > > mapped to guest as UC. > > > > Signed-off-by: Haozhong Zhang > > --- > > arch/x86/include/asm/pat.h | 2 ++ > > arch/x86/mm/pat.c | 16 ++++++++++++++++ > > 2 files changed, 18 insertions(+) > > > > diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h > > index fffb2794dd89..fabb0cf00e77 100644 > > --- a/arch/x86/include/asm/pat.h > > +++ b/arch/x86/include/asm/pat.h > > @@ -21,4 +21,6 @@ int io_reserve_memtype(resource_size_t start, resource_size_t end, > > > > void io_free_memtype(resource_size_t start, resource_size_t end); > > > > +bool pat_pfn_is_uc_or_uc_minus(unsigned long pfn); > > + > > #endif /* _ASM_X86_PAT_H */ > > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > > index fe7d57a8fb60..e1282dd4eeb8 100644 > > --- a/arch/x86/mm/pat.c > > +++ b/arch/x86/mm/pat.c > > @@ -677,6 +677,22 @@ static enum page_cache_mode lookup_memtype(u64 paddr) > > return rettype; > > } > > > > +/** > > + * Check with PAT whether the memory type of a pfn is UC or UC-. > > + * > > + * Only to be called when PAT is enabled. > > + * > > + * Returns true, if the memory type of @pfn is UC or UC-. > > + * Otherwise, returns false. > > + */ > > +bool pat_pfn_is_uc_or_uc_minus(unsigned long pfn) > > +{ > > + enum page_cache_mode cm = lookup_memtype(PFN_PHYS(pfn)); > > + > > + return cm == _PAGE_CACHE_MODE_UC || cm == _PAGE_CACHE_MODE_UC_MINUS; > > +} > > +EXPORT_SYMBOL_GPL(pat_pfn_is_uc_or_uc_minus); > > + > > /** > > * io_reserve_memtype - Request a memory type mapping for a region of memory > > * @start: start (physical address) of the region > > > > Wonder if we should check for pat internally. And if we should simply > return the memtype via lookup_memtype() instead of creating such a > strange named function (by providing e.g. a lookup_memtype() variant > that can be called with !pat_enabled()). > > The caller can easily check against _PAGE_CACHE_MODE_UC ... > Yes, the better solution should work for both PAT enabled and disabled cases, like what __vm_insert_mixed() does: use vma->vm_page_prot if PAT is disabled, and refer to track_pfn_insert() in addition if PAT is enabled. The early RFC patch [1] got the cache mode in a similar way via a new function kvm_vcpu_gfn_to_pgprot(). However, as explained in RFC, it does not work, because the existing MMIO check (where kvm_vcpu_gfn_to_pgprot() is called) in KVM is performed with a spinlock (vcpu->kvm->mmu_lock) being taken, but kvm_vcpu_gfn_to_pgprot() has to touch a semaphore (vcpu->kvm->mm->mmap_sem). Besides, KVM may prefetch and check MMIO of other pfns within vcpu->kvm->mmu_lock, and the prefectched pfns cannot be predicted in advance, which means we have to keep the MMIO check within vcpu->kvm->mmu_lock. Therefore, I only make a suboptimal fix in this patchset that only fixes PAT enabled cases, which I suppose is the usual usage scenario of NVDIMM. [1] https://patchwork.kernel.org/patch/10016261/ Haozhong