From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752265AbdFMJ1J (ORCPT ); Tue, 13 Jun 2017 05:27:09 -0400 Received: from mail.skyhub.de ([5.9.137.197]:41194 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751895AbdFMJ1I (ORCPT ); Tue, 13 Jun 2017 05:27:08 -0400 Date: Tue, 13 Jun 2017 11:26:46 +0200 From: Borislav Petkov To: Andy Lutomirski Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Tom Lendacky , Juergen Gross , xen-devel , Boris Ostrovsky Subject: Re: [PATCH] x86/mm: Split read_cr3() into read_cr3_pa() and __read_cr3() Message-ID: <20170613092646.l5wgvfrgoeb3fksz@pd.tnic> References: <883f8fb121f4616c1c1427ad87350bb2f5ffeca1.1497288170.git.luto@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <883f8fb121f4616c1c1427ad87350bb2f5ffeca1.1497288170.git.luto@kernel.org> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 12, 2017 at 10:26:14AM -0700, Andy Lutomirski wrote: > The kernel has several code paths that read CR3. Most of them assume that > CR3 contains the PGD's physical address, whereas some of them awkwardly > use PHYSICAL_PAGE_MASK to mask off low bits. > > Add explicit mask macros for CR3 and convert all of the CR3 readers. > This will keep them from breaking when PCID is enabled. ... > +/* > + * CR3's layout varies depending on several things. > + * > + * If CR4.PCIDE is set (64-bit only), then CR3[11:0] is the address space ID. > + * If PAE is enabled, then CR3[11:5] is part of the PDPT address > + * (i.e. it's 32-byte aligned, not page-aligned) and CR3[4:0] is ignored. > + * Otherwise (non-PAE, non-PCID), CR3[3] is PWT, CR3[4] is PCD, and > + * CR3[2:0] and CR3[11:5] are ignored. > + * > + * In all cases, Linux puts zeros in the low ignored bits and in PWT and PCD. > + * > + * CR3[63] is always read as zero. If CR4.PCIDE is set, then CR3[63] may be > + * written as 1 to prevent the write to CR3 from flushing the TLB. > + * > + * On systems with SME, one bit (in a variable position!) is stolen to indicate > + * that the top-level paging structure is encrypted. > + * > + * All of the remaining bits indicate the physical address of the top-level > + * paging structure. > + * > + * CR3_ADDR_MASK is the mask used by read_cr3_pa(). > + */ > +#ifdef CONFIG_X86_64 > +/* Mask off the address space ID bits. */ > +#define CR3_ADDR_MASK 0x7FFFFFFFFFFFF000ull > +#define CR3_PCID_MASK 0xFFFull > +#else > +/* > + * CR3_ADDR_MASK needs at least bits 31:5 set on PAE systems, and we save > + * a tiny bit of code size by setting all the bits. > + */ > +#define CR3_ADDR_MASK 0xFFFFFFFFull > +#define CR3_PCID_MASK 0ull All those can do GENMASK_ULL for better readability. > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c > index 3586996fc50d..bc0a849589bb 100644 > --- a/arch/x86/kernel/paravirt.c > +++ b/arch/x86/kernel/paravirt.c > @@ -391,7 +391,7 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = { > > .read_cr2 = native_read_cr2, > .write_cr2 = native_write_cr2, > - .read_cr3 = native_read_cr3, > + .read_cr3 = __native_read_cr3, > .write_cr3 = native_write_cr3, > > .flush_tlb_user = native_flush_tlb, > diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c > index ffeae818aa7a..c6d6dc5f8bb2 100644 > --- a/arch/x86/kernel/process_32.c > +++ b/arch/x86/kernel/process_32.c > @@ -92,7 +92,7 @@ void __show_regs(struct pt_regs *regs, int all) > > cr0 = read_cr0(); > cr2 = read_cr2(); > - cr3 = read_cr3(); > + cr3 = __read_cr3(); > cr4 = __read_cr4(); This is a good example for my confusion. So we have __read_cr4() - with the underscores - but not read_cr4(). Now we get __read_cr3 *and* read_cr3_pa(). So __read_cr3() could just as well lose the "__", right? Or are the PCID series bringing a read_cr3() without "__" too? Oh, and to confuse me even more, there's __native_read_cr3() which is *finally* accessing %cr3 :-) But there's native_write_cr3() without underscores. So can we make those names a bit more balanced please? Thanks. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.