From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753474AbbCJUHE (ORCPT ); Tue, 10 Mar 2015 16:07:04 -0400 Received: from mail-lb0-f176.google.com ([209.85.217.176]:41395 "EHLO mail-lb0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753091AbbCJUG7 (ORCPT ); Tue, 10 Mar 2015 16:06:59 -0400 MIME-Version: 1.0 In-Reply-To: <54FF4244.5080600@redhat.com> References: <54FF4244.5080600@redhat.com> From: Andy Lutomirski Date: Tue, 10 Mar 2015 13:06:37 -0700 Message-ID: Subject: Re: [PATCH 3/3] x86_32: Document our abuse of ss1 and sp1 To: Denys Vlasenko Cc: X86 ML , "linux-kernel@vger.kernel.org" , Borislav Petkov , Oleg Nesterov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2015 at 12:13 PM, Denys Vlasenko wrote: > On 03/10/2015 07:06 PM, Andy Lutomirski wrote: >> This has confused me for a while. Now that I figured it out, >> document it. > > Great! > >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/include/asm/processor.h | 21 ++++++++++++++++++--- >> 1 file changed, 18 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h >> index fc6d8d0d8d53..b26208998b7c 100644 >> --- a/arch/x86/include/asm/processor.h >> +++ b/arch/x86/include/asm/processor.h >> @@ -209,9 +209,24 @@ struct x86_hw_tss { >> unsigned short back_link, __blh; >> unsigned long sp0; >> unsigned short ss0, __ss0h; >> - unsigned long sp1; >> - /* ss1 caches MSR_IA32_SYSENTER_CS: */ >> - unsigned short ss1, __ss1h; >> + >> + /* >> + * We don't use ring 1, so sp1 and ss1 are convenient scratch >> + * spaces in the same cacheline as sp0. We use them to cache >> + * some MSR values to avoid unnecessary wrmsr instructions. > > I don't see where exactly tss.ss1/sp1 is getting used as cache. > Grepping for "sp1" string, I found only this: > > $ grep -r '[.>]e*sp1' . > ./kernel/cpu/common.c: tss->x86_tss.sp1 = sizeof(struct tss_struct) + (unsigned long) tss; > ./kernel/cpu/common.c: wrmsr(MSR_IA32_SYSENTER_ESP, tss->x86_tss.sp1, 0); > > void enable_sep_cpu(void) > { > int cpu = get_cpu(); > struct tss_struct *tss = &per_cpu(init_tss, cpu); > ... > tss->x86_tss.ss1 = __KERNEL_CS; > tss->x86_tss.sp1 = sizeof(struct tss_struct) + (unsigned long) tss; > wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0); > wrmsr(MSR_IA32_SYSENTER_ESP, tss->x86_tss.sp1, 0); > wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long) ia32_sysenter_target, 0); > put_cpu(); > } > > It's trivial to rewrite this wrmsr(MSR_IA32_SYSENTER_ESP) > without the detour through x86_tss.sp1. > > Apart from this, x86_tss.sp1 appears unused... ????confused???? > Hmm. Perhaps I hallucinated it. Maybe we should just remove this instead. We change sp0, but not SYSENTER_ESP. I'll add a fourth patch to the series. > > > .ss1 also seems to be a write-only field: > > $ grep -r '[.>]ss1' . > ./include/asm/processor.h: if (unlikely(tss->x86_tss.ss1 != thread->sysenter_cs)) { This is a read :) > ./include/asm/processor.h: tss->x86_tss.ss1 = thread->sysenter_cs; > ./include/asm/processor.h: .ss1 = __KERNEL_CS, \ > ./kernel/cpu/common.c: tss->x86_tss.ss1 = __KERNEL_CS; > > > >> + * >> + * We use SYSENTER_ESP to find sp0 and for the NMI emergency >> + * stack, > > We use what? SYSENTER_ESP is a MSR, right? We don't use it (the MSR) > to find anything... I don't understand what you are saying here. > As noted above, I'm wrong, so I won't bother clarifying. Will fix. > > but we need to context switch it because we do >> + * horrible things to the kernel stack in vm86 mode. >> + * >> + * We use SYSENTER_CS to disable sysenter in vm86 mode to avoid >> + * corrupting the stack if we went through the sysenter path >> + * from vm86 mode. >> + */ > > I'm confused how loading ss1/sp1 with anything can disable sysenter. > SYSENTER insn does not use those fields. > > What you _can_ disable is you can make it impossible to enter RING1 > if tss.ss1 is invalid. Does it make sense now that I pointed out the read of ss1? If not, I'll improve the comments. > > >> + unsigned long sp1; /* MSR_IA32_SYSENTER_ESP */ >> + unsigned short ss1; /* MSR_IA32_SYSENTER_CS */ > > The comments in the right don't explain anything (to me, at least). > > Sorry for sounding negative. No problem :) --Andy -- Andy Lutomirski AMA Capital Management, LLC