From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753389AbdKWXoq (ORCPT ); Thu, 23 Nov 2017 18:44:46 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:45231 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752951AbdKWXop (ORCPT ); Thu, 23 Nov 2017 18:44:45 -0500 Date: Fri, 24 Nov 2017 00:44:34 +0100 (CET) From: Thomas Gleixner To: Andy Lutomirski cc: X86 ML , Borislav Petkov , "linux-kernel@vger.kernel.org" , Brian Gerst , Dave Hansen , Linus Torvalds , Josh Poimboeuf Subject: Re: [PATCH v2 13/18] x86/asm/64: Use a percpu trampoline stack for IDT entries In-Reply-To: Message-ID: References: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 21 Nov 2017, Andy Lutomirski wrote: > The asm isn't exactly beautiful, Delightful euphemism :) > but I think that fully refactoring > it can wait. > @@ -560,6 +560,14 @@ END(irq_entries_start) > .macro interrupt func > cld > ALLOC_PT_GPREGS_ON_STACK > + > + testb $3, CS(%rsp) > + jz 1f > + SWAPGS > + call switch_to_thread_stack > + SWAPGS I'm surely missing something subtle, but the register saving does really not care on which GS it is. This swapgs orgy looks odd. > +1: > + > SAVE_C_REGS > SAVE_EXTRA_REGS > ENCODE_FRAME_POINTER > @@ -827,6 +835,33 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt > */ > #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) > > +/* > + * Switch to the thread stack. This is called with the IRET frame and > + * orig_ax in pt_regs and the rest of pt_regs allocated, but with all GPRs > + * in the CPU registers. That took several attempts to grok why you left ALLOC_PT_GPRES_ON_STACK in place in the interrupts macro above. In theory it would be sufficient to push %rdi on the entry stack and operate from there, but it spares only the 'addq %rsp'. Not worth the trouble of dealing with different register offsets. A comment to that effect would be useful if you look at that 3 month from now. > + */ > +ENTRY(switch_to_thread_stack) > + UNWIND_HINT_IRET_REGS offset=17*8 > + > + movq %rdi, RDI+8(%rsp) > + movq %rsp, %rdi > + movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp > + UNWIND_HINT_IRET_REGS offset=17*8 base=%rdi > + > + pushq SS+8(%rdi) /* regs->ss */ > + pushq RSP+8(%rdi) /* regs->rsp */ > + pushq EFLAGS+8(%rdi) /* regs->eflags */ > + pushq CS+8(%rdi) /* regs->cs */ > + pushq RIP+8(%rdi) /* regs->ip */ > + pushq ORIG_RAX+8(%rdi) /* regs->orig_ax */ > + ALLOC_PT_GPREGS_ON_STACK /* allocate the rest of regs */ > + pushq (%rdi) /* return address */ > + > + movq RDI+8(%rdi), %rdi > + UNWIND_HINT_IRET_REGS offset=17*8 > + ret > +END(switch_to_thread_stack) Thanks, tglx