linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Andrew Lutomirski <amluto@gmail.com>,
	"H. Peter Anvin" <hpa@linux.intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Alexander van Heukelum <heukelum@fastmail.fm>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Borislav Petkov <bp@alien8.de>,
	Arjan van de Ven <arjan.van.de.ven@intel.com>,
	Brian Gerst <brgerst@gmail.com>,
	Alexandre Julliard <julliard@winehq.com>,
	Andi Kleen <andi@firstfloor.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] x86-64: espfix for 64-bit mode *PROTOTYPE*
Date: Mon, 21 Apr 2014 16:29:45 -0700	[thread overview]
Message-ID: <5355A9E9.9070102@zytor.com> (raw)
In-Reply-To: <CAObL_7EJi5+m-oDXRy4hu+-OTZ=9wZ9WEivTMsdDtccU00wfWA@mail.gmail.com>

On 04/21/2014 04:19 PM, Andrew Lutomirski wrote:
> 
> Hahaha! :)
> 
> Some comments:
> 
> Does returning to 64-bit CS with 16-bit SS not need espfix?

There is no such thing.  With a 64-bit CS, the flags on SS are ignored
(although you still have to have a non-null SS... the conditions are a
bit complex.)

> Conversely, does 16-bit CS and 32-bit SS need espfix?

It does not, at least to the best of my knowledge (it is controlled by
the SS size, not the CS size.)

I'm going to double-check the corner cases just out of healthy paranoia,
but I'm 98% sure this is correct (and if not, the 32-bit code needs to
be fixed, too.)

>> @@ -1058,6 +1095,7 @@ bad_iret:
>>          * So pretend we completed the iret and took the #GPF in user mode.
>>          *
>>          * We are now running with the kernel GS after exception recovery.
>> +        * Exception entry will have removed us from the espfix stack.
>>          * But error_entry expects us to have user GS to match the user %cs,
>>          * so swap back.
>>          */
> 
> What is that referring to?

It means that we have already switched back from the espfix stack to the
real stack.

>> +       /*
>> +        * Switch from the espfix stack to the proper stack: tricky stuff.
>> +        * On the stack right now is 5 words of exception frame,
>> +        * error code/oldeax, RDI, and the return value, so no additional
>> +        * stack is available.
>> +        *
>> +        * We will always be using the user space GS on entry.
>> +       */
>> +ENTRY(espfix_fix_stack)
>> +       SWAPGS
>> +       cld
>> +       movq PER_CPU_VAR(kernel_stack),%rdi
>> +       subq $8*8,%rdi
>> +       /* Use the real stack to hold these registers for now */
>> +       movq %rsi,-8(%rdi)
>> +       movq %rcx,-16(%rdi)
>> +       movq %rsp,%rsi
>> +       movl $8,%ecx
>> +       rep;movsq
>> +       leaq -(10*8)(%rdi),%rsp
>> +       popq %rcx
>> +       popq %rsi
>> +       SWAPGS
>> +       retq
>>
> 
> Is it guaranteed that the userspace thread that caused this is dead?
> If not, do you need to change RIP so that espfix gets invoked again
> when you return from the exception?

It is not guaranteed to be dead at all.  Why would you need to change
RIP, though?

>> +
>> +void init_espfix_cpu(void)
>> +{
>> +       int cpu = smp_processor_id();
>> +       unsigned long addr;
>> +       pgd_t pgd, *pgd_p;
>> +       pud_t pud, *pud_p;
>> +       pmd_t pmd, *pmd_p;
>> +       pte_t pte, *pte_p;
>> +       int n;
>> +       void *stack_page;
>> +
>> +       cpu = smp_processor_id();
>> +       BUG_ON(cpu >= (8 << 20)/ESPFIX_STACK_SIZE);
>> +
>> +       /* We only have to do this once... */
>> +       if (likely(this_cpu_read(espfix_stack)))
>> +               return;         /* Already initialized */
>> +
>> +       addr = espfix_base_addr(cpu);
>> +
>> +       /* Did another CPU already set this up? */
>> +       if (likely(espfix_already_there(addr)))
>> +               goto done;
>> +
>> +       mutex_lock(&espfix_init_mutex);
>> +
>> +       if (unlikely(espfix_already_there(addr)))
>> +               goto unlock_done;
> 
> Wouldn't it be simpler to just have a single static bool to indicate
> whether espfix is initialized?

No, you would have to allocate memory for every possible CPU, which I
wanted to avoid in case NR_CPUS >> actual CPUs (I don't know if we have
already done that for percpu, but we *should* if we haven't yet.)

> Even better: why not separate the percpu init from the pagetable init
> and just do the pagetable init once from main or even modify_ldt?

It needs to be done once per CPU.  I wanted to do it late enough that
the page allocator is fully functional, so we don't have to do the ugly
hacks to call one allocator or another as the percpu initialization code
does (otherwise it would have made a lot of sense to co-locate with percpu.)

	-hpa


  reply	other threads:[~2014-04-21 23:30 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 17:36 [tip:x86/urgent] x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels tip-bot for H. Peter Anvin
2014-04-11 18:12 ` Andy Lutomirski
2014-04-11 18:20   ` H. Peter Anvin
2014-04-11 18:27 ` Brian Gerst
2014-04-11 18:29   ` H. Peter Anvin
2014-04-11 18:35     ` Brian Gerst
2014-04-11 21:16     ` Andy Lutomirski
2014-04-11 21:24       ` H. Peter Anvin
2014-04-11 21:53         ` Andy Lutomirski
2014-04-11 21:59           ` H. Peter Anvin
2014-04-11 22:15             ` Andy Lutomirski
2014-04-11 22:18               ` H. Peter Anvin
2014-04-13  4:20           ` H. Peter Anvin
2014-04-12 23:26         ` Alexander van Heukelum
2014-04-12 23:31           ` H. Peter Anvin
2014-04-12 23:49             ` Alexander van Heukelum
2014-04-13  0:03               ` H. Peter Anvin
2014-04-13  1:25                 ` Andy Lutomirski
2014-04-13  1:29                   ` Andy Lutomirski
2014-04-13  3:00                     ` H. Peter Anvin
2014-04-11 21:34       ` Linus Torvalds
2014-04-11 18:41   ` Linus Torvalds
2014-04-11 18:45     ` Brian Gerst
2014-04-11 18:50       ` Linus Torvalds
2014-04-12  4:44         ` Brian Gerst
2014-04-12 17:18           ` H. Peter Anvin
2014-04-12 19:35             ` Borislav Petkov
2014-04-12 19:44               ` H. Peter Anvin
2014-04-12 20:11                 ` Borislav Petkov
2014-04-12 20:34                   ` Brian Gerst
2014-04-12 20:59                     ` Borislav Petkov
2014-04-12 21:13                       ` Brian Gerst
2014-04-12 21:40                         ` Borislav Petkov
2014-04-14  7:21                           ` Ingo Molnar
2014-04-14  9:44                             ` Borislav Petkov
2014-04-14  9:47                               ` Ingo Molnar
2014-04-12 21:53                 ` Linus Torvalds
2014-04-12 22:25                   ` H. Peter Anvin
2014-04-13  2:56                     ` Andi Kleen
2014-04-13  3:02                       ` H. Peter Anvin
2014-04-13  3:13                       ` Linus Torvalds
2014-04-12 20:29             ` Brian Gerst
2014-04-14  7:48         ` Alexandre Julliard
2014-05-07  9:18           ` Sven Joachim
2014-05-07 10:18             ` Borislav Petkov
2014-05-07 16:57             ` Linus Torvalds
2014-05-07 17:09               ` H. Peter Anvin
2014-05-07 17:50                 ` Alexandre Julliard
2014-05-08  6:43                 ` Sven Joachim
2014-05-08 13:50                   ` H. Peter Anvin
2014-05-08 20:13                     ` H. Peter Anvin
2014-05-08 20:40                     ` H. Peter Anvin
2014-05-12 13:16               ` Josh Boyer
2014-05-12 16:52                 ` H. Peter Anvin
2014-05-14 23:43               ` [tip:x86/urgent] x86-64, modify_ldt: Make support for 16-bit segments a runtime option tip-bot for Linus Torvalds
2014-04-11 18:46     ` [tip:x86/urgent] x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels H. Peter Anvin
2014-04-14  7:27       ` Ingo Molnar
2014-04-14 15:45         ` H. Peter Anvin
2014-04-13  2:54     ` Andi Kleen
2014-04-21 22:47 ` [PATCH] x86-64: espfix for 64-bit mode *PROTOTYPE* H. Peter Anvin
2014-04-21 23:19   ` Andrew Lutomirski
2014-04-21 23:29     ` H. Peter Anvin [this message]
2014-04-22  0:37       ` Andrew Lutomirski
2014-04-22  0:53         ` H. Peter Anvin
2014-04-22  1:06           ` Andrew Lutomirski
2014-04-22  1:14             ` H. Peter Anvin
2014-04-22  1:28               ` Andrew Lutomirski
2014-04-22  1:47                 ` H. Peter Anvin
2014-04-22  1:53                   ` Andrew Lutomirski
2014-04-22 11:23                     ` Borislav Petkov
2014-04-22 14:46                       ` Borislav Petkov
2014-04-22 16:03                         ` Andrew Lutomirski
2014-04-22 16:10                           ` H. Peter Anvin
2014-04-22 16:33                             ` Andrew Lutomirski
2014-04-22 16:43                               ` Linus Torvalds
2014-04-22 17:00                                 ` Andrew Lutomirski
2014-04-22 17:04                                   ` Linus Torvalds
2014-04-22 17:11                                     ` Andrew Lutomirski
2014-04-22 17:15                                       ` H. Peter Anvin
2014-04-23  9:54                                         ` One Thousand Gnomes
2014-04-23 15:53                                           ` H. Peter Anvin
2014-04-23 17:08                                             ` Andrew Lutomirski
2014-04-23 17:16                                               ` H. Peter Anvin
2014-04-23 17:25                                                 ` Andrew Lutomirski
2014-04-23 17:28                                                   ` H. Peter Anvin
2014-04-23 17:45                                                     ` Andrew Lutomirski
2014-04-22 17:19                                       ` Linus Torvalds
2014-04-22 17:29                                         ` H. Peter Anvin
2014-04-22 17:46                                           ` Andrew Lutomirski
2014-04-22 17:59                                             ` H. Peter Anvin
2014-04-22 18:03                                             ` Brian Gerst
2014-04-22 18:06                                               ` H. Peter Anvin
2014-04-22 18:17                                                 ` Brian Gerst
2014-04-22 18:51                                                   ` H. Peter Anvin
2014-04-22 19:55                                                     ` Brian Gerst
2014-04-22 20:17                                                       ` H. Peter Anvin
2014-04-22 23:08                                                         ` Brian Gerst
2014-04-22 23:39                                                     ` Andi Kleen
2014-04-22 23:40                                                       ` H. Peter Anvin
2014-04-22 17:11                                     ` H. Peter Anvin
2014-04-22 17:26                                       ` Borislav Petkov
2014-04-22 17:29                                         ` Andrew Lutomirski
2014-04-22 19:27                                           ` Borislav Petkov
2014-04-23  6:24                                     ` H. Peter Anvin
2014-04-23  8:57                                       ` Alexandre Julliard
2014-04-22 17:09                                   ` H. Peter Anvin
2014-04-22 17:20                                     ` Andrew Lutomirski
2014-04-22 17:24                                       ` H. Peter Anvin
2014-04-22 11:25   ` Borislav Petkov
2014-04-23  1:17   ` H. Peter Anvin
2014-04-23  1:23     ` Andrew Lutomirski
2014-04-23  1:42       ` H. Peter Anvin
2014-04-23 14:24         ` Boris Ostrovsky
2014-04-23 16:56           ` H. Peter Anvin
2014-04-28 13:04             ` Konrad Rzeszutek Wilk
2014-04-25 21:02     ` Konrad Rzeszutek Wilk
2014-04-25 21:16       ` H. Peter Anvin
2014-04-24  4:13   ` comex
2014-04-24  4:53     ` Andrew Lutomirski
2014-04-24 22:24       ` H. Peter Anvin
2014-04-24 22:31         ` Andrew Lutomirski
2014-04-24 22:37           ` H. Peter Anvin
2014-04-24 22:43             ` Andrew Lutomirski
2014-04-28 23:05       ` H. Peter Anvin
2014-04-28 23:08         ` H. Peter Anvin
2014-04-29  0:02           ` Andrew Lutomirski
2014-04-29  0:15             ` H. Peter Anvin
2014-04-29  0:20             ` Andrew Lutomirski
2014-04-29  2:38               ` H. Peter Anvin
2014-04-29  2:44                 ` H. Peter Anvin
2014-04-29  3:45                 ` H. Peter Anvin
2014-04-29  3:47                   ` H. Peter Anvin
2014-04-29  4:36                   ` H. Peter Anvin
2014-04-29  7:14                     ` H. Peter Anvin
2014-04-25 12:02   ` Pavel Machek
2014-04-25 21:20     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5355A9E9.9070102@zytor.com \
    --to=hpa@zytor.com \
    --cc=amluto@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=arjan.van.de.ven@intel.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=heukelum@fastmail.fm \
    --cc=hpa@linux.intel.com \
    --cc=julliard@winehq.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).