All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <jiangshanlai@gmail.com>
To: Jann Horn <jannh@google.com>
Cc: Lai Jiangshan <laijs@linux.alibaba.com>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	kernel list <linux-kernel@vger.kernel.org>,
	Kees Cook <keescook@chromium.org>
Subject: Re: [PATCH V3 1/3] x86/entry: avoid calling into sync_regs() when entering from userspace
Date: Tue, 15 Sep 2020 15:55:51 +0800	[thread overview]
Message-ID: <CAJhGHyDL2SppJRL83JzKcEf_Zy9zDRMMW_1PrsEwu2ZaHQAkNQ@mail.gmail.com> (raw)
In-Reply-To: <CAG48ez23rw8oj7gT6Kyj3epXsDze+Cq30Jj_pV7sa54gxKEOpQ@mail.gmail.com>

On Sat, Sep 12, 2020 at 5:24 AM Jann Horn <jannh@google.com> wrote:


Hello Jann

Thanks for your review.

> As far as I can see, on systems affected by Meltdown, this patch fixes
> register state leakage between tasks because any data that is written
> to the per-CPU trampoline stacks must be considered visible to all
> userspace. I think that makes this a fix that should go into stable
> kernels.

I know what you mean since I did similarly as you said in another
project. But I hesitated to claim that. Not everyone goes too
paranoid to hide all registers. I consider them a nice cleanup
the makes future entry_64.S better.

>
> Therefore, please add:
>
> Fixes: 7f2590a110b8 ("x86/entry/64: Use a per-CPU trampoline stack for
> IDT entries")
> Cc: stable@vger.kernel.org
>
>
> > ---
> >  arch/x86/entry/entry_64.S | 43 +++++++++++++++++++++++++++++++--------
> >  1 file changed, 34 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 70dea9337816..1a7715430da3 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -928,19 +928,42 @@ SYM_CODE_END(paranoid_exit)
> >  SYM_CODE_START_LOCAL(error_entry)
> >         UNWIND_HINT_FUNC
> >         cld
> > -       PUSH_AND_CLEAR_REGS save_ret=1
> > -       ENCODE_FRAME_POINTER 8
> > -       testb   $3, CS+8(%rsp)
> > +       testb   $3, CS-ORIG_RAX+8(%rsp)
> >         jz      .Lerror_kernelspace
> >
> > -       /*
> > -        * We entered from user mode or we're pretending to have entered
> > -        * from user mode due to an IRET fault.
> > -        */
>
> As far as I can tell, this comment is still correct, and it is
> helpful. Why are you removing it?

This comment actually describes .Lerror_entry_from_usermode_after_swapgs
which is kind of far from here.

"We entered from user mode" is already self-explained by the code above.

"we're pretending to have entered from user mode due to an IRET fault"
notes code reviewers that .Lerror_bad_iret can jump to
.Lerror_entry_from_usermode_after_swapgs.

.Lerror_entry_from_usermode_after_swapgs will be removed in the next
patch.

Since the comment is too far from its target, and the target
will be removed in the next patch, so I remove the comments.

Maybe I should move the removal of the comment in the next patch?
But should I rewrite the comments before that?

>
> >         SWAPGS
> >         FENCE_SWAPGS_USER_ENTRY
> > -       /* We have user CR3.  Change to kernel CR3. */
> > -       SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
> > +       /*
> > +        * Switch to the thread stack. The IRET frame and orig_ax are
> > +        * on the stack, as well as the return address. RDI..R12 are
>
> Did you mean RDI..R15?

Good catch!

The whole code along with the comments is copied from original
interrupt_entry. The "RDI..R12" in the comment has been moved
several times and can be traced back to 7f2590a110b8
("x86/entry/64: Use a per-CPU trampoline stack for IDT entries")
(You already just referred to this commit).

>
> > +        * not (yet) on the stack and space has not (yet) been
> > +        * allocated for them.
> > +        */
> > +       pushq   %rdx
> > +
> > +       /* Need to switch before accessing the thread stack. */
> > +       SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
> > +       movq    %rsp, %rdx
> > +       movq    PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> Can we avoid spilling %rdx to the meltdown-readable entry stack here?
> We could do something similar to what entry_SYSCALL_64 does, roughly
> like this:

We must spill at least one register besides %rsp. I once used %rbp
for scratch_reg in a not-yet-opensourced project. I think
spilling %rbp is safer than spilling other registers, but I can't
find strong enough reasoning.

The feeble reason at hand is that %rbp is often used as the frame
pointer in userspace which is safer to be leaked since %rsp is
already leaked.

See https://github.com/google/gvisor/pull/2256#discussion_r399005782
("implement KPTI for gvisor")
I once recommended spilling %rbp in gVisor.
In the pull-request in gVisor, even Google didn't show their
eagerness to hide the kernel along with the registers from APPs.

If you also consider spilling %rbp is better than spilling
%rdx and other x86 people also want that, I can do it
in other patches.

1) This patch uses "PUSH_AND_CLEAR_REGS rdx=(%rdx), save_ret=1".
   And using %rbp needs to change PUSH_AND_CLEAR_REGS which
   is better to be in a separated commit.
2) The code in entry_64.S spills other %rdi %rdx etc. to
   entry stack. They need to be changed to %rbp too.

>
>
> /*
>  * While there is an iret frame, it won't be easy to find for a
>  * few instructions, so let's pretend it doesn't exist.
>  */
> UNWIND_HINT_EMPTY
>
> /*
>  * Switch to kernel CR3 and stack. To avoid spilling secret
>  * userspace register state to the trampoline stack, we use
>  * RSP as scratch - we can reconstruct the old RSP afterwards
>  * using TSS_sp0.
>  */
> SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp

It is already in the fine state in the proper place
in the stack.

Abusing %rsp is the source of perdition. Linus, tglx and
many others who have been touched entry_64.S have hated
syscall instruction for not automatically switching %rsp
and it is the source of many hateful IST exceptions.

See https://lore.kernel.org/lkml/875z98jkof.fsf@nanos.tec.linutronix.de/

I would not like to raise any objection for this reason
since I also hate abusing %rsp.

I expected you to get tonnes of replies if tglx ever notices
your such suggestion.

> movq    PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>
> pushq %rdx /* scratch, will be replaced with regs->ss later */
> mov PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rdx
> sub $7*8, %rdx /* return address, orig_ax, IRET frame */
> /*
>  * We have return address and orig_ax on the stack on
>  * top of the IRET frame. That means offset=2*8
>  */
> UNWIND_HINT_IRET_REGS base=%rdx offset=-5*8
>
> pushq   -2*8(%rdx)               /* regs->rsp */
> pushq   -3*8(%rdx)               /* regs->eflags */
> pushq   -4*8(%rdx)               /* regs->cs */
> pushq   -5*8(%rdx)               /* regs->ip */
> pushq   -6*8(%rdx)               /* regs->orig_ax */
> pushq   -7*8(%rdx)               /* return address */
> UNWIND_HINT_FUNC
>
> PUSH_AND_CLEAR_REGS rdx=7*8(%rsp), save_ret=1
>
> /* copy regs->ss from trampoline stack */
> movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rax
> mov -1*8(%rax), %rax
> movq %rax, 20*8(%rsp)
>
> ENCODE_FRAME_POINTER 8
>
> ret


Thanks for your review and your recommendations.

Lai.

  reply	other threads:[~2020-09-15  7:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27  7:31 [PATCH 0/5] x86/entry: simply stack switching when exception on userspace Lai Jiangshan
2020-05-27  7:31 ` [PATCH 1/5] x86/entry: introduce macro idtentry_swapgs_and_switch_to_kernel_stack Lai Jiangshan
2020-05-28 19:12   ` Thomas Gleixner
2020-05-29  8:26     ` [PATCH V2 0/4] x86/entry: simply stack switching when exception on userspace Lai Jiangshan
2020-05-29  8:26       ` [PATCH V2 1/4] x86/entry: avoid calling into sync_regs() when entering from userspace Lai Jiangshan
2020-05-29  8:26       ` [PATCH V2 2/4] x86/entry: directly switch to kernel stack when .Lerror_bad_iret Lai Jiangshan
2020-05-29  8:26       ` [PATCH V2 3/4] x86/entry: remove unused sync_regs() Lai Jiangshan
2020-05-29  8:26       ` [PATCH V2 4/4] x86/entry: don't copy to tmp in fixup_bad_iret Lai Jiangshan
2020-05-29 18:32       ` [PATCH V2 0/4] x86/entry: simply stack switching when exception on userspace Andy Lutomirski
2020-06-16  1:56         ` Lai Jiangshan
2020-06-18 13:52           ` Lai Jiangshan
2020-06-18 14:10             ` Thomas Gleixner
2020-06-27 21:03               ` Andy Lutomirski
     [not found]                 ` <20200817062355.2884-1-jiangshanlai@gmail.com>
2020-08-17  5:31                   ` [PATCH V3 0/3] " Lai Jiangshan
2020-09-10 10:12                     ` Lai Jiangshan
2020-08-17  6:23                   ` [PATCH V3 1/3] x86/entry: avoid calling into sync_regs() when entering from userspace Lai Jiangshan
2020-09-11 21:24                     ` Jann Horn
2020-09-15  7:55                       ` Lai Jiangshan [this message]
2020-08-17  6:23                   ` [PATCH V3 2/3] x86/entry: directly switch to kernel stack when .Lerror_bad_iret Lai Jiangshan
2020-08-17  6:23                   ` [PATCH V3 3/3] x86/entry: remove unused sync_regs() Lai Jiangshan
2020-05-27  7:31 ` [PATCH 2/5] x86/entry: avoid calling into sync_regs() when entering from userspace Lai Jiangshan
2020-05-27  7:31 ` [PATCH 3/5] x86/entry: directly switch to kernel stack when .Lerror_bad_iret Lai Jiangshan
2020-05-27  7:31 ` [PATCH 4/5] x86/entry: remove unused sync_regs() Lai Jiangshan
2020-05-27  7:31 ` [PATCH 5/5] x86/entry: don't copy to tmp in fixup_bad_iret Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJhGHyDL2SppJRL83JzKcEf_Zy9zDRMMW_1PrsEwu2ZaHQAkNQ@mail.gmail.com \
    --to=jiangshanlai@gmail.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.