All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <dvlasenk@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>,
	Andrey Wagin <avagin@gmail.com>, Ingo Molnar <mingo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Frederic Weisbecker <fweisbec@gmail.com>, X86 ML <x86@kernel.org>,
	Alexei Starovoitov <ast@plumgrid.com>,
	Will Drewry <wad@chromium.org>, Kees Cook <keescook@chromium.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/3 v3] x86: entry_64.S: always allocate complete "struct pt_regs"
Date: Wed, 25 Feb 2015 22:28:52 +0100	[thread overview]
Message-ID: <54EE3E94.7060208@redhat.com> (raw)
In-Reply-To: <CALCETrWWHxqZo+aaDrPhJKEORMWG5Fn4-8Hq6B19yNX++GLb5g@mail.gmail.com>

On 02/25/2015 09:10 PM, Andy Lutomirski wrote:
> On Wed, Feb 25, 2015 at 11:59 AM, Andrey Wagin <avagin@gmail.com> wrote:
>> 2015-02-25 21:42 GMT+03:00 Denys Vlasenko <dvlasenk@redhat.com>:
>>> On 02/25/2015 01:37 PM, Andrey Wagin wrote:
>>>> 2015-02-13 0:54 GMT+03:00 Denys Vlasenko <dvlasenk@redhat.com>:
>>>>> 64-bit code was using six stack slots less by not saving/restoring
>>>>> registers which are callee-preserved according to C ABI,
>>>>> and not allocating space for them.
>>>>> Only when syscall needed a complete "struct pt_regs",
>>>>> the complete area was allocated and filled in.
>>>>> As an additional twist, on interrupt entry a "slightly less truncated pt_regs"
>>>>> trick is used, to make nested interrupt stacks easier to unwind.
>>>>>
>>>>> This proved to be a source of significant obfuscation and subtle bugs.
>>>>> For example, stub_fork had to pop the return address,
>>>>> extend the struct, save registers, and push return address back. Ugly.
>>>>> ia32_ptregs_common pops return address and "returns" via jmp insn,
>>>>> throwing a wrench into CPU return stack cache.
>>>>>
>>>>> This patch changes code to always allocate a complete "struct pt_regs".
>>>>> The saving of registers is still done lazily.
>>>>>
>>>>> "Partial pt_regs" trick on interrupt stack is retained.
>>>>>
>>>>> Macros which manipulate "struct pt_regs" on stack are reworked:
>>>>> ALLOC_PT_GPREGS_ON_STACK allocates the structure.
>>>>> SAVE_C_REGS saves to it those registers which are clobbered by C code.
>>>>> SAVE_EXTRA_REGS saves to it all other registers.
>>>>> Corresponding RESTORE_* and REMOVE_PT_GPREGS_FROM_STACK macros reverse it.
>>>>>
>>>>> ia32_ptregs_common, stub_fork and friends lost their ugly dance with
>>>>> return pointer.
>>>>>
>>>>> LOAD_ARGS32 in ia32entry.S now uses symbolic stack offsets
>>>>> instead of magic numbers.
>>>>>
>>>>> error_entry and save_paranoid now use SAVE_C_REGS + SAVE_EXTRA_REGS
>>>>> instead of having it open-coded yet again.
>>>>>
>>>>> Patch was run-tested: 64-bit executables, 32-bit executables,
>>>>> strace works.
>>>>> Timing tests did not show measurable difference in 32-bit
>>>>> and 64-bit syscalls.
>>>>
>>>> Hello Denys,
>>>>
>>>> My test vm doesn't boot with this patch. Could you help to investigate
>>>> this issue?
>>>
>>> I think I found it. This part of my patch is possibly wrong:
>>>
>>> @@ -171,9 +171,9 @@ static inline int arch_irqs_disabled(void)
>>>  #define ARCH_LOCKDEP_SYS_EXIT_IRQ      \
>>>         TRACE_IRQS_ON; \
>>>         sti; \
>>> -       SAVE_REST; \
>>> +       SAVE_EXTRA_REGS; \
>>>         LOCKDEP_SYS_EXIT; \
>>> -       RESTORE_REST; \
>>> +       RESTORE_EXTRA_REGS; \
>>>         cli; \
>>>         TRACE_IRQS_OFF;
>>>
>>> The "SAVE_REST" here is intended to really *push* extra regs on stack,
>>> but the patch changed it so that they are written to existing stack
>>> slots above.
>>>
>>> From code inspection it should work in almost all cases, but some
>>> locations where it is used are really obscure.
>>>
>>> If there are places where *pushing* regs is really necessary,
>>> this can corrupt rbp,rbx,r12-15 registers.
>>>
>>> Your config has CONFIG_LOCKDEP=y, I think it's worth trying whether the bug
>>> was here.
>>> Please find updated patch attached. Can you try it?
>>
>> It doesn't work

Thanks for testing it anyway.


>> [    3.016262] traps: systemd-cgroups[390] general protection
>> ip:7f456f7b6028 sp:7fffdc059718 error:0 in
>> ld-2.18.so[7f456f79e000+20000]

This is what I know about these crashes. The SEGV itself is caused by
HLT instruction executed by dynamic loader, ld-2.NN.so.
The instruction is in _exit function, and is only reachable if
exit_group and exit syscalls fail to terminate the process.
So it seems that syscall execution is getting badly broken somehow
at some point.

This happens to both reporters.

My theory that it is related to lockdep seems to be wrong, because
Sabrina's kernel is not lockdep-enabled, yet it sees the same failure.

Both kernels are paravirtualized, both are booted under KVM,
Andrey runs it with four virtual CPUs, Sabrina runs with two.

My next theory is that I missed something related to paravirt.
I am looking at that code, so far I don't see anything suspicious.

Unfortunately, it doesn't happen to me: I have Sabrina's bzImage,
I run it under "qemu-system-x86_64 -enable-kvm -smp 2",
I see in dmesg that kernel does detect that it is being run under KVM,
but it works for me. No mysterious segfaults.

Andrey, can you send me your bzImage? Maybe it will trigger
the problem for me.


> The change to stub_\func looks wrong to me.  It saves and restores
> regs, but those regs might already have been saved if we're on the
> slow path.  (Yes, all that code is quite buggy even without all these
> patches.)  So is execve.
> 
> This means that, for example, execve called in the slow path will
> save/restore regs twice.  If the values in the regs after the first
> save and before the second save are different, then we corrupt user
> state.

This part?

        .macro FORK_LIKE func
 ENTRY(stub_\func)
        CFI_STARTPROC
-       popq    %r11                    /* save return address */
-       PARTIAL_FRAME 0
-       SAVE_REST
-       pushq   %r11                    /* put it back on stack */
+       DEFAULT_FRAME 0, 8              /* offset 8: return address */
+       SAVE_EXTRA_REGS 8
        FIXUP_TOP_OF_STACK %r11, 8
-       DEFAULT_FRAME 0 8               /* offset 8: return address */
        call sys_\func
        RESTORE_TOP_OF_STACK %r11, 8
-       ret $REST_SKIP          /* pop extended registers */
+       ret
        CFI_ENDPROC
 END(stub_\func)
        .endm

        FORK_LIKE  clone
        FORK_LIKE  fork
        FORK_LIKE  vfork

But the old code (SAVE_REST thing) was also saving registers here.
It had to jump through hoops (pop return address, SAVE_REST,
push return address) to do that.
After the patch, "SAVE_EXTRA_REGS 8" does the same, just without
pop/push pair.

I just don't see what's wrong with it. Can you elaborate?

And this area of code has no paravirt gunk, so if the bug is here,
why it doesn't fail for people running this natively?

-- 
vda


  reply	other threads:[~2015-02-25 21:29 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-12 21:54 [PATCH 1/3 v3] x86: entry_64.S: fix wrong symbolic constant usage: R11->ARGOFFSET Denys Vlasenko
2015-02-12 21:54 ` [PATCH 2/3 v3] x86: entry_64.S: always allocate complete "struct pt_regs" Denys Vlasenko
2015-02-18 20:22   ` Andy Lutomirski
2015-02-25 12:37   ` Andrey Wagin
2015-02-25 13:55     ` Denys Vlasenko
2015-02-25 14:48       ` Sabrina Dubroca
2015-02-25 16:52     ` Denys Vlasenko
2015-02-25 18:42     ` Denys Vlasenko
2015-02-25 19:59       ` Andrey Wagin
2015-02-25 20:10         ` Andy Lutomirski
2015-02-25 21:28           ` Denys Vlasenko [this message]
2015-02-25 21:59             ` Andy Lutomirski
2015-02-25 22:40               ` Sabrina Dubroca
2015-02-25 23:34                 ` Sabrina Dubroca
2015-02-26  1:12                   ` Denys Vlasenko
2015-02-26  5:18                     ` Andrew Morton
2015-02-26  6:25                       ` Stephen Rothwell
2015-02-26  9:55               ` Denys Vlasenko
2015-02-26 12:11                 ` Denys Vlasenko
2015-02-26 13:54                   ` Denys Vlasenko
2015-02-26 14:26                     ` Sabrina Dubroca
2015-02-26 15:14                 ` Andy Lutomirski
2015-02-12 21:54 ` [PATCH 3/3 v3] x86: entry_64.S: fix comments. No code changes Denys Vlasenko
2015-02-18 20:25   ` Andy Lutomirski
2015-02-18 20:00 ` [PATCH 1/3 v3] x86: entry_64.S: fix wrong symbolic constant usage: R11->ARGOFFSET Andy Lutomirski
  -- strict thread matches above, loose matches on Subject: below --
2015-02-26 22:40 [PATCH 00/16] x86/asm changes for 4.1 for review Andy Lutomirski
2015-02-26 22:40 ` [PATCH 01/16] x86: open-code register save/restore in trace_hardirqs thunks Andy Lutomirski
2015-03-04 22:52   ` [tip:x86/asm] x86/asm/64: Open-code register save/ restore in trace_hardirqs*() thunks tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 02/16] x86: introduce push/pop macros which generate CFI_REL_OFFSET and CFI_RESTORE Andy Lutomirski
2015-03-04 22:52   ` [tip:x86/asm] x86/asm: Introduce push/ pop " tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 03/16] x86: entry_64.S: fix wrong symbolic constant usage: R11->ARGOFFSET Andy Lutomirski
2015-03-04 22:53   ` [tip:x86/asm] x86/asm/entry/64: Fix incorrect " tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 04/16] x86: entry_64.S: always allocate complete "struct pt_regs" Andy Lutomirski
2015-03-04 22:53   ` [tip:x86/asm] x86/asm/entry/64: Always allocate a complete " struct pt_regs" on the kernel stack tip-bot for Denys Vlasenko
2015-03-21 22:51     ` Brian Gerst
2015-03-22 14:15       ` Denys Vlasenko
2015-02-26 22:40 ` [PATCH 05/16] x86: entry_64.S: fix comments. No code changes Andy Lutomirski
2015-03-04 22:53   ` [tip:x86/asm] x86/asm/entry/64: Fix comments tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 06/16] x86: code shrink in paranoid_exit Andy Lutomirski
2015-03-04 22:53   ` [tip:x86/asm] x86/asm/entry/64: Shrink code in 'paranoid_exit' tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 07/16] x86: mass removal of ARGOFFSET. No code changes Andy Lutomirski
2015-03-04 22:54   ` [tip:x86/asm] x86/asm/entry: Do mass removal of 'ARGOFFSET' tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 08/16] x86, entry: Remove int_check_syscall_exit_work Andy Lutomirski
2015-03-04 22:54   ` [tip:x86/asm] x86/asm/entry/64: Remove ' int_check_syscall_exit_work' tip-bot for Andy Lutomirski
2015-02-26 22:40 ` [PATCH 09/16] x86: add comments about various syscall instructions, no code changes Andy Lutomirski
2015-03-04 22:54   ` [tip:x86/asm] x86/asm/entry: Add comments about various syscall instructions tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 10/16] x86: entry_64.S: move save_paranoid and ret_from_fork closer to their users Andy Lutomirski
2015-03-04 22:55   ` [tip:x86/asm] x86/asm/entry/64: Move 'save_paranoid' and ' ret_from_fork' " tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 11/16] x86: entry_64.S: rename save_paranoid to paranoid_entry, no code changes Andy Lutomirski
2015-03-04 22:55   ` [tip:x86/asm] x86/asm/entry/64: Clean up and document various entry code details tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 12/16] x86: ia32entry.S: fold IA32_ARG_FIXUP macro into its callers Andy Lutomirski
2015-03-04 22:55   ` [tip:x86/asm] x86/asm/entry/64/compat: Fold the " tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 13/16] x86: entry_64.S: use more understandable constants Andy Lutomirski
2015-03-04 22:56   ` [tip:x86/asm] x86/asm/entry/64: Use more readable constants tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 14/16] x86: ia32entry.S: use more understandable constant Andy Lutomirski
2015-03-04 22:56   ` [tip:x86/asm] x86/asm/entry/64/compat: Use more readable constant tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 15/16] x86: entry.S: simplify optimistic SYSRET Andy Lutomirski
2015-03-04 21:40   ` Ingo Molnar
2015-03-04 22:56   ` [tip:x86/asm] x86/asm/entry/64: Simplify " tip-bot for Denys Vlasenko
2015-02-26 22:40 ` [PATCH 16/16] x86_64, entry: Remove a bogus ret_from_fork optimization Andy Lutomirski
2015-03-04 22:57   ` [tip:x86/asm] x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization tip-bot for Andy Lutomirski
2015-03-05 11:49   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2015-03-04 21:50 ` [PATCH 00/16] x86/asm changes for 4.1 for review Ingo Molnar
2015-03-04 21:55   ` Andy Lutomirski
2015-01-14 21:48 [PATCH 01/11] x86: entry_64.S: always allocate complete "struct pt_regs" Denys Vlasenko
2015-01-14 21:48 ` [PATCH 02/11] x86: code shrink in paranoid_exit Denys Vlasenko
2015-02-11 20:36   ` Andy Lutomirski
2015-02-11 21:01     ` H. Peter Anvin
2015-02-11 21:13     ` Denys Vlasenko
2015-02-11 22:09       ` Andy Lutomirski
2015-02-18 23:26   ` Andy Lutomirski
2015-01-14 21:48 ` [PATCH 03/11] x86: mass removal of ARGOFFSET Denys Vlasenko
2015-02-21  0:31   ` Andy Lutomirski
2015-02-23 14:31     ` Denys Vlasenko
2015-02-23 16:06       ` Andy Lutomirski
2015-01-14 21:48 ` [PATCH 04/11] x86: rename some macros and labels, no code changes Denys Vlasenko
2015-01-14 21:48 ` [PATCH 05/11] x86: add comments about various syscall instructions, " Denys Vlasenko
2015-01-14 21:48 ` [PATCH 06/11] x86: entry_64.S: move save_paranoid and ret_from_fork closer to their users Denys Vlasenko
2015-01-14 21:48 ` [PATCH 07/11] x86: entry_64.S: rename save_paranoid to paranoid_entry, no code changes Denys Vlasenko
2015-02-11 20:39   ` Andy Lutomirski
2015-01-14 21:48 ` [PATCH 08/11] x86: entry_64.S: fold test_in_nmi macro into its only user Denys Vlasenko
2015-02-11 20:40   ` Andy Lutomirski
2015-02-12  2:17     ` Steven Rostedt
2015-01-14 21:48 ` [PATCH 09/11] x86: get rid of KERNEL_STACK_OFFSET Denys Vlasenko
2015-01-14 21:48 ` [PATCH 10/11] x86: ia32entry.S: fold IA32_ARG_FIXUP macro into its callers Denys Vlasenko
2015-01-14 21:48 ` [PATCH 11/11] x86: entry_64.S: use more understandable constants Denys Vlasenko
2015-01-14 22:17 ` [PATCH 01/11] x86: entry_64.S: always allocate complete "struct pt_regs" Denys Vlasenko
2015-01-14 22:29   ` Andy Lutomirski
2015-01-14 22:41     ` Borislav Petkov
2015-01-14 22:50       ` Denys Vlasenko
2015-02-11 20:30 ` Andy Lutomirski
2015-02-11 21:55   ` Denys Vlasenko
2015-02-11 22:03     ` Andy Lutomirski
2015-01-11 23:07 [PATCH] x86: introduce push/pop macros which generate CFI_REL_OFFSET and CFI_RESTORE Denys Vlasenko
2015-01-12  0:38 ` Andy Lutomirski
2015-01-12  6:23   ` Denys Vlasenko
2015-01-12 19:23 ` Borislav Petkov
2015-01-12 19:25   ` Andy Lutomirski
2015-01-12 19:37     ` Borislav Petkov
2015-01-12 19:46       ` Andy Lutomirski
2015-01-12 20:11         ` Borislav Petkov
2015-01-12 20:14           ` Andy Lutomirski
2015-01-12 20:22             ` H. Peter Anvin
2015-01-12 20:26               ` Andy Lutomirski
2015-01-12 21:03                 ` Borislav Petkov
2015-01-13 12:07                 ` Denys Vlasenko
2015-01-12 20:32             ` Borislav Petkov
2015-02-11 20:24 ` Andy Lutomirski
2015-01-10 22:00 [PATCH 0/4 v2] x86: entry.S cleanup Denys Vlasenko
2015-01-10 22:00 ` [PATCH 1/4] x86: entry_64.S: delete unused code Denys Vlasenko
2015-01-10 22:12   ` Andy Lutomirski
2015-01-10 22:00 ` [PATCH 2/4] x86: ia32entry.S: fix wrong symbolic constant usage: R11->ARGOFFSET Denys Vlasenko
2015-01-10 22:13   ` Andy Lutomirski
2015-01-10 22:27     ` Linus Torvalds
2015-01-10 22:35       ` Borislav Petkov
2015-01-10 22:41         ` Linus Torvalds
2015-01-10 22:45           ` Borislav Petkov
2015-01-10 22:37       ` Linus Torvalds
2015-01-10 23:27       ` Andy Lutomirski
2015-01-13 22:11   ` Andy Lutomirski
2015-01-10 22:00 ` [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks Denys Vlasenko
2015-01-10 22:07   ` Linus Torvalds
2015-01-10 22:35     ` Denys Vlasenko
2015-01-10 22:41       ` Borislav Petkov
2015-01-11  3:33         ` Denys Vlasenko
2015-01-11 10:54           ` Borislav Petkov
2015-01-11 23:06             ` Denys Vlasenko
2015-02-11  2:38   ` Andy Lutomirski
2015-01-10 22:00 ` [PATCH 4/4] x86: entry_64.S: fold SAVE_ARGS_IRQ macro into its sole user Denys Vlasenko
2015-01-13 22:26   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54EE3E94.7060208@redhat.com \
    --to=dvlasenk@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=avagin@gmail.com \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.