From: Josh Poimboeuf <email@example.com> To: Linus Torvalds <firstname.lastname@example.org> Cc: kernel test robot <email@example.com>, Ingo Molnar <firstname.lastname@example.org>, Andrey Ryabinin <email@example.com>, Matthias Kaehlcke <firstname.lastname@example.org>, Alexander Potapenko <email@example.com>, Andy Lutomirski <firstname.lastname@example.org>, Arnd Bergmann <email@example.com>, Dmitriy Vyukov <firstname.lastname@example.org>, Miguel Bernal Marin <email@example.com>, Peter Zijlstra <firstname.lastname@example.org>, Thomas Gleixner <email@example.com>, LKML <firstname.lastname@example.org>, LKP <email@example.com> Subject: Re: [lkp-robot] [x86/asm] f5caf621ee: PANIC:double_fault Date: Thu, 28 Sep 2017 11:44:22 -0500 [thread overview] Message-ID: <20170928164422.sl4z4sfbkyscbxrk@treble> (raw) In-Reply-To: <CA+55aFwL0DGcxciSbPEGNpOgxveBsr3=qnm6Xx4CfFrxhkMQxg@mail.gmail.com> On Thu, Sep 28, 2017 at 09:21:07AM -0700, Linus Torvalds wrote: > On Thu, Sep 28, 2017 at 12:47 AM, kernel test robot > <firstname.lastname@example.org> wrote: > > > > [ 10.587519] RIP: 0010:compat_sock_ioctl+0xfea/0x103e > > [ 10.587974] RSP: 0000:0000000000277d78 EFLAGS: 00010283 > > [ 10.588448] RAX: 0000000000277d78 RBX: 0000000000008933 RCX: ffff8800141a8000 > > [ 10.589103] RDX: 0000000000000020 RSI: 00000000fffbea00 RDI: 00000000fffbea50 > > [ 10.589757] RBP: ffffc90000277e18 R08: fffbea50fffbea34 R09: ffffffff814a68c9 > > [ 10.590407] R10: ffffff9c00000002 R11: 00000000fffbea50 R12: 0000000000000000 > > [ 10.591056] R13: ffff880012c8c880 R14: 00000000fffbea50 R15: 00000000fffbea00 > > [ 10.591708] FS: 0000000000000000(0000) GS:ffff880019a00000(0063) knlGS:00000000f7fab9a0 > > [ 10.592446] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 > > [ 10.592973] CR2: 0000000000277d68 CR3: 000000001807f000 CR4: 00000000000006b0 > > [ 10.593623] Call Trace: > > [ 10.593858] Code: 02 0f ff 65 48 8b 04 25 80 d1 00 00 48 8b 80 28 25 00 00 48 83 e8 20 49 39 c7 77 34 89 e0 4c 89 f7 4c 89 fe ba 20 00 00 00 89 c4 <e8> b3 52 05 00 85 c0 74 22 eb 1a 4c 89 fa 89 de 4c 89 ef e8 c6 > > [ 10.595705] Kernel panic - not syncing: Machine halted. > > That is some _funky_ code, and yes, this may well be triggered by the > inline asm changes. > > The code decodes to (after ignoring a few bytes at the beginning that > were in the middle of an instruction) > > 0: 65 48 8b 04 25 80 d1 mov %gs:0xd180,%rax > 7: 00 00 > 9: 48 8b 80 28 25 00 00 mov 0x2528(%rax),%rax > 10: 48 83 e8 20 sub $0x20,%rax > 14: 49 39 c7 cmp %rax,%r15 > 17: 77 34 ja 0x4d > 19: 89 e0 mov %esp,%eax > 1b: 4c 89 f7 mov %r14,%rdi > 1e: 4c 89 fe mov %r15,%rsi > 21: ba 20 00 00 00 mov $0x20,%edx > 26: 89 c4 mov %eax,%esp > 28:* e8 b3 52 05 00 callq 0x552e0 <-- trapping instruction > 2d: 85 c0 test %eax,%eax > 2f: 74 22 je 0x53 > 31: eb 1a jmp 0x4d > 33: 4c 89 fa mov %r15,%rdx > 36: 89 de mov %ebx,%esi > 38: 4c 89 ef mov %r13,%rdi > > and it's worth noting that insane > > mov %eax,%esp > > instruction, and how RAX (and RSP) both have that bad value of > 0000000000277d78 in them. > > So double fault is correct - we've corrupted the stack. > > And NOTE! It's reloading 32 bits, not 64 bits, and that's the basic bug there. > > I do note that when I build a kernel, I do see that pattern of > > movl $32, %edx > call <something> > > and in every case it's a a call to a user copy. One is "call > _copy_from_user", while the other ones are all the > alternative_call_2() in copy_user_generic(). > > Judging by the offset within the function, and judging by the bug, > it's almost certainly that alternative_call_2() case. > > So it does sound like the clang fix has now introduced a gcc regression. > > And yes, in both cases it seems to be a compiler bug, but I'm not > convinced it's a good idea to fix a clang bug by introducing a gcc > one. > > Anyway, I think the real hint here is that 32-bit reload. > > Lookie here: > > register unsigned int __asm_call_sp asm("esp"); > #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp) > > yeah, that's just garbage. It sure as hell should not be "unsigned int". > > Yeah. yeah, gcc shouldn't do that insane reload in the first place, > but once that gcc bug has triggered, then the "unsigned int" is what > makes the code go really bad. > > I bet that changing it to "unsigned long" will just fix things. > > Josh? Agreed, changing it to "unsigned long" and "rsp" will probably fix it. I had made it "unsigned int" because of a clang issue with "unsigned long": CC arch/x86/entry/vdso/vdso32/vclock_gettime.o In file included from arch/x86/entry/vdso/vdso32/vclock_gettime.c:32: In file included from arch/x86/entry/vdso/vdso32/../vclock_gettime.c:15: In file included from ./arch/x86/include/asm/vgtod.h:5: In file included from ./include/linux/clocksource.h:12: In file included from ./include/linux/timex.h:56: In file included from ./include/uapi/linux/timex.h:56: In file included from ./include/linux/time.h:5: In file included from ./include/linux/seqlock.h:35: In file included from ./include/linux/spinlock.h:50: In file included from ./include/linux/preempt.h:10: In file included from ./include/linux/list.h:8: In file included from ./include/linux/kernel.h:10: In file included from ./include/linux/bitops.h:37: In file included from ./arch/x86/include/asm/bitops.h:16: In file included from ./arch/x86/include/asm/alternative.h:9: ./arch/x86/include/asm/asm.h:142:42: error: register 'rsp' unsuitable for global register variables on this target register unsigned long __asm_call_sp asm("rsp"); And I think we saw the same error in the realmode code. So we may need to tweak the macro a bit. -- Josh
next prev parent reply other threads:[~2017-09-28 16:44 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-09-28 7:47 kernel test robot 2017-09-28 7:59 ` Ingo Molnar 2017-09-28 8:18 ` Peter Zijlstra 2017-09-28 8:49 ` Ingo Molnar 2017-09-28 11:49 ` Peter Zijlstra 2017-09-28 16:21 ` Linus Torvalds 2017-09-28 16:44 ` Josh Poimboeuf [this message] 2017-09-28 17:01 ` Josh Poimboeuf 2017-09-28 19:10 ` Josh Poimboeuf 2017-09-28 21:58 ` [PATCH] x86/asm: Fix inline asm call constraints for GCC 4.4 Josh Poimboeuf 2017-09-28 23:53 ` Linus Torvalds 2017-09-29 1:40 ` Josh Poimboeuf 2017-09-29 8:01 ` Ingo Molnar 2017-09-29 10:32 ` Ye Xiaolong 2017-09-29 7:51 ` Ingo Molnar 2017-09-29 15:29 ` Arnd Bergmann 2017-09-29 9:27 ` [tip:x86/urgent] " tip-bot for Josh Poimboeuf 2017-09-29 11:18 ` tip-bot for Josh Poimboeuf
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170928164422.sl4z4sfbkyscbxrk@treble \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [lkp-robot] [x86/asm] f5caf621ee: PANIC:double_fault' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.