* [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" @ 2017-07-12 21:27 Matthias Kaehlcke 2017-07-12 22:12 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-12 21:27 UTC (permalink / raw) To: Chris J Arges, Josh Poimboeuf, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin Cc: x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer, Matthias Kaehlcke Commit f05058c4d652 supposedly "forces a stack frame to be created before the inline asm code if CONFIG_FRAME_POINTER is enabled by listing the stack pointer as an output operand for the get_user() inline assembly statement.". This doesn't work as intended, at least with gcc v4.9.2 and x86-64 the generated code is exactly the same with and without the patch. However clang adds an extra instruction that adjusts %rsp, which ends up causing double faults all over the place. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> --- arch/x86/include/asm/uaccess.h | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 476ea27f490b..9ec2beab73df 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -161,11 +161,10 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) ({ \ int __ret_gu; \ register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ - register void *__sp asm(_ASM_SP); \ __chk_user_ptr(ptr); \ might_fault(); \ - asm volatile("call __get_user_%P4" \ - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ + asm volatile("call __get_user_%P3" \ + : "=a" (__ret_gu), "=r" (__val_gu) \ : "0" (ptr), "i" (sizeof(*(ptr)))); \ (x) = (__force __typeof__(*(ptr))) __val_gu; \ __builtin_expect(__ret_gu, 0); \ -- 2.13.2.932.g7449e964c-goog ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 21:27 [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" Matthias Kaehlcke @ 2017-07-12 22:12 ` Josh Poimboeuf 2017-07-12 22:20 ` Matthias Kaehlcke 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-12 22:12 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Wed, Jul 12, 2017 at 02:27:44PM -0700, Matthias Kaehlcke wrote: > Commit f05058c4d652 supposedly "forces a stack frame to be created before > the inline asm code if CONFIG_FRAME_POINTER is enabled by listing the > stack pointer as an output operand for the get_user() inline assembly > statement.". This doesn't work as intended, at least with gcc v4.9.2 and > x86-64 the generated code is exactly the same with and without the patch. > However clang adds an extra instruction that adjusts %rsp, which ends up > causing double faults all over the place. I don't think reverting it is the right approach, because that will still break frame pointers in certain cases. The original commit probably should have clarified: " ... forces a stack frame *if it doesn't already exist*." In *most* cases it will have no effect, as you saw, because users of get_user() tend to do other function calls beforehand, so they will have already saved the frame pointer before calling it. However, that isn't always the case. We found that certain configs change GCC's behavior such that, for certain get_user() call sites, the containing function doesn't saved the frame pointer before inserting get_user()'s inline asm. GCC completely ignores inline asm, so it has no idea that it has a call instruction in it. So in general, *any* inline asm with a call instruction needs this constraint, to force the frame pointer to be saved, if it hasn't already. This is admittedly an awkward way of achieving this goal, but it's the only way I know how to do it with GCC. What extra instruction does clang add? -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 22:12 ` Josh Poimboeuf @ 2017-07-12 22:20 ` Matthias Kaehlcke 2017-07-12 22:35 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-12 22:20 UTC (permalink / raw) To: Josh Poimboeuf Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer Hi Josh, thanks for your prompt reply. El Wed, Jul 12, 2017 at 05:12:42PM -0500 Josh Poimboeuf ha dit: > On Wed, Jul 12, 2017 at 02:27:44PM -0700, Matthias Kaehlcke wrote: > > Commit f05058c4d652 supposedly "forces a stack frame to be created before > > the inline asm code if CONFIG_FRAME_POINTER is enabled by listing the > > stack pointer as an output operand for the get_user() inline assembly > > statement.". This doesn't work as intended, at least with gcc v4.9.2 and > > x86-64 the generated code is exactly the same with and without the patch. > > However clang adds an extra instruction that adjusts %rsp, which ends up > > causing double faults all over the place. > > I don't think reverting it is the right approach, because that will > still break frame pointers in certain cases. > > The original commit probably should have clarified: > > " ... forces a stack frame *if it doesn't already exist*." > > > In *most* cases it will have no effect, as you saw, because users of > get_user() tend to do other function calls beforehand, so they will have > already saved the frame pointer before calling it. > > However, that isn't always the case. We found that certain configs > change GCC's behavior such that, for certain get_user() call sites, the > containing function doesn't saved the frame pointer before inserting > get_user()'s inline asm. > > GCC completely ignores inline asm, so it has no idea that it has a call > instruction in it. So in general, *any* inline asm with a call > instruction needs this constraint, to force the frame pointer to be > saved, if it hasn't already. Thanks for the clarification! > This is admittedly an awkward way of achieving this goal, but it's the > only way I know how to do it with GCC. > > What extra instruction does clang add? I was looking at the get_user() call in drm_mode_setcrtc(). The code generated by clang without the patch is: if (get_user(out_id, &set_connectors_ptr[i])) { ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax ffffffff8138695c: 00 ffffffff8138695d: 49 03 06 add (%r14),%rax ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> And with the patch: if (get_user(out_id, &set_connectors_ptr[i])) { ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax ffffffff81386a5d: 00 ffffffff81386a5e: 49 03 06 add (%r14),%rax ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp ffffffff81386a66: e8 15 a5 f0 ff callq ffffffff81290f80 <__get_user_4> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 22:20 ` Matthias Kaehlcke @ 2017-07-12 22:35 ` Josh Poimboeuf 2017-07-12 22:36 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-12 22:35 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote: > > This is admittedly an awkward way of achieving this goal, but it's the > > only way I know how to do it with GCC. > > > > What extra instruction does clang add? > > I was looking at the get_user() call in drm_mode_setcrtc(). The code > generated by clang without the patch is: > > if (get_user(out_id, &set_connectors_ptr[i])) { > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > ffffffff8138695c: 00 > ffffffff8138695d: 49 03 06 add (%r14),%rax > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> > > And with the patch: > > if (get_user(out_id, &set_connectors_ptr[i])) { > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > ffffffff81386a5d: 00 > ffffffff81386a5e: 49 03 06 add (%r14),%rax > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp > ffffffff81386a66: e8 15 a5 f0 ff callq > ffffffff81290f80 <__get_user_4> Hm, that seems odd. Can you sure the disassembly for the whole function? -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 22:35 ` Josh Poimboeuf @ 2017-07-12 22:36 ` Josh Poimboeuf 2017-07-12 23:22 ` Matthias Kaehlcke 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-12 22:36 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Wed, Jul 12, 2017 at 05:35:47PM -0500, Josh Poimboeuf wrote: > On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote: > > > This is admittedly an awkward way of achieving this goal, but it's the > > > only way I know how to do it with GCC. > > > > > > What extra instruction does clang add? > > > > I was looking at the get_user() call in drm_mode_setcrtc(). The code > > generated by clang without the patch is: > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > ffffffff8138695c: 00 > > ffffffff8138695d: 49 03 06 add (%r14),%rax > > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> > > > > And with the patch: > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > ffffffff81386a5d: 00 > > ffffffff81386a5e: 49 03 06 add (%r14),%rax > > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp > > ffffffff81386a66: e8 15 a5 f0 ff callq > > ffffffff81290f80 <__get_user_4> > > Hm, that seems odd. Can you sure the disassembly for the whole > function? Er, share :-) -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 22:36 ` Josh Poimboeuf @ 2017-07-12 23:22 ` Matthias Kaehlcke 2017-07-13 18:00 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-12 23:22 UTC (permalink / raw) To: Josh Poimboeuf Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer El Wed, Jul 12, 2017 at 05:36:30PM -0500 Josh Poimboeuf ha dit: > On Wed, Jul 12, 2017 at 05:35:47PM -0500, Josh Poimboeuf wrote: > > On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote: > > > > This is admittedly an awkward way of achieving this goal, but it's the > > > > only way I know how to do it with GCC. > > > > > > > > What extra instruction does clang add? > > > > > > I was looking at the get_user() call in drm_mode_setcrtc(). The code > > > generated by clang without the patch is: > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > ffffffff8138695c: 00 > > > ffffffff8138695d: 49 03 06 add (%r14),%rax > > > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> > > > > > > And with the patch: > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > ffffffff81386a5d: 00 > > > ffffffff81386a5e: 49 03 06 add (%r14),%rax > > > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp > > > ffffffff81386a66: e8 15 a5 f0 ff callq > > > ffffffff81290f80 <__get_user_4> > > > > Hm, that seems odd. Can you sure the disassembly for the whole > > function? > > Er, share :-) Sure, please find below the disassemblies with and without the patch. The exact extra instruction differs from the one above, the disassembly above is from a debug session with some 'random' kernel version (bisect), the ones below from a v4.12ish kernel. At the bottom you also find a log of a double faults observed with the patch. If you are interested in building the kernel with clang yourself I can provide instructions, it is fairly painless nowadays as long as you have a recent version of clang (a somewhat older version should also do for this issue with some extra kernel patches). With f05058c4d652: ffffffff81367550 <drm_mode_setcrtc>: * Returns: * Zero on success, negative errno on failure. */ int drm_mode_setcrtc(struct drm_device *dev, void *data, struct drm_file *file_priv) { ffffffff81367550: e8 db 71 2e 00 callq ffffffff8164e730 <__fentry__> ffffffff81367555: 55 push %rbp ffffffff81367556: 48 89 e5 mov %rsp,%rbp ffffffff81367559: 41 57 push %r15 ffffffff8136755b: 41 56 push %r14 ffffffff8136755d: 41 55 push %r13 ffffffff8136755f: 41 54 push %r12 ffffffff81367561: 53 push %rbx ffffffff81367562: 48 81 ec c0 00 00 00 sub $0xc0,%rsp ffffffff81367569: 49 89 f6 mov %rsi,%r14 ffffffff8136756c: 48 89 fb mov %rdi,%rbx #define DRM_SWITCH_POWER_DYNAMIC_OFF 3 static __inline__ int drm_core_check_feature(struct drm_device *dev, int feature) { return ((dev->driver->driver_features & feature) ? 1 : 0); ffffffff8136756f: 48 8b 43 20 mov 0x20(%rbx),%rax uint32_t __user *set_connectors_ptr; struct drm_modeset_acquire_ctx ctx; int ret; int i; if (!drm_core_check_feature(dev, DRIVER_MODESET)) ffffffff81367573: f6 80 81 01 00 00 20 testb $0x20,0x181(%rax) ffffffff8136757a: 75 09 jne ffffffff81367585 <drm_mode_setcrtc+0x35> ffffffff8136757c: 6a ea pushq $0xffffffffffffffea ffffffff8136757e: 41 5c pop %r12 ffffffff81367580: e9 d5 04 00 00 jmpq ffffffff81367a5a <drm_mode_setcrtc+0x50a> ffffffff81367585: 6a de pushq $0xffffffffffffffde ffffffff81367587: 41 5c pop %r12 /* * Universal plane src offsets are only 16.16, prevent havoc for * drivers using universal plane code internally. */ if (crtc_req->x & 0xffff0000 || crtc_req->y & 0xffff0000) ffffffff81367589: 41 81 7e 14 ff ff 00 cmpl $0xffff,0x14(%r14) ffffffff81367590: 00 ffffffff81367591: 0f 87 c3 04 00 00 ja ffffffff81367a5a <drm_mode_setcrtc+0x50a> ffffffff81367597: 41 81 7e 18 ff ff 00 cmpl $0xffff,0x18(%r14) ffffffff8136759e: 00 ffffffff8136759f: 0f 87 b5 04 00 00 ja ffffffff81367a5a <drm_mode_setcrtc+0x50a> return -ERANGE; crtc = drm_crtc_find(dev, crtc_req->crtc_id); ffffffff813675a5: 41 8b 76 0c mov 0xc(%r14),%esi ffffffff813675a9: 48 89 df mov %rbx,%rdi ffffffff813675ac: e8 15 fe ff ff callq ffffffff813673c6 <drm_crtc_find> ffffffff813675b1: 49 89 c5 mov %rax,%r13 if (!crtc) { ffffffff813675b4: 4d 85 ed test %r13,%r13 ffffffff813675b7: 0f 84 7d 04 00 00 je ffffffff81367a3a <drm_mode_setcrtc+0x4ea> ffffffff813675bd: 48 89 5d b0 mov %rbx,-0x50(%rbp) DRM_DEBUG_KMS("Unknown CRTC ID %d\n", crtc_req->crtc_id); return -ENOENT; } DRM_DEBUG_KMS("[CRTC:%d:%s]\n", crtc->base.id, crtc->name); ffffffff813675c1: 41 8b 4d 78 mov 0x78(%r13),%ecx ffffffff813675c5: 4d 8b 45 20 mov 0x20(%r13),%r8 ffffffff813675c9: 6a 04 pushq $0x4 ffffffff813675cb: 5e pop %rsi ffffffff813675cc: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff813675d3: 48 c7 c2 e8 55 9b 81 mov $0xffffffff819b55e8,%rdx ffffffff813675da: 31 c0 xor %eax,%eax ffffffff813675dc: e8 45 c6 ff ff callq ffffffff81363c26 <drm_printk> mutex_lock(&crtc->dev->mode_config.mutex); ffffffff813675e1: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff813675e5: 48 81 c7 b0 02 00 00 add $0x2b0,%rdi ffffffff813675ec: e8 b2 3c 2e 00 callq ffffffff8164b2a3 <mutex_lock> ffffffff813675f1: 48 8d 9d 38 ff ff ff lea -0xc8(%rbp),%rbx drm_modeset_acquire_init(&ctx, 0); ffffffff813675f8: 31 f6 xor %esi,%esi ffffffff813675fa: 48 89 df mov %rbx,%rdi ffffffff813675fd: e8 ef 7d 00 00 callq ffffffff8136f3f1 <drm_modeset_acquire_init> ffffffff81367602: 49 8d 46 24 lea 0x24(%r14),%rax ffffffff81367606: 48 89 45 a0 mov %rax,-0x60(%rbp) ffffffff8136760a: 31 c0 xor %eax,%eax ffffffff8136760c: 48 89 45 c0 mov %rax,-0x40(%rbp) ffffffff81367610: 31 c0 xor %eax,%eax ffffffff81367612: 48 89 45 d0 mov %rax,-0x30(%rbp) ffffffff81367616: 31 c0 xor %eax,%eax ffffffff81367618: 48 89 45 c8 mov %rax,-0x38(%rbp) ffffffff8136761c: 45 31 ff xor %r15d,%r15d ffffffff8136761f: 48 89 45 a8 mov %rax,-0x58(%rbp) ffffffff81367623: 4c 89 6d b8 mov %r13,-0x48(%rbp) ffffffff81367627: e9 0a 03 00 00 jmpq ffffffff81367936 <drm_mode_setcrtc+0x3e6> ffffffff8136762c: 48 8d 9d 38 ff ff ff lea -0xc8(%rbp),%rbx } } kfree(connector_set); drm_mode_destroy(dev, mode); if (ret == -EDEADLK) { drm_modeset_backoff(&ctx); ffffffff81367633: 48 89 df mov %rbx,%rdi ffffffff81367636: e8 88 7e 00 00 callq ffffffff8136f4c3 <drm_modeset_backoff> ffffffff8136763b: e9 f6 02 00 00 jmpq ffffffff81367936 <drm_mode_setcrtc+0x3e6> goto out; if (crtc_req->mode_valid) { /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { if (!crtc->primary->fb) { ffffffff81367640: 49 8b 85 98 00 00 00 mov 0x98(%r13),%rax ffffffff81367647: 48 8b 98 b0 00 00 00 mov 0xb0(%rax),%rbx ffffffff8136764e: 48 85 db test %rbx,%rbx ffffffff81367651: 0f 84 19 01 00 00 je ffffffff81367770 <drm_mode_setcrtc+0x220> * * This function increments the framebuffer's reference count. */ static inline void drm_framebuffer_get(struct drm_framebuffer *fb) { drm_mode_object_get(&fb->base); ffffffff81367657: 48 8d 7b 18 lea 0x18(%rbx),%rdi ffffffff8136765b: e8 97 de 00 00 callq ffffffff813754f7 <drm_mode_object_get> ffffffff81367660: 48 89 5d d0 mov %rbx,-0x30(%rbp) ffffffff81367664: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff81367668: 48 8b 5d b0 mov -0x50(%rbp),%rbx ret = -ENOENT; goto out; } } mode = drm_mode_create(dev); ffffffff8136766c: 48 89 df mov %rbx,%rdi ffffffff8136766f: e8 32 08 00 00 callq ffffffff81367ea6 <drm_mode_create> if (!mode) { ffffffff81367674: 48 85 c0 test %rax,%rax ffffffff81367677: 74 38 je ffffffff813676b1 <drm_mode_setcrtc+0x161> ffffffff81367679: 48 89 c1 mov %rax,%rcx ret = -ENOMEM; goto out; } ret = drm_mode_convert_umode(mode, &crtc_req->mode); ffffffff8136767c: 48 89 4d c0 mov %rcx,-0x40(%rbp) ffffffff81367680: 48 89 c7 mov %rax,%rdi ffffffff81367683: 48 8b 75 a0 mov -0x60(%rbp),%rsi ffffffff81367687: e8 3a 1b 00 00 callq ffffffff813691c6 <drm_mode_convert_umode> ffffffff8136768c: 41 89 c4 mov %eax,%r12d if (ret) { ffffffff8136768f: 45 85 e4 test %r12d,%r12d ffffffff81367692: 74 2c je ffffffff813676c0 <drm_mode_setcrtc+0x170> DRM_DEBUG_KMS("Invalid mode\n"); ffffffff81367694: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff8136769b: 48 c7 c2 1c 7b 9b 81 mov $0xffffffff819b7b1c,%rdx ffffffff813676a2: 31 c0 xor %eax,%eax ffffffff813676a4: 6a 04 pushq $0x4 ffffffff813676a6: 5e pop %rsi ffffffff813676a7: e8 7a c5 ff ff callq ffffffff81363c26 <drm_printk> ffffffff813676ac: e9 a0 02 00 00 jmpq ffffffff81367951 <drm_mode_setcrtc+0x401> ffffffff813676b1: 31 c0 xor %eax,%eax ffffffff813676b3: 48 89 45 c0 mov %rax,-0x40(%rbp) ffffffff813676b7: 6a f4 pushq $0xfffffffffffffff4 ffffffff813676b9: 41 5c pop %r12 ffffffff813676bb: e9 91 02 00 00 jmpq ffffffff81367951 <drm_mode_setcrtc+0x401> * Drivers not implementing the universal planes API use a * default formats list provided by the DRM core which doesn't * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ffffffff813676c0: 49 8b bd 98 00 00 00 mov 0x98(%r13),%rdi ffffffff813676c7: 80 bf a4 00 00 00 00 cmpb $0x0,0xa4(%rdi) ffffffff813676ce: 0f 84 cf 00 00 00 je ffffffff813677a3 <drm_mode_setcrtc+0x253> &format_name)); goto out; } } ret = drm_crtc_check_viewport(crtc, crtc_req->x, crtc_req->y, ffffffff813676d4: 41 8b 76 14 mov 0x14(%r14),%esi ffffffff813676d8: 41 8b 56 18 mov 0x18(%r14),%edx ffffffff813676dc: 4c 89 ef mov %r13,%rdi ffffffff813676df: 48 8b 4d c0 mov -0x40(%rbp),%rcx ffffffff813676e3: 4c 8b 45 d0 mov -0x30(%rbp),%r8 ffffffff813676e7: e8 de fd ff ff callq ffffffff813674ca <drm_crtc_check_viewport> ffffffff813676ec: 41 89 c4 mov %eax,%r12d mode, fb); if (ret) ffffffff813676ef: 45 85 e4 test %r12d,%r12d ffffffff813676f2: 0f 85 59 02 00 00 jne ffffffff81367951 <drm_mode_setcrtc+0x401> goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff813676f8: 41 8b 4e 08 mov 0x8(%r14),%ecx ffffffff813676fc: 48 8b 45 c0 mov -0x40(%rbp),%rax ffffffff81367700: 48 85 c0 test %rax,%rax ffffffff81367703: 74 1e je ffffffff81367723 <drm_mode_setcrtc+0x1d3> ffffffff81367705: 85 c9 test %ecx,%ecx ffffffff81367707: 75 1a jne ffffffff81367723 <drm_mode_setcrtc+0x1d3> DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ffffffff81367709: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff81367710: 48 c7 c2 43 7b 9b 81 mov $0xffffffff819b7b43,%rdx ffffffff81367717: 31 c0 xor %eax,%eax ffffffff81367719: 6a 04 pushq $0x4 ffffffff8136771b: 5e pop %rsi ffffffff8136771c: e8 05 c5 ff ff callq ffffffff81363c26 <drm_printk> ffffffff81367721: eb 44 jmp ffffffff81367767 <drm_mode_setcrtc+0x217> if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff81367723: 48 85 c0 test %rax,%rax ffffffff81367726: 0f 95 c0 setne %al DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0 && (!mode || !fb)) { ffffffff81367729: 48 83 7d d0 00 cmpq $0x0,-0x30(%rbp) ffffffff8136772e: 0f 95 c2 setne %dl if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff81367731: 85 c9 test %ecx,%ecx DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0 && (!mode || !fb)) { ffffffff81367733: 74 1e je ffffffff81367753 <drm_mode_setcrtc+0x203> ffffffff81367735: 20 d0 and %dl,%al ffffffff81367737: 75 1a jne ffffffff81367753 <drm_mode_setcrtc+0x203> DRM_DEBUG_KMS("Count connectors is %d but no mode or fb set\n", ffffffff81367739: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff81367740: 48 c7 c2 67 7b 9b 81 mov $0xffffffff819b7b67,%rdx ffffffff81367747: 31 c0 xor %eax,%eax ffffffff81367749: 6a 04 pushq $0x4 ffffffff8136774b: 5e pop %rsi ffffffff8136774c: e8 d5 c4 ff ff callq ffffffff81363c26 <drm_printk> ffffffff81367751: eb 14 jmp ffffffff81367767 <drm_mode_setcrtc+0x217> if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff81367753: 85 c9 test %ecx,%ecx crtc_req->count_connectors); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0) { ffffffff81367755: 74 3e je ffffffff81367795 <drm_mode_setcrtc+0x245> u32 out_id; /* Avoid unbounded kernel memory allocation */ if (crtc_req->count_connectors > config->num_connector) { ffffffff81367757: 48 8b 45 b0 mov -0x50(%rbp),%rax ffffffff8136775b: 3b 88 10 04 00 00 cmp 0x410(%rax),%ecx ffffffff81367761: 0f 86 8d 00 00 00 jbe ffffffff813677f4 <drm_mode_setcrtc+0x2a4> ffffffff81367767: 6a ea pushq $0xffffffffffffffea ffffffff81367769: 41 5c pop %r12 ffffffff8136776b: e9 da 01 00 00 jmpq ffffffff8136794a <drm_mode_setcrtc+0x3fa> if (crtc_req->mode_valid) { /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { if (!crtc->primary->fb) { DRM_DEBUG_KMS("CRTC doesn't have current FB\n"); ffffffff81367770: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff81367777: 48 c7 c2 ed 7a 9b 81 mov $0xffffffff819b7aed,%rdx ffffffff8136777e: 31 c0 xor %eax,%eax ffffffff81367780: 6a 04 pushq $0x4 ffffffff81367782: 5e pop %rsi ffffffff81367783: e8 9e c4 ff ff callq ffffffff81363c26 <drm_printk> ffffffff81367788: 6a ea pushq $0xffffffffffffffea ffffffff8136778a: 41 5c pop %r12 ffffffff8136778c: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff81367790: e9 b5 01 00 00 jmpq ffffffff8136794a <drm_mode_setcrtc+0x3fa> ffffffff81367795: 31 d2 xor %edx,%edx ffffffff81367797: 48 8d b5 38 ff ff ff lea -0xc8(%rbp),%rsi ffffffff8136779e: e9 46 01 00 00 jmpq ffffffff813678e9 <drm_mode_setcrtc+0x399> * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ret = drm_plane_check_pixel_format(crtc->primary, fb->format->format); ffffffff813677a3: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff813677a7: 48 8b 40 38 mov 0x38(%rax),%rax ffffffff813677ab: 8b 30 mov (%rax),%esi * default formats list provided by the DRM core which doesn't * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ret = drm_plane_check_pixel_format(crtc->primary, ffffffff813677ad: e8 40 f4 00 00 callq ffffffff81376bf2 <drm_plane_check_pixel_format> ffffffff813677b2: 41 89 c4 mov %eax,%r12d fb->format->format); if (ret) { ffffffff813677b5: 45 85 e4 test %r12d,%r12d ffffffff813677b8: 0f 84 16 ff ff ff je ffffffff813676d4 <drm_mode_setcrtc+0x184> struct drm_format_name_buf format_name; DRM_DEBUG_KMS("Invalid pixel format %s\n", ffffffff813677be: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff813677c2: 48 8b 40 38 mov 0x38(%rax),%rax ffffffff813677c6: 8b 38 mov (%rax),%edi ffffffff813677c8: 48 8d b5 18 ff ff ff lea -0xe8(%rbp),%rsi ffffffff813677cf: e8 c7 03 00 00 callq ffffffff81367b9b <drm_get_format_name> ffffffff813677d4: 48 89 c1 mov %rax,%rcx ffffffff813677d7: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff813677de: 48 c7 c2 2a 7b 9b 81 mov $0xffffffff819b7b2a,%rdx ffffffff813677e5: 31 c0 xor %eax,%eax ffffffff813677e7: 6a 04 pushq $0x4 ffffffff813677e9: 5e pop %rsi ffffffff813677ea: e8 37 c4 ff ff callq ffffffff81363c26 <drm_printk> ffffffff813677ef: e9 5d 01 00 00 jmpq ffffffff81367951 <drm_mode_setcrtc+0x401> { if (size != 0 && n > SIZE_MAX / size) return NULL; if (__builtin_constant_p(n) && __builtin_constant_p(size)) return kmalloc(n * size, flags); return __kmalloc(n * size, flags); ffffffff813677f4: 48 c1 e1 03 shl $0x3,%rcx ffffffff813677f8: be c0 00 40 01 mov $0x14000c0,%esi ffffffff813677fd: 48 89 cf mov %rcx,%rdi ffffffff81367800: e8 85 78 dd ff callq ffffffff8113f08a <__kmalloc> } connector_set = kmalloc_array(crtc_req->count_connectors, sizeof(struct drm_connector *), GFP_KERNEL); if (!connector_set) { ffffffff81367805: 48 85 c0 test %rax,%rax ffffffff81367808: 48 8d b5 38 ff ff ff lea -0xc8(%rbp),%rsi ffffffff8136780f: 0f 84 c2 00 00 00 je ffffffff813678d7 <drm_mode_setcrtc+0x387> ffffffff81367815: 31 db xor %ebx,%ebx ffffffff81367817: 48 89 c2 mov %rax,%rdx ffffffff8136781a: 48 89 55 c8 mov %rdx,-0x38(%rbp) ffffffff8136781e: eb 35 jmp ffffffff81367855 <drm_mode_setcrtc+0x305> DRM_DEBUG_KMS("Connector id %d unknown\n", out_id); ret = -ENOENT; goto out; } DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", ffffffff81367820: 41 8b 4d 28 mov 0x28(%r13),%ecx ffffffff81367824: 4d 8b 45 48 mov 0x48(%r13),%r8 ffffffff81367828: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff8136782f: 48 c7 c2 94 57 9b 81 mov $0xffffffff819b5794,%rdx ffffffff81367836: 31 c0 xor %eax,%eax ffffffff81367838: 6a 04 pushq $0x4 ffffffff8136783a: 5e pop %rsi ffffffff8136783b: e8 e6 c3 ff ff callq ffffffff81363c26 <drm_printk> ffffffff81367840: 48 8b 45 c8 mov -0x38(%rbp),%rax connector->base.id, connector->name); connector_set[i] = connector; ffffffff81367844: 4e 89 2c f8 mov %r13,(%rax,%r15,8) if (!connector_set) { ret = -ENOMEM; goto out; } for (i = 0; i < crtc_req->count_connectors; i++) { ffffffff81367848: ff c3 inc %ebx ffffffff8136784a: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff8136784e: 48 8d b5 38 ff ff ff lea -0xc8(%rbp),%rsi ffffffff81367855: 41 8b 56 08 mov 0x8(%r14),%edx ffffffff81367859: 39 d3 cmp %edx,%ebx ffffffff8136785b: 0f 83 85 00 00 00 jae ffffffff813678e6 <drm_mode_setcrtc+0x396> connector_set[i] = NULL; ffffffff81367861: 4c 63 fb movslq %ebx,%r15 ffffffff81367864: 4a 83 24 f8 00 andq $0x0,(%rax,%r15,8) set_connectors_ptr = (uint32_t __user *)(unsigned long)crtc_req->set_connectors_ptr; if (get_user(out_id, &set_connectors_ptr[i])) { ffffffff81367869: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax ffffffff81367870: 00 ffffffff81367871: 49 03 06 add (%r14),%rax ffffffff81367874: 48 8b 65 a8 mov -0x58(%rbp),%rsp ffffffff81367878: e8 73 26 f1 ff callq ffffffff81279ef0 <__get_user_4> ffffffff8136787d: 49 89 d4 mov %rdx,%r12 ffffffff81367880: 48 89 65 a8 mov %rsp,-0x58(%rbp) ffffffff81367884: 85 c0 test %eax,%eax ffffffff81367886: 0f 85 a0 00 00 00 jne ffffffff8136792c <drm_mode_setcrtc+0x3dc> */ static inline struct drm_connector *drm_connector_lookup(struct drm_device *dev, uint32_t id) { struct drm_mode_object *mo; mo = drm_mode_object_find(dev, id, DRM_MODE_OBJECT_CONNECTOR); ffffffff8136788c: ba c0 c0 c0 c0 mov $0xc0c0c0c0,%edx ffffffff81367891: 48 8b 7d b0 mov -0x50(%rbp),%rdi ffffffff81367895: 44 89 e6 mov %r12d,%esi ffffffff81367898: e8 f7 db 00 00 callq ffffffff81375494 <drm_mode_object_find> ffffffff8136789d: 49 89 c5 mov %rax,%r13 return mo ? obj_to_connector(mo) : NULL; ffffffff813678a0: 4d 85 ed test %r13,%r13 ret = -EFAULT; goto out; } connector = drm_connector_lookup(dev, out_id); if (!connector) { ffffffff813678a3: 74 0a je ffffffff813678af <drm_mode_setcrtc+0x35f> ffffffff813678a5: 49 83 c5 d8 add $0xffffffffffffffd8,%r13 ffffffff813678a9: 0f 85 71 ff ff ff jne ffffffff81367820 <drm_mode_setcrtc+0x2d0> DRM_DEBUG_KMS("Connector id %d unknown\n", ffffffff813678af: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff813678b6: 48 c7 c2 95 7b 9b 81 mov $0xffffffff819b7b95,%rdx ffffffff813678bd: 31 c0 xor %eax,%eax ffffffff813678bf: 6a 04 pushq $0x4 ffffffff813678c1: 5e pop %rsi ffffffff813678c2: 44 89 e1 mov %r12d,%ecx ffffffff813678c5: e8 5c c3 ff ff callq ffffffff81363c26 <drm_printk> ffffffff813678ca: 6a fe pushq $0xfffffffffffffffe ffffffff813678cc: 41 5c pop %r12 ffffffff813678ce: 4c 8b 7d c8 mov -0x38(%rbp),%r15 ffffffff813678d2: e9 b5 fe ff ff jmpq ffffffff8136778c <drm_mode_setcrtc+0x23c> if (crtc_req->count_connectors > config->num_connector) { ret = -EINVAL; goto out; } connector_set = kmalloc_array(crtc_req->count_connectors, ffffffff813678d7: 49 89 c7 mov %rax,%r15 ffffffff813678da: 6a f4 pushq $0xfffffffffffffff4 ffffffff813678dc: 41 5c pop %r12 ffffffff813678de: 31 c0 xor %eax,%eax ffffffff813678e0: 48 89 45 c8 mov %rax,-0x38(%rbp) ffffffff813678e4: eb 64 jmp ffffffff8136794a <drm_mode_setcrtc+0x3fa> ffffffff813678e6: 49 89 c7 mov %rax,%r15 connector_set[i] = connector; } } set.crtc = crtc; ffffffff813678e9: 4c 89 ad 78 ff ff ff mov %r13,-0x88(%rbp) set.x = crtc_req->x; ffffffff813678f0: 41 8b 4e 14 mov 0x14(%r14),%ecx ffffffff813678f4: 89 4d 88 mov %ecx,-0x78(%rbp) set.y = crtc_req->y; ffffffff813678f7: 41 8b 4e 18 mov 0x18(%r14),%ecx ffffffff813678fb: 89 4d 8c mov %ecx,-0x74(%rbp) set.mode = mode; ffffffff813678fe: 48 8b 45 c0 mov -0x40(%rbp),%rax ffffffff81367902: 48 89 45 80 mov %rax,-0x80(%rbp) set.connectors = connector_set; ffffffff81367906: 4c 89 7d 90 mov %r15,-0x70(%rbp) set.num_connectors = crtc_req->count_connectors; ffffffff8136790a: 89 d0 mov %edx,%eax ffffffff8136790c: 48 89 45 98 mov %rax,-0x68(%rbp) set.fb = fb; ffffffff81367910: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff81367914: 48 89 85 70 ff ff ff mov %rax,-0x90(%rbp) ret = __drm_mode_set_config_internal(&set, &ctx); ffffffff8136791b: 48 8d bd 70 ff ff ff lea -0x90(%rbp),%rdi ffffffff81367922: e8 bf fa ff ff callq ffffffff813673e6 <__drm_mode_set_config_internal> ffffffff81367927: 41 89 c4 mov %eax,%r12d ffffffff8136792a: eb 1e jmp ffffffff8136794a <drm_mode_setcrtc+0x3fa> ffffffff8136792c: 6a f2 pushq $0xfffffffffffffff2 ffffffff8136792e: 41 5c pop %r12 ffffffff81367930: 4c 8b 7d c8 mov -0x38(%rbp),%r15 ffffffff81367934: eb 14 jmp ffffffff8136794a <drm_mode_setcrtc+0x3fa> DRM_DEBUG_KMS("[CRTC:%d:%s]\n", crtc->base.id, crtc->name); mutex_lock(&crtc->dev->mode_config.mutex); drm_modeset_acquire_init(&ctx, 0); retry: ret = drm_modeset_lock_all_ctx(crtc->dev, &ctx); ffffffff81367936: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff8136793a: 48 89 de mov %rbx,%rsi ffffffff8136793d: e8 08 7b 00 00 callq ffffffff8136f44a <drm_modeset_lock_all_ctx> ffffffff81367942: 41 89 c4 mov %eax,%r12d if (ret) ffffffff81367945: 45 85 e4 test %r12d,%r12d ffffffff81367948: 74 16 je ffffffff81367960 <drm_mode_setcrtc+0x410> set.num_connectors = crtc_req->count_connectors; set.fb = fb; ret = __drm_mode_set_config_internal(&set, &ctx); out: if (fb) ffffffff8136794a: 48 83 7d d0 00 cmpq $0x0,-0x30(%rbp) ffffffff8136794f: 74 6b je ffffffff813679bc <drm_mode_setcrtc+0x46c> * This function decrements the framebuffer's reference count and frees the * framebuffer if the reference count drops to zero. */ static inline void drm_framebuffer_put(struct drm_framebuffer *fb) { drm_mode_object_put(&fb->base); ffffffff81367951: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff81367955: 48 8d 78 18 lea 0x18(%rax),%rdi ffffffff81367959: e8 46 db 00 00 callq ffffffff813754a4 <drm_mode_object_put> ffffffff8136795e: eb 62 jmp ffffffff813679c2 <drm_mode_setcrtc+0x472> drm_modeset_acquire_init(&ctx, 0); retry: ret = drm_modeset_lock_all_ctx(crtc->dev, &ctx); if (ret) goto out; if (crtc_req->mode_valid) { ffffffff81367960: 41 83 7e 20 00 cmpl $0x0,0x20(%r14) ffffffff81367965: 0f 84 8d fd ff ff je ffffffff813676f8 <drm_mode_setcrtc+0x1a8> /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { ffffffff8136796b: 41 8b 76 10 mov 0x10(%r14),%esi ffffffff8136796f: 83 fe ff cmp $0xffffffff,%esi ffffffff81367972: 0f 84 c8 fc ff ff je ffffffff81367640 <drm_mode_setcrtc+0xf0> ffffffff81367978: 48 8b 5d b0 mov -0x50(%rbp),%rbx } fb = crtc->primary->fb; /* Make refcounting symmetric with the lookup path. */ drm_framebuffer_get(fb); } else { fb = drm_framebuffer_lookup(dev, crtc_req->fb_id); ffffffff8136797c: 48 89 df mov %rbx,%rdi ffffffff8136797f: e8 a3 b3 00 00 callq ffffffff81372d27 <drm_framebuffer_lookup> ffffffff81367984: 48 89 c1 mov %rax,%rcx if (!fb) { ffffffff81367987: 48 89 4d d0 mov %rcx,-0x30(%rbp) ffffffff8136798b: 48 85 c0 test %rax,%rax ffffffff8136798e: 0f 85 d8 fc ff ff jne ffffffff8136766c <drm_mode_setcrtc+0x11c> DRM_DEBUG_KMS("Unknown FB ID%d\n", ffffffff81367994: 41 8b 4e 10 mov 0x10(%r14),%ecx ffffffff81367998: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff8136799f: 48 c7 c2 0b 7b 9b 81 mov $0xffffffff819b7b0b,%rdx ffffffff813679a6: 31 c0 xor %eax,%eax ffffffff813679a8: 6a 04 pushq $0x4 ffffffff813679aa: 5e pop %rsi ffffffff813679ab: e8 76 c2 ff ff callq ffffffff81363c26 <drm_printk> ffffffff813679b0: 31 c0 xor %eax,%eax ffffffff813679b2: 48 89 45 d0 mov %rax,-0x30(%rbp) ffffffff813679b6: 6a fe pushq $0xfffffffffffffffe ffffffff813679b8: 41 5c pop %r12 ffffffff813679ba: eb 06 jmp ffffffff813679c2 <drm_mode_setcrtc+0x472> ffffffff813679bc: 31 c0 xor %eax,%eax ffffffff813679be: 48 89 45 d0 mov %rax,-0x30(%rbp) out: if (fb) drm_framebuffer_put(fb); if (connector_set) { ffffffff813679c2: 4d 85 ff test %r15,%r15 ffffffff813679c5: 74 27 je ffffffff813679ee <drm_mode_setcrtc+0x49e> ffffffff813679c7: 31 db xor %ebx,%ebx ffffffff813679c9: eb 17 jmp ffffffff813679e2 <drm_mode_setcrtc+0x492> for (i = 0; i < crtc_req->count_connectors; i++) { if (connector_set[i]) ffffffff813679cb: 48 63 c3 movslq %ebx,%rax ffffffff813679ce: 49 8b 3c c7 mov (%r15,%rax,8),%rdi ffffffff813679d2: 48 85 ff test %rdi,%rdi ffffffff813679d5: 74 09 je ffffffff813679e0 <drm_mode_setcrtc+0x490> * This function decrements the connector's reference count and frees the * object if the reference count drops to zero. */ static inline void drm_connector_put(struct drm_connector *connector) { drm_mode_object_put(&connector->base); ffffffff813679d7: 48 83 c7 28 add $0x28,%rdi ffffffff813679db: e8 c4 da 00 00 callq ffffffff813754a4 <drm_mode_object_put> out: if (fb) drm_framebuffer_put(fb); if (connector_set) { for (i = 0; i < crtc_req->count_connectors; i++) { ffffffff813679e0: ff c3 inc %ebx ffffffff813679e2: 41 3b 5e 08 cmp 0x8(%r14),%ebx ffffffff813679e6: 72 e3 jb ffffffff813679cb <drm_mode_setcrtc+0x47b> ffffffff813679e8: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff813679ec: eb 03 jmp ffffffff813679f1 <drm_mode_setcrtc+0x4a1> ffffffff813679ee: 45 31 ff xor %r15d,%r15d if (connector_set[i]) drm_connector_put(connector_set[i]); } } kfree(connector_set); ffffffff813679f1: 48 8b 7d c8 mov -0x38(%rbp),%rdi ffffffff813679f5: e8 55 78 dd ff callq ffffffff8113f24f <kfree> drm_mode_destroy(dev, mode); ffffffff813679fa: 48 8b 7d b0 mov -0x50(%rbp),%rdi ffffffff813679fe: 48 8b 75 c0 mov -0x40(%rbp),%rsi ffffffff81367a02: e8 ef 04 00 00 callq ffffffff81367ef6 <drm_mode_destroy> if (ret == -EDEADLK) { ffffffff81367a07: 41 83 fc dd cmp $0xffffffdd,%r12d ffffffff81367a0b: 0f 84 1b fc ff ff je ffffffff8136762c <drm_mode_setcrtc+0xdc> ffffffff81367a11: 4c 8d b5 38 ff ff ff lea -0xc8(%rbp),%r14 drm_modeset_backoff(&ctx); goto retry; } drm_modeset_drop_locks(&ctx); ffffffff81367a18: 4c 89 f7 mov %r14,%rdi ffffffff81367a1b: e8 62 7b 00 00 callq ffffffff8136f582 <drm_modeset_drop_locks> drm_modeset_acquire_fini(&ctx); ffffffff81367a20: 4c 89 f7 mov %r14,%rdi ffffffff81367a23: e8 ad 7a 00 00 callq ffffffff8136f4d5 <drm_modeset_acquire_fini> mutex_unlock(&crtc->dev->mode_config.mutex); ffffffff81367a28: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff81367a2c: 48 81 c7 b0 02 00 00 add $0x2b0,%rdi ffffffff81367a33: e8 b0 38 2e 00 callq ffffffff8164b2e8 <mutex_unlock> ffffffff81367a38: eb 20 jmp ffffffff81367a5a <drm_mode_setcrtc+0x50a> if (crtc_req->x & 0xffff0000 || crtc_req->y & 0xffff0000) return -ERANGE; crtc = drm_crtc_find(dev, crtc_req->crtc_id); if (!crtc) { DRM_DEBUG_KMS("Unknown CRTC ID %d\n", crtc_req->crtc_id); ffffffff81367a3a: 41 8b 4e 0c mov 0xc(%r14),%ecx ffffffff81367a3e: 6a 04 pushq $0x4 ffffffff81367a40: 5e pop %rsi ffffffff81367a41: 48 c7 c7 5d 82 95 81 mov $0xffffffff8195825d,%rdi ffffffff81367a48: 48 c7 c2 d9 7a 9b 81 mov $0xffffffff819b7ad9,%rdx ffffffff81367a4f: 31 c0 xor %eax,%eax ffffffff81367a51: e8 d0 c1 ff ff callq ffffffff81363c26 <drm_printk> ffffffff81367a56: 6a fe pushq $0xfffffffffffffffe ffffffff81367a58: 41 5c pop %r12 drm_modeset_drop_locks(&ctx); drm_modeset_acquire_fini(&ctx); mutex_unlock(&crtc->dev->mode_config.mutex); return ret; } ffffffff81367a5a: 44 89 e0 mov %r12d,%eax ffffffff81367a5d: 48 81 c4 c0 00 00 00 add $0xc0,%rsp ffffffff81367a64: 5b pop %rbx ffffffff81367a65: 41 5c pop %r12 ffffffff81367a67: 41 5d pop %r13 ffffffff81367a69: 41 5e pop %r14 ffffffff81367a6b: 41 5f pop %r15 ffffffff81367a6d: 5d pop %rbp ffffffff81367a6e: c3 retq ============================================================ Without f05058c4d652: ffffffff8136744c <drm_mode_setcrtc>: * Returns: * Zero on success, negative errno on failure. */ int drm_mode_setcrtc(struct drm_device *dev, void *data, struct drm_file *file_priv) { ffffffff8136744c: e8 1f 71 2e 00 callq ffffffff8164e570 <__fentry__> ffffffff81367451: 55 push %rbp ffffffff81367452: 48 89 e5 mov %rsp,%rbp ffffffff81367455: 41 57 push %r15 ffffffff81367457: 41 56 push %r14 ffffffff81367459: 41 55 push %r13 ffffffff8136745b: 41 54 push %r12 ffffffff8136745d: 53 push %rbx ffffffff8136745e: 48 81 ec b8 00 00 00 sub $0xb8,%rsp ffffffff81367465: 49 89 f6 mov %rsi,%r14 ffffffff81367468: 48 89 fb mov %rdi,%rbx #define DRM_SWITCH_POWER_DYNAMIC_OFF 3 static __inline__ int drm_core_check_feature(struct drm_device *dev, int feature) { return ((dev->driver->driver_features & feature) ? 1 : 0); ffffffff8136746b: 48 8b 43 20 mov 0x20(%rbx),%rax uint32_t __user *set_connectors_ptr; struct drm_modeset_acquire_ctx ctx; int ret; int i; if (!drm_core_check_feature(dev, DRIVER_MODESET)) ffffffff8136746f: f6 80 81 01 00 00 20 testb $0x20,0x181(%rax) ffffffff81367476: 75 09 jne ffffffff81367481 <drm_mode_setcrtc+0x35> ffffffff81367478: 6a ea pushq $0xffffffffffffffea ffffffff8136747a: 41 5c pop %r12 ffffffff8136747c: e9 be 04 00 00 jmpq ffffffff8136793f <drm_mode_setcrtc+0x4f3> ffffffff81367481: 6a de pushq $0xffffffffffffffde ffffffff81367483: 41 5c pop %r12 /* * Universal plane src offsets are only 16.16, prevent havoc for * drivers using universal plane code internally. */ if (crtc_req->x & 0xffff0000 || crtc_req->y & 0xffff0000) ffffffff81367485: 41 81 7e 14 ff ff 00 cmpl $0xffff,0x14(%r14) ffffffff8136748c: 00 ffffffff8136748d: 0f 87 ac 04 00 00 ja ffffffff8136793f <drm_mode_setcrtc+0x4f3> ffffffff81367493: 41 81 7e 18 ff ff 00 cmpl $0xffff,0x18(%r14) ffffffff8136749a: 00 ffffffff8136749b: 0f 87 9e 04 00 00 ja ffffffff8136793f <drm_mode_setcrtc+0x4f3> return -ERANGE; crtc = drm_crtc_find(dev, crtc_req->crtc_id); ffffffff813674a1: 41 8b 76 0c mov 0xc(%r14),%esi ffffffff813674a5: 48 89 df mov %rbx,%rdi ffffffff813674a8: e8 15 fe ff ff callq ffffffff813672c2 <drm_crtc_find> ffffffff813674ad: 49 89 c5 mov %rax,%r13 if (!crtc) { ffffffff813674b0: 4d 85 ed test %r13,%r13 ffffffff813674b3: 0f 84 66 04 00 00 je ffffffff8136791f <drm_mode_setcrtc+0x4d3> ffffffff813674b9: 48 89 5d b0 mov %rbx,-0x50(%rbp) DRM_DEBUG_KMS("Unknown CRTC ID %d\n", crtc_req->crtc_id); return -ENOENT; } DRM_DEBUG_KMS("[CRTC:%d:%s]\n", crtc->base.id, crtc->name); ffffffff813674bd: 41 8b 4d 78 mov 0x78(%r13),%ecx ffffffff813674c1: 4d 8b 45 20 mov 0x20(%r13),%r8 ffffffff813674c5: 6a 04 pushq $0x4 ffffffff813674c7: 5e pop %rsi ffffffff813674c8: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff813674cf: 48 c7 c2 d8 55 9b 81 mov $0xffffffff819b55d8,%rdx ffffffff813674d6: 31 c0 xor %eax,%eax ffffffff813674d8: e8 45 c6 ff ff callq ffffffff81363b22 <drm_printk> mutex_lock(&crtc->dev->mode_config.mutex); ffffffff813674dd: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff813674e1: 48 81 c7 b0 02 00 00 add $0x2b0,%rdi ffffffff813674e8: e8 e6 3b 2e 00 callq ffffffff8164b0d3 <mutex_lock> ffffffff813674ed: 48 8d 9d 40 ff ff ff lea -0xc0(%rbp),%rbx drm_modeset_acquire_init(&ctx, 0); ffffffff813674f4: 31 f6 xor %esi,%esi ffffffff813674f6: 48 89 df mov %rbx,%rdi ffffffff813674f9: e8 d8 7d 00 00 callq ffffffff8136f2d6 <drm_modeset_acquire_init> ffffffff813674fe: 49 8d 46 24 lea 0x24(%r14),%rax ffffffff81367502: 48 89 45 a8 mov %rax,-0x58(%rbp) ffffffff81367506: 31 c0 xor %eax,%eax ffffffff81367508: 48 89 45 c0 mov %rax,-0x40(%rbp) ffffffff8136750c: 31 c0 xor %eax,%eax ffffffff8136750e: 48 89 45 d0 mov %rax,-0x30(%rbp) ffffffff81367512: 31 c0 xor %eax,%eax ffffffff81367514: 48 89 45 c8 mov %rax,-0x38(%rbp) ffffffff81367518: 45 31 ff xor %r15d,%r15d ffffffff8136751b: 4c 89 6d b8 mov %r13,-0x48(%rbp) ffffffff8136751f: e9 f7 02 00 00 jmpq ffffffff8136781b <drm_mode_setcrtc+0x3cf> ffffffff81367524: 48 8d 9d 40 ff ff ff lea -0xc0(%rbp),%rbx } } kfree(connector_set); drm_mode_destroy(dev, mode); if (ret == -EDEADLK) { drm_modeset_backoff(&ctx); ffffffff8136752b: 48 89 df mov %rbx,%rdi ffffffff8136752e: e8 75 7e 00 00 callq ffffffff8136f3a8 <drm_modeset_backoff> ffffffff81367533: e9 e3 02 00 00 jmpq ffffffff8136781b <drm_mode_setcrtc+0x3cf> goto out; if (crtc_req->mode_valid) { /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { if (!crtc->primary->fb) { ffffffff81367538: 49 8b 85 98 00 00 00 mov 0x98(%r13),%rax ffffffff8136753f: 48 8b 98 b0 00 00 00 mov 0xb0(%rax),%rbx ffffffff81367546: 48 85 db test %rbx,%rbx ffffffff81367549: 0f 84 19 01 00 00 je ffffffff81367668 <drm_mode_setcrtc+0x21c> * * This function increments the framebuffer's reference count. */ static inline void drm_framebuffer_get(struct drm_framebuffer *fb) { drm_mode_object_get(&fb->base); ffffffff8136754f: 48 8d 7b 18 lea 0x18(%rbx),%rdi ffffffff81367553: e8 45 de 00 00 callq ffffffff8137539d <drm_mode_object_get> ffffffff81367558: 48 89 5d d0 mov %rbx,-0x30(%rbp) ffffffff8136755c: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff81367560: 48 8b 5d b0 mov -0x50(%rbp),%rbx ret = -ENOENT; goto out; } } mode = drm_mode_create(dev); ffffffff81367564: 48 89 df mov %rbx,%rdi ffffffff81367567: e8 1f 08 00 00 callq ffffffff81367d8b <drm_mode_create> if (!mode) { ffffffff8136756c: 48 85 c0 test %rax,%rax ffffffff8136756f: 74 38 je ffffffff813675a9 <drm_mode_setcrtc+0x15d> ffffffff81367571: 48 89 c1 mov %rax,%rcx ret = -ENOMEM; goto out; } ret = drm_mode_convert_umode(mode, &crtc_req->mode); ffffffff81367574: 48 89 4d c0 mov %rcx,-0x40(%rbp) ffffffff81367578: 48 89 c7 mov %rax,%rdi ffffffff8136757b: 48 8b 75 a8 mov -0x58(%rbp),%rsi ffffffff8136757f: e8 27 1b 00 00 callq ffffffff813690ab <drm_mode_convert_umode> ffffffff81367584: 41 89 c4 mov %eax,%r12d if (ret) { ffffffff81367587: 45 85 e4 test %r12d,%r12d ffffffff8136758a: 74 2c je ffffffff813675b8 <drm_mode_setcrtc+0x16c> DRM_DEBUG_KMS("Invalid mode\n"); ffffffff8136758c: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff81367593: 48 c7 c2 0c 7b 9b 81 mov $0xffffffff819b7b0c,%rdx ffffffff8136759a: 31 c0 xor %eax,%eax ffffffff8136759c: 6a 04 pushq $0x4 ffffffff8136759e: 5e pop %rsi ffffffff8136759f: e8 7e c5 ff ff callq ffffffff81363b22 <drm_printk> ffffffff813675a4: e9 8d 02 00 00 jmpq ffffffff81367836 <drm_mode_setcrtc+0x3ea> ffffffff813675a9: 31 c0 xor %eax,%eax ffffffff813675ab: 48 89 45 c0 mov %rax,-0x40(%rbp) ffffffff813675af: 6a f4 pushq $0xfffffffffffffff4 ffffffff813675b1: 41 5c pop %r12 ffffffff813675b3: e9 7e 02 00 00 jmpq ffffffff81367836 <drm_mode_setcrtc+0x3ea> * Drivers not implementing the universal planes API use a * default formats list provided by the DRM core which doesn't * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ffffffff813675b8: 49 8b bd 98 00 00 00 mov 0x98(%r13),%rdi ffffffff813675bf: 80 bf a4 00 00 00 00 cmpb $0x0,0xa4(%rdi) ffffffff813675c6: 0f 84 cf 00 00 00 je ffffffff8136769b <drm_mode_setcrtc+0x24f> &format_name)); goto out; } } ret = drm_crtc_check_viewport(crtc, crtc_req->x, crtc_req->y, ffffffff813675cc: 41 8b 76 14 mov 0x14(%r14),%esi ffffffff813675d0: 41 8b 56 18 mov 0x18(%r14),%edx ffffffff813675d4: 4c 89 ef mov %r13,%rdi ffffffff813675d7: 48 8b 4d c0 mov -0x40(%rbp),%rcx ffffffff813675db: 4c 8b 45 d0 mov -0x30(%rbp),%r8 ffffffff813675df: e8 e2 fd ff ff callq ffffffff813673c6 <drm_crtc_check_viewport> ffffffff813675e4: 41 89 c4 mov %eax,%r12d mode, fb); if (ret) ffffffff813675e7: 45 85 e4 test %r12d,%r12d ffffffff813675ea: 0f 85 46 02 00 00 jne ffffffff81367836 <drm_mode_setcrtc+0x3ea> goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff813675f0: 41 8b 4e 08 mov 0x8(%r14),%ecx ffffffff813675f4: 48 8b 45 c0 mov -0x40(%rbp),%rax ffffffff813675f8: 48 85 c0 test %rax,%rax ffffffff813675fb: 74 1e je ffffffff8136761b <drm_mode_setcrtc+0x1cf> ffffffff813675fd: 85 c9 test %ecx,%ecx ffffffff813675ff: 75 1a jne ffffffff8136761b <drm_mode_setcrtc+0x1cf> DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ffffffff81367601: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff81367608: 48 c7 c2 33 7b 9b 81 mov $0xffffffff819b7b33,%rdx ffffffff8136760f: 31 c0 xor %eax,%eax ffffffff81367611: 6a 04 pushq $0x4 ffffffff81367613: 5e pop %rsi ffffffff81367614: e8 09 c5 ff ff callq ffffffff81363b22 <drm_printk> ffffffff81367619: eb 44 jmp ffffffff8136765f <drm_mode_setcrtc+0x213> if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff8136761b: 48 85 c0 test %rax,%rax ffffffff8136761e: 0f 95 c0 setne %al DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0 && (!mode || !fb)) { ffffffff81367621: 48 83 7d d0 00 cmpq $0x0,-0x30(%rbp) ffffffff81367626: 0f 95 c2 setne %dl if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff81367629: 85 c9 test %ecx,%ecx DRM_DEBUG_KMS("Count connectors is 0 but mode set\n"); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0 && (!mode || !fb)) { ffffffff8136762b: 74 1e je ffffffff8136764b <drm_mode_setcrtc+0x1ff> ffffffff8136762d: 20 d0 and %dl,%al ffffffff8136762f: 75 1a jne ffffffff8136764b <drm_mode_setcrtc+0x1ff> DRM_DEBUG_KMS("Count connectors is %d but no mode or fb set\n", ffffffff81367631: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff81367638: 48 c7 c2 57 7b 9b 81 mov $0xffffffff819b7b57,%rdx ffffffff8136763f: 31 c0 xor %eax,%eax ffffffff81367641: 6a 04 pushq $0x4 ffffffff81367643: 5e pop %rsi ffffffff81367644: e8 d9 c4 ff ff callq ffffffff81363b22 <drm_printk> ffffffff81367649: eb 14 jmp ffffffff8136765f <drm_mode_setcrtc+0x213> if (ret) goto out; } if (crtc_req->count_connectors == 0 && mode) { ffffffff8136764b: 85 c9 test %ecx,%ecx crtc_req->count_connectors); ret = -EINVAL; goto out; } if (crtc_req->count_connectors > 0) { ffffffff8136764d: 74 3e je ffffffff8136768d <drm_mode_setcrtc+0x241> u32 out_id; /* Avoid unbounded kernel memory allocation */ if (crtc_req->count_connectors > config->num_connector) { ffffffff8136764f: 48 8b 45 b0 mov -0x50(%rbp),%rax ffffffff81367653: 3b 88 10 04 00 00 cmp 0x410(%rax),%ecx ffffffff81367659: 0f 86 8d 00 00 00 jbe ffffffff813676ec <drm_mode_setcrtc+0x2a0> ffffffff8136765f: 6a ea pushq $0xffffffffffffffea ffffffff81367661: 41 5c pop %r12 ffffffff81367663: e9 c7 01 00 00 jmpq ffffffff8136782f <drm_mode_setcrtc+0x3e3> if (crtc_req->mode_valid) { /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { if (!crtc->primary->fb) { DRM_DEBUG_KMS("CRTC doesn't have current FB\n"); ffffffff81367668: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff8136766f: 48 c7 c2 dd 7a 9b 81 mov $0xffffffff819b7add,%rdx ffffffff81367676: 31 c0 xor %eax,%eax ffffffff81367678: 6a 04 pushq $0x4 ffffffff8136767a: 5e pop %rsi ffffffff8136767b: e8 a2 c4 ff ff callq ffffffff81363b22 <drm_printk> ffffffff81367680: 6a ea pushq $0xffffffffffffffea ffffffff81367682: 41 5c pop %r12 ffffffff81367684: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff81367688: e9 a2 01 00 00 jmpq ffffffff8136782f <drm_mode_setcrtc+0x3e3> ffffffff8136768d: 31 d2 xor %edx,%edx ffffffff8136768f: 48 8d b5 40 ff ff ff lea -0xc0(%rbp),%rsi ffffffff81367696: e9 36 01 00 00 jmpq ffffffff813677d1 <drm_mode_setcrtc+0x385> * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ret = drm_plane_check_pixel_format(crtc->primary, fb->format->format); ffffffff8136769b: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff8136769f: 48 8b 40 38 mov 0x38(%rax),%rax ffffffff813676a3: 8b 30 mov (%rax),%esi * default formats list provided by the DRM core which doesn't * match real hardware capabilities. Skip the check in that * case. */ if (!crtc->primary->format_default) { ret = drm_plane_check_pixel_format(crtc->primary, ffffffff813676a5: e8 ee f3 00 00 callq ffffffff81376a98 <drm_plane_check_pixel_format> ffffffff813676aa: 41 89 c4 mov %eax,%r12d fb->format->format); if (ret) { ffffffff813676ad: 45 85 e4 test %r12d,%r12d ffffffff813676b0: 0f 84 16 ff ff ff je ffffffff813675cc <drm_mode_setcrtc+0x180> struct drm_format_name_buf format_name; DRM_DEBUG_KMS("Invalid pixel format %s\n", ffffffff813676b6: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff813676ba: 48 8b 40 38 mov 0x38(%rax),%rax ffffffff813676be: 8b 38 mov (%rax),%edi ffffffff813676c0: 48 8d b5 20 ff ff ff lea -0xe0(%rbp),%rsi ffffffff813676c7: e8 b4 03 00 00 callq ffffffff81367a80 <drm_get_format_name> ffffffff813676cc: 48 89 c1 mov %rax,%rcx ffffffff813676cf: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff813676d6: 48 c7 c2 1a 7b 9b 81 mov $0xffffffff819b7b1a,%rdx ffffffff813676dd: 31 c0 xor %eax,%eax ffffffff813676df: 6a 04 pushq $0x4 ffffffff813676e1: 5e pop %rsi ffffffff813676e2: e8 3b c4 ff ff callq ffffffff81363b22 <drm_printk> ffffffff813676e7: e9 4a 01 00 00 jmpq ffffffff81367836 <drm_mode_setcrtc+0x3ea> { if (size != 0 && n > SIZE_MAX / size) return NULL; if (__builtin_constant_p(n) && __builtin_constant_p(size)) return kmalloc(n * size, flags); return __kmalloc(n * size, flags); ffffffff813676ec: 48 c1 e1 03 shl $0x3,%rcx ffffffff813676f0: be c0 00 40 01 mov $0x14000c0,%esi ffffffff813676f5: 48 89 cf mov %rcx,%rdi ffffffff813676f8: e8 cc 78 dd ff callq ffffffff8113efc9 <__kmalloc> } connector_set = kmalloc_array(crtc_req->count_connectors, sizeof(struct drm_connector *), GFP_KERNEL); if (!connector_set) { ffffffff813676fd: 48 85 c0 test %rax,%rax ffffffff81367700: 48 8d b5 40 ff ff ff lea -0xc0(%rbp),%rsi ffffffff81367707: 0f 84 b2 00 00 00 je ffffffff813677bf <drm_mode_setcrtc+0x373> ffffffff8136770d: 31 db xor %ebx,%ebx ffffffff8136770f: 48 89 c2 mov %rax,%rdx ffffffff81367712: 48 89 55 c8 mov %rdx,-0x38(%rbp) ffffffff81367716: eb 35 jmp ffffffff8136774d <drm_mode_setcrtc+0x301> DRM_DEBUG_KMS("Connector id %d unknown\n", out_id); ret = -ENOENT; goto out; } DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", ffffffff81367718: 41 8b 4d 28 mov 0x28(%r13),%ecx ffffffff8136771c: 4d 8b 45 48 mov 0x48(%r13),%r8 ffffffff81367720: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff81367727: 48 c7 c2 84 57 9b 81 mov $0xffffffff819b5784,%rdx ffffffff8136772e: 31 c0 xor %eax,%eax ffffffff81367730: 6a 04 pushq $0x4 ffffffff81367732: 5e pop %rsi ffffffff81367733: e8 ea c3 ff ff callq ffffffff81363b22 <drm_printk> ffffffff81367738: 48 8b 45 c8 mov -0x38(%rbp),%rax connector->base.id, connector->name); connector_set[i] = connector; ffffffff8136773c: 4e 89 2c f8 mov %r13,(%rax,%r15,8) if (!connector_set) { ret = -ENOMEM; goto out; } for (i = 0; i < crtc_req->count_connectors; i++) { ffffffff81367740: ff c3 inc %ebx ffffffff81367742: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff81367746: 48 8d b5 40 ff ff ff lea -0xc0(%rbp),%rsi ffffffff8136774d: 41 8b 56 08 mov 0x8(%r14),%edx ffffffff81367751: 39 d3 cmp %edx,%ebx ffffffff81367753: 73 79 jae ffffffff813677ce <drm_mode_setcrtc+0x382> connector_set[i] = NULL; ffffffff81367755: 4c 63 fb movslq %ebx,%r15 ffffffff81367758: 4a 83 24 f8 00 andq $0x0,(%rax,%r15,8) set_connectors_ptr = (uint32_t __user *)(unsigned long)crtc_req->set_connectors_ptr; if (get_user(out_id, &set_connectors_ptr[i])) { ffffffff8136775d: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax ffffffff81367764: 00 ffffffff81367765: 49 03 06 add (%r14),%rax ffffffff81367768: e8 83 26 f1 ff callq ffffffff81279df0 <__get_user_4> ffffffff8136776d: 49 89 d4 mov %rdx,%r12 ffffffff81367770: 85 c0 test %eax,%eax ffffffff81367772: 0f 85 99 00 00 00 jne ffffffff81367811 <drm_mode_setcrtc+0x3c5> */ static inline struct drm_connector *drm_connector_lookup(struct drm_device *dev, uint32_t id) { struct drm_mode_object *mo; mo = drm_mode_object_find(dev, id, DRM_MODE_OBJECT_CONNECTOR); ffffffff81367778: ba c0 c0 c0 c0 mov $0xc0c0c0c0,%edx ffffffff8136777d: 48 8b 7d b0 mov -0x50(%rbp),%rdi ffffffff81367781: 44 89 e6 mov %r12d,%esi ffffffff81367784: e8 b1 db 00 00 callq ffffffff8137533a <drm_mode_object_find> ffffffff81367789: 49 89 c5 mov %rax,%r13 return mo ? obj_to_connector(mo) : NULL; ffffffff8136778c: 4d 85 ed test %r13,%r13 ret = -EFAULT; goto out; } connector = drm_connector_lookup(dev, out_id); if (!connector) { ffffffff8136778f: 74 06 je ffffffff81367797 <drm_mode_setcrtc+0x34b> ffffffff81367791: 49 83 c5 d8 add $0xffffffffffffffd8,%r13 ffffffff81367795: 75 81 jne ffffffff81367718 <drm_mode_setcrtc+0x2cc> DRM_DEBUG_KMS("Connector id %d unknown\n", ffffffff81367797: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff8136779e: 48 c7 c2 85 7b 9b 81 mov $0xffffffff819b7b85,%rdx ffffffff813677a5: 31 c0 xor %eax,%eax ffffffff813677a7: 6a 04 pushq $0x4 ffffffff813677a9: 5e pop %rsi ffffffff813677aa: 44 89 e1 mov %r12d,%ecx ffffffff813677ad: e8 70 c3 ff ff callq ffffffff81363b22 <drm_printk> ffffffff813677b2: 6a fe pushq $0xfffffffffffffffe ffffffff813677b4: 41 5c pop %r12 ffffffff813677b6: 4c 8b 7d c8 mov -0x38(%rbp),%r15 ffffffff813677ba: e9 c5 fe ff ff jmpq ffffffff81367684 <drm_mode_setcrtc+0x238> if (crtc_req->count_connectors > config->num_connector) { ret = -EINVAL; goto out; } connector_set = kmalloc_array(crtc_req->count_connectors, ffffffff813677bf: 49 89 c7 mov %rax,%r15 ffffffff813677c2: 6a f4 pushq $0xfffffffffffffff4 ffffffff813677c4: 41 5c pop %r12 ffffffff813677c6: 31 c0 xor %eax,%eax ffffffff813677c8: 48 89 45 c8 mov %rax,-0x38(%rbp) ffffffff813677cc: eb 61 jmp ffffffff8136782f <drm_mode_setcrtc+0x3e3> ffffffff813677ce: 49 89 c7 mov %rax,%r15 connector_set[i] = connector; } } set.crtc = crtc; ffffffff813677d1: 4c 89 6d 80 mov %r13,-0x80(%rbp) set.x = crtc_req->x; ffffffff813677d5: 41 8b 4e 14 mov 0x14(%r14),%ecx ffffffff813677d9: 89 4d 90 mov %ecx,-0x70(%rbp) set.y = crtc_req->y; ffffffff813677dc: 41 8b 4e 18 mov 0x18(%r14),%ecx ffffffff813677e0: 89 4d 94 mov %ecx,-0x6c(%rbp) set.mode = mode; ffffffff813677e3: 48 8b 45 c0 mov -0x40(%rbp),%rax ffffffff813677e7: 48 89 45 88 mov %rax,-0x78(%rbp) set.connectors = connector_set; ffffffff813677eb: 4c 89 7d 98 mov %r15,-0x68(%rbp) set.num_connectors = crtc_req->count_connectors; ffffffff813677ef: 89 d0 mov %edx,%eax ffffffff813677f1: 48 89 45 a0 mov %rax,-0x60(%rbp) set.fb = fb; ffffffff813677f5: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff813677f9: 48 89 85 78 ff ff ff mov %rax,-0x88(%rbp) ret = __drm_mode_set_config_internal(&set, &ctx); ffffffff81367800: 48 8d bd 78 ff ff ff lea -0x88(%rbp),%rdi ffffffff81367807: e8 d6 fa ff ff callq ffffffff813672e2 <__drm_mode_set_config_internal> ffffffff8136780c: 41 89 c4 mov %eax,%r12d ffffffff8136780f: eb 1e jmp ffffffff8136782f <drm_mode_setcrtc+0x3e3> ffffffff81367811: 6a f2 pushq $0xfffffffffffffff2 ffffffff81367813: 41 5c pop %r12 ffffffff81367815: 4c 8b 7d c8 mov -0x38(%rbp),%r15 ffffffff81367819: eb 14 jmp ffffffff8136782f <drm_mode_setcrtc+0x3e3> DRM_DEBUG_KMS("[CRTC:%d:%s]\n", crtc->base.id, crtc->name); mutex_lock(&crtc->dev->mode_config.mutex); drm_modeset_acquire_init(&ctx, 0); retry: ret = drm_modeset_lock_all_ctx(crtc->dev, &ctx); ffffffff8136781b: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff8136781f: 48 89 de mov %rbx,%rsi ffffffff81367822: e8 08 7b 00 00 callq ffffffff8136f32f <drm_modeset_lock_all_ctx> ffffffff81367827: 41 89 c4 mov %eax,%r12d if (ret) ffffffff8136782a: 45 85 e4 test %r12d,%r12d ffffffff8136782d: 74 16 je ffffffff81367845 <drm_mode_setcrtc+0x3f9> set.num_connectors = crtc_req->count_connectors; set.fb = fb; ret = __drm_mode_set_config_internal(&set, &ctx); out: if (fb) ffffffff8136782f: 48 83 7d d0 00 cmpq $0x0,-0x30(%rbp) ffffffff81367834: 74 6b je ffffffff813678a1 <drm_mode_setcrtc+0x455> * This function decrements the framebuffer's reference count and frees the * framebuffer if the reference count drops to zero. */ static inline void drm_framebuffer_put(struct drm_framebuffer *fb) { drm_mode_object_put(&fb->base); ffffffff81367836: 48 8b 45 d0 mov -0x30(%rbp),%rax ffffffff8136783a: 48 8d 78 18 lea 0x18(%rax),%rdi ffffffff8136783e: e8 07 db 00 00 callq ffffffff8137534a <drm_mode_object_put> ffffffff81367843: eb 62 jmp ffffffff813678a7 <drm_mode_setcrtc+0x45b> drm_modeset_acquire_init(&ctx, 0); retry: ret = drm_modeset_lock_all_ctx(crtc->dev, &ctx); if (ret) goto out; if (crtc_req->mode_valid) { ffffffff81367845: 41 83 7e 20 00 cmpl $0x0,0x20(%r14) ffffffff8136784a: 0f 84 a0 fd ff ff je ffffffff813675f0 <drm_mode_setcrtc+0x1a4> /* If we have a mode we need a framebuffer. */ /* If we pass -1, set the mode with the currently bound fb */ if (crtc_req->fb_id == -1) { ffffffff81367850: 41 8b 76 10 mov 0x10(%r14),%esi ffffffff81367854: 83 fe ff cmp $0xffffffff,%esi ffffffff81367857: 0f 84 db fc ff ff je ffffffff81367538 <drm_mode_setcrtc+0xec> ffffffff8136785d: 48 8b 5d b0 mov -0x50(%rbp),%rbx } fb = crtc->primary->fb; /* Make refcounting symmetric with the lookup path. */ drm_framebuffer_get(fb); } else { fb = drm_framebuffer_lookup(dev, crtc_req->fb_id); ffffffff81367861: 48 89 df mov %rbx,%rdi ffffffff81367864: e8 64 b3 00 00 callq ffffffff81372bcd <drm_framebuffer_lookup> ffffffff81367869: 48 89 c1 mov %rax,%rcx if (!fb) { ffffffff8136786c: 48 89 4d d0 mov %rcx,-0x30(%rbp) ffffffff81367870: 48 85 c0 test %rax,%rax ffffffff81367873: 0f 85 eb fc ff ff jne ffffffff81367564 <drm_mode_setcrtc+0x118> DRM_DEBUG_KMS("Unknown FB ID%d\n", ffffffff81367879: 41 8b 4e 10 mov 0x10(%r14),%ecx ffffffff8136787d: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff81367884: 48 c7 c2 fb 7a 9b 81 mov $0xffffffff819b7afb,%rdx ffffffff8136788b: 31 c0 xor %eax,%eax ffffffff8136788d: 6a 04 pushq $0x4 ffffffff8136788f: 5e pop %rsi ffffffff81367890: e8 8d c2 ff ff callq ffffffff81363b22 <drm_printk> ffffffff81367895: 31 c0 xor %eax,%eax ffffffff81367897: 48 89 45 d0 mov %rax,-0x30(%rbp) ffffffff8136789b: 6a fe pushq $0xfffffffffffffffe ffffffff8136789d: 41 5c pop %r12 ffffffff8136789f: eb 06 jmp ffffffff813678a7 <drm_mode_setcrtc+0x45b> ffffffff813678a1: 31 c0 xor %eax,%eax ffffffff813678a3: 48 89 45 d0 mov %rax,-0x30(%rbp) out: if (fb) drm_framebuffer_put(fb); if (connector_set) { ffffffff813678a7: 4d 85 ff test %r15,%r15 ffffffff813678aa: 74 27 je ffffffff813678d3 <drm_mode_setcrtc+0x487> ffffffff813678ac: 31 db xor %ebx,%ebx ffffffff813678ae: eb 17 jmp ffffffff813678c7 <drm_mode_setcrtc+0x47b> for (i = 0; i < crtc_req->count_connectors; i++) { if (connector_set[i]) ffffffff813678b0: 48 63 c3 movslq %ebx,%rax ffffffff813678b3: 49 8b 3c c7 mov (%r15,%rax,8),%rdi ffffffff813678b7: 48 85 ff test %rdi,%rdi ffffffff813678ba: 74 09 je ffffffff813678c5 <drm_mode_setcrtc+0x479> * This function decrements the connector's reference count and frees the * object if the reference count drops to zero. */ static inline void drm_connector_put(struct drm_connector *connector) { drm_mode_object_put(&connector->base); ffffffff813678bc: 48 83 c7 28 add $0x28,%rdi ffffffff813678c0: e8 85 da 00 00 callq ffffffff8137534a <drm_mode_object_put> out: if (fb) drm_framebuffer_put(fb); if (connector_set) { for (i = 0; i < crtc_req->count_connectors; i++) { ffffffff813678c5: ff c3 inc %ebx ffffffff813678c7: 41 3b 5e 08 cmp 0x8(%r14),%ebx ffffffff813678cb: 72 e3 jb ffffffff813678b0 <drm_mode_setcrtc+0x464> ffffffff813678cd: 4c 8b 6d b8 mov -0x48(%rbp),%r13 ffffffff813678d1: eb 03 jmp ffffffff813678d6 <drm_mode_setcrtc+0x48a> ffffffff813678d3: 45 31 ff xor %r15d,%r15d if (connector_set[i]) drm_connector_put(connector_set[i]); } } kfree(connector_set); ffffffff813678d6: 48 8b 7d c8 mov -0x38(%rbp),%rdi ffffffff813678da: e8 af 78 dd ff callq ffffffff8113f18e <kfree> drm_mode_destroy(dev, mode); ffffffff813678df: 48 8b 7d b0 mov -0x50(%rbp),%rdi ffffffff813678e3: 48 8b 75 c0 mov -0x40(%rbp),%rsi ffffffff813678e7: e8 ef 04 00 00 callq ffffffff81367ddb <drm_mode_destroy> if (ret == -EDEADLK) { ffffffff813678ec: 41 83 fc dd cmp $0xffffffdd,%r12d ffffffff813678f0: 0f 84 2e fc ff ff je ffffffff81367524 <drm_mode_setcrtc+0xd8> ffffffff813678f6: 4c 8d b5 40 ff ff ff lea -0xc0(%rbp),%r14 drm_modeset_backoff(&ctx); goto retry; } drm_modeset_drop_locks(&ctx); ffffffff813678fd: 4c 89 f7 mov %r14,%rdi ffffffff81367900: e8 62 7b 00 00 callq ffffffff8136f467 <drm_modeset_drop_locks> drm_modeset_acquire_fini(&ctx); ffffffff81367905: 4c 89 f7 mov %r14,%rdi ffffffff81367908: e8 ad 7a 00 00 callq ffffffff8136f3ba <drm_modeset_acquire_fini> mutex_unlock(&crtc->dev->mode_config.mutex); ffffffff8136790d: 49 8b 7d 00 mov 0x0(%r13),%rdi ffffffff81367911: 48 81 c7 b0 02 00 00 add $0x2b0,%rdi ffffffff81367918: e8 fb 37 2e 00 callq ffffffff8164b118 <mutex_unlock> ffffffff8136791d: eb 20 jmp ffffffff8136793f <drm_mode_setcrtc+0x4f3> if (crtc_req->x & 0xffff0000 || crtc_req->y & 0xffff0000) return -ERANGE; crtc = drm_crtc_find(dev, crtc_req->crtc_id); if (!crtc) { DRM_DEBUG_KMS("Unknown CRTC ID %d\n", crtc_req->crtc_id); ffffffff8136791f: 41 8b 4e 0c mov 0xc(%r14),%ecx ffffffff81367923: 6a 04 pushq $0x4 ffffffff81367925: 5e pop %rsi ffffffff81367926: 48 c7 c7 4d 82 95 81 mov $0xffffffff8195824d,%rdi ffffffff8136792d: 48 c7 c2 c9 7a 9b 81 mov $0xffffffff819b7ac9,%rdx ffffffff81367934: 31 c0 xor %eax,%eax ffffffff81367936: e8 e7 c1 ff ff callq ffffffff81363b22 <drm_printk> ffffffff8136793b: 6a fe pushq $0xfffffffffffffffe ffffffff8136793d: 41 5c pop %r12 drm_modeset_drop_locks(&ctx); drm_modeset_acquire_fini(&ctx); mutex_unlock(&crtc->dev->mode_config.mutex); return ret; } ffffffff8136793f: 44 89 e0 mov %r12d,%eax ffffffff81367942: 48 81 c4 b8 00 00 00 add $0xb8,%rsp ffffffff81367949: 5b pop %rbx ffffffff8136794a: 41 5c pop %r12 ffffffff8136794c: 41 5d pop %r13 ffffffff8136794e: 41 5e pop %r14 ffffffff81367950: 41 5f pop %r15 ffffffff81367952: 5d pop %rbp ffffffff81367953: c3 retq ============================================================ Double fault with f05058c4d652: [ 3.798722] PANIC: double fault, error_code: 0x0 [ 3.807387] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 [ 3.816040] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 [ 3.824792] task: ffff880075b92f00 task.stack: ffffc90000d6c000 [ 3.829599] EXT4-fs (mmcblk0p1): re-mounted. Opts: commit=600,data=ordered [ 3.839092] RIP: 0010:drm_mode_setcrtc+0x328/0x51f [ 3.844443] RSP: 0018:0000000000000000 EFLAGS: 00010206 [ 3.850280] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX: 0000000000000008 [ 3.858253] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI: ffffffff81367805 [ 3.866225] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09: 0000000000000308 [ 3.874199] R10: 0000000000000300 R11: 0000000000000556 R12: 0000000000000000 [ 3.882163] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15: 0000000000000000 [ 3.890136] FS: 00007fb2dbd62740(0000) GS:ffff88007ad00000(0000) knlGS:0000000000000000 [ 3.899177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.905595] CR2: fffffffffffffff8 CR3: 000000007596b000 CR4: 00000000001006e0 [ 3.913568] Call Trace: [ 3.916296] Code: b8 48 8d b5 38 ff ff ff 41 8b 56 08 39 d3 0f 83 85 00 00 00 4c 63 fb 4a 83 24 f8 00 4a 8d 04 bd 00 00 00 00 49 03 06 48 8b 65 a8 <e8> 73 26 f1 ff 49 89 d4 48 89 65 a8 85 c0 0f 85 a0 00 00 00 ba [ 3.937440] Kernel panic - not syncing: Machine halted. [ 3.943268] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 [ 3.951921] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 [ 3.960671] Call Trace: [ 3.963398] <#DF> [ 3.965637] __dump_stack+0x19/0x1b [ 3.969531] dump_stack+0x42/0x60 [ 3.969541] PANIC: double fault, error_code: 0x0 [ 3.969545] CPU: 0 PID: 719 Comm: rsyslogd Not tainted 4.12.0-00023-g711d82c128ff #107 [ 3.969546] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 [ 3.969548] task: ffff880075210000 task.stack: ffffc90000554000 [ 3.969554] RIP: 0010:SyS_setgroups+0x6b/0xd4 [ 3.969556] RSP: 0018:ffffffff8106360d EFLAGS: 00010297 [ 3.969558] RAX: 0000558afd6664e0 RBX: 0000000000000000 RCX: 0000000000000001 [ 3.969559] RDX: 000000000000000c RSI: 0000000000000000 RDI: ffffffff8106360d [ 3.969560] RBP: ffffc90000557f48 R08: ffffffffffffffea R09: ffffc90000557e50 [ 3.969562] R10: 00007fb4acc1d0d0 R11: ffff880075210000 R12: 0000558afd6664e0 [ 3.969563] R13: 00007fb4acdf9780 R14: fffffffffffffff2 R15: ffff880076d13f50 [ 3.969565] FS: 00007fb4acdf9780(0000) GS:ffff88007ac00000(0000) knlGS:0000000000000000 [ 3.969566] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.969568] CR2: ffffffff810635f8 CR3: 00000000751fb000 CR4: 00000000001006f0 [ 3.969569] Call Trace: [ 3.969571] Code: 5d c3 89 df e8 9c fd ff ff 49 89 c7 4d 85 ff 74 5b 41 8b 4f 04 31 f6 6a f2 41 5e 6a ea 41 58 eb 22 48 63 de 49 8d 04 9c 48 89 fc <e8> 7c 66 21 00 85 c0 75 41 83 fa ff 74 39 48 89 e7 41 89 54 9f [ 4.100376] panic+0xf0/0x23e [ 4.103688] df_debug+0x31/0x31 [ 4.107193] do_double_fault+0xd8/0xfb [ 4.111379] double_fault+0x22/0x30 [ 4.115274] RIP: 0010:drm_mode_setcrtc+0x328/0x51f [ 4.120624] RSP: 0018:0000000000000000 EFLAGS: 00010206 [ 4.126451] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX: 0000000000000008 [ 4.134424] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI: ffffffff81367805 [ 4.142397] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09: 0000000000000308 [ 4.150369] R10: 0000000000000300 R11: 0000000000000556 R12: 0000000000000000 [ 4.158341] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15: 0000000000000000 [ 4.166316] ? drm_mode_setcrtc+0x2b5/0x51f [ 4.170988] WARNING: kernel stack regs at ffff88007ad05f58 in frecon:605 has bad 'bp' value ffffc90000d6fd90 [ 4.170990] unwind stack type:0 next_sp: (null) mask:0x10 graph_idx:0 [ 4.170993] ffff88007ad05d40: ffff88007ad05e20 (0xffff88007ad05e20) [ 4.170995] ffff88007ad05d48: ffffffff810184b0 (show_trace_log_lvl+0x163/0x23a) [ 4.170996] ffff88007ad05d50: 0000000000000000 ... [ 4.170998] ffff88007ad05d58: ffff88007ad05000 (0xffff88007ad05000) [ 4.170999] ffff88007ad05d60: ffff88007ad06000 (0xffff88007ad06000) [ 4.171000] ffff88007ad05d68: 0000000000000000 ... [ 4.171001] ffff88007ad05d70: 0000000000000010 (0x10) [ 4.171003] ffff88007ad05d78: ffff880075b92f00 (0xffff880075b92f00) [ 4.171004] ffff88007ad05d80: 0000010100000000 (0x10100000000) [ 4.171005] ffff88007ad05d88: 0000000000000000 ... [ 4.171006] ffff88007ad05d90: ffff88007ad05d40 (0xffff88007ad05d40) [ 4.171008] ffff88007ad05d98: ffff88007ad05f58 (0xffff88007ad05f58) [ 4.171010] ffff88007ad05da0: ffffffff81367878 (drm_mode_setcrtc+0x328/0x51f) [ 4.171011] ffff88007ad05da8: 0000000000000004 (0x4) [ 4.171013] ffff88007ad05db0: ffff88007ad05000 (0xffff88007ad05000) [ 4.171014] ffff88007ad05db8: ffff88007ad06000 (0xffff88007ad06000) [ 4.171015] ffff88007ad05dc0: 0000000000000000 ... [ 4.171016] ffff88007ad05dc8: 0000000000000010 (0x10) [ 4.171018] ffff88007ad05dd0: ffff88007ad05f58 (0xffff88007ad05f58) [ 4.171019] ffff88007ad05dd8: ffff880075b92f00 (0xffff880075b92f00) [ 4.171020] ffff88007ad05de0: ffffffff8195ba29 (0xffffffff8195ba29) [ 4.171022] ffff88007ad05de8: 0000000081e70fc0 (0x81e70fc0) [ 4.171023] ffff88007ad05df0: ffffffff8195b9f9 (0xffffffff8195b9f9) [ 4.171024] ffff88007ad05df8: 0000000000000086 (0x86) [ 4.171025] ffff88007ad05e00: 0000000000000000 ... [ 4.171027] ffff88007ad05e08: ffff880077a25000 (0xffff880077a25000) [ 4.171028] ffff88007ad05e10: ffff88007ad05f00 (0xffff88007ad05f00) [ 4.171029] ffff88007ad05e18: ffff880075b92f00 (0xffff880075b92f00) [ 4.171031] ffff88007ad05e20: ffff88007ad05e30 (0xffff88007ad05e30) [ 4.171033] ffff88007ad05e28: ffffffff81018647 (show_stack+0x45/0x47) [ 4.171034] ffff88007ad05e30: ffff88007ad05e40 (0xffff88007ad05e40) [ 4.171038] ffff88007ad05e38: ffffffff8126f18f (__dump_stack+0x19/0x1b) [ 4.171039] ffff88007ad05e40: ffff88007ad05e70 (0xffff88007ad05e70) [ 4.171042] ffff88007ad05e48: ffffffff8126f158 (dump_stack+0x42/0x60) [ 4.171043] ffff88007ad05e50: 0000000000000086 (0x86) [ 4.171044] ffff88007ad05e58: 0000000000000000 ... [ 4.171045] ffff88007ad05e60: 0000000200000000 (0x200000000) [ 4.171046] ffff88007ad05e68: ffffffff819641ff (0xffffffff819641ff) [ 4.171048] ffff88007ad05e70: ffff88007ad05f08 (0xffff88007ad05f08) [ 4.171050] ffff88007ad05e78: ffffffff81047bea (panic+0xf0/0x23e) [ 4.171051] ffff88007ad05e80: ffff88007ad05e90 (0xffff88007ad05e90) [ 4.171052] ffff88007ad05e88: 000000007596b000 (0x7596b000) [ 4.171054] ffff88007ad05e90: 00000000001006e0 (0x1006e0) [ 4.171055] ffff88007ad05e98: 0000000000000002 (0x2) [ 4.171056] ffff88007ad05ea0: 0000000000000000 ... [ 4.171057] ffff88007ad05ea8: ffffffff81c42180 (0xffffffff81c42180) [ 4.171058] ffff88007ad05eb0: 0000000200000000 (0x200000000) [ 4.171060] ffff88007ad05eb8: 00000000ffff0a30 (0xffff0a30) [ 4.171061] ffff88007ad05ec0: 0000003000000008 (0x3000000008) [ 4.171062] ffff88007ad05ec8: ffff88007ad05f18 (0xffff88007ad05f18) [ 4.171064] ffff88007ad05ed0: ffff88007ad05e90 (0xffff88007ad05e90) [ 4.171065] ffff88007ad05ed8: ba00000000000024 (0xba00000000000024) [ 4.171066] ffff88007ad05ee0: ffff88007ad05f58 (0xffff88007ad05f58) [ 4.171067] ffff88007ad05ee8: 0000000000000000 ... [ 4.171069] ffff88007ad05ef0: ffff880077a25000 (0xffff880077a25000) [ 4.171070] ffff88007ad05ef8: ffff88007ad05f58 (0xffff88007ad05f58) [ 4.171071] ffff88007ad05f00: ffff880075b92f00 (0xffff880075b92f00) [ 4.171073] ffff88007ad05f08: ffff88007ad05f20 (0xffff88007ad05f20) [ 4.171075] ffff88007ad05f10: ffffffff81037185 (df_debug+0x31/0x31) [ 4.171075] ffff88007ad05f18: 0000000000000000 ... [ 4.171077] ffff88007ad05f20: ffff88007ad05f48 (0xffff88007ad05f48) [ 4.171079] ffff88007ad05f28: ffffffff81016626 (do_double_fault+0xd8/0xfb) [ 4.171080] ffff88007ad05f30: 0000000000000001 (0x1) [ 4.171082] ffff88007ad05f38: ffffc90000d6fdd0 (0xffffc90000d6fdd0) [ 4.171082] ffff88007ad05f40: 0000000000000000 ... [ 4.171084] ffff88007ad05f48: ffff88007ad05f59 (0xffff88007ad05f59) [ 4.171086] ffff88007ad05f50: ffffffff8164df72 (double_fault+0x22/0x30) [ 4.171087] ffff88007ad05f58: 0000000000000000 ... [ 4.171088] ffff88007ad05f60: ffffc90000d6fdd0 (0xffffc90000d6fdd0) [ 4.171090] ffff88007ad05f68: ffff880077a25000 (0xffff880077a25000) [ 4.171091] ffff88007ad05f70: 0000000000000000 ... [ 4.171092] ffff88007ad05f78: ffffc90000d6fd90 (0xffffc90000d6fd90) [ 4.171093] ffff88007ad05f80: 0000000000000000 ... [ 4.171094] ffff88007ad05f88: 0000000000000556 (0x556) [ 4.171095] ffff88007ad05f90: 0000000000000300 (0x300) [ 4.171096] ffff88007ad05f98: 0000000000000308 (0x308) [ 4.171098] ffff88007ad05fa0: 00000000014000c0 (0x14000c0) [ 4.171099] ffff88007ad05fa8: 0000559e707c4d60 (0x559e707c4d60) [ 4.171100] ffff88007ad05fb0: 0000000000000008 (0x8) [ 4.171101] ffff88007ad05fb8: 0000000000000001 (0x1) [ 4.171103] ffff88007ad05fc0: ffffc90000d6fcc8 (0xffffc90000d6fcc8) [ 4.171105] ffff88007ad05fc8: ffffffff81367805 (drm_mode_setcrtc+0x2b5/0x51f) [ 4.171107] ffff88007ad05fd0: ffffffffffffffff (0xffffffffffffffff) [ 4.171109] ffff88007ad05fd8: ffffffff81367878 (drm_mode_setcrtc+0x328/0x51f) [ 4.171110] ffff88007ad05fe0: 0000000000000010 (0x10) [ 4.171111] ffff88007ad05fe8: 0000000000010206 (0x10206) [ 4.171112] ffff88007ad05ff0: 0000000000000000 ... [ 4.171114] ffff88007ad05ff8: 0000000000000018 (0x18) [ 4.171115] </#DF> [ 5.819749] Shutting down cpus with NMI [ 5.824035] Kernel Offset: disabled [ 5.827930] ACPI MEMORY or I/O RESET_REG. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-12 23:22 ` Matthias Kaehlcke @ 2017-07-13 18:00 ` Josh Poimboeuf 2017-07-13 18:47 ` Matthias Kaehlcke 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 18:00 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Wed, Jul 12, 2017 at 04:22:13PM -0700, Matthias Kaehlcke wrote: > El Wed, Jul 12, 2017 at 05:36:30PM -0500 Josh Poimboeuf ha dit: > > > On Wed, Jul 12, 2017 at 05:35:47PM -0500, Josh Poimboeuf wrote: > > > On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote: > > > > > This is admittedly an awkward way of achieving this goal, but it's the > > > > > only way I know how to do it with GCC. > > > > > > > > > > What extra instruction does clang add? > > > > > > > > I was looking at the get_user() call in drm_mode_setcrtc(). The code > > > > generated by clang without the patch is: > > > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > > ffffffff8138695c: 00 > > > > ffffffff8138695d: 49 03 06 add (%r14),%rax > > > > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> > > > > > > > > And with the patch: > > > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > > ffffffff81386a5d: 00 > > > > ffffffff81386a5e: 49 03 06 add (%r14),%rax > > > > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp > > > > ffffffff81386a66: e8 15 a5 f0 ff callq > > > > ffffffff81290f80 <__get_user_4> > > > > > > Hm, that seems odd. Can you sure the disassembly for the whole > > > function? > > > > Er, share :-) > > Sure, please find below the disassemblies with and without the > patch. The exact extra instruction differs from the one above, the > disassembly above is from a debug session with some 'random' kernel > version (bisect), the ones below from a v4.12ish kernel. At the bottom > you also find a log of a double faults observed with the patch. > > If you are interested in building the kernel with clang yourself I can > provide instructions, it is fairly painless nowadays as long as you > have a recent version of clang (a somewhat older version should also > do for this issue with some extra kernel patches). Here's the reason for the double fault. First it puts zero on the stack at offset -0x58: > ffffffff81367616: 31 c0 xor %eax,%eax > ffffffff81367618: 48 89 45 c8 mov %rax,-0x38(%rbp) > ffffffff8136761c: 45 31 ff xor %r15d,%r15d > ffffffff8136761f: 48 89 45 a8 mov %rax,-0x58(%rbp) Then, later, it copies that zeroed word from the stack to RSP: > ffffffff81367874: 48 8b 65 a8 mov -0x58(%rbp),%rsp Then it double faults because the call instruction tries to write RIP on the stack, but RSP is zero: > ffffffff81367878: e8 73 26 f1 ff callq ffffffff81279ef0 <__get_user_4> Then clang tries to put RSP's value on the stack, at the same stack slot where the original zero was stored (though it never reaches this point): > ffffffff8136787d: 49 89 d4 mov %rdx,%r12 > ffffffff81367880: 48 89 65 a8 mov %rsp,-0x58(%rbp) The panic is consistent with the above. RIP points to the call instruction, RSP is zero: > [ 3.798722] PANIC: double fault, error_code: 0x0 > [ 3.807387] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 > [ 3.816040] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 > [ 3.824792] task: ffff880075b92f00 task.stack: ffffc90000d6c000 > [ 3.829599] EXT4-fs (mmcblk0p1): re-mounted. Opts: commit=600,data=ordered > [ 3.839092] RIP: 0010:drm_mode_setcrtc+0x328/0x51f > [ 3.844443] RSP: 0018:0000000000000000 EFLAGS: 00010206 > [ 3.850280] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX: 0000000000000008 > [ 3.858253] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI: ffffffff81367805 > [ 3.866225] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09: 0000000000000308 > [ 3.874199] R10: 0000000000000300 R11: 0000000000000556 R12: 0000000000000000 > [ 3.882163] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15: 0000000000000000 > [ 3.890136] FS: 00007fb2dbd62740(0000) GS:ffff88007ad00000(0000) knlGS:0000000000000000 > [ 3.899177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3.905595] CR2: fffffffffffffff8 CR3: 000000007596b000 CR4: 00000000001006e0 > [ 3.913568] Call Trace: > [ 3.916296] Code: b8 48 8d b5 38 ff ff ff 41 8b 56 08 39 d3 0f 83 85 00 00 00 4c 63 fb 4a 83 24 f8 00 4a 8d 04 bd 00 00 00 00 49 03 06 48 8b 65 a8 <e8> 73 26 f1 ff 49 89 d4 48 89 65 a8 85 c0 0f 85 a0 00 00 00 ba > [ 3.937440] Kernel panic - not syncing: Machine halted. > [ 3.943268] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 > [ 3.951921] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 > [ 3.960671] Call Trace: > [ 3.963398] <#DF> > [ 3.965637] __dump_stack+0x19/0x1b > [ 3.969531] dump_stack+0x42/0x60 clang is obviously getting confused by the RSP output constraint. I think it tries to take the constraint literally, since it takes RSP as an output from the inline asm and stores it on the stack. However, that behavior doesn't really make sense for a "register" variable. It also doesn't explain why it's zeroing the register out first. What happens if you try the below patch instead of the revert? Any chance the offending instruction goes away? diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 11433f9..beac907 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) might_fault(); \ asm volatile("call __get_user_%P4" \ : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ - : "0" (ptr), "i" (sizeof(*(ptr)))); \ + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ (x) = (__force __typeof__(*(ptr))) __val_gu; \ __builtin_expect(__ret_gu, 0); \ }) ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 18:00 ` Josh Poimboeuf @ 2017-07-13 18:47 ` Matthias Kaehlcke 2017-07-13 19:25 ` Josh Poimboeuf 2017-07-13 20:20 ` Andrey Rybainin 0 siblings, 2 replies; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-13 18:47 UTC (permalink / raw) To: Josh Poimboeuf Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer El Thu, Jul 13, 2017 at 01:00:01PM -0500 Josh Poimboeuf ha dit: > On Wed, Jul 12, 2017 at 04:22:13PM -0700, Matthias Kaehlcke wrote: > > El Wed, Jul 12, 2017 at 05:36:30PM -0500 Josh Poimboeuf ha dit: > > > > > On Wed, Jul 12, 2017 at 05:35:47PM -0500, Josh Poimboeuf wrote: > > > > On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote: > > > > > > This is admittedly an awkward way of achieving this goal, but it's the > > > > > > only way I know how to do it with GCC. > > > > > > > > > > > > What extra instruction does clang add? > > > > > > > > > > I was looking at the get_user() call in drm_mode_setcrtc(). The code > > > > > generated by clang without the patch is: > > > > > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > > > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > > > ffffffff8138695c: 00 > > > > > ffffffff8138695d: 49 03 06 add (%r14),%rax > > > > > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4> > > > > > > > > > > And with the patch: > > > > > > > > > > if (get_user(out_id, &set_connectors_ptr[i])) { > > > > > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax > > > > > ffffffff81386a5d: 00 > > > > > ffffffff81386a5e: 49 03 06 add (%r14),%rax > > > > > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp > > > > > ffffffff81386a66: e8 15 a5 f0 ff callq > > > > > ffffffff81290f80 <__get_user_4> > > > > > > > > Hm, that seems odd. Can you sure the disassembly for the whole > > > > function? > > > > > > Er, share :-) > > > > Sure, please find below the disassemblies with and without the > > patch. The exact extra instruction differs from the one above, the > > disassembly above is from a debug session with some 'random' kernel > > version (bisect), the ones below from a v4.12ish kernel. At the bottom > > you also find a log of a double faults observed with the patch. > > > > If you are interested in building the kernel with clang yourself I can > > provide instructions, it is fairly painless nowadays as long as you > > have a recent version of clang (a somewhat older version should also > > do for this issue with some extra kernel patches). > > Here's the reason for the double fault. First it puts zero on the stack > at offset -0x58: > > > ffffffff81367616: 31 c0 xor %eax,%eax > > ffffffff81367618: 48 89 45 c8 mov %rax,-0x38(%rbp) > > ffffffff8136761c: 45 31 ff xor %r15d,%r15d > > ffffffff8136761f: 48 89 45 a8 mov %rax,-0x58(%rbp) > > Then, later, it copies that zeroed word from the stack to RSP: > > > ffffffff81367874: 48 8b 65 a8 mov -0x58(%rbp),%rsp > > Then it double faults because the call instruction tries to write RIP on > the stack, but RSP is zero: > > > ffffffff81367878: e8 73 26 f1 ff callq ffffffff81279ef0 <__get_user_4> > > Then clang tries to put RSP's value on the stack, at the same stack slot > where the original zero was stored (though it never reaches this point): > > > ffffffff8136787d: 49 89 d4 mov %rdx,%r12 > > ffffffff81367880: 48 89 65 a8 mov %rsp,-0x58(%rbp) > > The panic is consistent with the above. RIP points to the call > instruction, RSP is zero: > > > [ 3.798722] PANIC: double fault, error_code: 0x0 > > [ 3.807387] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 > > [ 3.816040] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 > > [ 3.824792] task: ffff880075b92f00 task.stack: ffffc90000d6c000 > > [ 3.829599] EXT4-fs (mmcblk0p1): re-mounted. Opts: commit=600,data=ordered > > [ 3.839092] RIP: 0010:drm_mode_setcrtc+0x328/0x51f > > [ 3.844443] RSP: 0018:0000000000000000 EFLAGS: 00010206 > > [ 3.850280] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX: 0000000000000008 > > [ 3.858253] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI: ffffffff81367805 > > [ 3.866225] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09: 0000000000000308 > > [ 3.874199] R10: 0000000000000300 R11: 0000000000000556 R12: 0000000000000000 > > [ 3.882163] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15: 0000000000000000 > > [ 3.890136] FS: 00007fb2dbd62740(0000) GS:ffff88007ad00000(0000) knlGS:0000000000000000 > > [ 3.899177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 3.905595] CR2: fffffffffffffff8 CR3: 000000007596b000 CR4: 00000000001006e0 > > [ 3.913568] Call Trace: > > [ 3.916296] Code: b8 48 8d b5 38 ff ff ff 41 8b 56 08 39 d3 0f 83 85 00 00 00 4c 63 fb 4a 83 24 f8 00 4a 8d 04 bd 00 00 00 00 49 03 06 48 8b 65 a8 <e8> 73 26 f1 ff 49 89 d4 48 89 65 a8 85 c0 0f 85 a0 00 00 00 ba > > [ 3.937440] Kernel panic - not syncing: Machine halted. > > [ 3.943268] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107 > > [ 3.951921] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016 > > [ 3.960671] Call Trace: > > [ 3.963398] <#DF> > > [ 3.965637] __dump_stack+0x19/0x1b > > [ 3.969531] dump_stack+0x42/0x60 > > clang is obviously getting confused by the RSP output constraint. I > think it tries to take the constraint literally, since it takes RSP as > an output from the inline asm and stores it on the stack. However, that > behavior doesn't really make sense for a "register" variable. It also > doesn't explain why it's zeroing the register out first. Thanks for your analysis! > What happens if you try the below patch instead of the revert? Any > chance the offending instruction goes away? > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > index 11433f9..beac907 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > might_fault(); \ > asm volatile("call __get_user_%P4" \ > : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > __builtin_expect(__ret_gu, 0); \ > }) The generated code is basically the same, only that now the value from the stack is stored in a register and written twice to RSP: ffffffff813676ba: 31 c0 xor %eax,%eax ffffffff813676bc: 48 89 45 c8 mov %rax,-0x38(%rbp) ffffffff813676c0: 45 31 ff xor %r15d,%r15d ffffffff813676c3: 48 89 45 a8 mov %rax,-0x58(%rbp) ... ffffffff81367918: 48 8b 4d a8 mov -0x58(%rbp),%rcx ffffffff8136791c: 48 89 cc mov %rcx,%rsp ffffffff8136791f: 48 89 cc mov %rcx,%rsp ffffffff81367922: e8 69 26 f1 ff callq ffffffff81279f90 <__get_user_4> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 18:47 ` Matthias Kaehlcke @ 2017-07-13 19:25 ` Josh Poimboeuf 2017-07-13 19:38 ` Michael Davidson 2017-07-13 20:20 ` Andrey Rybainin 1 sibling, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 19:25 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 11:47:48AM -0700, Matthias Kaehlcke wrote: > > What happens if you try the below patch instead of the revert? Any > > chance the offending instruction goes away? > > > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > index 11433f9..beac907 100644 > > --- a/arch/x86/include/asm/uaccess.h > > +++ b/arch/x86/include/asm/uaccess.h > > @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > might_fault(); \ > > asm volatile("call __get_user_%P4" \ > > : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > __builtin_expect(__ret_gu, 0); \ > > }) > > The generated code is basically the same, only that now the value from > the stack is stored in a register and written twice to RSP: > > ffffffff813676ba: 31 c0 xor %eax,%eax > ffffffff813676bc: 48 89 45 c8 mov %rax,-0x38(%rbp) > ffffffff813676c0: 45 31 ff xor %r15d,%r15d > ffffffff813676c3: 48 89 45 a8 mov %rax,-0x58(%rbp) > ... > ffffffff81367918: 48 8b 4d a8 mov -0x58(%rbp),%rcx > ffffffff8136791c: 48 89 cc mov %rcx,%rsp > ffffffff8136791f: 48 89 cc mov %rcx,%rsp > ffffffff81367922: e8 69 26 f1 ff callq ffffffff81279f90 <__get_user_4> LOL. Why corrupt the stack pointer with a single instruction (reading a zero from memory, no less) when you can instead do it with three instructions, including two duplicates? Anyway this seems like a clang bug to me. If I specify RSP as an input register then the compiler shouldn't overwrite it first. For that matter it has no reason to overwrite it if it's an output register either. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 19:25 ` Josh Poimboeuf @ 2017-07-13 19:38 ` Michael Davidson 2017-07-13 20:18 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Michael Davidson @ 2017-07-13 19:38 UTC (permalink / raw) To: Josh Poimboeuf Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, X86 ML, LKML, Douglas Anderson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 12:25 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > Anyway this seems like a clang bug to me. If I specify RSP as an input > register then the compiler shouldn't overwrite it first. For that > matter it has no reason to overwrite it if it's an output register > either. > It's certainly a difference in behavior between clang and gcc. My question is whether this particular construct is really a "supported" (or, at least, reasonably guaranteed) way of forcing gcc to create a stack frame if none exists. or whether it is something that "just happens to work". If someone could explain the rationale behind *why* this works the way that it does on gcc that might help convince the clang people that this is actually a bug rather than just a piece of undefined behavior which gcc and clang happen to handle differently. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 19:38 ` Michael Davidson @ 2017-07-13 20:18 ` Josh Poimboeuf 0 siblings, 0 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 20:18 UTC (permalink / raw) To: Michael Davidson Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, X86 ML, LKML, Douglas Anderson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 12:38:32PM -0700, Michael Davidson wrote: > On Thu, Jul 13, 2017 at 12:25 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > Anyway this seems like a clang bug to me. If I specify RSP as an input > > register then the compiler shouldn't overwrite it first. For that > > matter it has no reason to overwrite it if it's an output register > > either. > > > > It's certainly a difference in behavior between clang and gcc. > > My question is whether this particular construct is really a > "supported" (or, at least, reasonably guaranteed) way of forcing gcc > to create a stack frame if none exists. or whether it is something > that "just happens to work". > > If someone could explain the rationale behind *why* this works the way > that it does on gcc that might help convince the clang people that > this is actually a bug rather than just a piece of undefined behavior > which gcc and clang happen to handle differently. Disclaimer: I'm no compiler expert, and there are usually a variety of opinions about compiler undefined behavior. So it would probably be good for real compiler people to participate in the discussion. But I think there are two separate issues here. 1) The first issue is whether it's supported behavior to specify RSP as an output constraint in order to force GCC to create a stack frame. As far as I know, this is a quirk of GCC, and not really considered defined behavior. However, the idea was suggested by some GCC developers: https://gcc.gnu.org/ml/gcc/2015-07/msg00079.html So at least it seems to be endorsed by GCC to some degree. If you need details on why it works, that thread has the details. 2) The second issue is whether clang should corrupt RSP. I don't see a reason for clang to do that. IMO, when using a local register variable as an input or output to inline asm, the compiler should leave the contents of the register alone. FWIW, my reading of the GCC manual seems to support that: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 18:47 ` Matthias Kaehlcke 2017-07-13 19:25 ` Josh Poimboeuf @ 2017-07-13 20:20 ` Andrey Rybainin 2017-07-13 20:34 ` Josh Poimboeuf 2017-07-13 21:14 ` Matthias Kaehlcke 1 sibling, 2 replies; 32+ messages in thread From: Andrey Rybainin @ 2017-07-13 20:20 UTC (permalink / raw) To: Matthias Kaehlcke, Josh Poimboeuf Cc: Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: > Thanks for your analysis! > >> What happens if you try the below patch instead of the revert? Any >> chance the offending instruction goes away? >> >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h >> index 11433f9..beac907 100644 >> --- a/arch/x86/include/asm/uaccess.h >> +++ b/arch/x86/include/asm/uaccess.h >> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) >> might_fault(); \ >> asm volatile("call __get_user_%P4" \ >> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ >> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ >> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ >> __builtin_expect(__ret_gu, 0); \ >> }) > > The generated code is basically the same, only that now the value from > the stack is stored in a register and written twice to RSP: > AFAIR clang works much better with global named registers. Could you check if the patch bellow helps? --- arch/x86/include/asm/uaccess.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index a059aac9e937..121204387978 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -157,15 +157,18 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) * Clang/LLVM cares about the size of the register, but still wants * the base register for something that ends up being a pair. */ + +register unsigned long __current_sp asm(_ASM_SP); + #define get_user(x, ptr) \ ({ \ int __ret_gu; \ register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ - register void *__sp asm(_ASM_SP); \ __chk_user_ptr(ptr); \ might_fault(); \ asm volatile("call __get_user_%P4" \ - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ + : "=a" (__ret_gu), "=r" (__val_gu), \ + "+r" (__current_sp) \ : "0" (ptr), "i" (sizeof(*(ptr)))); \ (x) = (__force __typeof__(*(ptr))) __val_gu; \ __builtin_expect(__ret_gu, 0); \ -- 2.13.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 20:20 ` Andrey Rybainin @ 2017-07-13 20:34 ` Josh Poimboeuf 2017-07-13 21:12 ` Matthias Kaehlcke 2017-07-13 21:14 ` Matthias Kaehlcke 1 sibling, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 20:34 UTC (permalink / raw) To: Andrey Rybainin Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 11:20:04PM +0300, Andrey Rybainin wrote: > On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: > > > Thanks for your analysis! > > > >> What happens if you try the below patch instead of the revert? Any > >> chance the offending instruction goes away? > >> > >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > >> index 11433f9..beac907 100644 > >> --- a/arch/x86/include/asm/uaccess.h > >> +++ b/arch/x86/include/asm/uaccess.h > >> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > >> might_fault(); \ > >> asm volatile("call __get_user_%P4" \ > >> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > >> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > >> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ > >> __builtin_expect(__ret_gu, 0); \ > >> }) > > > > The generated code is basically the same, only that now the value from > > the stack is stored in a register and written twice to RSP: > > > > AFAIR clang works much better with global named registers. > Could you check if the patch bellow helps? And yet another one to try (clobbering sp) :-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 11433f9..21f0c39 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) ({ \ int __ret_gu; \ register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ - register void *__sp asm(_ASM_SP); \ __chk_user_ptr(ptr); \ might_fault(); \ - asm volatile("call __get_user_%P4" \ - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ - : "0" (ptr), "i" (sizeof(*(ptr)))); \ + asm volatile("call __get_user_%P3" \ + : "=a" (__ret_gu), "=r" (__val_gu) \ + : "0" (ptr), "i" (sizeof(*(ptr))) \ + : "sp"); \ (x) = (__force __typeof__(*(ptr))) __val_gu; \ __builtin_expect(__ret_gu, 0); \ }) ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 20:34 ` Josh Poimboeuf @ 2017-07-13 21:12 ` Matthias Kaehlcke 2017-07-13 21:34 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-13 21:12 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer El Thu, Jul 13, 2017 at 03:34:16PM -0500 Josh Poimboeuf ha dit: > On Thu, Jul 13, 2017 at 11:20:04PM +0300, Andrey Rybainin wrote: > > On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: > > > > > Thanks for your analysis! > > > > > >> What happens if you try the below patch instead of the revert? Any > > >> chance the offending instruction goes away? > > >> > > >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > >> index 11433f9..beac907 100644 > > >> --- a/arch/x86/include/asm/uaccess.h > > >> +++ b/arch/x86/include/asm/uaccess.h > > >> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > >> might_fault(); \ > > >> asm volatile("call __get_user_%P4" \ > > >> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > >> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > >> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > > >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > >> __builtin_expect(__ret_gu, 0); \ > > >> }) > > > > > > The generated code is basically the same, only that now the value from > > > the stack is stored in a register and written twice to RSP: > > > > > > > AFAIR clang works much better with global named registers. > > Could you check if the patch bellow helps? > > And yet another one to try (clobbering sp) :-) > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > index 11433f9..21f0c39 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > ({ \ > int __ret_gu; \ > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > - register void *__sp asm(_ASM_SP); \ > __chk_user_ptr(ptr); \ > might_fault(); \ > - asm volatile("call __get_user_%P4" \ > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > + asm volatile("call __get_user_%P3" \ > + : "=a" (__ret_gu), "=r" (__val_gu) \ > + : "0" (ptr), "i" (sizeof(*(ptr))) \ > + : "sp"); \ > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > __builtin_expect(__ret_gu, 0); \ > }) This compiles with both gcc and clang, clang does not corrupt the stack pointer. I wouldn't be able to tell though if it forces a stack frame if it doesn't already exist, as the original patch intends. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:12 ` Matthias Kaehlcke @ 2017-07-13 21:34 ` Josh Poimboeuf 2017-07-13 21:57 ` Matthias Kaehlcke 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 21:34 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 02:12:45PM -0700, Matthias Kaehlcke wrote: > El Thu, Jul 13, 2017 at 03:34:16PM -0500 Josh Poimboeuf ha dit: > > And yet another one to try (clobbering sp) :-) > > > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > index 11433f9..21f0c39 100644 > > --- a/arch/x86/include/asm/uaccess.h > > +++ b/arch/x86/include/asm/uaccess.h > > @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > ({ \ > > int __ret_gu; \ > > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > > - register void *__sp asm(_ASM_SP); \ > > __chk_user_ptr(ptr); \ > > might_fault(); \ > > - asm volatile("call __get_user_%P4" \ > > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > + asm volatile("call __get_user_%P3" \ > > + : "=a" (__ret_gu), "=r" (__val_gu) \ > > + : "0" (ptr), "i" (sizeof(*(ptr))) \ > > + : "sp"); \ > > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > __builtin_expect(__ret_gu, 0); \ > > }) > > This compiles with both gcc and clang, clang does not corrupt the > stack pointer. I wouldn't be able to tell though if it forces a stack > frame if it doesn't already exist, as the original patch intends. Whether it forces the stack frame on clang is a very minor issue compared to the double fault. That really only matters when you want to use CONFIG_STACK_VALIDATION to get 100% reliable stacktraces with frame pointers. And that feature is currently very GCC-specific. So you probably don't need to worry about that for now, at least until you want to do live patching with a clang-compiled kernel. IIRC, clobbering SP does at least force the stack frame on GCC, though I need to double check that. I can try to work up an official patch in the next week or so (need to do some testing first). -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:34 ` Josh Poimboeuf @ 2017-07-13 21:57 ` Matthias Kaehlcke 2017-07-19 17:46 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-13 21:57 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer El Thu, Jul 13, 2017 at 04:34:06PM -0500 Josh Poimboeuf ha dit: > On Thu, Jul 13, 2017 at 02:12:45PM -0700, Matthias Kaehlcke wrote: > > El Thu, Jul 13, 2017 at 03:34:16PM -0500 Josh Poimboeuf ha dit: > > > And yet another one to try (clobbering sp) :-) > > > > > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > > index 11433f9..21f0c39 100644 > > > --- a/arch/x86/include/asm/uaccess.h > > > +++ b/arch/x86/include/asm/uaccess.h > > > @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > > ({ \ > > > int __ret_gu; \ > > > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > > > - register void *__sp asm(_ASM_SP); \ > > > __chk_user_ptr(ptr); \ > > > might_fault(); \ > > > - asm volatile("call __get_user_%P4" \ > > > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > > + asm volatile("call __get_user_%P3" \ > > > + : "=a" (__ret_gu), "=r" (__val_gu) \ > > > + : "0" (ptr), "i" (sizeof(*(ptr))) \ > > > + : "sp"); \ > > > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > > __builtin_expect(__ret_gu, 0); \ > > > }) > > > > This compiles with both gcc and clang, clang does not corrupt the > > stack pointer. I wouldn't be able to tell though if it forces a stack > > frame if it doesn't already exist, as the original patch intends. > > Whether it forces the stack frame on clang is a very minor issue > compared to the double fault. I totally agree, I was mainly concerned about not breaking the solution that currently works with gcc. > That really only matters when you want to use > CONFIG_STACK_VALIDATION to get 100% reliable stacktraces with frame > pointers. And that feature is currently very GCC-specific. So you > probably don't need to worry about that for now, at least until you want > to do live patching with a clang-compiled kernel. Eventually I expect that there will be interest in live patching clang-compiled kernels, however at this stage it probably isn't an overly important feature. > IIRC, clobbering SP does at least force the stack frame on GCC, though I > need to double check that. I can try to work up an official patch in > the next week or so (need to do some testing first). Sounds great. Thanks again for looking into this and coming up with a solution! Matthias ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:57 ` Matthias Kaehlcke @ 2017-07-19 17:46 ` Josh Poimboeuf 2017-07-19 21:50 ` Matthias Kaehlcke 2017-07-20 10:01 ` Andrey Ryabinin 0 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-19 17:46 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer On Thu, Jul 13, 2017 at 02:57:04PM -0700, Matthias Kaehlcke wrote: > El Thu, Jul 13, 2017 at 04:34:06PM -0500 Josh Poimboeuf ha dit: > > > On Thu, Jul 13, 2017 at 02:12:45PM -0700, Matthias Kaehlcke wrote: > > > El Thu, Jul 13, 2017 at 03:34:16PM -0500 Josh Poimboeuf ha dit: > > > > And yet another one to try (clobbering sp) :-) > > > > > > > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > > > index 11433f9..21f0c39 100644 > > > > --- a/arch/x86/include/asm/uaccess.h > > > > +++ b/arch/x86/include/asm/uaccess.h > > > > @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > > > ({ \ > > > > int __ret_gu; \ > > > > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > > > > - register void *__sp asm(_ASM_SP); \ > > > > __chk_user_ptr(ptr); \ > > > > might_fault(); \ > > > > - asm volatile("call __get_user_%P4" \ > > > > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > > > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > > > + asm volatile("call __get_user_%P3" \ > > > > + : "=a" (__ret_gu), "=r" (__val_gu) \ > > > > + : "0" (ptr), "i" (sizeof(*(ptr))) \ > > > > + : "sp"); \ > > > > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > > > __builtin_expect(__ret_gu, 0); \ > > > > }) > > > > > > This compiles with both gcc and clang, clang does not corrupt the > > > stack pointer. I wouldn't be able to tell though if it forces a stack > > > frame if it doesn't already exist, as the original patch intends. > > > > Whether it forces the stack frame on clang is a very minor issue > > compared to the double fault. > > I totally agree, I was mainly concerned about not breaking the > solution that currently works with gcc. > > > That really only matters when you want to use > > CONFIG_STACK_VALIDATION to get 100% reliable stacktraces with frame > > pointers. And that feature is currently very GCC-specific. So you > > probably don't need to worry about that for now, at least until you want > > to do live patching with a clang-compiled kernel. > > Eventually I expect that there will be interest in live patching > clang-compiled kernels, however at this stage it probably isn't an > overly important feature. > > > IIRC, clobbering SP does at least force the stack frame on GCC, though I > > need to double check that. I can try to work up an official patch in > > the next week or so (need to do some testing first). > > Sounds great. > > Thanks again for looking into this and coming up with a solution! After doing some testing, I don't think this approach is going to work after all. In addition to forcing the stack frame, it also causes GCC to add an unnecessary extra instruction to the epilogue of each affected function: lea -0x10(%rbp),%rsp We shouldn't be inserting extra instructions like that. I also don't think the other suggestion of turning the '__sp' register variable into a global variable is a very good solution either, as that just wastes memory for no reason. It would be nice if both compilers could agree on a way to support this. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-19 17:46 ` Josh Poimboeuf @ 2017-07-19 21:50 ` Matthias Kaehlcke 2017-07-20 10:01 ` Andrey Ryabinin 1 sibling, 0 replies; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-19 21:50 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, linux-kernel, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard.Rosenkranzer El Wed, Jul 19, 2017 at 12:46:31PM -0500 Josh Poimboeuf ha dit: > On Thu, Jul 13, 2017 at 02:57:04PM -0700, Matthias Kaehlcke wrote: > > El Thu, Jul 13, 2017 at 04:34:06PM -0500 Josh Poimboeuf ha dit: > > > > > On Thu, Jul 13, 2017 at 02:12:45PM -0700, Matthias Kaehlcke wrote: > > > > El Thu, Jul 13, 2017 at 03:34:16PM -0500 Josh Poimboeuf ha dit: > > > > > And yet another one to try (clobbering sp) :-) > > > > > > > > > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > > > > > index 11433f9..21f0c39 100644 > > > > > --- a/arch/x86/include/asm/uaccess.h > > > > > +++ b/arch/x86/include/asm/uaccess.h > > > > > @@ -166,12 +166,12 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > > > > > ({ \ > > > > > int __ret_gu; \ > > > > > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > > > > > - register void *__sp asm(_ASM_SP); \ > > > > > __chk_user_ptr(ptr); \ > > > > > might_fault(); \ > > > > > - asm volatile("call __get_user_%P4" \ > > > > > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > > > > > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > > > > > + asm volatile("call __get_user_%P3" \ > > > > > + : "=a" (__ret_gu), "=r" (__val_gu) \ > > > > > + : "0" (ptr), "i" (sizeof(*(ptr))) \ > > > > > + : "sp"); \ > > > > > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > > > > > __builtin_expect(__ret_gu, 0); \ > > > > > }) > > > > > > > > This compiles with both gcc and clang, clang does not corrupt the > > > > stack pointer. I wouldn't be able to tell though if it forces a stack > > > > frame if it doesn't already exist, as the original patch intends. > > > > > > Whether it forces the stack frame on clang is a very minor issue > > > compared to the double fault. > > > > I totally agree, I was mainly concerned about not breaking the > > solution that currently works with gcc. > > > > > That really only matters when you want to use > > > CONFIG_STACK_VALIDATION to get 100% reliable stacktraces with frame > > > pointers. And that feature is currently very GCC-specific. So you > > > probably don't need to worry about that for now, at least until you want > > > to do live patching with a clang-compiled kernel. > > > > Eventually I expect that there will be interest in live patching > > clang-compiled kernels, however at this stage it probably isn't an > > overly important feature. > > > > > IIRC, clobbering SP does at least force the stack frame on GCC, though I > > > need to double check that. I can try to work up an official patch in > > > the next week or so (need to do some testing first). > > > > Sounds great. > > > > Thanks again for looking into this and coming up with a solution! > > After doing some testing, I don't think this approach is going to work > after all. In addition to forcing the stack frame, it also causes GCC > to add an unnecessary extra instruction to the epilogue of each affected > function: > > lea -0x10(%rbp),%rsp > > We shouldn't be inserting extra instructions like that. I also don't > think the other suggestion of turning the '__sp' register variable into > a global variable is a very good solution either, as that just wastes > memory for no reason. > > It would be nice if both compilers could agree on a way to support this. Thanks for the update, Josh. I will ask compiler folks to bring this up with LLVM. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-19 17:46 ` Josh Poimboeuf 2017-07-19 21:50 ` Matthias Kaehlcke @ 2017-07-20 10:01 ` Andrey Ryabinin 2017-07-20 15:18 ` Josh Poimboeuf 1 sibling, 1 reply; 32+ messages in thread From: Andrey Ryabinin @ 2017-07-20 10:01 UTC (permalink / raw) To: Josh Poimboeuf Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer 2017-07-19 20:46 GMT+03:00 Josh Poimboeuf <jpoimboe@redhat.com>: > > After doing some testing, I don't think this approach is going to work > after all. In addition to forcing the stack frame, it also causes GCC > to add an unnecessary extra instruction to the epilogue of each affected > function: > > lea -0x10(%rbp),%rsp > > We shouldn't be inserting extra instructions like that. I also don't > think the other suggestion of turning the '__sp' register variable into > a global variable is a very good solution either, as that just wastes > memory for no reason. > Wastes memory? How is that wastes memory? That doesn't make any sense. > It would be nice if both compilers could agree on a way to support this. > > -- > Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-20 10:01 ` Andrey Ryabinin @ 2017-07-20 15:18 ` Josh Poimboeuf 2017-07-20 15:30 ` Andrey Ryabinin 0 siblings, 1 reply; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-20 15:18 UTC (permalink / raw) To: Andrey Ryabinin Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Thu, Jul 20, 2017 at 01:01:39PM +0300, Andrey Ryabinin wrote: > 2017-07-19 20:46 GMT+03:00 Josh Poimboeuf <jpoimboe@redhat.com>: > > > > > After doing some testing, I don't think this approach is going to work > > after all. In addition to forcing the stack frame, it also causes GCC > > to add an unnecessary extra instruction to the epilogue of each affected > > function: > > > > lea -0x10(%rbp),%rsp > > > > We shouldn't be inserting extra instructions like that. I also don't > > think the other suggestion of turning the '__sp' register variable into > > a global variable is a very good solution either, as that just wastes > > memory for no reason. > > > > Wastes memory? How is that wastes memory? That doesn't make any sense. Yes, you're right, that doesn't make any sense. I think I was trying to wrap my head around what it means to have a global register variable -- in a header file no less -- and why clang would treat that differently than a local register variable. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-20 15:18 ` Josh Poimboeuf @ 2017-07-20 15:30 ` Andrey Ryabinin 2017-07-20 20:56 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Andrey Ryabinin @ 2017-07-20 15:30 UTC (permalink / raw) To: Josh Poimboeuf Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer 2017-07-20 18:18 GMT+03:00 Josh Poimboeuf <jpoimboe@redhat.com>: > On Thu, Jul 20, 2017 at 01:01:39PM +0300, Andrey Ryabinin wrote: >> 2017-07-19 20:46 GMT+03:00 Josh Poimboeuf <jpoimboe@redhat.com>: >> >> > >> > After doing some testing, I don't think this approach is going to work >> > after all. In addition to forcing the stack frame, it also causes GCC >> > to add an unnecessary extra instruction to the epilogue of each affected >> > function: >> > >> > lea -0x10(%rbp),%rsp >> > >> > We shouldn't be inserting extra instructions like that. I also don't >> > think the other suggestion of turning the '__sp' register variable into >> > a global variable is a very good solution either, as that just wastes >> > memory for no reason. >> > >> >> Wastes memory? How is that wastes memory? That doesn't make any sense. > > Yes, you're right, that doesn't make any sense. I think I was trying to > wrap my head around what it means to have a global register variable -- > in a header file no less -- and why clang would treat that differently > than a local register variable. > FWIW bellow is my understanding of what's going on. It seems clang treats local named register almost the same as ordinary local variables. The only difference is that before reading the register variable clang puts variable's value into the specified register. So clang just assigns stack slot for the variable __sp where it's going to keep variable's value. But since __sp is unitialized (we haven't assign anything to it), the value of the __sp is some garbage from stack. inline asm specifies __sp as input, so clang assumes that it have to load __sp into 'rsp' because inline asm is going to use it. And it just loads garbage from stack into 'rsp' In fact, such behavior (I mean storing the value on stack and loading into reg before the use) is very useful. Clang's behavior allows to keep the value assigned to the call-clobbered register across the function calls. Unlike clang, gcc assigns value to the register right away and doesn't store the value anywhere else. So if the reg is call clobbered register you have to be absolutely sure that there is no subsequent function call that might clobber the register. E.g. see some real examples https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: Fix improper register assignment"). These bugs shouldn't happen with clang. But the global named register works slightly differently in clang. For the global, the value is just the value of the register itself, whatever it is. Read/write from global named register is just like direct read/write to the register ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-20 15:30 ` Andrey Ryabinin @ 2017-07-20 20:56 ` Josh Poimboeuf 2017-07-21 9:13 ` Andrey Ryabinin 2017-07-29 0:38 ` Matthias Kaehlcke 0 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-20 20:56 UTC (permalink / raw) To: Andrey Ryabinin Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: > FWIW bellow is my understanding of what's going on. > > It seems clang treats local named register almost the same as ordinary > local variables. > The only difference is that before reading the register variable clang > puts variable's value into the specified register. > > So clang just assigns stack slot for the variable __sp where it's > going to keep variable's value. > But since __sp is unitialized (we haven't assign anything to it), the > value of the __sp is some garbage from stack. > inline asm specifies __sp as input, so clang assumes that it have to > load __sp into 'rsp' because inline asm is going to use > it. And it just loads garbage from stack into 'rsp' > > In fact, such behavior (I mean storing the value on stack and loading > into reg before the use) is very useful. > Clang's behavior allows to keep the value assigned to the > call-clobbered register across the function calls. > > Unlike clang, gcc assigns value to the register right away and doesn't > store the value anywhere else. So if the reg is > call clobbered register you have to be absolutely sure that there is > no subsequent function call that might clobber the register. > > E.g. see some real examples > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: > Fix improper register assignment"). > These bugs shouldn't happen with clang. > > But the global named register works slightly differently in clang. For > the global, the value is just the value of the register itself, > whatever it is. Read/write from global named register is just like > direct read/write to the register Thanks, that clears up a lot of the confusion for me. Still, unfortunately, I don't think that's going to work for GCC. Changing the '__sp' register variable to global in the header file causes it to make a *bunch* of changes across the kernel, even in functions which don't do inline asm. It seems to be disabling some optimizations across the board. I do have another idea, which is to replace all uses of asm(" ... call foo ... " : outputs : inputs : clobbers); with a new ASM_CALL macro: ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); Then the compiler differences can be abstracted out, with GCC adding "sp" as an output constraint and clang doing nothing (for now). -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-20 20:56 ` Josh Poimboeuf @ 2017-07-21 9:13 ` Andrey Ryabinin 2017-07-21 13:24 ` Josh Poimboeuf 2017-07-29 0:38 ` Matthias Kaehlcke 1 sibling, 1 reply; 32+ messages in thread From: Andrey Ryabinin @ 2017-07-21 9:13 UTC (permalink / raw) To: Josh Poimboeuf Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On 07/20/2017 11:56 PM, Josh Poimboeuf wrote: > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: >> FWIW bellow is my understanding of what's going on. >> >> It seems clang treats local named register almost the same as ordinary >> local variables. >> The only difference is that before reading the register variable clang >> puts variable's value into the specified register. >> >> So clang just assigns stack slot for the variable __sp where it's >> going to keep variable's value. >> But since __sp is unitialized (we haven't assign anything to it), the >> value of the __sp is some garbage from stack. >> inline asm specifies __sp as input, so clang assumes that it have to >> load __sp into 'rsp' because inline asm is going to use >> it. And it just loads garbage from stack into 'rsp' >> >> In fact, such behavior (I mean storing the value on stack and loading >> into reg before the use) is very useful. >> Clang's behavior allows to keep the value assigned to the >> call-clobbered register across the function calls. >> >> Unlike clang, gcc assigns value to the register right away and doesn't >> store the value anywhere else. So if the reg is >> call clobbered register you have to be absolutely sure that there is >> no subsequent function call that might clobber the register. >> >> E.g. see some real examples >> https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: >> Fix improper register assignment"). >> These bugs shouldn't happen with clang. >> >> But the global named register works slightly differently in clang. For >> the global, the value is just the value of the register itself, >> whatever it is. Read/write from global named register is just like >> direct read/write to the register > > Thanks, that clears up a lot of the confusion for me. > > Still, unfortunately, I don't think that's going to work for GCC. > Changing the '__sp' register variable to global in the header file > causes it to make a *bunch* of changes across the kernel, even in > functions which don't do inline asm. It seems to be disabling some > optimizations across the board. All I see is just bunch of reordering of independent instructions, like this: -ffffffff81012760: 5b pop %rbx -ffffffff81012761: 31 c0 xor %eax,%eax +ffffffff81012760: 31 c0 xor %eax,%eax +ffffffff81012762: 5b pop %rbx -ffffffff810c29ae: 48 83 c4 28 add $0x28,%rsp -ffffffff810c29b2: 89 d8 mov %ebx,%eax +ffffffff810c29ae: 89 d8 mov %ebx,%eax +ffffffff810c29b0: 48 83 c4 28 add $0x28,%rsp I haven't noticed any single bad/harmful change. The size of .text remained the same. And btw, arm/arm64 already use global current_stack_pointer just fine. > I do have another idea, which is to replace all uses of > > asm(" ... call foo ... " : outputs : inputs : clobbers); > > with a new ASM_CALL macro: > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); > > Then the compiler differences can be abstracted out, with GCC adding > "sp" as an output constraint and clang doing nothing (for now). > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-21 9:13 ` Andrey Ryabinin @ 2017-07-21 13:24 ` Josh Poimboeuf 0 siblings, 0 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-21 13:24 UTC (permalink / raw) To: Andrey Ryabinin Cc: Matthias Kaehlcke, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Fri, Jul 21, 2017 at 12:13:31PM +0300, Andrey Ryabinin wrote: > > Still, unfortunately, I don't think that's going to work for GCC. > > Changing the '__sp' register variable to global in the header file > > causes it to make a *bunch* of changes across the kernel, even in > > functions which don't do inline asm. It seems to be disabling some > > optimizations across the board. > > All I see is just bunch of reordering of independent instructions, like this: > > -ffffffff81012760: 5b pop %rbx > -ffffffff81012761: 31 c0 xor %eax,%eax > +ffffffff81012760: 31 c0 xor %eax,%eax > +ffffffff81012762: 5b pop %rbx > > -ffffffff810c29ae: 48 83 c4 28 add $0x28,%rsp > -ffffffff810c29b2: 89 d8 mov %ebx,%eax > +ffffffff810c29ae: 89 d8 mov %ebx,%eax > +ffffffff810c29b0: 48 83 c4 28 add $0x28,%rsp > > I haven't noticed any single bad/harmful change. The size of .text remained the same. I compiled with -ffunction-sections to make the comparisons easier. The reordering is much more extreme than your example. (This is with GCC 7, btw). And it's not just reordering of instructions. It's control flow changes as well. Also, the text size grew a little: text data bss dec hex filename 10630602 8295074 16461824 35387500 21bf86c vmlinux.before 10634013 8295074 16461824 35390911 21c05bf vmlinux.after A small two-line change, which is supposed to be a noop, or at least should only affect a small number of functions, but which instead affects optimization decisions across the entire kernel, is actively harmful IMO. > And btw, arm/arm64 already use global current_stack_pointer just fine. I wonder if they looked for the impact. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-20 20:56 ` Josh Poimboeuf 2017-07-21 9:13 ` Andrey Ryabinin @ 2017-07-29 0:38 ` Matthias Kaehlcke 2017-07-29 0:55 ` Josh Poimboeuf 1 sibling, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-29 0:38 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit: > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: > > FWIW bellow is my understanding of what's going on. > > > > It seems clang treats local named register almost the same as ordinary > > local variables. > > The only difference is that before reading the register variable clang > > puts variable's value into the specified register. > > > > So clang just assigns stack slot for the variable __sp where it's > > going to keep variable's value. > > But since __sp is unitialized (we haven't assign anything to it), the > > value of the __sp is some garbage from stack. > > inline asm specifies __sp as input, so clang assumes that it have to > > load __sp into 'rsp' because inline asm is going to use > > it. And it just loads garbage from stack into 'rsp' > > > > In fact, such behavior (I mean storing the value on stack and loading > > into reg before the use) is very useful. > > Clang's behavior allows to keep the value assigned to the > > call-clobbered register across the function calls. > > > > Unlike clang, gcc assigns value to the register right away and doesn't > > store the value anywhere else. So if the reg is > > call clobbered register you have to be absolutely sure that there is > > no subsequent function call that might clobber the register. > > > > E.g. see some real examples > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: > > Fix improper register assignment"). > > These bugs shouldn't happen with clang. > > > > But the global named register works slightly differently in clang. For > > the global, the value is just the value of the register itself, > > whatever it is. Read/write from global named register is just like > > direct read/write to the register > > Thanks, that clears up a lot of the confusion for me. > > Still, unfortunately, I don't think that's going to work for GCC. > Changing the '__sp' register variable to global in the header file > causes it to make a *bunch* of changes across the kernel, even in > functions which don't do inline asm. It seems to be disabling some > optimizations across the board. > > I do have another idea, which is to replace all uses of > > asm(" ... call foo ... " : outputs : inputs : clobbers); > > with a new ASM_CALL macro: > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); > > Then the compiler differences can be abstracted out, with GCC adding > "sp" as an output constraint and clang doing nothing (for now). The idea sounds interesting, however I see two issues with ASM_CALL(): Inline assembly expects the individual elements of outputs, inputs and clobbers to be comma separated, and so does the macro for it's parameters. The assembler template can refer to the position of output and input operands, adding "sp" as output changes the positions of the inputs wrt to the clang version. Not sure how to move forward from here. Not even using an ugly #ifdef seems to be a halfway reasonable solution, since get_user() isn't the only place using this construct and #ifdefs would result in highly redundant macros in multiple places. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-29 0:38 ` Matthias Kaehlcke @ 2017-07-29 0:55 ` Josh Poimboeuf 2017-07-29 0:58 ` Josh Poimboeuf 2017-07-29 1:06 ` Matthias Kaehlcke 0 siblings, 2 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-29 0:55 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Andrey Ryabinin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Fri, Jul 28, 2017 at 05:38:52PM -0700, Matthias Kaehlcke wrote: > El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit: > > > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: > > > FWIW bellow is my understanding of what's going on. > > > > > > It seems clang treats local named register almost the same as ordinary > > > local variables. > > > The only difference is that before reading the register variable clang > > > puts variable's value into the specified register. > > > > > > So clang just assigns stack slot for the variable __sp where it's > > > going to keep variable's value. > > > But since __sp is unitialized (we haven't assign anything to it), the > > > value of the __sp is some garbage from stack. > > > inline asm specifies __sp as input, so clang assumes that it have to > > > load __sp into 'rsp' because inline asm is going to use > > > it. And it just loads garbage from stack into 'rsp' > > > > > > In fact, such behavior (I mean storing the value on stack and loading > > > into reg before the use) is very useful. > > > Clang's behavior allows to keep the value assigned to the > > > call-clobbered register across the function calls. > > > > > > Unlike clang, gcc assigns value to the register right away and doesn't > > > store the value anywhere else. So if the reg is > > > call clobbered register you have to be absolutely sure that there is > > > no subsequent function call that might clobber the register. > > > > > > E.g. see some real examples > > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: > > > Fix improper register assignment"). > > > These bugs shouldn't happen with clang. > > > > > > But the global named register works slightly differently in clang. For > > > the global, the value is just the value of the register itself, > > > whatever it is. Read/write from global named register is just like > > > direct read/write to the register > > > > Thanks, that clears up a lot of the confusion for me. > > > > Still, unfortunately, I don't think that's going to work for GCC. > > Changing the '__sp' register variable to global in the header file > > causes it to make a *bunch* of changes across the kernel, even in > > functions which don't do inline asm. It seems to be disabling some > > optimizations across the board. > > > > I do have another idea, which is to replace all uses of > > > > asm(" ... call foo ... " : outputs : inputs : clobbers); > > > > with a new ASM_CALL macro: > > > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); > > > > Then the compiler differences can be abstracted out, with GCC adding > > "sp" as an output constraint and clang doing nothing (for now). > > The idea sounds interesting, however I see two issues with ASM_CALL(): > > Inline assembly expects the individual elements of outputs, inputs and > clobbers to be comma separated, and so does the macro for it's > parameters. I think this isn't a problem, see the (untested and unfinished) patch below for an idea of how the arguments can be handled. > The assembler template can refer to the position of output and input > operands, adding "sp" as output changes the positions of the inputs > wrt to the clang version. True, though I think we could convert all ASM_CALL users to use named operands instead of the (bug-prone) numbered ones. It would be an improvement regardless. > Not sure how to move forward from here. Not even using an ugly #ifdef > seems to be a halfway reasonable solution, since get_user() isn't the > only place using this construct and #ifdefs would result in highly > redundant macros in multiple places. I still think ASM_CALL might work. The below patch is what I came up with last week before I got pulled into some other work. I'll be out on vacation next week but I hope to finish the patch after that. diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1b020381ab38..5da4bcbeebab 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -216,14 +216,14 @@ static inline int alternatives_text_reserved(void *start, void *end) * Otherwise, old function is used. */ #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2, \ - output, input...) \ + output, input, clobbers...) \ { \ - register void *__sp asm(_ASM_SP); \ - asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\ - "call %P[new2]", feature2) \ - : output, "+r" (__sp) \ - : [old] "i" (oldfunc), [new1] "i" (newfunc1), \ - [new2] "i" (newfunc2), ## input); \ + ASM_CALL(ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1, \ + "call %P[new2]", feature2), \ + ARGS(output), \ + ARGS([old] "i" (oldfunc), [new1] "i" (newfunc1), \ + [new2] "i" (newfunc2), ARGS(input)), \ + ## clobbers); \ } /* diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index b4a0d43248cf..21adcc8e739f 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -45,8 +45,8 @@ static inline void clear_page(void *page) clear_page_rep, X86_FEATURE_REP_GOOD, clear_page_erms, X86_FEATURE_ERMS, "=D" (page), - "0" (page) - : "memory", "rax", "rcx"); + "0" (page), + ARGS("memory", "rax", "rcx")); } void copy_page(void *to, void *from); diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index cb976bab6299..2a585b799a3e 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -471,8 +471,7 @@ int paravirt_disable_iospace(void); */ #ifdef CONFIG_X86_32 #define PVOP_VCALL_ARGS \ - unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx; \ - register void *__sp asm("esp") + unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx; #define PVOP_CALL_ARGS PVOP_VCALL_ARGS #define PVOP_CALL_ARG1(x) "a" ((unsigned long)(x)) @@ -492,8 +491,7 @@ int paravirt_disable_iospace(void); /* [re]ax isn't an arg, but the return val */ #define PVOP_VCALL_ARGS \ unsigned long __edi = __edi, __esi = __esi, \ - __edx = __edx, __ecx = __ecx, __eax = __eax; \ - register void *__sp asm("rsp") + __edx = __edx, __ecx = __ecx, __eax = __eax; #define PVOP_CALL_ARGS PVOP_VCALL_ARGS #define PVOP_CALL_ARG1(x) "D" ((unsigned long)(x)) @@ -541,24 +539,24 @@ int paravirt_disable_iospace(void); /* This is 32-bit specific, but is okay in 64-bit */ \ /* since this condition will never hold */ \ if (sizeof(rettype) > sizeof(unsigned long)) { \ - asm volatile(pre \ - paravirt_alt(PARAVIRT_CALL) \ - post \ - : call_clbr, "+r" (__sp) \ - : paravirt_type(op), \ - paravirt_clobber(clbr), \ - ##__VA_ARGS__ \ - : "memory", "cc" extra_clbr); \ + ASM_CALL(pre \ + paravirt_alt(PARAVIRT_CALL) \ + post, \ + ARGS(call_clbr), \ + ARGS(paravirt_type(op), \ + paravirt_clobber(clbr), \ + ##__VA_ARGS__), \ + ARGS("memory", "cc" extra_clbr)); \ __ret = (rettype)((((u64)__edx) << 32) | __eax); \ } else { \ - asm volatile(pre \ - paravirt_alt(PARAVIRT_CALL) \ - post \ - : call_clbr, "+r" (__sp) \ - : paravirt_type(op), \ - paravirt_clobber(clbr), \ - ##__VA_ARGS__ \ - : "memory", "cc" extra_clbr); \ + ASM_CALL(pre \ + paravirt_alt(PARAVIRT_CALL) \ + post, \ + ARGS(call_clbr), \ + ARGS(paravirt_type(op), \ + paravirt_clobber(clbr), \ + ##__VA_ARGS__), \ + ARGS("memory", "cc" extra_clbr)); \ __ret = (rettype)(__eax & PVOP_RETMASK(rettype)); \ } \ __ret; \ @@ -578,14 +576,14 @@ int paravirt_disable_iospace(void); ({ \ PVOP_VCALL_ARGS; \ PVOP_TEST_NULL(op); \ - asm volatile(pre \ - paravirt_alt(PARAVIRT_CALL) \ - post \ - : call_clbr, "+r" (__sp) \ - : paravirt_type(op), \ - paravirt_clobber(clbr), \ - ##__VA_ARGS__ \ - : "memory", "cc" extra_clbr); \ + ASM_CALL(pre \ + paravirt_alt(PARAVIRT_CALL) \ + post, \ + ARGS(call_clbr), \ + ARGS(paravirt_type(op), \ + paravirt_clobber(clbr), \ + ##__VA_ARGS__), \ + ARGS("memory", "cc" extra_clbr)); \ }) #define __PVOP_VCALL(op, pre, post, ...) \ diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index ec1f3c651150..19a034cbde37 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -102,16 +102,14 @@ static __always_inline bool should_resched(int preempt_offset) extern asmlinkage void ___preempt_schedule(void); # define __preempt_schedule() \ ({ \ - register void *__sp asm(_ASM_SP); \ - asm volatile ("call ___preempt_schedule" : "+r"(__sp)); \ + ASM_CALL("call ___preempt_schedule",,); \ }) extern asmlinkage void preempt_schedule(void); extern asmlinkage void ___preempt_schedule_notrace(void); # define __preempt_schedule_notrace() \ ({ \ - register void *__sp asm(_ASM_SP); \ - asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp)); \ + ASM_CALL("call ___preempt_schedule_notrace",,); \ }) extern asmlinkage void preempt_schedule_notrace(void); #endif diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 550bc4ab0df4..2bbfb777c8bb 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -677,20 +677,19 @@ static inline void sync_core(void) * Like all of Linux's memory ordering operations, this is a * compiler barrier as well. */ - register void *__sp asm(_ASM_SP); #ifdef CONFIG_X86_32 - asm volatile ( + ASM_CALL( "pushfl\n\t" "pushl %%cs\n\t" "pushl $1f\n\t" "iret\n\t" - "1:" - : "+r" (__sp) : : "memory"); + "1:", + , , "memory"); #else unsigned int tmp; - asm volatile ( + ASM_CALL( UNWIND_HINT_SAVE "mov %%ss, %0\n\t" "pushq %q0\n\t" @@ -702,8 +701,8 @@ static inline void sync_core(void) "pushq $1f\n\t" "iretq\n\t" UNWIND_HINT_RESTORE - "1:" - : "=&r" (tmp), "+r" (__sp) : : "cc", "memory"); + "1:", + "=&r" (tmp), , ARGS("cc", "memory")); #endif } diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h index a34e0d4b957d..9d20f4fb5d7c 100644 --- a/arch/x86/include/asm/rwsem.h +++ b/arch/x86/include/asm/rwsem.h @@ -99,25 +99,24 @@ static inline bool __down_read_trylock(struct rw_semaphore *sem) /* * lock for writing */ -#define ____down_write(sem, slow_path) \ -({ \ - long tmp; \ - struct rw_semaphore* ret; \ - register void *__sp asm(_ASM_SP); \ - \ - asm volatile("# beginning down_write\n\t" \ - LOCK_PREFIX " xadd %1,(%4)\n\t" \ - /* adds 0xffff0001, returns the old value */ \ - " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \ - /* was the active mask 0 before? */\ - " jz 1f\n" \ - " call " slow_path "\n" \ - "1:\n" \ - "# ending down_write" \ - : "+m" (sem->count), "=d" (tmp), "=a" (ret), "+r" (__sp) \ - : "a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS) \ - : "memory", "cc"); \ - ret; \ +#define ____down_write(sem, slow_path) \ +({ \ + long tmp; \ + struct rw_semaphore* ret; \ + \ + ASM_CALL("# beginning down_write\n\t" \ + LOCK_PREFIX " xadd %1,(%3)\n\t" \ + /* adds 0xffff0001, returns the old value */ \ + " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \ + /* was the active mask 0 before? */ \ + " jz 1f\n" \ + " call " slow_path "\n" \ + "1:\n" \ + "# ending down_write", \ + ARGS("+m" (sem->count), "=d" (tmp), "=a" (ret)), \ + ARGS("a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS)), \ + ARGS("memory", "cc")); \ + ret; \ }) static inline void __down_write(struct rw_semaphore *sem) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 184eb9894dae..c7be076ee703 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -166,12 +166,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) ({ \ int __ret_gu; \ register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ - register void *__sp asm(_ASM_SP); \ __chk_user_ptr(ptr); \ might_fault(); \ - asm volatile("call __get_user_%P4" \ - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ - : "0" (ptr), "i" (sizeof(*(ptr)))); \ + ASM_CALL("call __get_user_%P3", \ + ARGS("=a" (__ret_gu), "=r" (__val_gu)), \ + ARGS("0" (ptr), "i" (sizeof(*(ptr))))); \ (x) = (__force __typeof__(*(ptr))) __val_gu; \ __builtin_expect(__ret_gu, 0); \ }) diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index b16f6a1d8b26..8b63e2cb1eaf 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -38,10 +38,9 @@ copy_user_generic(void *to, const void *from, unsigned len) X86_FEATURE_REP_GOOD, copy_user_enhanced_fast_string, X86_FEATURE_ERMS, - ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from), - "=d" (len)), - "1" (to), "2" (from), "3" (len) - : "memory", "rcx", "r8", "r9", "r10", "r11"); + ARGS("=a" (ret), "=D" (to), "=S" (from), "=d" (len)), + ARGS("1" (to), "2" (from), "3" (len)), + ARGS("memory", "rcx", "r8", "r9", "r10", "r11")); return ret; } diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h index 11071fcd630e..2feddeeec1d1 100644 --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -114,9 +114,8 @@ extern struct { char _entry[32]; } hypercall_page[]; register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \ register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \ register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \ - register void *__sp asm(_ASM_SP); -#define __HYPERCALL_0PARAM "=r" (__res), "+r" (__sp) +#define __HYPERCALL_0PARAM "=r" (__res) #define __HYPERCALL_1PARAM __HYPERCALL_0PARAM, "+r" (__arg1) #define __HYPERCALL_2PARAM __HYPERCALL_1PARAM, "+r" (__arg2) #define __HYPERCALL_3PARAM __HYPERCALL_2PARAM, "+r" (__arg3) @@ -146,10 +145,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_0ARG(); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_0PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER0); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_0PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER0); \ (type)__res; \ }) @@ -157,10 +156,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_1ARG(a1); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_1PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER1); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_1PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER1); \ (type)__res; \ }) @@ -168,10 +167,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_2ARG(a1, a2); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_2PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER2); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_2PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER2); \ (type)__res; \ }) @@ -179,10 +178,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_3ARG(a1, a2, a3); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_3PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER3); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_3PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER3); \ (type)__res; \ }) @@ -190,10 +189,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_4ARG(a1, a2, a3, a4); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_4PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER4); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_4PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER4); \ (type)__res; \ }) @@ -201,10 +200,10 @@ extern struct { char _entry[32]; } hypercall_page[]; ({ \ __HYPERCALL_DECLS; \ __HYPERCALL_5ARG(a1, a2, a3, a4, a5); \ - asm volatile (__HYPERCALL \ - : __HYPERCALL_5PARAM \ - : __HYPERCALL_ENTRY(name) \ - : __HYPERCALL_CLOBBER5); \ + ASM_CALL(__HYPERCALL, \ + __HYPERCALL_5PARAM, \ + __HYPERCALL_ENTRY(name), \ + __HYPERCALL_CLOBBER5); \ (type)__res; \ }) @@ -218,10 +217,10 @@ privcmd_call(unsigned call, __HYPERCALL_5ARG(a1, a2, a3, a4, a5); stac(); - asm volatile("call *%[call]" - : __HYPERCALL_5PARAM - : [call] "a" (&hypercall_page[call]) - : __HYPERCALL_CLOBBER5); + ASM_CALL("call *%[call]", + __HYPERCALL_5PARAM, + [call] "a" (&hypercall_page[call]), + __HYPERCALL_CLOBBER5); clac(); return (long)__res; diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index fb0055953fbc..4d0d326029a7 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -5284,16 +5284,15 @@ static void fetch_possible_mmx_operand(struct x86_emulate_ctxt *ctxt, static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *)) { - register void *__sp asm(_ASM_SP); ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; if (!(ctxt->d & ByteOp)) fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE; - asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n" - : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags), - [fastop]"+S"(fop), "+r"(__sp) - : "c"(ctxt->src2.val)); + ASM_CALL("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n", + ARGS("+a"(ctxt->dst.val), "+d"(ctxt->src.val), + [flags]"+D"(flags), [fastop]"+S"(fop)), + ARGS("c"(ctxt->src2.val))); ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); if (!fop) /* exception is returned in fop variable */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ffd469ecad57..d42806ad3c9c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8671,7 +8671,6 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx) static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) { u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); - register void *__sp asm(_ASM_SP); if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK)) == (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) { @@ -8686,7 +8685,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) vector = exit_intr_info & INTR_INFO_VECTOR_MASK; desc = (gate_desc *)vmx->host_idt_base + vector; entry = gate_offset(*desc); - asm volatile( + ASM_CALL( #ifdef CONFIG_X86_64 "mov %%" _ASM_SP ", %[sp]\n\t" "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" @@ -8695,16 +8694,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) #endif "pushf\n\t" __ASM_SIZE(push) " $%c[cs]\n\t" - "call *%[entry]\n\t" - : + "call *%[entry]\n\t", #ifdef CONFIG_X86_64 - [sp]"=&r"(tmp), + [sp]"=&r"(tmp) #endif - "+r"(__sp) - : - [entry]"r"(entry), - [ss]"i"(__KERNEL_DS), - [cs]"i"(__KERNEL_CS) + , + ARGS([entry]"r"(entry), + [ss]"i"(__KERNEL_DS), + [cs]"i"(__KERNEL_CS)) ); } } diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 2a1fa10c6a98..ca642ac3a390 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -802,7 +802,6 @@ no_context(struct pt_regs *regs, unsigned long error_code, if (is_vmalloc_addr((void *)address) && (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) || address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) { - register void *__sp asm("rsp"); unsigned long stack = this_cpu_read(orig_ist.ist[DOUBLEFAULT_STACK]) - sizeof(void *); /* * We're likely to be running with very little stack space @@ -814,13 +813,12 @@ no_context(struct pt_regs *regs, unsigned long error_code, * and then double-fault, though, because we're likely to * break the console driver and lose most of the stack dump. */ - asm volatile ("movq %[stack], %%rsp\n\t" - "call handle_stack_overflow\n\t" - "1: jmp 1b" - : "+r" (__sp) - : "D" ("kernel stack overflow (page fault)"), - "S" (regs), "d" (address), - [stack] "rm" (stack)); + ASM_CALL("movq %[stack], %%rsp\n\t" + "call handle_stack_overflow\n\t" + "1: jmp 1b", + , + ARGS("D" ("kernel stack overflow (page fault)"), + "S" (regs), "d" (address), [stack] "rm" (stack))); unreachable(); } #endif diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 3f8c88e29a46..3a57f32a0bfd 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -589,4 +589,18 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s (_________p1); \ }) +#define ARGS(args...) args + +#ifdef CONFIG_FRAME_POINTER + +#define ASM_CALL(str, outputs, inputs, clobbers...) \ + asm volatile(str : outputs : inputs : "sp", ## clobbers) + +#else + +#define ASM_CALL(str, outputs, inputs, clobbers...) \ + asm volatile(str : outputs : inputs : clobbers); + +#endif + #endif /* __LINUX_COMPILER_H */ diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt index 6a1af43862df..766a00ebf80c 100644 --- a/tools/objtool/Documentation/stack-validation.txt +++ b/tools/objtool/Documentation/stack-validation.txt @@ -193,13 +193,8 @@ they mean, and suggestions for how to fix them. If it's a GCC-compiled .c file, the error may be because the function uses an inline asm() statement which has a "call" instruction. An - asm() statement with a call instruction must declare the use of the - stack pointer in its output operand. For example, on x86_64: - - register void *__sp asm("rsp"); - asm volatile("call func" : "+r" (__sp)); - - Otherwise the stack frame may not get created before the call. + asm() statement with a call instruction must use the ASM_CALL macro, + which forces the frame pointer to be saved before the call. 2. file.o: warning: objtool: .text+0x53: unreachable instruction @@ -221,7 +216,7 @@ they mean, and suggestions for how to fix them. change ENDPROC to END. -4. file.o: warning: objtool: func(): can't find starting instruction +3. file.o: warning: objtool: func(): can't find starting instruction or file.o: warning: objtool: func()+0x11dd: can't decode instruction @@ -230,7 +225,7 @@ they mean, and suggestions for how to fix them. section like .data or .rodata. -5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function +4. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function This is a kernel entry/exit instruction like sysenter or iret. Such instructions aren't allowed in a callable function, and are most @@ -239,7 +234,7 @@ they mean, and suggestions for how to fix them. annotated with the unwind hint macros in asm/unwind_hints.h. -6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame +5. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame This is a dynamic jump or a jump to an undefined symbol. Objtool assumed it's a sibling call and detected that the frame pointer @@ -253,7 +248,7 @@ they mean, and suggestions for how to fix them. the unwind hint macros in asm/unwind_hints.h. -7. file: warning: objtool: func()+0x5c: stack state mismatch +6. file: warning: objtool: func()+0x5c: stack state mismatch The instruction's frame pointer state is inconsistent, depending on which execution path was taken to reach the instruction. @@ -270,7 +265,7 @@ they mean, and suggestions for how to fix them. asm/unwind_hints.h. -8. file.o: warning: objtool: funcA() falls through to next function funcB() +7. file.o: warning: objtool: funcA() falls through to next function funcB() This means that funcA() doesn't end with a return instruction or an unconditional jump, and that objtool has determined that the function ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-29 0:55 ` Josh Poimboeuf @ 2017-07-29 0:58 ` Josh Poimboeuf 2017-07-29 1:06 ` Matthias Kaehlcke 1 sibling, 0 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-29 0:58 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Andrey Ryabinin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Fri, Jul 28, 2017 at 07:55:21PM -0500, Josh Poimboeuf wrote: > +#define ASM_CALL(str, outputs, inputs, clobbers...) \ > + asm volatile(str : outputs : inputs : "sp", ## clobbers) And note this part isn't right, the sp should be in the output operands. -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-29 0:55 ` Josh Poimboeuf 2017-07-29 0:58 ` Josh Poimboeuf @ 2017-07-29 1:06 ` Matthias Kaehlcke 1 sibling, 0 replies; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-29 1:06 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, LKML, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer El Fri, Jul 28, 2017 at 07:55:21PM -0500 Josh Poimboeuf ha dit: > On Fri, Jul 28, 2017 at 05:38:52PM -0700, Matthias Kaehlcke wrote: > > El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit: > > > > > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote: > > > > FWIW bellow is my understanding of what's going on. > > > > > > > > It seems clang treats local named register almost the same as ordinary > > > > local variables. > > > > The only difference is that before reading the register variable clang > > > > puts variable's value into the specified register. > > > > > > > > So clang just assigns stack slot for the variable __sp where it's > > > > going to keep variable's value. > > > > But since __sp is unitialized (we haven't assign anything to it), the > > > > value of the __sp is some garbage from stack. > > > > inline asm specifies __sp as input, so clang assumes that it have to > > > > load __sp into 'rsp' because inline asm is going to use > > > > it. And it just loads garbage from stack into 'rsp' > > > > > > > > In fact, such behavior (I mean storing the value on stack and loading > > > > into reg before the use) is very useful. > > > > Clang's behavior allows to keep the value assigned to the > > > > call-clobbered register across the function calls. > > > > > > > > Unlike clang, gcc assigns value to the register right away and doesn't > > > > store the value anywhere else. So if the reg is > > > > call clobbered register you have to be absolutely sure that there is > > > > no subsequent function call that might clobber the register. > > > > > > > > E.g. see some real examples > > > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm: > > > > Fix improper register assignment"). > > > > These bugs shouldn't happen with clang. > > > > > > > > But the global named register works slightly differently in clang. For > > > > the global, the value is just the value of the register itself, > > > > whatever it is. Read/write from global named register is just like > > > > direct read/write to the register > > > > > > Thanks, that clears up a lot of the confusion for me. > > > > > > Still, unfortunately, I don't think that's going to work for GCC. > > > Changing the '__sp' register variable to global in the header file > > > causes it to make a *bunch* of changes across the kernel, even in > > > functions which don't do inline asm. It seems to be disabling some > > > optimizations across the board. > > > > > > I do have another idea, which is to replace all uses of > > > > > > asm(" ... call foo ... " : outputs : inputs : clobbers); > > > > > > with a new ASM_CALL macro: > > > > > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers); > > > > > > Then the compiler differences can be abstracted out, with GCC adding > > > "sp" as an output constraint and clang doing nothing (for now). > > > > The idea sounds interesting, however I see two issues with ASM_CALL(): > > > > Inline assembly expects the individual elements of outputs, inputs and > > clobbers to be comma separated, and so does the macro for it's > > parameters. > > I think this isn't a problem, see the (untested and unfinished) patch > below for an idea of how the arguments can be handled. Good point! > > The assembler template can refer to the position of output and input > > operands, adding "sp" as output changes the positions of the inputs > > wrt to the clang version. > > True, though I think we could convert all ASM_CALL users to use named > operands instead of the (bug-prone) numbered ones. It would be an > improvement regardless. Agreed, that sounds like a good solution and a desirable improvement. > > Not sure how to move forward from here. Not even using an ugly #ifdef > > seems to be a halfway reasonable solution, since get_user() isn't the > > only place using this construct and #ifdefs would result in highly > > redundant macros in multiple places. > > I still think ASM_CALL might work. The below patch is what I came up > with last week before I got pulled into some other work. I'll be out on > vacation next week but I hope to finish the patch after that. Thanks for working on this. Enjoy your vacation! Matthias > diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h > index 1b020381ab38..5da4bcbeebab 100644 > --- a/arch/x86/include/asm/alternative.h > +++ b/arch/x86/include/asm/alternative.h > @@ -216,14 +216,14 @@ static inline int alternatives_text_reserved(void *start, void *end) > * Otherwise, old function is used. > */ > #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2, \ > - output, input...) \ > + output, input, clobbers...) \ > { \ > - register void *__sp asm(_ASM_SP); \ > - asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\ > - "call %P[new2]", feature2) \ > - : output, "+r" (__sp) \ > - : [old] "i" (oldfunc), [new1] "i" (newfunc1), \ > - [new2] "i" (newfunc2), ## input); \ > + ASM_CALL(ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1, \ > + "call %P[new2]", feature2), \ > + ARGS(output), \ > + ARGS([old] "i" (oldfunc), [new1] "i" (newfunc1), \ > + [new2] "i" (newfunc2), ARGS(input)), \ > + ## clobbers); \ > } > > /* > diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h > index b4a0d43248cf..21adcc8e739f 100644 > --- a/arch/x86/include/asm/page_64.h > +++ b/arch/x86/include/asm/page_64.h > @@ -45,8 +45,8 @@ static inline void clear_page(void *page) > clear_page_rep, X86_FEATURE_REP_GOOD, > clear_page_erms, X86_FEATURE_ERMS, > "=D" (page), > - "0" (page) > - : "memory", "rax", "rcx"); > + "0" (page), > + ARGS("memory", "rax", "rcx")); > } > > void copy_page(void *to, void *from); > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h > index cb976bab6299..2a585b799a3e 100644 > --- a/arch/x86/include/asm/paravirt_types.h > +++ b/arch/x86/include/asm/paravirt_types.h > @@ -471,8 +471,7 @@ int paravirt_disable_iospace(void); > */ > #ifdef CONFIG_X86_32 > #define PVOP_VCALL_ARGS \ > - unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx; \ > - register void *__sp asm("esp") > + unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx; > #define PVOP_CALL_ARGS PVOP_VCALL_ARGS > > #define PVOP_CALL_ARG1(x) "a" ((unsigned long)(x)) > @@ -492,8 +491,7 @@ int paravirt_disable_iospace(void); > /* [re]ax isn't an arg, but the return val */ > #define PVOP_VCALL_ARGS \ > unsigned long __edi = __edi, __esi = __esi, \ > - __edx = __edx, __ecx = __ecx, __eax = __eax; \ > - register void *__sp asm("rsp") > + __edx = __edx, __ecx = __ecx, __eax = __eax; > #define PVOP_CALL_ARGS PVOP_VCALL_ARGS > > #define PVOP_CALL_ARG1(x) "D" ((unsigned long)(x)) > @@ -541,24 +539,24 @@ int paravirt_disable_iospace(void); > /* This is 32-bit specific, but is okay in 64-bit */ \ > /* since this condition will never hold */ \ > if (sizeof(rettype) > sizeof(unsigned long)) { \ > - asm volatile(pre \ > - paravirt_alt(PARAVIRT_CALL) \ > - post \ > - : call_clbr, "+r" (__sp) \ > - : paravirt_type(op), \ > - paravirt_clobber(clbr), \ > - ##__VA_ARGS__ \ > - : "memory", "cc" extra_clbr); \ > + ASM_CALL(pre \ > + paravirt_alt(PARAVIRT_CALL) \ > + post, \ > + ARGS(call_clbr), \ > + ARGS(paravirt_type(op), \ > + paravirt_clobber(clbr), \ > + ##__VA_ARGS__), \ > + ARGS("memory", "cc" extra_clbr)); \ > __ret = (rettype)((((u64)__edx) << 32) | __eax); \ > } else { \ > - asm volatile(pre \ > - paravirt_alt(PARAVIRT_CALL) \ > - post \ > - : call_clbr, "+r" (__sp) \ > - : paravirt_type(op), \ > - paravirt_clobber(clbr), \ > - ##__VA_ARGS__ \ > - : "memory", "cc" extra_clbr); \ > + ASM_CALL(pre \ > + paravirt_alt(PARAVIRT_CALL) \ > + post, \ > + ARGS(call_clbr), \ > + ARGS(paravirt_type(op), \ > + paravirt_clobber(clbr), \ > + ##__VA_ARGS__), \ > + ARGS("memory", "cc" extra_clbr)); \ > __ret = (rettype)(__eax & PVOP_RETMASK(rettype)); \ > } \ > __ret; \ > @@ -578,14 +576,14 @@ int paravirt_disable_iospace(void); > ({ \ > PVOP_VCALL_ARGS; \ > PVOP_TEST_NULL(op); \ > - asm volatile(pre \ > - paravirt_alt(PARAVIRT_CALL) \ > - post \ > - : call_clbr, "+r" (__sp) \ > - : paravirt_type(op), \ > - paravirt_clobber(clbr), \ > - ##__VA_ARGS__ \ > - : "memory", "cc" extra_clbr); \ > + ASM_CALL(pre \ > + paravirt_alt(PARAVIRT_CALL) \ > + post, \ > + ARGS(call_clbr), \ > + ARGS(paravirt_type(op), \ > + paravirt_clobber(clbr), \ > + ##__VA_ARGS__), \ > + ARGS("memory", "cc" extra_clbr)); \ > }) > > #define __PVOP_VCALL(op, pre, post, ...) \ > diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h > index ec1f3c651150..19a034cbde37 100644 > --- a/arch/x86/include/asm/preempt.h > +++ b/arch/x86/include/asm/preempt.h > @@ -102,16 +102,14 @@ static __always_inline bool should_resched(int preempt_offset) > extern asmlinkage void ___preempt_schedule(void); > # define __preempt_schedule() \ > ({ \ > - register void *__sp asm(_ASM_SP); \ > - asm volatile ("call ___preempt_schedule" : "+r"(__sp)); \ > + ASM_CALL("call ___preempt_schedule",,); \ > }) > > extern asmlinkage void preempt_schedule(void); > extern asmlinkage void ___preempt_schedule_notrace(void); > # define __preempt_schedule_notrace() \ > ({ \ > - register void *__sp asm(_ASM_SP); \ > - asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp)); \ > + ASM_CALL("call ___preempt_schedule_notrace",,); \ > }) > extern asmlinkage void preempt_schedule_notrace(void); > #endif > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 550bc4ab0df4..2bbfb777c8bb 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -677,20 +677,19 @@ static inline void sync_core(void) > * Like all of Linux's memory ordering operations, this is a > * compiler barrier as well. > */ > - register void *__sp asm(_ASM_SP); > > #ifdef CONFIG_X86_32 > - asm volatile ( > + ASM_CALL( > "pushfl\n\t" > "pushl %%cs\n\t" > "pushl $1f\n\t" > "iret\n\t" > - "1:" > - : "+r" (__sp) : : "memory"); > + "1:", > + , , "memory"); > #else > unsigned int tmp; > > - asm volatile ( > + ASM_CALL( > UNWIND_HINT_SAVE > "mov %%ss, %0\n\t" > "pushq %q0\n\t" > @@ -702,8 +701,8 @@ static inline void sync_core(void) > "pushq $1f\n\t" > "iretq\n\t" > UNWIND_HINT_RESTORE > - "1:" > - : "=&r" (tmp), "+r" (__sp) : : "cc", "memory"); > + "1:", > + "=&r" (tmp), , ARGS("cc", "memory")); > #endif > } > > diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h > index a34e0d4b957d..9d20f4fb5d7c 100644 > --- a/arch/x86/include/asm/rwsem.h > +++ b/arch/x86/include/asm/rwsem.h > @@ -99,25 +99,24 @@ static inline bool __down_read_trylock(struct rw_semaphore *sem) > /* > * lock for writing > */ > -#define ____down_write(sem, slow_path) \ > -({ \ > - long tmp; \ > - struct rw_semaphore* ret; \ > - register void *__sp asm(_ASM_SP); \ > - \ > - asm volatile("# beginning down_write\n\t" \ > - LOCK_PREFIX " xadd %1,(%4)\n\t" \ > - /* adds 0xffff0001, returns the old value */ \ > - " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \ > - /* was the active mask 0 before? */\ > - " jz 1f\n" \ > - " call " slow_path "\n" \ > - "1:\n" \ > - "# ending down_write" \ > - : "+m" (sem->count), "=d" (tmp), "=a" (ret), "+r" (__sp) \ > - : "a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS) \ > - : "memory", "cc"); \ > - ret; \ > +#define ____down_write(sem, slow_path) \ > +({ \ > + long tmp; \ > + struct rw_semaphore* ret; \ > + \ > + ASM_CALL("# beginning down_write\n\t" \ > + LOCK_PREFIX " xadd %1,(%3)\n\t" \ > + /* adds 0xffff0001, returns the old value */ \ > + " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \ > + /* was the active mask 0 before? */ \ > + " jz 1f\n" \ > + " call " slow_path "\n" \ > + "1:\n" \ > + "# ending down_write", \ > + ARGS("+m" (sem->count), "=d" (tmp), "=a" (ret)), \ > + ARGS("a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS)), \ > + ARGS("memory", "cc")); \ > + ret; \ > }) > > static inline void __down_write(struct rw_semaphore *sem) > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > index 184eb9894dae..c7be076ee703 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -166,12 +166,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > ({ \ > int __ret_gu; \ > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > - register void *__sp asm(_ASM_SP); \ > __chk_user_ptr(ptr); \ > might_fault(); \ > - asm volatile("call __get_user_%P4" \ > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > + ASM_CALL("call __get_user_%P3", \ > + ARGS("=a" (__ret_gu), "=r" (__val_gu)), \ > + ARGS("0" (ptr), "i" (sizeof(*(ptr))))); \ > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > __builtin_expect(__ret_gu, 0); \ > }) > diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h > index b16f6a1d8b26..8b63e2cb1eaf 100644 > --- a/arch/x86/include/asm/uaccess_64.h > +++ b/arch/x86/include/asm/uaccess_64.h > @@ -38,10 +38,9 @@ copy_user_generic(void *to, const void *from, unsigned len) > X86_FEATURE_REP_GOOD, > copy_user_enhanced_fast_string, > X86_FEATURE_ERMS, > - ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from), > - "=d" (len)), > - "1" (to), "2" (from), "3" (len) > - : "memory", "rcx", "r8", "r9", "r10", "r11"); > + ARGS("=a" (ret), "=D" (to), "=S" (from), "=d" (len)), > + ARGS("1" (to), "2" (from), "3" (len)), > + ARGS("memory", "rcx", "r8", "r9", "r10", "r11")); > return ret; > } > > diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h > index 11071fcd630e..2feddeeec1d1 100644 > --- a/arch/x86/include/asm/xen/hypercall.h > +++ b/arch/x86/include/asm/xen/hypercall.h > @@ -114,9 +114,8 @@ extern struct { char _entry[32]; } hypercall_page[]; > register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \ > register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \ > register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \ > - register void *__sp asm(_ASM_SP); > > -#define __HYPERCALL_0PARAM "=r" (__res), "+r" (__sp) > +#define __HYPERCALL_0PARAM "=r" (__res) > #define __HYPERCALL_1PARAM __HYPERCALL_0PARAM, "+r" (__arg1) > #define __HYPERCALL_2PARAM __HYPERCALL_1PARAM, "+r" (__arg2) > #define __HYPERCALL_3PARAM __HYPERCALL_2PARAM, "+r" (__arg3) > @@ -146,10 +145,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_0ARG(); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_0PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER0); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_0PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER0); \ > (type)__res; \ > }) > > @@ -157,10 +156,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_1ARG(a1); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_1PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER1); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_1PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER1); \ > (type)__res; \ > }) > > @@ -168,10 +167,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_2ARG(a1, a2); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_2PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER2); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_2PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER2); \ > (type)__res; \ > }) > > @@ -179,10 +178,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_3ARG(a1, a2, a3); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_3PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER3); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_3PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER3); \ > (type)__res; \ > }) > > @@ -190,10 +189,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_4ARG(a1, a2, a3, a4); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_4PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER4); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_4PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER4); \ > (type)__res; \ > }) > > @@ -201,10 +200,10 @@ extern struct { char _entry[32]; } hypercall_page[]; > ({ \ > __HYPERCALL_DECLS; \ > __HYPERCALL_5ARG(a1, a2, a3, a4, a5); \ > - asm volatile (__HYPERCALL \ > - : __HYPERCALL_5PARAM \ > - : __HYPERCALL_ENTRY(name) \ > - : __HYPERCALL_CLOBBER5); \ > + ASM_CALL(__HYPERCALL, \ > + __HYPERCALL_5PARAM, \ > + __HYPERCALL_ENTRY(name), \ > + __HYPERCALL_CLOBBER5); \ > (type)__res; \ > }) > > @@ -218,10 +217,10 @@ privcmd_call(unsigned call, > __HYPERCALL_5ARG(a1, a2, a3, a4, a5); > > stac(); > - asm volatile("call *%[call]" > - : __HYPERCALL_5PARAM > - : [call] "a" (&hypercall_page[call]) > - : __HYPERCALL_CLOBBER5); > + ASM_CALL("call *%[call]", > + __HYPERCALL_5PARAM, > + [call] "a" (&hypercall_page[call]), > + __HYPERCALL_CLOBBER5); > clac(); > > return (long)__res; > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > index fb0055953fbc..4d0d326029a7 100644 > --- a/arch/x86/kvm/emulate.c > +++ b/arch/x86/kvm/emulate.c > @@ -5284,16 +5284,15 @@ static void fetch_possible_mmx_operand(struct x86_emulate_ctxt *ctxt, > > static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *)) > { > - register void *__sp asm(_ASM_SP); > ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; > > if (!(ctxt->d & ByteOp)) > fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE; > > - asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n" > - : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags), > - [fastop]"+S"(fop), "+r"(__sp) > - : "c"(ctxt->src2.val)); > + ASM_CALL("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n", > + ARGS("+a"(ctxt->dst.val), "+d"(ctxt->src.val), > + [flags]"+D"(flags), [fastop]"+S"(fop)), > + ARGS("c"(ctxt->src2.val))); > > ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); > if (!fop) /* exception is returned in fop variable */ > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index ffd469ecad57..d42806ad3c9c 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -8671,7 +8671,6 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx) > static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) > { > u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); > - register void *__sp asm(_ASM_SP); > > if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK)) > == (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) { > @@ -8686,7 +8685,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) > vector = exit_intr_info & INTR_INFO_VECTOR_MASK; > desc = (gate_desc *)vmx->host_idt_base + vector; > entry = gate_offset(*desc); > - asm volatile( > + ASM_CALL( > #ifdef CONFIG_X86_64 > "mov %%" _ASM_SP ", %[sp]\n\t" > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > @@ -8695,16 +8694,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu) > #endif > "pushf\n\t" > __ASM_SIZE(push) " $%c[cs]\n\t" > - "call *%[entry]\n\t" > - : > + "call *%[entry]\n\t", > #ifdef CONFIG_X86_64 > - [sp]"=&r"(tmp), > + [sp]"=&r"(tmp) > #endif > - "+r"(__sp) > - : > - [entry]"r"(entry), > - [ss]"i"(__KERNEL_DS), > - [cs]"i"(__KERNEL_CS) > + , > + ARGS([entry]"r"(entry), > + [ss]"i"(__KERNEL_DS), > + [cs]"i"(__KERNEL_CS)) > ); > } > } > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 2a1fa10c6a98..ca642ac3a390 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -802,7 +802,6 @@ no_context(struct pt_regs *regs, unsigned long error_code, > if (is_vmalloc_addr((void *)address) && > (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) || > address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) { > - register void *__sp asm("rsp"); > unsigned long stack = this_cpu_read(orig_ist.ist[DOUBLEFAULT_STACK]) - sizeof(void *); > /* > * We're likely to be running with very little stack space > @@ -814,13 +813,12 @@ no_context(struct pt_regs *regs, unsigned long error_code, > * and then double-fault, though, because we're likely to > * break the console driver and lose most of the stack dump. > */ > - asm volatile ("movq %[stack], %%rsp\n\t" > - "call handle_stack_overflow\n\t" > - "1: jmp 1b" > - : "+r" (__sp) > - : "D" ("kernel stack overflow (page fault)"), > - "S" (regs), "d" (address), > - [stack] "rm" (stack)); > + ASM_CALL("movq %[stack], %%rsp\n\t" > + "call handle_stack_overflow\n\t" > + "1: jmp 1b", > + , > + ARGS("D" ("kernel stack overflow (page fault)"), > + "S" (regs), "d" (address), [stack] "rm" (stack))); > unreachable(); > } > #endif > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 3f8c88e29a46..3a57f32a0bfd 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -589,4 +589,18 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s > (_________p1); \ > }) > > +#define ARGS(args...) args > + > +#ifdef CONFIG_FRAME_POINTER > + > +#define ASM_CALL(str, outputs, inputs, clobbers...) \ > + asm volatile(str : outputs : inputs : "sp", ## clobbers) > + > +#else > + > +#define ASM_CALL(str, outputs, inputs, clobbers...) \ > + asm volatile(str : outputs : inputs : clobbers); > + > +#endif > + > #endif /* __LINUX_COMPILER_H */ > diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt > index 6a1af43862df..766a00ebf80c 100644 > --- a/tools/objtool/Documentation/stack-validation.txt > +++ b/tools/objtool/Documentation/stack-validation.txt > @@ -193,13 +193,8 @@ they mean, and suggestions for how to fix them. > > If it's a GCC-compiled .c file, the error may be because the function > uses an inline asm() statement which has a "call" instruction. An > - asm() statement with a call instruction must declare the use of the > - stack pointer in its output operand. For example, on x86_64: > - > - register void *__sp asm("rsp"); > - asm volatile("call func" : "+r" (__sp)); > - > - Otherwise the stack frame may not get created before the call. > + asm() statement with a call instruction must use the ASM_CALL macro, > + which forces the frame pointer to be saved before the call. > > > 2. file.o: warning: objtool: .text+0x53: unreachable instruction > @@ -221,7 +216,7 @@ they mean, and suggestions for how to fix them. > change ENDPROC to END. > > > -4. file.o: warning: objtool: func(): can't find starting instruction > +3. file.o: warning: objtool: func(): can't find starting instruction > or > file.o: warning: objtool: func()+0x11dd: can't decode instruction > > @@ -230,7 +225,7 @@ they mean, and suggestions for how to fix them. > section like .data or .rodata. > > > -5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function > +4. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function > > This is a kernel entry/exit instruction like sysenter or iret. Such > instructions aren't allowed in a callable function, and are most > @@ -239,7 +234,7 @@ they mean, and suggestions for how to fix them. > annotated with the unwind hint macros in asm/unwind_hints.h. > > > -6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame > +5. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame > > This is a dynamic jump or a jump to an undefined symbol. Objtool > assumed it's a sibling call and detected that the frame pointer > @@ -253,7 +248,7 @@ they mean, and suggestions for how to fix them. > the unwind hint macros in asm/unwind_hints.h. > > > -7. file: warning: objtool: func()+0x5c: stack state mismatch > +6. file: warning: objtool: func()+0x5c: stack state mismatch > > The instruction's frame pointer state is inconsistent, depending on > which execution path was taken to reach the instruction. > @@ -270,7 +265,7 @@ they mean, and suggestions for how to fix them. > asm/unwind_hints.h. > > > -8. file.o: warning: objtool: funcA() falls through to next function funcB() > +7. file.o: warning: objtool: funcA() falls through to next function funcB() > > This means that funcA() doesn't end with a return instruction or an > unconditional jump, and that objtool has determined that the function ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 20:20 ` Andrey Rybainin 2017-07-13 20:34 ` Josh Poimboeuf @ 2017-07-13 21:14 ` Matthias Kaehlcke 2017-07-13 21:25 ` Andrey Rybainin 1 sibling, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-13 21:14 UTC (permalink / raw) To: Andrey Rybainin Cc: Josh Poimboeuf, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, Linux Kernel Mailing List, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer El Thu, Jul 13, 2017 at 11:20:04PM +0300 Andrey Rybainin ha dit: > On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: > > > Thanks for your analysis! > > > >> What happens if you try the below patch instead of the revert? Any > >> chance the offending instruction goes away? > >> > >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > >> index 11433f9..beac907 100644 > >> --- a/arch/x86/include/asm/uaccess.h > >> +++ b/arch/x86/include/asm/uaccess.h > >> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > >> might_fault(); \ > >> asm volatile("call __get_user_%P4" \ > >> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > >> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > >> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ > >> __builtin_expect(__ret_gu, 0); \ > >> }) > > > > The generated code is basically the same, only that now the value from > > the stack is stored in a register and written twice to RSP: > > > > AFAIR clang works much better with global named registers. > Could you check if the patch bellow helps? > > > --- > arch/x86/include/asm/uaccess.h | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > index a059aac9e937..121204387978 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -157,15 +157,18 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > * Clang/LLVM cares about the size of the register, but still wants > * the base register for something that ends up being a pair. > */ > + > +register unsigned long __current_sp asm(_ASM_SP); > + > #define get_user(x, ptr) \ > ({ \ > int __ret_gu; \ > register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > - register void *__sp asm(_ASM_SP); \ > __chk_user_ptr(ptr); \ > might_fault(); \ > asm volatile("call __get_user_%P4" \ > - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > + : "=a" (__ret_gu), "=r" (__val_gu), \ > + "+r" (__current_sp) \ > : "0" (ptr), "i" (sizeof(*(ptr)))); \ > (x) = (__force __typeof__(*(ptr))) __val_gu; \ > __builtin_expect(__ret_gu, 0); \ Thanks for the suggestion, however it fails to build with both gcc and clang: fs/ioctl.c:585:6: error: use of undeclared identifier '__current_sp' if (get_user(count, &argp->dest_count)) { ^ arch/x86/include/asm/uaccess.h:168:16: note: expanded from macro 'get_user' "+r" (__current_sp) \ The references I found refer to __current_sp as an intrinsic function for ARM32. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:14 ` Matthias Kaehlcke @ 2017-07-13 21:25 ` Andrey Rybainin 2017-07-13 21:43 ` Matthias Kaehlcke 0 siblings, 1 reply; 32+ messages in thread From: Andrey Rybainin @ 2017-07-13 21:25 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Josh Poimboeuf, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, Linux Kernel Mailing List, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On 07/14/2017 12:14 AM, Matthias Kaehlcke wrote: > El Thu, Jul 13, 2017 at 11:20:04PM +0300 Andrey Rybainin ha dit: > >> On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: >> >>> Thanks for your analysis! >>> >>>> What happens if you try the below patch instead of the revert? Any >>>> chance the offending instruction goes away? >>>> >>>> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h >>>> index 11433f9..beac907 100644 >>>> --- a/arch/x86/include/asm/uaccess.h >>>> +++ b/arch/x86/include/asm/uaccess.h >>>> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) >>>> might_fault(); \ >>>> asm volatile("call __get_user_%P4" \ >>>> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ >>>> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ >>>> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ >>>> (x) = (__force __typeof__(*(ptr))) __val_gu; \ >>>> __builtin_expect(__ret_gu, 0); \ >>>> }) >>> >>> The generated code is basically the same, only that now the value from >>> the stack is stored in a register and written twice to RSP: >>> >> >> AFAIR clang works much better with global named registers. >> Could you check if the patch bellow helps? >> >> >> --- >> arch/x86/include/asm/uaccess.h | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h >> index a059aac9e937..121204387978 100644 >> --- a/arch/x86/include/asm/uaccess.h >> +++ b/arch/x86/include/asm/uaccess.h >> @@ -157,15 +157,18 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) >> * Clang/LLVM cares about the size of the register, but still wants >> * the base register for something that ends up being a pair. >> */ >> + >> +register unsigned long __current_sp asm(_ASM_SP); >> + >> #define get_user(x, ptr) \ >> ({ \ >> int __ret_gu; \ >> register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ >> - register void *__sp asm(_ASM_SP); \ >> __chk_user_ptr(ptr); \ >> might_fault(); \ >> asm volatile("call __get_user_%P4" \ >> - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ >> + : "=a" (__ret_gu), "=r" (__val_gu), \ >> + "+r" (__current_sp) \ >> : "0" (ptr), "i" (sizeof(*(ptr)))); \ >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ >> __builtin_expect(__ret_gu, 0); \ > > Thanks for the suggestion, however it fails to build with both gcc and clang: > > fs/ioctl.c:585:6: error: use of undeclared identifier '__current_sp' > if (get_user(count, &argp->dest_count)) { > ^ > arch/x86/include/asm/uaccess.h:168:16: note: expanded from macro 'get_user' > "+r" (__current_sp) > \ > > The references I found refer to __current_sp as an intrinsic function > for ARM32. What? __current_sp declared right above get_user() as "register unsigned long __current_sp asm(_ASM_SP);" Did you actually applied my patch or you just modified the code yourself but have missed "register unsigned long __current_sp asm(_ASM_SP);" ? FWIW patch works (builds) for me with gcc. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:25 ` Andrey Rybainin @ 2017-07-13 21:43 ` Matthias Kaehlcke 2017-07-13 21:52 ` Josh Poimboeuf 0 siblings, 1 reply; 32+ messages in thread From: Matthias Kaehlcke @ 2017-07-13 21:43 UTC (permalink / raw) To: Andrey Rybainin Cc: Josh Poimboeuf, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, Linux Kernel Mailing List, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer El Fri, Jul 14, 2017 at 12:25:42AM +0300 Andrey Rybainin ha dit: > > > On 07/14/2017 12:14 AM, Matthias Kaehlcke wrote: > > El Thu, Jul 13, 2017 at 11:20:04PM +0300 Andrey Rybainin ha dit: > > > >> On 07/13/2017 09:47 PM, Matthias Kaehlcke wrote: > >> > >>> Thanks for your analysis! > >>> > >>>> What happens if you try the below patch instead of the revert? Any > >>>> chance the offending instruction goes away? > >>>> > >>>> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > >>>> index 11433f9..beac907 100644 > >>>> --- a/arch/x86/include/asm/uaccess.h > >>>> +++ b/arch/x86/include/asm/uaccess.h > >>>> @@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > >>>> might_fault(); \ > >>>> asm volatile("call __get_user_%P4" \ > >>>> : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > >>>> - : "0" (ptr), "i" (sizeof(*(ptr)))); \ > >>>> + : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \ > >>>> (x) = (__force __typeof__(*(ptr))) __val_gu; \ > >>>> __builtin_expect(__ret_gu, 0); \ > >>>> }) > >>> > >>> The generated code is basically the same, only that now the value from > >>> the stack is stored in a register and written twice to RSP: > >>> > >> > >> AFAIR clang works much better with global named registers. > >> Could you check if the patch bellow helps? > >> > >> > >> --- > >> arch/x86/include/asm/uaccess.h | 7 +++++-- > >> 1 file changed, 5 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > >> index a059aac9e937..121204387978 100644 > >> --- a/arch/x86/include/asm/uaccess.h > >> +++ b/arch/x86/include/asm/uaccess.h > >> @@ -157,15 +157,18 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL)) > >> * Clang/LLVM cares about the size of the register, but still wants > >> * the base register for something that ends up being a pair. > >> */ > >> + > >> +register unsigned long __current_sp asm(_ASM_SP); > >> + > >> #define get_user(x, ptr) \ > >> ({ \ > >> int __ret_gu; \ > >> register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \ > >> - register void *__sp asm(_ASM_SP); \ > >> __chk_user_ptr(ptr); \ > >> might_fault(); \ > >> asm volatile("call __get_user_%P4" \ > >> - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \ > >> + : "=a" (__ret_gu), "=r" (__val_gu), \ > >> + "+r" (__current_sp) \ > >> : "0" (ptr), "i" (sizeof(*(ptr)))); \ > >> (x) = (__force __typeof__(*(ptr))) __val_gu; \ > >> __builtin_expect(__ret_gu, 0); \ > > > > Thanks for the suggestion, however it fails to build with both gcc and clang: > > > > fs/ioctl.c:585:6: error: use of undeclared identifier '__current_sp' > > if (get_user(count, &argp->dest_count)) { > > ^ > > arch/x86/include/asm/uaccess.h:168:16: note: expanded from macro 'get_user' > > "+r" (__current_sp) > > \ > > > > The references I found refer to __current_sp as an intrinsic function > > for ARM32. > > What? __current_sp declared right above get_user() as "register unsigned long __current_sp asm(_ASM_SP);" > Did you actually applied my patch or you just modified the code yourself but have missed > "register unsigned long __current_sp asm(_ASM_SP);" ? Indeed, since the patch is only a few lines and I had the function already open in the editor it seemed easier to change the affected lines than to apply the patch and I missed the definition <:‑| After adding the missing line the code builds with clang and the stack pointer is not corrupted. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" 2017-07-13 21:43 ` Matthias Kaehlcke @ 2017-07-13 21:52 ` Josh Poimboeuf 0 siblings, 0 replies; 32+ messages in thread From: Josh Poimboeuf @ 2017-07-13 21:52 UTC (permalink / raw) To: Matthias Kaehlcke Cc: Andrey Rybainin, Chris J Arges, Borislav Petkov, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86, Linux Kernel Mailing List, Douglas Anderson, Michael Davidson, Greg Hackmann, Nick Desaulniers, Stephen Hines, Kees Cook, Arnd Bergmann, Bernhard Rosenkränzer On Thu, Jul 13, 2017 at 02:43:26PM -0700, Matthias Kaehlcke wrote: > > >> AFAIR clang works much better with global named registers. Is that a bug or a feature? -- Josh ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-07-29 1:06 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-07-12 21:27 [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" Matthias Kaehlcke 2017-07-12 22:12 ` Josh Poimboeuf 2017-07-12 22:20 ` Matthias Kaehlcke 2017-07-12 22:35 ` Josh Poimboeuf 2017-07-12 22:36 ` Josh Poimboeuf 2017-07-12 23:22 ` Matthias Kaehlcke 2017-07-13 18:00 ` Josh Poimboeuf 2017-07-13 18:47 ` Matthias Kaehlcke 2017-07-13 19:25 ` Josh Poimboeuf 2017-07-13 19:38 ` Michael Davidson 2017-07-13 20:18 ` Josh Poimboeuf 2017-07-13 20:20 ` Andrey Rybainin 2017-07-13 20:34 ` Josh Poimboeuf 2017-07-13 21:12 ` Matthias Kaehlcke 2017-07-13 21:34 ` Josh Poimboeuf 2017-07-13 21:57 ` Matthias Kaehlcke 2017-07-19 17:46 ` Josh Poimboeuf 2017-07-19 21:50 ` Matthias Kaehlcke 2017-07-20 10:01 ` Andrey Ryabinin 2017-07-20 15:18 ` Josh Poimboeuf 2017-07-20 15:30 ` Andrey Ryabinin 2017-07-20 20:56 ` Josh Poimboeuf 2017-07-21 9:13 ` Andrey Ryabinin 2017-07-21 13:24 ` Josh Poimboeuf 2017-07-29 0:38 ` Matthias Kaehlcke 2017-07-29 0:55 ` Josh Poimboeuf 2017-07-29 0:58 ` Josh Poimboeuf 2017-07-29 1:06 ` Matthias Kaehlcke 2017-07-13 21:14 ` Matthias Kaehlcke 2017-07-13 21:25 ` Andrey Rybainin 2017-07-13 21:43 ` Matthias Kaehlcke 2017-07-13 21:52 ` Josh Poimboeuf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).