All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v14 0/9] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction
@ 2017-02-08  8:09 ` Kyle Huey
  0 siblings, 0 replies; 54+ messages in thread
From: Kyle Huey @ 2017-02-08  8:09 UTC (permalink / raw)
  To: Robert O'Callahan, Thomas Gleixner, Andy Lutomirski,
	Ingo Molnar, H. Peter Anvin, x86, Paolo Bonzini,
	Radim Krčmář,
	Jeff Dike, Richard Weinberger, Alexander Viro, Shuah Khan,
	Dave Hansen, Borislav Petkov, Peter Zijlstra, Boris Ostrovsky,
	Len Brown, Rafael J. Wysocki, Dmitry Safonov, David Matlack,
	Nadav Amit, Andi Kleen
  Cc: linux-kernel, user-mode-linux-devel, user-mode-linux-user,
	linux-fsdevel, linux-kselftest, kvm

rr (http://rr-project.org/), a userspace record-and-replay reverse-
execution debugger, would like to trap and emulate the CPUID instruction.
This would allow us to a) mask away certain hardware features that rr does
not support (e.g. RDRAND) and b) enable trace portability across machines
by providing constant results.

Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
CPL > 0. Expose this capability to userspace as a new pair of arch_prctls,
ARCH_GET_CPUID and ARCH_SET_CPUID.

Since v13:
All: rebased on top of tglx's __switch_to_xtra patches
(https://lkml.org/lkml/2016/12/15/432)

Patch 6: x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
- Removed bogus assertion about interrupts

Patch 9: x86/arch_prctl: Rename 'code' argument to 'option'
- New

Three issues were raised last year on the v13 patches. tglx pointed out
that the lock checking assertion in patch 6 was bogus, and it has been
removed. ingo asked that we rename the second argument of arch_prctl to
option, which patch 9 was added to do.

The third issue raised was about performance and code generation. With the
__switch_to_xtra optimizations I suggested in my response that are
implemented in tglx's patches, the extra burden this feature imposes on
context switches that fall into __switch_to_xtra but do not use CPUID faulting
is a single AND and branch.  Compare,

Before:
276:	49 31 dc		xor    %rbx,%r12
279:	41 f7 c4 00 00 00 02 	test   $0x2000000,%r12d
280:	75 43                	jne    2c5 <__switch_to_xtra+0x95>
282:	41 f7 c4 00 00 01 00 	test   $0x10000,%r12d
289:    74 17                   je     2a2 <__switch_to_xtra+0x72>
28b:    65 48 8b 05 00 00 00    mov    %gs:0x0(%rip),%rax        # 293 <__switch_to_xtra+0x63>
292:    00
293:	48 83 f0 04		xor    $0x4,%rax
297:	65 48 89 05 00 00 00 	mov    %rax,%gs:0x0(%rip)        # 29f <__switch_to_xtra+0x6f>
29e:	00
29f:	0f 22 e0		mov    %rax,%cr4

After:
306:	4c 31 e3		xor    %r12,%rbx
309:	f7 c3 00 00 00 02    	test   $0x2000000,%ebx
30f:	0f 85 87 00 00 00    	jne    39c <__switch_to_xtra+0xdc>
315:    f7 c3 00 00 01 00       test   $0x10000,%ebx
31b:    74 17                   je     334 <__switch_to_xtra+0x74>
31d:    65 48 8b 05 00 00 00    mov    %gs:0x0(%rip),%rax        # 325 <__switch_to_xtra+0x65>
324:    00
325:    48 83 f0 04		xor    $0x4,%rax
329:    65 48 89 05 00 00 00 	mov    %rax,%gs:0x0(%rip)        # 331 <__switch_to_xtra+0x71>
330:	00
331:	0f 22 e0		mov    %rax,%cr4
334:    80 e7 80             	and    $0x80,%bh
337:    75 23                   jne    35c <__switch_to_xtra+0x9c>

And this is after the optimizations removed 3 conditional branches from
__switch_to_xtra, so we're still ahead a net 2 branches.

The generated code for set_cpuid_faulting is,

With CONFIG_PARAVIRT=n, inlined into __switch_to_xtra:
35c:    65 48 8b 05 00 00 00	mov    %gs:0x0(%rip),%rax        # 364 <__switch_to_xtra+0xa4>
363:	00
364:	48 83 e0 fe		and    $0xfffffffffffffffe,%rax
368:	b9 40 01 00 00       	mov    $0x140,%ecx
36d:    48 89 c2                mov    %rax,%rdx
370:    4c 89 e0                mov    %r12,%rax
373:    48 c1 e8 0f             shr    $0xf,%rax
377:    83 e0 01                and    $0x1,%eax
37a:    48 09 d0                or     %rdx,%rax
37d:    48 89 c2                mov    %rax,%rdx
380:	65 48 89 05 00 00 00    mov    %rax,%gs:0x0(%rip)        # 388 <__switch_to_xtra+0xc8>
387:	00
388:    48 c1 ea 20		shr    $0x20,%rdx
38c:    0f 30                   wrmsr

With CONFIG_PARAVIRT=y:

in __switch_to_xtra:
354:    80 e7 80                and    $0x80,%bh
357:    74 0f                   je     368 <__switch_to_xtra+0x88>
359:    4c 89 e7                mov    %r12,%rdi
35c:    48 c1 ef 0f             shr    $0xf,%rdi
360:    83 e7 01                and    $0x1,%edi
363:    e8 98 fc ff ff          callq  0 <set_cpuid_faulting>

0000000000000000 <set_cpuid_faulting>:
0:    e8 00 00 00 00          callq  5 <set_cpuid_faulting+0x5>
5:    55                      push   %rbp
6:    65 48 8b 15 00 00 00    mov    %gs:0x0(%rip),%rdx        # e <set_cpuid_faulting+0xe>
d:    00
e:    48 89 d0                mov    %rdx,%rax
11:   40 0f b6 d7             movzbl %dil,%edx
15:   48 89 e5                mov    %rsp,%rbp
18:   48 83 e0 fe             and    $0xfffffffffffffffe,%rax
1c:   bf 40 01 00 00          mov    $0x140,%edi
21:   48 09 c2                or     %rax,%rdx
24:   89 d6                   mov    %edx,%esi
26:   65 48 89 15 00 00 00    mov    %rdx,%gs:0x0(%rip)        # 2e <set_cpuid_faulting+0x2e>
2d:   00
2e:   48 c1 ea 20             shr    $0x20,%rdx
32:   ff 14 25 00 00 00 00    callq  *0x0
39:   5d                      pop    %rbp
3a:   c3                      retq

While these sequences are less efficient than the "load msr from the task struct and wrmsr"
sequence that Ingo's suggestion would produce, this is entirely reasonable code and avoids
taking up 8 bytes in the task_struct that will almost never be used.

As I said last year, obviously I want to get this into the kernel or I wouldn't be here.
So if Ingo or others insist on caching the MSR in the task_struct I'll do it, but I still
think this is a good approach.

- Kyle

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2018-07-27 22:58 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-08  8:09 [PATCH v14 0/9] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey
2017-02-08  8:09 ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 1/9] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 2/9] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64 Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 3/9] x86/arch_prctl: Add do_arch_prctl_common Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 4/9] x86/syscalls/32: Wire up arch_prctl on x86-32 Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 5/9] x86/cpufeature: Detect CPUID faulting support Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 6/9] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 7/9] x86/arch_prctl: Selftest for ARCH_[GET|SET]_CPUID Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-08  8:09 ` [PATCH v14 8/9] KVM: x86: virtualize cpuid faulting Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2018-07-27 17:15   ` Jim Mattson
2018-07-27 17:15     ` Jim Mattson
2018-07-27 17:15     ` Jim Mattson
2018-07-27 17:15     ` jmattson
2018-07-27 19:41   ` Andy Lutomirski
2018-07-27 19:41     ` Andy Lutomirski
2018-07-27 19:41     ` Andy Lutomirski
2018-07-27 19:41     ` luto
2018-07-27 20:28     ` Jim Mattson
2018-07-27 20:28       ` Jim Mattson
2018-07-27 20:28       ` Jim Mattson
2018-07-27 20:28       ` jmattson
2018-07-27 20:46       ` Andy Lutomirski
2018-07-27 20:46         ` Andy Lutomirski
2018-07-27 20:46         ` Andy Lutomirski
2018-07-27 20:46         ` luto
2018-07-27 21:03         ` Jim Mattson
2018-07-27 21:03           ` Jim Mattson
2018-07-27 21:03           ` Jim Mattson
2018-07-27 21:03           ` jmattson
2018-07-27 21:05           ` Andy Lutomirski
2018-07-27 21:05             ` Andy Lutomirski
2018-07-27 21:05             ` Andy Lutomirski
2018-07-27 21:05             ` luto
2018-07-27 21:30             ` Jim Mattson
2018-07-27 21:30               ` Jim Mattson
2018-07-27 21:30               ` Jim Mattson
2018-07-27 21:30               ` jmattson
2018-07-27 22:58               ` Andy Lutomirski
2018-07-27 22:58                 ` Andy Lutomirski
2018-07-27 22:58                 ` Andy Lutomirski
2018-07-27 22:58                 ` luto
2017-02-08  8:09 ` [PATCH v14 9/9] x86/arch_prctl: Rename 'code' argument to 'option' Kyle Huey
2017-02-08  8:09   ` Kyle Huey
2017-02-10 13:07 ` [PATCH v14 0/9] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Thomas Gleixner
2017-02-10 13:07   ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.