linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, "Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Jeff Dike" <jdike@addtoit.com>,
	"Richard Weinberger" <richard@nod.at>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shuah Khan" <shuah@kernel.org>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Len Brown" <len.brown@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	"Dmitry Safonov" <dsafonov@virtuozzo.com>,
	"David Matlack" <dmatlack@google.com>,
	"Nadav Amit" <nadav.amit@gmail.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	linux-kernel@vger.kernel.org,
	user-mode-linux-devel@lists.sourceforge.net,
	user-mode-linux-user@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction
Date: Fri, 2 Dec 2016 11:29:42 +0100	[thread overview]
Message-ID: <20161202102942.GA17332@gmail.com> (raw)
In-Reply-To: <20161128030521.4423-1-khuey@kylehuey.com>


* Kyle Huey <me@kylehuey.com> wrote:

> rr (http://rr-project.org/), a userspace record-and-replay reverse-
> execution debugger, would like to trap and emulate the CPUID instruction.
> This would allow us to a) mask away certain hardware features that rr does
> not support (e.g. RDRAND) and b) enable trace portability across machines
> by providing constant results.
> 
> Newer Intel CPUs (Ivy Bridge and later) can fault when CPUID is executed at
> CPL > 0. Expose this capability to userspace as a new pair of arch_prctls,
> ARCH_GET_CPUID and ARCH_SET_CPUID.
> 
> Since v12:
> Patch 4: x86/syscalls/32: Wire up arch_prctl on x86-32
> - compat_sys_arch_prctl prototype has argument names.

So while I am fine with the feature, I'm still unconvinced about the 
implementation:

1)

As I pointed out before, the arbitrary 'code' argument name x86-ism should be 
changed to 'option' like the canonical core kernel option name is for prctls(). 

This is still unfixed.

2)

As I complained about in my first review, TIF_NOCPUID flag is too far removed from 
the value what will be written into the MSR.

The result is poor code generation on 64-bit defconfig+CONFIG_PREEMPT=y:

        if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^
            test_tsk_thread_flag(next_p, TIF_NOCPUID)) {
		set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID));

is compiled as:

  476:   49 8b 06                mov    (%r14),%rax
  479:   49 8b 55 00             mov    0x0(%r13),%rdx
  47d:   48 c1 e8 0f             shr    $0xf,%rax
  481:   48 c1 ea 0f             shr    $0xf,%rdx
  485:   83 e2 01                and    $0x1,%edx
  488:   83 e0 01                and    $0x1,%eax
  48b:   38 c2                   cmp    %al,%dl
  48d:   74 10                   je     49f <__switch_to_xtra+0x9f>
  48f:   49 8b 7d 00             mov    0x0(%r13),%rdi
  493:   48 c1 ef 0f             shr    $0xf,%rdi
  497:   83 e7 01                and    $0x1,%edi
  49a:   e8 61 fb ff ff          callq  0 <set_cpuid_faulting>

... the first 7 instructions burdens all __switch_to_xtra() users, not just the 
faulting-CPUID users.

The set_cpuid_faulting() call is also unnecessary and the set_cpuid_faulting() 
call generates into an obscene sequence of:

0000000000000000 <set_cpuid_faulting>:
   0:	8b 15 00 00 00 00    	mov    0x0(%rip),%edx        # 6 <set_cpuid_faulting+0x6>
   6:	55                   	push   %rbp
   7:	48 89 e5             	mov    %rsp,%rbp
   a:	53                   	push   %rbx
   b:	40 0f b6 df          	movzbl %dil,%ebx
   f:	85 d2                	test   %edx,%edx
  11:	75 07                	jne    1a <set_cpuid_faulting+0x1a>
  13:	9c                   	pushfq 
  14:	58                   	pop    %rax
  15:	f6 c4 02             	test   $0x2,%ah
  18:	75 48                	jne    62 <set_cpuid_faulting+0x62>
  1a:	65 48 8b 05 00 00 00 	mov    %gs:0x0(%rip),%rax        # 22 <set_cpuid_faulting+0x22>
  21:	00 
  22:	48 83 e0 fe          	and    $0xfffffffffffffffe,%rax
  26:	b9 40 01 00 00       	mov    $0x140,%ecx
  2b:	48 09 d8             	or     %rbx,%rax
  2e:	48 89 c2             	mov    %rax,%rdx
  31:	65 48 89 05 00 00 00 	mov    %rax,%gs:0x0(%rip)        # 39 <set_cpuid_faulting+0x39>
  38:	00 
  39:	48 c1 ea 20          	shr    $0x20,%rdx
  3d:	0f 30                	wrmsr  
  3f:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  44:	5b                   	pop    %rbx
  45:	5d                   	pop    %rbp
  46:	c3                   	retq   
  47:	48 c1 e2 20          	shl    $0x20,%rdx
  4b:	89 c0                	mov    %eax,%eax
  4d:	bf 40 01 00 00       	mov    $0x140,%edi
  52:	48 09 d0             	or     %rdx,%rax
  55:	31 d2                	xor    %edx,%edx
  57:	48 89 c6             	mov    %rax,%rsi
  5a:	e8 00 00 00 00       	callq  5f <set_cpuid_faulting+0x5f>
  5f:	5b                   	pop    %rbx
  60:	5d                   	pop    %rbp
  61:	c3                   	retq   
  62:	e8 00 00 00 00       	callq  67 <set_cpuid_faulting+0x67>
  67:	85 c0                	test   %eax,%eax
  69:	74 af                	je     1a <set_cpuid_faulting+0x1a>
  6b:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 71 <set_cpuid_faulting+0x71>
  71:	85 c0                	test   %eax,%eax
  73:	75 a5                	jne    1a <set_cpuid_faulting+0x1a>
  75:	48 c7 c1 00 00 00 00 	mov    $0x0,%rcx
  7c:	48 c7 c2 00 00 00 00 	mov    $0x0,%rdx
  83:	be b9 00 00 00       	mov    $0xb9,%esi
  88:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
  8f:	e8 00 00 00 00       	callq  94 <set_cpuid_faulting+0x94>
  94:	eb 84                	jmp    1a <set_cpuid_faulting+0x1a>
  96:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  9d:	00 00 00 

The affected object file code size blows up as well, by 17%:

 arch/x86/kernel/process.o:
    text    data     bss     dec     hex filename
    3325    8577      32   11934    2e9e process.o.before
    3889    8609      32   12530    30f2 process.o.after

A good deal of this overhead and complexity comes from the implementation 
inefficiency I pointed out, and all this can be avoided with the method I 
suggested in my previous review, by caching the per task MSR value in the thread 
struct.

So sorry, NAK for this implementation - especially considering how relatively 
straightforward the changes I suggested are to implement.

Thanks,

	Ingo

  parent reply	other threads:[~2016-12-02 10:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-28  3:05 [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey
2016-11-28  3:05 ` [PATCH v13 1/8] x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl Kyle Huey
2016-11-28  3:05 ` [PATCH v13 2/8] x86/arch_prctl/64: Rename do_arch_prctl to do_arch_prctl_64 Kyle Huey
2016-11-28  3:05 ` [PATCH v13 3/8] x86/arch_prctl: Add do_arch_prctl_common Kyle Huey
2016-11-28  3:05 ` [PATCH v13 4/8] x86/syscalls/32: Wire up arch_prctl on x86-32 Kyle Huey
2016-11-28  3:05 ` [PATCH v13 5/8] x86/cpufeature: Detect CPUID faulting support Kyle Huey
2016-11-28  3:05 ` [PATCH v13 6/8] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-28 14:06   ` Thomas Gleixner
2016-11-28 16:13     ` Kyle Huey
2016-11-28  3:05 ` [PATCH v13 7/8] x86/arch_prctl: Selftest for ARCH_[GET|SET]_CPUID Kyle Huey
2016-11-28  3:05 ` [PATCH v13 8/8] KVM: x86: virtualize cpuid faulting Kyle Huey
2016-12-02 10:29 ` Ingo Molnar [this message]
2016-12-03 15:37   ` [PATCH v13 0/8] x86/arch_prctl Add ARCH_[GET|SET]_CPUID for controlling the CPUID instruction Kyle Huey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161202102942.GA17332@gmail.com \
    --to=mingo@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=dsafonov@virtuozzo.com \
    --cc=hpa@zytor.com \
    --cc=jdike@addtoit.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=me@kylehuey.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=richard@nod.at \
    --cc=rkrcmar@redhat.com \
    --cc=robert@ocallahan.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    --cc=user-mode-linux-user@lists.sourceforge.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).