linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [x86] 5ac0c41bf3:  WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
       [not found] <57614955.l3IQMtSfaGPEBdCy%fengguang.wu@intel.com>
@ 2016-06-15 14:25 ` Borislav Petkov
  2016-06-15 14:51   ` Eduardo Habkost
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Borislav Petkov @ 2016-06-15 14:25 UTC (permalink / raw)
  To: kernel test robot, Andy Lutomirski, Paolo Bonzini, Eduardo Habkost
  Cc: LKP, wfg, lkml

On Wed, Jun 15, 2016 at 08:25:57PM +0800, kernel test robot wrote:
> [    0.556833] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
> [    0.559888] ------------[ cut here ]------------
> [    0.559888] ------------[ cut here ]------------
> [    0.561405] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x44/0x70
> [    0.561405] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x44/0x70
> [    0.567649] unchecked MSR access error: RDMSR from 0x1b0
> [    0.567649] unchecked MSR access error: RDMSR from 0x1b0

Btw, Andy, this error message is completely useless - I
wanna know *where* the RDMSR in the code is, not point me at
ex_handler_rdmsr_unsafe().

IOW, I wanna convert the current thing into this:

[    0.028003] unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f
[    0.030343] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.032003] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
[    0.036003] unchecked MSR access error: WRMSR to 0x1b0 (tried to write 0x0000000000000006) at rIP: 0xffffffff81026de1

i.e.,

---
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 4bb53b89f3c5..2028a5ad3433 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -46,8 +46,8 @@ EXPORT_SYMBOL(ex_handler_ext);
 bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
 			     struct pt_regs *regs, int trapnr)
 {
-	WARN_ONCE(1, "unchecked MSR access error: RDMSR from 0x%x\n",
-		  (unsigned int)regs->cx);
+	pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx\n",
+		     (unsigned int)regs->cx, regs->ip);
 
 	/* Pretend that the read succeeded and returned 0. */
 	regs->ip = ex_fixup_addr(fixup);
@@ -60,9 +60,9 @@ EXPORT_SYMBOL(ex_handler_rdmsr_unsafe);
 bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
 			     struct pt_regs *regs, int trapnr)
 {
-	WARN_ONCE(1, "unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x)\n",
-		  (unsigned int)regs->cx,
-		  (unsigned int)regs->dx, (unsigned int)regs->ax);
+	pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx\n",
+		     (unsigned int)regs->cx, (unsigned int)regs->dx,
+		     (unsigned int)regs->ax,  regs->ip);
 
 	/* Pretend that the write succeeded. */
 	regs->ip = ex_fixup_addr(fixup);
---

Ok?

As to the error message, dear LKP friends, it happens because -cpu kvm64
on native Intel hands in CPUID bits of the host, i.e., if you do this in
the guest:

$ grep epb /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc nopl eagerfpu pni cx16 x2apic hypervisor epb
					  ^^^

you should have "epb" there too which is among those bits.

I can reproduce the same issue on an AMD host too by booting my guest
with

"-cpu kvm64,vendor=GenuineIntel"

Paolo, Eduardo, question: can we hide certain CPUID bits from the guest
when booting with -cpu kvm64?

In general, is there a way I can set or clear arbitrary CPUID bits so
that the guest sees what I want it to see?

And I don't mean the predefined CPUID flags which you toggle with "+" or
"-" followed by flag name. Because -cpu kvm64,-epb doesn't work.

Is there a way to make this work or should we hack it into qemu so that
we are able to do that? I.e., something like

-cpu=kvm64,cpuid=leaf6_ecx_bit3=0,...

or something smarter. But you get the idea...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3:  WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 14:25 ` [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe Borislav Petkov
@ 2016-06-15 14:51   ` Eduardo Habkost
  2016-06-15 17:07     ` Borislav Petkov
  2016-06-15 14:53   ` Paolo Bonzini
  2016-06-15 17:23   ` Andy Lutomirski
  2 siblings, 1 reply; 11+ messages in thread
From: Eduardo Habkost @ 2016-06-15 14:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: kernel test robot, Andy Lutomirski, Paolo Bonzini, LKP, wfg, lkml

On Wed, Jun 15, 2016 at 04:25:32PM +0200, Borislav Petkov wrote:
[...]
> 
> As to the error message, dear LKP friends, it happens because -cpu kvm64
> on native Intel hands in CPUID bits of the host, i.e., if you do this in
> the guest:
> 
> $ grep epb /proc/cpuinfo
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc nopl eagerfpu pni cx16 x2apic hypervisor epb
> 					  ^^^
> 
> you should have "epb" there too which is among those bits.

It shouldn't. -cpu kvm64 should give exactly the same set of
CPUID bits on every host (unless you see "missing feature"
warnings. "-cpu kvm64,enforce" would be even more strict and
refuse to start if any feature is missing).

> 
> I can reproduce the same issue on an AMD host too by booting my guest
> with
> 
> "-cpu kvm64,vendor=GenuineIntel"
> 
> Paolo, Eduardo, question: can we hide certain CPUID bits from the guest
> when booting with -cpu kvm64?
> 
> In general, is there a way I can set or clear arbitrary CPUID bits so
> that the guest sees what I want it to see?
> 
> And I don't mean the predefined CPUID flags which you toggle with "+" or
> "-" followed by flag name. Because -cpu kvm64,-epb doesn't work.

If -cpu kvm64,-epb doesn't work, -cpu kvm64 is not supposed to
expose ebp to the guest in the first place.

CPUID[6].ECX, specifically, is hardcoded to 0 in QEMU.

> 
> Is there a way to make this work or should we hack it into qemu so that
> we are able to do that? I.e., something like
> 
> -cpu=kvm64,cpuid=leaf6_ecx_bit3=0,...
> 
> or something smarter. But you get the idea...

We have that: it is "-feature" (or "feature=off"). Features that
can't be configured in the command-line should never appear to
the guest unless you use "-cpu host,migratable=off".

Something that allows disabling feature flags that still don't
have a name in QEMU would be useful for "-cpu
host,migratable=off" only (and that's an interesting suggestion).
But that's not the case here, I guess.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 14:25 ` [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe Borislav Petkov
  2016-06-15 14:51   ` Eduardo Habkost
@ 2016-06-15 14:53   ` Paolo Bonzini
  2016-06-15 17:23   ` Andy Lutomirski
  2 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2016-06-15 14:53 UTC (permalink / raw)
  To: Borislav Petkov, kernel test robot, Andy Lutomirski, Eduardo Habkost
  Cc: LKP, wfg, lkml



On 15/06/2016 16:25, Borislav Petkov wrote:
> As to the error message, dear LKP friends, it happens because -cpu kvm64
> on native Intel hands in CPUID bits of the host, i.e., if you do this in
> the guest:
> 
> $ grep epb /proc/cpuinfo
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm constant_tsc nopl eagerfpu pni cx16 x2apic hypervisor epb
> 
> you should have "epb" there too which is among those bits.

Hmm, no, it doesn't work like that.

EPB is bit 3 of CPUID[6].ECX.  Under KVM you should only ever see bit 2
of EAX set in that leaf (ARAT).

> I can reproduce the same issue on an AMD host too by booting my guest
> with
> 
> "-cpu kvm64,vendor=GenuineIntel"

I cannot reproduce it with 4.6.0-rc3 in the (Fedora 22 AMD) host and
4.7.0-rc2 in the (Fedora 21) guest.  QEMU is 2.4.1.

> Paolo, Eduardo, question: can we hide certain CPUID bits from the guest
> when booting with -cpu kvm64?
> 
> In general, is there a way I can set or clear arbitrary CPUID bits so
> that the guest sees what I want it to see?
> 
> And I don't mean the predefined CPUID flags which you toggle with "+" or
> "-" followed by flag name. Because -cpu kvm64,-epb doesn't work.

It doesn't work because QEMU has no idea of what EPB even is.  That bit
really shouldn't be set.

Can you bounce me the original report?

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3:  WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 14:51   ` Eduardo Habkost
@ 2016-06-15 17:07     ` Borislav Petkov
  0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2016-06-15 17:07 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: kernel test robot, Andy Lutomirski, Paolo Bonzini, LKP, wfg, lkml

On Wed, Jun 15, 2016 at 11:51:50AM -0300, Eduardo Habkost wrote:
> We have that: it is "-feature" (or "feature=off"). Features that
> can't be configured in the command-line should never appear to
> the guest unless you use "-cpu host,migratable=off".
> 
> Something that allows disabling feature flags that still don't
> have a name in QEMU would be useful for "-cpu
> host,migratable=off" only (and that's an interesting suggestion).
> But that's not the case here, I guess.

Right, and I'm wondering whether something like that would be useful for
testing purposes. For example, I want to enable a CPUID feature on the
command line and see how the kernel I'm booting in the guest reacts. And
that feature bit is not necessarily known to qemu/kvm.

Can we, say, pass whole CPUID leafs to qemu to report to the guest?
I.e., something like:

 ... -cpuid=0x5,eax=0x40,ebx=0xdeadbeef,ecx=0xcaffee ...

and so on.

Would that be interesting to add to qemu? I'd find something like that
very useful for testing kernels...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 14:25 ` [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe Borislav Petkov
  2016-06-15 14:51   ` Eduardo Habkost
  2016-06-15 14:53   ` Paolo Bonzini
@ 2016-06-15 17:23   ` Andy Lutomirski
  2016-06-15 17:50     ` Borislav Petkov
  2 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2016-06-15 17:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Jun 15, 2016 7:25 AM, "Borislav Petkov" <bp@alien8.de> wrote:
>
> On Wed, Jun 15, 2016 at 08:25:57PM +0800, kernel test robot wrote:
> > [    0.556833] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
> > [    0.559888] ------------[ cut here ]------------
> > [    0.559888] ------------[ cut here ]------------
> > [    0.561405] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x44/0x70
> > [    0.561405] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x44/0x70
> > [    0.567649] unchecked MSR access error: RDMSR from 0x1b0
> > [    0.567649] unchecked MSR access error: RDMSR from 0x1b0
>
> Btw, Andy, this error message is completely useless - I
> wanna know *where* the RDMSR in the code is, not point me at
> ex_handler_rdmsr_unsafe().

Did the "Call Trace" not show up?

>
> IOW, I wanna convert the current thing into this:
>
> [    0.028003] unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f
> [    0.030343] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
> [    0.032003] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
> [    0.036003] unchecked MSR access error: WRMSR to 0x1b0 (tried to write 0x0000000000000006) at rIP: 0xffffffff81026de1
>
> i.e.,
>
> ---
> diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
> index 4bb53b89f3c5..2028a5ad3433 100644
> --- a/arch/x86/mm/extable.c
> +++ b/arch/x86/mm/extable.c
> @@ -46,8 +46,8 @@ EXPORT_SYMBOL(ex_handler_ext);
>  bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
>                              struct pt_regs *regs, int trapnr)
>  {
> -       WARN_ONCE(1, "unchecked MSR access error: RDMSR from 0x%x\n",
> -                 (unsigned int)regs->cx);
> +       pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx\n",
> +                    (unsigned int)regs->cx, regs->ip);

I have no fundamental issue adding ip to this, but let's keep it
WARN_ONCE (so we notice loudly and so we get the call trace) and use
%pF or whatever it's called instead of %lx.

Also, I want to add a variant of WARN that takes pt_regs as parameters
at some point.  You'd get much better output.  Even without that, Josh
Poimboeuf and I (mainly Josh) have some work slowly afoot that will
greatly improve call trace quality when crossing an exception
boundary.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 17:23   ` Andy Lutomirski
@ 2016-06-15 17:50     ` Borislav Petkov
  2016-06-15 17:55       ` Andy Lutomirski
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-06-15 17:50 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 10:23:39AM -0700, Andy Lutomirski wrote:
> Did the "Call Trace" not show up?

It did, but it is bollocks too:

[    0.020003] ------------[ cut here ]------------
[    0.024009] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x6f/0x80
[    0.026455] unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f
[    0.028008] Modules linked in:
[    0.032008] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc3+ #26
[    0.035185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    0.036000]  0000000000000000 ffffffff81c03d20 ffffffff81298b65 ffffffff81c03d70
[    0.036000]  0000000000000000 ffffffff81c03d60 ffffffff81050fdb 0000003200000001
[    0.036000]  ffffffff81c03e38 ffffffff816691a0 0000000000000000 ffffffff81d662e0
[    0.036000] Call Trace:
[    0.036000]  [<ffffffff81298b65>] dump_stack+0x67/0x92
[    0.036000]  [<ffffffff81050fdb>] __warn+0xcb/0xf0
[    0.036000]  [<ffffffff8105104f>] warn_slowpath_fmt+0x4f/0x60
[    0.036000]  [<ffffffff81026d9f>] ? init_intel_energy_perf.part.3+0xf/0xd0
[    0.036000]  [<ffffffff810448ef>] ex_handler_rdmsr_unsafe+0x6f/0x80
[    0.036000]  [<ffffffff810449c9>] fixup_exception+0x39/0x50
[    0.036000]  [<ffffffff8101658f>] do_general_protection+0x7f/0x150
[    0.036000]  [<ffffffff816649bf>] general_protection+0x1f/0x30
[    0.036000]  [<ffffffff81026d9f>] ? init_intel_energy_perf.part.3+0xf/0xd0
[    0.036000]  [<ffffffff8102731f>] init_intel+0xdf/0x2b0
[    0.036000]  [<ffffffff8102597d>] identify_cpu+0x2ed/0x4f0
[    0.036000]  [<ffffffff81cdc1b8>] identify_boot_cpu+0x10/0x7a
[    0.036000]  [<ffffffff81cdc256>] check_bugs+0x9/0x2d
[    0.036000]  [<ffffffff81cd1ece>] start_kernel+0x3bd/0x3d9
[    0.036000]  [<ffffffff81cd1294>] x86_64_start_reservations+0x2f/0x31
[    0.036000]  [<ffffffff81cd13fe>] x86_64_start_kernel+0x168/0x176
[    0.036007] ---[ end trace df7f3cc4a52dae6d ]---

> I have no fundamental issue adding ip to this, but let's keep it
> WARN_ONCE (so we notice loudly and so we get the call trace)

Bah, I'm sceptical about that.

> and use %pF or whatever it's called instead of %lx.

Ok.

> Also, I want to add a variant of WARN that takes pt_regs as parameters
> at some point.  You'd get much better output.  Even without that, Josh
> Poimboeuf and I (mainly Josh) have some work slowly afoot that will
> greatly improve call trace quality when crossing an exception
> boundary.

Yes, because that call trace is almost worthless to me as the function
which actually causes it - init_intel_energy_perf - is not even on the
current stack frame.

And we don't really need the whole trace - we just need to be able to
say:

unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f (init_intel_energy_perf.part.3)

and that function name in there could probably be looked up with
kallsyms by doing something like "get me the function name containing
this rIP".

 (I haven't even looked whether that's doable but it should be).

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 17:50     ` Borislav Petkov
@ 2016-06-15 17:55       ` Andy Lutomirski
  2016-06-15 18:07         ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2016-06-15 17:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 10:50 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Jun 15, 2016 at 10:23:39AM -0700, Andy Lutomirski wrote:
>> Did the "Call Trace" not show up?
>
> It did, but it is bollocks too:
>
> [    0.020003] ------------[ cut here ]------------
> [    0.024009] WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x6f/0x80
> [    0.026455] unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f
> [    0.028008] Modules linked in:
> [    0.032008] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc3+ #26
> [    0.035185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [    0.036000]  0000000000000000 ffffffff81c03d20 ffffffff81298b65 ffffffff81c03d70
> [    0.036000]  0000000000000000 ffffffff81c03d60 ffffffff81050fdb 0000003200000001
> [    0.036000]  ffffffff81c03e38 ffffffff816691a0 0000000000000000 ffffffff81d662e0
> [    0.036000] Call Trace:
> [    0.036000]  [<ffffffff81298b65>] dump_stack+0x67/0x92
> [    0.036000]  [<ffffffff81050fdb>] __warn+0xcb/0xf0
> [    0.036000]  [<ffffffff8105104f>] warn_slowpath_fmt+0x4f/0x60
> [    0.036000]  [<ffffffff81026d9f>] ? init_intel_energy_perf.part.3+0xf/0xd0
> [    0.036000]  [<ffffffff810448ef>] ex_handler_rdmsr_unsafe+0x6f/0x80
> [    0.036000]  [<ffffffff810449c9>] fixup_exception+0x39/0x50
> [    0.036000]  [<ffffffff8101658f>] do_general_protection+0x7f/0x150
> [    0.036000]  [<ffffffff816649bf>] general_protection+0x1f/0x30

> [    0.036000]  [<ffffffff81026d9f>] ? init_intel_energy_perf.part.3+0xf/0xd0

Isn't it this one?

There's a general bug where we have bogus '?' entries when tracing
through an exception frame, but I think we should fix that
generically.  I have a really ugly patch, and I think that fixing it
better is in Josh's queue.  If we fix it generically, then we get this
benefit everywhere in the kernel.

> [    0.036000]  [<ffffffff8102731f>] init_intel+0xdf/0x2b0
> [    0.036000]  [<ffffffff8102597d>] identify_cpu+0x2ed/0x4f0
> [    0.036000]  [<ffffffff81cdc1b8>] identify_boot_cpu+0x10/0x7a
> [    0.036000]  [<ffffffff81cdc256>] check_bugs+0x9/0x2d
> [    0.036000]  [<ffffffff81cd1ece>] start_kernel+0x3bd/0x3d9
> [    0.036000]  [<ffffffff81cd1294>] x86_64_start_reservations+0x2f/0x31
> [    0.036000]  [<ffffffff81cd13fe>] x86_64_start_kernel+0x168/0x176
> [    0.036007] ---[ end trace df7f3cc4a52dae6d ]---
>
>> I have no fundamental issue adding ip to this, but let's keep it
>> WARN_ONCE (so we notice loudly and so we get the call trace)
>
> Bah, I'm sceptical about that.

I'm not.  If %pF points at some silly helper, we still want the frames below it.

>
>> and use %pF or whatever it's called instead of %lx.
>
> Ok.
>
>> Also, I want to add a variant of WARN that takes pt_regs as parameters
>> at some point.  You'd get much better output.  Even without that, Josh
>> Poimboeuf and I (mainly Josh) have some work slowly afoot that will
>> greatly improve call trace quality when crossing an exception
>> boundary.
>
> Yes, because that call trace is almost worthless to me as the function
> which actually causes it - init_intel_energy_perf - is not even on the
> current stack frame.
>
> And we don't really need the whole trace - we just need to be able to
> say:
>
> unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f (init_intel_energy_perf.part.3)
>
> and that function name in there could probably be looked up with
> kallsyms by doing something like "get me the function name containing
> this rIP".
>
>  (I haven't even looked whether that's doable but it should be).

%pF.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 17:55       ` Andy Lutomirski
@ 2016-06-15 18:07         ` Borislav Petkov
  2016-06-15 18:12           ` Andy Lutomirski
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-06-15 18:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 10:55:34AM -0700, Andy Lutomirski wrote:
> Isn't it this one?

Yes, it is.

> I'm not.  If %pF points at some silly helper, we still want the frames below it.

Why silly helper? It points to the rIP where the *MSR instruction is.
Can't get more precise than that.

And yes, %pF is better:

[    0.030879] unchecked MSR access error: RDMSR from 0x1b0 at rIP: 0xffffffff81026d9f (init_intel_energy_perf.part.3+0xf/0xd0)

Ok, I'll send a patch.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 18:07         ` Borislav Petkov
@ 2016-06-15 18:12           ` Andy Lutomirski
  2016-06-15 18:23             ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2016-06-15 18:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 11:07 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Jun 15, 2016 at 10:55:34AM -0700, Andy Lutomirski wrote:
>> Isn't it this one?
>
> Yes, it is.
>
>> I'm not.  If %pF points at some silly helper, we still want the frames below it.
>
> Why silly helper? It points to the rIP where the *MSR instruction is.
> Can't get more precise than that.

A hypothetical helper.

void do_thing(unsigned long msr)
{
  rdmsr(...);
}

void actual_problem(void)
{
  do_thing(0xbaadc0de);
}

I want to see actual_problem in the log.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 18:12           ` Andy Lutomirski
@ 2016-06-15 18:23             ` Borislav Petkov
  2016-06-15 18:44               ` Andy Lutomirski
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-06-15 18:23 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 11:12:37AM -0700, Andy Lutomirski wrote:
> I want to see actual_problem in the log.

So the optimal thing to do for that is to return an error to the calling
site which says "exception got handled" and then the *calling* site can
dump_stack().

And the old code did that:

static inline unsigned long long native_read_msr_safe(unsigned int msr,
                                                      int *err)
{
        DECLARE_ARGS(val, low, high);

        asm volatile("2: rdmsr ; xor %[err],%[err]\n"
                     "1:\n\t"
                     ".section .fixup,\"ax\"\n\t"
                     "3:  mov %[fault],%[err] ; jmp 1b\n\t"
                     ".previous\n\t"
                     _ASM_EXTABLE(2b, 3b)
                     : [err] "=r" (*err), EAX_EDX_RET(val, low, high)
                     : "c" (msr), [fault] "i" (-EIO));
						^^^^


-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe
  2016-06-15 18:23             ` Borislav Petkov
@ 2016-06-15 18:44               ` Andy Lutomirski
  0 siblings, 0 replies; 11+ messages in thread
From: Andy Lutomirski @ 2016-06-15 18:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Paolo Bonzini, wfg, LKP, lkml, Fengguang Wu, Eduardo Habkost,
	Josh Poimboeuf

On Wed, Jun 15, 2016 at 11:23 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Jun 15, 2016 at 11:12:37AM -0700, Andy Lutomirski wrote:
>> I want to see actual_problem in the log.
>
> So the optimal thing to do for that is to return an error to the calling
> site which says "exception got handled" and then the *calling* site can
> dump_stack().
>
> And the old code did that:
>
> static inline unsigned long long native_read_msr_safe(unsigned int msr,
>                                                       int *err)
> {
>         DECLARE_ARGS(val, low, high);
>
>         asm volatile("2: rdmsr ; xor %[err],%[err]\n"
>                      "1:\n\t"
>                      ".section .fixup,\"ax\"\n\t"
>                      "3:  mov %[fault],%[err] ; jmp 1b\n\t"
>                      ".previous\n\t"
>                      _ASM_EXTABLE(2b, 3b)
>                      : [err] "=r" (*err), EAX_EDX_RET(val, low, high)
>                      : "c" (msr), [fault] "i" (-EIO));
>                                                 ^^^^
>

Then we end up with bloat.  The current thing generates very compact code.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-15 18:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <57614955.l3IQMtSfaGPEBdCy%fengguang.wu@intel.com>
2016-06-15 14:25 ` [x86] 5ac0c41bf3: WARNING: CPU: 0 PID: 0 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe Borislav Petkov
2016-06-15 14:51   ` Eduardo Habkost
2016-06-15 17:07     ` Borislav Petkov
2016-06-15 14:53   ` Paolo Bonzini
2016-06-15 17:23   ` Andy Lutomirski
2016-06-15 17:50     ` Borislav Petkov
2016-06-15 17:55       ` Andy Lutomirski
2016-06-15 18:07         ` Borislav Petkov
2016-06-15 18:12           ` Andy Lutomirski
2016-06-15 18:23             ` Borislav Petkov
2016-06-15 18:44               ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).