linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* AMD erratum 665 on f15h processor?
@ 2017-12-17  9:04 Andrew Randrianasulu
  2017-12-17 20:52 ` Borislav Petkov
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Randrianasulu @ 2017-12-17  9:04 UTC (permalink / raw)
  To: linux-kernel

Hello!

I was trying to investigate why all my old kernels can't be booted on my 
relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right now - 
but old kernels die early ...

After some digging I found this
https://patchwork.kernel.org/patch/9311567/

Patch talk about family 12h, but my machine has this CPU:

[    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor (family: 0x15, 
model: 0x2, stepping: 0x0)
[    0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.


Because fix applied unconditionally it probably helps me, so please don't remove 
it.

fail log from qemu and kernel 4.2 attached


.text : 0xc0100000 - 0xc046ceb7   (3507 kB)
 Checking if this processor honours the WP bit even in supervisor mode...Ok.
 SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
 Hierarchical RCU implementation.
  Build-time adjustment of leaf fanout to 32.
  RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=1.
 RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=1
 NR_IRQS:2304 nr_irqs:256 16
 Console: colour VGA+ 80x60
 console [tty0] enabled
 clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 
1911260
 4467 ns
 tsc: Fast TSC calibration failed
 tsc: Unable to calibrate against PIT
 tsc: HPET/PMTIMER calibration failed
 tsc: Marking TSC unstable due to could not calculate TSC khz
 Calibrating delay loop... 1253.37 BogoMIPS (lpj=2506752)
 pid_max: default: 32768 minimum: 301
 ACPI: Core revision 20150619
 ACPI: All ACPI Tables successfully acquired
 Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
 Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
 Initializing cgroup subsys net_cls
 general protection fault: 0000 [#1] SMP
 Modules linked in:
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486 #7
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.11.0-0-g63451fca1
 3-prebuilt.qemu-project.org 04/01/2014
 task: c05dba40 ti: c05d4000 task.ti: c05d4000
 EIP: 0060:[<c010ec47>] EFLAGS: 00210202 CPU: 0
 EIP is at cpu_has_amd_erratum+0x23/0xb2
 EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c0470b2c
 ESI: c0630d00 EDI: c0470b30 EBP: c05d5f24 ESP: c05d5f14
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: ffc77000 CR3: 006d2000 CR4: 00040690
 Stack:
  02008140 00000000 c0630d00 00000000 c05d5f70 c010f446 000000d0 c05d5f5c
  c01e0571 00000010 0000001e 00000000 00000000 00000009 00000010 00000000
  c0630d00 00000000 c05d5f70 c010d74d 00000020 c0630d00 c0630d8b c05d5f9c
 Call Trace:
  [<c010f446>] init_amd+0x4e8/0x662
  [<c01e0571>] ? kmem_cache_alloc_trace+0xbe/0xc8
  [<c010d74d>] ? get_cpu_cap+0x127/0x12c
  [<c010d936>] identify_cpu+0x1e4/0x366
  [<c01e044c>] ? kmem_cache_alloc+0x90/0xf7
  [<c01c7869>] ? kmem_cache_create+0x118/0x15b
  [<c063f1ea>] identify_boot_cpu+0x10/0x99
  [<c018fb35>] ? __delayacct_tsk_init+0x15/0x28
  [<c063f2a6>] check_bugs+0x9/0x39
  [<c0638ae3>] start_kernel+0x3a3/0x3b3
  [<c063854d>] ? set_init_arg+0x52/0x52
  [<c06382b8>] i386_start_kernel+0x82/0x86
 Code: e0 eb 5d c0 89 e5 5d c3 55 89 e5 57 56 89 c6 53 51 8b 1a 8d 7a 04 81 fb 
ff
  ff 00 00 77 54 8b 40 2c f6 c4 02 74 4c b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 
89
  d1 99 39 ca 77 39 72 05 3b 5d f0 73 32
 EIP: [<c010ec47>] cpu_has_amd_erratum+0x23/0xb2 SS:ESP 0068:c05d5f14
 ---[ end trace 8bfd5e6fa0a4fcb2 ]---
 Kernel panic - not syncing: Attempted to kill the idle task!
 ---[ end Kernel panic - not syncing: Attempted to kill the idle task!

well, because this bug apparently fixed and fix propogated to -stable it 
shouldn't concern me too much, but may be someone in the future will rearrange  
those checks and assume only some old AMD CPUs were affected ... so, I leave 
this message.

qemu cmd line:
qemu-system-x86_64 -M 
q35 -enable-kvm -cdrom /dev/shm/slax_16_12_2017_test.iso -m 512  -soundhw 
es1370 -cpu host -device sga  -curses

-cpu host really important here. I used VGA mode 6 (vga=6) blindly for getting 
maximized output.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD erratum 665 on f15h processor?
  2017-12-17  9:04 AMD erratum 665 on f15h processor? Andrew Randrianasulu
@ 2017-12-17 20:52 ` Borislav Petkov
  2017-12-18  3:01   ` Andrew Randrianasulu
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-12-17 20:52 UTC (permalink / raw)
  To: Andrew Randrianasulu; +Cc: linux-kernel

On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> Hello!
> 
> I was trying to investigate why all my old kernels can't be booted on my 
> relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right now - 
> but old kernels die early ...
> 
> After some digging I found this
> https://patchwork.kernel.org/patch/9311567/
> 
> Patch talk about family 12h, but my machine has this CPU:
> 
> [    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor (family: 0x15, 
> model: 0x2, stepping: 0x0)
> [    0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.

Yes, your machine is not affected by that erratum. So far so good.

The rest of your mail I have hard time understanding: you're talking
about old kernels not booting on a new machine but then you paste a qemu
32-bit guest kernel boot log and after that I'm lost.

Perhaps you should try again by explaining in detail what exactly you're
trying to do and how exactly you're going about doing that...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD erratum 665 on f15h processor?
  2017-12-17 20:52 ` Borislav Petkov
@ 2017-12-18  3:01   ` Andrew Randrianasulu
  2017-12-18 13:22     ` Borislav Petkov
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Randrianasulu @ 2017-12-18  3:01 UTC (permalink / raw)
  To: Borislav Petkov, linux-kernel

В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > Hello!
> >
> > I was trying to investigate why all my old kernels can't be booted on my
> > relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right
> > now - but old kernels die early ...
> >
> > After some digging I found this
> > https://patchwork.kernel.org/patch/9311567/
> >
> > Patch talk about family 12h, but my machine has this CPU:
> >
> > [    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > (family: 0x15, model: 0x2, stepping: 0x0)
> > [    0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.
>
> Yes, your machine is not affected by that erratum. So far so good.
>
> The rest of your mail I have hard time understanding: you're talking
> about old kernels not booting on a new machine but then you paste a qemu
> 32-bit guest kernel boot log and after that I'm lost.
>
> Perhaps you should try again by explaining in detail what exactly you're
> trying to do and how exactly you're going about doing that...

Hi, Borislav!

I was trying to boot few self-made liveCD/DVDs - they use self-compiled kernels 
in 3.2-4.2 range. None of those old disks boots in qemu if I set it to cpu 
type 'host'. I have whole collection of old kernels since 2011, and none work 
anymore ! Even older CD with 2.6.23.something plainly rebooted after kernel and 
initrd were loaded by isolinux on physical machine! But 2.6.27.9 worked at 
least in qemu (not really want to reboot machine due to some stuff in tmpfs). 
So, because 4.2.0-i486  was my previous failsafe kernel, and it most likely 
will not work anymore - I guess I will use 4.12.0-x64.. I was just trying to 
find any change explaining this error, and your fix was closer I was able to 
find in this time interval (2015-2017). May be it was just some unrelated 
purely software bug in amd detection code.. I spend some time trying to figure 
out how to copy/paste from qemu, finally -curses interface worked.

I think I missed this misbehavior because I mostly used just qemu, without -cpu 
host (but with -enable-kvm), so it worked without problems.

When I first got this machine in early 2017 I already had 4.9+ as one of 
possible kernels in lilo menu, so, when 4.2 failed I quickly booted new kernel, 
and forgot about it. Lately I compiled 4.12 for using it on friend's machine 
with new AMD videocard - but default in syslinux/isolinux was still set to 
4.2.0, and it worked on another AMD machine. Few days ago i decided to make 
new 'live backup' of my running system, and while playing with new quemu 
discovered this oddity.

Still, for me it raises interesting question: as far as I understand qemu's BIOS 
(SeaBIOS) doesn't set all those cpu-specific workarounds/fixes - but with 
qemu -cpu host guest kernel will see nearly exact cpu model, and will try to 
apply (or not, assuming BIOS/firmware already set everything correctly?) some 
fixups, or at least run some detection code? Of course I can just compile new 
kernel with those checks disabled, but older kernels already compiled ... and 
disabling those workarounds will lead to crashes later on, so having runtime 
disable for them is not good idea ?

Not sure if I will able to get real boot log from physical machine boot - I 
don't think I compiled those old kernels with any way to store early 
oops/panic ..:/

Thanks for answering and sorry for possible false positive bug report.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD erratum 665 on f15h processor?
  2017-12-18  3:01   ` Andrew Randrianasulu
@ 2017-12-18 13:22     ` Borislav Petkov
       [not found]       ` <201712181954.52740.randrianasulu@gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-12-18 13:22 UTC (permalink / raw)
  To: Andrew Randrianasulu; +Cc: linux-kernel, kvm ML

+ kvm ML.

On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > Hello!
> > >
> > > I was trying to investigate why all my old kernels can't be booted on my
> > > relatively new machine. Kernels 4.10+ naturally boot - I use 4.14.3 right
> > > now - but old kernels die early ...
> > >
> > > After some digging I found this
> > > https://patchwork.kernel.org/patch/9311567/
> > >
> > > Patch talk about family 12h, but my machine has this CPU:
> > >
> > > [    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > [    0.056000] Performance Events: Fam15h core perfctr, AMD PMU driver.
> >
> > Yes, your machine is not affected by that erratum. So far so good.
> >
> > The rest of your mail I have hard time understanding: you're talking
> > about old kernels not booting on a new machine but then you paste a qemu
> > 32-bit guest kernel boot log and after that I'm lost.
> >
> > Perhaps you should try again by explaining in detail what exactly you're
> > trying to do and how exactly you're going about doing that...
> 
> Hi, Borislav!
> 
> I was trying to boot few self-made liveCD/DVDs - they use self-compiled kernels 
> in 3.2-4.2 range. None of those old disks boots in qemu if I set it to cpu 
> type 'host'. I have whole collection of old kernels since 2011, and none work 
> anymore ! Even older CD with 2.6.23.something plainly rebooted after kernel and 
> initrd were loaded by isolinux on physical machine! But 2.6.27.9 worked at 
> least in qemu (not really want to reboot machine due to some stuff in tmpfs). 
> So, because 4.2.0-i486  was my previous failsafe kernel, and it most likely 
> will not work anymore - I guess I will use 4.12.0-x64.. I was just trying to 
> find any change explaining this error, and your fix was closer I was able to 
> find in this time interval (2015-2017). May be it was just some unrelated 
> purely software bug in amd detection code.. I spend some time trying to figure 
> out how to copy/paste from qemu, finally -curses interface worked.
> 
> I think I missed this misbehavior because I mostly used just qemu, without -cpu 
> host (but with -enable-kvm), so it worked without problems.

So -cpu host means:

x86             host  KVM processor with all supported host features (only available in KVM mode)

which would theoretically mean that those guest kernel configs shouldn't
boot on the baremetal box either, if they fail on the guest.

But who knows what's happening.

You can give me a guest kernel .config of a kernel which fails along
with the exact qemu cmdline to try out here.

(Leaving in the rest for reference.)

> When I first got this machine in early 2017 I already had 4.9+ as one of 
> possible kernels in lilo menu, so, when 4.2 failed I quickly booted new kernel, 
> and forgot about it. Lately I compiled 4.12 for using it on friend's machine 
> with new AMD videocard - but default in syslinux/isolinux was still set to 
> 4.2.0, and it worked on another AMD machine. Few days ago i decided to make 
> new 'live backup' of my running system, and while playing with new quemu 
> discovered this oddity.
> 
> Still, for me it raises interesting question: as far as I understand qemu's BIOS 
> (SeaBIOS) doesn't set all those cpu-specific workarounds/fixes - but with 
> qemu -cpu host guest kernel will see nearly exact cpu model, and will try to 
> apply (or not, assuming BIOS/firmware already set everything correctly?) some 
> fixups, or at least run some detection code? Of course I can just compile new 
> kernel with those checks disabled, but older kernels already compiled ... and 
> disabling those workarounds will lead to crashes later on, so having runtime 
> disable for them is not good idea ?
> 
> Not sure if I will able to get real boot log from physical machine boot - I 
> don't think I compiled those old kernels with any way to store early 
> oops/panic ..:/
> 
> Thanks for answering and sorry for possible false positive bug report.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD erratum 665 on f15h processor?
       [not found]       ` <201712181954.52740.randrianasulu@gmail.com>
@ 2017-12-18 21:05         ` Borislav Petkov
  2017-12-19  5:22           ` Andrew Randrianasulu
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-12-18 21:05 UTC (permalink / raw)
  To: Andrew Randrianasulu; +Cc: kvm ML, lkml

When you git reply, please hit reply-to-all in your mail client so that
mailing lists get CCed too.

On Mon, Dec 18, 2017 at 07:54:52PM +0300, Andrew Randrianasulu wrote:
> В сообщении от Monday 18 December 2017 16:22:15 вы написали:
> > + kvm ML.
> >
> > On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> > > В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > > > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > > > Hello!
> > > > >
> > > > > I was trying to investigate why all my old kernels can't be booted on
> > > > > my relatively new machine. Kernels 4.10+ naturally boot - I use
> > > > > 4.14.3 right now - but old kernels die early ...
> > > > >
> > > > > After some digging I found this
> > > > > https://patchwork.kernel.org/patch/9311567/
> > > > >
> > > > > Patch talk about family 12h, but my machine has this CPU:
> > > > >
> > > > > [    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > > > [    0.056000] Performance Events: Fam15h core perfctr, AMD PMU
> > > > > driver.
> > > >
> > > > Yes, your machine is not affected by that erratum. So far so good.
> > > >
> > > > The rest of your mail I have hard time understanding: you're talking
> > > > about old kernels not booting on a new machine but then you paste a
> > > > qemu 32-bit guest kernel boot log and after that I'm lost.
> > > >
> > > > Perhaps you should try again by explaining in detail what exactly
> > > > you're trying to do and how exactly you're going about doing that...
> > >
> > > Hi, Borislav!
> > >
> > > I was trying to boot few self-made liveCD/DVDs - they use self-compiled
> > > kernels in 3.2-4.2 range. None of those old disks boots in qemu if I set
> > > it to cpu type 'host'. I have whole collection of old kernels since 2011,
> > > and none work anymore ! Even older CD with 2.6.23.something plainly
> > > rebooted after kernel and initrd were loaded by isolinux on physical
> > > machine! But 2.6.27.9 worked at least in qemu (not really want to reboot
> > > machine due to some stuff in tmpfs). So, because 4.2.0-i486  was my
> > > previous failsafe kernel, and it most likely will not work anymore - I
> > > guess I will use 4.12.0-x64.. I was just trying to find any change
> > > explaining this error, and your fix was closer I was able to find in this
> > > time interval (2015-2017). May be it was just some unrelated purely
> > > software bug in amd detection code.. I spend some time trying to figure
> > > out how to copy/paste from qemu, finally -curses interface worked.
> > >
> > > I think I missed this misbehavior because I mostly used just qemu,
> > > without -cpu host (but with -enable-kvm), so it worked without problems.
> >
> > So -cpu host means:
> >
> > x86             host  KVM processor with all supported host features (only
> > available in KVM mode)
> >
> > which would theoretically mean that those guest kernel configs shouldn't
> > boot on the baremetal box either, if they fail on the guest.
> >
> > But who knows what's happening.
> >
> > You can give me a guest kernel .config of a kernel which fails along
> > with the exact qemu cmdline to try out here.
> 
> .config attached.
> 
> for reproducting just launch qemu like this:
> 
> qemu-system-i386 -kernel /home/admin/slax-build/boot/vmlinuz -cpu 
> host --enable-kvm (just tried).
> 
>  Of course replace path to kernel image with your own. I can also attach binary 
> image, but I think it will be of little use for you.....

Nah, I built it using your .config.

So my guest stops very early in the BIOS with 

"Failed to allocate space for phdrs

-- System halted."

Then I looked at this:

https://bugzilla.kernel.org/show_bug.cgi?id=114671

and there's a patch

https://bugzilla.kernel.org/attachment.cgi?id=209601&action=diff&collapsed=&headers=1&format=raw

With it, it booted a bit further. But I still couldn't see any output.

So I booted with my cmdline to see more output and it did say:

general protection fault: 0000 [#1] SMP 
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
task: c05b9a80 ti: c05b2000 task.ti: c05b2000
EIP: 0060:[<c010e390>] EFLAGS: 00210293 CPU: 0
EIP is at cpu_has_amd_erratum+0x24/0xb0
EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c044ccf4
ESI: c0616900 EDI: c044ccf8 EBP: c05b3f68 ESP: c05b3f58
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: ffc77000 CR3: 006ae000 CR4: 00040690
Stack:
 02008140 00000000 c0616900 00000000 c05b3fa8 c010ec8b f5001d80 0000001e
 00000000 00000000 00000009 00000010 00000000 c0616900 00000000 c05b3fa8
 c010cf58 c0616900 c0616900 c061695c c05b3fc8 c010d156 c061698b c061695c
Call Trace:
 [<c010ec8b>] init_amd+0x5ee/0x631
 [<c010cf58>] ? get_cpu_cap+0x121/0x126
 [<c010d156>] identify_cpu+0x1f9/0x37d
 [<c0624a18>] identify_boot_cpu+0xd/0x80
 [<c0624abd>] check_bugs+0x8/0x35
 [<c061ea42>] start_kernel+0x32a/0x339
 [<c061e2c2>] i386_start_kernel+0x8c/0x90
Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34
EIP: [<c010e390>] cpu_has_amd_erratum+0x24/0xb0 SS:ESP 0068:c05b3f58
---[ end trace 7fb9e71b486a229a ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task!

Which is exactly like the splat you've posted and that fails:

Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45 f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34
All code
========
   0:   cf                      iret   
   1:   5b                      pop    %rbx
   2:   c0 89 e5 5d c3 55 89    rorb   $0x89,0x55c35de5(%rcx)
   9:   e5 57                   in     $0x57,%eax
   b:   56                      push   %rsi
   c:   53                      push   %rbx
   d:   51                      push   %rcx
   e:   89 c6                   mov    %eax,%esi
  10:   8b 1a                   mov    (%rdx),%ebx
  12:   8d 7a 04                lea    0x4(%rdx),%edi
  15:   81 fb ff ff 00 00       cmp    $0xffff,%ebx
  1b:   77 57                   ja     0x74
  1d:   8b 40 2c                mov    0x2c(%rax),%eax
  20:   0f ba e0 09             bt     $0x9,%eax
  24:   73 4e                   jae    0x74
  26:   b9 40 01 01 c0          mov    $0xc0010140,%ecx
  2b:*  0f 32                   rdmsr           <-- trapping instruction
  2d:   89 45 f0                mov    %eax,-0x10(%rbp)
  30:   89 d8                   mov    %ebx,%eax
  32:   89 d1                   mov    %edx,%ecx
  34:   99                      cltd
  35:   39 ca                   cmp    %ecx,%edx
  37:   77 3b                   ja     0x74
  39:   72 05                   jb     0x40
  3b:   3b 5d f0                cmp    -0x10(%rbp),%ebx
  3e:   73 34                   jae    0x74

because it tries to read from a non-existent MSR - 0xc0010140 - and
maybe it is because of the -cpu host emulation or so but those MSRs do
get virtualized, see

2b036c6b861d ("KVM: SVM: Add support for AMD's OSVW feature in guests")

but I'd refer to the kvm/qemu people to explain what the deal here
exactly is.

What I do, is use -cpu Opteron_G5 which is also F15h and that works.
Oh, and I'd use 64-bit kernels - 32-bit is not really being tested as
extensively.

HTH.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AMD erratum 665 on f15h processor?
  2017-12-18 21:05         ` Borislav Petkov
@ 2017-12-19  5:22           ` Andrew Randrianasulu
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Randrianasulu @ 2017-12-19  5:22 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: kvm ML, lkml

В сообщении от Tuesday 19 December 2017 00:05:40 Borislav Petkov написал(а):
> When you git reply, please hit reply-to-all in your mail client so that
> mailing lists get CCed too.

ok.

>
> On Mon, Dec 18, 2017 at 07:54:52PM +0300, Andrew Randrianasulu wrote:
> > В сообщении от Monday 18 December 2017 16:22:15 вы написали:
> > > + kvm ML.
> > >
> > > On Mon, Dec 18, 2017 at 06:01:21AM +0300, Andrew Randrianasulu wrote:
> > > > В сообщении от Sunday 17 December 2017 23:52:05 вы написали:
> > > > > On Sun, Dec 17, 2017 at 12:04:28PM +0300, Andrew Randrianasulu wrote:
> > > > > > Hello!
> > > > > >
> > > > > > I was trying to investigate why all my old kernels can't be
> > > > > > booted on my relatively new machine. Kernels 4.10+ naturally boot
> > > > > > - I use 4.14.3 right now - but old kernels die early ...
> > > > > >
> > > > > > After some digging I found this
> > > > > > https://patchwork.kernel.org/patch/9311567/
> > > > > >
> > > > > > Patch talk about family 12h, but my machine has this CPU:
> > > > > >
> > > > > > [    0.056000] smpboot: CPU0: AMD FX(tm)-4300 Quad-Core Processor
> > > > > > (family: 0x15, model: 0x2, stepping: 0x0)
> > > > > > [    0.056000] Performance Events: Fam15h core perfctr, AMD PMU
> > > > > > driver.
> > > > >
> > > > > Yes, your machine is not affected by that erratum. So far so good.
> > > > >
> > > > > The rest of your mail I have hard time understanding: you're
> > > > > talking about old kernels not booting on a new machine but then you
> > > > > paste a qemu 32-bit guest kernel boot log and after that I'm lost.
> > > > >
> > > > > Perhaps you should try again by explaining in detail what exactly
> > > > > you're trying to do and how exactly you're going about doing
> > > > > that...
> > > >
> > > > Hi, Borislav!
> > > >
> > > > I was trying to boot few self-made liveCD/DVDs - they use
> > > > self-compiled kernels in 3.2-4.2 range. None of those old disks boots
> > > > in qemu if I set it to cpu type 'host'. I have whole collection of
> > > > old kernels since 2011, and none work anymore ! Even older CD with
> > > > 2.6.23.something plainly rebooted after kernel and initrd were loaded
> > > > by isolinux on physical machine! But 2.6.27.9 worked at least in qemu
> > > > (not really want to reboot machine due to some stuff in tmpfs). So,
> > > > because 4.2.0-i486  was my previous failsafe kernel, and it most
> > > > likely will not work anymore - I guess I will use 4.12.0-x64.. I was
> > > > just trying to find any change explaining this error, and your fix
> > > > was closer I was able to find in this time interval (2015-2017). May
> > > > be it was just some unrelated purely software bug in amd detection
> > > > code.. I spend some time trying to figure out how to copy/paste from
> > > > qemu, finally -curses interface worked.
> > > >
> > > > I think I missed this misbehavior because I mostly used just qemu,
> > > > without -cpu host (but with -enable-kvm), so it worked without
> > > > problems.
> > >
> > > So -cpu host means:
> > >
> > > x86             host  KVM processor with all supported host features
> > > (only available in KVM mode)
> > >
> > > which would theoretically mean that those guest kernel configs
> > > shouldn't boot on the baremetal box either, if they fail on the guest.
> > >
> > > But who knows what's happening.
> > >
> > > You can give me a guest kernel .config of a kernel which fails along
> > > with the exact qemu cmdline to try out here.
> >
> > .config attached.
> >
> > for reproducting just launch qemu like this:
> >
> > qemu-system-i386 -kernel /home/admin/slax-build/boot/vmlinuz -cpu
> > host --enable-kvm (just tried).
> >
> >  Of course replace path to kernel image with your own. I can also attach
> > binary image, but I think it will be of little use for you.....
>
> Nah, I built it using your .config.
>
> So my guest stops very early in the BIOS with
>
> "Failed to allocate space for phdrs
>
> -- System halted."
>
> Then I looked at this:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=114671
>
> and there's a patch
>
> https://bugzilla.kernel.org/attachment.cgi?id=209601&action=diff&collapsed=
>&headers=1&format=raw


Thanks, looks like I will have more fun building 32-bit kernel, because I 
already updated binutils

>
> With it, it booted a bit further. But I still couldn't see any output.
>
> So I booted with my cmdline to see more output and it did say:
>
> general protection fault: 0000 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-i486+ #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1
> 04/01/2014 task: c05b9a80 ti: c05b2000 task.ti: c05b2000
> EIP: 0060:[<c010e390>] EFLAGS: 00210293 CPU: 0
> EIP is at cpu_has_amd_erratum+0x24/0xb0
> EAX: 00210bf7 EBX: 00000001 ECX: c0010140 EDX: c044ccf4
> ESI: c0616900 EDI: c044ccf8 EBP: c05b3f68 ESP: c05b3f58
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: ffc77000 CR3: 006ae000 CR4: 00040690
> Stack:
>  02008140 00000000 c0616900 00000000 c05b3fa8 c010ec8b f5001d80 0000001e
>  00000000 00000000 00000009 00000010 00000000 c0616900 00000000 c05b3fa8
>  c010cf58 c0616900 c0616900 c061695c c05b3fc8 c010d156 c061698b c061695c
> Call Trace:
>  [<c010ec8b>] init_amd+0x5ee/0x631
>  [<c010cf58>] ? get_cpu_cap+0x121/0x126
>  [<c010d156>] identify_cpu+0x1f9/0x37d
>  [<c0624a18>] identify_boot_cpu+0xd/0x80
>  [<c0624abd>] check_bugs+0x8/0x35
>  [<c061ea42>] start_kernel+0x32a/0x339
>  [<c061e2c2>] i386_start_kernel+0x8c/0x90
> Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb
> ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45
> f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 EIP: [<c010e390>]
> cpu_has_amd_erratum+0x24/0xb0 SS:ESP 0068:c05b3f58 ---[ end trace
> 7fb9e71b486a229a ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
>
> Which is exactly like the splat you've posted and that fails:
>
> Code: cf 5b c0 89 e5 5d c3 55 89 e5 57 56 53 51 89 c6 8b 1a 8d 7a 04 81 fb
> ff ff 00 00 77 57 8b 40 2c 0f ba e0 09 73 4e b9 40 01 01 c0 <0f> 32 89 45
> f0 89 d8 89 d1 99 39 ca 77 3b 72 05 3b 5d f0 73 34 All code
> ========
>    0:   cf                      iret
>    1:   5b                      pop    %rbx
>    2:   c0 89 e5 5d c3 55 89    rorb   $0x89,0x55c35de5(%rcx)
>    9:   e5 57                   in     $0x57,%eax
>    b:   56                      push   %rsi
>    c:   53                      push   %rbx
>    d:   51                      push   %rcx
>    e:   89 c6                   mov    %eax,%esi
>   10:   8b 1a                   mov    (%rdx),%ebx
>   12:   8d 7a 04                lea    0x4(%rdx),%edi
>   15:   81 fb ff ff 00 00       cmp    $0xffff,%ebx
>   1b:   77 57                   ja     0x74
>   1d:   8b 40 2c                mov    0x2c(%rax),%eax
>   20:   0f ba e0 09             bt     $0x9,%eax
>   24:   73 4e                   jae    0x74
>   26:   b9 40 01 01 c0          mov    $0xc0010140,%ecx
>   2b:*  0f 32                   rdmsr           <-- trapping instruction
>   2d:   89 45 f0                mov    %eax,-0x10(%rbp)
>   30:   89 d8                   mov    %ebx,%eax
>   32:   89 d1                   mov    %edx,%ecx
>   34:   99                      cltd
>   35:   39 ca                   cmp    %ecx,%edx
>   37:   77 3b                   ja     0x74
>   39:   72 05                   jb     0x40
>   3b:   3b 5d f0                cmp    -0x10(%rbp),%ebx
>   3e:   73 34                   jae    0x74
>
> because it tries to read from a non-existent MSR - 0xc0010140 - and
> maybe it is because of the -cpu host emulation or so but those MSRs do
> get virtualized, see
>
> 2b036c6b861d ("KVM: SVM: Add support for AMD's OSVW feature in guests")

Thanks again, patch "Add support from AMD's OSVW feature in guests" answered my 
question  about virtualizing somewhat buggy CPUs.

>
> but I'd refer to the kvm/qemu people to explain what the deal here
> exactly is.
>
> What I do, is use -cpu Opteron_G5 which is also F15h and that works.
> Oh, and I'd use 64-bit kernels - 32-bit is not really being tested as
> extensively.

-cpu Opteron_G5 works here, too.


>
> HTH.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-12-19  5:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-17  9:04 AMD erratum 665 on f15h processor? Andrew Randrianasulu
2017-12-17 20:52 ` Borislav Petkov
2017-12-18  3:01   ` Andrew Randrianasulu
2017-12-18 13:22     ` Borislav Petkov
     [not found]       ` <201712181954.52740.randrianasulu@gmail.com>
2017-12-18 21:05         ` Borislav Petkov
2017-12-19  5:22           ` Andrew Randrianasulu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).