All of lore.kernel.org
 help / color / mirror / Atom feed
* OpenBSD 5.0 kernel panic in AMD K10 cpu power state
@ 2011-11-08  9:25 Walter Haidinger
  2011-11-09 10:39 ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Walter Haidinger @ 2011-11-08  9:25 UTC (permalink / raw)
  To: kvm

Hi!

OpenBSD 5.0/i386 throws a kernel panic when I try to
boot it inside a Linux KVM (host: vanilla 3.0.4,
openSUSE 11.4/x86_64) unter qemu-kvm 0.14.1 and 0.15.1. 
Note that OpenBSD 4.9/i386 works.

The OpenBSD developers say:
"the virtual machine emulator you are using has a bug.  it declares
a cpu type from upstream and then does not emulate certain functions
of that cpu."

Therefore I'm reporting this here.

More from misc@openbsd.org:
  > OpenBSD 5.0 (GENERIC) #43: Wed Aug 17 10:10:52 MDT 2011
  >   deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
  > cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz
  > ...
  > kernel: protection fault trap, code=0
  > Stopped at      k1x_init+0x56:  rdmsr
  > k1x_init(d0ad7540,d09ae620,d0b8ce58,d059ce20,30000002) at k1x_init+0x56

  k1x_init() is not related to vmt, it is from k1x-pstate.c, which
  is cpu power state driver for K10 processors. 

Thread on misc@openbsd.org with full OpenBSD dmesg:
http://marc.info/?l=openbsd-misc&m=132067866208188&w=2

Since both qemu-kvm 0.14.1 and 0.15.1 show identical
symptoms, I assume this is in deed a KVM kernel bug.

Can somebody reproduce this?

Please CC: me when replying, thanks.
I'll follow the kvm@vger archives, though.

Regards,
Walter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state
  2011-11-08  9:25 OpenBSD 5.0 kernel panic in AMD K10 cpu power state Walter Haidinger
@ 2011-11-09 10:39 ` Avi Kivity
  2011-11-09 13:40   ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2011-11-09 10:39 UTC (permalink / raw)
  To: Walter Haidinger; +Cc: kvm

On 11/08/2011 11:25 AM, Walter Haidinger wrote:
> Hi!
>
> OpenBSD 5.0/i386 throws a kernel panic when I try to
> boot it inside a Linux KVM (host: vanilla 3.0.4,
> openSUSE 11.4/x86_64) unter qemu-kvm 0.14.1 and 0.15.1. 
> Note that OpenBSD 4.9/i386 works.
>
> The OpenBSD developers say:
> "the virtual machine emulator you are using has a bug.  it declares
> a cpu type from upstream and then does not emulate certain functions
> of that cpu."
>
> Therefore I'm reporting this here.

Thanks.

> More from misc@openbsd.org:
>   > OpenBSD 5.0 (GENERIC) #43: Wed Aug 17 10:10:52 MDT 2011
>   >   deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
>   > cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz
>   > ...
>   > kernel: protection fault trap, code=0
>   > Stopped at      k1x_init+0x56:  rdmsr
>   > k1x_init(d0ad7540,d09ae620,d0b8ce58,d059ce20,30000002) at k1x_init+0x56
>
>   k1x_init() is not related to vmt, it is from k1x-pstate.c, which
>   is cpu power state driver for K10 processors. 
>
> Thread on misc@openbsd.org with full OpenBSD dmesg:
> http://marc.info/?l=openbsd-misc&m=132067866208188&w=2
>
> Since both qemu-kvm 0.14.1 and 0.15.1 show identical
> symptoms, I assume this is in deed a KVM kernel bug.

It doesn't actually follow, but happens to be correct.

> Can somebody reproduce this?

I'll try it out and see.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state
  2011-11-09 10:39 ` Avi Kivity
@ 2011-11-09 13:40   ` Avi Kivity
  2011-11-09 14:19     ` Walter Haidinger
       [not found]     ` <4EBAD609.4050307@gmx.at>
  0 siblings, 2 replies; 6+ messages in thread
From: Avi Kivity @ 2011-11-09 13:40 UTC (permalink / raw)
  To: Walter Haidinger; +Cc: kvm

On 11/09/2011 12:39 PM, Avi Kivity wrote:
> > More from misc@openbsd.org:
> >   > OpenBSD 5.0 (GENERIC) #43: Wed Aug 17 10:10:52 MDT 2011
> >   >   deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
> >   > cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz
> >   > ...
> >   > kernel: protection fault trap, code=0
> >   > Stopped at      k1x_init+0x56:  rdmsr
> >   > k1x_init(d0ad7540,d09ae620,d0b8ce58,d059ce20,30000002) at k1x_init+0x56
> >
> >   k1x_init() is not related to vmt, it is from k1x-pstate.c, which
> >   is cpu power state driver for K10 processors. 
> >
> > Thread on misc@openbsd.org with full OpenBSD dmesg:
> > http://marc.info/?l=openbsd-misc&m=132067866208188&w=2
> >
> > Since both qemu-kvm 0.14.1 and 0.15.1 show identical
> > symptoms, I assume this is in deed a KVM kernel bug.
>
> It doesn't actually follow, but happens to be correct.
>
> > Can somebody reproduce this?
>
> I'll try it out and see.
>

Actually, it looks like an OpenBSD bug.  According to the AMD documentation:

"The current P-state value can be read using the P-State Status
Register. The P-State Current Limit
Register and the P-State Status Register are read-only registers. Writes
to these registers cause a #GP
exception. Support for hardware P-state control is indicated by EDX bit
7 as returned by CPUID
function 8000_0007h. Figure 18-1 shows the format of the P-State Current
Limit register."

Can you check what cpuid 80000007 returns by running 'x86info -r | grep
80000007' in a Linux guest with the same command line?  if edx returns
zero, then it's OpenBSD not checking cpuid correctly.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state
  2011-11-09 13:40   ` Avi Kivity
@ 2011-11-09 14:19     ` Walter Haidinger
       [not found]     ` <4EBAD609.4050307@gmx.at>
  1 sibling, 0 replies; 6+ messages in thread
From: Walter Haidinger @ 2011-11-09 14:19 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Am 09.11.2011 14:40, schrieb Avi Kivity:
Actually, it looks like an OpenBSD bug.  According to the AMD documentation:
> 
> "The current P-state value can be read using the P-State Status 
> Register. The P-State Current Limit Register and the P-State Status
> Register are read-only registers. Writes to these registers cause a
> #GP exception. Support for hardware P-state control is indicated by
> EDX bit 7 as returned by CPUID function 8000_0007h. Figure 18-1 shows
> the format of the P-State Current Limit register."

I'll forward this to the openbsd mailing-list.

> Can you check what cpuid 80000007 returns by running 'x86info -r |
> grep 80000007' in a Linux guest with the same command line?  if edx
> returns zero, then it's OpenBSD not checking cpuid correctly.
 
EDX for 0x80000007 is zero. Checked on both i386 and x86_64 guest
grml (2011.05 with 2.6.38 kernel) Linux live CD, full rx86info 
output appended below.

Walter

grml@grml ~ % x86info -r
x86info v1.25.  Dave Jones 2001-2009
Feedback to <davej@redhat.com>.

Found 1 CPU
--------------------------------------------------------------------------
EFamily: 1 EModel: 0 Family: 15 Model: 10 Stepping: 0
CPU Model: Unknown CPU
Processor name string: AMD Phenom(tm) II X6 1100T Processor
Monitor/Mwait: min/max line size 0/0, ecx bit 0 support, enumeration extension
SVM: revision 1, 16 ASIDs, np, NRIPSave
Address Size: 48 bits virtual, 40 bits physical
eax in: 0x00000000, eax = 00000006 ebx = 68747541 ecx = 444d4163 edx = 69746e65
eax in: 0x00000001, eax = 00100fa0 ebx = 00000800 ecx = 80802001 edx = 078bfbff
eax in: 0x00000002, eax = 00000001 ebx = 00000000 ecx = 00000000 edx = 002c307d
eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x00000004, eax = 00000121 ebx = 01c0003f ecx = 0000003f edx = 00000001
eax in: 0x00000005, eax = 00000000 ebx = 00000000 ecx = 00000003 edx = 00000000
eax in: 0x00000006, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000

eax in: 0x80000000, eax = 8000001b ebx = 68747541 ecx = 444d4163 edx = 69746e65
eax in: 0x80000001, eax = 00100fa0 ebx = 00000000 ecx = 000001f7 edx = 27d3fbff
eax in: 0x80000002, eax = 20444d41 ebx = 6e656850 ecx = 74286d6f edx = 4920296d
eax in: 0x80000003, eax = 36582049 ebx = 30313120 ecx = 50205430 edx = 65636f72
eax in: 0x80000004, eax = 726f7373 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000005, eax = 01ff01ff ebx = 01ff01ff ecx = 40020140 edx = 40020140
eax in: 0x80000006, eax = 00000000 ebx = 42004200 ecx = 02008140 edx = 00000000
eax in: 0x80000007, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000008, eax = 00003028 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000000a, eax = 00000001 ebx = 00000010 ecx = 00000000 edx = 00000009
eax in: 0x8000000b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000000c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000000d, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000000e, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000000f, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000010, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000011, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000012, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000013, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000014, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000015, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000016, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000017, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000018, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x80000019, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000001a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
eax in: 0x8000001b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state
       [not found]     ` <4EBAD609.4050307@gmx.at>
@ 2011-11-10  8:46       ` Avi Kivity
  2011-11-10 22:52         ` Andre Przywara
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2011-11-10  8:46 UTC (permalink / raw)
  To: Walter Haidinger; +Cc: KVM list

(re-adding cc)


On 11/09/2011 09:35 PM, Walter Haidinger wrote:
> Am 09.11.2011 14:40, schrieb Avi Kivity:
> > Actually, it looks like an OpenBSD bug.  According to the AMD 
> > documentation:
>
> Well, the OpenBSD developers are very confident that is
> a bug in the KVM cpu emulation and _not_ in OpenBSD.
>
> Basically they say that [despite -cpu host], the emulated
> cpu does not look like a real, but _non-existant_ cpu.
> Virtualization should look like _existing_ hardware.

That is true.  But OpenBSD is not following the vendor's recommendation
for how software should access the hardware.

> Since the list archive at 
> http://marc.info/?l=openbsd-misc&m=132077741910464&w=2
> lags a bit, I'm attaching some parts of the thread below:
>
> However, please remember it's OpenBSD, so the tone is, let's just
> say, rough.

Less than expected, actually.

> > The panic you hit is for an msr read, not a write. I'm aware those 
> > registers are read-only. The CPUID check isn't done, it matches on 
> > all family 10 and/or higher AMD processors. They're pretending to be
> >  an AMD K10 processor. On all real hardware I've tested this works 
> > fine. If you wish to be pedantic, patches are welcome.

So they're actually open to adding the cpuid check.

> They sent me a patch as a workaround, which:
>
> > The previous patch avoids touching the msr at all if ACPI indicates 
> > speed scaling is unavailable, this should prevent your panic.
>
> with -cpu host, OpenBSD dmesg showed the 1100T:
> >> cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz cpu0:
> >> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT
> >> ...
> >> bios0: vendor Bochs version "Bochs" date 01/01/2007 bios0: Bochs
> >> Bochs
> > They shouldn't be pretending to be AMD, especially if that emulation
> > is very incompatible.
>
> but the bug is in the Linux KVM:
>
> >> They're pretending to be an AMD K10 processor.
> >> 
> > Exactly.  What they are doing is wrong. They are pretending to be a 
> > AMD K10 processor _badly_, and then they think they can say "oh, but 
> > you need to check all these other registers too". A machine with that
> > setup has never physically existed.
>
> Is this all because I used -cpu host? 
>

-cpu host is not to blame, you could get the same result from other
combinations of cpu model and family.

I'll look at adding support for this MSR; should be simple.  But in
general processor features need to be qualified by cpuid, not by model.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state
  2011-11-10  8:46       ` Avi Kivity
@ 2011-11-10 22:52         ` Andre Przywara
  0 siblings, 0 replies; 6+ messages in thread
From: Andre Przywara @ 2011-11-10 22:52 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Walter Haidinger, KVM list

On 11/10/2011 09:46 AM, Avi Kivity wrote:
> (re-adding cc)
>
>
> On 11/09/2011 09:35 PM, Walter Haidinger wrote:
>> Am 09.11.2011 14:40, schrieb Avi Kivity:
>>> Actually, it looks like an OpenBSD bug.  According to the AMD
>>> documentation:
>>
>> Well, the OpenBSD developers are very confident that is
>> a bug in the KVM cpu emulation and _not_ in OpenBSD.
>>
>> Basically they say that [despite -cpu host], the emulated
>> cpu does not look like a real, but _non-existant_ cpu.
>> Virtualization should look like _existing_ hardware.
>
> That is true.  But OpenBSD is not following the vendor's recommendation
> for how software should access the hardware.
>
>> Since the list archive at
>> http://marc.info/?l=openbsd-misc&m=132077741910464&w=2
>> lags a bit, I'm attaching some parts of the thread below:
>>
>> However, please remember it's OpenBSD, so the tone is, let's just
>> say, rough.
>
> Less than expected, actually.
>
>>> The panic you hit is for an msr read, not a write. I'm aware those
>>> registers are read-only. The CPUID check isn't done, it matches on
>>> all family 10 and/or higher AMD processors. They're pretending to be
>>>   an AMD K10 processor. On all real hardware I've tested this works
>>> fine. If you wish to be pedantic, patches are welcome.

Avi, thanks for caring of that.

The manual is clear here: no CPUID bit, no MSRs. Beside that the 
emulated ACPI tables probably also don't provide any info here, right?
The fact that it runs: "on all family 10 and/or higher AMD processors" 
is just an empiric observation, not a law. You would be astonished what 
can be fused off...

We had a similar discussion here with unconditional AMD Northbridge PCI 
accesses when detecting certain AMD CPU family/model/steppings in the 
Linux kernel already (...but every AMD CPU has a northbridge...)
We (as virtualization guys) should not step back so easily here, 
especially if the spec is so clear. That spec argument should actually 
appeal to the OpenBSD guys, too. I got the impression that their design 
is, well, actually well designed.

>
> So they're actually open to adding the cpuid check.
>
>> They sent me a patch as a workaround, which:
>>
>>> The previous patch avoids touching the msr at all if ACPI indicates
>>> speed scaling is unavailable, this should prevent your panic.
>>
>> with -cpu host, OpenBSD dmesg showed the 1100T:
>>>> cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz cpu0:
>>>> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT
>>>> ...
>>>> bios0: vendor Bochs version "Bochs" date 01/01/2007 bios0: Bochs
>>>> Bochs
>>> They shouldn't be pretending to be AMD, especially if that emulation
>>> is very incompatible.
>>
>> but the bug is in the Linux KVM:
>>
>>>> They're pretending to be an AMD K10 processor.
>>>>
>>> Exactly.  What they are doing is wrong. They are pretending to be a
>>> AMD K10 processor _badly_, and then they think they can say "oh, but
>>> you need to check all these other registers too". A machine with that
>>> setup has never physically existed.
>>
>> Is this all because I used -cpu host?
>>
>
> -cpu host is not to blame, you could get the same result from other
> combinations of cpu model and family.
>
> I'll look at adding support for this MSR; should be simple.  But in
> general processor features need to be qualified by cpuid, not by model.

I guess emulating part of P-states will open up a can of worms. Beside 
the generic MSRs (0xC001006[1-3]) there are actual family specific ones 
which are selected by the CPUID family. So you would end up emulating 
them, too. I have a hard time to think about a strategy how to emulate 
this in general. So unless there is a real framework for dealing with 
P-state "hints" from the guest OS, I'd be reluctant with quick and dirty 
emulations.

Thanks,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-11-10 22:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-08  9:25 OpenBSD 5.0 kernel panic in AMD K10 cpu power state Walter Haidinger
2011-11-09 10:39 ` Avi Kivity
2011-11-09 13:40   ` Avi Kivity
2011-11-09 14:19     ` Walter Haidinger
     [not found]     ` <4EBAD609.4050307@gmx.at>
2011-11-10  8:46       ` Avi Kivity
2011-11-10 22:52         ` Andre Przywara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.