All of lore.kernel.org
 help / color / mirror / Atom feed
* Solaris 10 doesn't work under KVM
@ 2007-01-28 14:40 Waba
  2007-01-28 17:38 ` Michael Riepe
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Waba @ 2007-01-28 14:40 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Hello,

I tried to install Solaris 10 under KVM, without success, so Avi told me
to post the details here. My CPU is an AMD X2, and I am using KVM trunk
on a 2.6.19.2, started with the following command-line:

  kvm -hda solaris.qcow -cdrom sol-10-u3-ga-x86-dvd.iso -boot d -m 512 -no-acpi

(I tried with and without -no-acpi).

If I follow the normal installation path, I soon get messages like:

  svc.startd[7]: svc:/milestone/single-user:default: Method "/sbin/rcS start" failed due to signal ILL

and get dropped to a shell where even a simple "ls" produces a SIGILL. I
don't know Solaris (yet :), but I suspect that it happens when it tries
to start Java for svcs. If I choose to spawn a shell rather than
installing, I can run commands without problem.

Disabling KVM (-no-kvm) lets the installation proceed, but it is so slow
that I cancelled it before the end.

Now, how to reproduce my setup:
- Browse http://www.sun.com/software/solaris/get.jsp
- Click "Download Solaris 10"
- If asked for a user/pass, use bugmenot0/bugmenot (or make your own
  account)
- Select the medium at the top of the page. You can reproduce this bug
  using only the first CD (308MB), rather than downloading it all.
- Unzip the ISO somewhere
- Create a temporary disk image and boot.

Please let me know if you need any additional information. Thank you
much in advance,
-Waba.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-01-28 14:40 Solaris 10 doesn't work under KVM Waba
@ 2007-01-28 17:38 ` Michael Riepe
       [not found]   ` <45BCDF8C.1000508-0QoEqw4nQxo@public.gmane.org>
  2007-01-28 19:27 ` Avi Kivity
  2007-01-29 11:49 ` Avi Kivity
  2 siblings, 1 reply; 39+ messages in thread
From: Michael Riepe @ 2007-01-28 17:38 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Hi!

Waba wrote:

> I tried to install Solaris 10 under KVM, without success, so Avi told me
> to post the details here. My CPU is an AMD X2, and I am using KVM trunk
> on a 2.6.19.2, started with the following command-line:
> 
>   kvm -hda solaris.qcow -cdrom sol-10-u3-ga-x86-dvd.iso -boot d -m 512 -no-acpi
> 
> (I tried with and without -no-acpi).

-no-acpi should not be necessary for Solaris/x86.

> If I follow the normal installation path, I soon get messages like:
> 
>   svc.startd[7]: svc:/milestone/single-user:default: Method "/sbin/rcS start" failed due to signal ILL

I've seen SIGILLs on AMD (Opteron 8218) as well. It worked on an Intel
CPU, however.

Is your host running in 32 or 64 bit mode? I used 64 bit mode (host OS
was SLES 10).

-- 
Michael "Tired" Riepe <michael-0QoEqw4nQxo@public.gmane.org>
X-Tired: Each morning I get up I die a little

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]   ` <45BCDF8C.1000508-0QoEqw4nQxo@public.gmane.org>
@ 2007-01-28 18:28     ` Waba
  0 siblings, 0 replies; 39+ messages in thread
From: Waba @ 2007-01-28 18:28 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Sun, Jan 28, 2007 at 06:38:20PM +0100, Michael Riepe wrote:
> -no-acpi should not be necessary for Solaris/x86.
Ok.

> I've seen SIGILLs on AMD (Opteron 8218) as well. It worked on an Intel
> CPU, however.
> 
> Is your host running in 32 or 64 bit mode? I used 64 bit mode (host OS
> was SLES 10).
I'm in 64 bit mode too (host OS: Debian sid amd64).

-Waba.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-01-28 14:40 Solaris 10 doesn't work under KVM Waba
  2007-01-28 17:38 ` Michael Riepe
@ 2007-01-28 19:27 ` Avi Kivity
       [not found]   ` <45BCF91E.2030704-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-01-29 11:49 ` Avi Kivity
  2 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-01-28 19:27 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> Now, how to reproduce my setup:
> - Browse http://www.sun.com/software/solaris/get.jsp
> - Click "Download Solaris 10"
> - If asked for a user/pass, use bugmenot0/bugmenot (or make your own
>   account)
> - Select the medium at the top of the page. You can reproduce this bug
>   using only the first CD (308MB), rather than downloading it all.
> - Unzip the ISO somewhere
> - Create a temporary disk image and boot.
>
> Please let me know if you need any additional information. Thank you
> much in advance,
>   

The qemu BIOS complains that it can't boot from the CDROM 
(sol-10-u3-ga-x86-v1.iso).  This happens with and without kvm.  Do I 
need some other image?

'file' says:
/data/images/solaris/sol-10-u3-ga-x86-v1.iso: ISO 9660 CD-ROM filesystem 
data 'SOL_10_1106_X86                ' (bootable)

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]   ` <45BCF91E.2030704-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-01-28 20:25     ` Anthony Liguori
  2007-01-28 22:23     ` Waba
  1 sibling, 0 replies; 39+ messages in thread
From: Anthony Liguori @ 2007-01-28 20:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

Avi Kivity wrote:
> Waba wrote:
>> Now, how to reproduce my setup:
>> - Browse http://www.sun.com/software/solaris/get.jsp
>> - Click "Download Solaris 10"
>> - If asked for a user/pass, use bugmenot0/bugmenot (or make your own
>>   account)
>> - Select the medium at the top of the page. You can reproduce this bug
>>   using only the first CD (308MB), rather than downloading it all.
>> - Unzip the ISO somewhere
>> - Create a temporary disk image and boot.
>>
>> Please let me know if you need any additional information. Thank you
>> much in advance,
>>   
> 
> The qemu BIOS complains that it can't boot from the CDROM 
> (sol-10-u3-ga-x86-v1.iso).  This happens with and without kvm.  Do I 
> need some other image?

It boots fine with QEMU CVS.

Regards,

Anthony Liguori


> 'file' says:
> /data/images/solaris/sol-10-u3-ga-x86-v1.iso: ISO 9660 CD-ROM filesystem 
> data 'SOL_10_1106_X86                ' (bootable)
> 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]   ` <45BCF91E.2030704-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-01-28 20:25     ` Anthony Liguori
@ 2007-01-28 22:23     ` Waba
  2007-01-29  8:28       ` Avi Kivity
  1 sibling, 1 reply; 39+ messages in thread
From: Waba @ 2007-01-28 22:23 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Sun, Jan 28, 2007 at 09:27:26PM +0200, Avi Kivity wrote:
> The qemu BIOS complains that it can't boot from the CDROM 
> (sol-10-u3-ga-x86-v1.iso).  This happens with and without kvm.  Do I 
> need some other image?
> 
> 'file' says:
> /data/images/solaris/sol-10-u3-ga-x86-v1.iso: ISO 9660 CD-ROM filesystem 
> data 'SOL_10_1106_X86                ' (bootable)

That's strange. That looks like the same file that I got here, and it
works for me (past the GRUB screen at least). Just in case, here is the
MD5 sum of my version:

  c6f82c36195609b8dad67f61a927c7bc  sol-10-u3-ga-x86-v1.iso

-Waba.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-01-28 22:23     ` Waba
@ 2007-01-29  8:28       ` Avi Kivity
       [not found]         ` <45BDB03B.8000305-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-01-29  8:28 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> On Sun, Jan 28, 2007 at 09:27:26PM +0200, Avi Kivity wrote:
>   
>> The qemu BIOS complains that it can't boot from the CDROM 
>> (sol-10-u3-ga-x86-v1.iso).  This happens with and without kvm.  Do I 
>> need some other image?
>>
>> 'file' says:
>> /data/images/solaris/sol-10-u3-ga-x86-v1.iso: ISO 9660 CD-ROM filesystem 
>> data 'SOL_10_1106_X86                ' (bootable)
>>     
>
> That's strange. That looks like the same file that I got here, and it
> works for me (past the GRUB screen at least). Just in case, here is the
> MD5 sum of my version:
>
>   c6f82c36195609b8dad67f61a927c7bc  sol-10-u3-ga-x86-v1.iso
>
>   

Mine's different: I ran out of disk space.

I'll re-download and try again (I killed the .zip).


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]         ` <45BDB03B.8000305-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-01-29  9:40           ` Avi Kivity
  0 siblings, 0 replies; 39+ messages in thread
From: Avi Kivity @ 2007-01-29  9:40 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Avi Kivity wrote:
> Waba wrote:
>> On Sun, Jan 28, 2007 at 09:27:26PM +0200, Avi Kivity wrote:
>>  
>>> The qemu BIOS complains that it can't boot from the CDROM 
>>> (sol-10-u3-ga-x86-v1.iso).  This happens with and without kvm.  Do I 
>>> need some other image?
>>>
>>> 'file' says:
>>> /data/images/solaris/sol-10-u3-ga-x86-v1.iso: ISO 9660 CD-ROM 
>>> filesystem data 'SOL_10_1106_X86                ' (bootable)
>>>     
>>
>> That's strange. That looks like the same file that I got here, and it
>> works for me (past the GRUB screen at least). Just in case, here is the
>> MD5 sum of my version:
>>
>>   c6f82c36195609b8dad67f61a927c7bc  sol-10-u3-ga-x86-v1.iso
>>
>>   
>
> Mine's different: I ran out of disk space.
>
> I'll re-download and try again (I killed the .zip).
>
>

Well, it's a tough one.  The SIGILL is not injected by kvm, instead the 
guest gets confused.  Probably an mmu bug.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-01-28 14:40 Solaris 10 doesn't work under KVM Waba
  2007-01-28 17:38 ` Michael Riepe
  2007-01-28 19:27 ` Avi Kivity
@ 2007-01-29 11:49 ` Avi Kivity
       [not found]   ` <45BDDF32.3010607-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-01-29 11:49 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

Waba wrote:
> Hello,
>
> I tried to install Solaris 10 under KVM, without success, so Avi told me
> to post the details here. My CPU is an AMD X2, and I am using KVM trunk
> on a 2.6.19.2, started with the following command-line:
>
>   kvm -hda solaris.qcow -cdrom sol-10-u3-ga-x86-dvd.iso -boot d -m 512 -no-acpi
>
> (I tried with and without -no-acpi).
>
> If I follow the normal installation path, I soon get messages like:
>
>   svc.startd[7]: svc:/milestone/single-user:default: Method "/sbin/rcS start" failed due to signal ILL
>
>   

The attached patch should fix it.


-- 
error compiling committee.c: too many arguments to function


[-- Attachment #2: kvm-svm-cr0-wp.patch --]
[-- Type: text/x-patch, Size: 324 bytes --]

Index: svm.c
===================================================================
--- svm.c	(revision 4344)
+++ svm.c	(working copy)
@@ -727,7 +727,7 @@
 	}
 #endif
 	vcpu->svm->cr0 = cr0;
-	vcpu->svm->vmcb->save.cr0 = cr0 | CR0_PG_MASK;
+	vcpu->svm->vmcb->save.cr0 = cr0 | CR0_PG_MASK | CR0_WP_MASK;
 	vcpu->cr0 = cr0;
 }
 

[-- Attachment #3: Type: text/plain, Size: 347 bytes --]

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

[-- Attachment #4: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]   ` <45BDDF32.3010607-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-01 21:49     ` Waba
  2007-02-02 19:19       ` Joerg Roedel
  0 siblings, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-01 21:49 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Mon, Jan 29, 2007 at 01:49:06PM +0200, Avi Kivity wrote:
> The attached patch should fix it.

Update for those who wouldn't have followed IRC meanwhile: this patch
does fix the bug for Opteron-based systems, but doesn't improve anything
on my X2 4600+. Avi therefore suggests that someone (_joro?) with
knowledge and access to an X2 has a look at it.

Thanks in advance,
-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-01 21:49     ` Waba
@ 2007-02-02 19:19       ` Joerg Roedel
       [not found]         ` <20070202191942.GB8804-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Joerg Roedel @ 2007-02-02 19:19 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Thu, Feb 01, 2007 at 10:49:24PM +0100, Waba wrote:
> On Mon, Jan 29, 2007 at 01:49:06PM +0200, Avi Kivity wrote:
> > The attached patch should fix it.
> 
> Update for those who wouldn't have followed IRC meanwhile: this patch
> does fix the bug for Opteron-based systems, but doesn't improve anything
> on my X2 4600+. Avi therefore suggests that someone (_joro?) with
> knowledge and access to an X2 has a look at it.

I was able to reproduce the bug on a SVM machine here and did some
deeper research. I intercepted the #UD exception and printed out the
opcode. This opcode was all zero in the first time and changed randomly
to other undefined values. I also saved the last exit code before the UD
intercept and that was a PF intercept. The guest is in 32 bit PAE mode
when this happens.
Regarding this research I assume this bug is not SVM related, I think
something in the MMU goes wrong here.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]         ` <20070202191942.GB8804-5C7GfCeVMHo@public.gmane.org>
@ 2007-02-04  9:50           ` Avi Kivity
  2007-02-04 18:31           ` Waba
  1 sibling, 0 replies; 39+ messages in thread
From: Avi Kivity @ 2007-02-04  9:50 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

[-- Attachment #1: Type: text/plain, Size: 1357 bytes --]

Joerg Roedel wrote:
> On Thu, Feb 01, 2007 at 10:49:24PM +0100, Waba wrote:
>   
>> On Mon, Jan 29, 2007 at 01:49:06PM +0200, Avi Kivity wrote:
>>     
>>> The attached patch should fix it.
>>>       
>> Update for those who wouldn't have followed IRC meanwhile: this patch
>> does fix the bug for Opteron-based systems, but doesn't improve anything
>> on my X2 4600+. Avi therefore suggests that someone (_joro?) with
>> knowledge and access to an X2 has a look at it.
>>     
>
> I was able to reproduce the bug on a SVM machine here and did some
> deeper research. I intercepted the #UD exception and printed out the
> opcode. This opcode was all zero in the first time and changed randomly
> to other undefined values. I also saved the last exit code before the UD
> intercept and that was a PF intercept. The guest is in 32 bit PAE mode
> when this happens.
> Regarding this research I assume this bug is not SVM related, I think
> something in the MMU goes wrong here.
>
>   

kvm-trunk has a fix for this which is both mmu and svm related, see 
revision 4348.  It seems to fix the exact same problem on opterons but 
not on the athlons.

Waba, can you apply the attached patch and post dmesg after the error 
occurs?  (it also has a small fix which may help).


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


[-- Attachment #2: kvm-cr0-wp-test.patch --]
[-- Type: text/x-patch, Size: 1248 bytes --]

Index: svm.c
===================================================================
--- svm.c	(revision 4382)
+++ svm.c	(working copy)
@@ -553,7 +553,7 @@
 	 * cr0 val on cpu init should be 0x60000010, we enable cpu
 	 * cache by default. the orderly way is to enable cache in bios.
 	 */
-	save->cr0 = 0x00000010 | CR0_PG_MASK;
+	save->cr0 = 0x00000010 | CR0_PG_MASK | CR0_WP_MASK;
 	save->cr4 = CR4_PAE_MASK;
 	/* rdx = ?? */
 }
@@ -1430,6 +1430,17 @@
 	asm volatile ("mov %0, %%dr3" : : "r"(db_regs[3]));
 }
 
+static void check_cr0_wp(struct kvm_vcpu *vcpu, const char *where)
+{
+	static int last_cr0_wp = -1;
+	int cr0_wp;
+
+	cr0_wp = !!(vcpu->svm->vmcb->save.cr0 & CR0_WP_MASK);
+	if (cr0_wp != last_cr0_wp)
+		printk("cr0_wp: %d (%s)\n", cr0_wp, where);
+	last_cr0_wp = cr0_wp;
+}
+
 static int svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	u16 fs_selector;
@@ -1463,6 +1474,8 @@
 	fx_save(vcpu->host_fx_image);
 	fx_restore(vcpu->guest_fx_image);
 
+	check_cr0_wp(vcpu, "before");
+
 	asm volatile (
 #ifdef CONFIG_X86_64
 		"push %%rbx; push %%rcx; push %%rdx;"
@@ -1572,6 +1585,8 @@
 #endif
 		: "cc", "memory" );
 
+	check_cr0_wp(vcpu, "after");
+
 	fx_save(vcpu->guest_fx_image);
 	fx_restore(vcpu->host_fx_image);
 

[-- Attachment #3: Type: text/plain, Size: 374 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]         ` <20070202191942.GB8804-5C7GfCeVMHo@public.gmane.org>
  2007-02-04  9:50           ` Avi Kivity
@ 2007-02-04 18:31           ` Waba
  2007-02-07  9:42             ` Avi Kivity
  1 sibling, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-04 18:31 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

> Waba, can you apply the attached patch and post dmesg after the error 
> occurs?  (it also has a small fix which may help).

No luck with the fix, it stills SIGILLs :( Here is the dmesg:

[ 4800.373717] cr0_wp: 1 (before)
[ 4808.442199] kvm: emulating exchange as write

Thanks a lot for your help,
-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-04 18:31           ` Waba
@ 2007-02-07  9:42             ` Avi Kivity
       [not found]               ` <45C99EE9.3010306-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-07  9:42 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
>> Waba, can you apply the attached patch and post dmesg after the error 
>> occurs?  (it also has a small fix which may help).
>>     
>
> No luck with the fix, it stills SIGILLs :( Here is the dmesg:
>
> [ 4800.373717] cr0_wp: 1 (before)
> [ 4808.442199] kvm: emulating exchange as write
>
>   

ok.  please keep the patch applied, and an addition:

- change '#undef AUDIT' to '#define AUDIT' in mmu.c
- in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
- 'echo 9 > /proc/sysrq-trigger'

and run again.  Note that kvm will be very slow with this.  Watch dmesg 
for errors from kvm.




-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]               ` <45C99EE9.3010306-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-07 23:04                 ` Waba
  2007-02-08  9:27                   ` Avi Kivity
  0 siblings, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-07 23:04 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
> ok.  please keep the patch applied, and an addition:
> 
> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
> - 'echo 9 > /proc/sysrq-trigger'
> 
> and run again.  Note that kvm will be very slow with this.  Watch dmesg 
> for errors from kvm.

When you say 'very slow', you mean it :) I couldn't reproduce the bug in
the usual fashion because the involved processes timed out and were
killed by the service manager.

However, simply invoking "svccfg" at the prompt seems to bring in Java or
whatever heavy stuff is causing our problem, and I got a nice SIGILL
this way.

Sadly, nothing interesting appeared in the log:

[ 8544.402301] Loglevel set to 9
[ 8548.697286] cr0_wp: 1 (before)
[ 9868.532303] kvm: emulating exchange as write
[16785.771040] SysRq : Changing Loglevel
[16785.771049] Loglevel set to 0


Please tell me if there is anything else I can try out.
-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-07 23:04                 ` Waba
@ 2007-02-08  9:27                   ` Avi Kivity
       [not found]                     ` <45CAECEB.4000701-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-08  9:27 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
>   
>> ok.  please keep the patch applied, and an addition:
>>
>> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
>> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
>> - 'echo 9 > /proc/sysrq-trigger'
>>
>> and run again.  Note that kvm will be very slow with this.  Watch dmesg 
>> for errors from kvm.
>>     
>
> When you say 'very slow', you mean it :) I couldn't reproduce the bug in
> the usual fashion because the involved processes timed out and were
> killed by the service manager.
>
> However, simply invoking "svccfg" at the prompt seems to bring in Java or
> whatever heavy stuff is causing our problem, and I got a nice SIGILL
> this way.
>
> Sadly, nothing interesting appeared in the log:
>
> [ 8544.402301] Loglevel set to 9
> [ 8548.697286] cr0_wp: 1 (before)
> [ 9868.532303] kvm: emulating exchange as write
> [16785.771040] SysRq : Changing Loglevel
> [16785.771049] Loglevel set to 0
>
>
> Please tell me if there is anything else I can try out.
>   

Can you try to isolate the process which fails?

e.g. install with qemu, start in kvm in single user mode, and use truss 
or something similar.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                     ` <45CAECEB.4000701-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-08  9:58                       ` Joerg Roedel
       [not found]                         ` <20070208095816.GA5204-5C7GfCeVMHo@public.gmane.org>
  2007-02-10 13:34                       ` Waba
  1 sibling, 1 reply; 39+ messages in thread
From: Joerg Roedel @ 2007-02-08  9:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On Thu, Feb 08, 2007 at 11:27:07AM +0200, Avi Kivity wrote:
> Waba wrote:
> > On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
> >   
> >> ok.  please keep the patch applied, and an addition:
> >>
> >> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
> >> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
> >> - 'echo 9 > /proc/sysrq-trigger'
> >>
> >> and run again.  Note that kvm will be very slow with this.  Watch dmesg 
> >> for errors from kvm.
> >>     
> >
> > When you say 'very slow', you mean it :) I couldn't reproduce the bug in
> > the usual fashion because the involved processes timed out and were
> > killed by the service manager.
> >
> > However, simply invoking "svccfg" at the prompt seems to bring in Java or
> > whatever heavy stuff is causing our problem, and I got a nice SIGILL
> > this way.
> >
> > Sadly, nothing interesting appeared in the log:
> >
> > [ 8544.402301] Loglevel set to 9
> > [ 8548.697286] cr0_wp: 1 (before)
> > [ 9868.532303] kvm: emulating exchange as write
> > [16785.771040] SysRq : Changing Loglevel
> > [16785.771049] Loglevel set to 0
> >
> >
> > Please tell me if there is anything else I can try out.
> >   
> 
> Can you try to isolate the process which fails?
> 
> e.g. install with qemu, start in kvm in single user mode, and use truss 
> or something similar.

When I tried it I got a the SIGILL ever I invoked a new program. The
first SIGILL is in the starting process which is killed then. Afterwards
I get a shell. Every simple command there (like ls for example) caused a
SIGILL then.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                         ` <20070208095816.GA5204-5C7GfCeVMHo@public.gmane.org>
@ 2007-02-08 10:04                           ` Avi Kivity
       [not found]                             ` <45CAF5C6.8020104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-08 10:04 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

Joerg Roedel wrote:
> On Thu, Feb 08, 2007 at 11:27:07AM +0200, Avi Kivity wrote:
>   
>> Waba wrote:
>>     
>>> On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
>>>   
>>>       
>>>> ok.  please keep the patch applied, and an addition:
>>>>
>>>> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
>>>> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
>>>> - 'echo 9 > /proc/sysrq-trigger'
>>>>
>>>> and run again.  Note that kvm will be very slow with this.  Watch dmesg 
>>>> for errors from kvm.
>>>>     
>>>>         
>>> When you say 'very slow', you mean it :) I couldn't reproduce the bug in
>>> the usual fashion because the involved processes timed out and were
>>> killed by the service manager.
>>>
>>> However, simply invoking "svccfg" at the prompt seems to bring in Java or
>>> whatever heavy stuff is causing our problem, and I got a nice SIGILL
>>> this way.
>>>
>>> Sadly, nothing interesting appeared in the log:
>>>
>>> [ 8544.402301] Loglevel set to 9
>>> [ 8548.697286] cr0_wp: 1 (before)
>>> [ 9868.532303] kvm: emulating exchange as write
>>> [16785.771040] SysRq : Changing Loglevel
>>> [16785.771049] Loglevel set to 0
>>>
>>>
>>> Please tell me if there is anything else I can try out.
>>>   
>>>       
>> Can you try to isolate the process which fails?
>>
>> e.g. install with qemu, start in kvm in single user mode, and use truss 
>> or something similar.
>>     
>
> When I tried it I got a the SIGILL ever I invoked a new program. The
> first SIGILL is in the starting process which is killed then. Afterwards
> I get a shell. Every simple command there (like ls for example) caused a
> SIGILL then.
>
>   

Was this with the cr0.wp patch applied (commit 4348)?



-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                             ` <45CAF5C6.8020104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-08 10:19                               ` Joerg Roedel
       [not found]                                 ` <20070208101945.GB5204-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Joerg Roedel @ 2007-02-08 10:19 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On Thu, Feb 08, 2007 at 12:04:54PM +0200, Avi Kivity wrote:
> Joerg Roedel wrote:
> >On Thu, Feb 08, 2007 at 11:27:07AM +0200, Avi Kivity wrote:
> >  
> >>Waba wrote:
> >>    
> >>>On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
> >>>        
> >>>>ok.  please keep the patch applied, and an addition:
> >>>>
> >>>>- change '#undef AUDIT' to '#define AUDIT' in mmu.c
> >>>>- in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
> >>>>- 'echo 9 > /proc/sysrq-trigger'
> >>>>
> >>>>and run again.  Note that kvm will be very slow with this.  Watch dmesg for errors from 
> >>>>kvm.
> >>>>            
> >>>When you say 'very slow', you mean it :) I couldn't reproduce the bug in
> >>>the usual fashion because the involved processes timed out and were
> >>>killed by the service manager.
> >>>
> >>>However, simply invoking "svccfg" at the prompt seems to bring in Java or
> >>>whatever heavy stuff is causing our problem, and I got a nice SIGILL
> >>>this way.
> >>>
> >>>Sadly, nothing interesting appeared in the log:
> >>>
> >>>[ 8544.402301] Loglevel set to 9
> >>>[ 8548.697286] cr0_wp: 1 (before)
> >>>[ 9868.532303] kvm: emulating exchange as write
> >>>[16785.771040] SysRq : Changing Loglevel
> >>>[16785.771049] Loglevel set to 0
> >>>
> >>>
> >>>Please tell me if there is anything else I can try out.
> >>>        
> >>Can you try to isolate the process which fails?
> >>
> >>e.g. install with qemu, start in kvm in single user mode, and use truss or something 
> >>similar.
> >>    
> >
> >When I tried it I got a the SIGILL ever I invoked a new program. The
> >first SIGILL is in the starting process which is killed then. Afterwards
> >I get a shell. Every simple command there (like ls for example) caused a
> >SIGILL then.
> >
> >  
> 
> Was this with the cr0.wp patch applied (commit 4348)?

Yes. It was with kvm-current from last friday. The patch was already in
there.


-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                 ` <20070208101945.GB5204-5C7GfCeVMHo@public.gmane.org>
@ 2007-02-08 10:39                                   ` Avi Kivity
       [not found]                                     ` <45CAFDF6.4020402-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-08 10:39 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

Joerg Roedel wrote:
> On Thu, Feb 08, 2007 at 12:04:54PM +0200, Avi Kivity wrote:
>   
>> Joerg Roedel wrote:
>>     
>>> On Thu, Feb 08, 2007 at 11:27:07AM +0200, Avi Kivity wrote:
>>>  
>>>       
>>>> Waba wrote:
>>>>    
>>>>         
>>>>> On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
>>>>>        
>>>>>           
>>>>>> ok.  please keep the patch applied, and an addition:
>>>>>>
>>>>>> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
>>>>>> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
>>>>>> - 'echo 9 > /proc/sysrq-trigger'
>>>>>>
>>>>>> and run again.  Note that kvm will be very slow with this.  Watch dmesg for errors from 
>>>>>> kvm.
>>>>>>            
>>>>>>             
>>>>> When you say 'very slow', you mean it :) I couldn't reproduce the bug in
>>>>> the usual fashion because the involved processes timed out and were
>>>>> killed by the service manager.
>>>>>
>>>>> However, simply invoking "svccfg" at the prompt seems to bring in Java or
>>>>> whatever heavy stuff is causing our problem, and I got a nice SIGILL
>>>>> this way.
>>>>>
>>>>> Sadly, nothing interesting appeared in the log:
>>>>>
>>>>> [ 8544.402301] Loglevel set to 9
>>>>> [ 8548.697286] cr0_wp: 1 (before)
>>>>> [ 9868.532303] kvm: emulating exchange as write
>>>>> [16785.771040] SysRq : Changing Loglevel
>>>>> [16785.771049] Loglevel set to 0
>>>>>
>>>>>
>>>>> Please tell me if there is anything else I can try out.
>>>>>        
>>>>>           
>>>> Can you try to isolate the process which fails?
>>>>
>>>> e.g. install with qemu, start in kvm in single user mode, and use truss or something 
>>>> similar.
>>>>    
>>>>         
>>> When I tried it I got a the SIGILL ever I invoked a new program. The
>>> first SIGILL is in the starting process which is killed then. Afterwards
>>> I get a shell. Every simple command there (like ls for example) caused a
>>> SIGILL then.
>>>
>>>  
>>>       
>> Was this with the cr0.wp patch applied (commit 4348)?
>>     
>
> Yes. It was with kvm-current from last friday. The patch was already in
> there.
>   

Can this be sysenter support? i.e. is sysenter supported on opteron but 
not on athlon x2?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                     ` <45CAFDF6.4020402-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-08 11:00                                       ` Cyril Plisko
       [not found]                                         ` <c7dddeaa0702080300i31eb933fjfcdb4570f82b0a79-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Cyril Plisko @ 2007-02-08 11:00 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On 2/8/07, Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> Joerg Roedel wrote:
> > On Thu, Feb 08, 2007 at 12:04:54PM +0200, Avi Kivity wrote:
> >
> >> Joerg Roedel wrote:
> >>
> >>> On Thu, Feb 08, 2007 at 11:27:07AM +0200, Avi Kivity wrote:
> >>>
> >>>
> >>>> Waba wrote:
> >>>>
> >>>>
> >>>>> On Wed, Feb 07, 2007 at 11:42:01AM +0200, Avi Kivity wrote:
> >>>>>
> >>>>>
> >>>>>> ok.  please keep the patch applied, and an addition:
> >>>>>>
> >>>>>> - change '#undef AUDIT' to '#define AUDIT' in mmu.c
> >>>>>> - in the same file, change 'static int dbg = 1;' to 'static int dbg = 0;'
> >>>>>> - 'echo 9 > /proc/sysrq-trigger'
> >>>>>>
> >>>>>> and run again.  Note that kvm will be very slow with this.  Watch dmesg for errors from
> >>>>>> kvm.
> >>>>>>
> >>>>>>
> >>>>> When you say 'very slow', you mean it :) I couldn't reproduce the bug in
> >>>>> the usual fashion because the involved processes timed out and were
> >>>>> killed by the service manager.
> >>>>>
> >>>>> However, simply invoking "svccfg" at the prompt seems to bring in Java or
> >>>>> whatever heavy stuff is causing our problem, and I got a nice SIGILL
> >>>>> this way.
> >>>>>
> >>>>> Sadly, nothing interesting appeared in the log:
> >>>>>
> >>>>> [ 8544.402301] Loglevel set to 9
> >>>>> [ 8548.697286] cr0_wp: 1 (before)
> >>>>> [ 9868.532303] kvm: emulating exchange as write
> >>>>> [16785.771040] SysRq : Changing Loglevel
> >>>>> [16785.771049] Loglevel set to 0
> >>>>>
> >>>>>
> >>>>> Please tell me if there is anything else I can try out.
> >>>>>
> >>>>>
> >>>> Can you try to isolate the process which fails?
> >>>>
> >>>> e.g. install with qemu, start in kvm in single user mode, and use truss or something
> >>>> similar.
> >>>>
> >>>>
> >>> When I tried it I got a the SIGILL ever I invoked a new program. The
> >>> first SIGILL is in the starting process which is killed then. Afterwards
> >>> I get a shell. Every simple command there (like ls for example) caused a
> >>> SIGILL then.
> >>>
> >>>
> >>>
> >> Was this with the cr0.wp patch applied (commit 4348)?
> >>
> >
> > Yes. It was with kvm-current from last friday. The patch was already in
> > there.
> >
>
> Can this be sysenter support? i.e. is sysenter supported on opteron but
> not on athlon x2?


AFAIK, on both Opteron an Athlon X2 Solaris uses syscall, rather than sysenter.
The catch here is that it starts with the least optimized libc at
boot, and at some
point SMF uses moe(1) to find out most suitable version of libc and
lofs-mount it
on top of the original. From this point on AMD machine will use syscall.

-- 
Regards,
        Cyril

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                         ` <c7dddeaa0702080300i31eb933fjfcdb4570f82b0a79-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-02-08 12:21                                           ` Avi Kivity
       [not found]                                             ` <45CB15E3.7010803-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-08 12:21 UTC (permalink / raw)
  To: Cyril Plisko; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

Cyril Plisko wrote:
>>
>> Can this be sysenter support? i.e. is sysenter supported on opteron but
>> not on athlon x2?
>
>
> AFAIK, on both Opteron an Athlon X2 Solaris uses syscall, rather than 
> sysenter.
> The catch here is that it starts with the least optimized libc at
> boot, and at some
> point SMF uses moe(1) to find out most suitable version of libc and
> lofs-mount it
> on top of the original. From this point on AMD machine will use syscall.
>

Is this true on 32-bit solaris as well?

While the host is 64-bit, the guest is 32 in this case IIRC.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                             ` <45CB15E3.7010803-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-08 13:45                                               ` Cyril Plisko
  2007-02-08 14:45                                               ` Joerg Roedel
  1 sibling, 0 replies; 39+ messages in thread
From: Cyril Plisko @ 2007-02-08 13:45 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On 2/8/07, Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> Cyril Plisko wrote:
> >>
> >> Can this be sysenter support? i.e. is sysenter supported on opteron but
> >> not on athlon x2?
> >
> >
> > AFAIK, on both Opteron an Athlon X2 Solaris uses syscall, rather than
> > sysenter.
> > The catch here is that it starts with the least optimized libc at
> > boot, and at some
> > point SMF uses moe(1) to find out most suitable version of libc and
> > lofs-mount it
> > on top of the original. From this point on AMD machine will use syscall.
> >
>
> Is this true on 32-bit solaris as well?
>

It is true _only_ for 32 bit. 64 bit userland doesn't have to check these things
as syscall available by virtue of the fact.


> While the host is 64-bit, the guest is 32 in this case IIRC.
>
> --
> Do not meddle in the internals of kernels, for they are subtle and quick to panic.
>
>


-- 
Regards,
        Cyril

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                             ` <45CB15E3.7010803-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-02-08 13:45                                               ` Cyril Plisko
@ 2007-02-08 14:45                                               ` Joerg Roedel
       [not found]                                                 ` <20070208144530.GC5204-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 39+ messages in thread
From: Joerg Roedel @ 2007-02-08 14:45 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On Thu, Feb 08, 2007 at 02:21:55PM +0200, Avi Kivity wrote:
> Cyril Plisko wrote:
> >>
> >>Can this be sysenter support? i.e. is sysenter supported on opteron but
> >>not on athlon x2?
> >
> >
> >AFAIK, on both Opteron an Athlon X2 Solaris uses syscall, rather than sysenter.
> >The catch here is that it starts with the least optimized libc at
> >boot, and at some
> >point SMF uses moe(1) to find out most suitable version of libc and
> >lofs-mount it
> >on top of the original. From this point on AMD machine will use syscall.
> >
> 
> Is this true on 32-bit solaris as well?
> 
> While the host is 64-bit, the guest is 32 in this case IIRC.

Yeah, the guest is running in 32bit PAE when the SIGILLs happen.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                                 ` <20070208144530.GC5204-5C7GfCeVMHo@public.gmane.org>
@ 2007-02-08 14:58                                                   ` Cyril Plisko
  0 siblings, 0 replies; 39+ messages in thread
From: Cyril Plisko @ 2007-02-08 14:58 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Waba

On 2/8/07, Joerg Roedel <joerg.roedel-5C7GfCeVMHo@public.gmane.org> wrote:
> On Thu, Feb 08, 2007 at 02:21:55PM +0200, Avi Kivity wrote:
> > Cyril Plisko wrote:
> > >>
> > >>Can this be sysenter support? i.e. is sysenter supported on opteron but
> > >>not on athlon x2?
> > >
> > >
> > >AFAIK, on both Opteron an Athlon X2 Solaris uses syscall, rather than sysenter.
> > >The catch here is that it starts with the least optimized libc at
> > >boot, and at some
> > >point SMF uses moe(1) to find out most suitable version of libc and
> > >lofs-mount it
> > >on top of the original. From this point on AMD machine will use syscall.
> > >
> >
> > Is this true on 32-bit solaris as well?
> >
> > While the host is 64-bit, the guest is 32 in this case IIRC.
>
> Yeah, the guest is running in 32bit PAE when the SIGILLs happen.

Joerg,

you may want to boot Solaris with "-m verbose" flag to see which
of the SMF services fails.

>
> Joerg
>
> --
> Joerg Roedel
> Operating System Research Center
> AMD Saxony LLC & Co. KG
>
>
>


-- 
Regards,
        Cyril

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                     ` <45CAECEB.4000701-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-02-08  9:58                       ` Joerg Roedel
@ 2007-02-10 13:34                       ` Waba
  2007-02-11  9:14                         ` Avi Kivity
                                           ` (2 more replies)
  1 sibling, 3 replies; 39+ messages in thread
From: Waba @ 2007-02-10 13:34 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

It took me a while, but I figured it out... nearly!

Everything SIGILLs after the fs-root service is started. Its start
method does several things, but the problematic bit is replacing the
libc with an optimised version (namely, /usr/lib/libc/libc_hwcap1.so.1,
which makes use of the SSE, MMX, CMOV, SEP and FPU instruction sets
according to file(1)).

All these flags are indeed advertised in the CPUID (isainfo -v: sse2 sse
fxsr mmx cmov sep cx8 tsc fpu)). If the amd_sysc bit had been present,
the hwcap2 version would have been selected by moe(1), I guess (adds
SSE2 support and replaces SEP by AMD_SYSC).

Disabling the libc replacement in /lib/svc/method/fs-root entirely
workarounds the problem.

Further investigating, I tricked ls(1) into using the optimised libc
through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
that the faulty instruction is "movups (%esp), %xmm0", a SSE feature.
The %xmm0 register is apparently for storage purposes only, as the only
instructions used to access it are movups, movntps and movaps.

At this point I hope that it makes a lot of sense to you, because I
have no idea why it works fine on Avi's Opteron, etc.

Let me know if you need any additional debugging.
-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-10 13:34                       ` Waba
@ 2007-02-11  9:14                         ` Avi Kivity
       [not found]                           ` <45CEDE92.4090204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-02-12  9:48                         ` Avi Kivity
  2007-02-12 17:58                         ` Joerg Roedel
  2 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-11  9:14 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> It took me a while, but I figured it out... nearly!
>
> Everything SIGILLs after the fs-root service is started. Its start
> method does several things, but the problematic bit is replacing the
> libc with an optimised version (namely, /usr/lib/libc/libc_hwcap1.so.1,
> which makes use of the SSE, MMX, CMOV, SEP and FPU instruction sets
> according to file(1)).
>
> All these flags are indeed advertised in the CPUID (isainfo -v: sse2 sse
> fxsr mmx cmov sep cx8 tsc fpu)). If the amd_sysc bit had been present,
> the hwcap2 version would have been selected by moe(1), I guess (adds
> SSE2 support and replaces SEP by AMD_SYSC).
>   

The guest's cpuid is 100% faked by qemu.

> Disabling the libc replacement in /lib/svc/method/fs-root entirely
> workarounds the problem.
>
> Further investigating, I tricked ls(1) into using the optimised libc
> through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
> culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
> that the faulty instruction is "movups (%esp), %xmm0", a SSE feature.
> The %xmm0 register is apparently for storage purposes only, as the only
> instructions used to access it are movups, movntps and movaps.
>
> At this point I hope that it makes a lot of sense to you, because I
> have no idea why it works fine on Avi's Opteron, etc.
>
> Let me know if you need any additional debugging.

Can you post the host's /proc/cpuinfo? I'll compare it with my opteron.


Anyway, good debug job.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                           ` <45CEDE92.4090204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-11 10:43                             ` Waba
  2007-02-11 10:58                               ` Avi Kivity
  0 siblings, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-11 10:43 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Sun, Feb 11, 2007 at 11:14:58AM +0200, Avi Kivity wrote:
> Can you post the host's /proc/cpuinfo? I'll compare it with my opteron.

processor	: 0-1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 75
model name	: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
stepping	: 2
cpu MHz		: 2400.000
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm
cr8_legacy
bogomips	: 4822.56
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-11 10:43                             ` Waba
@ 2007-02-11 10:58                               ` Avi Kivity
  0 siblings, 0 replies; 39+ messages in thread
From: Avi Kivity @ 2007-02-11 10:58 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> On Sun, Feb 11, 2007 at 11:14:58AM +0200, Avi Kivity wrote:
>   
>> Can you post the host's /proc/cpuinfo? I'll compare it with my opteron.
>>     
>
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
> fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm
> cr8_legacy

Well, the flags are identical.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-10 13:34                       ` Waba
  2007-02-11  9:14                         ` Avi Kivity
@ 2007-02-12  9:48                         ` Avi Kivity
       [not found]                           ` <45D03801.4040006-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-02-12 17:58                         ` Joerg Roedel
  2 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-12  9:48 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

[-- Attachment #1: Type: text/plain, Size: 1555 bytes --]

Waba wrote:
> It took me a while, but I figured it out... nearly!
>
> Everything SIGILLs after the fs-root service is started. Its start
> method does several things, but the problematic bit is replacing the
> libc with an optimised version (namely, /usr/lib/libc/libc_hwcap1.so.1,
> which makes use of the SSE, MMX, CMOV, SEP and FPU instruction sets
> according to file(1)).
>
> All these flags are indeed advertised in the CPUID (isainfo -v: sse2 sse
> fxsr mmx cmov sep cx8 tsc fpu)). If the amd_sysc bit had been present,
> the hwcap2 version would have been selected by moe(1), I guess (adds
> SSE2 support and replaces SEP by AMD_SYSC).
>
> Disabling the libc replacement in /lib/svc/method/fs-root entirely
> workarounds the problem.
>
> Further investigating, I tricked ls(1) into using the optimised libc
> through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
> culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
> that the faulty instruction is "movups (%esp), %xmm0", a SSE feature.
> The %xmm0 register is apparently for storage purposes only, as the only
> instructions used to access it are movups, movntps and movaps.
>
> At this point I hope that it makes a lot of sense to you, because I
> have no idea why it works fine on Avi's Opteron, etc.
>
> Let me know if you need any additional debugging.
>   

Let's look at the control registers at the time of the SIGILL.  Can you 
reproduce the error with the attached patch and send dmesg?

-- 
error compiling committee.c: too many arguments to function


[-- Attachment #2: ud-print-cr0-cr4.patch --]
[-- Type: text/x-patch, Size: 1233 bytes --]

Index: svm.c
===================================================================
--- svm.c	(revision 4412)
+++ svm.c	(working copy)
@@ -481,7 +481,7 @@
 					INTERCEPT_DR5_MASK |
 					INTERCEPT_DR7_MASK;
 
-	control->intercept_exceptions = 1 << PF_VECTOR;
+	control->intercept_exceptions = (1 << PF_VECTOR) | (1 << UD_VECTOR);
 
 
 	control->intercept = 	(1ULL << INTERCEPT_INTR) |
@@ -1247,6 +1247,15 @@
 	return 1;
 }
 
+static int ud_interception(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+	printk(KERN_ERR "#ud: cr0 %lx (%llx) cr4 %lx (%llx)\n",
+	       vcpu->cr0, vcpu->svm->vmcb->save.cr0,
+	       vcpu->cr4, vcpu->svm->vmcb->save.cr4);
+	run->exit_reason = KVM_EXIT_SHUTDOWN;
+	return 0;
+}
+
 static int (*svm_exit_handlers[])(struct kvm_vcpu *vcpu,
 				      struct kvm_run *kvm_run) = {
 	[SVM_EXIT_READ_CR0]           		= emulate_on_interception,
@@ -1267,6 +1276,7 @@
 	[SVM_EXIT_WRITE_DR5]			= emulate_on_interception,
 	[SVM_EXIT_WRITE_DR7]			= emulate_on_interception,
 	[SVM_EXIT_EXCP_BASE + PF_VECTOR] 	= pf_interception,
+	[SVM_EXIT_EXCP_BASE + UD_VECTOR] 	= ud_interception,
 	[SVM_EXIT_INTR] 			= nop_on_interception,
 	[SVM_EXIT_NMI]				= nop_on_interception,
 	[SVM_EXIT_SMI]				= nop_on_interception,

[-- Attachment #3: Type: text/plain, Size: 374 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                           ` <45D03801.4040006-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-12 14:57                             ` Gregory Haskins
       [not found]                               ` <45D03A33.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
  2007-02-13 14:06                             ` Waba
  1 sibling, 1 reply; 39+ messages in thread
From: Gregory Haskins @ 2007-02-12 14:57 UTC (permalink / raw)
  To: Avi Kivity, Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

>>> On Mon, Feb 12, 2007 at  4:48 AM, in message <45D03801.4040006-atKUWr5tajBWk0Htik3J/w@public.gmane.org>,
Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote: 
> Waba wrote:
>> It took me a while, but I figured it out... nearly!
>>
>> Everything SIGILLs after the fs- root service is started. Its start
>> method does several things, but the problematic bit is replacing the
>> libc with an optimised version (namely, /usr/lib/libc/libc_hwcap1.so.1,
>> which makes use of the SSE, MMX, CMOV, SEP and FPU instruction sets
>> according to file(1)).
>>
>> All these flags are indeed advertised in the CPUID (isainfo - v: sse2 sse
>> fxsr mmx cmov sep cx8 tsc fpu)). If the amd_sysc bit had been present,
>> the hwcap2 version would have been selected by moe(1), I guess (adds
>> SSE2 support and replaces SEP by AMD_SYSC).
>>
>> Disabling the libc replacement in /lib/svc/method/fs- root entirely
>> workarounds the problem.
>>
>> Further investigating, I tricked ls(1) into using the optimised libc
>> through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
>> culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
>> that the faulty instruction is "movups (%esp), %xmm0", a SSE feature.
>> The %xmm0 register is apparently for storage purposes only, as the only
>> instructions used to access it are movups, movntps and movaps.
>>
>> At this point I hope that it makes a lot of sense to you, because I
>> have no idea why it works fine on Avi's Opteron, etc.
>>
>> Let me know if you need any additional debugging.
>>   
> 
> Let's look at the control registers at the time of the SIGILL.  Can you 
> reproduce the error with the attached patch and send dmesg?


Hi Avi,
  I have a sneaking suspicion that this may be the same root-cause of my findings with #UD on SLES.  I wrote a program that allows you to take MD5 sum pages of a running program's text sections and compare them.  I then compared the output of GRUB running on bare-metal and as a KVM guest and they were identical (except for the expected text that is affected by relocation).  This was not what I was expecting since we speculated MMU corruption.  Admittedly the test is not conclusive since the page mappings could surely be different under the load of the target apps execution verses the delta program.  But I was hoping for a smoking gun ;)

Note that I am seeing #UD under other apps as well (Firefox for instance).  If there were a disparity between the advertised and actual CPUID flags and SLES is using libraries that interpret the flags, that could explain the behavior here.  Note that grub is blowing up in libc for me as well.  I will explore a CPUID disparity as a possibility next and report back.   What I did notice is that KVM seems to report the CPU as an AMD, even though I am running on a Woodcrest.  I would speculate that the problem is that some AMD specific flag (e.g. amd_sysc) is set when it should not be.

Note that I am currently being pulled off KVM work for about a week so I will be silent for a bit.

-Greg

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                               ` <45D03A33.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
@ 2007-02-12 17:35                                 ` Dor Laor
  0 siblings, 0 replies; 39+ messages in thread
From: Dor Laor @ 2007-02-12 17:35 UTC (permalink / raw)
  To: Gregory Haskins, Avi Kivity, Waba
  Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

>Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
>> Waba wrote:
>>> It took me a while, but I figured it out... nearly!
>>>
>>> Everything SIGILLs after the fs- root service is started. Its start
>>> method does several things, but the problematic bit is replacing the
>>> libc with an optimised version (namely,
/usr/lib/libc/libc_hwcap1.so.1,
>>> which makes use of the SSE, MMX, CMOV, SEP and FPU instruction sets
>>> according to file(1)).
>>>
>>> All these flags are indeed advertised in the CPUID (isainfo - v:
sse2
>sse
>>> fxsr mmx cmov sep cx8 tsc fpu)). If the amd_sysc bit had been
present,
>>> the hwcap2 version would have been selected by moe(1), I guess (adds
>>> SSE2 support and replaces SEP by AMD_SYSC).
>>>
>>> Disabling the libc replacement in /lib/svc/method/fs- root entirely
>>> workarounds the problem.
>>>
>>> Further investigating, I tricked ls(1) into using the optimised libc
>>> through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
>>> culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
>>> that the faulty instruction is "movups (%esp), %xmm0", a SSE
feature.
>>> The %xmm0 register is apparently for storage purposes only, as the
only
>>> instructions used to access it are movups, movntps and movaps.
>>>
>>> At this point I hope that it makes a lot of sense to you, because I
>>> have no idea why it works fine on Avi's Opteron, etc.
>>>
>>> Let me know if you need any additional debugging.
>>>
>>
>> Let's look at the control registers at the time of the SIGILL.  Can
you
>> reproduce the error with the attached patch and send dmesg?
>
>
>Hi Avi,
>  I have a sneaking suspicion that this may be the same root-cause of
my
>findings with #UD on SLES.  I wrote a program that allows you to take
MD5
>sum pages of a running program's text sections and compare them.  I
then
>compared the output of GRUB running on bare-metal and as a KVM guest
and
>they were identical (except for the expected text that is affected by
>relocation).  This was not what I was expecting since we speculated MMU
>corruption.  Admittedly the test is not conclusive since the page
mappings
>could surely be different under the load of the target apps execution
>verses the delta program.  But I was hoping for a smoking gun ;)
>
>Note that I am seeing #UD under other apps as well (Firefox for
instance).
>If there were a disparity between the advertised and actual CPUID flags
and
>SLES is using libraries that interpret the flags, that could explain
the
>behavior here.  Note that grub is blowing up in libc for me as well.  I
>will explore a CPUID disparity as a possibility next and report back.
>What I did notice is that KVM seems to report the CPU as an AMD, even
>though I am running on a Woodcrest.  I would speculate that the problem
is
>that some AMD specific flag (e.g. amd_sysc) is set when it should not
be.
>
>Note that I am currently being pulled off KVM work for about a week so
I
>will be silent for a bit.

Just for your blessed testing you can change qemu to pass through the
cupid instruction. I've done it in the past and it worked just fine (on
AMD host thought). 

Good luck.
>-Greg
>
>-----------------------------------------------------------------------
--
>Using Tomcat but need to do more? Need to support web services,
security?
>Get stuff done quickly with pre-integrated technology to make your job
>easier.
>Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=12164
2
>_______________________________________________
>kvm-devel mailing list
>kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>https://lists.sourceforge.net/lists/listinfo/kvm-devel

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-10 13:34                       ` Waba
  2007-02-11  9:14                         ` Avi Kivity
  2007-02-12  9:48                         ` Avi Kivity
@ 2007-02-12 17:58                         ` Joerg Roedel
  2 siblings, 0 replies; 39+ messages in thread
From: Joerg Roedel @ 2007-02-12 17:58 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Sat, Feb 10, 2007 at 02:34:43PM +0100, Waba wrote:
> It took me a while, but I figured it out... nearly!

Great. We get closer to the real problem.

> Further investigating, I tricked ls(1) into using the optimised libc
> through LD_LIBRARY_PATH and obtained a core. mdb(1) told me that the
> culprit was hiding at libc`memset+0x74. And finally, dis(1) revealed
> that the faulty instruction is "movups (%esp), %xmm0", a SSE feature.
> The %xmm0 register is apparently for storage purposes only, as the only
> instructions used to access it are movups, movntps and movaps.

This differs a bit from my investigations. I got the #UD in SVM always
on the same RIP (which is unlikely when it is triggered in usermode). I
assume the error comes from the lazy FPU switching code inside the
kernel, triggered by the SSE instruction. But it is weird that this #UD
in the kernel results in a SIGILL to the userspace process, but maybe
Solaris does such things.
It is possible that the Kernel and the Userspace on Solaris have
different assumptions about the CPU capabilities?

> At this point I hope that it makes a lot of sense to you, because I
> have no idea why it works fine on Avi's Opteron, etc.

Yes, thats another open question...

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                           ` <45D03801.4040006-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  2007-02-12 14:57                             ` Gregory Haskins
@ 2007-02-13 14:06                             ` Waba
  2007-02-13 14:37                               ` Avi Kivity
  1 sibling, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-13 14:06 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Mon, Feb 12, 2007 at 11:48:49AM +0200, Avi Kivity wrote:
> Let's look at the control registers at the time of the SIGILL.  Can you 
> reproduce the error with the attached patch and send dmesg?

#ud: cr0 8005002b (8005003b) cr4 b8 (b8)

Qemu also aborted with "unhandled vm exit: 08" or similar, but I guess
that the important part is the printk.

-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-13 14:06                             ` Waba
@ 2007-02-13 14:37                               ` Avi Kivity
       [not found]                                 ` <45D1CD1F.907-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-13 14:37 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]

Waba wrote:
> On Mon, Feb 12, 2007 at 11:48:49AM +0200, Avi Kivity wrote:
>   
>> Let's look at the control registers at the time of the SIGILL.  Can you 
>> reproduce the error with the attached patch and send dmesg?
>>     
>
> #ud: cr0 8005002b (8005003b) cr4 b8 (b8)
>
> Qemu also aborted with "unhandled vm exit: 08" or similar, but I guess
> that the important part is the printk.
>
>   
Right.

bit 9 of cr4 (osfxsr) is clear, which according to the docs generates 
#ud on any sse instruction.

can you try the attached test patch (can be on top of the last patch)?

-- 
error compiling committee.c: too many arguments to function


[-- Attachment #2: force-osfxsr.patch --]
[-- Type: text/x-patch, Size: 657 bytes --]

Index: svm.c
===================================================================
--- svm.c	(revision 4418)
+++ svm.c	(working copy)
@@ -555,7 +555,7 @@
 	 * cache by default. the orderly way is to enable cache in bios.
 	 */
 	save->cr0 = 0x00000010 | CR0_PG_MASK | CR0_WP_MASK;
-	save->cr4 = CR4_PAE_MASK;
+	save->cr4 = CR4_PAE_MASK | 0x200;
 	/* rdx = ?? */
 }
 
@@ -741,7 +741,7 @@
 static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
        vcpu->cr4 = cr4;
-       vcpu->svm->vmcb->save.cr4 = cr4 | CR4_PAE_MASK;
+       vcpu->svm->vmcb->save.cr4 = cr4 | CR4_PAE_MASK | 0x200;
 }
 
 static void svm_set_segment(struct kvm_vcpu *vcpu,

[-- Attachment #3: Type: text/plain, Size: 374 bytes --]

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

[-- Attachment #4: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                 ` <45D1CD1F.907-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-13 21:44                                   ` Waba
  2007-02-14 13:20                                     ` Avi Kivity
  0 siblings, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-13 21:44 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Tue, Feb 13, 2007 at 04:37:19PM +0200, Avi Kivity wrote:
> bit 9 of cr4 (osfxsr) is clear, which according to the docs generates 
> #ud on any sse instruction.

I still have no idea why this bit was not set when running on my CPU,
but with the register set up this way, no more SIGILL. I re-enabled the
libc mount and rebooted the guest to be sure, and it does work!

Good catch!
-Waba.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-13 21:44                                   ` Waba
@ 2007-02-14 13:20                                     ` Avi Kivity
       [not found]                                       ` <45D30C92.8000808-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 39+ messages in thread
From: Avi Kivity @ 2007-02-14 13:20 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> On Tue, Feb 13, 2007 at 04:37:19PM +0200, Avi Kivity wrote:
>   
>> bit 9 of cr4 (osfxsr) is clear, which according to the docs generates 
>> #ud on any sse instruction.
>>     
>
> I still have no idea why this bit was not set when running on my CPU,
> but with the register set up this way, no more SIGILL. I re-enabled the
> libc mount and rebooted the guest to be sure, and it does work!
>
>   

Well, there's probably an emulator bug somewhere.

Can you add a printk() to set_cr4() in kvm_main.c and see what the guest 
does?  The documentation states that it's up to the OS to enable the 
bit, so I can't just apply the previous patch, even though it fixes the 
problem.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
       [not found]                                       ` <45D30C92.8000808-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-16 12:14                                         ` Waba
  2007-02-18  9:44                                           ` Avi Kivity
  0 siblings, 1 reply; 39+ messages in thread
From: Waba @ 2007-02-16 12:14 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Wed, Feb 14, 2007 at 03:20:18PM +0200, Avi Kivity wrote:
> Well, there's probably an emulator bug somewhere.
> 
> Can you add a printk() to set_cr4() in kvm_main.c and see what the guest 
> does?  The documentation states that it's up to the OS to enable the 
> bit, so I can't just apply the previous patch, even though it fixes the 
> problem.

set_cr4: set 0000000000000010
set_cr4: set 0000000000000090
set_cr4: set 0000000000000098
set_cr4: set 00000000000000b8

Here is it. If I get it right, the guest never sets 0x200 then ?

-Waba.

PS: I'll be on holidays and unable to read my mail until Thursday.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Solaris 10 doesn't work under KVM
  2007-02-16 12:14                                         ` Waba
@ 2007-02-18  9:44                                           ` Avi Kivity
  0 siblings, 0 replies; 39+ messages in thread
From: Avi Kivity @ 2007-02-18  9:44 UTC (permalink / raw)
  To: Waba; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Waba wrote:
> On Wed, Feb 14, 2007 at 03:20:18PM +0200, Avi Kivity wrote:
>   
>> Well, there's probably an emulator bug somewhere.
>>
>> Can you add a printk() to set_cr4() in kvm_main.c and see what the guest 
>> does?  The documentation states that it's up to the OS to enable the 
>> bit, so I can't just apply the previous patch, even though it fixes the 
>> problem.
>>     
>
> set_cr4: set 0000000000000010
> set_cr4: set 0000000000000090
> set_cr4: set 0000000000000098
> set_cr4: set 00000000000000b8
>
> Here is it. If I get it right, the guest never sets 0x200 then ?
>
>   

Yes.

At this point it'll be hard going without the sources.  Perhaps you can 
reproduce this with opensolaris?


> -Waba.
>
> PS: I'll be on holidays and unable to read my mail until Thursday.
>   

We'll be waiting...  enjoy your vacation.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2007-02-18  9:44 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-28 14:40 Solaris 10 doesn't work under KVM Waba
2007-01-28 17:38 ` Michael Riepe
     [not found]   ` <45BCDF8C.1000508-0QoEqw4nQxo@public.gmane.org>
2007-01-28 18:28     ` Waba
2007-01-28 19:27 ` Avi Kivity
     [not found]   ` <45BCF91E.2030704-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-01-28 20:25     ` Anthony Liguori
2007-01-28 22:23     ` Waba
2007-01-29  8:28       ` Avi Kivity
     [not found]         ` <45BDB03B.8000305-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-01-29  9:40           ` Avi Kivity
2007-01-29 11:49 ` Avi Kivity
     [not found]   ` <45BDDF32.3010607-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-01 21:49     ` Waba
2007-02-02 19:19       ` Joerg Roedel
     [not found]         ` <20070202191942.GB8804-5C7GfCeVMHo@public.gmane.org>
2007-02-04  9:50           ` Avi Kivity
2007-02-04 18:31           ` Waba
2007-02-07  9:42             ` Avi Kivity
     [not found]               ` <45C99EE9.3010306-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-07 23:04                 ` Waba
2007-02-08  9:27                   ` Avi Kivity
     [not found]                     ` <45CAECEB.4000701-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-08  9:58                       ` Joerg Roedel
     [not found]                         ` <20070208095816.GA5204-5C7GfCeVMHo@public.gmane.org>
2007-02-08 10:04                           ` Avi Kivity
     [not found]                             ` <45CAF5C6.8020104-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-08 10:19                               ` Joerg Roedel
     [not found]                                 ` <20070208101945.GB5204-5C7GfCeVMHo@public.gmane.org>
2007-02-08 10:39                                   ` Avi Kivity
     [not found]                                     ` <45CAFDF6.4020402-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-08 11:00                                       ` Cyril Plisko
     [not found]                                         ` <c7dddeaa0702080300i31eb933fjfcdb4570f82b0a79-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-02-08 12:21                                           ` Avi Kivity
     [not found]                                             ` <45CB15E3.7010803-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-08 13:45                                               ` Cyril Plisko
2007-02-08 14:45                                               ` Joerg Roedel
     [not found]                                                 ` <20070208144530.GC5204-5C7GfCeVMHo@public.gmane.org>
2007-02-08 14:58                                                   ` Cyril Plisko
2007-02-10 13:34                       ` Waba
2007-02-11  9:14                         ` Avi Kivity
     [not found]                           ` <45CEDE92.4090204-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-11 10:43                             ` Waba
2007-02-11 10:58                               ` Avi Kivity
2007-02-12  9:48                         ` Avi Kivity
     [not found]                           ` <45D03801.4040006-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-12 14:57                             ` Gregory Haskins
     [not found]                               ` <45D03A33.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-12 17:35                                 ` Dor Laor
2007-02-13 14:06                             ` Waba
2007-02-13 14:37                               ` Avi Kivity
     [not found]                                 ` <45D1CD1F.907-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-13 21:44                                   ` Waba
2007-02-14 13:20                                     ` Avi Kivity
     [not found]                                       ` <45D30C92.8000808-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-16 12:14                                         ` Waba
2007-02-18  9:44                                           ` Avi Kivity
2007-02-12 17:58                         ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.