All of lore.kernel.org
 help / color / mirror / Atom feed
* Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
@ 2010-05-24 15:51 Xu, Jiajun
  2010-05-24 16:53 ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Xu, Jiajun @ 2010-05-24 15:51 UTC (permalink / raw)
  To: xen-devel

Hi all,
	This is our bi-weekly test report for Xen-unstable tree. There are 3 new bugs found in this two weeks. 64b Testing is blocked because guest creation on 64b host cause Xen panic. XenU and CPU offline can not work.
	For bug fixing, Save/Restore and Live Migration can work again. The VT-d issues with 2 NICs and Myriom NIC are both resolved.
	We use Pv_ops(xen/master, 2.6.31.13) as Dom0 in our testing.

Status Summary
====================================================================
Feature				Result
------------------------------------------------------
VT-x/VT-x2			PASS
RAS 				Buggy
VT-d					Buggy
SR-IOV				Buggy
TXT					PASS
PowerMgmt			PASS
Other				Buggy

New Bugs (3):
====================================================================
1. xen hypervisor hang when create guest on 32e platform
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
2. CPU panic when running cpu offline
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
3. xenu guest can't boot up
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1618

Fixed Bugs (3)
====================================================================
1. Save/Restore and Live Migration can not work
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1609
2. [VT-d]Xen crash when booting guest with device assigned and Myricom driver loaded in Dom0
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1608
3. [VT-d] Guest with 2 NIC assigned may hang when booting
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1610

Old P1 Bugs (1):
=====================================================================
1. stubdom based guest hangs at starting when using qcow image.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1372

Old P2 Bugs (12)
=====================================================================
1. Failed to install FC10
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1461
2. Two onboard 82576 NICs assigned to HVM guest cannot work stable if use INTx interrupt.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1459
3. stubdom based guest hangs when assigning hdc to it.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1373
4. [stubdom]The xm save command hangs while saving <Domain-dm>.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1377
5. [stubdom] cannot restore stubdom based domain.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1378
6. Live Migration with md5sum running cause dma_timer_expiry error in guest http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1530
7. Very slow mouse/keyboard and no USB thumbdrive detected w/Core i7 & Pvops
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1541
8. Linux guest boots up very slow with SDL rendering
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1478
9. [RAS] CPUs are not in the correct NUMA node after hot-add memory
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1573
10. [SR-IOV] Qemu report pci_msix_writel error while assigning VF to guest
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1575
11. Can't create guest with big memory if do not limit Dom0 memory
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1604
12. Add fix for TBOOT/Xen and S3 flow
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1611


Xen Info:
============================================================================
Service OS : Red Hat Enterprise Linux Server release 5.1 (Tikanga)
xen-changeset:   21438:840f269d95fb

pvops git:
commit a3e7c7b82c09450487a7e7f5f47b165c49474fd4
Merge: f3d5fe8... a47d360...
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

ioemu git:
commit 01626771cf2e9285fbfddcbded2820fc77745e4b
Author: Ian Jackson <ian.jackson@eu.citrix.com>
Date:   Fri Apr 30 17:41:45 2010 +0100

Test Environment:
==========================================================================
Service OS : Red Hat Enterprise Linux Server release 5.1 (Tikanga) Hardware : Westmere-HEDT

   PAE           Summary Test Report of Last Session 
=====================================================================
  	                    Total   Pass    Fail    NoResult   Crash
=====================================================================
vtd_ept_vpid                13      13      0         0        0
control_panel_ept_vpid       10      8       2         0        0
ras_ept_vpid                1       0       0         0        1
gtest_ept_vpid              23      23      0         0        0
acpi_ept_vpid               5       3       2         0        0
sriov_ept_vpid              2       2       0         0        0
=====================================================================
vtd_ept_vpid                13      13      0         0        0
 :lm_pci_up_nomsi_PAE_gPA   1       1       0         0        0
 :two_dev_scp_nomsi_PAE_g   1       1       0         0        0
 :one_pcie_smp_PAE_gPAE     1       1       0         0        0
 :two_dev_up_PAE_gPAE       1       1       0         0        0
 :lm_pcie_smp_PAE_gPAE      1       1       0         0        0
 :one_pcie_smp_nomsi_PAE_   1       1       0         0        0
 :two_dev_smp_nomsi_PAE_g   1       1       0         0        0
 :two_dev_smp_PAE_gPAE      1       1       0         0        0
 :two_dev_up_nomsi_PAE_gP   1       1       0         0        0
 :hp_pci_up_PAE_gPAE        1       1       0         0        0
 :two_dev_scp_PAE_gPAE      1       1       0         0        0
 :lm_pci_smp_nomsi_PAE_gP   1       1       0         0        0
 :lm_pcie_up_PAE_gPAE       1       1       0         0        0
control_panel_ept_vpid      10      8       2         0        0
 :XEN_SR_SMP_PAE_gPAE       1       1       0         0        0
 :XEN_linux_win_PAE_gPAE    1       1       0         0        0
 :XEN_SR_Continuity_PAE_g   1       1       0         0        0
 :XEN_LM_SMP_PAE_gPAE       1       1       0         0        0
 :XEN_vmx_vcpu_pin_PAE_gP   1       1       0         0        0
 :XEN_1500M_guest_PAE_gPA   1       0       1         0        0
 :XEN_LM_Continuity_PAE_g   1       1       0         0        0
 :XEN_two_winxp_PAE_gPAE    1       0       1         0        0
 :XEN_256M_guest_PAE_gPAE   1       1       0         0        0
 :XEN_vmx_2vcpu_PAE_gPAE    1       1       0         0        0
ras_ept_vpid                1       0       0         0        1
 :cpu_online_offline_PAE_   1       0       0         0        1
gtest_ept_vpid              23      23      0         0        0
 :ltp_nightly_PAE_gPAE      1       1       0         0        0
 :boot_up_acpi_PAE_gPAE     1       1       0         0        0
 :reboot_xp_PAE_gPAE        1       1       0         0        0
 :boot_up_acpi_xp_PAE_gPA   1       1       0         0        0
 :boot_fc9_PAE_gPAE         1       1       0         0        0
 :boot_up_vista_PAE_gPAE    1       1       0         0        0
 :boot_up_acpi_win2k3_PAE   1       1       0         0        0
 :boot_smp_win7_ent_PAE_g   1       1       0         0        0
 :boot_smp_acpi_win2k3_PA   1       1       0         0        0
 :boot_smp_acpi_xp_PAE_gP   1       1       0         0        0
 :boot_smp_win7_ent_debug   1       1       0         0        0
 :boot_smp_vista_PAE_gPAE   1       1       0         0        0
 :boot_up_noacpi_win2k3_P   1       1       0         0        0
 :boot_nevada_PAE_gPAE      1       1       0         0        0
 :boot_solaris10u5_PAE_gP   1       1       0         0        0
 :boot_indiana_PAE_gPAE     1       1       0         0        0
 :boot_rhel5u1_PAE_gPAE     1       1       0         0        0
 :boot_base_kernel_PAE_gP   1       1       0         0        0
 :boot_up_win2008_PAE_gPA   1       1       0         0        0
 :boot_up_noacpi_xp_PAE_g   1       1       0         0        0
 :boot_smp_win2008_PAE_gP   1       1       0         0        0
 :reboot_fc6_PAE_gPAE       1       1       0         0        0
 :kb_nightly_PAE_gPAE       1       1       0         0        0
acpi_ept_vpid               5       3       2         0        0
 :monitor_p_status_PAE_gP   1       1       0         0        0
 :hvm_s3_smp_sr_PAE_gPAE    1       0       1         0        0
 :Dom0_S3_PAE_gPAE          1       1       0         0        0
 :monitor_c_status_PAE_gP   1       1       0         0        0
 :hvm_s3_smp_PAE_gPAE       1       0       1         0        0
sriov_ept_vpid              2       2       0         0        0
 :serial_vfs_smp_PAE_gPAE   1       1       0         0        0
 :one_vf_smp_PAE_gPAE       1       1       0         0        0
=====================================================================
Total                       54      49      4         0        1

Service OS : Red Hat Enterprise Linux Server release 5.1 (Tikanga) Hardware : Stoakley

   PAE           Summary Test Report of Last Session
=====================================================================
  	                    Total   Pass    Fail    NoResult   Crash
=====================================================================
vtd_ept_vpid                13      12      1         0        0
control_panel_ept_vpid      12      9       3         0        0
ras_ept_vpid                1       0       0         0        1
gtest_ept_vpid              23      23      0         0        0
acpi_ept_vpid               3       3       0         0        0
=====================================================================
vtd_ept_vpid                13      12      1         0        0
 :two_dev_scp_nomsi_PAE_g   1       1       0         0        0
 :lm_pci_up_nomsi_PAE_gPA   1       1       0         0        0
 :one_pcie_smp_PAE_gPAE     1       1       0         0        0
 :two_dev_up_PAE_gPAE       1       1       0         0        0
 :lm_pcie_smp_PAE_gPAE      1       1       0         0        0
 :two_dev_smp_nomsi_PAE_g   1       1       0         0        0
 :two_dev_smp_PAE_gPAE      1       0       1         0        0
 :one_pcie_smp_nomsi_PAE_   1       1       0         0        0
 :hp_pci_up_PAE_gPAE        1       1       0         0        0
 :two_dev_up_nomsi_PAE_gP   1       1       0         0        0
 :two_dev_scp_PAE_gPAE      1       1       0         0        0
 :lm_pci_smp_nomsi_PAE_gP   1       1       0         0        0
 :lm_pcie_up_PAE_gPAE       1       1       0         0        0
control_panel_ept_vpid      12      9       3         0        0
 :XEN_4G_guest_PAE_gPAE     1       0       1         0        0
 :XEN_linux_win_PAE_gPAE    1       1       0         0        0
 :XEN_SR_SMP_PAE_gPAE       1       1       0         0        0
 :XEN_LM_SMP_PAE_gPAE       1       1       0         0        0
 :XEN_SR_Continuity_PAE_g   1       1       0         0        0
 :XEN_vmx_vcpu_pin_PAE_gP   1       1       0         0        0
 :XEN_LM_Continuity_PAE_g   1       1       0         0        0
 :XEN_256M_guest_PAE_gPAE   1       1       0         0        0
 :XEN_1500M_guest_PAE_gPA   1       0       1         0        0
 :XEN_256M_xenu_PAE_gPAE    1       0       1         0        0
 :XEN_two_winxp_PAE_gPAE    1       1       0         0        0
 :XEN_vmx_2vcpu_PAE_gPAE    1       1       0         0        0
ras_ept_vpid                1       0       0         0        1
 :cpu_online_offline_PAE_   1       0       0         0        1
gtest_ept_vpid              23      23      0         0        0
 :ltp_nightly_PAE_gPAE      1       1       0         0        0
 :boot_up_acpi_PAE_gPAE     1       1       0         0        0
 :reboot_xp_PAE_gPAE        1       1       0         0        0
 :boot_up_acpi_xp_PAE_gPA   1       1       0         0        0
 :boot_up_vista_PAE_gPAE    1       1       0         0        0
 :boot_fc9_PAE_gPAE         1       1       0         0        0
 :boot_smp_win7_ent_PAE_g   1       1       0         0        0
 :boot_up_acpi_win2k3_PAE   1       1       0         0        0
 :boot_smp_acpi_win2k3_PA   1       1       0         0        0
 :boot_smp_acpi_xp_PAE_gP   1       1       0         0        0
 :boot_smp_win7_ent_debug   1       1       0         0        0
 :boot_smp_vista_PAE_gPAE   1       1       0         0        0
 :boot_up_noacpi_win2k3_P   1       1       0         0        0
 :boot_nevada_PAE_gPAE      1       1       0         0        0
 :boot_rhel5u1_PAE_gPAE     1       1       0         0        0
 :boot_indiana_PAE_gPAE     1       1       0         0        0
 :boot_solaris10u5_PAE_gP   1       1       0         0        0
 :boot_base_kernel_PAE_gP   1       1       0         0        0
 :boot_up_win2008_PAE_gPA   1       1       0         0        0
 :boot_up_noacpi_xp_PAE_g   1       1       0         0        0
 :boot_smp_win2008_PAE_gP   1       1       0         0        0
 :reboot_fc6_PAE_gPAE       1       1       0         0        0
 :kb_nightly_PAE_gPAE       1       1       0         0        0
acpi_ept_vpid               3       3       0         0        0
 :Dom0_S3_PAE_gPAE          1       1       0         0        0
 :hvm_s3_smp_sr_PAE_gPAE    1       1       0         0        0
 :hvm_s3_smp_PAE_gPAE       1       1       0         0        0
=====================================================================
Total                       52      47      4         0        1

Best Regards,
Jiajun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-24 15:51 Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7 Xu, Jiajun
@ 2010-05-24 16:53 ` Keir Fraser
  2010-05-25  8:17   ` Xu, Jiajun
  0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-05-24 16:53 UTC (permalink / raw)
  To: Xu, Jiajun, xen-devel

On 24/05/2010 16:51, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:

> 1. xen hypervisor hang when create guest on 32e platform
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
> 2. CPU panic when running cpu offline
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616

Please attach the backtrace. Also some indication of how easily this bug
triggers (is it on every cpu offline on your system, or do you have to cycle
the test a while?).

> 3. xenu guest can't boot up
> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1618

This is probably fixed at xen-unstable tip.

 -- Keir

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-24 16:53 ` Keir Fraser
@ 2010-05-25  8:17   ` Xu, Jiajun
  2010-05-25  8:27     ` Dulloor
  2010-05-25  9:13     ` Keir Fraser
  0 siblings, 2 replies; 19+ messages in thread
From: Xu, Jiajun @ 2010-05-25  8:17 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

> On 24/05/2010 16:51, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
> 
>> 1. xen hypervisor hang when create guest on 32e platform
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617

The bug occurs each time when I created the guest. I have attached the serial output on the bugzilla.

>> 2. CPU panic when running cpu offline
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616

Xen will panic when I offline cpu each time. The log is also attached on the bugzilla.

> Please attach the backtrace. Also some indication of how easily this
> bug triggers (is it on every cpu offline on your system, or do you
> have to cycle the test a while?).
> 
>> 3. xenu guest can't boot up
>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1618
> 
> This is probably fixed at xen-unstable tip.

Thanks a lot. We will verify the issue with the tip.

Best Regards,
Jiajun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  8:17   ` Xu, Jiajun
@ 2010-05-25  8:27     ` Dulloor
  2010-05-25  8:29       ` Xu, Jiajun
  2010-05-26 13:48       ` Xu, Jiajun
  2010-05-25  9:13     ` Keir Fraser
  1 sibling, 2 replies; 19+ messages in thread
From: Dulloor @ 2010-05-25  8:27 UTC (permalink / raw)
  To: Xu, Jiajun; +Cc: xen-devel, Keir Fraser

On Tue, May 25, 2010 at 4:17 AM, Xu, Jiajun <jiajun.xu@intel.com> wrote:
>> On 24/05/2010 16:51, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>>
>>> 1. xen hypervisor hang when create guest on 32e platform
>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>
> The bug occurs each time when I created the guest. I have attached the serial output on the bugzilla.
I see the same hang, but on a 64-bit platform. Can you verify if 21433
is the culprit, which is the case with me.

>
>>> 2. CPU panic when running cpu offline
>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
>
> Xen will panic when I offline cpu each time. The log is also attached on the bugzilla.
>
>> Please attach the backtrace. Also some indication of how easily this
>> bug triggers (is it on every cpu offline on your system, or do you
>> have to cycle the test a while?).
>>
>>> 3. xenu guest can't boot up
>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1618
>>
>> This is probably fixed at xen-unstable tip.
>
> Thanks a lot. We will verify the issue with the tip.
>
> Best Regards,
> Jiajun
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
-dulloor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  8:27     ` Dulloor
@ 2010-05-25  8:29       ` Xu, Jiajun
  2010-05-26 13:48       ` Xu, Jiajun
  1 sibling, 0 replies; 19+ messages in thread
From: Xu, Jiajun @ 2010-05-25  8:29 UTC (permalink / raw)
  To: Dulloor; +Cc: xen-devel, Keir Fraser

> On Tue, May 25, 2010 at 4:17 AM, Xu, Jiajun <jiajun.xu@intel.com> wrote:
>>> On 24/05/2010 16:51, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>>> 
>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>> 
>> The bug occurs each time when I created the guest. I have attached
>> the
> serial output on the bugzilla.
> I see the same hang, but on a 64-bit platform. Can you verify if 21433
> is the culprit, which is the case with me.

Yes. We also see it with 64-bit host. :)
Thanks, Dulloor. We will have a check with 21433.

Best Regards,
Jiajun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  8:17   ` Xu, Jiajun
  2010-05-25  8:27     ` Dulloor
@ 2010-05-25  9:13     ` Keir Fraser
  2010-05-25  9:15       ` Keir Fraser
  1 sibling, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-05-25  9:13 UTC (permalink / raw)
  To: Xu, Jiajun, xen-devel

On 25/05/2010 09:17, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:

>>> 1. xen hypervisor hang when create guest on 32e platform
>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
> 
> The bug occurs each time when I created the guest. I have attached the serial
> output on the bugzilla.

I haven't been able to reproduce this.

>>> 2. CPU panic when running cpu offline
>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
> 
> Xen will panic when I offline cpu each time. The log is also attached on the
> bugzilla.

Nor this. I even installed 32-bit Xen to match your environment more
closely.

 -- Keir

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  9:13     ` Keir Fraser
@ 2010-05-25  9:15       ` Keir Fraser
  2010-06-01  7:43         ` Jiang, Yunhong
  0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-05-25  9:15 UTC (permalink / raw)
  To: Xu, Jiajun, xen-devel

On 25/05/2010 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>> 
>> The bug occurs each time when I created the guest. I have attached the serial
>> output on the bugzilla.
> 
> I haven't been able to reproduce this.
> 
>>>> 2. CPU panic when running cpu offline
>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
>> 
>> Xen will panic when I offline cpu each time. The log is also attached on the
>> bugzilla.
> 
> Nor this. I even installed 32-bit Xen to match your environment more
> closely.

I'm running xen-unstable:21447 by the way. I ran 64-bit Xen for testing (1)
above, and both 64-bit and 32-bit Xen for testing (2).

 K.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  8:27     ` Dulloor
  2010-05-25  8:29       ` Xu, Jiajun
@ 2010-05-26 13:48       ` Xu, Jiajun
  1 sibling, 0 replies; 19+ messages in thread
From: Xu, Jiajun @ 2010-05-26 13:48 UTC (permalink / raw)
  To: Dulloor; +Cc: xen-devel, Keir Fraser

> On Tue, May 25, 2010 at 4:17 AM, Xu, Jiajun <jiajun.xu@intel.com> wrote:
>>> On 24/05/2010 16:51, "Xu, Jiajun" <jiajun.xu@intel.com> wrote:
>>> 
>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>> 
>> The bug occurs each time when I created the guest. I have attached
>> the serial
> output on the bugzilla.
> I see the same hang, but on a 64-bit platform. Can you verify if 21433
> is the culprit, which is the case with me.

Hi Keir, Dulloor,
We checked it's 21433 caused the issue on our platform. 

Best Regards,
Jiajun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-05-25  9:15       ` Keir Fraser
@ 2010-06-01  7:43         ` Jiang, Yunhong
  2010-06-01  9:30           ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-01  7:43 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3560 bytes --]

For issue 2, CPU panic when running cpu offline, it should comes from the periodic_timer.

When a CPU is pull down, cpu_disable_scheduler will remove the single shot timer, but the periodic_timer is not migrated.
After the vcpu is scheduled on another pCPU later, and then schedule out from that new pcpu, the stop_timer(&prev->periodic_timer) will try to access the per_cpu strucutre, whic still poiting to the offlined CPU's per_cpu area and will cause trouble. This should be caused by the per_cpu changes.

I try to migrate the periodic_timer also when cpu_disable_scheduler() and seems it works. (comments the migration in cpu_disable_scheudler will trigger the printk).
Seems on your side, the timer will always be triggered before schedule out?

--jyh

diff -r 96917cf25bf3 xen/common/schedule.c
--- a/xen/common/schedule.c	Fri May 28 10:54:07 2010 +0100
+++ b/xen/common/schedule.c	Tue Jun 01 15:35:21 2010 +0800
@@ -487,6 +487,15 @@ int cpu_disable_scheduler(unsigned int c
                 migrate_timer(&v->singleshot_timer, cpu_mig);
             }
 
+/*
+            if ( v->periodic_timer.cpu == cpu )
+            {
+                int cpu_mig = first_cpu(c->cpu_valid);
+                if ( cpu_mig == cpu )
+                    cpu_mig = next_cpu(cpu_mig, c->cpu_valid);
+                migrate_timer(&v->periodic_timer, cpu_mig);
+            }
+*/
             if ( v->processor == cpu )
             {
                 set_bit(_VPF_migrating, &v->pause_flags);
@@ -505,7 +514,10 @@ int cpu_disable_scheduler(unsigned int c
              * all locks.
              */
             if ( v->processor == cpu )
+            {
+                printk("we hit the EAGAIN here\n");
                 ret = -EAGAIN;
+            }
         }
     }
     return ret;
@@ -1005,6 +1017,11 @@ static void schedule(void)
 
     perfc_incr(sched_ctx);
 
+    if (prev->periodic_timer.cpu != smp_processor_id() && !cpu_online(prev->periodic_timer.cpu))
+    {
+        printk("I'm now at cpu %x, timer's cpu is %x\n", smp_processor_id(), prev->periodic_timer.cpu);
+    }
+
     stop_timer(&prev->periodic_timer);
 
     /* Ensure that the domain has an up-to-date time base. */



--jyh

>-----Original Message-----
>From: xen-devel-bounces@lists.xensource.com
>[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Keir Fraser
>Sent: Tuesday, May 25, 2010 5:15 PM
>To: Xu, Jiajun; xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>On 25/05/2010 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>
>>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>>>
>>> The bug occurs each time when I created the guest. I have attached the serial
>>> output on the bugzilla.
>>
>> I haven't been able to reproduce this.
>>
>>>>> 2. CPU panic when running cpu offline
>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
>>>
>>> Xen will panic when I offline cpu each time. The log is also attached on the
>>> bugzilla.
>>
>> Nor this. I even installed 32-bit Xen to match your environment more
>> closely.
>
>I'm running xen-unstable:21447 by the way. I ran 64-bit Xen for testing (1)
>above, and both 64-bit and 32-bit Xen for testing (2).
>
> K.
>
>
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-01  7:43         ` Jiang, Yunhong
@ 2010-06-01  9:30           ` Keir Fraser
  2010-06-02  7:28             ` Jiang, Yunhong
  0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-06-01  9:30 UTC (permalink / raw)
  To: Jiang, Yunhong, Xu, Jiajun, xen-devel

On 01/06/2010 08:43, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

> For issue 2, CPU panic when running cpu offline, it should comes from the
> periodic_timer.
> 
> When a CPU is pull down, cpu_disable_scheduler will remove the single shot
> timer, but the periodic_timer is not migrated.
> After the vcpu is scheduled on another pCPU later, and then schedule out from
> that new pcpu, the stop_timer(&prev->periodic_timer) will try to access the
> per_cpu strucutre, whic still poiting to the offlined CPU's per_cpu area and
> will cause trouble. This should be caused by the per_cpu changes.

Which xen-unstable changeset are you testing? All timers should be
automatically migrated off a dead CPU and onto CPU0 by changeset 21424. Is
that not working okay for you?

 -- Keir

> I try to migrate the periodic_timer also when cpu_disable_scheduler() and
> seems it works. (comments the migration in cpu_disable_scheudler will trigger
> the printk).
> Seems on your side, the timer will always be triggered before schedule out?
> 
> --jyh
> 
> diff -r 96917cf25bf3 xen/common/schedule.c
> --- a/xen/common/schedule.c Fri May 28 10:54:07 2010 +0100
> +++ b/xen/common/schedule.c Tue Jun 01 15:35:21 2010 +0800
> @@ -487,6 +487,15 @@ int cpu_disable_scheduler(unsigned int c
>                  migrate_timer(&v->singleshot_timer, cpu_mig);
>              }
>  
> +/*
> +            if ( v->periodic_timer.cpu == cpu )
> +            {
> +                int cpu_mig = first_cpu(c->cpu_valid);
> +                if ( cpu_mig == cpu )
> +                    cpu_mig = next_cpu(cpu_mig, c->cpu_valid);
> +                migrate_timer(&v->periodic_timer, cpu_mig);
> +            }
> +*/
>              if ( v->processor == cpu )
>              {
>                  set_bit(_VPF_migrating, &v->pause_flags);
> @@ -505,7 +514,10 @@ int cpu_disable_scheduler(unsigned int c
>               * all locks.
>               */
>              if ( v->processor == cpu )
> +            {
> +                printk("we hit the EAGAIN here\n");
>                  ret = -EAGAIN;
> +            }
>          }
>      }
>      return ret;
> @@ -1005,6 +1017,11 @@ static void schedule(void)
>  
>      perfc_incr(sched_ctx);
>  
> +    if (prev->periodic_timer.cpu != smp_processor_id() &&
> !cpu_online(prev->periodic_timer.cpu))
> +    {
> +        printk("I'm now at cpu %x, timer's cpu is %x\n", smp_processor_id(),
> prev->periodic_timer.cpu);
> +    }
> +
>      stop_timer(&prev->periodic_timer);
>  
>      /* Ensure that the domain has an up-to-date time base. */
> 
> 
> 
> --jyh
> 
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xensource.com
>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Keir Fraser
>> Sent: Tuesday, May 25, 2010 5:15 PM
>> To: Xu, Jiajun; xen-devel@lists.xensource.com
>> Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>> #a3e7c7...
>> 
>> On 25/05/2010 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>> 
>>>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>>>> 
>>>> The bug occurs each time when I created the guest. I have attached the
>>>> serial
>>>> output on the bugzilla.
>>> 
>>> I haven't been able to reproduce this.
>>> 
>>>>>> 2. CPU panic when running cpu offline
>>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
>>>> 
>>>> Xen will panic when I offline cpu each time. The log is also attached on
>>>> the
>>>> bugzilla.
>>> 
>>> Nor this. I even installed 32-bit Xen to match your environment more
>>> closely.
>> 
>> I'm running xen-unstable:21447 by the way. I ran 64-bit Xen for testing (1)
>> above, and both 64-bit and 32-bit Xen for testing (2).
>> 
>> K.
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-01  9:30           ` Keir Fraser
@ 2010-06-02  7:28             ` Jiang, Yunhong
  2010-06-02  8:01               ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-02  7:28 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel

[-- Attachment #1: Type: text/plain, Size: 5337 bytes --]


>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Tuesday, June 01, 2010 5:31 PM
>To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>On 01/06/2010 08:43, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> For issue 2, CPU panic when running cpu offline, it should comes from the
>> periodic_timer.
>>
>> When a CPU is pull down, cpu_disable_scheduler will remove the single shot
>> timer, but the periodic_timer is not migrated.
>> After the vcpu is scheduled on another pCPU later, and then schedule out from
>> that new pcpu, the stop_timer(&prev->periodic_timer) will try to access the
>> per_cpu strucutre, whic still poiting to the offlined CPU's per_cpu area and
>> will cause trouble. This should be caused by the per_cpu changes.
>
>Which xen-unstable changeset are you testing? All timers should be
>automatically migrated off a dead CPU and onto CPU0 by changeset 21424. Is
>that not working okay for you?

We are testing on 21492.

After more investigation, the root cause is the periodic_timer is stopped before take_cpu_down (in schedule()), so that it is not covred by 21424.
When v->periodic_period ==0, next vcpu's p_timer is not updated by the schedule(), thus, later in next schedule round, it will cause trouble for stop_timer().

With following small patch, it works, but I'm not sure if this is good solution.

--jyh

diff -r 96917cf25bf3 xen/common/schedule.c
--- a/xen/common/schedule.c	Fri May 28 10:54:07 2010 +0100
+++ b/xen/common/schedule.c	Wed Jun 02 15:18:56 2010 +0800
@@ -893,7 +893,10 @@ static void vcpu_periodic_timer_work(str
     ASSERT(!active_timer(&v->periodic_timer));
 
     if ( v->periodic_period == 0 )
+    {
+        v->periodic_timer.cpu = smp_processor_id();
         return;
+    }
 
     periodic_next_event = v->periodic_last_event + v->periodic_period;
 





>
> -- Keir
>
>> I try to migrate the periodic_timer also when cpu_disable_scheduler() and
>> seems it works. (comments the migration in cpu_disable_scheudler will trigger
>> the printk).
>> Seems on your side, the timer will always be triggered before schedule out?
>>
>> --jyh
>>
>> diff -r 96917cf25bf3 xen/common/schedule.c
>> --- a/xen/common/schedule.c Fri May 28 10:54:07 2010 +0100
>> +++ b/xen/common/schedule.c Tue Jun 01 15:35:21 2010 +0800
>> @@ -487,6 +487,15 @@ int cpu_disable_scheduler(unsigned int c
>>                  migrate_timer(&v->singleshot_timer, cpu_mig);
>>              }
>>
>> +/*
>> +            if ( v->periodic_timer.cpu == cpu )
>> +            {
>> +                int cpu_mig = first_cpu(c->cpu_valid);
>> +                if ( cpu_mig == cpu )
>> +                    cpu_mig = next_cpu(cpu_mig, c->cpu_valid);
>> +                migrate_timer(&v->periodic_timer, cpu_mig);
>> +            }
>> +*/
>>              if ( v->processor == cpu )
>>              {
>>                  set_bit(_VPF_migrating, &v->pause_flags);
>> @@ -505,7 +514,10 @@ int cpu_disable_scheduler(unsigned int c
>>               * all locks.
>>               */
>>              if ( v->processor == cpu )
>> +            {
>> +                printk("we hit the EAGAIN here\n");
>>                  ret = -EAGAIN;
>> +            }
>>          }
>>      }
>>      return ret;
>> @@ -1005,6 +1017,11 @@ static void schedule(void)
>>
>>      perfc_incr(sched_ctx);
>>
>> +    if (prev->periodic_timer.cpu != smp_processor_id() &&
>> !cpu_online(prev->periodic_timer.cpu))
>> +    {
>> +        printk("I'm now at cpu %x, timer's cpu is %x\n", smp_processor_id(),
>> prev->periodic_timer.cpu);
>> +    }
>> +
>>      stop_timer(&prev->periodic_timer);
>>
>>      /* Ensure that the domain has an up-to-date time base. */
>>
>>
>>
>> --jyh
>>
>>> -----Original Message-----
>>> From: xen-devel-bounces@lists.xensource.com
>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Keir Fraser
>>> Sent: Tuesday, May 25, 2010 5:15 PM
>>> To: Xu, Jiajun; xen-devel@lists.xensource.com
>>> Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>>> #a3e7c7...
>>>
>>> On 25/05/2010 10:13, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>>>
>>>>>>> 1. xen hypervisor hang when create guest on 32e platform
>>>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1617
>>>>>
>>>>> The bug occurs each time when I created the guest. I have attached the
>>>>> serial
>>>>> output on the bugzilla.
>>>>
>>>> I haven't been able to reproduce this.
>>>>
>>>>>>> 2. CPU panic when running cpu offline
>>>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1616
>>>>>
>>>>> Xen will panic when I offline cpu each time. The log is also attached on
>>>>> the
>>>>> bugzilla.
>>>>
>>>> Nor this. I even installed 32-bit Xen to match your environment more
>>>> closely.
>>>
>>> I'm running xen-unstable:21447 by the way. I ran 64-bit Xen for testing (1)
>>> above, and both 64-bit and 32-bit Xen for testing (2).
>>>
>>> K.
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>


[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  7:28             ` Jiang, Yunhong
@ 2010-06-02  8:01               ` Keir Fraser
  2010-06-02  8:49                 ` Jiang, Yunhong
                                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Keir Fraser @ 2010-06-02  8:01 UTC (permalink / raw)
  To: Jiang, Yunhong, Xu, Jiajun, xen-devel

On 02/06/2010 08:28, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

>> Which xen-unstable changeset are you testing? All timers should be
>> automatically migrated off a dead CPU and onto CPU0 by changeset 21424. Is
>> that not working okay for you?
> 
> We are testing on 21492.
> 
> After more investigation, the root cause is the periodic_timer is stopped
> before take_cpu_down (in schedule()), so that it is not covred by 21424.
> When v->periodic_period ==0, next vcpu's p_timer is not updated by the
> schedule(), thus, later in next schedule round, it will cause trouble for
> stop_timer().
> 
> With following small patch, it works, but I'm not sure if this is good
> solution.

I forgot about inactive timers in c/s 21424. Hm, I will fix this in the
timer subsystem and get back to you.

 -- Keir

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  8:01               ` Keir Fraser
@ 2010-06-02  8:49                 ` Jiang, Yunhong
  2010-06-02  9:24                 ` Jiang, Yunhong
  2010-06-02 12:14                 ` Keir Fraser
  2 siblings, 0 replies; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-02  8:49 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel


>
>I forgot about inactive timers in c/s 21424. Hm, I will fix this in the
>timer subsystem and get back to you.
>
> -- Keir

Thanks!
--jyh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  8:01               ` Keir Fraser
  2010-06-02  8:49                 ` Jiang, Yunhong
@ 2010-06-02  9:24                 ` Jiang, Yunhong
  2010-06-02  9:41                   ` Keir Fraser
  2010-06-02 12:14                 ` Keir Fraser
  2 siblings, 1 reply; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-02  9:24 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel

BTW, I get following failure after loop cpu o*l for about 95 times. 

(XEN) Xen call trace:
(XEN)    [<ffff82c48014b3e9>] clear_page_sse2+0x9/0x30
(XEN)    [<ffff82c4801b9922>] vmx_cpu_up_prepare+0x43/0x88
(XEN)    [<ffff82c4801a13fa>] cpu_callback+0x4a/0x94
(XEN)    [<ffff82c480112d95>] notifier_call_chain+0x68/0x84
(XEN)    [<ffff82c480100e5b>] cpu_up+0x7b/0x12f
(XEN)    [<ffff82c480173b7d>] arch_do_sysctl+0x770/0x833
(XEN)    [<ffff82c480121672>] do_sysctl+0x992/0x9ec
(XEN)    [<ffff82c4801fa3cf>] syscall_enter+0xef/0x149
(XEN)
(XEN) Pagetable walk from ffff83022fe1d000:
(XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
(XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
(XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
(XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555

I really can't imagine how this can happen considering the vmx_alloc_vmcs() is so straight-forward. My test machine is really magic.

Another fault as following:

(XEN) Xen call trace:
(XEN)    [<ffff82c480173459>] memcpy+0x11/0x1e
(XEN)    [<ffff82c4801722bf>] cpu_smpboot_callback+0x207/0x235
(XEN)    [<ffff82c480112d95>] notifier_call_chain+0x68/0x84
(XEN)    [<ffff82c480100e5b>] cpu_up+0x7b/0x12f
(XEN)    [<ffff82c480173c1d>] arch_do_sysctl+0x770/0x833
(XEN)    [<ffff82c480121712>] do_sysctl+0x992/0x9ec
(XEN)    [<ffff82c4801fa46f>] syscall_enter+0xef/0x149
(XEN)
(XEN) Pagetable walk from ffff830228ce5000:
(XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
(XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
(XEN)  L2[0x146] = 000000022fea3063 5555555555555555
(XEN)  L1[0x0e5] = 0000000228ce5262 000000000001fd49
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: ffff830228ce5000
(XEN) ****************************************
(XEN)

--jyh

>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Wednesday, June 02, 2010 4:01 PM
>To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>On 02/06/2010 08:28, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>>> Which xen-unstable changeset are you testing? All timers should be
>>> automatically migrated off a dead CPU and onto CPU0 by changeset 21424. Is
>>> that not working okay for you?
>>
>> We are testing on 21492.
>>
>> After more investigation, the root cause is the periodic_timer is stopped
>> before take_cpu_down (in schedule()), so that it is not covred by 21424.
>> When v->periodic_period ==0, next vcpu's p_timer is not updated by the
>> schedule(), thus, later in next schedule round, it will cause trouble for
>> stop_timer().
>>
>> With following small patch, it works, but I'm not sure if this is good
>> solution.
>
>I forgot about inactive timers in c/s 21424. Hm, I will fix this in the
>timer subsystem and get back to you.
>
> -- Keir
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  9:24                 ` Jiang, Yunhong
@ 2010-06-02  9:41                   ` Keir Fraser
  2010-06-02 10:23                     ` Jiang, Yunhong
  0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-06-02  9:41 UTC (permalink / raw)
  To: Jiang, Yunhong, Xu, Jiajun, xen-devel

On 02/06/2010 10:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

> (XEN) Pagetable walk from ffff83022fe1d000:
> (XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
> (XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
> (XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
> (XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555
> 
> I really can't imagine how this can happen considering the vmx_alloc_vmcs() is
> so straight-forward. My test machine is really magic.

Not at all. The free-memory pool was getting spiked with guarded (mapped
not-present) pages. The later unlucky allocator is the one that then
crashes.

I've just fixed this as xen-unstable:21504. The bug was a silly typo.

 Thanks,
 Keir

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  9:41                   ` Keir Fraser
@ 2010-06-02 10:23                     ` Jiang, Yunhong
  2010-06-02 12:17                       ` Keir Fraser
  0 siblings, 1 reply; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-02 10:23 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel

But in alloc_xenheap_pages(), we do unguard the page again, is that useful?

--jyh

>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Wednesday, June 02, 2010 5:41 PM
>To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>On 02/06/2010 10:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> (XEN) Pagetable walk from ffff83022fe1d000:
>> (XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
>> (XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
>> (XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
>> (XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555
>>
>> I really can't imagine how this can happen considering the vmx_alloc_vmcs() is
>> so straight-forward. My test machine is really magic.
>
>Not at all. The free-memory pool was getting spiked with guarded (mapped
>not-present) pages. The later unlucky allocator is the one that then
>crashes.
>
>I've just fixed this as xen-unstable:21504. The bug was a silly typo.
>
> Thanks,
> Keir
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02  8:01               ` Keir Fraser
  2010-06-02  8:49                 ` Jiang, Yunhong
  2010-06-02  9:24                 ` Jiang, Yunhong
@ 2010-06-02 12:14                 ` Keir Fraser
  2 siblings, 0 replies; 19+ messages in thread
From: Keir Fraser @ 2010-06-02 12:14 UTC (permalink / raw)
  To: Jiang, Yunhong, Xu, Jiajun, xen-devel

On 02/06/2010 09:01, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>> With following small patch, it works, but I'm not sure if this is good
>> solution.
> 
> I forgot about inactive timers in c/s 21424. Hm, I will fix this in the
> timer subsystem and get back to you.

Fixed by xen-unstable:21508.

 K.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02 10:23                     ` Jiang, Yunhong
@ 2010-06-02 12:17                       ` Keir Fraser
  2010-06-02 13:33                         ` Jiang, Yunhong
  0 siblings, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2010-06-02 12:17 UTC (permalink / raw)
  To: Jiang, Yunhong, Xu, Jiajun, xen-devel

That version of alloc_xenheap_pages is not built for x86_64.

 K.

On 02/06/2010 11:23, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:

> But in alloc_xenheap_pages(), we do unguard the page again, is that useful?
> 
> --jyh
> 
>> -----Original Message-----
>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>> Sent: Wednesday, June 02, 2010 5:41 PM
>> To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>> Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>> #a3e7c7...
>> 
>> On 02/06/2010 10:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>> 
>>> (XEN) Pagetable walk from ffff83022fe1d000:
>>> (XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
>>> (XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
>>> (XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
>>> (XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555
>>> 
>>> I really can't imagine how this can happen considering the vmx_alloc_vmcs()
>>> is
>>> so straight-forward. My test machine is really magic.
>> 
>> Not at all. The free-memory pool was getting spiked with guarded (mapped
>> not-present) pages. The later unlucky allocator is the one that then
>> crashes.
>> 
>> I've just fixed this as xen-unstable:21504. The bug was a silly typo.
>> 
>> Thanks,
>> Keir
>> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7...
  2010-06-02 12:17                       ` Keir Fraser
@ 2010-06-02 13:33                         ` Jiang, Yunhong
  0 siblings, 0 replies; 19+ messages in thread
From: Jiang, Yunhong @ 2010-06-02 13:33 UTC (permalink / raw)
  To: Keir Fraser, Xu, Jiajun, xen-devel

Oops, I didn't notice this.
Thanks for your patch, I will test them tomorrow.

--jyh

>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Wednesday, June 02, 2010 8:17 PM
>To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>#a3e7c7...
>
>That version of alloc_xenheap_pages is not built for x86_64.
>
> K.
>
>On 02/06/2010 11:23, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> But in alloc_xenheap_pages(), we do unguard the page again, is that useful?
>>
>> --jyh
>>
>>> -----Original Message-----
>>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>>> Sent: Wednesday, June 02, 2010 5:41 PM
>>> To: Jiang, Yunhong; Xu, Jiajun; xen-devel@lists.xensource.com
>>> Subject: Re: [Xen-devel] Biweekly VMX status report. Xen: #21438 & Xen0:
>>> #a3e7c7...
>>>
>>> On 02/06/2010 10:24, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>>>
>>>> (XEN) Pagetable walk from ffff83022fe1d000:
>>>> (XEN)  L4[0x106] = 00000000cfc8d027 5555555555555555
>>>> (XEN)  L3[0x008] = 00000000cfef9063 5555555555555555
>>>> (XEN)  L2[0x17f] = 000000022ff2a063 5555555555555555
>>>> (XEN)  L1[0x01d] = 000000022fe1d262 5555555555555555
>>>>
>>>> I really can't imagine how this can happen considering the vmx_alloc_vmcs()
>>>> is
>>>> so straight-forward. My test machine is really magic.
>>>
>>> Not at all. The free-memory pool was getting spiked with guarded (mapped
>>> not-present) pages. The later unlucky allocator is the one that then
>>> crashes.
>>>
>>> I've just fixed this as xen-unstable:21504. The bug was a silly typo.
>>>
>>> Thanks,
>>> Keir
>>>
>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2010-06-02 13:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-24 15:51 Biweekly VMX status report. Xen: #21438 & Xen0: #a3e7c7 Xu, Jiajun
2010-05-24 16:53 ` Keir Fraser
2010-05-25  8:17   ` Xu, Jiajun
2010-05-25  8:27     ` Dulloor
2010-05-25  8:29       ` Xu, Jiajun
2010-05-26 13:48       ` Xu, Jiajun
2010-05-25  9:13     ` Keir Fraser
2010-05-25  9:15       ` Keir Fraser
2010-06-01  7:43         ` Jiang, Yunhong
2010-06-01  9:30           ` Keir Fraser
2010-06-02  7:28             ` Jiang, Yunhong
2010-06-02  8:01               ` Keir Fraser
2010-06-02  8:49                 ` Jiang, Yunhong
2010-06-02  9:24                 ` Jiang, Yunhong
2010-06-02  9:41                   ` Keir Fraser
2010-06-02 10:23                     ` Jiang, Yunhong
2010-06-02 12:17                       ` Keir Fraser
2010-06-02 13:33                         ` Jiang, Yunhong
2010-06-02 12:14                 ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.