Problem running latest Xen unstable hypervisor with latest Linux kernels

All of lore.kernel.org
 help / color / mirror / Atom feed

* Problem running latest Xen unstable hypervisor with latest Linux kernels
@ 2010-05-21 13:55 Dave McCracken
  2010-05-21 15:25 ` Keir Fraser
  0 siblings, 1 reply; 5+ messages in thread
From: Dave McCracken @ 2010-05-21 13:55 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge; +Cc: Xen Developers List

I have a test box set up to run mainline Xen unstable hypervisor and a guest 
running the latest Linux builds from Jeremy.  Last week it all worked fine.  
This week when I pulled the latest unstable it stopped working.

Specifically I can boot up dom0 just fine.  When I try to start the guest Linux 
gets as far as "blkfront: xvdb1: barriers enabled", then hangs.  This happens 
with the xen/master branch as well as xen/core.  Returning to last week's 
hypervisor lets me boot the guest just fine.

I assume I have some combination of config options set/not set that no longer 
works with the new hypervisor code.  Is there something simple I need to 
change?

Thanks,
Dave McCracken

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem running latest Xen unstable hypervisor with latest Linux kernels
  2010-05-21 13:55 Problem running latest Xen unstable hypervisor with latest Linux kernels Dave McCracken
@ 2010-05-21 15:25 ` Keir Fraser
  2010-05-21 16:31   ` Stefano Stabellini
  2010-05-21 23:22   ` Dave McCracken
  0 siblings, 2 replies; 5+ messages in thread
From: Keir Fraser @ 2010-05-21 15:25 UTC (permalink / raw)
  To: Dave McCracken, Jeremy Fitzhardinge
  Cc: Xen Developers List, Stefano Stabellini

On 21/05/2010 14:55, "Dave McCracken" <dcm@mccr.org> wrote:

> I have a test box set up to run mainline Xen unstable hypervisor and a guest
> running the latest Linux builds from Jeremy.  Last week it all worked fine.
> This week when I pulled the latest unstable it stopped working.

This is a hypervisor bug. I would have bet that my recent smpboot/hotplug
changes might have caused it, but happily I can report that I am not to
blame. The offending changeset is 21339:804304d4e05d "x86: TSC handling
cleanups" by Stefano (cc'ed). The automated regression tests have also been
failing ever since it went in, so clearly PV guest startup failure is not
hard to repro with this patch applied. For now I have reverted it
(xen-unstable:21444), so please re-pull, re-build, re-test.

 -- Keir

> Specifically I can boot up dom0 just fine.  When I try to start the guest
> Linux 
> gets as far as "blkfront: xvdb1: barriers enabled", then hangs.  This happens
> with the xen/master branch as well as xen/core.  Returning to last week's
> hypervisor lets me boot the guest just fine.
> 
> I assume I have some combination of config options set/not set that no longer
> works with the new hypervisor code.  Is there something simple I need to
> change?
> 
> Thanks,
> Dave McCracken

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem running latest Xen unstable hypervisor with latest Linux kernels
  2010-05-21 15:25 ` Keir Fraser
@ 2010-05-21 16:31   ` Stefano Stabellini
  2010-05-21 16:32     ` Stefano Stabellini
  2010-05-21 23:22   ` Dave McCracken
  1 sibling, 1 reply; 5+ messages in thread
From: Stefano Stabellini @ 2010-05-21 16:31 UTC (permalink / raw)
  To: Keir Fraser
  Cc: Developers List, Jeremy Fitzhardinge, Dave McCracken, Xen,
	Stefano Stabellini

On Fri, 21 May 2010, Keir Fraser wrote:
> On 21/05/2010 14:55, "Dave McCracken" <dcm@mccr.org> wrote:
> 
> > I have a test box set up to run mainline Xen unstable hypervisor and a guest
> > running the latest Linux builds from Jeremy.  Last week it all worked fine.
> > This week when I pulled the latest unstable it stopped working.
> 
> This is a hypervisor bug. I would have bet that my recent smpboot/hotplug
> changes might have caused it, but happily I can report that I am not to
> blame. The offending changeset is 21339:804304d4e05d "x86: TSC handling
> cleanups" by Stefano (cc'ed). The automated regression tests have also been
> failing ever since it went in, so clearly PV guest startup failure is not
> hard to repro with this patch applied. For now I have reverted it
> (xen-unstable:21444), so please re-pull, re-build, re-test.
> 

Found the bug, I am attaching the full patch to the email, the following
is the diff to the previous version:

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

---

diff -r d84c1921e442 xen/arch/x86/time.c
--- a/xen/arch/x86/time.c	Fri May 21 17:11:21 2010 +0100
+++ b/xen/arch/x86/time.c	Fri May 21 17:28:59 2010 +0100
@@ -1636,7 +1636,6 @@
 {
     s_time_t now = get_s_time();
     struct domain *d = v->domain;
-    u64 tsc;
 
     spin_lock(&d->arch.vtsc_lock);
 
@@ -1652,7 +1651,7 @@
 
     spin_unlock(&d->arch.vtsc_lock);
 
-    tsc = gtime_to_gtsc(d, now);
+    now = gtime_to_gtsc(d, now);
 
     regs->eax = (uint32_t)now;
     regs->edx = (uint32_t)(now >> 32);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem running latest Xen unstable hypervisor with latest Linux kernels
  2010-05-21 16:31   ` Stefano Stabellini
@ 2010-05-21 16:32     ` Stefano Stabellini
  0 siblings, 0 replies; 5+ messages in thread
From: Stefano Stabellini @ 2010-05-21 16:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Xen Developers List, Jeremy Fitzhardinge, Dave McCracken, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1075 bytes --]

On Fri, 21 May 2010, Stefano Stabellini wrote:
> On Fri, 21 May 2010, Keir Fraser wrote:
> > On 21/05/2010 14:55, "Dave McCracken" <dcm@mccr.org> wrote:
> > 
> > > I have a test box set up to run mainline Xen unstable hypervisor and a guest
> > > running the latest Linux builds from Jeremy.  Last week it all worked fine.
> > > This week when I pulled the latest unstable it stopped working.
> > 
> > This is a hypervisor bug. I would have bet that my recent smpboot/hotplug
> > changes might have caused it, but happily I can report that I am not to
> > blame. The offending changeset is 21339:804304d4e05d "x86: TSC handling
> > cleanups" by Stefano (cc'ed). The automated regression tests have also been
> > failing ever since it went in, so clearly PV guest startup failure is not
> > hard to repro with this patch applied. For now I have reverted it
> > (xen-unstable:21444), so please re-pull, re-build, re-test.
> > 
> 
> Found the bug, I am attaching the full patch to the email, the following
> is the diff to the previous version:
> 

Oops, forgot the attachment


[-- Attachment #2: Type: text/plain, Size: 8139 bytes --]

diff -r d0420ab97345 xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c	Fri May 21 16:21:39 2010 +0100
+++ b/xen/arch/x86/hvm/hvm.c	Fri May 21 17:28:17 2010 +0100
@@ -205,32 +205,6 @@
         hvm_funcs.set_rdtsc_exiting(v, enable);
 }
 
-int hvm_gtsc_need_scale(struct domain *d)
-{
-    uint32_t gtsc_mhz, htsc_mhz;
-
-    if ( d->arch.vtsc )
-        return 0;
-
-    gtsc_mhz = d->arch.hvm_domain.gtsc_khz / 1000;
-    htsc_mhz = (uint32_t)cpu_khz / 1000;
-
-    d->arch.hvm_domain.tsc_scaled = (gtsc_mhz && (gtsc_mhz != htsc_mhz));
-    return d->arch.hvm_domain.tsc_scaled;
-}
-
-static u64 hvm_h2g_scale_tsc(struct vcpu *v, u64 host_tsc)
-{
-    uint32_t gtsc_khz, htsc_khz;
-
-    if ( !v->domain->arch.hvm_domain.tsc_scaled )
-        return host_tsc;
-
-    htsc_khz = cpu_khz;
-    gtsc_khz = v->domain->arch.hvm_domain.gtsc_khz;
-    return muldiv64(host_tsc, gtsc_khz, htsc_khz);
-}
-
 void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc)
 {
     uint64_t tsc;
@@ -238,11 +212,11 @@
     if ( v->domain->arch.vtsc )
     {
         tsc = hvm_get_guest_time(v);
+        tsc = gtime_to_gtsc(v->domain, tsc);
     }
     else
     {
         rdtscll(tsc);
-        tsc = hvm_h2g_scale_tsc(v, tsc);
     }
 
     v->arch.hvm_vcpu.cache_tsc_offset = guest_tsc - tsc;
@@ -256,12 +230,12 @@
     if ( v->domain->arch.vtsc )
     {
         tsc = hvm_get_guest_time(v);
+        tsc = gtime_to_gtsc(v->domain, tsc);
         v->domain->arch.vtsc_kerncount++;
     }
     else
     {
         rdtscll(tsc);
-        tsc = hvm_h2g_scale_tsc(v, tsc);
     }
 
     return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff -r d0420ab97345 xen/arch/x86/hvm/save.c
--- a/xen/arch/x86/hvm/save.c	Fri May 21 16:21:39 2010 +0100
+++ b/xen/arch/x86/hvm/save.c	Fri May 21 17:28:17 2010 +0100
@@ -33,7 +33,7 @@
     hdr->cpuid = eax;
 
     /* Save guest's preferred TSC. */
-    hdr->gtsc_khz = d->arch.hvm_domain.gtsc_khz;
+    hdr->gtsc_khz = d->arch.tsc_khz;
 }
 
 int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
@@ -62,8 +62,8 @@
 
     /* Restore guest's preferred TSC frequency. */
     if ( hdr->gtsc_khz )
-        d->arch.hvm_domain.gtsc_khz = hdr->gtsc_khz;
-    if ( hvm_gtsc_need_scale(d) )
+        d->arch.tsc_khz = hdr->gtsc_khz;
+    if ( d->arch.vtsc )
     {
         hvm_set_rdtsc_exiting(d, 1);
         gdprintk(XENLOG_WARNING, "Domain %d expects freq %uMHz "
diff -r d0420ab97345 xen/arch/x86/hvm/vpt.c
--- a/xen/arch/x86/hvm/vpt.c	Fri May 21 16:21:39 2010 +0100
+++ b/xen/arch/x86/hvm/vpt.c	Fri May 21 17:28:17 2010 +0100
@@ -32,9 +32,6 @@
     spin_lock_init(&pl->pl_time_lock);
     pl->stime_offset = -(u64)get_s_time();
     pl->last_guest_time = 0;
-
-    d->arch.hvm_domain.gtsc_khz = cpu_khz;
-    d->arch.hvm_domain.tsc_scaled = 0;
 }
 
 u64 hvm_get_guest_time(struct vcpu *v)
diff -r d0420ab97345 xen/arch/x86/time.c
--- a/xen/arch/x86/time.c	Fri May 21 16:21:39 2010 +0100
+++ b/xen/arch/x86/time.c	Fri May 21 17:28:17 2010 +0100
@@ -804,8 +804,13 @@
 
     if ( d->arch.vtsc )
     {
-        u64 delta = max_t(s64, t->stime_local_stamp - d->arch.vtsc_offset, 0);
-        tsc_stamp = scale_delta(delta, &d->arch.ns_to_vtsc);
+        u64 stime = t->stime_local_stamp;
+        if ( is_hvm_domain(d) )
+        {
+            struct pl_time *pl = &v->domain->arch.hvm_domain.pl_time;
+            stime += pl->stime_offset + v->arch.hvm_vcpu.stime_offset;
+        }
+        tsc_stamp = gtime_to_gtsc(d, stime);
     }
     else
     {
@@ -828,6 +833,8 @@
         _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
         _u.tsc_shift         = (s8)t->tsc_scale.shift;
     }
+    if ( is_hvm_domain(d) )
+        _u.tsc_timestamp += v->arch.hvm_vcpu.cache_tsc_offset;
 
     /* Don't bother unless timestamp record has changed or we are forced. */
     _u.version = u->version; /* make versions match for memcmp test */
@@ -1591,11 +1598,17 @@
  * PV SoftTSC Emulation.
  */
 
+u64 gtime_to_gtsc(struct domain *d, u64 tsc)
+{
+    if ( !is_hvm_domain(d) )
+        tsc = max_t(s64, tsc - d->arch.vtsc_offset, 0);
+    return scale_delta(tsc, &d->arch.ns_to_vtsc);
+}
+
 void pv_soft_rdtsc(struct vcpu *v, struct cpu_user_regs *regs, int rdtscp)
 {
     s_time_t now = get_s_time();
     struct domain *d = v->domain;
-    u64 delta;
 
     spin_lock(&d->arch.vtsc_lock);
 
@@ -1611,8 +1624,7 @@
 
     spin_unlock(&d->arch.vtsc_lock);
 
-    delta = max_t(s64, now - d->arch.vtsc_offset, 0);
-    now = scale_delta(delta, &d->arch.ns_to_vtsc);
+    now = gtime_to_gtsc(d, now);
 
     regs->eax = (uint32_t)now;
     regs->edx = (uint32_t)(now >> 32);
@@ -1753,8 +1765,10 @@
         d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
         d->arch.tsc_khz = gtsc_khz ? gtsc_khz : cpu_khz;
         set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
-        /* use native TSC if initial host has safe TSC and not migrated yet */
-        if ( host_tsc_is_safe() && incarnation == 0 )
+        /* use native TSC if initial host has safe TSC, has not migrated
+         * yet and tsc_khz == cpu_khz */
+        if ( host_tsc_is_safe() && incarnation == 0 &&
+                d->arch.tsc_khz == cpu_khz )
             d->arch.vtsc = 0;
         else 
             d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);
@@ -1779,7 +1793,7 @@
     }
     d->arch.incarnation = incarnation + 1;
     if ( is_hvm_domain(d) )
-        hvm_set_rdtsc_exiting(d, d->arch.vtsc || hvm_gtsc_need_scale(d));
+        hvm_set_rdtsc_exiting(d, d->arch.vtsc);
 }
 
 /* vtsc may incur measurable performance degradation, diagnose with this */
diff -r d0420ab97345 xen/common/kernel.c
--- a/xen/common/kernel.c	Fri May 21 16:21:39 2010 +0100
+++ b/xen/common/kernel.c	Fri May 21 17:28:17 2010 +0100
@@ -259,6 +259,8 @@
                 fi.submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) |
                              (1U << XENFEAT_highmem_assist) |
                              (1U << XENFEAT_gnttab_map_avail_bits);
+            else
+                fi.submap |= (1U << XENFEAT_hvm_safe_pvclock);
 #endif
             break;
         default:
diff -r d0420ab97345 xen/include/asm-x86/hvm/domain.h
--- a/xen/include/asm-x86/hvm/domain.h	Fri May 21 16:21:39 2010 +0100
+++ b/xen/include/asm-x86/hvm/domain.h	Fri May 21 17:28:17 2010 +0100
@@ -45,8 +45,6 @@
     struct hvm_ioreq_page  ioreq;
     struct hvm_ioreq_page  buf_ioreq;
 
-    uint32_t               gtsc_khz; /* kHz */
-    bool_t                 tsc_scaled;
     struct pl_time         pl_time;
 
     struct hvm_io_handler  io_handler;
diff -r d0420ab97345 xen/include/asm-x86/hvm/hvm.h
--- a/xen/include/asm-x86/hvm/hvm.h	Fri May 21 16:21:39 2010 +0100
+++ b/xen/include/asm-x86/hvm/hvm.h	Fri May 21 17:28:17 2010 +0100
@@ -296,7 +296,6 @@
 uint8_t hvm_combine_hw_exceptions(uint8_t vec1, uint8_t vec2);
 
 void hvm_set_rdtsc_exiting(struct domain *d, bool_t enable);
-int hvm_gtsc_need_scale(struct domain *d);
 
 static inline int hvm_cpu_up(void)
 {
diff -r d0420ab97345 xen/include/asm-x86/time.h
--- a/xen/include/asm-x86/time.h	Fri May 21 16:21:39 2010 +0100
+++ b/xen/include/asm-x86/time.h	Fri May 21 17:28:17 2010 +0100
@@ -57,6 +57,7 @@
 uint64_t ns_to_acpi_pm_tick(uint64_t ns);
 
 void pv_soft_rdtsc(struct vcpu *v, struct cpu_user_regs *regs, int rdtscp);
+u64 gtime_to_gtsc(struct domain *d, u64 tsc);
 
 void tsc_set_info(struct domain *d, uint32_t tsc_mode, uint64_t elapsed_nsec,
                   uint32_t gtsc_khz, uint32_t incarnation);
diff -r d0420ab97345 xen/include/public/features.h
--- a/xen/include/public/features.h	Fri May 21 16:21:39 2010 +0100
+++ b/xen/include/public/features.h	Fri May 21 17:28:17 2010 +0100
@@ -68,6 +68,9 @@
  */
 #define XENFEAT_gnttab_map_avail_bits      7
 
+/* x86: pvclock algorithm is safe to use on HVM */
+#define XENFEAT_hvm_safe_pvclock           9
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem running latest Xen unstable hypervisor with latest Linux kernels
  2010-05-21 15:25 ` Keir Fraser
  2010-05-21 16:31   ` Stefano Stabellini
@ 2010-05-21 23:22   ` Dave McCracken
  1 sibling, 0 replies; 5+ messages in thread
From: Dave McCracken @ 2010-05-21 23:22 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Jeremy Fitzhardinge, Xen Developers List, Stefano Stabellini

On Friday, May 21, 2010, Keir Fraser wrote:
> > I have a test box set up to run mainline Xen unstable hypervisor and a
> > guest running the latest Linux builds from Jeremy.  Last week it all
> > worked fine. This week when I pulled the latest unstable it stopped
> > working.
> 
> This is a hypervisor bug. I would have bet that my recent smpboot/hotplug
> changes might have caused it, but happily I can report that I am not to
> blame. The offending changeset is 21339:804304d4e05d "x86: TSC handling
> cleanups" by Stefano (cc'ed). The automated regression tests have also been
> failing ever since it went in, so clearly PV guest startup failure is not
> hard to repro with this patch applied. For now I have reverted it
> (xen-unstable:21444), so please re-pull, re-build, re-test.

The guest now boots just fine.  Thanks.

Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-05-21 23:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-21 13:55 Problem running latest Xen unstable hypervisor with latest Linux kernels Dave McCracken
2010-05-21 15:25 ` Keir Fraser
2010-05-21 16:31   ` Stefano Stabellini
2010-05-21 16:32     ` Stefano Stabellini
2010-05-21 23:22   ` Dave McCracken

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.