All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: DomU crash during migration when suspendingsource domain
@ 2007-02-14 13:57 Graham, Simon
  2007-02-14 14:35 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Graham, Simon @ 2007-02-14 13:57 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

> Are you migrating between unlike boxes? My guess is that the original
> box
> has processors supporting cacheinfo cpuid leaves and the target box
> does
> not. Migrating to older less-capable CPUs is definitely hit-and-miss
> I'm
> afraid. It really is best not to do it!
>

I think this is indeed what is happening -- supporting this is kind of
important for HA/FT - you need to be able to keep the domains running
when upgrading/replacing hardware.

I guess I'm still a tad confused, but presumably the CPU_DEAD processing
is not completely uninitializing the cache info (it seems to me that if
it discarded the cache info and NULL's the pointer in the CPU_DEAD
processing then it should get recreated when the CPU_ONLINE is done -
presumably there is some path where this is not done when it should be.

I'll do some more digging and get back with a proposed fix.
Simon
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: DomU crash during migration when suspendingsource domain
  2007-02-14 13:57 DomU crash during migration when suspendingsource domain Graham, Simon
@ 2007-02-14 14:35 ` Keir Fraser
  0 siblings, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2007-02-14 14:35 UTC (permalink / raw)
  To: Graham, Simon, xen-devel

In general we *cannot* expect to support CPUs with different features in
CPUID. We plan to fix this in two ways:
 1. Allow a guest to be given a restricted CPUID view (e.g., with features
masked out, or cacheinfo leaves missing).
 2. Where a guest has been exposed to extended features and leaves, prevent
it from being migrated to a less-capable CPU.

A further option (3) for cache info might be to fake out the leaves for CPUs
that do not support them. But I'm not sure whether, for example, this would
be compatible with AMD's CPUID instruction.

This issue is hardly specific to HA/FT. You can safely build yourself a
HA/FT cluster out of homogeneous hardware. Building it out of odds and ends
you have already is going to be hard or impossible to guarantee safety of in
general. I don't believe anyone sells or supports software to allow you to
do this, and there's a reason for that.

 -- Keir

On 14/2/07 13:57, "Graham, Simon" <Simon.Graham@stratus.com> wrote:

> I think this is indeed what is happening -- supporting this is kind of
> important for HA/FT - you need to be able to keep the domains running
> when upgrading/replacing hardware.
> 
> I guess I'm still a tad confused, but presumably the CPU_DEAD processing
> is not completely uninitializing the cache info (it seems to me that if
> it discarded the cache info and NULL's the pointer in the CPU_DEAD
> processing then it should get recreated when the CPU_ONLINE is done -
> presumably there is some path where this is not done when it should be.
> 
> I'll do some more digging and get back with a proposed fix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: DomU crash during migration when suspendingsource domain
  2007-02-14 15:08 Graham, Simon
@ 2007-02-14 15:43 ` Keir Fraser
  0 siblings, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2007-02-14 15:43 UTC (permalink / raw)
  To: Graham, Simon, Keir Fraser, xen-devel

On 14/2/07 15:08, "Graham, Simon" <Simon.Graham@stratus.com> wrote:

> Let me try that out here and get back to you -- I can submit a patch
> with this specific fix in if it solves the problem.
> 
> Since, as you say, this is just one aspect of dealing with hot plugging
> completely different processors, I somehow feel that a point fix like
> this wouldn't be accepted upstream and instead we'd need to think about
> a more complete solution (If, indeed, this is feasible).

Possibly true. In fact I think if you fix that function then you're going to
die in kobject_unregister() instead. The loop cache_remove_dev() is simply
bogus in your case since num_cache_leaves cannot be trusted.

A broader set of fixes might get accepted upstream because cache_add_dev()
can fail for other reasons too (at least out-of-memory) and any such failure
will cause cache_remove_dev() to barf. But it's not such a simple thing to
fix and it does not solve the general problem for us.

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: DomU crash during migration when suspendingsource domain
  2007-02-14 14:43 Graham, Simon
  2007-02-14 14:56 ` Keir Fraser
@ 2007-02-14 15:15 ` Petersson, Mats
  1 sibling, 0 replies; 7+ messages in thread
From: Petersson, Mats @ 2007-02-14 15:15 UTC (permalink / raw)
  To: Graham, Simon, Keir Fraser, xen-devel

Simon,  

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com 
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> Graham, Simon
> Sent: 14 February 2007 14:43
> To: Keir Fraser; xen-devel@lists.xensource.com
> Subject: RE: [Xen-devel] DomU crash during migration when 
> suspendingsource domain
> 
> 
> > In general we *cannot* expect to support CPUs with 
> different features
> > in
> > CPUID. We plan to fix this in two ways:
> >  1. Allow a guest to be given a restricted CPUID view (e.g., with
> > features
> > masked out, or cacheinfo leaves missing).
> 
> Do you plan to do this for PV domains as well as HVM?

PV guests already have an "emulated" CPUID instruction (by prefixing the
regular CPUID so that it turns into "illegal opcode","CPUID". 

At present, there's not much filtering going on in that code, but it's
capable of filtering any and all CPUID functionality used by a PV guest.


HVM also has the same capability of filtering. 

Would it make sense to have a sparse array of CPUID leaves and masks for
the respective entries, or did you have some better idea?

> 
> >  2. Where a guest has been exposed to extended features and leaves,
> > prevent
> > it from being migrated to a less-capable CPU.
> > 
> 
> I guess I'm not quite sure I fully understand -- since we hot 
> remove all
> the processors (but one - I guess that is an issue) and then hot add
> them again after migration, you would think it would be OK to 
> hot add a
> completely different processor -- of course there will be issues with
> the Linux code given that you cant actually test this on a
> non-virtualized system.

The real problem with migrating to a "lesser" platform is things like:
Linux starts by determining which method is the best for calculating TCP
checksums, copying disk-blocks, etc, etc. Let's say that the kernel
decides to use SSE registers for this purpose. You then migrate this to
a processor that doesn't have SSE instructions... Fail Fail Fail. 

Other features of the same sort would be large pages (not supported by
Xen at present), PAE support, etc, etc. 

Or indeed, just knowing how many sets of cache information are available
to a particular CPU type. 
> 
> > A further option (3) for cache info might be to fake out the leaves
> for
> > CPUs
> > that do not support them. But I'm not sure whether, for 
> example, this
> > would
> > be compatible with AMD's CPUID instruction.
> > 

I don't see anything wrong with this in general. AMD has cache-info in
the 80000xxx range of CPUID (for recent CPUs, older ones doesn't have
any cache info in the processor). So selectively fake that into
something saying "not available" (such as setting to zero), would be
fine for those. 

It gets interesting of course when moving from AMD to Intel or other way
around, as the code may "remember" that it's on one or t'other, and not
look in the right place for the info. 

--
Mats
> 
> Agreed.
> 
> > This issue is hardly specific to HA/FT. You can safely 
> build yourself
> a
> > HA/FT cluster out of homogeneous hardware. Building it out 
> of odds and
> > ends
> > you have already is going to be hard or impossible to 
> guarantee safety
> > of in
> > general. I don't believe anyone sells or supports software to allow
> you
> > to
> > do this, and there's a reason for that.
> 
> You misunderstand my point -- in an FT environment, you MUST 
> be able to
> upgrade and repair hardware without taking the domain down -- clearly
> this would normally be to an equivalent or higher functionality system
> but we cant guarantee that there wont be a new spiffy processor that
> causes this same issue to arise or that we wont run into some similar
> issue when replacing faulty hardware (the original system might no
> longer be available for example).
> 
> Simon
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: DomU crash during migration when suspendingsource domain
@ 2007-02-14 15:08 Graham, Simon
  2007-02-14 15:43 ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Graham, Simon @ 2007-02-14 15:08 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

> In this particular case it is quite arguable that
> cache_remove_shared_cpu_map() should check cpuid4_info[i]!=NULL, just
> as
> done in cache_shared_cpu_map_setup(). I can make this fix in our tree
> but
> something similar ought to be submitted upstream too. I'm pretty
> certain
> that this will fix your crash.
> 

Let me try that out here and get back to you -- I can submit a patch
with this specific fix in if it solves the problem. 

Since, as you say, this is just one aspect of dealing with hot plugging
completely different processors, I somehow feel that a point fix like
this wouldn't be accepted upstream and instead we'd need to think about
a more complete solution (If, indeed, this is feasible).

> 
> Upgrading upwards actually tends to be okay. I can't think of any
> practical
> examples of how that might fail. After all, worst case we can hide the
> extra
> features from the guest since we have some control over CPUID.
> *Downgrading*
> is the problem!

Understood... I can conceive of cases where this would not be true, but
I agree that Intel/AMD usually do a good job of ensuring backward
compatibility so we could hide the newer features until all systems have
the newer processors in place and you reboot the domains.

Simon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: DomU crash during migration when suspendingsource domain
  2007-02-14 14:43 Graham, Simon
@ 2007-02-14 14:56 ` Keir Fraser
  2007-02-14 15:15 ` Petersson, Mats
  1 sibling, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2007-02-14 14:56 UTC (permalink / raw)
  To: Graham, Simon, Keir Fraser, xen-devel

On 14/2/07 14:43, "Graham, Simon" <Simon.Graham@stratus.com> wrote:

> Do you plan to do this for PV domains as well as HVM?

Yes, we have a special paravirtualised CPUID interface which Linux uses. So
this can be done.
 
> I guess I'm not quite sure I fully understand -- since we hot remove all
> the processors (but one - I guess that is an issue) and then hot add
> them again after migration, you would think it would be OK to hot add a
> completely different processor -- of course there will be issues with
> the Linux code given that you cant actually test this on a
> non-virtualized system.

You might indeed think that. Unfortunately code can depend on the fact that
all x86 systems (at least so far) have symmetric cache hierarchies. In the
case of this particular code, num_cache_leaves is latched during boot based
on CPU0's CPUID result. This value is then considered safe to use for all
CPUs forever more, which is not a good assumption in your case.

In this particular case it is quite arguable that
cache_remove_shared_cpu_map() should check cpuid4_info[i]!=NULL, just as
done in cache_shared_cpu_map_setup(). I can make this fix in our tree but
something similar ought to be submitted upstream too. I'm pretty certain
that this will fix your crash.

> You misunderstand my point -- in an FT environment, you MUST be able to
> upgrade and repair hardware without taking the domain down -- clearly
> this would normally be to an equivalent or higher functionality system
> but we cant guarantee that there wont be a new spiffy processor that
> causes this same issue to arise or that we wont run into some similar
> issue when replacing faulty hardware (the original system might no
> longer be available for example).

Upgrading upwards actually tends to be okay. I can't think of any practical
examples of how that might fail. After all, worst case we can hide the extra
features from the guest since we have some control over CPUID. *Downgrading*
is the problem!

 -- Keir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: DomU crash during migration when suspendingsource domain
@ 2007-02-14 14:43 Graham, Simon
  2007-02-14 14:56 ` Keir Fraser
  2007-02-14 15:15 ` Petersson, Mats
  0 siblings, 2 replies; 7+ messages in thread
From: Graham, Simon @ 2007-02-14 14:43 UTC (permalink / raw)
  To: Keir Fraser, xen-devel


> In general we *cannot* expect to support CPUs with different features
> in
> CPUID. We plan to fix this in two ways:
>  1. Allow a guest to be given a restricted CPUID view (e.g., with
> features
> masked out, or cacheinfo leaves missing).

Do you plan to do this for PV domains as well as HVM?

>  2. Where a guest has been exposed to extended features and leaves,
> prevent
> it from being migrated to a less-capable CPU.
> 

I guess I'm not quite sure I fully understand -- since we hot remove all
the processors (but one - I guess that is an issue) and then hot add
them again after migration, you would think it would be OK to hot add a
completely different processor -- of course there will be issues with
the Linux code given that you cant actually test this on a
non-virtualized system.

> A further option (3) for cache info might be to fake out the leaves
for
> CPUs
> that do not support them. But I'm not sure whether, for example, this
> would
> be compatible with AMD's CPUID instruction.
> 

Agreed.

> This issue is hardly specific to HA/FT. You can safely build yourself
a
> HA/FT cluster out of homogeneous hardware. Building it out of odds and
> ends
> you have already is going to be hard or impossible to guarantee safety
> of in
> general. I don't believe anyone sells or supports software to allow
you
> to
> do this, and there's a reason for that.

You misunderstand my point -- in an FT environment, you MUST be able to
upgrade and repair hardware without taking the domain down -- clearly
this would normally be to an equivalent or higher functionality system
but we cant guarantee that there wont be a new spiffy processor that
causes this same issue to arise or that we wont run into some similar
issue when replacing faulty hardware (the original system might no
longer be available for example).

Simon

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-02-14 15:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-14 13:57 DomU crash during migration when suspendingsource domain Graham, Simon
2007-02-14 14:35 ` Keir Fraser
2007-02-14 14:43 Graham, Simon
2007-02-14 14:56 ` Keir Fraser
2007-02-14 15:15 ` Petersson, Mats
2007-02-14 15:08 Graham, Simon
2007-02-14 15:43 ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.