All of lore.kernel.org
 help / color / mirror / Atom feed
* DomU crash during migration when suspending source domain
@ 2007-02-14  3:42 Graham, Simon
  2007-02-14 10:13 ` Keir Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: Graham, Simon @ 2007-02-14  3:42 UTC (permalink / raw)
  To: xen-devel

Just run into an odd DomU crash doing live migration of a 4-VCPU domain (with 3.0.4 but the code looks the same in 2.6.18/unstable to me) - the actual panic is attached at the end of this, but the bottom line is that the code in cache_remove_shared_cpu_map (in arch/i385/kernel/cpu/intel_cacheinfo.c) is attempting to clean up the cache info for a processor that does not yet have this info setup - the code is dereferencing a pointer in the cpuid4_info[] array and looking at the dump I can see that this is NULL.

My working theory here is that we attempted the migration waaay early and the initialization of the array of cache info pointers was not setup for all processors yet; it would be relatively easy to protect against this by checking for NULL, but I'm not sure if this is the correct solution or not -- if anyone is familiar with this code and can comment on an appropriate fix I'd be grateful.

One thing I'm really not sure about is the timing of marking the CPUs up with respect to the trace re initializing CPUs (see console output below) -- I can see that the four VCPUs are setup in the cpu_sys_devices array (which is setup by the code that outputs the 'Initializing CPU#n' trace) but the array of cache info structures only has an entry for VCPU 0:

crash> cpu_sys_devices
cpu_sys_devices = $3 =
 {0xc0464448, 0xc046448c, 0xc04644d0, 0xc0464514, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0}

crash> cpuid4_info
cpuid4_info = $4 =
 {0xc7971180, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

Any suggestions for appropriate fixes here?
Simon

--- console output ---

Enabling SMP...
Initializing CPU#3
Initializing CPU#2
Initializing CPU#1
eth0: no IPv6 routers present
Unable to handle kernel NULL pointer dereference at virtual address 00000010
 printing eip:
c010dd3a
0204a000 -> *pde = 00000001:0d8ec001
06a9c000 -> *pme = 00000000:00000000
Oops: 0000 [#1]
SMP 
Modules linked in: ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core binfmt_misc dm_mirror dm_mod bnx2 ata_piix libata mptscsih mptfc mptspi mptsas mptscsi scsi_mod mptbase
CPU:    0
EIP:    0061:[<c010dd3a>]    Tainted: GF    VLI
EFLAGS: 00010202  (2.6.16.29-xen #1) 
EIP is at cache_remove_shared_cpu_map+0x1a/0x90
eax: 00000000  ebx: 00000001  ecx: 00000001  edx: 00000000
esi: 00000000  edi: 00000010  ebp: c3913f14  esp: c3913f08
ds: 007b  es: 007b  ss: 0069
Process suspend (pid: 4038, threadinfo=c3912000 task=c2244570)
Stack: <0>00000001 00000001 00000000 c3913f28 c010e3ba 00000007 00000001 00000007 
      c3913f34 c010e425 c03bd804 c3913f48 c012fae8 ffffffea 00000001 c568c570 
      c3913f7c c013b889 c3913fc0 00000002 00000001 00000000 00000003 00000000 
Call Trace:
 [<c0105401>] show_stack_log_lvl+0xa1/0xe0
 [<c01055f1>] show_registers+0x181/0x200
 [<c0105810>] die+0x100/0x1a0
 [<c01156f6>] do_page_fault+0x3c6/0x8b1
 [<c0105067>] error_code+0x2b/0x30
 [<c010e3ba>] cache_remove_dev+0x2a/0x60
 [<c010e425>] cacheinfo_cpu_callback+0x35/0x40
 [<c012fae8>] notifier_call_chain+0x18/0x40
 [<c013b889>] cpu_down+0x139/0x260
 [<c028bc9f>] smp_suspend+0x7f/0x100
 [<c028ca80>] __do_suspend+0x40/0x180
 [<c0136a06>] kthread+0x96/0xe0
 [<c0102e95>] kernel_thread_helper+0x5/0x10
Code: 0c 5b 5e 5f 5d c3 8d 74 26 00 8d bc 27 00 00 00 00 55 89 e5 57 56 89 d6 53 89 c3 8d 04 92 8b 14 9d 20 4d 46 c0 8d 04 82 8d 78 10 <8b> 40 10 ba 20 00 00 00 85 c0 74 03 0f bc d0 83 fa 21 b9 20 00 

-and-

crash> bt
PID: 4038   TASK: c2244570  CPU: 0   COMMAND: "suspend"
 #0 [c3913ddc] xen_panic_event at c010a527
 #1 [c3913df8] notifier_call_chain at c012fae6
 #2 [c3913e0c] panic at c0120b16
 #3 [c3913e20] die at c0105866
 #4 [c3913e6c] do_page_fault at c01156f1
 #5 [c3913ed0] error_code (via page_fault) at c0105065
    EAX: 00000000  EBX: 00000001  ECX: 00000001  EDX: 00000000  EBP: c3913f14
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000010
    CS:  0061      EIP: c010dd3a  ERR: ffffffff  EFLAGS: 00010202
 #6 [c3913f04] cache_remove_shared_cpu_map at c010dd3a
 #7 [c3913f18] cache_remove_dev at c010e3b5
 #8 [c3913f2c] cacheinfo_cpu_callback at c010e420
 #9 [c3913f38] notifier_call_chain at c012fae6
#10 [c3913f4c] cpu_down at c013b884
#11 [c3913f80] smp_suspend at c028bc9a
#12 [c3913f98] __do_suspend at c028ca7b
#13 [c3913fc4] kthread at c0136a03
#14 [c3913fe8] kernel_thread_helper at c0102e93
crash>

^ permalink raw reply	[flat|nested] 10+ messages in thread
* RE: DomU crash during migration when suspending source domain
@ 2007-02-14 13:02 Graham, Simon
  0 siblings, 0 replies; 10+ messages in thread
From: Graham, Simon @ 2007-02-14 13:02 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

> 
> Is this with a 2.6.16 guest from 3.0.4? This would most likely be a
CPU
> hotplug issue in Linux, but we did so lots of testing of that...
> 
>  -- Keir
>

Yes -- in fact, we're using the 3.0.4 kernel for both Dom0 and DomU.
I've some more info from failures last night:

1. I was wrong; it isn't a way early migration; it's a 2nd migration -
it seems that
   the first migration leaves the system in a bad way
2. It is only happening on specific machines - interestingly enough, in
every case, we
   migrate between dual core Xeon and quad core E5310 systems; these
systems have different
   cache hierarchies I think; maybe the CPU hotplug code cant handle a
change in the
   processor type?

Simon
 

^ permalink raw reply	[flat|nested] 10+ messages in thread
* RE: DomU crash during migration when suspending source domain
@ 2007-02-15  2:47 Graham, Simon
  2007-02-15  8:12 ` Keir Fraser
  0 siblings, 1 reply; 10+ messages in thread
From: Graham, Simon @ 2007-02-15  2:47 UTC (permalink / raw)
  To: Keir Fraser, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]

> A broader set of fixes might get accepted upstream because
> cache_add_dev()
> can fail for other reasons too (at least out-of-memory) and any such
> failure
> will cause cache_remove_dev() to barf. But it's not such a simple
thing
> to
> fix and it does not solve the general problem for us.

OK -- after some work today, I have a patch that would perhaps be
acceptable upstream -- I started off looking into ways of ensuring the
information on the cache got reset properly after migration but quickly
decided that this was not going to be at all easy (plus it wouldn't
address the larger issues of processor compatibility), so instead I
concentrated on handling all the cases where pointers might be NULL
because of an allocation failure (which also happens to fix the crash we
see here).

Attached is a patch to intel_cacheinfo-xen.c in unstable -- I basically
found everywhere that you might have a NULL due to allocation failure
and tested for it; I've run this a little and confirmed that it solves
the original crash migrating back and forth between the systems with
different processor families. Will run more extensive regressions on it
tonight and tomorrow, but I thought I'd send it now for review; will
resend with signed-off-by and comment if you think it is OK.

Simon


[-- Attachment #2: xen-unstable-intel-cachinfo.patch --]
[-- Type: application/octet-stream, Size: 4888 bytes --]

Index: trunk/linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c
===================================================================
--- trunk/linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c	(revision 10421)
+++ trunk/linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c	(working copy)
@@ -418,7 +418,7 @@
 
 /* pointer to _cpuid4_info array (for each cache leaf) */
 static struct _cpuid4_info *cpuid4_info[NR_CPUS];
-#define CPUID4_INFO_IDX(x,y)    (&((cpuid4_info[x])[y]))
+#define CPUID4_INFO_IDX(x,y)    (cpuid4_info[x]?(&((cpuid4_info[x])[y])):NULL)
 
 #ifdef CONFIG_SMP
 static void __cpuinit cache_shared_cpu_map_setup(unsigned int cpu, int index)
@@ -429,20 +429,23 @@
 	struct cpuinfo_x86 *c = cpu_data;
 
 	this_leaf = CPUID4_INFO_IDX(cpu, index);
-	num_threads_sharing = 1 + this_leaf->eax.split.num_threads_sharing;
+	if (this_leaf) {
+		num_threads_sharing = 1 + this_leaf->eax.split.num_threads_sharing;
 
-	if (num_threads_sharing == 1)
-		cpu_set(cpu, this_leaf->shared_cpu_map);
-	else {
-		index_msb = get_count_order(num_threads_sharing);
+		if (num_threads_sharing == 1)
+			cpu_set(cpu, this_leaf->shared_cpu_map);
+		else {
+			index_msb = get_count_order(num_threads_sharing);
 
-		for_each_online_cpu(i) {
-			if (c[i].apicid >> index_msb ==
-			    c[cpu].apicid >> index_msb) {
-				cpu_set(i, this_leaf->shared_cpu_map);
-				if (i != cpu && cpuid4_info[i])  {
-					sibling_leaf = CPUID4_INFO_IDX(i, index);
-					cpu_set(cpu, sibling_leaf->shared_cpu_map);
+			for_each_online_cpu(i) {
+				if (c[i].apicid >> index_msb ==
+				    c[cpu].apicid >> index_msb) {
+					cpu_set(i, this_leaf->shared_cpu_map);
+					if (i != cpu && cpuid4_info[i])  {
+						sibling_leaf = CPUID4_INFO_IDX(i, index);
+						if (sibling_leaf)
+							cpu_set(cpu, sibling_leaf->shared_cpu_map);
+					}
 				}
 			}
 		}
@@ -454,9 +457,11 @@
 	int sibling;
 
 	this_leaf = CPUID4_INFO_IDX(cpu, index);
-	for_each_cpu_mask(sibling, this_leaf->shared_cpu_map) {
-		sibling_leaf = CPUID4_INFO_IDX(sibling, index);	
-		cpu_clear(cpu, sibling_leaf->shared_cpu_map);
+	if (this_leaf) {
+		for_each_cpu_mask(sibling, this_leaf->shared_cpu_map) {
+			sibling_leaf = CPUID4_INFO_IDX(sibling, index);	
+			cpu_clear(cpu, sibling_leaf->shared_cpu_map);
+		}
 	}
 }
 #else
@@ -496,10 +501,12 @@
 	retval = 0;
 	for (j = 0; j < num_cache_leaves; j++) {
 		this_leaf = CPUID4_INFO_IDX(cpu, j);
-		retval = cpuid4_cache_lookup(j, this_leaf);
-		if (unlikely(retval < 0))
-			break;
-		cache_shared_cpu_map_setup(cpu, j);
+		if (this_leaf) {
+			retval = cpuid4_cache_lookup(j, this_leaf);
+			if (unlikely(retval < 0))
+				break;
+			cache_shared_cpu_map_setup(cpu, j);
+		}
 	}
 	set_cpus_allowed(current, oldmask);
 
@@ -527,7 +534,7 @@
 
 /* pointer to array of kobjects for cpuX/cache/indexY */
 static struct _index_kobject *index_kobject[NR_CPUS];
-#define INDEX_KOBJECT_PTR(x,y)    (&((index_kobject[x])[y]))
+#define INDEX_KOBJECT_PTR(x,y)    (index_kobject[x]?(&((index_kobject[x])[y])):NULL)
 
 #define show_one_plus(file_name, object, val)				\
 static ssize_t show_##file_name						\
@@ -608,12 +615,14 @@
 static ssize_t show(struct kobject * kobj, struct attribute * attr, char * buf)
 {
 	struct _cache_attr *fattr = to_attr(attr);
-	struct _index_kobject *this_leaf = to_object(kobj);
+	struct _index_kobject *this_leaf_obj = to_object(kobj);
+	struct _cpuid4_info *this_leaf = CPUID4_INFO_IDX(this_leaf_obj->cpu,
+							 this_leaf_obj->index);
+
 	ssize_t ret;
 
-	ret = fattr->show ?
-		fattr->show(CPUID4_INFO_IDX(this_leaf->cpu, this_leaf->index),
-			buf) :
+	ret = fattr->show && this_leaf?
+		fattr->show(this_leaf,buf) :
 	       	0;
 	return ret;
 }
@@ -696,6 +705,9 @@
 
 	for (i = 0; i < num_cache_leaves; i++) {
 		this_object = INDEX_KOBJECT_PTR(cpu,i);
+		if (!this_object)
+			continue;
+
 		this_object->cpu = cpu;
 		this_object->index = i;
 		this_object->kobj.parent = cache_kobject[cpu];
@@ -704,8 +716,10 @@
 		retval = kobject_register(&(this_object->kobj));
 		if (unlikely(retval)) {
 			for (j = 0; j < i; j++) {
-				kobject_unregister(
-					&(INDEX_KOBJECT_PTR(cpu,j)->kobj));
+				struct _index_kobject *robj = 
+					INDEX_KOBJECT_PTR(cpu,j);
+				if (robj)
+					kobject_unregister(&robj->kobj);
 			}
 			kobject_unregister(cache_kobject[cpu]);
 			cpuid4_cache_sysfs_exit(cpu);
@@ -719,12 +733,16 @@
 {
 	unsigned int cpu = sys_dev->id;
 	unsigned long i;
+	struct _index_kobject *this_object;
 
 	for (i = 0; i < num_cache_leaves; i++) {
 		cache_remove_shared_cpu_map(cpu, i);
-		kobject_unregister(&(INDEX_KOBJECT_PTR(cpu,i)->kobj));
+		this_object = INDEX_KOBJECT_PTR(cpu,i);
+		if (this_object)
+			kobject_unregister(&this_object->kobj);
 	}
-	kobject_unregister(cache_kobject[cpu]);
+	if (cache_kobject[cpu])
+		kobject_unregister(cache_kobject[cpu]);
 	cpuid4_cache_sysfs_exit(cpu);
 	return;
 }

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread
* RE: DomU crash during migration when suspending source domain
@ 2007-02-15 12:45 Graham, Simon
  2007-02-15 13:22 ` Keir Fraser
  2007-02-15 19:52 ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 10+ messages in thread
From: Graham, Simon @ 2007-02-15 12:45 UTC (permalink / raw)
  To: Keir Fraser, Keir Fraser, xen-devel

> The question is whether it has a chance of being accepted upstream.
> Since
> it's rather bigger than the one-liner we originally hoped for, we
don't
> want
> to carry it if it has no chance of acceptance.
> 

You have a better idea of that than I do - however, it seemed to me that
a complete fix for the cases where allocation failed would be more
likely to be accepted than something that only fixes one place.

Another thing to consider -- hitting allocation failures is MUCH more
likely in the Xen environment when live migration is in use; how often
is a processor taken down and then restarted outside of live migration?
However, since live migration can occur when the domain has been up for
a long time, it's much more likely for the allocation to fail (although
presumably still somewhat low)...

Excuse my ignorance but what's the process for pushing something like
this upstream?
/simgr

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-02-15 19:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-14  3:42 DomU crash during migration when suspending source domain Graham, Simon
2007-02-14 10:13 ` Keir Fraser
2007-02-14 10:36   ` Keir Fraser
2007-02-14 10:48     ` Keir Fraser
2007-02-14 13:02 Graham, Simon
2007-02-15  2:47 Graham, Simon
2007-02-15  8:12 ` Keir Fraser
2007-02-15 12:45 Graham, Simon
2007-02-15 13:22 ` Keir Fraser
2007-02-15 19:52 ` Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.