* HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
@ 2010-07-07 18:42 Gianni Tedesco
2010-07-08 10:03 ` George Dunlap
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-07 18:42 UTC (permalink / raw)
To: Xen Devel
Hi,
I've spent a few weeks investigating a very reproducible guest-hangs bug
which appears to affect all hypervisors from at least 3.4.2 through 4.0
to unstable.
To reproduce setup an RHEL5.2 guest for kickstart network install
something like this:
vmlinuz ks=nfs:1.2.3.4:ks-rhel52.cfg ksdevice=eth0 console=tty0
console=ttyS0,9600n8 serial initrd=initrd.img root=/dev/ram0
With a vm profile something like this:
kernel = "/usr/lib/xen/boot/hvmloader"
builder = 'hvm'
memory = 128
name = "RHEL5.2-ks"
vcpus = 2
vif = [ 'type=ioemu,bridge=xenbr0,mac=00:26:b9:87:0e:d3' ]
disk = [ 'phy:/dev/sdb1,hda,w' ]
device_model = '/usr/lib/xen/bin/qemu-dm'
As long as VCPU's > 1 the guest repeatedly hangs after userspace has
started. In all cases hvmctx reports that the kernel is spinning away in
cpu_idle() as if waiting for an interrupt and EFLAGS.IF = 1. If a key is
pressed either on kb or via serial the system unhangs itself. The system
still responds to network traffic (eg. ping) during this time but that
doesn't unhang it.
I have ruled out all the usual suspects, timer modes, vpt_align, guest
kernel clock sources, acpi, hpet, hap and oos. With a little debugging I
was able to show that timer IRQ's from HPET as well as RESCHED IPI's
were still getting delivered during the hangs. Full ctx dump follows.
Help! :)
HVM save record for domain 94
Entry 0: type 1 instance 0, length 24
Header: magic 0x54381286, version 1
Xen changeset 0
CPUID[0][%eax] 0x000106a5
gtsc_khz 2666735
Entry 1: type 2 instance 0, length 1024
CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
rcx 0x0000000000000000 rdx 0x0000000000000000
rbp 0x0000000000030000 rsi 0x0000000000000001
rdi 0xffffffff802e5658 rsp 0xffffffff803cff90
r8 0xffffffff803ce000 r9 0x000000000000003e
r10 0xffff8100070a0038 r11 0xffff81000769f7a0
r12 0x0000000000000000 r13 0x0000000000000000
r14 0x0000000000000000 r15 0x0000000000000000
rip 0xffffffff8006ad64 rflags 0x0000000000000246
cr0 0x000000008005003b cr2 0x000000000042cc00
cr3 0x0000000006bf9000 cr4 0x00000000000006e0
dr0 0x0000000000000000 dr1 0x0000000000000000
dr2 0x0000000000000000 dr3 0x0000000000000000
dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
gs 0x00000000 (0xffffffff8039e000 + 0xffffffff / 0x00c00)
ss 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
tr 0x00000040 (0xffff810001033000 + 0x0000206f / 0x0008b)
ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
itdr (0xffffffff8041d000 + 0x00000fff)
gdtr (0xffffffff803d0000 + 0x00000080)
sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
shadow gs 0x0000000000000000
MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
star 0x0023001000000000 cstar 0xffffffff80061584
sfmask 0x0000000000003700 efer 0x0000000000000d01
tsc 0x0000001af07d0e03
event 0x00000000 error 0x00000000
FPU: fcw 0x037f fsw 0x0000
ftw 0x00 (0x00) fop 0x0000
fpuip 0x0000000000000000 fpudp 0x0000000000000000
mxcsr 0x00001fa0 mask 0x0000ffff
mm0 0x00000000000000000000 (0x000000000000)
mm1 0x00000000000000000000 (0x000000000000)
mm2 0x00000000000000000000 (0x000000000000)
mm3 0x00000000000000000000 (0x000000000000)
mm4 0x00000000000000000000 (0x000000000000)
mm5 0x00000000000000000000 (0x000000000000)
mm6 0x00000000000000000000 (0x000000000000)
mm7 0x00000000000000000000 (0x000000000000)
xmm00 0x00000000000000003fe333333f19999a
xmm01 0x00000000000000000000000040266666
xmm02 0x00000000000000000000000000000000
xmm03 0x00000000000000000000000000000000
xmm04 0x00000000000000000000000000000000
xmm05 0x00000000000000000000000000000000
xmm06 0x00000000000000000000000000000000
xmm07 0x00000000000000000000000000000000
xmm08 0x00000000000000000000000000000000
xmm09 0x00000000000000000000000000000000
xmm10 0x00000000000000000000000000000000
xmm11 0x00000000000000000000000000000000
xmm12 0x00000000000000000000000000000000
xmm13 0x00000000000000000000000000000000
xmm14 0x00000000000000000000000000000000
xmm15 0x00000000000000000000000000000000
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
Entry 2: type 2 instance 1, length 1024
CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
rcx 0x0000000000000000 rdx 0x0000000000000000
rbp 0x0000000000000001 rsi 0x0000000000000001
rdi 0xffffffff802e5658 rsp 0xffff81000708fef0
r8 0xffff81000708e000 r9 0x000000000000003f
r10 0xffff8100070a0008 r11 0xffff810006b5a200
r12 0x00000000000000ff r13 0xffffffff803a6080
r14 0x0000000000000100 r15 0xffffffff803c8280
rip 0xffffffff8006ad64 rflags 0x0000000000000246
cr0 0x000000008005003b cr2 0x0000000000866290
cr3 0x0000000000201000 cr4 0x00000000000006e0
dr0 0x0000000000000000 dr1 0x0000000000000000
dr2 0x0000000000000000 dr3 0x0000000000000000
dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
gs 0x00000000 (0xffff810007080b40 + 0xffffffff / 0x00c00)
ss 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
tr 0x00000040 (0xffff81000103b580 + 0x0000206f / 0x0008b)
ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
itdr (0xffffffff8041d000 + 0x00000fff)
gdtr (0xffff810007085000 + 0x00000080)
sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
shadow gs 0x0000000000000000
MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
star 0x0023001000000000 cstar 0xffffffff80061584
sfmask 0x0000000000003700 efer 0x0000000000000d01
tsc 0x0000001af07d53b6
event 0x00000000 error 0x00000000
FPU: fcw 0x037f fsw 0x0000
ftw 0x00 (0x00) fop 0x0000
fpuip 0x0000000000000000 fpudp 0x0000000000000000
mxcsr 0x00001fa0 mask 0x0000ffff
mm0 0x00000000000000000000 (0x000000000000)
mm1 0x00000000000000000000 (0x000000000000)
mm2 0x00000000000000000000 (0x000000000000)
mm3 0x00000000000000000000 (0x000000000000)
mm4 0x00000000000000000000 (0x000000000000)
mm5 0x00000000000000000000 (0x000000000000)
mm6 0x00000000000000000000 (0x000000000000)
mm7 0x00000000000000000000 (0x000000000000)
xmm00 0x00000000000000003fe333333f19999a
xmm01 0x00000000000000000000000040266666
xmm02 0x00000000000000000000000000000000
xmm03 0x00000000000000000000000000000000
xmm04 0x00000000000000000000000000000000
xmm05 0x00000000000000000000000000000000
xmm06 0x00000000000000000000000000000000
xmm07 0x00000000000000000000000000000000
xmm08 0x00000000000000000000000000000000
xmm09 0x00000000000000000000000000000000
xmm10 0x00000000000000000000000000000000
xmm11 0x00000000000000000000000000000000
xmm12 0x00000000000000000000000000000000
xmm13 0x00000000000000000000000000000000
xmm14 0x00000000000000000000000000000000
xmm15 0x00000000000000000000000000000000
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
Entry 3: type 3 instance 0, length 8
PIC: IRQ base 0x20, irr 0x1, imr 0xfa, isr 0
init_state 0, priority_add 0, readsel_isr 0, poll 0
auto_eoi 1, rotate_on_auto_eoi 0
special_fully_nested_mode 0, special_mask_mode 0
is_master 1, elcr 0x24, int_output 0x1
Entry 4: type 3 instance 1, length 8
PIC: IRQ base 0x28, irr 0, imr 0xff, isr 0
init_state 0, priority_add 0, readsel_isr 0, poll 0
auto_eoi 0, rotate_on_auto_eoi 0
special_fully_nested_mode 0, special_mask_mode 0
is_master 0, elcr 0xc, int_output 0
Entry 5: type 4 instance 0, length 400
IOAPIC: base_address 0xfec00000, ioregsel 0x1c id 0x1
pin 00: 0x0000000000010000
pin 01: 0x0000000000000039
pin 02: 0x0000000000000031
pin 03: 0x0000000000000041
pin 04: 0x0000000000000049
pin 05: 0x000000000001a051
pin 06: 0x0000000000000059
pin 07: 0x0000000000000061
pin 08: 0x0000000000000069
pin 09: 0x0000000000000071
pin 10: 0x000000000001a079
pin 11: 0x000000000001a081
pin 12: 0x0000000000000089
pin 13: 0x0000000000000091
pin 14: 0x0000000000000099
pin 15: 0x00000000000000a1
pin 16: 0x0000000000010000
pin 17: 0x0000000000010000
pin 18: 0x0000000000010000
pin 19: 0x0000000000010000
pin 20: 0x0000000000010000
pin 21: 0x0000000000010000
pin 22: 0x0000000000010000
pin 23: 0x0000000000010000
pin 24: 0x0000000000010000
pin 25: 0x0000000000010000
pin 26: 0x0000000000010000
pin 27: 0x0000000000010000
pin 28: 0x0000000000010000
pin 29: 0x0000000000010000
pin 30: 0x0000000000010000
pin 31: 0x0000000000010000
pin 32: 0x0000000000010000
pin 33: 0x0000000000010000
pin 34: 0x0000000000010000
pin 35: 0x0000000000010000
pin 36: 0x0000000000010000
pin 37: 0x0000000000010000
pin 38: 0x0000000000010000
pin 39: 0x0000000000010000
pin 40: 0x0000000000010000
pin 41: 0x0000000000010000
pin 42: 0x0000000000010000
pin 43: 0x0000000000010000
pin 44: 0x0000000000010000
pin 45: 0x0000000000010000
pin 46: 0x0000000000010000
pin 47: 0x0000000000010000
Entry 6: type 5 instance 0, length 16
LAPIC: base_msr 0xfee00900, disabled 0, timer_divisor 0x10
Entry 7: type 5 instance 1, length 16
LAPIC: base_msr 0xfee00800, disabled 0, timer_divisor 0x10
Entry 8: type 6 instance 0, length 1024
LAPIC registers:
0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
0x0020: 0x0000000000000000 0x0030: 0x0000000000050014
0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
0x00c0: 0x0000000000000000 0x00d0: 0x0000000001000000
0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
0x0300: 0x00000000000000fc 0x0310: 0x0000000002000000
0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
0x0360: 0x0000000000000400 0x0370: 0x00000000000000fe
0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
Entry 9: type 6 instance 1, length 1024
LAPIC registers:
0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
0x0020: 0x0000000002000000 0x0030: 0x0000000000050014
0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
0x00c0: 0x0000000000000000 0x00d0: 0x0000000002000000
0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
0x0300: 0x00000000000000fd 0x0310: 0x0000000000000000
0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
0x0360: 0x0000000000010400 0x0370: 0x00000000000000fe
0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
Entry 10: type 7 instance 0, length 16
PCI IRQs: 0x00000000000000000000000000000000
Entry 11: type 8 instance 0, length 8
ISA IRQs: 0x0001
Entry 12: type 9 instance 0, length 8
PCI LINK: 5 10 11 5
Entry 13: type 10 instance 0, length 56
PIT: speaker off
ch 0: count 0x4a9, latched_count 0x4a5, count_latched 0
status 0, status_latched 0
rd_state 0x3, wr_state 0x3, wr_latch 0xa9, rw_mode 0x3
mode 0x2, bcd 0, gate 0x1
ch 1: count 0x10000, latched_count 0, count_latched 0
status 0, status_latched 0
rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
mode 0xff, bcd 0, gate 0x1
Entry 14: type 11 instance 0, length 16
RTC: regs 0x18 0x00 0x36 0x00 0x18 0x00 0x03 0x07
0x07 0x10 0x26 0x02 0x00 0x80, index 0x10
Entry 15: type 12 instance 0, length 1048
HPET: capability 0xf424008086a201 config 0
isr 0 counter 0xa1ad6b9c
timer0 config 0xf0000000000030 cmp 0
timer0 period 0 fsb 0
timer1 config 0xf0000000000030 cmp 0
timer1 period 0 fsb 0
timer2 config 0xf0000000000030 cmp 0
timer2 period 0 fsb 0
Entry 16: type 13 instance 0, length 8
ACPI PM: TMR_VAL 0x8fff, PM1a_STS 0x0, PM1a_EN 0x0
Entry 17: type 14 instance 0, length 240
MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
var 0 0x00000000f0000000 0x0000000ff8000800
var 1 0x00000000f8000000 0x0000000ffc000800
var 2 0x0000000000000000 0x0000000000000000
var 3 0x0000000000000000 0x0000000000000000
var 4 0x0000000000000000 0x0000000000000000
var 5 0x0000000000000000 0x0000000000000000
var 6 0x0000000000000000 0x0000000000000000
var 7 0x0000000000000000 0x0000000000000000
fixed 00 0x0606060606060606
fixed 01 0x0606060606060606
fixed 02 0x0101010101010101
fixed 03 0x0606060606060606
fixed 04 0x0606060606060606
fixed 05 0x0606060606060606
fixed 06 0x0606060606060606
fixed 07 0x0606060606060606
fixed 08 0x0606060606060606
fixed 09 0x0606060606060606
fixed 10 0x0606060606060606
Entry 18: type 14 instance 1, length 240
MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
var 0 0x00000000f0000000 0x0000000ff8000800
var 1 0x00000000f8000000 0x0000000ffc000800
var 2 0x0000000000000000 0x0000000000000000
var 3 0x0000000000000000 0x0000000000000000
var 4 0x0000000000000000 0x0000000000000000
var 5 0x0000000000000000 0x0000000000000000
var 6 0x0000000000000000 0x0000000000000000
var 7 0x0000000000000000 0x0000000000000000
fixed 00 0x0606060606060606
fixed 01 0x0606060606060606
fixed 02 0x0101010101010101
fixed 03 0x0606060606060606
fixed 04 0x0606060606060606
fixed 05 0x0606060606060606
fixed 06 0x0606060606060606
fixed 07 0x0606060606060606
fixed 08 0x0606060606060606
fixed 09 0x0606060606060606
fixed 10 0x0606060606060606
Entry 19: type 0 instance 0, length 0
[scara@habil xen-unstable.hg]$ sudo xen-hvmctx 94 | grep rip
rip 0xffffffff8006ad64 rflags 0x0000000000000246
rip 0xffffffff8006ad64 rflags 0x0000000000000246
[scara@habil xen-unstable.hg]$ sudo xen-hvmctx 94
HVM save record for domain 94
Entry 0: type 1 instance 0, length 24
Header: magic 0x54381286, version 1
Xen changeset 0
CPUID[0][%eax] 0x000106a5
gtsc_khz 2666735
Entry 1: type 2 instance 0, length 1024
CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
rcx 0x0000000000000000 rdx 0x0000000000000000
rbp 0x0000000000030000 rsi 0x0000000000000001
rdi 0xffffffff802e5658 rsp 0xffffffff803cff90
r8 0xffffffff803ce000 r9 0x000000000000003e
r10 0xffff8100070a0038 r11 0xffff81000769f7a0
r12 0x0000000000000000 r13 0x0000000000000000
r14 0x0000000000000000 r15 0x0000000000000000
rip 0xffffffff8006ad64 rflags 0x0000000000000246
cr0 0x000000008005003b cr2 0x000000000042cc00
cr3 0x0000000006bf9000 cr4 0x00000000000006e0
dr0 0x0000000000000000 dr1 0x0000000000000000
dr2 0x0000000000000000 dr3 0x0000000000000000
dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
gs 0x00000000 (0xffffffff8039e000 + 0xffffffff / 0x00c00)
ss 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
tr 0x00000040 (0xffff810001033000 + 0x0000206f / 0x0008b)
ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
itdr (0xffffffff8041d000 + 0x00000fff)
gdtr (0xffffffff803d0000 + 0x00000080)
sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
shadow gs 0x0000000000000000
MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
star 0x0023001000000000 cstar 0xffffffff80061584
sfmask 0x0000000000003700 efer 0x0000000000000d01
tsc 0x0000001ffb1191c5
event 0x00000000 error 0x00000000
FPU: fcw 0x037f fsw 0x0000
ftw 0x00 (0x00) fop 0x0000
fpuip 0x0000000000000000 fpudp 0x0000000000000000
mxcsr 0x00001fa0 mask 0x0000ffff
mm0 0x00000000000000000000 (0x000000000000)
mm1 0x00000000000000000000 (0x000000000000)
mm2 0x00000000000000000000 (0x000000000000)
mm3 0x00000000000000000000 (0x000000000000)
mm4 0x00000000000000000000 (0x000000000000)
mm5 0x00000000000000000000 (0x000000000000)
mm6 0x00000000000000000000 (0x000000000000)
mm7 0x00000000000000000000 (0x000000000000)
xmm00 0x00000000000000003fe333333f19999a
xmm01 0x00000000000000000000000040266666
xmm02 0x00000000000000000000000000000000
xmm03 0x00000000000000000000000000000000
xmm04 0x00000000000000000000000000000000
xmm05 0x00000000000000000000000000000000
xmm06 0x00000000000000000000000000000000
xmm07 0x00000000000000000000000000000000
xmm08 0x00000000000000000000000000000000
xmm09 0x00000000000000000000000000000000
xmm10 0x00000000000000000000000000000000
xmm11 0x00000000000000000000000000000000
xmm12 0x00000000000000000000000000000000
xmm13 0x00000000000000000000000000000000
xmm14 0x00000000000000000000000000000000
xmm15 0x00000000000000000000000000000000
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
Entry 2: type 2 instance 1, length 1024
CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
rcx 0x0000000000000000 rdx 0x0000000000000000
rbp 0x0000000000000001 rsi 0x0000000000000001
rdi 0xffffffff802e5658 rsp 0xffff81000708fef0
r8 0xffff81000708e000 r9 0x000000000000003f
r10 0xffff8100070a0008 r11 0xffff810006b5a480
r12 0x00000000000000ff r13 0xffffffff803a6080
r14 0x0000000000000100 r15 0xffffffff803c8280
rip 0xffffffff8006ad64 rflags 0x0000000000000246
cr0 0x000000008005003b cr2 0x0000000000866290
cr3 0x0000000000201000 cr4 0x00000000000006e0
dr0 0x0000000000000000 dr1 0x0000000000000000
dr2 0x0000000000000000 dr3 0x0000000000000000
dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
gs 0x00000000 (0xffff810007080b40 + 0xffffffff / 0x00c00)
ss 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
tr 0x00000040 (0xffff81000103b580 + 0x0000206f / 0x0008b)
ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
itdr (0xffffffff8041d000 + 0x00000fff)
gdtr (0xffff810007085000 + 0x00000080)
sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
shadow gs 0x0000000000000000
MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
star 0x0023001000000000 cstar 0xffffffff80061584
sfmask 0x0000000000003700 efer 0x0000000000000d01
tsc 0x0000001ffb11d380
event 0x00000000 error 0x00000000
FPU: fcw 0x037f fsw 0x0000
ftw 0x00 (0x00) fop 0x0000
fpuip 0x0000000000000000 fpudp 0x0000000000000000
mxcsr 0x00001fa0 mask 0x0000ffff
mm0 0x00000000000000000000 (0x000000000000)
mm1 0x00000000000000000000 (0x000000000000)
mm2 0x00000000000000000000 (0x000000000000)
mm3 0x00000000000000000000 (0x000000000000)
mm4 0x00000000000000000000 (0x000000000000)
mm5 0x00000000000000000000 (0x000000000000)
mm6 0x00000000000000000000 (0x000000000000)
mm7 0x00000000000000000000 (0x000000000000)
xmm00 0x00000000000000003fe333333f19999a
xmm01 0x00000000000000000000000040266666
xmm02 0x00000000000000000000000000000000
xmm03 0x00000000000000000000000000000000
xmm04 0x00000000000000000000000000000000
xmm05 0x00000000000000000000000000000000
xmm06 0x00000000000000000000000000000000
xmm07 0x00000000000000000000000000000000
xmm08 0x00000000000000000000000000000000
xmm09 0x00000000000000000000000000000000
xmm10 0x00000000000000000000000000000000
xmm11 0x00000000000000000000000000000000
xmm12 0x00000000000000000000000000000000
xmm13 0x00000000000000000000000000000000
xmm14 0x00000000000000000000000000000000
xmm15 0x00000000000000000000000000000000
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
(0x00000000000000000000000000000000)
Entry 3: type 3 instance 0, length 8
PIC: IRQ base 0x20, irr 0x1, imr 0xfa, isr 0
init_state 0, priority_add 0, readsel_isr 0, poll 0
auto_eoi 1, rotate_on_auto_eoi 0
special_fully_nested_mode 0, special_mask_mode 0
is_master 1, elcr 0x24, int_output 0x1
Entry 4: type 3 instance 1, length 8
PIC: IRQ base 0x28, irr 0, imr 0xff, isr 0
init_state 0, priority_add 0, readsel_isr 0, poll 0
auto_eoi 0, rotate_on_auto_eoi 0
special_fully_nested_mode 0, special_mask_mode 0
is_master 0, elcr 0xc, int_output 0
Entry 5: type 4 instance 0, length 400
IOAPIC: base_address 0xfec00000, ioregsel 0x1c id 0x1
pin 00: 0x0000000000010000
pin 01: 0x0000000000000039
pin 02: 0x0000000000000031
pin 03: 0x0000000000000041
pin 04: 0x0000000000000049
pin 05: 0x000000000001a051
pin 06: 0x0000000000000059
pin 07: 0x0000000000000061
pin 08: 0x0000000000000069
pin 09: 0x0000000000000071
pin 10: 0x000000000001a079
pin 11: 0x000000000001a081
pin 12: 0x0000000000000089
pin 13: 0x0000000000000091
pin 14: 0x0000000000000099
pin 15: 0x00000000000000a1
pin 16: 0x0000000000010000
pin 17: 0x0000000000010000
pin 18: 0x0000000000010000
pin 19: 0x0000000000010000
pin 20: 0x0000000000010000
pin 21: 0x0000000000010000
pin 22: 0x0000000000010000
pin 23: 0x0000000000010000
pin 24: 0x0000000000010000
pin 25: 0x0000000000010000
pin 26: 0x0000000000010000
pin 27: 0x0000000000010000
pin 28: 0x0000000000010000
pin 29: 0x0000000000010000
pin 30: 0x0000000000010000
pin 31: 0x0000000000010000
pin 32: 0x0000000000010000
pin 33: 0x0000000000010000
pin 34: 0x0000000000010000
pin 35: 0x0000000000010000
pin 36: 0x0000000000010000
pin 37: 0x0000000000010000
pin 38: 0x0000000000010000
pin 39: 0x0000000000010000
pin 40: 0x0000000000010000
pin 41: 0x0000000000010000
pin 42: 0x0000000000010000
pin 43: 0x0000000000010000
pin 44: 0x0000000000010000
pin 45: 0x0000000000010000
pin 46: 0x0000000000010000
pin 47: 0x0000000000010000
Entry 6: type 5 instance 0, length 16
LAPIC: base_msr 0xfee00900, disabled 0, timer_divisor 0x10
Entry 7: type 5 instance 1, length 16
LAPIC: base_msr 0xfee00800, disabled 0, timer_divisor 0x10
Entry 8: type 6 instance 0, length 1024
LAPIC registers:
0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
0x0020: 0x0000000000000000 0x0030: 0x0000000000050014
0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
0x00c0: 0x0000000000000000 0x00d0: 0x0000000001000000
0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
0x0300: 0x00000000000000fc 0x0310: 0x0000000002000000
0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
0x0360: 0x0000000000000400 0x0370: 0x00000000000000fe
0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
Entry 9: type 6 instance 1, length 1024
LAPIC registers:
0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
0x0020: 0x0000000002000000 0x0030: 0x0000000000050014
0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
0x00c0: 0x0000000000000000 0x00d0: 0x0000000002000000
0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
0x0300: 0x00000000000000fd 0x0310: 0x0000000000000000
0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
0x0360: 0x0000000000010400 0x0370: 0x00000000000000fe
0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
Entry 10: type 7 instance 0, length 16
PCI IRQs: 0x00000000000000000000000000000000
Entry 11: type 8 instance 0, length 8
ISA IRQs: 0x0001
Entry 12: type 9 instance 0, length 8
PCI LINK: 5 10 11 5
Entry 13: type 10 instance 0, length 56
PIT: speaker off
ch 0: count 0x4a9, latched_count 0x4a7, count_latched 0
status 0, status_latched 0
rd_state 0x3, wr_state 0x3, wr_latch 0xa9, rw_mode 0x3
mode 0x2, bcd 0, gate 0x1
ch 1: count 0x10000, latched_count 0, count_latched 0
status 0, status_latched 0
rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
mode 0xff, bcd 0, gate 0x1
Entry 14: type 11 instance 0, length 16
RTC: regs 0x26 0x00 0x36 0x00 0x18 0x00 0x03 0x07
0x07 0x10 0x26 0x02 0x00 0x80, index 0x10
Entry 15: type 12 instance 0, length 1048
HPET: capability 0xf424008086a201 config 0
isr 0 counter 0xbfed04a9
timer0 config 0xf0000000000030 cmp 0
timer0 period 0 fsb 0
timer1 config 0xf0000000000030 cmp 0
timer1 period 0 fsb 0
timer2 config 0xf0000000000030 cmp 0
timer2 period 0 fsb 0
Entry 16: type 13 instance 0, length 8
ACPI PM: TMR_VAL 0x8fff, PM1a_STS 0x0, PM1a_EN 0x0
Entry 17: type 14 instance 0, length 240
MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
var 0 0x00000000f0000000 0x0000000ff8000800
var 1 0x00000000f8000000 0x0000000ffc000800
var 2 0x0000000000000000 0x0000000000000000
var 3 0x0000000000000000 0x0000000000000000
var 4 0x0000000000000000 0x0000000000000000
var 5 0x0000000000000000 0x0000000000000000
var 6 0x0000000000000000 0x0000000000000000
var 7 0x0000000000000000 0x0000000000000000
fixed 00 0x0606060606060606
fixed 01 0x0606060606060606
fixed 02 0x0101010101010101
fixed 03 0x0606060606060606
fixed 04 0x0606060606060606
fixed 05 0x0606060606060606
fixed 06 0x0606060606060606
fixed 07 0x0606060606060606
fixed 08 0x0606060606060606
fixed 09 0x0606060606060606
fixed 10 0x0606060606060606
Entry 18: type 14 instance 1, length 240
MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
var 0 0x00000000f0000000 0x0000000ff8000800
var 1 0x00000000f8000000 0x0000000ffc000800
var 2 0x0000000000000000 0x0000000000000000
var 3 0x0000000000000000 0x0000000000000000
var 4 0x0000000000000000 0x0000000000000000
var 5 0x0000000000000000 0x0000000000000000
var 6 0x0000000000000000 0x0000000000000000
var 7 0x0000000000000000 0x0000000000000000
fixed 00 0x0606060606060606
fixed 01 0x0606060606060606
fixed 02 0x0101010101010101
fixed 03 0x0606060606060606
fixed 04 0x0606060606060606
fixed 05 0x0606060606060606
fixed 06 0x0606060606060606
fixed 07 0x0606060606060606
fixed 08 0x0606060606060606
fixed 09 0x0606060606060606
fixed 10 0x0606060606060606
Entry 19: type 0 instance 0, length 0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-07 18:42 HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1 Gianni Tedesco
@ 2010-07-08 10:03 ` George Dunlap
2010-07-08 11:55 ` Gianni Tedesco
2010-07-12 14:04 ` Konrad Rzeszutek Wilk
2010-07-21 17:29 ` Gianni Tedesco
2 siblings, 1 reply; 18+ messages in thread
From: George Dunlap @ 2010-07-08 10:03 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: Xen Devel
If both cpus are idling with EFLAGS.IF=1, this would imply that the
kernel thinks it's waiting on a device, yes? One thing you could do
is to track the interaction between the guest and the devices, and see
if you can figure out what it's waiting for and why the thing it's
waiting for isn't happening. You can use xentrace + xenalyze
(http://xenbits.xensource.com/ext/xenalyze.hg) to see all the PIO,
MMIO, and interrupts delivered to the guest.
Unfortunately this would mean understanding at some level the
interface the device presents, which may involve a lot of going
through driver code / going through QEMU, which doesn't sound fun. :-/
Maybe someone else will have some suggestions...
I ended up with a similar-looking problem during boot with a stock
2.6.18.8 kernel, after hacking up a work-around to allow it to get
past the timer synchronization stage. It might be easier to track
down if you have a failure mode that's quicker to reproduce and a
guest kernel that's easier to modify. (But of course there's always
the possibility that it's a different bug with similar symptoms...)
-George
On Wed, Jul 7, 2010 at 7:42 PM, Gianni Tedesco
<gianni.tedesco@citrix.com> wrote:
> Hi,
>
> I've spent a few weeks investigating a very reproducible guest-hangs bug
> which appears to affect all hypervisors from at least 3.4.2 through 4.0
> to unstable.
>
> To reproduce setup an RHEL5.2 guest for kickstart network install
> something like this:
>
> vmlinuz ks=nfs:1.2.3.4:ks-rhel52.cfg ksdevice=eth0 console=tty0
> console=ttyS0,9600n8 serial initrd=initrd.img root=/dev/ram0
>
> With a vm profile something like this:
>
> kernel = "/usr/lib/xen/boot/hvmloader"
> builder = 'hvm'
> memory = 128
> name = "RHEL5.2-ks"
> vcpus = 2
> vif = [ 'type=ioemu,bridge=xenbr0,mac=00:26:b9:87:0e:d3' ]
> disk = [ 'phy:/dev/sdb1,hda,w' ]
> device_model = '/usr/lib/xen/bin/qemu-dm'
>
> As long as VCPU's > 1 the guest repeatedly hangs after userspace has
> started. In all cases hvmctx reports that the kernel is spinning away in
> cpu_idle() as if waiting for an interrupt and EFLAGS.IF = 1. If a key is
> pressed either on kb or via serial the system unhangs itself. The system
> still responds to network traffic (eg. ping) during this time but that
> doesn't unhang it.
>
> I have ruled out all the usual suspects, timer modes, vpt_align, guest
> kernel clock sources, acpi, hpet, hap and oos. With a little debugging I
> was able to show that timer IRQ's from HPET as well as RESCHED IPI's
> were still getting delivered during the hangs. Full ctx dump follows.
>
> Help! :)
>
> HVM save record for domain 94
> Entry 0: type 1 instance 0, length 24
> Header: magic 0x54381286, version 1
> Xen changeset 0
> CPUID[0][%eax] 0x000106a5
> gtsc_khz 2666735
> Entry 1: type 2 instance 0, length 1024
> CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
> rcx 0x0000000000000000 rdx 0x0000000000000000
> rbp 0x0000000000030000 rsi 0x0000000000000001
> rdi 0xffffffff802e5658 rsp 0xffffffff803cff90
> r8 0xffffffff803ce000 r9 0x000000000000003e
> r10 0xffff8100070a0038 r11 0xffff81000769f7a0
> r12 0x0000000000000000 r13 0x0000000000000000
> r14 0x0000000000000000 r15 0x0000000000000000
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> cr0 0x000000008005003b cr2 0x000000000042cc00
> cr3 0x0000000006bf9000 cr4 0x00000000000006e0
> dr0 0x0000000000000000 dr1 0x0000000000000000
> dr2 0x0000000000000000 dr3 0x0000000000000000
> dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
> cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
> ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> gs 0x00000000 (0xffffffff8039e000 + 0xffffffff / 0x00c00)
> ss 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> tr 0x00000040 (0xffff810001033000 + 0x0000206f / 0x0008b)
> ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> itdr (0xffffffff8041d000 + 0x00000fff)
> gdtr (0xffffffff803d0000 + 0x00000080)
> sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
> shadow gs 0x0000000000000000
> MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
> star 0x0023001000000000 cstar 0xffffffff80061584
> sfmask 0x0000000000003700 efer 0x0000000000000d01
> tsc 0x0000001af07d0e03
> event 0x00000000 error 0x00000000
> FPU: fcw 0x037f fsw 0x0000
> ftw 0x00 (0x00) fop 0x0000
> fpuip 0x0000000000000000 fpudp 0x0000000000000000
> mxcsr 0x00001fa0 mask 0x0000ffff
> mm0 0x00000000000000000000 (0x000000000000)
> mm1 0x00000000000000000000 (0x000000000000)
> mm2 0x00000000000000000000 (0x000000000000)
> mm3 0x00000000000000000000 (0x000000000000)
> mm4 0x00000000000000000000 (0x000000000000)
> mm5 0x00000000000000000000 (0x000000000000)
> mm6 0x00000000000000000000 (0x000000000000)
> mm7 0x00000000000000000000 (0x000000000000)
> xmm00 0x00000000000000003fe333333f19999a
> xmm01 0x00000000000000000000000040266666
> xmm02 0x00000000000000000000000000000000
> xmm03 0x00000000000000000000000000000000
> xmm04 0x00000000000000000000000000000000
> xmm05 0x00000000000000000000000000000000
> xmm06 0x00000000000000000000000000000000
> xmm07 0x00000000000000000000000000000000
> xmm08 0x00000000000000000000000000000000
> xmm09 0x00000000000000000000000000000000
> xmm10 0x00000000000000000000000000000000
> xmm11 0x00000000000000000000000000000000
> xmm12 0x00000000000000000000000000000000
> xmm13 0x00000000000000000000000000000000
> xmm14 0x00000000000000000000000000000000
> xmm15 0x00000000000000000000000000000000
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> Entry 2: type 2 instance 1, length 1024
> CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
> rcx 0x0000000000000000 rdx 0x0000000000000000
> rbp 0x0000000000000001 rsi 0x0000000000000001
> rdi 0xffffffff802e5658 rsp 0xffff81000708fef0
> r8 0xffff81000708e000 r9 0x000000000000003f
> r10 0xffff8100070a0008 r11 0xffff810006b5a200
> r12 0x00000000000000ff r13 0xffffffff803a6080
> r14 0x0000000000000100 r15 0xffffffff803c8280
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> cr0 0x000000008005003b cr2 0x0000000000866290
> cr3 0x0000000000201000 cr4 0x00000000000006e0
> dr0 0x0000000000000000 dr1 0x0000000000000000
> dr2 0x0000000000000000 dr3 0x0000000000000000
> dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
> cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
> ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> gs 0x00000000 (0xffff810007080b40 + 0xffffffff / 0x00c00)
> ss 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> tr 0x00000040 (0xffff81000103b580 + 0x0000206f / 0x0008b)
> ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> itdr (0xffffffff8041d000 + 0x00000fff)
> gdtr (0xffff810007085000 + 0x00000080)
> sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
> shadow gs 0x0000000000000000
> MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
> star 0x0023001000000000 cstar 0xffffffff80061584
> sfmask 0x0000000000003700 efer 0x0000000000000d01
> tsc 0x0000001af07d53b6
> event 0x00000000 error 0x00000000
> FPU: fcw 0x037f fsw 0x0000
> ftw 0x00 (0x00) fop 0x0000
> fpuip 0x0000000000000000 fpudp 0x0000000000000000
> mxcsr 0x00001fa0 mask 0x0000ffff
> mm0 0x00000000000000000000 (0x000000000000)
> mm1 0x00000000000000000000 (0x000000000000)
> mm2 0x00000000000000000000 (0x000000000000)
> mm3 0x00000000000000000000 (0x000000000000)
> mm4 0x00000000000000000000 (0x000000000000)
> mm5 0x00000000000000000000 (0x000000000000)
> mm6 0x00000000000000000000 (0x000000000000)
> mm7 0x00000000000000000000 (0x000000000000)
> xmm00 0x00000000000000003fe333333f19999a
> xmm01 0x00000000000000000000000040266666
> xmm02 0x00000000000000000000000000000000
> xmm03 0x00000000000000000000000000000000
> xmm04 0x00000000000000000000000000000000
> xmm05 0x00000000000000000000000000000000
> xmm06 0x00000000000000000000000000000000
> xmm07 0x00000000000000000000000000000000
> xmm08 0x00000000000000000000000000000000
> xmm09 0x00000000000000000000000000000000
> xmm10 0x00000000000000000000000000000000
> xmm11 0x00000000000000000000000000000000
> xmm12 0x00000000000000000000000000000000
> xmm13 0x00000000000000000000000000000000
> xmm14 0x00000000000000000000000000000000
> xmm15 0x00000000000000000000000000000000
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> Entry 3: type 3 instance 0, length 8
> PIC: IRQ base 0x20, irr 0x1, imr 0xfa, isr 0
> init_state 0, priority_add 0, readsel_isr 0, poll 0
> auto_eoi 1, rotate_on_auto_eoi 0
> special_fully_nested_mode 0, special_mask_mode 0
> is_master 1, elcr 0x24, int_output 0x1
> Entry 4: type 3 instance 1, length 8
> PIC: IRQ base 0x28, irr 0, imr 0xff, isr 0
> init_state 0, priority_add 0, readsel_isr 0, poll 0
> auto_eoi 0, rotate_on_auto_eoi 0
> special_fully_nested_mode 0, special_mask_mode 0
> is_master 0, elcr 0xc, int_output 0
> Entry 5: type 4 instance 0, length 400
> IOAPIC: base_address 0xfec00000, ioregsel 0x1c id 0x1
> pin 00: 0x0000000000010000
> pin 01: 0x0000000000000039
> pin 02: 0x0000000000000031
> pin 03: 0x0000000000000041
> pin 04: 0x0000000000000049
> pin 05: 0x000000000001a051
> pin 06: 0x0000000000000059
> pin 07: 0x0000000000000061
> pin 08: 0x0000000000000069
> pin 09: 0x0000000000000071
> pin 10: 0x000000000001a079
> pin 11: 0x000000000001a081
> pin 12: 0x0000000000000089
> pin 13: 0x0000000000000091
> pin 14: 0x0000000000000099
> pin 15: 0x00000000000000a1
> pin 16: 0x0000000000010000
> pin 17: 0x0000000000010000
> pin 18: 0x0000000000010000
> pin 19: 0x0000000000010000
> pin 20: 0x0000000000010000
> pin 21: 0x0000000000010000
> pin 22: 0x0000000000010000
> pin 23: 0x0000000000010000
> pin 24: 0x0000000000010000
> pin 25: 0x0000000000010000
> pin 26: 0x0000000000010000
> pin 27: 0x0000000000010000
> pin 28: 0x0000000000010000
> pin 29: 0x0000000000010000
> pin 30: 0x0000000000010000
> pin 31: 0x0000000000010000
> pin 32: 0x0000000000010000
> pin 33: 0x0000000000010000
> pin 34: 0x0000000000010000
> pin 35: 0x0000000000010000
> pin 36: 0x0000000000010000
> pin 37: 0x0000000000010000
> pin 38: 0x0000000000010000
> pin 39: 0x0000000000010000
> pin 40: 0x0000000000010000
> pin 41: 0x0000000000010000
> pin 42: 0x0000000000010000
> pin 43: 0x0000000000010000
> pin 44: 0x0000000000010000
> pin 45: 0x0000000000010000
> pin 46: 0x0000000000010000
> pin 47: 0x0000000000010000
> Entry 6: type 5 instance 0, length 16
> LAPIC: base_msr 0xfee00900, disabled 0, timer_divisor 0x10
> Entry 7: type 5 instance 1, length 16
> LAPIC: base_msr 0xfee00800, disabled 0, timer_divisor 0x10
> Entry 8: type 6 instance 0, length 1024
> LAPIC registers:
> 0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
> 0x0020: 0x0000000000000000 0x0030: 0x0000000000050014
> 0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
> 0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
> 0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
> 0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
> 0x00c0: 0x0000000000000000 0x00d0: 0x0000000001000000
> 0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
> 0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
> 0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
> 0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
> 0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
> 0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
> 0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
> 0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
> 0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
> 0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
> 0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
> 0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
> 0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
> 0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
> 0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
> 0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
> 0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
> 0x0300: 0x00000000000000fc 0x0310: 0x0000000002000000
> 0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
> 0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
> 0x0360: 0x0000000000000400 0x0370: 0x00000000000000fe
> 0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
> 0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
> 0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
> 0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
> Entry 9: type 6 instance 1, length 1024
> LAPIC registers:
> 0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
> 0x0020: 0x0000000002000000 0x0030: 0x0000000000050014
> 0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
> 0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
> 0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
> 0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
> 0x00c0: 0x0000000000000000 0x00d0: 0x0000000002000000
> 0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
> 0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
> 0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
> 0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
> 0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
> 0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
> 0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
> 0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
> 0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
> 0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
> 0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
> 0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
> 0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
> 0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
> 0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
> 0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
> 0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
> 0x0300: 0x00000000000000fd 0x0310: 0x0000000000000000
> 0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
> 0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
> 0x0360: 0x0000000000010400 0x0370: 0x00000000000000fe
> 0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
> 0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
> 0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
> 0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
> Entry 10: type 7 instance 0, length 16
> PCI IRQs: 0x00000000000000000000000000000000
> Entry 11: type 8 instance 0, length 8
> ISA IRQs: 0x0001
> Entry 12: type 9 instance 0, length 8
> PCI LINK: 5 10 11 5
> Entry 13: type 10 instance 0, length 56
> PIT: speaker off
> ch 0: count 0x4a9, latched_count 0x4a5, count_latched 0
> status 0, status_latched 0
> rd_state 0x3, wr_state 0x3, wr_latch 0xa9, rw_mode 0x3
> mode 0x2, bcd 0, gate 0x1
> ch 1: count 0x10000, latched_count 0, count_latched 0
> status 0, status_latched 0
> rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
> mode 0xff, bcd 0, gate 0x1
> Entry 14: type 11 instance 0, length 16
> RTC: regs 0x18 0x00 0x36 0x00 0x18 0x00 0x03 0x07
> 0x07 0x10 0x26 0x02 0x00 0x80, index 0x10
> Entry 15: type 12 instance 0, length 1048
> HPET: capability 0xf424008086a201 config 0
> isr 0 counter 0xa1ad6b9c
> timer0 config 0xf0000000000030 cmp 0
> timer0 period 0 fsb 0
> timer1 config 0xf0000000000030 cmp 0
> timer1 period 0 fsb 0
> timer2 config 0xf0000000000030 cmp 0
> timer2 period 0 fsb 0
> Entry 16: type 13 instance 0, length 8
> ACPI PM: TMR_VAL 0x8fff, PM1a_STS 0x0, PM1a_EN 0x0
> Entry 17: type 14 instance 0, length 240
> MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
> var 0 0x00000000f0000000 0x0000000ff8000800
> var 1 0x00000000f8000000 0x0000000ffc000800
> var 2 0x0000000000000000 0x0000000000000000
> var 3 0x0000000000000000 0x0000000000000000
> var 4 0x0000000000000000 0x0000000000000000
> var 5 0x0000000000000000 0x0000000000000000
> var 6 0x0000000000000000 0x0000000000000000
> var 7 0x0000000000000000 0x0000000000000000
> fixed 00 0x0606060606060606
> fixed 01 0x0606060606060606
> fixed 02 0x0101010101010101
> fixed 03 0x0606060606060606
> fixed 04 0x0606060606060606
> fixed 05 0x0606060606060606
> fixed 06 0x0606060606060606
> fixed 07 0x0606060606060606
> fixed 08 0x0606060606060606
> fixed 09 0x0606060606060606
> fixed 10 0x0606060606060606
> Entry 18: type 14 instance 1, length 240
> MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
> var 0 0x00000000f0000000 0x0000000ff8000800
> var 1 0x00000000f8000000 0x0000000ffc000800
> var 2 0x0000000000000000 0x0000000000000000
> var 3 0x0000000000000000 0x0000000000000000
> var 4 0x0000000000000000 0x0000000000000000
> var 5 0x0000000000000000 0x0000000000000000
> var 6 0x0000000000000000 0x0000000000000000
> var 7 0x0000000000000000 0x0000000000000000
> fixed 00 0x0606060606060606
> fixed 01 0x0606060606060606
> fixed 02 0x0101010101010101
> fixed 03 0x0606060606060606
> fixed 04 0x0606060606060606
> fixed 05 0x0606060606060606
> fixed 06 0x0606060606060606
> fixed 07 0x0606060606060606
> fixed 08 0x0606060606060606
> fixed 09 0x0606060606060606
> fixed 10 0x0606060606060606
> Entry 19: type 0 instance 0, length 0
> [scara@habil xen-unstable.hg]$ sudo xen-hvmctx 94 | grep rip
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> [scara@habil xen-unstable.hg]$ sudo xen-hvmctx 94
> HVM save record for domain 94
> Entry 0: type 1 instance 0, length 24
> Header: magic 0x54381286, version 1
> Xen changeset 0
> CPUID[0][%eax] 0x000106a5
> gtsc_khz 2666735
> Entry 1: type 2 instance 0, length 1024
> CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
> rcx 0x0000000000000000 rdx 0x0000000000000000
> rbp 0x0000000000030000 rsi 0x0000000000000001
> rdi 0xffffffff802e5658 rsp 0xffffffff803cff90
> r8 0xffffffff803ce000 r9 0x000000000000003e
> r10 0xffff8100070a0038 r11 0xffff81000769f7a0
> r12 0x0000000000000000 r13 0x0000000000000000
> r14 0x0000000000000000 r15 0x0000000000000000
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> cr0 0x000000008005003b cr2 0x000000000042cc00
> cr3 0x0000000006bf9000 cr4 0x00000000000006e0
> dr0 0x0000000000000000 dr1 0x0000000000000000
> dr2 0x0000000000000000 dr3 0x0000000000000000
> dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
> cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
> ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> gs 0x00000000 (0xffffffff8039e000 + 0xffffffff / 0x00c00)
> ss 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> tr 0x00000040 (0xffff810001033000 + 0x0000206f / 0x0008b)
> ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> itdr (0xffffffff8041d000 + 0x00000fff)
> gdtr (0xffffffff803d0000 + 0x00000080)
> sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
> shadow gs 0x0000000000000000
> MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
> star 0x0023001000000000 cstar 0xffffffff80061584
> sfmask 0x0000000000003700 efer 0x0000000000000d01
> tsc 0x0000001ffb1191c5
> event 0x00000000 error 0x00000000
> FPU: fcw 0x037f fsw 0x0000
> ftw 0x00 (0x00) fop 0x0000
> fpuip 0x0000000000000000 fpudp 0x0000000000000000
> mxcsr 0x00001fa0 mask 0x0000ffff
> mm0 0x00000000000000000000 (0x000000000000)
> mm1 0x00000000000000000000 (0x000000000000)
> mm2 0x00000000000000000000 (0x000000000000)
> mm3 0x00000000000000000000 (0x000000000000)
> mm4 0x00000000000000000000 (0x000000000000)
> mm5 0x00000000000000000000 (0x000000000000)
> mm6 0x00000000000000000000 (0x000000000000)
> mm7 0x00000000000000000000 (0x000000000000)
> xmm00 0x00000000000000003fe333333f19999a
> xmm01 0x00000000000000000000000040266666
> xmm02 0x00000000000000000000000000000000
> xmm03 0x00000000000000000000000000000000
> xmm04 0x00000000000000000000000000000000
> xmm05 0x00000000000000000000000000000000
> xmm06 0x00000000000000000000000000000000
> xmm07 0x00000000000000000000000000000000
> xmm08 0x00000000000000000000000000000000
> xmm09 0x00000000000000000000000000000000
> xmm10 0x00000000000000000000000000000000
> xmm11 0x00000000000000000000000000000000
> xmm12 0x00000000000000000000000000000000
> xmm13 0x00000000000000000000000000000000
> xmm14 0x00000000000000000000000000000000
> xmm15 0x00000000000000000000000000000000
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> Entry 2: type 2 instance 1, length 1024
> CPU: rax 0x0000000000000000 rbx 0xffffffff8006ad3b
> rcx 0x0000000000000000 rdx 0x0000000000000000
> rbp 0x0000000000000001 rsi 0x0000000000000001
> rdi 0xffffffff802e5658 rsp 0xffff81000708fef0
> r8 0xffff81000708e000 r9 0x000000000000003f
> r10 0xffff8100070a0008 r11 0xffff810006b5a480
> r12 0x00000000000000ff r13 0xffffffff803a6080
> r14 0x0000000000000100 r15 0xffffffff803c8280
> rip 0xffffffff8006ad64 rflags 0x0000000000000246
> cr0 0x000000008005003b cr2 0x0000000000866290
> cr3 0x0000000000201000 cr4 0x00000000000006e0
> dr0 0x0000000000000000 dr1 0x0000000000000000
> dr2 0x0000000000000000 dr3 0x0000000000000000
> dr6 0x00000000ffff0ff0 dr7 0x0000000000000400
> cs 0x00000010 (0x0000000000000000 + 0xffffffff / 0x00a9b)
> ds 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> es 0x00000018 (0x0000000000000000 + 0xffffffff / 0x00c93)
> fs 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> gs 0x00000000 (0xffff810007080b40 + 0xffffffff / 0x00c00)
> ss 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> tr 0x00000040 (0xffff81000103b580 + 0x0000206f / 0x0008b)
> ldtr 0x00000000 (0x0000000000000000 + 0xffffffff / 0x00c00)
> itdr (0xffffffff8041d000 + 0x00000fff)
> gdtr (0xffff810007085000 + 0x00000080)
> sysenter cs 0x00000010 eip 0xffffffff80061408 esp 0x0000000000000000
> shadow gs 0x0000000000000000
> MSR flags 0x0000000000000007 lstar 0xffffffff8005d098
> star 0x0023001000000000 cstar 0xffffffff80061584
> sfmask 0x0000000000003700 efer 0x0000000000000d01
> tsc 0x0000001ffb11d380
> event 0x00000000 error 0x00000000
> FPU: fcw 0x037f fsw 0x0000
> ftw 0x00 (0x00) fop 0x0000
> fpuip 0x0000000000000000 fpudp 0x0000000000000000
> mxcsr 0x00001fa0 mask 0x0000ffff
> mm0 0x00000000000000000000 (0x000000000000)
> mm1 0x00000000000000000000 (0x000000000000)
> mm2 0x00000000000000000000 (0x000000000000)
> mm3 0x00000000000000000000 (0x000000000000)
> mm4 0x00000000000000000000 (0x000000000000)
> mm5 0x00000000000000000000 (0x000000000000)
> mm6 0x00000000000000000000 (0x000000000000)
> mm7 0x00000000000000000000 (0x000000000000)
> xmm00 0x00000000000000003fe333333f19999a
> xmm01 0x00000000000000000000000040266666
> xmm02 0x00000000000000000000000000000000
> xmm03 0x00000000000000000000000000000000
> xmm04 0x00000000000000000000000000000000
> xmm05 0x00000000000000000000000000000000
> xmm06 0x00000000000000000000000000000000
> xmm07 0x00000000000000000000000000000000
> xmm08 0x00000000000000000000000000000000
> xmm09 0x00000000000000000000000000000000
> xmm10 0x00000000000000000000000000000000
> xmm11 0x00000000000000000000000000000000
> xmm12 0x00000000000000000000000000000000
> xmm13 0x00000000000000000000000000000000
> xmm14 0x00000000000000000000000000000000
> xmm15 0x00000000000000000000000000000000
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> (0x00000000000000000000000000000000)
> Entry 3: type 3 instance 0, length 8
> PIC: IRQ base 0x20, irr 0x1, imr 0xfa, isr 0
> init_state 0, priority_add 0, readsel_isr 0, poll 0
> auto_eoi 1, rotate_on_auto_eoi 0
> special_fully_nested_mode 0, special_mask_mode 0
> is_master 1, elcr 0x24, int_output 0x1
> Entry 4: type 3 instance 1, length 8
> PIC: IRQ base 0x28, irr 0, imr 0xff, isr 0
> init_state 0, priority_add 0, readsel_isr 0, poll 0
> auto_eoi 0, rotate_on_auto_eoi 0
> special_fully_nested_mode 0, special_mask_mode 0
> is_master 0, elcr 0xc, int_output 0
> Entry 5: type 4 instance 0, length 400
> IOAPIC: base_address 0xfec00000, ioregsel 0x1c id 0x1
> pin 00: 0x0000000000010000
> pin 01: 0x0000000000000039
> pin 02: 0x0000000000000031
> pin 03: 0x0000000000000041
> pin 04: 0x0000000000000049
> pin 05: 0x000000000001a051
> pin 06: 0x0000000000000059
> pin 07: 0x0000000000000061
> pin 08: 0x0000000000000069
> pin 09: 0x0000000000000071
> pin 10: 0x000000000001a079
> pin 11: 0x000000000001a081
> pin 12: 0x0000000000000089
> pin 13: 0x0000000000000091
> pin 14: 0x0000000000000099
> pin 15: 0x00000000000000a1
> pin 16: 0x0000000000010000
> pin 17: 0x0000000000010000
> pin 18: 0x0000000000010000
> pin 19: 0x0000000000010000
> pin 20: 0x0000000000010000
> pin 21: 0x0000000000010000
> pin 22: 0x0000000000010000
> pin 23: 0x0000000000010000
> pin 24: 0x0000000000010000
> pin 25: 0x0000000000010000
> pin 26: 0x0000000000010000
> pin 27: 0x0000000000010000
> pin 28: 0x0000000000010000
> pin 29: 0x0000000000010000
> pin 30: 0x0000000000010000
> pin 31: 0x0000000000010000
> pin 32: 0x0000000000010000
> pin 33: 0x0000000000010000
> pin 34: 0x0000000000010000
> pin 35: 0x0000000000010000
> pin 36: 0x0000000000010000
> pin 37: 0x0000000000010000
> pin 38: 0x0000000000010000
> pin 39: 0x0000000000010000
> pin 40: 0x0000000000010000
> pin 41: 0x0000000000010000
> pin 42: 0x0000000000010000
> pin 43: 0x0000000000010000
> pin 44: 0x0000000000010000
> pin 45: 0x0000000000010000
> pin 46: 0x0000000000010000
> pin 47: 0x0000000000010000
> Entry 6: type 5 instance 0, length 16
> LAPIC: base_msr 0xfee00900, disabled 0, timer_divisor 0x10
> Entry 7: type 5 instance 1, length 16
> LAPIC: base_msr 0xfee00800, disabled 0, timer_divisor 0x10
> Entry 8: type 6 instance 0, length 1024
> LAPIC registers:
> 0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
> 0x0020: 0x0000000000000000 0x0030: 0x0000000000050014
> 0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
> 0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
> 0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
> 0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
> 0x00c0: 0x0000000000000000 0x00d0: 0x0000000001000000
> 0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
> 0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
> 0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
> 0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
> 0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
> 0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
> 0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
> 0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
> 0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
> 0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
> 0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
> 0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
> 0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
> 0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
> 0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
> 0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
> 0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
> 0x0300: 0x00000000000000fc 0x0310: 0x0000000002000000
> 0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
> 0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
> 0x0360: 0x0000000000000400 0x0370: 0x00000000000000fe
> 0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
> 0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
> 0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
> 0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
> Entry 9: type 6 instance 1, length 1024
> LAPIC registers:
> 0x0000: 0x0000000000000000 0x0010: 0x0000000000000000
> 0x0020: 0x0000000002000000 0x0030: 0x0000000000050014
> 0x0040: 0x0000000000000000 0x0050: 0x0000000000000000
> 0x0060: 0x0000000000000000 0x0070: 0x0000000000000000
> 0x0080: 0x0000000000000000 0x0090: 0x0000000000000000
> 0x00a0: 0x0000000000000000 0x00b0: 0x0000000000000000
> 0x00c0: 0x0000000000000000 0x00d0: 0x0000000002000000
> 0x00e0: 0x00000000ffffffff 0x00f0: 0x00000000000001ff
> 0x0100: 0x0000000000000000 0x0110: 0x0000000000000000
> 0x0120: 0x0000000000000000 0x0130: 0x0000000000000000
> 0x0140: 0x0000000000000000 0x0150: 0x0000000000000000
> 0x0160: 0x0000000000000000 0x0170: 0x0000000000000000
> 0x0180: 0x0000000000000000 0x0190: 0x0000000000000000
> 0x01a0: 0x0000000000000000 0x01b0: 0x0000000000000000
> 0x01c0: 0x0000000000000000 0x01d0: 0x0000000000000000
> 0x01e0: 0x0000000000000000 0x01f0: 0x0000000000000000
> 0x0200: 0x0000000000000000 0x0210: 0x0000000000000000
> 0x0220: 0x0000000000000000 0x0230: 0x0000000000000000
> 0x0240: 0x0000000000000000 0x0250: 0x0000000000000000
> 0x0260: 0x0000000000000000 0x0270: 0x0000000000000000
> 0x0280: 0x0000000000000000 0x0290: 0x0000000000000000
> 0x02a0: 0x0000000000000000 0x02b0: 0x0000000000000000
> 0x02c0: 0x0000000000000000 0x02d0: 0x0000000000000000
> 0x02e0: 0x0000000000000000 0x02f0: 0x0000000000000000
> 0x0300: 0x00000000000000fd 0x0310: 0x0000000000000000
> 0x0320: 0x00000000000200ef 0x0330: 0x0000000000010000
> 0x0340: 0x0000000000010000 0x0350: 0x0000000000000400
> 0x0360: 0x0000000000010400 0x0370: 0x00000000000000fe
> 0x0380: 0x000000000000186a 0x0390: 0x0000000000000000
> 0x03a0: 0x0000000000000000 0x03b0: 0x0000000000000000
> 0x03c0: 0x0000000000000000 0x03d0: 0x0000000000000000
> 0x03e0: 0x0000000000000003 0x03f0: 0x0000000000000000
> Entry 10: type 7 instance 0, length 16
> PCI IRQs: 0x00000000000000000000000000000000
> Entry 11: type 8 instance 0, length 8
> ISA IRQs: 0x0001
> Entry 12: type 9 instance 0, length 8
> PCI LINK: 5 10 11 5
> Entry 13: type 10 instance 0, length 56
> PIT: speaker off
> ch 0: count 0x4a9, latched_count 0x4a7, count_latched 0
> status 0, status_latched 0
> rd_state 0x3, wr_state 0x3, wr_latch 0xa9, rw_mode 0x3
> mode 0x2, bcd 0, gate 0x1
> ch 1: count 0x10000, latched_count 0, count_latched 0
> status 0, status_latched 0
> rd_state 0, wr_state 0, wr_latch 0, rw_mode 0
> mode 0xff, bcd 0, gate 0x1
> Entry 14: type 11 instance 0, length 16
> RTC: regs 0x26 0x00 0x36 0x00 0x18 0x00 0x03 0x07
> 0x07 0x10 0x26 0x02 0x00 0x80, index 0x10
> Entry 15: type 12 instance 0, length 1048
> HPET: capability 0xf424008086a201 config 0
> isr 0 counter 0xbfed04a9
> timer0 config 0xf0000000000030 cmp 0
> timer0 period 0 fsb 0
> timer1 config 0xf0000000000030 cmp 0
> timer1 period 0 fsb 0
> timer2 config 0xf0000000000030 cmp 0
> timer2 period 0 fsb 0
> Entry 16: type 13 instance 0, length 8
> ACPI PM: TMR_VAL 0x8fff, PM1a_STS 0x0, PM1a_EN 0x0
> Entry 17: type 14 instance 0, length 240
> MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
> var 0 0x00000000f0000000 0x0000000ff8000800
> var 1 0x00000000f8000000 0x0000000ffc000800
> var 2 0x0000000000000000 0x0000000000000000
> var 3 0x0000000000000000 0x0000000000000000
> var 4 0x0000000000000000 0x0000000000000000
> var 5 0x0000000000000000 0x0000000000000000
> var 6 0x0000000000000000 0x0000000000000000
> var 7 0x0000000000000000 0x0000000000000000
> fixed 00 0x0606060606060606
> fixed 01 0x0606060606060606
> fixed 02 0x0101010101010101
> fixed 03 0x0606060606060606
> fixed 04 0x0606060606060606
> fixed 05 0x0606060606060606
> fixed 06 0x0606060606060606
> fixed 07 0x0606060606060606
> fixed 08 0x0606060606060606
> fixed 09 0x0606060606060606
> fixed 10 0x0606060606060606
> Entry 18: type 14 instance 1, length 240
> MTRR: PAT 0x7040600070406, cap 0x508, default 0xc06
> var 0 0x00000000f0000000 0x0000000ff8000800
> var 1 0x00000000f8000000 0x0000000ffc000800
> var 2 0x0000000000000000 0x0000000000000000
> var 3 0x0000000000000000 0x0000000000000000
> var 4 0x0000000000000000 0x0000000000000000
> var 5 0x0000000000000000 0x0000000000000000
> var 6 0x0000000000000000 0x0000000000000000
> var 7 0x0000000000000000 0x0000000000000000
> fixed 00 0x0606060606060606
> fixed 01 0x0606060606060606
> fixed 02 0x0101010101010101
> fixed 03 0x0606060606060606
> fixed 04 0x0606060606060606
> fixed 05 0x0606060606060606
> fixed 06 0x0606060606060606
> fixed 07 0x0606060606060606
> fixed 08 0x0606060606060606
> fixed 09 0x0606060606060606
> fixed 10 0x0606060606060606
> Entry 19: type 0 instance 0, length 0
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-08 10:03 ` George Dunlap
@ 2010-07-08 11:55 ` Gianni Tedesco
2010-07-08 13:28 ` George Dunlap
0 siblings, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-08 11:55 UTC (permalink / raw)
To: George Dunlap; +Cc: Xen Devel
On Thu, 2010-07-08 at 11:03 +0100, George Dunlap wrote:
> If both cpus are idling with EFLAGS.IF=1, this would imply that the
> kernel thinks it's waiting on a device, yes? One thing you could do
> is to track the interaction between the guest and the devices, and see
> if you can figure out what it's waiting for and why the thing it's
> waiting for isn't happening. You can use xentrace + xenalyze
> (http://xenbits.xensource.com/ext/xenalyze.hg) to see all the PIO,
> MMIO, and interrupts delivered to the guest.
>
> Unfortunately this would mean understanding at some level the
> interface the device presents, which may involve a lot of going
> through driver code / going through QEMU, which doesn't sound fun. :-/
> Maybe someone else will have some suggestions...
Hmm, yeah, usually that's a headache to do for one device never mind the
whole system...
> I ended up with a similar-looking problem during boot with a stock
> 2.6.18.8 kernel, after hacking up a work-around to allow it to get
> past the timer synchronization stage. It might be easier to track
> down if you have a failure mode that's quicker to reproduce and a
> guest kernel that's easier to modify. (But of course there's always
> the possibility that it's a different bug with similar symptoms...)
Well this reproduces relatively quick but because it's a vendor kernel +
custom initrd it's a bit harder to modify components. Just re-building
the original turns out to be a pain.
I think for now my time is probably best spent trying to minimise the
code required to reproduce the thing and hopefully, in turn, minimise
the amount of PIO + MMIO + IRQ traces to go through.
Argh :)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-08 11:55 ` Gianni Tedesco
@ 2010-07-08 13:28 ` George Dunlap
2010-07-09 16:20 ` Gianni Tedesco
2010-07-21 16:46 ` Gianni Tedesco
0 siblings, 2 replies; 18+ messages in thread
From: George Dunlap @ 2010-07-08 13:28 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: Xen Devel
On Thu, Jul 8, 2010 at 12:55 PM, Gianni Tedesco
<gianni.tedesco@citrix.com> wrote:
> Hmm, yeah, usually that's a headache to do for one device never mind the
> whole system...
But realistically, there's only a handful of devices which it might be
waiting on -- seems like the disk is the most likely culprit.
I'm happy to help with the tracing / analysis bit.
In any case, thanks for doing this, and good luck.
-George
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-08 13:28 ` George Dunlap
@ 2010-07-09 16:20 ` Gianni Tedesco
2010-07-21 16:46 ` Gianni Tedesco
1 sibling, 0 replies; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-09 16:20 UTC (permalink / raw)
To: George Dunlap; +Cc: Xen Devel
On Thu, 2010-07-08 at 14:28 +0100, George Dunlap wrote:
> On Thu, Jul 8, 2010 at 12:55 PM, Gianni Tedesco
> <gianni.tedesco@citrix.com> wrote:
> > Hmm, yeah, usually that's a headache to do for one device never mind the
> > whole system...
>
> But realistically, there's only a handful of devices which it might be
> waiting on -- seems like the disk is the most likely culprit.
>
> I'm happy to help with the tracing / analysis bit.
>
> In any case, thanks for doing this, and good luck.
Problem is I seem to have ruled out most of that now. IDE is firing off
IRQ's and the host ACKing them properly. There are even hangs during
periods of no IDE activity - just working from ramdisks. Timers are
getting through. Networking always seems to work fine and the hang
occurs regardless of e1000 vs rtl8139. The bug reproduces without serial
and regardless of acpi, std-vga, etc so there isn't much else that could
be going wrong here.
I managed to get a shell up and running while the system is hung and
just spawning "busybox ls" also hangs in an uninterruptible state
(although the uninterruptable part may have been due to lack of job
control in the shell.)
Since the device model and IRQ delivery seems to work as expected I am
wondering if there could be an artefact that causes, for example, the
kernels semaphore implementation to not work as it should?
Gianni
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-07 18:42 HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1 Gianni Tedesco
2010-07-08 10:03 ` George Dunlap
@ 2010-07-12 14:04 ` Konrad Rzeszutek Wilk
2010-07-12 14:24 ` George Dunlap
2010-07-12 15:09 ` Gianni Tedesco
2010-07-21 17:29 ` Gianni Tedesco
2 siblings, 2 replies; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-12 14:04 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: Xen Devel
On Wed, Jul 07, 2010 at 07:42:35PM +0100, Gianni Tedesco wrote:
> Hi,
>
> I've spent a few weeks investigating a very reproducible guest-hangs bug
> which appears to affect all hypervisors from at least 3.4.2 through 4.0
> to unstable.
>
> To reproduce setup an RHEL5.2 guest for kickstart network install
> something like this:
Does this happen with RHEL5.4? CentOS 5.4?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-12 14:04 ` Konrad Rzeszutek Wilk
@ 2010-07-12 14:24 ` George Dunlap
2010-07-12 15:09 ` Gianni Tedesco
1 sibling, 0 replies; 18+ messages in thread
From: George Dunlap @ 2010-07-12 14:24 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Xen Devel, Gianni Tedesco
Something superficially similar happens when booting an
(already-installed) Debian Etch system with a 2.6.18-6 kernel, and a
kernel.org version of 2.6.18.8; but there's no guarantee it's the same
root cause.
-George
On Mon, Jul 12, 2010 at 3:04 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Wed, Jul 07, 2010 at 07:42:35PM +0100, Gianni Tedesco wrote:
>> Hi,
>>
>> I've spent a few weeks investigating a very reproducible guest-hangs bug
>> which appears to affect all hypervisors from at least 3.4.2 through 4.0
>> to unstable.
>>
>> To reproduce setup an RHEL5.2 guest for kickstart network install
>> something like this:
>
> Does this happen with RHEL5.4? CentOS 5.4?
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-12 14:04 ` Konrad Rzeszutek Wilk
2010-07-12 14:24 ` George Dunlap
@ 2010-07-12 15:09 ` Gianni Tedesco
2010-07-12 15:44 ` Konrad Rzeszutek Wilk
1 sibling, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-12 15:09 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: George Dunlap, Xen Devel
On Mon, 2010-07-12 at 15:04 +0100, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 07, 2010 at 07:42:35PM +0100, Gianni Tedesco wrote:
> > Hi,
> >
> > I've spent a few weeks investigating a very reproducible guest-hangs bug
> > which appears to affect all hypervisors from at least 3.4.2 through 4.0
> > to unstable.
> >
> > To reproduce setup an RHEL5.2 guest for kickstart network install
> > something like this:
>
> Does this happen with RHEL5.4? CentOS 5.4?
I used the same setup to test RHEL5, RHEL5.[1234] and can only reproduce
on 5.1 and 5.2.
As George pointed out, this may be an issue with a far wider range of
kernels but difficult to reproduce.
My most recent finding is that this is reproduced by loading modules
which create a kernel thread. A sysrq-t during the hangs is showing
every task waiting for kthread_create to complete except for anaconda
which is either waiting on a tty_ioctl() (serial install) or poll (vga
install).
Gianni Tedesco
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-12 15:09 ` Gianni Tedesco
@ 2010-07-12 15:44 ` Konrad Rzeszutek Wilk
2010-07-13 16:31 ` Gianni Tedesco
0 siblings, 1 reply; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-12 15:44 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: George Dunlap, Xen Devel
On Mon, Jul 12, 2010 at 04:09:36PM +0100, Gianni Tedesco wrote:
> On Mon, 2010-07-12 at 15:04 +0100, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 07, 2010 at 07:42:35PM +0100, Gianni Tedesco wrote:
> > > Hi,
> > >
> > > I've spent a few weeks investigating a very reproducible guest-hangs bug
> > > which appears to affect all hypervisors from at least 3.4.2 through 4.0
> > > to unstable.
> > >
> > > To reproduce setup an RHEL5.2 guest for kickstart network install
> > > something like this:
> >
> > Does this happen with RHEL5.4? CentOS 5.4?
>
> I used the same setup to test RHEL5, RHEL5.[1234] and can only reproduce
> on 5.1 and 5.2.
Ok, did you look in the changelog for RHEL5.3 and above. It might be
that you are hitting a bug that was fixed.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-12 15:44 ` Konrad Rzeszutek Wilk
@ 2010-07-13 16:31 ` Gianni Tedesco
2010-07-13 18:13 ` Gianni Tedesco
0 siblings, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-13 16:31 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: George Dunlap, Xen Devel
On Mon, 2010-07-12 at 16:44 +0100, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 12, 2010 at 04:09:36PM +0100, Gianni Tedesco wrote:
> > On Mon, 2010-07-12 at 15:04 +0100, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Jul 07, 2010 at 07:42:35PM +0100, Gianni Tedesco wrote:
> > > > Hi,
> > > >
> > > > I've spent a few weeks investigating a very reproducible guest-hangs bug
> > > > which appears to affect all hypervisors from at least 3.4.2 through 4.0
> > > > to unstable.
> > > >
> > > > To reproduce setup an RHEL5.2 guest for kickstart network install
> > > > something like this:
> > >
> > > Does this happen with RHEL5.4? CentOS 5.4?
> >
> > I used the same setup to test RHEL5, RHEL5.[1234] and can only reproduce
> > on 5.1 and 5.2.
>
> Ok, did you look in the changelog for RHEL5.3 and above. It might be
> that you are hitting a bug that was fixed.
There are several potential candidates but difficult to get further info
due to redhat bugzilla. Trying to get in touch with a relevant engineer
to confirm / falsify that theory.
Thanks
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-13 16:31 ` Gianni Tedesco
@ 2010-07-13 18:13 ` Gianni Tedesco
2010-07-13 18:42 ` Dan Magenheimer
0 siblings, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-13 18:13 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: George Dunlap, Xen Devel
On Tue, 2010-07-13 at 17:31 +0100, Gianni Tedesco wrote:
> > > I used the same setup to test RHEL5, RHEL5.[1234] and can only reproduce
> > > on 5.1 and 5.2.
> >
> > Ok, did you look in the changelog for RHEL5.3 and above. It might be
> > that you are hitting a bug that was fixed.
>
> There are several potential candidates but difficult to get further info
> due to redhat bugzilla. Trying to get in touch with a relevant engineer
> to confirm / falsify that theory.
The patch "Fix gettimeofday reliability issues with TSC, HPET, and
PM-Timer" seems to mask the bug and make it much less reproducable. This
patch was to fix some gettimeofday-goes-backwards issues on bare metal.
As a result of this I can now confirm the bug is still present in
RHEL5.3 at least - I shall test the others shortly.
Looks like TSC/PIT timesource is either a) unreliable in xen, b)
unreliable in RHEL kernels or c) all of the above.
Gianni
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-13 18:13 ` Gianni Tedesco
@ 2010-07-13 18:42 ` Dan Magenheimer
2010-07-13 19:13 ` Gianni Tedesco
0 siblings, 1 reply; 18+ messages in thread
From: Dan Magenheimer @ 2010-07-13 18:42 UTC (permalink / raw)
To: Gianni Tedesco, Konrad Wilk; +Cc: George Dunlap, Xen Devel
This may be totally unrelated, but just in case...
Are you using xl to create your problem domains?
If so, you might want to set timer_mode=1 in your
vm.cfg. (See other xen-devel thread "xen tsc problems".)
> -----Original Message-----
> From: Gianni Tedesco [mailto:gianni.tedesco@citrix.com]
> Sent: Tuesday, July 13, 2010 12:13 PM
> To: Konrad Rzeszutek Wilk
> Cc: George Dunlap; Xen Devel
> Subject: Re: [Xen-devel] HVM SMP linux guest hangs in cpu_idle() with
> EFLAGS.IF = 1
>
> On Tue, 2010-07-13 at 17:31 +0100, Gianni Tedesco wrote:
> > > > I used the same setup to test RHEL5, RHEL5.[1234] and can only
> reproduce
> > > > on 5.1 and 5.2.
> > >
> > > Ok, did you look in the changelog for RHEL5.3 and above. It might
> be
> > > that you are hitting a bug that was fixed.
> >
> > There are several potential candidates but difficult to get further
> info
> > due to redhat bugzilla. Trying to get in touch with a relevant
> engineer
> > to confirm / falsify that theory.
>
> The patch "Fix gettimeofday reliability issues with TSC, HPET, and
> PM-Timer" seems to mask the bug and make it much less reproducable.
> This
> patch was to fix some gettimeofday-goes-backwards issues on bare metal.
> As a result of this I can now confirm the bug is still present in
> RHEL5.3 at least - I shall test the others shortly.
>
> Looks like TSC/PIT timesource is either a) unreliable in xen, b)
> unreliable in RHEL kernels or c) all of the above.
>
> Gianni
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-13 18:42 ` Dan Magenheimer
@ 2010-07-13 19:13 ` Gianni Tedesco
2010-07-13 19:27 ` Gianni Tedesco
0 siblings, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-13 19:13 UTC (permalink / raw)
To: Dan Magenheimer; +Cc: George Dunlap, Xen Devel, Konrad Wilk
On Tue, 2010-07-13 at 19:42 +0100, Dan Magenheimer wrote:
> This may be totally unrelated, but just in case...
>
> Are you using xl to create your problem domains?
> If so, you might want to set timer_mode=1 in your
> vm.cfg. (See other xen-devel thread "xen tsc problems".)
Yeah I just had that chat with stefano but I am running with
timer_mode=1 now and no changes, I had timer_mode=3 before, and 0 before
that.
Currently instrumenting my kernel to check for any time-sources going
backwards...
Gianni
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-13 19:13 ` Gianni Tedesco
@ 2010-07-13 19:27 ` Gianni Tedesco
0 siblings, 0 replies; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-13 19:27 UTC (permalink / raw)
To: Dan Magenheimer; +Cc: George Dunlap, Xen Devel, Konrad Wilk
On Tue, 2010-07-13 at 20:13 +0100, Gianni Tedesco wrote:
> On Tue, 2010-07-13 at 19:42 +0100, Dan Magenheimer wrote:
> > This may be totally unrelated, but just in case...
> >
> > Are you using xl to create your problem domains?
> > If so, you might want to set timer_mode=1 in your
> > vm.cfg. (See other xen-devel thread "xen tsc problems".)
>
> Yeah I just had that chat with stefano but I am running with
> timer_mode=1 now and no changes, I had timer_mode=3 before, and 0 before
> that.
>
> Currently instrumenting my kernel to check for any time-sources going
> backwards...
In fact, strike this, the clocksource is jiffies which is being bumped
by interrupts, according to hvmctx, HPET is delivering ISR 0 so no way
wall time should be going backwards.
Damn
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-08 13:28 ` George Dunlap
2010-07-09 16:20 ` Gianni Tedesco
@ 2010-07-21 16:46 ` Gianni Tedesco
1 sibling, 0 replies; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-21 16:46 UTC (permalink / raw)
To: George Dunlap; +Cc: Xen Devel
On Thu, 2010-07-08 at 14:28 +0100, George Dunlap wrote:
> On Thu, Jul 8, 2010 at 12:55 PM, Gianni Tedesco
> <gianni.tedesco@citrix.com> wrote:
> > Hmm, yeah, usually that's a headache to do for one device never mind the
> > whole system...
>
> But realistically, there's only a handful of devices which it might be
> waiting on -- seems like the disk is the most likely culprit.
I can now *categorically* rule this out. I have injected IRQ's for every
installed device and none of this un-sticks the system...
Gianni
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-07 18:42 HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1 Gianni Tedesco
2010-07-08 10:03 ` George Dunlap
2010-07-12 14:04 ` Konrad Rzeszutek Wilk
@ 2010-07-21 17:29 ` Gianni Tedesco
2010-07-21 17:56 ` Nakajima, Jun
2 siblings, 1 reply; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-21 17:29 UTC (permalink / raw)
To: Xen Devel
Another data-point I have on this but haven't mentioned here yet is that
the userspace (anaconda) processes are usually hanging in
tty_wait_until_sent() which is apparently why things are woken up by
receiving a keypress on the serial line.
At other times the hang occurs while userspace (hotplug, also anaconda)
is waiting in do_poll() on a kernel netlink socket. In that case a
keypress on serial line also wakes it (!!)
Gianni Tedesco
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-21 17:29 ` Gianni Tedesco
@ 2010-07-21 17:56 ` Nakajima, Jun
2010-07-21 17:59 ` Gianni Tedesco
0 siblings, 1 reply; 18+ messages in thread
From: Nakajima, Jun @ 2010-07-21 17:56 UTC (permalink / raw)
To: Gianni Tedesco, Xen Devel
[-- Attachment #1: Type: text/plain, Size: 832 bytes --]
Gianni Tedesco wrote on Wed, 21 Jul 2010 at 10:29:43:
> Another data-point I have on this but haven't mentioned here yet is that
> the userspace (anaconda) processes are usually hanging in
> tty_wait_until_sent() which is apparently why things are woken up by
> receiving a keypress on the serial line.
>
> At other times the hang occurs while userspace (hotplug, also anaconda)
> is waiting in do_poll() on a kernel netlink socket. In that case a
> keypress on serial line also wakes it (!!)
>
I'm not sure if I followed all the emails on the thread, but did you see such a hang without serial connection? I had impression (i.e. kind of remember) serial connection of Linux at boot time caused a hang especially with SMP (on native).
> Gianni Tedesco
>
Jun
___
Intel Open Source Technology Center
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
2010-07-21 17:56 ` Nakajima, Jun
@ 2010-07-21 17:59 ` Gianni Tedesco
0 siblings, 0 replies; 18+ messages in thread
From: Gianni Tedesco @ 2010-07-21 17:59 UTC (permalink / raw)
To: Nakajima, Jun; +Cc: Xen Devel
On Wed, 2010-07-21 at 18:56 +0100, Nakajima, Jun wrote:
> Gianni Tedesco wrote on Wed, 21 Jul 2010 at 10:29:43:
>
> > Another data-point I have on this but haven't mentioned here yet is that
> > the userspace (anaconda) processes are usually hanging in
> > tty_wait_until_sent() which is apparently why things are woken up by
> > receiving a keypress on the serial line.
> >
> > At other times the hang occurs while userspace (hotplug, also anaconda)
> > is waiting in do_poll() on a kernel netlink socket. In that case a
> > keypress on serial line also wakes it (!!)
> >
>
> I'm not sure if I followed all the emails on the thread, but did you
> see such a hang without serial connection? I had impression (i.e. kind
> of remember) serial connection of Linux at boot time caused a hang
> especially with SMP (on native).
I have seen the hang without serial console enabled but with serial
still attached... In that case the hangs still happen but much more
rarely. I am going to try some further tests to remove serial from the
equation and see what happens.
Thanks for the hint
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2010-07-21 17:59 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-07 18:42 HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1 Gianni Tedesco
2010-07-08 10:03 ` George Dunlap
2010-07-08 11:55 ` Gianni Tedesco
2010-07-08 13:28 ` George Dunlap
2010-07-09 16:20 ` Gianni Tedesco
2010-07-21 16:46 ` Gianni Tedesco
2010-07-12 14:04 ` Konrad Rzeszutek Wilk
2010-07-12 14:24 ` George Dunlap
2010-07-12 15:09 ` Gianni Tedesco
2010-07-12 15:44 ` Konrad Rzeszutek Wilk
2010-07-13 16:31 ` Gianni Tedesco
2010-07-13 18:13 ` Gianni Tedesco
2010-07-13 18:42 ` Dan Magenheimer
2010-07-13 19:13 ` Gianni Tedesco
2010-07-13 19:27 ` Gianni Tedesco
2010-07-21 17:29 ` Gianni Tedesco
2010-07-21 17:56 ` Nakajima, Jun
2010-07-21 17:59 ` Gianni Tedesco
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.