All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
       [not found] <1313577856.13030.17.camel@scarafaggio>
@ 2011-08-22  9:00 ` Ian Campbell
  2011-08-22  9:49   ` Giuseppe Sacco
  2011-08-24 20:24   ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 12+ messages in thread
From: Ian Campbell @ 2011-08-22  9:00 UTC (permalink / raw)
  To: Giuseppe Sacco, 638172; +Cc: xen-devel, Ben Hutchings

@xen-devel:

Does this look familiar to anyone, this is (I expect, hopefully Giuseppe
will confirm) from Debian Squeeze which has a Xen 4.0.x with a PVops
dom0 kernel based on xen.git from last summer (e73f4955a821) with more
recent upstream longterm kernels (up to and including 2.6.32.41) merged
in. While it does seem to have the switch from level to edge triggered
interrupt the Debian kernel doesn't appear to have the switch to fasteoi
for pirqs (0672fb44a111 plus a few followups) -- could that be related
to this? (I'm not sure if that was a cleanup or a fix)

Might the tsc unstable message be relevant?

@Giuseppe:

Can you confirm the versions of the xen and qemu-dm packages which you
have got installed please.

Also I think it would be useful to see the guest configuration file and
details of the storage (filesystems, SCSI controllers etc) backing the
guest storage which you have got configured.

Full history of this report can be found at
http://bugs.debian.org/638172

Ian.

Can you also provide configuration details 
On Wed, 2011-08-17 at 12:44 +0200, Giuseppe Sacco wrote:
> Package: linux-image-2.6.32-5-xen-686
> Version: 2.6.32-35
> Severity: important
> 
> Hi,
> I am experiencing a few outages on a XEN server. Often I have to
> poweroff the server, but last time I found some information in syslog.
> Here it is:
> 
> Aug 17 12:35:45 centrum kernel: [ 1424.037532] Clocksource tsc unstable (delta = -103103328 ns)
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Modules linked in: xt_state xt_physdev iptable_filter tun cpufreq_userspace cpufreq_powersave cpufreq_c
> onservative cpufreq_stats dummy bridge stp xen_evtchn xenfs xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tab
> les xfs exportfs loop snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec radeon snd_hwdep ttm snd_pcm snd_timer drm_kms_helper drm snd soundcore snd_pa
> ge_alloc i2c_algo_bit shpchp i2c_piix4 pcspkr k8temp pci_hotplug i2c_core evdev button ext3 jbd mbcache dm_mod aacraid 3w_9xxx 3w_xxxx raid10 raid456 
> async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 md_mod sata_nv sata_sil sata_via sd_mod crc_t10dif ata_generic ahc
> i pata_atiixp ohci_hcd libata processor ehci_hcd r8169 mii scsi_mod thermal usbcore nls_base thermal_sys acpi_processor [last unloaded: dummy]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] 
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Pid: 3205, comm: qemu-dm Tainted: G        W  (2.6.32-5-xen-686 #1) MS-7368
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP: 0061:[<c1002227>] EFLAGS: 00200246 CPU: 0
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP is at hypercall_page+0x227/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EAX: 00040000 EBX: 00000000 ECX: 00000000 EDX: ec8fa828
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] ESI: ec8fa800 EDI: c24d9600 EBP: c27d4800 ESP: e4207d64
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] CR0: 8005003b CR2: b7712200 CR3: 241f0000 CR4: 00000660
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR6: ffff0ff0 DR7: 00000400
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Call Trace:
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006034>] ? xen_force_evtchn_callback+0xc/0x10
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006764>] ? check_events+0x8/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006723>] ? xen_irq_enable_direct_end+0x0/0x1
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f457>] ? scsi_request_fn+0x440/0x47a [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1132541>] ? __blk_run_queue+0x2e/0x5a
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c11325f3>] ? blk_run_queue+0x18/0x27
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93eaca>] ? scsi_run_queue+0x281/0x308 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f639>] ? scsi_next_command+0x25/0x2f [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93ffa1>] ? scsi_io_completion+0x383/0x3a4 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93a723>] ? scsi_finish_command+0xaa/0xc2 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1135bb3>] ? blk_done_softirq+0x53/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c8ea>] ? __do_softirq+0xaa/0x156
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c9c7>] ? do_softirq+0x31/0x3c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103caa1>] ? irq_exit+0x26/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1199be6>] ? xen_evtchn_do_upcall+0x22/0x2c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1009b3f>] ? xen_do_upcall+0x7/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1002407>] ? hypercall_page+0x407/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10015>] ? HYPERVISOR_event_channel_op+0x15/0x4c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10937c6>] ? __alloc_pages_nodemask+0xf3/0x4d9
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP is at hypercall_page+0x227/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EAX: 00040000 EBX: 00000000 ECX: 00000000 EDX: ec8fa828
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] ESI: ec8fa800 EDI: c24d9600 EBP: c27d4800 ESP: e4207d64
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] CR0: 8005003b CR2: b7712200 CR3: 241f0000 CR4: 00000660
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR6: ffff0ff0 DR7: 00000400
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Call Trace:
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006034>] ? xen_force_evtchn_callback+0xc/0x10
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006764>] ? check_events+0x8/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006723>] ? xen_irq_enable_direct_end+0x0/0x1
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f457>] ? scsi_request_fn+0x440/0x47a [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1132541>] ? __blk_run_queue+0x2e/0x5a
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c11325f3>] ? blk_run_queue+0x18/0x27
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93eaca>] ? scsi_run_queue+0x281/0x308 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f639>] ? scsi_next_command+0x25/0x2f [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93ffa1>] ? scsi_io_completion+0x383/0x3a4 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93a723>] ? scsi_finish_command+0xaa/0xc2 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1135bb3>] ? blk_done_softirq+0x53/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c8ea>] ? __do_softirq+0xaa/0x156
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c9c7>] ? do_softirq+0x31/0x3c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103caa1>] ? irq_exit+0x26/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1199be6>] ? xen_evtchn_do_upcall+0x22/0x2c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1009b3f>] ? xen_do_upcall+0x7/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1002407>] ? hypercall_page+0x407/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10015>] ? HYPERVISOR_event_channel_op+0x15/0x4c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10937c6>] ? __alloc_pages_nodemask+0xf3/0x4d9
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10484>] ? evtchn_ioctl+0x22e/0x28c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10256>] ? evtchn_ioctl+0x0/0x28c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6520>] ? vfs_ioctl+0x1c/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6ab4>] ? do_vfs_ioctl+0x4aa/0x4e5
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10bb8e6>] ? fsnotify_modify+0x5a/0x61
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda104e2>] ? evtchn_write+0x0/0xda [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10bc4c4>] ? vfs_write+0x9e/0xd6
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6b30>] ? sys_ioctl+0x41/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1008f7c>] ? syscall_call+0x7/0xb
> 
> Bye,
> Giuseppe
> 
> 
> 
> 

-- 
Ian Campbell
Current Noise: Converge - For You (Live)

(null cookie; hope that's ok)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-22  9:00 ` Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205] Ian Campbell
@ 2011-08-22  9:49   ` Giuseppe Sacco
  2011-08-24 20:24   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 12+ messages in thread
From: Giuseppe Sacco @ 2011-08-22  9:49 UTC (permalink / raw)
  To: Ian Campbell; +Cc: 638172, xen-devel, Ben Hutchings

Il giorno lun, 22/08/2011 alle 10.00 +0100, Ian Campbell ha scritto:
[...]
> @Giuseppe:
> 
> Can you confirm the versions of the xen and qemu-dm packages which you
> have got installed please.

$ COLUMNS=120 dpkg -l \*linux-image\* \*qemu-\*  | grep ^ii
ii  linux-image-2.6.32-5-xen 2.6.32-35                Linux 2.6.32 for modern PCs, Xen dom0 support
ii  linux-image-xen-686      2.6.32+29                Linux for modern PCs (meta-package), Xen dom0 support
ii  qemu-keymaps             0.12.5+dfsg-3squeeze1    QEMU keyboard maps
ii  qemu-system              0.12.5+dfsg-3squeeze1    QEMU full system emulation binaries
ii  qemu-utils               0.12.5+dfsg-3squeeze1    QEMU utilities
ii  xen-qemu-dm-4.0          4.0.1-2                  Xen Qemu Device Model virtual machine hardware emulator

> Also I think it would be useful to see the guest configuration file and
> details of the storage (filesystems, SCSI controllers etc) backing the
> guest storage which you have got configured.

I do host two VM:

1.
kernel = "/usr/lib/xen-default/boot/hvmloader"
builder='hvm'
memory = 1024
shadow_memory = 8
name = "piero"
#vif = [ 'mac=00:16:3e:6f:81:0a, type=ioemu, bridge=dummy0' ]
vif = [ 'type=ioemu, bridge=dummy0' ]
disk = [ 'phy:mapper/rootvg-piero--disk,hda,w' ]
device_model = '/usr/' + arch_libdir + '/xen-default/bin/qemu-dm'

2.
kernel = "/usr/lib/xen-default/boot/hvmloader"
builder='hvm'
memory = 512
shadow_memory = 8
name = "suse"
vif = [ 'mac=00:16:3e:6f:81:0a, type=ioemu, bridge=dummy0' ]
disk = [ 'phy:mapper/rootvg-suse32--disk,hda,w' ]
device_model = '/usr/' + arch_libdir + '/xen-default/bin/qemu-dm'

Xen is really version 4:

$ ls -ld /usr/lib/xen-* /etc/alternatives/xen-default
lrwxrwxrwx 1 root root   16  4 gen  2011 /etc/alternatives/xen-default -> /usr/lib/xen-4.0
drwxr-xr-x 5 root root 4096  3 gen  2011 /usr/lib/xen-4.0
drwxr-xr-x 3 root root 4096  9 dic  2010 /usr/lib/xen-common
lrwxrwxrwx 1 root root   29  4 gen  2011 /usr/lib/xen-default -> /etc/alternatives/xen-default

storage for both machines is on LVM: two volumes on the same volume
group rootvg. rootvg have one physical volume: a raid1 md0 built with
two SATA disks connected to the same controller.

centrum:~# vgs
  VG     #PV #LV #SN Attr   VSize   VFree  
  rootvg   1  13   0 wz--n- 370,35g 121,07g
centrum:~# pvs
  PV         VG     Fmt  Attr PSize   PFree  
  /dev/md2   rootvg lvm2 a-   370,35g 121,07g
centrum:~# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md2 : active raid1 sda3[0] sdb3[1]
      388339136 blocks [2/2] [UU]
      
md1 : active raid1 sda2[0] sdb2[1]
      264960 blocks [2/2] [UU]
      
md0 : active raid1 sda1[0] sdb1[1]
      2102464 blocks [2/2] [UU]
      
unused devices: <none>
centrum:~# cat /sys/devices/pci0000:00/0000:00:12.0/host4/target4:0:0/4:0:0:0/model
SAMSUNG HD403LJ 
centrum:~# cat /sys/devices/pci0000:00/0000:00:12.0/host2/target2:0:0/2:0:0:0/model
SAMSUNG HD403LJ 

Bye,
Giuseppe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-22  9:00 ` Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205] Ian Campbell
  2011-08-22  9:49   ` Giuseppe Sacco
@ 2011-08-24 20:24   ` Konrad Rzeszutek Wilk
  2011-08-24 21:24     ` Ian Campbell
  1 sibling, 1 reply; 12+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-08-24 20:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: 638172, Ben Hutchings, xen-devel, Giuseppe Sacco

On Mon, Aug 22, 2011 at 10:00:11AM +0100, Ian Campbell wrote:
> @xen-devel:
> 
> Does this look familiar to anyone, this is (I expect, hopefully Giuseppe
> will confirm) from Debian Squeeze which has a Xen 4.0.x with a PVops
> dom0 kernel based on xen.git from last summer (e73f4955a821) with more
> recent upstream longterm kernels (up to and including 2.6.32.41) merged
> in. While it does seem to have the switch from level to edge triggered
> interrupt the Debian kernel doesn't appear to have the switch to fasteoi
> for pirqs (0672fb44a111 plus a few followups) -- could that be related
> to this? (I'm not sure if that was a cleanup or a fix)

It was a fix. We had some interrupts getting wedged - but I don't recall
the stack exactly. But there are some follows - like
e5ac0bda96c495321dbad9b57a4b1a93a5a72e7f
7e186bdd0098b34c69fb8067c67340ae610ea499

> 
> Might the tsc unstable message be relevant?

Hm, not sure. I keep on getting those on my guests but life seems to go on.


The interesting about the stack trace is that it looks similiar to:

http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979

which has some fixes https://patchwork.kernel.org/patch/1091772/
but they may not help.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-24 20:24   ` Konrad Rzeszutek Wilk
@ 2011-08-24 21:24     ` Ian Campbell
  2011-08-25  6:52       ` Bug#638172: [Xen-devel] " Giuseppe Sacco
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2011-08-24 21:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Sacco, Giuseppe, xen-devel, Ben Hutchings, 638172


[-- Attachment #1.1: Type: text/plain, Size: 5480 bytes --]

On Wed, 2011-08-24 at 21:24 +0100, Konrad Rzeszutek Wilk wrote:
> On Mon, Aug 22, 2011 at 10:00:11AM +0100, Ian Campbell wrote:
> > @xen-devel:
> > 
> > Does this look familiar to anyone, this is (I expect, hopefully Giuseppe
> > will confirm) from Debian Squeeze which has a Xen 4.0.x with a PVops
> > dom0 kernel based on xen.git from last summer (e73f4955a821) with more
> > recent upstream longterm kernels (up to and including 2.6.32.41) merged
> > in. While it does seem to have the switch from level to edge triggered
> > interrupt the Debian kernel doesn't appear to have the switch to fasteoi
> > for pirqs (0672fb44a111 plus a few followups) -- could that be related
> > to this? (I'm not sure if that was a cleanup or a fix)
> 
> It was a fix. We had some interrupts getting wedged - but I don't recall
> the stack exactly.

OK, sounds very much like those fixes are worth a try then. Thanks.

>  But there are some follows - like
> e5ac0bda96c495321dbad9b57a4b1a93a5a72e7f
> 7e186bdd0098b34c69fb8067c67340ae610ea499

The list of changesets against drivers/xen/events.c which are not in the
Debian kernel which I came up with is below [0]. A small number are
false positives (Debian already got them via the longterm branches) but
most are not.

The majority look like real fixes to me either for this particular issue
or other problems. I would consider them all candidates for inclusion in
a future update of the Debian kernel.

Giuseppe, are you able to reproduce the issue you are seeing at will? If
I build a test kernel would you be able to try it? You are using a -686
kernel right (as opposed to amd64). OOI which hypervisor flavour do you
use?

> The interesting about the stack trace is that it looks similiar to:
> 
> http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979
> 
> which has some fixes https://patchwork.kernel.org/patch/1091772/
> but they may not help.

Looks like it is an issue on native too. If it is an issue as far back
as 2.6.32 as well I expect we'll see the fix via the longterm channels
at some point.

Ian.

[0]

652c98bac315a2253628885f05cfd5f30b553ae5 xen: Use IRQF_FORCE_RESUME
f9f09329407e3a11140827ba71d8f9d9ede42823 xen: events: do not unmask event channels on resume
ea2020837ca7dc2c9bcfc477fb4d261cf067db4f xen: do not try to allocate the callback vector again at restore time
acad13511ebe1db666aab5807117d3ac647ea58d xen: events: Remove redundant clear of l2i at end of round-robin loop
0e2ec1fb16f9ca84f91de3d9427a0964d679738a xen: events: Make round-robin scan fairer by snapshotting each l2 word
188449f889c6c30709c7e9e8710b9eff14fd963f xen: events: Clean up round-robin evtchn scan.
1acdebd2d67f71d230f5857c28843e636b7dd92e xen: events: Make last processed event channel a per-cpu variable.
2d9c33e1b47b800e43a1444a65353fcb96e27165 xen: events: Process event channels notifications in round-robin order.
2b1c9503c615f68262ae2e96ee26ee128b486287 xen/events: only unmask irq if enabled
c756a6e7f711308ce85afc7d4c79213cce58a033 xen: allocate irq descriptors on any numa node
b1a003a2aa9ee0d3d69237725c91839f4b6a8559 xen/events: use locked set|clear_bit() for cpu_evtchn_mask
cca68cf2d344eb3c4ff996e99f36cf8f8382bc2b xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore
c7ff70d2824191af119091d3af8db3bb57b06f77 xen: events: do not unmask event channels on resume
d4283609c7504309b8b93d7582857ff4623105f3 xen: improvements to VIRQ_DEBUG output
7c42097171f2e0beafa16e007a06e464b3014bea xen: correct parameter type for pirq_eoi
97708051c14157e95e25d112c26902f1c6fbb462 xen: ensure that all event channels start off bound to VCPU 0
e05885b24a55db82fbdb5cbc3f31426b976d7fc1 xen: set up IRQ before binding virq to evtchn
f0d4a0552f03b52027fb2c7958a1cbbe210cf418 xen/apic: fix pirq_eoi_gmfn resume
d2ea486300ca6e207ba178a425fbd023b8621bb1 xen/pirq: use fasteoi for MSI too
158d6550716687486000a828c601706b55322ad0 xen/pirq: use eoi as enable
2390c371ecd32d9f06e22871636185382bf70ab7 xen/events: use PHYSDEVOP_pirq_eoi_gmfn to get pirq need-EOI info
cb23e8d58ca35b6f9e10e1ea5682bd61f2442ebd xen/evtchn: correction, pirq hypercall does not unmask
43d8a5030a502074f3c4aafed4d6095ebd76067c xen/evtchn: pirq_eoi does unmask
f4526f9a78ffb3d3fc9f81636c5b0357fc1beccd xen/evtchn: make pirq enable/disable unmask/mask
c6a16a778f86699b339585ba5b9197035d77c40f xen/evtchn: rename retrigger_dynirq -> irq
d0936845a856816af2af48ddf019366be68e96ba xen/evtchn: rename enable/disable_dynirq -> unmask/mask_irq
2789ef00cbe2cdb38deb30ee4085b88befadb1b0 xen: make pirq interrupts use fasteoi
0672fb44a111dfb6386022071725c5b15c9de584 xen/events: change to using fasteoi
9fa90aa72d6af5cc2c2eddf56f9a586035e13ae7 xen: use dynamic_irq_init_keep_chip_data
f55ce8740101c54016544a0d633dc1b6b21244ae Introduce CONFIG_XEN_PVHVM compile option
f61692642a2a2b83a52dd7e64619ba3bb29998af xen/pirq: do EOI properly for pirq events
47cd3eb068a8a0cea124495e525ac16876fa08f6 xen/pci: fix compile error when CONFIG_PCI_XEN disabled
29a2e2a7bd19233c62461b104c69233f15ce99ec xen/apic: use handle_edge_irq for pirq events
6dc7b8080195ed43ee6de5b1d60c65aa719208ad xen/irq: replace boot boot allocator
66fd3052fec7e7c21a9d88ba1a03bc062f5fb53d xen: handle events as edge-triggered
8401e9b96f80f9c0128e7c8fc5a01abfabbfa021 xen: use percpu interrupts for IPIs and VIRQs

-- 
Ian Campbell


A Fortran compiler is the hobgoblin of little minis.

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Bug#638172: [Xen-devel] Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-24 21:24     ` Ian Campbell
@ 2011-08-25  6:52       ` Giuseppe Sacco
  2011-08-25  6:56         ` Ian Campbell
  0 siblings, 1 reply; 12+ messages in thread
From: Giuseppe Sacco @ 2011-08-25  6:52 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Konrad Rzeszutek Wilk, 638172, Ben Hutchings, xen-devel

Il giorno mer, 24/08/2011 alle 22.24 +0100, Ian Campbell ha scritto:
[...]
> Giuseppe, are you able to reproduce the issue you are seeing at will? If
> I build a test kernel would you be able to try it? You are using a -686
> kernel right (as opposed to amd64). OOI which hypervisor flavour do you
> use?

Unfortunately not. This server crashes often, but I do not have any
method to let it crashes. The worst case is when it crashes just after
reboot, the opposite was after a month from reboot.

But, I may use a different kernel and let the server goes until crashes:
no problem in rebooting it for kernel update.

And yes, it is a 32bit debian squeeze system on a 64 bit Athlon cpu.

Bye,
Giuseppe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-25  6:52       ` Bug#638172: [Xen-devel] " Giuseppe Sacco
@ 2011-08-25  6:56         ` Ian Campbell
  2011-08-25  7:23           ` Bug#638172: [Xen-devel] " Giuseppe Sacco
  2011-08-26  7:25           ` Bug#638172: " Ian Campbell
  0 siblings, 2 replies; 12+ messages in thread
From: Ian Campbell @ 2011-08-25  6:56 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: 638172, Ben Hutchings, xen-devel, Konrad Rzeszutek Wilk


[-- Attachment #1.1: Type: text/plain, Size: 1200 bytes --]

On Thu, 2011-08-25 at 08:52 +0200, Giuseppe Sacco wrote:
> Il giorno mer, 24/08/2011 alle 22.24 +0100, Ian Campbell ha scritto:
> [...]
> > Giuseppe, are you able to reproduce the issue you are seeing at will? If
> > I build a test kernel would you be able to try it? You are using a -686
> > kernel right (as opposed to amd64). OOI which hypervisor flavour do you
> > use?
> 
> Unfortunately not. This server crashes often, but I do not have any
> method to let it crashes. The worst case is when it crashes just after
> reboot, the opposite was after a month from reboot.
> 
> But, I may use a different kernel and let the server goes until crashes:
> no problem in rebooting it for kernel update.

Thanks that would be useful, I'll put something together and let you
know.

> And yes, it is a 32bit debian squeeze system on a 64 bit Athlon cpu.

But are you running the amd64 or 686 flavour of the hypervisor? Both are
available in 32bit Debian. FWIW I would always recommend running the 64
bit hypervisor (even with 32 bit dom0) if you are able to.

Ian.

-- 
Ian Campbell


Once the toothpaste is out of the tube, it's hard to get it back in.
		-- H. R. Haldeman

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Bug#638172: [Xen-devel] Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-25  6:56         ` Ian Campbell
@ 2011-08-25  7:23           ` Giuseppe Sacco
  2011-08-26  7:25           ` Bug#638172: " Ian Campbell
  1 sibling, 0 replies; 12+ messages in thread
From: Giuseppe Sacco @ 2011-08-25  7:23 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Konrad Rzeszutek Wilk, 638172, Ben Hutchings, xen-devel

Il giorno gio, 25/08/2011 alle 07.56 +0100, Ian Campbell ha scritto:
[...]
> > And yes, it is a 32bit debian squeeze system on a 64 bit Athlon cpu.
> 
> But are you running the amd64 or 686 flavour of the hypervisor? Both are
> available in 32bit Debian. FWIW I would always recommend running the 64
> bit hypervisor (even with 32 bit dom0) if you are able to.

The only hypervisor package installed is xen-hypervisor-4.0-i386. I was
not aware that it would be possible to use the amd64 version when
running a 32bit kernel dom0. If you suggest it, I may switch to the
64bit version when I'll install your patched kernel.

Thanks,
Giuseppe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug#638172: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-25  6:56         ` Ian Campbell
  2011-08-25  7:23           ` Bug#638172: [Xen-devel] " Giuseppe Sacco
@ 2011-08-26  7:25           ` Ian Campbell
  2011-08-26  8:28             ` Giuseppe Sacco
  1 sibling, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2011-08-26  7:25 UTC (permalink / raw)
  To: 638172; +Cc: Konrad, Ben Hutchings, xen-devel, Wilk, Giuseppe Sacco


[-- Attachment #1.1: Type: text/plain, Size: 713 bytes --]

Hi Giuseppe,

On Thu, 2011-08-25 at 07:56 +0100, Ian Campbell wrote:
> 
> > But, I may use a different kernel and let the server goes until crashes:
> > no problem in rebooting it for kernel update.
> 
> Thanks that would be useful, I'll put something together and let you
> know.

I'm in the process of uploading a kernel to
http://xenbits.xen.org/people/ianc/2.6.32-36~xen0/ which has a bunch of
patches to the event channel (aka IRQ) subsystem backported. I think the
kernel flavour you want is there already please could you give it a go
when you get the chance.

Thanks,
Ian.

-- 
Ian Campbell


On-line, adj.:
	The idea that a human being should always be accessible to a computer.

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug#638172: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-26  7:25           ` Bug#638172: " Ian Campbell
@ 2011-08-26  8:28             ` Giuseppe Sacco
  2011-08-26  8:31               ` Ian Campbell
  2011-10-18 10:54               ` Ian Campbell
  0 siblings, 2 replies; 12+ messages in thread
From: Giuseppe Sacco @ 2011-08-26  8:28 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Konrad Rzeszutek Wilk, Ben Hutchings, xen-devel, 638172

Hi,
I just installed the new kernel and switched to 64bit hypervisor. I'll
let you know about any news.

Il giorno ven, 26/08/2011 alle 08.25 +0100, Ian Campbell ha scritto:
[...]
> I'm in the process of uploading a kernel to
> http://xenbits.xen.org/people/ianc/2.6.32-36~xen0/ which has a bunch of
> patches to the event channel (aka IRQ) subsystem backported. I think the
> kernel flavour you want is there already please could you give it a go
> when you get the chance.

During boot I got this message. Is this related to this bug or to new
kernel?

[    0.004000] ------------[ cut here ]------------
[    0.004000] WARNING: at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/arch/x86/xen/enlighten.c:726 perf_events_lapic_init+0x28/0x29()
[    0.004000] Hardware name: MS-7368
[    0.004000] Modules linked in:
[    0.004000] Pid: 0, comm: swapper Not tainted 2.6.32-5-xen-686 #1
[    0.004000] Call Trace:
[    0.004000]  [<c1037839>] ? warn_slowpath_common+0x5e/0x8a
[    0.004000]  [<c103786f>] ? warn_slowpath_null+0xa/0xc
[    0.004000]  [<c1011db0>] ? perf_events_lapic_init+0x28/0x29
[    0.004000]  [<c14033dd>] ? init_hw_perf_events+0x2dd/0x376
[    0.004000]  [<c1403030>] ? check_bugs+0x8/0xd8
[    0.004000]  [<c13fb808>] ? start_kernel+0x309/0x31d
[    0.004000]  [<c13fd410>] ? xen_start_kernel+0x564/0x56b
[    0.004000]  [<c1409045>] ? check_nmi_watchdog+0xcd/0x1f2
[    0.004000] ---[ end trace a7919e7f17c0a725 ]---

Thanks,
Giuseppe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug#638172: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-26  8:28             ` Giuseppe Sacco
@ 2011-08-26  8:31               ` Ian Campbell
  2011-10-18 10:54               ` Ian Campbell
  1 sibling, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2011-08-26  8:31 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: Konrad Rzeszutek Wilk, Ben Hutchings, xen-devel, 638172

On Fri, 2011-08-26 at 10:28 +0200, Giuseppe Sacco wrote:
> Hi,
> I just installed the new kernel and switched to 64bit hypervisor. I'll
> let you know about any news.
> 
> Il giorno ven, 26/08/2011 alle 08.25 +0100, Ian Campbell ha scritto:
> [...]
> > I'm in the process of uploading a kernel to
> > http://xenbits.xen.org/people/ianc/2.6.32-36~xen0/ which has a bunch of
> > patches to the event channel (aka IRQ) subsystem backported. I think the
> > kernel flavour you want is there already please could you give it a go
> > when you get the chance.
> 
> During boot I got this message. Is this related to this bug or to new
> kernel?

It's a benign (but annoying) warning. I'm angling to get it dropped:
http://marc.info/?l=xen-devel&m=131400621622691

> 
> [    0.004000] ------------[ cut here ]------------
> [    0.004000] WARNING: at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/arch/x86/xen/enlighten.c:726 perf_events_lapic_init+0x28/0x29()
> [    0.004000] Hardware name: MS-7368
> [    0.004000] Modules linked in:
> [    0.004000] Pid: 0, comm: swapper Not tainted 2.6.32-5-xen-686 #1
> [    0.004000] Call Trace:
> [    0.004000]  [<c1037839>] ? warn_slowpath_common+0x5e/0x8a
> [    0.004000]  [<c103786f>] ? warn_slowpath_null+0xa/0xc
> [    0.004000]  [<c1011db0>] ? perf_events_lapic_init+0x28/0x29
> [    0.004000]  [<c14033dd>] ? init_hw_perf_events+0x2dd/0x376
> [    0.004000]  [<c1403030>] ? check_bugs+0x8/0xd8
> [    0.004000]  [<c13fb808>] ? start_kernel+0x309/0x31d
> [    0.004000]  [<c13fd410>] ? xen_start_kernel+0x564/0x56b
> [    0.004000]  [<c1409045>] ? check_nmi_watchdog+0xcd/0x1f2
> [    0.004000] ---[ end trace a7919e7f17c0a725 ]---
> 
> Thanks,
> Giuseppe
> 
> 

-- 
Ian Campbell

May your Tongue stick to the Roof of your Mouth with the Force of a
Thousand Caramels.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug#638172: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-08-26  8:28             ` Giuseppe Sacco
  2011-08-26  8:31               ` Ian Campbell
@ 2011-10-18 10:54               ` Ian Campbell
  2011-10-18 11:00                 ` Giuseppe Sacco
  1 sibling, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2011-10-18 10:54 UTC (permalink / raw)
  To: Giuseppe Sacco; +Cc: Konrad Rzeszutek Wilk, Ben Hutchings, xen-devel, 638172

On Fri, 2011-08-26 at 10:28 +0200, Giuseppe Sacco wrote:
> I just installed the new kernel and switched to 64bit hypervisor. I'll
> let you know about any news.

Has everything been OK since you switched?

Ian.

> 
> Il giorno ven, 26/08/2011 alle 08.25 +0100, Ian Campbell ha scritto:
> [...]
> > I'm in the process of uploading a kernel to
> > http://xenbits.xen.org/people/ianc/2.6.32-36~xen0/ which has a bunch of
> > patches to the event channel (aka IRQ) subsystem backported. I think the
> > kernel flavour you want is there already please could you give it a go
> > when you get the chance.
> 
> During boot I got this message. Is this related to this bug or to new
> kernel?
> 
> [    0.004000] ------------[ cut here ]------------
> [    0.004000] WARNING: at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/arch/x86/xen/enlighten.c:726 perf_events_lapic_init+0x28/0x29()
> [    0.004000] Hardware name: MS-7368
> [    0.004000] Modules linked in:
> [    0.004000] Pid: 0, comm: swapper Not tainted 2.6.32-5-xen-686 #1
> [    0.004000] Call Trace:
> [    0.004000]  [<c1037839>] ? warn_slowpath_common+0x5e/0x8a
> [    0.004000]  [<c103786f>] ? warn_slowpath_null+0xa/0xc
> [    0.004000]  [<c1011db0>] ? perf_events_lapic_init+0x28/0x29
> [    0.004000]  [<c14033dd>] ? init_hw_perf_events+0x2dd/0x376
> [    0.004000]  [<c1403030>] ? check_bugs+0x8/0xd8
> [    0.004000]  [<c13fb808>] ? start_kernel+0x309/0x31d
> [    0.004000]  [<c13fd410>] ? xen_start_kernel+0x564/0x56b
> [    0.004000]  [<c1409045>] ? check_nmi_watchdog+0xcd/0x1f2
> [    0.004000] ---[ end trace a7919e7f17c0a725 ]---
> 
> Thanks,
> Giuseppe
> 
> 

-- 
Ian Campbell
Current Noise: Anathema - Flying

There's no future in time travel.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Bug#638172: Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
  2011-10-18 10:54               ` Ian Campbell
@ 2011-10-18 11:00                 ` Giuseppe Sacco
  0 siblings, 0 replies; 12+ messages in thread
From: Giuseppe Sacco @ 2011-10-18 11:00 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Konrad Rzeszutek Wilk, Ben Hutchings, xen-devel, 638172

Il giorno mar, 18/10/2011 alle 11.54 +0100, Ian Campbell ha scritto:
> On Fri, 2011-08-26 at 10:28 +0200, Giuseppe Sacco wrote:
> > I just installed the new kernel and switched to 64bit hypervisor. I'll
> > let you know about any news.
> 
> Has everything been OK since you switched?

Yes: no crashes since then.

Thanks,
Giuseppe

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-10-18 11:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1313577856.13030.17.camel@scarafaggio>
2011-08-22  9:00 ` Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205] Ian Campbell
2011-08-22  9:49   ` Giuseppe Sacco
2011-08-24 20:24   ` Konrad Rzeszutek Wilk
2011-08-24 21:24     ` Ian Campbell
2011-08-25  6:52       ` Bug#638172: [Xen-devel] " Giuseppe Sacco
2011-08-25  6:56         ` Ian Campbell
2011-08-25  7:23           ` Bug#638172: [Xen-devel] " Giuseppe Sacco
2011-08-26  7:25           ` Bug#638172: " Ian Campbell
2011-08-26  8:28             ` Giuseppe Sacco
2011-08-26  8:31               ` Ian Campbell
2011-10-18 10:54               ` Ian Campbell
2011-10-18 11:00                 ` Giuseppe Sacco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.