On 2012-03-30 22:13, Jason Baron wrote: > On Fri, Mar 30, 2012 at 09:29:23PM +0200, Jan Kiszka wrote: >> On 2012-03-30 21:18, Jason Baron wrote: >>> We've hit a kernel host panic, when issuing a 'system_reset' with a 82576 nic >>> assigned and a Windows guest. Host system is a PowerEdge R815. >>> >>> [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 32993 >>> [Hardware Error]: APEI generic hardware error status >>> [Hardware Error]: severity: 1, fatal >>> [Hardware Error]: section: 0, severity: 1, fatal >>> [Hardware Error]: flags: 0x01 >>> [Hardware Error]: primary >>> [Hardware Error]: section_type: PCIe error >>> [Hardware Error]: port_type: 0, PCIe end point >>> [Hardware Error]: version: 1.0 >>> [Hardware Error]: command: 0x0000, status: 0x0010 >>> [Hardware Error]: device_id: 0000:08:00.0 >>> [Hardware Error]: slot: 1 >>> [Hardware Error]: secondary_bus: 0x00 >>> [Hardware Error]: vendor_id: 0x8086, device_id: 0x10c9 >>> [Hardware Error]: class_code: 000002 >>> [Hardware Error]: aer_status: 0x00100000, aer_mask: 0x00018000 >>> [Hardware Error]: Unsupported Request >>> [Hardware Error]: aer_layer=Transaction Layer, aer_agent=Requester ID >>> [Hardware Error]: aer_uncor_severity: 0x00067011 >>> [Hardware Error]: aer_tlp_header: 40001001 0020000f edbf800c 01000000 >>> [Hardware Error]: section: 1, severity: 1, fatal >>> [Hardware Error]: flags: 0x01 >>> [Hardware Error]: primary >>> [Hardware Error]: section_type: PCIe error >>> [Hardware Error]: port_type: 0, PCIe end point >>> [Hardware Error]: version: 1.0 >>> [Hardware Error]: command: 0x0000, status: 0x0010 >>> [Hardware Error]: device_id: 0000:08:00.0 >>> [Hardware Error]: slot: 1 >>> [Hardware Error]: secondary_bus: 0x00 >>> [Hardware Error]: vendor_id: 0x8086, device_id: 0x10c9 >>> [Hardware Error]: class_code: 000002 >>> [Hardware Error]: aer_status: 0x00100000, aer_mask: 0x00018000 >>> [Hardware Error]: Unsupported Request >>> [Hardware Error]: aer_layer=Transaction Layer, aer_agent=Requester ID >>> [Hardware Error]: aer_uncor_severity: 0x00067011 >>> [Hardware Error]: aer_tlp_header: 40001001 0020000f edbf800c 01000000 >>> Kernel panic - not syncing: Fatal hardware error! >>> Pid: 0, comm: swapper Not tainted 2.6.32-242.el6.x86_64 #1 >>> Call Trace: >>> [] ? panic+0xa0/0x168 >>> [] ? ghes_notify_nmi+0x17c/0x180 >>> [] ? notifier_call_chain+0x55/0x80 >>> [] ? atomic_notifier_call_chain+0x1a/0x20 >>> [] ? notify_die+0x2e/0x30 >>> [] ? do_nmi+0x1a1/0x2b0 >>> [] ? nmi+0x20/0x30 >>> [] ? native_safe_halt+0xb/0x10 >>> <> [] ? default_idle+0x4d/0xb0 >>> [] ? cpu_idle+0xb6/0x110 >>> [] ? rest_init+0x7a/0x80 >>> [] ? start_kernel+0x424/0x430 >>> [] ? x86_64_start_reservations+0x125/0x129 >>> [] ? x86_64_start_kernel+0xfa/0x109 >>> >>> The root cause of the problem is that the 'reset_assigned_device()' code >>> first writes a 0 to the command register. Then, when qemu subsequently does >>> a kvm_deassign_irq() (called by assign_irq(), in the system_reset path), >>> the kernel ends up calling '__msix_mask_irq()', which performs a write to >>> the memory mapped msi vector space. Since, we've explicitly told the device >>> to disallow mmio access (via the 0 write to the command register), we end >>> up with the above 'Unsupported Request'. >>> >>> The fix here is to first call kvm_deassign_irq(), before doing the reset, >> >> s/fix/workaround/. This is a kernel bug if userspace can crash the >> system like this, no? Let's fix the kernel first and then look at what >> needs to be changed here. >> >> Jan >> > > But don't I need special privalege to run the device assignment bits? Yes, but even that might be moderated by a management component like libvirt. > For example, this crash is precipitated by a write of '0' to the pci > device config register from userspace. Surely, not every is allowed to > do that write. So it seems to me, that this patch is in keeping with the > current model of how things work. No user should needlessly be able to crash the host by issuing valid commands in a special order. Jan