All of lore.kernel.org
 help / color / mirror / Atom feed
* 4.2.2 pci-passthrough crashes Dell Poweredge R710
@ 2013-05-14  8:29 Alexander Bienzeisler
  2013-05-14 13:32 ` Jan Beulich
  2013-05-15  0:31 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 6+ messages in thread
From: Alexander Bienzeisler @ 2013-05-14  8:29 UTC (permalink / raw)
  To: xen-devel

Hello everyone,

i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 domU 
with a pci device attached, the dom0 machine hardcrashes.

my system log (idrac) shows the following:

CPU 2 has an internal error (IERR).
A bus fatal error was detected on a component at bus 0 device 0 function 0.
CPU 1 machine check detected.

and plenty of other entries. The machine hardresets then.
If i leave the faulty machine down after a reboot, nothing like this 
happens.

xl info:
> host                   : susi-0
> release                : 3.8.2-ipmi
> version                : #4 SMP Mon Mar 11 12:54:31 CET 2013
> machine                : x86_64
> nr_cpus                : 12
> max_cpu_id             : 31
> nr_nodes               : 2
> cores_per_socket       : 6
> threads_per_core       : 1
> cpu_mhz                : 3325
> hw_caps                : 
> bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
> virt_caps              : hvm
> total_memory           : 98291
> free_memory            : 62390
> sharing_freed_memory   : 0
> sharing_used_memory    : 0
> free_cpus              : 0
> xen_major              : 4
> xen_minor              : 2
> xen_extra              : .2
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
> hvm-3.0-x86_32p hvm-3.0-x86_64
> xen_scheduler          : credit
> xen_pagesize           : 4096
> platform_params        : virt_start=0xffff800000000000
> xen_changeset          : unavailable
> xen_commandline        : placeholder loglvl=all dom0_mem=2048M 
> dom0_max_vcpus=2 com2=115200 console=com2,vga
> cc_compiler            : gcc (Debian 4.4.5-8) 4.4.5
> cc_compile_by          : root
> cc_compile_domain      : wsk.tu-chemnitz.de
> cc_compile_date        : Tue May 14 09:16:43 CEST 2013
> xend_config_format     : 4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
  2013-05-14  8:29 4.2.2 pci-passthrough crashes Dell Poweredge R710 Alexander Bienzeisler
@ 2013-05-14 13:32 ` Jan Beulich
  2013-05-15  0:31 ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2013-05-14 13:32 UTC (permalink / raw)
  To: Alexander Bienzeisler; +Cc: xen-devel

>>> On 14.05.13 at 10:29, Alexander Bienzeisler <chosi@amd.co.at> wrote:
> i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8 domU 
> with a pci device attached, the dom0 machine hardcrashes.
> 
> my system log (idrac) shows the following:
> 
> CPU 2 has an internal error (IERR).
> A bus fatal error was detected on a component at bus 0 device 0 function 0.
> CPU 1 machine check detected.

Machine checks and CPU internal errors aren't normally caused by
software, but in most cases point at faulty hardware. In any case,
if you're suspecting Xen, you'd need to provide full logs (hypervisor
and kernel) covering the crash, with suitable debugging options
enabled.

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
  2013-05-14  8:29 4.2.2 pci-passthrough crashes Dell Poweredge R710 Alexander Bienzeisler
  2013-05-14 13:32 ` Jan Beulich
@ 2013-05-15  0:31 ` Konrad Rzeszutek Wilk
  2013-05-15  1:58   ` Alexander Bienzeisler
  1 sibling, 1 reply; 6+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-15  0:31 UTC (permalink / raw)
  To: Alexander Bienzeisler; +Cc: xen-devel

On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote:
> Hello everyone,
> 
> i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8
> domU with a pci device attached, the dom0 machine hardcrashes.

How do you pass in the PCI device? I don't see in your xl info the
capability listed to do VT-d. Does your machine do VT-d?
> 
> my system log (idrac) shows the following:
> 
> CPU 2 has an internal error (IERR).
> A bus fatal error was detected on a component at bus 0 device 0 function 0.
> CPU 1 machine check detected.
> 
> and plenty of other entries. The machine hardresets then.
> If i leave the faulty machine down after a reboot, nothing like this
> happens.
> 
> xl info:
> >host                   : susi-0
> >release                : 3.8.2-ipmi
> >version                : #4 SMP Mon Mar 11 12:54:31 CET 2013
> >machine                : x86_64
> >nr_cpus                : 12
> >max_cpu_id             : 31
> >nr_nodes               : 2
> >cores_per_socket       : 6
> >threads_per_core       : 1
> >cpu_mhz                : 3325
> >hw_caps                : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
> >virt_caps              : hvm
> >total_memory           : 98291
> >free_memory            : 62390
> >sharing_freed_memory   : 0
> >sharing_used_memory    : 0
> >free_cpus              : 0
> >xen_major              : 4
> >xen_minor              : 2
> >xen_extra              : .2
> >xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
> >hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
> >xen_scheduler          : credit
> >xen_pagesize           : 4096
> >platform_params        : virt_start=0xffff800000000000
> >xen_changeset          : unavailable
> >xen_commandline        : placeholder loglvl=all dom0_mem=2048M
> >dom0_max_vcpus=2 com2=115200 console=com2,vga
> >cc_compiler            : gcc (Debian 4.4.5-8) 4.4.5
> >cc_compile_by          : root
> >cc_compile_domain      : wsk.tu-chemnitz.de
> >cc_compile_date        : Tue May 14 09:16:43 CEST 2013
> >xend_config_format     : 4
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
  2013-05-15  0:31 ` Konrad Rzeszutek Wilk
@ 2013-05-15  1:58   ` Alexander Bienzeisler
  2013-05-15  6:32     ` Pasi Kärkkäinen
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Bienzeisler @ 2013-05-15  1:58 UTC (permalink / raw)
  To: xen-devel

Hello Konrad,

i pass it exactly the same way i did with 4.2.1. I have changed nothing. 
Right now i reverted to 4.2.1 and i'm not seeing this anymore. The 
problem right now is, that it seems like the hard crashes this corrupted 
my HVM LV filesystems and stuff is broken now. However, this is not 
related. I don't see this with 4.2.1.

passing through like this:

module options: xen-pciback.hide=(00:1a.0), which is an usb controller
win2k8.cfg (xl): pci = [ '00:1a.0' ]
xl pci-assignable-list output: 0000:00:1a.0

it worked flawlessly before and started breaking with 4.2.2 - i'm 
actually pretty sure this is not a hardware issue, since i can't 
reproduce it with 4.2.1. I think i might lack the skills to provide 
proper logs and crashdumps of whatever sort. That's all i can tell you 
right now.

cheers,
Alex


Am 15.05.2013 02:31, schrieb Konrad Rzeszutek Wilk:
> On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote:
>> Hello everyone,
>>
>> i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8
>> domU with a pci device attached, the dom0 machine hardcrashes.
> How do you pass in the PCI device? I don't see in your xl info the
> capability listed to do VT-d. Does your machine do VT-d?
>> my system log (idrac) shows the following:
>>
>> CPU 2 has an internal error (IERR).
>> A bus fatal error was detected on a component at bus 0 device 0 function 0.
>> CPU 1 machine check detected.
>>
>> and plenty of other entries. The machine hardresets then.
>> If i leave the faulty machine down after a reboot, nothing like this
>> happens.
>>
>> xl info:
>>> host                   : susi-0
>>> release                : 3.8.2-ipmi
>>> version                : #4 SMP Mon Mar 11 12:54:31 CET 2013
>>> machine                : x86_64
>>> nr_cpus                : 12
>>> max_cpu_id             : 31
>>> nr_nodes               : 2
>>> cores_per_socket       : 6
>>> threads_per_core       : 1
>>> cpu_mhz                : 3325
>>> hw_caps                : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
>>> virt_caps              : hvm
>>> total_memory           : 98291
>>> free_memory            : 62390
>>> sharing_freed_memory   : 0
>>> sharing_used_memory    : 0
>>> free_cpus              : 0
>>> xen_major              : 4
>>> xen_minor              : 2
>>> xen_extra              : .2
>>> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
>>> hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
>>> xen_scheduler          : credit
>>> xen_pagesize           : 4096
>>> platform_params        : virt_start=0xffff800000000000
>>> xen_changeset          : unavailable
>>> xen_commandline        : placeholder loglvl=all dom0_mem=2048M
>>> dom0_max_vcpus=2 com2=115200 console=com2,vga
>>> cc_compiler            : gcc (Debian 4.4.5-8) 4.4.5
>>> cc_compile_by          : root
>>> cc_compile_domain      : wsk.tu-chemnitz.de
>>> cc_compile_date        : Tue May 14 09:16:43 CEST 2013
>>> xend_config_format     : 4
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
  2013-05-15  1:58   ` Alexander Bienzeisler
@ 2013-05-15  6:32     ` Pasi Kärkkäinen
  2013-05-15  8:33       ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Pasi Kärkkäinen @ 2013-05-15  6:32 UTC (permalink / raw)
  To: Alexander Bienzeisler; +Cc: xen-devel

On Wed, May 15, 2013 at 03:58:19AM +0200, Alexander Bienzeisler wrote:
> Hello Konrad,
> 
> i pass it exactly the same way i did with 4.2.1. I have changed
> nothing. Right now i reverted to 4.2.1 and i'm not seeing this
> anymore. The problem right now is, that it seems like the hard
> crashes this corrupted my HVM LV filesystems and stuff is broken
> now. However, this is not related. I don't see this with 4.2.1.
> 
> passing through like this:
> 
> module options: xen-pciback.hide=(00:1a.0), which is an usb controller
> win2k8.cfg (xl): pci = [ '00:1a.0' ]
> xl pci-assignable-list output: 0000:00:1a.0
> 
> it worked flawlessly before and started breaking with 4.2.2 - i'm
> actually pretty sure this is not a hardware issue, since i can't
> reproduce it with 4.2.1. I think i might lack the skills to provide
> proper logs and crashdumps of whatever sort. That's all i can tell
> you right now.
>

Subject says you're using Dell R710 server. So you should enable SOL
(Serial Over LAN) from the iDRAC management processor. SOL device will be seen
as a serial port by Xen. So configure Xen hypervisor to log everything to the serial port.
And configure dom0 linux to log to Xen hypervisor. 

Then ssh to the iDRAC and connect to the SOL console (iDRAC6 command: "console com2").
Make sure your ssh client has big enough scroll back buffer (or redirect the ssh output to a file), 
so you can capture a full log with all the boot-time and crash-time output from both Xen and dom0 Linux.
Power-on (or restart) your R710 server and you're ready to reproduce the crash.

See: http://wiki.xen.org/wiki/Xen_Serial_Console

So you probably need to add these options to grub (modify to match your iDRAC/BIOS settings):

for xen.gz: loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 sync_console lapic=debug apic_verbosity=debug apic=debug iommu=verbose
for vmlinuz: earlyprintk=xen console=hvc0 initcall_debug debug loglevel=10

Hopefully that helps..

-- Pasi

> Am 15.05.2013 02:31, schrieb Konrad Rzeszutek Wilk:
> >On Tue, May 14, 2013 at 10:29:27AM +0200, Alexander Bienzeisler wrote:
> >>Hello everyone,
> >>
> >>i just updated from 4.2.1 to 4.2.2. If i try to fire up my win2k8
> >>domU with a pci device attached, the dom0 machine hardcrashes.
> >How do you pass in the PCI device? I don't see in your xl info the
> >capability listed to do VT-d. Does your machine do VT-d?
> >>my system log (idrac) shows the following:
> >>
> >>CPU 2 has an internal error (IERR).
> >>A bus fatal error was detected on a component at bus 0 device 0 function 0.
> >>CPU 1 machine check detected.
> >>
> >>and plenty of other entries. The machine hardresets then.
> >>If i leave the faulty machine down after a reboot, nothing like this
> >>happens.
> >>
> >>xl info:
> >>>host                   : susi-0
> >>>release                : 3.8.2-ipmi
> >>>version                : #4 SMP Mon Mar 11 12:54:31 CET 2013
> >>>machine                : x86_64
> >>>nr_cpus                : 12
> >>>max_cpu_id             : 31
> >>>nr_nodes               : 2
> >>>cores_per_socket       : 6
> >>>threads_per_core       : 1
> >>>cpu_mhz                : 3325
> >>>hw_caps                : bfebfbff:2c100800:00000000:00003f40:029ee3ff:00000000:00000001:00000000
> >>>virt_caps              : hvm
> >>>total_memory           : 98291
> >>>free_memory            : 62390
> >>>sharing_freed_memory   : 0
> >>>sharing_used_memory    : 0
> >>>free_cpus              : 0
> >>>xen_major              : 4
> >>>xen_minor              : 2
> >>>xen_extra              : .2
> >>>xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p
> >>>hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
> >>>xen_scheduler          : credit
> >>>xen_pagesize           : 4096
> >>>platform_params        : virt_start=0xffff800000000000
> >>>xen_changeset          : unavailable
> >>>xen_commandline        : placeholder loglvl=all dom0_mem=2048M
> >>>dom0_max_vcpus=2 com2=115200 console=com2,vga
> >>>cc_compiler            : gcc (Debian 4.4.5-8) 4.4.5
> >>>cc_compile_by          : root
> >>>cc_compile_domain      : wsk.tu-chemnitz.de
> >>>cc_compile_date        : Tue May 14 09:16:43 CEST 2013
> >>>xend_config_format     : 4
> >>
> >>
> >>_______________________________________________
> >>Xen-devel mailing list
> >>Xen-devel@lists.xen.org
> >>http://lists.xen.org/xen-devel
> >>
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xen.org
> >http://lists.xen.org/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 4.2.2 pci-passthrough crashes Dell Poweredge R710
  2013-05-15  6:32     ` Pasi Kärkkäinen
@ 2013-05-15  8:33       ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2013-05-15  8:33 UTC (permalink / raw)
  To: Alexander Bienzeisler, Pasi Kärkkäinen; +Cc: xen-devel

>>> On 15.05.13 at 08:32, Pasi Kärkkäinen<pasik@iki.fi> wrote:
> for xen.gz: loglvl=all guest_loglvl=all com2=115200,8n1 console=com2 
> sync_console lapic=debug apic_verbosity=debug apic=debug iommu=verbose

That last element should be "iommu=debug", which on 4.2.2 and
-unstable implies verbose.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-15  8:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-14  8:29 4.2.2 pci-passthrough crashes Dell Poweredge R710 Alexander Bienzeisler
2013-05-14 13:32 ` Jan Beulich
2013-05-15  0:31 ` Konrad Rzeszutek Wilk
2013-05-15  1:58   ` Alexander Bienzeisler
2013-05-15  6:32     ` Pasi Kärkkäinen
2013-05-15  8:33       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.