All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI BAR register space written with garbage in HVM guest.
@ 2010-03-16  1:09 Dan Gora
  2010-03-16  1:48 ` Konrad Rzeszutek Wilk
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Dan Gora @ 2010-03-16  1:09 UTC (permalink / raw)
  To: xen-devel

Hi All,

I'm having a problem where if I pass through two instances of my
device to a HVM domU, one of the board instances is having it's PCI
BAR registers overwritten with garbage by some unknown actor 30
seconds to a minute after I load my driver.  I cannnot for the life of
me find what might possibly be overwriting the BAR registers.

I've added a debugging printf to XEN in
xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR
address space being overwritten with garbage:

(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
bytes=4 value=0xffffffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
bytes=4 value=0x1600ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
bytes=4 value=0x64d5323e
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
bytes=4 value=0x450008
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0
bytes=4 value=0xa7e54002
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0
bytes=4 value=0x11400000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0
bytes=4 value=0x693
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0
bytes=4 value=0xffff0000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0
bytes=4 value=0x4400ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0
bytes=4 value=0x2c024300
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0
bytes=4 value=0x1012dac
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0
bytes=4 value=0xa1c30006
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080030 offset=0x0
bytes=4 value=0xa00040d
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080034 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080038 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008003c offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080040 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080044 offset=0x0
bytes=4 value=0x16000000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080048 offset=0x0
bytes=4 value=0x64d5323e
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008004c offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080050 offset=0x0
bytes=4 value=0x0
<snipped, rest of PCI BAR registers written with 0x0...>

I've added printks to the dom0 and domU kernels in the
pci_bus_write_config_##size() macros in drivers/pci/access.c and in
arch/x86/pci/direct.c to print every time the kernel accesses PCI
configuration space, but I only see these printfs when my driver
access my board's PCI configuration space or some other driver
accesses PCI configuration space, but I do NOT see them when this PCI
BAR register space trashing happens.

So I noticed also that lspci does not cause these kernel printfs to
occur and upon reading the pciutils source code I learned that pretty
much anything which can do an outl() to 0xcf8/0xcfc can mess with PCI
configuration space.

So now I figure it must be some user space thing unless I'm just
missing some other way which the kernel or XEN can access PCI
configuration space, but what could it possibly be?

This problem only occurs in HVM guests and only seems to occur when I
pass two instances of my device to the domU and only occurs many many
seconds after I load my driver (30-60 seconds).  I'm absolutely sure
that it's not my driver because the kernel printfs show up when my
driver accesses PCI configuration space.

I'm really pretty much at a loss as even how to debug this.  There
doesn't appear to be any dump_stack() in XEN so that I can see what
called pci_conf_write() in XEN, but even then it appears that it only
gets called as a trap from the dom0 or domU.  It's not clear to me if
you can even see what process/stack actually caused the trap back in
the dom0 or domU.  Is that possible?

Is there anything else that I should look at?  qemu?  pciback?
pcifront?  Am I missing some access method to PCI configuration space
down in the kernel or is pci_confl_read/write pretty much it?  Any
ideas what would possibly be trying to overwrite all of PCI
configuration space like this?

_any_ ideas are most welcome..

thanks
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16  1:09 PCI BAR register space written with garbage in HVM guest Dan Gora
@ 2010-03-16  1:48 ` Konrad Rzeszutek Wilk
  2010-03-16  2:55   ` Dan Gora
  2010-03-16 10:24 ` Jan Beulich
  2010-03-17  0:14 ` Dan Gora
  2 siblings, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-16  1:48 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel

On Mon, Mar 15, 2010 at 10:09:28PM -0300, Dan Gora wrote:
> Hi All,
> 
> I'm having a problem where if I pass through two instances of my
> device to a HVM domU, one of the board instances is having it's PCI
> BAR registers overwritten with garbage by some unknown actor 30
> seconds to a minute after I load my driver.  I cannnot for the life of
> me find what might possibly be overwriting the BAR registers.
> 
> I've added a debugging printf to XEN in
> xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR
> address space being overwritten with garbage:
> 
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
> bytes=4 value=0xffffffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
> bytes=4 value=0x1600ffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
> bytes=4 value=0x64d5323e
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
> bytes=4 value=0x450008
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0
> bytes=4 value=0xa7e54002
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0
> bytes=4 value=0x11400000

Wow.. That is impressive. Are the values always the same? Or are they
truly random?

> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0
> bytes=4 value=0x693
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0
> bytes=4 value=0xffff0000
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0
> bytes=4 value=0x4400ffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0
> bytes=4 value=0x2c024300
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0
> bytes=4 value=0x1012dac
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0
> bytes=4 value=0xa1c30006
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080030 offset=0x0
> bytes=4 value=0xa00040d
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080034 offset=0x0
> bytes=4 value=0x0
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080038 offset=0x0
> bytes=4 value=0x0
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008003c offset=0x0
> bytes=4 value=0x0
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080040 offset=0x0
> bytes=4 value=0x0
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080044 offset=0x0
> bytes=4 value=0x16000000
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080048 offset=0x0
> bytes=4 value=0x64d5323e
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008004c offset=0x0
> bytes=4 value=0x0
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080050 offset=0x0
> bytes=4 value=0x0
> <snipped, rest of PCI BAR registers written with 0x0...>
> 
> I've added printks to the dom0 and domU kernels in the
> pci_bus_write_config_##size() macros in drivers/pci/access.c and in
> arch/x86/pci/direct.c to print every time the kernel accesses PCI
> configuration space, but I only see these printfs when my driver
> access my board's PCI configuration space or some other driver
> accesses PCI configuration space, but I do NOT see them when this PCI
> BAR register space trashing happens.
> 
> So I noticed also that lspci does not cause these kernel printfs to
> occur and upon reading the pciutils source code I learned that pretty
> much anything which can do an outl() to 0xcf8/0xcfc can mess with PCI
> configuration space.
> 
> So now I figure it must be some user space thing unless I'm just
> missing some other way which the kernel or XEN can access PCI
> configuration space, but what could it possibly be?
> 
> This problem only occurs in HVM guests and only seems to occur when I
> pass two instances of my device to the domU and only occurs many many
> seconds after I load my driver (30-60 seconds).  I'm absolutely sure
> that it's not my driver because the kernel printfs show up when my
> driver accesses PCI configuration space.
> 
> I'm really pretty much at a loss as even how to debug this.  There
> doesn't appear to be any dump_stack() in XEN so that I can see what
> called pci_conf_write() in XEN, but even then it appears that it only
> gets called as a trap from the dom0 or domU.  It's not clear to me if
> you can even see what process/stack actually caused the trap back in
> the dom0 or domU.  Is that possible?

You could instrument the code (Xen) to crash the DomU domain when you
detect garbage. Then you can pick at with xenctx to look at its stack..etc
> 
> Is there anything else that I should look at?  qemu?  pciback?
> pcifront?  Am I missing some access method to PCI configuration space
> down in the kernel or is pci_confl_read/write pretty much it?  Any

QEMU uses libpci, which is the same as lspci, and that looks to work.

You can crank up the verbosity of pciback and pcifront with its
parameters to see if they are the ones doing this. But your domain is
HVM DomU so the pcifront/pciback is not utilized.

That narrows it down to QEMU or the Dom0 kernel.

> ideas what would possibly be trying to overwrite all of PCI
> configuration space like this?
> 
> _any_ ideas are most welcome..
> 
> thanks
> dan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16  1:48 ` Konrad Rzeszutek Wilk
@ 2010-03-16  2:55   ` Dan Gora
  2010-03-16 10:27     ` Jan Beulich
  2010-03-16 15:20     ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 14+ messages in thread
From: Dan Gora @ 2010-03-16  2:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Mon, Mar 15, 2010 at 10:48 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:

>> I've added a debugging printf to XEN in
>> xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR
>> address space being overwritten with garbage:
>>
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
>> bytes=4 value=0xffffffff
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
>> bytes=4 value=0x1600ffff
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
>> bytes=4 value=0x64d5323e
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
>> bytes=4 value=0x450008
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0
>> bytes=4 value=0xa7e54002
>> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0
>> bytes=4 value=0x11400000
>
> Wow.. That is impressive. Are the values always the same? Or are they
> truly random?


I'm pretty sure that they are always the same...

>> I'm really pretty much at a loss as even how to debug this.  There
>> doesn't appear to be any dump_stack() in XEN so that I can see what
>> called pci_conf_write() in XEN, but even then it appears that it only
>> gets called as a trap from the dom0 or domU.  It's not clear to me if
>> you can even see what process/stack actually caused the trap back in
>> the dom0 or domU.  Is that possible?
>
> You could instrument the code (Xen) to crash the DomU domain when you
> detect garbage. Then you can pick at with xenctx to look at its stack..etc

This is where I could really use some help... Is it possible crash
only the domU from with xen?  What would it leave behind to analyze
with xenctx?

>>
>> Is there anything else that I should look at?  qemu?  pciback?
>> pcifront?  Am I missing some access method to PCI configuration space
>> down in the kernel or is pci_confl_read/write pretty much it?  Any
>
> QEMU uses libpci, which is the same as lspci, and that looks to work.
>
> You can crank up the verbosity of pciback and pcifront with its
> parameters to see if they are the ones doing this. But your domain is
> HVM DomU so the pcifront/pciback is not utilized.
>

Yeah I tried that and there doesn't appear to be anything going on there.

<aside>
What is pcifront/pciback's role for HVM guests exactly?  I understand
that you "hide" the devices from the dom0 with pciback and it
definately loads and does *something* when the HVM guest comes up, but
accesses from the domU don't appear to go through it at all (I
understand that it works with qemu somehow, but that channel too is
not at all clear how it works...)
I've looked through qemu enough to kind of understand that it's
responsible for abstracting the PCI configuration space for the domU,
but I don't really understand how accesses get channeled through to it
from the domU.  Does it use hypercalls somehow?  Can someone explain
how this whole flow is supposed to work for PCI configuration space
accesses?
</aside>

> That narrows it down to QEMU or the Dom0 kernel.

Yeah that's what I figured.. I instrumented qemu a bit and also did
not see anything going on in there, but I'm not at the office anymore
so I cannot tell you exactly what functions I instrumented.  I'm going
to look into this more tomorrow.

thanks a lot for your help...
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16  1:09 PCI BAR register space written with garbage in HVM guest Dan Gora
  2010-03-16  1:48 ` Konrad Rzeszutek Wilk
@ 2010-03-16 10:24 ` Jan Beulich
  2010-03-17  0:14 ` Dan Gora
  2 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2010-03-16 10:24 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel

>>> Dan Gora <dan.gora@gmail.com> 16.03.10 02:09 >>>
>I'm really pretty much at a loss as even how to debug this.  There
>doesn't appear to be any dump_stack() in XEN so that I can see what
>called pci_conf_write() in XEN, but even then it appears that it only
>gets called as a trap from the dom0 or domU.  It's not clear to me if
>you can even see what process/stack actually caused the trap back in
>the dom0 or domU.  Is that possible?

show_execution_state(guest_cpu_user_regs());

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM  guest.
  2010-03-16  2:55   ` Dan Gora
@ 2010-03-16 10:27     ` Jan Beulich
  2010-03-16 15:20     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2010-03-16 10:27 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel, Konrad Rzeszutek Wilk

>>> Dan Gora <dan.gora@gmail.com> 16.03.10 03:55 >>>
>This is where I could really use some help... Is it possible crash
>only the domU from with xen?  What would it leave behind to analyze
>with xenctx?

domain_crash(current->domain);

assuming the access happens in the context of the DomU. If it happens
in Dom0's context (e.g. coming from qemu) you'd still kill the whole
system with this, so be careful.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16  2:55   ` Dan Gora
  2010-03-16 10:27     ` Jan Beulich
@ 2010-03-16 15:20     ` Konrad Rzeszutek Wilk
  2010-03-16 15:54       ` Jan Beulich
  1 sibling, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-16 15:20 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel

> <aside>
> What is pcifront/pciback's role for HVM guests exactly?  I understand

None functionaly. The purpose is to "bind" the PCI devices to pciback
(or pcistub) so that no other kernel module usurps it and starts
utilizing it.

> that you "hide" the devices from the dom0 with pciback and it
> definately loads and does *something* when the HVM guest comes up, but
> accesses from the domU don't appear to go through it at all (I
> understand that it works with qemu somehow, but that channel too is
> not at all clear how it works...)

With HVM pciback is not used. You need an virtualization aware IOMMU to
pass through PCI devices to the guest. PCIback/PCI front is for PV
guests and on machine where you don't neccessarily have this fancy
IOMMU.

> I've looked through qemu enough to kind of understand that it's
> responsible for abstracting the PCI configuration space for the domU,
> but I don't really understand how accesses get channeled through to it
> from the domU.  Does it use hypercalls somehow?  Can someone explain

The Hypervisor gets "trapped" when an outb is made (look for
emulate_privileged_op function and specifically out).

Then it somehow injects the fault in QEMU which does the rest. I don't
remember the details of how it does thought :-(

> how this whole flow is supposed to work for PCI configuration space
> accesses?
> </aside>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16 15:20     ` Konrad Rzeszutek Wilk
@ 2010-03-16 15:54       ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2010-03-16 15:54 UTC (permalink / raw)
  To: Dan Gora, Konrad Rzeszutek Wilk; +Cc: xen-devel

>>> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> 16.03.10 16:20 >>>
>The Hypervisor gets "trapped" when an outb is made (look for
>emulate_privileged_op function and specifically out).

No, that's only the path taken for pv guests. HVM guests use special
exit events for trapping I/O instructions.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-16  1:09 PCI BAR register space written with garbage in HVM guest Dan Gora
  2010-03-16  1:48 ` Konrad Rzeszutek Wilk
  2010-03-16 10:24 ` Jan Beulich
@ 2010-03-17  0:14 ` Dan Gora
  2010-03-17  2:46   ` Konrad Rzeszutek Wilk
  2010-03-18  2:14   ` Dan Gora
  2 siblings, 2 replies; 14+ messages in thread
From: Dan Gora @ 2010-03-17  0:14 UTC (permalink / raw)
  To: xen-devel

Hi all,

So I have definately tracked the writes as coming from qemu-dm, but
I'm still having trouble with finding out what is causing it.  I can
now see the writes occuring down in the kernel (I only had
pci_read_bus_XX traced, I didn't have pci_read_user_XX traced before..
sigh...) and I can also see the problem occuring with strace as well,
but I still cannot find where in qemu corresponds to this behavior...

So here is what I see:

Recall that from Xen HV I can see the bad writes occuring:

> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
> bytes=4 value=0xffffffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
> bytes=4 value=0x1600ffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
> bytes=4 value=0x64d5323e
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
> bytes=4 value=0x450008

I also traced the dom0 kernel better and can now also see these same
writes occuring:

Mar 16 21:35:49 sunblade1 kernel: pci_conf1_write: writing 8:0 reg 0x0
len=4 value=0xffffffff
Mar 16 21:35:49 sunblade1 kernel: Pid: 5904, comm: qemu-dm Tainted: G
        2.6.27.19-5-xen.dan #7
Mar 16 21:35:49 sunblade1 kernel:
Mar 16 21:35:49 sunblade1 kernel: Call Trace:
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020c55a>] dump_trace+0xba/0x379
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020c867>]
show_trace_log_lvl+0x4e/0x69
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020d5d8>] show_trace+0x15/0x19
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8048d873>] dump_stack+0x77/0x80
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80407a92>]
pci_conf1_write+0x86/0x113
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8040940f>] raw_pci_write+0x24/0x3d
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80409482>] pci_write+0x2c/0x2e
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8035ec0f>]
pci_user_write_config_dword+0x62/0xba
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80364e49>]
pci_write_config+0x110/0x192
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802ff3eb>] write+0xf5/0x12f
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802a9bea>] vfs_write+0xb3/0x15c
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802a9d61>] sys_write+0x4c/0x75
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020b7f2>] tracesys+0xaf/0xb4
Mar 16 21:35:49 sunblade1 kernel:  [<00007f05d53f97fb>] 0x7f05d53f97fb
Mar 16 21:35:49 sunblade1 kernel:
Mar 16 21:35:49 sunblade1 kernel: pci_conf1_write: writing 8:0 reg 0x4
len=4 value=0x1600ffff
Mar 16 21:35:49 sunblade1 kernel: Pid: 5904, comm: qemu-dm Tainted: G
        2.6.27.19-5-xen.dan #7
Mar 16 21:35:49 sunblade1 kernel:
Mar 16 21:35:49 sunblade1 kernel: Call Trace:
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020c55a>] dump_trace+0xba/0x379
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020c867>]
show_trace_log_lvl+0x4e/0x69
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020d5d8>] show_trace+0x15/0x19
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8048d873>] dump_stack+0x77/0x80
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80407a92>]
pci_conf1_write+0x86/0x113
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8040940f>] raw_pci_write+0x24/0x3d
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80409482>] pci_write+0x2c/0x2e
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8035ec0f>]
pci_user_write_config_dword+0x62/0xba
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff80364e49>]
pci_write_config+0x110/0x192
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802ff3eb>] write+0xf5/0x12f
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802a9bea>] vfs_write+0xb3/0x15c
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff802a9d61>] sys_write+0x4c/0x75
Mar 16 21:35:49 sunblade1 kernel:  [<ffffffff8020b7f2>] tracesys+0xaf/0xb4
Mar 16 21:35:50 sunblade1 kernel:  [<00007f05d53f97fb>] 0x7f05d53f97fb
Mar 16 21:35:50 sunblade1 kernel:


I also traced qemu-dm with strace and found the culprit:
5904  21:35:47 [    7f05d3b68b2b]
open("/sys/bus/pci/devices/0000:08:00.0/config", O_RDWR) = 6
5904  21:35:47 [    7f05d53fa3c8] pwrite(6, "\6\1", 2, 4) = 2
5904  21:35:47 [    7f05d3b6eb77] ioctl(16, EVIOCGKEYCODE, 0x7fffdde98890) = 0
5904  21:35:47 [    7f05d53f987b] read(4, 0x7fffdde98870, 16) = -1
EAGAIN (Resource temporarily unavailable)
5904  21:35:47 [    7fffddf437dc] clock_gettime(CLOCK_MONOTONIC,
{1724, 868201462}) = 0
5904  21:35:47 [    7fffddf437dc] clock_gettime(CLOCK_MONOTONIC,
{1724, 868259075}) = 0

<snip>

5904  21:35:48 [    7f05d53f987b] read(16, "o\0\0\0", 4) = 4
5904  21:35:48 [    7f05d53f97fb] write(16, "o\0\0\0", 4) = 4
5904  21:35:48 [    7f05d53f97fb] write(6,
"\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"...,
590) = 256

Notice the length of 590 bytes!  The EIP also matches what the kernel
thinks was the syscall which caused the write.

Anyone have any idea where this could be happening in qemu?  I'm
pretty sure that it's not originating with the domU because I traced
the domU kenrel's pci config access routines pci_conf1_write/read the
same way as the dom0's and there are no printfs in the domU kernel
when this occurs.

Again, thanks for all the help,
dan

On Mon, Mar 15, 2010 at 10:09 PM, Dan Gora <dan.gora@gmail.com> wrote:
> Hi All,
>
> I'm having a problem where if I pass through two instances of my
> device to a HVM domU, one of the board instances is having it's PCI
> BAR registers overwritten with garbage by some unknown actor 30
> seconds to a minute after I load my driver.  I cannnot for the life of
> me find what might possibly be overwriting the BAR registers.
>
> I've added a debugging printf to XEN in
> xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR
> address space being overwritten with garbage:
>
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
> bytes=4 value=0xffffffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
> bytes=4 value=0x1600ffff
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
> bytes=4 value=0x64d5323e
> (XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
> bytes=4 value=0x450008

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-17  0:14 ` Dan Gora
@ 2010-03-17  2:46   ` Konrad Rzeszutek Wilk
  2010-03-17  3:31     ` Dan Gora
  2010-03-18  2:14   ` Dan Gora
  1 sibling, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-17  2:46 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel

> open("/sys/bus/pci/devices/0000:08:00.0/config", O_RDWR) = 6
> 5904  21:35:47 [    7f05d53fa3c8] pwrite(6, "\6\1", 2, 4) = 2
> 5904  21:35:47 [    7f05d3b6eb77] ioctl(16, EVIOCGKEYCODE, 0x7fffdde98890) = 0
> 5904  21:35:47 [    7f05d53f987b] read(4, 0x7fffdde98870, 16) = -1
> EAGAIN (Resource temporarily unavailable)
> 5904  21:35:47 [    7fffddf437dc] clock_gettime(CLOCK_MONOTONIC,
> {1724, 868201462}) = 0
> 5904  21:35:47 [    7fffddf437dc] clock_gettime(CLOCK_MONOTONIC,
> {1724, 868259075}) = 0

There is a snippet of QEMU that has that (hw/pass-through.c),
function pt_pci_write_config:

.. snip ..
      ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len);

        if (!ret)
            PT_LOG("Error: pci_write_block failed. return value[%d].\n",
ret);
    }    

    if (pm_state != NULL && pm_state->flags & PT_FLAG_TRANSITING)
        /* set QEMUTimer */
        qemu_mod_timer(pm_state->pm_timer,
            (qemu_get_clock(rt_clock) + pm_state->pm_delay));

> 
> <snip>
> 
> 5904  21:35:48 [    7f05d53f987b] read(16, "o\0\0\0", 4) = 4
> 5904  21:35:48 [    7f05d53f97fb] write(16, "o\0\0\0", 4) = 4
> 5904  21:35:48 [    7f05d53f97fb] write(6,
> "\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"...,
> 590) = 256

I would turn on all of those debug options and see if anything is
happening.

Actually I would instrument all of the 'pci_write_block' calls.

> 
> Notice the length of 590 bytes!  The EIP also matches what the kernel
> thinks was the syscall which caused the write.
> 
> Anyone have any idea where this could be happening in qemu?  I'm

Can you attach debug to it? Or maybe run 'objdump' on the qemu-dm and
see if the address correspond to some code in it?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-17  2:46   ` Konrad Rzeszutek Wilk
@ 2010-03-17  3:31     ` Dan Gora
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Gora @ 2010-03-17  3:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Tue, Mar 16, 2010 at 11:46 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> There is a snippet of QEMU that has that (hw/pass-through.c),
> function pt_pci_write_config:
>
> .. snip ..
>      ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len);
>
>        if (!ret)
>            PT_LOG("Error: pci_write_block failed. return value[%d].\n",
> ret);
>    }
>
>    if (pm_state != NULL && pm_state->flags & PT_FLAG_TRANSITING)
>        /* set QEMUTimer */
>        qemu_mod_timer(pm_state->pm_timer,
>            (qemu_get_clock(rt_clock) + pm_state->pm_delay));

I turned on all the debugging in there and I see messages, but none
when the bad writes occur, which made me think that they were not
coming from there...

>> <snip>
>>
>> 5904  21:35:48 [    7f05d53f987b] read(16, "o\0\0\0", 4) = 4
>> 5904  21:35:48 [    7f05d53f97fb] write(16, "o\0\0\0", 4) = 4
>> 5904  21:35:48 [    7f05d53f97fb] write(6,
>> "\377\377\377\377\377\377\0\26>2\325d\10\0E\0\2@\0\354\0\0@\21w\302\0\0\0\0\377\377"...,
>> 590) = 256
>
> I would turn on all of those debug options and see if anything is
> happening.

done.. nothing :(  At least not during the error.

> Actually I would instrument all of the 'pci_write_block' calls.

I'll try that tomorrow...

> Can you attach debug to it? Or maybe run 'objdump' on the qemu-dm and
> see if the address correspond to some code in it?

Yes I can debug it, but really the only thing that I have to hang my
hat on is write().  So I figure that it must be coming from pciutils
since it's messing with /sys/bus/pci, but I don't really know much
about pciutils so wasn't sure what to grep around for.

I'll try and trace the pci_write_block calls and put gdb on it
tomorrow and see if anything pops up.

thanks,
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-17  0:14 ` Dan Gora
  2010-03-17  2:46   ` Konrad Rzeszutek Wilk
@ 2010-03-18  2:14   ` Dan Gora
       [not found]     ` <20100318143207.GF14445@phenom.dumpdata.com>
  1 sibling, 1 reply; 14+ messages in thread
From: Dan Gora @ 2010-03-18  2:14 UTC (permalink / raw)
  To: xen-devel

Hi all,

I apologize for the length of this message, but I figured that it was
probably better to supply too much info rather than too little...
I've made some good progress debugging this today.  The problem looks
pretty bizarre...

I finally figured out how to trap the problem in gdb with the help of
the conditional break point.  Since I know that it's a write to fd 6
of length 590 bytes I just make gdb break on that:

(gdb) disass write
Dump of assembler code for function write:
0x00007f2c3afe07b0 <write+0>:   cmpl   $0x0,0x20db59(%rip)        #
0x7f2c3b1ee310 <__pthread_multiple_threads>
0x00007f2c3afe07b7 <write+7>:   jne    0x7f2c3afe07c9 <write+25>
0x00007f2c3afe07b9 <__write_nocancel+0>:        mov    $0x1,%eax
0x00007f2c3afe07be <__write_nocancel+5>:        syscall
0x00007f2c3afe07c0 <__write_nocancel+7>:        cmp    $0xfffffffffffff001,%rax
0x00007f2c3afe07c6 <__write_nocancel+13>:       jae    0x7f2c3afe081b
<write+107>
0x00007f2c3afe07c8 <__write_nocancel+15>:       retq
0x00007f2c3afe07c9 <write+25>:  sub    $0x28,%rsp
0x00007f2c3afe07cd <write+29>:  mov    %rdi,0x8(%rsp)
0x00007f2c3afe07d2 <write+34>:  mov    %rsi,0x10(%rsp)
0x00007f2c3afe07d7 <write+39>:  mov    %rdx,0x18(%rsp)
0x00007f2c3afe07dc <write+44>:  callq  0x7f2c3afe0380
<__pthread_enable_asynccancel>
0x00007f2c3afe07e1 <write+49>:  mov    0x8(%rsp),%rdi
0x00007f2c3afe07e6 <write+54>:  mov    0x10(%rsp),%rsi
0x00007f2c3afe07eb <write+59>:  mov    0x18(%rsp),%rdx
0x00007f2c3afe07f0 <write+64>:  mov    %rax,(%rsp)
0x00007f2c3afe07f4 <write+68>:  mov    $0x1,%eax
0x00007f2c3afe07f9 <write+73>:  syscall
0x00007f2c3afe07fb <write+75>:  mov    (%rsp),%rdi
0x00007f2c3afe07ff <write+79>:  mov    %rax,0x8(%rsp)
0x00007f2c3afe0804 <write+84>:  callq  0x7f2c3afe0350
<__pthread_disable_asynccancel>
0x00007f2c3afe0809 <write+89>:  mov    0x8(%rsp),%rax
0x00007f2c3afe080e <write+94>:  add    $0x28,%rsp
0x00007f2c3afe0812 <write+98>:  cmp    $0xfffffffffffff001,%rax
0x00007f2c3afe0818 <write+104>: jae    0x7f2c3afe081b <write+107>
0x00007f2c3afe081a <write+106>: retq
0x00007f2c3afe081b <write+107>: mov    0x20977e(%rip),%rcx        #
0x7f2c3b1e9fa0
0x00007f2c3afe0822 <write+114>: xor    %edx,%edx
0x00007f2c3afe0824 <write+116>: sub    %rax,%rdx
0x00007f2c3afe0827 <write+119>: mov    %edx,%fs:(%rcx)
0x00007f2c3afe082a <write+122>: or     $0xffffffffffffffff,%rax
0x00007f2c3afe082e <write+126>: jmp    0x7f2c3afe081a <write+106>
End of assembler dump.
(gdb) break *0x00007f2c3afe07f4 if $rdi == 6 && $rdx == 590
Breakpoint 1 at 0x7f2c3afe07f4
(gdb) info break
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00007f2c3afe07f4 <write+68>
        stop only if $rdi == 6 && $rdx == 590

===================
Here the breakpoint fires, but the bad writes do not occur when we perform
the syscall.
===================

Breakpoint 1, 0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0
#1  0x000000000040746d in tap_receive (opaque=0xa940f0,
    buf=0x7fff43a7f080 "������", size=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079
#2  0x0000000000407081 in qemu_send_packet (vc1=0xaf99d0,
    buf=0x7fff43a7f080 "������", size=590)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865
#3  0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9780,
    descriptor=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810
#4  0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6,
    addr=<value optimized out>, val=524878)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367
#5  0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>,
    buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>,
    _len=<value optimized out>, is_write=1) at exec-dm.c:510
#6  0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>,
    req=0x7f2c3ba7e000) at helper2.c:319
#7  0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>)
    at helper2.c:508
#8  0x000000000040aeb4 in main_loop_wait (timeout=10)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175
#9  0x00000000004a4dbd in main_loop () at helper2.c:571
#10 0x000000000040feb3 in main (argc=20, argv=0x7fff43a82968)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950

========================
Here is the data buffer that is being written by rtl8139_transmit():
Notice that it looks like an ethernet frame!  But it is being written
down the sysfs pci configuration space descriptor!
========================

(gdb) x/590xb 0x7fff43a7f080
0x7fff43a7f080: 0xff    0xff    0xff    0xff    0xff    0xff    0x00    0x16
0x7fff43a7f088: 0x3e    0x32    0xd5    0x64    0x08    0x00    0x45    0x00
0x7fff43a7f090: 0x02    0x40    0x91    0x0d    0x00    0x00    0x40    0x11
0x7fff43a7f098: 0xe7    0xa0    0x00    0x00    0x00    0x00    0xff    0xff
0x7fff43a7f0a0: 0xff    0xff    0x00    0x44    0x00    0x43    0x02    0x2c
0x7fff43a7f0a8: 0xc4    0xb7    0x01    0x01    0x06    0x00    0xae    0xb8
0x7fff43a7f0b0: 0x09    0x63    0x00    0x0a    0x00    0x00    0x00    0x00
0x7fff43a7f0b8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0c0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x16
0x7fff43a7f0c8: 0x3e    0x32    0xd5    0x64    0x00    0x00    0x00    0x00
0x7fff43a7f0d0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0d8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0e0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0e8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0f0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fff43a7f0f8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

=================================
So we load and unload my driver in the domU.  After several
seconds the breakpoint hits again and the bad writes actually happen
when we do the write() system call
================================

Breakpoint 1, 0x00007f2c3afe07f4 in write () from /lib64/libpthread.so.0
(gdb) info regi
rax            0x0      0
rbx            0x24e    590
rcx            0x0      0
rdx            0x24e    590
rsi            0x7fff43a7f080   140734328467584
rdi            0x6      6
rbp            0x7fff43a7f080   0x7fff43a7f080
rsp            0x7fff43a7f000   0x7fff43a7f000
r8             0x6      6
r9             0x43e    1086
r10            0x7fff43a811d0   140734328476112
r11            0x202    514
r12            0xa940f0 11092208
r13            0x7fff43a7f080   140734328467584
r14            0x7f2c3ba7e018   139827956146200
r15            0x1      1
rip            0x7f2c3afe07f4   0x7f2c3afe07f4 <write+68>
eflags         0x202    [ IF ]
cs             0x33     51
ss             0xe02b   57387
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
fctrl          0x37f    895
fstat          0x0      0
ftag           0xffff   65535
fiseg          0x0      0
fioff          0x0      0
foseg          0x0      0
fooff          0x0      0
fop            0x0      0
mxcsr          0x1f80   [ IM DM ZM OM UM PM ]
(gdb) x/10i $rip
0x7f2c3afe07f4 <write+68>:      mov    $0x1,%eax
0x7f2c3afe07f9 <write+73>:      syscall
0x7f2c3afe07fb <write+75>:      mov    (%rsp),%rdi
0x7f2c3afe07ff <write+79>:      mov    %rax,0x8(%rsp)
0x7f2c3afe0804 <write+84>:
    callq  0x7f2c3afe0350 <__pthread_disable_asynccancel>
0x7f2c3afe0809 <write+89>:      mov    0x8(%rsp),%rax
0x7f2c3afe080e <write+94>:      add    $0x28,%rsp
0x7f2c3afe0812 <write+98>:      cmp    $0xfffffffffffff001,%rax
0x7f2c3afe0818 <write+104>:     jae    0x7f2c3afe081b <write+107>
0x7f2c3afe081a <write+106>:     retq
(gdb) x/20xb 0x7fff43a7f080
0x7fff43a7f080: 0xff    0xff    0xff    0xff    0xff    0xff    0x00    0x16
0x7fff43a7f088: 0x3e    0x32    0xd5    0x64    0x08    0x00    0x45    0x00
0x7fff43a7f090: 0x02    0x40    0x91    0x0f
=========================
Now do the system call...
=========================
(gdb) si

Breakpoint 2, 0x00007f2c3afe07f9 in write () from /lib64/libpthread.so.0
(gdb) si
0x00007f2c3afe07fb in write () from /lib64/libpthread.so.0

====================================
Here the problem occurs.. I see the PCI BAR space being overwritten
on the XEN console.  The bad BAR registers exactly match the data.
====================================

(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0 bytes=4 val
ue=0xffffffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0 bytes=4 val
ue=0x1600ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0 bytes=4 val
ue=0x64d5323e
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0 bytes=4 val
ue=0x450008
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0 bytes=4 val
ue=0xf914002
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0 bytes=4 val
ue=0x11400000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0 bytes=4 val
ue=0x9ee7
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0 bytes=4 val
ue=0xffff0000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0 bytes=4 val
ue=0x4400ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0 bytes=4 val
ue=0x2c024300
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0 bytes=4 val
ue=0x101b7c4
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0 bytes=4 val
ue=0xb8ae0006

========================
so let's go see what we can see....
========================
(gdb) bt
#0  0x00007f2c3afe07fb in write () from /lib64/libpthread.so.0
#1  0x000000000040746d in tap_receive (opaque=0xa940f0,
    buf=0x7fff43a7f080 "������", size=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079
#2  0x0000000000407081 in qemu_send_packet (vc1=0xaf99d0,
    buf=0x7fff43a7f080 "������", size=590)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865
#3  0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9780,
    descriptor=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810
#4  0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6,
    addr=<value optimized out>, val=524878)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367
#5  0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>,
    buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>,
    _len=<value optimized out>, is_write=1) at exec-dm.c:510
#6  0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>,
    req=0x7f2c3ba7e000) at helper2.c:319
#7  0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>)
    at helper2.c:508
#8  0x000000000040aeb4 in main_loop_wait (timeout=10)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175
#9  0x00000000004a4dbd in main_loop () at helper2.c:571
#10 0x000000000040feb3 in main (argc=20, argv=0x7fff43a82968)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950
(gdb)
(gdb) up
#1  0x000000000040746d in tap_receive (opaque=0xa940f0,
    buf=0x7fff43a7f080 "������", size=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079
4079            ret = write(s->fd, buf, size); /* No error handling ?! */
(gdb) print *((TAPState*) opaque)
$2 = {vc = 0xa94910, fd = 6,
  down_script = "/etc/xen/qemu-ifdown", '\0' <repeats 1003 times>,
  script_arg = "br0", '\0' <repeats 1020 times>, next = 0x0}
(gdb) print *(VLANClientState*) 0xa94910
$3 = {fd_read = 0x407450 <tap_receive>, fd_can_read = 0, opaque = 0xa940f0,
  next = 0xaf99d0, vlan = 0xa940a0,
  info_str = "tap: ifname=tap1.0 setup_script=/etc/xen/qemu-ifup",
'\0' <repeats 205 times>}
(gdb) print *(VLANState*)0xa940a0
$4 = {id = 1, first_client = 0xa94910, next = 0x0, nb_guest_devs = 1,
  nb_host_devs = 1}
(gdb) print *(VLANClientState*) 0xaf99d0
$5 = {fd_read = 0x4235e0 <rtl8139_receive>,
  fd_can_read = 0x422770 <rtl8139_can_receive>, opaque = 0xaf9780, next = 0x0,
  vlan = 0xa940a0,
  info_str = "rtl8139 pci macaddr=00:16:3e:32:d5:64", '\0' <repeats 218 times>}

(gdb) up
#4  0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6,
    addr=<value optimized out>, val=524878)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367
2367        if (rtl8139_transmit_one(s, descriptor))
(gdb) up
#5  0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>,
    buf=0x7f2c3ba7e018 <Address 0x7f2c3ba7e018 out of bounds>,
    _len=<value optimized out>, is_write=1) at exec-dm.c:510
510
io_mem_write[io_index][2](io_mem_opaque[io_index], addr, val);

================================================
Here are the open files at the time.. Notice that fd 6 is open read only..
I don't understand how this write() is working....
================================================

sunblade1:root:bash 2047 => lsof -p 5820
<snip>
qemu-dm 5820 root    0r   CHR                1,3      0t0       3278 /dev/null
qemu-dm 5820 root    1w   REG              253,5   335301     606514
/var/log/xen/qemu-dm-node1.log
qemu-dm 5820 root    2w   REG              253,5   335301     606514
/var/log/xen/qemu-dm-node1.log
qemu-dm 5820 root    3r   DIR              253,5     4096     196635
/etc/init.d/rc3.d
qemu-dm 5820 root    4r  FIFO                0,6      0t0      13302 pipe
qemu-dm 5820 root    5w  FIFO                0,6      0t0      13302 pipe
qemu-dm 5820 root    6r   REG                0,0      256       4132
/sys/devices/pci0000:00/0000:00:03.0/0000:07:00.0/0000:08:00.0/config
qemu-dm 5820 root    7u   REG                0,3        0 4026538410
/proc/xen/privcmd
qemu-dm 5820 root    8u  unix 0xffff8803cd8e6e80      0t0      13325 socket
qemu-dm 5820 root    9u   BLK                8,6    0t512       1226 /dev/sda6
qemu-dm 5820 root   10u  IPv4              13327      0t0        TCP
localhost:5901 (LISTEN)
qemu-dm 5820 root   11u   CHR                5,2      0t0       3280 /dev/ptmx
qemu-dm 5820 root   12u   CHR              136,2      0t0          4 /dev/pts/2
qemu-dm 5820 root   13w   REG              253,5   881440     606516
/var/log/xen/xend.log
qemu-dm 5820 root   14u  FIFO              253,5      0t0     461914
/var/run/tap/qemu-read-1
qemu-dm 5820 root   15u  FIFO              253,5      0t0     461935
/var/run/tap/qemu-write-1
qemu-dm 5820 root   16u   CHR              10,63      0t0       4375
/dev/xen/evtchn
qemu-dm 5820 root   18r   REG                0,0      256       4080
/sys/devices/pci0000:00/0000:00:14.3/config
qemu-dm 5820 root   19r   REG                0,0      256       4080
/sys/devices/pci0000:00/0000:00:14.3/config
qemu-dm 5820 root   20r   REG                0,0      256       4123
/sys/devices/pci0000:00/0000:00:1f.3/config
qemu-dm 5820 root   21r   REG                0,0      256       4123
/sys/devices/pci0000:00/0000:00:1f.3/config
qemu-dm 5820 root   22r  FIFO                0,6      0t0      13336 pipe
qemu-dm 5820 root   23w  FIFO                0,6      0t0      13336 pipe
qemu-dm 5820 root   28u  FIFO              253,5      0t0     461937
/var/run/xend/dm-1-1268876464.fifo

======================================
So, I don't get it... There seem to be several questions here:

1) Why is the rtl driver trying to write into my board's sysfs configuration
space file?  Is it possible that it's just getting the wrong path in sysfs
somehow?  It seems like it should probably be opening a socket or something.

2) If the file descriptor is open read only, how is the write working?


So I changed the configuration file to only pass through one board:
Here is the configuration file:
=======================================
sunblade1:root:bash 2061 => cat node2.conf
import os, re

arch = os.uname()[4]
if re.search('64', arch):
    arch_libdir = 'lib64'
else:
    arch_libdir = 'lib'

kernel        = "/usr/lib/xen/boot/hvmloader"
builder       = 'hvm'
#memory        =  512
memory        =  2048
name          = "node2"
vcpus         =  8
pae           =  1
acpi          =  1
apic          =  1
vif           =['mac=00:16:3e:32:d5:64, bridge=br0, model=rtl8139']
disk          =['phy:sda6,hda,w']
#disk         =['phy:sda6,hda,w', 'phy:/dev/loop0,hdc,r']
on_poweroff   = 'destroy'
on_reboot     = 'restart'
on_crash      = 'restart'
device_model  = '/usr/' + arch_libdir + '/xen/bin/qemu-dm'
boot          = "cd"
snapshot      =  0
sdl           =  0
opengl        =  1
vnc           =  1
vnclisten     = "127.0.0.1"
vncdisplay    =  1
vncunused     =  0
vncpasswd     = ''
nographic     =  0
stdvga        =  0
serial        = 'pty'
#pci           = ['08:00.0']
pci           = ['0e:00.0']
#pci           = ['08:00.0', '0e:00.0']
#pci           = ['0e:00.0', '08:00.0']
#usb          =  1
#usbdevice    = 'mouse'

====================================
Which only differs from the original configuration file in the devices
that it exports:
====================================
sunblade1:root:bash 2050 => diff node1.conf node2.conf
13c13
< name          = "node1"
---
> name          = "node2"
38c38
< #pci           = ['0e:00.0']
---
> pci           = ['0e:00.0']
40c40
< pci           = ['0e:00.0', '08:00.0']
---
> #pci           = ['0e:00.0', '08:00.0']

====================================
After starting this domU, qemu doesn't have fd 6 open any more...
====================================

sunblade1:root:bash 2047 => lsof -p 5843
<snip>
qemu-dm 5843 root    0r   CHR                1,3      0t0       3278 /dev/null
qemu-dm 5843 root    1w   REG              253,5   304274     606587
/var/log/xen/qemu-dm-node2.log
qemu-dm 5843 root    2w   REG              253,5   304274     606587
/var/log/xen/qemu-dm-node2.log
qemu-dm 5843 root    3r   DIR              253,5     4096     196635
/etc/init.d/rc3.d
qemu-dm 5843 root    4r  FIFO                0,6      0t0      13281 pipe
qemu-dm 5843 root    5w  FIFO                0,6      0t0      13281 pipe
qemu-dm 5843 root    7u   REG                0,3        0 4026538410
/proc/xen/privcmd
qemu-dm 5843 root    8u  unix 0xffff8803cc809400      0t0      13313 socket
qemu-dm 5843 root    9u   BLK                8,6    0t512       1281 /dev/sda6
qemu-dm 5843 root   10u  IPv4              13315      0t0        TCP
localhost:5901 (LISTEN)
qemu-dm 5843 root   11u   CHR                5,2      0t0       3280 /dev/ptmx
qemu-dm 5843 root   12u   CHR              136,2      0t0          4 /dev/pts/2
qemu-dm 5843 root   13w   REG              253,5   906611     606516
/var/log/xen/xend.log
qemu-dm 5843 root   14u  FIFO              253,5      0t0     461914
/var/run/tap/qemu-read-1
qemu-dm 5843 root   15u  FIFO              253,5      0t0     461935
/var/run/tap/qemu-write-1
qemu-dm 5843 root   16u   CHR              10,63      0t0       4370
/dev/xen/evtchn
qemu-dm 5843 root   17u   REG                0,0      256       4139
/sys/devices/pci0000:00/0000:00:05.0/0000:0d:00.0/0000:0e:00.0/config
qemu-dm 5843 root   18r   REG                0,0      256       4080
/sys/devices/pci0000:00/0000:00:14.3/config
qemu-dm 5843 root   19r   REG                0,0      256       4080
/sys/devices/pci0000:00/0000:00:14.3/config
qemu-dm 5843 root   20r   REG                0,0      256       4123
/sys/devices/pci0000:00/0000:00:1f.3/config
qemu-dm 5843 root   21r   REG                0,0      256       4123
/sys/devices/pci0000:00/0000:00:1f.3/config
qemu-dm 5843 root   22r  FIFO                0,6      0t0      13324 pipe
qemu-dm 5843 root   23w  FIFO                0,6      0t0      13324 pipe
qemu-dm 5843 root   28u  FIFO              253,5      0t0     461937
/var/run/xend/dm-1-1268880243.fifo

======================================
I then started the domU and redebugged qemu-dm with the same break point:
======================================

sunblade1:root:bash 2055 => gdb -p 5843
GNU gdb (GDB; SUSE Linux Enterprise 11) 6.8.50.20081120-cvs
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Attaching to process 5843
<snip>
(gdb) b *0x00007fbce8b0c7f4 if $rdi == 6 && $rdx == 590
Breakpoint 2 at 0x7fbce8b0c7f4
(gdb) c
Continuing.

Breakpoint 2, 0x00007fbce8b0c7f4 in write () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007fbce8b0c7f4 in write () from /lib64/libpthread.so.0
#1  0x000000000040746d in tap_receive (opaque=0xa940f0,
    buf=0x7ffff15a98b0 "������", size=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:4079
#2  0x0000000000407081 in qemu_send_packet (vc1=0xaf93b0,
    buf=0x7ffff15a98b0 "������", size=590)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:3865
#3  0x00000000004235a0 in rtl8139_transmit_one (s=0xaf9190,
    descriptor=<value optimized out>)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:1810
#4  0x0000000000424ae7 in rtl8139_io_writel (opaque=0x6,
    addr=<value optimized out>, val=524878)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/hw/rtl8139.c:2367
#5  0x00000000004726a3 in cpu_physical_memory_rw (_addr=<value optimized out>,
    buf=0x7fbce95aa018 <Address 0x7fbce95aa018 out of bounds>,
    _len=<value optimized out>, is_write=1) at exec-dm.c:510
#6  0x00000000004a4703 in cpu_ioreq_move (env=<value optimized out>,
    req=0x7fbce95aa000) at helper2.c:319
#7  0x00000000004a4fb1 in cpu_handle_ioreq (opaque=<value optimized out>)
    at helper2.c:508
#8  0x000000000040aeb4 in main_loop_wait (timeout=10)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:7175
#9  0x00000000004a4dbd in main_loop () at helper2.c:571
#10 0x000000000040feb3 in main (argc=20, argv=0x7ffff15ad198)
    at /home/dg/rpmbuild/BUILD/xen-3.3.1-testing/tools/ioemu-dir/vl.c:8950

(gdb) info regi
rax            0x0      0
rbx            0x24e    590
rcx            0x0      0
rdx            0x24e    590
rsi            0x7ffff15a98b0   140737242634416
rdi            0x6      6
rbp            0x7ffff15a98b0   0x7ffff15a98b0
rsp            0x7ffff15a9830   0x7ffff15a9830
r8             0x6      6
r9             0x1d6    470
r10            0x7ffff15aba00   140737242642944
r11            0x206    518
r12            0xa940f0 11092208
r13            0x7ffff15a98b0   140737242634416
r14            0x7fbce95aa018   140449345609752
r15            0x1      1
rip            0x7fbce8b0c7f4   0x7fbce8b0c7f4 <write+68>
eflags         0x206    [ PF IF ]
cs             0xe033   57395
ss             0xe02b   57387
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
fctrl          0x37f    895
fstat          0x0      0
ftag           0xffff   65535
fiseg          0x0      0
fioff          0x0      0
foseg          0x0      0
fooff          0x0      0
fop            0x0      0
mxcsr          0x1f80   [ IM DM ZM OM UM PM ]

=======================================
Notice that it's writing to fd 6!  And fd 6 is not even open... Is
this fd just uninitialized?

That's about as far as I got.

Anyone have any clue as to what is going wrong here?

Should I be posting this on the qemu mailing list?

thanks,
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-18 18:27       ` Dan Gora
@ 2010-03-18 18:10         ` Konrad Rzeszutek Wilk
  2010-03-18 19:56           ` Dan Gora
  0 siblings, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-18 18:10 UTC (permalink / raw)
  To: Dan Gora; +Cc: xen-devel

On Thu, Mar 18, 2010 at 03:27:30PM -0300, Dan Gora wrote:
> On Thu, Mar 18, 2010 at 11:32 AM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
> > On Wed, Mar 17, 2010 at 11:14:13PM -0300, Dan Gora wrote:
> >> Hi all,
> >>
> >> I apologize for the length of this message, but I figured that it was
> >> probably better to supply too much info rather than too little...
> >> I've made some good progress debugging this today.  The problem looks
> >> pretty bizarre...
> >
> > I just skimmed through your e-mail and didn't dive in the r8169 code to
> > see if there is something there, but I thought you might want to try
> > using e1000 instead of r8169 driver and see if that makes it work?
> 
> Sure enough that fixed the problem!  No more writes on fd 6, no more
> PCI BAR regsiter corruption!
> That's good enough for me as a workaround.  I guess someone else can
> have a look at this rtl8139 driver.

That is not what I expected, but I am glad to hear that it
makes it possible for you to continue with your project.

It is good to know that the failure path is contained in the r8169
implementation. If it wouldn't be too much trouble, could you send an
e-mail to the QEMU mailing list this thread and your solution in
case there is somebody there itching to work on this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: PCI BAR register space written with garbage in HVM guest.
       [not found]     ` <20100318143207.GF14445@phenom.dumpdata.com>
@ 2010-03-18 18:27       ` Dan Gora
  2010-03-18 18:10         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Gora @ 2010-03-18 18:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel

On Thu, Mar 18, 2010 at 11:32 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Wed, Mar 17, 2010 at 11:14:13PM -0300, Dan Gora wrote:
>> Hi all,
>>
>> I apologize for the length of this message, but I figured that it was
>> probably better to supply too much info rather than too little...
>> I've made some good progress debugging this today.  The problem looks
>> pretty bizarre...
>
> I just skimmed through your e-mail and didn't dive in the r8169 code to
> see if there is something there, but I thought you might want to try
> using e1000 instead of r8169 driver and see if that makes it work?

Sure enough that fixed the problem!  No more writes on fd 6, no more
PCI BAR regsiter corruption!
That's good enough for me as a workaround.  I guess someone else can
have a look at this rtl8139 driver.

thanks!
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Re: PCI BAR register space written with garbage in HVM guest.
  2010-03-18 18:10         ` Konrad Rzeszutek Wilk
@ 2010-03-18 19:56           ` Dan Gora
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Gora @ 2010-03-18 19:56 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Thu, Mar 18, 2010 at 3:10 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> That is not what I expected, but I am glad to hear that it
> makes it possible for you to continue with your project.
>

yeah me either.. :)

> It is good to know that the failure path is contained in the r8169
> implementation. If it wouldn't be too much trouble, could you send an
> e-mail to the QEMU mailing list this thread and your solution in
> case there is somebody there itching to work on this.

done, but there doesn't seem to be much uptake there..

thanks
dan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-03-18 19:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-16  1:09 PCI BAR register space written with garbage in HVM guest Dan Gora
2010-03-16  1:48 ` Konrad Rzeszutek Wilk
2010-03-16  2:55   ` Dan Gora
2010-03-16 10:27     ` Jan Beulich
2010-03-16 15:20     ` Konrad Rzeszutek Wilk
2010-03-16 15:54       ` Jan Beulich
2010-03-16 10:24 ` Jan Beulich
2010-03-17  0:14 ` Dan Gora
2010-03-17  2:46   ` Konrad Rzeszutek Wilk
2010-03-17  3:31     ` Dan Gora
2010-03-18  2:14   ` Dan Gora
     [not found]     ` <20100318143207.GF14445@phenom.dumpdata.com>
2010-03-18 18:27       ` Dan Gora
2010-03-18 18:10         ` Konrad Rzeszutek Wilk
2010-03-18 19:56           ` Dan Gora

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.