All of lore.kernel.org
 help / color / mirror / Atom feed
* MSI message data register configuration in Xen guests
@ 2012-06-26  2:38 Deep Debroy
  2012-06-26  2:51 ` Rolu
  0 siblings, 1 reply; 11+ messages in thread
From: Deep Debroy @ 2012-06-26  2:38 UTC (permalink / raw)
  To: xen-devel

Hi, I was playing around with a MSI capable virtual device (so far
submitted as patches only) in the upstream qemu tree but having
trouble getting it to work on a Xen hvm guest. The device happens to
be a QEMU implementation of VMWare's pvscsi controller. The device
works fine in a Xen guest when I switch the device's code to force
usage of legacy interrupts with upstream QEMU. With MSI based
interrupts, the device works fine on a KVM guest but as stated before,
not on a Xen guest. After digging a bit, it appears, the reason for
the failure in Xen guests is that the MSI data register in the Xen
guest ends up with a value of 4300 where the Deliver Mode value of 3
happens to be reserved (per spec) and therefore illegal. The
vmsi_deliver routine in Xen rejects MSI interrupts with such data as
illegal (per expectation) causing all commands issued by the guest OS
on the device to timeout.

Given this above scenario, I was wondering if anyone can shed some
light on how to debug this further for Xen. Something I would
specifically like to know is where the MSI data register configuration
actually happens. Is it done by some code specific to Xen and within
the Xen codebase or it all done within QEMU?

Thanks,
Deep

P.S. some details on the device's PCI configuration:

lspci output for a working instance in KVM:

00:07.0 SCSI storage controller: VMware PVSCSI SCSI Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 255
Interrupt: pin A routed to IRQ 45
Region 0: Memory at febf0000 (32-bit, non-prefetchable) [size=32K]
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0300c  Data: 4161
Kernel driver in use: vmw_pvscsi
Kernel modules: vmw_pvscsi

Here is the lspci output for the scenario where it's failing to work in Xen:

00:04.0 SCSI storage controller: VMware PVSCSI SCSI Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 255
Interrupt: pin A routed to IRQ 80
Region 0: Memory at f3020000 (32-bit, non-prefetchable) [size=32K]
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee36000  Data: 4300
Kernel driver in use: vmw_pvscsi
Kernel modules: vmw_pvscsi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-26  2:38 MSI message data register configuration in Xen guests Deep Debroy
@ 2012-06-26  2:51 ` Rolu
  2012-06-27 23:18   ` Deep Debroy
  2012-06-28 20:52   ` MSI message data register configuration in Xen guests Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 11+ messages in thread
From: Rolu @ 2012-06-26  2:51 UTC (permalink / raw)
  To: Deep Debroy; +Cc: xen-devel

On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
> Hi, I was playing around with a MSI capable virtual device (so far
> submitted as patches only) in the upstream qemu tree but having
> trouble getting it to work on a Xen hvm guest. The device happens to
> be a QEMU implementation of VMWare's pvscsi controller. The device
> works fine in a Xen guest when I switch the device's code to force
> usage of legacy interrupts with upstream QEMU. With MSI based
> interrupts, the device works fine on a KVM guest but as stated before,
> not on a Xen guest. After digging a bit, it appears, the reason for
> the failure in Xen guests is that the MSI data register in the Xen
> guest ends up with a value of 4300 where the Deliver Mode value of 3
> happens to be reserved (per spec) and therefore illegal. The
> vmsi_deliver routine in Xen rejects MSI interrupts with such data as
> illegal (per expectation) causing all commands issued by the guest OS
> on the device to timeout.
>
> Given this above scenario, I was wondering if anyone can shed some
> light on how to debug this further for Xen. Something I would
> specifically like to know is where the MSI data register configuration
> actually happens. Is it done by some code specific to Xen and within
> the Xen codebase or it all done within QEMU?
>

This seems like the same issue I ran into, though in my case it is
with passed through physical devices. See
http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
the older messages in that thread for more info on what's going on. No
fix yet but help debugging is very welcome.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-26  2:51 ` Rolu
@ 2012-06-27 23:18   ` Deep Debroy
  2012-06-28 20:52     ` Deep Debroy
  2012-06-28 20:52   ` MSI message data register configuration in Xen guests Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 11+ messages in thread
From: Deep Debroy @ 2012-06-27 23:18 UTC (permalink / raw)
  To: Rolu, xen-devel

On Mon, Jun 25, 2012 at 7:51 PM, Rolu <rolu@roce.org> wrote:
>
> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
> > Hi, I was playing around with a MSI capable virtual device (so far
> > submitted as patches only) in the upstream qemu tree but having
> > trouble getting it to work on a Xen hvm guest. The device happens to
> > be a QEMU implementation of VMWare's pvscsi controller. The device
> > works fine in a Xen guest when I switch the device's code to force
> > usage of legacy interrupts with upstream QEMU. With MSI based
> > interrupts, the device works fine on a KVM guest but as stated before,
> > not on a Xen guest. After digging a bit, it appears, the reason for
> > the failure in Xen guests is that the MSI data register in the Xen
> > guest ends up with a value of 4300 where the Deliver Mode value of 3
> > happens to be reserved (per spec) and therefore illegal. The
> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
> > illegal (per expectation) causing all commands issued by the guest OS
> > on the device to timeout.
> >
> > Given this above scenario, I was wondering if anyone can shed some
> > light on how to debug this further for Xen. Something I would
> > specifically like to know is where the MSI data register configuration
> > actually happens. Is it done by some code specific to Xen and within
> > the Xen codebase or it all done within QEMU?
> >
>
> This seems like the same issue I ran into, though in my case it is
> with passed through physical devices. See
> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
> the older messages in that thread for more info on what's going on. No
> fix yet but help debugging is very welcome.

Thanks Rolu for pointing out the other thread - it was very useful.
Some of the symptoms appear to be identical in my case. However, I am
not using a pass-through device. Instead, in my case it's a fully
virtualized device pretty much identical to a raw file backed disk
image where the controller is pvscsi rather than lsi. Therefore I
guess some of the latter discussion in the other thread around
pass-through specific areas of code in qemu are not relevant? Please
correct me if I am wrong. Also note that I am using upstream qemu
where neither the #define for PT_PCI_MSITRANSLATE_DEFAULT nor
xenstore.c exsits (which is where Stefano's suggested change appeared
to be).

So far, here's what I am observing in the hvm linux guest :

On the guest side, as discussed in the other thread,
xen_hvm_setup_msi_irqs is invoked for the device and a value of 0x4300
is being by xen_msi_compose_msg that is written in the data register.
On the qemu (upstream) side, when the virtualized controller is trying
to complete a request, it's invoking the following chain of calls ->
stl_le_phys -> xen_apic_mem_write -> xen_hvm_inject_msi
On the xen side, this ends up in: hvmop_inject_msi -> hvm_inject_msi
-> vmsi_deliver. vmsi_deliver, as previously discussed, rejects the
delivery mode of 0x3.

Is the above sequence of interactions the expected path for a HVM
guest trying to use a fully virtualized device/controller that uses
MSI in upstream qemu? If so, if a standard linux guest always
populates the value of 0x4300 in the MSI data register through
xen_hvm_setup_msi_irqs, how are MSI notifications from a device in
qemu supposed to work given the delivery type of 0x3 is indeed
reserved and bypass the the vmsi_deliver check?

Thanks,
Deep

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-26  2:51 ` Rolu
  2012-06-27 23:18   ` Deep Debroy
@ 2012-06-28 20:52   ` Konrad Rzeszutek Wilk
  2012-06-28 21:26     ` Deep Debroy
  2012-06-28 21:27     ` Rolu
  1 sibling, 2 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-06-28 20:52 UTC (permalink / raw)
  To: Rolu; +Cc: Deep Debroy, xen-devel

On Tue, Jun 26, 2012 at 04:51:29AM +0200, Rolu wrote:
> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
> > Hi, I was playing around with a MSI capable virtual device (so far
> > submitted as patches only) in the upstream qemu tree but having
> > trouble getting it to work on a Xen hvm guest. The device happens to
> > be a QEMU implementation of VMWare's pvscsi controller. The device
> > works fine in a Xen guest when I switch the device's code to force
> > usage of legacy interrupts with upstream QEMU. With MSI based
> > interrupts, the device works fine on a KVM guest but as stated before,
> > not on a Xen guest. After digging a bit, it appears, the reason for
> > the failure in Xen guests is that the MSI data register in the Xen
> > guest ends up with a value of 4300 where the Deliver Mode value of 3
> > happens to be reserved (per spec) and therefore illegal. The
> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
> > illegal (per expectation) causing all commands issued by the guest OS
> > on the device to timeout.
> >
> > Given this above scenario, I was wondering if anyone can shed some
> > light on how to debug this further for Xen. Something I would
> > specifically like to know is where the MSI data register configuration
> > actually happens. Is it done by some code specific to Xen and within
> > the Xen codebase or it all done within QEMU?
> >
> 
> This seems like the same issue I ran into, though in my case it is
> with passed through physical devices. See
> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
> the older messages in that thread for more info on what's going on. No
> fix yet but help debugging is very welcome.

Huh? You said in http://lists.xen.org/archives/html/xen-devel/2012-06/msg01653.html
"This worked!"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-27 23:18   ` Deep Debroy
@ 2012-06-28 20:52     ` Deep Debroy
  2012-06-29 11:10       ` Stefano Stabellini
  2012-07-03 10:28       ` [PATCH] xen: event channel remapping for emulated MSIs Stefano Stabellini
  0 siblings, 2 replies; 11+ messages in thread
From: Deep Debroy @ 2012-06-28 20:52 UTC (permalink / raw)
  To: Rolu, xen-devel

On Wed, Jun 27, 2012 at 4:18 PM, Deep Debroy <ddebroy@gmail.com> wrote:
> On Mon, Jun 25, 2012 at 7:51 PM, Rolu <rolu@roce.org> wrote:
>>
>> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
>> > Hi, I was playing around with a MSI capable virtual device (so far
>> > submitted as patches only) in the upstream qemu tree but having
>> > trouble getting it to work on a Xen hvm guest. The device happens to
>> > be a QEMU implementation of VMWare's pvscsi controller. The device
>> > works fine in a Xen guest when I switch the device's code to force
>> > usage of legacy interrupts with upstream QEMU. With MSI based
>> > interrupts, the device works fine on a KVM guest but as stated before,
>> > not on a Xen guest. After digging a bit, it appears, the reason for
>> > the failure in Xen guests is that the MSI data register in the Xen
>> > guest ends up with a value of 4300 where the Deliver Mode value of 3
>> > happens to be reserved (per spec) and therefore illegal. The
>> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
>> > illegal (per expectation) causing all commands issued by the guest OS
>> > on the device to timeout.
>> >
>> > Given this above scenario, I was wondering if anyone can shed some
>> > light on how to debug this further for Xen. Something I would
>> > specifically like to know is where the MSI data register configuration
>> > actually happens. Is it done by some code specific to Xen and within
>> > the Xen codebase or it all done within QEMU?
>> >
>>
>> This seems like the same issue I ran into, though in my case it is
>> with passed through physical devices. See
>> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
>> the older messages in that thread for more info on what's going on. No
>> fix yet but help debugging is very welcome.
>
> Thanks Rolu for pointing out the other thread - it was very useful.
> Some of the symptoms appear to be identical in my case. However, I am
> not using a pass-through device. Instead, in my case it's a fully
> virtualized device pretty much identical to a raw file backed disk
> image where the controller is pvscsi rather than lsi. Therefore I
> guess some of the latter discussion in the other thread around
> pass-through specific areas of code in qemu are not relevant? Please
> correct me if I am wrong. Also note that I am using upstream qemu
> where neither the #define for PT_PCI_MSITRANSLATE_DEFAULT nor
> xenstore.c exsits (which is where Stefano's suggested change appeared
> to be).
>
> So far, here's what I am observing in the hvm linux guest :
>
> On the guest side, as discussed in the other thread,
> xen_hvm_setup_msi_irqs is invoked for the device and a value of 0x4300
> is being by xen_msi_compose_msg that is written in the data register.
> On the qemu (upstream) side, when the virtualized controller is trying
> to complete a request, it's invoking the following chain of calls ->
> stl_le_phys -> xen_apic_mem_write -> xen_hvm_inject_msi
> On the xen side, this ends up in: hvmop_inject_msi -> hvm_inject_msi
> -> vmsi_deliver. vmsi_deliver, as previously discussed, rejects the
> delivery mode of 0x3.
>
> Is the above sequence of interactions the expected path for a HVM
> guest trying to use a fully virtualized device/controller that uses
> MSI in upstream qemu? If so, if a standard linux guest always
> populates the value of 0x4300 in the MSI data register through
> xen_hvm_setup_msi_irqs, how are MSI notifications from a device in
> qemu supposed to work given the delivery type of 0x3 is indeed
> reserved and bypass the the vmsi_deliver check?
>
> Thanks,
> Deep

I wanted to see whether the HVM guest can interact with the MSI
virtualized controller properly without any of the Xen-specific code
in the linux kernel kicking in (i.e. allowing the regular PCI/MSI code
in linux to fire). So I rebuilt the kernel with CONFIG_XEN disabled
such that pci_xen_hvm_init no longer sets x86_msi.*msi_irqs to xen
specific routines like xen_hvm_setup_msi_irqs which is where the
0x4300 is getting populated. This seems to work properly. The MSI data
register for the controller ends up getting a valid value like 0x4049,
vmsi_deliver no longer complains, all MSI notifications are delivered
in the expected way to the guest and the raw, file-backed disks
attached to the controller showing up in fdisk -l.

My conclusion: the linux kernel's xen specific code, specifically
routines like xen_hvm_setup_msi_irqs, need to be tweaked to work with
fully virtualized qemu devices that use MSI. I will follow-up
regarding that on LKML.

Thanks,
Deep

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-28 20:52   ` MSI message data register configuration in Xen guests Konrad Rzeszutek Wilk
@ 2012-06-28 21:26     ` Deep Debroy
  2012-06-28 21:27     ` Rolu
  1 sibling, 0 replies; 11+ messages in thread
From: Deep Debroy @ 2012-06-28 21:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Rolu, xen-devel

On Thu, Jun 28, 2012 at 1:52 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Tue, Jun 26, 2012 at 04:51:29AM +0200, Rolu wrote:
>> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
>> > Hi, I was playing around with a MSI capable virtual device (so far
>> > submitted as patches only) in the upstream qemu tree but having
>> > trouble getting it to work on a Xen hvm guest. The device happens to
>> > be a QEMU implementation of VMWare's pvscsi controller. The device
>> > works fine in a Xen guest when I switch the device's code to force
>> > usage of legacy interrupts with upstream QEMU. With MSI based
>> > interrupts, the device works fine on a KVM guest but as stated before,
>> > not on a Xen guest. After digging a bit, it appears, the reason for
>> > the failure in Xen guests is that the MSI data register in the Xen
>> > guest ends up with a value of 4300 where the Deliver Mode value of 3
>> > happens to be reserved (per spec) and therefore illegal. The
>> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
>> > illegal (per expectation) causing all commands issued by the guest OS
>> > on the device to timeout.
>> >
>> > Given this above scenario, I was wondering if anyone can shed some
>> > light on how to debug this further for Xen. Something I would
>> > specifically like to know is where the MSI data register configuration
>> > actually happens. Is it done by some code specific to Xen and within
>> > the Xen codebase or it all done within QEMU?
>> >
>>
>> This seems like the same issue I ran into, though in my case it is
>> with passed through physical devices. See
>> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
>> the older messages in that thread for more info on what's going on. No
>> fix yet but help debugging is very welcome.
>
> Huh? You said in http://lists.xen.org/archives/html/xen-devel/2012-06/msg01653.html
> "This worked!"

Hi Konrad, I believe Rolu's response in the thread you pointed to was
with respect to pass-through devices. This current thread is not about
pass-through devices but for a fully virtualized qemu device -
specifically a disk controller that is exposing raw-image backed files
from the host the guest as disks, very similar to the default LSI scsi
controller in qemu.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-28 20:52   ` MSI message data register configuration in Xen guests Konrad Rzeszutek Wilk
  2012-06-28 21:26     ` Deep Debroy
@ 2012-06-28 21:27     ` Rolu
  1 sibling, 0 replies; 11+ messages in thread
From: Rolu @ 2012-06-28 21:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Deep Debroy, xen-devel

On Thu, Jun 28, 2012 at 10:52 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Tue, Jun 26, 2012 at 04:51:29AM +0200, Rolu wrote:
>> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
>> > Hi, I was playing around with a MSI capable virtual device (so far
>> > submitted as patches only) in the upstream qemu tree but having
>> > trouble getting it to work on a Xen hvm guest. The device happens to
>> > be a QEMU implementation of VMWare's pvscsi controller. The device
>> > works fine in a Xen guest when I switch the device's code to force
>> > usage of legacy interrupts with upstream QEMU. With MSI based
>> > interrupts, the device works fine on a KVM guest but as stated before,
>> > not on a Xen guest. After digging a bit, it appears, the reason for
>> > the failure in Xen guests is that the MSI data register in the Xen
>> > guest ends up with a value of 4300 where the Deliver Mode value of 3
>> > happens to be reserved (per spec) and therefore illegal. The
>> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
>> > illegal (per expectation) causing all commands issued by the guest OS
>> > on the device to timeout.
>> >
>> > Given this above scenario, I was wondering if anyone can shed some
>> > light on how to debug this further for Xen. Something I would
>> > specifically like to know is where the MSI data register configuration
>> > actually happens. Is it done by some code specific to Xen and within
>> > the Xen codebase or it all done within QEMU?
>> >
>>
>> This seems like the same issue I ran into, though in my case it is
>> with passed through physical devices. See
>> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
>> the older messages in that thread for more info on what's going on. No
>> fix yet but help debugging is very welcome.
>
> Huh? You said in http://lists.xen.org/archives/html/xen-devel/2012-06/msg01653.html
> "This worked!"

That's a day and a half later.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-28 20:52     ` Deep Debroy
@ 2012-06-29 11:10       ` Stefano Stabellini
  2012-07-03  5:54         ` Deep Debroy
  2012-07-03 10:28       ` [PATCH] xen: event channel remapping for emulated MSIs Stefano Stabellini
  1 sibling, 1 reply; 11+ messages in thread
From: Stefano Stabellini @ 2012-06-29 11:10 UTC (permalink / raw)
  To: Deep Debroy; +Cc: Rolu, xen-devel

[-- Attachment #1: Type: text/plain, Size: 6878 bytes --]

On Thu, 28 Jun 2012, Deep Debroy wrote:
> On Wed, Jun 27, 2012 at 4:18 PM, Deep Debroy <ddebroy@gmail.com> wrote:
> > On Mon, Jun 25, 2012 at 7:51 PM, Rolu <rolu@roce.org> wrote:
> >>
> >> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
> >> > Hi, I was playing around with a MSI capable virtual device (so far
> >> > submitted as patches only) in the upstream qemu tree but having
> >> > trouble getting it to work on a Xen hvm guest. The device happens to
> >> > be a QEMU implementation of VMWare's pvscsi controller. The device
> >> > works fine in a Xen guest when I switch the device's code to force
> >> > usage of legacy interrupts with upstream QEMU. With MSI based
> >> > interrupts, the device works fine on a KVM guest but as stated before,
> >> > not on a Xen guest. After digging a bit, it appears, the reason for
> >> > the failure in Xen guests is that the MSI data register in the Xen
> >> > guest ends up with a value of 4300 where the Deliver Mode value of 3
> >> > happens to be reserved (per spec) and therefore illegal. The
> >> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
> >> > illegal (per expectation) causing all commands issued by the guest OS
> >> > on the device to timeout.
> >> >
> >> > Given this above scenario, I was wondering if anyone can shed some
> >> > light on how to debug this further for Xen. Something I would
> >> > specifically like to know is where the MSI data register configuration
> >> > actually happens. Is it done by some code specific to Xen and within
> >> > the Xen codebase or it all done within QEMU?
> >> >
> >>
> >> This seems like the same issue I ran into, though in my case it is
> >> with passed through physical devices. See
> >> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
> >> the older messages in that thread for more info on what's going on. No
> >> fix yet but help debugging is very welcome.
> >
> > Thanks Rolu for pointing out the other thread - it was very useful.
> > Some of the symptoms appear to be identical in my case. However, I am
> > not using a pass-through device. Instead, in my case it's a fully
> > virtualized device pretty much identical to a raw file backed disk
> > image where the controller is pvscsi rather than lsi. Therefore I
> > guess some of the latter discussion in the other thread around
> > pass-through specific areas of code in qemu are not relevant? Please
> > correct me if I am wrong. Also note that I am using upstream qemu
> > where neither the #define for PT_PCI_MSITRANSLATE_DEFAULT nor
> > xenstore.c exsits (which is where Stefano's suggested change appeared
> > to be).
> >
> > So far, here's what I am observing in the hvm linux guest :
> >
> > On the guest side, as discussed in the other thread,
> > xen_hvm_setup_msi_irqs is invoked for the device and a value of 0x4300
> > is being by xen_msi_compose_msg that is written in the data register.
> > On the qemu (upstream) side, when the virtualized controller is trying
> > to complete a request, it's invoking the following chain of calls ->
> > stl_le_phys -> xen_apic_mem_write -> xen_hvm_inject_msi
> > On the xen side, this ends up in: hvmop_inject_msi -> hvm_inject_msi
> > -> vmsi_deliver. vmsi_deliver, as previously discussed, rejects the
> > delivery mode of 0x3.
> >
> > Is the above sequence of interactions the expected path for a HVM
> > guest trying to use a fully virtualized device/controller that uses
> > MSI in upstream qemu? If so, if a standard linux guest always
> > populates the value of 0x4300 in the MSI data register through
> > xen_hvm_setup_msi_irqs, how are MSI notifications from a device in
> > qemu supposed to work given the delivery type of 0x3 is indeed
> > reserved and bypass the the vmsi_deliver check?
> >
> I wanted to see whether the HVM guest can interact with the MSI
> virtualized controller properly without any of the Xen-specific code
> in the linux kernel kicking in (i.e. allowing the regular PCI/MSI code
> in linux to fire). So I rebuilt the kernel with CONFIG_XEN disabled
> such that pci_xen_hvm_init no longer sets x86_msi.*msi_irqs to xen
> specific routines like xen_hvm_setup_msi_irqs which is where the
> 0x4300 is getting populated. This seems to work properly. The MSI data
> register for the controller ends up getting a valid value like 0x4049,
> vmsi_deliver no longer complains, all MSI notifications are delivered
> in the expected way to the guest and the raw, file-backed disks
> attached to the controller showing up in fdisk -l.
> 
> My conclusion: the linux kernel's xen specific code, specifically
> routines like xen_hvm_setup_msi_irqs, need to be tweaked to work with
> fully virtualized qemu devices that use MSI. I will follow-up
> regarding that on LKML.

Thanks for your analysis of the problem, I think it is correct: Linux PV
on HVM is trying to setup an event channel delivery for the MSI as it
always does (therefore choosing 0x3 as delivery mode).
However emulated devices in QEMU don't support that.
To be honest emulated devices in QEMU didn't support MSIs at all until
very recently, so this is why we are seeing this issue only now.

Could you please try this Xen patch and let me know if it makes things
better?


diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index a90927a..f44f3b9 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -281,6 +281,31 @@ void hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
         >> MSI_DATA_TRIGGER_SHIFT;
     uint8_t vector = data & MSI_DATA_VECTOR_MASK;
 
+    if ( !vector )
+    {
+        int pirq = ((addr >> 32) & 0xffffff00) | ((addr >> 12) & 0xff);
+        if ( pirq > 0 )
+        {
+            struct pirq *info = pirq_info(d, pirq);
+
+            /* if it is the first time, allocate the pirq */
+            if (info->arch.hvm.emuirq == IRQ_UNBOUND)
+            {
+                spin_lock(&d->event_lock);
+                map_domain_emuirq_pirq(d, pirq, IRQ_MSI_EMU);
+                spin_unlock(&d->event_lock);
+            } else if (info->arch.hvm.emuirq != IRQ_MSI_EMU)
+            {
+                printk("%s: pirq %d does not correspond to an emulated MSI\n", __func__, pirq);
+                return;
+            }
+            send_guest_pirq(d, info);
+            return;
+        } else {
+            printk("%s: error getting pirq from MSI: pirq = %d\n", __func__, pirq);
+        }
+    }
+
     vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
 }
 
diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
index 40e2245..066f64d 100644
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -188,6 +188,7 @@ void cleanup_domain_irq_mapping(struct domain *);
 })
 #define IRQ_UNBOUND -1
 #define IRQ_PT -2
+#define IRQ_MSI_EMU -3
 
 bool_t cpu_has_pending_apic_eoi(void);
 

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-06-29 11:10       ` Stefano Stabellini
@ 2012-07-03  5:54         ` Deep Debroy
  2012-07-03 10:20           ` Stefano Stabellini
  0 siblings, 1 reply; 11+ messages in thread
From: Deep Debroy @ 2012-07-03  5:54 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Rolu, xen-devel

On Fri, Jun 29, 2012 at 4:10 AM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Thu, 28 Jun 2012, Deep Debroy wrote:
>> On Wed, Jun 27, 2012 at 4:18 PM, Deep Debroy <ddebroy@gmail.com> wrote:
>> > On Mon, Jun 25, 2012 at 7:51 PM, Rolu <rolu@roce.org> wrote:
>> >>
>> >> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
>> >> > Hi, I was playing around with a MSI capable virtual device (so far
>> >> > submitted as patches only) in the upstream qemu tree but having
>> >> > trouble getting it to work on a Xen hvm guest. The device happens to
>> >> > be a QEMU implementation of VMWare's pvscsi controller. The device
>> >> > works fine in a Xen guest when I switch the device's code to force
>> >> > usage of legacy interrupts with upstream QEMU. With MSI based
>> >> > interrupts, the device works fine on a KVM guest but as stated before,
>> >> > not on a Xen guest. After digging a bit, it appears, the reason for
>> >> > the failure in Xen guests is that the MSI data register in the Xen
>> >> > guest ends up with a value of 4300 where the Deliver Mode value of 3
>> >> > happens to be reserved (per spec) and therefore illegal. The
>> >> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
>> >> > illegal (per expectation) causing all commands issued by the guest OS
>> >> > on the device to timeout.
>> >> >
>> >> > Given this above scenario, I was wondering if anyone can shed some
>> >> > light on how to debug this further for Xen. Something I would
>> >> > specifically like to know is where the MSI data register configuration
>> >> > actually happens. Is it done by some code specific to Xen and within
>> >> > the Xen codebase or it all done within QEMU?
>> >> >
>> >>
>> >> This seems like the same issue I ran into, though in my case it is
>> >> with passed through physical devices. See
>> >> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
>> >> the older messages in that thread for more info on what's going on. No
>> >> fix yet but help debugging is very welcome.
>> >
>> > Thanks Rolu for pointing out the other thread - it was very useful.
>> > Some of the symptoms appear to be identical in my case. However, I am
>> > not using a pass-through device. Instead, in my case it's a fully
>> > virtualized device pretty much identical to a raw file backed disk
>> > image where the controller is pvscsi rather than lsi. Therefore I
>> > guess some of the latter discussion in the other thread around
>> > pass-through specific areas of code in qemu are not relevant? Please
>> > correct me if I am wrong. Also note that I am using upstream qemu
>> > where neither the #define for PT_PCI_MSITRANSLATE_DEFAULT nor
>> > xenstore.c exsits (which is where Stefano's suggested change appeared
>> > to be).
>> >
>> > So far, here's what I am observing in the hvm linux guest :
>> >
>> > On the guest side, as discussed in the other thread,
>> > xen_hvm_setup_msi_irqs is invoked for the device and a value of 0x4300
>> > is being by xen_msi_compose_msg that is written in the data register.
>> > On the qemu (upstream) side, when the virtualized controller is trying
>> > to complete a request, it's invoking the following chain of calls ->
>> > stl_le_phys -> xen_apic_mem_write -> xen_hvm_inject_msi
>> > On the xen side, this ends up in: hvmop_inject_msi -> hvm_inject_msi
>> > -> vmsi_deliver. vmsi_deliver, as previously discussed, rejects the
>> > delivery mode of 0x3.
>> >
>> > Is the above sequence of interactions the expected path for a HVM
>> > guest trying to use a fully virtualized device/controller that uses
>> > MSI in upstream qemu? If so, if a standard linux guest always
>> > populates the value of 0x4300 in the MSI data register through
>> > xen_hvm_setup_msi_irqs, how are MSI notifications from a device in
>> > qemu supposed to work given the delivery type of 0x3 is indeed
>> > reserved and bypass the the vmsi_deliver check?
>> >
>> I wanted to see whether the HVM guest can interact with the MSI
>> virtualized controller properly without any of the Xen-specific code
>> in the linux kernel kicking in (i.e. allowing the regular PCI/MSI code
>> in linux to fire). So I rebuilt the kernel with CONFIG_XEN disabled
>> such that pci_xen_hvm_init no longer sets x86_msi.*msi_irqs to xen
>> specific routines like xen_hvm_setup_msi_irqs which is where the
>> 0x4300 is getting populated. This seems to work properly. The MSI data
>> register for the controller ends up getting a valid value like 0x4049,
>> vmsi_deliver no longer complains, all MSI notifications are delivered
>> in the expected way to the guest and the raw, file-backed disks
>> attached to the controller showing up in fdisk -l.
>>
>> My conclusion: the linux kernel's xen specific code, specifically
>> routines like xen_hvm_setup_msi_irqs, need to be tweaked to work with
>> fully virtualized qemu devices that use MSI. I will follow-up
>> regarding that on LKML.
>
> Thanks for your analysis of the problem, I think it is correct: Linux PV
> on HVM is trying to setup an event channel delivery for the MSI as it
> always does (therefore choosing 0x3 as delivery mode).
> However emulated devices in QEMU don't support that.
> To be honest emulated devices in QEMU didn't support MSIs at all until
> very recently, so this is why we are seeing this issue only now.
>
> Could you please try this Xen patch and let me know if it makes things
> better?
>

Thanks Stefano. I have tested the below patch with the MSI device and
it's now working (without any additional changes to the linux guest
kernel).

>
> diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
> index a90927a..f44f3b9 100644
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -281,6 +281,31 @@ void hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
>          >> MSI_DATA_TRIGGER_SHIFT;
>      uint8_t vector = data & MSI_DATA_VECTOR_MASK;
>
> +    if ( !vector )
> +    {
> +        int pirq = ((addr >> 32) & 0xffffff00) | ((addr >> 12) & 0xff);
> +        if ( pirq > 0 )
> +        {
> +            struct pirq *info = pirq_info(d, pirq);
> +
> +            /* if it is the first time, allocate the pirq */
> +            if (info->arch.hvm.emuirq == IRQ_UNBOUND)
> +            {
> +                spin_lock(&d->event_lock);
> +                map_domain_emuirq_pirq(d, pirq, IRQ_MSI_EMU);
> +                spin_unlock(&d->event_lock);
> +            } else if (info->arch.hvm.emuirq != IRQ_MSI_EMU)
> +            {
> +                printk("%s: pirq %d does not correspond to an emulated MSI\n", __func__, pirq);
> +                return;
> +            }
> +            send_guest_pirq(d, info);
> +            return;
> +        } else {
> +            printk("%s: error getting pirq from MSI: pirq = %d\n", __func__, pirq);
> +        }
> +    }
> +
>      vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
>  }
>
> diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
> index 40e2245..066f64d 100644
> --- a/xen/include/asm-x86/irq.h
> +++ b/xen/include/asm-x86/irq.h
> @@ -188,6 +188,7 @@ void cleanup_domain_irq_mapping(struct domain *);
>  })
>  #define IRQ_UNBOUND -1
>  #define IRQ_PT -2
> +#define IRQ_MSI_EMU -3
>
>  bool_t cpu_has_pending_apic_eoi(void);
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MSI message data register configuration in Xen guests
  2012-07-03  5:54         ` Deep Debroy
@ 2012-07-03 10:20           ` Stefano Stabellini
  0 siblings, 0 replies; 11+ messages in thread
From: Stefano Stabellini @ 2012-07-03 10:20 UTC (permalink / raw)
  To: Deep Debroy; +Cc: Rolu, xen-devel, Stefano Stabellini

On Tue, 3 Jul 2012, Deep Debroy wrote:
> On Fri, Jun 29, 2012 at 4:10 AM, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> > On Thu, 28 Jun 2012, Deep Debroy wrote:
> >> On Wed, Jun 27, 2012 at 4:18 PM, Deep Debroy <ddebroy@gmail.com> wrote:
> >> > On Mon, Jun 25, 2012 at 7:51 PM, Rolu <rolu@roce.org> wrote:
> >> >>
> >> >> On Tue, Jun 26, 2012 at 4:38 AM, Deep Debroy <ddebroy@gmail.com> wrote:
> >> >> > Hi, I was playing around with a MSI capable virtual device (so far
> >> >> > submitted as patches only) in the upstream qemu tree but having
> >> >> > trouble getting it to work on a Xen hvm guest. The device happens to
> >> >> > be a QEMU implementation of VMWare's pvscsi controller. The device
> >> >> > works fine in a Xen guest when I switch the device's code to force
> >> >> > usage of legacy interrupts with upstream QEMU. With MSI based
> >> >> > interrupts, the device works fine on a KVM guest but as stated before,
> >> >> > not on a Xen guest. After digging a bit, it appears, the reason for
> >> >> > the failure in Xen guests is that the MSI data register in the Xen
> >> >> > guest ends up with a value of 4300 where the Deliver Mode value of 3
> >> >> > happens to be reserved (per spec) and therefore illegal. The
> >> >> > vmsi_deliver routine in Xen rejects MSI interrupts with such data as
> >> >> > illegal (per expectation) causing all commands issued by the guest OS
> >> >> > on the device to timeout.
> >> >> >
> >> >> > Given this above scenario, I was wondering if anyone can shed some
> >> >> > light on how to debug this further for Xen. Something I would
> >> >> > specifically like to know is where the MSI data register configuration
> >> >> > actually happens. Is it done by some code specific to Xen and within
> >> >> > the Xen codebase or it all done within QEMU?
> >> >> >
> >> >>
> >> >> This seems like the same issue I ran into, though in my case it is
> >> >> with passed through physical devices. See
> >> >> http://lists.xen.org/archives/html/xen-devel/2012-06/msg01423.html and
> >> >> the older messages in that thread for more info on what's going on. No
> >> >> fix yet but help debugging is very welcome.
> >> >
> >> > Thanks Rolu for pointing out the other thread - it was very useful.
> >> > Some of the symptoms appear to be identical in my case. However, I am
> >> > not using a pass-through device. Instead, in my case it's a fully
> >> > virtualized device pretty much identical to a raw file backed disk
> >> > image where the controller is pvscsi rather than lsi. Therefore I
> >> > guess some of the latter discussion in the other thread around
> >> > pass-through specific areas of code in qemu are not relevant? Please
> >> > correct me if I am wrong. Also note that I am using upstream qemu
> >> > where neither the #define for PT_PCI_MSITRANSLATE_DEFAULT nor
> >> > xenstore.c exsits (which is where Stefano's suggested change appeared
> >> > to be).
> >> >
> >> > So far, here's what I am observing in the hvm linux guest :
> >> >
> >> > On the guest side, as discussed in the other thread,
> >> > xen_hvm_setup_msi_irqs is invoked for the device and a value of 0x4300
> >> > is being by xen_msi_compose_msg that is written in the data register.
> >> > On the qemu (upstream) side, when the virtualized controller is trying
> >> > to complete a request, it's invoking the following chain of calls ->
> >> > stl_le_phys -> xen_apic_mem_write -> xen_hvm_inject_msi
> >> > On the xen side, this ends up in: hvmop_inject_msi -> hvm_inject_msi
> >> > -> vmsi_deliver. vmsi_deliver, as previously discussed, rejects the
> >> > delivery mode of 0x3.
> >> >
> >> > Is the above sequence of interactions the expected path for a HVM
> >> > guest trying to use a fully virtualized device/controller that uses
> >> > MSI in upstream qemu? If so, if a standard linux guest always
> >> > populates the value of 0x4300 in the MSI data register through
> >> > xen_hvm_setup_msi_irqs, how are MSI notifications from a device in
> >> > qemu supposed to work given the delivery type of 0x3 is indeed
> >> > reserved and bypass the the vmsi_deliver check?
> >> >
> >> I wanted to see whether the HVM guest can interact with the MSI
> >> virtualized controller properly without any of the Xen-specific code
> >> in the linux kernel kicking in (i.e. allowing the regular PCI/MSI code
> >> in linux to fire). So I rebuilt the kernel with CONFIG_XEN disabled
> >> such that pci_xen_hvm_init no longer sets x86_msi.*msi_irqs to xen
> >> specific routines like xen_hvm_setup_msi_irqs which is where the
> >> 0x4300 is getting populated. This seems to work properly. The MSI data
> >> register for the controller ends up getting a valid value like 0x4049,
> >> vmsi_deliver no longer complains, all MSI notifications are delivered
> >> in the expected way to the guest and the raw, file-backed disks
> >> attached to the controller showing up in fdisk -l.
> >>
> >> My conclusion: the linux kernel's xen specific code, specifically
> >> routines like xen_hvm_setup_msi_irqs, need to be tweaked to work with
> >> fully virtualized qemu devices that use MSI. I will follow-up
> >> regarding that on LKML.
> >
> > Thanks for your analysis of the problem, I think it is correct: Linux PV
> > on HVM is trying to setup an event channel delivery for the MSI as it
> > always does (therefore choosing 0x3 as delivery mode).
> > However emulated devices in QEMU don't support that.
> > To be honest emulated devices in QEMU didn't support MSIs at all until
> > very recently, so this is why we are seeing this issue only now.
> >
> > Could you please try this Xen patch and let me know if it makes things
> > better?
> >
> 
> Thanks Stefano. I have tested the below patch with the MSI device and
> it's now working (without any additional changes to the linux guest
> kernel).

Thanks! I'll submit the patch and add your Tested-by.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] xen: event channel remapping for emulated MSIs
  2012-06-28 20:52     ` Deep Debroy
  2012-06-29 11:10       ` Stefano Stabellini
@ 2012-07-03 10:28       ` Stefano Stabellini
  1 sibling, 0 replies; 11+ messages in thread
From: Stefano Stabellini @ 2012-07-03 10:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Deep Debroy, Ian Campbell, Stefano Stabellini

Linux PV on HVM guests remap all the MSIs onto event channels,
including MSIs corresponding to QEMU's emulated devices.
This patch makes sure that we handle correctly the case of emulated MSI
that have been remapped, sending a pirq to the guest instead.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Deep Debroy <ddebroy@gmail.com>

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index a90927a..f44f3b9 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -281,6 +281,31 @@ void hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
         >> MSI_DATA_TRIGGER_SHIFT;
     uint8_t vector = data & MSI_DATA_VECTOR_MASK;
 
+    if ( !vector )
+    {
+        int pirq = ((addr >> 32) & 0xffffff00) | ((addr >> 12) & 0xff);
+        if ( pirq > 0 )
+        {
+            struct pirq *info = pirq_info(d, pirq);
+
+            /* if it is the first time, allocate the pirq */
+            if (info->arch.hvm.emuirq == IRQ_UNBOUND)
+            {
+                spin_lock(&d->event_lock);
+                map_domain_emuirq_pirq(d, pirq, IRQ_MSI_EMU);
+                spin_unlock(&d->event_lock);
+            } else if (info->arch.hvm.emuirq != IRQ_MSI_EMU)
+            {
+                printk("%s: pirq %d does not correspond to an emulated MSI\n", __func__, pirq);
+                return;
+            }
+            send_guest_pirq(d, info);
+            return;
+        } else {
+            printk("%s: error getting pirq from MSI: pirq = %d\n", __func__, pirq);
+        }
+    }
+
     vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
 }
 
diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
index 40e2245..066f64d 100644
--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -188,6 +188,7 @@ void cleanup_domain_irq_mapping(struct domain *);
 })
 #define IRQ_UNBOUND -1
 #define IRQ_PT -2
+#define IRQ_MSI_EMU -3
 
 bool_t cpu_has_pending_apic_eoi(void);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-07-03 10:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-26  2:38 MSI message data register configuration in Xen guests Deep Debroy
2012-06-26  2:51 ` Rolu
2012-06-27 23:18   ` Deep Debroy
2012-06-28 20:52     ` Deep Debroy
2012-06-29 11:10       ` Stefano Stabellini
2012-07-03  5:54         ` Deep Debroy
2012-07-03 10:20           ` Stefano Stabellini
2012-07-03 10:28       ` [PATCH] xen: event channel remapping for emulated MSIs Stefano Stabellini
2012-06-28 20:52   ` MSI message data register configuration in Xen guests Konrad Rzeszutek Wilk
2012-06-28 21:26     ` Deep Debroy
2012-06-28 21:27     ` Rolu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.