All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-05-08  8:18 Zhenzhong Duan
  2013-05-10 18:53 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Zhenzhong Duan @ 2013-05-08  8:18 UTC (permalink / raw)
  To: xen-devel, linux-kernel
  Cc: Konrad Rzeszutek Wilk, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

When driver load and unload in a loop, pirq will exhaust finally.
Try to use the same pirq which was already mapped and binded at first time
when driver loaded.

Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
when driver unload, it's always true in second load.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
---
 arch/x86/pci/xen.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 94e7662..4289eef 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -230,8 +230,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 		__read_msi_msg(msidesc, &msg);
 		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
 			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
-		if (msg.data != XEN_PIRQ_MSI_DATA ||
-		    xen_irq_from_pirq(pirq) < 0) {
+		if (msg.data != XEN_PIRQ_MSI_DATA) {
 			pirq = xen_allocate_pirq_msi(dev, msidesc);
 			if (pirq < 0) {
 				irq = -ENODEV;
-- 
1.7.3


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-08  8:18 [PATCH] xen: reuse the same pirq allocated when driver load first time Zhenzhong Duan
@ 2013-05-10 18:53 ` Konrad Rzeszutek Wilk
  2013-05-13  7:44   ` Zhenzhong Duan
  2013-05-13 11:06   ` Stefano Stabellini
  0 siblings, 2 replies; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-10 18:53 UTC (permalink / raw)
  To: Zhenzhong Duan, Stefano Stabellini
  Cc: xen-devel, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

[-- Attachment #1: Type: text/plain, Size: 2513 bytes --]

On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> When driver load and unload in a loop, pirq will exhaust finally.
> Try to use the same pirq which was already mapped and binded at first time

So what happens if I unload and reload two drivers in random order?

> when driver loaded.
> 
> Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> when driver unload, it's always true in second load.

If my understanding is right the issue at hand is that the caching
information about the pirq disappears once the driver has been
unloaded b/c the event's irq-info is removed (as the driver is
unloaded and free_irq is called).

Stefano,
Is there a specific write to the MSI structure that would cause the
hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
PIRQ in usage?

I presume that on the first load the msg.data value is
ALWAYS != XEN_PIRQ_MSI_DATA right? Is that somethign we can ALWAYS
depend? Or is there a possibility that it could be that when the
PCI device is unplugged from one guest and plugged in another?

Could we use PHYSDEVOP_irq_status_query to figure out if the PIRQ
has been already allocated? Ah, does not look like it.

I am a bit worried about depending on just msg.data to tells us
whether the device had been initialized before.

Perhaps a different solution could be to utilize the
'xen_device_domain_owner' code. See attached patch (not
compile tested) to retain whether we had initialized this
device in the past and we can skip the hypercall.

On a second thought it could also use some extra checks
to make sure that the pirq that would be used on the second
run is the same as on the first one.

> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
> Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
> ---
>  arch/x86/pci/xen.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
> index 94e7662..4289eef 100644
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -230,8 +230,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>  		__read_msi_msg(msidesc, &msg);
>  		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
>  			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
> -		if (msg.data != XEN_PIRQ_MSI_DATA ||
> -		    xen_irq_from_pirq(pirq) < 0) {
> +		if (msg.data != XEN_PIRQ_MSI_DATA) {

>  			if (pirq < 0) {
>  				irq = -ENODEV;
> -- 
> 1.7.3
> 

[-- Attachment #2: use-xen-device-domain-owner.patch --]
[-- Type: text/plain, Size: 1857 bytes --]

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 56ab749..0150bc9 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -225,7 +225,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 		pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
 			((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
 		if (msg.data != XEN_PIRQ_MSI_DATA ||
-		    xen_irq_from_pirq(pirq) < 0) {
+		    xen_device_pirq_alloced(dev)) {
 			pirq = xen_allocate_pirq_msi(dev, msidesc);
 			if (pirq < 0) {
 				irq = -ENODEV;
@@ -234,6 +234,7 @@ static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 			xen_msi_compose_msg(dev, pirq, &msg);
 			__write_msi_msg(msidesc, &msg);
 			dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
+			xen_device_allocated(dev);
 		} else {
 			dev_dbg(&dev->dev,
 				"xen: msi already bound to pirq=%d\n", pirq);
@@ -500,6 +501,7 @@ int __init pci_xen_initial_domain(void)
 
 struct xen_device_domain_owner {
 	domid_t domain;
+	bool used;
 	struct pci_dev *dev;
 	struct list_head list;
 };
@@ -570,4 +572,32 @@ int xen_unregister_device_domain_owner(struct pci_dev *dev)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(xen_unregister_device_domain_owner);
+
+int xen_device_allocated(struct pci_dev *dev)
+{
+	struct xen_device_domain_owner *owner;
+
+	int rc = xen_register_device_domain_owner(dev, DOMID_SELF);
+	if (rc)
+		return rc;
+
+	spin_lock(&dev_domain_list_spinlock);
+	owner = find_device(dev);
+	if (owner)
+		owner->used = true;
+	spin_unlock(&dev_domain_list_spinlock);
+	return 0;
+}
+bool xen_device_pirq_alloced(struct pci_dev *dev)
+{
+	struct xen_device_domain_owner *owner;
+	bool rc = false;
+
+	spin_lock(&dev_domain_list_spinlock);
+	owner = find_device(dev);
+	if (owner && owner->used)
+		rc = true;
+	spin_unlock(&dev_domain_list_spinlock);
+	return rc;
+}
 #endif

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-10 18:53 ` Konrad Rzeszutek Wilk
@ 2013-05-13  7:44   ` Zhenzhong Duan
  2013-05-13 11:06   ` Stefano Stabellini
  1 sibling, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-05-13  7:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen


On 2013-05-11 02:53, Konrad Rzeszutek Wilk wrote:
> On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
>> When driver load and unload in a loop, pirq will exhaust finally.
>> Try to use the same pirq which was already mapped and binded at first time
> So what happens if I unload and reload two drivers in random order?
Same result. In fact, not only pirq exhuast issue, also driver can't be 
reloaded successfully everytime.
When trying to load driver which need to setup msix (ex. mlx4_core) it once
succeed and once failed (3rd time succeed, 4th failed etc).

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-10 18:53 ` Konrad Rzeszutek Wilk
  2013-05-13  7:44   ` Zhenzhong Duan
@ 2013-05-13 11:06   ` Stefano Stabellini
  2013-05-13 14:07     ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-13 11:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Zhenzhong Duan, Stefano Stabellini, xen-devel, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Yuval Shaia,
	Feng Jin, Chien Yen

On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > When driver load and unload in a loop, pirq will exhaust finally.
> > Try to use the same pirq which was already mapped and binded at first time
> 
> So what happens if I unload and reload two drivers in random order?
> 
> > when driver loaded.
> > 
> > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > when driver unload, it's always true in second load.
> 
> If my understanding is right the issue at hand is that the caching
> information about the pirq disappears once the driver has been
> unloaded b/c the event's irq-info is removed (as the driver is
> unloaded and free_irq is called).
> 
> Stefano,
> Is there a specific write to the MSI structure that would cause the
> hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> PIRQ in usage?

We already have a "free PIRQ" hypercall, it's called
PHYSDEVOP_unmap_pirq and should be called by QEMU.
Linux should disable the MSI bit in the PCI config options of the
device:

pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)

That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
unmaps the pirq. If it doesn't, it's a bug :)


> I presume that on the first load the msg.data value is
> ALWAYS != XEN_PIRQ_MSI_DATA right? Is that somethign we can ALWAYS
> depend? Or is there a possibility that it could be that when the
> PCI device is unplugged from one guest and plugged in another?
> 
> Could we use PHYSDEVOP_irq_status_query to figure out if the PIRQ
> has been already allocated? Ah, does not look like it.
> 
> I am a bit worried about depending on just msg.data to tells us
> whether the device had been initialized before.

Data comes from QEMU, I think it is zeroed at init time.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 11:06   ` Stefano Stabellini
@ 2013-05-13 14:07     ` Konrad Rzeszutek Wilk
  2013-05-13 14:50       ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-13 14:07 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > When driver load and unload in a loop, pirq will exhaust finally.
> > > Try to use the same pirq which was already mapped and binded at first time
> > 
> > So what happens if I unload and reload two drivers in random order?
> > 
> > > when driver loaded.
> > > 
> > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > when driver unload, it's always true in second load.
> > 
> > If my understanding is right the issue at hand is that the caching
> > information about the pirq disappears once the driver has been
> > unloaded b/c the event's irq-info is removed (as the driver is
> > unloaded and free_irq is called).
> > 
> > Stefano,
> > Is there a specific write to the MSI structure that would cause the
> > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > PIRQ in usage?
> 
> We already have a "free PIRQ" hypercall, it's called
> PHYSDEVOP_unmap_pirq and should be called by QEMU.

Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
it in the Linux kernel (and not in QEMU), perhaps that should be done in the
Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?

It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
unregister_real_device which is only called during pci unplug?

> Linux should disable the MSI bit in the PCI config options of the
> device:
> 
> pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)

Zhenzhong, does it do that? Looking at the driver it certainly seems that way.
> 
> That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
> unmaps the pirq. If it doesn't, it's a bug :)

<sigh> It doesn't do that. So two bugs:
 - QEMU doing that
 - Linux kernel doing the hypercall as well.

> 
> 
> > I presume that on the first load the msg.data value is
> > ALWAYS != XEN_PIRQ_MSI_DATA right? Is that somethign we can ALWAYS
> > depend? Or is there a possibility that it could be that when the
> > PCI device is unplugged from one guest and plugged in another?
> > 
> > Could we use PHYSDEVOP_irq_status_query to figure out if the PIRQ
> > has been already allocated? Ah, does not look like it.
> > 
> > I am a bit worried about depending on just msg.data to tells us
> > whether the device had been initialized before.
> 
> Data comes from QEMU, I think it is zeroed at init time.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 14:07     ` Konrad Rzeszutek Wilk
@ 2013-05-13 14:50       ` Stefano Stabellini
  2013-05-13 16:17         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-13 14:50 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Zhenzhong Duan, xen-devel, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Yuval Shaia,
	Feng Jin, Chien Yen

On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > Try to use the same pirq which was already mapped and binded at first time
> > > 
> > > So what happens if I unload and reload two drivers in random order?
> > > 
> > > > when driver loaded.
> > > > 
> > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > when driver unload, it's always true in second load.
> > > 
> > > If my understanding is right the issue at hand is that the caching
> > > information about the pirq disappears once the driver has been
> > > unloaded b/c the event's irq-info is removed (as the driver is
> > > unloaded and free_irq is called).
> > > 
> > > Stefano,
> > > Is there a specific write to the MSI structure that would cause the
> > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > PIRQ in usage?
> > 
> > We already have a "free PIRQ" hypercall, it's called
> > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> 
> Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?

I think it would confuse QEMU. It is probably better to let the unmap
being handled by it.


> It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> unregister_real_device which is only called during pci unplug?

You are right! I would think that this behaviour is erroneous unless it
was done on purpose to avoid allocating MSIs twice.
If that is the case we would need to do something similar in Linux too.

I think that the issue is the mismatch between QEMU's and Linux's
behaviours: either both should be allocating MSIs once, or they should
both be allocating and deallocating MSIs every time the driver is loaded
and unloaded.


> > Linux should disable the MSI bit in the PCI config options of the
> > device:
> > 
> > pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)
> 
> Zhenzhong, does it do that? Looking at the driver it certainly seems that way.
> > 
> > That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
> > unmaps the pirq. If it doesn't, it's a bug :)
> 
> <sigh> It doesn't do that. So two bugs:
>  - QEMU doing that
>  - Linux kernel doing the hypercall as well.

At first sight I also thought that the Linux kernel was issuing
PHYSDEVOP_unmap_pirq too but actually Linux is only doing it if
xen_initial_domain().

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 14:50       ` Stefano Stabellini
@ 2013-05-13 16:17         ` Konrad Rzeszutek Wilk
  2013-05-13 17:24           ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-13 16:17 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > 
> > > > So what happens if I unload and reload two drivers in random order?
> > > > 
> > > > > when driver loaded.
> > > > > 
> > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > when driver unload, it's always true in second load.
> > > > 
> > > > If my understanding is right the issue at hand is that the caching
> > > > information about the pirq disappears once the driver has been
> > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > unloaded and free_irq is called).
> > > > 
> > > > Stefano,
> > > > Is there a specific write to the MSI structure that would cause the
> > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > PIRQ in usage?
> > > 
> > > We already have a "free PIRQ" hypercall, it's called
> > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > 
> > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> 
> I think it would confuse QEMU. It is probably better to let the unmap
> being handled by it.
> 
> 
> > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > unregister_real_device which is only called during pci unplug?
> 
> You are right! I would think that this behaviour is erroneous unless it
> was done on purpose to avoid allocating MSIs twice.
> If that is the case we would need to do something similar in Linux too.
> 
> I think that the issue is the mismatch between QEMU's and Linux's
> behaviours: either both should be allocating MSIs once, or they should
> both be allocating and deallocating MSIs every time the driver is loaded
> and unloaded.

Right. But we also have the scenario that QEMU and Linux are going to 
be out of sync. So we need fixes in both places - I think.

> 
> 
> > > Linux should disable the MSI bit in the PCI config options of the
> > > device:
> > > 
> > > pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)
> > 
> > Zhenzhong, does it do that? Looking at the driver it certainly seems that way.
> > > 
> > > That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
> > > unmaps the pirq. If it doesn't, it's a bug :)
> > 
> > <sigh> It doesn't do that. So two bugs:
> >  - QEMU doing that
> >  - Linux kernel doing the hypercall as well.
> 
> At first sight I also thought that the Linux kernel was issuing
> PHYSDEVOP_unmap_pirq too but actually Linux is only doing it if
> xen_initial_domain().

That seems like an easy fix. Just do 'if (xen_initial_domain()
|| xen_hvm_domain())'. I think the only one we cannot do it for
is 'xen_pv_domain()' (so PCI in PV guests) as the "owner" of the PIRQ is
actually dom0.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 16:17         ` Konrad Rzeszutek Wilk
@ 2013-05-13 17:24           ` Stefano Stabellini
  2013-05-13 18:20             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-13 17:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Zhenzhong Duan, xen-devel, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Yuval Shaia,
	Feng Jin, Chien Yen

On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > > 
> > > > > So what happens if I unload and reload two drivers in random order?
> > > > > 
> > > > > > when driver loaded.
> > > > > > 
> > > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > > when driver unload, it's always true in second load.
> > > > > 
> > > > > If my understanding is right the issue at hand is that the caching
> > > > > information about the pirq disappears once the driver has been
> > > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > > unloaded and free_irq is called).
> > > > > 
> > > > > Stefano,
> > > > > Is there a specific write to the MSI structure that would cause the
> > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > > PIRQ in usage?
> > > > 
> > > > We already have a "free PIRQ" hypercall, it's called
> > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > 
> > > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> > 
> > I think it would confuse QEMU. It is probably better to let the unmap
> > being handled by it.
> > 
> > 
> > > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > > unregister_real_device which is only called during pci unplug?
> > 
> > You are right! I would think that this behaviour is erroneous unless it
> > was done on purpose to avoid allocating MSIs twice.
> > If that is the case we would need to do something similar in Linux too.
> > 
> > I think that the issue is the mismatch between QEMU's and Linux's
> > behaviours: either both should be allocating MSIs once, or they should
> > both be allocating and deallocating MSIs every time the driver is loaded
> > and unloaded.
> 
> Right. But we also have the scenario that QEMU and Linux are going to 
> be out of sync. So we need fixes in both places - I think.

QEMU is the owner of the pirq, in fact it is the one that creates and
destroys the mapping. I think that the right place to fix this problem
is in QEMU, the ABI would be much cleaner as a result. As a side effect
we don't need to make any changes in Linux.

> > 
> > > > Linux should disable the MSI bit in the PCI config options of the
> > > > device:
> > > > 
> > > > pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)
> > > 
> > > Zhenzhong, does it do that? Looking at the driver it certainly seems that way.
> > > > 
> > > > That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
> > > > unmaps the pirq. If it doesn't, it's a bug :)
> > > 
> > > <sigh> It doesn't do that. So two bugs:
> > >  - QEMU doing that
> > >  - Linux kernel doing the hypercall as well.
> > 
> > At first sight I also thought that the Linux kernel was issuing
> > PHYSDEVOP_unmap_pirq too but actually Linux is only doing it if
> > xen_initial_domain().
> 
> That seems like an easy fix. Just do 'if (xen_initial_domain()
> || xen_hvm_domain())'. I think the only one we cannot do it for
> is 'xen_pv_domain()' (so PCI in PV guests) as the "owner" of the PIRQ is
> actually dom0.

We could do that and it might be easier than changing QEMU, but I think
it would be more clear and consistent if QEMU was the one to do the
unmapping.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 17:24           ` Stefano Stabellini
@ 2013-05-13 18:20             ` Konrad Rzeszutek Wilk
  2013-05-14 13:49               ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-13 18:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > > > 
> > > > > > So what happens if I unload and reload two drivers in random order?
> > > > > > 
> > > > > > > when driver loaded.
> > > > > > > 
> > > > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > > > when driver unload, it's always true in second load.
> > > > > > 
> > > > > > If my understanding is right the issue at hand is that the caching
> > > > > > information about the pirq disappears once the driver has been
> > > > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > > > unloaded and free_irq is called).
> > > > > > 
> > > > > > Stefano,
> > > > > > Is there a specific write to the MSI structure that would cause the
> > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > > > PIRQ in usage?
> > > > > 
> > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > 
> > > > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > > > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > > > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> > > 
> > > I think it would confuse QEMU. It is probably better to let the unmap
> > > being handled by it.
> > > 
> > > 
> > > > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > > > unregister_real_device which is only called during pci unplug?
> > > 
> > > You are right! I would think that this behaviour is erroneous unless it
> > > was done on purpose to avoid allocating MSIs twice.
> > > If that is the case we would need to do something similar in Linux too.
> > > 
> > > I think that the issue is the mismatch between QEMU's and Linux's
> > > behaviours: either both should be allocating MSIs once, or they should
> > > both be allocating and deallocating MSIs every time the driver is loaded
> > > and unloaded.
> > 
> > Right. But we also have the scenario that QEMU and Linux are going to 
> > be out of sync. So we need fixes in both places - I think.
> 
> QEMU is the owner of the pirq, in fact it is the one that creates and
> destroys the mapping. I think that the right place to fix this problem
> is in QEMU, the ABI would be much cleaner as a result. As a side effect
> we don't need to make any changes in Linux.

You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
> 
> > > 
> > > > > Linux should disable the MSI bit in the PCI config options of the
> > > > > device:
> > > > > 
> > > > > pci_disable_msi -> pci_msi_shutdown -> msi_set_enable(0)
> > > > 
> > > > Zhenzhong, does it do that? Looking at the driver it certainly seems that way.
> > > > > 
> > > > > That should cause QEMU to issue a xc_physdev_unmap_pirq that actually
> > > > > unmaps the pirq. If it doesn't, it's a bug :)
> > > > 
> > > > <sigh> It doesn't do that. So two bugs:
> > > >  - QEMU doing that
> > > >  - Linux kernel doing the hypercall as well.
> > > 
> > > At first sight I also thought that the Linux kernel was issuing
> > > PHYSDEVOP_unmap_pirq too but actually Linux is only doing it if
> > > xen_initial_domain().
> > 
> > That seems like an easy fix. Just do 'if (xen_initial_domain()
> > || xen_hvm_domain())'. I think the only one we cannot do it for
> > is 'xen_pv_domain()' (so PCI in PV guests) as the "owner" of the PIRQ is
> > actually dom0.
> 
> We could do that and it might be easier than changing QEMU, but I think
> it would be more clear and consistent if QEMU was the one to do the
> unmapping.

And allocating the PIRQ?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-13 18:20             ` Konrad Rzeszutek Wilk
@ 2013-05-14 13:49               ` Stefano Stabellini
  2013-05-14 14:20                 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-14 13:49 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Zhenzhong Duan, xen-devel, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Yuval Shaia,
	Feng Jin, Chien Yen

On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > > > > 
> > > > > > > So what happens if I unload and reload two drivers in random order?
> > > > > > > 
> > > > > > > > when driver loaded.
> > > > > > > > 
> > > > > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > > > > when driver unload, it's always true in second load.
> > > > > > > 
> > > > > > > If my understanding is right the issue at hand is that the caching
> > > > > > > information about the pirq disappears once the driver has been
> > > > > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > > > > unloaded and free_irq is called).
> > > > > > > 
> > > > > > > Stefano,
> > > > > > > Is there a specific write to the MSI structure that would cause the
> > > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > > > > PIRQ in usage?
> > > > > > 
> > > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > > 
> > > > > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > > > > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > > > > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> > > > 
> > > > I think it would confuse QEMU. It is probably better to let the unmap
> > > > being handled by it.
> > > > 
> > > > 
> > > > > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > > > > unregister_real_device which is only called during pci unplug?
> > > > 
> > > > You are right! I would think that this behaviour is erroneous unless it
> > > > was done on purpose to avoid allocating MSIs twice.
> > > > If that is the case we would need to do something similar in Linux too.
> > > > 
> > > > I think that the issue is the mismatch between QEMU's and Linux's
> > > > behaviours: either both should be allocating MSIs once, or they should
> > > > both be allocating and deallocating MSIs every time the driver is loaded
> > > > and unloaded.
> > > 
> > > Right. But we also have the scenario that QEMU and Linux are going to 
> > > be out of sync. So we need fixes in both places - I think.
> > 
> > QEMU is the owner of the pirq, in fact it is the one that creates and
> > destroys the mapping. I think that the right place to fix this problem
> > is in QEMU, the ABI would be much cleaner as a result. As a side effect
> > we don't need to make any changes in Linux.
> 
> You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
 
PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
pirq that QEMU is going to use.

However I would let QEMU handle the mapping (it already does that in
pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
I think the problem is that pt_msi_disable is only called on
unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
the guest disables MSIs.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-14 13:49               ` Stefano Stabellini
@ 2013-05-14 14:20                 ` Konrad Rzeszutek Wilk
  2013-05-15  9:41                   ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-14 14:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > > > > > 
> > > > > > > > So what happens if I unload and reload two drivers in random order?
> > > > > > > > 
> > > > > > > > > when driver loaded.
> > > > > > > > > 
> > > > > > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > > > > > when driver unload, it's always true in second load.
> > > > > > > > 
> > > > > > > > If my understanding is right the issue at hand is that the caching
> > > > > > > > information about the pirq disappears once the driver has been
> > > > > > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > > > > > unloaded and free_irq is called).
> > > > > > > > 
> > > > > > > > Stefano,
> > > > > > > > Is there a specific write to the MSI structure that would cause the
> > > > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > > > > > PIRQ in usage?
> > > > > > > 
> > > > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > > > 
> > > > > > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > > > > > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > > > > > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> > > > > 
> > > > > I think it would confuse QEMU. It is probably better to let the unmap
> > > > > being handled by it.
> > > > > 
> > > > > 
> > > > > > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > > > > > unregister_real_device which is only called during pci unplug?
> > > > > 
> > > > > You are right! I would think that this behaviour is erroneous unless it
> > > > > was done on purpose to avoid allocating MSIs twice.
> > > > > If that is the case we would need to do something similar in Linux too.
> > > > > 
> > > > > I think that the issue is the mismatch between QEMU's and Linux's
> > > > > behaviours: either both should be allocating MSIs once, or they should
> > > > > both be allocating and deallocating MSIs every time the driver is loaded
> > > > > and unloaded.
> > > > 
> > > > Right. But we also have the scenario that QEMU and Linux are going to 
> > > > be out of sync. So we need fixes in both places - I think.
> > > 
> > > QEMU is the owner of the pirq, in fact it is the one that creates and
> > > destroys the mapping. I think that the right place to fix this problem
> > > is in QEMU, the ABI would be much cleaner as a result. As a side effect
> > > we don't need to make any changes in Linux.
> > 
> > You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
>  
> PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
> pirq that QEMU is going to use.

That looks like an API violation. We have an hypercall that
allocates the PIRQ in the Linux, then two hypercalls in the QEMU
layer - one to map, and the other to unmap and free.


> 
> However I would let QEMU handle the mapping (it already does that in
> pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
> done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
> I think the problem is that pt_msi_disable is only called on
> unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
> the guest disables MSIs.

Sure, I am not disputing that. I think the fix in QEMU to call the
unmap is correct.

But I am also wondering whether it makes sense to do that in the Linux
kernel - as it does the alloc in the first place. Seems like a bit of
duct-tape has been used to connect this plumbing together.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-14 14:20                 ` Konrad Rzeszutek Wilk
@ 2013-05-15  9:41                   ` Stefano Stabellini
  2013-05-15 14:18                     ` Zhenzhong Duan
  2013-05-17  2:22                     ` Zhenzhong Duan
  0 siblings, 2 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-15  9:41 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, Zhenzhong Duan, xen-devel, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Yuval Shaia,
	Feng Jin, Chien Yen

On Tue, 14 May 2013, Konrad Rzeszutek Wilk wrote:
> On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
> > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
> > > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
> > > > > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
> > > > > > > > > > When driver load and unload in a loop, pirq will exhaust finally.
> > > > > > > > > > Try to use the same pirq which was already mapped and binded at first time
> > > > > > > > > 
> > > > > > > > > So what happens if I unload and reload two drivers in random order?
> > > > > > > > > 
> > > > > > > > > > when driver loaded.
> > > > > > > > > > 
> > > > > > > > > > Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
> > > > > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
> > > > > > > > > > when driver unload, it's always true in second load.
> > > > > > > > > 
> > > > > > > > > If my understanding is right the issue at hand is that the caching
> > > > > > > > > information about the pirq disappears once the driver has been
> > > > > > > > > unloaded b/c the event's irq-info is removed (as the driver is
> > > > > > > > > unloaded and free_irq is called).
> > > > > > > > > 
> > > > > > > > > Stefano,
> > > > > > > > > Is there a specific write to the MSI structure that would cause the
> > > > > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
> > > > > > > > > PIRQ in usage?
> > > > > > > > 
> > > > > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > > > > 
> > > > > > > Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
> > > > > > > it in the Linux kernel (and not in QEMU), perhaps that should be done in the
> > > > > > > Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
> > > > > > 
> > > > > > I think it would confuse QEMU. It is probably better to let the unmap
> > > > > > being handled by it.
> > > > > > 
> > > > > > 
> > > > > > > It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
> > > > > > > unregister_real_device which is only called during pci unplug?
> > > > > > 
> > > > > > You are right! I would think that this behaviour is erroneous unless it
> > > > > > was done on purpose to avoid allocating MSIs twice.
> > > > > > If that is the case we would need to do something similar in Linux too.
> > > > > > 
> > > > > > I think that the issue is the mismatch between QEMU's and Linux's
> > > > > > behaviours: either both should be allocating MSIs once, or they should
> > > > > > both be allocating and deallocating MSIs every time the driver is loaded
> > > > > > and unloaded.
> > > > > 
> > > > > Right. But we also have the scenario that QEMU and Linux are going to 
> > > > > be out of sync. So we need fixes in both places - I think.
> > > > 
> > > > QEMU is the owner of the pirq, in fact it is the one that creates and
> > > > destroys the mapping. I think that the right place to fix this problem
> > > > is in QEMU, the ABI would be much cleaner as a result. As a side effect
> > > > we don't need to make any changes in Linux.
> > > 
> > > You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
> >  
> > PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
> > pirq that QEMU is going to use.
> 
> That looks like an API violation. We have an hypercall that
> allocates the PIRQ in the Linux, then two hypercalls in the QEMU
> layer - one to map, and the other to unmap and free.
> 
> > However I would let QEMU handle the mapping (it already does that in
> > pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
> > done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
> > I think the problem is that pt_msi_disable is only called on
> > unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
> > the guest disables MSIs.
> 
> Sure, I am not disputing that. I think the fix in QEMU to call the
> unmap is correct.
> 
> But I am also wondering whether it makes sense to do that in the Linux
> kernel - as it does the alloc in the first place. Seems like a bit of
> duct-tape has been used to connect this plumbing together.


I admit that it is not a great interface.
I would be open to options that move the entire setup/freeing in Linux,
but keep in mind that we need to retain the pirq code in QEMU for pure
HVM guests.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-15  9:41                   ` Stefano Stabellini
@ 2013-05-15 14:18                     ` Zhenzhong Duan
  2013-05-17  2:22                     ` Zhenzhong Duan
  1 sibling, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-05-15 14:18 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen


On 2013-05-15 17:41, Stefano Stabellini wrote:
> On Tue, 14 May 2013, Konrad Rzeszutek Wilk wrote:
>> On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>> On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
>>>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>> On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
>>>>>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>> On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
>>>>>>>>> On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>> On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
>>>>>>>>>>> When driver load and unload in a loop, pirq will exhaust finally.
>>>>>>>>>>> Try to use the same pirq which was already mapped and binded at first time
>>>>>>>>>> So what happens if I unload and reload two drivers in random order?
>>>>>>>>>>
>>>>>>>>>>> when driver loaded.
>>>>>>>>>>>
>>>>>>>>>>> Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
>>>>>>>>>>> xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
>>>>>>>>>>> when driver unload, it's always true in second load.
>>>>>>>>>> If my understanding is right the issue at hand is that the caching
>>>>>>>>>> information about the pirq disappears once the driver has been
>>>>>>>>>> unloaded b/c the event's irq-info is removed (as the driver is
>>>>>>>>>> unloaded and free_irq is called).
>>>>>>>>>>
>>>>>>>>>> Stefano,
>>>>>>>>>> Is there a specific write to the MSI structure that would cause the
>>>>>>>>>> hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
>>>>>>>>>> PIRQ in usage?
>>>>>>>>> We already have a "free PIRQ" hypercall, it's called
>>>>>>>>> PHYSDEVOP_unmap_pirq and should be called by QEMU.
>>>>>>>> Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
>>>>>>>> it in the Linux kernel (and not in QEMU), perhaps that should be done in the
>>>>>>>> Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
>>>>>>> I think it would confuse QEMU. It is probably better to let the unmap
>>>>>>> being handled by it.
>>>>>>>
>>>>>>>
>>>>>>>> It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
>>>>>>>> unregister_real_device which is only called during pci unplug?
>>>>>>> You are right! I would think that this behaviour is erroneous unless it
>>>>>>> was done on purpose to avoid allocating MSIs twice.
>>>>>>> If that is the case we would need to do something similar in Linux too.
>>>>>>>
>>>>>>> I think that the issue is the mismatch between QEMU's and Linux's
>>>>>>> behaviours: either both should be allocating MSIs once, or they should
>>>>>>> both be allocating and deallocating MSIs every time the driver is loaded
>>>>>>> and unloaded.
>>>>>> Right. But we also have the scenario that QEMU and Linux are going to
>>>>>> be out of sync. So we need fixes in both places - I think.
>>>>> QEMU is the owner of the pirq, in fact it is the one that creates and
>>>>> destroys the mapping. I think that the right place to fix this problem
>>>>> is in QEMU, the ABI would be much cleaner as a result. As a side effect
>>>>> we don't need to make any changes in Linux.
>>>> You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
>>>   
>>> PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
>>> pirq that QEMU is going to use.
>> That looks like an API violation. We have an hypercall that
>> allocates the PIRQ in the Linux, then two hypercalls in the QEMU
>> layer - one to map, and the other to unmap and free.
>>
>>> However I would let QEMU handle the mapping (it already does that in
>>> pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
>>> done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
>>> I think the problem is that pt_msi_disable is only called on
>>> unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
>>> the guest disables MSIs.
>> Sure, I am not disputing that. I think the fix in QEMU to call the
>> unmap is correct.
>>
>> But I am also wondering whether it makes sense to do that in the Linux
>> kernel - as it does the alloc in the first place. Seems like a bit of
>> duct-tape has been used to connect this plumbing together.
>
> I admit that it is not a great interface.
> I would be open to options that move the entire setup/freeing in Linux,
> but keep in mind that we need to retain the pirq code in QEMU for pure
> HVM guests.
It seems MAP_PIRQ_TYPE_MSI is unsupported in physdev_hvm_map_pirq yet.
Move setup/freeing in Linux need to add that support first.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-15  9:41                   ` Stefano Stabellini
  2013-05-15 14:18                     ` Zhenzhong Duan
@ 2013-05-17  2:22                     ` Zhenzhong Duan
  2013-05-20 10:24                       ` Stefano Stabellini
  1 sibling, 1 reply; 58+ messages in thread
From: Zhenzhong Duan @ 2013-05-17  2:22 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen


On 2013-05-15 17:41, Stefano Stabellini wrote:
> On Tue, 14 May 2013, Konrad Rzeszutek Wilk wrote:
>> On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>> On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
>>>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>> On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini wrote:
>>>>>>> On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>> On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini wrote:
>>>>>>>>> On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>> On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan wrote:
>>>>>>>>>>> When driver load and unload in a loop, pirq will exhaust finally.
>>>>>>>>>>> Try to use the same pirq which was already mapped and binded at first time
>>>>>>>>>> So what happens if I unload and reload two drivers in random order?
>>>>>>>>>>
>>>>>>>>>>> when driver loaded.
>>>>>>>>>>>
>>>>>>>>>>> Read pirq from msix entry and test if data is XEN_PIRQ_MSI_DATA
>>>>>>>>>>> xen_irq_from_pirq(pirq) < 0 checking is wrong as irq will be freed
>>>>>>>>>>> when driver unload, it's always true in second load.
>>>>>>>>>> If my understanding is right the issue at hand is that the caching
>>>>>>>>>> information about the pirq disappears once the driver has been
>>>>>>>>>> unloaded b/c the event's irq-info is removed (as the driver is
>>>>>>>>>> unloaded and free_irq is called).
>>>>>>>>>>
>>>>>>>>>> Stefano,
>>>>>>>>>> Is there a specific write to the MSI structure that would cause the
>>>>>>>>>> hypervisor to drop the PIRQ? Or a nice hypercall to "free" an
>>>>>>>>>> PIRQ in usage?
>>>>>>>>> We already have a "free PIRQ" hypercall, it's called
>>>>>>>>> PHYSDEVOP_unmap_pirq and should be called by QEMU.
>>>>>>>> Considering that we call function that allocates (PHYSDEVOP_get_free_pirq)
>>>>>>>> it in the Linux kernel (and not in QEMU), perhaps that should be done in the
>>>>>>>> Linux kernel as part of xen_destroy_irq()? Or would that confuse QEMU?
>>>>>>> I think it would confuse QEMU. It is probably better to let the unmap
>>>>>>> being handled by it.
>>>>>>>
>>>>>>>
>>>>>>>> It looks like QEMU only does that hypercall (via xc_physdev_unmap_pirq)
>>>>>>>> unregister_real_device which is only called during pci unplug?
>>>>>>> You are right! I would think that this behaviour is erroneous unless it
>>>>>>> was done on purpose to avoid allocating MSIs twice.
>>>>>>> If that is the case we would need to do something similar in Linux too.
>>>>>>>
>>>>>>> I think that the issue is the mismatch between QEMU's and Linux's
>>>>>>> behaviours: either both should be allocating MSIs once, or they should
>>>>>>> both be allocating and deallocating MSIs every time the driver is loaded
>>>>>>> and unloaded.
>>>>>> Right. But we also have the scenario that QEMU and Linux are going to
>>>>>> be out of sync. So we need fixes in both places - I think.
>>>>> QEMU is the owner of the pirq, in fact it is the one that creates and
>>>>> destroys the mapping. I think that the right place to fix this problem
>>>>> is in QEMU, the ABI would be much cleaner as a result. As a side effect
>>>>> we don't need to make any changes in Linux.
>>>> You do. You need to remove the PHYSDEVOP_get_free_pirq call in that case.
>>>   
>>> PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
>>> pirq that QEMU is going to use.
>> That looks like an API violation. We have an hypercall that
>> allocates the PIRQ in the Linux, then two hypercalls in the QEMU
>> layer - one to map, and the other to unmap and free.
>>
>>> However I would let QEMU handle the mapping (it already does that in
>>> pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
>>> done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
>>> I think the problem is that pt_msi_disable is only called on
>>> unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
>>> the guest disables MSIs.
>> Sure, I am not disputing that. I think the fix in QEMU to call the
>> unmap is correct.
>>
>> But I am also wondering whether it makes sense to do that in the Linux
>> kernel - as it does the alloc in the first place. Seems like a bit of
>> duct-tape has been used to connect this plumbing together.
>
> I admit that it is not a great interface.
> I would be open to options that move the entire setup/freeing in Linux,
> but keep in mind that we need to retain the pirq code in QEMU for pure
> HVM guests.
Hi Stefano,

do you work out a patch for me to test?

thanks
zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-17  2:22                     ` Zhenzhong Duan
@ 2013-05-20 10:24                       ` Stefano Stabellini
  2013-05-20 15:24                         ` Konrad Rzeszutek Wilk
  2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-20 10:24 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, xen-devel,
	linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Yuval Shaia, Feng Jin, Chien Yen

On Fri, 17 May 2013, Zhenzhong Duan wrote:
> On 2013-05-15 17:41, Stefano Stabellini wrote:
> > On Tue, 14 May 2013, Konrad Rzeszutek Wilk wrote:
> > > On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
> > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> > > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini
> > > > > > > wrote:
> > > > > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini
> > > > > > > > > wrote:
> > > > > > > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan
> > > > > > > > > > > wrote:
> > > > > > > > > > > > When driver load and unload in a loop, pirq will exhaust
> > > > > > > > > > > > finally.
> > > > > > > > > > > > Try to use the same pirq which was already mapped and
> > > > > > > > > > > > binded at first time
> > > > > > > > > > > So what happens if I unload and reload two drivers in
> > > > > > > > > > > random order?
> > > > > > > > > > > 
> > > > > > > > > > > > when driver loaded.
> > > > > > > > > > > > 
> > > > > > > > > > > > Read pirq from msix entry and test if data is
> > > > > > > > > > > > XEN_PIRQ_MSI_DATA
> > > > > > > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq
> > > > > > > > > > > > will be freed
> > > > > > > > > > > > when driver unload, it's always true in second load.
> > > > > > > > > > > If my understanding is right the issue at hand is that the
> > > > > > > > > > > caching
> > > > > > > > > > > information about the pirq disappears once the driver has
> > > > > > > > > > > been
> > > > > > > > > > > unloaded b/c the event's irq-info is removed (as the
> > > > > > > > > > > driver is
> > > > > > > > > > > unloaded and free_irq is called).
> > > > > > > > > > > 
> > > > > > > > > > > Stefano,
> > > > > > > > > > > Is there a specific write to the MSI structure that would
> > > > > > > > > > > cause the
> > > > > > > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free"
> > > > > > > > > > > an
> > > > > > > > > > > PIRQ in usage?
> > > > > > > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > > > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > > > > > > Considering that we call function that allocates
> > > > > > > > > (PHYSDEVOP_get_free_pirq)
> > > > > > > > > it in the Linux kernel (and not in QEMU), perhaps that should
> > > > > > > > > be done in the
> > > > > > > > > Linux kernel as part of xen_destroy_irq()? Or would that
> > > > > > > > > confuse QEMU?
> > > > > > > > I think it would confuse QEMU. It is probably better to let the
> > > > > > > > unmap
> > > > > > > > being handled by it.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > It looks like QEMU only does that hypercall (via
> > > > > > > > > xc_physdev_unmap_pirq)
> > > > > > > > > unregister_real_device which is only called during pci unplug?
> > > > > > > > You are right! I would think that this behaviour is erroneous
> > > > > > > > unless it
> > > > > > > > was done on purpose to avoid allocating MSIs twice.
> > > > > > > > If that is the case we would need to do something similar in
> > > > > > > > Linux too.
> > > > > > > > 
> > > > > > > > I think that the issue is the mismatch between QEMU's and
> > > > > > > > Linux's
> > > > > > > > behaviours: either both should be allocating MSIs once, or they
> > > > > > > > should
> > > > > > > > both be allocating and deallocating MSIs every time the driver
> > > > > > > > is loaded
> > > > > > > > and unloaded.
> > > > > > > Right. But we also have the scenario that QEMU and Linux are going
> > > > > > > to
> > > > > > > be out of sync. So we need fixes in both places - I think.
> > > > > > QEMU is the owner of the pirq, in fact it is the one that creates
> > > > > > and
> > > > > > destroys the mapping. I think that the right place to fix this
> > > > > > problem
> > > > > > is in QEMU, the ABI would be much cleaner as a result. As a side
> > > > > > effect
> > > > > > we don't need to make any changes in Linux.
> > > > > You do. You need to remove the PHYSDEVOP_get_free_pirq call in that
> > > > > case.
> > > >   PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
> > > > pirq that QEMU is going to use.
> > > That looks like an API violation. We have an hypercall that
> > > allocates the PIRQ in the Linux, then two hypercalls in the QEMU
> > > layer - one to map, and the other to unmap and free.
> > > 
> > > > However I would let QEMU handle the mapping (it already does that in
> > > > pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
> > > > done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
> > > > I think the problem is that pt_msi_disable is only called on
> > > > unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
> > > > the guest disables MSIs.
> > > Sure, I am not disputing that. I think the fix in QEMU to call the
> > > unmap is correct.
> > > 
> > > But I am also wondering whether it makes sense to do that in the Linux
> > > kernel - as it does the alloc in the first place. Seems like a bit of
> > > duct-tape has been used to connect this plumbing together.
> > 
> > I admit that it is not a great interface.
> > I would be open to options that move the entire setup/freeing in Linux,
> > but keep in mind that we need to retain the pirq code in QEMU for pure
> > HVM guests.
> Hi Stefano,
> 
> do you work out a patch for me to test?

I'll be traveling/busy for a few weeks, maybe it's best if someone else
picks up this work item.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-20 10:24                       ` Stefano Stabellini
@ 2013-05-20 15:24                         ` Konrad Rzeszutek Wilk
  2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-20 15:24 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

On Mon, May 20, 2013 at 11:24:04AM +0100, Stefano Stabellini wrote:
> On Fri, 17 May 2013, Zhenzhong Duan wrote:
> > On 2013-05-15 17:41, Stefano Stabellini wrote:
> > > On Tue, 14 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > On Tue, May 14, 2013 at 02:49:50PM +0100, Stefano Stabellini wrote:
> > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > On Mon, May 13, 2013 at 06:24:46PM +0100, Stefano Stabellini wrote:
> > > > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > On Mon, May 13, 2013 at 03:50:52PM +0100, Stefano Stabellini
> > > > > > > > wrote:
> > > > > > > > > On Mon, 13 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > On Mon, May 13, 2013 at 12:06:43PM +0100, Stefano Stabellini
> > > > > > > > > > wrote:
> > > > > > > > > > > On Fri, 10 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > > > On Wed, May 08, 2013 at 04:18:24PM +0800, Zhenzhong Duan
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > When driver load and unload in a loop, pirq will exhaust
> > > > > > > > > > > > > finally.
> > > > > > > > > > > > > Try to use the same pirq which was already mapped and
> > > > > > > > > > > > > binded at first time
> > > > > > > > > > > > So what happens if I unload and reload two drivers in
> > > > > > > > > > > > random order?
> > > > > > > > > > > > 
> > > > > > > > > > > > > when driver loaded.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Read pirq from msix entry and test if data is
> > > > > > > > > > > > > XEN_PIRQ_MSI_DATA
> > > > > > > > > > > > > xen_irq_from_pirq(pirq) < 0 checking is wrong as irq
> > > > > > > > > > > > > will be freed
> > > > > > > > > > > > > when driver unload, it's always true in second load.
> > > > > > > > > > > > If my understanding is right the issue at hand is that the
> > > > > > > > > > > > caching
> > > > > > > > > > > > information about the pirq disappears once the driver has
> > > > > > > > > > > > been
> > > > > > > > > > > > unloaded b/c the event's irq-info is removed (as the
> > > > > > > > > > > > driver is
> > > > > > > > > > > > unloaded and free_irq is called).
> > > > > > > > > > > > 
> > > > > > > > > > > > Stefano,
> > > > > > > > > > > > Is there a specific write to the MSI structure that would
> > > > > > > > > > > > cause the
> > > > > > > > > > > > hypervisor to drop the PIRQ? Or a nice hypercall to "free"
> > > > > > > > > > > > an
> > > > > > > > > > > > PIRQ in usage?
> > > > > > > > > > > We already have a "free PIRQ" hypercall, it's called
> > > > > > > > > > > PHYSDEVOP_unmap_pirq and should be called by QEMU.
> > > > > > > > > > Considering that we call function that allocates
> > > > > > > > > > (PHYSDEVOP_get_free_pirq)
> > > > > > > > > > it in the Linux kernel (and not in QEMU), perhaps that should
> > > > > > > > > > be done in the
> > > > > > > > > > Linux kernel as part of xen_destroy_irq()? Or would that
> > > > > > > > > > confuse QEMU?
> > > > > > > > > I think it would confuse QEMU. It is probably better to let the
> > > > > > > > > unmap
> > > > > > > > > being handled by it.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > It looks like QEMU only does that hypercall (via
> > > > > > > > > > xc_physdev_unmap_pirq)
> > > > > > > > > > unregister_real_device which is only called during pci unplug?
> > > > > > > > > You are right! I would think that this behaviour is erroneous
> > > > > > > > > unless it
> > > > > > > > > was done on purpose to avoid allocating MSIs twice.
> > > > > > > > > If that is the case we would need to do something similar in
> > > > > > > > > Linux too.
> > > > > > > > > 
> > > > > > > > > I think that the issue is the mismatch between QEMU's and
> > > > > > > > > Linux's
> > > > > > > > > behaviours: either both should be allocating MSIs once, or they
> > > > > > > > > should
> > > > > > > > > both be allocating and deallocating MSIs every time the driver
> > > > > > > > > is loaded
> > > > > > > > > and unloaded.
> > > > > > > > Right. But we also have the scenario that QEMU and Linux are going
> > > > > > > > to
> > > > > > > > be out of sync. So we need fixes in both places - I think.
> > > > > > > QEMU is the owner of the pirq, in fact it is the one that creates
> > > > > > > and
> > > > > > > destroys the mapping. I think that the right place to fix this
> > > > > > > problem
> > > > > > > is in QEMU, the ABI would be much cleaner as a result. As a side
> > > > > > > effect
> > > > > > > we don't need to make any changes in Linux.
> > > > > > You do. You need to remove the PHYSDEVOP_get_free_pirq call in that
> > > > > > case.
> > > > >   PHYSDEVOP_get_free_pirq needs to stay, because Linux needs to know the
> > > > > pirq that QEMU is going to use.
> > > > That looks like an API violation. We have an hypercall that
> > > > allocates the PIRQ in the Linux, then two hypercalls in the QEMU
> > > > layer - one to map, and the other to unmap and free.
> > > > 
> > > > > However I would let QEMU handle the mapping (it already does that in
> > > > > pt_msi_setup calling xc_physdev_map_pirq_msi) and unmapping (that is
> > > > > done by calling xc_domain_unbind_msi_irq from pt_msi_disable).
> > > > > I think the problem is that pt_msi_disable is only called on
> > > > > unregister_real_device and pt_reset_interrupt_and_io_mapping, not when
> > > > > the guest disables MSIs.
> > > > Sure, I am not disputing that. I think the fix in QEMU to call the
> > > > unmap is correct.
> > > > 
> > > > But I am also wondering whether it makes sense to do that in the Linux
> > > > kernel - as it does the alloc in the first place. Seems like a bit of
> > > > duct-tape has been used to connect this plumbing together.
> > > 
> > > I admit that it is not a great interface.
> > > I would be open to options that move the entire setup/freeing in Linux,
> > > but keep in mind that we need to retain the pirq code in QEMU for pure
> > > HVM guests.
> > Hi Stefano,
> > 
> > do you work out a patch for me to test?

I did the patch that which calls PHYSDEVOP_unmap_pirq from within
Linux kernel. And it did not fix a thing. Here is the patch (sorry about
the copy-n-paste):
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 6a6bbe4..d122ca9 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -789,21 +789,28 @@ error_irq:
        return ret;
 }
 #endif
-
+static bool unmap_pirq = false;
+static int __init unmap_pirq_setup(char *s)
+{
+       unmap_pirq = true;
+       return 1;
+}
+__setup("unmap_pirq", unmap_pirq_setup);
 int xen_destroy_irq(int irq)
 {
        struct irq_desc *desc;
        struct physdev_unmap_pirq unmap_irq;
        struct irq_info *info = info_for_irq(irq);
        int rc = -ENOENT;
-
        mutex_lock(&irq_mapping_update_lock);

        desc = irq_to_desc(irq);
        if (!desc)
                goto out;

-       if (xen_initial_domain()) {
+       if (xen_initial_domain() || unmap_pirq) {
+               printk(KERN_INFO "irq: %d, pirq:%d unmapping\n",
+                      irq, info->u.pirq.pirq);
                unmap_irq.pirq = info->u.pirq.pirq;
                unmap_irq.domid = info->u.pirq.domid;
                rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);


Meaning that the later call to xen_allocate_pirq_msi would still get new PIRQs
value.

On the other hand, the original patch proposed by Zhenzhong works nicely.


.. I am wondering what is going wrong here.
> 
> I'll be traveling/busy for a few weeks, maybe it's best if someone else
> picks up this work item.

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-20 10:24                       ` Stefano Stabellini
  2013-05-20 15:24                         ` Konrad Rzeszutek Wilk
@ 2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
  2013-05-20 20:38                           ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-20 17:57 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

> > Hi Stefano,
> > 
> > do you work out a patch for me to test?
> 
> I'll be traveling/busy for a few weeks, maybe it's best if someone else
> picks up this work item.

This little test-case below should have worked:

#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/pagemap.h>
#include <linux/init.h>
#include <xen/xen.h>
#include <xen/page.h>
#include <asm/xen/hypervisor.h>
#include <xen/features.h>
#include <xen/events.h>

MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
MODULE_DESCRIPTION("alloc_and_unmap");
MODULE_LICENSE("GPL");
MODULE_VERSION("0.1");

static int do_it(void)
{
	struct physdev_get_free_pirq op_get_free_pirq;
	struct physdev_unmap_pirq unmap_irq;
	int rc, pirq;

	op_get_free_pirq.type = MAP_PIRQ_TYPE_MSI;
 	rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq);
	if (rc) {
		printk(KERN_WARNING "%s:%d rc:%d\n", __func__, __LINE__, rc);
		return rc;
	}
	pirq = op_get_free_pirq.pirq;
	unmap_irq.pirq = pirq;
	unmap_irq.domid = DOMID_SELF;
	rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
	if (rc) {
		printk(KERN_WARNING "unmap irq failed %d\n", rc);
		return rc;
	}
	printk("PIRQ: %d\n", pirq);
	return 0;
}
static int __init alloc_and_unmap_init(void)
{
	int i;

	for (i = 0; i < 10; i++)
		if (do_it())
			break;
	return 0;
}
static void __exit alloc_and_unmap_exit(void)
{
}
module_init(alloc_and_unmap_init);
module_exit(alloc_and_unmap_exit);


But I get:

# insmod /alloc_and_unmap.ko 
[   34.899277] PIRQ: 55
[   34.901846] PIRQ: 54
[   34.904351] PIRQ: 53
[   34.906921] PIRQ: 52
[   34.909451] PIRQ: 51
[   34.912038] PIRQ: 50
[   34.914650] PIRQ: 49
[   34.917205] PIRQ: 48
[   34.919776] PIRQ: 47
[   34.922339] PIRQ: 46

Which means there is some bug in the hypervisor as well (This is with Xen 4.3
and traditional QEMU - not that it matters as I am just doing these hypercalls).


At this point I think that upstream option is to save the PIRQ value and re-use it.
Will post a patch for it.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
@ 2013-05-20 20:38                           ` Konrad Rzeszutek Wilk
  2013-05-21 10:07                             ` [Xen-devel] " David Vrabel
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-20 20:38 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Zhenzhong Duan, xen-devel, linux-kernel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Yuval Shaia, Feng Jin, Chien Yen

> At this point I think that upstream option is to save the PIRQ value and re-use it.
> Will post a patch for it.

Here is the patch. It works for me when passing in a NIC driver.

>From 509499568d1cdf1f2a3fb53773c991f4b063eb56 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Mon, 20 May 2013 16:08:16 -0400
Subject: [PATCH] xen/pci: Track PVHVM PIRQs.

The PIRQs that the hypervisor provides for the guest are a limited
resource. They are acquired via PHYSDEVOP_get_free_pirq and in
theory should be returned back to the hypervisor via PHYSDEVOP_unmap_pirq
hypercall. Unfortunatly that is not the case.

This means that if there is a PCI device that has been passed in
the guest and does a loop of 'rmmod <driver>;modprobe <driver>"
we end up exhausting all of the PIRQs that are available.

For example (with kernel built as debug), we get this:
00:05.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
[  152.659396] e1000e 0000:00:05.0: xen: msi bound to pirq=53
[  152.665856] e1000e 0000:00:05.0: xen: msi --> pirq=53 --> irq=73
.. snip
[  188.276835] e1000e 0000:00:05.0: xen: msi bound to pirq=51
[  188.283194] e1000e 0000:00:05.0: xen: msi --> pirq=51 --> irq=73

.. and so on, until the pirq value is zero. This is an acute problem
when many PCI devices with many MSI-X entries are passed in the guest.

There is an alternative solution where we assume that on PCI
initialization (so when user passes in the PCI device) QEMU will init
the MSI and MSI-X entries to zero. Based on that assumptions and
that the Linux MSI API will write the PIRQ value to the MSI/MSI-X
(and used by QEMU through the life-cycle of the PCI device), we can
also depend on that. That means if MSI (or MSI-X entries) are read back
and are not 0, we can re-use that PIRQ value. However this patch
guards against future changes in QEMU in case that assumption
is incorrect.

Reported-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/events.c |  124 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 6a6bbe4..8aae21a 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -112,6 +112,27 @@ struct irq_info {
 #define PIRQ_NEEDS_EOI	(1 << 0)
 #define PIRQ_SHAREABLE	(1 << 1)
 
+/*
+ * The PHYSDEVOP_get_free_pirq allocates a set of PIRQs for the guest and
+ * the PHYSDEVOP_unmap_pirq is suppose to return them to the hypervisor.
+ * Unfortunatly that is not the case and we exhaust all of the PIRQs that are
+ * allocated for the domain if a driver is loaded/unloaded in a loop.
+ * The pirq_info serves a cache of the allocated PIRQs so that we can reuse
+ * for drivers. Note, it is only used by the MSI, MSI-X routines.
+ */
+#ifdef CONFIG_PCI_MSI
+static LIST_HEAD(xen_pirq_list_head);
+#endif
+enum xen_pirq_status {
+	PIRQ_FREE = 1,
+	PIRQ_BUSY
+};
+struct pirq_info {
+	struct list_head list;
+	int pirq;
+	enum xen_pirq_status status;
+};
+
 static int *evtchn_to_irq;
 #ifdef CONFIG_X86
 static unsigned long *pirq_eoi_map;
@@ -145,6 +166,86 @@ static struct irq_chip xen_pirq_chip;
 static void enable_dynirq(struct irq_data *data);
 static void disable_dynirq(struct irq_data *data);
 
+#ifdef CONFIG_PCI_MSI
+static void xen_pirq_add(int pirq)
+{
+	struct pirq_info *p_info;
+
+	if (!xen_hvm_domain())
+		return;
+
+	p_info = kzalloc(sizeof(*p_info), GFP_KERNEL);
+	if (!p_info)
+		return;
+
+	p_info->pirq = pirq;
+	p_info->status = PIRQ_FREE;
+
+	mutex_lock(&irq_mapping_update_lock);
+	list_add_tail(&p_info->list, &xen_pirq_list_head);
+	mutex_unlock(&irq_mapping_update_lock);
+}
+static void xen_update_pirq_status(int pirq, enum xen_pirq_status status)
+{
+	struct pirq_info *p_info = NULL;
+	bool found = false;
+
+	if (!xen_hvm_domain())
+		return;
+
+	/*
+	 * The irq_mapping_update_lock is being held while this is called.
+	 */
+	list_for_each_entry(p_info, &xen_pirq_list_head, list) {
+
+		if (p_info->pirq != pirq)
+			continue;
+		found = true;
+		break;
+	}
+	if (found && p_info)
+		p_info->status = status;
+}
+static int xen_pirq_get_free(void)
+{
+	int pirq = -EBUSY;
+	struct pirq_info *p_info = NULL;
+
+	if (!xen_hvm_domain())
+		return -ENODEV;
+
+	mutex_lock(&irq_mapping_update_lock);
+	list_for_each_entry(p_info, &xen_pirq_list_head, list) {
+		if (p_info->status == PIRQ_BUSY)
+			continue;
+		pirq = p_info->pirq;
+		break;
+	}
+	mutex_unlock(&irq_mapping_update_lock);
+	return pirq;
+}
+static void xen_pirq_clear_all(void)
+{
+	struct pirq_info *p_info = NULL, *tmp;
+
+	if (!xen_hvm_domain())
+		return;
+
+	list_for_each_entry_safe(p_info, tmp, &xen_pirq_list_head, list) {
+		/*
+		 * When migrating to a new host, the PT PCI devices _should_
+		 * be unplugged!
+		 */
+		WARN_ON(p_info->status == PIRQ_BUSY);
+		list_del(&p_info->list);
+		kfree(p_info);
+	}
+}
+#else
+static void xen_update_pirq_status(int pirq, enum xen_pirq_status status) { }
+static void xen_pirq_clear_all(void) { }
+#endif
+
 /* Get info for IRQ */
 static struct irq_info *info_for_irq(unsigned irq)
 {
@@ -222,6 +323,8 @@ static void xen_irq_info_pirq_init(unsigned irq,
 	info->u.pirq.gsi = gsi;
 	info->u.pirq.domid = domid;
 	info->u.pirq.flags = flags;
+
+	xen_update_pirq_status(pirq, PIRQ_BUSY);
 }
 
 /*
@@ -520,6 +622,12 @@ static void xen_free_irq(unsigned irq)
 	if (WARN_ON(!info))
 		return;
 
+	/*
+	 * Update the list of PIRQs (for PVHVM guests) that it is free.
+	 */
+	if (info->type == IRQT_PIRQ)
+		xen_update_pirq_status(info->u.pirq.pirq, PIRQ_FREE);
+
 	list_del(&info->list);
 
 	irq_set_handler_data(irq, NULL);
@@ -750,15 +858,22 @@ out:
 #ifdef CONFIG_PCI_MSI
 int xen_allocate_pirq_msi(struct pci_dev *dev, struct msi_desc *msidesc)
 {
-	int rc;
+	int rc, pirq;
 	struct physdev_get_free_pirq op_get_free_pirq;
 
+	pirq = xen_pirq_get_free();
+	if (pirq >= 0)
+		return pirq;
+
 	op_get_free_pirq.type = MAP_PIRQ_TYPE_MSI;
 	rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq);
 
 	WARN_ONCE(rc == -ENOSYS,
 		  "hypervisor does not support the PHYSDEVOP_get_free_pirq interface\n");
 
+	if (rc == 0)
+		xen_pirq_add(op_get_free_pirq.pirq);
+
 	return rc ? -1 : op_get_free_pirq.pirq;
 }
 
@@ -1631,6 +1746,11 @@ static void restore_pirqs(void)
 
 		__startup_pirq(irq);
 	}
+
+	/* For PVHVM guests with PCI passthrough devices (that got unplugged
+	 * before the migrate) we MUST clear our cache.
+	 */
+	xen_pirq_clear_all();
 }
 
 static void restore_cpu_virqs(unsigned int cpu)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-20 20:38                           ` Konrad Rzeszutek Wilk
@ 2013-05-21 10:07                             ` David Vrabel
  2013-05-21 13:40                               ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: David Vrabel @ 2013-05-21 10:07 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, xen-devel, Feng Jin, Zhenzhong Duan,
	Yuval Shaia, linux-kernel, Chien Yen, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner

On 20/05/13 21:38, Konrad Rzeszutek Wilk wrote:
>> At this point I think that upstream option is to save the PIRQ value and re-use it.
>> Will post a patch for it.
> 
> Here is the patch. It works for me when passing in a NIC driver.
> 
>>From 509499568d1cdf1f2a3fb53773c991f4b063eb56 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Mon, 20 May 2013 16:08:16 -0400
> Subject: [PATCH] xen/pci: Track PVHVM PIRQs.
> 
> The PIRQs that the hypervisor provides for the guest are a limited
> resource. They are acquired via PHYSDEVOP_get_free_pirq and in
> theory should be returned back to the hypervisor via PHYSDEVOP_unmap_pirq
> hypercall. Unfortunatly that is not the case.
> 
> This means that if there is a PCI device that has been passed in
> the guest and does a loop of 'rmmod <driver>;modprobe <driver>"
> we end up exhausting all of the PIRQs that are available.
> 
> For example (with kernel built as debug), we get this:
> 00:05.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
> [  152.659396] e1000e 0000:00:05.0: xen: msi bound to pirq=53
> [  152.665856] e1000e 0000:00:05.0: xen: msi --> pirq=53 --> irq=73
> .. snip
> [  188.276835] e1000e 0000:00:05.0: xen: msi bound to pirq=51
> [  188.283194] e1000e 0000:00:05.0: xen: msi --> pirq=51 --> irq=73
> 
> .. and so on, until the pirq value is zero. This is an acute problem
> when many PCI devices with many MSI-X entries are passed in the guest.
> 
> There is an alternative solution where we assume that on PCI
> initialization (so when user passes in the PCI device) QEMU will init
> the MSI and MSI-X entries to zero. Based on that assumptions and
> that the Linux MSI API will write the PIRQ value to the MSI/MSI-X
> (and used by QEMU through the life-cycle of the PCI device), we can
> also depend on that. That means if MSI (or MSI-X entries) are read back
> and are not 0, we can re-use that PIRQ value. However this patch
> guards against future changes in QEMU in case that assumption
> is incorrect.
> 
> Reported-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/xen/events.c |  124 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 122 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index 6a6bbe4..8aae21a 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -112,6 +112,27 @@ struct irq_info {
>  #define PIRQ_NEEDS_EOI	(1 << 0)
>  #define PIRQ_SHAREABLE	(1 << 1)
>  
> +/*
> + * The PHYSDEVOP_get_free_pirq allocates a set of PIRQs for the guest and
> + * the PHYSDEVOP_unmap_pirq is suppose to return them to the hypervisor.
> + * Unfortunatly that is not the case and we exhaust all of the PIRQs that are
> + * allocated for the domain if a driver is loaded/unloaded in a loop.
> + * The pirq_info serves a cache of the allocated PIRQs so that we can reuse
> + * for drivers. Note, it is only used by the MSI, MSI-X routines.
> + */

Ick.  Let's fix the bug in the hypervisor instead of hacking up the
kernel like this.

Looking at the hypervisor code I couldn't see anything obviously wrong.
 I do note that Xen doesn't free the pirq until it has been unbound by
the guest.  Xen will warn if the guest unmaps a pirq that is still bound
("domD: forcing unbind of pirq P").  Is this what is happening?  If so,
that would suggest a bug in the guest rather than the hypervisor.

David

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 10:07                             ` [Xen-devel] " David Vrabel
@ 2013-05-21 13:40                               ` Konrad Rzeszutek Wilk
  2013-05-21 16:51                                 ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-21 13:40 UTC (permalink / raw)
  To: David Vrabel
  Cc: Stefano Stabellini, xen-devel, Feng Jin, Zhenzhong Duan,
	Yuval Shaia, linux-kernel, Chien Yen, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 2976 bytes --]

> Looking at the hypervisor code I couldn't see anything obviously wrong.

I think the culprit is "physdev_unmap_pirq":

   if ( is_hvm_domain(d) )                                                     
    {                                                                           
        spin_lock(&d->event_lock);                                              
        gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
            d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
            domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
            domain_pirq_to_irq(d, pirq));                                       
                                                                                
        if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
            ret = unmap_domain_pirq_emuirq(d, pirq);                            
        spin_unlock(&d->event_lock);                                            
        if ( domid == DOMID_SELF || ret )                                       
            goto free_domain;                                             

It always tells me unbound:

(XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
(XEN) irq.c:1873:d14 14, nr_pirqs: 56
(a bit older debug code, so the 'unbound' does not show up here).

Which means that the call to unmap_domain_pirq_emuirq does not happen.
The checks in unmap_domain_pirq_emuirq also look to be depend
on the code being IRQ_UNBOUND.

In other words, all of that code looks to only clear things when
they are !IRQ_UNBOUND.

But the other logic (IRQ_UNBOUND) looks to be missing a removal
in the radix tree:

  if ( emuirq != IRQ_PT )                                                     
        radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
                                                                        
And I think that is what is causing the leak - the radix tree
needs to be pruned? Or perhaps the allocate_pirq should check
the radix tree for IRQ_UNBOUND ones and re-use them?

>  I do note that Xen doesn't free the pirq until it has been unbound by
> the guest.  Xen will warn if the guest unmaps a pirq that is still bound
> ("domD: forcing unbind of pirq P").  Is this what is happening?  If so,

No. It does not b/c of this check in  physdev_unmap_pirq:

if ( domid == DOMID_SELF || ret )
   goto free_domain

which jumps over the call to unmap_domain_pirq.

> that would suggest a bug in the guest rather than the hypervisor.

No. But then I am not even using it. See attached module.

> 
> David

[-- Attachment #2: alloc_and_unmap.c --]
[-- Type: text/plain, Size: 1232 bytes --]

#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/pagemap.h>
#include <linux/init.h>
#include <xen/xen.h>
#include <xen/page.h>
#include <asm/xen/hypervisor.h>
#include <xen/features.h>
#include <xen/events.h>

MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
MODULE_DESCRIPTION("alloc_and_unmap");
MODULE_LICENSE("GPL");
MODULE_VERSION("0.1");

static int do_it(void)
{
	int rc;
	struct physdev_get_free_pirq op_get_free_pirq;
	struct physdev_unmap_pirq unmap_irq;
	int pirq;

	op_get_free_pirq.type = MAP_PIRQ_TYPE_MSI;
 	rc = HYPERVISOR_physdev_op(PHYSDEVOP_get_free_pirq, &op_get_free_pirq);
	if (rc) {
		printk(KERN_WARNING "%s:%d rc:%d\n", __func__, __LINE__, rc);
		return rc;
	}
	pirq = op_get_free_pirq.pirq;
	unmap_irq.pirq = pirq;
	unmap_irq.domid = DOMID_SELF;
	rc = HYPERVISOR_physdev_op(PHYSDEVOP_unmap_pirq, &unmap_irq);
	if (rc) {
		printk(KERN_WARNING "unmap irq failed %d\n", rc);
		return rc;
	}
	printk("PIRQ: %d\n", pirq);
	return 0;
}
static int __init alloc_and_unmap_init(void)
{
	int i;

	for (i = 0; i < 10; i++)
		if (do_it())
			break;
	return 0;
}
static void __exit alloc_and_unmap_exit(void)
{
}
module_init(alloc_and_unmap_init);
module_exit(alloc_and_unmap_exit);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 13:40                               ` Konrad Rzeszutek Wilk
@ 2013-05-21 16:51                                 ` Stefano Stabellini
  2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
  2013-05-29 17:50                                   ` Stefano Stabellini
  0 siblings, 2 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-21 16:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, Stefano Stabellini, xen-devel, Feng Jin,
	Zhenzhong Duan, Yuval Shaia, linux-kernel, Chien Yen,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner

On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > Looking at the hypervisor code I couldn't see anything obviously wrong.
> 
> I think the culprit is "physdev_unmap_pirq":
> 
>    if ( is_hvm_domain(d) )                                                     
>     {                                                                           
>         spin_lock(&d->event_lock);                                              
>         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
>             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
>             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
>             domain_pirq_to_irq(d, pirq));                                       
>                                                                                 
>         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
>             ret = unmap_domain_pirq_emuirq(d, pirq);                            
>         spin_unlock(&d->event_lock);                                            
>         if ( domid == DOMID_SELF || ret )                                       
>             goto free_domain;                                             
> 
> It always tells me unbound:
> 
> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (a bit older debug code, so the 'unbound' does not show up here).
> 
> Which means that the call to unmap_domain_pirq_emuirq does not happen.
> The checks in unmap_domain_pirq_emuirq also look to be depend
> on the code being IRQ_UNBOUND.
> 
> In other words, all of that code looks to only clear things when
> they are !IRQ_UNBOUND.
> 
> But the other logic (IRQ_UNBOUND) looks to be missing a removal
> in the radix tree:
> 
>   if ( emuirq != IRQ_PT )                                                     
>         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
>                                                                         
> And I think that is what is causing the leak - the radix tree
> needs to be pruned? Or perhaps the allocate_pirq should check
> the radix tree for IRQ_UNBOUND ones and re-use them?

I think that you are looking in the wrong place.
The issue is that QEMU doesn't call pt_msi_disable in
pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).

The code above is correct as is because it is trying to handle emulated
IRQs and MSIs, not real passthrough MSIs. They latter are not added to
that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 16:51                                 ` Stefano Stabellini
@ 2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
  2013-05-21 21:50                                     ` Stefano Stabellini
  2013-05-29 17:50                                   ` Stefano Stabellini
  1 sibling, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-21 20:42 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: David Vrabel, xen-devel, Feng Jin, Zhenzhong Duan, Yuval Shaia,
	linux-kernel, Chien Yen, Ingo Molnar, H. Peter Anvin,
	Thomas Gleixner

On Tue, May 21, 2013 at 05:51:02PM +0100, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > 
> > I think the culprit is "physdev_unmap_pirq":
> > 
> >    if ( is_hvm_domain(d) )                                                     
> >     {                                                                           
> >         spin_lock(&d->event_lock);                                              
> >         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
> >             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
> >             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
> >             domain_pirq_to_irq(d, pirq));                                       
> >                                                                                 
> >         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
> >             ret = unmap_domain_pirq_emuirq(d, pirq);                            
> >         spin_unlock(&d->event_lock);                                            
> >         if ( domid == DOMID_SELF || ret )                                       
> >             goto free_domain;                                             
> > 
> > It always tells me unbound:
> > 
> > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (a bit older debug code, so the 'unbound' does not show up here).
> > 
> > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > The checks in unmap_domain_pirq_emuirq also look to be depend
> > on the code being IRQ_UNBOUND.
> > 
> > In other words, all of that code looks to only clear things when
> > they are !IRQ_UNBOUND.
> > 
> > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > in the radix tree:
> > 
> >   if ( emuirq != IRQ_PT )                                                     
> >         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
> >                                                                         
> > And I think that is what is causing the leak - the radix tree
> > needs to be pruned? Or perhaps the allocate_pirq should check
> > the radix tree for IRQ_UNBOUND ones and re-use them?
> 
> I think that you are looking in the wrong place.
> The issue is that QEMU doesn't call pt_msi_disable in
> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).

In my test-case I am not even calling QEMU. I am just doing two hypercalls 
hypercall - get_free_pirq and unmap.
> 
> The code above is correct as is because it is trying to handle emulated
> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.

The bug is in the hypervisor. This little patch solves the test-case
(I hadn't tried to do the PCI passthrough yet)


diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index b0b0c65..b78717a 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1851,8 +1851,8 @@ static int pirq_guest_force_unbind(struct domain *d, struct pirq *pirq)
 static inline bool_t is_free_pirq(const struct domain *d,
                                   const struct pirq *pirq)
 {
-    return !pirq || (!pirq->arch.irq && (!is_hvm_domain(d) ||
-        pirq->arch.hvm.emuirq == IRQ_UNBOUND));
+    return !pirq || ((pirq->arch.irq == 0 || (pirq->arch.irq == PIRQ_ALLOCATED)) &&
+           (!is_hvm_domain(d) || pirq->arch.hvm.emuirq == IRQ_UNBOUND));
 }
 
 int get_free_pirq(struct domain *d, int type)


The reason is that pirq->arch.irq in PHYSDEVOP_get_free_pirq is set to
from the value of zero to -1 (PIRQ_ALLOCATED). Then in map_domain_pirq
we check it first:

904     old_irq = domain_pirq_to_irq(d, pirq);                  
.. snip..                    
1907     if ( (old_irq > 0 && (old_irq != irq) ) ||                                  

and since the 'old_irq' is -1 (or zero), and the irq passed in
is different, then all checks pass and the value is over-written:

 1988         set_domain_irq_pirq(d, irq, info);                                      

And that is it.


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
@ 2013-05-21 21:50                                     ` Stefano Stabellini
  2013-05-21 22:41                                       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-21 21:50 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Stefano Stabellini, David Vrabel, xen-devel, Feng Jin,
	Zhenzhong Duan, Yuval Shaia, linux-kernel, Chien Yen,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner

On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> On Tue, May 21, 2013 at 05:51:02PM +0100, Stefano Stabellini wrote:
> > On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > > 
> > > I think the culprit is "physdev_unmap_pirq":
> > > 
> > >    if ( is_hvm_domain(d) )                                                     
> > >     {                                                                           
> > >         spin_lock(&d->event_lock);                                              
> > >         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
> > >             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
> > >             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
> > >             domain_pirq_to_irq(d, pirq));                                       
> > >                                                                                 
> > >         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
> > >             ret = unmap_domain_pirq_emuirq(d, pirq);                            
> > >         spin_unlock(&d->event_lock);                                            
> > >         if ( domid == DOMID_SELF || ret )                                       
> > >             goto free_domain;                                             
> > > 
> > > It always tells me unbound:
> > > 
> > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (a bit older debug code, so the 'unbound' does not show up here).
> > > 
> > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > on the code being IRQ_UNBOUND.
> > > 
> > > In other words, all of that code looks to only clear things when
> > > they are !IRQ_UNBOUND.
> > > 
> > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > in the radix tree:
> > > 
> > >   if ( emuirq != IRQ_PT )                                                     
> > >         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
> > >                                                                         
> > > And I think that is what is causing the leak - the radix tree
> > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > 
> > I think that you are looking in the wrong place.
> > The issue is that QEMU doesn't call pt_msi_disable in
> > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> 
> In my test-case I am not even calling QEMU. I am just doing two hypercalls 
> hypercall - get_free_pirq and unmap.
> > 
> > The code above is correct as is because it is trying to handle emulated
> > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> 
> The bug is in the hypervisor. This little patch solves the test-case
> (I hadn't tried to do the PCI passthrough yet)
> 
> 
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index b0b0c65..b78717a 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1851,8 +1851,8 @@ static int pirq_guest_force_unbind(struct domain *d, struct pirq *pirq)
>  static inline bool_t is_free_pirq(const struct domain *d,
>                                    const struct pirq *pirq)
>  {
> -    return !pirq || (!pirq->arch.irq && (!is_hvm_domain(d) ||
> -        pirq->arch.hvm.emuirq == IRQ_UNBOUND));
> +    return !pirq || ((pirq->arch.irq == 0 || (pirq->arch.irq == PIRQ_ALLOCATED)) &&
> +           (!is_hvm_domain(d) || pirq->arch.hvm.emuirq == IRQ_UNBOUND));
>  }
>  
>  int get_free_pirq(struct domain *d, int type)
> 
> 
> The reason is that pirq->arch.irq in PHYSDEVOP_get_free_pirq is set to
> from the value of zero to -1 (PIRQ_ALLOCATED). Then in map_domain_pirq
> we check it first:
> 
> 904     old_irq = domain_pirq_to_irq(d, pirq);                  
> .. snip..                    
> 1907     if ( (old_irq > 0 && (old_irq != irq) ) ||                                  
> 
> and since the 'old_irq' is -1 (or zero), and the irq passed in
> is different, then all checks pass and the value is over-written:
> 
>  1988         set_domain_irq_pirq(d, irq, info);                                      
> 
> And that is it.


We have to be careful about this: the point of PHYSDEVOP_get_free_pirq is
that Linux can know for sure the pirq that is going to be used to map the
MSI by QEMU. If you modify is_free_pirq that way, suddenly the pirq
could be allocated for something else after Linux called
PHYSDEVOP_get_free_pirq and before QEMU called xc_physdev_map_pirq_msi.

Right now the unmap is supposed to be done by QEMU, not Linux. So I
think that it is "normal" (although counterintuitive) that your little
test works that way.

pirq allocated via PHYSDEVOP_get_free_pirq should be passed to QEMU,
mapped by QEMU, unmapped by QEMU and eventually freed by QEMU.

This is not the bestest interface ever written of course but that's how
it works now.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 21:50                                     ` Stefano Stabellini
@ 2013-05-21 22:41                                       ` Konrad Rzeszutek Wilk
  2013-05-22  9:37                                         ` Jan Beulich
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-21 22:41 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: David Vrabel, xen-devel, Feng Jin, Zhenzhong Duan, Yuval Shaia,
	linux-kernel, Chien Yen, Ingo Molnar, H. Peter Anvin,
	Thomas Gleixner

On Tue, May 21, 2013 at 10:50:09PM +0100, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > On Tue, May 21, 2013 at 05:51:02PM +0100, Stefano Stabellini wrote:
> > > On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > > > 
> > > > I think the culprit is "physdev_unmap_pirq":
> > > > 
> > > >    if ( is_hvm_domain(d) )                                                     
> > > >     {                                                                           
> > > >         spin_lock(&d->event_lock);                                              
> > > >         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
> > > >             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
> > > >             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
> > > >             domain_pirq_to_irq(d, pirq));                                       
> > > >                                                                                 
> > > >         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
> > > >             ret = unmap_domain_pirq_emuirq(d, pirq);                            
> > > >         spin_unlock(&d->event_lock);                                            
> > > >         if ( domid == DOMID_SELF || ret )                                       
> > > >             goto free_domain;                                             
> > > > 
> > > > It always tells me unbound:
> > > > 
> > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > (a bit older debug code, so the 'unbound' does not show up here).
> > > > 
> > > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > > on the code being IRQ_UNBOUND.
> > > > 
> > > > In other words, all of that code looks to only clear things when
> > > > they are !IRQ_UNBOUND.
> > > > 
> > > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > > in the radix tree:
> > > > 
> > > >   if ( emuirq != IRQ_PT )                                                     
> > > >         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
> > > >                                                                         
> > > > And I think that is what is causing the leak - the radix tree
> > > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > 
> > > I think that you are looking in the wrong place.
> > > The issue is that QEMU doesn't call pt_msi_disable in
> > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > 
> > In my test-case I am not even calling QEMU. I am just doing two hypercalls 
> > hypercall - get_free_pirq and unmap.
> > > 
> > > The code above is correct as is because it is trying to handle emulated
> > > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> > 
> > The bug is in the hypervisor. This little patch solves the test-case
> > (I hadn't tried to do the PCI passthrough yet)
> > 
> > 
> > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > index b0b0c65..b78717a 100644
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -1851,8 +1851,8 @@ static int pirq_guest_force_unbind(struct domain *d, struct pirq *pirq)
> >  static inline bool_t is_free_pirq(const struct domain *d,
> >                                    const struct pirq *pirq)
> >  {
> > -    return !pirq || (!pirq->arch.irq && (!is_hvm_domain(d) ||
> > -        pirq->arch.hvm.emuirq == IRQ_UNBOUND));
> > +    return !pirq || ((pirq->arch.irq == 0 || (pirq->arch.irq == PIRQ_ALLOCATED)) &&
> > +           (!is_hvm_domain(d) || pirq->arch.hvm.emuirq == IRQ_UNBOUND));
> >  }
> >  
> >  int get_free_pirq(struct domain *d, int type)

To be fair, this diff is just to demonstrate that the pirq->arch.irq is
the one that seems to gate things. I am not suggesting that this is the
final patch - just the 'aha, this is what is happening!'.
> > 
> > 
> > The reason is that pirq->arch.irq in PHYSDEVOP_get_free_pirq is set to
> > from the value of zero to -1 (PIRQ_ALLOCATED). Then in map_domain_pirq
> > we check it first:
> > 
> > 904     old_irq = domain_pirq_to_irq(d, pirq);                  
> > .. snip..                    
> > 1907     if ( (old_irq > 0 && (old_irq != irq) ) ||                                  
> > 
> > and since the 'old_irq' is -1 (or zero), and the irq passed in
> > is different, then all checks pass and the value is over-written:
> > 
> >  1988         set_domain_irq_pirq(d, irq, info);                                      
> > 
> > And that is it.
> 
> 
> We have to be careful about this: the point of PHYSDEVOP_get_free_pirq is
> that Linux can know for sure the pirq that is going to be used to map the
> MSI by QEMU. If you modify is_free_pirq that way, suddenly the pirq
> could be allocated for something else after Linux called
> PHYSDEVOP_get_free_pirq and before QEMU called xc_physdev_map_pirq_msi.

Yes. And I think the 'is_free_pirq' modification above is incorrect.

I think the fix should be in the unmap_pirq code (hypervisor) to check
if the arch.irq is either zero or PIRQ_ALLOCATED. Right now it only
checks for zero.

Then as you say there is also xc_physdev_map_pirq_msi, but the call
chain looks to be INTx-MSIx. The pt_msi_update is the call that is
used when guest writes the PIRQ (xc_domain_update_msi_irq -> XEN_DOMCTL_bind_pt_irq)

And that looks to be it. The pt_msi_update can be called multiple times if
the guest decides to use a different PIRQ.

> 
> Right now the unmap is supposed to be done by QEMU, not Linux. So I
> think that it is "normal" (although counterintuitive) that your little
> test works that way.

Yes, the test-case is flawed.
> 
> pirq allocated via PHYSDEVOP_get_free_pirq should be passed to QEMU,
> mapped by QEMU, unmapped by QEMU and eventually freed by QEMU.
> 
> This is not the bestest interface ever written of course but that's how
> it works now.

That is kindly said :-)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 22:41                                       ` Konrad Rzeszutek Wilk
@ 2013-05-22  9:37                                         ` Jan Beulich
  2013-05-22 15:14                                           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Jan Beulich @ 2013-05-22  9:37 UTC (permalink / raw)
  To: Stefano Stabellini, Konrad Rzeszutek Wilk
  Cc: David Vrabel, Thomas Gleixner, xen-devel, Chien Yen, Feng Jin,
	Yuval Shaia, Zhenzhong Duan, Ingo Molnar, linux-kernel,
	H. Peter Anvin

>>> On 22.05.13 at 00:41, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> On Tue, May 21, 2013 at 10:50:09PM +0100, Stefano Stabellini wrote:
>> We have to be careful about this: the point of PHYSDEVOP_get_free_pirq is
>> that Linux can know for sure the pirq that is going to be used to map the
>> MSI by QEMU. If you modify is_free_pirq that way, suddenly the pirq
>> could be allocated for something else after Linux called
>> PHYSDEVOP_get_free_pirq and before QEMU called xc_physdev_map_pirq_msi.
> 
> Yes. And I think the 'is_free_pirq' modification above is incorrect.
> 
> I think the fix should be in the unmap_pirq code (hypervisor) to check
> if the arch.irq is either zero or PIRQ_ALLOCATED. Right now it only
> checks for zero.

Which check are you talking about? Looking at physdev_unmap_pirq()
I see none at all, unmap_domain_pirq() has a <= 0 check, and
unmap_domain_pirq_emuirq() again doesn't appear to have any.

If you're talking about unmap_domain_pirq(), then you'll need to
be careful: A negative value here doesn't necessarily mean
PIRQ_ALLOCATED, but could also come from another run that
found it necessary to force the unbind. Hence the definition of
PIRQ_ALLOCATED would then collide with the (unlikely?) case of
IRQ1 having got assigned to a guest. To be on the safe side, we
should therefore redefine PIRQ_ALLOCATED to say INT_MIN.

Jan


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-22  9:37                                         ` Jan Beulich
@ 2013-05-22 15:14                                           ` Konrad Rzeszutek Wilk
  2013-05-22 15:25                                             ` Jan Beulich
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-22 15:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, David Vrabel, Thomas Gleixner, xen-devel,
	Chien Yen, Feng Jin, Yuval Shaia, Zhenzhong Duan, Ingo Molnar,
	linux-kernel, H. Peter Anvin

On Wed, May 22, 2013 at 10:37:39AM +0100, Jan Beulich wrote:
> >>> On 22.05.13 at 00:41, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > On Tue, May 21, 2013 at 10:50:09PM +0100, Stefano Stabellini wrote:
> >> We have to be careful about this: the point of PHYSDEVOP_get_free_pirq is
> >> that Linux can know for sure the pirq that is going to be used to map the
> >> MSI by QEMU. If you modify is_free_pirq that way, suddenly the pirq
> >> could be allocated for something else after Linux called
> >> PHYSDEVOP_get_free_pirq and before QEMU called xc_physdev_map_pirq_msi.
> > 
> > Yes. And I think the 'is_free_pirq' modification above is incorrect.
> > 
> > I think the fix should be in the unmap_pirq code (hypervisor) to check
> > if the arch.irq is either zero or PIRQ_ALLOCATED. Right now it only
> > checks for zero.
> 
> Which check are you talking about? Looking at physdev_unmap_pirq()

Sorry about being so haphazard here. I am still digging in the code
and trying to get a sense of how QEMU and hypervisor are suppose to
dance together.

The check was on the PHYSDEVOP_get_free_pirq, which calls get_free_pirq
and uses the is_free_pirq check. After the get_free_pirq call, the logic
in PHYSDEVOP_get_free_pirq sets info->arch.pirq = PIRQ_ALLOCATED to
protect itself from giving the same PIRQ twice.

The physdev_unmap_pirq (from PHYSDEVOP_unmap_pirq), only has this
check:
 if (domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND)

and since the arch.hvm.emuirq is IRQ_UNBOUND (-1), it does not
call unmap_domain_pirq_emuirq. It probably shouldn't, but it should
at least remove the info->arch.pirq = PIRQ_ALLOCATED as we are
telling the hypervisor: "hey, I am done with this, return to the
pool." But since that is not cleared, the PHYSDEVOP_get_free_pirq
will skip this pirq as arch.pirq is still set to PIRQ_ALLOCATED.

> I see none at all, unmap_domain_pirq() has a <= 0 check, and
> unmap_domain_pirq_emuirq() again doesn't appear to have any.

The 'unmap_domain_pirq' path would be if dom0 (so QEMU) did the
unmap for the guest. That is via the PHYSDEVOP_unmap_pirq. And
I think if that path was taken (as Stefano suggests QEMU should
do when a MSI or MSI-X driver is unloaded and zero is writen as
an PIRQ), we would end up calling clear_domain_irq_pirq, which
would set arch.pirq = 0.

Or to a negative value as you pointed out later. Which then
means we won't be ever able to re-use the PIRQ (as
PHYSDEVOP_get_free_pirq or rather get_free_pirq would skip over it
as arch.pirq != 0).
> 
> If you're talking about unmap_domain_pirq(), then you'll need to
> be careful: A negative value here doesn't necessarily mean
> PIRQ_ALLOCATED, but could also come from another run that
> found it necessary to force the unbind. Hence the definition of
> PIRQ_ALLOCATED would then collide with the (unlikely?) case of
> IRQ1 having got assigned to a guest. To be on the safe side, we
> should therefore redefine PIRQ_ALLOCATED to say INT_MIN.

You are right about being cautious - this is a bit of complex
code interaction between Xen, QEMU, and Linux kernel.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-22 15:14                                           ` Konrad Rzeszutek Wilk
@ 2013-05-22 15:25                                             ` Jan Beulich
  2013-05-22 16:41                                               ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 58+ messages in thread
From: Jan Beulich @ 2013-05-22 15:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, Stefano Stabellini, Thomas Gleixner, xen-devel,
	Chien Yen, Feng Jin, Yuval Shaia, Zhenzhong Duan, Ingo Molnar,
	linux-kernel, H. Peter Anvin

>>> On 22.05.13 at 17:14, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> The physdev_unmap_pirq (from PHYSDEVOP_unmap_pirq), only has this
> check:
>  if (domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND)
> 
> and since the arch.hvm.emuirq is IRQ_UNBOUND (-1), it does not
> call unmap_domain_pirq_emuirq. It probably shouldn't, but it should
> at least remove the info->arch.pirq = PIRQ_ALLOCATED as we are
> telling the hypervisor: "hey, I am done with this, return to the
> pool." But since that is not cleared, the PHYSDEVOP_get_free_pirq
> will skip this pirq as arch.pirq is still set to PIRQ_ALLOCATED.

Okay, that clarifies it quite a bit. For one, I'll leave any of the
emuirq stuff to Stefano, who wrote this originally. And then, from
the beginning of this thread, I'm not convinced that freeing a pirq
is really the right thing here: unmap_pirq() is the counterpart of
map_pirq(), not get_free_pirq(). I would think that is a guest
allocates a pirq and then unmaps it without first mapping it, it's
the guest's fault that it now lost one pirq resource. It should not
have allocated one in the first place if it didn't mean to use it for
anything.

>> I see none at all, unmap_domain_pirq() has a <= 0 check, and
>> unmap_domain_pirq_emuirq() again doesn't appear to have any.
> 
> The 'unmap_domain_pirq' path would be if dom0 (so QEMU) did the
> unmap for the guest. That is via the PHYSDEVOP_unmap_pirq. And
> I think if that path was taken (as Stefano suggests QEMU should
> do when a MSI or MSI-X driver is unloaded and zero is writen as
> an PIRQ), we would end up calling clear_domain_irq_pirq, which
> would set arch.pirq = 0.
> 
> Or to a negative value as you pointed out later. Which then
> means we won't be ever able to re-use the PIRQ (as
> PHYSDEVOP_get_free_pirq or rather get_free_pirq would skip over it
> as arch.pirq != 0).

That setting to a negative value is not causing the slot to be
permanently lost, it merely defers its freeing. It was the only
way I could find back then to reasonably handle an unmap
being done before the matched unbind.

Jan


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-22 15:25                                             ` Jan Beulich
@ 2013-05-22 16:41                                               ` Konrad Rzeszutek Wilk
  2013-05-23  6:31                                                 ` Jan Beulich
  0 siblings, 1 reply; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-22 16:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: David Vrabel, Stefano Stabellini, Thomas Gleixner, xen-devel,
	Chien Yen, Feng Jin, Yuval Shaia, Zhenzhong Duan, Ingo Molnar,
	linux-kernel, H. Peter Anvin

On Wed, May 22, 2013 at 04:25:10PM +0100, Jan Beulich wrote:
> >>> On 22.05.13 at 17:14, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > The physdev_unmap_pirq (from PHYSDEVOP_unmap_pirq), only has this
> > check:
> >  if (domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND)
> > 
> > and since the arch.hvm.emuirq is IRQ_UNBOUND (-1), it does not
> > call unmap_domain_pirq_emuirq. It probably shouldn't, but it should
> > at least remove the info->arch.pirq = PIRQ_ALLOCATED as we are
> > telling the hypervisor: "hey, I am done with this, return to the
> > pool." But since that is not cleared, the PHYSDEVOP_get_free_pirq
> > will skip this pirq as arch.pirq is still set to PIRQ_ALLOCATED.
> 
> Okay, that clarifies it quite a bit. For one, I'll leave any of the
> emuirq stuff to Stefano, who wrote this originally. And then, from
> the beginning of this thread, I'm not convinced that freeing a pirq
> is really the right thing here: unmap_pirq() is the counterpart of
> map_pirq(), not get_free_pirq(). I would think that is a guest
> allocates a pirq and then unmaps it without first mapping it, it's
> the guest's fault that it now lost one pirq resource. It should not
> have allocated one in the first place if it didn't mean to use it for
> anything.

It does use it, but if you do run this in a loop:
  rmmod e1000e;modprobe e1000e

it ends up doing thse three hypercalls: PHYSDEVOP_get_free_pirq,
PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq and so on. The reason is that
drivers/xen/events.c keeps track of the Linux IRQ <-> PIRQ just as long
as needed - if the driver does a free_irq, well, then the mapping is
de-allocated and lost.

One patch I posted (for Linux) keeps track of the PIRQ so that if
free_irq is called and we remove the Linux IRQ <-> PIRQ association,
we still have the PIRQ saved away and can re-use it.

In other words, the loop ends up doing:
 PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq

> 
> >> I see none at all, unmap_domain_pirq() has a <= 0 check, and
> >> unmap_domain_pirq_emuirq() again doesn't appear to have any.
> > 
> > The 'unmap_domain_pirq' path would be if dom0 (so QEMU) did the
> > unmap for the guest. That is via the PHYSDEVOP_unmap_pirq. And
> > I think if that path was taken (as Stefano suggests QEMU should
> > do when a MSI or MSI-X driver is unloaded and zero is writen as
> > an PIRQ), we would end up calling clear_domain_irq_pirq, which
> > would set arch.pirq = 0.
> > 
> > Or to a negative value as you pointed out later. Which then
> > means we won't be ever able to re-use the PIRQ (as
> > PHYSDEVOP_get_free_pirq or rather get_free_pirq would skip over it
> > as arch.pirq != 0).
> 
> That setting to a negative value is not causing the slot to be
> permanently lost, it merely defers its freeing. It was the only
> way I could find back then to reasonably handle an unmap
> being done before the matched unbind.

Ah, so pt_irq_destroy_bind (XEN_DOMCTL_unbind_pt_irq) is the counterpart
to PHYSDEVOP_get_free_pirq in some form. Which looks to be on QEMU side
only called when the PCI device is put in sleep or pulled out of the guest.

It probably shouldn't be called when the device is merely de-activated.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-22 16:41                                               ` Konrad Rzeszutek Wilk
@ 2013-05-23  6:31                                                 ` Jan Beulich
  0 siblings, 0 replies; 58+ messages in thread
From: Jan Beulich @ 2013-05-23  6:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, Stefano Stabellini, Thomas Gleixner, xen-devel,
	Chien Yen, Feng Jin, Yuval Shaia, Zhenzhong Duan, Ingo Molnar,
	linux-kernel, H. Peter Anvin

>>> On 22.05.13 at 18:41, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> On Wed, May 22, 2013 at 04:25:10PM +0100, Jan Beulich wrote:
>> Okay, that clarifies it quite a bit. For one, I'll leave any of the
>> emuirq stuff to Stefano, who wrote this originally. And then, from
>> the beginning of this thread, I'm not convinced that freeing a pirq
>> is really the right thing here: unmap_pirq() is the counterpart of
>> map_pirq(), not get_free_pirq(). I would think that is a guest
>> allocates a pirq and then unmaps it without first mapping it, it's
>> the guest's fault that it now lost one pirq resource. It should not
>> have allocated one in the first place if it didn't mean to use it for
>> anything.
> 
> It does use it, but if you do run this in a loop:
>   rmmod e1000e;modprobe e1000e
> 
> it ends up doing thse three hypercalls: PHYSDEVOP_get_free_pirq,
> PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq and so on.

Yeah, my comment was more towards your simplified repro
module...

> The reason is that
> drivers/xen/events.c keeps track of the Linux IRQ <-> PIRQ just as long
> as needed - if the driver does a free_irq, well, then the mapping is
> de-allocated and lost.
> 
> One patch I posted (for Linux) keeps track of the PIRQ so that if
> free_irq is called and we remove the Linux IRQ <-> PIRQ association,
> we still have the PIRQ saved away and can re-use it.
> 
> In other words, the loop ends up doing:
>  PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq

And that's the right solution here imo. Remember that
get_free_pirq was introduced only for the pv-ops kernel, the
classic one never needed it (and hence couldn't leak pIRQ-s).

>> That setting to a negative value is not causing the slot to be
>> permanently lost, it merely defers its freeing. It was the only
>> way I could find back then to reasonably handle an unmap
>> being done before the matched unbind.
> 
> Ah, so pt_irq_destroy_bind (XEN_DOMCTL_unbind_pt_irq) is the counterpart
> to PHYSDEVOP_get_free_pirq in some form. Which looks to be on QEMU side
> only called when the PCI device is put in sleep or pulled out of the guest.
> 
> It probably shouldn't be called when the device is merely de-activated.

Right.

Jan


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-21 16:51                                 ` Stefano Stabellini
  2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
@ 2013-05-29 17:50                                   ` Stefano Stabellini
  2013-05-30 17:48                                     ` Konrad Rzeszutek Wilk
  2013-06-05  5:27                                     ` Zhenzhong Duan
  1 sibling, 2 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-05-29 17:50 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Zhenzhong Duan, Yuval Shaia, linux-kernel, Chien Yen,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner

On Tue, 21 May 2013, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > 
> > I think the culprit is "physdev_unmap_pirq":
> > 
> >    if ( is_hvm_domain(d) )                                                     
> >     {                                                                           
> >         spin_lock(&d->event_lock);                                              
> >         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
> >             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
> >             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
> >             domain_pirq_to_irq(d, pirq));                                       
> >                                                                                 
> >         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
> >             ret = unmap_domain_pirq_emuirq(d, pirq);                            
> >         spin_unlock(&d->event_lock);                                            
> >         if ( domid == DOMID_SELF || ret )                                       
> >             goto free_domain;                                             
> > 
> > It always tells me unbound:
> > 
> > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > (a bit older debug code, so the 'unbound' does not show up here).
> > 
> > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > The checks in unmap_domain_pirq_emuirq also look to be depend
> > on the code being IRQ_UNBOUND.
> > 
> > In other words, all of that code looks to only clear things when
> > they are !IRQ_UNBOUND.
> > 
> > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > in the radix tree:
> > 
> >   if ( emuirq != IRQ_PT )                                                     
> >         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
> >                                                                         
> > And I think that is what is causing the leak - the radix tree
> > needs to be pruned? Or perhaps the allocate_pirq should check
> > the radix tree for IRQ_UNBOUND ones and re-use them?
> 
> I think that you are looking in the wrong place.
> The issue is that QEMU doesn't call pt_msi_disable in
> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> 
> The code above is correct as is because it is trying to handle emulated
> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.


This patch fixes the issue, I have only tested MSI (MSI-X completely
untested).


diff --git a/hw/pass-through.c b/hw/pass-through.c
index 304c438..079e465 100644
--- a/hw/pass-through.c
+++ b/hw/pass-through.c
@@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev *ptdev,
         ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
     }
     else
-        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
+    {
+        if (ptdev->msi->flags & PT_MSI_MAPPED) {
+            pt_msi_disable(ptdev);
+        }
+    }
 
     /* pass through MSI_ENABLE bit when no MSI-INTx translation */
     if (!ptdev->msi_trans_en) {
@@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev *ptdev,
             pt_disable_msi_translate(ptdev);
         }
         pt_msix_update(ptdev);
+    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
+        pt_msix_delete(ptdev);
     }
 
     ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
diff --git a/hw/pt-msi.c b/hw/pt-msi.c
index b03b989..65fa7d6 100644
--- a/hw/pt-msi.c
+++ b/hw/pt-msi.c
@@ -213,7 +213,8 @@ void pt_msi_disable(struct pt_dev *dev)
 
 out:
     /* clear msi info */
-    dev->msi->flags &= ~(MSI_FLAG_UNINIT | PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
+    dev->msi->flags &= ~(PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
+    dev->msi->flags |= MSI_FLAG_UNINIT;
     dev->msi->pirq = -1;
     dev->msi_trans_en = 0;
 }

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-29 17:50                                   ` Stefano Stabellini
@ 2013-05-30 17:48                                     ` Konrad Rzeszutek Wilk
  2013-06-05  5:27                                     ` Zhenzhong Duan
  1 sibling, 0 replies; 58+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-30 17:48 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: David Vrabel, xen-devel, Feng Jin, Zhenzhong Duan, Yuval Shaia,
	linux-kernel, Chien Yen, Ingo Molnar, H. Peter Anvin,
	Thomas Gleixner

On Wed, May 29, 2013 at 06:50:41PM +0100, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Stefano Stabellini wrote:
> > On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > > 
> > > I think the culprit is "physdev_unmap_pirq":
> > > 
> > >    if ( is_hvm_domain(d) )                                                     
> > >     {                                                                           
> > >         spin_lock(&d->event_lock);                                              
> > >         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
> > >             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
> > >             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
> > >             domain_pirq_to_irq(d, pirq));                                       
> > >                                                                                 
> > >         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
> > >             ret = unmap_domain_pirq_emuirq(d, pirq);                            
> > >         spin_unlock(&d->event_lock);                                            
> > >         if ( domid == DOMID_SELF || ret )                                       
> > >             goto free_domain;                                             
> > > 
> > > It always tells me unbound:
> > > 
> > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (a bit older debug code, so the 'unbound' does not show up here).
> > > 
> > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > on the code being IRQ_UNBOUND.
> > > 
> > > In other words, all of that code looks to only clear things when
> > > they are !IRQ_UNBOUND.
> > > 
> > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > in the radix tree:
> > > 
> > >   if ( emuirq != IRQ_PT )                                                     
> > >         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
> > >                                                                         
> > > And I think that is what is causing the leak - the radix tree
> > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > 
> > I think that you are looking in the wrong place.
> > The issue is that QEMU doesn't call pt_msi_disable in
> > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > 
> > The code above is correct as is because it is trying to handle emulated
> > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> 
> 
> This patch fixes the issue, I have only tested MSI (MSI-X completely
> untested).

I tested it on my NIC which has MSI and it worked nicely. The other box
with MSI-X is giving me a bit of trouble so will have to wait.

Duan,

Could you - when you get a moment - also try this with the PCI device
that was having issues please?
> 
> 
> diff --git a/hw/pass-through.c b/hw/pass-through.c
> index 304c438..079e465 100644
> --- a/hw/pass-through.c
> +++ b/hw/pass-through.c
> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev *ptdev,
>          ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>      }
>      else
> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> +    {
> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> +            pt_msi_disable(ptdev);
> +        }
> +    }
>  
>      /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>      if (!ptdev->msi_trans_en) {
> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev *ptdev,
>              pt_disable_msi_translate(ptdev);
>          }
>          pt_msix_update(ptdev);
> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
> +        pt_msix_delete(ptdev);
>      }
>  
>      ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index b03b989..65fa7d6 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -213,7 +213,8 @@ void pt_msi_disable(struct pt_dev *dev)
>  
>  out:
>      /* clear msi info */
> -    dev->msi->flags &= ~(MSI_FLAG_UNINIT | PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
> +    dev->msi->flags &= ~(PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
> +    dev->msi->flags |= MSI_FLAG_UNINIT;
>      dev->msi->pirq = -1;
>      dev->msi_trans_en = 0;
>  }

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-05-29 17:50                                   ` Stefano Stabellini
  2013-05-30 17:48                                     ` Konrad Rzeszutek Wilk
@ 2013-06-05  5:27                                     ` Zhenzhong Duan
  2013-06-05 12:50                                       ` [Xen-devel] " Stefano Stabellini
  1 sibling, 1 reply; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-05  5:27 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Konrad Rzeszutek Wilk, Feng Jin, linux-kernel,
	Yuval Shaia, Chien Yen, Ingo Molnar, David Vrabel,
	H. Peter Anvin, Thomas Gleixner


[-- Attachment #1.1: Type: text/plain, Size: 7385 bytes --]

Stefano Stabellini wrote:
> On Tue, 21 May 2013, Stefano Stabellini wrote:
>    
>> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>      
>>>> Looking at the hypervisor code I couldn't see anything obviously wrong.
>>>>          
>>> I think the culprit is "physdev_unmap_pirq":
>>>
>>>     if ( is_hvm_domain(d) )
>>>      {
>>>          spin_lock(&d->event_lock);
>>>          gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>>              d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>              domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",
>>>              domain_pirq_to_irq(d, pirq));
>>>                                                                                  
>>>          if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>              ret = unmap_domain_pirq_emuirq(d, pirq);
>>>          spin_unlock(&d->event_lock);
>>>          if ( domid == DOMID_SELF || ret )
>>>              goto free_domain;
>>>
>>> It always tells me unbound:
>>>
>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>
>>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>> on the code being IRQ_UNBOUND.
>>>
>>> In other words, all of that code looks to only clear things when
>>> they are !IRQ_UNBOUND.
>>>
>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>> in the radix tree:
>>>
>>>    if ( emuirq != IRQ_PT )
>>>          radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>>                                                                          
>>> And I think that is what is causing the leak - the radix tree
>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>        
>> I think that you are looking in the wrong place.
>> The issue is that QEMU doesn't call pt_msi_disable in
>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>
>> The code above is correct as is because it is trying to handle emulated
>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>      
>
>
> This patch fixes the issue, I have only tested MSI (MSI-X completely
> untested).
>
>
> diff --git a/hw/pass-through.c b/hw/pass-through.c
> index 304c438..079e465 100644
> --- a/hw/pass-through.c
> +++ b/hw/pass-through.c
> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev *ptdev,
>           ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>       }
>       else
> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> +    {
> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> +            pt_msi_disable(ptdev);
> +        }
> +    }
>   
>       /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>       if (!ptdev->msi_trans_en) {
> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev *ptdev,
>               pt_disable_msi_translate(ptdev);
>           }
>           pt_msix_update(ptdev);
> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
> +        pt_msix_delete(ptdev);
>    
Hi Stefano,
I made a test with this patch, os reboot when driver reload. If use 
pt_msix_disable instead of pt_msix_delete, driver could be reloaded.
But I still see some error in qemu.log and xen console. Seems four IRQs 
are not freed when unmap.
--------------first load---------------------------
pt_msix_update_one: pt_msix_update_one requested pirq = 103
pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 102
pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 101
pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 100
pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
------------- first unload---------------------------
pt_msix_disable: Unbind msix with pirq 67, gvec 0
pt_msix_disable: Unmap msix with pirq 67
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
pt_msix_disable: Unbind msix with pirq 66, gvec 0
pt_msix_disable: Unmap msix with pirq 66
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
pt_msix_disable: Unbind msix with pirq 65, gvec 0
pt_msix_disable: Unmap msix with pirq 65
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
pt_msix_disable: Unbind msix with pirq 64, gvec 0
pt_msix_disable: Unmap msix with pirq 64
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
--------------second load---------------------------
pt_msix_update_one: pt_msix_update_one requested pirq = 99
pt_msix_update_one: Update msix entry 0 with pirq 63 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 98
pt_msix_update_one: Update msix entry 1 with pirq 62 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 97
pt_msix_update_one: Update msix entry 2 with pirq 61 gvec 0
pt_msix_update_one: pt_msix_update_one requested pirq = 96
pt_msix_update_one: Update msix entry 3 with pirq 60 gvec 0

xm debug-keys i
(XEN)    IRQ: 222 affinity:00000000,00000000,00000000,00000000,00002000 
vec:a9 type=PCI-MSI         status=00000042 mapped, unbound
(XEN)    IRQ: 223 affinity:00000000,00000000,00000000,00000000,00002000 
vec:c1 type=PCI-MSI         status=00000042 mapped, unbound
(XEN)    IRQ: 224 affinity:00000000,00000000,00000000,00000000,00002000 
vec:22 type=PCI-MSI         status=00000042 mapped, unbound
(XEN)    IRQ: 225 affinity:00000000,00000000,00000000,00000000,00000001 
vec:33 type=PCI-MSI         status=00000002 mapped, unbound
(XEN)    IRQ: 226 affinity:00000000,00000000,00000000,00000000,00000400 
vec:b2 type=PCI-MSI         status=00000010 in-flight=0 domain-list=7: 
99(----),
(XEN)    IRQ: 227 affinity:00000000,00000000,00000000,00000000,00000001 
vec:63 type=PCI-MSI         status=00000050 in-flight=0 domain-list=7: 
98(----),
(XEN)    IRQ: 228 affinity:00000000,00000000,00000000,00000000,00000001 
vec:6b type=PCI-MSI         status=00000050 in-flight=0 domain-list=7: 
97(----),
(XEN)    IRQ: 229 affinity:00000000,00000000,00000000,00000000,00000004 
vec:8c type=PCI-MSI         status=00000051 in-flight=0 domain-list=7: 
96(----),

>       }
>   
>       ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index b03b989..65fa7d6 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -213,7 +213,8 @@ void pt_msi_disable(struct pt_dev *dev)
>   
>   out:
>       /* clear msi info */
> -    dev->msi->flags &= ~(MSI_FLAG_UNINIT | PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
> +    dev->msi->flags &= ~(PT_MSI_MAPPED | PCI_MSI_FLAGS_ENABLE);
> +    dev->msi->flags |= MSI_FLAG_UNINIT;
>       dev->msi->pirq = -1;
>       dev->msi_trans_en = 0;
>   }
>    

[-- Attachment #1.2: Type: text/html, Size: 9326 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-05  5:27                                     ` Zhenzhong Duan
@ 2013-06-05 12:50                                       ` Stefano Stabellini
  2013-06-20  2:57                                         ` Zhenzhong Duan
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-05 12:50 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner

On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> Stefano Stabellini wrote:
> 
> On Tue, 21 May 2013, Stefano Stabellini wrote:
>   
> 
> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>     
> 
> Looking at the hypervisor code I couldn't see anything obviously wrong.
>         
> 
> I think the culprit is "physdev_unmap_pirq":
> 
>    if ( is_hvm_domain(d) )                                                     
>     {                                                                           
>         spin_lock(&d->event_lock);                                              
>         gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",            
>             d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),                 
>             domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",        
>             domain_pirq_to_irq(d, pirq));                                       
>                                                                                 
>         if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )                    
>             ret = unmap_domain_pirq_emuirq(d, pirq);                            
>         spin_unlock(&d->event_lock);                                            
>         if ( domid == DOMID_SELF || ret )                                       
>             goto free_domain;                                             
> 
> It always tells me unbound:
> 
> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> (a bit older debug code, so the 'unbound' does not show up here).
> 
> Which means that the call to unmap_domain_pirq_emuirq does not happen.
> The checks in unmap_domain_pirq_emuirq also look to be depend
> on the code being IRQ_UNBOUND.
> 
> In other words, all of that code looks to only clear things when
> they are !IRQ_UNBOUND.
> 
> But the other logic (IRQ_UNBOUND) looks to be missing a removal
> in the radix tree:
> 
>   if ( emuirq != IRQ_PT )                                                     
>         radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);             
>                                                                         
> And I think that is what is causing the leak - the radix tree
> needs to be pruned? Or perhaps the allocate_pirq should check
> the radix tree for IRQ_UNBOUND ones and re-use them?
>       
> 
> I think that you are looking in the wrong place.
> The issue is that QEMU doesn't call pt_msi_disable in
> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> 
> The code above is correct as is because it is trying to handle emulated
> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>     
> 
> 
> This patch fixes the issue, I have only tested MSI (MSI-X completely
> untested).
> 
> 
> diff --git a/hw/pass-through.c b/hw/pass-through.c
> index 304c438..079e465 100644
> --- a/hw/pass-through.c
> +++ b/hw/pass-through.c
> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev *ptdev,
>          ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>      }
>      else
> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> +    {
> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> +            pt_msi_disable(ptdev);
> +        }
> +    }
>  
>      /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>      if (!ptdev->msi_trans_en) {
> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev *ptdev,
>              pt_disable_msi_translate(ptdev);
>          }
>          pt_msix_update(ptdev);
> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
> +        pt_msix_delete(ptdev);
>   
> 
> Hi Stefano,
> I made a test with this patch, os reboot when driver reload. If use pt_msix_disable
> instead of pt_msix_delete, driver could be reloaded.
> But I still see some error in qemu.log and xen console. Seems four IRQs are not freed
> when unmap.
> --------------first load---------------------------
> pt_msix_update_one: pt_msix_update_one requested pirq = 103
> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
> pt_msix_update_one: pt_msix_update_one requested pirq = 102
> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
> pt_msix_update_one: pt_msix_update_one requested pirq = 101
> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
> pt_msix_update_one: pt_msix_update_one requested pirq = 100
> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
> ------------- first unload---------------------------
> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> pt_msix_disable: Unmap msix with pirq 67
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> pt_msix_disable: Unbind msix with pirq 66, gvec 0
> pt_msix_disable: Unmap msix with pirq 66
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> pt_msix_disable: Unbind msix with pirq 65, gvec 0
> pt_msix_disable: Unmap msix with pirq 65
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> pt_msix_disable: Unbind msix with pirq 64, gvec 0
> pt_msix_disable: Unmap msix with pirq 64
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]

Can you add some printks in Xen (the hypercall name is
PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-05 12:50                                       ` [Xen-devel] " Stefano Stabellini
@ 2013-06-20  2:57                                         ` Zhenzhong Duan
  2013-06-20 14:21                                           ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-20  2:57 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner


On 2013-06-05 20:50, Stefano Stabellini wrote:
> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>> Stefano Stabellini wrote:
>>
>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>    
>>
>> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>      
>>
>> Looking at the hypervisor code I couldn't see anything obviously wrong.
>>          
>>
>> I think the culprit is "physdev_unmap_pirq":
>>
>>     if ( is_hvm_domain(d) )
>>      {
>>          spin_lock(&d->event_lock);
>>          gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>              d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>              domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",
>>              domain_pirq_to_irq(d, pirq));
>>                                                                                  
>>          if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>              ret = unmap_domain_pirq_emuirq(d, pirq);
>>          spin_unlock(&d->event_lock);
>>          if ( domid == DOMID_SELF || ret )
>>              goto free_domain;
>>
>> It always tells me unbound:
>>
>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>> (a bit older debug code, so the 'unbound' does not show up here).
>>
>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>> The checks in unmap_domain_pirq_emuirq also look to be depend
>> on the code being IRQ_UNBOUND.
>>
>> In other words, all of that code looks to only clear things when
>> they are !IRQ_UNBOUND.
>>
>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>> in the radix tree:
>>
>>    if ( emuirq != IRQ_PT )
>>          radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>                                                                          
>> And I think that is what is causing the leak - the radix tree
>> needs to be pruned? Or perhaps the allocate_pirq should check
>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>        
>>
>> I think that you are looking in the wrong place.
>> The issue is that QEMU doesn't call pt_msi_disable in
>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>
>> The code above is correct as is because it is trying to handle emulated
>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>      
>>
>>
>> This patch fixes the issue, I have only tested MSI (MSI-X completely
>> untested).
>>
>>
>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>> index 304c438..079e465 100644
>> --- a/hw/pass-through.c
>> +++ b/hw/pass-through.c
>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev *ptdev,
>>           ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>       }
>>       else
>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>> +    {
>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>> +            pt_msi_disable(ptdev);
>> +        }
>> +    }
>>   
>>       /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>>       if (!ptdev->msi_trans_en) {
>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev *ptdev,
>>               pt_disable_msi_translate(ptdev);
>>           }
>>           pt_msix_update(ptdev);
>> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
>> +        pt_msix_delete(ptdev);
>>    
>>
>> Hi Stefano,
>> I made a test with this patch, os reboot when driver reload. If use pt_msix_disable
>> instead of pt_msix_delete, driver could be reloaded.
>> But I still see some error in qemu.log and xen console. Seems four IRQs are not freed
>> when unmap.
>> --------------first load---------------------------
>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>> ------------- first unload---------------------------
>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>> pt_msix_disable: Unmap msix with pirq 67
>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>> pt_msix_disable: Unmap msix with pirq 66
>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>> pt_msix_disable: Unmap msix with pirq 65
>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>> pt_msix_disable: Unmap msix with pirq 64
>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> Can you add some printks in Xen (the hypercall name is
> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in 
physdev_unmap_pirq.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-20  2:57                                         ` Zhenzhong Duan
@ 2013-06-20 14:21                                           ` Stefano Stabellini
  2013-06-24  7:19                                               ` Zhenzhong Duan
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-20 14:21 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen,
	Ingo Molnar, H. Peter Anvin, Thomas Gleixner

On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> On 2013-06-05 20:50, Stefano Stabellini wrote:
> > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > Stefano Stabellini wrote:
> > > 
> > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > >    
> > > On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > >      
> > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > >          
> > > I think the culprit is "physdev_unmap_pirq":
> > > 
> > >     if ( is_hvm_domain(d) )
> > >      {
> > >          spin_lock(&d->event_lock);
> > >          gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
> > >              d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
> > >              domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" :
> > > "",
> > >              domain_pirq_to_irq(d, pirq));
> > >                                                                                           if
> > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > >              ret = unmap_domain_pirq_emuirq(d, pirq);
> > >          spin_unlock(&d->event_lock);
> > >          if ( domid == DOMID_SELF || ret )
> > >              goto free_domain;
> > > 
> > > It always tells me unbound:
> > > 
> > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (a bit older debug code, so the 'unbound' does not show up here).
> > > 
> > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > on the code being IRQ_UNBOUND.
> > > 
> > > In other words, all of that code looks to only clear things when
> > > they are !IRQ_UNBOUND.
> > > 
> > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > in the radix tree:
> > > 
> > >    if ( emuirq != IRQ_PT )
> > >          radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
> > >                                                                          And
> > > I think that is what is causing the leak - the radix tree
> > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > >        
> > > I think that you are looking in the wrong place.
> > > The issue is that QEMU doesn't call pt_msi_disable in
> > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > 
> > > The code above is correct as is because it is trying to handle emulated
> > > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> > >      
> > > 
> > > This patch fixes the issue, I have only tested MSI (MSI-X completely
> > > untested).
> > > 
> > > 
> > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > index 304c438..079e465 100644
> > > --- a/hw/pass-through.c
> > > +++ b/hw/pass-through.c
> > > @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
> > > *ptdev,
> > >           ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
> > >       }
> > >       else
> > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > +    {
> > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > +            pt_msi_disable(ptdev);
> > > +        }
> > > +    }
> > >         /* pass through MSI_ENABLE bit when no MSI-INTx translation */
> > >       if (!ptdev->msi_trans_en) {
> > > @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
> > > *ptdev,
> > >               pt_disable_msi_translate(ptdev);
> > >           }
> > >           pt_msix_update(ptdev);
> > > +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
> > > +        pt_msix_delete(ptdev);
> > >    
> > > Hi Stefano,
> > > I made a test with this patch, os reboot when driver reload. If use
> > > pt_msix_disable
> > > instead of pt_msix_delete, driver could be reloaded.
> > > But I still see some error in qemu.log and xen console. Seems four IRQs
> > > are not freed
> > > when unmap.
> > > --------------first load---------------------------
> > > pt_msix_update_one: pt_msix_update_one requested pirq = 103
> > > pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
> > > pt_msix_update_one: pt_msix_update_one requested pirq = 102
> > > pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
> > > pt_msix_update_one: pt_msix_update_one requested pirq = 101
> > > pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
> > > pt_msix_update_one: pt_msix_update_one requested pirq = 100
> > > pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
> > > ------------- first unload---------------------------
> > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > pt_msix_disable: Unmap msix with pirq 67
> > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > pt_msix_disable: Unmap msix with pirq 66
> > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > pt_msix_disable: Unmap msix with pirq 65
> > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > pt_msix_disable: Unmap msix with pirq 64
> > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > Can you add some printks in Xen (the hypercall name is
> > PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
> physdev_unmap_pirq.

That means that Linux didn't call irq_enable on the MSI-X in question:

irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq

EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
be IRQ_PT.

I don't know if that's a normal condition, but in any case it should
not create any problems to physdev_unmap_pirq, in fact the folloing
check:

    if ( domid == DOMID_SELF || ret )
            goto free_domain;

should fail so Xen should continue and execute unmap_domain_pirq. That's
what we want.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-20 14:21                                           ` Stefano Stabellini
@ 2013-06-24  7:19                                               ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-24  7:19 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner


On 2013-06-20 22:21, Stefano Stabellini wrote:
> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>> Stefano Stabellini wrote:
>>>>
>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>     
>>>> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>       
>>>> Looking at the hypervisor code I couldn't see anything obviously wrong.
>>>>           
>>>> I think the culprit is "physdev_unmap_pirq":
>>>>
>>>>      if ( is_hvm_domain(d) )
>>>>       {
>>>>           spin_lock(&d->event_lock);
>>>>           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>>>               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" :
>>>> "",
>>>>               domain_pirq_to_irq(d, pirq));
>>>>                                                                                            if
>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>               ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>           spin_unlock(&d->event_lock);
>>>>           if ( domid == DOMID_SELF || ret )
>>>>               goto free_domain;
>>>>
>>>> It always tells me unbound:
>>>>
>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>
>>>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>> on the code being IRQ_UNBOUND.
>>>>
>>>> In other words, all of that code looks to only clear things when
>>>> they are !IRQ_UNBOUND.
>>>>
>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>> in the radix tree:
>>>>
>>>>     if ( emuirq != IRQ_PT )
>>>>           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>>>                                                                           And
>>>> I think that is what is causing the leak - the radix tree
>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>         
>>>> I think that you are looking in the wrong place.
>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>
>>>> The code above is correct as is because it is trying to handle emulated
>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>       
>>>>
>>>> This patch fixes the issue, I have only tested MSI (MSI-X completely
>>>> untested).
>>>>
>>>>
>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>> index 304c438..079e465 100644
>>>> --- a/hw/pass-through.c
>>>> +++ b/hw/pass-through.c
>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
>>>> *ptdev,
>>>>            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>        }
>>>>        else
>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>> +    {
>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>> +            pt_msi_disable(ptdev);
>>>> +        }
>>>> +    }
>>>>          /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>>>>        if (!ptdev->msi_trans_en) {
>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
>>>> *ptdev,
>>>>                pt_disable_msi_translate(ptdev);
>>>>            }
>>>>            pt_msix_update(ptdev);
>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
>>>> +        pt_msix_delete(ptdev);
>>>>     
>>>> Hi Stefano,
>>>> I made a test with this patch, os reboot when driver reload. If use
>>>> pt_msix_disable
>>>> instead of pt_msix_delete, driver could be reloaded.
>>>> But I still see some error in qemu.log and xen console. Seems four IRQs
>>>> are not freed
>>>> when unmap.
>>>> --------------first load---------------------------
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>> ------------- first unload---------------------------
>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 67
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 66
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 65
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 64
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>> Can you add some printks in Xen (the hypercall name is
>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>> physdev_unmap_pirq.
> That means that Linux didn't call irq_enable on the MSI-X in question:
>
> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>
> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
> be IRQ_PT.
>
> I don't know if that's a normal condition, but in any case it should
> not create any problems to physdev_unmap_pirq, in fact the folloing
> check:
>
>      if ( domid == DOMID_SELF || ret )
>              goto free_domain;
>
> should fail so Xen should continue and execute unmap_domain_pirq. That's
> what we want.
 From linux side, request_irq->  request_threaded_irq-> __setup_irq->  
irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
If irq_enable isn't called, how does the driver receive interrupt, I did 
see four interrupts in /proc/interrupt and driver works ok.

Could you have a look if there is something wrong in xen side of 
clearing the mapping?

zduan


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-06-24  7:19                                               ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-24  7:19 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner


On 2013-06-20 22:21, Stefano Stabellini wrote:
> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>> Stefano Stabellini wrote:
>>>>
>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>     
>>>> On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>       
>>>> Looking at the hypervisor code I couldn't see anything obviously wrong.
>>>>           
>>>> I think the culprit is "physdev_unmap_pirq":
>>>>
>>>>      if ( is_hvm_domain(d) )
>>>>       {
>>>>           spin_lock(&d->event_lock);
>>>>           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>>>               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" :
>>>> "",
>>>>               domain_pirq_to_irq(d, pirq));
>>>>                                                                                            if
>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>               ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>           spin_unlock(&d->event_lock);
>>>>           if ( domid == DOMID_SELF || ret )
>>>>               goto free_domain;
>>>>
>>>> It always tells me unbound:
>>>>
>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>
>>>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>> on the code being IRQ_UNBOUND.
>>>>
>>>> In other words, all of that code looks to only clear things when
>>>> they are !IRQ_UNBOUND.
>>>>
>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>> in the radix tree:
>>>>
>>>>     if ( emuirq != IRQ_PT )
>>>>           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>>>                                                                           And
>>>> I think that is what is causing the leak - the radix tree
>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>         
>>>> I think that you are looking in the wrong place.
>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>
>>>> The code above is correct as is because it is trying to handle emulated
>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>       
>>>>
>>>> This patch fixes the issue, I have only tested MSI (MSI-X completely
>>>> untested).
>>>>
>>>>
>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>> index 304c438..079e465 100644
>>>> --- a/hw/pass-through.c
>>>> +++ b/hw/pass-through.c
>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
>>>> *ptdev,
>>>>            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>        }
>>>>        else
>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>> +    {
>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>> +            pt_msi_disable(ptdev);
>>>> +        }
>>>> +    }
>>>>          /* pass through MSI_ENABLE bit when no MSI-INTx translation */
>>>>        if (!ptdev->msi_trans_en) {
>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
>>>> *ptdev,
>>>>                pt_disable_msi_translate(ptdev);
>>>>            }
>>>>            pt_msix_update(ptdev);
>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
>>>> +        pt_msix_delete(ptdev);
>>>>     
>>>> Hi Stefano,
>>>> I made a test with this patch, os reboot when driver reload. If use
>>>> pt_msix_disable
>>>> instead of pt_msix_delete, driver could be reloaded.
>>>> But I still see some error in qemu.log and xen console. Seems four IRQs
>>>> are not freed
>>>> when unmap.
>>>> --------------first load---------------------------
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>> ------------- first unload---------------------------
>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 67
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 66
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 65
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>> pt_msix_disable: Unmap msix with pirq 64
>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>> Can you add some printks in Xen (the hypercall name is
>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>> physdev_unmap_pirq.
> That means that Linux didn't call irq_enable on the MSI-X in question:
>
> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>
> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
> be IRQ_PT.
>
> I don't know if that's a normal condition, but in any case it should
> not create any problems to physdev_unmap_pirq, in fact the folloing
> check:
>
>      if ( domid == DOMID_SELF || ret )
>              goto free_domain;
>
> should fail so Xen should continue and execute unmap_domain_pirq. That's
> what we want.
 From linux side, request_irq->  request_threaded_irq-> __setup_irq->  
irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
If irq_enable isn't called, how does the driver receive interrupt, I did 
see four interrupts in /proc/interrupt and driver works ok.

Could you have a look if there is something wrong in xen side of 
clearing the mapping?

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-24  7:19                                               ` Zhenzhong Duan
  (?)
@ 2013-06-24 17:18                                               ` Stefano Stabellini
  2013-06-25  5:33                                                   ` DuanZhenzhong
  -1 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-24 17:18 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen

Trimming some of the people in CC

On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> On 2013-06-20 22:21, Stefano Stabellini wrote:
> > On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> > > On 2013-06-05 20:50, Stefano Stabellini wrote:
> > > > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > > > Stefano Stabellini wrote:
> > > > > 
> > > > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > > > >     On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > >       Looking at the hypervisor code I couldn't see anything obviously
> > > > > wrong.
> > > > >           I think the culprit is "physdev_unmap_pirq":
> > > > > 
> > > > >      if ( is_hvm_domain(d) )
> > > > >       {
> > > > >           spin_lock(&d->event_lock);
> > > > >           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
> > > > >               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
> > > > >               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
> > > > > "unbound" :
> > > > > "",
> > > > >               domain_pirq_to_irq(d, pirq));
> > > > >                                                                                            if
> > > > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > > >               ret = unmap_domain_pirq_emuirq(d, pirq);
> > > > >           spin_unlock(&d->event_lock);
> > > > >           if ( domid == DOMID_SELF || ret )
> > > > >               goto free_domain;
> > > > > 
> > > > > It always tells me unbound:
> > > > > 
> > > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > (a bit older debug code, so the 'unbound' does not show up here).
> > > > > 
> > > > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > > > on the code being IRQ_UNBOUND.
> > > > > 
> > > > > In other words, all of that code looks to only clear things when
> > > > > they are !IRQ_UNBOUND.
> > > > > 
> > > > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > > > in the radix tree:
> > > > > 
> > > > >     if ( emuirq != IRQ_PT )
> > > > >           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
> > > > >                                                                           And
> > > > > I think that is what is causing the leak - the radix tree
> > > > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > > >         I think that you are looking in the wrong place.
> > > > > The issue is that QEMU doesn't call pt_msi_disable in
> > > > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > > > 
> > > > > The code above is correct as is because it is trying to handle
> > > > > emulated
> > > > > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > > > > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> > > > >       
> > > > > This patch fixes the issue, I have only tested MSI (MSI-X completely
> > > > > untested).
> > > > > 
> > > > > 
> > > > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > index 304c438..079e465 100644
> > > > > --- a/hw/pass-through.c
> > > > > +++ b/hw/pass-through.c
> > > > > @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
> > > > > *ptdev,
> > > > >            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
> > > > >        }
> > > > >        else
> > > > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > > > +    {
> > > > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > > > +            pt_msi_disable(ptdev);
> > > > > +        }
> > > > > +    }
> > > > >          /* pass through MSI_ENABLE bit when no MSI-INTx translation
> > > > > */
> > > > >        if (!ptdev->msi_trans_en) {
> > > > > @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
> > > > > *ptdev,
> > > > >                pt_disable_msi_translate(ptdev);
> > > > >            }
> > > > >            pt_msix_update(ptdev);
> > > > > +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
> > > > > +        pt_msix_delete(ptdev);
> > > > >     Hi Stefano,
> > > > > I made a test with this patch, os reboot when driver reload. If use
> > > > > pt_msix_disable
> > > > > instead of pt_msix_delete, driver could be reloaded.
> > > > > But I still see some error in qemu.log and xen console. Seems four
> > > > > IRQs
> > > > > are not freed
> > > > > when unmap.
> > > > > --------------first load---------------------------
> > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 103
> > > > > pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
> > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 102
> > > > > pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
> > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 101
> > > > > pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
> > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 100
> > > > > pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
> > > > > ------------- first unload---------------------------
> > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > > > pt_msix_disable: Unmap msix with pirq 66
> > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > > > pt_msix_disable: Unmap msix with pirq 65
> > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > > > pt_msix_disable: Unmap msix with pirq 64
> > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > Can you add some printks in Xen (the hypercall name is
> > > > PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
> > > Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
> > > physdev_unmap_pirq.
> > That means that Linux didn't call irq_enable on the MSI-X in question:
> > 
> > irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
> > 
> > EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
> > map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
> > be IRQ_PT.
> > 
> > I don't know if that's a normal condition, but in any case it should
> > not create any problems to physdev_unmap_pirq, in fact the folloing
> > check:
> > 
> >      if ( domid == DOMID_SELF || ret )
> >              goto free_domain;
> > 
> > should fail so Xen should continue and execute unmap_domain_pirq. That's
> > what we want.
> From linux side, request_irq->  request_threaded_irq-> __setup_irq->
> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
> If irq_enable isn't called, how does the driver receive interrupt, I did see
> four interrupts in /proc/interrupt and driver works ok.

Good to know

> Could you have a look if there is something wrong in xen side of clearing the
> mapping?

What I am saying is that the error you are getting:

pt_msix_disable: Unbind msix with pirq 67, gvec 0
pt_msix_disable: Unmap msix with pirq 67
pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]

cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
IRQ_UNBOUND.
So, why are you getting this error? What is failing?
I am ready to believe the problem is in Xen but Without understanding
why you are getting the error it's hard to find a solution.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-24 17:18                                               ` Stefano Stabellini
@ 2013-06-25  5:33                                                   ` DuanZhenzhong
  0 siblings, 0 replies; 58+ messages in thread
From: DuanZhenzhong @ 2013-06-25  5:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen

Stefano Stabellini wrote:
> Trimming some of the people in CC
>
> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>   
>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>     
>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>       
>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>         
>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>           
>>>>>> Stefano Stabellini wrote:
>>>>>>
>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>     On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>       Looking at the hypervisor code I couldn't see anything obviously
>>>>>> wrong.
>>>>>>           I think the culprit is "physdev_unmap_pirq":
>>>>>>
>>>>>>      if ( is_hvm_domain(d) )
>>>>>>       {
>>>>>>           spin_lock(&d->event_lock);
>>>>>>           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>>>>>               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>>>               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>> "unbound" :
>>>>>> "",
>>>>>>               domain_pirq_to_irq(d, pirq));
>>>>>>                                                                                            if
>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>               ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>           spin_unlock(&d->event_lock);
>>>>>>           if ( domid == DOMID_SELF || ret )
>>>>>>               goto free_domain;
>>>>>>
>>>>>> It always tells me unbound:
>>>>>>
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>>>
>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>> on the code being IRQ_UNBOUND.
>>>>>>
>>>>>> In other words, all of that code looks to only clear things when
>>>>>> they are !IRQ_UNBOUND.
>>>>>>
>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>>>> in the radix tree:
>>>>>>
>>>>>>     if ( emuirq != IRQ_PT )
>>>>>>           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>>>>>                                                                           And
>>>>>> I think that is what is causing the leak - the radix tree
>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>         I think that you are looking in the wrong place.
>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>
>>>>>> The code above is correct as is because it is trying to handle
>>>>>> emulated
>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>>>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>>>       
>>>>>> This patch fixes the issue, I have only tested MSI (MSI-X completely
>>>>>> untested).
>>>>>>
>>>>>>
>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>> index 304c438..079e465 100644
>>>>>> --- a/hw/pass-through.c
>>>>>> +++ b/hw/pass-through.c
>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
>>>>>> *ptdev,
>>>>>>            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>        }
>>>>>>        else
>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>> +    {
>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>> +            pt_msi_disable(ptdev);
>>>>>> +        }
>>>>>> +    }
>>>>>>          /* pass through MSI_ENABLE bit when no MSI-INTx translation
>>>>>> */
>>>>>>        if (!ptdev->msi_trans_en) {
>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
>>>>>> *ptdev,
>>>>>>                pt_disable_msi_translate(ptdev);
>>>>>>            }
>>>>>>            pt_msix_update(ptdev);
>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
>>>>>> +        pt_msix_delete(ptdev);
>>>>>>     Hi Stefano,
>>>>>> I made a test with this patch, os reboot when driver reload. If use
>>>>>> pt_msix_disable
>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>> But I still see some error in qemu.log and xen console. Seems four
>>>>>> IRQs
>>>>>> are not freed
>>>>>> when unmap.
>>>>>> --------------first load---------------------------
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>> ------------- first unload---------------------------
>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>             
>>>>> Can you add some printks in Xen (the hypercall name is
>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>>>>>           
>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>>>> physdev_unmap_pirq.
>>>>         
>>> That means that Linux didn't call irq_enable on the MSI-X in question:
>>>
>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>
>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
>>> be IRQ_PT.
>>>
>>> I don't know if that's a normal condition, but in any case it should
>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>> check:
>>>
>>>      if ( domid == DOMID_SELF || ret )
>>>              goto free_domain;
>>>
>>> should fail so Xen should continue and execute unmap_domain_pirq. That's
>>> what we want.
>>>       
>> From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>> If irq_enable isn't called, how does the driver receive interrupt, I did see
>> four interrupts in /proc/interrupt and driver works ok.
>>     
>
> Good to know
>
>   
>> Could you have a look if there is something wrong in xen side of clearing the
>> mapping?
>>     
>
> What I am saying is that the error you are getting:
>
> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> pt_msix_disable: Unmap msix with pirq 67
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>
> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> IRQ_UNBOUND.
> So, why are you getting this error? What is failing?
> I am ready to believe the problem is in Xen but Without understanding
> why you are getting the error it's hard to find a solution.
>   
I found the reason, you are looking at xen-unstable, I was working with 
4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
That patch set ret to -EINVAL initially. After remove that line, unmap 
succeed.
But we still need below patch to let driver reload succeed everytime. 
Without that, 1st reload failed, 2nd succeed, 3 failed, ...

diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
--- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
+++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
@@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
         return;
     }
 
-    if ( offset != 3 && entry->io_mem[offset] != val )
+    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq == 
-1))
         entry->flags = 1;
     entry->io_mem[offset] = val;

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-06-25  5:33                                                   ` DuanZhenzhong
  0 siblings, 0 replies; 58+ messages in thread
From: DuanZhenzhong @ 2013-06-25  5:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen

Stefano Stabellini wrote:
> Trimming some of the people in CC
>
> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>   
>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>     
>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>       
>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>         
>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>           
>>>>>> Stefano Stabellini wrote:
>>>>>>
>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>     On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>       Looking at the hypervisor code I couldn't see anything obviously
>>>>>> wrong.
>>>>>>           I think the culprit is "physdev_unmap_pirq":
>>>>>>
>>>>>>      if ( is_hvm_domain(d) )
>>>>>>       {
>>>>>>           spin_lock(&d->event_lock);
>>>>>>           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
>>>>>>               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>>>               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>> "unbound" :
>>>>>> "",
>>>>>>               domain_pirq_to_irq(d, pirq));
>>>>>>                                                                                            if
>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>               ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>           spin_unlock(&d->event_lock);
>>>>>>           if ( domid == DOMID_SELF || ret )
>>>>>>               goto free_domain;
>>>>>>
>>>>>> It always tells me unbound:
>>>>>>
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>>>
>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not happen.
>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>> on the code being IRQ_UNBOUND.
>>>>>>
>>>>>> In other words, all of that code looks to only clear things when
>>>>>> they are !IRQ_UNBOUND.
>>>>>>
>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>>>> in the radix tree:
>>>>>>
>>>>>>     if ( emuirq != IRQ_PT )
>>>>>>           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
>>>>>>                                                                           And
>>>>>> I think that is what is causing the leak - the radix tree
>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>         I think that you are looking in the wrong place.
>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>
>>>>>> The code above is correct as is because it is trying to handle
>>>>>> emulated
>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not added to
>>>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>>>       
>>>>>> This patch fixes the issue, I have only tested MSI (MSI-X completely
>>>>>> untested).
>>>>>>
>>>>>>
>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>> index 304c438..079e465 100644
>>>>>> --- a/hw/pass-through.c
>>>>>> +++ b/hw/pass-through.c
>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct pt_dev
>>>>>> *ptdev,
>>>>>>            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>        }
>>>>>>        else
>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>> +    {
>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>> +            pt_msi_disable(ptdev);
>>>>>> +        }
>>>>>> +    }
>>>>>>          /* pass through MSI_ENABLE bit when no MSI-INTx translation
>>>>>> */
>>>>>>        if (!ptdev->msi_trans_en) {
>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct pt_dev
>>>>>> *ptdev,
>>>>>>                pt_disable_msi_translate(ptdev);
>>>>>>            }
>>>>>>            pt_msix_update(ptdev);
>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) && ptdev->msix->enabled) {
>>>>>> +        pt_msix_delete(ptdev);
>>>>>>     Hi Stefano,
>>>>>> I made a test with this patch, os reboot when driver reload. If use
>>>>>> pt_msix_disable
>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>> But I still see some error in qemu.log and xen console. Seems four
>>>>>> IRQs
>>>>>> are not freed
>>>>>> when unmap.
>>>>>> --------------first load---------------------------
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>> ------------- first unload---------------------------
>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>             
>>>>> Can you add some printks in Xen (the hypercall name is
>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>>>>>           
>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>>>> physdev_unmap_pirq.
>>>>         
>>> That means that Linux didn't call irq_enable on the MSI-X in question:
>>>
>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>
>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
>>> be IRQ_PT.
>>>
>>> I don't know if that's a normal condition, but in any case it should
>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>> check:
>>>
>>>      if ( domid == DOMID_SELF || ret )
>>>              goto free_domain;
>>>
>>> should fail so Xen should continue and execute unmap_domain_pirq. That's
>>> what we want.
>>>       
>> From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>> If irq_enable isn't called, how does the driver receive interrupt, I did see
>> four interrupts in /proc/interrupt and driver works ok.
>>     
>
> Good to know
>
>   
>> Could you have a look if there is something wrong in xen side of clearing the
>> mapping?
>>     
>
> What I am saying is that the error you are getting:
>
> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> pt_msix_disable: Unmap msix with pirq 67
> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>
> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> IRQ_UNBOUND.
> So, why are you getting this error? What is failing?
> I am ready to believe the problem is in Xen but Without understanding
> why you are getting the error it's hard to find a solution.
>   
I found the reason, you are looking at xen-unstable, I was working with 
4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
That patch set ret to -EINVAL initially. After remove that line, unmap 
succeed.
But we still need below patch to let driver reload succeed everytime. 
Without that, 1st reload failed, 2nd succeed, 3 failed, ...

diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
--- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
+++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
@@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
         return;
     }
 
-    if ( offset != 3 && entry->io_mem[offset] != val )
+    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq == 
-1))
         entry->flags = 1;
     entry->io_mem[offset] = val;

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time)
  2013-06-25  5:33                                                   ` DuanZhenzhong
  (?)
@ 2013-06-25  7:21                                                   ` Jan Beulich
  2013-06-25  7:44                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix DuanZhenzhong
  2013-06-25 10:56                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Stefano Stabellini
  -1 siblings, 2 replies; 58+ messages in thread
From: Jan Beulich @ 2013-06-25  7:21 UTC (permalink / raw)
  To: Stefano Stabellini, DuanZhenzhong
  Cc: xen-devel, Konrad Rzeszutek Wilk, Andrew Cooper, Feng Jin,
	Yuval Shaia, Chien Yen, David Vrabel

[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]

>>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
> Stefano Stabellini wrote:
>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>> Could you have a look if there is something wrong in xen side of clearing 
>>> the mapping?
>>
>> What I am saying is that the error you are getting:
>>
>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>> pt_msix_disable: Unmap msix with pirq 67
>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>
>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>> IRQ_UNBOUND.
>> So, why are you getting this error? What is failing?
>> I am ready to believe the problem is in Xen but Without understanding
>> why you are getting the error it's hard to find a solution.
>>   
> I found the reason, you are looking at xen-unstable, I was working with 
> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> That patch set ret to -EINVAL initially. After remove that line, unmap 
> succeed.

Removing that line certainly isn't right. The proper fix is the one
below/attached.

Jan

****************************************************
x86: fix emuirq regression from XSA-21 fix

The XSA-21 patch broke the assumption of "ret" being zero prior to the
IRQ_UNBOUND check.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
         spin_lock(&d->event_lock);
         if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
             ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
+        else
+            ret = 0;
         spin_unlock(&d->event_lock);
         if ( unmap->domid == DOMID_SELF || ret )
             goto free_domain;




[-- Attachment #2: x86-xsa21-emuirq-fix.patch --]
[-- Type: text/plain, Size: 641 bytes --]

x86: fix emuirq regression from XSA-21 fix

The XSA-21 patch broke the assumption of "ret" being zero prior to the
IRQ_UNBOUND check.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
         spin_lock(&d->event_lock);
         if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
             ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
+        else
+            ret = 0;
         spin_unlock(&d->event_lock);
         if ( unmap->domid == DOMID_SELF || ret )
             goto free_domain;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix
  2013-06-25  7:21                                                   ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Jan Beulich
@ 2013-06-25  7:44                                                     ` DuanZhenzhong
  2013-06-25  8:36                                                       ` Jan Beulich
  2013-06-25 10:56                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Stefano Stabellini
  1 sibling, 1 reply; 58+ messages in thread
From: DuanZhenzhong @ 2013-06-25  7:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Stefano Stabellini, Andrew Cooper,
	Konrad Rzeszutek Wilk, Feng Jin, Yuval Shaia, Chien Yen,
	David Vrabel

Jan Beulich wrote:
>>>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
>>>>         
>> Stefano Stabellini wrote:
>>     
>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>       
>>>> Could you have a look if there is something wrong in xen side of clearing 
>>>> the mapping?
>>>>         
>>> What I am saying is that the error you are getting:
>>>
>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>> pt_msix_disable: Unmap msix with pirq 67
>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>
>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>> IRQ_UNBOUND.
>>> So, why are you getting this error? What is failing?
>>> I am ready to believe the problem is in Xen but Without understanding
>>> why you are getting the error it's hard to find a solution.
>>>   
>>>       
>> I found the reason, you are looking at xen-unstable, I was working with 
>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>> That patch set ret to -EINVAL initially. After remove that line, unmap 
>> succeed.
>>     
>
> Removing that line certainly isn't right. The proper fix is the one
> below/attached.
>
> Jan
>
> ****************************************************
> x86: fix emuirq regression from XSA-21 fix
>
> The XSA-21 patch broke the assumption of "ret" being zero prior to the
> IRQ_UNBOUND check.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
>          spin_lock(&d->event_lock);
>          if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
>              ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
> +        else
> +            ret = 0;
>          spin_unlock(&d->event_lock);
>          if ( unmap->domid == DOMID_SELF || ret )
>              goto free_domain;
>
>
>
>   
Any reason of not right? below patch does work as ret is already zero if it could get to that line.

zduan

****************************************************
x86: fix emuirq regression from XSA-21 fix

The XSA-21 patch broke the assumption of "ret" being zero prior to the
IRQ_UNBOUND check.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>

--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -240,7 +240,6 @@ static int physdev_unmap_pirq(struct phy
     if ( ret )
         return ret;
 
-    ret = -EINVAL;
     if ( unmap->pirq < 0 || unmap->pirq >= d->nr_pirqs )
         goto free_domain;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix
  2013-06-25  7:44                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix DuanZhenzhong
@ 2013-06-25  8:36                                                       ` Jan Beulich
  2013-06-25  8:43                                                         ` DuanZhenzhong
  0 siblings, 1 reply; 58+ messages in thread
From: Jan Beulich @ 2013-06-25  8:36 UTC (permalink / raw)
  To: DuanZhenzhong
  Cc: xen-devel, Stefano Stabellini, Andrew Cooper,
	Konrad Rzeszutek Wilk, Feng Jin, Yuval Shaia, Chien Yen,
	David Vrabel

>>> On 25.06.13 at 09:44, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
> Any reason of not right? below patch does work as ret is already zero if it 
> could get to that line.
> 
> zduan
> 
> ****************************************************
> x86: fix emuirq regression from XSA-21 fix
> 
> The XSA-21 patch broke the assumption of "ret" being zero prior to the
> IRQ_UNBOUND check.
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
> 
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -240,7 +240,6 @@ static int physdev_unmap_pirq(struct phy
>      if ( ret )
>          return ret;
>  
> -    ret = -EINVAL;
>      if ( unmap->pirq < 0 || unmap->pirq >= d->nr_pirqs )
>          goto free_domain;

So if unmap->pirq is out of range, you'd return 0 (success)
nevertheless - how can that be correct?

Jan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix
  2013-06-25  8:36                                                       ` Jan Beulich
@ 2013-06-25  8:43                                                         ` DuanZhenzhong
  0 siblings, 0 replies; 58+ messages in thread
From: DuanZhenzhong @ 2013-06-25  8:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Stefano Stabellini, Andrew Cooper,
	Konrad Rzeszutek Wilk, Feng Jin, Yuval Shaia, Chien Yen,
	David Vrabel

Jan Beulich wrote:
>>>> On 25.06.13 at 09:44, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
>>>>         
>> Any reason of not right? below patch does work as ret is already zero if it 
>> could get to that line.
>>
>> zduan
>>
>> ****************************************************
>> x86: fix emuirq regression from XSA-21 fix
>>
>> The XSA-21 patch broke the assumption of "ret" being zero prior to the
>> IRQ_UNBOUND check.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
>>
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -240,7 +240,6 @@ static int physdev_unmap_pirq(struct phy
>>      if ( ret )
>>          return ret;
>>  
>> -    ret = -EINVAL;
>>      if ( unmap->pirq < 0 || unmap->pirq >= d->nr_pirqs )
>>          goto free_domain;
>>     
>
> So if unmap->pirq is out of range, you'd return 0 (success)
> nevertheless - how can that be correct?
>
> Jan
>
>   
You are right.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time)
  2013-06-25  7:21                                                   ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Jan Beulich
  2013-06-25  7:44                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix DuanZhenzhong
@ 2013-06-25 10:56                                                     ` Stefano Stabellini
  2013-06-25 11:03                                                       ` Stefano Stabellini
  1 sibling, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-25 10:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Stefano Stabellini, Andrew Cooper,
	Konrad Rzeszutek Wilk, Feng Jin, DuanZhenzhong, Yuval Shaia,
	Chien Yen, David Vrabel

On Tue, 25 Jun 2013, Jan Beulich wrote:
> >>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
> > Stefano Stabellini wrote:
> >> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> >>> Could you have a look if there is something wrong in xen side of clearing 
> >>> the mapping?
> >>
> >> What I am saying is that the error you are getting:
> >>
> >> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> >> pt_msix_disable: Unmap msix with pirq 67
> >> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> >>
> >> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> >> IRQ_UNBOUND.
> >> So, why are you getting this error? What is failing?
> >> I am ready to believe the problem is in Xen but Without understanding
> >> why you are getting the error it's hard to find a solution.
> >>   
> > I found the reason, you are looking at xen-unstable, I was working with 
> > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> > That patch set ret to -EINVAL initially. After remove that line, unmap 
> > succeed.
> 
> Removing that line certainly isn't right. The proper fix is the one
> below/attached.
> 
> Jan
> 
> ****************************************************
> x86: fix emuirq regression from XSA-21 fix
> 
> The XSA-21 patch broke the assumption of "ret" being zero prior to the
> IRQ_UNBOUND check.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/physdev.c
> +++ b/xen/arch/x86/physdev.c
> @@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
>          spin_lock(&d->event_lock);
>          if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
>              ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
> +        else
> +            ret = 0;
>          spin_unlock(&d->event_lock);
>          if ( unmap->domid == DOMID_SELF || ret )
>              goto free_domain;


This is unnecessary.
ret is 0 regardless because of the previous:

ret = xsm_unmap_domain_pirq(XSM_TARGET, d);

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time)
  2013-06-25 10:56                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Stefano Stabellini
@ 2013-06-25 11:03                                                       ` Stefano Stabellini
  2013-06-27  8:34                                                         ` Jan Beulich
  0 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-25 11:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Konrad Rzeszutek Wilk, Andrew Cooper, Feng Jin,
	DuanZhenzhong, Yuval Shaia, Chien Yen, David Vrabel, Jan Beulich

On Tue, 25 Jun 2013, Stefano Stabellini wrote:
> On Tue, 25 Jun 2013, Jan Beulich wrote:
> > >>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
> > > Stefano Stabellini wrote:
> > >> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> > >>> Could you have a look if there is something wrong in xen side of clearing 
> > >>> the mapping?
> > >>
> > >> What I am saying is that the error you are getting:
> > >>
> > >> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > >> pt_msix_disable: Unmap msix with pirq 67
> > >> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > >>
> > >> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> > >> IRQ_UNBOUND.
> > >> So, why are you getting this error? What is failing?
> > >> I am ready to believe the problem is in Xen but Without understanding
> > >> why you are getting the error it's hard to find a solution.
> > >>   
> > > I found the reason, you are looking at xen-unstable, I was working with 
> > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> > > That patch set ret to -EINVAL initially. After remove that line, unmap 
> > > succeed.
> > 
> > Removing that line certainly isn't right. The proper fix is the one
> > below/attached.
> > 
> > Jan
> > 
> > ****************************************************
> > x86: fix emuirq regression from XSA-21 fix
> > 
> > The XSA-21 patch broke the assumption of "ret" being zero prior to the
> > IRQ_UNBOUND check.
> > 
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > --- a/xen/arch/x86/physdev.c
> > +++ b/xen/arch/x86/physdev.c
> > @@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
> >          spin_lock(&d->event_lock);
> >          if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
> >              ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
> > +        else
> > +            ret = 0;
> >          spin_unlock(&d->event_lock);
> >          if ( unmap->domid == DOMID_SELF || ret )
> >              goto free_domain;
> 
> 
> This is unnecessary.
> ret is 0 regardless because of the previous:
> 
> ret = xsm_unmap_domain_pirq(XSM_TARGET, d);
>

Ops, I read now the patch for XSA-21.
The change above is fine (and makes the code more readable anyway).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-25  5:33                                                   ` DuanZhenzhong
  (?)
  (?)
@ 2013-06-25 17:51                                                   ` Stefano Stabellini
  2013-06-26  4:00                                                       ` Zhenzhong Duan
  -1 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-25 17:51 UTC (permalink / raw)
  To: DuanZhenzhong
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen

On Tue, 25 Jun 2013, DuanZhenzhong wrote:
> Stefano Stabellini wrote:
> > Trimming some of the people in CC
> > 
> > On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> >   
> > > On 2013-06-20 22:21, Stefano Stabellini wrote:
> > >     
> > > > On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> > > >       
> > > > > On 2013-06-05 20:50, Stefano Stabellini wrote:
> > > > >         
> > > > > > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > > > >           
> > > > > > > Stefano Stabellini wrote:
> > > > > > > 
> > > > > > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > > > > > >     On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > >       Looking at the hypervisor code I couldn't see anything
> > > > > > > obviously
> > > > > > > wrong.
> > > > > > >           I think the culprit is "physdev_unmap_pirq":
> > > > > > > 
> > > > > > >      if ( is_hvm_domain(d) )
> > > > > > >       {
> > > > > > >           spin_lock(&d->event_lock);
> > > > > > >           gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq:
> > > > > > > %d\n",
> > > > > > >               d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
> > > > > > >               domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
> > > > > > > "unbound" :
> > > > > > > "",
> > > > > > >               domain_pirq_to_irq(d, pirq));
> > > > > > >                                                                                            if
> > > > > > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > > > > >               ret = unmap_domain_pirq_emuirq(d, pirq);
> > > > > > >           spin_unlock(&d->event_lock);
> > > > > > >           if ( domid == DOMID_SELF || ret )
> > > > > > >               goto free_domain;
> > > > > > > 
> > > > > > > It always tells me unbound:
> > > > > > > 
> > > > > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > (a bit older debug code, so the 'unbound' does not show up here).
> > > > > > > 
> > > > > > > Which means that the call to unmap_domain_pirq_emuirq does not
> > > > > > > happen.
> > > > > > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > > > > > on the code being IRQ_UNBOUND.
> > > > > > > 
> > > > > > > In other words, all of that code looks to only clear things when
> > > > > > > they are !IRQ_UNBOUND.
> > > > > > > 
> > > > > > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > > > > > in the radix tree:
> > > > > > > 
> > > > > > >     if ( emuirq != IRQ_PT )
> > > > > > >           radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
> > > > > > > emuirq);
> > > > > > >                                                                           And
> > > > > > > I think that is what is causing the leak - the radix tree
> > > > > > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > > > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > > > > >         I think that you are looking in the wrong place.
> > > > > > > The issue is that QEMU doesn't call pt_msi_disable in
> > > > > > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > > > > > 
> > > > > > > The code above is correct as is because it is trying to handle
> > > > > > > emulated
> > > > > > > IRQs and MSIs, not real passthrough MSIs. They latter are not
> > > > > > > added to
> > > > > > > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
> > > > > > >       This patch fixes the issue, I have only tested MSI (MSI-X
> > > > > > > completely
> > > > > > > untested).
> > > > > > > 
> > > > > > > 
> > > > > > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > > > index 304c438..079e465 100644
> > > > > > > --- a/hw/pass-through.c
> > > > > > > +++ b/hw/pass-through.c
> > > > > > > @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
> > > > > > > pt_dev
> > > > > > > *ptdev,
> > > > > > >            ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
> > > > > > >        }
> > > > > > >        else
> > > > > > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > > > > > +    {
> > > > > > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > > > > > +            pt_msi_disable(ptdev);
> > > > > > > +        }
> > > > > > > +    }
> > > > > > >          /* pass through MSI_ENABLE bit when no MSI-INTx
> > > > > > > translation
> > > > > > > */
> > > > > > >        if (!ptdev->msi_trans_en) {
> > > > > > > @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
> > > > > > > pt_dev
> > > > > > > *ptdev,
> > > > > > >                pt_disable_msi_translate(ptdev);
> > > > > > >            }
> > > > > > >            pt_msix_update(ptdev);
> > > > > > > +    } else if (!(*value & PCI_MSIX_ENABLE) &&
> > > > > > > ptdev->msix->enabled) {
> > > > > > > +        pt_msix_delete(ptdev);
> > > > > > >     Hi Stefano,
> > > > > > > I made a test with this patch, os reboot when driver reload. If
> > > > > > > use
> > > > > > > pt_msix_disable
> > > > > > > instead of pt_msix_delete, driver could be reloaded.
> > > > > > > But I still see some error in qemu.log and xen console. Seems four
> > > > > > > IRQs
> > > > > > > are not freed
> > > > > > > when unmap.
> > > > > > > --------------first load---------------------------
> > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 103
> > > > > > > pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
> > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 102
> > > > > > > pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
> > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 101
> > > > > > > pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
> > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 100
> > > > > > > pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
> > > > > > > ------------- first unload---------------------------
> > > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > > > > > pt_msix_disable: Unmap msix with pirq 66
> > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > > > > > pt_msix_disable: Unmap msix with pirq 65
> > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > > > > > pt_msix_disable: Unmap msix with pirq 64
> > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > >             
> > > > > > Can you add some printks in Xen (the hypercall name is
> > > > > > PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
> > > > > >           
> > > > > Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
> > > > > physdev_unmap_pirq.
> > > > >         
> > > > That means that Linux didn't call irq_enable on the MSI-X in question:
> > > > 
> > > > irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
> > > > 
> > > > EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
> > > > map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
> > > > be IRQ_PT.
> > > > 
> > > > I don't know if that's a normal condition, but in any case it should
> > > > not create any problems to physdev_unmap_pirq, in fact the folloing
> > > > check:
> > > > 
> > > >      if ( domid == DOMID_SELF || ret )
> > > >              goto free_domain;
> > > > 
> > > > should fail so Xen should continue and execute unmap_domain_pirq. That's
> > > > what we want.
> > > >       
> > > From linux side, request_irq->  request_threaded_irq-> __setup_irq->
> > > irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
> > > If irq_enable isn't called, how does the driver receive interrupt, I did
> > > see
> > > four interrupts in /proc/interrupt and driver works ok.
> > >     
> > 
> > Good to know
> > 
> >   
> > > Could you have a look if there is something wrong in xen side of clearing
> > > the
> > > mapping?
> > >     
> > 
> > What I am saying is that the error you are getting:
> > 
> > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > pt_msix_disable: Unmap msix with pirq 67
> > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > 
> > cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> > IRQ_UNBOUND.
> > So, why are you getting this error? What is failing?
> > I am ready to believe the problem is in Xen but Without understanding
> > why you are getting the error it's hard to find a solution.
> >   
> I found the reason, you are looking at xen-unstable, I was working with
> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> That patch set ret to -EINVAL initially. After remove that line, unmap
> succeed.
> But we still need below patch to let driver reload succeed everytime. Without
> that, 1st reload failed, 2nd succeed, 3 failed, ...
> 
> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>         return;
>     }
> 
> -    if ( offset != 3 && entry->io_mem[offset] != val )
> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq == -1))
>         entry->flags = 1;
>     entry->io_mem[offset] = val;

Interesting. I don't think this is the proper fix though.
Does the appended patch change anything?

diff --git a/hw/pt-msi.c b/hw/pt-msi.c
index 71fa6f0..cd5d9c7 100644
--- a/hw/pt-msi.c
+++ b/hw/pt-msi.c
@@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr)
     uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
     int ret;
 
-    if ( !entry->flags )
+    if ( !entry->flags && ptdev->msix->enabled )
         return 0;
 
     if (!gvec) {


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-25 17:51                                                   ` [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time Stefano Stabellini
@ 2013-06-26  4:00                                                       ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-26  4:00 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-26 01:51, Stefano Stabellini wrote:
> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>> Stefano Stabellini wrote:
>>> Trimming some of the people in CC
>>>
>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>    
>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>      
>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>        
>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>          
>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>            
>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>
>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>      On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>        Looking at the hypervisor code I couldn't see anything
>>>>>>>> obviously
>>>>>>>> wrong.
>>>>>>>>            I think the culprit is "physdev_unmap_pirq":
>>>>>>>>
>>>>>>>>       if ( is_hvm_domain(d) )
>>>>>>>>        {
>>>>>>>>            spin_lock(&d->event_lock);
>>>>>>>>            gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq:
>>>>>>>> %d\n",
>>>>>>>>                d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>>>>>                domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>>>> "unbound" :
>>>>>>>> "",
>>>>>>>>                domain_pirq_to_irq(d, pirq));
>>>>>>>>                                                                                             if
>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>                ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>            spin_unlock(&d->event_lock);
>>>>>>>>            if ( domid == DOMID_SELF || ret )
>>>>>>>>                goto free_domain;
>>>>>>>>
>>>>>>>> It always tells me unbound:
>>>>>>>>
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>>>>>
>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not
>>>>>>>> happen.
>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>
>>>>>>>> In other words, all of that code looks to only clear things when
>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>
>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>>>>>> in the radix tree:
>>>>>>>>
>>>>>>>>      if ( emuirq != IRQ_PT )
>>>>>>>>            radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>> emuirq);
>>>>>>>>                                                                            And
>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>          I think that you are looking in the wrong place.
>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>
>>>>>>>> The code above is correct as is because it is trying to handle
>>>>>>>> emulated
>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not
>>>>>>>> added to
>>>>>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>>>>>        This patch fixes the issue, I have only tested MSI (MSI-X
>>>>>>>> completely
>>>>>>>> untested).
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>> index 304c438..079e465 100644
>>>>>>>> --- a/hw/pass-through.c
>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
>>>>>>>> pt_dev
>>>>>>>> *ptdev,
>>>>>>>>             ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>         }
>>>>>>>>         else
>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>> +    {
>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>> +        }
>>>>>>>> +    }
>>>>>>>>           /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>> translation
>>>>>>>> */
>>>>>>>>         if (!ptdev->msi_trans_en) {
>>>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
>>>>>>>> pt_dev
>>>>>>>> *ptdev,
>>>>>>>>                 pt_disable_msi_translate(ptdev);
>>>>>>>>             }
>>>>>>>>             pt_msix_update(ptdev);
>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>> ptdev->msix->enabled) {
>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>      Hi Stefano,
>>>>>>>> I made a test with this patch, os reboot when driver reload. If
>>>>>>>> use
>>>>>>>> pt_msix_disable
>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>> But I still see some error in qemu.log and xen console. Seems four
>>>>>>>> IRQs
>>>>>>>> are not freed
>>>>>>>> when unmap.
>>>>>>>> --------------first load---------------------------
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>>>> ------------- first unload---------------------------
>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>              
>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>>>>>>>            
>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>>>>>> physdev_unmap_pirq.
>>>>>>          
>>>>> That means that Linux didn't call irq_enable on the MSI-X in question:
>>>>>
>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>
>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
>>>>> be IRQ_PT.
>>>>>
>>>>> I don't know if that's a normal condition, but in any case it should
>>>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>>>> check:
>>>>>
>>>>>       if ( domid == DOMID_SELF || ret )
>>>>>               goto free_domain;
>>>>>
>>>>> should fail so Xen should continue and execute unmap_domain_pirq. That's
>>>>> what we want.
>>>>>        
>>>>  From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>> If irq_enable isn't called, how does the driver receive interrupt, I did
>>>> see
>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>      
>>> Good to know
>>>
>>>    
>>>> Could you have a look if there is something wrong in xen side of clearing
>>>> the
>>>> mapping?
>>>>      
>>> What I am saying is that the error you are getting:
>>>
>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>> pt_msix_disable: Unmap msix with pirq 67
>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>
>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>> IRQ_UNBOUND.
>>> So, why are you getting this error? What is failing?
>>> I am ready to believe the problem is in Xen but Without understanding
>>> why you are getting the error it's hard to find a solution.
>>>    
>> I found the reason, you are looking at xen-unstable, I was working with
>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>> That patch set ret to -EINVAL initially. After remove that line, unmap
>> succeed.
>> But we still need below patch to let driver reload succeed everytime. Without
>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>
>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>          return;
>>      }
>>
>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq == -1))
>>          entry->flags = 1;
>>      entry->io_mem[offset] = val;
> Interesting. I don't think this is the proper fix though.
> Does the appended patch change anything?
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cd5d9c7 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr)
>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>       int ret;
>   
> -    if ( !entry->flags )
> +    if ( !entry->flags && ptdev->msix->enabled )
>           return 0;
>   
>       if (!gvec) {
Tested, not work.
If you look at msix_capability_init in kernel, line 707,722, 
dev->msix->enabled is already set when pt_msix_update is called.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-06-26  4:00                                                       ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-26  4:00 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-26 01:51, Stefano Stabellini wrote:
> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>> Stefano Stabellini wrote:
>>> Trimming some of the people in CC
>>>
>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>    
>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>      
>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>        
>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>          
>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>            
>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>
>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>      On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>        Looking at the hypervisor code I couldn't see anything
>>>>>>>> obviously
>>>>>>>> wrong.
>>>>>>>>            I think the culprit is "physdev_unmap_pirq":
>>>>>>>>
>>>>>>>>       if ( is_hvm_domain(d) )
>>>>>>>>        {
>>>>>>>>            spin_lock(&d->event_lock);
>>>>>>>>            gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq:
>>>>>>>> %d\n",
>>>>>>>>                d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
>>>>>>>>                domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>>>> "unbound" :
>>>>>>>> "",
>>>>>>>>                domain_pirq_to_irq(d, pirq));
>>>>>>>>                                                                                             if
>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>                ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>            spin_unlock(&d->event_lock);
>>>>>>>>            if ( domid == DOMID_SELF || ret )
>>>>>>>>                goto free_domain;
>>>>>>>>
>>>>>>>> It always tells me unbound:
>>>>>>>>
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>> (a bit older debug code, so the 'unbound' does not show up here).
>>>>>>>>
>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not
>>>>>>>> happen.
>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>
>>>>>>>> In other words, all of that code looks to only clear things when
>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>
>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a removal
>>>>>>>> in the radix tree:
>>>>>>>>
>>>>>>>>      if ( emuirq != IRQ_PT )
>>>>>>>>            radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>> emuirq);
>>>>>>>>                                                                            And
>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>          I think that you are looking in the wrong place.
>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>
>>>>>>>> The code above is correct as is because it is trying to handle
>>>>>>>> emulated
>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not
>>>>>>>> added to
>>>>>>>> that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>>>>>>>>        This patch fixes the issue, I have only tested MSI (MSI-X
>>>>>>>> completely
>>>>>>>> untested).
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>> index 304c438..079e465 100644
>>>>>>>> --- a/hw/pass-through.c
>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
>>>>>>>> pt_dev
>>>>>>>> *ptdev,
>>>>>>>>             ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>         }
>>>>>>>>         else
>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>> +    {
>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>> +        }
>>>>>>>> +    }
>>>>>>>>           /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>> translation
>>>>>>>> */
>>>>>>>>         if (!ptdev->msi_trans_en) {
>>>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
>>>>>>>> pt_dev
>>>>>>>> *ptdev,
>>>>>>>>                 pt_disable_msi_translate(ptdev);
>>>>>>>>             }
>>>>>>>>             pt_msix_update(ptdev);
>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>> ptdev->msix->enabled) {
>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>      Hi Stefano,
>>>>>>>> I made a test with this patch, os reboot when driver reload. If
>>>>>>>> use
>>>>>>>> pt_msix_disable
>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>> But I still see some error in qemu.log and xen console. Seems four
>>>>>>>> IRQs
>>>>>>>> are not freed
>>>>>>>> when unmap.
>>>>>>>> --------------first load---------------------------
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>>>> ------------- first unload---------------------------
>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>              
>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are failing?
>>>>>>>            
>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND in
>>>>>> physdev_unmap_pirq.
>>>>>>          
>>>>> That means that Linux didn't call irq_enable on the MSI-X in question:
>>>>>
>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>
>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and calls
>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq) should
>>>>> be IRQ_PT.
>>>>>
>>>>> I don't know if that's a normal condition, but in any case it should
>>>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>>>> check:
>>>>>
>>>>>       if ( domid == DOMID_SELF || ret )
>>>>>               goto free_domain;
>>>>>
>>>>> should fail so Xen should continue and execute unmap_domain_pirq. That's
>>>>> what we want.
>>>>>        
>>>>  From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>> If irq_enable isn't called, how does the driver receive interrupt, I did
>>>> see
>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>      
>>> Good to know
>>>
>>>    
>>>> Could you have a look if there is something wrong in xen side of clearing
>>>> the
>>>> mapping?
>>>>      
>>> What I am saying is that the error you are getting:
>>>
>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>> pt_msix_disable: Unmap msix with pirq 67
>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>
>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>> IRQ_UNBOUND.
>>> So, why are you getting this error? What is failing?
>>> I am ready to believe the problem is in Xen but Without understanding
>>> why you are getting the error it's hard to find a solution.
>>>    
>> I found the reason, you are looking at xen-unstable, I was working with
>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>> That patch set ret to -EINVAL initially. After remove that line, unmap
>> succeed.
>> But we still need below patch to let driver reload succeed everytime. Without
>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>
>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>          return;
>>      }
>>
>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq == -1))
>>          entry->flags = 1;
>>      entry->io_mem[offset] = val;
> Interesting. I don't think this is the proper fix though.
> Does the appended patch change anything?
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cd5d9c7 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr)
>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>       int ret;
>   
> -    if ( !entry->flags )
> +    if ( !entry->flags && ptdev->msix->enabled )
>           return 0;
>   
>       if (!gvec) {
Tested, not work.
If you look at msix_capability_init in kernel, line 707,722, 
dev->msix->enabled is already set when pt_msix_update is called.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-26  4:00                                                       ` Zhenzhong Duan
  (?)
@ 2013-06-26 18:08                                                       ` Stefano Stabellini
  2013-06-27  4:01                                                           ` Zhenzhong Duan
  -1 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-26 18:08 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen

On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
> On 2013-06-26 01:51, Stefano Stabellini wrote:
> > On Tue, 25 Jun 2013, DuanZhenzhong wrote:
> > > Stefano Stabellini wrote:
> > > > Trimming some of the people in CC
> > > > 
> > > > On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> > > >    
> > > > > On 2013-06-20 22:21, Stefano Stabellini wrote:
> > > > >      
> > > > > > On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> > > > > >        
> > > > > > > On 2013-06-05 20:50, Stefano Stabellini wrote:
> > > > > > >          
> > > > > > > > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > >            
> > > > > > > > > Stefano Stabellini wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > > > > > > > >      On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > >        Looking at the hypervisor code I couldn't see anything
> > > > > > > > > obviously
> > > > > > > > > wrong.
> > > > > > > > >            I think the culprit is "physdev_unmap_pirq":
> > > > > > > > > 
> > > > > > > > >       if ( is_hvm_domain(d) )
> > > > > > > > >        {
> > > > > > > > >            spin_lock(&d->event_lock);
> > > > > > > > >            gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s,
> > > > > > > > > irq:
> > > > > > > > > %d\n",
> > > > > > > > >                d->domain_id, pirq, domain_pirq_to_emuirq(d,
> > > > > > > > > pirq),
> > > > > > > > >                domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
> > > > > > > > > "unbound" :
> > > > > > > > > "",
> > > > > > > > >                domain_pirq_to_irq(d, pirq));
> > > > > > > > >                                                                                             if
> > > > > > > > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > > > > > > >                ret = unmap_domain_pirq_emuirq(d, pirq);
> > > > > > > > >            spin_unlock(&d->event_lock);
> > > > > > > > >            if ( domid == DOMID_SELF || ret )
> > > > > > > > >                goto free_domain;
> > > > > > > > > 
> > > > > > > > > It always tells me unbound:
> > > > > > > > > 
> > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > (a bit older debug code, so the 'unbound' does not show up
> > > > > > > > > here).
> > > > > > > > > 
> > > > > > > > > Which means that the call to unmap_domain_pirq_emuirq does not
> > > > > > > > > happen.
> > > > > > > > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > > > > > > > on the code being IRQ_UNBOUND.
> > > > > > > > > 
> > > > > > > > > In other words, all of that code looks to only clear things
> > > > > > > > > when
> > > > > > > > > they are !IRQ_UNBOUND.
> > > > > > > > > 
> > > > > > > > > But the other logic (IRQ_UNBOUND) looks to be missing a
> > > > > > > > > removal
> > > > > > > > > in the radix tree:
> > > > > > > > > 
> > > > > > > > >      if ( emuirq != IRQ_PT )
> > > > > > > > >            radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
> > > > > > > > > emuirq);
> > > > > > > > >                                                                            And
> > > > > > > > > I think that is what is causing the leak - the radix tree
> > > > > > > > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > > > > > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > > > > > > >          I think that you are looking in the wrong place.
> > > > > > > > > The issue is that QEMU doesn't call pt_msi_disable in
> > > > > > > > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > > > > > > > 
> > > > > > > > > The code above is correct as is because it is trying to handle
> > > > > > > > > emulated
> > > > > > > > > IRQs and MSIs, not real passthrough MSIs. They latter are not
> > > > > > > > > added to
> > > > > > > > > that radix tree, see physdev_hvm_map_pirq and
> > > > > > > > > physdev_map_pirq.
> > > > > > > > >        This patch fixes the issue, I have only tested MSI
> > > > > > > > > (MSI-X
> > > > > > > > > completely
> > > > > > > > > untested).
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > > > > > index 304c438..079e465 100644
> > > > > > > > > --- a/hw/pass-through.c
> > > > > > > > > +++ b/hw/pass-through.c
> > > > > > > > > @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
> > > > > > > > > pt_dev
> > > > > > > > > *ptdev,
> > > > > > > > >             ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
> > > > > > > > >         }
> > > > > > > > >         else
> > > > > > > > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > > > > > > > +    {
> > > > > > > > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > > > > > > > +            pt_msi_disable(ptdev);
> > > > > > > > > +        }
> > > > > > > > > +    }
> > > > > > > > >           /* pass through MSI_ENABLE bit when no MSI-INTx
> > > > > > > > > translation
> > > > > > > > > */
> > > > > > > > >         if (!ptdev->msi_trans_en) {
> > > > > > > > > @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
> > > > > > > > > pt_dev
> > > > > > > > > *ptdev,
> > > > > > > > >                 pt_disable_msi_translate(ptdev);
> > > > > > > > >             }
> > > > > > > > >             pt_msix_update(ptdev);
> > > > > > > > > +    } else if (!(*value & PCI_MSIX_ENABLE) &&
> > > > > > > > > ptdev->msix->enabled) {
> > > > > > > > > +        pt_msix_delete(ptdev);
> > > > > > > > >      Hi Stefano,
> > > > > > > > > I made a test with this patch, os reboot when driver reload.
> > > > > > > > > If
> > > > > > > > > use
> > > > > > > > > pt_msix_disable
> > > > > > > > > instead of pt_msix_delete, driver could be reloaded.
> > > > > > > > > But I still see some error in qemu.log and xen console. Seems
> > > > > > > > > four
> > > > > > > > > IRQs
> > > > > > > > > are not freed
> > > > > > > > > when unmap.
> > > > > > > > > --------------first load---------------------------
> > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 103
> > > > > > > > > pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
> > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 102
> > > > > > > > > pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
> > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 101
> > > > > > > > > pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
> > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq = 100
> > > > > > > > > pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
> > > > > > > > > ------------- first unload---------------------------
> > > > > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > > > > > > > pt_msix_disable: Unmap msix with pirq 66
> > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > > > > > > > pt_msix_disable: Unmap msix with pirq 65
> > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > > > > > > > pt_msix_disable: Unmap msix with pirq 64
> > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > > >              
> > > > > > > > Can you add some printks in Xen (the hypercall name is
> > > > > > > > PHYSDEVOP_unmap_pirq) to figure out exactly why they are
> > > > > > > > failing?
> > > > > > > >            
> > > > > > > Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND
> > > > > > > in
> > > > > > > physdev_unmap_pirq.
> > > > > > >          
> > > > > > That means that Linux didn't call irq_enable on the MSI-X in
> > > > > > question:
> > > > > > 
> > > > > > irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
> > > > > > 
> > > > > > EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
> > > > > > calls
> > > > > > map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
> > > > > > should
> > > > > > be IRQ_PT.
> > > > > > 
> > > > > > I don't know if that's a normal condition, but in any case it should
> > > > > > not create any problems to physdev_unmap_pirq, in fact the folloing
> > > > > > check:
> > > > > > 
> > > > > >       if ( domid == DOMID_SELF || ret )
> > > > > >               goto free_domain;
> > > > > > 
> > > > > > should fail so Xen should continue and execute unmap_domain_pirq.
> > > > > > That's
> > > > > > what we want.
> > > > > >        
> > > > >  From linux side, request_irq->  request_threaded_irq-> __setup_irq->
> > > > > irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
> > > > > If irq_enable isn't called, how does the driver receive interrupt, I
> > > > > did
> > > > > see
> > > > > four interrupts in /proc/interrupt and driver works ok.
> > > > >      
> > > > Good to know
> > > > 
> > > >    
> > > > > Could you have a look if there is something wrong in xen side of
> > > > > clearing
> > > > > the
> > > > > mapping?
> > > > >      
> > > > What I am saying is that the error you are getting:
> > > > 
> > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > pt_msix_disable: Unmap msix with pirq 67
> > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > 
> > > > cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> > > > IRQ_UNBOUND.
> > > > So, why are you getting this error? What is failing?
> > > > I am ready to believe the problem is in Xen but Without understanding
> > > > why you are getting the error it's hard to find a solution.
> > > >    
> > > I found the reason, you are looking at xen-unstable, I was working with
> > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> > > That patch set ret to -EINVAL initially. After remove that line, unmap
> > > succeed.
> > > But we still need below patch to let driver reload succeed everytime.
> > > Without
> > > that, 1st reload failed, 2nd succeed, 3 failed, ...
> > > 
> > > diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
> > > --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
> > > +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
> > > @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
> > >          return;
> > >      }
> > > 
> > > -    if ( offset != 3 && entry->io_mem[offset] != val )
> > > +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq ==
> > > -1))
> > >          entry->flags = 1;
> > >      entry->io_mem[offset] = val;
> > Interesting. I don't think this is the proper fix though.
> > Does the appended patch change anything?
> > 
> > diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> > index 71fa6f0..cd5d9c7 100644
> > --- a/hw/pt-msi.c
> > +++ b/hw/pt-msi.c
> > @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int
> > entry_nr)
> >       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
> >       int ret;
> >   -    if ( !entry->flags )
> > +    if ( !entry->flags && ptdev->msix->enabled )
> >           return 0;
> >         if (!gvec) {
> Tested, not work.
> If you look at msix_capability_init in kernel, line 707,722,
> dev->msix->enabled is already set when pt_msix_update is called.

Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
pt_msix_update.


I was assuming that you needed to add "|| entry->pirq == -1" because you
needed to pass the check:

    if ( !entry->flags )
        return 0;

at the beginning of pt_msix_update_one. Am I getting it right?

If that is case that I thought that we just needed to make sure that
when ptdev->msix->enabled is still zero then we go through the test in
pt_msix_update_one. Where is the mistake?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-26 18:08                                                       ` Stefano Stabellini
@ 2013-06-27  4:01                                                           ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-27  4:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-27 02:08, Stefano Stabellini wrote:
> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>> Stefano Stabellini wrote:
>>>>> Trimming some of the people in CC
>>>>>
>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>     
>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>       
>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>         
>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>           
>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>             
>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>       On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>         Looking at the hypervisor code I couldn't see anything
>>>>>>>>>> obviously
>>>>>>>>>> wrong.
>>>>>>>>>>             I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>
>>>>>>>>>>        if ( is_hvm_domain(d) )
>>>>>>>>>>         {
>>>>>>>>>>             spin_lock(&d->event_lock);
>>>>>>>>>>             gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s,
>>>>>>>>>> irq:
>>>>>>>>>> %d\n",
>>>>>>>>>>                 d->domain_id, pirq, domain_pirq_to_emuirq(d,
>>>>>>>>>> pirq),
>>>>>>>>>>                 domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>>>>>> "unbound" :
>>>>>>>>>> "",
>>>>>>>>>>                 domain_pirq_to_irq(d, pirq));
>>>>>>>>>>                                                                                              if
>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>                 ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>             spin_unlock(&d->event_lock);
>>>>>>>>>>             if ( domid == DOMID_SELF || ret )
>>>>>>>>>>                 goto free_domain;
>>>>>>>>>>
>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>> here).
>>>>>>>>>>
>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not
>>>>>>>>>> happen.
>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>
>>>>>>>>>> In other words, all of that code looks to only clear things
>>>>>>>>>> when
>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>
>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>> removal
>>>>>>>>>> in the radix tree:
>>>>>>>>>>
>>>>>>>>>>       if ( emuirq != IRQ_PT )
>>>>>>>>>>             radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>> emuirq);
>>>>>>>>>>                                                                             And
>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>           I think that you are looking in the wrong place.
>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>
>>>>>>>>>> The code above is correct as is because it is trying to handle
>>>>>>>>>> emulated
>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not
>>>>>>>>>> added to
>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>         This patch fixes the issue, I have only tested MSI
>>>>>>>>>> (MSI-X
>>>>>>>>>> completely
>>>>>>>>>> untested).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
>>>>>>>>>> pt_dev
>>>>>>>>>> *ptdev,
>>>>>>>>>>              ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>          }
>>>>>>>>>>          else
>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>> +    {
>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>> +        }
>>>>>>>>>> +    }
>>>>>>>>>>            /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>> translation
>>>>>>>>>> */
>>>>>>>>>>          if (!ptdev->msi_trans_en) {
>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
>>>>>>>>>> pt_dev
>>>>>>>>>> *ptdev,
>>>>>>>>>>                  pt_disable_msi_translate(ptdev);
>>>>>>>>>>              }
>>>>>>>>>>              pt_msix_update(ptdev);
>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>       Hi Stefano,
>>>>>>>>>> I made a test with this patch, os reboot when driver reload.
>>>>>>>>>> If
>>>>>>>>>> use
>>>>>>>>>> pt_msix_disable
>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>> But I still see some error in qemu.log and xen console. Seems
>>>>>>>>>> four
>>>>>>>>>> IRQs
>>>>>>>>>> are not freed
>>>>>>>>>> when unmap.
>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>>               
>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>> failing?
>>>>>>>>>             
>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND
>>>>>>>> in
>>>>>>>> physdev_unmap_pirq.
>>>>>>>>           
>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>> question:
>>>>>>>
>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>
>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>> calls
>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>> should
>>>>>>> be IRQ_PT.
>>>>>>>
>>>>>>> I don't know if that's a normal condition, but in any case it should
>>>>>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>>>>>> check:
>>>>>>>
>>>>>>>        if ( domid == DOMID_SELF || ret )
>>>>>>>                goto free_domain;
>>>>>>>
>>>>>>> should fail so Xen should continue and execute unmap_domain_pirq.
>>>>>>> That's
>>>>>>> what we want.
>>>>>>>         
>>>>>>   From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>> If irq_enable isn't called, how does the driver receive interrupt, I
>>>>>> did
>>>>>> see
>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>       
>>>>> Good to know
>>>>>
>>>>>     
>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>> clearing
>>>>>> the
>>>>>> mapping?
>>>>>>       
>>>>> What I am saying is that the error you are getting:
>>>>>
>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>
>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>> IRQ_UNBOUND.
>>>>> So, why are you getting this error? What is failing?
>>>>> I am ready to believe the problem is in Xen but Without understanding
>>>>> why you are getting the error it's hard to find a solution.
>>>>>     
>>>> I found the reason, you are looking at xen-unstable, I was working with
>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>> succeed.
>>>> But we still need below patch to let driver reload succeed everytime.
>>>> Without
>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>
>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>           return;
>>>>       }
>>>>
>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq ==
>>>> -1))
>>>>           entry->flags = 1;
>>>>       entry->io_mem[offset] = val;
>>> Interesting. I don't think this is the proper fix though.
>>> Does the appended patch change anything?
>>>
>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>> index 71fa6f0..cd5d9c7 100644
>>> --- a/hw/pt-msi.c
>>> +++ b/hw/pt-msi.c
>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int
>>> entry_nr)
>>>        uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>        int ret;
>>>    -    if ( !entry->flags )
>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>            return 0;
>>>          if (!gvec) {
>> Tested, not work.
>> If you look at msix_capability_init in kernel, line 707,722,
>> dev->msix->enabled is already set when pt_msix_update is called.
> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
> pt_msix_update.
It does.
line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
                  this will set dev->msix->enabled first time
line 722, PCI_MSIX_FLAGS_ENABLE set
                  this trigger call of pt_msix_update
                  then dev->msix->enabled was set a second time
> I was assuming that you needed to add "|| entry->pirq == -1" because you
> needed to pass the check:
>
>      if ( !entry->flags )
>          return 0;
>
> at the beginning of pt_msix_update_one. Am I getting it right?
Right, as entry->pirq is set to -1 when driver unload.
> If that is case that I thought that we just needed to make sure that
> when ptdev->msix->enabled is still zero then we go through the test in
> pt_msix_update_one. Where is the mistake?
If you want to use dev->msix->enabled for checking, below patch could work.
But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE 
any more,
but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK

zduan
***********************************************************
diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
--- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
+++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
@@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
          pt_msix_disable(ptdev);
      }

-    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
+    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value & 
PCI_MSIX_MASK);

      return 0;
  }
diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
--- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
+++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
@@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
      uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
      int ret;

-    if ( !entry->flags )
+    if ( !entry->flags && dev->msix->enabled )
          return 0;

      if (!gvec) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-06-27  4:01                                                           ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-27  4:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-27 02:08, Stefano Stabellini wrote:
> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>> Stefano Stabellini wrote:
>>>>> Trimming some of the people in CC
>>>>>
>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>     
>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>       
>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>         
>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>           
>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>             
>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>       On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>         Looking at the hypervisor code I couldn't see anything
>>>>>>>>>> obviously
>>>>>>>>>> wrong.
>>>>>>>>>>             I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>
>>>>>>>>>>        if ( is_hvm_domain(d) )
>>>>>>>>>>         {
>>>>>>>>>>             spin_lock(&d->event_lock);
>>>>>>>>>>             gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s,
>>>>>>>>>> irq:
>>>>>>>>>> %d\n",
>>>>>>>>>>                 d->domain_id, pirq, domain_pirq_to_emuirq(d,
>>>>>>>>>> pirq),
>>>>>>>>>>                 domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ?
>>>>>>>>>> "unbound" :
>>>>>>>>>> "",
>>>>>>>>>>                 domain_pirq_to_irq(d, pirq));
>>>>>>>>>>                                                                                              if
>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>                 ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>             spin_unlock(&d->event_lock);
>>>>>>>>>>             if ( domid == DOMID_SELF || ret )
>>>>>>>>>>                 goto free_domain;
>>>>>>>>>>
>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>> here).
>>>>>>>>>>
>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does not
>>>>>>>>>> happen.
>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be depend
>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>
>>>>>>>>>> In other words, all of that code looks to only clear things
>>>>>>>>>> when
>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>
>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>> removal
>>>>>>>>>> in the radix tree:
>>>>>>>>>>
>>>>>>>>>>       if ( emuirq != IRQ_PT )
>>>>>>>>>>             radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>> emuirq);
>>>>>>>>>>                                                                             And
>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should check
>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>           I think that you are looking in the wrong place.
>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>
>>>>>>>>>> The code above is correct as is because it is trying to handle
>>>>>>>>>> emulated
>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are not
>>>>>>>>>> added to
>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>         This patch fixes the issue, I have only tested MSI
>>>>>>>>>> (MSI-X
>>>>>>>>>> completely
>>>>>>>>>> untested).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int pt_msgctrl_reg_write(struct
>>>>>>>>>> pt_dev
>>>>>>>>>> *ptdev,
>>>>>>>>>>              ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>          }
>>>>>>>>>>          else
>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>> +    {
>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>> +        }
>>>>>>>>>> +    }
>>>>>>>>>>            /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>> translation
>>>>>>>>>> */
>>>>>>>>>>          if (!ptdev->msi_trans_en) {
>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int pt_msixctrl_reg_write(struct
>>>>>>>>>> pt_dev
>>>>>>>>>> *ptdev,
>>>>>>>>>>                  pt_disable_msi_translate(ptdev);
>>>>>>>>>>              }
>>>>>>>>>>              pt_msix_update(ptdev);
>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>       Hi Stefano,
>>>>>>>>>> I made a test with this patch, os reboot when driver reload.
>>>>>>>>>> If
>>>>>>>>>> use
>>>>>>>>>> pt_msix_disable
>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>> But I still see some error in qemu.log and xen console. Seems
>>>>>>>>>> four
>>>>>>>>>> IRQs
>>>>>>>>>> are not freed
>>>>>>>>>> when unmap.
>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 103
>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 102
>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 101
>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec 0
>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq = 100
>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec 0
>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>>>>               
>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>> failing?
>>>>>>>>>             
>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) = IRQ_UNBOUND
>>>>>>>> in
>>>>>>>> physdev_unmap_pirq.
>>>>>>>>           
>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>> question:
>>>>>>>
>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>
>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>> calls
>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>> should
>>>>>>> be IRQ_PT.
>>>>>>>
>>>>>>> I don't know if that's a normal condition, but in any case it should
>>>>>>> not create any problems to physdev_unmap_pirq, in fact the folloing
>>>>>>> check:
>>>>>>>
>>>>>>>        if ( domid == DOMID_SELF || ret )
>>>>>>>                goto free_domain;
>>>>>>>
>>>>>>> should fail so Xen should continue and execute unmap_domain_pirq.
>>>>>>> That's
>>>>>>> what we want.
>>>>>>>         
>>>>>>   From linux side, request_irq->  request_threaded_irq-> __setup_irq->
>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>> If irq_enable isn't called, how does the driver receive interrupt, I
>>>>>> did
>>>>>> see
>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>       
>>>>> Good to know
>>>>>
>>>>>     
>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>> clearing
>>>>>> the
>>>>>> mapping?
>>>>>>       
>>>>> What I am saying is that the error you are getting:
>>>>>
>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>
>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>> IRQ_UNBOUND.
>>>>> So, why are you getting this error? What is failing?
>>>>> I am ready to believe the problem is in Xen but Without understanding
>>>>> why you are getting the error it's hard to find a solution.
>>>>>     
>>>> I found the reason, you are looking at xen-unstable, I was working with
>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>> succeed.
>>>> But we still need below patch to let driver reload succeed everytime.
>>>> Without
>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>
>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>           return;
>>>>       }
>>>>
>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq ==
>>>> -1))
>>>>           entry->flags = 1;
>>>>       entry->io_mem[offset] = val;
>>> Interesting. I don't think this is the proper fix though.
>>> Does the appended patch change anything?
>>>
>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>> index 71fa6f0..cd5d9c7 100644
>>> --- a/hw/pt-msi.c
>>> +++ b/hw/pt-msi.c
>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int
>>> entry_nr)
>>>        uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>        int ret;
>>>    -    if ( !entry->flags )
>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>            return 0;
>>>          if (!gvec) {
>> Tested, not work.
>> If you look at msix_capability_init in kernel, line 707,722,
>> dev->msix->enabled is already set when pt_msix_update is called.
> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
> pt_msix_update.
It does.
line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
                  this will set dev->msix->enabled first time
line 722, PCI_MSIX_FLAGS_ENABLE set
                  this trigger call of pt_msix_update
                  then dev->msix->enabled was set a second time
> I was assuming that you needed to add "|| entry->pirq == -1" because you
> needed to pass the check:
>
>      if ( !entry->flags )
>          return 0;
>
> at the beginning of pt_msix_update_one. Am I getting it right?
Right, as entry->pirq is set to -1 when driver unload.
> If that is case that I thought that we just needed to make sure that
> when ptdev->msix->enabled is still zero then we go through the test in
> pt_msix_update_one. Where is the mistake?
If you want to use dev->msix->enabled for checking, below patch could work.
But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE 
any more,
but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK

zduan
***********************************************************
diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
--- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
+++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
@@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
          pt_msix_disable(ptdev);
      }

-    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
+    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value & 
PCI_MSIX_MASK);

      return 0;
  }
diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
--- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
+++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
@@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
      uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
      int ret;

-    if ( !entry->flags )
+    if ( !entry->flags && dev->msix->enabled )
          return 0;

      if (!gvec) {

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time)
  2013-06-25 11:03                                                       ` Stefano Stabellini
@ 2013-06-27  8:34                                                         ` Jan Beulich
  2013-06-27 10:46                                                           ` Stefano Stabellini
  0 siblings, 1 reply; 58+ messages in thread
From: Jan Beulich @ 2013-06-27  8:34 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Konrad Rzeszutek Wilk, Andrew Cooper, Feng Jin,
	DuanZhenzhong, YuvalShaia, Chien Yen, DavidVrabel

>>> On 25.06.13 at 13:03, Stefano Stabellini <stefano.stabellini@eu.citrix.com>
wrote:
> On Tue, 25 Jun 2013, Stefano Stabellini wrote:
>> On Tue, 25 Jun 2013, Jan Beulich wrote:
>> > >>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
>> > > Stefano Stabellini wrote:
>> > >> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>> > >>> Could you have a look if there is something wrong in xen side of clearing 
>> > >>> the mapping?
>> > >>
>> > >> What I am saying is that the error you are getting:
>> > >>
>> > >> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>> > >> pt_msix_disable: Unmap msix with pirq 67
>> > >> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>> > >>
>> > >> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>> > >> IRQ_UNBOUND.
>> > >> So, why are you getting this error? What is failing?
>> > >> I am ready to believe the problem is in Xen but Without understanding
>> > >> why you are getting the error it's hard to find a solution.
>> > >>   
>> > > I found the reason, you are looking at xen-unstable, I was working with 
>> > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>> > > That patch set ret to -EINVAL initially. After remove that line, unmap 
>> > > succeed.
>> > 
>> > Removing that line certainly isn't right. The proper fix is the one
>> > below/attached.
>> > 
>> > Jan
>> > 
>> > ****************************************************
>> > x86: fix emuirq regression from XSA-21 fix
>> > 
>> > The XSA-21 patch broke the assumption of "ret" being zero prior to the
>> > IRQ_UNBOUND check.
>> > 
>> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> > 
>> > --- a/xen/arch/x86/physdev.c
>> > +++ b/xen/arch/x86/physdev.c
>> > @@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
>> >          spin_lock(&d->event_lock);
>> >          if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
>> >              ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
>> > +        else
>> > +            ret = 0;
>> >          spin_unlock(&d->event_lock);
>> >          if ( unmap->domid == DOMID_SELF || ret )
>> >              goto free_domain;
>> 
>> 
>> This is unnecessary.
>> ret is 0 regardless because of the previous:
>> 
>> ret = xsm_unmap_domain_pirq(XSM_TARGET, d);
>>
> 
> Ops, I read now the patch for XSA-21.
> The change above is fine (and makes the code more readable anyway).

So is this some odd way of saying Reviewed-by?

Jan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time)
  2013-06-27  8:34                                                         ` Jan Beulich
@ 2013-06-27 10:46                                                           ` Stefano Stabellini
  0 siblings, 0 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-27 10:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Stefano Stabellini, Andrew Cooper,
	Konrad Rzeszutek Wilk, Feng Jin, DuanZhenzhong, YuvalShaia,
	Chien Yen, DavidVrabel

On Thu, 27 Jun 2013, Jan Beulich wrote:
> >>> On 25.06.13 at 13:03, Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> wrote:
> > On Tue, 25 Jun 2013, Stefano Stabellini wrote:
> >> On Tue, 25 Jun 2013, Jan Beulich wrote:
> >> > >>> On 25.06.13 at 07:33, DuanZhenzhong <zhenzhong.duan@oracle.com> wrote:
> >> > > Stefano Stabellini wrote:
> >> > >> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> >> > >>> Could you have a look if there is something wrong in xen side of clearing 
> >> > >>> the mapping?
> >> > >>
> >> > >> What I am saying is that the error you are getting:
> >> > >>
> >> > >> pt_msix_disable: Unbind msix with pirq 67, gvec 0
> >> > >> pt_msix_disable: Unmap msix with pirq 67
> >> > >> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> >> > >>
> >> > >> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> >> > >> IRQ_UNBOUND.
> >> > >> So, why are you getting this error? What is failing?
> >> > >> I am ready to believe the problem is in Xen but Without understanding
> >> > >> why you are getting the error it's hard to find a solution.
> >> > >>   
> >> > > I found the reason, you are looking at xen-unstable, I was working with 
> >> > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> >> > > That patch set ret to -EINVAL initially. After remove that line, unmap 
> >> > > succeed.
> >> > 
> >> > Removing that line certainly isn't right. The proper fix is the one
> >> > below/attached.
> >> > 
> >> > Jan
> >> > 
> >> > ****************************************************
> >> > x86: fix emuirq regression from XSA-21 fix
> >> > 
> >> > The XSA-21 patch broke the assumption of "ret" being zero prior to the
> >> > IRQ_UNBOUND check.
> >> > 
> >> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> > 
> >> > --- a/xen/arch/x86/physdev.c
> >> > +++ b/xen/arch/x86/physdev.c
> >> > @@ -243,6 +243,8 @@ static int physdev_unmap_pirq(struct phy
> >> >          spin_lock(&d->event_lock);
> >> >          if ( domain_pirq_to_emuirq(d, unmap->pirq) != IRQ_UNBOUND )
> >> >              ret = unmap_domain_pirq_emuirq(d, unmap->pirq);
> >> > +        else
> >> > +            ret = 0;
> >> >          spin_unlock(&d->event_lock);
> >> >          if ( unmap->domid == DOMID_SELF || ret )
> >> >              goto free_domain;
> >> 
> >> 
> >> This is unnecessary.
> >> ret is 0 regardless because of the previous:
> >> 
> >> ret = xsm_unmap_domain_pirq(XSM_TARGET, d);
> >>
> > 
> > Ops, I read now the patch for XSA-21.
> > The change above is fine (and makes the code more readable anyway).
> 
> So is this some odd way of saying Reviewed-by?
 
Yes :)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-27  4:01                                                           ` Zhenzhong Duan
  (?)
@ 2013-06-27 11:52                                                           ` Stefano Stabellini
  2013-06-28  2:33                                                               ` Zhenzhong Duan
  -1 siblings, 1 reply; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-27 11:52 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen

On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
> On 2013-06-27 02:08, Stefano Stabellini wrote:
> > On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
> > > On 2013-06-26 01:51, Stefano Stabellini wrote:
> > > > On Tue, 25 Jun 2013, DuanZhenzhong wrote:
> > > > > Stefano Stabellini wrote:
> > > > > > Trimming some of the people in CC
> > > > > > 
> > > > > > On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> > > > > >     
> > > > > > > On 2013-06-20 22:21, Stefano Stabellini wrote:
> > > > > > >       
> > > > > > > > On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > >         
> > > > > > > > > On 2013-06-05 20:50, Stefano Stabellini wrote:
> > > > > > > > >           
> > > > > > > > > > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > > > >             
> > > > > > > > > > > Stefano Stabellini wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > > > > > > > > > >       On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > > > > > > > >         Looking at the hypervisor code I couldn't see
> > > > > > > > > > > anything
> > > > > > > > > > > obviously
> > > > > > > > > > > wrong.
> > > > > > > > > > >             I think the culprit is "physdev_unmap_pirq":
> > > > > > > > > > > 
> > > > > > > > > > >        if ( is_hvm_domain(d) )
> > > > > > > > > > >         {
> > > > > > > > > > >             spin_lock(&d->event_lock);
> > > > > > > > > > >             gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x
> > > > > > > > > > > %s,
> > > > > > > > > > > irq:
> > > > > > > > > > > %d\n",
> > > > > > > > > > >                 d->domain_id, pirq,
> > > > > > > > > > > domain_pirq_to_emuirq(d,
> > > > > > > > > > > pirq),
> > > > > > > > > > >                 domain_pirq_to_emuirq(d, pirq) ==
> > > > > > > > > > > IRQ_UNBOUND ?
> > > > > > > > > > > "unbound" :
> > > > > > > > > > > "",
> > > > > > > > > > >                 domain_pirq_to_irq(d, pirq));
> > > > > > > > > > >                                                                                              if
> > > > > > > > > > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > > > > > > > > >                 ret = unmap_domain_pirq_emuirq(d, pirq);
> > > > > > > > > > >             spin_unlock(&d->event_lock);
> > > > > > > > > > >             if ( domid == DOMID_SELF || ret )
> > > > > > > > > > >                 goto free_domain;
> > > > > > > > > > > 
> > > > > > > > > > > It always tells me unbound:
> > > > > > > > > > > 
> > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > (a bit older debug code, so the 'unbound' does not show up
> > > > > > > > > > > here).
> > > > > > > > > > > 
> > > > > > > > > > > Which means that the call to unmap_domain_pirq_emuirq does
> > > > > > > > > > > not
> > > > > > > > > > > happen.
> > > > > > > > > > > The checks in unmap_domain_pirq_emuirq also look to be
> > > > > > > > > > > depend
> > > > > > > > > > > on the code being IRQ_UNBOUND.
> > > > > > > > > > > 
> > > > > > > > > > > In other words, all of that code looks to only clear
> > > > > > > > > > > things
> > > > > > > > > > > when
> > > > > > > > > > > they are !IRQ_UNBOUND.
> > > > > > > > > > > 
> > > > > > > > > > > But the other logic (IRQ_UNBOUND) looks to be missing a
> > > > > > > > > > > removal
> > > > > > > > > > > in the radix tree:
> > > > > > > > > > > 
> > > > > > > > > > >       if ( emuirq != IRQ_PT )
> > > > > > > > > > >             radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
> > > > > > > > > > > emuirq);
> > > > > > > > > > >                                                                             And
> > > > > > > > > > > I think that is what is causing the leak - the radix tree
> > > > > > > > > > > needs to be pruned? Or perhaps the allocate_pirq should
> > > > > > > > > > > check
> > > > > > > > > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > > > > > > > > >           I think that you are looking in the wrong place.
> > > > > > > > > > > The issue is that QEMU doesn't call pt_msi_disable in
> > > > > > > > > > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > > > > > > > > > 
> > > > > > > > > > > The code above is correct as is because it is trying to
> > > > > > > > > > > handle
> > > > > > > > > > > emulated
> > > > > > > > > > > IRQs and MSIs, not real passthrough MSIs. They latter are
> > > > > > > > > > > not
> > > > > > > > > > > added to
> > > > > > > > > > > that radix tree, see physdev_hvm_map_pirq and
> > > > > > > > > > > physdev_map_pirq.
> > > > > > > > > > >         This patch fixes the issue, I have only tested MSI
> > > > > > > > > > > (MSI-X
> > > > > > > > > > > completely
> > > > > > > > > > > untested).
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > > > > > > > index 304c438..079e465 100644
> > > > > > > > > > > --- a/hw/pass-through.c
> > > > > > > > > > > +++ b/hw/pass-through.c
> > > > > > > > > > > @@ -3866,7 +3866,11 @@ static int
> > > > > > > > > > > pt_msgctrl_reg_write(struct
> > > > > > > > > > > pt_dev
> > > > > > > > > > > *ptdev,
> > > > > > > > > > >              ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
> > > > > > > > > > >          }
> > > > > > > > > > >          else
> > > > > > > > > > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > > > > > > > > > +    {
> > > > > > > > > > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > > > > > > > > > +            pt_msi_disable(ptdev);
> > > > > > > > > > > +        }
> > > > > > > > > > > +    }
> > > > > > > > > > >            /* pass through MSI_ENABLE bit when no MSI-INTx
> > > > > > > > > > > translation
> > > > > > > > > > > */
> > > > > > > > > > >          if (!ptdev->msi_trans_en) {
> > > > > > > > > > > @@ -4013,6 +4017,8 @@ static int
> > > > > > > > > > > pt_msixctrl_reg_write(struct
> > > > > > > > > > > pt_dev
> > > > > > > > > > > *ptdev,
> > > > > > > > > > >                  pt_disable_msi_translate(ptdev);
> > > > > > > > > > >              }
> > > > > > > > > > >              pt_msix_update(ptdev);
> > > > > > > > > > > +    } else if (!(*value & PCI_MSIX_ENABLE) &&
> > > > > > > > > > > ptdev->msix->enabled) {
> > > > > > > > > > > +        pt_msix_delete(ptdev);
> > > > > > > > > > >       Hi Stefano,
> > > > > > > > > > > I made a test with this patch, os reboot when driver
> > > > > > > > > > > reload.
> > > > > > > > > > > If
> > > > > > > > > > > use
> > > > > > > > > > > pt_msix_disable
> > > > > > > > > > > instead of pt_msix_delete, driver could be reloaded.
> > > > > > > > > > > But I still see some error in qemu.log and xen console.
> > > > > > > > > > > Seems
> > > > > > > > > > > four
> > > > > > > > > > > IRQs
> > > > > > > > > > > are not freed
> > > > > > > > > > > when unmap.
> > > > > > > > > > > --------------first load---------------------------
> > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq =
> > > > > > > > > > > 103
> > > > > > > > > > > pt_msix_update_one: Update msix entry 0 with pirq 67 gvec
> > > > > > > > > > > 0
> > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq =
> > > > > > > > > > > 102
> > > > > > > > > > > pt_msix_update_one: Update msix entry 1 with pirq 66 gvec
> > > > > > > > > > > 0
> > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq =
> > > > > > > > > > > 101
> > > > > > > > > > > pt_msix_update_one: Update msix entry 2 with pirq 65 gvec
> > > > > > > > > > > 0
> > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq =
> > > > > > > > > > > 100
> > > > > > > > > > > pt_msix_update_one: Update msix entry 3 with pirq 64 gvec
> > > > > > > > > > > 0
> > > > > > > > > > > ------------- first unload---------------------------
> > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > [00:04.0]
> > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 66
> > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > [00:04.0]
> > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 65
> > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > [00:04.0]
> > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 64
> > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > [00:04.0]
> > > > > > > > > > >               
> > > > > > > > > > Can you add some printks in Xen (the hypercall name is
> > > > > > > > > > PHYSDEVOP_unmap_pirq) to figure out exactly why they are
> > > > > > > > > > failing?
> > > > > > > > > >             
> > > > > > > > > Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
> > > > > > > > > IRQ_UNBOUND
> > > > > > > > > in
> > > > > > > > > physdev_unmap_pirq.
> > > > > > > > >           
> > > > > > > > That means that Linux didn't call irq_enable on the MSI-X in
> > > > > > > > question:
> > > > > > > > 
> > > > > > > > irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
> > > > > > > > 
> > > > > > > > EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
> > > > > > > > calls
> > > > > > > > map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
> > > > > > > > should
> > > > > > > > be IRQ_PT.
> > > > > > > > 
> > > > > > > > I don't know if that's a normal condition, but in any case it
> > > > > > > > should
> > > > > > > > not create any problems to physdev_unmap_pirq, in fact the
> > > > > > > > folloing
> > > > > > > > check:
> > > > > > > > 
> > > > > > > >        if ( domid == DOMID_SELF || ret )
> > > > > > > >                goto free_domain;
> > > > > > > > 
> > > > > > > > should fail so Xen should continue and execute
> > > > > > > > unmap_domain_pirq.
> > > > > > > > That's
> > > > > > > > what we want.
> > > > > > > >         
> > > > > > >   From linux side, request_irq->  request_threaded_irq->
> > > > > > > __setup_irq->
> > > > > > > irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
> > > > > > > If irq_enable isn't called, how does the driver receive interrupt,
> > > > > > > I
> > > > > > > did
> > > > > > > see
> > > > > > > four interrupts in /proc/interrupt and driver works ok.
> > > > > > >       
> > > > > > Good to know
> > > > > > 
> > > > > >     
> > > > > > > Could you have a look if there is something wrong in xen side of
> > > > > > > clearing
> > > > > > > the
> > > > > > > mapping?
> > > > > > >       
> > > > > > What I am saying is that the error you are getting:
> > > > > > 
> > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > 
> > > > > > cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> > > > > > IRQ_UNBOUND.
> > > > > > So, why are you getting this error? What is failing?
> > > > > > I am ready to believe the problem is in Xen but Without
> > > > > > understanding
> > > > > > why you are getting the error it's hard to find a solution.
> > > > > >     
> > > > > I found the reason, you are looking at xen-unstable, I was working
> > > > > with
> > > > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> > > > > That patch set ret to -EINVAL initially. After remove that line, unmap
> > > > > succeed.
> > > > > But we still need below patch to let driver reload succeed everytime.
> > > > > Without
> > > > > that, 1st reload failed, 2nd succeed, 3 failed, ...
> > > > > 
> > > > > diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
> > > > > --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
> > > > > +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
> > > > > @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
> > > > >           return;
> > > > >       }
> > > > > 
> > > > > -    if ( offset != 3 && entry->io_mem[offset] != val )
> > > > > +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq
> > > > > ==
> > > > > -1))
> > > > >           entry->flags = 1;
> > > > >       entry->io_mem[offset] = val;
> > > > Interesting. I don't think this is the proper fix though.
> > > > Does the appended patch change anything?
> > > > 
> > > > diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> > > > index 71fa6f0..cd5d9c7 100644
> > > > --- a/hw/pt-msi.c
> > > > +++ b/hw/pt-msi.c
> > > > @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev,
> > > > int
> > > > entry_nr)
> > > >        uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
> > > >        int ret;
> > > >    -    if ( !entry->flags )
> > > > +    if ( !entry->flags && ptdev->msix->enabled )
> > > >            return 0;
> > > >          if (!gvec) {
> > > Tested, not work.
> > > If you look at msix_capability_init in kernel, line 707,722,
> > > dev->msix->enabled is already set when pt_msix_update is called.
> > Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
> > dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
> > pt_msix_update.
> It does.
> line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
>                  this will set dev->msix->enabled first time
> line 722, PCI_MSIX_FLAGS_ENABLE set
>                  this trigger call of pt_msix_update
>                  then dev->msix->enabled was set a second time
> > I was assuming that you needed to add "|| entry->pirq == -1" because you
> > needed to pass the check:
> > 
> >      if ( !entry->flags )
> >          return 0;
> > 
> > at the beginning of pt_msix_update_one. Am I getting it right?
> Right, as entry->pirq is set to -1 when driver unload.
> > If that is case that I thought that we just needed to make sure that
> > when ptdev->msix->enabled is still zero then we go through the test in
> > pt_msix_update_one. Where is the mistake?
> If you want to use dev->msix->enabled for checking, below patch could work.
> But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
> more,
> but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
> 
> zduan
> ***********************************************************
> diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
> --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
> +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
> @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
>          pt_msix_disable(ptdev);
>      }
> 
> -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
> +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
> PCI_MSIX_MASK);
> 
>      return 0;
>  }
> diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
> --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
> +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
> @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
>      uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>      int ret;
> 
> -    if ( !entry->flags )
> +    if ( !entry->flags && dev->msix->enabled )
>          return 0;
> 
>      if (!gvec) {

I understand now, thanks for the explanation.
Among the two alternatives, I think that your first change is actually
better.

However we can probably still improve it a little bit by setting
entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
the driver reload case from the first driver initialization.

diff --git a/hw/pt-msi.c b/hw/pt-msi.c
index 71fa6f0..cc4e280 100644
--- a/hw/pt-msi.c
+++ b/hw/pt-msi.c
@@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
         }
         /* clear msi-x info */
         entry->pirq = -1;
-        entry->flags = 0;
+        entry->flags = 1;
     }
 }
 

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-27 11:52                                                           ` Stefano Stabellini
@ 2013-06-28  2:33                                                               ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-28  2:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-27 19:52, Stefano Stabellini wrote:
> On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-27 02:08, Stefano Stabellini wrote:
>>> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>>>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>>>> Stefano Stabellini wrote:
>>>>>>> Trimming some of the people in CC
>>>>>>>
>>>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>      
>>>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>>>        
>>>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>          
>>>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>>>            
>>>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>>>              
>>>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>>>        On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>>>          Looking at the hypervisor code I couldn't see
>>>>>>>>>>>> anything
>>>>>>>>>>>> obviously
>>>>>>>>>>>> wrong.
>>>>>>>>>>>>              I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>>>
>>>>>>>>>>>>         if ( is_hvm_domain(d) )
>>>>>>>>>>>>          {
>>>>>>>>>>>>              spin_lock(&d->event_lock);
>>>>>>>>>>>>              gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x
>>>>>>>>>>>> %s,
>>>>>>>>>>>> irq:
>>>>>>>>>>>> %d\n",
>>>>>>>>>>>>                  d->domain_id, pirq,
>>>>>>>>>>>> domain_pirq_to_emuirq(d,
>>>>>>>>>>>> pirq),
>>>>>>>>>>>>                  domain_pirq_to_emuirq(d, pirq) ==
>>>>>>>>>>>> IRQ_UNBOUND ?
>>>>>>>>>>>> "unbound" :
>>>>>>>>>>>> "",
>>>>>>>>>>>>                  domain_pirq_to_irq(d, pirq));
>>>>>>>>>>>>                                                                                               if
>>>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>>>                  ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>>>              spin_unlock(&d->event_lock);
>>>>>>>>>>>>              if ( domid == DOMID_SELF || ret )
>>>>>>>>>>>>                  goto free_domain;
>>>>>>>>>>>>
>>>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>>>
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>>>> here).
>>>>>>>>>>>>
>>>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does
>>>>>>>>>>>> not
>>>>>>>>>>>> happen.
>>>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be
>>>>>>>>>>>> depend
>>>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, all of that code looks to only clear
>>>>>>>>>>>> things
>>>>>>>>>>>> when
>>>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>>>> removal
>>>>>>>>>>>> in the radix tree:
>>>>>>>>>>>>
>>>>>>>>>>>>        if ( emuirq != IRQ_PT )
>>>>>>>>>>>>              radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>>>> emuirq);
>>>>>>>>>>>>                                                                              And
>>>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should
>>>>>>>>>>>> check
>>>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>>>            I think that you are looking in the wrong place.
>>>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>>>
>>>>>>>>>>>> The code above is correct as is because it is trying to
>>>>>>>>>>>> handle
>>>>>>>>>>>> emulated
>>>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are
>>>>>>>>>>>> not
>>>>>>>>>>>> added to
>>>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>>>          This patch fixes the issue, I have only tested MSI
>>>>>>>>>>>> (MSI-X
>>>>>>>>>>>> completely
>>>>>>>>>>>> untested).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int
>>>>>>>>>>>> pt_msgctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>               ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>>           }
>>>>>>>>>>>>           else
>>>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>> +    {
>>>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>>>> +        }
>>>>>>>>>>>> +    }
>>>>>>>>>>>>             /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>>>> translation
>>>>>>>>>>>> */
>>>>>>>>>>>>           if (!ptdev->msi_trans_en) {
>>>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int
>>>>>>>>>>>> pt_msixctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>                   pt_disable_msi_translate(ptdev);
>>>>>>>>>>>>               }
>>>>>>>>>>>>               pt_msix_update(ptdev);
>>>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>>>        Hi Stefano,
>>>>>>>>>>>> I made a test with this patch, os reboot when driver
>>>>>>>>>>>> reload.
>>>>>>>>>>>> If
>>>>>>>>>>>> use
>>>>>>>>>>>> pt_msix_disable
>>>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>>>> But I still see some error in qemu.log and xen console.
>>>>>>>>>>>> Seems
>>>>>>>>>>>> four
>>>>>>>>>>>> IRQs
>>>>>>>>>>>> are not freed
>>>>>>>>>>>> when unmap.
>>>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 103
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 102
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 101
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 100
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>>                
>>>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>>>> failing?
>>>>>>>>>>>              
>>>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
>>>>>>>>>> IRQ_UNBOUND
>>>>>>>>>> in
>>>>>>>>>> physdev_unmap_pirq.
>>>>>>>>>>            
>>>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>>>> question:
>>>>>>>>>
>>>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>>>
>>>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>>>> calls
>>>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>>>> should
>>>>>>>>> be IRQ_PT.
>>>>>>>>>
>>>>>>>>> I don't know if that's a normal condition, but in any case it
>>>>>>>>> should
>>>>>>>>> not create any problems to physdev_unmap_pirq, in fact the
>>>>>>>>> folloing
>>>>>>>>> check:
>>>>>>>>>
>>>>>>>>>         if ( domid == DOMID_SELF || ret )
>>>>>>>>>                 goto free_domain;
>>>>>>>>>
>>>>>>>>> should fail so Xen should continue and execute
>>>>>>>>> unmap_domain_pirq.
>>>>>>>>> That's
>>>>>>>>> what we want.
>>>>>>>>>          
>>>>>>>>    From linux side, request_irq->  request_threaded_irq->
>>>>>>>> __setup_irq->
>>>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>>>> If irq_enable isn't called, how does the driver receive interrupt,
>>>>>>>> I
>>>>>>>> did
>>>>>>>> see
>>>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>>>        
>>>>>>> Good to know
>>>>>>>
>>>>>>>      
>>>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>>>> clearing
>>>>>>>> the
>>>>>>>> mapping?
>>>>>>>>        
>>>>>>> What I am saying is that the error you are getting:
>>>>>>>
>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>
>>>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>>>> IRQ_UNBOUND.
>>>>>>> So, why are you getting this error? What is failing?
>>>>>>> I am ready to believe the problem is in Xen but Without
>>>>>>> understanding
>>>>>>> why you are getting the error it's hard to find a solution.
>>>>>>>      
>>>>>> I found the reason, you are looking at xen-unstable, I was working
>>>>>> with
>>>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>>>> succeed.
>>>>>> But we still need below patch to let driver reload succeed everytime.
>>>>>> Without
>>>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>>>
>>>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>>>            return;
>>>>>>        }
>>>>>>
>>>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq
>>>>>> ==
>>>>>> -1))
>>>>>>            entry->flags = 1;
>>>>>>        entry->io_mem[offset] = val;
>>>>> Interesting. I don't think this is the proper fix though.
>>>>> Does the appended patch change anything?
>>>>>
>>>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>>>> index 71fa6f0..cd5d9c7 100644
>>>>> --- a/hw/pt-msi.c
>>>>> +++ b/hw/pt-msi.c
>>>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev,
>>>>> int
>>>>> entry_nr)
>>>>>         uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>>>         int ret;
>>>>>     -    if ( !entry->flags )
>>>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>>>             return 0;
>>>>>           if (!gvec) {
>>>> Tested, not work.
>>>> If you look at msix_capability_init in kernel, line 707,722,
>>>> dev->msix->enabled is already set when pt_msix_update is called.
>>> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
>>> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
>>> pt_msix_update.
>> It does.
>> line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
>>                   this will set dev->msix->enabled first time
>> line 722, PCI_MSIX_FLAGS_ENABLE set
>>                   this trigger call of pt_msix_update
>>                   then dev->msix->enabled was set a second time
>>> I was assuming that you needed to add "|| entry->pirq == -1" because you
>>> needed to pass the check:
>>>
>>>       if ( !entry->flags )
>>>           return 0;
>>>
>>> at the beginning of pt_msix_update_one. Am I getting it right?
>> Right, as entry->pirq is set to -1 when driver unload.
>>> If that is case that I thought that we just needed to make sure that
>>> when ptdev->msix->enabled is still zero then we go through the test in
>>> pt_msix_update_one. Where is the mistake?
>> If you want to use dev->msix->enabled for checking, below patch could work.
>> But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
>> more,
>> but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
>>
>> zduan
>> ***********************************************************
>> diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
>> --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
>> +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
>> @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
>>           pt_msix_disable(ptdev);
>>       }
>>
>> -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
>> +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
>> PCI_MSIX_MASK);
>>
>>       return 0;
>>   }
>> diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
>> @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
>>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>       int ret;
>>
>> -    if ( !entry->flags )
>> +    if ( !entry->flags && dev->msix->enabled )
>>           return 0;
>>
>>       if (!gvec) {
> I understand now, thanks for the explanation.
> Among the two alternatives, I think that your first change is actually
> better.
>
> However we can probably still improve it a little bit by setting
> entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
> the driver reload case from the first driver initialization.
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cc4e280 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
>           }
>           /* clear msi-x info */
>           entry->pirq = -1;
> -        entry->flags = 0;
> +        entry->flags = 1;
>       }
>   }
Test passed.
But this change set entry->flags in all entrys, dev->msix->total_entries 
count of pirqs are
mapped when driver reload, no matter how many msix entrys driver is 
initializing.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
@ 2013-06-28  2:33                                                               ` Zhenzhong Duan
  0 siblings, 0 replies; 58+ messages in thread
From: Zhenzhong Duan @ 2013-06-28  2:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Konrad Rzeszutek Wilk, David Vrabel, xen-devel, Feng Jin,
	Yuval Shaia, linux-kernel, Chien Yen


On 2013-06-27 19:52, Stefano Stabellini wrote:
> On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-27 02:08, Stefano Stabellini wrote:
>>> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>>>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>>>> Stefano Stabellini wrote:
>>>>>>> Trimming some of the people in CC
>>>>>>>
>>>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>      
>>>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>>>        
>>>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>          
>>>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>>>            
>>>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>>>              
>>>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>>>        On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>>>          Looking at the hypervisor code I couldn't see
>>>>>>>>>>>> anything
>>>>>>>>>>>> obviously
>>>>>>>>>>>> wrong.
>>>>>>>>>>>>              I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>>>
>>>>>>>>>>>>         if ( is_hvm_domain(d) )
>>>>>>>>>>>>          {
>>>>>>>>>>>>              spin_lock(&d->event_lock);
>>>>>>>>>>>>              gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x
>>>>>>>>>>>> %s,
>>>>>>>>>>>> irq:
>>>>>>>>>>>> %d\n",
>>>>>>>>>>>>                  d->domain_id, pirq,
>>>>>>>>>>>> domain_pirq_to_emuirq(d,
>>>>>>>>>>>> pirq),
>>>>>>>>>>>>                  domain_pirq_to_emuirq(d, pirq) ==
>>>>>>>>>>>> IRQ_UNBOUND ?
>>>>>>>>>>>> "unbound" :
>>>>>>>>>>>> "",
>>>>>>>>>>>>                  domain_pirq_to_irq(d, pirq));
>>>>>>>>>>>>                                                                                               if
>>>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>>>                  ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>>>              spin_unlock(&d->event_lock);
>>>>>>>>>>>>              if ( domid == DOMID_SELF || ret )
>>>>>>>>>>>>                  goto free_domain;
>>>>>>>>>>>>
>>>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>>>
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>>>> here).
>>>>>>>>>>>>
>>>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does
>>>>>>>>>>>> not
>>>>>>>>>>>> happen.
>>>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be
>>>>>>>>>>>> depend
>>>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, all of that code looks to only clear
>>>>>>>>>>>> things
>>>>>>>>>>>> when
>>>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>>>> removal
>>>>>>>>>>>> in the radix tree:
>>>>>>>>>>>>
>>>>>>>>>>>>        if ( emuirq != IRQ_PT )
>>>>>>>>>>>>              radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>>>> emuirq);
>>>>>>>>>>>>                                                                              And
>>>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should
>>>>>>>>>>>> check
>>>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>>>            I think that you are looking in the wrong place.
>>>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>>>
>>>>>>>>>>>> The code above is correct as is because it is trying to
>>>>>>>>>>>> handle
>>>>>>>>>>>> emulated
>>>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are
>>>>>>>>>>>> not
>>>>>>>>>>>> added to
>>>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>>>          This patch fixes the issue, I have only tested MSI
>>>>>>>>>>>> (MSI-X
>>>>>>>>>>>> completely
>>>>>>>>>>>> untested).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int
>>>>>>>>>>>> pt_msgctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>               ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>>           }
>>>>>>>>>>>>           else
>>>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>> +    {
>>>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>>>> +        }
>>>>>>>>>>>> +    }
>>>>>>>>>>>>             /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>>>> translation
>>>>>>>>>>>> */
>>>>>>>>>>>>           if (!ptdev->msi_trans_en) {
>>>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int
>>>>>>>>>>>> pt_msixctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>                   pt_disable_msi_translate(ptdev);
>>>>>>>>>>>>               }
>>>>>>>>>>>>               pt_msix_update(ptdev);
>>>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>>>        Hi Stefano,
>>>>>>>>>>>> I made a test with this patch, os reboot when driver
>>>>>>>>>>>> reload.
>>>>>>>>>>>> If
>>>>>>>>>>>> use
>>>>>>>>>>>> pt_msix_disable
>>>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>>>> But I still see some error in qemu.log and xen console.
>>>>>>>>>>>> Seems
>>>>>>>>>>>> four
>>>>>>>>>>>> IRQs
>>>>>>>>>>>> are not freed
>>>>>>>>>>>> when unmap.
>>>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 103
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 102
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 101
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 100
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>>                
>>>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>>>> failing?
>>>>>>>>>>>              
>>>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
>>>>>>>>>> IRQ_UNBOUND
>>>>>>>>>> in
>>>>>>>>>> physdev_unmap_pirq.
>>>>>>>>>>            
>>>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>>>> question:
>>>>>>>>>
>>>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>>>
>>>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>>>> calls
>>>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>>>> should
>>>>>>>>> be IRQ_PT.
>>>>>>>>>
>>>>>>>>> I don't know if that's a normal condition, but in any case it
>>>>>>>>> should
>>>>>>>>> not create any problems to physdev_unmap_pirq, in fact the
>>>>>>>>> folloing
>>>>>>>>> check:
>>>>>>>>>
>>>>>>>>>         if ( domid == DOMID_SELF || ret )
>>>>>>>>>                 goto free_domain;
>>>>>>>>>
>>>>>>>>> should fail so Xen should continue and execute
>>>>>>>>> unmap_domain_pirq.
>>>>>>>>> That's
>>>>>>>>> what we want.
>>>>>>>>>          
>>>>>>>>    From linux side, request_irq->  request_threaded_irq->
>>>>>>>> __setup_irq->
>>>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>>>> If irq_enable isn't called, how does the driver receive interrupt,
>>>>>>>> I
>>>>>>>> did
>>>>>>>> see
>>>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>>>        
>>>>>>> Good to know
>>>>>>>
>>>>>>>      
>>>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>>>> clearing
>>>>>>>> the
>>>>>>>> mapping?
>>>>>>>>        
>>>>>>> What I am saying is that the error you are getting:
>>>>>>>
>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>
>>>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>>>> IRQ_UNBOUND.
>>>>>>> So, why are you getting this error? What is failing?
>>>>>>> I am ready to believe the problem is in Xen but Without
>>>>>>> understanding
>>>>>>> why you are getting the error it's hard to find a solution.
>>>>>>>      
>>>>>> I found the reason, you are looking at xen-unstable, I was working
>>>>>> with
>>>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>>>> succeed.
>>>>>> But we still need below patch to let driver reload succeed everytime.
>>>>>> Without
>>>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>>>
>>>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>>>            return;
>>>>>>        }
>>>>>>
>>>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq
>>>>>> ==
>>>>>> -1))
>>>>>>            entry->flags = 1;
>>>>>>        entry->io_mem[offset] = val;
>>>>> Interesting. I don't think this is the proper fix though.
>>>>> Does the appended patch change anything?
>>>>>
>>>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>>>> index 71fa6f0..cd5d9c7 100644
>>>>> --- a/hw/pt-msi.c
>>>>> +++ b/hw/pt-msi.c
>>>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev,
>>>>> int
>>>>> entry_nr)
>>>>>         uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>>>         int ret;
>>>>>     -    if ( !entry->flags )
>>>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>>>             return 0;
>>>>>           if (!gvec) {
>>>> Tested, not work.
>>>> If you look at msix_capability_init in kernel, line 707,722,
>>>> dev->msix->enabled is already set when pt_msix_update is called.
>>> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
>>> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
>>> pt_msix_update.
>> It does.
>> line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
>>                   this will set dev->msix->enabled first time
>> line 722, PCI_MSIX_FLAGS_ENABLE set
>>                   this trigger call of pt_msix_update
>>                   then dev->msix->enabled was set a second time
>>> I was assuming that you needed to add "|| entry->pirq == -1" because you
>>> needed to pass the check:
>>>
>>>       if ( !entry->flags )
>>>           return 0;
>>>
>>> at the beginning of pt_msix_update_one. Am I getting it right?
>> Right, as entry->pirq is set to -1 when driver unload.
>>> If that is case that I thought that we just needed to make sure that
>>> when ptdev->msix->enabled is still zero then we go through the test in
>>> pt_msix_update_one. Where is the mistake?
>> If you want to use dev->msix->enabled for checking, below patch could work.
>> But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
>> more,
>> but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
>>
>> zduan
>> ***********************************************************
>> diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
>> --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
>> +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
>> @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
>>           pt_msix_disable(ptdev);
>>       }
>>
>> -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
>> +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
>> PCI_MSIX_MASK);
>>
>>       return 0;
>>   }
>> diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
>> @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
>>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>       int ret;
>>
>> -    if ( !entry->flags )
>> +    if ( !entry->flags && dev->msix->enabled )
>>           return 0;
>>
>>       if (!gvec) {
> I understand now, thanks for the explanation.
> Among the two alternatives, I think that your first change is actually
> better.
>
> However we can probably still improve it a little bit by setting
> entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
> the driver reload case from the first driver initialization.
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cc4e280 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
>           }
>           /* clear msi-x info */
>           entry->pirq = -1;
> -        entry->flags = 0;
> +        entry->flags = 1;
>       }
>   }
Test passed.
But this change set entry->flags in all entrys, dev->msix->total_entries 
count of pirqs are
mapped when driver reload, no matter how many msix entrys driver is 
initializing.

zduan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
  2013-06-28  2:33                                                               ` Zhenzhong Duan
  (?)
@ 2013-06-28 11:12                                                               ` Stefano Stabellini
  -1 siblings, 0 replies; 58+ messages in thread
From: Stefano Stabellini @ 2013-06-28 11:12 UTC (permalink / raw)
  To: Zhenzhong Duan
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, David Vrabel,
	xen-devel, Feng Jin, Yuval Shaia, linux-kernel, Chien Yen

On Fri, 28 Jun 2013, Zhenzhong Duan wrote:
> On 2013-06-27 19:52, Stefano Stabellini wrote:
> > On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
> > > On 2013-06-27 02:08, Stefano Stabellini wrote:
> > > > On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
> > > > > On 2013-06-26 01:51, Stefano Stabellini wrote:
> > > > > > On Tue, 25 Jun 2013, DuanZhenzhong wrote:
> > > > > > > Stefano Stabellini wrote:
> > > > > > > > Trimming some of the people in CC
> > > > > > > > 
> > > > > > > > On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > >      
> > > > > > > > > On 2013-06-20 22:21, Stefano Stabellini wrote:
> > > > > > > > >        
> > > > > > > > > > On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > > > >          
> > > > > > > > > > > On 2013-06-05 20:50, Stefano Stabellini wrote:
> > > > > > > > > > >            
> > > > > > > > > > > > On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
> > > > > > > > > > > >              
> > > > > > > > > > > > > Stefano Stabellini wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Tue, 21 May 2013, Stefano Stabellini wrote:
> > > > > > > > > > > > >        On Tue, 21 May 2013, Konrad Rzeszutek Wilk
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >          Looking at the hypervisor code I couldn't see
> > > > > > > > > > > > > anything
> > > > > > > > > > > > > obviously
> > > > > > > > > > > > > wrong.
> > > > > > > > > > > > >              I think the culprit is
> > > > > > > > > > > > > "physdev_unmap_pirq":
> > > > > > > > > > > > > 
> > > > > > > > > > > > >         if ( is_hvm_domain(d) )
> > > > > > > > > > > > >          {
> > > > > > > > > > > > >              spin_lock(&d->event_lock);
> > > > > > > > > > > > >              gdprintk(XENLOG_WARNING,"d%d, pirq: %d is
> > > > > > > > > > > > > %x
> > > > > > > > > > > > > %s,
> > > > > > > > > > > > > irq:
> > > > > > > > > > > > > %d\n",
> > > > > > > > > > > > >                  d->domain_id, pirq,
> > > > > > > > > > > > > domain_pirq_to_emuirq(d,
> > > > > > > > > > > > > pirq),
> > > > > > > > > > > > >                  domain_pirq_to_emuirq(d, pirq) ==
> > > > > > > > > > > > > IRQ_UNBOUND ?
> > > > > > > > > > > > > "unbound" :
> > > > > > > > > > > > > "",
> > > > > > > > > > > > >                  domain_pirq_to_irq(d, pirq));
> > > > > > > > > > > > >                                                                                               if
> > > > > > > > > > > > > ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > > > > > > > > > > >                  ret = unmap_domain_pirq_emuirq(d,
> > > > > > > > > > > > > pirq);
> > > > > > > > > > > > >              spin_unlock(&d->event_lock);
> > > > > > > > > > > > >              if ( domid == DOMID_SELF || ret )
> > > > > > > > > > > > >                  goto free_domain;
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It always tells me unbound:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > > > > > > > > > > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > > > > > > > > > > > (a bit older debug code, so the 'unbound' does not
> > > > > > > > > > > > > show up
> > > > > > > > > > > > > here).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Which means that the call to unmap_domain_pirq_emuirq
> > > > > > > > > > > > > does
> > > > > > > > > > > > > not
> > > > > > > > > > > > > happen.
> > > > > > > > > > > > > The checks in unmap_domain_pirq_emuirq also look to be
> > > > > > > > > > > > > depend
> > > > > > > > > > > > > on the code being IRQ_UNBOUND.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > In other words, all of that code looks to only clear
> > > > > > > > > > > > > things
> > > > > > > > > > > > > when
> > > > > > > > > > > > > they are !IRQ_UNBOUND.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > But the other logic (IRQ_UNBOUND) looks to be missing
> > > > > > > > > > > > > a
> > > > > > > > > > > > > removal
> > > > > > > > > > > > > in the radix tree:
> > > > > > > > > > > > > 
> > > > > > > > > > > > >        if ( emuirq != IRQ_PT )
> > > > > > > > > > > > >              radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
> > > > > > > > > > > > > emuirq);
> > > > > > > > > > > > >                                                                              And
> > > > > > > > > > > > > I think that is what is causing the leak - the radix
> > > > > > > > > > > > > tree
> > > > > > > > > > > > > needs to be pruned? Or perhaps the allocate_pirq
> > > > > > > > > > > > > should
> > > > > > > > > > > > > check
> > > > > > > > > > > > > the radix tree for IRQ_UNBOUND ones and re-use them?
> > > > > > > > > > > > >            I think that you are looking in the wrong
> > > > > > > > > > > > > place.
> > > > > > > > > > > > > The issue is that QEMU doesn't call pt_msi_disable in
> > > > > > > > > > > > > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > The code above is correct as is because it is trying
> > > > > > > > > > > > > to
> > > > > > > > > > > > > handle
> > > > > > > > > > > > > emulated
> > > > > > > > > > > > > IRQs and MSIs, not real passthrough MSIs. They latter
> > > > > > > > > > > > > are
> > > > > > > > > > > > > not
> > > > > > > > > > > > > added to
> > > > > > > > > > > > > that radix tree, see physdev_hvm_map_pirq and
> > > > > > > > > > > > > physdev_map_pirq.
> > > > > > > > > > > > >          This patch fixes the issue, I have only
> > > > > > > > > > > > > tested MSI
> > > > > > > > > > > > > (MSI-X
> > > > > > > > > > > > > completely
> > > > > > > > > > > > > untested).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > diff --git a/hw/pass-through.c b/hw/pass-through.c
> > > > > > > > > > > > > index 304c438..079e465 100644
> > > > > > > > > > > > > --- a/hw/pass-through.c
> > > > > > > > > > > > > +++ b/hw/pass-through.c
> > > > > > > > > > > > > @@ -3866,7 +3866,11 @@ static int
> > > > > > > > > > > > > pt_msgctrl_reg_write(struct
> > > > > > > > > > > > > pt_dev
> > > > > > > > > > > > > *ptdev,
> > > > > > > > > > > > >               ptdev->msi->flags |=
> > > > > > > > > > > > > PCI_MSI_FLAGS_ENABLE;
> > > > > > > > > > > > >           }
> > > > > > > > > > > > >           else
> > > > > > > > > > > > > -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
> > > > > > > > > > > > > +    {
> > > > > > > > > > > > > +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
> > > > > > > > > > > > > +            pt_msi_disable(ptdev);
> > > > > > > > > > > > > +        }
> > > > > > > > > > > > > +    }
> > > > > > > > > > > > >             /* pass through MSI_ENABLE bit when no
> > > > > > > > > > > > > MSI-INTx
> > > > > > > > > > > > > translation
> > > > > > > > > > > > > */
> > > > > > > > > > > > >           if (!ptdev->msi_trans_en) {
> > > > > > > > > > > > > @@ -4013,6 +4017,8 @@ static int
> > > > > > > > > > > > > pt_msixctrl_reg_write(struct
> > > > > > > > > > > > > pt_dev
> > > > > > > > > > > > > *ptdev,
> > > > > > > > > > > > >                   pt_disable_msi_translate(ptdev);
> > > > > > > > > > > > >               }
> > > > > > > > > > > > >               pt_msix_update(ptdev);
> > > > > > > > > > > > > +    } else if (!(*value & PCI_MSIX_ENABLE) &&
> > > > > > > > > > > > > ptdev->msix->enabled) {
> > > > > > > > > > > > > +        pt_msix_delete(ptdev);
> > > > > > > > > > > > >        Hi Stefano,
> > > > > > > > > > > > > I made a test with this patch, os reboot when driver
> > > > > > > > > > > > > reload.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > use
> > > > > > > > > > > > > pt_msix_disable
> > > > > > > > > > > > > instead of pt_msix_delete, driver could be reloaded.
> > > > > > > > > > > > > But I still see some error in qemu.log and xen
> > > > > > > > > > > > > console.
> > > > > > > > > > > > > Seems
> > > > > > > > > > > > > four
> > > > > > > > > > > > > IRQs
> > > > > > > > > > > > > are not freed
> > > > > > > > > > > > > when unmap.
> > > > > > > > > > > > > --------------first load---------------------------
> > > > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq
> > > > > > > > > > > > > =
> > > > > > > > > > > > > 103
> > > > > > > > > > > > > pt_msix_update_one: Update msix entry 0 with pirq 67
> > > > > > > > > > > > > gvec
> > > > > > > > > > > > > 0
> > > > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq
> > > > > > > > > > > > > =
> > > > > > > > > > > > > 102
> > > > > > > > > > > > > pt_msix_update_one: Update msix entry 1 with pirq 66
> > > > > > > > > > > > > gvec
> > > > > > > > > > > > > 0
> > > > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq
> > > > > > > > > > > > > =
> > > > > > > > > > > > > 101
> > > > > > > > > > > > > pt_msix_update_one: Update msix entry 2 with pirq 65
> > > > > > > > > > > > > gvec
> > > > > > > > > > > > > 0
> > > > > > > > > > > > > pt_msix_update_one: pt_msix_update_one requested pirq
> > > > > > > > > > > > > =
> > > > > > > > > > > > > 100
> > > > > > > > > > > > > pt_msix_update_one: Update msix entry 3 with pirq 64
> > > > > > > > > > > > > gvec
> > > > > > > > > > > > > 0
> > > > > > > > > > > > > ------------- first unload---------------------------
> > > > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > > > [00:04.0]
> > > > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 66, gvec 0
> > > > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 66
> > > > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > > > [00:04.0]
> > > > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 65, gvec 0
> > > > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 65
> > > > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > > > [00:04.0]
> > > > > > > > > > > > > pt_msix_disable: Unbind msix with pirq 64, gvec 0
> > > > > > > > > > > > > pt_msix_disable: Unmap msix with pirq 64
> > > > > > > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed.
> > > > > > > > > > > > > [00:04.0]
> > > > > > > > > > > > >                
> > > > > > > > > > > > Can you add some printks in Xen (the hypercall name is
> > > > > > > > > > > > PHYSDEVOP_unmap_pirq) to figure out exactly why they are
> > > > > > > > > > > > failing?
> > > > > > > > > > > >              
> > > > > > > > > > > Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
> > > > > > > > > > > IRQ_UNBOUND
> > > > > > > > > > > in
> > > > > > > > > > > physdev_unmap_pirq.
> > > > > > > > > > >            
> > > > > > > > > > That means that Linux didn't call irq_enable on the MSI-X in
> > > > > > > > > > question:
> > > > > > > > > > 
> > > > > > > > > > irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
> > > > > > > > > > 
> > > > > > > > > > EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen
> > > > > > > > > > and
> > > > > > > > > > calls
> > > > > > > > > > map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d,
> > > > > > > > > > unmap->pirq)
> > > > > > > > > > should
> > > > > > > > > > be IRQ_PT.
> > > > > > > > > > 
> > > > > > > > > > I don't know if that's a normal condition, but in any case
> > > > > > > > > > it
> > > > > > > > > > should
> > > > > > > > > > not create any problems to physdev_unmap_pirq, in fact the
> > > > > > > > > > folloing
> > > > > > > > > > check:
> > > > > > > > > > 
> > > > > > > > > >         if ( domid == DOMID_SELF || ret )
> > > > > > > > > >                 goto free_domain;
> > > > > > > > > > 
> > > > > > > > > > should fail so Xen should continue and execute
> > > > > > > > > > unmap_domain_pirq.
> > > > > > > > > > That's
> > > > > > > > > > what we want.
> > > > > > > > > >          
> > > > > > > > >    From linux side, request_irq->  request_threaded_irq->
> > > > > > > > > __setup_irq->
> > > > > > > > > irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
> > > > > > > > > If irq_enable isn't called, how does the driver receive
> > > > > > > > > interrupt,
> > > > > > > > > I
> > > > > > > > > did
> > > > > > > > > see
> > > > > > > > > four interrupts in /proc/interrupt and driver works ok.
> > > > > > > > >        
> > > > > > > > Good to know
> > > > > > > > 
> > > > > > > >      
> > > > > > > > > Could you have a look if there is something wrong in xen side
> > > > > > > > > of
> > > > > > > > > clearing
> > > > > > > > > the
> > > > > > > > > mapping?
> > > > > > > > >        
> > > > > > > > What I am saying is that the error you are getting:
> > > > > > > > 
> > > > > > > > pt_msix_disable: Unbind msix with pirq 67, gvec 0
> > > > > > > > pt_msix_disable: Unmap msix with pirq 67
> > > > > > > > pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
> > > > > > > > 
> > > > > > > > cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
> > > > > > > > IRQ_UNBOUND.
> > > > > > > > So, why are you getting this error? What is failing?
> > > > > > > > I am ready to believe the problem is in Xen but Without
> > > > > > > > understanding
> > > > > > > > why you are getting the error it's hard to find a solution.
> > > > > > > >      
> > > > > > > I found the reason, you are looking at xen-unstable, I was working
> > > > > > > with
> > > > > > > 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
> > > > > > > That patch set ret to -EINVAL initially. After remove that line,
> > > > > > > unmap
> > > > > > > succeed.
> > > > > > > But we still need below patch to let driver reload succeed
> > > > > > > everytime.
> > > > > > > Without
> > > > > > > that, 1st reload failed, 2nd succeed, 3 failed, ...
> > > > > > > 
> > > > > > > diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
> > > > > > > --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
> > > > > > > +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
> > > > > > > @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
> > > > > > >            return;
> > > > > > >        }
> > > > > > > 
> > > > > > > -    if ( offset != 3 && entry->io_mem[offset] != val )
> > > > > > > +    if ( offset != 3 && (entry->io_mem[offset] != val ||
> > > > > > > entry->pirq
> > > > > > > ==
> > > > > > > -1))
> > > > > > >            entry->flags = 1;
> > > > > > >        entry->io_mem[offset] = val;
> > > > > > Interesting. I don't think this is the proper fix though.
> > > > > > Does the appended patch change anything?
> > > > > > 
> > > > > > diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> > > > > > index 71fa6f0..cd5d9c7 100644
> > > > > > --- a/hw/pt-msi.c
> > > > > > +++ b/hw/pt-msi.c
> > > > > > @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev
> > > > > > *dev,
> > > > > > int
> > > > > > entry_nr)
> > > > > >         uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
> > > > > >         int ret;
> > > > > >     -    if ( !entry->flags )
> > > > > > +    if ( !entry->flags && ptdev->msix->enabled )
> > > > > >             return 0;
> > > > > >           if (!gvec) {
> > > > > Tested, not work.
> > > > > If you look at msix_capability_init in kernel, line 707,722,
> > > > > dev->msix->enabled is already set when pt_msix_update is called.
> > > > Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
> > > > dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
> > > > pt_msix_update.
> > > It does.
> > > line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
> > >                   this will set dev->msix->enabled first time
> > > line 722, PCI_MSIX_FLAGS_ENABLE set
> > >                   this trigger call of pt_msix_update
> > >                   then dev->msix->enabled was set a second time
> > > > I was assuming that you needed to add "|| entry->pirq == -1" because you
> > > > needed to pass the check:
> > > > 
> > > >       if ( !entry->flags )
> > > >           return 0;
> > > > 
> > > > at the beginning of pt_msix_update_one. Am I getting it right?
> > > Right, as entry->pirq is set to -1 when driver unload.
> > > > If that is case that I thought that we just needed to make sure that
> > > > when ptdev->msix->enabled is still zero then we go through the test in
> > > > pt_msix_update_one. Where is the mistake?
> > > If you want to use dev->msix->enabled for checking, below patch could
> > > work.
> > > But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
> > > more,
> > > but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
> > > 
> > > zduan
> > > ***********************************************************
> > > diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
> > > --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
> > > +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
> > > @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
> > >           pt_msix_disable(ptdev);
> > >       }
> > > 
> > > -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
> > > +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
> > > PCI_MSIX_MASK);
> > > 
> > >       return 0;
> > >   }
> > > diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
> > > --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
> > > +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
> > > @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
> > >       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
> > >       int ret;
> > > 
> > > -    if ( !entry->flags )
> > > +    if ( !entry->flags && dev->msix->enabled )
> > >           return 0;
> > > 
> > >       if (!gvec) {
> > I understand now, thanks for the explanation.
> > Among the two alternatives, I think that your first change is actually
> > better.
> > 
> > However we can probably still improve it a little bit by setting
> > entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
> > the driver reload case from the first driver initialization.
> > 
> > diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> > index 71fa6f0..cc4e280 100644
> > --- a/hw/pt-msi.c
> > +++ b/hw/pt-msi.c
> > @@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
> >           }
> >           /* clear msi-x info */
> >           entry->pirq = -1;
> > -        entry->flags = 0;
> > +        entry->flags = 1;
> >       }
> >   }
> Test passed.
> But this change set entry->flags in all entrys, dev->msix->total_entries count
> of pirqs are
> mapped when driver reload, no matter how many msix entrys driver is
> initializing.

OK, you convinced me, let's go with your initial solution to this
problem :-)

Can you please resend a patch for qemu-xen-traditional and qemu-xen to
xen-devel with all the changes?

Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2013-06-28 11:12 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-08  8:18 [PATCH] xen: reuse the same pirq allocated when driver load first time Zhenzhong Duan
2013-05-10 18:53 ` Konrad Rzeszutek Wilk
2013-05-13  7:44   ` Zhenzhong Duan
2013-05-13 11:06   ` Stefano Stabellini
2013-05-13 14:07     ` Konrad Rzeszutek Wilk
2013-05-13 14:50       ` Stefano Stabellini
2013-05-13 16:17         ` Konrad Rzeszutek Wilk
2013-05-13 17:24           ` Stefano Stabellini
2013-05-13 18:20             ` Konrad Rzeszutek Wilk
2013-05-14 13:49               ` Stefano Stabellini
2013-05-14 14:20                 ` Konrad Rzeszutek Wilk
2013-05-15  9:41                   ` Stefano Stabellini
2013-05-15 14:18                     ` Zhenzhong Duan
2013-05-17  2:22                     ` Zhenzhong Duan
2013-05-20 10:24                       ` Stefano Stabellini
2013-05-20 15:24                         ` Konrad Rzeszutek Wilk
2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
2013-05-20 20:38                           ` Konrad Rzeszutek Wilk
2013-05-21 10:07                             ` [Xen-devel] " David Vrabel
2013-05-21 13:40                               ` Konrad Rzeszutek Wilk
2013-05-21 16:51                                 ` Stefano Stabellini
2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
2013-05-21 21:50                                     ` Stefano Stabellini
2013-05-21 22:41                                       ` Konrad Rzeszutek Wilk
2013-05-22  9:37                                         ` Jan Beulich
2013-05-22 15:14                                           ` Konrad Rzeszutek Wilk
2013-05-22 15:25                                             ` Jan Beulich
2013-05-22 16:41                                               ` Konrad Rzeszutek Wilk
2013-05-23  6:31                                                 ` Jan Beulich
2013-05-29 17:50                                   ` Stefano Stabellini
2013-05-30 17:48                                     ` Konrad Rzeszutek Wilk
2013-06-05  5:27                                     ` Zhenzhong Duan
2013-06-05 12:50                                       ` [Xen-devel] " Stefano Stabellini
2013-06-20  2:57                                         ` Zhenzhong Duan
2013-06-20 14:21                                           ` Stefano Stabellini
2013-06-24  7:19                                             ` Zhenzhong Duan
2013-06-24  7:19                                               ` Zhenzhong Duan
2013-06-24 17:18                                               ` Stefano Stabellini
2013-06-25  5:33                                                 ` DuanZhenzhong
2013-06-25  5:33                                                   ` DuanZhenzhong
2013-06-25  7:21                                                   ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Jan Beulich
2013-06-25  7:44                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix DuanZhenzhong
2013-06-25  8:36                                                       ` Jan Beulich
2013-06-25  8:43                                                         ` DuanZhenzhong
2013-06-25 10:56                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Stefano Stabellini
2013-06-25 11:03                                                       ` Stefano Stabellini
2013-06-27  8:34                                                         ` Jan Beulich
2013-06-27 10:46                                                           ` Stefano Stabellini
2013-06-25 17:51                                                   ` [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time Stefano Stabellini
2013-06-26  4:00                                                     ` Zhenzhong Duan
2013-06-26  4:00                                                       ` Zhenzhong Duan
2013-06-26 18:08                                                       ` Stefano Stabellini
2013-06-27  4:01                                                         ` Zhenzhong Duan
2013-06-27  4:01                                                           ` Zhenzhong Duan
2013-06-27 11:52                                                           ` Stefano Stabellini
2013-06-28  2:33                                                             ` Zhenzhong Duan
2013-06-28  2:33                                                               ` Zhenzhong Duan
2013-06-28 11:12                                                               ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.