All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC]  pv-ops: fix shared irq device passthrough
@ 2009-10-23 11:39 Han, Weidong
  2009-10-26 20:27 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 17+ messages in thread
From: Han, Weidong @ 2009-10-23 11:39 UTC (permalink / raw)
  To: 'xen-devel@lists.xensource.com'
  Cc: 'Jeremy Fitzhardinge', 'keir.fraser@eu.citrix.com'

[-- Attachment #1: Type: text/plain, Size: 2204 bytes --]

>From 9c91f23076c555690488cbd81f889f279d4cf2fa Mon Sep 17 00:00:00 2001
From: Weidong Han <weidong.han@intel.com>
Date: Sat, 24 Oct 2009 03:18:30 +0800
Subject: [PATCH] pv-ops: fix shared irq device passthrough

In driver/xen/events.c, whether bind_pirq is shareable or not is
determined by desc->action is NULL or not. But in __setup_irq,
startup(irq) is invoked before desc->action is assigned with
new action. So desc->action in startup_irq is alwasy NULL, and
bind_pirq is always not shareable. This results in pt_irq_create_bind
failure when passthrough a device which shares irq to other devices.

This patch move desc->action before startup(irq), therefore fix
the problem.

Signed-off-by: Weidong Han <weidong.han@intel.com>
---
 kernel/irq/manage.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 07a11dc..3b85b72 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -565,6 +565,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 {
 	struct irqaction *old, **old_ptr;
 	const char *old_name = NULL;
+	int old_irq;
 	unsigned long flags;
 	int shared = 0;
 	int ret;
@@ -644,6 +645,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 		shared = 1;
 	}
 
+	old = *old_ptr;
+	old_irq = new->irq;
+	new->irq = irq;
+	*old_ptr = new;
+
 	if (!shared) {
 		irq_chip_set_defaults(desc->chip);
 
@@ -654,8 +660,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 			ret = __irq_set_trigger(desc, irq,
 					new->flags & IRQF_TRIGGER_MASK);
 
-			if (ret)
+			if (ret) {
+				new->irq = old_irq;
+				*old_ptr = old;
 				goto out_thread;
+			}
 		} else
 			compat_irq_chip_set_default_handler(desc);
 #if defined(CONFIG_IRQ_PER_CPU)
@@ -690,9 +699,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 				(int)(new->flags & IRQF_TRIGGER_MASK));
 	}
 
-	new->irq = irq;
-	*old_ptr = new;
-
 	/* Reset broken irq detection when installing new handler */
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
-- 
1.6.0.4

[-- Attachment #2: 0001-pv-ops-fix-shared-irq-device-passthrough.patch --]
[-- Type: application/octet-stream, Size: 2135 bytes --]

From 9c91f23076c555690488cbd81f889f279d4cf2fa Mon Sep 17 00:00:00 2001
From: Weidong Han <weidong.han@intel.com>
Date: Sat, 24 Oct 2009 03:18:30 +0800
Subject: [PATCH] pv-ops: fix shared irq device passthrough

In driver/xen/events.c, whether bind_pirq is shareable or not is
determined by desc->action is NULL or not. But in __setup_irq,
startup(irq) is invoked before desc->action is assigned with
new action. So desc->action in startup_irq is alwasy NULL, and
bind_pirq is always not shareable. This results in pt_irq_create_bind
failure when passthrough a device which shares irq to other devices.

This patch move desc->action before startup(irq), therefore fix
the problem.

Signed-off-by: Weidong Han <weidong.han@intel.com>
---
 kernel/irq/manage.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 07a11dc..3b85b72 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -565,6 +565,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 {
 	struct irqaction *old, **old_ptr;
 	const char *old_name = NULL;
+	int old_irq;
 	unsigned long flags;
 	int shared = 0;
 	int ret;
@@ -644,6 +645,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 		shared = 1;
 	}
 
+	old = *old_ptr;
+	old_irq = new->irq;
+	new->irq = irq;
+	*old_ptr = new;
+
 	if (!shared) {
 		irq_chip_set_defaults(desc->chip);
 
@@ -654,8 +660,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 			ret = __irq_set_trigger(desc, irq,
 					new->flags & IRQF_TRIGGER_MASK);
 
-			if (ret)
+			if (ret) {
+				new->irq = old_irq;
+				*old_ptr = old;
 				goto out_thread;
+			}
 		} else
 			compat_irq_chip_set_default_handler(desc);
 #if defined(CONFIG_IRQ_PER_CPU)
@@ -690,9 +699,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 				(int)(new->flags & IRQF_TRIGGER_MASK));
 	}
 
-	new->irq = irq;
-	*old_ptr = new;
-
 	/* Reset broken irq detection when installing new handler */
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
-- 
1.6.0.4


[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-10-23 11:39 [PATCH][RFC] pv-ops: fix shared irq device passthrough Han, Weidong
@ 2009-10-26 20:27 ` Jeremy Fitzhardinge
  2009-10-27  1:32   ` Han, Weidong
  0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-10-26 20:27 UTC (permalink / raw)
  To: Han, Weidong
  Cc: 'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

On 10/23/09 04:39, Han, Weidong wrote:
> From 9c91f23076c555690488cbd81f889f279d4cf2fa Mon Sep 17 00:00:00 2001
> From: Weidong Han <weidong.han@intel.com>
> Date: Sat, 24 Oct 2009 03:18:30 +0800
> Subject: [PATCH] pv-ops: fix shared irq device passthrough
>
> In driver/xen/events.c, whether bind_pirq is shareable or not is
> determined by desc->action is NULL or not. But in __setup_irq,
> startup(irq) is invoked before desc->action is assigned with
> new action. So desc->action in startup_irq is alwasy NULL, and
> bind_pirq is always not shareable. This results in pt_irq_create_bind
> failure when passthrough a device which shares irq to other devices.
>
> This patch move desc->action before startup(irq), therefore fix
> the problem.
>   

Hm, not sure about this.  The logic in __setup_irq already looks pretty
tortured, and I'd like to avoid touching it unless there's definitively
a bug in there.

I think the right question is whether probe_irq() is really doing the
right test, and whether it could do something else instead?

    J

> Signed-off-by: Weidong Han <weidong.han@intel.com>
> ---
>  kernel/irq/manage.c |   14 ++++++++++----
>  1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index 07a11dc..3b85b72 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -565,6 +565,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>  {
>  	struct irqaction *old, **old_ptr;
>  	const char *old_name = NULL;
> +	int old_irq;
>  	unsigned long flags;
>  	int shared = 0;
>  	int ret;
> @@ -644,6 +645,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>  		shared = 1;
>  	}
>  
> +	old = *old_ptr;
> +	old_irq = new->irq;
> +	new->irq = irq;
> +	*old_ptr = new;
> +
>  	if (!shared) {
>  		irq_chip_set_defaults(desc->chip);
>  
> @@ -654,8 +660,11 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>  			ret = __irq_set_trigger(desc, irq,
>  					new->flags & IRQF_TRIGGER_MASK);
>  
> -			if (ret)
> +			if (ret) {
> +				new->irq = old_irq;
> +				*old_ptr = old;
>  				goto out_thread;
> +			}
>  		} else
>  			compat_irq_chip_set_default_handler(desc);
>  #if defined(CONFIG_IRQ_PER_CPU)
> @@ -690,9 +699,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>  				(int)(new->flags & IRQF_TRIGGER_MASK));
>  	}
>  
> -	new->irq = irq;
> -	*old_ptr = new;
> -
>  	/* Reset broken irq detection when installing new handler */
>  	desc->irq_count = 0;
>  	desc->irqs_unhandled = 0;
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-10-26 20:27 ` Jeremy Fitzhardinge
@ 2009-10-27  1:32   ` Han, Weidong
  2009-10-29 22:56     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 17+ messages in thread
From: Han, Weidong @ 2009-10-27  1:32 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: 'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

Jeremy Fitzhardinge wrote:
> On 10/23/09 04:39, Han, Weidong wrote:
>> From 9c91f23076c555690488cbd81f889f279d4cf2fa Mon Sep 17 00:00:00
>> 2001 
>> From: Weidong Han <weidong.han@intel.com>
>> Date: Sat, 24 Oct 2009 03:18:30 +0800
>> Subject: [PATCH] pv-ops: fix shared irq device passthrough
>> 
>> In driver/xen/events.c, whether bind_pirq is shareable or not is
>> determined by desc->action is NULL or not. But in __setup_irq,
>> startup(irq) is invoked before desc->action is assigned with
>> new action. So desc->action in startup_irq is alwasy NULL, and
>> bind_pirq is always not shareable. This results in pt_irq_create_bind
>> failure when passthrough a device which shares irq to other devices.
>> 
>> This patch move desc->action before startup(irq), therefore fix
>> the problem.
>> 
> 
> Hm, not sure about this.  The logic in __setup_irq already looks
> pretty tortured, and I'd like to avoid touching it unless there's
> definitively a bug in there.
> 
> I think the right question is whether probe_irq() is really doing the
> right test, and whether it could do something else instead?

Totally agree. The better way is to change probe_irq, therefore needn't touch kernel irq stuffs. probe_irq is just check if desc->action == NULL or not. The code is there for a long long time (from c/s 26 in linux-2.6.18-xen.hg). I don't know whether it can be replaced. 

Regards,
Weidong

> 
>     J
> 
>> Signed-off-by: Weidong Han <weidong.han@intel.com>
>> ---
>>  kernel/irq/manage.c |   14 ++++++++++----
>>  1 files changed, 10 insertions(+), 4 deletions(-)
>> 
>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>> index 07a11dc..3b85b72 100644
>> --- a/kernel/irq/manage.c
>> +++ b/kernel/irq/manage.c
>> @@ -565,6 +565,7 @@ __setup_irq(unsigned int irq, struct irq_desc
>>  	*desc, struct irqaction *new)  { struct irqaction *old, **old_ptr;
>>  	const char *old_name = NULL;
>> +	int old_irq;
>>  	unsigned long flags;
>>  	int shared = 0;
>>  	int ret;
>> @@ -644,6 +645,11 @@ __setup_irq(unsigned int irq, struct irq_desc
>>  	*desc, struct irqaction *new)  		shared = 1; }
>> 
>> +	old = *old_ptr;
>> +	old_irq = new->irq;
>> +	new->irq = irq;
>> +	*old_ptr = new;
>> +
>>  	if (!shared) {
>>  		irq_chip_set_defaults(desc->chip);
>> 
>> @@ -654,8 +660,11 @@ __setup_irq(unsigned int irq, struct irq_desc
>>  			*desc, struct irqaction *new) ret = __irq_set_trigger(desc, irq,
>>  					new->flags & IRQF_TRIGGER_MASK);
>> 
>> -			if (ret)
>> +			if (ret) {
>> +				new->irq = old_irq;
>> +				*old_ptr = old;
>>  				goto out_thread;
>> +			}
>>  		} else
>>  			compat_irq_chip_set_default_handler(desc);
>>  #if defined(CONFIG_IRQ_PER_CPU)
>> @@ -690,9 +699,6 @@ __setup_irq(unsigned int irq, struct irq_desc
>>  				*desc, struct irqaction *new) (int)(new->flags &
>>  	IRQF_TRIGGER_MASK)); }
>> 
>> -	new->irq = irq;
>> -	*old_ptr = new;
>> -
>>  	/* Reset broken irq detection when installing new handler */ 
>>  	desc->irq_count = 0; desc->irqs_unhandled = 0;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-10-27  1:32   ` Han, Weidong
@ 2009-10-29 22:56     ` Jeremy Fitzhardinge
  2009-10-30  9:29       ` Han, Weidong
  0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-10-29 22:56 UTC (permalink / raw)
  To: Han, Weidong
  Cc: 'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

On 10/26/09 18:32, Han, Weidong wrote:
>> Hm, not sure about this.  The logic in __setup_irq already looks
>> pretty tortured, and I'd like to avoid touching it unless there's
>> definitively a bug in there.
>>
>> I think the right question is whether probe_irq() is really doing the
>> right test, and whether it could do something else instead?
>>     
> Totally agree. The better way is to change probe_irq, therefore needn't touch kernel irq stuffs. probe_irq is just check if desc->action == NULL or not. The code is there for a long long time (from c/s 26 in linux-2.6.18-xen.hg). I don't know whether it can be replaced. 
>   

Could you look into it?  How many drivers do interrupt probing these
days anyway?  I think we can safely not support ISA devices.

    J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-10-29 22:56     ` Jeremy Fitzhardinge
@ 2009-10-30  9:29       ` Han, Weidong
  2009-10-30 21:34         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 17+ messages in thread
From: Han, Weidong @ 2009-10-30  9:29 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: 'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

Jeremy Fitzhardinge wrote:
> On 10/26/09 18:32, Han, Weidong wrote:
>>> Hm, not sure about this.  The logic in __setup_irq already looks
>>> pretty tortured, and I'd like to avoid touching it unless there's
>>> definitively a bug in there. 
>>> 
>>> I think the right question is whether probe_irq() is really doing
>>> the right test, and whether it could do something else instead?
>>> 
>> Totally agree. The better way is to change probe_irq, therefore
>> needn't touch kernel irq stuffs. probe_irq is just check if
>> desc->action == NULL or not. The code is there for a long long time
>> (from c/s 26 in linux-2.6.18-xen.hg). I don't know whether it can be
>> replaced.    
>> 
> 
> Could you look into it?  How many drivers do interrupt probing these
> days anyway?  I think we can safely not support ISA devices.
> 
>     J

All devices will call probing_irq in startup_pirq, which bind pirq to event channel. But now probing_irq always returns false, then all pirqs are not shareable. In my system, a PCI NIC, an USB controller and an IDE interface device share the same IRQ 18. Because above reason, pirq 18 is set not shareable. So when I hide the PCI NIC, and assign it to guest, it fails in guest_bind_pirq, therefore the PCI NIC in guest cannot work, even impact the devices who share the IRQ 18.

If don't want to change __setup_irq code like my patch does, current probing_irq is useless (always return false). The problem is there is almost no information in desc can be used in probing_irq if desc->action is NULL. Keir, do you have any ideas?

Regards,
Weidong

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-10-30  9:29       ` Han, Weidong
@ 2009-10-30 21:34         ` Jeremy Fitzhardinge
  2009-10-31 13:50           ` Han, Weidong
       [not found]           ` <715D42877B251141A38726ABF5CABF2C055064A406@pdsmsx503.ccr.corp.intel.com>
  0 siblings, 2 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-10-30 21:34 UTC (permalink / raw)
  To: Han, Weidong
  Cc: 'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

On 10/30/09 02:29, Han, Weidong wrote:
> All devices will call probing_irq in startup_pirq, which bind pirq to event channel. But now probing_irq always returns false, then all pirqs are not shareable. In my system, a PCI NIC, an USB controller and an IDE interface device share the same IRQ 18. Because above reason, pirq 18 is set not shareable. So when I hide the PCI NIC, and assign it to guest, it fails in guest_bind_pirq, therefore the PCI NIC in guest cannot work, even impact the devices who share the IRQ 18.
>
> If don't want to change __setup_irq code like my patch does, current probing_irq is useless (always return false). The problem is there is almost no information in desc can be used in probing_irq if desc->action is NULL. Keir, do you have any ideas?
>   

I think the intent of probing_irq is to detect irqs which are being used
to probe for an interrupt during driver initialization.  This should
only be used for things like ISA which don't have any way to explicitly
find out what irq a device is attached to.

Given that ISA devices aren't very interesting and no driver should need
to do that kind of probing under Xen, I wonder if we can't just remove
the whole test?

    J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-10-30 21:34         ` Jeremy Fitzhardinge
@ 2009-10-31 13:50           ` Han, Weidong
       [not found]           ` <715D42877B251141A38726ABF5CABF2C055064A406@pdsmsx503.ccr.corp.intel.com>
  1 sibling, 0 replies; 17+ messages in thread
From: Han, Weidong @ 2009-10-31 13:50 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: 'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

Jeremy Fitzhardinge wrote:
> On 10/30/09 02:29, Han, Weidong wrote:
>> All devices will call probing_irq in startup_pirq, which bind pirq
>> to event channel. But now probing_irq always returns false, then all
>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
>> and an IDE interface device share the same IRQ 18. Because above
>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
>> and assign it to guest, it fails in guest_bind_pirq, therefore the
>> PCI NIC in guest cannot work, even impact the devices who share the
>> IRQ 18.       
>> 
>> If don't want to change __setup_irq code like my patch does, current
>> probing_irq is useless (always return false). The problem is there
>> is almost no information in desc can be used in probing_irq if
>> desc->action is NULL. Keir, do you have any ideas?   
>> 
> 
> I think the intent of probing_irq is to detect irqs which are being
> used to probe for an interrupt during driver initialization.  This
> should only be used for things like ISA which don't have any way to
> explicitly find out what irq a device is attached to.
> 
> Given that ISA devices aren't very interesting and no driver should
> need to do that kind of probing under Xen, I wonder if we can't just
> remove the whole test?
> 
>     J

Not only ISA devices, but also PCI devices will use probing_irq. ISA devices are indeed not interesting, VT-d only assigns PCI devices. In fact, Sharing interrupt between assigned devices and host devices is not happy situation. We put much effort to make it work long time ago. If there is really no good approach, one alternate is add a limit of device assignment: don't allow assign PCI devices which share interrupt with other devices.

Regards,
Weidong

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
       [not found]           ` <715D42877B251141A38726ABF5CABF2C055064A406@pdsmsx503.ccr.corp.intel.com>
@ 2009-11-02 10:12             ` Han, Weidong
  2009-11-10  6:12               ` Han, Weidong
  0 siblings, 1 reply; 17+ messages in thread
From: Han, Weidong @ 2009-11-02 10:12 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: 'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

[-- Attachment #1: Type: text/plain, Size: 2266 bytes --]

Hi Jeremy,

Instead of changing kernel __setup_irq and use probing_irq to determine if pirq is shareable or not, I changed to set shareable flag in irq_info according to trigger mode in xen_allocate_pirq. Set level triggered interrupts shareable. This patch doesn't touch kernel IRQ code, it only changes xen related code. Do you think it is reasonable? Attached the patch.

Regards,
Weidong

Han, Weidong wrote:
> Jeremy Fitzhardinge wrote:
>> On 10/30/09 02:29, Han, Weidong wrote:
>>> All devices will call probing_irq in startup_pirq, which bind pirq
>>> to event channel. But now probing_irq always returns false, then all
>>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
>>> and an IDE interface device share the same IRQ 18. Because above
>>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
>>> and assign it to guest, it fails in guest_bind_pirq, therefore the
>>> PCI NIC in guest cannot work, even impact the devices who share the
>>> IRQ 18. 
>>> 
>>> If don't want to change __setup_irq code like my patch does, current
>>> probing_irq is useless (always return false). The problem is there
>>> is almost no information in desc can be used in probing_irq if
>>> desc->action is NULL. Keir, do you have any ideas?
>>> 
>> 
>> I think the intent of probing_irq is to detect irqs which are being
>> used to probe for an interrupt during driver initialization.  This
>> should only be used for things like ISA which don't have any way to
>> explicitly find out what irq a device is attached to.
>> 
>> Given that ISA devices aren't very interesting and no driver should
>> need to do that kind of probing under Xen, I wonder if we can't just
>> remove the whole test? 
>> 
>>     J
> 
> Not only ISA devices, but also PCI devices will use probing_irq. ISA
> devices are indeed not interesting, VT-d only assigns PCI devices. In
> fact, Sharing interrupt between assigned devices and host devices is
> not happy situation. We put much effort to make it work long time
> ago. If there is really no good approach, one alternate is add a
> limit of device assignment: don't allow assign PCI devices which
> share interrupt with other devices.      
> 
> Regards,
> Weidong


[-- Attachment #2: 0001-pv-ops-fix-shared-irq-device-passthrough-v2.patch --]
[-- Type: application/octet-stream, Size: 4118 bytes --]

From 675c1733889e87aae22151da82bfb6cde763b205 Mon Sep 17 00:00:00 2001
From: Weidong Han <weidong.han@intel.com>
Date: Tue, 3 Nov 2009 01:54:44 +0800
Subject: [PATCH] pv-ops: fix shared irq device passthrough [v2]

In driver/xen/events.c, whether bind_pirq is shareable or not is
determined by desc->action is NULL or not. But in __setup_irq,
startup(irq) is invoked before desc->action is assigned with
new action. So desc->action in startup_irq is alwasy NULL, and
bind_pirq is always not shareable. This results in pt_irq_create_bind
failure when passthrough a device which shares irq to other devices.

This patch doesn't use probing_irq to determine if pirq is shareable
or not, instead set shareable flag in irq_info according to trigger
mode in xen_allocate_pirq. Set level triggered interrupts shareable.
Thus use this flag to set bind_pirq flag accordingly.

Signed-off-by: Weidong Han <weidong.han@intel.com>
---
 arch/x86/xen/pci.c   |   15 ++++++++++++---
 drivers/xen/events.c |    7 +++++--
 include/xen/events.h |    2 +-
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index 44d91ad..6f366ae 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -44,6 +44,8 @@ static void xen_set_io_apic_routing(int irq, int trigger, int polarity)
 int xen_register_gsi(u32 gsi, int triggering, int polarity)
 {
 	int irq;
+	int shareable = 0;
+	char *name;
 
 	if (!xen_domain())
 		return -1;
@@ -51,8 +53,15 @@ int xen_register_gsi(u32 gsi, int triggering, int polarity)
 	printk(KERN_DEBUG "xen: registering gsi %u triggering %d polarity %d\n",
 	       gsi, triggering, polarity);
 
-	irq = xen_allocate_pirq(gsi, (triggering == ACPI_EDGE_SENSITIVE)
-				     ? "ioapic-edge" : "ioapic-level");
+	if (triggering == ACPI_EDGE_SENSITIVE) {
+		shareable = 0;
+		name = "ioapic-edge";
+	} else {
+		shareable = 1;
+		name = "ioapic-level";
+	}
+
+	irq = xen_allocate_pirq(gsi, shareable, name);
 
 	printk(KERN_DEBUG "xen: --> irq=%d\n", irq);
 
@@ -70,7 +79,7 @@ void __init xen_setup_pirqs(void)
 
 	if (0 == nr_ioapics) {
 		for (irq = 0; irq < NR_IRQS_LEGACY; irq++)
-			xen_allocate_pirq(irq, "xt-pic");
+			xen_allocate_pirq(irq, 0, "xt-pic");
 		return;
 	}
 
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index d3c267f..9d90f6e 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -96,6 +96,7 @@ struct irq_info
 	} u;
 };
 #define PIRQ_NEEDS_EOI	(1 << 0)
+#define PIRQ_SHAREABLE	(1 << 1)
 
 static struct irq_info *irq_info;
 
@@ -438,7 +439,8 @@ static unsigned int startup_pirq(unsigned int irq)
 
 	bind_pirq.pirq = info->u.pirq.nr;
 	/* NB. We are happy to share unless we are probing. */
-	bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
+	bind_pirq.flags = info->u.pirq.flags & PIRQ_SHAREABLE ?
+					BIND_PIRQ__WILL_SHARE : 0;
 	rc = HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq);
 	if (rc != 0) {
 		if (!probing_irq(irq))
@@ -543,7 +545,7 @@ static int find_irq_by_gsi(unsigned gsi)
  * event channel until the irq actually started up.  Return an
  * existing irq if we've already got one for the gsi.
  */
-int xen_allocate_pirq(unsigned gsi, char *name)
+int xen_allocate_pirq(unsigned gsi, int shareable, char *name)
 {
 	int irq;
 	struct physdev_irq irq_op;
@@ -575,6 +577,7 @@ int xen_allocate_pirq(unsigned gsi, char *name)
  	}
 
 	irq_info[irq] = mk_pirq_info(0, gsi, irq_op.vector);
+	irq_info[irq].u.pirq.flags |= shareable ? PIRQ_SHAREABLE : 0;
 out:
 	spin_unlock(&irq_mapping_update_lock);
 	return irq;
diff --git a/include/xen/events.h b/include/xen/events.h
index bb654f2..7b231f0 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -68,7 +68,7 @@ unsigned irq_from_evtchn(unsigned int evtchn);
 /* Allocate an irq for a physical interrupt, given a gsi.  "Legacy"
    GSIs are identity mapped; others are dynamically allocated as
    usual. */
-int xen_allocate_pirq(unsigned gsi, char *name);
+int xen_allocate_pirq(unsigned gsi, int shareable, char *name);
 
 /* Return vector allocated to pirq */
 int xen_vector_from_irq(unsigned pirq);
-- 
1.6.0.4


[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-11-02 10:12             ` Han, Weidong
@ 2009-11-10  6:12               ` Han, Weidong
  2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
                                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Han, Weidong @ 2009-11-10  6:12 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: 'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

[-- Attachment #1: Type: text/plain, Size: 2679 bytes --]

Jeremy,

Any comments?

Regards,
Weidong 

-----Original Message-----
From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Han, Weidong
Sent: 2009年11月2日 18:13
To: 'Jeremy Fitzhardinge'
Cc: 'xen-devel@lists.xensource.com'; Kay, Allen M; 'keir.fraser@eu.citrix.com'
Subject: RE: [Xen-devel] [PATCH][RFC] pv-ops: fix shared irq device passthrough

Hi Jeremy,

Instead of changing kernel __setup_irq and use probing_irq to determine if pirq is shareable or not, I changed to set shareable flag in irq_info according to trigger mode in xen_allocate_pirq. Set level triggered interrupts shareable. This patch doesn't touch kernel IRQ code, it only changes xen related code. Do you think it is reasonable? Attached the patch.

Regards,
Weidong

Han, Weidong wrote:
> Jeremy Fitzhardinge wrote:
>> On 10/30/09 02:29, Han, Weidong wrote:
>>> All devices will call probing_irq in startup_pirq, which bind pirq
>>> to event channel. But now probing_irq always returns false, then all
>>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
>>> and an IDE interface device share the same IRQ 18. Because above
>>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
>>> and assign it to guest, it fails in guest_bind_pirq, therefore the
>>> PCI NIC in guest cannot work, even impact the devices who share the
>>> IRQ 18. 
>>> 
>>> If don't want to change __setup_irq code like my patch does, current
>>> probing_irq is useless (always return false). The problem is there
>>> is almost no information in desc can be used in probing_irq if
>>> desc->action is NULL. Keir, do you have any ideas?
>>> 
>> 
>> I think the intent of probing_irq is to detect irqs which are being
>> used to probe for an interrupt during driver initialization.  This
>> should only be used for things like ISA which don't have any way to
>> explicitly find out what irq a device is attached to.
>> 
>> Given that ISA devices aren't very interesting and no driver should
>> need to do that kind of probing under Xen, I wonder if we can't just
>> remove the whole test? 
>> 
>>     J
> 
> Not only ISA devices, but also PCI devices will use probing_irq. ISA
> devices are indeed not interesting, VT-d only assigns PCI devices. In
> fact, Sharing interrupt between assigned devices and host devices is
> not happy situation. We put much effort to make it work long time
> ago. If there is really no good approach, one alternate is add a
> limit of device assignment: don't allow assign PCI devices which
> share interrupt with other devices.      
> 
> Regards,
> Weidong


[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-11-10  6:12               ` Han, Weidong
@ 2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
  2009-11-10  6:59                   ` Han, Weidong
  2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
  2009-11-10 17:56                 ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 17+ messages in thread
From: Mr. Teo En Ming (Zhang Enming) @ 2009-11-10  6:30 UTC (permalink / raw)
  To: Han, Weidong
  Cc: space.time.universe, Jeremy Fitzhardinge, xen-devel, Kay,
	Allen M, keir.fraser


[-- Attachment #1.1: Type: text/plain, Size: 3749 bytes --]

Hi Weidong,

Have you seen the debug messages in my qemu-dm log after I have applied the
shared irq device passthrough patch version 2 to my pvops dom0 kernel
2.6.31.5 in the other xen-devel mailing list thread for firewire controller
passthrough?

-- 
Mr. Teo En Ming (Zhang Enming) Dip(Mechatronics) BEng(Hons)(Mechanical
Engineering)
Alma Maters:
(1) Singapore Polytechnic
(2) National University of Singapore
My Primary Blog: http://teo-en-ming-aka-zhang-enming.blogspot.com
My Secondary Blog: http://enmingteo.wordpress.com
My Youtube videos: http://www.youtube.com/user/enmingteo
Email: space.time.universe@gmail.com
Mobile Phone (Starhub Prepaid): +65-8369-2618
Street: Bedok Reservoir Road
Country: Singapore

2009/11/10 Han, Weidong <weidong.han@intel.com>

> Jeremy,
>
> Any comments?
>
> Regards,
> Weidong
>
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:
> xen-devel-bounces@lists.xensource.com] On Behalf Of Han, Weidong
> Sent: 2009年11月2日 18:13
> To: 'Jeremy Fitzhardinge'
> Cc: 'xen-devel@lists.xensource.com'; Kay, Allen M; '
> keir.fraser@eu.citrix.com'
> Subject: RE: [Xen-devel] [PATCH][RFC] pv-ops: fix shared irq device
> passthrough
>
> Hi Jeremy,
>
> Instead of changing kernel __setup_irq and use probing_irq to determine if
> pirq is shareable or not, I changed to set shareable flag in irq_info
> according to trigger mode in xen_allocate_pirq. Set level triggered
> interrupts shareable. This patch doesn't touch kernel IRQ code, it only
> changes xen related code. Do you think it is reasonable? Attached the patch.
>
> Regards,
> Weidong
>
> Han, Weidong wrote:
> > Jeremy Fitzhardinge wrote:
> >> On 10/30/09 02:29, Han, Weidong wrote:
> >>> All devices will call probing_irq in startup_pirq, which bind pirq
> >>> to event channel. But now probing_irq always returns false, then all
> >>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
> >>> and an IDE interface device share the same IRQ 18. Because above
> >>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
> >>> and assign it to guest, it fails in guest_bind_pirq, therefore the
> >>> PCI NIC in guest cannot work, even impact the devices who share the
> >>> IRQ 18.
> >>>
> >>> If don't want to change __setup_irq code like my patch does, current
> >>> probing_irq is useless (always return false). The problem is there
> >>> is almost no information in desc can be used in probing_irq if
> >>> desc->action is NULL. Keir, do you have any ideas?
> >>>
> >>
> >> I think the intent of probing_irq is to detect irqs which are being
> >> used to probe for an interrupt during driver initialization.  This
> >> should only be used for things like ISA which don't have any way to
> >> explicitly find out what irq a device is attached to.
> >>
> >> Given that ISA devices aren't very interesting and no driver should
> >> need to do that kind of probing under Xen, I wonder if we can't just
> >> remove the whole test?
> >>
> >>     J
> >
> > Not only ISA devices, but also PCI devices will use probing_irq. ISA
> > devices are indeed not interesting, VT-d only assigns PCI devices. In
> > fact, Sharing interrupt between assigned devices and host devices is
> > not happy situation. We put much effort to make it work long time
> > ago. If there is really no good approach, one alternate is add a
> > limit of device assignment: don't allow assign PCI devices which
> > share interrupt with other devices.
> >
> > Regards,
> > Weidong
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>

[-- Attachment #1.2: Type: text/html, Size: 5098 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
@ 2009-11-10  6:59                   ` Han, Weidong
  0 siblings, 0 replies; 17+ messages in thread
From: Han, Weidong @ 2009-11-10  6:59 UTC (permalink / raw)
  To: 'Mr. Teo En Ming (Zhang Enming)'
  Cc: 'Jeremy Fitzhardinge',
	'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'


[-- Attachment #1.1: Type: text/plain, Size: 4327 bytes --]

I read it. The irq binding failure was gone. So it takes effect.

Regards,
Weidong

________________________________
From: Mr. Teo En Ming (Zhang Enming) [mailto:space.time.universe@gmail.com]
Sent: 2009年11月10日 14:30
To: Han, Weidong
Cc: Jeremy Fitzhardinge; xen-devel@lists.xensource.com; Kay, Allen M; keir.fraser@eu.citrix.com; space.time.universe@gmail.com
Subject: Re: [Xen-devel] [PATCH][RFC] pv-ops: fix shared irq device passthrough

Hi Weidong,

Have you seen the debug messages in my qemu-dm log after I have applied the shared irq device passthrough patch version 2 to my pvops dom0 kernel 2.6.31.5 in the other xen-devel mailing list thread for firewire controller passthrough?

--
Mr. Teo En Ming (Zhang Enming) Dip(Mechatronics) BEng(Hons)(Mechanical Engineering)
Alma Maters:
(1) Singapore Polytechnic
(2) National University of Singapore
My Primary Blog: http://teo-en-ming-aka-zhang-enming.blogspot.com
My Secondary Blog: http://enmingteo.wordpress.com
My Youtube videos: http://www.youtube.com/user/enmingteo
Email: space.time.universe@gmail.com<mailto:space.time.universe@gmail.com>
Mobile Phone (Starhub Prepaid): +65-8369-2618
Street: Bedok Reservoir Road
Country: Singapore

2009/11/10 Han, Weidong <weidong.han@intel.com<mailto:weidong.han@intel.com>>
Jeremy,

Any comments?

Regards,
Weidong

-----Original Message-----
From: xen-devel-bounces@lists.xensource.com<mailto:xen-devel-bounces@lists.xensource.com> [mailto:xen-devel-bounces@lists.xensource.com<mailto:xen-devel-bounces@lists.xensource.com>] On Behalf Of Han, Weidong
Sent: 2009年11月2日 18:13
To: 'Jeremy Fitzhardinge'
Cc: 'xen-devel@lists.xensource.com<mailto:xen-devel@lists.xensource.com>'; Kay, Allen M; 'keir.fraser@eu.citrix.com<mailto:keir.fraser@eu.citrix.com>'
Subject: RE: [Xen-devel] [PATCH][RFC] pv-ops: fix shared irq device passthrough

Hi Jeremy,

Instead of changing kernel __setup_irq and use probing_irq to determine if pirq is shareable or not, I changed to set shareable flag in irq_info according to trigger mode in xen_allocate_pirq. Set level triggered interrupts shareable. This patch doesn't touch kernel IRQ code, it only changes xen related code. Do you think it is reasonable? Attached the patch.

Regards,
Weidong

Han, Weidong wrote:
> Jeremy Fitzhardinge wrote:
>> On 10/30/09 02:29, Han, Weidong wrote:
>>> All devices will call probing_irq in startup_pirq, which bind pirq
>>> to event channel. But now probing_irq always returns false, then all
>>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
>>> and an IDE interface device share the same IRQ 18. Because above
>>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
>>> and assign it to guest, it fails in guest_bind_pirq, therefore the
>>> PCI NIC in guest cannot work, even impact the devices who share the
>>> IRQ 18.
>>>
>>> If don't want to change __setup_irq code like my patch does, current
>>> probing_irq is useless (always return false). The problem is there
>>> is almost no information in desc can be used in probing_irq if
>>> desc->action is NULL. Keir, do you have any ideas?
>>>
>>
>> I think the intent of probing_irq is to detect irqs which are being
>> used to probe for an interrupt during driver initialization.  This
>> should only be used for things like ISA which don't have any way to
>> explicitly find out what irq a device is attached to.
>>
>> Given that ISA devices aren't very interesting and no driver should
>> need to do that kind of probing under Xen, I wonder if we can't just
>> remove the whole test?
>>
>>     J
>
> Not only ISA devices, but also PCI devices will use probing_irq. ISA
> devices are indeed not interesting, VT-d only assigns PCI devices. In
> fact, Sharing interrupt between assigned devices and host devices is
> not happy situation. We put much effort to make it work long time
> ago. If there is really no good approach, one alternate is add a
> limit of device assignment: don't allow assign PCI devices which
> share interrupt with other devices.
>
> Regards,
> Weidong


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com<mailto:Xen-devel@lists.xensource.com>
http://lists.xensource.com/xen-devel






[-- Attachment #1.2: Type: text/html, Size: 6413 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-11-10  6:12               ` Han, Weidong
  2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
@ 2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
  2009-11-10 17:11                   ` Kay, Allen M
  2009-11-10 18:44                   ` Jeremy Fitzhardinge
  2009-11-10 17:56                 ` Jeremy Fitzhardinge
  2 siblings, 2 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2009-11-10 16:30 UTC (permalink / raw)
  To: Han, Weidong
  Cc: 'Jeremy Fitzhardinge',
	'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

On Tue, Nov 10, 2009 at 02:12:03PM +0800, Han, Weidong wrote:
> Jeremy,
> 
> Any comments?


With this patch you can share the PCI devices on the same IRQ, correct? Will
this mean that you can assign to a guest a USB controller, while
Dom0 has controller of the PCI NIC (and assuming that both of those
share the same interrupt line)? If so, won't the Dom0 start throwing
a fit b/c there are unhandled IRQs and eventually disable the IRQ line?


Or even the guest decide that there are too many IRQs and decided to
disable the IRQ line?

> Instead of changing kernel __setup_irq and use probing_irq to determine if pirq is shareable or not, I changed to set shareable flag in irq_info according to trigger mode in xen_allocate_pirq. Set level triggered interrupts shareable. This patch doesn't touch kernel IRQ code, it only changes xen related code. Do you think it is reasonable? Attached the patch.
> > Jeremy Fitzhardinge wrote:

.. snip ..
> >>> All devices will call probing_irq in startup_pirq, which bind pirq
> >>> to event channel. But now probing_irq always returns false, then all
> >>> pirqs are not shareable. In my system, a PCI NIC, an USB controller
> >>> and an IDE interface device share the same IRQ 18. Because above
> >>> reason, pirq 18 is set not shareable. So when I hide the PCI NIC,
> >>> and assign it to guest, it fails in guest_bind_pirq, therefore the
> >>> PCI NIC in guest cannot work, even impact the devices who share the
> >>> IRQ 18. 
> >>> 
> >>> If don't want to change __setup_irq code like my patch does, current
> >>> probing_irq is useless (always return false). The problem is there
> >>> is almost no information in desc can be used in probing_irq if
> >>> desc->action is NULL. Keir, do you have any ideas?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
@ 2009-11-10 17:11                   ` Kay, Allen M
  2009-11-10 18:40                     ` Konrad Rzeszutek Wilk
  2009-11-10 18:44                   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 17+ messages in thread
From: Kay, Allen M @ 2009-11-10 17:11 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Han, Weidong
  Cc: 'Jeremy Fitzhardinge',
	'xen-devel@lists.xensource.com',
	'keir.fraser@eu.citrix.com'

> With this patch you can share the PCI devices on the same IRQ, correct? Will
> this mean that you can assign to a guest a USB controller, while
> Dom0 has controller of the PCI NIC (and assuming that both of those
> share the same interrupt line)? If so, won't the Dom0 start throwing
> a fit b/c there are unhandled IRQs and eventually disable the IRQ line?

No, interrupts are injected to both domains sharing the same IRQ.
See arch/x86/irq.c/__do_IRQ_guest() for details.

> Or even the guest decide that there are too many IRQs and decided to
> disable the IRQ line?

Note that in __do_IRQ_guest(), interrupt handling control flow for PCI passthrough case
Is the same for PV event channel case.  Guest doesn't disable the IRQ line unless
there is a malfunction.

Allen

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-11-10  6:12               ` Han, Weidong
  2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
  2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
@ 2009-11-10 17:56                 ` Jeremy Fitzhardinge
  2 siblings, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-11-10 17:56 UTC (permalink / raw)
  To: Han, Weidong
  Cc: 'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

On 11/09/09 22:12, Han, Weidong wrote:
> Jeremy,
>
> Any comments?
>   

Looks fine. Does it work OK?

J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-11-10 17:11                   ` Kay, Allen M
@ 2009-11-10 18:40                     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2009-11-10 18:40 UTC (permalink / raw)
  To: Kay, Allen M
  Cc: 'Jeremy Fitzhardinge',
	'xen-devel@lists.xensource.com',
	Han, Weidong, 'keir.fraser@eu.citrix.com'

On Tue, Nov 10, 2009 at 09:11:23AM -0800, Kay, Allen M wrote:
> > With this patch you can share the PCI devices on the same IRQ, correct? Will
> > this mean that you can assign to a guest a USB controller, while
> > Dom0 has controller of the PCI NIC (and assuming that both of those
> > share the same interrupt line)? If so, won't the Dom0 start throwing
> > a fit b/c there are unhandled IRQs and eventually disable the IRQ line?
> 
> No, interrupts are injected to both domains sharing the same IRQ.
> See arch/x86/irq.c/__do_IRQ_guest() for details.

You are looking at it from the Xen level. The patch was for Linux pv_ops.
> 
> > Or even the guest decide that there are too many IRQs and decided to
> > disable the IRQ line?
> 
> Note that in __do_IRQ_guest(), interrupt handling control flow for PCI passthrough case
> Is the same for PV event channel case.  Guest doesn't disable the IRQ line unless
> there is a malfunction.

Right, and the way Linux detects if there is a malfunction is by seeing if the
interrupt handlers don't handle the interrupt. The logic in Dom0 is that do_IRQ is
called, which then calls (depending on the type of interrupt - edge or level)
the proper top-level interrupt handler:


354 void
355 handle_level_irq(unsigned int irq, struct irq_desc *desc)
356 {

.. snip ..

which goes through all of the IRQ handlers sharing the IRQ level.
If there is one, then it will just call one of them.

377         spin_unlock(&desc->lock);
378 
379         action_ret = handle_IRQ_event(irq, action);
380         if (!noirqdebug)
381                 note_interrupt(irq, desc, action_ret);
382 

The 'handle_IRQ_event' calls the IRQ handler, for example the
ahci one:

2143 static irqreturn_t ahci_interrupt(int irq, void *dev_instance)
2144 {
.. snip ..
2156         /* sigh.  0xffffffff is a valid return from h/w */
2157         irq_stat = readl(mmio + HOST_IRQ_STAT);
2158         if (!irq_stat)
2159                 return IRQ_NONE;

And lets that the interrupt was fired off on the USB controller (which
is owned by the guest, not Dom0) and the AHCI one did not have presently
any data (the guest is booting of a SCSI controller or what not).

The 'note_interrupt' would decide that since this isn't handled:

226 void note_interrupt(unsigned int irq, struct irq_desc *desc,
227                     irqreturn_t action_ret)
228 {
229         if (unlikely(action_ret != IRQ_HANDLED)) {

it would accumulate the count of unhandled interrupts and then
eventually disable the interrupt:

256         if (unlikely(desc->irqs_unhandled > 99900)) {
257                 /*
258                  * The interrupt is stuck
259                  */
260                 __report_bad_irq(irq, desc, action_ret);
261                 /*
262                  * Now kill the IRQ
263                  */
264                 printk(KERN_EMERG "Disabling IRQ #%d\n", irq);
265                 desc->status |= IRQ_DISABLED | IRQ_SPURIOUS_DISABLED;
266                 desc->depth++;
267                 desc->chip->disable(irq);


I think this problem exists with or _without_ the proposed patch.

Except that _without_ the patch you can't even get to share an interrupt
in which case you would never go down this logic path.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFC] pv-ops: fix shared irq device passthrough
  2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
  2009-11-10 17:11                   ` Kay, Allen M
@ 2009-11-10 18:44                   ` Jeremy Fitzhardinge
  2009-11-12  7:15                     ` Han, Weidong
  1 sibling, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-11-10 18:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Kay, Allen M, 'xen-devel@lists.xensource.com',
	Han, Weidong, 'keir.fraser@eu.citrix.com'

On 11/10/09 08:30, Konrad Rzeszutek Wilk wrote:
> With this patch you can share the PCI devices on the same IRQ, correct? Will
> this mean that you can assign to a guest a USB controller, while
> Dom0 has controller of the PCI NIC (and assuming that both of those
> share the same interrupt line)? If so, won't the Dom0 start throwing
> a fit b/c there are unhandled IRQs and eventually disable the IRQ line?
>
>
> Or even the guest decide that there are too many IRQs and decided to
> disable the IRQ line?
>   

It would be easy to add a dummy handler which always returns IRQ_HANDLED
for every cross-domain shared interrupt to prevent this.  Of course that
would paper over any real spurious/screaming interrupt problem, but we
could add some extra logic into the dummy handler if necessary.

I merged the patch, and had to add the new parameter to
xen_allocate_pirq() in arch/x86/pci/xen.c.  I just made it 0 which
should have no functional difference from before, but I'll leave it to
you to decide what it really should be.

    J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH][RFC]  pv-ops: fix shared irq device passthrough
  2009-11-10 18:44                   ` Jeremy Fitzhardinge
@ 2009-11-12  7:15                     ` Han, Weidong
  0 siblings, 0 replies; 17+ messages in thread
From: Han, Weidong @ 2009-11-12  7:15 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge', 'Konrad Rzeszutek Wilk'
  Cc: 'xen-devel@lists.xensource.com',
	Kay, Allen M, 'keir.fraser@eu.citrix.com'

Jeremy Fitzhardinge wrote:
> On 11/10/09 08:30, Konrad Rzeszutek Wilk wrote:
>> With this patch you can share the PCI devices on the same IRQ,
>> correct? Will this mean that you can assign to a guest a USB
>> controller, while 
>> Dom0 has controller of the PCI NIC (and assuming that both of those
>> share the same interrupt line)? If so, won't the Dom0 start throwing
>> a fit b/c there are unhandled IRQs and eventually disable the IRQ
>> line? 
>> 
>> 
>> Or even the guest decide that there are too many IRQs and decided to
>> disable the IRQ line? 
>> 

Konrad,

You listed and analysed code path of handle_level_irq in another mail, but you miss following lines in note_interrupt:

	if (time_after(jiffies, desc->last_unhandled + HZ/10))
            	desc->irqs_unhandled = 1; 
	else
            	desc->irqs_unhandled++;

I found time_after will return true when assigned the shared irq device to hvm, then it doesn't complain too many unhandled IRQs. So it's no problem when passthrough shared IRQ NIC and USB controller, but i'm not sure it can cover all cases.

Actually, it's supported in Xen with 2.6.18 dom0 for a long time. We doesn't see problem. 

Regards,
Weidong

> 
> It would be easy to add a dummy handler which always returns
> IRQ_HANDLED for every cross-domain shared interrupt to prevent this. 
> Of course that would paper over any real spurious/screaming interrupt
> problem, but we could add some extra logic into the dummy handler if
> necessary. 
> 
> I merged the patch, and had to add the new parameter to
> xen_allocate_pirq() in arch/x86/pci/xen.c.  I just made it 0 which
> should have no functional difference from before, but I'll leave it to
> you to decide what it really should be.
> 
>     J

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-11-12  7:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-23 11:39 [PATCH][RFC] pv-ops: fix shared irq device passthrough Han, Weidong
2009-10-26 20:27 ` Jeremy Fitzhardinge
2009-10-27  1:32   ` Han, Weidong
2009-10-29 22:56     ` Jeremy Fitzhardinge
2009-10-30  9:29       ` Han, Weidong
2009-10-30 21:34         ` Jeremy Fitzhardinge
2009-10-31 13:50           ` Han, Weidong
     [not found]           ` <715D42877B251141A38726ABF5CABF2C055064A406@pdsmsx503.ccr.corp.intel.com>
2009-11-02 10:12             ` Han, Weidong
2009-11-10  6:12               ` Han, Weidong
2009-11-10  6:30                 ` Mr. Teo En Ming (Zhang Enming)
2009-11-10  6:59                   ` Han, Weidong
2009-11-10 16:30                 ` Konrad Rzeszutek Wilk
2009-11-10 17:11                   ` Kay, Allen M
2009-11-10 18:40                     ` Konrad Rzeszutek Wilk
2009-11-10 18:44                   ` Jeremy Fitzhardinge
2009-11-12  7:15                     ` Han, Weidong
2009-11-10 17:56                 ` Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.