All of lore.kernel.org
 help / color / mirror / Atom feed
* Interrupt to CPU routing in HVM domains - again
@ 2008-09-05  1:06 James Harper
  2008-09-05  1:21 ` John Levon
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: James Harper @ 2008-09-05  1:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, bart brooks

(Bart - I hope you don't mind me sending your email to the list)

Keir,

As per a recent discussion I modified the IRQ code in the Windows GPLPV
drivers so that only the vcpu_info[0] structure is used, instead of
vcpu_info[current_cpu] structure. As per Bart's email below though, this
has caused him to experience performance issues.

Have I understood correctly that only cpu 0 of the vcpu_info[] array is
ever used even if the interrupt actually occurs on another vcpu? Is this
true for all versions of Xen? It seems that Bart's experience is exactly
the opposite of mine - the change that fixed up the performance issues
for me caused performance issues for him...

Bart: Can you have a look through the xen-devel list archives and have a
read of a thread with a subject of "HVM windows - PCI IRQ firing on both
CPU's", around the middle of last month? Let me know if you interpret
that any differently to me...

Thanks

James



> -----Original Message-----
> From: bart brooks [mailto:bart_brooks@hotmail.com]
> Sent: Friday, 5 September 2008 01:19
> To: James Harper
> Subject: Performance - Update GPLPV drivers -0.9.11-pre12
> Importance: High
> 
> Hi James,
> 
> 
> 
> We have tracked down the issue where performance has dropped off after
> version 0.9.11-pre9 and still exists in version 0.9.11-pre12.
> 
> Event channel interrupts for transmit are generated only on VCPU-0,
> whereas for receive they are generated on all VCPUs in a round robin
> fashion. Post 0.9.11-pre9 it is assumed that all the interrupts are
> generated on VCPU-0, so the network interrupts generated on other
VPCUs
> are only processed if there is some activity going on VCPU-0 or an
> outstanding DPC. This caused the packets to be processed out-of-order
and
> retransmissions. Retransmissions happened after a timeout (200ms) with
no
> activity during that time. Overall it bought down the bandwidth a lot
with
> huge gaps of no activity.
> 
> 
> 
> Instead of assuming that everything is on CPU-0, the following change
was
> made in the xenpci driver in the file evtchn.c in the function
> EvtChn_Interrupt()
> 
> int cpu = KeGetCurrentProcessorNumber() & (MAX_VIRT_CPUS - 1);
> 
> This is the same code found in version  0.9.11-pre9
> 
> 
> 
> After this change, we are getting numbers comparable to 0.9.11-pre9 .
> 
> Bart
> 
> 
> ________________________________
> 
> Get more out of the Web. Learn 10 hidden secrets of Windows Live.
Learn
> Now <http://windowslive.com/connect/post/jamiethomson.spaces.live.com-
> Blog-cns!550F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_getmore_092008>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05  1:06 Interrupt to CPU routing in HVM domains - again James Harper
@ 2008-09-05  1:21 ` John Levon
  2008-09-05  7:44   ` Keir Fraser
  2008-09-05  7:43 ` Keir Fraser
  2008-09-05 15:15 ` Steve Ofsthun
  2 siblings, 1 reply; 18+ messages in thread
From: John Levon @ 2008-09-05  1:21 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel, Keir Fraser, bart brooks

On Fri, Sep 05, 2008 at 11:06:06AM +1000, James Harper wrote:

> As per a recent discussion I modified the IRQ code in the Windows GPLPV
> drivers so that only the vcpu_info[0] structure is used, instead of
> vcpu_info[current_cpu] structure. As per Bart's email below though, this
> has caused him to experience performance issues.
> 
> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
> ever used even if the interrupt actually occurs on another vcpu? Is this
> true for all versions of Xen? It seems that Bart's experience is exactly
> the opposite of mine - the change that fixed up the performance issues
> for me caused performance issues for him...

I was just in this code. From what I can tell it's a requirement that
all the domU's event channels be bound to VCPU0, since there's no code
in Xen itself that will redirect evtchn_pending_sel from another VCPU to
CPU0's vcpu_info, and the callback IRQ code assumes VCPU0.

regards
john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05  1:06 Interrupt to CPU routing in HVM domains - again James Harper
  2008-09-05  1:21 ` John Levon
@ 2008-09-05  7:43 ` Keir Fraser
  2008-09-05 10:13   ` James Harper
  2008-09-05 15:15 ` Steve Ofsthun
  2 siblings, 1 reply; 18+ messages in thread
From: Keir Fraser @ 2008-09-05  7:43 UTC (permalink / raw)
  To: James Harper, xen-devel; +Cc: bart brooks

I can absolutely assure you that only vcpu0's evtchn_pending flag influences
delivery of the xen_platform pci interrupt. In xen-unstable/xen-3.3, the
relevant function to look at is
xen/arch/x86/hvm/irq.c:hvm_assert_evtchn_irq(). It has always been this way.

In any case, there is only one interrupt line (currently), so there would
hardly be a scalability benefit to spreading out event channel bindings to
different vcpu selector words and evtchn_pending flags. :-)

The only thing I can think of (and should certainly be checked!) is that
some of your event channels are erroneously getting bound to vcpu != 0. Are
you running an irq load balancer or somesuch? Obviously event channels bound
to vcpu != 0 will now never be serviced, whereas before your changes you
would probabilistically 'get lucky'.

 -- Keir

On 5/9/08 02:06, "James Harper" <james.harper@bendigoit.com.au> wrote:

> (Bart - I hope you don't mind me sending your email to the list)
> 
> Keir,
> 
> As per a recent discussion I modified the IRQ code in the Windows GPLPV
> drivers so that only the vcpu_info[0] structure is used, instead of
> vcpu_info[current_cpu] structure. As per Bart's email below though, this
> has caused him to experience performance issues.
> 
> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
> ever used even if the interrupt actually occurs on another vcpu? Is this
> true for all versions of Xen? It seems that Bart's experience is exactly
> the opposite of mine - the change that fixed up the performance issues
> for me caused performance issues for him...
> 
> Bart: Can you have a look through the xen-devel list archives and have a
> read of a thread with a subject of "HVM windows - PCI IRQ firing on both
> CPU's", around the middle of last month? Let me know if you interpret
> that any differently to me...
> 
> Thanks
> 
> James
> 
> 
> 
>> -----Original Message-----
>> From: bart brooks [mailto:bart_brooks@hotmail.com]
>> Sent: Friday, 5 September 2008 01:19
>> To: James Harper
>> Subject: Performance - Update GPLPV drivers -0.9.11-pre12
>> Importance: High
>> 
>> Hi James,
>> 
>> 
>> 
>> We have tracked down the issue where performance has dropped off after
>> version 0.9.11-pre9 and still exists in version 0.9.11-pre12.
>> 
>> Event channel interrupts for transmit are generated only on VCPU-0,
>> whereas for receive they are generated on all VCPUs in a round robin
>> fashion. Post 0.9.11-pre9 it is assumed that all the interrupts are
>> generated on VCPU-0, so the network interrupts generated on other
> VPCUs
>> are only processed if there is some activity going on VCPU-0 or an
>> outstanding DPC. This caused the packets to be processed out-of-order
> and
>> retransmissions. Retransmissions happened after a timeout (200ms) with
> no
>> activity during that time. Overall it bought down the bandwidth a lot
> with
>> huge gaps of no activity.
>> 
>> 
>> 
>> Instead of assuming that everything is on CPU-0, the following change
> was
>> made in the xenpci driver in the file evtchn.c in the function
>> EvtChn_Interrupt()
>> 
>> int cpu = KeGetCurrentProcessorNumber() & (MAX_VIRT_CPUS - 1);
>> 
>> This is the same code found in version  0.9.11-pre9
>> 
>> 
>> 
>> After this change, we are getting numbers comparable to 0.9.11-pre9 .
>> 
>> Bart
>> 
>> 
>> ________________________________
>> 
>> Get more out of the Web. Learn 10 hidden secrets of Windows Live.
> Learn
>> Now <http://windowslive.com/connect/post/jamiethomson.spaces.live.com-
>> Blog-cns!550F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_getmore_092008>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05  1:21 ` John Levon
@ 2008-09-05  7:44   ` Keir Fraser
  2008-09-05 10:25     ` James Harper
  0 siblings, 1 reply; 18+ messages in thread
From: Keir Fraser @ 2008-09-05  7:44 UTC (permalink / raw)
  To: John Levon, James Harper; +Cc: xen-devel, bart brooks

On 5/9/08 02:21, "John Levon" <levon@movementarian.org> wrote:

>> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
>> ever used even if the interrupt actually occurs on another vcpu? Is this
>> true for all versions of Xen? It seems that Bart's experience is exactly
>> the opposite of mine - the change that fixed up the performance issues
>> for me caused performance issues for him...
> 
> I was just in this code. From what I can tell it's a requirement that
> all the domU's event channels be bound to VCPU0, since there's no code
> in Xen itself that will redirect evtchn_pending_sel from another VCPU to
> CPU0's vcpu_info, and the callback IRQ code assumes VCPU0.

Yes, this is very likely the problem being observed.

 -- Keir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Interrupt to CPU routing in HVM domains - again
  2008-09-05  7:43 ` Keir Fraser
@ 2008-09-05 10:13   ` James Harper
  2008-09-05 10:32     ` Keir Fraser
  0 siblings, 1 reply; 18+ messages in thread
From: James Harper @ 2008-09-05 10:13 UTC (permalink / raw)
  To: Keir Fraser, xen-devel; +Cc: bart brooks

> The only thing I can think of (and should certainly be checked!) is
that
> some of your event channels are erroneously getting bound to vcpu !=
0.
> Are
> you running an irq load balancer or somesuch? Obviously event channels
> bound
> to vcpu != 0 will now never be serviced, whereas before your changes
you
> would probabilistically 'get lucky'.

In allocating event channels, I'm doing EVTCHNOP_alloc_unbound and then
passing the returned event channel number to Dom0 via xenbus. Is there
anything else I need to do? Do I need a specific hypercall to bind the
event channel to cpu 0 only? I'll check the 'unmodified drivers'
source...

The thing is, my performance testing has shown good results for SMP
(vcpus=4) and UP, but Bart has reported that he only gets good
performance when he checks the vcpu_info for the cpu that the interrupt
has been triggered on.

James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Interrupt to CPU routing in HVM domains - again
  2008-09-05  7:44   ` Keir Fraser
@ 2008-09-05 10:25     ` James Harper
  2008-09-05 13:18       ` John Levon
  0 siblings, 1 reply; 18+ messages in thread
From: James Harper @ 2008-09-05 10:25 UTC (permalink / raw)
  To: Keir Fraser, John Levon; +Cc: xen-devel, bart brooks

> On 5/9/08 02:21, "John Levon" <levon@movementarian.org> wrote:
> 
> >> Have I understood correctly that only cpu 0 of the vcpu_info[]
array is
> >> ever used even if the interrupt actually occurs on another vcpu? Is
> this
> >> true for all versions of Xen? It seems that Bart's experience is
> exactly
> >> the opposite of mine - the change that fixed up the performance
issues
> >> for me caused performance issues for him...
> >
> > I was just in this code. From what I can tell it's a requirement
that
> > all the domU's event channels be bound to VCPU0, since there's no
code
> > in Xen itself that will redirect evtchn_pending_sel from another
VCPU to
> > CPU0's vcpu_info, and the callback IRQ code assumes VCPU0.
> 
> Yes, this is very likely the problem being observed.

Would I need to do a specific bind_vcpu hypercall to cause an event
channel to be bound to a vcpu != 0? I'm not doing that.

I'm thinking that Bart is seeing some weird interaction somewhere that
is causing the change that should break things to somehow make things
work...

James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 10:13   ` James Harper
@ 2008-09-05 10:32     ` Keir Fraser
  0 siblings, 0 replies; 18+ messages in thread
From: Keir Fraser @ 2008-09-05 10:32 UTC (permalink / raw)
  To: James Harper, xen-devel; +Cc: bart brooks

On 5/9/08 11:13, "James Harper" <james.harper@bendigoit.com.au> wrote:

>> The only thing I can think of (and should certainly be checked!) is that
>> some of your event channels are erroneously getting bound to vcpu != 0.
>> Are
>> you running an irq load balancer or somesuch? Obviously event channels
>> bound
>> to vcpu != 0 will now never be serviced, whereas before your changes you
>> would probabilistically 'get lucky'.
> 
> In allocating event channels, I'm doing EVTCHNOP_alloc_unbound and then
> passing the returned event channel number to Dom0 via xenbus. Is there
> anything else I need to do? Do I need a specific hypercall to bind the
> event channel to cpu 0 only? I'll check the 'unmodified drivers'
> source...

Nope, default is binding to vcpu0, except for some VIRQs and IPIs. This even
gets reset for you across a particular port being freed and re-allocated.

> The thing is, my performance testing has shown good results for SMP
> (vcpus=4) and UP, but Bart has reported that he only gets good
> performance when he checks the vcpu_info for the cpu that the interrupt
> has been triggered on.

Well, that doesn't make much sense. He should still check port bindings (a
program 'lsevtchn' shoudl get installed in dom0 for this purpose -- source
is tools/xcutils/lsevtchn.c) just in case. I think lsevtchn will need
extending to print the bound vcpu (it does get returned by the hypercall).
I'll check in a patch to xen-unstable for that, but it's easy to lash up
yourself.

 -- Keir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 10:25     ` James Harper
@ 2008-09-05 13:18       ` John Levon
  0 siblings, 0 replies; 18+ messages in thread
From: John Levon @ 2008-09-05 13:18 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel, Keir Fraser, bart brooks

On Fri, Sep 05, 2008 at 08:25:15PM +1000, James Harper wrote:

> > > I was just in this code. From what I can tell it's a requirement
> that
> > > all the domU's event channels be bound to VCPU0, since there's no
> code
> > > in Xen itself that will redirect evtchn_pending_sel from another
> VCPU to
> > > CPU0's vcpu_info, and the callback IRQ code assumes VCPU0.
> > 
> > Yes, this is very likely the problem being observed.
> 
> Would I need to do a specific bind_vcpu hypercall to cause an event
> channel to be bound to a vcpu != 0? I'm not doing that.

No, any allocation should default to VCPU0.

regards
john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05  1:06 Interrupt to CPU routing in HVM domains - again James Harper
  2008-09-05  1:21 ` John Levon
  2008-09-05  7:43 ` Keir Fraser
@ 2008-09-05 15:15 ` Steve Ofsthun
  2008-09-05 15:25   ` John Levon
  2 siblings, 1 reply; 18+ messages in thread
From: Steve Ofsthun @ 2008-09-05 15:15 UTC (permalink / raw)
  To: James Harper; +Cc: xen-devel, Keir Fraser, bart brooks

James Harper wrote:
> (Bart - I hope you don't mind me sending your email to the list)
> 
> Keir,
> 
> As per a recent discussion I modified the IRQ code in the Windows GPLPV
> drivers so that only the vcpu_info[0] structure is used, instead of
> vcpu_info[current_cpu] structure. As per Bart's email below though, this
> has caused him to experience performance issues.
> 
> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
> ever used even if the interrupt actually occurs on another vcpu? Is this
> true for all versions of Xen? It seems that Bart's experience is exactly
> the opposite of mine - the change that fixed up the performance issues
> for me caused performance issues for him...

While the event channel delivery code "binds" HVM event channel interrupts to VCPU0, the interrupt is delivered via the emulated IOAPIC.  The guest OS may program this "hardware" to deliver the interrupt to other VCPUs.  For linux, this gets done by the irqbalance code among others.  Xen overrides this routing for the timer 0 interrupt path in vioapic.c under the #define IRQ0_SPECIAL_ROUTING.  We hacked our version of Xen to piggyback on this code to force all event channel interrupts for HVM guests to also avoid any guest rerouting:

#ifdef IRQ0_SPECIAL_ROUTING
    /* Force round-robin to pick VCPU 0 */
    if ( ((irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled()) ||
         is_hvm_callback_irq(vioapic, irq) )
        deliver_bitmask = (uint32_t)1;
#endif

This routing override provides a significant performance boost [or rather avoids the performance penalty] for SMP PV drivers up until the time that VCPU0 is saturated with interrupts.  You can probably achieve the same thing but forcing the guest OS to set it's interrupt affinity to VCPU0.  Under linux for example, you can disable the irqbalance code.

Steve

> 
> Bart: Can you have a look through the xen-devel list archives and have a
> read of a thread with a subject of "HVM windows - PCI IRQ firing on both
> CPU's", around the middle of last month? Let me know if you interpret
> that any differently to me...
> 
> Thanks
> 
> James
> 
> 
> 
>> -----Original Message-----
>> From: bart brooks [mailto:bart_brooks@hotmail.com]
>> Sent: Friday, 5 September 2008 01:19
>> To: James Harper
>> Subject: Performance - Update GPLPV drivers -0.9.11-pre12
>> Importance: High
>>
>> Hi James,
>>
>>
>>
>> We have tracked down the issue where performance has dropped off after
>> version 0.9.11-pre9 and still exists in version 0.9.11-pre12.
>>
>> Event channel interrupts for transmit are generated only on VCPU-0,
>> whereas for receive they are generated on all VCPUs in a round robin
>> fashion. Post 0.9.11-pre9 it is assumed that all the interrupts are
>> generated on VCPU-0, so the network interrupts generated on other
> VPCUs
>> are only processed if there is some activity going on VCPU-0 or an
>> outstanding DPC. This caused the packets to be processed out-of-order
> and
>> retransmissions. Retransmissions happened after a timeout (200ms) with
> no
>> activity during that time. Overall it bought down the bandwidth a lot
> with
>> huge gaps of no activity.
>>
>>
>>
>> Instead of assuming that everything is on CPU-0, the following change
> was
>> made in the xenpci driver in the file evtchn.c in the function
>> EvtChn_Interrupt()
>>
>> int cpu = KeGetCurrentProcessorNumber() & (MAX_VIRT_CPUS - 1);
>>
>> This is the same code found in version  0.9.11-pre9
>>
>>
>>
>> After this change, we are getting numbers comparable to 0.9.11-pre9 .
>>
>> Bart
>>
>>
>> ________________________________
>>
>> Get more out of the Web. Learn 10 hidden secrets of Windows Live.
> Learn
>> Now <http://windowslive.com/connect/post/jamiethomson.spaces.live.com-
>> Blog-cns!550F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_getmore_092008>
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 15:15 ` Steve Ofsthun
@ 2008-09-05 15:25   ` John Levon
  2008-09-05 17:11     ` Steve Ofsthun
  0 siblings, 1 reply; 18+ messages in thread
From: John Levon @ 2008-09-05 15:25 UTC (permalink / raw)
  To: Steve Ofsthun; +Cc: James Harper, xen-devel, bart brooks, Keir Fraser

On Fri, Sep 05, 2008 at 11:15:43AM -0400, Steve Ofsthun wrote:

> While the event channel delivery code "binds" HVM event channel interrupts 
> to VCPU0, the interrupt is delivered via the emulated IOAPIC.  The guest OS 
> may program this "hardware" to deliver the interrupt to other VCPUs.  For 
> linux, this gets done by the irqbalance code among others.  Xen overrides 
> this routing for the timer 0 interrupt path in vioapic.c under the #define 
> IRQ0_SPECIAL_ROUTING.  We hacked our version of Xen to piggyback on this 
> code to force all event channel interrupts for HVM guests to also avoid any 
> guest rerouting:
> 
> #ifdef IRQ0_SPECIAL_ROUTING
>    /* Force round-robin to pick VCPU 0 */
>    if ( ((irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled()) ||
>         is_hvm_callback_irq(vioapic, irq) )
>        deliver_bitmask = (uint32_t)1;
> #endif

Yes, please - Solaris 10 PV drivers are buggy in that they use the
current VCPUs vcpu_info. I just found this bug, and it's getting fixed,
but if this makes sense anyway, it'd be good.

> This routing override provides a significant performance boost [or rather 
> avoids the performance penalty] for SMP PV drivers up until the time that 
> VCPU0 is saturated with interrupts.  You can probably achieve the same 

Of course there's no requirement that the evtchn is actually dealt with
on the same CPU, just the callback IRQ and the evtchn "ack" (clearing
evtchn_upcall_pending).

regards
john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 15:25   ` John Levon
@ 2008-09-05 17:11     ` Steve Ofsthun
  2008-09-05 17:28       ` John Levon
  0 siblings, 1 reply; 18+ messages in thread
From: Steve Ofsthun @ 2008-09-05 17:11 UTC (permalink / raw)
  To: John Levon; +Cc: James Harper, xen-devel, bart brooks, Keir Fraser

John Levon wrote:
> On Fri, Sep 05, 2008 at 11:15:43AM -0400, Steve Ofsthun wrote:
> 
>> While the event channel delivery code "binds" HVM event channel interrupts 
>> to VCPU0, the interrupt is delivered via the emulated IOAPIC.  The guest OS 
>> may program this "hardware" to deliver the interrupt to other VCPUs.  For 
>> linux, this gets done by the irqbalance code among others.  Xen overrides 
>> this routing for the timer 0 interrupt path in vioapic.c under the #define 
>> IRQ0_SPECIAL_ROUTING.  We hacked our version of Xen to piggyback on this 
>> code to force all event channel interrupts for HVM guests to also avoid any 
>> guest rerouting:
>>
>> #ifdef IRQ0_SPECIAL_ROUTING
>>    /* Force round-robin to pick VCPU 0 */
>>    if ( ((irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled()) ||
>>         is_hvm_callback_irq(vioapic, irq) )
>>        deliver_bitmask = (uint32_t)1;
>> #endif
> 
> Yes, please - Solaris 10 PV drivers are buggy in that they use the
> current VCPUs vcpu_info. I just found this bug, and it's getting fixed,
> but if this makes sense anyway, it'd be good.

I can submit a patch for this, but we feel this is something of a hack.  We'd like to provide a more general mechanism for allowing event channel binding to "work" for HVM guests.  But to do this, we are trying to address conflicting goals.  Either we honor the event channel binding by circumventing the IOAPIC emulation, or we faithfully emulate the IOAPIC and circumvent the event channel binding.

Our driver writers would like to see support for multiple callback IRQs.  Then particular event channel interrupts could be bound to particular IRQs.  This would allow PV device interrupts to be distributed intelligently.  It would also allow net and block interrupts to be disentangled for Windows PV drivers.

We deal pretty much exclusively with HVM guests, do SMP PV environments selectively bind device interrupts to different VCPUs?

Steve

>> This routing override provides a significant performance boost [or rather 
>> avoids the performance penalty] for SMP PV drivers up until the time that 
>> VCPU0 is saturated with interrupts.  You can probably achieve the same 
> 
> Of course there's no requirement that the evtchn is actually dealt with
> on the same CPU, just the callback IRQ and the evtchn "ack" (clearing
> evtchn_upcall_pending).
> 
> regards
> john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 17:11     ` Steve Ofsthun
@ 2008-09-05 17:28       ` John Levon
  2008-09-05 18:47         ` Steve Ofsthun
  2008-09-06  7:48         ` Keir Fraser
  0 siblings, 2 replies; 18+ messages in thread
From: John Levon @ 2008-09-05 17:28 UTC (permalink / raw)
  To: Steve Ofsthun; +Cc: James Harper, xen-devel, bart brooks, Keir Fraser

On Fri, Sep 05, 2008 at 01:11:41PM -0400, Steve Ofsthun wrote:

> >>#ifdef IRQ0_SPECIAL_ROUTING
> >>   /* Force round-robin to pick VCPU 0 */
> >>   if ( ((irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled()) ||
> >>        is_hvm_callback_irq(vioapic, irq) )
> >>       deliver_bitmask = (uint32_t)1;
> >>#endif
> >
> >Yes, please - Solaris 10 PV drivers are buggy in that they use the
> >current VCPUs vcpu_info. I just found this bug, and it's getting fixed,
> >but if this makes sense anyway, it'd be good.
> 
> I can submit a patch for this, but we feel this is something of a hack.  

Yep.

> We'd like to provide a more general mechanism for allowing event channel 
> binding to "work" for HVM guests.  But to do this, we are trying to address 
> conflicting goals.  Either we honor the event channel binding by 
> circumventing the IOAPIC emulation, or we faithfully emulate the IOAPIC and 
> circumvent the event channel binding.

Well, this doesn't really make sense anyway as is: the IRQ binding has little
to do with where the evtchns are handled (I don't think there's any
requirement that they both happen on the same CPU).

> Our driver writers would like to see support for multiple callback IRQs.  
> Then particular event channel interrupts could be bound to particular IRQs. 
> This would allow PV device interrupts to be distributed intelligently.  It 
> would also allow net and block interrupts to be disentangled for Windows PV 
> drivers.

You could do a bunch of that just by distributing them from the single
callback IRQ. But I suppose it would be nice to move to a
one-IRQ-per-evtchn model. You'd need to keep the ABI of course, so you'd
need a feature flag or something.

> We deal pretty much exclusively with HVM guests, do SMP PV environments 
> selectively bind device interrupts to different VCPUs?

For true PV you can bind evtchns at will.

regards
john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 17:28       ` John Levon
@ 2008-09-05 18:47         ` Steve Ofsthun
  2008-09-06  7:48         ` Keir Fraser
  1 sibling, 0 replies; 18+ messages in thread
From: Steve Ofsthun @ 2008-09-05 18:47 UTC (permalink / raw)
  To: John Levon; +Cc: James Harper, xen-devel, bart brooks, Keir Fraser

John Levon wrote:
> On Fri, Sep 05, 2008 at 01:11:41PM -0400, Steve Ofsthun wrote:
> 
>>>> #ifdef IRQ0_SPECIAL_ROUTING
>>>>   /* Force round-robin to pick VCPU 0 */
>>>>   if ( ((irq == hvm_isa_irq_to_gsi(0)) && pit_channel0_enabled()) ||
>>>>        is_hvm_callback_irq(vioapic, irq) )
>>>>       deliver_bitmask = (uint32_t)1;
>>>> #endif
>>> Yes, please - Solaris 10 PV drivers are buggy in that they use the
>>> current VCPUs vcpu_info. I just found this bug, and it's getting fixed,
>>> but if this makes sense anyway, it'd be good.
>> I can submit a patch for this, but we feel this is something of a hack.  
> 
> Yep.

OK, I'll throw a patch together.

>> We'd like to provide a more general mechanism for allowing event channel 
>> binding to "work" for HVM guests.  But to do this, we are trying to address 
>> conflicting goals.  Either we honor the event channel binding by 
>> circumventing the IOAPIC emulation, or we faithfully emulate the IOAPIC and 
>> circumvent the event channel binding.
> 
> Well, this doesn't really make sense anyway as is: the IRQ binding has little
> to do with where the evtchns are handled (I don't think there's any
> requirement that they both happen on the same CPU).

Yes, there is no requirement, but there is a significant latency penalty for redirecting an event channel interrupt through an IRQ routed to a different VCPU.  Think 10s of milliseconds delay minimum due to Xen scheduling on a busy node (the current VCPU will not yield unless it is idle).  Add to this the fact that almost any significant I/O load on an HVM Windows guest becomes cpu bound quickly (so your scheduling priority is reduced).

>> Our driver writers would like to see support for multiple callback IRQs.  
>> Then particular event channel interrupts could be bound to particular IRQs. 
>> This would allow PV device interrupts to be distributed intelligently.  It 
>> would also allow net and block interrupts to be disentangled for Windows PV 
>> drivers.
> 
> You could do a bunch of that just by distributing them from the single
> callback IRQ. But I suppose it would be nice to move to a
> one-IRQ-per-evtchn model. You'd need to keep the ABI of course, so you'd
> need a feature flag or something.

Distributing from a single IRQ works OK for Linux, but doesn't work very well for older versions of Windows.  For block handling you want to deliver the real interrupts in SCSI miniport context.  The network can deal with the interrupt redirect.  But the network easily generates the highest interrupt rates and is sensitive to additional latency.  So you end up slowing SCSI down with "extra" network interrupts, and slowing the network down with increased interrupt latency.  Delivering net and block interrupts independently would avoid these issues.  Delivering interrupts to a bus driver and forwarding these to virtual device drivers directly is only an option on newer versions of Windows.

>> We deal pretty much exclusively with HVM guests, do SMP PV environments 
>> selectively bind device interrupts to different VCPUs?
> 
> For true PV you can bind evtchns at will.
> 
> regards
> john

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-05 17:28       ` John Levon
  2008-09-05 18:47         ` Steve Ofsthun
@ 2008-09-06  7:48         ` Keir Fraser
  2008-09-06  7:59           ` Keir Fraser
  2008-09-06 12:21           ` James Harper
  1 sibling, 2 replies; 18+ messages in thread
From: Keir Fraser @ 2008-09-06  7:48 UTC (permalink / raw)
  To: John Levon, Steve Ofsthun; +Cc: James Harper, xen-devel, bart brooks

On 5/9/08 18:28, "John Levon" <levon@movementarian.org> wrote:

>> Our driver writers would like to see support for multiple callback IRQs.
>> Then particular event channel interrupts could be bound to particular IRQs.
>> This would allow PV device interrupts to be distributed intelligently.  It
>> would also allow net and block interrupts to be disentangled for Windows PV
>> drivers.
> 
> You could do a bunch of that just by distributing them from the single
> callback IRQ. But I suppose it would be nice to move to a
> one-IRQ-per-evtchn model. You'd need to keep the ABI of course, so you'd
> need a feature flag or something.

Yes, it should work as follows:
one-IRQ-per-evtchn and turn off the usual PV per-VCPU selector word and
evtchn_pending master flag (as we have already disabled the evtchn_mask
master flag). Then all evtchn re-routing would be handled through the
IO-APIC like all other emulated IRQs.

I'd consider a clean patch to do this (would need to be backward compatible
with old PV-on-HVM drivers). I won't take hacks of the sort Steve posted
(not that he expected me to!).

 -- Keir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Interrupt to CPU routing in HVM domains - again
  2008-09-06  7:48         ` Keir Fraser
@ 2008-09-06  7:59           ` Keir Fraser
  2008-09-06 12:21           ` James Harper
  1 sibling, 0 replies; 18+ messages in thread
From: Keir Fraser @ 2008-09-06  7:59 UTC (permalink / raw)
  To: John Levon, Steve Ofsthun; +Cc: James Harper, xen-devel, bart brooks

On 6/9/08 08:48, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:

>> You could do a bunch of that just by distributing them from the single
>> callback IRQ. But I suppose it would be nice to move to a
>> one-IRQ-per-evtchn model. You'd need to keep the ABI of course, so you'd
>> need a feature flag or something.
> 
> Yes, it should work as follows:
> one-IRQ-per-evtchn and turn off the usual PV per-VCPU selector word and
> evtchn_pending master flag (as we have already disabled the evtchn_mask master
> flag). Then all evtchn re-routing would be handled through the IO-APIC like
> all other emulated IRQs.

Oh, another way I like (on the Xen side of the interface, at least) is to
keep the PV evtchn VCPU binding mechanisms and per-VCPU selector word and
evtchn_pending master flag. But then do 'direct FSB injection' to specific
VCPU LAPICs. That is, each VCPU would receive its event-channel
notifications on a pre-arranged vector via a simulated message to its LAPIC.

This is certainly easier to implement on the Xen side, and I would say
neater too. However, its significant drawback is that this doesn't likely
fit very well with existing OS IRQ subsystems:
 * OS still needs to demux interrupts and hence essentially has a nested
layer of interrupt delivery effectively (if we want the specific event
channels to be visible as distinct interrupts to the OS).
 * There is a PV-specific way of reassigning 'IRQs' to VCPUs. The usual OS
methods of tickling the IO-APIC will not apply.
 * The OS may well not even have a way of allocating an interrupt vector or
otherwise registering interest in, and receiving CPU-specific interrupts on,
a specific interrupt vector.

Obviously this could all be worked through for Linux guests by extending our
existing pv_ops implementation a little. I think it would fit well. But I
have doubts this could work well for other OSes where we can make less
far-reaching changes (Windows for example).

 -- Keir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Interrupt to CPU routing in HVM domains - again
  2008-09-06  7:48         ` Keir Fraser
  2008-09-06  7:59           ` Keir Fraser
@ 2008-09-06 12:21           ` James Harper
  2008-09-08 16:51             ` bart brooks
  1 sibling, 1 reply; 18+ messages in thread
From: James Harper @ 2008-09-06 12:21 UTC (permalink / raw)
  To: bart brooks, Keir Fraser; +Cc: xen-devel

Bart,

I'm pretty sure I've identified the problem. The fact that the output of
the debugging patch I sent you shows that sometimes one xennet isr is
logged and sometimes two are logged was a bit of a clue. I assume you
have two network adapters?

Please make the following change to xennet.c:

"
diff -r ea14db3ca6f2 xennet/xennet.c
--- a/xennet/xennet.c   Thu Sep 04 22:31:38 2008 +1000
+++ b/xennet/xennet.c   Sat Sep 06 22:14:36 2008 +1000
@@ -137,7 +137,7 @@ XenNet_InterruptIsr(
   else
   {
     *QueueMiniportHandleInterrupt = (BOOLEAN)!!xi->connected;
-    *InterruptRecognized = TRUE;
+    *InterruptRecognized = FALSE;
   }

   //FUNCTION_EXIT();
"

Xennet's isr was reporting that the interrupt was handled, which it was,
but xennet needs to report that it was not handled so that the next isr
in the chain is triggered. Your change to the isr in evtchn.c would have
caused spurious interrupts which had the side effect of ensuring that
the second xennet isr was triggered eventually.

I hope that makes sense. I'm too tired to re-read it to make sure :)

I'm not completely sure if it will work... I don't know how ndis will
react when we tell it that it needs to schedule a dpc even though we are
also telling it that the interrupt was not for us.

Let me know the outcome and I'll release an update if it works.

Thanks

James

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Interrupt to CPU routing in HVM domains - again
  2008-09-06 12:21           ` James Harper
@ 2008-09-08 16:51             ` bart brooks
  2008-09-09  0:41               ` James Harper
  0 siblings, 1 reply; 18+ messages in thread
From: bart brooks @ 2008-09-08 16:51 UTC (permalink / raw)
  To: James Harper, Keir Fraser; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1897 bytes --]


Hi James,
 
We make the change and it did not help the perf issue; as result
 
Bart> Subject: RE: [Xen-devel] Interrupt to CPU routing in HVM domains - again> Date: Sat, 6 Sep 2008 22:21:47 +1000> From: james.harper@bendigoit.com.au> To: bart_brooks@hotmail.com; keir.fraser@eu.citrix.com> CC: xen-devel@lists.xensource.com> > Bart,> > I'm pretty sure I've identified the problem. The fact that the output of> the debugging patch I sent you shows that sometimes one xennet isr is> logged and sometimes two are logged was a bit of a clue. I assume you> have two network adapters?> > Please make the following change to xennet.c:> > "> diff -r ea14db3ca6f2 xennet/xennet.c> --- a/xennet/xennet.c Thu Sep 04 22:31:38 2008 +1000> +++ b/xennet/xennet.c Sat Sep 06 22:14:36 2008 +1000> @@ -137,7 +137,7 @@ XenNet_InterruptIsr(> else> {> *QueueMiniportHandleInterrupt = (BOOLEAN)!!xi->connected;> - *InterruptRecognized = TRUE;> + *InterruptRecognized = FALSE;> }> > //FUNCTION_EXIT();> "> > Xennet's isr was reporting that the interrupt was handled, which it was,> but xennet needs to report that it was not handled so that the next isr> in the chain is triggered. Your change to the isr in evtchn.c would have> caused spurious interrupts which had the side effect of ensuring that> the second xennet isr was triggered eventually.> > I hope that makes sense. I'm too tired to re-read it to make sure :)> > I'm not completely sure if it will work... I don't know how ndis will> react when we tell it that it needs to schedule a dpc even though we are> also telling it that the interrupt was not for us.> > Let me know the outcome and I'll release an update if it works.> > Thanks> > James> 
_________________________________________________________________
See how Windows Mobile brings your life together—at home, work, or on the go.
http://clk.atdmt.com/MRT/go/msnnkwxp1020093182mrt/direct/01/

[-- Attachment #1.2: Type: text/html, Size: 2442 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: Interrupt to CPU routing in HVM domains - again
  2008-09-08 16:51             ` bart brooks
@ 2008-09-09  0:41               ` James Harper
  0 siblings, 0 replies; 18+ messages in thread
From: James Harper @ 2008-09-09  0:41 UTC (permalink / raw)
  To: bart brooks, Keir Fraser; +Cc: xen-devel

> 
> Hi James,
> 
> We make the change and it did not help the perf issue; as result
> 

That's a shame... I was sure it was going to fix it.

Does the problem occur when you only have one network adapter?

James

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-09-09  0:41 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-05  1:06 Interrupt to CPU routing in HVM domains - again James Harper
2008-09-05  1:21 ` John Levon
2008-09-05  7:44   ` Keir Fraser
2008-09-05 10:25     ` James Harper
2008-09-05 13:18       ` John Levon
2008-09-05  7:43 ` Keir Fraser
2008-09-05 10:13   ` James Harper
2008-09-05 10:32     ` Keir Fraser
2008-09-05 15:15 ` Steve Ofsthun
2008-09-05 15:25   ` John Levon
2008-09-05 17:11     ` Steve Ofsthun
2008-09-05 17:28       ` John Levon
2008-09-05 18:47         ` Steve Ofsthun
2008-09-06  7:48         ` Keir Fraser
2008-09-06  7:59           ` Keir Fraser
2008-09-06 12:21           ` James Harper
2008-09-08 16:51             ` bart brooks
2008-09-09  0:41               ` James Harper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.