All of lore.kernel.org
 help / color / mirror / Atom feed
* IRQ affinity enforced only after first interrupt.
@ 2012-03-26  9:06 Yevgeny Petrilin
  2012-03-26 14:28 ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Yevgeny Petrilin @ 2012-03-26  9:06 UTC (permalink / raw)
  To: linux-pci

Hello,

I'm working on an issue where affinity changes to IRQ only have effect after the first interrupt which still happens on the original core.
I understand that the decision regarding it takes place in this code:

	if (irq_can_move_pcntxt(data)) {
		ret = chip->irq_set_affinity(data, mask, false);
		switch (ret) {
		case IRQ_SET_MASK_OK:
			cpumask_copy(data->affinity, mask);
		case IRQ_SET_MASK_OK_NOCOPY:
			irq_set_thread_affinity(desc);
			ret = 0;
		}
	} else {
		irqd_set_move_pending(data);
		irq_copy_pending(desc, mask);
	}

Which means that the "IRQD_MOVE_PCNTXT" flag is not set in irq_data->state_use_accessors.
I was able to add this flag using irq_modify_status(), which is probably not the way to go.
This option also doesn't exist in older kernels (2.6.32)

So the question is, when irq_desc is created, how is it determined that "IRQD_MOVE_PCNTXT" flag is set?

Thanks,
Yevgeny

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IRQ affinity enforced only after first interrupt.
  2012-03-26  9:06 IRQ affinity enforced only after first interrupt Yevgeny Petrilin
@ 2012-03-26 14:28 ` Bjorn Helgaas
  2012-03-26 14:33   ` Jiang Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2012-03-26 14:28 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: linux-pci, Thomas Gleixner, linux-kernel

[This is not really a PCI question, so +cc Thomas, LKML.]

On Mon, Mar 26, 2012 at 3:06 AM, Yevgeny Petrilin
<yevgenyp@mellanox.co.il> wrote:
> Hello,
>
> I'm working on an issue where affinity changes to IRQ only have effect after the first interrupt which still happens on the original core.
> I understand that the decision regarding it takes place in this code:
>
>        if (irq_can_move_pcntxt(data)) {
>                ret = chip->irq_set_affinity(data, mask, false);
>                switch (ret) {
>                case IRQ_SET_MASK_OK:
>                        cpumask_copy(data->affinity, mask);
>                case IRQ_SET_MASK_OK_NOCOPY:
>                        irq_set_thread_affinity(desc);
>                        ret = 0;
>                }
>        } else {
>                irqd_set_move_pending(data);
>                irq_copy_pending(desc, mask);
>        }
>
> Which means that the "IRQD_MOVE_PCNTXT" flag is not set in irq_data->state_use_accessors.
> I was able to add this flag using irq_modify_status(), which is probably not the way to go.
> This option also doesn't exist in older kernels (2.6.32)
>
> So the question is, when irq_desc is created, how is it determined that "IRQD_MOVE_PCNTXT" flag is set?
>
> Thanks,
> Yevgeny
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IRQ affinity enforced only after first interrupt.
  2012-03-26 14:28 ` Bjorn Helgaas
@ 2012-03-26 14:33   ` Jiang Liu
  2012-03-26 15:24     ` Yevgeny Petrilin
  0 siblings, 1 reply; 9+ messages in thread
From: Jiang Liu @ 2012-03-26 14:33 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yevgeny Petrilin, linux-pci, Thomas Gleixner, linux-kernel

The architecture specific code will determine whether the IRQ could be migrated
in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
systems if interrupt remapping is enabled.

On 03/26/2012 10:28 PM, Bjorn Helgaas wrote:
> [This is not really a PCI question, so +cc Thomas, LKML.]
> 
> On Mon, Mar 26, 2012 at 3:06 AM, Yevgeny Petrilin
> <yevgenyp@mellanox.co.il> wrote:
>> Hello,
>>
>> I'm working on an issue where affinity changes to IRQ only have effect after the first interrupt which still happens on the original core.
>> I understand that the decision regarding it takes place in this code:
>>
>>        if (irq_can_move_pcntxt(data)) {
>>                ret = chip->irq_set_affinity(data, mask, false);
>>                switch (ret) {
>>                case IRQ_SET_MASK_OK:
>>                        cpumask_copy(data->affinity, mask);
>>                case IRQ_SET_MASK_OK_NOCOPY:
>>                        irq_set_thread_affinity(desc);
>>                        ret = 0;
>>                }
>>        } else {
>>                irqd_set_move_pending(data);
>>                irq_copy_pending(desc, mask);
>>        }
>>
>> Which means that the "IRQD_MOVE_PCNTXT" flag is not set in irq_data->state_use_accessors.
>> I was able to add this flag using irq_modify_status(), which is probably not the way to go.
>> This option also doesn't exist in older kernels (2.6.32)
>>
>> So the question is, when irq_desc is created, how is it determined that "IRQD_MOVE_PCNTXT" flag is set?
>>
>> Thanks,
>> Yevgeny
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: IRQ affinity enforced only after first interrupt.
  2012-03-26 14:33   ` Jiang Liu
@ 2012-03-26 15:24     ` Yevgeny Petrilin
  2012-03-26 19:04       ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Yevgeny Petrilin @ 2012-03-26 15:24 UTC (permalink / raw)
  To: Jiang Liu, Bjorn Helgaas
  Cc: linux-pci, Thomas Gleixner, linux-kernel, Yael Shenhav

> 
> The architecture specific code will determine whether the IRQ could be migrated
> in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> systems if interrupt remapping is enabled.

Actually I am encountering this issue with x86, and see different behavior with different
HW devices (NICs). On same machine I have one device that responds immediately to affinity changes
while the other one changes the affinity only after first interrupt.

> 
> On 03/26/2012 10:28 PM, Bjorn Helgaas wrote:
> > [This is not really a PCI question, so +cc Thomas, LKML.]
> >
> > On Mon, Mar 26, 2012 at 3:06 AM, Yevgeny Petrilin
> > <yevgenyp@mellanox.co.il> wrote:
> >> Hello,
> >>
> >> I'm working on an issue where affinity changes to IRQ only have effect after the first interrupt which still happens on the original core.
> >> I understand that the decision regarding it takes place in this code:
> >>
> >>        if (irq_can_move_pcntxt(data)) {
> >>                ret = chip->irq_set_affinity(data, mask, false);
> >>                switch (ret) {
> >>                case IRQ_SET_MASK_OK:
> >>                        cpumask_copy(data->affinity, mask);
> >>                case IRQ_SET_MASK_OK_NOCOPY:
> >>                        irq_set_thread_affinity(desc);
> >>                        ret = 0;
> >>                }
> >>        } else {
> >>                irqd_set_move_pending(data);
> >>                irq_copy_pending(desc, mask);
> >>        }
> >>
> >> Which means that the "IRQD_MOVE_PCNTXT" flag is not set in irq_data->state_use_accessors.
> >> I was able to add this flag using irq_modify_status(), which is probably not the way to go.
> >> This option also doesn't exist in older kernels (2.6.32)
> >>
> >> So the question is, when irq_desc is created, how is it determined that "IRQD_MOVE_PCNTXT" flag is set?
> >>
> >> Thanks,
> >> Yevgeny


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: IRQ affinity enforced only after first interrupt.
  2012-03-26 15:24     ` Yevgeny Petrilin
@ 2012-03-26 19:04       ` Thomas Gleixner
  2012-03-27  9:39         ` Yevgeny Petrilin
  2012-04-04 10:01         ` Alexander Gordeev
  0 siblings, 2 replies; 9+ messages in thread
From: Thomas Gleixner @ 2012-03-26 19:04 UTC (permalink / raw)
  To: Yevgeny Petrilin
  Cc: Jiang Liu, Bjorn Helgaas, linux-pci, linux-kernel, Yael Shenhav

On Mon, 26 Mar 2012, Yevgeny Petrilin wrote:

> > 
> > The architecture specific code will determine whether the IRQ could be migrated
> > in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> > systems if interrupt remapping is enabled.
>
> Actually I am encountering this issue with x86, and see different
> behavior with different HW devices (NICs). On same machine I have
> one device that responds immediately to affinity changes while the
> other one changes the affinity only after first interrupt.

That simply depends on the underlying hardware. On certain hardware we
can change the affinity only in hard interrupt context, that means
right when a interrupt of that device is delivered.

On the other devices we can change it right away and the corresponding
interrupt chips set IRQ_MOVE_PCNTXT to indicate that.

There is nothing we can do about this. It's dictated by hardware.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: IRQ affinity enforced only after first interrupt.
  2012-03-26 19:04       ` Thomas Gleixner
@ 2012-03-27  9:39         ` Yevgeny Petrilin
  2012-03-27 12:52           ` Thomas Gleixner
  2012-04-04 10:01         ` Alexander Gordeev
  1 sibling, 1 reply; 9+ messages in thread
From: Yevgeny Petrilin @ 2012-03-27  9:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jiang Liu, Bjorn Helgaas, linux-pci, linux-kernel, Yael Shenhav

> > >
> > > The architecture specific code will determine whether the IRQ could be migrated
> > > in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> > > systems if interrupt remapping is enabled.
> >
> > Actually I am encountering this issue with x86, and see different
> > behavior with different HW devices (NICs). On same machine I have
> > one device that responds immediately to affinity changes while the
> > other one changes the affinity only after first interrupt.
> 
> That simply depends on the underlying hardware. On certain hardware we
> can change the affinity only in hard interrupt context, that means
> right when a interrupt of that device is delivered.
> 
> On the other devices we can change it right away and the corresponding
> interrupt chips set IRQ_MOVE_PCNTXT to indicate that.
> 
> There is nothing we can do about this. It's dictated by hardware.
> 

Thanks for the explanation,
Which capabilities of the HW show whether IRQ_MOVE_PCNTXT can be set or not?
Is it done by reading configuration from PCI?

Thanks,
Yevgeny

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: IRQ affinity enforced only after first interrupt.
  2012-03-27  9:39         ` Yevgeny Petrilin
@ 2012-03-27 12:52           ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2012-03-27 12:52 UTC (permalink / raw)
  To: Yevgeny Petrilin
  Cc: Jiang Liu, Bjorn Helgaas, linux-pci, linux-kernel, Yael Shenhav

On Tue, 27 Mar 2012, Yevgeny Petrilin wrote:

> > > >
> > > > The architecture specific code will determine whether the IRQ could be migrated
> > > > in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> > > > systems if interrupt remapping is enabled.
> > >
> > > Actually I am encountering this issue with x86, and see different
> > > behavior with different HW devices (NICs). On same machine I have
> > > one device that responds immediately to affinity changes while the
> > > other one changes the affinity only after first interrupt.
> > 
> > That simply depends on the underlying hardware. On certain hardware we
> > can change the affinity only in hard interrupt context, that means
> > right when a interrupt of that device is delivered.
> > 
> > On the other devices we can change it right away and the corresponding
> > interrupt chips set IRQ_MOVE_PCNTXT to indicate that.
> > 
> > There is nothing we can do about this. It's dictated by hardware.
> > 
> 
> Thanks for the explanation,
> Which capabilities of the HW show whether IRQ_MOVE_PCNTXT can be set or not?
> Is it done by reading configuration from PCI?

It's done by reading the specs of the interrupt controllers. This is
not at PCI (device) level. It's a property of the interrupt controller
(PIC, APIC, IOAPIC) and additional features like interrupt remapping.

The device merily uses an interrupt, but it does not know at all which
underlying interrupt controller is handling it. 

The only choice a device driver has is between pin based interrupts
and Message Signaled Interrupts, when the hardware supports it. This
information is retrieved from the PCI config space.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IRQ affinity enforced only after first interrupt.
  2012-03-26 19:04       ` Thomas Gleixner
  2012-03-27  9:39         ` Yevgeny Petrilin
@ 2012-04-04 10:01         ` Alexander Gordeev
  2012-04-05  8:47           ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Gordeev @ 2012-04-04 10:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Yevgeny Petrilin, Jiang Liu, Bjorn Helgaas, linux-pci,
	linux-kernel, Yael Shenhav

On Mon, Mar 26, 2012 at 09:04:20PM +0200, Thomas Gleixner wrote:
> On Mon, 26 Mar 2012, Yevgeny Petrilin wrote:
> 
> > > 
> > > The architecture specific code will determine whether the IRQ could be migrated
> > > in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> > > systems if interrupt remapping is enabled.
> >
> > Actually I am encountering this issue with x86, and see different
> > behavior with different HW devices (NICs). On same machine I have
> > one device that responds immediately to affinity changes while the
> > other one changes the affinity only after first interrupt.
> 
> That simply depends on the underlying hardware. On certain hardware we
> can change the affinity only in hard interrupt context, that means
> right when a interrupt of that device is delivered.
> 
> On the other devices we can change it right away and the corresponding
> interrupt chips set IRQ_MOVE_PCNTXT to indicate that.

Actually, even with IRQ_MOVE_PCNTXT capable chips, a hardware handler still
might be called on a core that belongs to old affinity, after the successful
write of new affinity. Threaded handlers are also racy with irq affinity
updates.

If that is inconsistency, bug or design?

> There is nothing we can do about this. It's dictated by hardware.

May be we could wait for desc->pending_mask to be cleared before returning from
irq_set_affinity()?

> Thanks,
> 
> 	tglx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Regards,
Alexander Gordeev
agordeev@redhat.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: IRQ affinity enforced only after first interrupt.
  2012-04-04 10:01         ` Alexander Gordeev
@ 2012-04-05  8:47           ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2012-04-05  8:47 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: Yevgeny Petrilin, Jiang Liu, Bjorn Helgaas, linux-pci,
	linux-kernel, Yael Shenhav

On Wed, 4 Apr 2012, Alexander Gordeev wrote:

> On Mon, Mar 26, 2012 at 09:04:20PM +0200, Thomas Gleixner wrote:
> > On Mon, 26 Mar 2012, Yevgeny Petrilin wrote:
> > 
> > > > 
> > > > The architecture specific code will determine whether the IRQ could be migrated
> > > > in process context. For example, the IRQ_MOVE_PCNTXT flag will be set on x86
> > > > systems if interrupt remapping is enabled.
> > >
> > > Actually I am encountering this issue with x86, and see different
> > > behavior with different HW devices (NICs). On same machine I have
> > > one device that responds immediately to affinity changes while the
> > > other one changes the affinity only after first interrupt.
> > 
> > That simply depends on the underlying hardware. On certain hardware we
> > can change the affinity only in hard interrupt context, that means
> > right when a interrupt of that device is delivered.
> > 
> > On the other devices we can change it right away and the corresponding
> > interrupt chips set IRQ_MOVE_PCNTXT to indicate that.
> 
> Actually, even with IRQ_MOVE_PCNTXT capable chips, a hardware handler still
> might be called on a core that belongs to old affinity, after the successful
> write of new affinity. Threaded handlers are also racy with irq affinity
> updates.
> 
> If that is inconsistency, bug or design?

Well, irq affinity updates are not designed to be immediate. There is
no point in doing so.
 
> > There is nothing we can do about this. It's dictated by hardware.
> 
> May be we could wait for desc->pending_mask to be cleared before returning from
> irq_set_affinity()?

If that device does not issue an interrupt for a long time,
e.g. because the interface is down, then you are stuck there forever.

What's the point of this? One interrupt on the wrong core is nothing
we need to worry about.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-04-05  8:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-26  9:06 IRQ affinity enforced only after first interrupt Yevgeny Petrilin
2012-03-26 14:28 ` Bjorn Helgaas
2012-03-26 14:33   ` Jiang Liu
2012-03-26 15:24     ` Yevgeny Petrilin
2012-03-26 19:04       ` Thomas Gleixner
2012-03-27  9:39         ` Yevgeny Petrilin
2012-03-27 12:52           ` Thomas Gleixner
2012-04-04 10:01         ` Alexander Gordeev
2012-04-05  8:47           ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.