All of lore.kernel.org
 help / color / mirror / Atom feed
* IRQ migration on CPU offline path
@ 2011-12-14 18:34 Will Deacon
  2011-12-15  1:25 ` Eric W. Biederman
  0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2011-12-14 18:34 UTC (permalink / raw)
  To: tglx; +Cc: linux-kernel

Hi Thomas,

I've been looking at the IRQ migration code on x86 (fixup_irqs) for the CPU
hotplug path in order to try and fix a bug we have on ARM with the
desc->affinity mask not getting updated. Compared to irq_set_affinity, the code
is pretty whacky (I guess because it's called via stop_machine) so I wondered if
you could help me understand a few points:

(1) Affinity notifiers - we seem to ignore these and I guess they don't expect
    to be called from this context. It could lead to the cpu_rmap stuff being
    bogus, but that's not used for anything much. Do we just depend on people
    having hotplug notifiers to deal with this?

(2) Threaded handlers - I can't see where we update irqaction->thread_flags with
    IRQTF_AFFINITY for handlers that are migrated by the scheduler when a CPU
    goes down. Is this required?

(3) On x86, we rely on the irq_chip updating the desc->affinity mask in
    ->irq_set_affinity. It seems like we could use IRQ_SET_MASK_OK{_NOCOPY} for
    this and, in the case of the ioapic, return IRQ_SET_MASK_OK_NOCOPY (removing
    a redundant copy from the usual affinity path).

Of course, I could just be completely confused, which is why I haven't started
hacking code just yet :)

Cheers and sorry for the barrage of questions!,

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IRQ migration on CPU offline path
  2011-12-14 18:34 IRQ migration on CPU offline path Will Deacon
@ 2011-12-15  1:25 ` Eric W. Biederman
  2011-12-15 16:28   ` Will Deacon
  0 siblings, 1 reply; 5+ messages in thread
From: Eric W. Biederman @ 2011-12-15  1:25 UTC (permalink / raw)
  To: Will Deacon; +Cc: tglx, linux-kernel

Will Deacon <will.deacon@arm.com> writes:

> Hi Thomas,
>
> I've been looking at the IRQ migration code on x86 (fixup_irqs) for the CPU
> hotplug path in order to try and fix a bug we have on ARM with the
> desc->affinity mask not getting updated. Compared to irq_set_affinity, the code
> is pretty whacky (I guess because it's called via stop_machine) so I wondered if
> you could help me understand a few points:

There is a lot of craziness on that path because of poor hardware design
on x86 we can't know when an irq has actually be migrated, and other
nasties.

There is also the issue that I expect is still the case that we have the
generic layer asking us to cpu migration and the associated irq
migrations with the irqs disabled which at least for the bits of poorly
designed hardware made the entire path a best effort beast.

> (1) Affinity notifiers - we seem to ignore these and I guess they don't expect
>     to be called from this context. It could lead to the cpu_rmap stuff being
>     bogus, but that's not used for anything much. Do we just depend on people
>     having hotplug notifiers to deal with this?

> (2) Threaded handlers - I can't see where we update irqaction->thread_flags with
>     IRQTF_AFFINITY for handlers that are migrated by the scheduler when a CPU
>     goes down. Is this required?
>
> (3) On x86, we rely on the irq_chip updating the desc->affinity mask in
>     ->irq_set_affinity. It seems like we could use IRQ_SET_MASK_OK{_NOCOPY} for
>     this and, in the case of the ioapic, return IRQ_SET_MASK_OK_NOCOPY (removing
>     a redundant copy from the usual affinity path).

Possibly.  I forget which direction this code has gone, but we have the
interesting case that frequently on x86 we can be pinned to multiple and
the hardware only supports being pinned to a single cpu at a time.  So
the original logic was to put in the affinity mask what we had actually
done instead of what was requested.  And in that case the mask changes
so a NOCOPY doesn't feel appropriate but I don't understand that code.

> Of course, I could just be completely confused, which is why I haven't started
> hacking code just yet :)

If x86 becomes a good clean example in this corner case I would be
amazed.  Last I looked I almost marked it all as CONFIG_BROKEN because
we were trying to do the impossible.  Unfortunately peoples laptops
go through this path when they suspend and so it was more painful to
disable hacky racy mess than to keep living with it.

There has been an increase in the number of cases where it is possible
to actually perform the migration with irqs disabled so on a good day
that code might even work.  

Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IRQ migration on CPU offline path
  2011-12-15  1:25 ` Eric W. Biederman
@ 2011-12-15 16:28   ` Will Deacon
  2011-12-16  5:26     ` Eric W. Biederman
  0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2011-12-15 16:28 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: tglx, linux-kernel

Hi Eric,

Cheers for the response.

On Thu, Dec 15, 2011 at 01:25:19AM +0000, Eric W. Biederman wrote:
> Will Deacon <will.deacon@arm.com> writes:
> > I've been looking at the IRQ migration code on x86 (fixup_irqs) for the CPU
> > hotplug path in order to try and fix a bug we have on ARM with the
> > desc->affinity mask not getting updated. Compared to irq_set_affinity, the code
> > is pretty whacky (I guess because it's called via stop_machine) so I wondered if
> > you could help me understand a few points:
> 
> There is a lot of craziness on that path because of poor hardware design
> on x86 we can't know when an irq has actually be migrated, and other
> nasties.
> 
> There is also the issue that I expect is still the case that we have the
> generic layer asking us to cpu migration and the associated irq
> migrations with the irqs disabled which at least for the bits of poorly
> designed hardware made the entire path a best effort beast.

Argh, ok. Does this mean that other architectures should just preserve the
interface that x86 gives (for example not triggering IRQ affinity
notifiers)?

> If x86 becomes a good clean example in this corner case I would be
> amazed.  Last I looked I almost marked it all as CONFIG_BROKEN because
> we were trying to do the impossible.  Unfortunately peoples laptops
> go through this path when they suspend and so it was more painful to
> disable hacky racy mess than to keep living with it.
> 
> There has been an increase in the number of cases where it is possible
> to actually perform the migration with irqs disabled so on a good day
> that code might even work.  

Right, so this stuff is fairly fragile. We can probably get a reasonable
version working on ARM (with the GIC present) but I'm not sure what to do
about the notifiers I mentioned earlier and proper migration of threaded
interrupt handlers.

I'll take a look at some other archs.

Thanks,

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IRQ migration on CPU offline path
  2011-12-15 16:28   ` Will Deacon
@ 2011-12-16  5:26     ` Eric W. Biederman
  2011-12-16 10:51       ` Will Deacon
  0 siblings, 1 reply; 5+ messages in thread
From: Eric W. Biederman @ 2011-12-16  5:26 UTC (permalink / raw)
  To: Will Deacon; +Cc: tglx, linux-kernel

Will Deacon <will.deacon@arm.com> writes:

> Hi Eric,
>
> Cheers for the response.
>
> On Thu, Dec 15, 2011 at 01:25:19AM +0000, Eric W. Biederman wrote:
>> Will Deacon <will.deacon@arm.com> writes:
>> > I've been looking at the IRQ migration code on x86 (fixup_irqs) for the CPU
>> > hotplug path in order to try and fix a bug we have on ARM with the
>> > desc->affinity mask not getting updated. Compared to irq_set_affinity, the code
>> > is pretty whacky (I guess because it's called via stop_machine) so I wondered if
>> > you could help me understand a few points:
>> 
>> There is a lot of craziness on that path because of poor hardware design
>> on x86 we can't know when an irq has actually be migrated, and other
>> nasties.
>> 
>> There is also the issue that I expect is still the case that we have the
>> generic layer asking us to cpu migration and the associated irq
>> migrations with the irqs disabled which at least for the bits of poorly
>> designed hardware made the entire path a best effort beast.
>
> Argh, ok. Does this mean that other architectures should just preserve the
> interface that x86 gives (for example not triggering IRQ affinity
> notifiers)?

Interesting.  In this case the affinity notifier is an ugly hack for
exactly one driver.  The affinity notifier is new (This January) and
buggy.  Among other things there appears to be a clear reference count
leak on the affinity notify structure. 

Honestly I don't see much to justify the existence of the affinity
notifiers, and especially their requirement that they be called in
process context.

At a practical level since the architects of the affinity notifier
didn't choose to add notification on migration I don't see why
you should care.

This isn't an x86 versus the rest of the world.  This is a 
Solarflare driver vs the rest of the kernel issue.  When the Solarflar
developers care they can fix up arm and all of the rest of the
architectures that support cpu hot unplug.

>> If x86 becomes a good clean example in this corner case I would be
>> amazed.  Last I looked I almost marked it all as CONFIG_BROKEN because
>> we were trying to do the impossible.  Unfortunately peoples laptops
>> go through this path when they suspend and so it was more painful to
>> disable hacky racy mess than to keep living with it.
>> 
>> There has been an increase in the number of cases where it is possible
>> to actually perform the migration with irqs disabled so on a good day
>> that code might even work.  
>
> Right, so this stuff is fairly fragile. We can probably get a reasonable
> version working on ARM (with the GIC present) but I'm not sure what to do
> about the notifiers I mentioned earlier and proper migration of threaded
> interrupt handlers.

Yes. It looks like the irq migration notifiers are just broken by design.

As for threaded interrupt handlers there is probably something
reasonable that can be done there.  My guess is threaded interrupt
handlers should be handled the same way any other thread is handled
during cpu hot-unplug.  And if something needs to be done I expect the
generic code can do it.

Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: IRQ migration on CPU offline path
  2011-12-16  5:26     ` Eric W. Biederman
@ 2011-12-16 10:51       ` Will Deacon
  0 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2011-12-16 10:51 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: tglx, linux-kernel

On Fri, Dec 16, 2011 at 05:26:46AM +0000, Eric W. Biederman wrote:
> > Argh, ok. Does this mean that other architectures should just preserve the
> > interface that x86 gives (for example not triggering IRQ affinity
> > notifiers)?
> 
> Interesting.  In this case the affinity notifier is an ugly hack for
> exactly one driver.  The affinity notifier is new (This January) and
> buggy.  Among other things there appears to be a clear reference count
> leak on the affinity notify structure. 
> 
> Honestly I don't see much to justify the existence of the affinity
> notifiers, and especially their requirement that they be called in
> process context.

One case I could see (ok, I'm clutching slightly at straws here) is for
modules that want to control the affinity of an IRQ that they are
controlling. irq_set_affinity is not an exported symbol, so they could use
irq_set_affinity_hint to try and stop userspace daemons from messing with
them and use notifiers to keep track of what they ended up with.

> At a practical level since the architects of the affinity notifier
> didn't choose to add notification on migration I don't see why
> you should care.

Suits me :)

> This isn't an x86 versus the rest of the world.  This is a 
> Solarflare driver vs the rest of the kernel issue.  When the Solarflar
> developers care they can fix up arm and all of the rest of the
> architectures that support cpu hot unplug.

Sure, I just think that whatever we do, it should be consistent across
archs, even if it's a driver that is to blame.

> As for threaded interrupt handlers there is probably something
> reasonable that can be done there.  My guess is threaded interrupt
> handlers should be handled the same way any other thread is handled
> during cpu hot-unplug.  And if something needs to be done I expect the
> generic code can do it.

My first thoughts were that we needed to call irq_set_thread_affinity to set
the IRQTF_AFFINITY bit in the threaD_flags, but actually looking at
irq_thread_check_affinity, I think you're right. The scheduler will deal
with this for us when it migrates kernel threads off the dying CPU.

So the conclusion is: ignore the IRQ affinity notifiers, update the affinity
mask in the irq_data and let the scheduler do the rest.

Thanks for the help!

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-12-16 10:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-14 18:34 IRQ migration on CPU offline path Will Deacon
2011-12-15  1:25 ` Eric W. Biederman
2011-12-15 16:28   ` Will Deacon
2011-12-16  5:26     ` Eric W. Biederman
2011-12-16 10:51       ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.