* [Xenomai-core] BUG on xnintr_attach
@ 2009-04-22 12:26 Jan Kiszka
2009-04-22 16:23 ` Philippe Gerum
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2009-04-22 12:26 UTC (permalink / raw)
To: xenomai-core
Hi all,
issuing rtdm_irq_request and, thus, xnintr_attach can trigger a
"I-pipe: Detected stalled topmost domain, probably caused by a bug." if
the interrupt type is MSI:
[<ffffffff80273cce>] ipipe_check_context+0xe7/0xe9
[<ffffffff8049dae9>] _spin_lock_irqsave+0x18/0x54
[<ffffffff8037dcc2>] pci_bus_read_config_dword+0x3c/0x87
[<ffffffff80387c1d>] read_msi_msg+0x61/0xe1
[<ffffffff8021c5b8>] ? assign_irq_vector+0x3e/0x49
[<ffffffff8021d7b2>] set_msi_irq_affinity+0x6d/0xc8
[<ffffffff8021fa5d>] __ipipe_set_irq_affinity+0x6c/0x77
[<ffffffff80274231>] ipipe_set_irq_affinity+0x34/0x3d
[<ffffffff8027c572>] xnintr_attach+0xaa/0x11e
Two option to fix this, but I'm currently undecided which one to go:
- harden pci_lock (drivers/pci/access.c) - didn't we applied such a
MSI-related workaround before?
- move xnarch_set_irq_affinity out of intrlock (but couldn't we face
even more pci_lock related issues?)
Jan
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Xenomai-core] BUG on xnintr_attach
2009-04-22 12:26 [Xenomai-core] BUG on xnintr_attach Jan Kiszka
@ 2009-04-22 16:23 ` Philippe Gerum
2009-04-22 16:48 ` Jan Kiszka
0 siblings, 1 reply; 4+ messages in thread
From: Philippe Gerum @ 2009-04-22 16:23 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
On Wed, 2009-04-22 at 14:26 +0200, Jan Kiszka wrote:
> Hi all,
>
> issuing rtdm_irq_request and, thus, xnintr_attach can trigger a
> "I-pipe: Detected stalled topmost domain, probably caused by a bug." if
> the interrupt type is MSI:
>
> [<ffffffff80273cce>] ipipe_check_context+0xe7/0xe9
> [<ffffffff8049dae9>] _spin_lock_irqsave+0x18/0x54
> [<ffffffff8037dcc2>] pci_bus_read_config_dword+0x3c/0x87
> [<ffffffff80387c1d>] read_msi_msg+0x61/0xe1
> [<ffffffff8021c5b8>] ? assign_irq_vector+0x3e/0x49
> [<ffffffff8021d7b2>] set_msi_irq_affinity+0x6d/0xc8
> [<ffffffff8021fa5d>] __ipipe_set_irq_affinity+0x6c/0x77
> [<ffffffff80274231>] ipipe_set_irq_affinity+0x34/0x3d
> [<ffffffff8027c572>] xnintr_attach+0xaa/0x11e
>
> Two option to fix this, but I'm currently undecided which one to go:
> - harden pci_lock (drivers/pci/access.c) - didn't we applied such a
> MSI-related workaround before?
This did not work as expected: pathological latency spots. This said,
the vanilla code has evolved since I tried this quick hack months ago,
so it may be worth to look at this option once again.
> - move xnarch_set_irq_affinity out of intrlock (but couldn't we face
> even more pci_lock related issues?)
>
Since upstream decided to use PCI config reads even inside hot paths
when processing MSI interrupts, the only sane way would be to make the
locking used there Adeos-aware, likely virtualizing the interrupt mask.
The way upstream generally deals with MSI is currently a problem for us.
> Jan
>
--
Philippe.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Xenomai-core] BUG on xnintr_attach
2009-04-22 16:23 ` Philippe Gerum
@ 2009-04-22 16:48 ` Jan Kiszka
2009-04-23 7:20 ` Jan Kiszka
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2009-04-22 16:48 UTC (permalink / raw)
To: Philippe Gerum; +Cc: xenomai-core
Philippe Gerum wrote:
> On Wed, 2009-04-22 at 14:26 +0200, Jan Kiszka wrote:
>> Hi all,
>>
>> issuing rtdm_irq_request and, thus, xnintr_attach can trigger a
>> "I-pipe: Detected stalled topmost domain, probably caused by a bug." if
>> the interrupt type is MSI:
>>
>> [<ffffffff80273cce>] ipipe_check_context+0xe7/0xe9
>> [<ffffffff8049dae9>] _spin_lock_irqsave+0x18/0x54
>> [<ffffffff8037dcc2>] pci_bus_read_config_dword+0x3c/0x87
>> [<ffffffff80387c1d>] read_msi_msg+0x61/0xe1
>> [<ffffffff8021c5b8>] ? assign_irq_vector+0x3e/0x49
>> [<ffffffff8021d7b2>] set_msi_irq_affinity+0x6d/0xc8
>> [<ffffffff8021fa5d>] __ipipe_set_irq_affinity+0x6c/0x77
>> [<ffffffff80274231>] ipipe_set_irq_affinity+0x34/0x3d
>> [<ffffffff8027c572>] xnintr_attach+0xaa/0x11e
>>
>> Two option to fix this, but I'm currently undecided which one to go:
>> - harden pci_lock (drivers/pci/access.c) - didn't we applied such a
>> MSI-related workaround before?
>
> This did not work as expected: pathological latency spots. This said,
> the vanilla code has evolved since I tried this quick hack months ago,
> so it may be worth to look at this option once again.
>
>> - move xnarch_set_irq_affinity out of intrlock (but couldn't we face
>> even more pci_lock related issues?)
>>
>
> Since upstream decided to use PCI config reads even inside hot paths
> when processing MSI interrupts, the only sane way would be to make the
> locking used there Adeos-aware, likely virtualizing the interrupt mask.
> The way upstream generally deals with MSI is currently a problem for us.
Hmm, guess this needs a closer look again. But I vaguely recall upstream
had removed the config reading at least from the hot paths due to
complaints about performance.
However, I think we go with approach 1 independently. So please pull
this from git://git.xenomai.org/xenomai-jki.git for-upstream
Jan
--------->
Subject: [PATCH] nucleus: Move xnarch_set_irq_affinity out of intrlock
There is no need to hold intrlock while setting the affinity of the
to-be-registered IRQ. Additionally, we have troubles with MSI code that
can use Linux locks from within this service. So move the affinity
adjustment out of the critical section.
Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
ksrc/nucleus/intr.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/ksrc/nucleus/intr.c b/ksrc/nucleus/intr.c
index 870efb1..dba3b26 100644
--- a/ksrc/nucleus/intr.c
+++ b/ksrc/nucleus/intr.c
@@ -721,11 +721,12 @@ int xnintr_attach(xnintr_t *intr, void *cookie)
intr->cookie = cookie;
memset(&intr->stat, 0, sizeof(intr->stat));
- xnlock_get_irqsave(&intrlock, s);
-
#ifdef CONFIG_SMP
xnarch_set_irq_affinity(intr->irq, nkaffinity);
#endif /* CONFIG_SMP */
+
+ xnlock_get_irqsave(&intrlock, s);
+
err = xnintr_irq_attach(intr);
if (!err)
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Xenomai-core] BUG on xnintr_attach
2009-04-22 16:48 ` Jan Kiszka
@ 2009-04-23 7:20 ` Jan Kiszka
0 siblings, 0 replies; 4+ messages in thread
From: Jan Kiszka @ 2009-04-23 7:20 UTC (permalink / raw)
Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 2697 bytes --]
Jan Kiszka wrote:
> Philippe Gerum wrote:
>> On Wed, 2009-04-22 at 14:26 +0200, Jan Kiszka wrote:
>>> Hi all,
>>>
>>> issuing rtdm_irq_request and, thus, xnintr_attach can trigger a
>>> "I-pipe: Detected stalled topmost domain, probably caused by a bug." if
>>> the interrupt type is MSI:
>>>
>>> [<ffffffff80273cce>] ipipe_check_context+0xe7/0xe9
>>> [<ffffffff8049dae9>] _spin_lock_irqsave+0x18/0x54
>>> [<ffffffff8037dcc2>] pci_bus_read_config_dword+0x3c/0x87
>>> [<ffffffff80387c1d>] read_msi_msg+0x61/0xe1
>>> [<ffffffff8021c5b8>] ? assign_irq_vector+0x3e/0x49
>>> [<ffffffff8021d7b2>] set_msi_irq_affinity+0x6d/0xc8
>>> [<ffffffff8021fa5d>] __ipipe_set_irq_affinity+0x6c/0x77
>>> [<ffffffff80274231>] ipipe_set_irq_affinity+0x34/0x3d
>>> [<ffffffff8027c572>] xnintr_attach+0xaa/0x11e
>>>
>>> Two option to fix this, but I'm currently undecided which one to go:
>>> - harden pci_lock (drivers/pci/access.c) - didn't we applied such a
>>> MSI-related workaround before?
>> This did not work as expected: pathological latency spots. This said,
>> the vanilla code has evolved since I tried this quick hack months ago,
>> so it may be worth to look at this option once again.
>>
>>> - move xnarch_set_irq_affinity out of intrlock (but couldn't we face
>>> even more pci_lock related issues?)
>>>
>> Since upstream decided to use PCI config reads even inside hot paths
>> when processing MSI interrupts, the only sane way would be to make the
>> locking used there Adeos-aware, likely virtualizing the interrupt mask.
>> The way upstream generally deals with MSI is currently a problem for us.
>
> Hmm, guess this needs a closer look again. But I vaguely recall upstream
> had removed the config reading at least from the hot paths due to
> complaints about performance.
I went through the critical MSI code paths again. Its irqchip comes with
ack, mask/unmask and set_affinity which may be used by Xenomai. The ack
is most important as it runs during early dispatching, but it maps to
ack_apic_edge, thus is clean. The rest is contaminated with PCI config
access.
At least pci_lock is involved for this, depending on the PCI access
method, also pci_config_lock. And then there are other code paths that
call wake_up_all while holding pci_lock. This locks more or less hopeless.
On the other hand, we shouldn't depend on mask/unmask or set_affinity
while in critical Xenomai/I-pipe code paths. The MSI interrupt flow is
the same as for edge-triggered (except that there is even no EIO). That
means we do not mask deferred interrupts. And that means we should be
safe once the affinity setting is fixed.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-04-23 7:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-22 12:26 [Xenomai-core] BUG on xnintr_attach Jan Kiszka
2009-04-22 16:23 ` Philippe Gerum
2009-04-22 16:48 ` Jan Kiszka
2009-04-23 7:20 ` Jan Kiszka
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.