All of lore.kernel.org
 help / color / mirror / Atom feed
* ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
@ 2019-07-04  8:57 Lange Norbert
  2019-07-04 10:15 ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Lange Norbert @ 2019-07-04  8:57 UTC (permalink / raw)
  To: Xenomai (xenomai@xenomai.org)

Hello,

using the rt_igb driver with the recent ipipe/kernel will result in a broken state
(I assume one cpu core is “stuck”).

This is a quote from Phillipe (note that I tested the plain upstream revivision below)
> This happens specifically when the igb driver enables the device at rtifconfig up only with 4.19+.
> The HIPASE clock device is fine and can be enabled manually with no issue. The spurious IRQ
> message is only a symptom, something seems wrong with this fairly old (rt_)igb code on recent kernels.

+ modprobe rtnet
+ modprobe rtpacket
+ modprobe rt_igp
[  325.791715] RTnet: registered rteth0
[  325.795328] rt_igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
[  325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1) 22:20:47:8d:0f:c9
[  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF
[  325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[  325.823638] sdhci-pci 0000:00:1b.0: SDHCI controller found [8086:5aca] (rev b)

+ rtifconfig rteth0 up
[  326.066500] spurious APIC interrupt through vector ff on CPU#0, should never happen.


xenomai master
ipipe-core-4.19.56-x86-2
 (config is attached)

Regards, Norbert Lange
________________________________

This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-4.19.tar.xz
Type: application/octet-stream
Size: 22800 bytes
Desc: config-4.19.tar.xz
URL: <http://xenomai.org/pipermail/xenomai/attachments/20190704/d32a914d/attachment.obj>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-04  8:57 ipipe 4.19: spurious APIC interrupt when setting rt_igp to up Lange Norbert
@ 2019-07-04 10:15 ` Jan Kiszka
  2019-07-04 10:21   ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-04 10:15 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org)

On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
> Hello,
> 
> using the rt_igb driver with the recent ipipe/kernel will result in a broken state
> (I assume one cpu core is “stuck”).
> 
> This is a quote from Phillipe (note that I tested the plain upstream revivision below)
>> This happens specifically when the igb driver enables the device at rtifconfig up only with 4.19+.
>> The HIPASE clock device is fine and can be enabled manually with no issue. The spurious IRQ
>> message is only a symptom, something seems wrong with this fairly old (rt_)igb code on recent kernels.
> 
> + modprobe rtnet
> + modprobe rtpacket
> + modprobe rt_igp
> [  325.791715] RTnet: registered rteth0
> [  325.795328] rt_igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
> [  325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1) 22:20:47:8d:0f:c9
> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF
> [  325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
> [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI controller found [8086:5aca] (rev b)
> 
> + rtifconfig rteth0 up
> [  326.066500] spurious APIC interrupt through vector ff on CPU#0, should never happen.
> 

Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It should tell us 
the real vector number.

I'll see in parallel if I can reproduce with rt_igb here.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-04 10:15 ` Jan Kiszka
@ 2019-07-04 10:21   ` Jan Kiszka
  2019-07-05  7:38     ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-04 10:21 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org)

On 04.07.19 12:15, Jan Kiszka wrote:
> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
>> Hello,
>>
>> using the rt_igb driver with the recent ipipe/kernel will result in a broken 
>> state
>> (I assume one cpu core is “stuck”).
>>
>> This is a quote from Phillipe (note that I tested the plain upstream 
>> revivision below)
>>> This happens specifically when the igb driver enables the device at 
>>> rtifconfig up only with 4.19+.
>>> The HIPASE clock device is fine and can be enabled manually with no issue. 
>>> The spurious IRQ
>>> message is only a symptom, something seems wrong with this fairly old 
>>> (rt_)igb code on recent kernels.
>>
>> + modprobe rtnet
>> + modprobe rtpacket
>> + modprobe rt_igp
>> [  325.791715] RTnet: registered rteth0
>> [  325.795328] rt_igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
>> [  325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1) 
>> 22:20:47:8d:0f:c9
>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF
>> [  325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 
>> tx queue(s)
>> [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI controller found [8086:5aca] (rev b)
>>
>> + rtifconfig rteth0 up
>> [  326.066500] spurious APIC interrupt through vector ff on CPU#0, should 
>> never happen.
>>
> 
> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It should tell us 
> the real vector number.
> 
> I'll see in parallel if I can reproduce with rt_igb here.

Already succeeded, with rt_e1000e in KVM. Debugging...

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-04 10:21   ` Jan Kiszka
@ 2019-07-05  7:38     ` Jan Kiszka
  2019-07-05 10:43       ` Lange Norbert
  2019-07-11 14:40       ` Philippe Gerum
  0 siblings, 2 replies; 19+ messages in thread
From: Jan Kiszka @ 2019-07-05  7:38 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 04.07.19 12:21, Jan Kiszka wrote:
> On 04.07.19 12:15, Jan Kiszka wrote:
>> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
>>> Hello,
>>>
>>> using the rt_igb driver with the recent ipipe/kernel will result in a broken 
>>> state
>>> (I assume one cpu core is “stuck”).
>>>
>>> This is a quote from Phillipe (note that I tested the plain upstream 
>>> revivision below)
>>>> This happens specifically when the igb driver enables the device at 
>>>> rtifconfig up only with 4.19+.
>>>> The HIPASE clock device is fine and can be enabled manually with no issue. 
>>>> The spurious IRQ
>>>> message is only a symptom, something seems wrong with this fairly old 
>>>> (rt_)igb code on recent kernels.
>>>
>>> + modprobe rtnet
>>> + modprobe rtpacket
>>> + modprobe rt_igp
>>> [  325.791715] RTnet: registered rteth0
>>> [  325.795328] rt_igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
>>> [  325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1) 
>>> 22:20:47:8d:0f:c9
>>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF
>>> [  325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 
>>> tx queue(s)
>>> [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI controller found [8086:5aca] 
>>> (rev b)
>>>
>>> + rtifconfig rteth0 up
>>> [  326.066500] spurious APIC interrupt through vector ff on CPU#0, should 
>>> never happen.
>>>
>>
>> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It should tell 
>> us the real vector number.
>>
>> I'll see in parallel if I can reproduce with rt_igb here.
> 
> Already succeeded, with rt_e1000e in KVM. Debugging...
> 

This addresses it on x86 for me:

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6c279e065879..d503b875f086 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
 		ipipe_root_only();
 
 		raw_spin_lock_irqsave(&desc->lock, flags);
-		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
+		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
+		    !WARN_ON(irq_activate(desc))) {
 			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
 			chip->irq_startup(&desc->irq_data);
 		}

Probably upstream commit c942cee46bba (genirq: Separate activation and
startup) makes this necessary.

Philippe, I suppose this is either not essential on arm, or external
interrupts weren't tested yet, like I missed on x86. Fine to make this a
noarch patch?

Actually, we should make ipipe_enable_irq return an error, rather than
do that WARN_ON here, but that would change APIs, down to
xnintr_enable().

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* RE: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-05  7:38     ` Jan Kiszka
@ 2019-07-05 10:43       ` Lange Norbert
  2019-07-05 11:29         ` Jan Kiszka
  2019-07-11 14:40       ` Philippe Gerum
  1 sibling, 1 reply; 19+ messages in thread
From: Lange Norbert @ 2019-07-05 10:43 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai (xenomai@xenomai.org), Philippe Gerum



> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: Freitag, 5. Juli 2019 09:39
> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> <rpm@xenomai.org>
> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>
> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
>
>
> On 04.07.19 12:21, Jan Kiszka wrote:
> > On 04.07.19 12:15, Jan Kiszka wrote:
> >> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
> >>> Hello,
> >>>
> >>> using the rt_igb driver with the recent ipipe/kernel will result in
> >>> a broken state (I assume one cpu core is “stuck”).
> >>>
> >>> This is a quote from Phillipe (note that I tested the plain upstream
> >>> revivision below)
> >>>> This happens specifically when the igb driver enables the device at
> >>>> rtifconfig up only with 4.19+.
> >>>> The HIPASE clock device is fine and can be enabled manually with no
> issue.
> >>>> The spurious IRQ
> >>>> message is only a symptom, something seems wrong with this fairly
> >>>> old (rt_)igb code on recent kernels.
> >>>
> >>> + modprobe rtnet
> >>> + modprobe rtpacket
> >>> + modprobe rt_igp
> >>> [  325.791715] RTnet: registered rteth0 [  325.795328] rt_igb
> >>> 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [
> >>> 325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1)
> >>> 22:20:47:8d:0f:c9
> >>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF [
> >>> 325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx
> >>> queue(s), 1 tx queue(s) [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI
> >>> controller found [8086:5aca] (rev b)
> >>>
> >>> + rtifconfig rteth0 up
> >>> [  326.066500] spurious APIC interrupt through vector ff on CPU#0,
> >>> should never happen.
> >>>
> >>
> >> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It
> >> should tell us the real vector number.
> >>
> >> I'll see in parallel if I can reproduce with rt_igb here.

Applying that patch then causes the ipipe-patch to fail.
Would take me some time to cleanup.

> >
> > Already succeeded, with rt_e1000e in KVM. Debugging...
> >
>
> This addresses it on x86 for me:
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index
> 6c279e065879..d503b875f086 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>                 ipipe_root_only();
>
>                 raw_spin_lock_irqsave(&desc->lock, flags);
> -               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
> +               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
> +                   !WARN_ON(irq_activate(desc))) {
>                         desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>                         chip->irq_startup(&desc->irq_data);
>                 }

Problem still persists for me with that patch. I use a nfsroot (with a USB->ETH adapter so I can kick out the linux igb driver),
Maybe that’s related.

Norbert
________________________________

This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-05 10:43       ` Lange Norbert
@ 2019-07-05 11:29         ` Jan Kiszka
  2019-07-05 13:56           ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-05 11:29 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 05.07.19 12:43, Lange Norbert wrote:
> 
> 
>> -----Original Message-----
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>> Sent: Freitag, 5. Juli 2019 09:39
>> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
>> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
>> <rpm@xenomai.org>
>> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>>
>> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
>> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
>> ATTACHMENTS.
>>
>>
>> On 04.07.19 12:21, Jan Kiszka wrote:
>>> On 04.07.19 12:15, Jan Kiszka wrote:
>>>> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
>>>>> Hello,
>>>>>
>>>>> using the rt_igb driver with the recent ipipe/kernel will result in
>>>>> a broken state (I assume one cpu core is “stuck”).
>>>>>
>>>>> This is a quote from Phillipe (note that I tested the plain upstream
>>>>> revivision below)
>>>>>> This happens specifically when the igb driver enables the device at
>>>>>> rtifconfig up only with 4.19+.
>>>>>> The HIPASE clock device is fine and can be enabled manually with no
>> issue.
>>>>>> The spurious IRQ
>>>>>> message is only a symptom, something seems wrong with this fairly
>>>>>> old (rt_)igb code on recent kernels.
>>>>>
>>>>> + modprobe rtnet
>>>>> + modprobe rtpacket
>>>>> + modprobe rt_igp
>>>>> [  325.791715] RTnet: registered rteth0 [  325.795328] rt_igb
>>>>> 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [
>>>>> 325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1)
>>>>> 22:20:47:8d:0f:c9
>>>>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF [
>>>>> 325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx
>>>>> queue(s), 1 tx queue(s) [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI
>>>>> controller found [8086:5aca] (rev b)
>>>>>
>>>>> + rtifconfig rteth0 up
>>>>> [  326.066500] spurious APIC interrupt through vector ff on CPU#0,
>>>>> should never happen.
>>>>>
>>>>
>>>> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It
>>>> should tell us the real vector number.
>>>>
>>>> I'll see in parallel if I can reproduce with rt_igb here.
> 
> Applying that patch then causes the ipipe-patch to fail.
> Would take me some time to cleanup.
> 

Yes, did this yesterday, and it requires more work. But the information from it 
is no longer essential.

>>>
>>> Already succeeded, with rt_e1000e in KVM. Debugging...
>>>
>>
>> This addresses it on x86 for me:
>>
>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index
>> 6c279e065879..d503b875f086 100644
>> --- a/kernel/irq/chip.c
>> +++ b/kernel/irq/chip.c
>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>                  ipipe_root_only();
>>
>>                  raw_spin_lock_irqsave(&desc->lock, flags);
>> -               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>> +               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>> +                   !WARN_ON(irq_activate(desc))) {
>>                          desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>                          chip->irq_startup(&desc->irq_data);
>>                  }
> 
> Problem still persists for me with that patch. I use a nfsroot (with a USB->ETH adapter so I can kick out the linux igb driver),
> Maybe that’s related.

Does reducing your machine to maxcpus=1 resolve the issue? I could imagine we an 
affinity problem on top.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-05 11:29         ` Jan Kiszka
@ 2019-07-05 13:56           ` Jan Kiszka
  2019-07-09 16:21             ` Lange Norbert
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-05 13:56 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 05.07.19 13:29, Jan Kiszka wrote:
> On 05.07.19 12:43, Lange Norbert wrote:
>>
>>
>>> -----Original Message-----
>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>> Sent: Freitag, 5. Juli 2019 09:39
>>> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
>>> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
>>> <rpm@xenomai.org>
>>> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>>>
>>> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
>>> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
>>> ATTACHMENTS.
>>>
>>>
>>> On 04.07.19 12:21, Jan Kiszka wrote:
>>>> On 04.07.19 12:15, Jan Kiszka wrote:
>>>>> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
>>>>>> Hello,
>>>>>>
>>>>>> using the rt_igb driver with the recent ipipe/kernel will result in
>>>>>> a broken state (I assume one cpu core is “stuck”).
>>>>>>
>>>>>> This is a quote from Phillipe (note that I tested the plain upstream
>>>>>> revivision below)
>>>>>>> This happens specifically when the igb driver enables the device at
>>>>>>> rtifconfig up only with 4.19+.
>>>>>>> The HIPASE clock device is fine and can be enabled manually with no
>>> issue.
>>>>>>> The spurious IRQ
>>>>>>> message is only a symptom, something seems wrong with this fairly
>>>>>>> old (rt_)igb code on recent kernels.
>>>>>>
>>>>>> + modprobe rtnet
>>>>>> + modprobe rtpacket
>>>>>> + modprobe rt_igp
>>>>>> [  325.791715] RTnet: registered rteth0 [  325.795328] rt_igb
>>>>>> 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [
>>>>>> 325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1)
>>>>>> 22:20:47:8d:0f:c9
>>>>>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF [
>>>>>> 325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx
>>>>>> queue(s), 1 tx queue(s) [  325.823638] sdhci-pci 0000:00:1b.0: SDHCI
>>>>>> controller found [8086:5aca] (rev b)
>>>>>>
>>>>>> + rtifconfig rteth0 up
>>>>>> [  326.066500] spurious APIC interrupt through vector ff on CPU#0,
>>>>>> should never happen.
>>>>>>
>>>>>
>>>>> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It
>>>>> should tell us the real vector number.
>>>>>
>>>>> I'll see in parallel if I can reproduce with rt_igb here.
>>
>> Applying that patch then causes the ipipe-patch to fail.
>> Would take me some time to cleanup.
>>
> 
> Yes, did this yesterday, and it requires more work. But the information from it 
> is no longer essential.
> 
>>>>
>>>> Already succeeded, with rt_e1000e in KVM. Debugging...
>>>>
>>>
>>> This addresses it on x86 for me:
>>>
>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index
>>> 6c279e065879..d503b875f086 100644
>>> --- a/kernel/irq/chip.c
>>> +++ b/kernel/irq/chip.c
>>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>>                  ipipe_root_only();
>>>
>>>                  raw_spin_lock_irqsave(&desc->lock, flags);
>>> -               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>>> +               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>>> +                   !WARN_ON(irq_activate(desc))) {
>>>                          desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>>                          chip->irq_startup(&desc->irq_data);
>>>                  }
>>
>> Problem still persists for me with that patch. I use a nfsroot (with a 
>> USB->ETH adapter so I can kick out the linux igb driver),
>> Maybe that’s related.
> 
> Does reducing your machine to maxcpus=1 resolve the issue? I could imagine we an 
> affinity problem on top.
> 

We do have an affinity problem, will try to fix it soon, but that didn't allow 
me to reproduce your issue with my patch applied.

Could you turn on CONFIG_GENERIC_IRQ_DEBUGFS and grab the content of 
/sys/kernel/debug/irq? Maybe Linux considers the interrupt in question here as 
"affinity managed by kernel", and then my patch is nop. Still need to understand 
all implications of this managed mode for I-pipe.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-05 13:56           ` Jan Kiszka
@ 2019-07-09 16:21             ` Lange Norbert
  2019-07-09 16:33               ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Lange Norbert @ 2019-07-09 16:21 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai (xenomai@xenomai.org), Philippe Gerum

Hello,

maxcpus=1 still causes the spurious int, this time fully locking up.

I attached the debug/irq directory after the cause.

Some things that might be relevant:
-   the SOC would use PINCTRL_BROXTON under linux, but this is disabled (not fixed up for Xenomai)
-   I have the regular igb driver in use, and am unbinding the network card prior to binding the rt_igp driver

Regards, Norbert

> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: Freitag, 5. Juli 2019 15:57
> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> <rpm@xenomai.org>
> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>
> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
>
>
> On 05.07.19 13:29, Jan Kiszka wrote:
> > On 05.07.19 12:43, Lange Norbert wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>> Sent: Freitag, 5. Juli 2019 09:39
> >>> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
> >>> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> >>> <rpm@xenomai.org>
> >>> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp
> >>> to up
> >>>
> >>> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE,
> PLEASE
> >>> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
> >>>
> >>>
> >>> On 04.07.19 12:21, Jan Kiszka wrote:
> >>>> On 04.07.19 12:15, Jan Kiszka wrote:
> >>>>> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> using the rt_igb driver with the recent ipipe/kernel will result
> >>>>>> in a broken state (I assume one cpu core is “stuck”).
> >>>>>>
> >>>>>> This is a quote from Phillipe (note that I tested the plain
> >>>>>> upstream revivision below)
> >>>>>>> This happens specifically when the igb driver enables the device
> >>>>>>> at rtifconfig up only with 4.19+.
> >>>>>>> The HIPASE clock device is fine and can be enabled manually with
> >>>>>>> no
> >>> issue.
> >>>>>>> The spurious IRQ
> >>>>>>> message is only a symptom, something seems wrong with this
> >>>>>>> fairly old (rt_)igb code on recent kernels.
> >>>>>>
> >>>>>> + modprobe rtnet
> >>>>>> + modprobe rtpacket
> >>>>>> + modprobe rt_igp
> >>>>>> [  325.791715] RTnet: registered rteth0 [  325.795328] rt_igb
> >>>>>> 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [
> >>>>>> 325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1)
> >>>>>> 22:20:47:8d:0f:c9
> >>>>>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF [
> >>>>>> 325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx
> >>>>>> queue(s), 1 tx queue(s) [  325.823638] sdhci-pci 0000:00:1b.0:
> >>>>>> SDHCI controller found [8086:5aca] (rev b)
> >>>>>>
> >>>>>> + rtifconfig rteth0 up
> >>>>>> [  326.066500] spurious APIC interrupt through vector ff on
> >>>>>> CPU#0, should never happen.
> >>>>>>
> >>>>>
> >>>>> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It
> >>>>> should tell us the real vector number.
> >>>>>
> >>>>> I'll see in parallel if I can reproduce with rt_igb here.
> >>
> >> Applying that patch then causes the ipipe-patch to fail.
> >> Would take me some time to cleanup.
> >>
> >
> > Yes, did this yesterday, and it requires more work. But the
> > information from it is no longer essential.
> >
> >>>>
> >>>> Already succeeded, with rt_e1000e in KVM. Debugging...
> >>>>
> >>>
> >>> This addresses it on x86 for me:
> >>>
> >>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index
> >>> 6c279e065879..d503b875f086 100644
> >>> --- a/kernel/irq/chip.c
> >>> +++ b/kernel/irq/chip.c
> >>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
> >>>                  ipipe_root_only();
> >>>
> >>>                  raw_spin_lock_irqsave(&desc->lock, flags);
> >>> -               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
> >>> +               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
> >>> +                   !WARN_ON(irq_activate(desc))) {
> >>>                          desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
> >>>                          chip->irq_startup(&desc->irq_data);
> >>>                  }
> >>
> >> Problem still persists for me with that patch. I use a nfsroot (with
> >> a
> >> USB->ETH adapter so I can kick out the linux igb driver),
> >> Maybe that’s related.
> >
> > Does reducing your machine to maxcpus=1 resolve the issue? I could
> > imagine we an affinity problem on top.
> >
>
> We do have an affinity problem, will try to fix it soon, but that didn't allow me
> to reproduce your issue with my patch applied.
>
> Could you turn on CONFIG_GENERIC_IRQ_DEBUGFS and grab the content of
> /sys/kernel/debug/irq? Maybe Linux considers the interrupt in question
> here as "affinity managed by kernel", and then my patch is nop. Still need to
> understand all implications of this managed mode for I-pipe.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate
> Competence Center Embedded Linux
________________________________

This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debugirq.tar.xz
Type: application/octet-stream
Size: 1808 bytes
Desc: debugirq.tar.xz
URL: <http://xenomai.org/pipermail/xenomai/attachments/20190709/8de577f3/attachment.obj>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-09 16:21             ` Lange Norbert
@ 2019-07-09 16:33               ` Jan Kiszka
  2019-07-09 17:54                 ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-09 16:33 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 09.07.19 18:21, Lange Norbert wrote:
> Hello,
> 
> maxcpus=1 still causes the spurious int, this time fully locking up.
> 
> I attached the debug/irq directory after the cause.
> > Some things that might be relevant:
> -   the SOC would use PINCTRL_BROXTON under linux, but this is disabled (not fixed up for Xenomai)
> -   I have the regular igb driver in use, and am unbinding the network card prior to binding the rt_igp driver
> 

Thanks. What's the interrupt number that Xenomai is using? Should be the same
that the Linux driver is using as well.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-09 16:33               ` Jan Kiszka
@ 2019-07-09 17:54                 ` Jan Kiszka
  2019-07-10  8:44                   ` Lange Norbert
  2019-07-10 21:30                   ` Jan Kiszka
  0 siblings, 2 replies; 19+ messages in thread
From: Jan Kiszka @ 2019-07-09 17:54 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 09.07.19 18:33, Jan Kiszka wrote:
> On 09.07.19 18:21, Lange Norbert wrote:
>> Hello,
>>
>> maxcpus=1 still causes the spurious int, this time fully locking up.
>>
>> I attached the debug/irq directory after the cause.
>>> Some things that might be relevant:
>> -   the SOC would use PINCTRL_BROXTON under linux, but this is disabled (not fixed up for Xenomai)
>> -   I have the regular igb driver in use, and am unbinding the network card prior to binding the rt_igp driver
>>
> 
> Thanks. What's the interrupt number that Xenomai is using? Should be the same
> that the Linux driver is using as well.

Found already: Should be IRQ 130-132 for device 00:03.0. If the directory state
was like that while Xenomai was still holding those interrupts, the problem it
that there are no vectors assigned to them. Can you confirm that rt_igb was
still loaded and the interface was up?

Are those interrupts MSI or MSI-X? Can't read that from the logs.

I probably need to get some rt_igb running somewhere...

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-09 17:54                 ` Jan Kiszka
@ 2019-07-10  8:44                   ` Lange Norbert
  2019-07-10 21:30                   ` Jan Kiszka
  1 sibling, 0 replies; 19+ messages in thread
From: Lange Norbert @ 2019-07-10  8:44 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai (xenomai@xenomai.org), Philippe Gerum



> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: Dienstag, 9. Juli 2019 19:54
> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> <rpm@xenomai.org>
> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>
> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
>
>
> On 09.07.19 18:33, Jan Kiszka wrote:
> > On 09.07.19 18:21, Lange Norbert wrote:
> >> Hello,
> >>
> >> maxcpus=1 still causes the spurious int, this time fully locking up.
> >>
> >> I attached the debug/irq directory after the cause.
> >>> Some things that might be relevant:
> >> -   the SOC would use PINCTRL_BROXTON under linux, but this is disabled
> (not fixed up for Xenomai)
> >> -   I have the regular igb driver in use, and am unbinding the network card
> prior to binding the rt_igp driver
> >>
> >
> > Thanks. What's the interrupt number that Xenomai is using? Should be
> > the same that the Linux driver is using as well.
>
> Found already: Should be IRQ 130-132 for device 00:03.0. If the directory
> state was like that while Xenomai was still holding those interrupts, the
> problem it that there are no vectors assigned to them. Can you confirm that
> rt_igb was still loaded and the interface was up?

Well, the bug happens when bringing up the interface.

# modprobe rt_igb
[  117.274639] rt_igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection
[  117.281800] rt_igb 0000:03:00.0: enp3s0: (PCIe:2.5Gb/s:Width x1) 22:20:47:8d:0f:c9
[  117.289397] rt_igb 0000:03:00.0: enp3s0: PBA No: FFFFFF-0FF
[  117.294997] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
[  117.303500] [Xenomai] HIPASE PPS on SDP0
[  117.307529] sdhci-pci 0000:00:1b.0: SDHCI controller found [8086:5aca] (rev b)

# rtifconfig enp3s0 up
[  166.503855] spurious APIC interrupt through vector ff on CPU#0, should never happen.

> Are those interrupts MSI or MSI-X? Can't read that from the logs.

MSI-X from the kernel log.

>
> I probably need to get some rt_igb running somewhere...
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate
> Competence Center Embedded Linux
________________________________

This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-09 17:54                 ` Jan Kiszka
  2019-07-10  8:44                   ` Lange Norbert
@ 2019-07-10 21:30                   ` Jan Kiszka
  2019-07-11 12:23                     ` Lange Norbert
  1 sibling, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-10 21:30 UTC (permalink / raw)
  To: Lange Norbert, Xenomai (xenomai@xenomai.org), Philippe Gerum

On 09.07.19 19:54, Jan Kiszka wrote:
> On 09.07.19 18:33, Jan Kiszka wrote:
>> On 09.07.19 18:21, Lange Norbert wrote:
>>> Hello,
>>>
>>> maxcpus=1 still causes the spurious int, this time fully locking up.
>>>
>>> I attached the debug/irq directory after the cause.
>>>> Some things that might be relevant:
>>> -   the SOC would use PINCTRL_BROXTON under linux, but this is disabled (not fixed up for Xenomai)
>>> -   I have the regular igb driver in use, and am unbinding the network card prior to binding the rt_igp driver
>>>
>>
>> Thanks. What's the interrupt number that Xenomai is using? Should be the same
>> that the Linux driver is using as well.
> 
> Found already: Should be IRQ 130-132 for device 00:03.0. If the directory state
> was like that while Xenomai was still holding those interrupts, the problem it
> that there are no vectors assigned to them. Can you confirm that rt_igb was
> still loaded and the interface was up?
> 
> Are those interrupts MSI or MSI-X? Can't read that from the logs.
> 
> I probably need to get some rt_igb running somewhere...
> 

Still no luck, even on a box with a igb-driven NIC (I350):

[  667.928036] rt_igb 0000:06:00.1: Intel(R) Gigabit Ethernet Network Connection
[  667.928064] rt_igb 0000:06:00.1: rteth0: (PCIe:5.0Gb/s:Width x4) 00:25:90:5d:10:19
[  667.928149] rt_igb 0000:06:00.1: rteth0: PBA No: 010A00-000
[  667.928153] rt_igb 0000:06:00.1: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
xeon-d:~ # cat /proc/xenomai/irq 
  IRQ         CPU0        ...        CPU15
   47:           0        ...           79         rteth0-TxRx-0

I'm currently using the two attached patches on top of ipipe-core-4.19.57-x86-3.

Did you cross-check if the running kernel contains the fix(es)?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ipipe-Activate-IRQ-in-ipipe_enable_irq.patch
Type: text/x-patch
Size: 1011 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20190710/55e16f8f/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-x86-ipipe-Make-sure-IRQs-are-active-when-setting-aff.patch
Type: text/x-patch
Size: 1391 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20190710/55e16f8f/attachment-0001.bin>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-10 21:30                   ` Jan Kiszka
@ 2019-07-11 12:23                     ` Lange Norbert
  0 siblings, 0 replies; 19+ messages in thread
From: Lange Norbert @ 2019-07-11 12:23 UTC (permalink / raw)
  To: Jan Kiszka, Xenomai (xenomai@xenomai.org), Philippe Gerum

> -----Original Message-----
> From: Jan Kiszka <jan.kiszka@siemens.com>
> Sent: Mittwoch, 10. Juli 2019 23:31
> To: Lange Norbert <norbert.lange@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> <rpm@xenomai.org>
> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>
> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
>
>
> On 09.07.19 19:54, Jan Kiszka wrote:
> > On 09.07.19 18:33, Jan Kiszka wrote:
> >> On 09.07.19 18:21, Lange Norbert wrote:
> >>> Hello,
> >>>
> >>> maxcpus=1 still causes the spurious int, this time fully locking up.
> >>>
> >>> I attached the debug/irq directory after the cause.
> >>>> Some things that might be relevant:
> >>> -   the SOC would use PINCTRL_BROXTON under linux, but this is disabled
> (not fixed up for Xenomai)
> >>> -   I have the regular igb driver in use, and am unbinding the network
> card prior to binding the rt_igp driver
> >>>
> >>
> >> Thanks. What's the interrupt number that Xenomai is using? Should be
> >> the same that the Linux driver is using as well.
> >
> > Found already: Should be IRQ 130-132 for device 00:03.0. If the
> > directory state was like that while Xenomai was still holding those
> > interrupts, the problem it that there are no vectors assigned to them.
> > Can you confirm that rt_igb was still loaded and the interface was up?
> >
> > Are those interrupts MSI or MSI-X? Can't read that from the logs.
> >
> > I probably need to get some rt_igb running somewhere...
> >
>
> Still no luck, even on a box with a igb-driven NIC (I350):
>
> [  667.928036] rt_igb 0000:06:00.1: Intel(R) Gigabit Ethernet Network
> Connection [  667.928064] rt_igb 0000:06:00.1: rteth0: (PCIe:5.0Gb/s:Width
> x4) 00:25:90:5d:10:19 [  667.928149] rt_igb 0000:06:00.1: rteth0: PBA No:
> 010A00-000 [  667.928153] rt_igb 0000:06:00.1: Using MSI-X interrupts. 1 rx
> queue(s), 1 tx queue(s) xeon-d:~ # cat /proc/xenomai/irq
>   IRQ         CPU0        ...        CPU15
>    47:           0        ...           79         rteth0-TxRx-0
>
> I'm currently using the two attached patches on top of ipipe-core-4.19.57-
> x86-3.

With those 2 patches ist now fixed on my end,
So far I used this:

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6c279e065879..d503b875f086 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
 ipipe_root_only();

 raw_spin_lock_irqsave(&desc->lock, flags);
-if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
+if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
+    !WARN_ON(irq_activate(desc))) {
 desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
 chip->irq_startup(&desc->irq_data);
 }

>
> Did you cross-check if the running kernel contains the fix(es)?

Yes, the old one.
Thanks for the fix.

Norbert
________________________________

This message and any attachments are solely for the use of the intended recipients. They may contain privileged and/or confidential information or other information protected from disclosure. If you are not an intended recipient, you are hereby notified that you received this email in error and that any review, dissemination, distribution or copying of this email and any attachment is strictly prohibited. If you have received this email in error, please contact the sender and delete the message and any attachment from your system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-05  7:38     ` Jan Kiszka
  2019-07-05 10:43       ` Lange Norbert
@ 2019-07-11 14:40       ` Philippe Gerum
  2019-07-11 15:09         ` Jan Kiszka
  1 sibling, 1 reply; 19+ messages in thread
From: Philippe Gerum @ 2019-07-11 14:40 UTC (permalink / raw)
  To: Jan Kiszka, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 7/5/19 9:38 AM, Jan Kiszka wrote:

> This addresses it on x86 for me:
> 
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 6c279e065879..d503b875f086 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>  		ipipe_root_only();
>  
>  		raw_spin_lock_irqsave(&desc->lock, flags);
> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
> +		    !WARN_ON(irq_activate(desc))) {
>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>  			chip->irq_startup(&desc->irq_data);
>  		}
> 
> Probably upstream commit c942cee46bba (genirq: Separate activation and
> startup) makes this necessary.
> 
> Philippe, I suppose this is either not essential on arm, or external
> interrupts weren't tested yet, like I missed on x86. Fine to make this a
> noarch patch?

No issue. I've not been working on/with the I-pipe but Dovetail instead
in the past weeks, so testing of 4.19 is still very limited on my end. I
have several full-fledged real world ARM*-based application systems to
improve this, just need to find a way to squeeze this work in.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-11 14:40       ` Philippe Gerum
@ 2019-07-11 15:09         ` Jan Kiszka
  2019-07-11 16:00           ` Philippe Gerum
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-11 15:09 UTC (permalink / raw)
  To: Philippe Gerum, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 11.07.19 16:40, Philippe Gerum wrote:
> On 7/5/19 9:38 AM, Jan Kiszka wrote:
> 
>> This addresses it on x86 for me:
>>
>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> index 6c279e065879..d503b875f086 100644
>> --- a/kernel/irq/chip.c
>> +++ b/kernel/irq/chip.c
>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>  		ipipe_root_only();
>>  
>>  		raw_spin_lock_irqsave(&desc->lock, flags);
>> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>> +		    !WARN_ON(irq_activate(desc))) {
>>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>  			chip->irq_startup(&desc->irq_data);
>>  		}
>>
>> Probably upstream commit c942cee46bba (genirq: Separate activation and
>> startup) makes this necessary.
>>
>> Philippe, I suppose this is either not essential on arm, or external
>> interrupts weren't tested yet, like I missed on x86. Fine to make this a
>> noarch patch?
> 
> No issue. I've not been working on/with the I-pipe but Dovetail instead
> in the past weeks, so testing of 4.19 is still very limited on my end. I
> have several full-fledged real world ARM*-based application systems to
> improve this, just need to find a way to squeeze this work in.
> 

One question remains, though: Should we just do the WARN_ON() thing within the
existing API or change that API to return a potential error? Variant one I have
ready, but I have no feeling for the risk that there is actually an error.

The same goes for ipipe_set_irq_affinity that will require the activation as
well but cannot return an error so far.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-11 15:09         ` Jan Kiszka
@ 2019-07-11 16:00           ` Philippe Gerum
  2019-07-11 16:34             ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Philippe Gerum @ 2019-07-11 16:00 UTC (permalink / raw)
  To: Jan Kiszka, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 7/11/19 5:09 PM, Jan Kiszka wrote:
> On 11.07.19 16:40, Philippe Gerum wrote:
>> On 7/5/19 9:38 AM, Jan Kiszka wrote:
>>
>>> This addresses it on x86 for me:
>>>
>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>> index 6c279e065879..d503b875f086 100644
>>> --- a/kernel/irq/chip.c
>>> +++ b/kernel/irq/chip.c
>>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>>  		ipipe_root_only();
>>>  
>>>  		raw_spin_lock_irqsave(&desc->lock, flags);
>>> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>>> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>>> +		    !WARN_ON(irq_activate(desc))) {
>>>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>>  			chip->irq_startup(&desc->irq_data);
>>>  		}
>>>
>>> Probably upstream commit c942cee46bba (genirq: Separate activation and
>>> startup) makes this necessary.
>>>
>>> Philippe, I suppose this is either not essential on arm, or external
>>> interrupts weren't tested yet, like I missed on x86. Fine to make this a
>>> noarch patch?
>>
>> No issue. I've not been working on/with the I-pipe but Dovetail instead
>> in the past weeks, so testing of 4.19 is still very limited on my end. I
>> have several full-fledged real world ARM*-based application systems to
>> improve this, just need to find a way to squeeze this work in.
>>
> 
> One question remains, though: Should we just do the WARN_ON() thing within the
> existing API or change that API to return a potential error? Variant one I have
> ready, but I have no feeling for the risk that there is actually an error.
> 
> The same goes for ipipe_set_irq_affinity that will require the activation as
> well but cannot return an error so far.
> 

Moving from void to non-void would be backward-compatible provided we
don't tag these services as __must_check, so propagating the status
would make sense.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-11 16:00           ` Philippe Gerum
@ 2019-07-11 16:34             ` Jan Kiszka
  2019-07-11 16:48               ` Philippe Gerum
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2019-07-11 16:34 UTC (permalink / raw)
  To: Philippe Gerum, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 11.07.19 18:00, Philippe Gerum wrote:
> On 7/11/19 5:09 PM, Jan Kiszka wrote:
>> On 11.07.19 16:40, Philippe Gerum wrote:
>>> On 7/5/19 9:38 AM, Jan Kiszka wrote:
>>>
>>>> This addresses it on x86 for me:
>>>>
>>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>>> index 6c279e065879..d503b875f086 100644
>>>> --- a/kernel/irq/chip.c
>>>> +++ b/kernel/irq/chip.c
>>>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>>>  		ipipe_root_only();
>>>>  
>>>>  		raw_spin_lock_irqsave(&desc->lock, flags);
>>>> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>>>> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>>>> +		    !WARN_ON(irq_activate(desc))) {
>>>>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>>>  			chip->irq_startup(&desc->irq_data);
>>>>  		}
>>>>
>>>> Probably upstream commit c942cee46bba (genirq: Separate activation and
>>>> startup) makes this necessary.
>>>>
>>>> Philippe, I suppose this is either not essential on arm, or external
>>>> interrupts weren't tested yet, like I missed on x86. Fine to make this a
>>>> noarch patch?
>>>
>>> No issue. I've not been working on/with the I-pipe but Dovetail instead
>>> in the past weeks, so testing of 4.19 is still very limited on my end. I
>>> have several full-fledged real world ARM*-based application systems to
>>> improve this, just need to find a way to squeeze this work in.
>>>
>>
>> One question remains, though: Should we just do the WARN_ON() thing within the
>> existing API or change that API to return a potential error? Variant one I have
>> ready, but I have no feeling for the risk that there is actually an error.
>>
>> The same goes for ipipe_set_irq_affinity that will require the activation as
>> well but cannot return an error so far.
>>
> 
> Moving from void to non-void would be backward-compatible provided we
> don't tag these services as __must_check, so propagating the status
> would make sense.

...but it would also be risky as we then had no reporting of an error. If we
change the API, I would do that in way users (namely drivers) have a chance to
become aware of this change.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-11 16:34             ` Jan Kiszka
@ 2019-07-11 16:48               ` Philippe Gerum
  2019-07-11 19:42                 ` Jan Kiszka
  0 siblings, 1 reply; 19+ messages in thread
From: Philippe Gerum @ 2019-07-11 16:48 UTC (permalink / raw)
  To: Jan Kiszka, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 7/11/19 6:34 PM, Jan Kiszka wrote:
> On 11.07.19 18:00, Philippe Gerum wrote:
>> On 7/11/19 5:09 PM, Jan Kiszka wrote:
>>> On 11.07.19 16:40, Philippe Gerum wrote:
>>>> On 7/5/19 9:38 AM, Jan Kiszka wrote:
>>>>
>>>>> This addresses it on x86 for me:
>>>>>
>>>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>>>> index 6c279e065879..d503b875f086 100644
>>>>> --- a/kernel/irq/chip.c
>>>>> +++ b/kernel/irq/chip.c
>>>>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>>>>  		ipipe_root_only();
>>>>>  
>>>>>  		raw_spin_lock_irqsave(&desc->lock, flags);
>>>>> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>>>>> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>>>>> +		    !WARN_ON(irq_activate(desc))) {
>>>>>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>>>>  			chip->irq_startup(&desc->irq_data);
>>>>>  		}
>>>>>
>>>>> Probably upstream commit c942cee46bba (genirq: Separate activation and
>>>>> startup) makes this necessary.
>>>>>
>>>>> Philippe, I suppose this is either not essential on arm, or external
>>>>> interrupts weren't tested yet, like I missed on x86. Fine to make this a
>>>>> noarch patch?
>>>>
>>>> No issue. I've not been working on/with the I-pipe but Dovetail instead
>>>> in the past weeks, so testing of 4.19 is still very limited on my end. I
>>>> have several full-fledged real world ARM*-based application systems to
>>>> improve this, just need to find a way to squeeze this work in.
>>>>
>>>
>>> One question remains, though: Should we just do the WARN_ON() thing within the
>>> existing API or change that API to return a potential error? Variant one I have
>>> ready, but I have no feeling for the risk that there is actually an error.
>>>
>>> The same goes for ipipe_set_irq_affinity that will require the activation as
>>> well but cannot return an error so far.
>>>
>>
>> Moving from void to non-void would be backward-compatible provided we
>> don't tag these services as __must_check, so propagating the status
>> would make sense.
> 
> ...but it would also be risky as we then had no reporting of an error. If we
> change the API, I would do that in way users (namely drivers) have a chance to
> become aware of this change.
> 

Coupling error propagation and WARN_ON (e.g. in pipeline debug mode)
should not be a problem.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
  2019-07-11 16:48               ` Philippe Gerum
@ 2019-07-11 19:42                 ` Jan Kiszka
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Kiszka @ 2019-07-11 19:42 UTC (permalink / raw)
  To: Philippe Gerum, Lange Norbert, Xenomai (xenomai@xenomai.org)

On 11.07.19 18:48, Philippe Gerum wrote:
> On 7/11/19 6:34 PM, Jan Kiszka wrote:
>> On 11.07.19 18:00, Philippe Gerum wrote:
>>> On 7/11/19 5:09 PM, Jan Kiszka wrote:
>>>> On 11.07.19 16:40, Philippe Gerum wrote:
>>>>> On 7/5/19 9:38 AM, Jan Kiszka wrote:
>>>>>
>>>>>> This addresses it on x86 for me:
>>>>>>
>>>>>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>>>>> index 6c279e065879..d503b875f086 100644
>>>>>> --- a/kernel/irq/chip.c
>>>>>> +++ b/kernel/irq/chip.c
>>>>>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
>>>>>>  		ipipe_root_only();
>>>>>>  
>>>>>>  		raw_spin_lock_irqsave(&desc->lock, flags);
>>>>>> -		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
>>>>>> +		if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
>>>>>> +		    !WARN_ON(irq_activate(desc))) {
>>>>>>  			desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
>>>>>>  			chip->irq_startup(&desc->irq_data);
>>>>>>  		}
>>>>>>
>>>>>> Probably upstream commit c942cee46bba (genirq: Separate activation and
>>>>>> startup) makes this necessary.
>>>>>>
>>>>>> Philippe, I suppose this is either not essential on arm, or external
>>>>>> interrupts weren't tested yet, like I missed on x86. Fine to make this a
>>>>>> noarch patch?
>>>>>
>>>>> No issue. I've not been working on/with the I-pipe but Dovetail instead
>>>>> in the past weeks, so testing of 4.19 is still very limited on my end. I
>>>>> have several full-fledged real world ARM*-based application systems to
>>>>> improve this, just need to find a way to squeeze this work in.
>>>>>
>>>>
>>>> One question remains, though: Should we just do the WARN_ON() thing within the
>>>> existing API or change that API to return a potential error? Variant one I have
>>>> ready, but I have no feeling for the risk that there is actually an error.
>>>>
>>>> The same goes for ipipe_set_irq_affinity that will require the activation as
>>>> well but cannot return an error so far.
>>>>
>>>
>>> Moving from void to non-void would be backward-compatible provided we
>>> don't tag these services as __must_check, so propagating the status
>>> would make sense.
>>
>> ...but it would also be risky as we then had no reporting of an error. If we
>> change the API, I would do that in way users (namely drivers) have a chance to
>> become aware of this change.
>>
> 
> Coupling error propagation and WARN_ON (e.g. in pipeline debug mode)
> should not be a problem.
> 

Evaluating the error code of ipipe_enable_irq and ipipe_set_irq_affinity in
Xenomai will mean requiring a minimal ipipe core version for 4.19. So I will
push the burden of dealing with unprepared drivers to Xenomai (WARN_ON there)
and rather enforce that ipipe update. Luckily, we have no release out yet that
support 4.19.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-11 19:42 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04  8:57 ipipe 4.19: spurious APIC interrupt when setting rt_igp to up Lange Norbert
2019-07-04 10:15 ` Jan Kiszka
2019-07-04 10:21   ` Jan Kiszka
2019-07-05  7:38     ` Jan Kiszka
2019-07-05 10:43       ` Lange Norbert
2019-07-05 11:29         ` Jan Kiszka
2019-07-05 13:56           ` Jan Kiszka
2019-07-09 16:21             ` Lange Norbert
2019-07-09 16:33               ` Jan Kiszka
2019-07-09 17:54                 ` Jan Kiszka
2019-07-10  8:44                   ` Lange Norbert
2019-07-10 21:30                   ` Jan Kiszka
2019-07-11 12:23                     ` Lange Norbert
2019-07-11 14:40       ` Philippe Gerum
2019-07-11 15:09         ` Jan Kiszka
2019-07-11 16:00           ` Philippe Gerum
2019-07-11 16:34             ` Jan Kiszka
2019-07-11 16:48               ` Philippe Gerum
2019-07-11 19:42                 ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.