All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
@ 2013-02-15 15:15 Anders Blomdell
  2013-02-15 15:26 ` Jan Kiszka
  0 siblings, 1 reply; 11+ messages in thread
From: Anders Blomdell @ 2013-02-15 15:15 UTC (permalink / raw)
  To: Xenomai

Hi,

I have a DX79SI that dies with "kernel BUG at 
arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very 
surprising since when running the system with an ordinary kernel thera 
are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.

Question is if it would be possible to do something less fatal than 
'BUG_ON(irq < 0);' in the code below:

int __ipipe_handle_irq(struct pt_regs *regs)
{
	struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
	int irq, vector = regs->orig_ax, flags = 0;
	struct pt_regs *tick_regs;

	if (likely(vector < 0)) {
		irq = __this_cpu_read(vector_irq[~vector]);
		BUG_ON(irq < 0);
	} else { /* Software-generated. */
		irq = vector;
		flags = IPIPE_IRQF_NOACK;
	}

Regards

Anders Blomdell

-- 
Anders Blomdell                  Email: anders.blomdell@control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-15 15:15 [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI Anders Blomdell
@ 2013-02-15 15:26 ` Jan Kiszka
  2013-02-15 15:34   ` Jan Kiszka
  2013-02-25 10:18   ` Anders Blomdell
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Kiszka @ 2013-02-15 15:26 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Xenomai

On 2013-02-15 16:15, Anders Blomdell wrote:
> Hi,
> 
> I have a DX79SI that dies with "kernel BUG at
> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
> surprising since when running the system with an ordinary kernel thera
> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
> 
> Question is if it would be possible to do something less fatal than
> 'BUG_ON(irq < 0);' in the code below:

This remains a bug that has to be understood.

> 
> int __ipipe_handle_irq(struct pt_regs *regs)
> {
>     struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>     int irq, vector = regs->orig_ax, flags = 0;
>     struct pt_regs *tick_regs;
> 
>     if (likely(vector < 0)) {
>         irq = __this_cpu_read(vector_irq[~vector]);
>         BUG_ON(irq < 0);
>     } else { /* Software-generated. */
>         irq = vector;
>         flags = IPIPE_IRQF_NOACK;
>     }

Kernel 3.5.7 with latest I-pipe? This is the second report of this kind,
see [1] for the discussion and suggestions. If you don't have KGDB and
that kind enabled, try Gilles' instrumentations.

Jan

[1] http://thread.gmane.org/gmane.linux.real-time.xenomai.users/15936

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-15 15:26 ` Jan Kiszka
@ 2013-02-15 15:34   ` Jan Kiszka
  2013-02-25 10:18   ` Anders Blomdell
  1 sibling, 0 replies; 11+ messages in thread
From: Jan Kiszka @ 2013-02-15 15:34 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Xenomai

On 2013-02-15 16:26, Jan Kiszka wrote:
> On 2013-02-15 16:15, Anders Blomdell wrote:
>> Hi,
>>
>> I have a DX79SI that dies with "kernel BUG at
>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>> surprising since when running the system with an ordinary kernel thera
>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>
>> Question is if it would be possible to do something less fatal than
>> 'BUG_ON(irq < 0);' in the code below:
> 
> This remains a bug that has to be understood.

Oh, now I really read the paragraph above: So your system generates
spurious interrupts? OK, for which vector? Are there reports on the web
related to this issue and hardware reasons? I wouldn't want to run RT
load on such a wacky box - provided it is a hardware issue.

The reason why we BUG here is that so far all cases were real I-pipe
bugs. Until we know this is required for otherwise fine hardware, I'm a
bit reluctant to add special code that handles this differently.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-15 15:26 ` Jan Kiszka
  2013-02-15 15:34   ` Jan Kiszka
@ 2013-02-25 10:18   ` Anders Blomdell
  2013-02-25 11:11     ` Jan Kiszka
  2013-02-25 12:27     ` Gilles Chanteperdrix
  1 sibling, 2 replies; 11+ messages in thread
From: Anders Blomdell @ 2013-02-25 10:18 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 2013-02-15 16:26, Jan Kiszka wrote:
> On 2013-02-15 16:15, Anders Blomdell wrote:
>> Hi,
>>
>> I have a DX79SI that dies with "kernel BUG at
>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>> surprising since when running the system with an ordinary kernel thera
>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>
>> Question is if it would be possible to do something less fatal than
>> 'BUG_ON(irq < 0);' in the code below:
>
> This remains a bug that has to be understood.
>
>>
>> int __ipipe_handle_irq(struct pt_regs *regs)
>> {
>>      struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>      int irq, vector = regs->orig_ax, flags = 0;
>>      struct pt_regs *tick_regs;
>>
>>      if (likely(vector < 0)) {
>>          irq = __this_cpu_read(vector_irq[~vector]);
>>          BUG_ON(irq < 0);
>>      } else { /* Software-generated. */
>>          irq = vector;
>>          flags = IPIPE_IRQF_NOACK;
>>      }
>
> Kernel 3.5.7 with latest I-pipe?
Yes.

> This is the second report of this kind,
> see [1] for the discussion and suggestions. If you don't have KGDB and
> that kind enabled, try Gilles' instrumentations.
After a running xenomai five and a half day on a DX58SO motherboard, the 
system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for 
vector (irq -1)' on our logserver.

I'm planning to put in Gilles instrumentations and change the BUG_ON to 
a WARN_ON/WARN, but what should I return after that (my guess is a 
'return 1', but waiting a week to be proved wrong would be a waste of 
time :-).

>
> Jan
>
> [1] http://thread.gmane.org/gmane.linux.real-time.xenomai.users/15936
>
Regards

Anders

-- 
Anders Blomdell                  Email: anders.blomdell@control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 10:18   ` Anders Blomdell
@ 2013-02-25 11:11     ` Jan Kiszka
  2013-02-25 11:53       ` Anders Blomdell
  2013-02-25 12:27     ` Gilles Chanteperdrix
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Kiszka @ 2013-02-25 11:11 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Xenomai

On 2013-02-25 11:18, Anders Blomdell wrote:
> On 2013-02-15 16:26, Jan Kiszka wrote:
>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>> Hi,
>>>
>>> I have a DX79SI that dies with "kernel BUG at
>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>> surprising since when running the system with an ordinary kernel thera
>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>
>>> Question is if it would be possible to do something less fatal than
>>> 'BUG_ON(irq < 0);' in the code below:
>>
>> This remains a bug that has to be understood.
>>
>>>
>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>> {
>>>      struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>      int irq, vector = regs->orig_ax, flags = 0;
>>>      struct pt_regs *tick_regs;
>>>
>>>      if (likely(vector < 0)) {
>>>          irq = __this_cpu_read(vector_irq[~vector]);
>>>          BUG_ON(irq < 0);
>>>      } else { /* Software-generated. */
>>>          irq = vector;
>>>          flags = IPIPE_IRQF_NOACK;
>>>      }
>>
>> Kernel 3.5.7 with latest I-pipe?
> Yes.
> 
>> This is the second report of this kind,
>> see [1] for the discussion and suggestions. If you don't have KGDB and
>> that kind enabled, try Gilles' instrumentations.
> After a running xenomai five and a half day on a DX58SO motherboard, the 
> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for 
> vector (irq -1)' on our logserver.
> 
> I'm planning to put in Gilles instrumentations and change the BUG_ON to 
> a WARN_ON/WARN, but what should I return after that (my guess is a 
> 'return 1', but waiting a week to be proved wrong would be a waste of 
> time :-).

Err, what was the test setup that generated the Linux error but not the
I-pipe BUG_ON? Was it a Xenomai-enabled kernel with the BUG_ON removed?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 11:11     ` Jan Kiszka
@ 2013-02-25 11:53       ` Anders Blomdell
  0 siblings, 0 replies; 11+ messages in thread
From: Anders Blomdell @ 2013-02-25 11:53 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

On 2013-02-25 12:11, Jan Kiszka wrote:
> On 2013-02-25 11:18, Anders Blomdell wrote:
>> On 2013-02-15 16:26, Jan Kiszka wrote:
>>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>>> Hi,
>>>>
>>>> I have a  that dies with "kernel BUG at
>>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>>> surprising since when running the system with an ordinary kernel thera
>>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>>
>>>> Question is if it would be possible to do something less fatal than
>>>> 'BUG_ON(irq < 0);' in the code below:
>>>
>>> This remains a bug that has to be understood.
>>>
>>>>
>>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>>> {
>>>>       struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>>       int irq, vector = regs->orig_ax, flags = 0;
>>>>       struct pt_regs *tick_regs;
>>>>
>>>>       if (likely(vector < 0)) {
>>>>           irq = __this_cpu_read(vector_irq[~vector]);
>>>>           BUG_ON(irq < 0);
>>>>       } else { /* Software-generated. */
>>>>           irq = vector;
>>>>           flags = IPIPE_IRQF_NOACK;
>>>>       }
>>>
>>> Kernel 3.5.7 with latest I-pipe?
>> Yes.
>>
>>> This is the second report of this kind,
>>> see [1] for the discussion and suggestions. If you don't have KGDB and
>>> that kind enabled, try Gilles' instrumentations.
>> After a running xenomai five and a half day on a DX58SO motherboard, the
>> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
>> vector (irq -1)' on our logserver.
>>
>> I'm planning to put in Gilles instrumentations and change the BUG_ON to
>> a WARN_ON/WARN, but what should I return after that (my guess is a
>> 'return 1', but waiting a week to be proved wrong would be a waste of
>> time :-).
>
> Err, what was the test setup that generated the Linux error but not the
> I-pipe BUG_ON? Was it a Xenomai-enabled kernel with the BUG_ON removed?
Strangely enough, a Xenomai-enabled kernel with BUG_ON still in place, 
so yes it's a mystery how the do_IRQ gets triggered. No realtime tasks 
active at all though!

After the do_IRQ on the logserver, the system is mostly unresponsive 
(i.e no way to get the screen to light up, the ssh-daemon immediately 
terminates login sessions). Since the BUG_ON on the DX79SI motherboard 
seemed to kill the filesystem, my immediate thought is that it's what 
happened here as well (loosely based on the fact that the message did 
not reach the local /var/log/messages).

Another thing I could test is to run a stock 3.5.7 kernel and see if the 
do_IRQ message appears there as well (3.6.11 has not shown any signs of 
do_IRQ messages).

/Anders

-- 
Anders Blomdell                  Email: anders.blomdell@control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 10:18   ` Anders Blomdell
  2013-02-25 11:11     ` Jan Kiszka
@ 2013-02-25 12:27     ` Gilles Chanteperdrix
  2013-02-25 14:39       ` Anders Blomdell
  1 sibling, 1 reply; 11+ messages in thread
From: Gilles Chanteperdrix @ 2013-02-25 12:27 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Jan Kiszka, Xenomai

On 02/25/2013 11:18 AM, Anders Blomdell wrote:

> On 2013-02-15 16:26, Jan Kiszka wrote:
>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>> Hi,
>>>
>>> I have a DX79SI that dies with "kernel BUG at
>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>> surprising since when running the system with an ordinary kernel thera
>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>
>>> Question is if it would be possible to do something less fatal than
>>> 'BUG_ON(irq < 0);' in the code below:
>>
>> This remains a bug that has to be understood.
>>
>>>
>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>> {
>>>      struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>      int irq, vector = regs->orig_ax, flags = 0;
>>>      struct pt_regs *tick_regs;
>>>
>>>      if (likely(vector < 0)) {
>>>          irq = __this_cpu_read(vector_irq[~vector]);
>>>          BUG_ON(irq < 0);
>>>      } else { /* Software-generated. */
>>>          irq = vector;
>>>          flags = IPIPE_IRQF_NOACK;
>>>      }
>>
>> Kernel 3.5.7 with latest I-pipe?
> Yes.
> 
>> This is the second report of this kind,
>> see [1] for the discussion and suggestions. If you don't have KGDB and
>> that kind enabled, try Gilles' instrumentations.
> After a running xenomai five and a half day on a DX58SO motherboard, the 
> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for 
> vector (irq -1)' on our logserver.
> 
> I'm planning to put in Gilles instrumentations and change the BUG_ON to 
> a WARN_ON/WARN, but what should I return after that (my guess is a 
> 'return 1', but waiting a week to be proved wrong would be a waste of 
> time :-).


Returning 1 is incorrect:
- you should probably jump to the end of the __ipipe_handle_irq function
- if the irq is irq 7, meaning a spurious irq, Linux should handle it,
so, __ipipe_dispatch_irq should be called. The real question however is
why does not the I-pipe have an irq for this vector.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 12:27     ` Gilles Chanteperdrix
@ 2013-02-25 14:39       ` Anders Blomdell
  2013-02-25 14:53         ` Philippe Gerum
  2013-02-25 14:54         ` Jan Kiszka
  0 siblings, 2 replies; 11+ messages in thread
From: Anders Blomdell @ 2013-02-25 14:39 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, Xenomai

On 2013-02-25 13:27, Gilles Chanteperdrix wrote:
> On 02/25/2013 11:18 AM, Anders Blomdell wrote:
>
>> On 2013-02-15 16:26, Jan Kiszka wrote:
>>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>>> Hi,
>>>>
>>>> I have a DX79SI that dies with "kernel BUG at
>>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>>> surprising since when running the system with an ordinary kernel thera
>>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>>
>>>> Question is if it would be possible to do something less fatal than
>>>> 'BUG_ON(irq < 0);' in the code below:
>>>
>>> This remains a bug that has to be understood.
>>>
>>>>
>>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>>> {
>>>>       struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>>       int irq, vector = regs->orig_ax, flags = 0;
>>>>       struct pt_regs *tick_regs;
>>>>
>>>>       if (likely(vector < 0)) {
>>>>           irq = __this_cpu_read(vector_irq[~vector]);
>>>>           BUG_ON(irq < 0);
>>>>       } else { /* Software-generated. */
>>>>           irq = vector;
>>>>           flags = IPIPE_IRQF_NOACK;
>>>>       }
>>>
>>> Kernel 3.5.7 with latest I-pipe?
>> Yes.
>>
>>> This is the second report of this kind,
>>> see [1] for the discussion and suggestions. If you don't have KGDB and
>>> that kind enabled, try Gilles' instrumentations.
>> After a running xenomai five and a half day on a DX58SO motherboard, the
>> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
>> vector (irq -1)' on our logserver.
>>
>> I'm planning to put in Gilles instrumentations and change the BUG_ON to
>> a WARN_ON/WARN, but what should I return after that (my guess is a
>> 'return 1', but waiting a week to be proved wrong would be a waste of
>> time :-).
>
>
> Returning 1 is incorrect:
> - you should probably jump to the end of the __ipipe_handle_irq function
> - if the irq is irq 7, meaning a spurious irq, Linux should handle it,
> so, __ipipe_dispatch_irq should be called.
OK, so you mean that I'm probably lokking at two different problems
DX58SO: a spurious interrupt (irq==7) passes through __ipipe_handle_irq 
without triggering BUG_ON, but something else breaks.
DX79SI: some (spurious?) interrupt results in irq < 0, triggering BUG_ON

Would the following changes be what you have in mind:

	if (likely(vector < 0)) {
		irq = __this_cpu_read(vector_irq[~vector]);
	        if (irq < 0) {
		    	WARN(irq < 0, "irq(%d) < 0", irq);
  		  	goto out:
	 	}
	} else { /* Software-generated. */
		irq = vector;
		flags = IPIPE_IRQF_NOACK;
	}
	...
out:
	return 1;




Regards

Anders
-- 
Anders Blomdell                  Email: anders.blomdell@control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 14:39       ` Anders Blomdell
@ 2013-02-25 14:53         ` Philippe Gerum
  2013-02-25 14:54         ` Jan Kiszka
  1 sibling, 0 replies; 11+ messages in thread
From: Philippe Gerum @ 2013-02-25 14:53 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Jan Kiszka, Xenomai

On 02/25/2013 03:39 PM, Anders Blomdell wrote:
> On 2013-02-25 13:27, Gilles Chanteperdrix wrote:
>> On 02/25/2013 11:18 AM, Anders Blomdell wrote:
>>
>>> On 2013-02-15 16:26, Jan Kiszka wrote:
>>>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>>>> Hi,
>>>>>
>>>>> I have a DX79SI that dies with "kernel BUG at
>>>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>>>> surprising since when running the system with an ordinary kernel thera
>>>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>>>
>>>>> Question is if it would be possible to do something less fatal than
>>>>> 'BUG_ON(irq < 0);' in the code below:
>>>>
>>>> This remains a bug that has to be understood.
>>>>
>>>>>
>>>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>>>> {
>>>>>       struct ipipe_percpu_data *p =
>>>>> __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>>>       int irq, vector = regs->orig_ax, flags = 0;
>>>>>       struct pt_regs *tick_regs;
>>>>>
>>>>>       if (likely(vector < 0)) {
>>>>>           irq = __this_cpu_read(vector_irq[~vector]);
>>>>>           BUG_ON(irq < 0);
>>>>>       } else { /* Software-generated. */
>>>>>           irq = vector;
>>>>>           flags = IPIPE_IRQF_NOACK;
>>>>>       }
>>>>
>>>> Kernel 3.5.7 with latest I-pipe?
>>> Yes.
>>>
>>>> This is the second report of this kind,
>>>> see [1] for the discussion and suggestions. If you don't have KGDB and
>>>> that kind enabled, try Gilles' instrumentations.
>>> After a running xenomai five and a half day on a DX58SO motherboard, the
>>> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
>>> vector (irq -1)' on our logserver.
>>>
>>> I'm planning to put in Gilles instrumentations and change the BUG_ON to
>>> a WARN_ON/WARN, but what should I return after that (my guess is a
>>> 'return 1', but waiting a week to be proved wrong would be a waste of
>>> time :-).
>>
>>
>> Returning 1 is incorrect:
>> - you should probably jump to the end of the __ipipe_handle_irq function
>> - if the irq is irq 7, meaning a spurious irq, Linux should handle it,
>> so, __ipipe_dispatch_irq should be called.
> OK, so you mean that I'm probably lokking at two different problems
> DX58SO: a spurious interrupt (irq==7) passes through __ipipe_handle_irq
> without triggering BUG_ON, but something else breaks.
> DX79SI: some (spurious?) interrupt results in irq < 0, triggering BUG_ON
>
> Would the following changes be what you have in mind:
>
>      if (likely(vector < 0)) {
>          irq = __this_cpu_read(vector_irq[~vector]);
>              if (irq < 0) {
>                  WARN(irq < 0, "irq(%d) < 0", irq);
>                 goto out:
>           }
>      } else { /* Software-generated. */
>          irq = vector;
>          flags = IPIPE_IRQF_NOACK;
>      }
>      ...
> out:
>      return 1;
>

	...
	goto out;
	...

out:
	if (!__ipipe_root_p ||
	    test_bit(IPIPE_STALL_FLAG, &__ipipe_root_status))
		return 0;

	return 1;

Returning non-zero means: "context is regular, pass irq to linux 
normally", and therefore let do_IRQ handle it.

The interrupted context stops being regular linux-wise when the 
interrupt has preempted the real-time domain (in which case linux might 
have been preempted in the middle of nowhere by a rt activity), or the 
root domain is stalled (which means linux does NOT expect that irq to 
flow down to it, yet).

-- 
Philippe.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 14:39       ` Anders Blomdell
  2013-02-25 14:53         ` Philippe Gerum
@ 2013-02-25 14:54         ` Jan Kiszka
  2013-02-25 15:17           ` Anders Blomdell
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Kiszka @ 2013-02-25 14:54 UTC (permalink / raw)
  To: Anders Blomdell; +Cc: Xenomai

On 2013-02-25 15:39, Anders Blomdell wrote:
> On 2013-02-25 13:27, Gilles Chanteperdrix wrote:
>> On 02/25/2013 11:18 AM, Anders Blomdell wrote:
>>
>>> On 2013-02-15 16:26, Jan Kiszka wrote:
>>>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>>>> Hi,
>>>>>
>>>>> I have a DX79SI that dies with "kernel BUG at
>>>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>>>> surprising since when running the system with an ordinary kernel thera
>>>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>>>
>>>>> Question is if it would be possible to do something less fatal than
>>>>> 'BUG_ON(irq < 0);' in the code below:
>>>>
>>>> This remains a bug that has to be understood.
>>>>
>>>>>
>>>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>>>> {
>>>>>       struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>>>       int irq, vector = regs->orig_ax, flags = 0;
>>>>>       struct pt_regs *tick_regs;
>>>>>
>>>>>       if (likely(vector < 0)) {
>>>>>           irq = __this_cpu_read(vector_irq[~vector]);
>>>>>           BUG_ON(irq < 0);
>>>>>       } else { /* Software-generated. */
>>>>>           irq = vector;
>>>>>           flags = IPIPE_IRQF_NOACK;
>>>>>       }
>>>>
>>>> Kernel 3.5.7 with latest I-pipe?
>>> Yes.
>>>
>>>> This is the second report of this kind,
>>>> see [1] for the discussion and suggestions. If you don't have KGDB and
>>>> that kind enabled, try Gilles' instrumentations.
>>> After a running xenomai five and a half day on a DX58SO motherboard, the
>>> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
>>> vector (irq -1)' on our logserver.
>>>
>>> I'm planning to put in Gilles instrumentations and change the BUG_ON to
>>> a WARN_ON/WARN, but what should I return after that (my guess is a
>>> 'return 1', but waiting a week to be proved wrong would be a waste of
>>> time :-).
>>
>>
>> Returning 1 is incorrect:
>> - you should probably jump to the end of the __ipipe_handle_irq function
>> - if the irq is irq 7, meaning a spurious irq, Linux should handle it,
>> so, __ipipe_dispatch_irq should be called.
> OK, so you mean that I'm probably lokking at two different problems
> DX58SO: a spurious interrupt (irq==7) passes through __ipipe_handle_irq 
> without triggering BUG_ON, but something else breaks.
> DX79SI: some (spurious?) interrupt results in irq < 0, triggering BUG_ON
> 
> Would the following changes be what you have in mind:
> 
> 	if (likely(vector < 0)) {
> 		irq = __this_cpu_read(vector_irq[~vector]);
> 	        if (irq < 0) {
> 		    	WARN(irq < 0, "irq(%d) < 0", irq);

Again, that's only an instrumentation to help finding the bug. According
to my reading of the code, Linux should behave incorrectly over invalid
vector_irq entries as well.

Jan

>   		  	goto out:
> 	 	}
> 	} else { /* Software-generated. */
> 		irq = vector;
> 		flags = IPIPE_IRQF_NOACK;
> 	}
> 	...
> out:
> 	return 1;
> 
> 
> 
> 
> Regards
> 
> Anders
> 

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI
  2013-02-25 14:54         ` Jan Kiszka
@ 2013-02-25 15:17           ` Anders Blomdell
  0 siblings, 0 replies; 11+ messages in thread
From: Anders Blomdell @ 2013-02-25 15:17 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 2013-02-25 15:54, Jan Kiszka wrote:
> On 2013-02-25 15:39, Anders Blomdell wrote:
>> On 2013-02-25 13:27, Gilles Chanteperdrix wrote:
>>> On 02/25/2013 11:18 AM, Anders Blomdell wrote:
>>>
>>>> On 2013-02-15 16:26, Jan Kiszka wrote:
>>>>> On 2013-02-15 16:15, Anders Blomdell wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have a DX79SI that dies with "kernel BUG at
>>>>>> arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
>>>>>> surprising since when running the system with an ordinary kernel thera
>>>>>> are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.
>>>>>>
>>>>>> Question is if it would be possible to do something less fatal than
>>>>>> 'BUG_ON(irq < 0);' in the code below:
>>>>>
>>>>> This remains a bug that has to be understood.
>>>>>
>>>>>>
>>>>>> int __ipipe_handle_irq(struct pt_regs *regs)
>>>>>> {
>>>>>>        struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
>>>>>>        int irq, vector = regs->orig_ax, flags = 0;
>>>>>>        struct pt_regs *tick_regs;
>>>>>>
>>>>>>        if (likely(vector < 0)) {
>>>>>>            irq = __this_cpu_read(vector_irq[~vector]);
>>>>>>            BUG_ON(irq < 0);
>>>>>>        } else { /* Software-generated. */
>>>>>>            irq = vector;
>>>>>>            flags = IPIPE_IRQF_NOACK;
>>>>>>        }
>>>>>
>>>>> Kernel 3.5.7 with latest I-pipe?
>>>> Yes.
>>>>
>>>>> This is the second report of this kind,
>>>>> see [1] for the discussion and suggestions. If you don't have KGDB and
>>>>> that kind enabled, try Gilles' instrumentations.
>>>> After a running xenomai five and a half day on a DX58SO motherboard, the
>>>> system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
>>>> vector (irq -1)' on our logserver.
>>>>
>>>> I'm planning to put in Gilles instrumentations and change the BUG_ON to
>>>> a WARN_ON/WARN, but what should I return after that (my guess is a
>>>> 'return 1', but waiting a week to be proved wrong would be a waste of
>>>> time :-).
>>>
>>>
>>> Returning 1 is incorrect:
>>> - you should probably jump to the end of the __ipipe_handle_irq function
>>> - if the irq is irq 7, meaning a spurious irq, Linux should handle it,
>>> so, __ipipe_dispatch_irq should be called.
>> OK, so you mean that I'm probably lokking at two different problems
>> DX58SO: a spurious interrupt (irq==7) passes through __ipipe_handle_irq
>> without triggering BUG_ON, but something else breaks.
>> DX79SI: some (spurious?) interrupt results in irq < 0, triggering BUG_ON
>>
>> Would the following changes be what you have in mind:
>>
>> 	if (likely(vector < 0)) {
>> 		irq = __this_cpu_read(vector_irq[~vector]);
>> 	        if (irq < 0) {
>> 		    	WARN(irq < 0, "irq(%d) < 0", irq);
>
> Again, that's only an instrumentation to help finding the bug.
I know (have to crawl before i can walk :-))

My understanding of the code is that if irq < 0, should not call 
__ipipe_dispatch_irq, since its irq argument is unsigned, but perhaps 
the MAYDAY and IPIPE_STALL_FLAG stuff should be executed (moving the 
out: label up a few lines).

I guess that I don't need to consider the special case when 
p->hrtimer_irq == -1 and irq == -1?



> According to my reading of the code, Linux should behave incorrectly
 > over invalid vector_irq entries as well.
The difference on DX79SI being that Linux (3.6.11) only logs do_IRQ, 
while Xenomai (Linux 3.5.7/xenomai-2.6.2.1) gives a BUG_ON which makes 
all filesystems read-only, and after that everything more or less 
freezes :-(

>
> Jan
>
>>    		  	goto out:
>> 	 	}
>> 	} else { /* Software-generated. */
>> 		irq = vector;
>> 		flags = IPIPE_IRQF_NOACK;
>> 	}
>> 	...
>> out:
>> 	return 1;
>>
>>
>>
>>
Regards

Anders

-- 
Anders Blomdell                  Email: anders.blomdell@control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-02-25 15:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-15 15:15 [Xenomai] kernel BUG at arch/x86/kernel/ipipe.c:589! on motherboard DX79SI Anders Blomdell
2013-02-15 15:26 ` Jan Kiszka
2013-02-15 15:34   ` Jan Kiszka
2013-02-25 10:18   ` Anders Blomdell
2013-02-25 11:11     ` Jan Kiszka
2013-02-25 11:53       ` Anders Blomdell
2013-02-25 12:27     ` Gilles Chanteperdrix
2013-02-25 14:39       ` Anders Blomdell
2013-02-25 14:53         ` Philippe Gerum
2013-02-25 14:54         ` Jan Kiszka
2013-02-25 15:17           ` Anders Blomdell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.