All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
@ 2011-05-19 13:58 Philippe Gerum
  2011-05-19 18:15 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 6+ messages in thread
From: Philippe Gerum @ 2011-05-19 13:58 UTC (permalink / raw)
  To: xenomai


The NMI latency watchdog is a feature Xenomai supports when proper
hardware is available, which triggers a stack backtrace dump, then
panics when a real-time timer tick is late by a given amount of time. We
used it in the early times to chase pathological latencies, particularly
when debugging the original SMP port.

We currently have two architectures supporting that watchdog, namely x86
and blackfin. x86-wise, the rebasing of the NMI support in mainline over
the perf sub-system just obsoleted our NMI hijacking badly, making it
unusable since 2.6.38.

As I was diving in our NMI support code to adapt it once again for
2.6.38 - with a vague feeling of seasickness coming - I felt maybe time
has come to question the very presence of that feature in our code base:

- NMI watchdog predated the latency tracer. AFAIC, I stopped using the
former long ago, preferring the latter for debugging latency issues.

- the non-maskable nature of the interrupt trigger does not help us
nowadays compared to using the I-pipe tracer: the mainline NMI support
would catch hard lockups with irqs off and panic the same way, and the
tracer would help spotting the issue with a much finer level of detail
in case the latency spot leaves the machine in a sane state, Ie. when
the board remains usable and allows for inspection of /proc/ipipe/trace
files.

- hijacking the mainline NMI code the way we do has always been a
massive pain on x86, prone to trigger conflicts with later kernel
releases.

For this reason, I'm considering issuing a patch for a complete removal
of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
for 2.6.38 kernels and above in 2.5.x.

Comments welcome.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
  2011-05-19 13:58 [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog Philippe Gerum
@ 2011-05-19 18:15 ` Gilles Chanteperdrix
  2011-05-19 18:36   ` Jan Kiszka
  0 siblings, 1 reply; 6+ messages in thread
From: Gilles Chanteperdrix @ 2011-05-19 18:15 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 05/19/2011 03:58 PM, Philippe Gerum wrote:
> For this reason, I'm considering issuing a patch for a complete removal
> of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
> for 2.6.38 kernels and above in 2.5.x.
> 
> Comments welcome.

I am in the same case as you: I no longer use Xeno's NMI watchdog, so I
agree to get rid of it.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
  2011-05-19 18:15 ` Gilles Chanteperdrix
@ 2011-05-19 18:36   ` Jan Kiszka
  2011-05-19 20:29     ` Philippe Gerum
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2011-05-19 18:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 688 bytes --]

On 2011-05-19 20:15, Gilles Chanteperdrix wrote:
> On 05/19/2011 03:58 PM, Philippe Gerum wrote:
>> For this reason, I'm considering issuing a patch for a complete removal
>> of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
>> for 2.6.38 kernels and above in 2.5.x.
>>
>> Comments welcome.
> 
> I am in the same case as you: I no longer use Xeno's NMI watchdog, so I
> agree to get rid of it.

Yeah. The last time we wanted to use it get more information about a
hard hang, the CPU we used was not supported.

Philippe, did you test the Linux watchdog already, if it generate proper
results on artificial Xenomai lockups on a single core?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
  2011-05-19 18:36   ` Jan Kiszka
@ 2011-05-19 20:29     ` Philippe Gerum
  2011-06-22 17:16       ` Gilles Chanteperdrix
  0 siblings, 1 reply; 6+ messages in thread
From: Philippe Gerum @ 2011-05-19 20:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

On Thu, 2011-05-19 at 20:36 +0200, Jan Kiszka wrote:
> On 2011-05-19 20:15, Gilles Chanteperdrix wrote:
> > On 05/19/2011 03:58 PM, Philippe Gerum wrote:
> >> For this reason, I'm considering issuing a patch for a complete removal
> >> of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
> >> for 2.6.38 kernels and above in 2.5.x.
> >>
> >> Comments welcome.
> > 
> > I am in the same case as you: I no longer use Xeno's NMI watchdog, so I
> > agree to get rid of it.
> 
> Yeah. The last time we wanted to use it get more information about a
> hard hang, the CPU we used was not supported.
> 
> Philippe, did you test the Linux watchdog already, if it generate proper
> results on artificial Xenomai lockups on a single core?

This works provided we tell the pipeline to enter printk-sync mode when
the watchdog kicks. So I'd say that we could probably do a better job in
making the pipeline core smarter wrt NMI watchdog context handling than
asking Xenomai to dup the mainline code for having its own NMI handling.

> 
> Jan
> 

-- 
Philippe.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
  2011-05-19 20:29     ` Philippe Gerum
@ 2011-06-22 17:16       ` Gilles Chanteperdrix
  2011-06-22 20:47         ` Philippe Gerum
  0 siblings, 1 reply; 6+ messages in thread
From: Gilles Chanteperdrix @ 2011-06-22 17:16 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Jan Kiszka, xenomai

On 05/19/2011 10:29 PM, Philippe Gerum wrote:
> On Thu, 2011-05-19 at 20:36 +0200, Jan Kiszka wrote:
>> On 2011-05-19 20:15, Gilles Chanteperdrix wrote:
>>> On 05/19/2011 03:58 PM, Philippe Gerum wrote:
>>>> For this reason, I'm considering issuing a patch for a complete removal
>>>> of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
>>>> for 2.6.38 kernels and above in 2.5.x.
>>>>
>>>> Comments welcome.
>>>
>>> I am in the same case as you: I no longer use Xeno's NMI watchdog, so I
>>> agree to get rid of it.
>>
>> Yeah. The last time we wanted to use it get more information about a
>> hard hang, the CPU we used was not supported.
>>
>> Philippe, did you test the Linux watchdog already, if it generate proper
>> results on artificial Xenomai lockups on a single core?
> 
> This works provided we tell the pipeline to enter printk-sync mode when
> the watchdog kicks. So I'd say that we could probably do a better job in
> making the pipeline core smarter wrt NMI watchdog context handling than
> asking Xenomai to dup the mainline code for having its own NMI handling.

If nobody disagrees, I am removing this code from -head. Now.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog
  2011-06-22 17:16       ` Gilles Chanteperdrix
@ 2011-06-22 20:47         ` Philippe Gerum
  0 siblings, 0 replies; 6+ messages in thread
From: Philippe Gerum @ 2011-06-22 20:47 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, xenomai

On Wed, 2011-06-22 at 19:16 +0200, Gilles Chanteperdrix wrote:
> On 05/19/2011 10:29 PM, Philippe Gerum wrote:
> > On Thu, 2011-05-19 at 20:36 +0200, Jan Kiszka wrote:
> >> On 2011-05-19 20:15, Gilles Chanteperdrix wrote:
> >>> On 05/19/2011 03:58 PM, Philippe Gerum wrote:
> >>>> For this reason, I'm considering issuing a patch for a complete removal
> >>>> of the NMI latency watchdog code in Xenomai 2.6.x, disabling the feature
> >>>> for 2.6.38 kernels and above in 2.5.x.
> >>>>
> >>>> Comments welcome.
> >>>
> >>> I am in the same case as you: I no longer use Xeno's NMI watchdog, so I
> >>> agree to get rid of it.
> >>
> >> Yeah. The last time we wanted to use it get more information about a
> >> hard hang, the CPU we used was not supported.
> >>
> >> Philippe, did you test the Linux watchdog already, if it generate proper
> >> results on artificial Xenomai lockups on a single core?
> > 
> > This works provided we tell the pipeline to enter printk-sync mode when
> > the watchdog kicks. So I'd say that we could probably do a better job in
> > making the pipeline core smarter wrt NMI watchdog context handling than
> > asking Xenomai to dup the mainline code for having its own NMI handling.
> 
> If nobody disagrees, I am removing this code from -head. Now.
> 

Ack.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-06-22 20:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-19 13:58 [Xenomai-core] [RFC] Getting rid of the NMI latency watchdog Philippe Gerum
2011-05-19 18:15 ` Gilles Chanteperdrix
2011-05-19 18:36   ` Jan Kiszka
2011-05-19 20:29     ` Philippe Gerum
2011-06-22 17:16       ` Gilles Chanteperdrix
2011-06-22 20:47         ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.