All of lore.kernel.org
 help / color / mirror / Atom feed
* at91: input overruns question
@ 2010-10-26  6:19 Aras Vaichas
  2010-10-26  6:49 ` Uwe Kleine-König
  2010-10-28  3:14 ` Aras Vaichas
  0 siblings, 2 replies; 7+ messages in thread
From: Aras Vaichas @ 2010-10-26  6:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I'm updating our production kernel to 2.6.33.7-rt29 for an at91rm9200.
This is the newest version supported by RT.

I've noticed that the age-old debug serial input overrun(s) issue has
gotten far worse and I can no longer use the cursor keys at all. i.e.
no history on serial console or vi.

As the kernel version numbers have gone up, the problem seems to have
gotten worse.

Is there anything that I can do to make it a bit more usable?

Any .config changes?

Is it worse because of the RT patches?

Please feel free to point me to any previous posts discussing the
issue, or solutions.

Thanks!

-- 
Aras Vaichas

Senior Engineer - Software?|?Magellan Technology Pty Ltd

65 Johnston St, Annandale NSW 2038, Australia

P | +61 2 9562 9854 | F +61 2 9518 7620 |??arasv at magellan-technology.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-10-26  6:19 at91: input overruns question Aras Vaichas
@ 2010-10-26  6:49 ` Uwe Kleine-König
  2010-10-27  1:00   ` Aras Vaichas
  2010-10-28  3:14 ` Aras Vaichas
  1 sibling, 1 reply; 7+ messages in thread
From: Uwe Kleine-König @ 2010-10-26  6:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Aras,

On Tue, Oct 26, 2010 at 05:19:30PM +1100, Aras Vaichas wrote:
> I'm updating our production kernel to 2.6.33.7-rt29 for an at91rm9200.
> This is the newest version supported by RT.
> 
> I've noticed that the age-old debug serial input overrun(s) issue has
> gotten far worse and I can no longer use the cursor keys at all. i.e.
> no history on serial console or vi.
> 
> As the kernel version numbers have gone up, the problem seems to have
> gotten worse.
> 
> Is there anything that I can do to make it a bit more usable?
> 
> Any .config changes?
> 
> Is it worse because of the RT patches?
> 
> Please feel free to point me to any previous posts discussing the
> issue, or solutions.
Where does the warning come from?  serial layer, at91 serial driver?
How big is the fifo in the at91 serial device?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-K?nig            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-10-26  6:49 ` Uwe Kleine-König
@ 2010-10-27  1:00   ` Aras Vaichas
  0 siblings, 0 replies; 7+ messages in thread
From: Aras Vaichas @ 2010-10-27  1:00 UTC (permalink / raw)
  To: linux-arm-kernel

2010/10/26 Uwe Kleine-K?nig <u.kleine-koenig@pengutronix.de>:
> Hello Aras,
>
> On Tue, Oct 26, 2010 at 05:19:30PM +1100, Aras Vaichas wrote:
>> I'm updating our production kernel to 2.6.33.7-rt29 for an at91rm9200.
>> This is the newest version supported by RT.
>>
>> I've noticed that the age-old debug serial input overrun(s) issue has
>> gotten far worse and I can no longer use the cursor keys at all. i.e.
>> no history on serial console or vi.
>>
>> As the kernel version numbers have gone up, the problem seems to have
>> gotten worse.
>>
>> Is there anything that I can do to make it a bit more usable?
>>
>> Any .config changes?
>>
>> Is it worse because of the RT patches?
>>
>> Please feel free to point me to any previous posts discussing the
>> issue, or solutions.
> Where does the warning come from?

The warning comes from drivers/char/n_tty.c, function n_tty_receive_overrun()


> How big is the fifo in the at91 serial device?

The DBGU (debug serial port) doesn't have one. The idea is that it's
supposed to use the PDC (Peripheral DMA Controller)

There is this comment in /arch/arm/mach-at91/at91rm9200_devices.c:

static struct atmel_uart_data dbgu_data = {
    .use_dma_tx = 0,
    .use_dma_rx = 0,        /* DBGU not capable of receive DMA */

So I guess there is no DMA for the DBGU receive. Also, AFAIK, it
shares an IRQ with the RTC. But the RTC only goes off once a second.

The only reason that I've brought it up is that I've never seen it
this bad. It barely happened in 2.4.x, 2.6.19 was OK, 2.6.26 got worse
and now the DBGU is almost a lost cause in 2.6.33.7-rt29.

I can try rebuilding this kernel without the RT patches and see if
that makes a difference, but my application developers really want as
much performance as they can get at this point.

Aras

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-10-26  6:19 at91: input overruns question Aras Vaichas
  2010-10-26  6:49 ` Uwe Kleine-König
@ 2010-10-28  3:14 ` Aras Vaichas
  2010-10-31 11:19   ` Remy Bohmer
  1 sibling, 1 reply; 7+ messages in thread
From: Aras Vaichas @ 2010-10-28  3:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 October 2010 17:19, Aras Vaichas <arasv@magellan-technology.com> wrote:
> Hi,
>
> I'm updating our production kernel to 2.6.33.7-rt29 for an at91rm9200.
> This is the newest version supported by RT.
>
> I've noticed that the age-old debug serial input overrun(s) issue has
> gotten far worse and I can no longer use the cursor keys at all. i.e.
> no history on serial console or vi.
>
> As the kernel version numbers have gone up, the problem seems to have
> gotten worse.
>
> Is there anything that I can do to make it a bit more usable?
>
> Any .config changes?

Configuration settings CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS cause
the overrun(s) problems to get worse.

If I disable these two settings I still get overrun(s) but not quite
as frequent.

Can someone explain why these settings would cause the CPU to be
unable to process 2 serial characters in a row?

Something must be causing the DBGU serial receive interrupt to stall
for more than 86us in order to miss a complete 115200 baud serial
character.

Aras

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-10-28  3:14 ` Aras Vaichas
@ 2010-10-31 11:19   ` Remy Bohmer
  2010-11-01 23:33     ` Aras Vaichas
  0 siblings, 1 reply; 7+ messages in thread
From: Remy Bohmer @ 2010-10-31 11:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

2010/10/28 Aras Vaichas <arasv@magellan-technology.com>:
> On 26 October 2010 17:19, Aras Vaichas <arasv@magellan-technology.com> wrote:
>> Hi,
>>
>> I'm updating our production kernel to 2.6.33.7-rt29 for an at91rm9200.
>> This is the newest version supported by RT.
>>
>> I've noticed that the age-old debug serial input overrun(s) issue has
>> gotten far worse and I can no longer use the cursor keys at all. i.e.
>> no history on serial console or vi.
>>
>> As the kernel version numbers have gone up, the problem seems to have
>> gotten worse.
>>
>> Is there anything that I can do to make it a bit more usable?
>>
>> Any .config changes?
>
> Configuration settings CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS cause
> the overrun(s) problems to get worse.

Indeed. high-res timers and no-hz have a longer interrupt handling
compared to the old fashioned ticking kernel.

> If I disable these two settings I still get overrun(s) but not quite
> as frequent.

> Can someone explain why these settings would cause the CPU to be
> unable to process 2 serial characters in a row?
> Something must be causing the DBGU serial receive interrupt to stall
> for more than 86us in order to miss a complete 115200 baud serial
> character.

The DBGU has a 1 byte FIFO, thus not a FIFO at all. As you already
mention, the time between 2 interrupts on 115200 baud must be about
86us. On rm9200 the timer interrupt handler itself can cost about 50us
to more than 100usec in itself (measured through ETM
trace), combined with a worst case interrupt latency of about 75us.
This is even worse on a HRT or NOHZ enabled kernel.
So, mathematically you can already prove that it cannot keep up to the
115200 properly.

Enabling DMA for DBGU is no option either since IIRC it would only
generate an interrupt once the DMA buffer is full, not when the first
character arrives in the buffer. Thus not very useful for a terminal.

See for more info about the timer interrupt:
http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-03/msg01954.html
and
http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-03/pnga98YRFl9x7.png
(ETM trace screendump of the timer interrupt handler)

Kind regards,

Remy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-10-31 11:19   ` Remy Bohmer
@ 2010-11-01 23:33     ` Aras Vaichas
  2010-11-02 14:11       ` Remy Bohmer
  0 siblings, 1 reply; 7+ messages in thread
From: Aras Vaichas @ 2010-11-01 23:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 31 October 2010 22:19, Remy Bohmer <linux@bohmer.net> wrote:
> 2010/10/28 Aras Vaichas <arasv@magellan-technology.com>:
>> On 26 October 2010 17:19, Aras Vaichas <arasv@magellan-technology.com> wrote:
<SNIP>
>>
>> Configuration settings CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS cause
>> the overrun(s) problems to get worse.
>
> Indeed. high-res timers and no-hz have a longer interrupt handling
> compared to the old fashioned ticking kernel.

<SNIP>

Thank you for that, I suspected as much, but I don't have trace
capability on my target to verify it.

Luckily the DBGU port is only used infrequently, most of the time the
connection to our product is via ethernet/usb.

I did manage to get ithe DBGU to work by changing the IRQ request flag
to NODELAY.

#ifndef CONFIG_RTC_DRV_AT91RM9200
    if ( machine_is_at91rm9200() && port->irq == 1 )
        retval = request_irq(port->irq, atmel_interrupt, IRQF_NODELAY,
            tty ? tty->name : "atmel_serial", port);
    else
#endif
        retval = request_irq(port->irq, atmel_interrupt, IRQF_SHARED,
            tty ? tty->name : "atmel_serial", port);

I understand the warnings and limitations of NODELAY for an RT kernel,
but this is a quick fix which allows me to continue to use the DBGU
for my development and I can remove it once I get to production
status.

There is a comment in the atmel_serial.c file which says that
"Transmit is IRQF_NODELAY safe" but I don't know if the Receive is as
well. Maybe the Receive isn't deemed safe because it is shared with
the RTC? If it is not shared, then maybe it is safe? I intend
investigate this myself I have any further problems.

Thanks again!

Aras

^ permalink raw reply	[flat|nested] 7+ messages in thread

* at91: input overruns question
  2010-11-01 23:33     ` Aras Vaichas
@ 2010-11-02 14:11       ` Remy Bohmer
  0 siblings, 0 replies; 7+ messages in thread
From: Remy Bohmer @ 2010-11-02 14:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

> I did manage to get ithe DBGU to work by changing the IRQ request flag
> to NODELAY.
>
> #ifndef CONFIG_RTC_DRV_AT91RM9200
> ? ?if ( machine_is_at91rm9200() && port->irq == 1 )
> ? ? ? ?retval = request_irq(port->irq, atmel_interrupt, IRQF_NODELAY,
> ? ? ? ? ? ?tty ? tty->name : "atmel_serial", port);
> ? ?else
> #endif
> ? ? ? ?retval = request_irq(port->irq, atmel_interrupt, IRQF_SHARED,
> ? ? ? ? ? ?tty ? tty->name : "atmel_serial", port);
>
> I understand the warnings and limitations of NODELAY for an RT kernel,
> but this is a quick fix which allows me to continue to use the DBGU
> for my development and I can remove it once I get to production
> status.
>

I was looking through my own patch repo and I use on 2.6.33-rt the
patch attached (since 2.6.31) (for Atmel at91sam9261), but for a
different reason then you run into now! (see patch header)

Notice that the DBGU is shared with the system timer (ST) on rm9200.
The interrupt of the ST is shared with the DBGU.
This means that everything the interrupt handler for ST does bothers the DBGU.
BUT: The ST has its interrupt handler installed with IRQF_TIMER (see
arch/arm/mach-at91/at91rm9200_time.c). IRQF_TIMER implies
IRQF_NODELAY.
Without NODELAY set, the IRQ-thread of the DBGU will only run if the
ST-interrupt is completed, causing your troubles.
The driver of the DBGU was adapted such that it runs properly in
NODELAY context. You can keep it set, even in production use.

> There is a comment in the atmel_serial.c file which says that
> "Transmit is IRQF_NODELAY safe" but I don't know if the Receive is as
> well.

Especially the receive was important to make safe for that context.
That one will drop data if it is not executed fast enough!
We do that only for receive. For transmit only there would have been
no reason to make it nodelay safe, since that would only imply a
bottleneck on throughput in case it runs from IRQ-thread. (you won' t
get buffer overflows on transmit...)

> Maybe the Receive isn't deemed safe because it is shared with
> the RTC? If it is not shared, then maybe it is safe? I intend
> investigate this myself I have any further problems.

Only the interrupt handler marked nodelay run in IRQ context (like ST
and DBGU now), all others in thread context.(like PMC and RTC)
All other shared interrupts, like RTC and PMC, slow down the interrupt
handling of the DBGU and ST, since the IRQ-thread must first run,
before the interrupt can be enabled again at AIC level.

The best solution is to adapt all shared handlers to use the
request_threaded_irq() API and only do in the top-half context what is
absolutely necessary. (Only the ST-handler cannot be modified though)

Kind regards,

Remy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make-atmel-serial-irq-nodelay.patch
Type: text/x-patch
Size: 1197 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20101102/242ded5e/attachment.bin>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-02 14:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-26  6:19 at91: input overruns question Aras Vaichas
2010-10-26  6:49 ` Uwe Kleine-König
2010-10-27  1:00   ` Aras Vaichas
2010-10-28  3:14 ` Aras Vaichas
2010-10-31 11:19   ` Remy Bohmer
2010-11-01 23:33     ` Aras Vaichas
2010-11-02 14:11       ` Remy Bohmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.