All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Freyder <steve@freyder.net>
Cc: "xenomai@xenomai.org" <xenomai@xenomai.org>
Subject: Re: rt_dev_send() stalls periodic task
Date: Mon, 22 Apr 2019 15:58:36 -0500	[thread overview]
Message-ID: <5CBE2AFC.2050105@freyder.net> (raw)
In-Reply-To: <5CBE1B46.3050804@freyder.net>

On 4/22/2019 2:51 PM, Steve Freyder via Xenomai wrote:
> On 4/22/2019 1:45 AM, Jan Kiszka wrote:
>> On 22.04.19 08:40, C Smith via Xenomai wrote:
>>> Thanks for your insight, Steve. I didn't realize rt_dev_write() doesnt
>>> actually stall until it is called many times and the 4K TX buffer gets
>>> full. (is that right Jan?)
>>> It that is the case, sure I could find a way to check the TX buffer 
>>> fill
>>> level to prevent my app from stalling.
>>>
>>> I rewrote the xeno_16550A driver RTSER_RTIOC_GET_STATUS ioctl to 
>>> return to
>>> userspace the contents of the IIR and the IER too.
>>> I'm getting IIR = 0b 0001 0100, so the source of the latest 
>>> interrupt is a
>>> RX (not surprising, as I'm doing full duplex) and there is no THRE
>>> interrupt pending.
>>> So regardless of the ultimate cause, this state will never empty the TX
>>> buffer.
>>>
>>> I think my only choice is to try something I had to do once before on a
>>> similarly misbehaving serial port: I'll rewrite the xeno_16550A 
>>> interrupt
>>> handlers to redundantly check for data pending in the TX buffer 
>>> whenever
>>> any interrupt like an RX interrupt happens. I do have bidirectional 
>>> traffic
>>> after all, so the driver will wake up frequently and keep the TX data
>>> transmitting.
>>>
>>> Interesting enough, the stall problem did not occur when I used the 
>>> sample
>>> serial code provided by xenomai: cross-link.c . I also rewrote 
>>> cross-link.c
>>> to send a 72 byte packet and receive on the same port (I installed a
>>> physical loopback device on the serial port). No stalls for 12+ 
>>> hours with
>>> packets streaming at 100 Hz.
>>> The only difference in the serial configuration between that 
>>> cross-link.c
>>> app and my app was :
>>> struct rtser_config :
>>>          .rx_timeout        = RTSER_DEF_TIMEOUT  // infinite , no 
>>> stall for
>>> many hours in cross-link.c
>>> versus:
>>>          .rx_timeout        = 500000   // 500us, stalls within an 
>>> hour in my
>>> app
>>> I don't know why an RX setting affects TX behavior. I also can't use
>>> RTSER_DEF_TIMEOUT in my application or it dies when it starts up - 
>>> no clue
>>> why.  But I did try setting
>>>    .rx_timeout        = 5000000   // 5 ms. my app doesnt stall for 
>>> several
>>> hours
>>> and though that did not cause the serial to stall in my app for several
>>> hours of testing, it is just open-loop finger-crossing, and not a real
>>> solution.
>>> I need the TX interrupts to fire reliably. So I think I must rewrite 
>>> that
>>> interrupt handler, as above.
>>>
>>
>> I think we have a race between rt_16550_write filling the software 
>> queue that
>> the tx interrupt is supposed to write out and the latter already firing,
>> consuming that event without seeing the queue filled. I'll think 
>> about a better
>> algorithm tomorrow, one that can possibly get rid of some interrupt 
>> events as well.
>>
>> Jan
>>
> Greetings again,
>
> If cross-link.c is not stalling, but the CSmith application hangs on
> startup when using similar settings to what cross-link.c is using, it
> tells me that understanding why this "hang on startup" is happening
> would be a good idea.  I know this has happened to me when I got an
> event from a UART that my code did not handle, and because I did not
> handle it, the event continued to fire over and over - a hang. I
> theorized that perhaps there's an issue with there being stale data
> or a data overrun condition that exists when the app starts up that's
> causing this hang.  In either case, it sounds as though the difference
> in settings between CSmith app and cross-link.c might be a key factor.
>
> I went back to the previous email trail, and if I interpreted it
> correctly, the overall data rate is only about 80% of 115Kbaud. This
> suggests that every time there is a write, the 4K software buffer in
> the driver should be completely empty - as should the TX FIFO. The
> only time that won't be true is when the transmit processing got
> stalled (by loss of interrupt, or whatever).
>
> I would be interested to see what happens if the CSmith app
> were to be modified to write one byte at a time, with no delay
> between rt_dev_write() calls.
>
> Finally, some searching shows that back when the original National
> Semiconductor 16550[A] UARTs were first being "cloned" by other
> vendors, National created a program called "COMTEST" that was
> designed to reveal the "misbehaviour" of those competing chips by
> doing extensive testing of the timing and other characteristics and
> how it deviated from "the real thing".  I wonder if anyone in this
> group knows where a copy of that program (or a more modern version)
> might exist?
>
> Regards,
> Steve
>
>
Apologies, I said "hangs on startup" but the original statement was
"dies on startup".  So the theory was that if that were fixed, and
the timeout was RTSER_DEF_TIMEOUT like it is in cross-link.c, that
this might solve the problem.



  reply	other threads:[~2019-04-22 20:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-15 17:28 rt_dev_send() stalls periodic task C Smith
2019-04-16  8:03 ` Jan Kiszka
2019-04-18  6:42   ` C Smith
2019-04-18  8:36     ` Jan Kiszka
2019-04-21  4:33       ` C Smith
2019-04-21 20:10         ` Steve Freyder
2019-04-22  6:40           ` C Smith
2019-04-22  6:45             ` Jan Kiszka
2019-04-22 19:51               ` Steve Freyder
2019-04-22 20:58                 ` Steve Freyder [this message]
2019-04-22 22:56                   ` C Smith
2019-04-22 23:44                     ` Steve Freyder
2019-04-23 12:15               ` Jan Kiszka
2019-04-24  6:53                 ` C Smith
2019-04-25  7:15                 ` C Smith
2019-04-25  8:23                   ` Jan Kiszka
2019-04-26  0:59                     ` C Smith
2019-04-26 16:38                       ` Jan Kiszka
2019-04-24 13:05 Jeff Webb
2019-04-24 14:36 ` Jan Kiszka
2019-04-26  0:41   ` Jeff Webb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5CBE2AFC.2050105@freyder.net \
    --to=steve@freyder.net \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.