All of lore.kernel.org
 help / color / mirror / Atom feed
* Serial data loss
@ 2020-04-07  7:30 gianluca
  2020-04-07  8:24 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 4+ messages in thread
From: gianluca @ 2020-04-07  7:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman, jslaby, linux-serial, linux-kernel,
	Gianluca Renzi, dimka, linux

Hello everybody!

I hope you are keeping safe against Covid-19 a.k.a. Coronavirus!

Now my issue:

I have a BIG trouble having dataloss when using two internal serial 
ports of my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s 
architecture.

It runs at 454Mhz.

Kernel used 4.9.x

When using my test case unit software between two serial ports connect 
each other by a null modem cable, it fails when the speed rate are 
different, and dataloss is increasing higher the speed rate.

I suppose to have overruns (now I am modifying my software to check them 
too), but I think it is due the way the ISR is called and all data are 
passed to the uart circular buffer within the interrupt routine.

I am talking about the high latency from the IRQ up to the service 
routine when flushing the FIFO and another IRQ is called by another uart 
in the same time at different speed.

The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all 
other serial drivers are acting in the same way: they are reading one 
character at time from the FIFO (if it exists) and put it into the 
circular buffer so serial/tty driver can pass them to the user read routine.

Each function call has some overhead and it is time-consuming, and if 
another interrupt is invoked by the same UART Core but from another 
serial port (different context) the continuos insertion done by hardware 
UART into the FIFO cannot be served fast enough to have an overrun. I 
think this can be applied __almost__ to every serial driver as they are 
written in the same way.

And it is __NOT__ an issue because of the CPU and its speed! Using two 
serial converter (FTDI and Prolific PL2303 based) on each board, the 
problem does not appear at all even after 24 hours running at more than 
115200!!!

It does work fine if I am using two different serial devices: one 
internal uart (mxs-auart) and an external uart (ttyUSB).

So I can say it is related on how the harwdare is managing the interrupt 
context and the FIFO/buffer small size.

Are those correct assuptions?

Will a shared FIQ driver over the UART solve the issue?

Regards,
-- 
Eurek s.r.l.                          |
Electronic Engineering                | http://www.eurek.it
via Celletta 8/B, 40026 Imola, Italy  | Phone: +39-(0)542-609120
p.iva 00690621206 - c.f. 04020030377  | Fax:   +39-(0)542-609212

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Serial data loss
  2020-04-07  7:30 Serial data loss gianluca
@ 2020-04-07  8:24 ` Greg Kroah-Hartman
  2020-04-07  9:01   ` gianluca
  0 siblings, 1 reply; 4+ messages in thread
From: Greg Kroah-Hartman @ 2020-04-07  8:24 UTC (permalink / raw)
  To: gianluca; +Cc: jslaby, linux-serial, linux-kernel, Gianluca Renzi, dimka, linux

On Tue, Apr 07, 2020 at 09:30:21AM +0200, gianluca wrote:
> I have a BIG trouble having dataloss when using two internal serial ports of
> my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s architecture.
> 
> It runs at 454Mhz.
> 
> Kernel used 4.9.x

That's a very old kernel, you are going to have to get support for that
from the vendor you bought it from :(

> When using my test case unit software between two serial ports connect each
> other by a null modem cable, it fails when the speed rate are different,

Of course, how would that work?

> and
> dataloss is increasing higher the speed rate.

What type of flow control are you using?

> I suppose to have overruns (now I am modifying my software to check them
> too), but I think it is due the way the ISR is called and all data are
> passed to the uart circular buffer within the interrupt routine.

Are you using flow control?

> I am talking about the high latency from the IRQ up to the service routine
> when flushing the FIFO and another IRQ is called by another uart in the same
> time at different speed.
> 
> The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all other
> serial drivers are acting in the same way: they are reading one character at
> time from the FIFO (if it exists) and put it into the circular buffer so
> serial/tty driver can pass them to the user read routine.
> 
> Each function call has some overhead and it is time-consuming, and if
> another interrupt is invoked by the same UART Core but from another serial
> port (different context) the continuos insertion done by hardware UART into
> the FIFO cannot be served fast enough to have an overrun. I think this can
> be applied __almost__ to every serial driver as they are written in the same
> way.
> 
> And it is __NOT__ an issue because of the CPU and its speed! Using two
> serial converter (FTDI and Prolific PL2303 based) on each board, the problem
> does not appear at all even after 24 hours running at more than 115200!!!

usb-serial devices are totally different and send data to the host in a
completly different way.

Your hardware might just not be able to handle really high baud rates at
a continous stream, what baud rate were you using?

And again, this is what flow control was designed for, please use it.

> It does work fine if I am using two different serial devices: one internal
> uart (mxs-auart) and an external uart (ttyUSB).

Again, different interrupt and protocols being used for the USB stuff.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Serial data loss
  2020-04-07  8:24 ` Greg Kroah-Hartman
@ 2020-04-07  9:01   ` gianluca
  2020-04-07  9:16     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 4+ messages in thread
From: gianluca @ 2020-04-07  9:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: jslaby, linux-serial, linux-kernel, Gianluca Renzi, dimka, linux

Hello,
I am very pleased the Mr. Greg Kroah-Hartman is writing to me in person!

I appreciate a lot sir!

On 04/07/2020 10:24 AM, Greg Kroah-Hartman wrote:
> On Tue, Apr 07, 2020 at 09:30:21AM +0200, gianluca wrote:
>> I have a BIG trouble having dataloss when using two internal serial ports of
>> my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s architecture.
>>
>> It runs at 454Mhz.
>>
>> Kernel used 4.9.x
>
> That's a very old kernel, you are going to have to get support for that
> from the vendor you bought it from :(
>

We are the vendor. ;-)

Jokes apart, I can try to use the latest kernel 5.6, and see how is 
going on them, but at the first check the driver seems exactly the same 
as in kernel 4.9.

>> When using my test case unit software between two serial ports connect each
>> other by a null modem cable, it fails when the speed rate are different,
>
> Of course, how would that work?
>

I am not native english speaker so I am misleading to a 
misunderstanding: my test case is a software with two pthreads which the 
main thread is working with a differnet baud rate than the other 
pthread. Using the same software in two different machines, and using 
the same baudrate for each corrispondant port it should work.

i.e. /dev/ttyAPP1 is running at 9600 and /dev/ttyAPP2 is running at 38400

The same in the other machine. Both ports are null-modem connected:

	9600  /dev/ttyAPP1 <----> /dev/ttyAPP1 9600
	38400 /dev/ttyAPP2 <----> /dev/ttyAPP2 38400

I hope to be clear now. ;-)

>> and
>> dataloss is increasing higher the speed rate.
>
> What type of flow control are you using?
>

Unfortunately no flow control. Because the I cannot use it. When 
connected to the real-hardware those two ports are connected to a 
microcontroller unit which does not have flow control, only RX & TX 
connected (i.e. no RTS/CTS/DTE/DCE lines)

>> I suppose to have overruns (now I am modifying my software to check them
>> too), but I think it is due the way the ISR is called and all data are
>> passed to the uart circular buffer within the interrupt routine.
>
> Are you using flow control?
>

As above, no [ unfortunately ]


>> I am talking about the high latency from the IRQ up to the service routine
>> when flushing the FIFO and another IRQ is called by another uart in the same
>> time at different speed.
>>
>> The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all other
>> serial drivers are acting in the same way: they are reading one character at
>> time from the FIFO (if it exists) and put it into the circular buffer so
>> serial/tty driver can pass them to the user read routine.
>>
>> Each function call has some overhead and it is time-consuming, and if
>> another interrupt is invoked by the same UART Core but from another serial
>> port (different context) the continuos insertion done by hardware UART into
>> the FIFO cannot be served fast enough to have an overrun. I think this can
>> be applied __almost__ to every serial driver as they are written in the same
>> way.
>>
>> And it is __NOT__ an issue because of the CPU and its speed! Using two
>> serial converter (FTDI and Prolific PL2303 based) on each board, the problem
>> does not appear at all even after 24 hours running at more than 115200!!!
>
> usb-serial devices are totally different and send data to the host in a
> completly different way.
>
> Your hardware might just not be able to handle really high baud rates at
> a continous stream, what baud rate were you using?
>

I suppose that, but the same issue can be proven with all single core 
(NO FIFO UART) processors using two ports on the same uart core, running 
Linux kernel @ 450 Mhz or less.

The irq latency it is the same.

> And again, this is what flow control was designed for, please use it.
>

I know and usually I am using a sort of protocol which can check 
correctness of packet, and if not, the packet has to be reasked/resent.
In this case the microcontroller board I am connected to is not built by 
us, and the software is a custom protocol (and I do not know if an error 
on transfer can be accomplished by another request).

So the flow control __CANNOT_BE_USED_AT_ALL__...

>> It does work fine if I am using two different serial devices: one internal
>> uart (mxs-auart) and an external uart (ttyUSB).
>
> Again, different interrupt and protocols being used for the USB stuff.
>

...and in our case is working better than the internal uart driver on 
the same board. It is a real pity...

> thanks,
>

Thanks to you, mr. greg k-h!

> greg k-h


P.S.: I am a very close friend of Andrea Arcangeli, we grew up in the 
same place, and we went in the same school here in Italy (Imola - bologna).

We used to talked about you last Christmas Holidays when Andrea came to 
Italy from NY

Regards,
Gianluca Renzi
-- 
Eurek s.r.l.                          |
Electronic Engineering                | http://www.eurek.it
via Celletta 8/B, 40026 Imola, Italy  | Phone: +39-(0)542-609120
p.iva 00690621206 - c.f. 04020030377  | Fax:   +39-(0)542-609212

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Serial data loss
  2020-04-07  9:01   ` gianluca
@ 2020-04-07  9:16     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 4+ messages in thread
From: Greg Kroah-Hartman @ 2020-04-07  9:16 UTC (permalink / raw)
  To: gianluca; +Cc: jslaby, linux-serial, linux-kernel, Gianluca Renzi, dimka, linux

On Tue, Apr 07, 2020 at 11:01:08AM +0200, gianluca wrote:
> On 04/07/2020 10:24 AM, Greg Kroah-Hartman wrote:
> > On Tue, Apr 07, 2020 at 09:30:21AM +0200, gianluca wrote:
> > > I have a BIG trouble having dataloss when using two internal serial ports of
> > > my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s architecture.
> > > 
> > > It runs at 454Mhz.
> > > 
> > > Kernel used 4.9.x
> > 
> > That's a very old kernel, you are going to have to get support for that
> > from the vendor you bought it from :(
> > 
> 
> We are the vendor. ;-)

Good luck!  :)

> Jokes apart, I can try to use the latest kernel 5.6, and see how is going on
> them, but at the first check the driver seems exactly the same as in kernel
> 4.9.
> 
> > > When using my test case unit software between two serial ports connect each
> > > other by a null modem cable, it fails when the speed rate are different,
> > 
> > Of course, how would that work?
> > 
> 
> I am not native english speaker so I am misleading to a misunderstanding: my
> test case is a software with two pthreads which the main thread is working
> with a differnet baud rate than the other pthread. Using the same software
> in two different machines, and using the same baudrate for each
> corrispondant port it should work.
> 
> i.e. /dev/ttyAPP1 is running at 9600 and /dev/ttyAPP2 is running at 38400
> 
> The same in the other machine. Both ports are null-modem connected:
> 
> 	9600  /dev/ttyAPP1 <----> /dev/ttyAPP1 9600
> 	38400 /dev/ttyAPP2 <----> /dev/ttyAPP2 38400
> 
> I hope to be clear now. ;-)

Ok, yes, that makes more sense now, thank you.

> > > and
> > > dataloss is increasing higher the speed rate.
> > 
> > What type of flow control are you using?
> > 
> 
> Unfortunately no flow control. Because the I cannot use it. When connected
> to the real-hardware those two ports are connected to a microcontroller unit
> which does not have flow control, only RX & TX connected (i.e. no
> RTS/CTS/DTE/DCE lines)

Then you are going to have problems, that is exactly what flow control
was designed for.  To ignore that is to have problems.

Also, there is software flow control when you do not have any control
lines.  This "issue" was solved decades ago :)

> > > I am talking about the high latency from the IRQ up to the service routine
> > > when flushing the FIFO and another IRQ is called by another uart in the same
> > > time at different speed.
> > > 
> > > The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all other
> > > serial drivers are acting in the same way: they are reading one character at
> > > time from the FIFO (if it exists) and put it into the circular buffer so
> > > serial/tty driver can pass them to the user read routine.
> > > 
> > > Each function call has some overhead and it is time-consuming, and if
> > > another interrupt is invoked by the same UART Core but from another serial
> > > port (different context) the continuos insertion done by hardware UART into
> > > the FIFO cannot be served fast enough to have an overrun. I think this can
> > > be applied __almost__ to every serial driver as they are written in the same
> > > way.
> > > 
> > > And it is __NOT__ an issue because of the CPU and its speed! Using two
> > > serial converter (FTDI and Prolific PL2303 based) on each board, the problem
> > > does not appear at all even after 24 hours running at more than 115200!!!
> > 
> > usb-serial devices are totally different and send data to the host in a
> > completly different way.
> > 
> > Your hardware might just not be able to handle really high baud rates at
> > a continous stream, what baud rate were you using?
> > 
> 
> I suppose that, but the same issue can be proven with all single core (NO
> FIFO UART) processors using two ports on the same uart core, running Linux
> kernel @ 450 Mhz or less.
> 
> The irq latency it is the same.

Again, usb-serial devices do not use a uart on the host, so they have a
totally different design and code flow.

> > And again, this is what flow control was designed for, please use it.
> > 
> 
> I know and usually I am using a sort of protocol which can check correctness
> of packet, and if not, the packet has to be reasked/resent.
> In this case the microcontroller board I am connected to is not built by us,
> and the software is a custom protocol (and I do not know if an error on
> transfer can be accomplished by another request).
> 
> So the flow control __CANNOT_BE_USED_AT_ALL__...

Then that is a design mistake, please fix that.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-07  9:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07  7:30 Serial data loss gianluca
2020-04-07  8:24 ` Greg Kroah-Hartman
2020-04-07  9:01   ` gianluca
2020-04-07  9:16     ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.