All of lore.kernel.org
 help / color / mirror / Atom feed
* [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
@ 2021-11-04  8:43 Deniz Uğur
  2021-11-08  7:59 ` Philippe Gerum
  0 siblings, 1 reply; 7+ messages in thread
From: Deniz Uğur @ 2021-11-04  8:43 UTC (permalink / raw)
  To: Xenomai

Greetings Xenomai community,

For the past year I have been using EVL to develop some real-time apps utilizing fast data acquisition through SPI interfaces on RPi 4B. I then send the collected data through mainline network stack to a amd64 machine running Xenomai 3.1. The data I’m sending is sensitive and cannot afford packet drops. By that I mean even if I use UDP I would have to wait for next successful packet to continue. So in that case, I guess, TCP without Nagle’s algorithm should be enough.

What I want to ask is how would one achieve a bare minimum but real-time communication between two devices over network. I can convert my other Xenomai client to EVL. Any improvements over mainline network stack would be enough I believe.

Thanks for your time,
Deniz.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-04  8:43 [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64 Deniz Uğur
@ 2021-11-08  7:59 ` Philippe Gerum
  2021-11-08  9:21   ` Deniz Uğur
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2021-11-08  7:59 UTC (permalink / raw)
  To: Deniz Uğur; +Cc: xenomai


Deniz Uğur via Xenomai <xenomai@xenomai.org> writes:

> Greetings Xenomai community,
>
> For the past year I have been using EVL to develop some real-time apps
> utilizing fast data acquisition through SPI interfaces on RPi 4B. I
> then send the collected data through mainline network stack to a amd64
> machine running Xenomai 3.1. The data I’m sending is sensitive and
> cannot afford packet drops. By that I mean even if I use UDP I would
> have to wait for next successful packet to continue.

Providing a basic sliding window mechanism between UDP/packet peers may
be an option.

> So in that case,
> I guess, TCP without Nagle’s algorithm should be enough.
>

TCP with real-time capabilities would require to set limits on the time
spent in recovery and retransmissions by construction, turning off
packet coalescence may not be enough. With the current networking stack,
this is more of a design issue than an implementation problem.

> What I want to ask is how would one achieve a bare minimum but real-time communication between two devices over network. I can convert my other Xenomai client to EVL. Any improvements over mainline network stack would be enough I believe.
>

As Paul McKenney put it once, "(hard) real-time is the start of a
conversation rather than a complete requirement.". What would be the
requirements in terms of expected bandwidth, worst case latency, jitter,
packet ordering guarantee ?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-08  7:59 ` Philippe Gerum
@ 2021-11-08  9:21   ` Deniz Uğur
  2021-11-10  9:48     ` Philippe Gerum
  0 siblings, 1 reply; 7+ messages in thread
From: Deniz Uğur @ 2021-11-08  9:21 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

> Providing a basic sliding window mechanism between UDP/packet peers may
> be an option.

How so? Wouldn’t that introduce out-of-order packets?

> As Paul McKenney put it once, "(hard) real-time is the start of a
> conversation rather than a complete requirement.”

I agree, as with all things any improvement over mainline should be incremental.

> What would be the
> requirements in terms of expected bandwidth, worst case latency, jitter,
> packet ordering guarantee ?

Well, my application reads encoder data and transmits it to a host computer for further calculations. Therefore, packet ordering must be inline with how the data is read. Each packet consists of ~100 bytes and I use 4 clients to send data. Worst case latency shouldn’t be higher than 0.2-0.3 ms, because that is my sampling time for each client. I have no constraints on jitter, as I’m happy with the jitter I’m currently measuring (which is less than 5%).

What can you recommend me with these in mind?

Thanks,
Deniz.

> On 8 Nov 2021, at 10:59, Philippe Gerum <rpm@xenomai.org> wrote:
> 
> 
> Deniz Uğur via Xenomai <xenomai@xenomai.org> writes:
> 
>> Greetings Xenomai community,
>> 
>> For the past year I have been using EVL to develop some real-time apps
>> utilizing fast data acquisition through SPI interfaces on RPi 4B. I
>> then send the collected data through mainline network stack to a amd64
>> machine running Xenomai 3.1. The data I’m sending is sensitive and
>> cannot afford packet drops. By that I mean even if I use UDP I would
>> have to wait for next successful packet to continue.
> 
> Providing a basic sliding window mechanism between UDP/packet peers may
> be an option.
> 
>> So in that case,
>> I guess, TCP without Nagle’s algorithm should be enough.
>> 
> 
> TCP with real-time capabilities would require to set limits on the time
> spent in recovery and retransmissions by construction, turning off
> packet coalescence may not be enough. With the current networking stack,
> this is more of a design issue than an implementation problem.
> 
>> What I want to ask is how would one achieve a bare minimum but real-time communication between two devices over network. I can convert my other Xenomai client to EVL. Any improvements over mainline network stack would be enough I believe.
>> 
> 
> As Paul McKenney put it once, "(hard) real-time is the start of a
> conversation rather than a complete requirement.". What would be the
> requirements in terms of expected bandwidth, worst case latency, jitter,
> packet ordering guarantee ?
> 
> -- 
> Philippe.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-08  9:21   ` Deniz Uğur
@ 2021-11-10  9:48     ` Philippe Gerum
  2021-11-10 13:00       ` Deniz Uğur
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2021-11-10  9:48 UTC (permalink / raw)
  To: Deniz Uğur; +Cc: xenomai


Deniz Uğur <deniz343@gmail.com> writes:

>> Providing a basic sliding window mechanism between UDP/packet peers may
>> be an option.
>
> How so? Wouldn’t that introduce out-of-order packets?
>

The recipient could stage the packets forming the current window,
reordering them on the fly based on some sequence number in those
datagrams (assuming you do trust the mac-level checksum), until that
window is deemed complete, at which point the window could be
acknowledged to the peer and released to the application in the same
move. The main issue would be to decide about when a window transmission
is complete, depending on whether the peer is supposed to send data
continuously or not. If not, then this becomes a bit trickier.

>> As Paul McKenney put it once, "(hard) real-time is the start of a
>> conversation rather than a complete requirement.”
>
> I agree, as with all things any improvement over mainline should be incremental.
>
>> What would be the
>> requirements in terms of expected bandwidth, worst case latency, jitter,
>> packet ordering guarantee ?
>
> Well, my application reads encoder data and transmits it to a host
> computer for further calculations. Therefore, packet ordering must be
> inline with how the data is read. Each packet consists of ~100 bytes
> and I use 4 clients to send data. Worst case latency shouldn’t be
> higher than 0.2-0.3 ms, because that is my sampling time for each
> client. I have no constraints on jitter, as I’m happy with the jitter
> I’m currently measuring (which is less than 5%).
>
> What can you recommend me with these in mind?
>

200-300 microseconds latency worst case is demanding, this is in the
same ballpark than the figures I obtained with RTnet/Xenomai3.1 between
two mid-range SoCs (x86_64 and i.MX6Q) attached to a dedicated switch,
over a 100Mbit link.

I don't think the common linux stack is an option with such requirement
under load, especially since you would have to channel the network
traffic from your real-time task(s) to the regular stack via an EVL
proxy element (or anything more complex that would fit) before it can
reach the wire.

Going for a kernel bypass on the rpi4 by coupling DPDK and EVL would at
the very least require a DPDK-enabled GENET driver, which does not seem
to exist.

Another option would be to check whether you could work directly at
packet level for now on the rpi4, based on the EVL networking
layer. This is experimental WIP, but this is readily capable of
exchanging raw packets between peers. See [1].

[1]
https://source.denx.de/Xenomai/xenomai4/libevl/-/blob/master/tidbits/oob-net-icmp.c

-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-10  9:48     ` Philippe Gerum
@ 2021-11-10 13:00       ` Deniz Uğur
  2021-11-10 13:57         ` Philippe Gerum
  0 siblings, 1 reply; 7+ messages in thread
From: Deniz Uğur @ 2021-11-10 13:00 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

> 200-300 microseconds latency worst case is demanding, this is in the
> same ballpark than the figures I obtained with RTnet/Xenomai3.1 between
> two mid-range SoCs (x86_64 and i.MX6Q) attached to a dedicated switch,
> over a 100Mbit link.

I assume this is only the communication latency and nothing else. As I have to 
read SPI which takes 50-60 us, this measurements would be higher I guess.

> Going for a kernel bypass on the rpi4 by coupling DPDK and EVL would at
> the very least require a DPDK-enabled GENET driver, which does not seem
> to exist.

That would’ve been splendid to be honest. I didn’t knew DPDK till now but
the premise sounds quite good.

> Another option would be to check whether you could work directly at
> packet level for now on the rpi4, based on the EVL networking
> layer. This is experimental WIP, but this is readily capable of
> exchanging raw packets between peers. See [1].

If I understand correctly, EVL’s networking layer doesn’t support TCP/UDP
at the moment and I would have to implement the sliding window myself
with raw packets. Correct?

—

To sum up, we don’t have many options considering RPi 4 with kernel bypassing.

> I don't think the common linux stack is an option with such requirement
> under load, especially since you would have to channel the network
> traffic


Along with that, EVL’s net stack not creating any improvement for this kind of load.
Considering these, 200-300 us is demanding but it’s as high as it gets at the moment.

> On 10 Nov 2021, at 12:48, Philippe Gerum <rpm@xenomai.org> wrote:
> 
> 
> Deniz Uğur <deniz343@gmail.com <mailto:deniz343@gmail.com>> writes:
> 
>>> Providing a basic sliding window mechanism between UDP/packet peers may
>>> be an option.
>> 
>> How so? Wouldn’t that introduce out-of-order packets?
>> 
> 
> The recipient could stage the packets forming the current window,
> reordering them on the fly based on some sequence number in those
> datagrams (assuming you do trust the mac-level checksum), until that
> window is deemed complete, at which point the window could be
> acknowledged to the peer and released to the application in the same
> move. The main issue would be to decide about when a window transmission
> is complete, depending on whether the peer is supposed to send data
> continuously or not. If not, then this becomes a bit trickier.
> 
>>> As Paul McKenney put it once, "(hard) real-time is the start of a
>>> conversation rather than a complete requirement.”
>> 
>> I agree, as with all things any improvement over mainline should be incremental.
>> 
>>> What would be the
>>> requirements in terms of expected bandwidth, worst case latency, jitter,
>>> packet ordering guarantee ?
>> 
>> Well, my application reads encoder data and transmits it to a host
>> computer for further calculations. Therefore, packet ordering must be
>> inline with how the data is read. Each packet consists of ~100 bytes
>> and I use 4 clients to send data. Worst case latency shouldn’t be
>> higher than 0.2-0.3 ms, because that is my sampling time for each
>> client. I have no constraints on jitter, as I’m happy with the jitter
>> I’m currently measuring (which is less than 5%).
>> 
>> What can you recommend me with these in mind?
>> 
> 
> 200-300 microseconds latency worst case is demanding, this is in the
> same ballpark than the figures I obtained with RTnet/Xenomai3.1 between
> two mid-range SoCs (x86_64 and i.MX6Q) attached to a dedicated switch,
> over a 100Mbit link.
> 
> I don't think the common linux stack is an option with such requirement
> under load, especially since you would have to channel the network
> traffic from your real-time task(s) to the regular stack via an EVL
> proxy element (or anything more complex that would fit) before it can
> reach the wire.
> 
> Going for a kernel bypass on the rpi4 by coupling DPDK and EVL would at
> the very least require a DPDK-enabled GENET driver, which does not seem
> to exist.
> 
> Another option would be to check whether you could work directly at
> packet level for now on the rpi4, based on the EVL networking
> layer. This is experimental WIP, but this is readily capable of
> exchanging raw packets between peers. See [1].
> 
> [1]
> https://source.denx.de/Xenomai/xenomai4/libevl/-/blob/master/tidbits/oob-net-icmp.c <https://source.denx.de/Xenomai/xenomai4/libevl/-/blob/master/tidbits/oob-net-icmp.c>
> 
> -- 
> Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-10 13:00       ` Deniz Uğur
@ 2021-11-10 13:57         ` Philippe Gerum
  2021-11-10 16:05           ` Deniz Uğur
  0 siblings, 1 reply; 7+ messages in thread
From: Philippe Gerum @ 2021-11-10 13:57 UTC (permalink / raw)
  To: Deniz Uğur; +Cc: xenomai


Deniz Uğur <deniz343@gmail.com> writes:

>  200-300 microseconds latency worst case is demanding, this is in the
>  same ballpark than the figures I obtained with RTnet/Xenomai3.1 between
>  two mid-range SoCs (x86_64 and i.MX6Q) attached to a dedicated switch,
>  over a 100Mbit link.
>
> I assume this is only the communication latency and nothing else. As I have to 
> read SPI which takes 50-60 us, this measurements would be higher I
> guess.

Correct.

>
>  Going for a kernel bypass on the rpi4 by coupling DPDK and EVL would at
>  the very least require a DPDK-enabled GENET driver, which does not seem
>  to exist.
>
> That would’ve been splendid to be honest. I didn’t knew DPDK till now but
> the premise sounds quite good.
>

I'm not a DPDK expert, but there are folks on this list with significant
knowledge about DPDK who might want to discuss this.

>  Another option would be to check whether you could work directly at
>  packet level for now on the rpi4, based on the EVL networking
>  layer. This is experimental WIP, but this is readily capable of
>  exchanging raw packets between peers. See [1].
>
> If I understand correctly, EVL’s networking layer doesn’t support TCP/UDP
> at the moment and I would have to implement the sliding window myself
> with raw packets. Correct?
>

Correct. In userland, which would make things easier.

> —
>
> To sum up, we don’t have many options considering RPi 4 with kernel bypassing.
>
>  I don't think the common linux stack is an option with such requirement
>  under load, especially since you would have to channel the network
>  traffic
>
> Along with that, EVL’s net stack not creating any improvement for this
> kind of load.

Mm, not sure. Even in the case EVL reuses the regular NIC drivers, I can
already see latency improvements here (pi2 and bbb -> x86 Soc), because
the traffic is directly injected to the driver from the TX softirq
context, bypassing the delays induced by the regular net stack. Once the
full, end-to-end oob chain between the app and the wire including the
driver is enabled, we should get interesting figures (bypassing the
softirq context entirely). Forward looking statement, I agree. Working
on it.

> Considering these, 200-300 us is demanding but it’s as high as it gets at the moment.
>

-- 
Philippe.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64
  2021-11-10 13:57         ` Philippe Gerum
@ 2021-11-10 16:05           ` Deniz Uğur
  0 siblings, 0 replies; 7+ messages in thread
From: Deniz Uğur @ 2021-11-10 16:05 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

> Once the full, end-to-end oob chain between the app and the wire including
> the driver is enabled, we should get interesting figures (bypassing the
> softirq context entirely). Forward looking statement, I agree. Working
> on it.

Looking forward to this. If I can be any help with this I’d be glad to do it.

Thanks for the recommendations by the way, I will try to make my current setup
little bit better with what’s available to me via EVL.

> On 10 Nov 2021, at 16:57, Philippe Gerum <rpm@xenomai.org> wrote:
> 
> 
> Deniz Uğur <deniz343@gmail.com> writes:
> 
>> 200-300 microseconds latency worst case is demanding, this is in the
>> same ballpark than the figures I obtained with RTnet/Xenomai3.1 between
>> two mid-range SoCs (x86_64 and i.MX6Q) attached to a dedicated switch,
>> over a 100Mbit link.
>> 
>> I assume this is only the communication latency and nothing else. As I have to 
>> read SPI which takes 50-60 us, this measurements would be higher I
>> guess.
> 
> Correct.
> 
>> 
>> Going for a kernel bypass on the rpi4 by coupling DPDK and EVL would at
>> the very least require a DPDK-enabled GENET driver, which does not seem
>> to exist.
>> 
>> That would’ve been splendid to be honest. I didn’t knew DPDK till now but
>> the premise sounds quite good.
>> 
> 
> I'm not a DPDK expert, but there are folks on this list with significant
> knowledge about DPDK who might want to discuss this.
> 
>> Another option would be to check whether you could work directly at
>> packet level for now on the rpi4, based on the EVL networking
>> layer. This is experimental WIP, but this is readily capable of
>> exchanging raw packets between peers. See [1].
>> 
>> If I understand correctly, EVL’s networking layer doesn’t support TCP/UDP
>> at the moment and I would have to implement the sliding window myself
>> with raw packets. Correct?
>> 
> 
> Correct. In userland, which would make things easier.
> 
>> —
>> 
>> To sum up, we don’t have many options considering RPi 4 with kernel bypassing.
>> 
>> I don't think the common linux stack is an option with such requirement
>> under load, especially since you would have to channel the network
>> traffic
>> 
>> Along with that, EVL’s net stack not creating any improvement for this
>> kind of load.
> 
> Mm, not sure. Even in the case EVL reuses the regular NIC drivers, I can
> already see latency improvements here (pi2 and bbb -> x86 Soc), because
> the traffic is directly injected to the driver from the TX softirq
> context, bypassing the delays induced by the regular net stack. Once the
> full, end-to-end oob chain between the app and the wire including the
> driver is enabled, we should get interesting figures (bypassing the
> softirq context entirely). Forward looking statement, I agree. Working
> on it.
> 
>> Considering these, 200-300 us is demanding but it’s as high as it gets at the moment.
>> 
> 
> -- 
> Philippe.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-10 16:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-04  8:43 [EVL][dovetail] Minimum requirements for network communication for RPi 4B and amd64 Deniz Uğur
2021-11-08  7:59 ` Philippe Gerum
2021-11-08  9:21   ` Deniz Uğur
2021-11-10  9:48     ` Philippe Gerum
2021-11-10 13:00       ` Deniz Uğur
2021-11-10 13:57         ` Philippe Gerum
2021-11-10 16:05           ` Deniz Uğur

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.