All of lore.kernel.org
 help / color / mirror / Atom feed
* Multicast packet reordering
@ 2022-12-02  4:45 Etienne Champetier
  2022-12-02 13:46 ` Andrew Lunn
  2022-12-02 18:34 ` Jakub Kicinski
  0 siblings, 2 replies; 6+ messages in thread
From: Etienne Champetier @ 2022-12-02  4:45 UTC (permalink / raw)
  To: netdev

Hello all,

I'm investigating random multicast packet reordering between 2 containers
even under moderate traffic (16 video multicast, ~80mbps total) on Alma 8.

To simplify the testing, I've reproduced the issue using iperf2, then reproduced the issue on lo.
I can reproduce multicast packet reordering on lo on Alma 8, Alma 9, and Fedora 37, but not on CentOS 7.
As Fedora 37 is using 6.0.7-301.fc37.x86_64 I'm reporting here.

Using RPS fixes the issue, but to make it short:
- Is it expect to have multicast packet reordering when just tuning buffer sizes ?
- Does it make sense to use RPS to fix this issue / anything else / better ?
- In the case of 2 containers talking using veth + bridge, is it better to keep 1 queue
and set rps_cpus to all cpus, or some more complex tuning like 1 queue per cpu + rps on 1 cpu only ?

The details:
On a Dell R7515 / AMD EPYC 7702P 64-Core Processor (128 threads) / 1 NUMA node

For each OS I'm doing 3 tests
1) initial tuning
tuned-adm profile network-throughput
2) increase buffers
sysctl -f - <<'EOF'
net.core.netdev_max_backlog=250000
net.core.rmem_default=33554432
net.core.rmem_max=33554432
net.core.wmem_default=4194304
net.core.wmem_max=4194304
EOF
3) Enable RPS
echo ffffffff,ffffffff,ffffffff,ffffffff > /sys/class/net/lo/queues/rx-0/rps_cpus

I start the servers and the client
for i in {1..10}; do
   iperf -s -u -B 239.255.255.$i%lo -i 1 &
done
iperf -c 239.255.255.1 -B 127.0.0.1 -u -i 1 -b 2G -l 1316 -P 10 --incr-dstip -t0

On Fedora 37, Alma 8&9:
test 1: I get drops and reordering
test 2: no drops but reordering
test 3: clean

On CentOS 7 I don't reach 2G, but I don't get reordering, I get drops at step 1,
all clean at step 2, and drops again when enabling RPS.

Best
Etienne


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multicast packet reordering
  2022-12-02  4:45 Multicast packet reordering Etienne Champetier
@ 2022-12-02 13:46 ` Andrew Lunn
  2022-12-02 15:32   ` Etienne Champetier
  2022-12-02 18:34 ` Jakub Kicinski
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Lunn @ 2022-12-02 13:46 UTC (permalink / raw)
  To: Etienne Champetier; +Cc: netdev

On Thu, Dec 01, 2022 at 11:45:53PM -0500, Etienne Champetier wrote:
> Hello all,
> 
> I'm investigating random multicast packet reordering between 2 containers
> even under moderate traffic (16 video multicast, ~80mbps total) on Alma 8.

Have you tried plain unicast UDP?

There is nothing in the UDP standard which says UDP has to arrive in
order. Your application needs to handle reordering. So your time might
be better spent optimizing your application for when it happens.

	Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multicast packet reordering
  2022-12-02 13:46 ` Andrew Lunn
@ 2022-12-02 15:32   ` Etienne Champetier
  0 siblings, 0 replies; 6+ messages in thread
From: Etienne Champetier @ 2022-12-02 15:32 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev


Le 02/12/2022 à 08:46, Andrew Lunn a écrit :
> On Thu, Dec 01, 2022 at 11:45:53PM -0500, Etienne Champetier wrote:
>> Hello all,
>>
>> I'm investigating random multicast packet reordering between 2 containers
>> even under moderate traffic (16 video multicast, ~80mbps total) on Alma 8.
> Have you tried plain unicast UDP?

Just did on Fedora 37, and same results, if I don't enable RPS I get a bit of reordering from time to time
for i in {1..10}; do
   iperf -s -u --port $((5000+i)) -i 1 &
done
iperf -c 127.0.0.1 -u -i 1 -b 2G -l 1316 -P 10 --incr-dstport -t0

> There is nothing in the UDP standard which says UDP has to arrive in
> order. Your application needs to handle reordering. So your time might
> be better spent optimizing your application for when it happens.

I a big believer in fixing where things are broken, but it's not always the easiest path (or even possible).

I'm in the video industry and working on an "appliance" that host multiple applications each as separate containers.
Some applications are from our R&D, some from third party. The default protocol that everyone supports
to pass video around is MPEG TS over udp multicast, and this requires reliable network (no drops/no reordering).
A good number of those application supports RTP, which has a reorder buffer and optionally FEC,
but sadly not all, and having third party implement new features can take years.

When running all applications on separate separate servers reordering has never been an issue,
ie physical NIC and switch do a better job at keeping packets in order than virtual interface it seems.

I understand if we trade off non strict ordering for performance, but is it the case ?
I'm fine enabling RPS and calling it a day, I was mostly looking for comments if it's expected Linux behavior.

Etienne

> 	Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multicast packet reordering
  2022-12-02  4:45 Multicast packet reordering Etienne Champetier
  2022-12-02 13:46 ` Andrew Lunn
@ 2022-12-02 18:34 ` Jakub Kicinski
  2022-12-02 20:09   ` Etienne Champetier
  1 sibling, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2022-12-02 18:34 UTC (permalink / raw)
  To: Etienne Champetier; +Cc: netdev

On Thu, 1 Dec 2022 23:45:53 -0500 Etienne Champetier wrote:
> Using RPS fixes the issue, but to make it short:
> - Is it expect to have multicast packet reordering when just tuning buffer sizes ?
> - Does it make sense to use RPS to fix this issue / anything else / better ?
> - In the case of 2 containers talking using veth + bridge, is it better to keep 1 queue
> and set rps_cpus to all cpus, or some more complex tuning like 1 queue per cpu + rps on 1 cpu only ?

Yes, there are per-cpu queues in various places to help scaling, 
if you don't pin the sender to one CPU and it gets moved you can 
understandably get reordering w/ UDP (both on lo and veth).

As Andrew said that's considered acceptable.
Unfortunately it's one of those cases where we need to relax 
the requirements / stray from the ideal world if we want parallel
processing to not suck..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multicast packet reordering
  2022-12-02 18:34 ` Jakub Kicinski
@ 2022-12-02 20:09   ` Etienne Champetier
  2022-12-02 21:39     ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Etienne Champetier @ 2022-12-02 20:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev

Le ven. 2 déc. 2022 à 13:34, Jakub Kicinski <kuba@kernel.org> a écrit :
>
> On Thu, 1 Dec 2022 23:45:53 -0500 Etienne Champetier wrote:
> > Using RPS fixes the issue, but to make it short:
> > - Is it expect to have multicast packet reordering when just tuning buffer sizes ?
> > - Does it make sense to use RPS to fix this issue / anything else / better ?
> > - In the case of 2 containers talking using veth + bridge, is it better to keep 1 queue
> > and set rps_cpus to all cpus, or some more complex tuning like 1 queue per cpu + rps on 1 cpu only ?
>
> Yes, there are per-cpu queues in various places to help scaling,
> if you don't pin the sender to one CPU and it gets moved you can
> understandably get reordering w/ UDP (both on lo and veth).

Is enabling RPS a workaround that will continue to work in the long term,
or it just fixes this reordering "by accident" ?

And I guess pinning the sender to one CPU is also important when
sending via a real NIC,
not only moving packets internally ?

> As Andrew said that's considered acceptable.
> Unfortunately it's one of those cases where we need to relax
> the requirements / stray from the ideal world if we want parallel
> processing to not suck..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multicast packet reordering
  2022-12-02 20:09   ` Etienne Champetier
@ 2022-12-02 21:39     ` Jakub Kicinski
  0 siblings, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2022-12-02 21:39 UTC (permalink / raw)
  To: Etienne Champetier; +Cc: netdev

On Fri, 2 Dec 2022 15:09:13 -0500 Etienne Champetier wrote:
> > Yes, there are per-cpu queues in various places to help scaling,
> > if you don't pin the sender to one CPU and it gets moved you can
> > understandably get reordering w/ UDP (both on lo and veth).  
> 
> Is enabling RPS a workaround that will continue to work in the long term,
> or it just fixes this reordering "by accident" ?

For lo and veth it should continue to work.

> And I guess pinning the sender to one CPU is also important when
> sending via a real NIC, not only moving packets internally ?

Yes, for UDP with real NICs you may want to try to pin a flow 
to a particular NIC queue.. somehow. Not sure how much of a problem
this is in practice.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-02 21:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02  4:45 Multicast packet reordering Etienne Champetier
2022-12-02 13:46 ` Andrew Lunn
2022-12-02 15:32   ` Etienne Champetier
2022-12-02 18:34 ` Jakub Kicinski
2022-12-02 20:09   ` Etienne Champetier
2022-12-02 21:39     ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.