Linux kernel regressions
 help / color / Atom feed
* [regression] UDP recv data corruption
@ 2021-07-01 10:47 Matthias Treydte
  2021-07-01 15:39 ` David Ahern
  0 siblings, 1 reply; 11+ messages in thread
From: Matthias Treydte @ 2021-07-01 10:47 UTC (permalink / raw)
  To: stable; +Cc: netdev, regressions, davem, yoshfuji, dsahern

Hello,

we recently upgraded the Linux kernel from 5.11.21 to 5.12.12 in our  
video stream receiver appliance and noticed compression artifacts on  
video streams that were previously looking fine. We are receiving UDP  
multicast MPEG TS streams through an FFMpeg / libav layer which does  
the socket and lower level protocol handling. For affected kernels it  
spills the log with messages like

> [mpegts @ 0x7fa130000900] Packet corrupt (stream = 0, dts = 6870802195).
> [mpegts @ 0x7fa11c000900] Packet corrupt (stream = 0, dts = 6870821068).

Bisecting identified commit 18f25dc399901426dff61e676ba603ff52c666f7  
as the one introducing the problem in the mainline kernel. It was  
backported to the 5.12 series in  
450687386cd16d081b58cd7a342acff370a96078. Some random observations  
which may help to understand what's going on:

    * the problem exists in Linux 5.13
    * reverting that commit on top of 5.13 makes the problem go away
    * Linux 5.10.45 is fine
    * no relevant output in dmesg
    * can be reproduced on different hardware (Intel, AMD, different NICs, ...)
    * we do use the bonding driver on the systems (but I did not yet  
verify that this is related)
    * we do not use vxlan (mentioned in the commit message)
    * the relevant code in FFMpeg identifying packet corruption is here:
      https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mpegts.c#L2758

And the bonding configuration:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v5.10.45

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: enp2s0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp2s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 80:ee:73:XX:XX:XX
Slave queue ID: 0

Slave Interface: enp3s0
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 80:ee:73:XX:XX:XX
Slave queue ID: 0


If there is anything else I can do to help tracking this down please  
let me know.


Regards,
-Matthias Treydte



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-01 10:47 [regression] UDP recv data corruption Matthias Treydte
@ 2021-07-01 15:39 ` David Ahern
  2021-07-02  0:31   ` Willem de Bruijn
  0 siblings, 1 reply; 11+ messages in thread
From: David Ahern @ 2021-07-01 15:39 UTC (permalink / raw)
  To: Matthias Treydte, stable, Paolo Abeni
  Cc: netdev, regressions, davem, yoshfuji, dsahern

[ adding Paolo, author of 18f25dc39990 ]

On 7/1/21 4:47 AM, Matthias Treydte wrote:
> Hello,
> 
> we recently upgraded the Linux kernel from 5.11.21 to 5.12.12 in our
> video stream receiver appliance and noticed compression artifacts on
> video streams that were previously looking fine. We are receiving UDP
> multicast MPEG TS streams through an FFMpeg / libav layer which does the
> socket and lower level protocol handling. For affected kernels it spills
> the log with messages like
> 
>> [mpegts @ 0x7fa130000900] Packet corrupt (stream = 0, dts = 6870802195).
>> [mpegts @ 0x7fa11c000900] Packet corrupt (stream = 0, dts = 6870821068).
> 
> Bisecting identified commit 18f25dc399901426dff61e676ba603ff52c666f7 as
> the one introducing the problem in the mainline kernel. It was
> backported to the 5.12 series in
> 450687386cd16d081b58cd7a342acff370a96078. Some random observations which
> may help to understand what's going on:
> 
>    * the problem exists in Linux 5.13
>    * reverting that commit on top of 5.13 makes the problem go away
>    * Linux 5.10.45 is fine
>    * no relevant output in dmesg
>    * can be reproduced on different hardware (Intel, AMD, different
> NICs, ...)
>    * we do use the bonding driver on the systems (but I did not yet
> verify that this is related)
>    * we do not use vxlan (mentioned in the commit message)
>    * the relevant code in FFMpeg identifying packet corruption is here:
>     
> https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mpegts.c#L2758
> 
> And the bonding configuration:
> 
> # cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v5.10.45
> 
> Bonding Mode: fault-tolerance (active-backup)
> Primary Slave: None
> Currently Active Slave: enp2s0
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 0
> Down Delay (ms): 0
> Peer Notification Delay (ms): 0
> 
> Slave Interface: enp2s0
> MII Status: up
> Speed: 1000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: 80:ee:73:XX:XX:XX
> Slave queue ID: 0
> 
> Slave Interface: enp3s0
> MII Status: down
> Speed: Unknown
> Duplex: Unknown
> Link Failure Count: 0
> Permanent HW addr: 80:ee:73:XX:XX:XX
> Slave queue ID: 0
> 
> 
> If there is anything else I can do to help tracking this down please let
> me know.
> 
> 
> Regards,
> -Matthias Treydte
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-01 15:39 ` David Ahern
@ 2021-07-02  0:31   ` Willem de Bruijn
  2021-07-02 11:42     ` Paolo Abeni
  2021-07-02 12:36     ` Matthias Treydte
  0 siblings, 2 replies; 11+ messages in thread
From: Willem de Bruijn @ 2021-07-02  0:31 UTC (permalink / raw)
  To: David Ahern
  Cc: Matthias Treydte, stable, Paolo Abeni, netdev, regressions,
	davem, yoshfuji, dsahern

On Thu, Jul 1, 2021 at 11:39 AM David Ahern <dsahern@gmail.com> wrote:
>
> [ adding Paolo, author of 18f25dc39990 ]
>
> On 7/1/21 4:47 AM, Matthias Treydte wrote:
> > Hello,
> >
> > we recently upgraded the Linux kernel from 5.11.21 to 5.12.12 in our
> > video stream receiver appliance and noticed compression artifacts on
> > video streams that were previously looking fine. We are receiving UDP
> > multicast MPEG TS streams through an FFMpeg / libav layer which does the
> > socket and lower level protocol handling. For affected kernels it spills
> > the log with messages like
> >
> >> [mpegts @ 0x7fa130000900] Packet corrupt (stream = 0, dts = 6870802195).
> >> [mpegts @ 0x7fa11c000900] Packet corrupt (stream = 0, dts = 6870821068).
> >
> > Bisecting identified commit 18f25dc399901426dff61e676ba603ff52c666f7 as
> > the one introducing the problem in the mainline kernel. It was
> > backported to the 5.12 series in
> > 450687386cd16d081b58cd7a342acff370a96078. Some random observations which
> > may help to understand what's going on:
> >
> >    * the problem exists in Linux 5.13
> >    * reverting that commit on top of 5.13 makes the problem go away
> >    * Linux 5.10.45 is fine
> >    * no relevant output in dmesg
> >    * can be reproduced on different hardware (Intel, AMD, different
> > NICs, ...)
> >    * we do use the bonding driver on the systems (but I did not yet
> > verify that this is related)
> >    * we do not use vxlan (mentioned in the commit message)
> >    * the relevant code in FFMpeg identifying packet corruption is here:
> >
> > https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mpegts.c#L2758
> >
> > And the bonding configuration:
> >
> > # cat /proc/net/bonding/bond0
> > Ethernet Channel Bonding Driver: v5.10.45
> >
> > Bonding Mode: fault-tolerance (active-backup)
> > Primary Slave: None
> > Currently Active Slave: enp2s0
> > MII Status: up
> > MII Polling Interval (ms): 100
> > Up Delay (ms): 0
> > Down Delay (ms): 0
> > Peer Notification Delay (ms): 0
> >
> > Slave Interface: enp2s0
> > MII Status: up
> > Speed: 1000 Mbps
> > Duplex: full
> > Link Failure Count: 0
> > Permanent HW addr: 80:ee:73:XX:XX:XX
> > Slave queue ID: 0
> >
> > Slave Interface: enp3s0
> > MII Status: down
> > Speed: Unknown
> > Duplex: Unknown
> > Link Failure Count: 0
> > Permanent HW addr: 80:ee:73:XX:XX:XX
> > Slave queue ID: 0
> >
> >
> > If there is anything else I can do to help tracking this down please let
> > me know.

That library does not enable UDP_GRO. You do not have any UDP based
tunnel devices (besides vxlan) configured, either, right?

Then no socket lookup should take place, so sk is NULL.

It is also unlikely that the device has either of NETIF_F_GRO_FRAGLIST
or NETIF_F_GRO_UDP_FWD configured. This can be checked with `ethtool
-K $DEV`, shown as "rx-gro-list" and "rx-udp-gro-forwarding",
respectively.

Then udp_gro_receive_segment is not called.

So this should just return the packet without applying any GRO.

I'm referring to this block of code in udp_gro_receive:

        if (!sk || !udp_sk(sk)->gro_receive) {
                if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
                        NAPI_GRO_CB(skb)->is_flist = sk ?
!udp_sk(sk)->gro_enabled : 1;

                if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
                    (sk && udp_sk(sk)->gro_enabled) ||
NAPI_GRO_CB(skb)->is_flist)
                        pp = call_gro_receive(udp_gro_receive_segment,
head, skb);
                return pp;
        }

I don't see what could be up.

One possible short-term workaround is to disable GRO. If this commit
is implicated, that should fix it. At some obvious possible cycle
cost.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02  0:31   ` Willem de Bruijn
@ 2021-07-02 11:42     ` Paolo Abeni
  2021-07-02 14:31       ` Matthias Treydte
  2021-07-02 12:36     ` Matthias Treydte
  1 sibling, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2021-07-02 11:42 UTC (permalink / raw)
  To: Matthias Treydte
  Cc: stable, netdev, regressions, davem, yoshfuji, dsahern,
	Willem de Bruijn, David Ahern

Hello,

On Thu, 2021-07-01 at 20:31 -0400, Willem de Bruijn wrote:
> On Thu, Jul 1, 2021 at 11:39 AM David Ahern <dsahern@gmail.com> wrote:
> > [ adding Paolo, author of 18f25dc39990 ]
> > 
> > On 7/1/21 4:47 AM, Matthias Treydte wrote:
> > > Hello,
> > > 
> > > we recently upgraded the Linux kernel from 5.11.21 to 5.12.12 in our
> > > video stream receiver appliance and noticed compression artifacts on
> > > video streams that were previously looking fine. We are receiving UDP
> > > multicast MPEG TS streams through an FFMpeg / libav layer which does the
> > > socket and lower level protocol handling. For affected kernels it spills
> > > the log with messages like
> > > 
> > > > [mpegts @ 0x7fa130000900] Packet corrupt (stream = 0, dts = 6870802195).
> > > > [mpegts @ 0x7fa11c000900] Packet corrupt (stream = 0, dts = 6870821068).
> > > 
> > > Bisecting identified commit 18f25dc399901426dff61e676ba603ff52c666f7 as
> > > the one introducing the problem in the mainline kernel. It was
> > > backported to the 5.12 series in
> > > 450687386cd16d081b58cd7a342acff370a96078. Some random observations which
> > > may help to understand what's going on:
> > > 
> > >    * the problem exists in Linux 5.13
> > >    * reverting that commit on top of 5.13 makes the problem go away
> > >    * Linux 5.10.45 is fine
> > >    * no relevant output in dmesg
> > >    * can be reproduced on different hardware (Intel, AMD, different
> > > NICs, ...)
> > >    * we do use the bonding driver on the systems (but I did not yet
> > > verify that this is related)
> > >    * we do not use vxlan (mentioned in the commit message)
> > >    * the relevant code in FFMpeg identifying packet corruption is here:
> > > 
> > > https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/mpegts.c#L2758
> > > 
> > > And the bonding configuration:
> > > 
> > > # cat /proc/net/bonding/bond0
> > > Ethernet Channel Bonding Driver: v5.10.45
> > > 
> > > Bonding Mode: fault-tolerance (active-backup)
> > > Primary Slave: None
> > > Currently Active Slave: enp2s0
> > > MII Status: up
> > > MII Polling Interval (ms): 100
> > > Up Delay (ms): 0
> > > Down Delay (ms): 0
> > > Peer Notification Delay (ms): 0
> > > 
> > > Slave Interface: enp2s0
> > > MII Status: up
> > > Speed: 1000 Mbps
> > > Duplex: full
> > > Link Failure Count: 0
> > > Permanent HW addr: 80:ee:73:XX:XX:XX
> > > Slave queue ID: 0
> > > 
> > > Slave Interface: enp3s0
> > > MII Status: down
> > > Speed: Unknown
> > > Duplex: Unknown
> > > Link Failure Count: 0
> > > Permanent HW addr: 80:ee:73:XX:XX:XX
> > > Slave queue ID: 0
> > > 
> > > 
> > > If there is anything else I can do to help tracking this down please let
> > > me know.

Thank you for the report!

According to the above I would wild guess that the GRO layer is
wrongly/unexpectly aggregating some UDP packets, but I do agree with
Willem, it looks like the code pointed by the bisecting should not do
anything evil here, but I'm likely missing some items.

Could you please:
- tell how frequent is the pkt corruption, even a rough estimate of the
frequency.

- provide the features exposed by the relevant devices: 
ethtool -k <nic name>

- check, if possibly, how exactly the pkts are corrupted. Wrong size?
bad csum? what else?

- ideally a short pcap trace comprising the problematic packets would
be great!

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02  0:31   ` Willem de Bruijn
  2021-07-02 11:42     ` Paolo Abeni
@ 2021-07-02 12:36     ` Matthias Treydte
  2021-07-02 14:06       ` Paolo Abeni
  1 sibling, 1 reply; 11+ messages in thread
From: Matthias Treydte @ 2021-07-02 12:36 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Ahern, stable, Paolo Abeni, netdev, regressions, davem,
	yoshfuji, dsahern


Quoting Willem de Bruijn <willemdebruijn.kernel@gmail.com>:

> That library does not enable UDP_GRO. You do not have any UDP based
> tunnel devices (besides vxlan) configured, either, right?

The configuration is really minimal by now, I also took the bonding
out of the equation. We have systemd configure "en*" with mDNS and DHCP
enabled and that's it. The problem remains.

I also found new hardware on my desk today (some Intel SoC), showing
exactly the same symptoms. So it's really nothing to do with the
hardware.

> It is also unlikely that the device has either of NETIF_F_GRO_FRAGLIST
> or NETIF_F_GRO_UDP_FWD configured. This can be checked with `ethtool
> -K $DEV`, shown as "rx-gro-list" and "rx-udp-gro-forwarding",
> respectively.

The full output of "ethtool -k enp5s0" from that SoC:

Features for enp5s0:
rx-checksumming: on
tx-checksumming: on
         tx-checksum-ipv4: off [fixed]
         tx-checksum-ip-generic: on
         tx-checksum-ipv6: off [fixed]
         tx-checksum-fcoe-crc: off [fixed]
         tx-checksum-sctp: on
scatter-gather: on
         tx-scatter-gather: on
         tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
         tx-tcp-segmentation: on
         tx-tcp-ecn-segmentation: off [fixed]
         tx-tcp-mangleid-segmentation: off
         tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: on
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

That's the only NIC on this board:

# ip l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN  
mode DEFAULT group default qlen 1000
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state  
UP mode DEFAULT group default qlen 1000
     link/ether 00:30:d6:24:99:67 brd ff:ff:ff:ff:ff:ff

> One possible short-term workaround is to disable GRO.

Indeed, "ethtool -K enp5s0 gro off" fixes the problem, and calling it with
"gro on" brings it back.


And to answer Paolo's questions from his mail to the list (@Paolo: I'm  
not subscribed,
please also send to me directly so I don't miss your mail)

> Could you please:
> - tell how frequent is the pkt corruption, even a rough estimate of the
> frequency.

# journalctl --since "5min ago" | grep "Packet corrupt" | wc -l
167

So there are 167 detected failures in 5 minutes, while the system is receiving
at a moderate rate of about 900 pkts/s (according to Prometheus' node exporter
at least, but seems about right)

Next I'll try to capture some broken packets and reply in a separate mail,
I'll have to figure out a good way to do this first.


Thanks for your help,
-Matthias




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 12:36     ` Matthias Treydte
@ 2021-07-02 14:06       ` Paolo Abeni
  2021-07-02 14:21         ` Paolo Abeni
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2021-07-02 14:06 UTC (permalink / raw)
  To: Matthias Treydte, Willem de Bruijn
  Cc: David Ahern, stable, netdev, regressions, davem, yoshfuji, dsahern

Hello,

On Fri, 2021-07-02 at 14:36 +0200, Matthias Treydte wrote:
> And to answer Paolo's questions from his mail to the list (@Paolo: I'm  
> not subscribed, please also send to me directly so I don't miss your mail)

(yup, that is what I did ?!?)

> > Could you please:
> > - tell how frequent is the pkt corruption, even a rough estimate of the
> > frequency.
> 
> # journalctl --since "5min ago" | grep "Packet corrupt" | wc -l
> 167
> 
> So there are 167 detected failures in 5 minutes, while the system is receiving
> at a moderate rate of about 900 pkts/s (according to Prometheus' node exporter
> at least, but seems about right)

Intersting. The relevant UDP GRO features are already off, and this
happens infrequently. Something is happening on a per packet basis, I
can't guess what.

It looks like you should be able to collect more info WRT the packet
corruption enabling debug logging at ffmpeg level, but I guess that
will flood the logfile.

If you have the kernel debuginfo and the 'perf' tool available, could
you please try:

perf probe -a 'udp_gro_receive sk sk->__sk_common.skc_dport'
perf probe -a 'udp_gro_receive_segment'

# neet to wait until at least a pkt corruption happens, 10 second
# should be more then enough
perf record -a -e probe:udp_gro_receive -e probe:udp_gro_receive_segment sleep 10

perf script | gzip > perf_script.gz

and share the above? I fear it could be too big for the ML, feel free
to send it directly to me.

> Next I'll try to capture some broken packets and reply in a separate mail,
> I'll have to figure out a good way to do this first.

Looks like there is corrupted packet every ~2K UDP ones. If you capture
a few thousends consecutive ones, than wireshark should probably help
finding the suspicious ones.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 14:06       ` Paolo Abeni
@ 2021-07-02 14:21         ` Paolo Abeni
  2021-07-02 15:23           ` Matthias Treydte
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2021-07-02 14:21 UTC (permalink / raw)
  To: Matthias Treydte, Willem de Bruijn
  Cc: David Ahern, stable, netdev, regressions, davem, yoshfuji, dsahern

Hi,

On Fri, 2021-07-02 at 16:06 +0200, Paolo Abeni wrote:
> On Fri, 2021-07-02 at 14:36 +0200, Matthias Treydte wrote:
> > And to answer Paolo's questions from his mail to the list (@Paolo: I'm  
> > not subscribed, please also send to me directly so I don't miss your mail)
> 
> (yup, that is what I did ?!?)
> 
> > > Could you please:
> > > - tell how frequent is the pkt corruption, even a rough estimate of the
> > > frequency.
> > 
> > # journalctl --since "5min ago" | grep "Packet corrupt" | wc -l
> > 167
> > 
> > So there are 167 detected failures in 5 minutes, while the system is receiving
> > at a moderate rate of about 900 pkts/s (according to Prometheus' node exporter
> > at least, but seems about right)

I'm sorry for the high-frequency spamming.

Could you please try the following patch ? (only compile-tested)

I fear some packets may hang in the GRO engine for no reasons.
---
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 54e06b88af69..458c888337a5 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -526,6 +526,8 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
                if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
                    (sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist)
                        pp = call_gro_receive(udp_gro_receive_segment, head, skb);
+               else
+                       goto out;
                return pp;
        }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 11:42     ` Paolo Abeni
@ 2021-07-02 14:31       ` Matthias Treydte
  0 siblings, 0 replies; 11+ messages in thread
From: Matthias Treydte @ 2021-07-02 14:31 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: stable, netdev, regressions, davem, yoshfuji, dsahern,
	Willem de Bruijn, David Ahern


Quoting Paolo Abeni <pabeni@redhat.com>:

Finally got your mail, seems it was delayed, sorry.

> - check, if possibly, how exactly the pkts are corrupted. Wrong size?
> bad csum? what else?
>
> - ideally a short pcap trace comprising the problematic packets would
> be great!

While trying to produce something I came up with an interesting  
observation which seems to indicate that the problem lies in the  
combination of GRO and fragmented UDP packets:

Tuning the sending side (for my testing purposes also FFMpeg) to send  
UDP packets of 1316 bytes tops makes the problem go away in the  
receiver. The value must be an exact multiple of 188 (the MPEG TS  
frame size) to cause FFMpeg not to send fragmented packets at all.

Using this we were able to do the following on our normal desktop machines:

The sending side uses an command like this:

ffmpeg \
   -stream_loop -1 \
   -re -i "$video" \
   -c:v copy -c:a copy \
   -f mpegts "udp://239.12.23.0:1935"

and the receiver (in our case using Linux 5.12.14) "mpv  
udp://239.12.23.0:1935" to see the stream. For our test $video was  
just some h264 encoded MKV I had laying around. The receiver sees  
compression artifacts and the "Packet corrupt" messages in the  
console. Now there are two ways to improve this situation:

1) The receiver uses ethtool to disable GRO
2) The sender changes the URL to be "udp://239.12.23.0:1935?pkt_size=1316"

At this point I assume there are better ways to reproduce this using  
netcat or the like. But being a video guy, well, here we are. ;-)

My knowledge about the inner workings of Linux' IP stack are lacking,  
but because tcpdump "sees" packets before they are reassembled and the  
problem seems to exist only with packets that were fragmented and  
reassembled (as they are presented to libav), I have the feeling that  
a pcap file would not be that helpful with this, right?


Regards,
-Matthias


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 14:21         ` Paolo Abeni
@ 2021-07-02 15:23           ` Matthias Treydte
  2021-07-02 15:32             ` Paolo Abeni
  0 siblings, 1 reply; 11+ messages in thread
From: Matthias Treydte @ 2021-07-02 15:23 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, David Ahern, stable, netdev, regressions,
	davem, yoshfuji, dsahern


Quoting Paolo Abeni <pabeni@redhat.com>:

> ---
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 54e06b88af69..458c888337a5 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -526,6 +526,8 @@ struct sk_buff *udp_gro_receive(struct list_head  
> *head, struct sk_buff *skb,
>                 if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
>                     (sk && udp_sk(sk)->gro_enabled) ||  
> NAPI_GRO_CB(skb)->is_flist)
>                         pp =  
> call_gro_receive(udp_gro_receive_segment, head, skb);
> +               else
> +                       goto out;
>                 return pp;
>         }

Impressive! This patch, applied to 5.13, fixes the problem. What I  
like even more is that it again confirms my suspicion that an "if"  
without an "else" is always a code smell. :-)

With this and the reproducer in my previous mail, is there still value  
in doing the "perf" stuff?


Thanks,
-Matthias


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 15:23           ` Matthias Treydte
@ 2021-07-02 15:32             ` Paolo Abeni
  2021-07-02 16:07               ` Matthias Treydte
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2021-07-02 15:32 UTC (permalink / raw)
  To: Matthias Treydte
  Cc: Willem de Bruijn, David Ahern, stable, netdev, regressions,
	davem, yoshfuji, dsahern

On Fri, 2021-07-02 at 17:23 +0200, Matthias Treydte wrote:
> Quoting Paolo Abeni <pabeni@redhat.com>:
> 
> > ---
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index 54e06b88af69..458c888337a5 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -526,6 +526,8 @@ struct sk_buff *udp_gro_receive(struct list_head  
> > *head, struct sk_buff *skb,
> >                 if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
> >                     (sk && udp_sk(sk)->gro_enabled) ||  
> > NAPI_GRO_CB(skb)->is_flist)
> >                         pp =  
> > call_gro_receive(udp_gro_receive_segment, head, skb);
> > +               else
> > +                       goto out;
> >                 return pp;
> >         }
> 
> Impressive! This patch, applied to 5.13, fixes the problem. What I  
> like even more is that it again confirms my suspicion that an "if"  
> without an "else" is always a code smell. :-)

Thank you for the quick feedback! I'll submit formally soon, after more
tests. I'll probably change the code to be something hopefully more
readable, as follow:

---
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 7a670757f37a..b3aabc886763 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -551,8 +551,10 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 
                if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
                    (sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist)
-                       pp = call_gro_receive(udp_gro_receive_segment, head, skb);
-               return pp;
+                       return call_gro_receive(udp_gro_receive_segment, head, skb);
+
+               /* no GRO, be sure flush the current packet */
+               goto out;
        }
---

> With this and the reproducer in my previous mail, is there still value  
> in doing the "perf" stuff?

Not needed, thank you!

Would be great instead if you could have a spin to the proposed variant
above - not stritly needed, I'm really asking for a few extra miles
here ;)

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [regression] UDP recv data corruption
  2021-07-02 15:32             ` Paolo Abeni
@ 2021-07-02 16:07               ` Matthias Treydte
  0 siblings, 0 replies; 11+ messages in thread
From: Matthias Treydte @ 2021-07-02 16:07 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Willem de Bruijn, David Ahern, stable, netdev, regressions,
	davem, yoshfuji, dsahern


Quoting Paolo Abeni <pabeni@redhat.com>:

> Would be great instead if you could have a spin to the proposed variant
> above - not stritly needed, I'm really asking for a few extra miles
> here ;)

Although this variant lacks the "else", to my surprise I have to admit  
that it works just as well. :-)


It was a pleasure and thanks for your work,
-Matthias



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01 10:47 [regression] UDP recv data corruption Matthias Treydte
2021-07-01 15:39 ` David Ahern
2021-07-02  0:31   ` Willem de Bruijn
2021-07-02 11:42     ` Paolo Abeni
2021-07-02 14:31       ` Matthias Treydte
2021-07-02 12:36     ` Matthias Treydte
2021-07-02 14:06       ` Paolo Abeni
2021-07-02 14:21         ` Paolo Abeni
2021-07-02 15:23           ` Matthias Treydte
2021-07-02 15:32             ` Paolo Abeni
2021-07-02 16:07               ` Matthias Treydte

Linux kernel regressions

Archives are clonable:
	git clone --mirror https://lore.kernel.org/regressions/0 regressions/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 regressions regressions/ https://lore.kernel.org/regressions \
		regressions@lists.linux.dev
	public-inbox-index regressions

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/dev.linux.lists.regressions


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git