All of lore.kernel.org
 help / color / mirror / Atom feed
* UDP data corruption in v4.4
@ 2020-07-25  2:21 Dexuan Cui
  2020-07-25  5:58 ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2020-07-25  2:21 UTC (permalink / raw)
  To: Al Viro, netdev, stable, David S. Miller, 'Eric Dumazet'
  Cc: 'Willy Tarreau', Greg KH, Joseph Salisbury, Michael Kelley

Hi,
The v4.4 stable kernel (currently it's v4.4.231) lacks this bugfix:
327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error")
, as a result, the v4.4 kernel can deliver corrupt data to the application
when a corrupt UDP packet is closely followed by a valid UDP packet:
when the same invocation of the recvmsg() syscall delivers the corrupt
packet's UDP payload to the application's receive buffer, it provides the
UDP payload length and the "from IP/Port" of the valid packet to the 
application -- this mismatch makes the issue worse.

Details:

For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's
include/linux/skbuff.h:3951), Linux delays the UDP checksum verification
until the application invokes the syscall recvmsg().

In the recvmsg() syscall handler, while Linux is copying the UDP payload
to the application's memory, it calculates the UDP checksum. If the
calculated checksum doesn't match the received checksum, Linux drops the
corrupt UDP packet, and then starts to process the next packet (if any),
and if the next packet is valid (i.e. the checksum is correct), Linux will
copy the valid UDP packet's payload to the application's receiver buffer.

The bug is: before Linux starts to copy the valid UDP packet, the data
structure used to track how many more bytes should be copied to the
application memory is not reset to what it was when the application just
entered the kernel by the syscall! Consequently, only a small portion or
none of the valid packet's payload is copied to the application's receive
buffer, and later when the application exits from the kernel, actually
most of the application's receive buffer contains the payload of the
corrupt packet while recvmsg() returns the length of the UDP payload of
the valid packet.

For the mainline kernel, the bug was fixed by Al Viro in the commit 
327868212381, but unluckily the bugfix is only backported to the
upstream v4.9+ kernels. I hope the bugfix can be backported to the
v4.4 stable kernel, since it's a "longterm" kernel and is still used by
some Linux distros.

It turns out backporting 327868212381 to v4.4 means that some 
Supporting patches must be backported first, so the overall changes
are pretty big...

I made the below one-line workaround patch to force the recvmsg() syscall
handler to return to the userspace when Linux detects a corrupt UDP packet,
so the application will invoke the syscall again to receive the following valid
UDP packet (note: the patch may not work well with blocking sockets, for
which typically the application doesn't expect an error of -EAGAIN. I
guess it would be safer to return -EINTR instead?):

--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1367,6 +1367,7 @@ csum_copy_err:
        /* starting over for a new packet, but check if we need to yield */
        cond_resched();
        msg->msg_flags &= ~MSG_TRUNC;
+       return -EAGAIN;
        goto try_again;
}


Eric Dumazet made an alternative that performs the csum validation earlier:

--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
sk_buff *skb)
                }
        }

-       if (rcu_access_pointer(sk->sk_filter) &&
-           udp_lib_checksum_complete(skb))
+       if (udp_lib_checksum_complete(skb))
                goto csum_error;

        if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {

I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
trying to backport 327868212381.

Looking forward to your comments!

Thanks,
Dexuan Cui

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP data corruption in v4.4
  2020-07-25  2:21 UDP data corruption in v4.4 Dexuan Cui
@ 2020-07-25  5:58 ` Greg KH
  2020-07-27 18:38   ` Dexuan Cui
  0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2020-07-25  5:58 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Al Viro, netdev, stable, David S. Miller, 'Eric Dumazet',
	'Willy Tarreau',
	Joseph Salisbury, Michael Kelley

On Sat, Jul 25, 2020 at 02:21:06AM +0000, Dexuan Cui wrote:
> Hi,
> The v4.4 stable kernel (currently it's v4.4.231) lacks this bugfix:
> 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error")
> , as a result, the v4.4 kernel can deliver corrupt data to the application
> when a corrupt UDP packet is closely followed by a valid UDP packet:
> when the same invocation of the recvmsg() syscall delivers the corrupt
> packet's UDP payload to the application's receive buffer, it provides the
> UDP payload length and the "from IP/Port" of the valid packet to the 
> application -- this mismatch makes the issue worse.
> 
> Details:
> 
> For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's
> include/linux/skbuff.h:3951), Linux delays the UDP checksum verification
> until the application invokes the syscall recvmsg().
> 
> In the recvmsg() syscall handler, while Linux is copying the UDP payload
> to the application's memory, it calculates the UDP checksum. If the
> calculated checksum doesn't match the received checksum, Linux drops the
> corrupt UDP packet, and then starts to process the next packet (if any),
> and if the next packet is valid (i.e. the checksum is correct), Linux will
> copy the valid UDP packet's payload to the application's receiver buffer.
> 
> The bug is: before Linux starts to copy the valid UDP packet, the data
> structure used to track how many more bytes should be copied to the
> application memory is not reset to what it was when the application just
> entered the kernel by the syscall! Consequently, only a small portion or
> none of the valid packet's payload is copied to the application's receive
> buffer, and later when the application exits from the kernel, actually
> most of the application's receive buffer contains the payload of the
> corrupt packet while recvmsg() returns the length of the UDP payload of
> the valid packet.
> 
> For the mainline kernel, the bug was fixed by Al Viro in the commit 
> 327868212381, but unluckily the bugfix is only backported to the
> upstream v4.9+ kernels. I hope the bugfix can be backported to the
> v4.4 stable kernel, since it's a "longterm" kernel and is still used by
> some Linux distros.
> 
> It turns out backporting 327868212381 to v4.4 means that some 
> Supporting patches must be backported first, so the overall changes
> are pretty big...
> 
> I made the below one-line workaround patch to force the recvmsg() syscall
> handler to return to the userspace when Linux detects a corrupt UDP packet,
> so the application will invoke the syscall again to receive the following valid
> UDP packet (note: the patch may not work well with blocking sockets, for
> which typically the application doesn't expect an error of -EAGAIN. I
> guess it would be safer to return -EINTR instead?):
> 
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1367,6 +1367,7 @@ csum_copy_err:
>         /* starting over for a new packet, but check if we need to yield */
>         cond_resched();
>         msg->msg_flags &= ~MSG_TRUNC;
> +       return -EAGAIN;
>         goto try_again;
> }
> 
> 
> Eric Dumazet made an alternative that performs the csum validation earlier:
> 
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
> sk_buff *skb)
>                 }
>         }
> 
> -       if (rcu_access_pointer(sk->sk_filter) &&
> -           udp_lib_checksum_complete(skb))
> +       if (udp_lib_checksum_complete(skb))
>                 goto csum_error;
> 
>         if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
> 
> I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
> trying to backport 327868212381.

Does Eric's fix work with your testing?  If so, great, can you turn it
into something I can apply to the 4.4.y stable tree and send it to
stable@vger.kernel.org?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: UDP data corruption in v4.4
  2020-07-25  5:58 ` Greg KH
@ 2020-07-27 18:38   ` Dexuan Cui
  2020-07-27 18:40     ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2020-07-27 18:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Al Viro, netdev, stable, David S. Miller, 'Eric Dumazet',
	'Willy Tarreau',
	Joseph Salisbury, Michael Kelley

> From: Greg KH <greg@kroah.com>
> Sent: Friday, July 24, 2020 10:59 PM
> > [...]
> > Eric Dumazet made an alternative that performs the csum validation earlier:
> >
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
> > sk_buff *skb)
> >                 }
> >         }
> >
> > -       if (rcu_access_pointer(sk->sk_filter) &&
> > -           udp_lib_checksum_complete(skb))
> > +       if (udp_lib_checksum_complete(skb))
> >                 goto csum_error;
> >
> >         if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
> >
> > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
> > trying to backport 327868212381.
> 
> Does Eric's fix work with your testing?  

Yes, it worked in my testing overnight.

> If so, great, can you turn it
> into something I can apply to the 4.4.y stable tree and send it to
> stable@vger.kernel.org?
> 
> greg k-h

Will do shortly.

Thanks,
Dexuan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP data corruption in v4.4
  2020-07-27 18:38   ` Dexuan Cui
@ 2020-07-27 18:40     ` Eric Dumazet
  2020-07-27 18:57       ` Dexuan Cui
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2020-07-27 18:40 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Greg KH, Al Viro, netdev, stable, David S. Miller, Willy Tarreau,
	Joseph Salisbury, Michael Kelley

On Mon, Jul 27, 2020 at 11:38 AM Dexuan Cui <decui@microsoft.com> wrote:
>
> > From: Greg KH <greg@kroah.com>
> > Sent: Friday, July 24, 2020 10:59 PM
> > > [...]
> > > Eric Dumazet made an alternative that performs the csum validation earlier:
> > >
> > > --- a/net/ipv4/udp.c
> > > +++ b/net/ipv4/udp.c
> > > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
> > > sk_buff *skb)
> > >                 }
> > >         }
> > >
> > > -       if (rcu_access_pointer(sk->sk_filter) &&
> > > -           udp_lib_checksum_complete(skb))
> > > +       if (udp_lib_checksum_complete(skb))
> > >                 goto csum_error;
> > >
> > >         if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
> > >
> > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
> > > trying to backport 327868212381.
> >
> > Does Eric's fix work with your testing?
>
> Yes, it worked in my testing overnight.
>
> > If so, great, can you turn it
> > into something I can apply to the 4.4.y stable tree and send it to
> > stable@vger.kernel.org?
> >
> > greg k-h
>
> Will do shortly.
>

Just as a reminder, please also add the IPv6 part.

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index a8d74f44056a681ef9057c4c4abb34016120b44f..13713e0e5779b75de975faaeb4511bef40e097dc
100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -661,8 +661,7 @@ static int udpv6_queue_rcv_one_skb(struct sock
*sk, struct sk_buff *skb)
        }

        prefetch(&sk->sk_rmem_alloc);
-       if (rcu_access_pointer(sk->sk_filter) &&
-           udp_lib_checksum_complete(skb))
+       if (udp_lib_checksum_complete(skb))
                goto csum_error;

        if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: UDP data corruption in v4.4
  2020-07-27 18:40     ` Eric Dumazet
@ 2020-07-27 18:57       ` Dexuan Cui
  2020-07-27 19:37         ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Dexuan Cui @ 2020-07-27 18:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Greg KH, Al Viro, netdev, stable, David S. Miller, Willy Tarreau,
	Joseph Salisbury, Michael Kelley

> From: Eric Dumazet <edumazet@google.com>
> Sent: Monday, July 27, 2020 11:40 AM
> To: Dexuan Cui <decui@microsoft.com>
> 
> On Mon, Jul 27, 2020 at 11:38 AM Dexuan Cui <decui@microsoft.com> wrote:
> >
> > > From: Greg KH <greg@kroah.com>
> > > Sent: Friday, July 24, 2020 10:59 PM
> > > > [...]
> > > > Eric Dumazet made an alternative that performs the csum validation
> earlier:
> > > >
> > > > --- a/net/ipv4/udp.c
> > > > +++ b/net/ipv4/udp.c
> > > > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct
> > > > sk_buff *skb)
> > > >                 }
> > > >         }
> > > >
> > > > -       if (rcu_access_pointer(sk->sk_filter) &&
> > > > -           udp_lib_checksum_complete(skb))
> > > > +       if (udp_lib_checksum_complete(skb))
> > > >                 goto csum_error;
> > > >
> > > >         if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
> > > >
> > > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than
> > > > trying to backport 327868212381.
> > >
> > > Does Eric's fix work with your testing?
> >
> > Yes, it worked in my testing overnight.
> >
> > > If so, great, can you turn it
> > > into something I can apply to the 4.4.y stable tree and send it to
> > > stable@vger.kernel.org?
> > >
> > > greg k-h
> >
> > Will do shortly.
> >
> 
> Just as a reminder, please also add the IPv6 part.
> 
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index
> a8d74f44056a681ef9057c4c4abb34016120b44f..13713e0e5779b75de975faae
> b4511bef40e097dc
> 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -661,8 +661,7 @@ static int udpv6_queue_rcv_one_skb(struct sock
> *sk, struct sk_buff *skb)
>         }
> 
>         prefetch(&sk->sk_rmem_alloc);
> -       if (rcu_access_pointer(sk->sk_filter) &&
> -           udp_lib_checksum_complete(skb))
> +       if (udp_lib_checksum_complete(skb))
>                 goto csum_error;
> 
>         if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))

Oh, yes! :-) Thank you!

Eric, I'll add your Signed-off-by and mine. Please let me know in case
this is not ok.

I'll do a little more testing with the patch and I plan to post the patch
to stable@vger.kernel.org and netdev@vger.kernel.org this afternoon,
i.e. in 3~4 hours or so. 

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP data corruption in v4.4
  2020-07-27 18:57       ` Dexuan Cui
@ 2020-07-27 19:37         ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2020-07-27 19:37 UTC (permalink / raw)
  To: Dexuan Cui, Eric Dumazet
  Cc: Greg KH, Al Viro, netdev, stable, David S. Miller, Willy Tarreau,
	Joseph Salisbury, Michael Kelley



On 7/27/20 11:57 AM, Dexuan Cui wrote:
>> From: Eric Dumazet <edumazet@google.com>

> Oh, yes! :-) Thank you!
> 
> Eric, I'll add your Signed-off-by and mine. Please let me know in case
> this is not ok.

Sure, do not worry about this.

> 
> I'll do a little more testing with the patch and I plan to post the patch
> to stable@vger.kernel.org and netdev@vger.kernel.org this afternoon,
> i.e. in 3~4 hours or so. 
> 
> Thanks,
> -- Dexuan
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-27 19:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-25  2:21 UDP data corruption in v4.4 Dexuan Cui
2020-07-25  5:58 ` Greg KH
2020-07-27 18:38   ` Dexuan Cui
2020-07-27 18:40     ` Eric Dumazet
2020-07-27 18:57       ` Dexuan Cui
2020-07-27 19:37         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.