All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
@ 2020-03-09 15:34 Willem de Bruijn
  2020-03-09 15:42 ` Willem de Bruijn
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-09 15:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, mst, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

In one error case, tpacket_rcv drops packets after incrementing the
ring producer index.

If this happens, it does not update tp_status to TP_STATUS_USER and
thus the reader is stalled for an iteration of the ring, causing out
of order arrival.

The only such error path is when virtio_net_hdr_from_skb fails due
to encountering an unknown GSO type.

Signed-off-by: Willem de Bruijn <willemb@google.com>

---

I wonder whether it should drop packets with unknown GSO types at all.
This consistently blinds the reader to certain packets, including
recent UDP and SCTP GSO types.

The peer function virtio_net_hdr_to_skb already drops any packets with
unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
let the peer at least be aware of failure.

And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
---
 net/packet/af_packet.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 30c6879d6774..e5b0986215d2 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2274,6 +2274,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 					TP_STATUS_KERNEL, (macoff+snaplen));
 	if (!h.raw)
 		goto drop_n_account;
+
+	if (do_vnet &&
+	    virtio_net_hdr_from_skb(skb, h.raw + macoff -
+				    sizeof(struct virtio_net_hdr),
+				    vio_le(), true, 0))
+		goto drop_n_account;
+
 	if (po->tp_version <= TPACKET_V2) {
 		packet_increment_rx_head(po, &po->rx_ring);
 	/*
@@ -2286,12 +2293,6 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 			status |= TP_STATUS_LOSING;
 	}
 
-	if (do_vnet &&
-	    virtio_net_hdr_from_skb(skb, h.raw + macoff -
-				    sizeof(struct virtio_net_hdr),
-				    vio_le(), true, 0))
-		goto drop_n_account;
-
 	po->stats.stats1.tp_packets++;
 	if (copy_skb) {
 		status |= TP_STATUS_COPY;
-- 
2.25.1.481.gfbce0eb801-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-09 15:34 [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop Willem de Bruijn
@ 2020-03-09 15:42 ` Willem de Bruijn
  2020-03-09 15:50   ` Willem de Bruijn
  2020-03-10  6:43 ` Michael S. Tsirkin
  2020-03-12  6:13 ` David Miller
  2 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-09 15:42 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller, Michael S. Tsirkin

On Mon, Mar 9, 2020 at 11:34 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> From: Willem de Bruijn <willemb@google.com>
>
> In one error case, tpacket_rcv drops packets after incrementing the
> ring producer index.
>
> If this happens, it does not update tp_status to TP_STATUS_USER and
> thus the reader is stalled for an iteration of the ring, causing out
> of order arrival.
>
> The only such error path is when virtio_net_hdr_from_skb fails due
> to encountering an unknown GSO type.
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>


Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.")

I forgot to add the Fixes tag, sorry. This goes back to the
introduction of GSO support for virtio_net.

The discussion then explicitly arrived at deciding to fail on an
unknown type (SKB_GSO_FCOE). Which is why I asked the question here
whether we want to revisit that choice or leave as is.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-09 15:42 ` Willem de Bruijn
@ 2020-03-09 15:50   ` Willem de Bruijn
  2020-03-10  6:46     ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-09 15:50 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller, Michael S. Tsirkin

On Mon, Mar 9, 2020 at 11:42 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Mon, Mar 9, 2020 at 11:34 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > From: Willem de Bruijn <willemb@google.com>
> >
> > In one error case, tpacket_rcv drops packets after incrementing the
> > ring producer index.
> >
> > If this happens, it does not update tp_status to TP_STATUS_USER and
> > thus the reader is stalled for an iteration of the ring, causing out
> > of order arrival.
> >
> > The only such error path is when virtio_net_hdr_from_skb fails due
> > to encountering an unknown GSO type.
> >
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
>
>
> Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.")
>
> I forgot to add the Fixes tag, sorry. This goes back to the
> introduction of GSO support for virtio_net.

The problem of blinding receivers to certain packet types goes back to
that commit.

But the specific issue of ring out of order arrival is added later,
when vnet_hdr support is extended to tpacket_rcv:

Fixes: 58d19b19cd99 ("packet: vnet_hdr support for tpacket_rcv")

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-09 15:34 [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop Willem de Bruijn
  2020-03-09 15:42 ` Willem de Bruijn
@ 2020-03-10  6:43 ` Michael S. Tsirkin
  2020-03-10 12:49   ` Willem de Bruijn
  2020-03-12  6:13 ` David Miller
  2 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10  6:43 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, Willem de Bruijn

On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> In one error case, tpacket_rcv drops packets after incrementing the
> ring producer index.
> 
> If this happens, it does not update tp_status to TP_STATUS_USER and
> thus the reader is stalled for an iteration of the ring, causing out
> of order arrival.
> 
> The only such error path is when virtio_net_hdr_from_skb fails due
> to encountering an unknown GSO type.
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> 
> ---
> 
> I wonder whether it should drop packets with unknown GSO types at all.
> This consistently blinds the reader to certain packets, including
> recent UDP and SCTP GSO types.

Ugh it looks like you have found a bug.  Consider a legacy userspace -
it was actually broken by adding USD and SCTP GSO.  I suspect the right
thing to do here is actually to split these packets up, not drop them.


> The peer function virtio_net_hdr_to_skb already drops any packets with
> unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> let the peer at least be aware of failure.
> 
> And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.

This last one is possible for sure, but for virtio_net_hdr_from_skb
we'll need more flags to know whether it's safe to pass
these types to userspace.


> ---
>  net/packet/af_packet.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 30c6879d6774..e5b0986215d2 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2274,6 +2274,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
>  					TP_STATUS_KERNEL, (macoff+snaplen));
>  	if (!h.raw)
>  		goto drop_n_account;
> +
> +	if (do_vnet &&
> +	    virtio_net_hdr_from_skb(skb, h.raw + macoff -
> +				    sizeof(struct virtio_net_hdr),
> +				    vio_le(), true, 0))
> +		goto drop_n_account;
> +
>  	if (po->tp_version <= TPACKET_V2) {
>  		packet_increment_rx_head(po, &po->rx_ring);
>  	/*
> @@ -2286,12 +2293,6 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
>  			status |= TP_STATUS_LOSING;
>  	}
>  
> -	if (do_vnet &&
> -	    virtio_net_hdr_from_skb(skb, h.raw + macoff -
> -				    sizeof(struct virtio_net_hdr),
> -				    vio_le(), true, 0))
> -		goto drop_n_account;
> -
>  	po->stats.stats1.tp_packets++;
>  	if (copy_skb) {
>  		status |= TP_STATUS_COPY;
> -- 
> 2.25.1.481.gfbce0eb801-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-09 15:50   ` Willem de Bruijn
@ 2020-03-10  6:46     ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10  6:46 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Mon, Mar 09, 2020 at 11:50:23AM -0400, Willem de Bruijn wrote:
> On Mon, Mar 9, 2020 at 11:42 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > On Mon, Mar 9, 2020 at 11:34 AM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > From: Willem de Bruijn <willemb@google.com>
> > >
> > > In one error case, tpacket_rcv drops packets after incrementing the
> > > ring producer index.
> > >
> > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > thus the reader is stalled for an iteration of the ring, causing out
> > > of order arrival.
> > >
> > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > to encountering an unknown GSO type.
> > >
> > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> >
> >
> > Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.")
> >
> > I forgot to add the Fixes tag, sorry. This goes back to the
> > introduction of GSO support for virtio_net.
> 
> The problem of blinding receivers to certain packet types goes back to
> that commit.
> 
> But the specific issue of ring out of order arrival is added later,
> when vnet_hdr support is extended to tpacket_rcv:
> 
> Fixes: 58d19b19cd99 ("packet: vnet_hdr support for tpacket_rcv")


In fact it looks like

commit 9fd1ff5d2ac7181844735806b0a703c942365291
Author: Steffen Klassert <steffen.klassert@secunet.com>
Date:   Sat Jan 25 11:26:45 2020 +0100

    udp: Support UDP fraglist GRO/GSO.

and

commit 90017accff61ae89283ad9a51f9ac46ca01633fb
Author: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date:   Thu Jun 2 15:05:43 2016 -0300

    sctp: Add GSO support
    
both break userspace using virtio due to lack of fallback...

-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10  6:43 ` Michael S. Tsirkin
@ 2020-03-10 12:49   ` Willem de Bruijn
  2020-03-10 12:59     ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 12:49 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Willem de Bruijn, Network Development, David Miller

On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> >
> > In one error case, tpacket_rcv drops packets after incrementing the
> > ring producer index.
> >
> > If this happens, it does not update tp_status to TP_STATUS_USER and
> > thus the reader is stalled for an iteration of the ring, causing out
> > of order arrival.
> >
> > The only such error path is when virtio_net_hdr_from_skb fails due
> > to encountering an unknown GSO type.
> >
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> >
> > ---
> >
> > I wonder whether it should drop packets with unknown GSO types at all.
> > This consistently blinds the reader to certain packets, including
> > recent UDP and SCTP GSO types.
>
> Ugh it looks like you have found a bug.  Consider a legacy userspace -
> it was actually broken by adding USD and SCTP GSO.  I suspect the right
> thing to do here is actually to split these packets up, not drop them.

In the main virtio users, virtio_net/tun/tap, the packets will always
arrive segmented, due to these devices not advertising hardware
segmentation for these protocols.

So the issue is limited to users of tpacket_rcv, which is relatively
new. There too it is limited on egress to devices that do advertise
h/w offload. And on r/x to GRO.

The UDP GSO issue precedes the fraglist GRO patch, by the way, and
goes back to my (argh!) introduction of the feature on the egress
path.

>
> > The peer function virtio_net_hdr_to_skb already drops any packets with
> > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > let the peer at least be aware of failure.
> >
> > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
>
> This last one is possible for sure, but for virtio_net_hdr_from_skb
> we'll need more flags to know whether it's safe to pass
> these types to userspace.

Can you elaborate? Since virtio_net_hdr_to_skb users already returns
-EINVAL on unknown GSO types and its callers just drop these packets,
it looks to me that the infra is future proof wrt adding new GSO
types.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 12:49   ` Willem de Bruijn
@ 2020-03-10 12:59     ` Michael S. Tsirkin
  2020-03-10 14:16       ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10 12:59 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > From: Willem de Bruijn <willemb@google.com>
> > >
> > > In one error case, tpacket_rcv drops packets after incrementing the
> > > ring producer index.
> > >
> > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > thus the reader is stalled for an iteration of the ring, causing out
> > > of order arrival.
> > >
> > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > to encountering an unknown GSO type.
> > >
> > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > >
> > > ---
> > >
> > > I wonder whether it should drop packets with unknown GSO types at all.
> > > This consistently blinds the reader to certain packets, including
> > > recent UDP and SCTP GSO types.
> >
> > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > thing to do here is actually to split these packets up, not drop them.
> 
> In the main virtio users, virtio_net/tun/tap, the packets will always
> arrive segmented, due to these devices not advertising hardware
> segmentation for these protocols.

Oh right. That's good then, sorry about the noise.

> So the issue is limited to users of tpacket_rcv, which is relatively
> new. There too it is limited on egress to devices that do advertise
> h/w offload. And on r/x to GRO.
> 
> The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> goes back to my (argh!) introduction of the feature on the egress
> path.
> 
> >
> > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > let the peer at least be aware of failure.
> > >
> > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> >
> > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > we'll need more flags to know whether it's safe to pass
> > these types to userspace.
> 
> Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> -EINVAL on unknown GSO types and its callers just drop these packets,
> it looks to me that the infra is future proof wrt adding new GSO
> types.

Oh I mean if we do want to add new types and want to pass them to
users, then virtio_net_hdr_from_skb will need to flag so it
knows whether that will or won't confuse userspace.

-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 12:59     ` Michael S. Tsirkin
@ 2020-03-10 14:16       ` Willem de Bruijn
  2020-03-10 14:43         ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 14:16 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Willem de Bruijn, Network Development, David Miller

On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > From: Willem de Bruijn <willemb@google.com>
> > > >
> > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > ring producer index.
> > > >
> > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > of order arrival.
> > > >
> > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > to encountering an unknown GSO type.
> > > >
> > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > >
> > > > ---
> > > >
> > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > This consistently blinds the reader to certain packets, including
> > > > recent UDP and SCTP GSO types.
> > >
> > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > thing to do here is actually to split these packets up, not drop them.
> >
> > In the main virtio users, virtio_net/tun/tap, the packets will always
> > arrive segmented, due to these devices not advertising hardware
> > segmentation for these protocols.
>
> Oh right. That's good then, sorry about the noise.

Not at all. Thanks for taking a look!

> > So the issue is limited to users of tpacket_rcv, which is relatively
> > new. There too it is limited on egress to devices that do advertise
> > h/w offload. And on r/x to GRO.
> >
> > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > goes back to my (argh!) introduction of the feature on the egress
> > path.
> >
> > >
> > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > let the peer at least be aware of failure.
> > > >
> > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > >
> > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > we'll need more flags to know whether it's safe to pass
> > > these types to userspace.
> >
> > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > -EINVAL on unknown GSO types and its callers just drop these packets,
> > it looks to me that the infra is future proof wrt adding new GSO
> > types.
>
> Oh I mean if we do want to add new types and want to pass them to
> users, then virtio_net_hdr_from_skb will need to flag so it
> knows whether that will or won't confuse userspace.

I'm not sure how that would work. Ignoring other tun/tap/virtio for
now, just looking at tpacket, a new variant of socket option for
PACKET_VNET_HDR, for every new GSO type?

In practice the userspace I'm aware of, and any sane implementation,
will be future proof to drop and account packets whose type it cannot
process. So I think we can just add new types.

In the worst case, arrival of these packets is under admin control with ethtool.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 14:16       ` Willem de Bruijn
@ 2020-03-10 14:43         ` Michael S. Tsirkin
  2020-03-10 15:38           ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10 14:43 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > From: Willem de Bruijn <willemb@google.com>
> > > > >
> > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > ring producer index.
> > > > >
> > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > of order arrival.
> > > > >
> > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > to encountering an unknown GSO type.
> > > > >
> > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > >
> > > > > ---
> > > > >
> > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > This consistently blinds the reader to certain packets, including
> > > > > recent UDP and SCTP GSO types.
> > > >
> > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > thing to do here is actually to split these packets up, not drop them.
> > >
> > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > arrive segmented, due to these devices not advertising hardware
> > > segmentation for these protocols.
> >
> > Oh right. That's good then, sorry about the noise.
> 
> Not at all. Thanks for taking a look!
> 
> > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > new. There too it is limited on egress to devices that do advertise
> > > h/w offload. And on r/x to GRO.
> > >
> > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > goes back to my (argh!) introduction of the feature on the egress
> > > path.
> > >
> > > >
> > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > let the peer at least be aware of failure.
> > > > >
> > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > >
> > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > we'll need more flags to know whether it's safe to pass
> > > > these types to userspace.
> > >
> > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > it looks to me that the infra is future proof wrt adding new GSO
> > > types.
> >
> > Oh I mean if we do want to add new types and want to pass them to
> > users, then virtio_net_hdr_from_skb will need to flag so it
> > knows whether that will or won't confuse userspace.
> 
> I'm not sure how that would work. Ignoring other tun/tap/virtio for
> now, just looking at tpacket, a new variant of socket option for
> PACKET_VNET_HDR, for every new GSO type?

Maybe a single one with a bitmap of legal types?

> In practice the userspace I'm aware of, and any sane implementation,
> will be future proof to drop and account packets whose type it cannot
> process. So I think we can just add new types.

Well if packets are just dropped then userspace breaks right?
So we'll really need to split up packets when this happens.


> In the worst case, arrival of these packets is under admin control with ethtool.

It's common to enable this by default since hey offload, must be good.

-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 14:43         ` Michael S. Tsirkin
@ 2020-03-10 15:38           ` Willem de Bruijn
  2020-03-10 16:14             ` Willem de Bruijn
  2020-03-10 21:29             ` Michael S. Tsirkin
  0 siblings, 2 replies; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 15:38 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Willem de Bruijn, Network Development, David Miller

On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > >
> > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > ring producer index.
> > > > > >
> > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > of order arrival.
> > > > > >
> > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > to encountering an unknown GSO type.
> > > > > >
> > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > >
> > > > > > ---
> > > > > >
> > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > This consistently blinds the reader to certain packets, including
> > > > > > recent UDP and SCTP GSO types.
> > > > >
> > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > thing to do here is actually to split these packets up, not drop them.
> > > >
> > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > arrive segmented, due to these devices not advertising hardware
> > > > segmentation for these protocols.
> > >
> > > Oh right. That's good then, sorry about the noise.
> >
> > Not at all. Thanks for taking a look!
> >
> > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > new. There too it is limited on egress to devices that do advertise
> > > > h/w offload. And on r/x to GRO.
> > > >
> > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > goes back to my (argh!) introduction of the feature on the egress
> > > > path.
> > > >
> > > > >
> > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > let the peer at least be aware of failure.
> > > > > >
> > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > >
> > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > we'll need more flags to know whether it's safe to pass
> > > > > these types to userspace.
> > > >
> > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > types.
> > >
> > > Oh I mean if we do want to add new types and want to pass them to
> > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > knows whether that will or won't confuse userspace.
> >
> > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > now, just looking at tpacket, a new variant of socket option for
> > PACKET_VNET_HDR, for every new GSO type?
>
> Maybe a single one with a bitmap of legal types?
>
> > In practice the userspace I'm aware of, and any sane implementation,
> > will be future proof to drop and account packets whose type it cannot
> > process. So I think we can just add new types.
>
> Well if packets are just dropped then userspace breaks right?

It is an improvement over the current silent discard in the kernel.

If it can count these packets, userspace becomes notified that it
should perhaps upgrade or use ethtool to stop the kernel from
generating certain packets.

Specifically for packet sockets, it wants to receive packets as they
appear "on the wire". It does not have to drop these today even, but
can easily parse the headers.

For packet sockets at least, I don't think that we want transparent
segmentation.


> So we'll really need to split up packets when this happens.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 15:38           ` Willem de Bruijn
@ 2020-03-10 16:14             ` Willem de Bruijn
  2020-03-10 21:29             ` Michael S. Tsirkin
  1 sibling, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 16:14 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Michael S. Tsirkin, Network Development, David Miller

On Tue, Mar 10, 2020 at 11:38 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > >
> > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > ring producer index.
> > > > > > >
> > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > of order arrival.
> > > > > > >
> > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > to encountering an unknown GSO type.
> > > > > > >
> > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > >
> > > > > > > ---
> > > > > > >
> > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > recent UDP and SCTP GSO types.
> > > > > >
> > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > >
> > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > arrive segmented, due to these devices not advertising hardware
> > > > > segmentation for these protocols.
> > > >
> > > > Oh right. That's good then, sorry about the noise.
> > >
> > > Not at all. Thanks for taking a look!
> > >
> > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > new. There too it is limited on egress to devices that do advertise
> > > > > h/w offload. And on r/x to GRO.
> > > > >
> > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > path.
> > > > >
> > > > > >
> > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > let the peer at least be aware of failure.
> > > > > > >
> > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > >
> > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > these types to userspace.
> > > > >
> > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > types.
> > > >
> > > > Oh I mean if we do want to add new types and want to pass them to
> > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > knows whether that will or won't confuse userspace.
> > >
> > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > now, just looking at tpacket, a new variant of socket option for
> > > PACKET_VNET_HDR, for every new GSO type?
> >
> > Maybe a single one with a bitmap of legal types?
> >
> > > In practice the userspace I'm aware of, and any sane implementation,
> > > will be future proof to drop and account packets whose type it cannot
> > > process. So I think we can just add new types.
> >
> > Well if packets are just dropped then userspace breaks right?
>
> It is an improvement over the current silent discard in the kernel.
>
> If it can count these packets, userspace becomes notified that it
> should perhaps upgrade or use ethtool to stop the kernel from
> generating certain packets.
>
> Specifically for packet sockets, it wants to receive packets as they
> appear "on the wire". It does not have to drop these today even, but
> can easily parse the headers.
>
> For packet sockets at least, I don't think that we want transparent
> segmentation.

Or more succinct:

For packet sockets I think we should pass everything up to userspace
as it is. Then virtio_net_hdr_from_skb never has to fail. And this
patch is unnecessary.

But we would need at least one new VIRTIO_NET_HDR_GSO_UNKNOWN type.
And only use that on packet sockets, keeping existing EINVAL in the
(indeed unexpected) case such packets make it to virtio_net or tun/tap
ndo_start_xmit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 15:38           ` Willem de Bruijn
  2020-03-10 16:14             ` Willem de Bruijn
@ 2020-03-10 21:29             ` Michael S. Tsirkin
  2020-03-10 21:35               ` Willem de Bruijn
  1 sibling, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10 21:29 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > >
> > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > ring producer index.
> > > > > > >
> > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > of order arrival.
> > > > > > >
> > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > to encountering an unknown GSO type.
> > > > > > >
> > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > >
> > > > > > > ---
> > > > > > >
> > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > recent UDP and SCTP GSO types.
> > > > > >
> > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > >
> > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > arrive segmented, due to these devices not advertising hardware
> > > > > segmentation for these protocols.
> > > >
> > > > Oh right. That's good then, sorry about the noise.
> > >
> > > Not at all. Thanks for taking a look!
> > >
> > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > new. There too it is limited on egress to devices that do advertise
> > > > > h/w offload. And on r/x to GRO.
> > > > >
> > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > path.
> > > > >
> > > > > >
> > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > let the peer at least be aware of failure.
> > > > > > >
> > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > >
> > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > these types to userspace.
> > > > >
> > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > types.
> > > >
> > > > Oh I mean if we do want to add new types and want to pass them to
> > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > knows whether that will or won't confuse userspace.
> > >
> > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > now, just looking at tpacket, a new variant of socket option for
> > > PACKET_VNET_HDR, for every new GSO type?
> >
> > Maybe a single one with a bitmap of legal types?
> >
> > > In practice the userspace I'm aware of, and any sane implementation,
> > > will be future proof to drop and account packets whose type it cannot
> > > process. So I think we can just add new types.
> >
> > Well if packets are just dropped then userspace breaks right?
> 
> It is an improvement over the current silent discard in the kernel.
> 
> If it can count these packets, userspace becomes notified that it
> should perhaps upgrade or use ethtool to stop the kernel from
> generating certain packets.
> 
> Specifically for packet sockets, it wants to receive packets as they
> appear "on the wire". It does not have to drop these today even, but
> can easily parse the headers.
> 
> For packet sockets at least, I don't think that we want transparent
> segmentation.

Well it's GSO is in the way then it's no longer "on the wire", right?
Whether we split these back to individual skbs or we don't
it's individual packets that are on the wire. GSO just allows
passing them to the application in a more efficient way.

> 
> > So we'll really need to split up packets when this happens.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 21:29             ` Michael S. Tsirkin
@ 2020-03-10 21:35               ` Willem de Bruijn
  2020-03-10 21:57                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 21:35 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Willem de Bruijn, Network Development, David Miller

On Tue, Mar 10, 2020 at 5:30 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> > On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > > >
> > > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > > ring producer index.
> > > > > > > >
> > > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > > of order arrival.
> > > > > > > >
> > > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > > to encountering an unknown GSO type.
> > > > > > > >
> > > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > > >
> > > > > > > > ---
> > > > > > > >
> > > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > > recent UDP and SCTP GSO types.
> > > > > > >
> > > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > > >
> > > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > > arrive segmented, due to these devices not advertising hardware
> > > > > > segmentation for these protocols.
> > > > >
> > > > > Oh right. That's good then, sorry about the noise.
> > > >
> > > > Not at all. Thanks for taking a look!
> > > >
> > > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > > new. There too it is limited on egress to devices that do advertise
> > > > > > h/w offload. And on r/x to GRO.
> > > > > >
> > > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > > path.
> > > > > >
> > > > > > >
> > > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > > let the peer at least be aware of failure.
> > > > > > > >
> > > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > > >
> > > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > > these types to userspace.
> > > > > >
> > > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > > types.
> > > > >
> > > > > Oh I mean if we do want to add new types and want to pass them to
> > > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > > knows whether that will or won't confuse userspace.
> > > >
> > > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > > now, just looking at tpacket, a new variant of socket option for
> > > > PACKET_VNET_HDR, for every new GSO type?
> > >
> > > Maybe a single one with a bitmap of legal types?
> > >
> > > > In practice the userspace I'm aware of, and any sane implementation,
> > > > will be future proof to drop and account packets whose type it cannot
> > > > process. So I think we can just add new types.
> > >
> > > Well if packets are just dropped then userspace breaks right?
> >
> > It is an improvement over the current silent discard in the kernel.
> >
> > If it can count these packets, userspace becomes notified that it
> > should perhaps upgrade or use ethtool to stop the kernel from
> > generating certain packets.
> >
> > Specifically for packet sockets, it wants to receive packets as they
> > appear "on the wire". It does not have to drop these today even, but
> > can easily parse the headers.
> >
> > For packet sockets at least, I don't think that we want transparent
> > segmentation.
>
> Well it's GSO is in the way then it's no longer "on the wire", right?
> Whether we split these back to individual skbs or we don't
> it's individual packets that are on the wire. GSO just allows
> passing them to the application in a more efficient way.

Not entirely. With TSO enabled, packet sockets will show the TCP TSO
packets, not the individual segment on the wire.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 21:35               ` Willem de Bruijn
@ 2020-03-10 21:57                 ` Michael S. Tsirkin
  2020-03-10 23:13                   ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-10 21:57 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Tue, Mar 10, 2020 at 05:35:55PM -0400, Willem de Bruijn wrote:
> On Tue, Mar 10, 2020 at 5:30 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> > > On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > > > >
> > > > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > > > ring producer index.
> > > > > > > > >
> > > > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > > > of order arrival.
> > > > > > > > >
> > > > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > > > to encountering an unknown GSO type.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > > > >
> > > > > > > > > ---
> > > > > > > > >
> > > > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > > > recent UDP and SCTP GSO types.
> > > > > > > >
> > > > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > > > >
> > > > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > > > arrive segmented, due to these devices not advertising hardware
> > > > > > > segmentation for these protocols.
> > > > > >
> > > > > > Oh right. That's good then, sorry about the noise.
> > > > >
> > > > > Not at all. Thanks for taking a look!
> > > > >
> > > > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > > > new. There too it is limited on egress to devices that do advertise
> > > > > > > h/w offload. And on r/x to GRO.
> > > > > > >
> > > > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > > > path.
> > > > > > >
> > > > > > > >
> > > > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > > > let the peer at least be aware of failure.
> > > > > > > > >
> > > > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > > > >
> > > > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > > > these types to userspace.
> > > > > > >
> > > > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > > > types.
> > > > > >
> > > > > > Oh I mean if we do want to add new types and want to pass them to
> > > > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > > > knows whether that will or won't confuse userspace.
> > > > >
> > > > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > > > now, just looking at tpacket, a new variant of socket option for
> > > > > PACKET_VNET_HDR, for every new GSO type?
> > > >
> > > > Maybe a single one with a bitmap of legal types?
> > > >
> > > > > In practice the userspace I'm aware of, and any sane implementation,
> > > > > will be future proof to drop and account packets whose type it cannot
> > > > > process. So I think we can just add new types.
> > > >
> > > > Well if packets are just dropped then userspace breaks right?
> > >
> > > It is an improvement over the current silent discard in the kernel.
> > >
> > > If it can count these packets, userspace becomes notified that it
> > > should perhaps upgrade or use ethtool to stop the kernel from
> > > generating certain packets.
> > >
> > > Specifically for packet sockets, it wants to receive packets as they
> > > appear "on the wire". It does not have to drop these today even, but
> > > can easily parse the headers.
> > >
> > > For packet sockets at least, I don't think that we want transparent
> > > segmentation.
> >
> > Well it's GSO is in the way then it's no longer "on the wire", right?
> > Whether we split these back to individual skbs or we don't
> > it's individual packets that are on the wire. GSO just allows
> > passing them to the application in a more efficient way.
> 
> Not entirely. With TSO enabled, packet sockets will show the TCP TSO
> packets, not the individual segment on the wire.

But nothing breaks if it shows a segment on the wire while linux
processes packets in batches, right? It's just some extra info that
an app can't handle, so we hide it from the app...

-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 21:57                 ` Michael S. Tsirkin
@ 2020-03-10 23:13                   ` Willem de Bruijn
  2020-03-11  7:56                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-10 23:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Willem de Bruijn, Network Development, David Miller

On Tue, Mar 10, 2020 at 5:58 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 05:35:55PM -0400, Willem de Bruijn wrote:
> > On Tue, Mar 10, 2020 at 5:30 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> > > > On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > > > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > > > > >
> > > > > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > > > > ring producer index.
> > > > > > > > > >
> > > > > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > > > > of order arrival.
> > > > > > > > > >
> > > > > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > > > > to encountering an unknown GSO type.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > > > > recent UDP and SCTP GSO types.
> > > > > > > > >
> > > > > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > > > > >
> > > > > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > > > > arrive segmented, due to these devices not advertising hardware
> > > > > > > > segmentation for these protocols.
> > > > > > >
> > > > > > > Oh right. That's good then, sorry about the noise.
> > > > > >
> > > > > > Not at all. Thanks for taking a look!
> > > > > >
> > > > > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > > > > new. There too it is limited on egress to devices that do advertise
> > > > > > > > h/w offload. And on r/x to GRO.
> > > > > > > >
> > > > > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > > > > path.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > > > > let the peer at least be aware of failure.
> > > > > > > > > >
> > > > > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > > > > >
> > > > > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > > > > these types to userspace.
> > > > > > > >
> > > > > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > > > > types.
> > > > > > >
> > > > > > > Oh I mean if we do want to add new types and want to pass them to
> > > > > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > > > > knows whether that will or won't confuse userspace.
> > > > > >
> > > > > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > > > > now, just looking at tpacket, a new variant of socket option for
> > > > > > PACKET_VNET_HDR, for every new GSO type?
> > > > >
> > > > > Maybe a single one with a bitmap of legal types?
> > > > >
> > > > > > In practice the userspace I'm aware of, and any sane implementation,
> > > > > > will be future proof to drop and account packets whose type it cannot
> > > > > > process. So I think we can just add new types.
> > > > >
> > > > > Well if packets are just dropped then userspace breaks right?
> > > >
> > > > It is an improvement over the current silent discard in the kernel.
> > > >
> > > > If it can count these packets, userspace becomes notified that it
> > > > should perhaps upgrade or use ethtool to stop the kernel from
> > > > generating certain packets.
> > > >
> > > > Specifically for packet sockets, it wants to receive packets as they
> > > > appear "on the wire". It does not have to drop these today even, but
> > > > can easily parse the headers.
> > > >
> > > > For packet sockets at least, I don't think that we want transparent
> > > > segmentation.
> > >
> > > Well it's GSO is in the way then it's no longer "on the wire", right?
> > > Whether we split these back to individual skbs or we don't
> > > it's individual packets that are on the wire. GSO just allows
> > > passing them to the application in a more efficient way.
> >
> > Not entirely. With TSO enabled, packet sockets will show the TCP TSO
> > packets, not the individual segment on the wire.
>
> But nothing breaks if it shows a segment on the wire while linux
> processes packets in batches, right? It's just some extra info that
> an app can't handle, so we hide it from the app...

I don't entirely follow. Are we on the same page here and agree that
we should just show the GSO packet to userspace?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-10 23:13                   ` Willem de Bruijn
@ 2020-03-11  7:56                     ` Michael S. Tsirkin
  2020-03-11 14:31                       ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-11  7:56 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Tue, Mar 10, 2020 at 07:13:41PM -0400, Willem de Bruijn wrote:
> On Tue, Mar 10, 2020 at 5:58 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 10, 2020 at 05:35:55PM -0400, Willem de Bruijn wrote:
> > > On Tue, Mar 10, 2020 at 5:30 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> > > > > On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > > > > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > > > > > >
> > > > > > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > > > > > ring producer index.
> > > > > > > > > > >
> > > > > > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > > > > > of order arrival.
> > > > > > > > > > >
> > > > > > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > > > > > to encountering an unknown GSO type.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >
> > > > > > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > > > > > recent UDP and SCTP GSO types.
> > > > > > > > > >
> > > > > > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > > > > > >
> > > > > > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > > > > > arrive segmented, due to these devices not advertising hardware
> > > > > > > > > segmentation for these protocols.
> > > > > > > >
> > > > > > > > Oh right. That's good then, sorry about the noise.
> > > > > > >
> > > > > > > Not at all. Thanks for taking a look!
> > > > > > >
> > > > > > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > > > > > new. There too it is limited on egress to devices that do advertise
> > > > > > > > > h/w offload. And on r/x to GRO.
> > > > > > > > >
> > > > > > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > > > > > path.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > > > > > let the peer at least be aware of failure.
> > > > > > > > > > >
> > > > > > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > > > > > >
> > > > > > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > > > > > these types to userspace.
> > > > > > > > >
> > > > > > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > > > > > types.
> > > > > > > >
> > > > > > > > Oh I mean if we do want to add new types and want to pass them to
> > > > > > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > > > > > knows whether that will or won't confuse userspace.
> > > > > > >
> > > > > > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > > > > > now, just looking at tpacket, a new variant of socket option for
> > > > > > > PACKET_VNET_HDR, for every new GSO type?
> > > > > >
> > > > > > Maybe a single one with a bitmap of legal types?
> > > > > >
> > > > > > > In practice the userspace I'm aware of, and any sane implementation,
> > > > > > > will be future proof to drop and account packets whose type it cannot
> > > > > > > process. So I think we can just add new types.
> > > > > >
> > > > > > Well if packets are just dropped then userspace breaks right?
> > > > >
> > > > > It is an improvement over the current silent discard in the kernel.
> > > > >
> > > > > If it can count these packets, userspace becomes notified that it
> > > > > should perhaps upgrade or use ethtool to stop the kernel from
> > > > > generating certain packets.
> > > > >
> > > > > Specifically for packet sockets, it wants to receive packets as they
> > > > > appear "on the wire". It does not have to drop these today even, but
> > > > > can easily parse the headers.
> > > > >
> > > > > For packet sockets at least, I don't think that we want transparent
> > > > > segmentation.
> > > >
> > > > Well it's GSO is in the way then it's no longer "on the wire", right?
> > > > Whether we split these back to individual skbs or we don't
> > > > it's individual packets that are on the wire. GSO just allows
> > > > passing them to the application in a more efficient way.
> > >
> > > Not entirely. With TSO enabled, packet sockets will show the TCP TSO
> > > packets, not the individual segment on the wire.
> >
> > But nothing breaks if it shows a segment on the wire while linux
> > processes packets in batches, right? It's just some extra info that
> > an app can't handle, so we hide it from the app...
> 
> I don't entirely follow. Are we on the same page here and agree that
> we should just show the GSO packet to userspace?

So we are talking about a hypothetical case where we add a GSO type,
then a hypothetical userspace that does not know about a specific GSO
type, right? I feel there must be some kind of negotiation
and kernel must detect and show individual packets to such
userspace. In fact, a similar change for the checksum has
broken old dhcp clients and some of them are broken to this
day, with ip tables rules used to work around the issue.


-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-11  7:56                     ` Michael S. Tsirkin
@ 2020-03-11 14:31                       ` Willem de Bruijn
  2020-03-11 21:25                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-11 14:31 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Network Development, David Miller

On Wed, Mar 11, 2020 at 3:56 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 10, 2020 at 07:13:41PM -0400, Willem de Bruijn wrote:
> > On Tue, Mar 10, 2020 at 5:58 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 10, 2020 at 05:35:55PM -0400, Willem de Bruijn wrote:
> > > > On Tue, Mar 10, 2020 at 5:30 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Mar 10, 2020 at 11:38:16AM -0400, Willem de Bruijn wrote:
> > > > > > On Tue, Mar 10, 2020 at 10:44 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Mar 10, 2020 at 10:16:56AM -0400, Willem de Bruijn wrote:
> > > > > > > > On Tue, Mar 10, 2020 at 8:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Mar 10, 2020 at 08:49:23AM -0400, Willem de Bruijn wrote:
> > > > > > > > > > On Tue, Mar 10, 2020 at 2:43 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 09, 2020 at 11:34:35AM -0400, Willem de Bruijn wrote:
> > > > > > > > > > > > From: Willem de Bruijn <willemb@google.com>
> > > > > > > > > > > >
> > > > > > > > > > > > In one error case, tpacket_rcv drops packets after incrementing the
> > > > > > > > > > > > ring producer index.
> > > > > > > > > > > >
> > > > > > > > > > > > If this happens, it does not update tp_status to TP_STATUS_USER and
> > > > > > > > > > > > thus the reader is stalled for an iteration of the ring, causing out
> > > > > > > > > > > > of order arrival.
> > > > > > > > > > > >
> > > > > > > > > > > > The only such error path is when virtio_net_hdr_from_skb fails due
> > > > > > > > > > > > to encountering an unknown GSO type.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >
> > > > > > > > > > > > I wonder whether it should drop packets with unknown GSO types at all.
> > > > > > > > > > > > This consistently blinds the reader to certain packets, including
> > > > > > > > > > > > recent UDP and SCTP GSO types.
> > > > > > > > > > >
> > > > > > > > > > > Ugh it looks like you have found a bug.  Consider a legacy userspace -
> > > > > > > > > > > it was actually broken by adding USD and SCTP GSO.  I suspect the right
> > > > > > > > > > > thing to do here is actually to split these packets up, not drop them.
> > > > > > > > > >
> > > > > > > > > > In the main virtio users, virtio_net/tun/tap, the packets will always
> > > > > > > > > > arrive segmented, due to these devices not advertising hardware
> > > > > > > > > > segmentation for these protocols.
> > > > > > > > >
> > > > > > > > > Oh right. That's good then, sorry about the noise.
> > > > > > > >
> > > > > > > > Not at all. Thanks for taking a look!
> > > > > > > >
> > > > > > > > > > So the issue is limited to users of tpacket_rcv, which is relatively
> > > > > > > > > > new. There too it is limited on egress to devices that do advertise
> > > > > > > > > > h/w offload. And on r/x to GRO.
> > > > > > > > > >
> > > > > > > > > > The UDP GSO issue precedes the fraglist GRO patch, by the way, and
> > > > > > > > > > goes back to my (argh!) introduction of the feature on the egress
> > > > > > > > > > path.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > The peer function virtio_net_hdr_to_skb already drops any packets with
> > > > > > > > > > > > unknown types, so it should be fine to add an SKB_GSO_UNKNOWN type and
> > > > > > > > > > > > let the peer at least be aware of failure.
> > > > > > > > > > > >
> > > > > > > > > > > > And possibly add SKB_GSO_UDP_L4 and SKB_GSO_SCTP types to virtio too.
> > > > > > > > > > >
> > > > > > > > > > > This last one is possible for sure, but for virtio_net_hdr_from_skb
> > > > > > > > > > > we'll need more flags to know whether it's safe to pass
> > > > > > > > > > > these types to userspace.
> > > > > > > > > >
> > > > > > > > > > Can you elaborate? Since virtio_net_hdr_to_skb users already returns
> > > > > > > > > > -EINVAL on unknown GSO types and its callers just drop these packets,
> > > > > > > > > > it looks to me that the infra is future proof wrt adding new GSO
> > > > > > > > > > types.
> > > > > > > > >
> > > > > > > > > Oh I mean if we do want to add new types and want to pass them to
> > > > > > > > > users, then virtio_net_hdr_from_skb will need to flag so it
> > > > > > > > > knows whether that will or won't confuse userspace.
> > > > > > > >
> > > > > > > > I'm not sure how that would work. Ignoring other tun/tap/virtio for
> > > > > > > > now, just looking at tpacket, a new variant of socket option for
> > > > > > > > PACKET_VNET_HDR, for every new GSO type?
> > > > > > >
> > > > > > > Maybe a single one with a bitmap of legal types?
> > > > > > >
> > > > > > > > In practice the userspace I'm aware of, and any sane implementation,
> > > > > > > > will be future proof to drop and account packets whose type it cannot
> > > > > > > > process. So I think we can just add new types.
> > > > > > >
> > > > > > > Well if packets are just dropped then userspace breaks right?
> > > > > >
> > > > > > It is an improvement over the current silent discard in the kernel.
> > > > > >
> > > > > > If it can count these packets, userspace becomes notified that it
> > > > > > should perhaps upgrade or use ethtool to stop the kernel from
> > > > > > generating certain packets.
> > > > > >
> > > > > > Specifically for packet sockets, it wants to receive packets as they
> > > > > > appear "on the wire". It does not have to drop these today even, but
> > > > > > can easily parse the headers.
> > > > > >
> > > > > > For packet sockets at least, I don't think that we want transparent
> > > > > > segmentation.
> > > > >
> > > > > Well it's GSO is in the way then it's no longer "on the wire", right?
> > > > > Whether we split these back to individual skbs or we don't
> > > > > it's individual packets that are on the wire. GSO just allows
> > > > > passing them to the application in a more efficient way.
> > > >
> > > > Not entirely. With TSO enabled, packet sockets will show the TCP TSO
> > > > packets, not the individual segment on the wire.
> > >
> > > But nothing breaks if it shows a segment on the wire while linux
> > > processes packets in batches, right? It's just some extra info that
> > > an app can't handle, so we hide it from the app...
> >
> > I don't entirely follow. Are we on the same page here and agree that
> > we should just show the GSO packet to userspace?
>
> So we are talking about a hypothetical case where we add a GSO type,
> then a hypothetical userspace that does not know about a specific GSO
> type, right? I feel there must be some kind of negotiation
> and kernel must detect and show individual packets to such
> userspace. In fact, a similar change for the checksum has
> broken old dhcp clients and some of them are broken to this
> day, with ip tables rules used to work around the issue.

Interesting. Had to look that up. [1] summarizes it well. The issue is
with virtio-net devices advertising tx + rx checksum offload and thus
passing packets between VMs on the same host with bogus checksum
field. And the DHCP process reading such a packet, trying to verify
the checksum, failing and dropping.

Can we separate the virtio/tun/tap and packet socket use-cases of
virtio_net_hdr?

I would expect packet sockets to behave the same with and without
po->has_vnet_hdr. Without, they already pass all GSO packets up to
userspace as is. Which is essential for debugging with tcpdump or
wirehark. I always interpreted has_vnet_hdr as just an option to
receive more metadata along, akin to PACKET_AUXDATA. Not something
that subtly changes the packet flow.

That was my intend, but I only extended it to tpacket_rcv. Reading up
on the original feature that was added for packet_rcv, it does mention
"allows GSO/checksum offload to be enabled when using raw socket
backend with virtio_net". I don't know what that raw socket back-end
with virtio-net is. Something deprecated, but possibly still in use
somewhere?

For virtio/tun/tap, as long as we do not add NETIF_F_GSO_.. options to
dev->features and dev->hw_features, we have nothing to worry about.
Eventually we probably do want to add opt-in gso support. And I think
this ethtool negotiation will suffice. We just have to add new
VIRTIO_NET_HDR_GSO_.. types and update virtio_net_hdr_{from, to}_skb
before expanding dev->features.

[1] https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/930962/comments/5

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-11 14:31                       ` Willem de Bruijn
@ 2020-03-11 21:25                         ` Michael S. Tsirkin
  2020-03-11 21:49                           ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-11 21:25 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller

On Wed, Mar 11, 2020 at 10:31:47AM -0400, Willem de Bruijn wrote:
> I would expect packet sockets to behave the same with and without
> po->has_vnet_hdr. Without, they already pass all GSO packets up to
> userspace as is. Which is essential for debugging with tcpdump or
> wirehark. I always interpreted has_vnet_hdr as just an option to
> receive more metadata along, akin to PACKET_AUXDATA. Not something
> that subtly changes the packet flow.
>

So again just making sure:

	we are talking about a hypothetical case where we add a GSO type,
	then a hypothetical userspace that does not know about a specific GSO
	type, right?

I feel if someone writes a program with packet sockets, it is
important that it keeps working, and that means keep seeing
all packets, even if someone runs it on a new kernel
with a new optimization like gso. I feel dropping packets is
worse than changing gso type.

And that in turn means userspace must opt in to
seeing new GSO type, and old userspace must see old ones.

One way to do that would be converting packets on the socket, another
would be disabling the new GSO automatically as the socket is created
unless it opts in.

> That was my intend, but I only extended it to tpacket_rcv. Reading up
> on the original feature that was added for packet_rcv, it does mention
> "allows GSO/checksum offload to be enabled when using raw socket
> backend with virtio_net". I don't know what that raw socket back-end
> with virtio-net is. Something deprecated, but possibly still in use
> somewhere?

Pretty much. E.g. I still sometimes use it with an out of tree QEMU
patch - maybe I'll try to re-post it there just so we have an upstream
way to test the interface.

-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-11 21:25                         ` Michael S. Tsirkin
@ 2020-03-11 21:49                           ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2020-03-11 21:49 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Network Development, David Miller

On Wed, Mar 11, 2020 at 5:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 11, 2020 at 10:31:47AM -0400, Willem de Bruijn wrote:
> > I would expect packet sockets to behave the same with and without
> > po->has_vnet_hdr. Without, they already pass all GSO packets up to
> > userspace as is. Which is essential for debugging with tcpdump or
> > wirehark. I always interpreted has_vnet_hdr as just an option to
> > receive more metadata along, akin to PACKET_AUXDATA. Not something
> > that subtly changes the packet flow.
> >
>
> So again just making sure:
>
>         we are talking about a hypothetical case where we add a GSO type,
>         then a hypothetical userspace that does not know about a specific GSO
>         type, right?

I suppose we're talking about two cases:

- what to do about the two gso types that have already been added, and
are now dropped
- what to do about future gso types.

As for userspace, yeah, I don't know of any pf_packet programs that
are not robust against unknown gso_type. But you may know better on
this front :)

> I feel if someone writes a program with packet sockets, it is
> important that it keeps working, and that means keep seeing
> all packets,

Agreed

> even if someone runs it on a new kernel
> with a new optimization like gso. I feel dropping packets is
> worse than changing gso type.

Agreed

> And that in turn means userspace must opt in to
> seeing new GSO type, and old userspace must see old ones.

This is where I'm inclined to disagree. But that does depend on
whether the fragile legacy applications are hypothetical or not.

> One way to do that would be converting packets on the socket,

Then packet sockets are really showing something else than what they're reading.

It may be a necessary evil for legacy applications, but I really don't
like either the basic idea or that it differs from packet sockets
without po->has_vnet_hdr.

> another
> would be disabling the new GSO automatically as the socket is created
> unless it opts in.

Unfortunately that's not possible for packet sockets that are not
bound to a specific device.

> > That was my intend, but I only extended it to tpacket_rcv. Reading up
> > on the original feature that was added for packet_rcv, it does mention
> > "allows GSO/checksum offload to be enabled when using raw socket
> > backend with virtio_net". I don't know what that raw socket back-end
> > with virtio-net is. Something deprecated, but possibly still in use
> > somewhere?
>
> Pretty much. E.g. I still sometimes use it with an out of tree QEMU
> patch - maybe I'll try to re-post it there just so we have an upstream
> way to test the interface.

If you have a pointer, that would help me understand this case, thanks.

Segmentation on-demand, if we have to do that similar to the previous
checksum fix up (thought that was opt-in, not set as default, btw),
will not be entirely trivial code-wise, but doable.

As for handling a full socket partially through a segment list, I
guess that is no different from drops any other time, so that at least
will not be a problem.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-09 15:34 [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop Willem de Bruijn
  2020-03-09 15:42 ` Willem de Bruijn
  2020-03-10  6:43 ` Michael S. Tsirkin
@ 2020-03-12  6:13 ` David Miller
  2020-03-12  6:27   ` Michael S. Tsirkin
  2 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2020-03-12  6:13 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: netdev, mst, willemb

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Mon,  9 Mar 2020 11:34:35 -0400

> From: Willem de Bruijn <willemb@google.com>
> 
> In one error case, tpacket_rcv drops packets after incrementing the
> ring producer index.
> 
> If this happens, it does not update tp_status to TP_STATUS_USER and
> thus the reader is stalled for an iteration of the ring, causing out
> of order arrival.
> 
> The only such error path is when virtio_net_hdr_from_skb fails due
> to encountering an unknown GSO type.
> 
> Signed-off-by: Willem de Bruijn <willemb@google.com>

I'm applying this, as it fixes the ring state management in this case.

The question of what we should actually be doing for unknown GSO types
is a separate discussion.

Thanks Willem.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
  2020-03-12  6:13 ` David Miller
@ 2020-03-12  6:27   ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2020-03-12  6:27 UTC (permalink / raw)
  To: David Miller; +Cc: willemdebruijn.kernel, netdev, willemb

On Wed, Mar 11, 2020 at 11:13:27PM -0700, David Miller wrote:
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Mon,  9 Mar 2020 11:34:35 -0400
> 
> > From: Willem de Bruijn <willemb@google.com>
> > 
> > In one error case, tpacket_rcv drops packets after incrementing the
> > ring producer index.
> > 
> > If this happens, it does not update tp_status to TP_STATUS_USER and
> > thus the reader is stalled for an iteration of the ring, causing out
> > of order arrival.
> > 
> > The only such error path is when virtio_net_hdr_from_skb fails due
> > to encountering an unknown GSO type.
> > 
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> 
> I'm applying this, as it fixes the ring state management in this case.
> 
> The question of what we should actually be doing for unknown GSO types
> is a separate discussion.
> 
> Thanks Willem.

Absolutely, I agree

Acked-by: Michael S. Tsirkin <mst@redhat.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-03-12  6:28 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-09 15:34 [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop Willem de Bruijn
2020-03-09 15:42 ` Willem de Bruijn
2020-03-09 15:50   ` Willem de Bruijn
2020-03-10  6:46     ` Michael S. Tsirkin
2020-03-10  6:43 ` Michael S. Tsirkin
2020-03-10 12:49   ` Willem de Bruijn
2020-03-10 12:59     ` Michael S. Tsirkin
2020-03-10 14:16       ` Willem de Bruijn
2020-03-10 14:43         ` Michael S. Tsirkin
2020-03-10 15:38           ` Willem de Bruijn
2020-03-10 16:14             ` Willem de Bruijn
2020-03-10 21:29             ` Michael S. Tsirkin
2020-03-10 21:35               ` Willem de Bruijn
2020-03-10 21:57                 ` Michael S. Tsirkin
2020-03-10 23:13                   ` Willem de Bruijn
2020-03-11  7:56                     ` Michael S. Tsirkin
2020-03-11 14:31                       ` Willem de Bruijn
2020-03-11 21:25                         ` Michael S. Tsirkin
2020-03-11 21:49                           ` Willem de Bruijn
2020-03-12  6:13 ` David Miller
2020-03-12  6:27   ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.