Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Network Development <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>
Subject: Re: [PATCH net] net/packet: tpacket_rcv: do not increment ring index on drop
Date: Wed, 11 Mar 2020 17:49:52 -0400	[thread overview]
Message-ID: <CA+FuTSeMRBG4GyQFjtk3+SuxZX+aPGUezuhwk7yumFq1CHyu5A@mail.gmail.com> (raw)
In-Reply-To: <20200311171634-mutt-send-email-mst@kernel.org>

On Wed, Mar 11, 2020 at 5:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 11, 2020 at 10:31:47AM -0400, Willem de Bruijn wrote:
> > I would expect packet sockets to behave the same with and without
> > po->has_vnet_hdr. Without, they already pass all GSO packets up to
> > userspace as is. Which is essential for debugging with tcpdump or
> > wirehark. I always interpreted has_vnet_hdr as just an option to
> > receive more metadata along, akin to PACKET_AUXDATA. Not something
> > that subtly changes the packet flow.
> >
>
> So again just making sure:
>
>         we are talking about a hypothetical case where we add a GSO type,
>         then a hypothetical userspace that does not know about a specific GSO
>         type, right?

I suppose we're talking about two cases:

- what to do about the two gso types that have already been added, and
are now dropped
- what to do about future gso types.

As for userspace, yeah, I don't know of any pf_packet programs that
are not robust against unknown gso_type. But you may know better on
this front :)

> I feel if someone writes a program with packet sockets, it is
> important that it keeps working, and that means keep seeing
> all packets,

Agreed

> even if someone runs it on a new kernel
> with a new optimization like gso. I feel dropping packets is
> worse than changing gso type.

Agreed

> And that in turn means userspace must opt in to
> seeing new GSO type, and old userspace must see old ones.

This is where I'm inclined to disagree. But that does depend on
whether the fragile legacy applications are hypothetical or not.

> One way to do that would be converting packets on the socket,

Then packet sockets are really showing something else than what they're reading.

It may be a necessary evil for legacy applications, but I really don't
like either the basic idea or that it differs from packet sockets
without po->has_vnet_hdr.

> another
> would be disabling the new GSO automatically as the socket is created
> unless it opts in.

Unfortunately that's not possible for packet sockets that are not
bound to a specific device.

> > That was my intend, but I only extended it to tpacket_rcv. Reading up
> > on the original feature that was added for packet_rcv, it does mention
> > "allows GSO/checksum offload to be enabled when using raw socket
> > backend with virtio_net". I don't know what that raw socket back-end
> > with virtio-net is. Something deprecated, but possibly still in use
> > somewhere?
>
> Pretty much. E.g. I still sometimes use it with an out of tree QEMU
> patch - maybe I'll try to re-post it there just so we have an upstream
> way to test the interface.

If you have a pointer, that would help me understand this case, thanks.

Segmentation on-demand, if we have to do that similar to the previous
checksum fix up (thought that was opt-in, not set as default, btw),
will not be entirely trivial code-wise, but doable.

As for handling a full socket partially through a segment list, I
guess that is no different from drops any other time, so that at least
will not be a problem.