On Fri, 2021-06-25 at 10:55 +0800, Jason Wang wrote: > 在 2021/6/24 下午6:42, David Woodhouse 写道: > > On Thu, 2021-06-24 at 14:12 +0800, Jason Wang wrote: > > > 在 2021/6/24 上午12:12, David Woodhouse 写道: > > > > We *should* eventually expand this test case to attach an AF_PACKET > > > > device to the vhost-net, instead of using a tun device as the back end. > > > > (Although I don't really see *why* vhost is limited to AF_PACKET. Why > > > > *can't* I attach anything else, like an AF_UNIX socket, to vhost-net?) > > > > > > It's just because nobody wrote the code. And we're lacking the real use > > > case. > > > > Hm, what code? > > > The codes to support AF_UNIX. > > > > For AF_PACKET I haven't actually spotted that there *is* any. > > > Vhost_net has this support for more than 10 years. It's hard to say > there's no user for that. > I wasn't saying I hadn't spotted the use case. I hadn't spotted the *code* which is in af_packet to support vhost. But... > > As I've been refactoring the interaction between vhost and tun/tap, and > > fixing it up for different vhdr lengths, PI, and (just now) frowning in > > horror at the concept that tun and vhost can have *different* > > endiannesses, I hadn't spotted that there was anything special on the > > packet socket. > > Vnet header support. ... I have no idea how I failed to spot that. OK, so AF_PACKET sockets can *optionally* support the case where *they* provide the virtio_net_hdr — instead of vhost doing it, or there being none. But any other sockets would work for the "vhost does it" or the "no vhdr" case. ... and I need to fix my 'get sock_hlen from the underlying tun/tap device' patch to *not* assume that sock_hlen is zero for a raw socket; it needs to check the PACKET_VNET_HDR sockopt. And *that* was broken for the VERSION_1|MRG_RXBUF case before I came along, wasn't it? Because vhost would have assumed sock_hlen to be 12 bytes, while in AF_PACKET it's always only 10? > > For that case, sock_hlen is just zero and we > > send/receive plain packets... or so I thought? Did I miss something? > > > With vnet header, it can have GSO and csum offload. > > > > > > As far as I was aware, that ought to have worked with any datagram > > socket. I was pondering not just AF_UNIX but also UDP (since that's my > > main transport for VPN data, at least in the case where I care about > > performance). > > > My understanding is that vhost_net designed for accelerating virtio > datapath which is mainly used for VM (L2 traffic). So all kinds of TAPs > (tuntap,macvtap or packet socket) are the main users. If you check git > history, vhost can only be enabled without KVM until sometime last year. > So I confess it can serve as a more general use case, and we had already > has some discussions. But it's hard to say it's worth to do that since > it became a re-invention of io_uring? Yeah, ultimately I'm not sure that's worth exploring. As I said, I was looking for something that works on *current* kernels. Which means no io_uring on the underlying tun socket, and no vhost on UDP. If I want to go and implement *both* ring protocols in userspace and make use of each of them on the socket that they do support, I can do that. Yay! :) If I'm going to require new kernels, then I should just work on the "ideal" data path which doesn't really involve userspace at all. But we should probably take that discussion to a separate thread.