From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD2ACC25B08 for ; Wed, 17 Aug 2022 06:54:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229933AbiHQGys (ORCPT ); Wed, 17 Aug 2022 02:54:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230442AbiHQGyp (ORCPT ); Wed, 17 Aug 2022 02:54:45 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E06C25C948 for ; Tue, 16 Aug 2022 23:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660719282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zcPD4jBG73lQXOoH43Q6tPWaFdDfYRto/2dXXxDgwek=; b=Zj2pAXsd4q69s+qQo6qi6ije2FgY75UDnmqKiMWMoFDh599/FBQeWmT/UAMK04B9pZnFxy i1L0GSP25ua2Qtuua4vq6+hcurd146GFHJnq5NGsYuziLH5/uq32SEHy0kEYLG2hAclIBh OaAXxEUqosw7AGgRjVQgSRllOBzaygo= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-7-9iPhOHP9PVGPfs7qZM5t0A-1; Wed, 17 Aug 2022 02:54:40 -0400 X-MC-Unique: 9iPhOHP9PVGPfs7qZM5t0A-1 Received: by mail-wr1-f71.google.com with SMTP id g11-20020adfa48b000000b002250d091f76so1466517wrb.3 for ; Tue, 16 Aug 2022 23:54:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=zcPD4jBG73lQXOoH43Q6tPWaFdDfYRto/2dXXxDgwek=; b=vPSibVH9wYoFWTWTs202xkPsizc7bAGt2o3ths3P3rw6eZk2ED+5YMXL3RXQX3SxQ7 erCklMuKF4imlep4M4+o/RWy/f2MjEBWOVQABGFxAUaoxJugqwXDCkdeYWJwRSWjHOUs RY2kHMxmHC9m1opRVo6l8pK2b9eQwmhtlJGhYpRd1M/Ts3Oet+jGmWSgTYyop1d/Bqt3 W3Oro6XXbVyxe/Ccu/zjSq53k5Nltl/Ny+532FAA/agywTmo/UZpaaeNgymH6NfhYZii vNKYr6/JQjTD/myPwyt9FsaHbaEYhed7R79rJZ8JFBk5Jv3aCBVXLvA1RiD+lJsffPSX Pokg== X-Gm-Message-State: ACgBeo2E3YIhxCS3WtCV6KZx4CiOI1wBoeweeqihSQVn/yLVqUN+9jzp i3ZVv1V5hjFfe1Y42ONmQl0+MPy9haR9O8P+GpUlcuPFh3jo1QD1EZetUxAMDyKDfpoIZpYRZNo WkhUg+CHXxXsk4qgWofX34wzg X-Received: by 2002:adf:d1c9:0:b0:225:f98:d602 with SMTP id b9-20020adfd1c9000000b002250f98d602mr5595057wrd.419.1660719279679; Tue, 16 Aug 2022 23:54:39 -0700 (PDT) X-Google-Smtp-Source: AA6agR71mbO/1JAgp957U1l5Y2ejH9W2IMsrKo6NRUtwq9BrlZgJpVzkNJv+ewui/k38InlrBVqC/w== X-Received: by 2002:adf:d1c9:0:b0:225:f98:d602 with SMTP id b9-20020adfd1c9000000b002250f98d602mr5595029wrd.419.1660719279393; Tue, 16 Aug 2022 23:54:39 -0700 (PDT) Received: from redhat.com ([2.55.43.215]) by smtp.gmail.com with ESMTPSA id f14-20020a05600c154e00b003a32251c3f9sm1288244wmg.5.2022.08.16.23.54.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Aug 2022 23:54:38 -0700 (PDT) Date: Wed, 17 Aug 2022 02:54:33 -0400 From: "Michael S. Tsirkin" To: Bobby Eshleman Cc: Bobby Eshleman , Bobby Eshleman , Cong Wang , Jiang Wang , Stefan Hajnoczi , Stefano Garzarella , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Dexuan Cui , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org Subject: Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Message-ID: <20220817025250-mutt-send-email-mst@kernel.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote: > Hey everybody, > > This series introduces datagrams, packet scheduling, and sk_buff usage > to virtio vsock. > > The usage of struct sk_buff benefits users by a) preparing vsock to use > other related systems that require sk_buff, such as sockmap and qdisc, > b) supporting basic congestion control via sock_alloc_send_skb, and c) > reducing copying when delivering packets to TAP. > > The socket layer no longer forces errors to be -ENOMEM, as typically > userspace expects -EAGAIN when the sk_sndbuf threshold is reached and > messages are being sent with option MSG_DONTWAIT. > > The datagram work is based off previous patches by Jiang Wang[1]. > > The introduction of datagrams creates a transport layer fairness issue > where datagrams may freely starve streams of queue access. This happens > because, unlike streams, datagrams lack the transactions necessary for > calculating credits and throttling. > > Previous proposals introduce changes to the spec to add an additional > virtqueue pair for datagrams[1]. Although this solution works, using > Linux's qdisc for packet scheduling leverages already existing systems, > avoids the need to change the virtio specification, and gives additional > capabilities. The usage of SFQ or fq_codel, for example, may solve the > transport layer starvation problem. It is easy to imagine other use > cases as well. For example, services of varying importance may be > assigned different priorities, and qdisc will apply appropriate > priority-based scheduling. By default, the system default pfifo qdisc is > used. The qdisc may be bypassed and legacy queuing is resumed by simply > setting the virtio-vsock%d network device to state DOWN. This technique > still allows vsock to work with zero-configuration. The basic question to answer then is this: with a net device qdisc etc in the picture, how is this different from virtio net then? Why do you still want to use vsock? > In summary, this series introduces these major changes to vsock: > > - virtio vsock supports datagrams > - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt > - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb, > which applies the throttling threshold sk_sndbuf. > - The vsock socket layer supports returning errors other than -ENOMEM. > - This is used to return -EAGAIN when the sk_sndbuf threshold is > reached. > - virtio vsock uses a net_device, through which qdisc may be used. > - qdisc allows scheduling policies to be applied to vsock flows. > - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is, > it may avoid datagrams from flooding out stream flows. The benefit > to this is that additional virtqueues are not needed for datagrams. > - The net_device and qdisc is bypassed by simply setting the > net_device state to DOWN. > > [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/ > > Bobby Eshleman (5): > vsock: replace virtio_vsock_pkt with sk_buff > vsock: return errors other than -ENOMEM to socket > vsock: add netdev to vhost/virtio vsock > virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit > virtio/vsock: add support for dgram > > Jiang Wang (1): > vsock_test: add tests for vsock dgram > > drivers/vhost/vsock.c | 238 ++++---- > include/linux/virtio_vsock.h | 73 ++- > include/net/af_vsock.h | 2 + > include/uapi/linux/virtio_vsock.h | 2 + > net/vmw_vsock/af_vsock.c | 30 +- > net/vmw_vsock/hyperv_transport.c | 2 +- > net/vmw_vsock/virtio_transport.c | 237 +++++--- > net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++-------- > net/vmw_vsock/vmci_transport.c | 9 +- > net/vmw_vsock/vsock_loopback.c | 51 +- > tools/testing/vsock/util.c | 105 ++++ > tools/testing/vsock/util.h | 4 + > tools/testing/vsock/vsock_test.c | 195 ++++++ > 13 files changed, 1176 insertions(+), 543 deletions(-) > > -- > 2.35.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A840C2BB41 for ; Wed, 17 Aug 2022 06:54:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id B459241936; Wed, 17 Aug 2022 06:54:48 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org B459241936 Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Zj2pAXsd X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c_af8itKhEXw; Wed, 17 Aug 2022 06:54:47 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 9C6954177B; Wed, 17 Aug 2022 06:54:46 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 9C6954177B Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 72D5AC0033; Wed, 17 Aug 2022 06:54:46 +0000 (UTC) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 28DB7C002D for ; Wed, 17 Aug 2022 06:54:45 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 03BB482F77 for ; Wed, 17 Aug 2022 06:54:45 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 03BB482F77 Authentication-Results: smtp1.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Zj2pAXsd X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Knes9Ae1hqUO for ; Wed, 17 Aug 2022 06:54:44 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org EE07982F19 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp1.osuosl.org (Postfix) with ESMTPS id EE07982F19 for ; Wed, 17 Aug 2022 06:54:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660719282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zcPD4jBG73lQXOoH43Q6tPWaFdDfYRto/2dXXxDgwek=; b=Zj2pAXsd4q69s+qQo6qi6ije2FgY75UDnmqKiMWMoFDh599/FBQeWmT/UAMK04B9pZnFxy i1L0GSP25ua2Qtuua4vq6+hcurd146GFHJnq5NGsYuziLH5/uq32SEHy0kEYLG2hAclIBh OaAXxEUqosw7AGgRjVQgSRllOBzaygo= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-554-9HlDbQPtM-ir2ft2t0ldNA-1; Wed, 17 Aug 2022 02:54:41 -0400 X-MC-Unique: 9HlDbQPtM-ir2ft2t0ldNA-1 Received: by mail-wm1-f69.google.com with SMTP id r67-20020a1c4446000000b003a5fa79008bso631850wma.5 for ; Tue, 16 Aug 2022 23:54:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=zcPD4jBG73lQXOoH43Q6tPWaFdDfYRto/2dXXxDgwek=; b=jV7hISrNjJcdYOmDAb1NTVoz/8kitVNLc71ZvbSay0ILhJL9eVIxm4ycPM2vVPrL5N zr2HuVO52pcrtnlK92Q7vjRW1IH13LJjpBGa9JAWV3MeeSiVwe0gBeLDODNpeDxil2yR /hqMhd9y9dje7bZwhr9ZRxGnVsiHmU/iTOyuocy8fA5a92Ug4GwjUsl6XwDUW/BMMFHK U+qcLC16w/QVwFdaklP/Z2UDJk9c+WsoQZkq7cb+0E2VZMPbGfIO2HM8wLYB03iK2Gks kpItAsWtpq+630+j3lV227BwPDnj4h/zqTwe8opFFXquPcHfGg2rN0K3NVDqacovHewC kPaQ== X-Gm-Message-State: ACgBeo1sBJZk8hL2wXKwp7jO0oDlJzMN7lZxnF211TnSW+S7N7xFBV2h iqbe/mPc3G1naSQQLpP0NqLu2S1kEKCzuuFxctfnmOwbF+OrMfGk3g4FvOh0RZ23i1vr+r4b+xc PPzucBFpTJjiDsLcq2/uYY7NOnPdgFS5ySqfMs166VA== X-Received: by 2002:adf:d1c9:0:b0:225:f98:d602 with SMTP id b9-20020adfd1c9000000b002250f98d602mr5595061wrd.419.1660719279685; Tue, 16 Aug 2022 23:54:39 -0700 (PDT) X-Google-Smtp-Source: AA6agR71mbO/1JAgp957U1l5Y2ejH9W2IMsrKo6NRUtwq9BrlZgJpVzkNJv+ewui/k38InlrBVqC/w== X-Received: by 2002:adf:d1c9:0:b0:225:f98:d602 with SMTP id b9-20020adfd1c9000000b002250f98d602mr5595029wrd.419.1660719279393; Tue, 16 Aug 2022 23:54:39 -0700 (PDT) Received: from redhat.com ([2.55.43.215]) by smtp.gmail.com with ESMTPSA id f14-20020a05600c154e00b003a32251c3f9sm1288244wmg.5.2022.08.16.23.54.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Aug 2022 23:54:38 -0700 (PDT) Date: Wed, 17 Aug 2022 02:54:33 -0400 From: "Michael S. Tsirkin" To: Bobby Eshleman Subject: Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Message-ID: <20220817025250-mutt-send-email-mst@kernel.org> References: MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline Cc: Bobby Eshleman , Wei Liu , Cong Wang , Stephen Hemminger , Bobby Eshleman , Jiang Wang , Dexuan Cui , Haiyang Zhang , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Eric Dumazet , netdev@vger.kernel.org, Stefan Hajnoczi , kvm@vger.kernel.org, Jakub Kicinski , Paolo Abeni , linux-hyperv@vger.kernel.org, "David S. Miller" X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote: > Hey everybody, > > This series introduces datagrams, packet scheduling, and sk_buff usage > to virtio vsock. > > The usage of struct sk_buff benefits users by a) preparing vsock to use > other related systems that require sk_buff, such as sockmap and qdisc, > b) supporting basic congestion control via sock_alloc_send_skb, and c) > reducing copying when delivering packets to TAP. > > The socket layer no longer forces errors to be -ENOMEM, as typically > userspace expects -EAGAIN when the sk_sndbuf threshold is reached and > messages are being sent with option MSG_DONTWAIT. > > The datagram work is based off previous patches by Jiang Wang[1]. > > The introduction of datagrams creates a transport layer fairness issue > where datagrams may freely starve streams of queue access. This happens > because, unlike streams, datagrams lack the transactions necessary for > calculating credits and throttling. > > Previous proposals introduce changes to the spec to add an additional > virtqueue pair for datagrams[1]. Although this solution works, using > Linux's qdisc for packet scheduling leverages already existing systems, > avoids the need to change the virtio specification, and gives additional > capabilities. The usage of SFQ or fq_codel, for example, may solve the > transport layer starvation problem. It is easy to imagine other use > cases as well. For example, services of varying importance may be > assigned different priorities, and qdisc will apply appropriate > priority-based scheduling. By default, the system default pfifo qdisc is > used. The qdisc may be bypassed and legacy queuing is resumed by simply > setting the virtio-vsock%d network device to state DOWN. This technique > still allows vsock to work with zero-configuration. The basic question to answer then is this: with a net device qdisc etc in the picture, how is this different from virtio net then? Why do you still want to use vsock? > In summary, this series introduces these major changes to vsock: > > - virtio vsock supports datagrams > - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt > - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb, > which applies the throttling threshold sk_sndbuf. > - The vsock socket layer supports returning errors other than -ENOMEM. > - This is used to return -EAGAIN when the sk_sndbuf threshold is > reached. > - virtio vsock uses a net_device, through which qdisc may be used. > - qdisc allows scheduling policies to be applied to vsock flows. > - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is, > it may avoid datagrams from flooding out stream flows. The benefit > to this is that additional virtqueues are not needed for datagrams. > - The net_device and qdisc is bypassed by simply setting the > net_device state to DOWN. > > [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/ > > Bobby Eshleman (5): > vsock: replace virtio_vsock_pkt with sk_buff > vsock: return errors other than -ENOMEM to socket > vsock: add netdev to vhost/virtio vsock > virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit > virtio/vsock: add support for dgram > > Jiang Wang (1): > vsock_test: add tests for vsock dgram > > drivers/vhost/vsock.c | 238 ++++---- > include/linux/virtio_vsock.h | 73 ++- > include/net/af_vsock.h | 2 + > include/uapi/linux/virtio_vsock.h | 2 + > net/vmw_vsock/af_vsock.c | 30 +- > net/vmw_vsock/hyperv_transport.c | 2 +- > net/vmw_vsock/virtio_transport.c | 237 +++++--- > net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++-------- > net/vmw_vsock/vmci_transport.c | 9 +- > net/vmw_vsock/vsock_loopback.c | 51 +- > tools/testing/vsock/util.c | 105 ++++ > tools/testing/vsock/util.h | 4 + > tools/testing/vsock/vsock_test.c | 195 ++++++ > 13 files changed, 1176 insertions(+), 543 deletions(-) > > -- > 2.35.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization