From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 351A4C43334 for ; Wed, 15 Jun 2022 01:29:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238832AbiFOB3H (ORCPT ); Tue, 14 Jun 2022 21:29:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231237AbiFOB3F (ORCPT ); Tue, 14 Jun 2022 21:29:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ABC792DAAE for ; Tue, 14 Jun 2022 18:29:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655256543; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gPcslwf6/5AipVOiJxjFuNmMuJiLjP5Ai/xX9fViRS4=; b=QQbe3riAOnysT+McwsCYR7Jyb8wsIGxw0xHpuuwGxDvnfL07OpA2GcdlJt73su7ULDMVJs pBGne0MMx/OjS1c+f9I8qBWpOzkmDSoyY4iVsGJgeKM4J3jhnt3ItQn8z/JXzc98UVcRiL MsVXOP/B3BBLDgk1C1fBhYlxajhAZhI= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-619-aQfEcPsEOPueazRk0aUN3Q-1; Tue, 14 Jun 2022 21:29:02 -0400 X-MC-Unique: aQfEcPsEOPueazRk0aUN3Q-1 Received: by mail-lj1-f200.google.com with SMTP id b26-20020a2e989a000000b002556f92fa13so1557754ljj.15 for ; Tue, 14 Jun 2022 18:29:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gPcslwf6/5AipVOiJxjFuNmMuJiLjP5Ai/xX9fViRS4=; b=pwUJJl8ka1/9qj5CouNOYHEMM/D5eiZl5rp+0kYhedaU2ehYt5lSEuSOAU+V+2dE5E ZFQTHVMrGbOZAY7C182jelohgIZq6rBA0jNYpTbnhbYvrVQkHztuYDJeuNgI7DgRJjqf 2I8uKMGRTNMJz81hQLUoO6lBP4ssLxbggVsEnQxNpDrwjIG/zI/Pwtrt5c4bt9d8/de6 C3il8ZTp980Ml4CSMZCGPB5UbZ6LneZ7LFyC62eazGGTbGkQpvoHRRxk11Qri33l0ZQw JsfnBVb//RJEROYZqnBy+uNGq0LwEACV8b3nqh/zGY6Ixfnb8WGi/h8i6tI6HdtCGom5 YaGw== X-Gm-Message-State: AJIora9fYQdm0biSqTkUesTwMn3geA5WPzgga8Lwbaxvb8YNrba4WyAT 6zNdH9BGnBf6FW2TrZEn8Ytv7/RVHHmHsL13Vvla8c377dZbK9mVJoYZBscpPb7TzXz+6vsspWr swk9ecy0L1OVfj6/TN+mD6v7WvehCEFO5EM/KJTic X-Received: by 2002:a19:4352:0:b0:479:5d1:3fef with SMTP id m18-20020a194352000000b0047905d13fefmr4549844lfj.411.1655256541077; Tue, 14 Jun 2022 18:29:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzy6G17S6B/NL5asgkhJ3N6qCPtMcJgm4jIUP9oxu6MOto7+cw1fHjnYuylfVl22jAQ1SwvKuEFSnQqUE0b/Qw= X-Received: by 2002:a19:4352:0:b0:479:5d1:3fef with SMTP id m18-20020a194352000000b0047905d13fefmr4549812lfj.411.1655256540740; Tue, 14 Jun 2022 18:29:00 -0700 (PDT) MIME-Version: 1.0 References: <20220526124338.36247-1-eperezma@redhat.com> <20220527065442-mutt-send-email-mst@kernel.org> In-Reply-To: From: Jason Wang Date: Wed, 15 Jun 2022 09:28:49 +0800 Message-ID: Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation To: Parav Pandit Cc: "Michael S. Tsirkin" , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "martinh@xilinx.com" , Stefano Garzarella , "martinpo@xilinx.com" , "lvivier@redhat.com" , "pabloc@xilinx.com" , Eli Cohen , Dan Carpenter , Xie Yongji , Christophe JAILLET , Zhang Min , Wu Zongyong , "lulu@redhat.com" , Zhu Lingshan , "Piotr.Uminski@intel.com" , Si-Wei Liu , "ecree.xilinx@gmail.com" , "gautam.dawar@amd.com" , "habetsm.xilinx@gmail.com" , "tanuj.kamde@amd.com" , "hanand@xilinx.com" , "dinang@xilinx.com" , Longpeng Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 15, 2022 at 8:10 AM Parav Pandit wrote: > > > > > From: Jason Wang > > Sent: Wednesday, June 1, 2022 11:54 PM > > > > On Thu, Jun 2, 2022 at 10:59 AM Parav Pandit wrote: > > > > > > > > > > From: Jason Wang > > > > Sent: Wednesday, June 1, 2022 10:00 PM > > > > > > > > On Thu, Jun 2, 2022 at 2:58 AM Parav Pandit wrote: > > > > > > > > > > > > > > > > From: Jason Wang > > > > > > Sent: Tuesday, May 31, 2022 10:42 PM > > > > > > > > > > > > Well, the ability to query the virtqueue state was proposed as > > > > > > another feature (Eugenio, please correct me). This should be > > > > > > sufficient for making virtio-net to be live migrated. > > > > > > > > > > > The device is stopped, it won't answer to this special vq config done > > here. > > > > > > > > This depends on the definition of the stop. Any query to the device > > > > state should be allowed otherwise it's meaningless for us. > > > > > > > > > Programming all of these using cfg registers doesn't scale for > > > > > on-chip > > > > memory and for the speed. > > > > > > > > Well, they are orthogonal and what I want to say is, we should first > > > > define the semantics of stop and state of the virtqueue. > > > > > > > > Such a facility could be accessed by either transport specific > > > > method or admin virtqueue, it totally depends on the hardware > > architecture of the vendor. > > > > > > > I find it hard to believe that a vendor can implement a CVQ but not AQ and > > chose to expose tens of hundreds of registers. > > > But maybe, it fits some specific hw. > > > > You can have a look at the ifcvf dpdk driver as an example. > > > Ifcvf is an example of using registers. > It is not an answer why AQ is hard for it. :) Well, it's an example of how vDPA is implemented. I think we agree that for vDPA, vendors have the flexibility to implement their perferrable datapath. > virtio spec has definition of queue now and implementing yet another queue shouldn't be a problem. > > So far no one seem to have problem with the additional queue. > So I take it as AQ is ok. > > > But another thing that is unrelated to hardware architecture is the nesting > > support. Having admin virtqueue in a nesting environment looks like an > > overkill. Presenting a register in L1 and map it to L0's admin should be good > > enough. > So may be a optimized interface can be added that fits nested env. > At this point in time real users that we heard are interested in non-nested use cases. Let's enable them first. That's fine. For nests, it's actually really easy, just adding an interface within the existing transport should be sufficient. > > > > > > > > > > I like to learn the advantages of such method other than simplicity. > > > > > > We can clearly that we are shifting away from such PCI registers with SIOV, > > IMS and other scalable solutions. > > > virtio drifting in reverse direction by introducing more registers as > > transport. > > > I expect it to an optional transport like AQ. > > > > Actually, I had a proposal of using admin virtqueue as a transport, it's > > designed to be SIOV/IMS capable. And it's not hard to extend it with the > > state/stop support etc. > > > > > > > > > > > > > > > Next would be to program hundreds of statistics of the 64 VQs > > > > > through a > > > > giant PCI config space register in some busy polling scheme. > > > > > > > > We don't need giant config space, and this method has been > > > > implemented by some vDPA vendors. > > > > > > > There are tens of 64-bit counters per VQs. These needs to programmed on > > destination side. > > > Programming these via registers requires exposing them on the registers. > > > In one of the proposals, I see them being queried via CVQ from the device. > > > > I didn't see a proposal like this. And I don't think querying general virtio state > > like idx with a device specific CVQ is a good design. > > > My example was not for the idx. But for VQ statistics that is queried via CVQ. > > > > > > > Programming them via cfg registers requires large cfg space or synchronous > > programming until receiving ACK from it. > > > This means one entry at a time... > > > > > > Programming them via CVQ needs replicate and align cmd values etc on all > > device types. All duplicate and hard to maintain. > > > > > > > > > > > > > > > > I can clearly see how all these are inefficient for faster LM. > > > > > We need an efficient AQ to proceed with at minimum. > > > > > > > > I'm fine with admin virtqueue, but the stop and state are orthogonal to > > that. > > > > And using admin virtqueue for stop/state will be more natural if we > > > > use admin virtqueue as a transport. > > > Ok. > > > We should have defined it bit earlier that all vendors can use. :( > > > > I agree. > > I remember few months back, you acked in the weekly meeting that TC has approved the AQ direction. > And we are still in this circle of debating the AQ. I think not. Just to make sure we are on the same page, the proposal here is for vDPA, and hope it can provide forward compatibility to virtio. So in the context of vDPA, admin virtqueue is not a must. Thanks From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C77F6C433EF for ; Wed, 15 Jun 2022 01:29:11 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 5B1238308D; Wed, 15 Jun 2022 01:29:11 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C8hBJwvj8uMS; Wed, 15 Jun 2022 01:29:10 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id C5D2182FF9; Wed, 15 Jun 2022 01:29:09 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9BB05C0032; Wed, 15 Jun 2022 01:29:09 +0000 (UTC) Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id CF3BBC002D for ; Wed, 15 Jun 2022 01:29:07 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id B807F60F9E for ; Wed, 15 Jun 2022 01:29:07 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp3.osuosl.org (amavisd-new); dkim=pass (1024-bit key) header.d=redhat.com Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ou4KzNQB7amv for ; Wed, 15 Jun 2022 01:29:05 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id DADF860F58 for ; Wed, 15 Jun 2022 01:29:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655256543; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gPcslwf6/5AipVOiJxjFuNmMuJiLjP5Ai/xX9fViRS4=; b=QQbe3riAOnysT+McwsCYR7Jyb8wsIGxw0xHpuuwGxDvnfL07OpA2GcdlJt73su7ULDMVJs pBGne0MMx/OjS1c+f9I8qBWpOzkmDSoyY4iVsGJgeKM4J3jhnt3ItQn8z/JXzc98UVcRiL MsVXOP/B3BBLDgk1C1fBhYlxajhAZhI= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-661-nDJ0L2qwOqaPK_KjdGZ16Q-1; Tue, 14 Jun 2022 21:29:02 -0400 X-MC-Unique: nDJ0L2qwOqaPK_KjdGZ16Q-1 Received: by mail-lf1-f71.google.com with SMTP id j3-20020a05651231c300b0047dbea7b031so5273108lfe.19 for ; Tue, 14 Jun 2022 18:29:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gPcslwf6/5AipVOiJxjFuNmMuJiLjP5Ai/xX9fViRS4=; b=Ns7WWF6JqvMJIjkeOzcmFa8s06woDR2T/jJxFO8tBpHeCoxYuP2rvcnKxZ8/2JxNQe H6nyKmoXDi0HF5tHzzj2zvIy7p7s/Hoza/hujWlILep0B/2xeDQT6IL0nG28i4onYhVl ELRi88CPPCJj9z99IpYqJfuOVcPsxHb7i0rIytH75D0tGIdrDNVDjFQet96jVbd06v2k 9z38iGNAriSlLyP1mJ4sjfHHsHZlDZg5jM/zNH4TFPx7v7Qlh/ZFfOp4OVT05jRtR1bh Ze+pX5hkVFVTSpE/5gYaeU80vX6sQIrTpvmoY8tyU31DwAX5kNVLrZjktEM6wtemqYUJ KwSg== X-Gm-Message-State: AJIora/4ATD5vUf/AGChYyHqabB2bfxvSI91YIHIf6o8l29EA0Y2medY 5c5dhpn+inm1yQL9411hRvfH2SqsMoaI3SxwRDgQP+uEkGEUyfDYGzhmSrtcyp8SWl9mIPaI+ql vwWAJRodxAeuaZEj1kPmjcjI8lfBV7WWEq3Rlyg2g9j4f8JdBhDMn5b5SAQ== X-Received: by 2002:a19:4352:0:b0:479:5d1:3fef with SMTP id m18-20020a194352000000b0047905d13fefmr4549831lfj.411.1655256541025; Tue, 14 Jun 2022 18:29:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzy6G17S6B/NL5asgkhJ3N6qCPtMcJgm4jIUP9oxu6MOto7+cw1fHjnYuylfVl22jAQ1SwvKuEFSnQqUE0b/Qw= X-Received: by 2002:a19:4352:0:b0:479:5d1:3fef with SMTP id m18-20020a194352000000b0047905d13fefmr4549812lfj.411.1655256540740; Tue, 14 Jun 2022 18:29:00 -0700 (PDT) MIME-Version: 1.0 References: <20220526124338.36247-1-eperezma@redhat.com> <20220527065442-mutt-send-email-mst@kernel.org> In-Reply-To: From: Jason Wang Date: Wed, 15 Jun 2022 09:28:49 +0800 Message-ID: Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation To: Parav Pandit Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jasowang@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: "tanuj.kamde@amd.com" , "kvm@vger.kernel.org" , "Michael S. Tsirkin" , "virtualization@lists.linux-foundation.org" , Wu Zongyong , "pabloc@xilinx.com" , Eli Cohen , Zhang Min , "lulu@redhat.com" , =?UTF-8?Q?Eugenio_P=C3=A9rez?= , "Piotr.Uminski@intel.com" , "martinh@xilinx.com" , Xie Yongji , "dinang@xilinx.com" , "habetsm.xilinx@gmail.com" , Longpeng , Dan Carpenter , "lvivier@redhat.com" , Christophe JAILLET , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "ecree.xilinx@gmail.com" , "hanand@xilinx.com" , "martinpo@xilinx.com" , "gautam.dawar@amd.com" , Zhu Lingshan X-BeenThere: virtualization@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Linux virtualization List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: virtualization-bounces@lists.linux-foundation.org Sender: "Virtualization" On Wed, Jun 15, 2022 at 8:10 AM Parav Pandit wrote: > > > > > From: Jason Wang > > Sent: Wednesday, June 1, 2022 11:54 PM > > > > On Thu, Jun 2, 2022 at 10:59 AM Parav Pandit wrote: > > > > > > > > > > From: Jason Wang > > > > Sent: Wednesday, June 1, 2022 10:00 PM > > > > > > > > On Thu, Jun 2, 2022 at 2:58 AM Parav Pandit wrote: > > > > > > > > > > > > > > > > From: Jason Wang > > > > > > Sent: Tuesday, May 31, 2022 10:42 PM > > > > > > > > > > > > Well, the ability to query the virtqueue state was proposed as > > > > > > another feature (Eugenio, please correct me). This should be > > > > > > sufficient for making virtio-net to be live migrated. > > > > > > > > > > > The device is stopped, it won't answer to this special vq config done > > here. > > > > > > > > This depends on the definition of the stop. Any query to the device > > > > state should be allowed otherwise it's meaningless for us. > > > > > > > > > Programming all of these using cfg registers doesn't scale for > > > > > on-chip > > > > memory and for the speed. > > > > > > > > Well, they are orthogonal and what I want to say is, we should first > > > > define the semantics of stop and state of the virtqueue. > > > > > > > > Such a facility could be accessed by either transport specific > > > > method or admin virtqueue, it totally depends on the hardware > > architecture of the vendor. > > > > > > > I find it hard to believe that a vendor can implement a CVQ but not AQ and > > chose to expose tens of hundreds of registers. > > > But maybe, it fits some specific hw. > > > > You can have a look at the ifcvf dpdk driver as an example. > > > Ifcvf is an example of using registers. > It is not an answer why AQ is hard for it. :) Well, it's an example of how vDPA is implemented. I think we agree that for vDPA, vendors have the flexibility to implement their perferrable datapath. > virtio spec has definition of queue now and implementing yet another queue shouldn't be a problem. > > So far no one seem to have problem with the additional queue. > So I take it as AQ is ok. > > > But another thing that is unrelated to hardware architecture is the nesting > > support. Having admin virtqueue in a nesting environment looks like an > > overkill. Presenting a register in L1 and map it to L0's admin should be good > > enough. > So may be a optimized interface can be added that fits nested env. > At this point in time real users that we heard are interested in non-nested use cases. Let's enable them first. That's fine. For nests, it's actually really easy, just adding an interface within the existing transport should be sufficient. > > > > > > > > > > I like to learn the advantages of such method other than simplicity. > > > > > > We can clearly that we are shifting away from such PCI registers with SIOV, > > IMS and other scalable solutions. > > > virtio drifting in reverse direction by introducing more registers as > > transport. > > > I expect it to an optional transport like AQ. > > > > Actually, I had a proposal of using admin virtqueue as a transport, it's > > designed to be SIOV/IMS capable. And it's not hard to extend it with the > > state/stop support etc. > > > > > > > > > > > > > > > Next would be to program hundreds of statistics of the 64 VQs > > > > > through a > > > > giant PCI config space register in some busy polling scheme. > > > > > > > > We don't need giant config space, and this method has been > > > > implemented by some vDPA vendors. > > > > > > > There are tens of 64-bit counters per VQs. These needs to programmed on > > destination side. > > > Programming these via registers requires exposing them on the registers. > > > In one of the proposals, I see them being queried via CVQ from the device. > > > > I didn't see a proposal like this. And I don't think querying general virtio state > > like idx with a device specific CVQ is a good design. > > > My example was not for the idx. But for VQ statistics that is queried via CVQ. > > > > > > > Programming them via cfg registers requires large cfg space or synchronous > > programming until receiving ACK from it. > > > This means one entry at a time... > > > > > > Programming them via CVQ needs replicate and align cmd values etc on all > > device types. All duplicate and hard to maintain. > > > > > > > > > > > > > > > > I can clearly see how all these are inefficient for faster LM. > > > > > We need an efficient AQ to proceed with at minimum. > > > > > > > > I'm fine with admin virtqueue, but the stop and state are orthogonal to > > that. > > > > And using admin virtqueue for stop/state will be more natural if we > > > > use admin virtqueue as a transport. > > > Ok. > > > We should have defined it bit earlier that all vendors can use. :( > > > > I agree. > > I remember few months back, you acked in the weekly meeting that TC has approved the AQ direction. > And we are still in this circle of debating the AQ. I think not. Just to make sure we are on the same page, the proposal here is for vDPA, and hope it can provide forward compatibility to virtio. So in the context of vDPA, admin virtqueue is not a must. Thanks _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization