From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: updated: kvm networking todo wiki Date: Wed, 29 May 2013 08:01:03 -0500 Message-ID: <87k3miq6sw.fsf@codemonkey.ws> References: <20130523085034.GA16142@redhat.com> <519F35B7.6010408@redhat.com> <20130524113542.GA7046@redhat.com> <8738tctrox.fsf@codemonkey.ws> <20130524140024.GA12024@redhat.com> <87li6yodgq.fsf@rustcorp.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: Krishna Kumar2 , lmr@redhat.com, "Xin, Xiaohui" , kvm@vger.kernel.org, netdev@vger.kernel.org, Shirley Ma , virtualization@lists.linux-foundation.org, David Stevens , qemu-devel@nongnu.org, vyasevic@redhat.com, herbert@gondor.hengli.com.au, jdike@linux.intel.com, sri@linux.vnet.ibm.com To: Rusty Russell , "Michael S. Tsirkin" Return-path: In-Reply-To: <87li6yodgq.fsf@rustcorp.com.au> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org List-Id: netdev.vger.kernel.org Rusty Russell writes: > "Michael S. Tsirkin" writes: >> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote: >>> "Michael S. Tsirkin" writes: >>> >>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote: >>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote: >>> >> > Hey guys, >>> >> > I've updated the kvm networking todo wiki with current projects. >>> >> > Will try to keep it up to date more often. >>> >> > Original announcement below. >>> >> >>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki. >>> >> >>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the >>> >> project still being considered? >>> > >>> > It might have been interesting several years ago, but now that linux has >>> > vhost-net in kernel, the only point seems to be to >>> > speed up networking on non-linux hosts. >>> >>> Data plane just means having a dedicated thread for virtqueue processing >>> that doesn't hold qemu_mutex. >>> >>> Of course we're going to do this in QEMU. It's a no brainer. But not >>> as a separate device, just as an improvement to the existing userspace >>> virtio-net. >>> >>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck. >>> >>> FWIW, I think what's more interesting is using vhost-net as a networking >>> backend with virtio-net in QEMU being what's guest facing. >>> >>> In theory, this gives you the best of both worlds: QEMU acts as a first >>> line of defense against a malicious guest while still getting the >>> performance advantages of vhost-net (zero-copy). >> >> Great idea, that sounds very intresting. >> >> I'll add it to the wiki. >> >> In fact a bit of complexity in vhost was put there in the vague hope to >> support something like this: virtio rings are not translated through >> regular memory tables, instead, vhost gets a pointer to ring address. >> >> This allows qemu acting as a man in the middle, >> verifying the descriptors but not touching the >> >> Anyone interested in working on such a project? > > It would be an interesting idea if we didn't already have the vhost > model where we don't need the userspace bounce. The model is very interesting for QEMU because then we can use vhost as a backend for other types of network adapters (like vmxnet3 or even e1000). It also helps for things like fault tolerance where we need to be able to control packet flow within QEMU. Regards, Anthony Liguori > We already have two > sets of host side ring code in the kernel (vhost and vringh, though > they're being unified). > > All an accelerator can offer on the tx side is zero copy and direct > update of the used ring. On rx userspace could register the buffers and > the accelerator could fill them and update the used ring. It still > needs to deal with merged buffers, for example. > > You avoid the address translation in the kernel, but I'm not convinced > that's a key problem. > > Cheers, > Rusty. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48743) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uhg0A-00008m-Dn for qemu-devel@nongnu.org; Wed, 29 May 2013 09:01:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uhg05-0005bu-KL for qemu-devel@nongnu.org; Wed, 29 May 2013 09:01:14 -0400 Received: from mail-oa0-f49.google.com ([209.85.219.49]:38077) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uhg05-0005bo-Fs for qemu-devel@nongnu.org; Wed, 29 May 2013 09:01:09 -0400 Received: by mail-oa0-f49.google.com with SMTP id k14so11359679oag.22 for ; Wed, 29 May 2013 06:01:08 -0700 (PDT) From: Anthony Liguori In-Reply-To: <87li6yodgq.fsf@rustcorp.com.au> References: <20130523085034.GA16142@redhat.com> <519F35B7.6010408@redhat.com> <20130524113542.GA7046@redhat.com> <8738tctrox.fsf@codemonkey.ws> <20130524140024.GA12024@redhat.com> <87li6yodgq.fsf@rustcorp.com.au> Date: Wed, 29 May 2013 08:01:03 -0500 Message-ID: <87k3miq6sw.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] updated: kvm networking todo wiki List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Rusty Russell , "Michael S. Tsirkin" Cc: Krishna Kumar2 , lmr@redhat.com, "Xin, Xiaohui" , akong@redhat.com, kvm@vger.kernel.org, sriram.narasimhan@hp.com, netdev@vger.kernel.org, Jason Wang , Shirley Ma , virtualization@lists.linux-foundation.org, David Stevens , qemu-devel@nongnu.org, vyasevic@redhat.com, herbert@gondor.hengli.com.au, jdike@linux.intel.com, sri@linux.vnet.ibm.com Rusty Russell writes: > "Michael S. Tsirkin" writes: >> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote: >>> "Michael S. Tsirkin" writes: >>> >>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote: >>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote: >>> >> > Hey guys, >>> >> > I've updated the kvm networking todo wiki with current projects. >>> >> > Will try to keep it up to date more often. >>> >> > Original announcement below. >>> >> >>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki. >>> >> >>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the >>> >> project still being considered? >>> > >>> > It might have been interesting several years ago, but now that linux has >>> > vhost-net in kernel, the only point seems to be to >>> > speed up networking on non-linux hosts. >>> >>> Data plane just means having a dedicated thread for virtqueue processing >>> that doesn't hold qemu_mutex. >>> >>> Of course we're going to do this in QEMU. It's a no brainer. But not >>> as a separate device, just as an improvement to the existing userspace >>> virtio-net. >>> >>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck. >>> >>> FWIW, I think what's more interesting is using vhost-net as a networking >>> backend with virtio-net in QEMU being what's guest facing. >>> >>> In theory, this gives you the best of both worlds: QEMU acts as a first >>> line of defense against a malicious guest while still getting the >>> performance advantages of vhost-net (zero-copy). >> >> Great idea, that sounds very intresting. >> >> I'll add it to the wiki. >> >> In fact a bit of complexity in vhost was put there in the vague hope to >> support something like this: virtio rings are not translated through >> regular memory tables, instead, vhost gets a pointer to ring address. >> >> This allows qemu acting as a man in the middle, >> verifying the descriptors but not touching the >> >> Anyone interested in working on such a project? > > It would be an interesting idea if we didn't already have the vhost > model where we don't need the userspace bounce. The model is very interesting for QEMU because then we can use vhost as a backend for other types of network adapters (like vmxnet3 or even e1000). It also helps for things like fault tolerance where we need to be able to control packet flow within QEMU. Regards, Anthony Liguori > We already have two > sets of host side ring code in the kernel (vhost and vringh, though > they're being unified). > > All an accelerator can offer on the tx side is zero copy and direct > update of the used ring. On rx userspace could register the buffers and > the accelerator could fill them and update the used ring. It still > needs to deal with merged buffers, for example. > > You avoid the address translation in the kernel, but I'm not convinced > that's a key problem. > > Cheers, > Rusty.