From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42059) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UhNGu-00047f-UN for qemu-devel@nongnu.org; Tue, 28 May 2013 13:01:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UhNGM-0006XJ-IW for qemu-devel@nongnu.org; Tue, 28 May 2013 13:01:16 -0400 Received: from mail-ob0-x22d.google.com ([2607:f8b0:4003:c01::22d]:61999) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UhNGM-0006X9-BL for qemu-devel@nongnu.org; Tue, 28 May 2013 13:00:42 -0400 Received: by mail-ob0-f173.google.com with SMTP id wc20so1636316obb.32 for ; Tue, 28 May 2013 10:00:41 -0700 (PDT) From: Anthony Liguori In-Reply-To: <51A496C4.1020602@os.inf.tu-dresden.de> References: <20130527093409.GH21969@stefanha-thinkpad.redhat.com> <51A496C4.1020602@os.inf.tu-dresden.de> Date: Tue, 28 May 2013 12:00:38 -0500 Message-ID: <87r4grca4p.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Julian Stecklina , "snabb-devel@googlegroups.com" , qemu-devel@nongnu.org Cc: mst@redhat.com Julian Stecklina writes: > On 05/28/2013 12:10 PM, Luke Gorrie wrote: >> On 27 May 2013 11:34, Stefan Hajnoczi > > wrote: >> >> vhost_net is about connecting the a virtio-net speaking process to a >> tun-like device. The problem you are trying to solve is connecting a >> virtio-net speaking process to Snabb Switch. >> >> >> Yep! > > Since I am on a similar path as Luke, let me share another idea. > > What about extending qemu in a way to allow PCI device models to be > implemented in another process. We aren't going to support any interface that enables out of tree devices. This is just plugins in a different form with even more downsides. You cannot easily keep track of dirty info, the guest physical address translation to host is difficult to keep in sync (imagine the complexity of memory hotplug). Basically, it's easy to hack up but extremely hard to do something that works correctly overall. There isn't a compelling reason to implement something like this other than avoiding getting code into QEMU. Best to just submit your device to QEMU for inclusion. If you want to avoid copying in a vswitch, better to use something like vmsplice as I outlined in another thread. > This is not as hard as it may sound. > qemu would open a domain socket to this process and map VM memory over > to the other side. This can be accomplished by having file descriptors > in qemu to VM memory (reusing -mem-path code) and passing those over the > domain socket. The other side can then just mmap them. The socket would > also be used for configuration and I/O by the guest on the PCI > I/O/memory regions. You could also use this to do IRQs or use eventfds, > whatever works better. > > To have a zero copy userspace switch, the switch would offer virtio-net > devices to any qemu that wants to connect to it and implement the > complete device logic itself. Since it has access to all guest memory, > it can just do memcpy for packet data. Of course, this only works for > 64-bit systems, because you need vast amounts of virtual address space. > In my experience, doing this in userspace is _way less painful_. > > If you can get away with polling in the switch the overhead of doing all > this in userspace is zero. And as long as you can rate-limit explicit > notifications over the socket even that overhead should be okay. > > Opinions? I don't see any compelling reason to do something like this. It's jumping through a tremendous number of hoops to avoid putting code that belongs in QEMU in tree. Regards, Anthony Liguori > > Julian