All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: xiaohui.xin@intel.com, netdev@vger.kernel.org,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu,
	davem@davemloft.net, jdike@linux.intel.com
Subject: Re: [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net.
Date: Wed, 14 Apr 2010 19:16:10 +0300	[thread overview]
Message-ID: <20100414161610.GA10897@redhat.com> (raw)
In-Reply-To: <201004141757.54829.arnd@arndb.de>

On Wed, Apr 14, 2010 at 05:57:54PM +0200, Arnd Bergmann wrote:
> On Wednesday 14 April 2010, Michael S. Tsirkin wrote:
> > On Wed, Apr 14, 2010 at 04:55:21PM +0200, Arnd Bergmann wrote:
> > > On Friday 09 April 2010, xiaohui.xin@intel.com wrote:
> > > > From: Xin Xiaohui <xiaohui.xin@intel.com>
> >
> > > It seems that you are duplicating a lot of functionality that
> > > is already in macvtap. I've asked about this before but then
> > > didn't look at your newer versions. Can you explain the value
> > > of introducing another interface to user land?
> > 
> > Hmm, I have not noticed a lot of duplication.
> 
> The code is indeed quite distinct, but the idea of adding another
> character device to pass into vhost for direct device access is.

All backends besides tap seem to do this, btw :)

> > BTW macvtap also duplicates tun code, it might be
> > a good idea for tun to export some functionality.
> 
> Yes, that's something I plan to look into.
> 
> > > I'm still planning to add zero-copy support to macvtap,
> > > hopefully reusing parts of your code, but do you think there
> > > is value in having both?
> > 
> > If macvtap would get zero copy tx and rx, maybe not. But
> > it's not immediately obvious whether zero-copy support
> > for macvtap might work, though, especially for zero copy rx.
> > The approach with mpassthru is much simpler in that
> > it takes complete control of the device.
> 
> As far as I can tell, the most significant limitation of mpassthru
> is that there can only ever be a single guest on a physical NIC.
> 
> Given that limitation, I believe we can do the same on macvtap,
> and simply disable zero-copy RX when you want to use more than one
> guest, or both guest and host on the same NIC.
> 
> The logical next step here would be to allow VMDq and similar
> technologies to separate out the RX traffic in the hardware.
> We don't have a configuration interface for that yet, but
> since this is logically the same as macvlan, I think we should
> use the same interfaces for both, essentially treating VMDq
> as a hardware acceleration for macvlan. We can probably handle
> it in similar ways to how we handle hardware support for vlan.
> 
> At that stage, macvtap would be the logical interface for
> connecting a VMDq (hardware macvlan) device to a guest!

I won't object to that but ... code walks.

> > > > +static ssize_t mp_chr_aio_write(struct kiocb *iocb, const struct iovec *iov,
> > > > +				unsigned long count, loff_t pos)
> > > > +{
> > > > +	struct file *file = iocb->ki_filp;
> > > > +	struct mp_struct *mp = mp_get(file->private_data);
> > > > +	struct sock *sk = mp->socket.sk;
> > > > +	struct sk_buff *skb;
> > > > +	int len, err;
> > > > +	ssize_t result;
> > > 
> > > Can you explain what this function is even there for? AFAICT, vhost-net
> > > doesn't call it, the interface is incompatible with the existing
> > > tap interface, and you don't provide a read function.
> > 
> > qemu needs the ability to inject raw packets into device
> > from userspace, bypassing vhost/virtio (for live migration).
> 
> Ok, but since there is only a write callback and no read, it won't
> actually be able to do this with the current code, right?

I think it'll work as is, with vhost qemu only ever writes,
never reads from device. We'll also never need GSO etc
which is a large part of what tap does (and macvtap will
have to do).

> Moreover, it seems weird to have a new type of interface here that
> duplicates tap/macvtap with less functionality. Coming back
> to your original comment, this means that while mpassthru is currently
> not duplicating the actual code from macvtap, it would need to do
> exactly that to get the qemu interface right!
> 
> 	Arnd

I don't think so, see above. anyway, both can reuse tun.c :)

-- 
MST

  reply	other threads:[~2010-04-14 16:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-09  9:37 [RFC][PATCH v3 0/3] Provide a zero-copy method on KVM virtio-net xiaohui.xin
2010-04-09  9:37 ` [RFC][PATCH v3 1/3] A device for zero-copy based " xiaohui.xin
2010-04-09  9:37   ` [RFC][PATCH v3 2/3] Provides multiple submits and asynchronous notifications xiaohui.xin
2010-04-09  9:37     ` [RFC][PATCH v3 3/3] Let host NIC driver to DMA to guest user space xiaohui.xin
2010-04-14 14:55   ` [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net Arnd Bergmann
2010-04-14 15:26     ` Michael S. Tsirkin
2010-04-14 15:57       ` Arnd Bergmann
2010-04-14 16:16         ` Michael S. Tsirkin [this message]
2010-04-14 16:35           ` Arnd Bergmann
2010-04-14 20:31             ` Michael S. Tsirkin
2010-04-14 20:39               ` Arnd Bergmann
2010-04-14 20:40                 ` Michael S. Tsirkin
2010-04-14 20:52                   ` Arnd Bergmann
2010-04-15  9:01     ` Xin, Xiaohui
2010-04-15  9:03       ` Michael S. Tsirkin
2010-04-22  8:24         ` xiaohui.xin
2010-04-22  8:29           ` Xin, Xiaohui
2010-04-22  8:37         ` Re:[RFC][PATCH v3 2/3] Provides multiple submits and asynchronous notifications xiaohui.xin
2010-04-22  9:49           ` [RFC][PATCH " Michael S. Tsirkin
2010-04-23  7:08             ` xiaohui.xin
2010-04-24 19:32               ` [RFC][PATCH " Michael S. Tsirkin
2010-04-15 15:06       ` [RFC][PATCH v3 1/3] A device for zero-copy based on KVM virtio-net Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100414161610.GA10897@redhat.com \
    --to=mst@redhat.com \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=jdike@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    --cc=xiaohui.xin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.