All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Xin, Xiaohui" <xiaohui.xin@intel.com>
Cc: Shirley Ma <mashirle@us.ibm.com>, Arnd Bergmann <arnd@arndb.de>,
	Avi Kivity <avi@redhat.com>, David Miller <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel
Date: Thu, 16 Sep 2010 12:02:01 +0200	[thread overview]
Message-ID: <20100916100201.GI20864@redhat.com> (raw)
In-Reply-To: <F2E9EB7348B8264F86B6AB8151CE2D792B8CAA02B1@shsmsx502.ccr.corp.intel.com>

On Thu, Sep 16, 2010 at 04:18:10PM +0800, Xin, Xiaohui wrote:
> >From: Michael S. Tsirkin [mailto:mst@redhat.com]
> >Sent: Wednesday, September 15, 2010 5:59 PM
> >To: Xin, Xiaohui
> >Cc: Shirley Ma; Arnd Bergmann; Avi Kivity; David Miller; netdev@vger.kernel.org;
> >kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> >Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel
> >
> >On Wed, Sep 15, 2010 at 10:46:02AM +0800, Xin, Xiaohui wrote:
> >> >From: Michael S. Tsirkin [mailto:mst@redhat.com]
> >> >Sent: Wednesday, September 15, 2010 12:30 AM
> >> >To: Shirley Ma
> >> >Cc: Arnd Bergmann; Avi Kivity; Xin, Xiaohui; David Miller; netdev@vger.kernel.org;
> >> >kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> >> >Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel
> >> >
> >> >On Tue, Sep 14, 2010 at 09:00:25AM -0700, Shirley Ma wrote:
> >> >> On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote:
> >> >> > I would expect this to hurt performance significantly.
> >> >> > We could do this for asynchronous requests only to avoid the
> >> >> > slowdown.
> >> >>
> >> >> Is kiocb in sendmsg helpful here? It is not used now.
> >> >>
> >> >> Shirley
> >> >
> >> >Precisely. This is what the patch from Xin Xiaohui does.  That code
> >> >already seems to do most of what you are trying to do, right?
> >> >
> >> >The main thing missing seems to be macvtap integration, so that we can fall back
> >> >on data copy if zero copy is unavailable?
> >> >How hard would it be to basically link the mp and macvtap modules
> >> >together to get us this functionality? Anyone?
> >> >
> >> Michael,
> >> Is to support macvtap with zero-copy through mp device the functionality
> >> you mentioned above?
> >
> >I have trouble parsing the above question.  At some point Arnd suggested
> >that the mp device functionality would fit nicely as part of the macvtap
> >driver.  It seems to make sense superficially, the advantage if it
> >worked would be that we would get zero copy (mostly) transparently.
> >
> >Do you agree with this goal?
> >
> 
> I would say yes.

In that case, it's a blocker for upstream merge because this change
affects userspace.

> >> Before Shirley Ma has suggested to move the zero-copy functionality into
> >> tun/tap device or macvtap device. How do you think about that?
> >> I suspect
> >> there will be a lot of duplicate code in that three drivers except we can extract
> >> code of zero-copy into kernel APIs and vhost APIs.
> >
> >
> >tap would be very hard at this point as it does not bind to a device.
> >macvtap might work, we mainly need to figure out a way to detect that
> >device can do zero copy so the right mode is used.  I think a first step
> >could be to simply link mp code into macvtap module, pass necessary
> >ioctls on, then move some code around as necessary.  This might get rid
> >of code duplication nicely.
> 
> I'll look into this to see how much effort would be.
> 
> >
> >
> >> Do you think that's worth to do and help current process which is blocked too
> >> long than I expected?
> >
> >I think it's nice to have.
> >
> >And if done hopefully this will get the folk working on the macvtap
> >driver to review the code, which will help find all issues faster.
> >
> >I think if you post some performance numbers,
> >this will also help get people excited and looking at the code.
> >
> 
> The performance data I have posted before is compared with raw socket on vhost-net.
> But currently, the raw socket backend is removed from the qemu side.
> So I can only compare with tap on vhost-net. But unfortunately, I missed something
> that I even can't bring it up. I was blocked by this for a time.

Hey, maybe you are seeing the bug that was reported recently.
Could you try tcpdump -i on the tap interface in host and ethX on guest
and tell me what you see?
If you see packet in guest but not in host, could you try
adding printks in vhost handle_tx to see whether it gets called
and if yes where it fails?

> >I also don't see the process as completely blocked, each review round points
> >out more issues: we aren't going back and forth changing
> >same lines again and again, are we?
> >
> >One thing that might help is increase the frequency of updates,
> >try sending them out sooner.
> >On the other hand 10 new patches each revision is a lot:
> >if there is a part of patchset that has stabilised you can split it out,
> >post once and keep posting the changing part separately.
> >
> >I hope these suggestions help.
> 
> Thanks, Michael!
> 
> >
> >> >
> >> >--
> >> >MST

  reply	other threads:[~2010-09-16 10:08 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-13 20:43 [RFC PATCH 0/1] macvtap TX zero copy between guest and host kernel Shirley Ma
2010-09-13 20:47 ` RFC PATCH 1/2] macvtap: A new sock zero copy flag Shirley Ma
2010-09-13 20:48 ` [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel Shirley Ma
2010-09-14  3:17   ` David Miller
2010-09-14  9:12     ` Avi Kivity
2010-09-14 15:05       ` Shirley Ma
2010-09-14 15:21         ` Arnd Bergmann
2010-09-14 15:22           ` Michael S. Tsirkin
2010-09-14 16:00             ` Shirley Ma
2010-09-14 16:29               ` Michael S. Tsirkin
2010-09-14 17:02                 ` Shirley Ma
2010-09-14 18:27                   ` Michael S. Tsirkin
2010-09-14 18:49                     ` Shirley Ma
2010-09-14 19:01                       ` Michael S. Tsirkin
2010-09-14 19:20                         ` Shirley Ma
2010-09-15  5:31                           ` Michael S. Tsirkin
2010-09-14 19:36                         ` Shirley Ma
2010-09-15  5:12                           ` Michael S. Tsirkin
2010-09-15  6:21                             ` Shirley Ma
2010-09-15 10:10                               ` Michael S. Tsirkin
2010-09-15 14:52                                 ` Shirley Ma
2010-09-15 15:04                                   ` Michael S. Tsirkin
2010-09-15 15:39                                     ` Michael S. Tsirkin
2010-09-15 17:00                                       ` Shirley Ma
2010-09-15 17:30                                         ` Michael S. Tsirkin
2010-09-15 18:48                                           ` Shirley Ma
2010-09-29  3:24                                   ` Shirley Ma
2010-09-29  8:16                                     ` Michael S. Tsirkin
2010-09-29  8:28                                       ` Michael S. Tsirkin
2010-09-29 14:33                                         ` Shirley Ma
2010-09-29 14:56                                         ` Shirley Ma
2010-09-29 14:31                                       ` Shirley Ma
2010-09-29 14:37                                       ` Shirley Ma
2010-09-29 15:14                                     ` Michael S. Tsirkin
2010-09-29 15:23                                       ` Shirley Ma
2010-09-15  2:46                 ` Xin, Xiaohui
2010-09-15  9:58                   ` Michael S. Tsirkin
2010-09-16  8:18                     ` Xin, Xiaohui
2010-09-16 10:02                       ` Michael S. Tsirkin [this message]
2010-09-15  1:56           ` Xin, Xiaohui
2010-09-15  1:50         ` Xin, Xiaohui
2010-09-15  2:40           ` Shirley Ma
2010-09-15  2:55             ` Xin, Xiaohui
2010-09-15  5:27             ` Michael S. Tsirkin
2010-09-15  6:17               ` Shirley Ma
2010-09-14 12:05 ` [RFC PATCH 0/1] macvtap " Michael S. Tsirkin
2010-09-14 15:15   ` Shirley Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100916100201.GI20864@redhat.com \
    --to=mst@redhat.com \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=davem@davemloft.net \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mashirle@us.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiaohui.xin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.