netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* updated: kvm networking todo wiki
@ 2013-05-23  8:50 Michael S. Tsirkin
  2013-05-23 14:12 ` Lucas Meneghel Rodrigues
  2013-05-24  9:41 ` Jason Wang
  0 siblings, 2 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2013-05-23  8:50 UTC (permalink / raw)
  To: David Stevens, sri, Anthony Liguori, Rusty Russell,
	Krishna Kumar2, Shirley Ma, Xin, Xiaohui, jdike, herbert, lmr,
	akong, Jason Wang, vyasevic, sriram.narasimhan
  Cc: netdev, qemu-devel, kvm, virtualization

Hey guys,
I've updated the kvm networking todo wiki with current projects.
Will try to keep it up to date more often.
Original announcement below.

----

I've put up a wiki page with a kvm networking todo list,
mainly to avoid effort duplication, but also in the hope
to draw attention to what I think we should try addressing
in KVM:

http://www.linux-kvm.org/page/NetworkingTodo

This page could cover all networking related activity in KVM,
currently most info is related to virtio-net.

Note: if there's no developer listed for an item,
this just means I don't know of anyone actively working
on an issue at the moment, not that no one intends to.

I would appreciate it if others working on one of the items on this list
would add their names so we can communicate better.  If others like this
wiki page, please go ahead and add stuff you are working on if any.

It would be especially nice to add autotest projects:
there is just a short test matrix and a catch-all
'Cover test matrix with autotest', currently.

Currently there are some links to Red Hat bugzilla entries,
feel free to add links to other bugzillas.

Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-23  8:50 updated: kvm networking todo wiki Michael S. Tsirkin
@ 2013-05-23 14:12 ` Lucas Meneghel Rodrigues
  2013-05-24  9:41 ` Jason Wang
  1 sibling, 0 replies; 17+ messages in thread
From: Lucas Meneghel Rodrigues @ 2013-05-23 14:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar2, vyasevic, Xin, Xiaohui, Rusty Russell, akong,
	kvm, sriram.narasimhan, netdev, Jason Wang, Shirley Ma,
	virtualization, David Stevens, qemu-devel, herbert,
	Anthony Liguori, jdike, sri

On 23/05/13 05:50 AM, Michael S. Tsirkin wrote:
> Hey guys,
> I've updated the kvm networking todo wiki with current projects.
> Will try to keep it up to date more often.
> Original announcement below.
>
> ----
>
> I've put up a wiki page with a kvm networking todo list,
> mainly to avoid effort duplication, but also in the hope
> to draw attention to what I think we should try addressing
> in KVM:
>
> http://www.linux-kvm.org/page/NetworkingTodo
>
> This page could cover all networking related activity in KVM,
> currently most info is related to virtio-net.
>
> Note: if there's no developer listed for an item,
> this just means I don't know of anyone actively working
> on an issue at the moment, not that no one intends to.
>
> I would appreciate it if others working on one of the items on this list
> would add their names so we can communicate better.  If others like this
> wiki page, please go ahead and add stuff you are working on if any.
>
> It would be especially nice to add autotest projects:
> there is just a short test matrix and a catch-all
> 'Cover test matrix with autotest', currently.

Ok, I'll take a look and fill in with currently available networking tests.

> Currently there are some links to Red Hat bugzilla entries,
> feel free to add links to other bugzillas.
>
> Thanks!
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-23  8:50 updated: kvm networking todo wiki Michael S. Tsirkin
  2013-05-23 14:12 ` Lucas Meneghel Rodrigues
@ 2013-05-24  9:41 ` Jason Wang
  2013-05-24 11:35   ` Michael S. Tsirkin
  1 sibling, 1 reply; 17+ messages in thread
From: Jason Wang @ 2013-05-24  9:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Krishna Kumar2, lmr, Xin, Xiaohui, kvm, netdev, Shirley Ma,
	virtualization, David Stevens, qemu-devel, vyasevic, herbert,
	Anthony Liguori, jdike, sri

On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
> Hey guys,
> I've updated the kvm networking todo wiki with current projects.
> Will try to keep it up to date more often.
> Original announcement below.

Thanks a lot. I've added the tasks I'm currently working on to the wiki.

btw. I notice the virtio-net data plane were missed in the wiki. Is the
project still being considered?
> ----
>
> I've put up a wiki page with a kvm networking todo list,
> mainly to avoid effort duplication, but also in the hope
> to draw attention to what I think we should try addressing
> in KVM:
>
> http://www.linux-kvm.org/page/NetworkingTodo
>
> This page could cover all networking related activity in KVM,
> currently most info is related to virtio-net.
>
> Note: if there's no developer listed for an item,
> this just means I don't know of anyone actively working
> on an issue at the moment, not that no one intends to.
>
> I would appreciate it if others working on one of the items on this list
> would add their names so we can communicate better.  If others like this
> wiki page, please go ahead and add stuff you are working on if any.
>
> It would be especially nice to add autotest projects:
> there is just a short test matrix and a catch-all
> 'Cover test matrix with autotest', currently.
>
> Currently there are some links to Red Hat bugzilla entries,
> feel free to add links to other bugzillas.
>
> Thanks!
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-24  9:41 ` Jason Wang
@ 2013-05-24 11:35   ` Michael S. Tsirkin
  2013-05-24 13:47     ` Anthony Liguori
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2013-05-24 11:35 UTC (permalink / raw)
  To: Jason Wang
  Cc: Krishna Kumar2, lmr, Xin, Xiaohui, kvm, netdev, Shirley Ma,
	virtualization, David Stevens, qemu-devel, vyasevic, herbert,
	Anthony Liguori, jdike, sri

On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
> > Hey guys,
> > I've updated the kvm networking todo wiki with current projects.
> > Will try to keep it up to date more often.
> > Original announcement below.
> 
> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
> 
> btw. I notice the virtio-net data plane were missed in the wiki. Is the
> project still being considered?

It might have been interesting several years ago, but now that linux has
vhost-net in kernel, the only point seems to be to
speed up networking on non-linux hosts. Since non-linux
does not have kvm, I doubt virtio is a bottleneck.
IMO yet another networking backend is a distraction,
and confusing to users.
In any case, I'd like to see virtio-blk dataplane replace
non dataplane first. We don't want two copies of
virtio-net in qemu.

> > ----
> >
> > I've put up a wiki page with a kvm networking todo list,
> > mainly to avoid effort duplication, but also in the hope
> > to draw attention to what I think we should try addressing
> > in KVM:
> >
> > http://www.linux-kvm.org/page/NetworkingTodo
> >
> > This page could cover all networking related activity in KVM,
> > currently most info is related to virtio-net.
> >
> > Note: if there's no developer listed for an item,
> > this just means I don't know of anyone actively working
> > on an issue at the moment, not that no one intends to.
> >
> > I would appreciate it if others working on one of the items on this list
> > would add their names so we can communicate better.  If others like this
> > wiki page, please go ahead and add stuff you are working on if any.
> >
> > It would be especially nice to add autotest projects:
> > there is just a short test matrix and a catch-all
> > 'Cover test matrix with autotest', currently.
> >
> > Currently there are some links to Red Hat bugzilla entries,
> > feel free to add links to other bugzillas.
> >
> > Thanks!
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-24 11:35   ` Michael S. Tsirkin
@ 2013-05-24 13:47     ` Anthony Liguori
  2013-05-24 14:00       ` Michael S. Tsirkin
  0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2013-05-24 13:47 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: David Stevens, sri, Rusty Russell, Krishna Kumar2, Shirley Ma,
	Xin, Xiaohui, jdike, herbert, lmr, akong, vyasevic,
	sriram.narasimhan, kvm, qemu-devel, netdev, virtualization

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
>> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
>> > Hey guys,
>> > I've updated the kvm networking todo wiki with current projects.
>> > Will try to keep it up to date more often.
>> > Original announcement below.
>> 
>> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
>> 
>> btw. I notice the virtio-net data plane were missed in the wiki. Is the
>> project still being considered?
>
> It might have been interesting several years ago, but now that linux has
> vhost-net in kernel, the only point seems to be to
> speed up networking on non-linux hosts.

Data plane just means having a dedicated thread for virtqueue processing
that doesn't hold qemu_mutex.

Of course we're going to do this in QEMU.  It's a no brainer.  But not
as a separate device, just as an improvement to the existing userspace
virtio-net.

> Since non-linux does not have kvm, I doubt virtio is a bottleneck.

FWIW, I think what's more interesting is using vhost-net as a networking
backend with virtio-net in QEMU being what's guest facing.

In theory, this gives you the best of both worlds: QEMU acts as a first
line of defense against a malicious guest while still getting the
performance advantages of vhost-net (zero-copy).

> IMO yet another networking backend is a distraction,
> and confusing to users.
> In any case, I'd like to see virtio-blk dataplane replace
> non dataplane first. We don't want two copies of
> virtio-net in qemu.

100% agreed.

Regards,

Anthony Liguori

>
>> > ----
>> >
>> > I've put up a wiki page with a kvm networking todo list,
>> > mainly to avoid effort duplication, but also in the hope
>> > to draw attention to what I think we should try addressing
>> > in KVM:
>> >
>> > http://www.linux-kvm.org/page/NetworkingTodo
>> >
>> > This page could cover all networking related activity in KVM,
>> > currently most info is related to virtio-net.
>> >
>> > Note: if there's no developer listed for an item,
>> > this just means I don't know of anyone actively working
>> > on an issue at the moment, not that no one intends to.
>> >
>> > I would appreciate it if others working on one of the items on this list
>> > would add their names so we can communicate better.  If others like this
>> > wiki page, please go ahead and add stuff you are working on if any.
>> >
>> > It would be especially nice to add autotest projects:
>> > there is just a short test matrix and a catch-all
>> > 'Cover test matrix with autotest', currently.
>> >
>> > Currently there are some links to Red Hat bugzilla entries,
>> > feel free to add links to other bugzillas.
>> >
>> > Thanks!
>> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-24 13:47     ` Anthony Liguori
@ 2013-05-24 14:00       ` Michael S. Tsirkin
  2013-05-29  0:07         ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2013-05-24 14:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Krishna Kumar2, lmr, Xin, Xiaohui, Shirley Ma, kvm, netdev,
	virtualization, David Stevens, qemu-devel, vyasevic, herbert,
	jdike, sri

On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
> >> > Hey guys,
> >> > I've updated the kvm networking todo wiki with current projects.
> >> > Will try to keep it up to date more often.
> >> > Original announcement below.
> >> 
> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
> >> 
> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the
> >> project still being considered?
> >
> > It might have been interesting several years ago, but now that linux has
> > vhost-net in kernel, the only point seems to be to
> > speed up networking on non-linux hosts.
> 
> Data plane just means having a dedicated thread for virtqueue processing
> that doesn't hold qemu_mutex.
> 
> Of course we're going to do this in QEMU.  It's a no brainer.  But not
> as a separate device, just as an improvement to the existing userspace
> virtio-net.
> 
> > Since non-linux does not have kvm, I doubt virtio is a bottleneck.
> 
> FWIW, I think what's more interesting is using vhost-net as a networking
> backend with virtio-net in QEMU being what's guest facing.
> 
> In theory, this gives you the best of both worlds: QEMU acts as a first
> line of defense against a malicious guest while still getting the
> performance advantages of vhost-net (zero-copy).

Great idea, that sounds very intresting.

I'll add it to the wiki.

In fact a bit of complexity in vhost was put there in the vague hope to
support something like this: virtio rings are not translated through
regular memory tables, instead, vhost gets a pointer to ring address.

This allows qemu acting as a man in the middle,
verifying the descriptors but not touching the

Anyone interested in working on such a project?

> > IMO yet another networking backend is a distraction,
> > and confusing to users.
> > In any case, I'd like to see virtio-blk dataplane replace
> > non dataplane first. We don't want two copies of
> > virtio-net in qemu.
> 
> 100% agreed.
> 
> Regards,
> 
> Anthony Liguori
> 
> >
> >> > ----
> >> >
> >> > I've put up a wiki page with a kvm networking todo list,
> >> > mainly to avoid effort duplication, but also in the hope
> >> > to draw attention to what I think we should try addressing
> >> > in KVM:
> >> >
> >> > http://www.linux-kvm.org/page/NetworkingTodo
> >> >
> >> > This page could cover all networking related activity in KVM,
> >> > currently most info is related to virtio-net.
> >> >
> >> > Note: if there's no developer listed for an item,
> >> > this just means I don't know of anyone actively working
> >> > on an issue at the moment, not that no one intends to.
> >> >
> >> > I would appreciate it if others working on one of the items on this list
> >> > would add their names so we can communicate better.  If others like this
> >> > wiki page, please go ahead and add stuff you are working on if any.
> >> >
> >> > It would be especially nice to add autotest projects:
> >> > there is just a short test matrix and a catch-all
> >> > 'Cover test matrix with autotest', currently.
> >> >
> >> > Currently there are some links to Red Hat bugzilla entries,
> >> > feel free to add links to other bugzillas.
> >> >
> >> > Thanks!
> >> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-24 14:00       ` Michael S. Tsirkin
@ 2013-05-29  0:07         ` Rusty Russell
  2013-05-29 13:01           ` Anthony Liguori
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2013-05-29  0:07 UTC (permalink / raw)
  To: Michael S. Tsirkin, Anthony Liguori
  Cc: Jason Wang, David Stevens, sri, Krishna Kumar2, Shirley Ma, Xin,
	Xiaohui, jdike, herbert, lmr, akong, vyasevic, sriram.narasimhan,
	kvm, qemu-devel, netdev, virtualization

"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> 
>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
>> >> > Hey guys,
>> >> > I've updated the kvm networking todo wiki with current projects.
>> >> > Will try to keep it up to date more often.
>> >> > Original announcement below.
>> >> 
>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
>> >> 
>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the
>> >> project still being considered?
>> >
>> > It might have been interesting several years ago, but now that linux has
>> > vhost-net in kernel, the only point seems to be to
>> > speed up networking on non-linux hosts.
>> 
>> Data plane just means having a dedicated thread for virtqueue processing
>> that doesn't hold qemu_mutex.
>> 
>> Of course we're going to do this in QEMU.  It's a no brainer.  But not
>> as a separate device, just as an improvement to the existing userspace
>> virtio-net.
>> 
>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck.
>> 
>> FWIW, I think what's more interesting is using vhost-net as a networking
>> backend with virtio-net in QEMU being what's guest facing.
>> 
>> In theory, this gives you the best of both worlds: QEMU acts as a first
>> line of defense against a malicious guest while still getting the
>> performance advantages of vhost-net (zero-copy).
>
> Great idea, that sounds very intresting.
>
> I'll add it to the wiki.
>
> In fact a bit of complexity in vhost was put there in the vague hope to
> support something like this: virtio rings are not translated through
> regular memory tables, instead, vhost gets a pointer to ring address.
>
> This allows qemu acting as a man in the middle,
> verifying the descriptors but not touching the
>
> Anyone interested in working on such a project?

It would be an interesting idea if we didn't already have the vhost
model where we don't need the userspace bounce.  We already have two
sets of host side ring code in the kernel (vhost and vringh, though
they're being unified).

All an accelerator can offer on the tx side is zero copy and direct
update of the used ring.  On rx userspace could register the buffers and
the accelerator could fill them and update the used ring.  It still
needs to deal with merged buffers, for example.

You avoid the address translation in the kernel, but I'm not convinced
that's a key problem.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-29  0:07         ` Rusty Russell
@ 2013-05-29 13:01           ` Anthony Liguori
  2013-05-29 14:12             ` Michael S. Tsirkin
  2013-05-30  5:23             ` Rusty Russell
  0 siblings, 2 replies; 17+ messages in thread
From: Anthony Liguori @ 2013-05-29 13:01 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Krishna Kumar2, lmr, Xin, Xiaohui, kvm, netdev, Shirley Ma,
	virtualization, David Stevens, qemu-devel, vyasevic, herbert,
	jdike, sri

Rusty Russell <rusty@rustcorp.com.au> writes:

> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>>> 
>>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
>>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
>>> >> > Hey guys,
>>> >> > I've updated the kvm networking todo wiki with current projects.
>>> >> > Will try to keep it up to date more often.
>>> >> > Original announcement below.
>>> >> 
>>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
>>> >> 
>>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the
>>> >> project still being considered?
>>> >
>>> > It might have been interesting several years ago, but now that linux has
>>> > vhost-net in kernel, the only point seems to be to
>>> > speed up networking on non-linux hosts.
>>> 
>>> Data plane just means having a dedicated thread for virtqueue processing
>>> that doesn't hold qemu_mutex.
>>> 
>>> Of course we're going to do this in QEMU.  It's a no brainer.  But not
>>> as a separate device, just as an improvement to the existing userspace
>>> virtio-net.
>>> 
>>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck.
>>> 
>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>> backend with virtio-net in QEMU being what's guest facing.
>>> 
>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>> line of defense against a malicious guest while still getting the
>>> performance advantages of vhost-net (zero-copy).
>>
>> Great idea, that sounds very intresting.
>>
>> I'll add it to the wiki.
>>
>> In fact a bit of complexity in vhost was put there in the vague hope to
>> support something like this: virtio rings are not translated through
>> regular memory tables, instead, vhost gets a pointer to ring address.
>>
>> This allows qemu acting as a man in the middle,
>> verifying the descriptors but not touching the
>>
>> Anyone interested in working on such a project?
>
> It would be an interesting idea if we didn't already have the vhost
> model where we don't need the userspace bounce.

The model is very interesting for QEMU because then we can use vhost as
a backend for other types of network adapters (like vmxnet3 or even
e1000).

It also helps for things like fault tolerance where we need to be able
to control packet flow within QEMU.

Regards,

Anthony Liguori

> We already have two
> sets of host side ring code in the kernel (vhost and vringh, though
> they're being unified).
>
> All an accelerator can offer on the tx side is zero copy and direct
> update of the used ring.  On rx userspace could register the buffers and
> the accelerator could fill them and update the used ring.  It still
> needs to deal with merged buffers, for example.
>
> You avoid the address translation in the kernel, but I'm not convinced
> that's a key problem.
>
> Cheers,
> Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-29 13:01           ` Anthony Liguori
@ 2013-05-29 14:12             ` Michael S. Tsirkin
  2013-05-30  5:23             ` Rusty Russell
  1 sibling, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2013-05-29 14:12 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Krishna Kumar2, lmr, Xin, Xiaohui, Shirley Ma, kvm, netdev,
	virtualization, David Stevens, qemu-devel, vyasevic, herbert,
	jdike, sri

On Wed, May 29, 2013 at 08:01:03AM -0500, Anthony Liguori wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
> 
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> >> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
> >>> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >>> 
> >>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote:
> >>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote:
> >>> >> > Hey guys,
> >>> >> > I've updated the kvm networking todo wiki with current projects.
> >>> >> > Will try to keep it up to date more often.
> >>> >> > Original announcement below.
> >>> >> 
> >>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki.
> >>> >> 
> >>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the
> >>> >> project still being considered?
> >>> >
> >>> > It might have been interesting several years ago, but now that linux has
> >>> > vhost-net in kernel, the only point seems to be to
> >>> > speed up networking on non-linux hosts.
> >>> 
> >>> Data plane just means having a dedicated thread for virtqueue processing
> >>> that doesn't hold qemu_mutex.
> >>> 
> >>> Of course we're going to do this in QEMU.  It's a no brainer.  But not
> >>> as a separate device, just as an improvement to the existing userspace
> >>> virtio-net.
> >>> 
> >>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck.
> >>> 
> >>> FWIW, I think what's more interesting is using vhost-net as a networking
> >>> backend with virtio-net in QEMU being what's guest facing.
> >>> 
> >>> In theory, this gives you the best of both worlds: QEMU acts as a first
> >>> line of defense against a malicious guest while still getting the
> >>> performance advantages of vhost-net (zero-copy).
> >>
> >> Great idea, that sounds very intresting.
> >>
> >> I'll add it to the wiki.
> >>
> >> In fact a bit of complexity in vhost was put there in the vague hope to
> >> support something like this: virtio rings are not translated through
> >> regular memory tables, instead, vhost gets a pointer to ring address.
> >>
> >> This allows qemu acting as a man in the middle,
> >> verifying the descriptors but not touching the
> >>
> >> Anyone interested in working on such a project?
> >
> > It would be an interesting idea if we didn't already have the vhost
> > model where we don't need the userspace bounce.
> 
> The model is very interesting for QEMU because then we can use vhost as
> a backend for other types of network adapters (like vmxnet3 or even
> e1000).
> 
> It also helps for things like fault tolerance where we need to be able
> to control packet flow within QEMU.
> 
> Regards,
> 
> Anthony Liguori

It was also floated as an alternative way to do live migration.

> > We already have two
> > sets of host side ring code in the kernel (vhost and vringh, though
> > they're being unified).
> >
> > All an accelerator can offer on the tx side is zero copy and direct
> > update of the used ring.  On rx userspace could register the buffers and
> > the accelerator could fill them and update the used ring.  It still
> > needs to deal with merged buffers, for example.
> >
> > You avoid the address translation in the kernel, but I'm not convinced
> > that's a key problem.
> >
> > Cheers,
> > Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-29 13:01           ` Anthony Liguori
  2013-05-29 14:12             ` Michael S. Tsirkin
@ 2013-05-30  5:23             ` Rusty Russell
  2013-05-30  6:38               ` Stefan Hajnoczi
  2013-05-30 13:39               ` Anthony Liguori
  1 sibling, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2013-05-30  5:23 UTC (permalink / raw)
  To: Anthony Liguori, Michael S. Tsirkin
  Cc: Jason Wang, herbert, kvm, qemu-devel, netdev, virtualization,
	Dmitry Fleytman

Anthony Liguori <anthony@codemonkey.ws> writes:
> Rusty Russell <rusty@rustcorp.com.au> writes:
>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>> backend with virtio-net in QEMU being what's guest facing.
>>> 
>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>> line of defense against a malicious guest while still getting the
>>> performance advantages of vhost-net (zero-copy).
>>>
>> It would be an interesting idea if we didn't already have the vhost
>> model where we don't need the userspace bounce.
>
> The model is very interesting for QEMU because then we can use vhost as
> a backend for other types of network adapters (like vmxnet3 or even
> e1000).
>
> It also helps for things like fault tolerance where we need to be able
> to control packet flow within QEMU.

(CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).

Then I'm really confused as to what this would look like.  A zero copy
sendmsg?  We should be able to implement that today.

On the receive side, what can we do better than readv?  If we need to
return to userspace to tell the guest that we've got a new packet, we
don't win on latency.  We might reduce syscall overhead with a
multi-dimensional readv to read multiple packets at once?

Confused,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30  5:23             ` Rusty Russell
@ 2013-05-30  6:38               ` Stefan Hajnoczi
  2013-05-30  7:18                 ` Rusty Russell
  2013-05-30 13:40                 ` Anthony Liguori
  2013-05-30 13:39               ` Anthony Liguori
  1 sibling, 2 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30  6:38 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Anthony Liguori, Michael S. Tsirkin, Jason Wang, herbert, kvm,
	qemu-devel, netdev, Linux Virtualization, Dmitry Fleytman

On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Anthony Liguori <anthony@codemonkey.ws> writes:
>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>>> backend with virtio-net in QEMU being what's guest facing.
>>>>
>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>>> line of defense against a malicious guest while still getting the
>>>> performance advantages of vhost-net (zero-copy).
>>>>
>>> It would be an interesting idea if we didn't already have the vhost
>>> model where we don't need the userspace bounce.
>>
>> The model is very interesting for QEMU because then we can use vhost as
>> a backend for other types of network adapters (like vmxnet3 or even
>> e1000).
>>
>> It also helps for things like fault tolerance where we need to be able
>> to control packet flow within QEMU.
>
> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
>
> Then I'm really confused as to what this would look like.  A zero copy
> sendmsg?  We should be able to implement that today.
>
> On the receive side, what can we do better than readv?  If we need to
> return to userspace to tell the guest that we've got a new packet, we
> don't win on latency.  We might reduce syscall overhead with a
> multi-dimensional readv to read multiple packets at once?

Sounds like recvmmsg(2).

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30  6:38               ` Stefan Hajnoczi
@ 2013-05-30  7:18                 ` Rusty Russell
  2013-05-30 13:40                 ` Anthony Liguori
  1 sibling, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2013-05-30  7:18 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, Michael S. Tsirkin, netdev, qemu-devel,
	Linux Virtualization, herbert, Anthony Liguori

Stefan Hajnoczi <stefanha@gmail.com> writes:
> On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>> On the receive side, what can we do better than readv?  If we need to
>> return to userspace to tell the guest that we've got a new packet, we
>> don't win on latency.  We might reduce syscall overhead with a
>> multi-dimensional readv to read multiple packets at once?
>
> Sounds like recvmmsg(2).

Wow... the future is here, today!

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30  5:23             ` Rusty Russell
  2013-05-30  6:38               ` Stefan Hajnoczi
@ 2013-05-30 13:39               ` Anthony Liguori
  1 sibling, 0 replies; 17+ messages in thread
From: Anthony Liguori @ 2013-05-30 13:39 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Jason Wang, herbert, kvm, qemu-devel, netdev, virtualization,
	Dmitry Fleytman

Rusty Russell <rusty@rustcorp.com.au> writes:

> Anthony Liguori <anthony@codemonkey.ws> writes:
>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>>> backend with virtio-net in QEMU being what's guest facing.
>>>> 
>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>>> line of defense against a malicious guest while still getting the
>>>> performance advantages of vhost-net (zero-copy).
>>>>
>>> It would be an interesting idea if we didn't already have the vhost
>>> model where we don't need the userspace bounce.
>>
>> The model is very interesting for QEMU because then we can use vhost as
>> a backend for other types of network adapters (like vmxnet3 or even
>> e1000).
>>
>> It also helps for things like fault tolerance where we need to be able
>> to control packet flow within QEMU.
>
> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
>
> Then I'm really confused as to what this would look like.  A zero copy
> sendmsg?  We should be able to implement that today.

The only trouble with sendmsg would be doing batch submission and
asynchronous completion.

A thread pool could certainly be used for this I guess.

Regards,

Anthony Liguori

> On the receive side, what can we do better than readv?  If we need to
> return to userspace to tell the guest that we've got a new packet, we
> don't win on latency.  We might reduce syscall overhead with a
> multi-dimensional readv to read multiple packets at once?
>
> Confused,
> Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30  6:38               ` Stefan Hajnoczi
  2013-05-30  7:18                 ` Rusty Russell
@ 2013-05-30 13:40                 ` Anthony Liguori
  2013-05-30 13:44                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2013-05-30 13:40 UTC (permalink / raw)
  To: Stefan Hajnoczi, Rusty Russell
  Cc: kvm, Michael S. Tsirkin, netdev, qemu-devel,
	Linux Virtualization, herbert

Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>> Anthony Liguori <anthony@codemonkey.ws> writes:
>>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>>>> backend with virtio-net in QEMU being what's guest facing.
>>>>>
>>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>>>> line of defense against a malicious guest while still getting the
>>>>> performance advantages of vhost-net (zero-copy).
>>>>>
>>>> It would be an interesting idea if we didn't already have the vhost
>>>> model where we don't need the userspace bounce.
>>>
>>> The model is very interesting for QEMU because then we can use vhost as
>>> a backend for other types of network adapters (like vmxnet3 or even
>>> e1000).
>>>
>>> It also helps for things like fault tolerance where we need to be able
>>> to control packet flow within QEMU.
>>
>> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
>>
>> Then I'm really confused as to what this would look like.  A zero copy
>> sendmsg?  We should be able to implement that today.
>>
>> On the receive side, what can we do better than readv?  If we need to
>> return to userspace to tell the guest that we've got a new packet, we
>> don't win on latency.  We might reduce syscall overhead with a
>> multi-dimensional readv to read multiple packets at once?
>
> Sounds like recvmmsg(2).

Could we map this to mergable rx buffers though?

Regards,

Anthony Liguori

>
> Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30 13:40                 ` Anthony Liguori
@ 2013-05-30 13:44                   ` Michael S. Tsirkin
  2013-05-30 14:41                     ` Anthony Liguori
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2013-05-30 13:44 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: kvm, qemu-devel, Linux Virtualization, herbert, netdev

On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote:
> Stefan Hajnoczi <stefanha@gmail.com> writes:
> 
> > On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> >> Anthony Liguori <anthony@codemonkey.ws> writes:
> >>> Rusty Russell <rusty@rustcorp.com.au> writes:
> >>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
> >>>>> FWIW, I think what's more interesting is using vhost-net as a networking
> >>>>> backend with virtio-net in QEMU being what's guest facing.
> >>>>>
> >>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
> >>>>> line of defense against a malicious guest while still getting the
> >>>>> performance advantages of vhost-net (zero-copy).
> >>>>>
> >>>> It would be an interesting idea if we didn't already have the vhost
> >>>> model where we don't need the userspace bounce.
> >>>
> >>> The model is very interesting for QEMU because then we can use vhost as
> >>> a backend for other types of network adapters (like vmxnet3 or even
> >>> e1000).
> >>>
> >>> It also helps for things like fault tolerance where we need to be able
> >>> to control packet flow within QEMU.
> >>
> >> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
> >>
> >> Then I'm really confused as to what this would look like.  A zero copy
> >> sendmsg?  We should be able to implement that today.
> >>
> >> On the receive side, what can we do better than readv?  If we need to
> >> return to userspace to tell the guest that we've got a new packet, we
> >> don't win on latency.  We might reduce syscall overhead with a
> >> multi-dimensional readv to read multiple packets at once?
> >
> > Sounds like recvmmsg(2).
> 
> Could we map this to mergable rx buffers though?
> 
> Regards,
> 
> Anthony Liguori

Yes because we don't have to complete buffers in order.

> >
> > Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30 13:44                   ` Michael S. Tsirkin
@ 2013-05-30 14:41                     ` Anthony Liguori
  2013-06-03  0:32                       ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Anthony Liguori @ 2013-05-30 14:41 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, qemu-devel, Linux Virtualization, herbert, netdev

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote:
>> Stefan Hajnoczi <stefanha@gmail.com> writes:
>> 
>> > On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>> >> Anthony Liguori <anthony@codemonkey.ws> writes:
>> >>> Rusty Russell <rusty@rustcorp.com.au> writes:
>> >>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>> >>>>> FWIW, I think what's more interesting is using vhost-net as a networking
>> >>>>> backend with virtio-net in QEMU being what's guest facing.
>> >>>>>
>> >>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>> >>>>> line of defense against a malicious guest while still getting the
>> >>>>> performance advantages of vhost-net (zero-copy).
>> >>>>>
>> >>>> It would be an interesting idea if we didn't already have the vhost
>> >>>> model where we don't need the userspace bounce.
>> >>>
>> >>> The model is very interesting for QEMU because then we can use vhost as
>> >>> a backend for other types of network adapters (like vmxnet3 or even
>> >>> e1000).
>> >>>
>> >>> It also helps for things like fault tolerance where we need to be able
>> >>> to control packet flow within QEMU.
>> >>
>> >> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
>> >>
>> >> Then I'm really confused as to what this would look like.  A zero copy
>> >> sendmsg?  We should be able to implement that today.
>> >>
>> >> On the receive side, what can we do better than readv?  If we need to
>> >> return to userspace to tell the guest that we've got a new packet, we
>> >> don't win on latency.  We might reduce syscall overhead with a
>> >> multi-dimensional readv to read multiple packets at once?
>> >
>> > Sounds like recvmmsg(2).
>> 
>> Could we map this to mergable rx buffers though?
>> 
>> Regards,
>> 
>> Anthony Liguori
>
> Yes because we don't have to complete buffers in order.

What I meant though was for GRO, we don't know how large the received
packet is going to be.  Mergable rx buffers lets us allocate a pool of
data for all incoming packets instead of allocating max packet size *
max packets.

recvmmsg expects an array of msghdrs and I presume each needs to be
given a fixed size.  So this seems incompatible with mergable rx
buffers.

Regards,

Anthony Liguori

>
>> >
>> > Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: updated: kvm networking todo wiki
  2013-05-30 14:41                     ` Anthony Liguori
@ 2013-06-03  0:32                       ` Rusty Russell
  0 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2013-06-03  0:32 UTC (permalink / raw)
  To: Anthony Liguori, Michael S. Tsirkin
  Cc: kvm, qemu-devel, Linux Virtualization, herbert, netdev

Anthony Liguori <anthony@codemonkey.ws> writes:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>
>> On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote:
>>> Stefan Hajnoczi <stefanha@gmail.com> writes:
>>> 
>>> > On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
>>> >> Anthony Liguori <anthony@codemonkey.ws> writes:
>>> >>> Rusty Russell <rusty@rustcorp.com.au> writes:
>>> >>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote:
>>> >>>>> FWIW, I think what's more interesting is using vhost-net as a networking
>>> >>>>> backend with virtio-net in QEMU being what's guest facing.
>>> >>>>>
>>> >>>>> In theory, this gives you the best of both worlds: QEMU acts as a first
>>> >>>>> line of defense against a malicious guest while still getting the
>>> >>>>> performance advantages of vhost-net (zero-copy).
>>> >>>>>
>>> >>>> It would be an interesting idea if we didn't already have the vhost
>>> >>>> model where we don't need the userspace bounce.
>>> >>>
>>> >>> The model is very interesting for QEMU because then we can use vhost as
>>> >>> a backend for other types of network adapters (like vmxnet3 or even
>>> >>> e1000).
>>> >>>
>>> >>> It also helps for things like fault tolerance where we need to be able
>>> >>> to control packet flow within QEMU.
>>> >>
>>> >> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts).
>>> >>
>>> >> Then I'm really confused as to what this would look like.  A zero copy
>>> >> sendmsg?  We should be able to implement that today.
>>> >>
>>> >> On the receive side, what can we do better than readv?  If we need to
>>> >> return to userspace to tell the guest that we've got a new packet, we
>>> >> don't win on latency.  We might reduce syscall overhead with a
>>> >> multi-dimensional readv to read multiple packets at once?
>>> >
>>> > Sounds like recvmmsg(2).
>>> 
>>> Could we map this to mergable rx buffers though?
>>> 
>>> Regards,
>>> 
>>> Anthony Liguori
>>
>> Yes because we don't have to complete buffers in order.
>
> What I meant though was for GRO, we don't know how large the received
> packet is going to be.  Mergable rx buffers lets us allocate a pool of
> data for all incoming packets instead of allocating max packet size *
> max packets.
>
> recvmmsg expects an array of msghdrs and I presume each needs to be
> given a fixed size.  So this seems incompatible with mergable rx
> buffers.

Good point.  You'd need to build 64k buffers to pass to recvmmsg, then
reuse the parts it didn't touch on the next call.  This limits us to
about a 16th of what we could do with an interface which understood
buffer merging, but I don't know how much that would matter in
practice.  We'd need some benchmarks....

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-06-03  0:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-23  8:50 updated: kvm networking todo wiki Michael S. Tsirkin
2013-05-23 14:12 ` Lucas Meneghel Rodrigues
2013-05-24  9:41 ` Jason Wang
2013-05-24 11:35   ` Michael S. Tsirkin
2013-05-24 13:47     ` Anthony Liguori
2013-05-24 14:00       ` Michael S. Tsirkin
2013-05-29  0:07         ` Rusty Russell
2013-05-29 13:01           ` Anthony Liguori
2013-05-29 14:12             ` Michael S. Tsirkin
2013-05-30  5:23             ` Rusty Russell
2013-05-30  6:38               ` Stefan Hajnoczi
2013-05-30  7:18                 ` Rusty Russell
2013-05-30 13:40                 ` Anthony Liguori
2013-05-30 13:44                   ` Michael S. Tsirkin
2013-05-30 14:41                     ` Anthony Liguori
2013-06-03  0:32                       ` Rusty Russell
2013-05-30 13:39               ` Anthony Liguori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).