All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Make virtio-net.c ring size configurable?
@ 2014-02-14 13:43 Luke Gorrie
  2014-02-14 19:34 ` Mario Smarduch
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Luke Gorrie @ 2014-02-14 13:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: snabb-devel

[-- Attachment #1: Type: text/plain, Size: 417 bytes --]

Howdy!

Observation: virtio-net.c hard-codes the vring size to 256 buffers.

Could this reasonably be made configurable, or would that be likely to
cause a problem?

In Snabb Switch we are creating a 1:1 mapping between Virtio-net
descriptors and VMDq hardware receive descriptors. The VMDq queues support
32768 buffers and I'd like to match this on the QEMU/Virtio-net side -- or
at least come close.

Cheers!
-Luke

[-- Attachment #2: Type: text/html, Size: 591 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Make virtio-net.c ring size configurable?
  2014-02-14 13:43 [Qemu-devel] Make virtio-net.c ring size configurable? Luke Gorrie
@ 2014-02-14 19:34 ` Mario Smarduch
  2014-02-24 15:20 ` Stefan Hajnoczi
  2016-05-12 17:38 ` [Qemu-devel] " xchenum
  2 siblings, 0 replies; 9+ messages in thread
From: Mario Smarduch @ 2014-02-14 19:34 UTC (permalink / raw)
  To: Luke Gorrie; +Cc: snabb-devel, qemu-devel

On 02/14/2014 05:43 AM, Luke Gorrie wrote:
> Howdy!
> 
> Observation: virtio-net.c hard-codes the vring size to 256 buffers.
> 
> Could this reasonably be made configurable, or would that be likely to
> cause a problem?
> 
> In Snabb Switch we are creating a 1:1 mapping between Virtio-net
> descriptors and VMDq hardware receive descriptors. The VMDq queues
> support 32768 buffers and I'd like to match this on the QEMU/Virtio-net
> side -- or at least come close.
> 
> Cheers!
> -Luke
> 
> 

For PCI that seems to be hardcoded. For 'virtio-mmio' call to get QUEUE_NUM
checks if vring.num != 0 and returns VIRTQUEUE_MAX_SIZE (1024). Later the
guest writes VIRTIO_MMIO_QUEUE_NUM this new size (early on in probe) and 
virtio_queue_set_num() adjusts the vring_desc, avail, .. values accordingly. 
The PCI variant doesn't support write to VIRTIO_PCI_QUEUE_NUM. 

You might be able to try something like that adjusting max value.

- Mario

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Make virtio-net.c ring size configurable?
  2014-02-14 13:43 [Qemu-devel] Make virtio-net.c ring size configurable? Luke Gorrie
  2014-02-14 19:34 ` Mario Smarduch
@ 2014-02-24 15:20 ` Stefan Hajnoczi
  2014-02-24 16:14   ` [Qemu-devel] [snabb-devel] " Luke Gorrie
  2014-02-24 19:16   ` Luke Gorrie
  2016-05-12 17:38 ` [Qemu-devel] " xchenum
  2 siblings, 2 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-02-24 15:20 UTC (permalink / raw)
  To: Luke Gorrie; +Cc: snabb-devel, qemu-devel

On Fri, Feb 14, 2014 at 02:43:14PM +0100, Luke Gorrie wrote:
> Observation: virtio-net.c hard-codes the vring size to 256 buffers.
> 
> Could this reasonably be made configurable, or would that be likely to
> cause a problem?
> 
> In Snabb Switch we are creating a 1:1 mapping between Virtio-net
> descriptors and VMDq hardware receive descriptors. The VMDq queues support
> 32768 buffers and I'd like to match this on the QEMU/Virtio-net side -- or
> at least come close.

In reality virtio-net can use many more buffers because it has the
VIRTIO_RING_F_INDIRECT_DESC feature.  Each descriptor can point to a
whole new descriptor table.

Do you want the 1:1 mapping to achieve best performance or just to
simplify the coding?

Since vhost_net does many Gbit/s I doubt the ring size is a limiting
factor although there are still periodic discussions about tweaking the
direct vs indirect descriptor heuristic.

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [snabb-devel] Re: Make virtio-net.c ring size configurable?
  2014-02-24 15:20 ` Stefan Hajnoczi
@ 2014-02-24 16:14   ` Luke Gorrie
  2014-02-27 14:17     ` Stefan Hajnoczi
  2014-02-24 19:16   ` Luke Gorrie
  1 sibling, 1 reply; 9+ messages in thread
From: Luke Gorrie @ 2014-02-24 16:14 UTC (permalink / raw)
  To: snabb-devel; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1613 bytes --]

On 24 February 2014 16:20, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> Do you want the 1:1 mapping to achieve best performance or just to
> simplify the coding?
>

We want to keep the real-time constraints on the data plane comfortable.

The question I ask myself is: How long can I buffer packets during
processing before something is dropped?

256 buffers can be consumed in 17 microseconds on a 10G interface. That's
uncomfortably tight for me. I would like every buffer in the data path to
be dimensioned for at least 100us of traffic - ideally more like 1ms. That
gives us more flexibility for scheduling work, handling configuration
changes, etc. So I'd love to have the guest know to keep us fed with e.g.
32768 buffers at all times.

Our data plane is batch-oriented and deals with "breaths" of 100+ packets
at a time. So we're a bit more hungry for buffers than a data plane that's
optimized for minimum latency instead.

What do you think? Can I reliably get the buffers I want with
VIRTIO_RING_F_INDIRECT_DESC
or should I increase the vring size?

Since vhost_net does many Gbit/s I doubt the ring size is a limiting
> factor although there are still periodic discussions about tweaking the
> direct vs indirect descriptor heuristic.
>

FWIW the workloads I'm focused on are high rates of small packets as seen
by a switch/router/firewall/etc devices. I've found that it's possible to
struggle with these workloads even when getting solid performance on e.g.
TCP bulk transfer with TSO. So I'm prepared for the possibility that what
works well for others may well not work well for our application.

[-- Attachment #2: Type: text/html, Size: 2515 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [snabb-devel] Re: Make virtio-net.c ring size configurable?
  2014-02-24 15:20 ` Stefan Hajnoczi
  2014-02-24 16:14   ` [Qemu-devel] [snabb-devel] " Luke Gorrie
@ 2014-02-24 19:16   ` Luke Gorrie
  1 sibling, 0 replies; 9+ messages in thread
From: Luke Gorrie @ 2014-02-24 19:16 UTC (permalink / raw)
  To: snabb-devel; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 855 bytes --]

On 24 February 2014 16:20, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Fri, Feb 14, 2014 at 02:43:14PM +0100, Luke Gorrie wrote:
> > In Snabb Switch we are creating a 1:1 mapping between Virtio-net
> > descriptors and VMDq hardware receive descriptors. The VMDq queues
> support
> > 32768 buffers and I'd like to match this on the QEMU/Virtio-net side --
> or
> > at least come close.
>

[...]


> Do you want the 1:1 mapping to achieve best performance or just to
> simplify the coding?
>

More background:

The 1:1 mapping between hardware RX descriptors and Virtio-net descriptors
is for best performance, specifically for zero-copy operation. We want the
NIC to DMA the packets directly into guest memory and that's why we need to
pre-populate the NIC descriptor lists with suitable memory obtained from
the guest via the Virtio-net avail ring.

[-- Attachment #2: Type: text/html, Size: 1538 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [snabb-devel] Re: Make virtio-net.c ring size configurable?
  2014-02-24 16:14   ` [Qemu-devel] [snabb-devel] " Luke Gorrie
@ 2014-02-27 14:17     ` Stefan Hajnoczi
  2014-02-27 14:49       ` Michael S. Tsirkin
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-02-27 14:17 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Luke Gorrie, snabb-devel, qemu-devel

On Mon, Feb 24, 2014 at 05:14:04PM +0100, Luke Gorrie wrote:
> On 24 February 2014 16:20, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> > Do you want the 1:1 mapping to achieve best performance or just to
> > simplify the coding?
> >
> 
> We want to keep the real-time constraints on the data plane comfortable.
> 
> The question I ask myself is: How long can I buffer packets during
> processing before something is dropped?
> 
> 256 buffers can be consumed in 17 microseconds on a 10G interface.

This is a good point.  The virtio-net vring is too small at 256 buffers
for workloads that want to send/receive small packets at 10 Gbit/s line
rate.  (Minimum UDP packet size is 52 bytes!)

Michael: Luke has asked to increase the virtio-net virtqueue size.
Thoughts?

Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [snabb-devel] Re: Make virtio-net.c ring size configurable?
  2014-02-27 14:17     ` Stefan Hajnoczi
@ 2014-02-27 14:49       ` Michael S. Tsirkin
  2014-02-28  8:02         ` Luke Gorrie
  0 siblings, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2014-02-27 14:49 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Luke Gorrie, snabb-devel, qemu-devel

On Thu, Feb 27, 2014 at 03:17:44PM +0100, Stefan Hajnoczi wrote:
> On Mon, Feb 24, 2014 at 05:14:04PM +0100, Luke Gorrie wrote:
> > On 24 February 2014 16:20, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > 
> > > Do you want the 1:1 mapping to achieve best performance or just to
> > > simplify the coding?
> > >
> > 
> > We want to keep the real-time constraints on the data plane comfortable.
> > 
> > The question I ask myself is: How long can I buffer packets during
> > processing before something is dropped?
> > 
> > 256 buffers can be consumed in 17 microseconds on a 10G interface.
> 
> This is a good point.  The virtio-net vring is too small at 256 buffers
> for workloads that want to send/receive small packets at 10 Gbit/s line
> rate.  (Minimum UDP packet size is 52 bytes!)
> 
> Michael: Luke has asked to increase the virtio-net virtqueue size.
> Thoughts?
> 
> Stefan

Heh you want to increase the bufferbloat?
Each buffer pointer takes up 16 bytes so we are using order-2
allocations as it is, anything more and it'll start to fail
if hotplug happens long after boot.

AFAIK baremetal does not push line rate with 1 byte payload
either.

-- 
MST

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [snabb-devel] Re: Make virtio-net.c ring size configurable?
  2014-02-27 14:49       ` Michael S. Tsirkin
@ 2014-02-28  8:02         ` Luke Gorrie
  0 siblings, 0 replies; 9+ messages in thread
From: Luke Gorrie @ 2014-02-28  8:02 UTC (permalink / raw)
  To: snabb-devel; +Cc: Stefan Hajnoczi, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1960 bytes --]

On 27 February 2014 15:49, Michael S. Tsirkin <mst@redhat.com> wrote:

> > Michael: Luke has asked to increase the virtio-net virtqueue size.
> > Thoughts?
> >
> > Stefan
>
> Heh you want to increase the bufferbloat?
>

I'm sensitive to this. (I have actually built a commercial anti-bufferbloat
network device for ISPs in the recent past.) I will go to great lengths to
keep latency below 1 millisecond but beyond that I'm more flexible.

Each buffer pointer takes up 16 bytes so we are using order-2
> allocations as it is, anything more and it'll start to fail
> if hotplug happens long after boot.
>

(Sorry I don't have the background to understand this issue.)


> AFAIK baremetal does not push line rate with 1 byte payload
> either.
>

To me it feels normal to do this in the commercial networking industry.
Many networking vendors will sell you a NIC with a software interface to
drive it at line rate from userspace: Intel, Myricom, SolarFlare, Chelsio,
Mellanox. They really work. Lots of high-end commercial network devices are
built on these simple and cheap components.

Here's one detailed performance test that Luca Deri did based on standard
Intel CPU and NIC and all packet sizes:
http://www.ntop.org/wp-content/uploads/2012/04/DNA_ip_forward_RFC2544.pdf

For my project now I need to drive 6x10G ports worth of network traffic
through Virtio-net to KVM guests. That's the ballpark of what ISPs I'm
talking with require to be able to use Virtio-net instead of
SR-IOV+Passthrough. They really want to use Virtio-net for a variety of
reasons and the only barrier is performance for router-like workloads.

I'm working on Deutsche Telekom's TeraStream project [1] [2] and success
will mean that Virtio-net drives all internet traffic for national ISPs.
That would be really cool imo :-).

[1] TeraStream blurb
http://blog.ipspace.net/2013/11/deutsche-telekom-terastream-designed.html
[2] TeraStream talk http://ripe67.ripe.net/archives/video/3/

[-- Attachment #2: Type: text/html, Size: 3229 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] Make virtio-net.c ring size configurable?
  2014-02-14 13:43 [Qemu-devel] Make virtio-net.c ring size configurable? Luke Gorrie
  2014-02-14 19:34 ` Mario Smarduch
  2014-02-24 15:20 ` Stefan Hajnoczi
@ 2016-05-12 17:38 ` xchenum
  2 siblings, 0 replies; 9+ messages in thread
From: xchenum @ 2016-05-12 17:38 UTC (permalink / raw)
  To: Snabb Switch development; +Cc: qemu-devel

Luke, I might have a similar problem... I am wondering if you end up 
increasing the ring buffer size yourself.

My problem is on the tx side. When sending many small udp packets, I am 
seeing "outgoing packets dropped" in "netstat -s" increase quickly. 
Increasing txqueue of the interface and wmem size in sysctl doesn't seem to 
help at all. tx ring size is what I am looking at now. My VM however is 
connected to a bridge and then to OpenVSwitch - so I might have other 
bottlenecks...

Thanks!

On Friday, February 14, 2014 at 8:43:14 AM UTC-5, Luke Gorrie wrote:
>
> Howdy!
>
> Observation: virtio-net.c hard-codes the vring size to 256 buffers.
>
> Could this reasonably be made configurable, or would that be likely to 
> cause a problem?
>
> In Snabb Switch we are creating a 1:1 mapping between Virtio-net 
> descriptors and VMDq hardware receive descriptors. The VMDq queues support 
> 32768 buffers and I'd like to match this on the QEMU/Virtio-net side -- or 
> at least come close.
>
> Cheers!
> -Luke
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-12 17:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-14 13:43 [Qemu-devel] Make virtio-net.c ring size configurable? Luke Gorrie
2014-02-14 19:34 ` Mario Smarduch
2014-02-24 15:20 ` Stefan Hajnoczi
2014-02-24 16:14   ` [Qemu-devel] [snabb-devel] " Luke Gorrie
2014-02-27 14:17     ` Stefan Hajnoczi
2014-02-27 14:49       ` Michael S. Tsirkin
2014-02-28  8:02         ` Luke Gorrie
2014-02-24 19:16   ` Luke Gorrie
2016-05-12 17:38 ` [Qemu-devel] " xchenum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.