* VSOCK benchmark and optimizations
@ 2019-04-01 16:32 ` Stefano Garzarella
0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-01 16:32 UTC (permalink / raw)
To: alex.bennee; +Cc: qemu devel list, netdev
Hi Alex,
I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
and linux-netdev (maybe this info could be useful for others :))
One of the VSOCK advantages is the simple configuration: you don't need to set
up IP addresses for guest/host, and it can be used with the standard POSIX
socket API. [1]
I'm currently working on it, so the "optimized" values are still work in
progress and I'll send the patches upstream (Linux) as soon as possible.
(I hope in 1 or 2 weeks)
Optimizations:
+ reducing the number of credit update packets
- RX side sent, on every packet received, an empty packet only to inform the
TX side about the space in the RX buffer.
+ increase RX buffers size to 64 KB (from 4 KB)
+ merge packets to fill RX buffers
As benchmark tool I used iperf3 [2] modified with VSOCK support:
host -> guest [Gbps] guest -> host [Gbps]
pkt_size before opt. optimized before opt. optimized
1K 0.5 1.6 1.4 1.4
2K 1.1 3.1 2.3 2.5
4K 2.0 5.6 4.2 4.4
8K 3.2 10.2 7.2 7.5
16K 6.4 14.2 9.4 11.3
32K 9.8 18.9 9.2 17.8
64K 13.8 22.9 8.8 25.0
128K 17.6 24.5 7.7 25.7
256K 19.0 24.8 8.1 25.6
512K 20.8 25.1 8.1 25.4
How to reproduce:
host$ modprobe vhost_vsock
host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
# Note: Guest CID should be >= 3
# (0, 1 are reserved and 2 identify the host)
guest$ iperf3 --vsock -s
host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
If you want, I can do a similar benchmark (with iperf3) using a networking
card (do you have a specific configuration?).
Let me know if you need more details!
Thanks,
Stefano
[1] https://wiki.qemu.org/Features/VirtioVsock
[2] https://github.com/stefano-garzarella/iperf/
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] VSOCK benchmark and optimizations
@ 2019-04-01 16:32 ` Stefano Garzarella
0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-01 16:32 UTC (permalink / raw)
To: alex.bennee; +Cc: qemu devel list, netdev
Hi Alex,
I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
and linux-netdev (maybe this info could be useful for others :))
One of the VSOCK advantages is the simple configuration: you don't need to set
up IP addresses for guest/host, and it can be used with the standard POSIX
socket API. [1]
I'm currently working on it, so the "optimized" values are still work in
progress and I'll send the patches upstream (Linux) as soon as possible.
(I hope in 1 or 2 weeks)
Optimizations:
+ reducing the number of credit update packets
- RX side sent, on every packet received, an empty packet only to inform the
TX side about the space in the RX buffer.
+ increase RX buffers size to 64 KB (from 4 KB)
+ merge packets to fill RX buffers
As benchmark tool I used iperf3 [2] modified with VSOCK support:
host -> guest [Gbps] guest -> host [Gbps]
pkt_size before opt. optimized before opt. optimized
1K 0.5 1.6 1.4 1.4
2K 1.1 3.1 2.3 2.5
4K 2.0 5.6 4.2 4.4
8K 3.2 10.2 7.2 7.5
16K 6.4 14.2 9.4 11.3
32K 9.8 18.9 9.2 17.8
64K 13.8 22.9 8.8 25.0
128K 17.6 24.5 7.7 25.7
256K 19.0 24.8 8.1 25.6
512K 20.8 25.1 8.1 25.4
How to reproduce:
host$ modprobe vhost_vsock
host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
# Note: Guest CID should be >= 3
# (0, 1 are reserved and 2 identify the host)
guest$ iperf3 --vsock -s
host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
If you want, I can do a similar benchmark (with iperf3) using a networking
card (do you have a specific configuration?).
Let me know if you need more details!
Thanks,
Stefano
[1] https://wiki.qemu.org/Features/VirtioVsock
[2] https://github.com/stefano-garzarella/iperf/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: VSOCK benchmark and optimizations
2019-04-01 16:32 ` [Qemu-devel] " Stefano Garzarella
@ 2019-04-02 4:19 ` Alex Bennée
-1 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2019-04-02 4:19 UTC (permalink / raw)
To: Stefano Garzarella; +Cc: qemu devel list, netdev
Stefano Garzarella <sgarzare@redhat.com> writes:
> Hi Alex,
> I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> and linux-netdev (maybe this info could be useful for others :))
>
> One of the VSOCK advantages is the simple configuration: you don't need to set
> up IP addresses for guest/host, and it can be used with the standard POSIX
> socket API. [1]
>
> I'm currently working on it, so the "optimized" values are still work in
> progress and I'll send the patches upstream (Linux) as soon as possible.
> (I hope in 1 or 2 weeks)
>
> Optimizations:
> + reducing the number of credit update packets
> - RX side sent, on every packet received, an empty packet only to inform the
> TX side about the space in the RX buffer.
> + increase RX buffers size to 64 KB (from 4 KB)
> + merge packets to fill RX buffers
>
> As benchmark tool I used iperf3 [2] modified with VSOCK support:
>
> host -> guest [Gbps] guest -> host [Gbps]
> pkt_size before opt. optimized before opt. optimized
> 1K 0.5 1.6 1.4 1.4
> 2K 1.1 3.1 2.3 2.5
> 4K 2.0 5.6 4.2 4.4
> 8K 3.2 10.2 7.2 7.5
> 16K 6.4 14.2 9.4 11.3
> 32K 9.8 18.9 9.2 17.8
> 64K 13.8 22.9 8.8 25.0
> 128K 17.6 24.5 7.7 25.7
> 256K 19.0 24.8 8.1 25.6
> 512K 20.8 25.1 8.1 25.4
>
>
> How to reproduce:
>
> host$ modprobe vhost_vsock
> host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
> # Note: Guest CID should be >= 3
> # (0, 1 are reserved and 2 identify the host)
>
> guest$ iperf3 --vsock -s
>
> host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
> host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
>
>
> If you want, I can do a similar benchmark (with iperf3) using a networking
> card (do you have a specific configuration?).
My main interest is how it stacks up against:
--device virtio-net-pci and I guess the vhost equivalent
AIUI one of the motivators was being able to run something like NFS for
a guest FS over vsock instead of the overhead from UDP and having to
deal with the additional complication of having a working network setup.
>
> Let me know if you need more details!
>
> Thanks,
> Stefano
>
> [1] https://wiki.qemu.org/Features/VirtioVsock
> [2] https://github.com/stefano-garzarella/iperf/
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
@ 2019-04-02 4:19 ` Alex Bennée
0 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2019-04-02 4:19 UTC (permalink / raw)
To: Stefano Garzarella; +Cc: qemu devel list, netdev
Stefano Garzarella <sgarzare@redhat.com> writes:
> Hi Alex,
> I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> and linux-netdev (maybe this info could be useful for others :))
>
> One of the VSOCK advantages is the simple configuration: you don't need to set
> up IP addresses for guest/host, and it can be used with the standard POSIX
> socket API. [1]
>
> I'm currently working on it, so the "optimized" values are still work in
> progress and I'll send the patches upstream (Linux) as soon as possible.
> (I hope in 1 or 2 weeks)
>
> Optimizations:
> + reducing the number of credit update packets
> - RX side sent, on every packet received, an empty packet only to inform the
> TX side about the space in the RX buffer.
> + increase RX buffers size to 64 KB (from 4 KB)
> + merge packets to fill RX buffers
>
> As benchmark tool I used iperf3 [2] modified with VSOCK support:
>
> host -> guest [Gbps] guest -> host [Gbps]
> pkt_size before opt. optimized before opt. optimized
> 1K 0.5 1.6 1.4 1.4
> 2K 1.1 3.1 2.3 2.5
> 4K 2.0 5.6 4.2 4.4
> 8K 3.2 10.2 7.2 7.5
> 16K 6.4 14.2 9.4 11.3
> 32K 9.8 18.9 9.2 17.8
> 64K 13.8 22.9 8.8 25.0
> 128K 17.6 24.5 7.7 25.7
> 256K 19.0 24.8 8.1 25.6
> 512K 20.8 25.1 8.1 25.4
>
>
> How to reproduce:
>
> host$ modprobe vhost_vsock
> host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
> # Note: Guest CID should be >= 3
> # (0, 1 are reserved and 2 identify the host)
>
> guest$ iperf3 --vsock -s
>
> host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
> host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
>
>
> If you want, I can do a similar benchmark (with iperf3) using a networking
> card (do you have a specific configuration?).
My main interest is how it stacks up against:
--device virtio-net-pci and I guess the vhost equivalent
AIUI one of the motivators was being able to run something like NFS for
a guest FS over vsock instead of the overhead from UDP and having to
deal with the additional complication of having a working network setup.
>
> Let me know if you need more details!
>
> Thanks,
> Stefano
>
> [1] https://wiki.qemu.org/Features/VirtioVsock
> [2] https://github.com/stefano-garzarella/iperf/
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: VSOCK benchmark and optimizations
2019-04-02 4:19 ` [Qemu-devel] " Alex Bennée
@ 2019-04-02 7:37 ` Stefano Garzarella
-1 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-02 7:37 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu devel list, netdev, Stefan Hajnoczi
On Tue, Apr 02, 2019 at 04:19:25AM +0000, Alex Bennée wrote:
>
> Stefano Garzarella <sgarzare@redhat.com> writes:
>
> > Hi Alex,
> > I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> > and linux-netdev (maybe this info could be useful for others :))
> >
> > One of the VSOCK advantages is the simple configuration: you don't need to set
> > up IP addresses for guest/host, and it can be used with the standard POSIX
> > socket API. [1]
> >
> > I'm currently working on it, so the "optimized" values are still work in
> > progress and I'll send the patches upstream (Linux) as soon as possible.
> > (I hope in 1 or 2 weeks)
> >
> > Optimizations:
> > + reducing the number of credit update packets
> > - RX side sent, on every packet received, an empty packet only to inform the
> > TX side about the space in the RX buffer.
> > + increase RX buffers size to 64 KB (from 4 KB)
> > + merge packets to fill RX buffers
> >
> > As benchmark tool I used iperf3 [2] modified with VSOCK support:
> >
> > host -> guest [Gbps] guest -> host [Gbps]
> > pkt_size before opt. optimized before opt. optimized
> > 1K 0.5 1.6 1.4 1.4
> > 2K 1.1 3.1 2.3 2.5
> > 4K 2.0 5.6 4.2 4.4
> > 8K 3.2 10.2 7.2 7.5
> > 16K 6.4 14.2 9.4 11.3
> > 32K 9.8 18.9 9.2 17.8
> > 64K 13.8 22.9 8.8 25.0
> > 128K 17.6 24.5 7.7 25.7
> > 256K 19.0 24.8 8.1 25.6
> > 512K 20.8 25.1 8.1 25.4
> >
> >
> > How to reproduce:
> >
> > host$ modprobe vhost_vsock
> > host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
> > # Note: Guest CID should be >= 3
> > # (0, 1 are reserved and 2 identify the host)
> >
> > guest$ iperf3 --vsock -s
> >
> > host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
> > host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
> >
> >
> > If you want, I can do a similar benchmark (with iperf3) using a networking
> > card (do you have a specific configuration?).
>
> My main interest is how it stacks up against:
>
> --device virtio-net-pci and I guess the vhost equivalent
I'll do some tests with virtio-net and vhost.
>
> AIUI one of the motivators was being able to run something like NFS for
> a guest FS over vsock instead of the overhead from UDP and having to
> deal with the additional complication of having a working network setup.
>
CCing Stefan.
I know he is working on virtio-fs that maybe suite better with your use cases.
He also worked on VSOCK support for NFS, but I think it is not merged upstream.
Thanks,
Stefano
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
@ 2019-04-02 7:37 ` Stefano Garzarella
0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-02 7:37 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu devel list, netdev, Stefan Hajnoczi
On Tue, Apr 02, 2019 at 04:19:25AM +0000, Alex Bennée wrote:
>
> Stefano Garzarella <sgarzare@redhat.com> writes:
>
> > Hi Alex,
> > I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> > and linux-netdev (maybe this info could be useful for others :))
> >
> > One of the VSOCK advantages is the simple configuration: you don't need to set
> > up IP addresses for guest/host, and it can be used with the standard POSIX
> > socket API. [1]
> >
> > I'm currently working on it, so the "optimized" values are still work in
> > progress and I'll send the patches upstream (Linux) as soon as possible.
> > (I hope in 1 or 2 weeks)
> >
> > Optimizations:
> > + reducing the number of credit update packets
> > - RX side sent, on every packet received, an empty packet only to inform the
> > TX side about the space in the RX buffer.
> > + increase RX buffers size to 64 KB (from 4 KB)
> > + merge packets to fill RX buffers
> >
> > As benchmark tool I used iperf3 [2] modified with VSOCK support:
> >
> > host -> guest [Gbps] guest -> host [Gbps]
> > pkt_size before opt. optimized before opt. optimized
> > 1K 0.5 1.6 1.4 1.4
> > 2K 1.1 3.1 2.3 2.5
> > 4K 2.0 5.6 4.2 4.4
> > 8K 3.2 10.2 7.2 7.5
> > 16K 6.4 14.2 9.4 11.3
> > 32K 9.8 18.9 9.2 17.8
> > 64K 13.8 22.9 8.8 25.0
> > 128K 17.6 24.5 7.7 25.7
> > 256K 19.0 24.8 8.1 25.6
> > 512K 20.8 25.1 8.1 25.4
> >
> >
> > How to reproduce:
> >
> > host$ modprobe vhost_vsock
> > host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
> > # Note: Guest CID should be >= 3
> > # (0, 1 are reserved and 2 identify the host)
> >
> > guest$ iperf3 --vsock -s
> >
> > host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
> > host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
> >
> >
> > If you want, I can do a similar benchmark (with iperf3) using a networking
> > card (do you have a specific configuration?).
>
> My main interest is how it stacks up against:
>
> --device virtio-net-pci and I guess the vhost equivalent
I'll do some tests with virtio-net and vhost.
>
> AIUI one of the motivators was being able to run something like NFS for
> a guest FS over vsock instead of the overhead from UDP and having to
> deal with the additional complication of having a working network setup.
>
CCing Stefan.
I know he is working on virtio-fs that maybe suite better with your use cases.
He also worked on VSOCK support for NFS, but I think it is not merged upstream.
Thanks,
Stefano
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
2019-04-01 16:32 ` [Qemu-devel] " Stefano Garzarella
(?)
(?)
@ 2019-04-03 12:34 ` Stefan Hajnoczi
2019-04-03 15:10 ` Stefano Garzarella
-1 siblings, 1 reply; 11+ messages in thread
From: Stefan Hajnoczi @ 2019-04-03 12:34 UTC (permalink / raw)
To: Stefano Garzarella; +Cc: alex.bennee, netdev, qemu devel list
[-- Attachment #1: Type: text/plain, Size: 1858 bytes --]
On Mon, Apr 01, 2019 at 06:32:40PM +0200, Stefano Garzarella wrote:
> Hi Alex,
> I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> and linux-netdev (maybe this info could be useful for others :))
>
> One of the VSOCK advantages is the simple configuration: you don't need to set
> up IP addresses for guest/host, and it can be used with the standard POSIX
> socket API. [1]
>
> I'm currently working on it, so the "optimized" values are still work in
> progress and I'll send the patches upstream (Linux) as soon as possible.
> (I hope in 1 or 2 weeks)
>
> Optimizations:
> + reducing the number of credit update packets
> - RX side sent, on every packet received, an empty packet only to inform the
> TX side about the space in the RX buffer.
> + increase RX buffers size to 64 KB (from 4 KB)
> + merge packets to fill RX buffers
>
> As benchmark tool I used iperf3 [2] modified with VSOCK support:
>
> host -> guest [Gbps] guest -> host [Gbps]
> pkt_size before opt. optimized before opt. optimized
> 1K 0.5 1.6 1.4 1.4
This is a "large" small package size. I think 64 bytes is a common
"small" packet size and is worth benchmarking too.
> 2K 1.1 3.1 2.3 2.5
> 4K 2.0 5.6 4.2 4.4
> 8K 3.2 10.2 7.2 7.5
> 16K 6.4 14.2 9.4 11.3
> 32K 9.8 18.9 9.2 17.8
> 64K 13.8 22.9 8.8 25.0
> 128K 17.6 24.5 7.7 25.7
> 256K 19.0 24.8 8.1 25.6
> 512K 20.8 25.1 8.1 25.4
Nice improvements!
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
2019-04-02 7:37 ` [Qemu-devel] " Stefano Garzarella
(?)
@ 2019-04-03 12:36 ` Stefan Hajnoczi
-1 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2019-04-03 12:36 UTC (permalink / raw)
To: Stefano Garzarella
Cc: Alex Bennée, netdev, qemu devel list, Stefan Hajnoczi
[-- Attachment #1: Type: text/plain, Size: 3516 bytes --]
On Tue, Apr 02, 2019 at 09:37:06AM +0200, Stefano Garzarella wrote:
> On Tue, Apr 02, 2019 at 04:19:25AM +0000, Alex Bennée wrote:
> >
> > Stefano Garzarella <sgarzare@redhat.com> writes:
> >
> > > Hi Alex,
> > > I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> > > and linux-netdev (maybe this info could be useful for others :))
> > >
> > > One of the VSOCK advantages is the simple configuration: you don't need to set
> > > up IP addresses for guest/host, and it can be used with the standard POSIX
> > > socket API. [1]
> > >
> > > I'm currently working on it, so the "optimized" values are still work in
> > > progress and I'll send the patches upstream (Linux) as soon as possible.
> > > (I hope in 1 or 2 weeks)
> > >
> > > Optimizations:
> > > + reducing the number of credit update packets
> > > - RX side sent, on every packet received, an empty packet only to inform the
> > > TX side about the space in the RX buffer.
> > > + increase RX buffers size to 64 KB (from 4 KB)
> > > + merge packets to fill RX buffers
> > >
> > > As benchmark tool I used iperf3 [2] modified with VSOCK support:
> > >
> > > host -> guest [Gbps] guest -> host [Gbps]
> > > pkt_size before opt. optimized before opt. optimized
> > > 1K 0.5 1.6 1.4 1.4
> > > 2K 1.1 3.1 2.3 2.5
> > > 4K 2.0 5.6 4.2 4.4
> > > 8K 3.2 10.2 7.2 7.5
> > > 16K 6.4 14.2 9.4 11.3
> > > 32K 9.8 18.9 9.2 17.8
> > > 64K 13.8 22.9 8.8 25.0
> > > 128K 17.6 24.5 7.7 25.7
> > > 256K 19.0 24.8 8.1 25.6
> > > 512K 20.8 25.1 8.1 25.4
> > >
> > >
> > > How to reproduce:
> > >
> > > host$ modprobe vhost_vsock
> > > host$ qemu-system-x86_64 ... -device vhost-vsock-pci,guest-cid=3
> > > # Note: Guest CID should be >= 3
> > > # (0, 1 are reserved and 2 identify the host)
> > >
> > > guest$ iperf3 --vsock -s
> > >
> > > host$ iperf3 --vsock -c 3 -l ${pkt_size} # host -> guest
> > > host$ iperf3 --vsock -c 3 -l ${pkt_size} -R # guest -> host
> > >
> > >
> > > If you want, I can do a similar benchmark (with iperf3) using a networking
> > > card (do you have a specific configuration?).
> >
> > My main interest is how it stacks up against:
> >
> > --device virtio-net-pci and I guess the vhost equivalent
>
> I'll do some tests with virtio-net and vhost.
>
> >
> > AIUI one of the motivators was being able to run something like NFS for
> > a guest FS over vsock instead of the overhead from UDP and having to
> > deal with the additional complication of having a working network setup.
> >
>
> CCing Stefan.
>
> I know he is working on virtio-fs that maybe suite better with your use cases.
> He also worked on VSOCK support for NFS, but I think it is not merged upstream.
Hi Alex,
David Gilbert, Vivek Goyal, Miklos Szeredi, and I are working on
virtio-fs for host<->guest file sharing. It performs better than
virtio-9p and we're currently working on getting it upstream (first the
VIRTIO device spec, then Linux and QEMU patches).
You can read about it and try it here:
https://virtio-fs.gitlab.io/
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
2019-04-03 12:34 ` Stefan Hajnoczi
@ 2019-04-03 15:10 ` Stefano Garzarella
0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-03 15:10 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: alex.bennee, netdev, qemu devel list
On Wed, Apr 03, 2019 at 01:34:38PM +0100, Stefan Hajnoczi wrote:
> On Mon, Apr 01, 2019 at 06:32:40PM +0200, Stefano Garzarella wrote:
> > Hi Alex,
> > I'm sending you some benchmarks and information about VSOCK CCing qemu-devel
> > and linux-netdev (maybe this info could be useful for others :))
> >
> > One of the VSOCK advantages is the simple configuration: you don't need to set
> > up IP addresses for guest/host, and it can be used with the standard POSIX
> > socket API. [1]
> >
> > I'm currently working on it, so the "optimized" values are still work in
> > progress and I'll send the patches upstream (Linux) as soon as possible.
> > (I hope in 1 or 2 weeks)
> >
> > Optimizations:
> > + reducing the number of credit update packets
> > - RX side sent, on every packet received, an empty packet only to inform the
> > TX side about the space in the RX buffer.
> > + increase RX buffers size to 64 KB (from 4 KB)
> > + merge packets to fill RX buffers
> >
> > As benchmark tool I used iperf3 [2] modified with VSOCK support:
> >
> > host -> guest [Gbps] guest -> host [Gbps]
> > pkt_size before opt. optimized before opt. optimized
> > 1K 0.5 1.6 1.4 1.4
>
> This is a "large" small package size. I think 64 bytes is a common
> "small" packet size and is worth benchmarking too.
>
Okay, I'll add more small packet sizes for the benchmark.
> > 2K 1.1 3.1 2.3 2.5
> > 4K 2.0 5.6 4.2 4.4
> > 8K 3.2 10.2 7.2 7.5
> > 16K 6.4 14.2 9.4 11.3
> > 32K 9.8 18.9 9.2 17.8
> > 64K 13.8 22.9 8.8 25.0
> > 128K 17.6 24.5 7.7 25.7
> > 256K 19.0 24.8 8.1 25.6
> > 512K 20.8 25.1 8.1 25.4
>
> Nice improvements!
Thanks :)
I'm cleaning the patches, doing step by step benchmarks and I hope I'll
send the series upstream in these days.
Stefano
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: VSOCK benchmark and optimizations
2019-04-02 4:19 ` [Qemu-devel] " Alex Bennée
@ 2019-04-04 10:47 ` Stefano Garzarella
-1 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-04 10:47 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu devel list, netdev, Stefan Hajnoczi
On Tue, Apr 02, 2019 at 04:19:25AM +0000, Alex Bennée wrote:
>
> My main interest is how it stacks up against:
>
> --device virtio-net-pci and I guess the vhost equivalent
>
Hi Alex,
I added TCP tests on virtio-net and I did also a test with TCP_NODELAY,
just to be fair, because VSOCK doesn't implement something like this
(maybe could be an improvement to add for maximizing the throughput).
I set the MTU to the maximum allowed (65520).
I also redo the VSOCK tests. There are some differences because now I'm
using tuned to have fewer fluctuations and I removed batching in VSOCK
optimization because it is not ready to be published.
VSOCK TCP + virtio-net + vhost
host -> guest [Gbps] host -> guest [Gbps]
pkt_size before opt. optimized TCP_NODELAY
64 0.060 0.096 0.16 0.15
256 0.22 0.36 0.32 0.57
512 0.42 0.74 1.2 1.2
1K 0.7 1.5 2.1 2.1
2K 1.5 2.9 3.5 3.4
4K 2.5 5.3 5.5 5.3
8K 3.9 8.8 8.0 7.9
16K 6.6 12.8 9.8 10.2
32K 9.9 18.1 11.8 10.7
64K 13.5 21.4 11.4 11.3
128K 17.9 23.6 11.2 11.0
256K 18.0 24.4 11.1 11.0
512K 18.4 25.3 10.1 10.7
Note: Maybe I have something miss configured because TCP on virtio-net
doesn't exceed 11 Gbps.
VSOCK TCP + virtio-net + vhost
guest -> host [Gbps] guest -> host [Gbps]
pkt_size before opt. optimized TCP_NODELAY
64 0.088 0.101 0.24 0.24
256 0.35 0.41 0.36 1.03
512 0.70 0.73 0.69 1.6
1K 1.1 1.3 1.1 3.0
2K 2.4 2.6 2.1 5.5
4K 4.3 4.5 3.8 8.8
8K 7.3 7.6 6.6 20.0
16K 9.2 11.1 12.3 29.4
32K 8.3 18.1 19.3 28.2
64K 8.3 25.4 20.6 28.7
128K 7.2 26.7 23.1 27.9
256K 7.7 24.9 28.5 29.4
512K 7.7 25.0 28.3 29.3
virtio-net is well optimized than VSOCK, but we are near :). Maybe we
will use virtio-net as a transport for VSOCK, in order to avoid duplicate
optimizations.
How to reproduce TCP tests:
host$ ip link set dev br0 mtu 65520
host$ ip link set dev tap0 mtu 65520
host$ qemu-system-x86_64 ... \
-netdev tap,id=net0,vhost=on,ifname=tap0,script=no,downscript=no \
-device virtio-net-pci,netdev=net0
guest$ ip link set dev eth0 mtu 65520
guest$ iperf3 -s
host$ iperf3 -c ${VM_IP} -N -l ${pkt_size} # host -> guest
host$ iperf3 -c ${VM_IP} -N -l ${pkt_size} -R # guest -> host
Cheers,
Stefano
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] VSOCK benchmark and optimizations
@ 2019-04-04 10:47 ` Stefano Garzarella
0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2019-04-04 10:47 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu devel list, netdev, Stefan Hajnoczi
On Tue, Apr 02, 2019 at 04:19:25AM +0000, Alex Bennée wrote:
>
> My main interest is how it stacks up against:
>
> --device virtio-net-pci and I guess the vhost equivalent
>
Hi Alex,
I added TCP tests on virtio-net and I did also a test with TCP_NODELAY,
just to be fair, because VSOCK doesn't implement something like this
(maybe could be an improvement to add for maximizing the throughput).
I set the MTU to the maximum allowed (65520).
I also redo the VSOCK tests. There are some differences because now I'm
using tuned to have fewer fluctuations and I removed batching in VSOCK
optimization because it is not ready to be published.
VSOCK TCP + virtio-net + vhost
host -> guest [Gbps] host -> guest [Gbps]
pkt_size before opt. optimized TCP_NODELAY
64 0.060 0.096 0.16 0.15
256 0.22 0.36 0.32 0.57
512 0.42 0.74 1.2 1.2
1K 0.7 1.5 2.1 2.1
2K 1.5 2.9 3.5 3.4
4K 2.5 5.3 5.5 5.3
8K 3.9 8.8 8.0 7.9
16K 6.6 12.8 9.8 10.2
32K 9.9 18.1 11.8 10.7
64K 13.5 21.4 11.4 11.3
128K 17.9 23.6 11.2 11.0
256K 18.0 24.4 11.1 11.0
512K 18.4 25.3 10.1 10.7
Note: Maybe I have something miss configured because TCP on virtio-net
doesn't exceed 11 Gbps.
VSOCK TCP + virtio-net + vhost
guest -> host [Gbps] guest -> host [Gbps]
pkt_size before opt. optimized TCP_NODELAY
64 0.088 0.101 0.24 0.24
256 0.35 0.41 0.36 1.03
512 0.70 0.73 0.69 1.6
1K 1.1 1.3 1.1 3.0
2K 2.4 2.6 2.1 5.5
4K 4.3 4.5 3.8 8.8
8K 7.3 7.6 6.6 20.0
16K 9.2 11.1 12.3 29.4
32K 8.3 18.1 19.3 28.2
64K 8.3 25.4 20.6 28.7
128K 7.2 26.7 23.1 27.9
256K 7.7 24.9 28.5 29.4
512K 7.7 25.0 28.3 29.3
virtio-net is well optimized than VSOCK, but we are near :). Maybe we
will use virtio-net as a transport for VSOCK, in order to avoid duplicate
optimizations.
How to reproduce TCP tests:
host$ ip link set dev br0 mtu 65520
host$ ip link set dev tap0 mtu 65520
host$ qemu-system-x86_64 ... \
-netdev tap,id=net0,vhost=on,ifname=tap0,script=no,downscript=no \
-device virtio-net-pci,netdev=net0
guest$ ip link set dev eth0 mtu 65520
guest$ iperf3 -s
host$ iperf3 -c ${VM_IP} -N -l ${pkt_size} # host -> guest
host$ iperf3 -c ${VM_IP} -N -l ${pkt_size} -R # guest -> host
Cheers,
Stefano
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-04-04 10:54 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-01 16:32 VSOCK benchmark and optimizations Stefano Garzarella
2019-04-01 16:32 ` [Qemu-devel] " Stefano Garzarella
2019-04-02 4:19 ` Alex Bennée
2019-04-02 4:19 ` [Qemu-devel] " Alex Bennée
2019-04-02 7:37 ` Stefano Garzarella
2019-04-02 7:37 ` [Qemu-devel] " Stefano Garzarella
2019-04-03 12:36 ` Stefan Hajnoczi
2019-04-04 10:47 ` Stefano Garzarella
2019-04-04 10:47 ` [Qemu-devel] " Stefano Garzarella
2019-04-03 12:34 ` Stefan Hajnoczi
2019-04-03 15:10 ` Stefano Garzarella
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.