All of lore.kernel.org
 help / color / mirror / Atom feed
* vhost-blk development
@ 2012-04-09 22:59 Michael Baysek
  2012-04-10 11:55 ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Baysek @ 2012-04-09 22:59 UTC (permalink / raw)
  To: kvm

Hi all.  I'm interested in any developments on the vhost-blk in kernel accelerator for disk i/o.  

I had seen a patchset on LKML https://lkml.org/lkml/2011/7/28/175 but that is rather old.  Are there any newer developments going on with the vhost-blk stuff?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-09 22:59 vhost-blk development Michael Baysek
@ 2012-04-10 11:55 ` Stefan Hajnoczi
  2012-04-10 17:25   ` Michael Baysek
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-04-10 11:55 UTC (permalink / raw)
  To: Michael Baysek; +Cc: kvm

On Mon, Apr 9, 2012 at 11:59 PM, Michael Baysek <mbaysek@liquidweb.com> wrote:
> Hi all.  I'm interested in any developments on the vhost-blk in kernel accelerator for disk i/o.
>
> I had seen a patchset on LKML https://lkml.org/lkml/2011/7/28/175 but that is rather old.  Are there any newer developments going on with the vhost-blk stuff?

Hi Michael,
I'm curious what you are looking for in vhost-blk.  Are you trying to
improve disk performance for KVM guests?

Perhaps you'd like to share your configuration, workload, and other
details so that we can discuss how to improve performance.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-10 11:55 ` Stefan Hajnoczi
@ 2012-04-10 17:25   ` Michael Baysek
  2012-04-11 10:19     ` Stefan Hajnoczi
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Baysek @ 2012-04-10 17:25 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

Hi Stefan.  

Well, I'm trying to determine which I/O method currently has the very least performance overhead and gives the best performance for both reads and writes.

I am doing my testing by putting the entire guest onto a ramdisk.  I'm working on an i5-760 with 16GB RAM with VT-d enabled.  I am running the standard Centos 6 kernel with 0.12.1.2 release of qemu-kvm that comes stock on Centos 6.  The guest is configured with 512 MB RAM, using, 4 cpu cores with it's /dev/vda being the ramdisk on the host.

I'm not closed to building a custom kernel or kvm if I can get better performance reliably.  However, my initial attempts with the 3.3.1 kernel and latest kvm gave mixed results.
  
I've been using iozone 3.98 with -O -l32 -i0 -i1 -i2 -e -+n -r4K -s250M to measure performance.

So, I was interested in vhost-blk since it seemed like a promising avenue to take a look at.  If you have any other thoughts, that would also be helpful.

-Mike



----- Original Message -----
From: "Stefan Hajnoczi" <stefanha@gmail.com>
To: "Michael Baysek" <mbaysek@liquidweb.com>
Cc: kvm@vger.kernel.org
Sent: Tuesday, April 10, 2012 4:55:26 AM
Subject: Re: vhost-blk development

On Mon, Apr 9, 2012 at 11:59 PM, Michael Baysek <mbaysek@liquidweb.com> wrote:
> Hi all.  I'm interested in any developments on the vhost-blk in kernel accelerator for disk i/o.
>
> I had seen a patchset on LKML https://lkml.org/lkml/2011/7/28/175 but that is rather old.  Are there any newer developments going on with the vhost-blk stuff?

Hi Michael,
I'm curious what you are looking for in vhost-blk.  Are you trying to
improve disk performance for KVM guests?

Perhaps you'd like to share your configuration, workload, and other
details so that we can discuss how to improve performance.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-10 17:25   ` Michael Baysek
@ 2012-04-11 10:19     ` Stefan Hajnoczi
  2012-04-11 16:52       ` Michael Baysek
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-04-11 10:19 UTC (permalink / raw)
  To: Michael Baysek; +Cc: kvm

On Tue, Apr 10, 2012 at 6:25 PM, Michael Baysek <mbaysek@liquidweb.com> wrote:
> Well, I'm trying to determine which I/O method currently has the very least performance overhead and gives the best performance for both reads and writes.
>
> I am doing my testing by putting the entire guest onto a ramdisk.  I'm working on an i5-760 with 16GB RAM with VT-d enabled.  I am running the standard Centos 6 kernel with 0.12.1.2 release of qemu-kvm that comes stock on Centos 6.  The guest is configured with 512 MB RAM, using, 4 cpu cores with it's /dev/vda being the ramdisk on the host.

Results collected for ramdisk usually do not reflect the performance
you get with a real disk or SSD.  I suggest using the host/guest
configuration you want to deploy.

> I've been using iozone 3.98 with -O -l32 -i0 -i1 -i2 -e -+n -r4K -s250M to measure performance.

I haven't looked up the options but I think you need -I to use
O_DIRECT and bypass the guest page cache - otherwise you are not
benchmarking I/O performance but overall file system/page cache
performance.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-11 10:19     ` Stefan Hajnoczi
@ 2012-04-11 16:52       ` Michael Baysek
  2012-04-12  8:30         ` Stefan Hajnoczi
  2012-04-13  5:38         ` Liu Yuan
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Baysek @ 2012-04-11 16:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm

In this particular case, I did intend to deploy these instances directly to 
the ramdisk.  I want to squeeze every drop of performance out of these 
instances for use cases with lots of concurrent accesses.   I thought it 
would be possible to achieve improvements an order of magnitude or more 
over SSD, but it seems not to be the case (so far).  

I am purposefully not using O_DIRECT since most workloads will not be using 
it, although I did notice better performance when I did use it.  I did
already identify the page cache as a hinderance as well.

I seem to have hit some performance ceilings inside of the kvm guests that 
are much lower than that of the host they are running on.  I am seeing a 
lot more interrupts and context switches on the parent than I am in the 
guests, and I am looking for any and all ways to cut these down.  

I had read somewhere that vhost-blk may help.  However, those patches were 
posted on qemu-devel in 2010, with some activity on LKML in 2011, but not 
much since.  I feared that the reason they are still not merged might be 
bugs, incomplete implementation, or something of the sort.  

Anyhow, I thank you for your quick and timely responses.  I have spent some 
weeks investigating ways to boost performance in this use case and I am 
left with few remaining options.  I hope I have communicated clearly what I 
am trying to accomplish, and why I am inquiring specifically about vhost-blk.  

Regards,

-Mike


----- Original Message -----
From: "Stefan Hajnoczi" <stefanha@gmail.com>
To: "Michael Baysek" <mbaysek@liquidweb.com>
Cc: kvm@vger.kernel.org
Sent: Wednesday, April 11, 2012 3:19:48 AM
Subject: Re: vhost-blk development

On Tue, Apr 10, 2012 at 6:25 PM, Michael Baysek <mbaysek@liquidweb.com> wrote:
> Well, I'm trying to determine which I/O method currently has the very least performance overhead and gives the best performance for both reads and writes.
>
> I am doing my testing by putting the entire guest onto a ramdisk.  I'm working on an i5-760 with 16GB RAM with VT-d enabled.  I am running the standard Centos 6 kernel with 0.12.1.2 release of qemu-kvm that comes stock on Centos 6.  The guest is configured with 512 MB RAM, using, 4 cpu cores with it's /dev/vda being the ramdisk on the host.

Results collected for ramdisk usually do not reflect the performance
you get with a real disk or SSD.  I suggest using the host/guest
configuration you want to deploy.

> I've been using iozone 3.98 with -O -l32 -i0 -i1 -i2 -e -+n -r4K -s250M to measure performance.

I haven't looked up the options but I think you need -I to use
O_DIRECT and bypass the guest page cache - otherwise you are not
benchmarking I/O performance but overall file system/page cache
performance.

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-11 16:52       ` Michael Baysek
@ 2012-04-12  8:30         ` Stefan Hajnoczi
  2012-04-13  5:38         ` Liu Yuan
  1 sibling, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2012-04-12  8:30 UTC (permalink / raw)
  To: Michael Baysek; +Cc: kvm

On Wed, Apr 11, 2012 at 5:52 PM, Michael Baysek <mbaysek@liquidweb.com> wrote:
> I am purposefully not using O_DIRECT since most workloads will not be using
> it, although I did notice better performance when I did use it.  I did
> already identify the page cache as a hinderance as well.

If you do not use O_DIRECT in the benchmark then you are not
exercising pure disk I/O and therefore the benchmark is not measuring
virtio-blk performance.

> I seem to have hit some performance ceilings inside of the kvm guests that
> are much lower than that of the host they are running on.  I am seeing a
> lot more interrupts and context switches on the parent than I am in the
> guests, and I am looking for any and all ways to cut these down.

If you don't care about persistence of data I suggest doing the
ramdisk or tmpfs inside the guest.  That way you keep all "storage"
operations inside the guest and avoid going through any storage
interface (virtio-blk, IDE, etc).  You should be able to achieve
maximum performance using this approach although it does require guest
configuration.

Hope this helps,

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-11 16:52       ` Michael Baysek
  2012-04-12  8:30         ` Stefan Hajnoczi
@ 2012-04-13  5:38         ` Liu Yuan
  2012-04-19 20:26           ` Michael Baysek
  1 sibling, 1 reply; 10+ messages in thread
From: Liu Yuan @ 2012-04-13  5:38 UTC (permalink / raw)
  To: Michael Baysek; +Cc: Stefan Hajnoczi, kvm

On 04/12/2012 12:52 AM, Michael Baysek wrote:

> In this particular case, I did intend to deploy these instances directly to 
> the ramdisk.  I want to squeeze every drop of performance out of these 
> instances for use cases with lots of concurrent accesses.   I thought it 
> would be possible to achieve improvements an order of magnitude or more 
> over SSD, but it seems not to be the case (so far).  


Last year I tried virtio-blk over vhost, which originally planned to put
virtio-blk driver into kernel to reduce system call overhead and shorten
the code path.

I think in your particular case (ramdisk), virtio-blk will hit the best
performance improvement because biggest time-hogger IO is ripped out in
the path, at least would be expected much better than my last test
numbers (+15% for throughput and -10% for latency) which runs on local disk.

But unfortunately, virtio-blk was not considered to be useful enough at
that time, Qemu folks think it is better to optimize the IO stack in
QEMU instead of setting up another code path for it.

I remember that I developed virtio-blk at Linux 3.0 base, So I think it
is not hard to rebase it on latest kernel code or porting it back to
RHEL 6's modified 2.6.32 kernel.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-13  5:38         ` Liu Yuan
@ 2012-04-19 20:26           ` Michael Baysek
  2012-04-20  4:28             ` Liu Yuan
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Baysek @ 2012-04-19 20:26 UTC (permalink / raw)
  To: Liu Yuan; +Cc: kvm

Hi Yuan, 

Can you point me to the latest revision of the code and provide some 
guidance on how to test it?  I really would love to see if it helps.

Best,
-Mike


----- Original Message -----
From: "Liu Yuan" <namei.unix@gmail.com>
To: "Michael Baysek" <mbaysek@liquidweb.com>
Cc: "Stefan Hajnoczi" <stefanha@gmail.com>, kvm@vger.kernel.org
Sent: Thursday, April 12, 2012 10:38:39 PM
Subject: Re: vhost-blk development

On 04/12/2012 12:52 AM, Michael Baysek wrote:

> In this particular case, I did intend to deploy these instances directly to 
> the ramdisk.  I want to squeeze every drop of performance out of these 
> instances for use cases with lots of concurrent accesses.   I thought it 
> would be possible to achieve improvements an order of magnitude or more 
> over SSD, but it seems not to be the case (so far).  


Last year I tried virtio-blk over vhost, which originally planned to put
virtio-blk driver into kernel to reduce system call overhead and shorten
the code path.

I think in your particular case (ramdisk), virtio-blk will hit the best
performance improvement because biggest time-hogger IO is ripped out in
the path, at least would be expected much better than my last test
numbers (+15% for throughput and -10% for latency) which runs on local disk.

But unfortunately, virtio-blk was not considered to be useful enough at
that time, Qemu folks think it is better to optimize the IO stack in
QEMU instead of setting up another code path for it.

I remember that I developed virtio-blk at Linux 3.0 base, So I think it
is not hard to rebase it on latest kernel code or porting it back to
RHEL 6's modified 2.6.32 kernel.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-19 20:26           ` Michael Baysek
@ 2012-04-20  4:28             ` Liu Yuan
  2012-04-20 20:23               ` Michael Baysek
  0 siblings, 1 reply; 10+ messages in thread
From: Liu Yuan @ 2012-04-20  4:28 UTC (permalink / raw)
  To: Michael Baysek; +Cc: kvm

On 04/20/2012 04:26 AM, Michael Baysek wrote:

> Can you point me to the latest revision of the code and provide some 
> guidance on how to test it?  I really would love to see if it helps.


There is no latest revision, I didn't continue the development when I
saw the sign that it wouldn't be accepted.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: vhost-blk development
  2012-04-20  4:28             ` Liu Yuan
@ 2012-04-20 20:23               ` Michael Baysek
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Baysek @ 2012-04-20 20:23 UTC (permalink / raw)
  To: Liu Yuan; +Cc: kvm

Ok, so is this it?  

https://lkml.org/lkml/2011/7/28/175

And what, once it's compiled, it intercepts the virtio-blk requests?  If 
not, how is it enabled in the kvm instance?

Best,

Mike



----- Original Message -----
From: "Liu Yuan" <namei.unix@gmail.com>
To: "Michael Baysek" <mbaysek@liquidweb.com>
Cc: kvm@vger.kernel.org
Sent: Thursday, April 19, 2012 9:28:45 PM
Subject: Re: vhost-blk development

On 04/20/2012 04:26 AM, Michael Baysek wrote:

> Can you point me to the latest revision of the code and provide some 
> guidance on how to test it?  I really would love to see if it helps.


There is no latest revision, I didn't continue the development when I
saw the sign that it wouldn't be accepted.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-04-20 20:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-09 22:59 vhost-blk development Michael Baysek
2012-04-10 11:55 ` Stefan Hajnoczi
2012-04-10 17:25   ` Michael Baysek
2012-04-11 10:19     ` Stefan Hajnoczi
2012-04-11 16:52       ` Michael Baysek
2012-04-12  8:30         ` Stefan Hajnoczi
2012-04-13  5:38         ` Liu Yuan
2012-04-19 20:26           ` Michael Baysek
2012-04-20  4:28             ` Liu Yuan
2012-04-20 20:23               ` Michael Baysek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.