From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36437) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YihnK-0006P2-Um for qemu-devel@nongnu.org; Thu, 16 Apr 2015 07:17:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YihnH-0005YV-Ok for qemu-devel@nongnu.org; Thu, 16 Apr 2015 07:17:18 -0400 Received: from mail-la0-f45.google.com ([209.85.215.45]:36739) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YihnH-0005YH-Cw for qemu-devel@nongnu.org; Thu, 16 Apr 2015 07:17:15 -0400 Received: by lagv1 with SMTP id v1so54022286lag.3 for ; Thu, 16 Apr 2015 04:17:13 -0700 (PDT) Message-ID: <552F9A4F.90504@clodo.ru> Date: Thu, 16 Apr 2015 14:17:35 +0300 From: Konstantin Krotov MIME-Version: 1.0 References: <552E1EB4.3030805@clodo.ru> <20150416012735.GB21291@ad.nay.redhat.com> In-Reply-To: <20150416012735.GB21291@ad.nay.redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] virtio-blk and virtio-scsi performance comparison Reply-To: kkv@clodo.ru List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: qemu-devel@nongnu.org 16.04.2015 04:27, Fam Zheng пишет: > On Wed, 04/15 11:17, Konstantin Krotov wrote: >> Hello list! >> >> I performed tests with fio and obtained results: >> >> *** virtio-scsi with cache=none, io=threads, blok device is md-device from >> mdadm raid1, random r/w, 32 thread from guest (debian, kernel 3.16): >> >> fio fio1 >> readtest: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, >> iodepth=32 >> fio-2.1.11 >> Starting 1 process >> Jobs: 1 (f=1): [m(1)] [100.0% done] [126.2MB/125.1MB/0KB /s] [32.3K/32.3K/0 >> iops] [eta 00m:00s] >> readtest: (groupid=0, jobs=1): err= 0: pid=707: Wed Apr 8 07:35:01 2015 >> read : io=5117.4MB, bw=125830KB/s, iops=31457, runt= 41645msec >> slat (usec): min=4, max=343, avg=11.45, stdev=10.24 >> clat (usec): min=104, max=16667, avg=484.09, stdev=121.96 >> lat (usec): min=112, max=16672, avg=495.90, stdev=123.67 >> clat percentiles (usec): >> | 1.00th=[ 302], 5.00th=[ 346], 10.00th=[ 374], 20.00th=[ 406], >> | 30.00th=[ 426], 40.00th=[ 446], 50.00th=[ 462], 60.00th=[ 482], >> | 70.00th=[ 506], 80.00th=[ 540], 90.00th=[ 596], 95.00th=[ 732], >> | 99.00th=[ 948], 99.50th=[ 996], 99.90th=[ 1176], 99.95th=[ 1240], >> | 99.99th=[ 1384] >> bw (KB /s): min=67392, max=135216, per=99.99%, avg=125813.01, >> stdev=12524.05 >> write: io=5114.7MB, bw=125763KB/s, iops=31440, runt= 41645msec >> slat (usec): min=4, max=388, avg=11.85, stdev=10.47 >> clat (usec): min=147, max=8968, avg=505.23, stdev=127.40 >> lat (usec): min=155, max=8973, avg=517.45, stdev=128.97 >> clat percentiles (usec): >> | 1.00th=[ 334], 5.00th=[ 370], 10.00th=[ 394], 20.00th=[ 426], >> | 30.00th=[ 446], 40.00th=[ 462], 50.00th=[ 478], 60.00th=[ 498], >> | 70.00th=[ 524], 80.00th=[ 556], 90.00th=[ 628], 95.00th=[ 756], >> | 99.00th=[ 988], 99.50th=[ 1064], 99.90th=[ 1288], 99.95th=[ 1368], >> | 99.99th=[ 2224] >> bw (KB /s): min=67904, max=136384, per=99.99%, avg=125746.89, >> stdev=12449.56 >> lat (usec) : 250=0.05%, 500=64.27%, 750=30.80%, 1000=4.20% >> lat (msec) : 2=0.67%, 4=0.01%, 10=0.01%, 20=0.01% >> cpu : usr=18.03%, sys=76.42%, ctx=26617, majf=0, minf=7 >> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>> =64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>> =64=0.0% >> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>> =64=0.0% >> issued : total=r=1310044/w=1309348/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=32 >> >> Run status group 0 (all jobs): >> READ: io=5117.4MB, aggrb=125829KB/s, minb=125829KB/s, maxb=125829KB/s, >> mint=41645msec, maxt=41645msec >> WRITE: io=5114.7MB, aggrb=125762KB/s, minb=125762KB/s, maxb=125762KB/s, >> mint=41645msec, maxt=41645msec >> >> Disk stats (read/write): >> sda: ios=1302885/1302192, merge=55/0, ticks=281040/321660, >> in_queue=601264, util=99.29% >> >> >> same guest, >> *** virtio-blk with cache=none, io=threads, blok device is md-device from >> mdadm raid1, random r/w, 32 thread from guest (debian, kernel 3.16): >> >> fio fio1 >> readtest: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, >> iodepth=32 >> fio-2.1.11 >> Starting 1 process >> Jobs: 1 (f=1): [m(1)] [100.0% done] [123.7MB/123.3MB/0KB /s] [31.7K/31.6K/0 >> iops] [eta 00m:00s] >> readtest: (groupid=0, jobs=1): err= 0: pid=810: Wed Apr 8 07:26:37 2015 >> read : io=5117.4MB, bw=148208KB/s, iops=37051, runt= 35357msec >> slat (usec): min=2, max=2513, avg= 7.27, stdev=10.28 >> clat (usec): min=104, max=10716, avg=382.30, stdev=113.38 >> lat (usec): min=108, max=10719, avg=389.94, stdev=115.48 >> clat percentiles (usec): >> | 1.00th=[ 215], 5.00th=[ 249], 10.00th=[ 270], 20.00th=[ 298], >> | 30.00th=[ 318], 40.00th=[ 338], 50.00th=[ 358], 60.00th=[ 386], >> | 70.00th=[ 418], 80.00th=[ 462], 90.00th=[ 516], 95.00th=[ 572], >> | 99.00th=[ 756], 99.50th=[ 820], 99.90th=[ 996], 99.95th=[ 1176], >> | 99.99th=[ 2256] >> bw (KB /s): min=119296, max=165456, per=99.94%, avg=148124.33, >> stdev=11834.17 >> write: io=5114.7MB, bw=148129KB/s, iops=37032, runt= 35357msec >> slat (usec): min=2, max=2851, avg= 7.55, stdev=10.53 >> clat (usec): min=172, max=11080, avg=461.92, stdev=137.02 >> lat (usec): min=178, max=11086, avg=469.86, stdev=138.05 >> clat percentiles (usec): >> | 1.00th=[ 278], 5.00th=[ 318], 10.00th=[ 338], 20.00th=[ 366], >> | 30.00th=[ 390], 40.00th=[ 414], 50.00th=[ 438], 60.00th=[ 466], >> | 70.00th=[ 494], 80.00th=[ 532], 90.00th=[ 604], 95.00th=[ 716], >> | 99.00th=[ 900], 99.50th=[ 980], 99.90th=[ 1336], 99.95th=[ 1704], >> | 99.99th=[ 3408] >> bw (KB /s): min=119656, max=166680, per=99.93%, avg=148029.21, >> stdev=11824.30 >> lat (usec) : 250=2.71%, 500=77.22%, 750=17.60%, 1000=2.21% >> lat (msec) : 2=0.24%, 4=0.02%, 10=0.01%, 20=0.01% >> cpu : usr=27.92%, sys=55.44%, ctx=91283, majf=0, minf=7 >> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >>> =64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>> =64=0.0% >> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >>> =64=0.0% >> issued : total=r=1310044/w=1309348/d=0, short=r=0/w=0/d=0 >> latency : target=0, window=0, percentile=100.00%, depth=32 >> >> Run status group 0 (all jobs): >> READ: io=5117.4MB, aggrb=148207KB/s, minb=148207KB/s, maxb=148207KB/s, >> mint=35357msec, maxt=35357msec >> WRITE: io=5114.7MB, aggrb=148128KB/s, minb=148128KB/s, maxb=148128KB/s, >> mint=35357msec, maxt=35357msec >> >> Disk stats (read/write): >> vdb: ios=1302512/1301780, merge=0/0, ticks=294828/407184, in_queue=701380, >> util=99.51% >> >> In my tests virtio-scsi shows worse results than virtio-blk. >> Host kernel 3.19-3, qemu-system-x86_64 -version >> QEMU emulator version 2.0.0. > > > Hi Konstantin, > > Thanks for sharing your test result with us! > > It is not surprising that virtio-blk performs better in such a test. It has a > much smaller command set, which results in both a simpler device model and > probably a simpler guest driver. > > virtio-scsi, on the other hand, provides more features and means to be more > scalable (you won't need to painfully mess with pci bridges to attach 1000 > disks). > > Anyway, we are working on improving virtio-scsi performance, although it's > theoretically impossible to make it faster or even equally fast. > > Regarding your test, I think with current code base, it generally performs > better if you use io=native. Have you compared that? > > Fam > Thank you for the answer! In my production system i'm interested to use io=thread, becouse i export MD raid1 devices to guest's and i need to read from the MD device performs balanced between raid1 legs (it's true only if io=thread). -- WBR Konstantin V. Krotov