Hello Group.
Let me describe my setup first.
Storage base are 6xSAS drives in RAID50 in IBM x3650 M3, there's
LSI ServeRAID M5015 controller (FW Package Build: 12.13.0-0179).
Disk specs:
http://www.cnet.com/products/seagate-savvio-10k-4-600gb-sas-2/specs/
I've created 6xRAID0 devices from above SAS drives. The reason was
poor performance of the controller itself in every possible RAID
level. Every virtual volume drive which is member of my raid looks
like:
Virtual Drive: 3 (Target Id: 3)
Name :
RAID Level : Primary-0, Secondary-0, RAID Level
Qualifier-0
Size : 557.861 GB
Sector Size : 512
Parity Size : 0
State : Optimal
Strip Size : 128 KB
Number Of Drives : 1
Span Depth : 1
Default Cache Policy: WriteBack, ReadAheadNone,
Cached, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Cached, No
Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
Is VD Cached: No
On that 6 RAID0 volumes I've created softraid (mdadm, Debian 8,
testing) 2xRAID5 and I stripped it, which resulted in creation of
RAID50 array:
Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : active raid0 md2[1] md1[0]
2339045376 blocks super 1.2 512k chunks
md2 : active raid5 sdg1[2] sdf1[1] sde1[0]
1169653760 blocks super 1.2 level 5, 128k chunk,
algorithm 2 [3/3] [UUU]
bitmap: 1/5 pages [4KB], 65536KB chunk
md1 : active raid5 sdd1[2] sdc1[1] sdb1[0]
1169653760 blocks super 1.2 level 5, 128k chunk,
algorithm 2 [3/3] [UUU]
bitmap: 1/5 pages [4KB], 65536KB chunk
On that raid, I've created ext2 fs:
mkfs.ext2 -b 4096 -E stride=128,stripe-width=512 -vvm1
/dev/mapper/hdd-images -i 4194304
Small benchmarks of sequential read and write (20GiB with echo
3 > /proc/sys/vm/drop_caches before every test):
1. Filesystem benchmark:
read 380 MB/s, write 200MB/s
2. LVM volume benchmark:
read 409 MB/s, could not do write test
3. RAID device test:
423 MB/s
4. When I was reading continuously from 4 SAS virtual drives
using dd then I was able to hit bottleneck of the controller (6GB/s)
easly.
I've installed Windows 2012 server, and I've very big problems with
optimal configuration, which allows me to maximize total troughput.
Best performance I've got in that configuration:
qemu-system-x86_64 -enable-kvm -name XXXX -S -machine
pc-1.1,accel=kvm,usb=off -cpu host -m 16000
-realtime mlock=off -smp 4,sockets=4,cores=1,threads=1
-uuid d0e14081-b4a0-23b5-ae39-110a686b0e55 -no-user-config
-nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/acm-server.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=localtime -no-shutdown -boot strict=on -device
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
-drive
file=/var/lib/libvirt/images/xxx.img,if=none,id=drive-virtio-disk0,format=raw,cache=unsafe
\
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1\
-drive
file=/dev/mapper/hdd-storage,if=none,id=drive-virtio-disk1,format=raw,cache=unsafe
\
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1
\
-drive
file=/var/lib/libvirt/images-hdd/storage.img,if=none,id=drive-virtio-disk2,format=raw,cache=unsafe
\
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk2,id=virtio-disk2
\
-netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f5:b5:b7,bus=pci.0,addr=0x3
-chardev spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-device usb-tablet,id=input0 -spice
port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on
-device
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg
timestamp=on
I was able to get 150MB/s sequential read in VM. Then I've
discovered something extraordinary, when I limited CPU count to one,
not four like before, disk throughput was almost two times bigger.
Then I've realized something:
Qemu creates more than 70 threads and everyone of them tries to
write to disk, which results in:
1. High I/O time.
2. Large latency.
2. Poor sequential read/write speeds.
When I limited number of cores, I guess I limited number of threads
as well. That's why I got better numbers.
I've tried to combine AIO native and thread setting with deadline
scheduler. Native AIO was much more worse.
The final question, is there any way to prevent Qemu for making so
large number of processes when VM does only one sequential R/W
operation?