On Tue, May 25, 2021 at 01:05:51PM -0500, Mike Christie wrote: > Results: > -------- > When running with the null_blk driver and vhost-scsi I can get 1.2 > million IOPs by just running a simple > > fio --filename=/dev/sda --direct=1 --rw=randrw --bs=4k --ioengine=libaio > --iodepth=128 --numjobs=8 --time_based --group_reporting --name=iops > --runtime=60 --eta-newline=1 > > The VM has 8 vCPUs and sda has 8 virtqueues and we can do a total of > 1024 cmds per devices. To get 1.2 million IOPs I did have to tune and > ran the virsh emulatorpin command so the vhost threads were running > on different CPUs than the VM. If the vhost threads share CPUs then I > get around 800K. > > For a more real device that are also CPU hogs like iscsi, I can still > get 1 million IOPs using 1 dm-multipath device over 8 iscsi paths > (natively it gets 1.1 million IOPs). There is no comparison against a baseline, but I guess it would be the same 8 vCPU guest with single queue vhost-scsi? > > Results/TODO Note: > > - I ported the vdpa sim code to support multiple workers and as-is now > it made perf much worse. If I increase vdpa_sim_blk's num queues to 4-8 > I get 700K IOPs with the fio command above. However with the multiple > worker support it drops to 400K. The problem is the vdpa_sim lock > and the iommu_lock. If I hack (like comment out locks or not worry about > data corruption or crashes) then I can get around 1.2M - 1.6M IOPs with > 8 queues and fio command above. > > So these patches could help other drivers, but it will just take more > work to remove those types of locks. I was hoping the 2 items could be > done indepentently since it helps vhost-scsi immediately. Cool, thank you for taking a look. That's useful info for Stefano. vDPA and vhost can be handled independently though in the long term hopefully they can share code.