All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs
@ 2017-04-01  5:23 Jaden Liang
  2017-04-01  5:37 ` Fam Zheng
  2017-04-06 10:40 ` Stefan Hajnoczi
  0 siblings, 2 replies; 4+ messages in thread
From: Jaden Liang @ 2017-04-01  5:23 UTC (permalink / raw)
  To: qemu-devel

Hello,

I ran qemu with drive file via libnfs recently, and found some performance
problem and an improvement idea.

I started qemu with 6 drives parameter like nfs://127.0.0.1/dir/vm-disk-x.qcow2
which linked to a local NFS server, then used iometer in guest machine to test
the 4K random read or random write IO performance. I found that while the IO
depth go up, the IOPS hit a bottleneck. I looked into the causes, found that the
main thread of qemu used 100% CPU. From the perf data, it show the CPU heats are
send / recv calls in libnfs. By reading the source code of libnfs and qemu block
drive of nfs.c, libnfs only support single work thread, and the network events
of nfs interface in qemu are all registered in the epoll of main thread. That is
the cause why main thread uses 100% CPU.

After the analysis above, there is an improvement idea comes up. I start a
thread for every drive while libnfs open drive file, then create an epoll in
every drive thread to handle all of the network events. I have finished an demo
modification in block/nfs.c, then rerun iometer in the guest machine, the
performance increased a lot. Random read IOPS increases almost 100%, random
write IOPS increases about 68%.

Test model details
VM configure: 6 vdisks in 1 VM
Test tool and parameter: iometer with 4K random read and randwrite
Backend physical drive: 2 SSDs, 6 vdisks are seperated in 2 SSDs

Before modified:
IO Depth           1        2          4           8       16         32
4K randread  16659  28387   42932   46868   52108   55760
4K randwrite  12212   19456   30447   30574   35788   39015

After modified:
IO Depth            1         2          4          8        16          32
4K randread   17661   33115   57138   82016   99369   109410
4K randwrite  12669   21492   36017   51532   61475   65577

I could put a up to coding standard patch later. Now I want to get some advise
about this modification. Is this a reasonable solution to improve performance in
NFS shares? Or there is another better way?

Any suggestions would be great! Also please feel free to ask question.

-- 
Best regards,
Jaden Liang

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs
  2017-04-01  5:23 [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs Jaden Liang
@ 2017-04-01  5:37 ` Fam Zheng
  2017-04-01  6:28   ` Jaden Liang
  2017-04-06 10:40 ` Stefan Hajnoczi
  1 sibling, 1 reply; 4+ messages in thread
From: Fam Zheng @ 2017-04-01  5:37 UTC (permalink / raw)
  To: Jaden Liang; +Cc: qemu-devel

On Sat, 04/01 13:23, Jaden Liang wrote:
> Hello,
> 
> I ran qemu with drive file via libnfs recently, and found some performance
> problem and an improvement idea.
> 
> I started qemu with 6 drives parameter like nfs://127.0.0.1/dir/vm-disk-x.qcow2
> which linked to a local NFS server, then used iometer in guest machine to test
> the 4K random read or random write IO performance. I found that while the IO
> depth go up, the IOPS hit a bottleneck. I looked into the causes, found that the
> main thread of qemu used 100% CPU. From the perf data, it show the CPU heats are
> send / recv calls in libnfs. By reading the source code of libnfs and qemu block
> drive of nfs.c, libnfs only support single work thread, and the network events
> of nfs interface in qemu are all registered in the epoll of main thread. That is
> the cause why main thread uses 100% CPU.
> 
> After the analysis above, there is an improvement idea comes up. I start a
> thread for every drive while libnfs open drive file, then create an epoll in
> every drive thread to handle all of the network events. I have finished an demo
> modification in block/nfs.c, then rerun iometer in the guest machine, the
> performance increased a lot. Random read IOPS increases almost 100%, random
> write IOPS increases about 68%.
> 
> Test model details
> VM configure: 6 vdisks in 1 VM
> Test tool and parameter: iometer with 4K random read and randwrite
> Backend physical drive: 2 SSDs, 6 vdisks are seperated in 2 SSDs
> 
> Before modified:
> IO Depth           1        2          4           8       16         32
> 4K randread  16659  28387   42932   46868   52108   55760
> 4K randwrite  12212   19456   30447   30574   35788   39015
> 
> After modified:
> IO Depth            1         2          4          8        16          32
> 4K randread   17661   33115   57138   82016   99369   109410
> 4K randwrite  12669   21492   36017   51532   61475   65577
> 
> I could put a up to coding standard patch later. Now I want to get some advise
> about this modification. Is this a reasonable solution to improve performance in
> NFS shares? Or there is another better way?
> 
> Any suggestions would be great! Also please feel free to ask question.

Just one comment: in block/file-posix.c (aio=threads), there is a thread pool
that does something similar, using the code util/thread-pool.c. Maybe it's
usable for your block/nfs.c change too.

Also a question: have you considered modifying libnfs to create more worker
threads? That way all applications using libnfs can benefit.

Fam

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs
  2017-04-01  5:37 ` Fam Zheng
@ 2017-04-01  6:28   ` Jaden Liang
  0 siblings, 0 replies; 4+ messages in thread
From: Jaden Liang @ 2017-04-01  6:28 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-devel

2017-04-01 13:37 GMT+08:00 Fam Zheng <famz@redhat.com>:
> On Sat, 04/01 13:23, Jaden Liang wrote:
>> Hello,
>>
>> I ran qemu with drive file via libnfs recently, and found some performance
>> problem and an improvement idea.
>>
>> I started qemu with 6 drives parameter like nfs://127.0.0.1/dir/vm-disk-x.qcow2
>> which linked to a local NFS server, then used iometer in guest machine to test
>> the 4K random read or random write IO performance. I found that while the IO
>> depth go up, the IOPS hit a bottleneck. I looked into the causes, found that the
>> main thread of qemu used 100% CPU. From the perf data, it show the CPU heats are
>> send / recv calls in libnfs. By reading the source code of libnfs and qemu block
>> drive of nfs.c, libnfs only support single work thread, and the network events
>> of nfs interface in qemu are all registered in the epoll of main thread. That is
>> the cause why main thread uses 100% CPU.
>>
>> After the analysis above, there is an improvement idea comes up. I start a
>> thread for every drive while libnfs open drive file, then create an epoll in
>> every drive thread to handle all of the network events. I have finished an demo
>> modification in block/nfs.c, then rerun iometer in the guest machine, the
>> performance increased a lot. Random read IOPS increases almost 100%, random
>> write IOPS increases about 68%.
>>
>> Test model details
>> VM configure: 6 vdisks in 1 VM
>> Test tool and parameter: iometer with 4K random read and randwrite
>> Backend physical drive: 2 SSDs, 6 vdisks are seperated in 2 SSDs
>>
>> Before modified:
>> IO Depth           1        2          4           8       16         32
>> 4K randread  16659  28387   42932   46868   52108   55760
>> 4K randwrite  12212   19456   30447   30574   35788   39015
>>
>> After modified:
>> IO Depth            1         2          4          8        16          32
>> 4K randread   17661   33115   57138   82016   99369   109410
>> 4K randwrite  12669   21492   36017   51532   61475   65577
>>
>> I could put a up to coding standard patch later. Now I want to get some advise
>> about this modification. Is this a reasonable solution to improve performance in
>> NFS shares? Or there is another better way?
>>
>> Any suggestions would be great! Also please feel free to ask question.
>
> Just one comment: in block/file-posix.c (aio=threads), there is a thread pool
> that does something similar, using the code util/thread-pool.c. Maybe it's
> usable for your block/nfs.c change too.
>
> Also a question: have you considered modifying libnfs to create more worker
> threads? That way all applications using libnfs can benefit.
>
> Fam

Modifying libnfs is also a solution. However, when I looked into
libnfs, found that it
is totally single-thread design. It would be a lot work to make it
support multi-thread mode.
That is why I choose to modify qemu block/nfs.c instead. Because there
are already
some similar ways like file-posix.c.

-- 
Best regards,
Jaden Liang

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs
  2017-04-01  5:23 [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs Jaden Liang
  2017-04-01  5:37 ` Fam Zheng
@ 2017-04-06 10:40 ` Stefan Hajnoczi
  1 sibling, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2017-04-06 10:40 UTC (permalink / raw)
  To: Jaden Liang; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2576 bytes --]

On Sat, Apr 01, 2017 at 01:23:46PM +0800, Jaden Liang wrote:
> Hello,
> 
> I ran qemu with drive file via libnfs recently, and found some performance
> problem and an improvement idea.
> 
> I started qemu with 6 drives parameter like nfs://127.0.0.1/dir/vm-disk-x.qcow2
> which linked to a local NFS server, then used iometer in guest machine to test
> the 4K random read or random write IO performance. I found that while the IO
> depth go up, the IOPS hit a bottleneck. I looked into the causes, found that the
> main thread of qemu used 100% CPU. From the perf data, it show the CPU heats are
> send / recv calls in libnfs. By reading the source code of libnfs and qemu block
> drive of nfs.c, libnfs only support single work thread, and the network events
> of nfs interface in qemu are all registered in the epoll of main thread. That is
> the cause why main thread uses 100% CPU.
> 
> After the analysis above, there is an improvement idea comes up. I start a
> thread for every drive while libnfs open drive file, then create an epoll in
> every drive thread to handle all of the network events. I have finished an demo
> modification in block/nfs.c, then rerun iometer in the guest machine, the
> performance increased a lot. Random read IOPS increases almost 100%, random
> write IOPS increases about 68%.
> 
> Test model details
> VM configure: 6 vdisks in 1 VM
> Test tool and parameter: iometer with 4K random read and randwrite
> Backend physical drive: 2 SSDs, 6 vdisks are seperated in 2 SSDs
> 
> Before modified:
> IO Depth           1        2          4           8       16         32
> 4K randread  16659  28387   42932   46868   52108   55760
> 4K randwrite  12212   19456   30447   30574   35788   39015
> 
> After modified:
> IO Depth            1         2          4          8        16          32
> 4K randread   17661   33115   57138   82016   99369   109410
> 4K randwrite  12669   21492   36017   51532   61475   65577
> 
> I could put a up to coding standard patch later. Now I want to get some advise
> about this modification. Is this a reasonable solution to improve performance in
> NFS shares? Or there is another better way?
> 
> Any suggestions would be great! Also please feel free to ask question.

Did you try using -object iothread,id=iothread1 -device
virtio-blk-pci,iothread=iothread1,... to define IOThreads for each
virtio-blk-pci device?

The block/nfs.c code already supports IOThread so you can run multiple
threads and don't need to use 100% CPU in the main loop.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-06 10:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-01  5:23 [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs Jaden Liang
2017-04-01  5:37 ` Fam Zheng
2017-04-01  6:28   ` Jaden Liang
2017-04-06 10:40 ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.