* [QUESTION] A performance problem for buffer write compared with 9p @ 2019-08-15 0:30 wangyan 2019-08-15 0:56 ` Gao Xiang 2019-08-20 9:16 ` [Virtio-fs] " Stefan Hajnoczi 0 siblings, 2 replies; 14+ messages in thread From: wangyan @ 2019-08-15 0:30 UTC (permalink / raw) To: linux-fsdevel; +Cc: virtio-fs, piaojun Hi all, I met a performance problem when I tested buffer write compared with 9p. Guest configuration: Kernel: https://github.com/rhvgoyal/linux/tree/virtio-fs-dev-5.1 2vCPU 8GB RAM Host configuration: Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 128GB RAM Linux 3.10.0 Qemu: https://gitlab.com/virtio-fs/qemu/tree/virtio-fs-dev EXT4 + ramdisk for shared folder ------------------------------------------------------------------------ For virtiofs: virtiofsd cmd: ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/share/ -o cache=always -o writeback mount cmd: mount -t virtio_fs myfs /mnt/virtiofs -o rootmode=040000,user_id=0,group_id=0 For 9p: mount cmd: mount -t 9p -o trans=virtio,version=9p2000.L,rw,dirsync,nodev,msize=1000000000,cache=fscache sharedir /mnt/virtiofs/ ------------------------------------------------------------------------ Compared with 9p, the test result: 1. Latency Test model: fio -filename=/mnt/virtiofs/test -rw=write -bs=4K -size=1G -iodepth=1 \ -ioengine=psync -numjobs=1 -group_reporting -name=4K -time_based -runtime=30 virtiofs: avg-lat is 6.37 usec 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/471.9MB/0KB /s] [0/121K/0 iops] [eta 00m:00s] 4K: (groupid=0, jobs=1): err= 0: pid=5558: Fri Aug 9 09:21:13 2019 write: io=13758MB, bw=469576KB/s, iops=117394, runt= 30001msec clat (usec): min=2, max=10316, avg= 5.75, stdev=81.80 lat (usec): min=3, max=10317, avg= 6.37, stdev=81.80 9p: avg-lat is 3.94 usec 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/634.2MB/0KB /s] [0/162K/0 iops] [eta 00m:00s] 4K: (groupid=0, jobs=1): err= 0: pid=5873: Fri Aug 9 09:53:46 2019 write: io=19700MB, bw=672414KB/s, iops=168103, runt= 30001msec clat (usec): min=2, max=632, avg= 3.34, stdev= 3.77 lat (usec): min=2, max=633, avg= 3.94, stdev= 3.82 2. Bandwidth Test model: fio -filename=/mnt/virtiofs/test -rw=write -bs=1M -size=1G -iodepth=1 \ -ioengine=psync -numjobs=1 -group_reporting -name=1M -time_based -runtime=30 virtiofs: bandwidth is 718961KB/s 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/753.8MB/0KB /s] [0/753/0 iops] [eta 00m:00s] 1M: (groupid=0, jobs=1): err= 0: pid=5648: Fri Aug 9 09:24:36 2019 write: io=21064MB, bw=718961KB/s, iops=702, runt= 30001msec clat (usec): min=390, max=11127, avg=1361.41, stdev=1551.50 lat (usec): min=432, max=11170, avg=1414.72, stdev=1553.28 9p: bandwidth is 2305.5MB/s 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/2406MB/0KB /s] [0/2406/0 iops] [eta 00m:00s] 1M: (groupid=0, jobs=1): err= 0: pid=5907: Fri Aug 9 09:55:14 2019 write: io=69166MB, bw=2305.5MB/s, iops=2305, runt= 30001msec clat (usec): min=287, max=17678, avg=352.00, stdev=503.43 lat (usec): min=330, max=17721, avg=402.76, stdev=503.41 9p has a lower latency and higher bandwidth than virtiofs. ------------------------------------------------------------------------ I found that the judgement statement 'if (!TestSetPageDirty(page))' always true in function '__set_page_dirty_nobuffers', it will waste much time to mark inode dirty, no one page is dirty when write it the second time. The buffer write stack: fuse_file_write_iter ->fuse_cache_write_iter ->generic_file_write_iter ->__generic_file_write_iter ->generic_perform_write ->fuse_write_end ->set_page_dirty ->__set_page_dirty_nobuffers The reason for 'if (!TestSetPageDirty(page))' always true may be the pdflush process will clean the page's dirty flags in clear_page_dirty_for_io(), and call fuse_writepages_send() to flush all pages to the disk of the host. So when the page is written the second time, it always not dirty. The pdflush stack for fuse: pdflush ->... ->do_writepages ->fuse_writepages ->write_cache_pages // will clear all page's dirty flags ->clear_page_dirty_for_io // clear page's dirty flags ->fuse_writepages_send // write all pages to the host, but don't wait the result Why not wait for getting the result of writing back pages to the host before cleaning all page's dirty flags? As for 9p, pdflush will call clear_page_dirty_for_io() to clean the page's dirty flags. Then call p9_client_write() to write the page to the host, waiting for the result, and then flush the next page. In this case, buffer write of 9p will hit the dirty page many times before it is being write back to the host by pdflush process. The pdflush stack for 9p: pdflush ->... ->do_writepages ->generic_writepages ->write_cache_pages ->clear_page_dirty_for_io // clear page's dirty flags ->__writepage ->v9fs_vfs_writepage ->v9fs_vfs_writepage_locked ->p9_client_write // it will get the writing back page's result According to the test result, is the handling method of 9p for page writing back more reasonable than virtiofs? Thanks, Yan Wang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [QUESTION] A performance problem for buffer write compared with 9p 2019-08-15 0:30 [QUESTION] A performance problem for buffer write compared with 9p wangyan @ 2019-08-15 0:56 ` Gao Xiang 2019-08-20 9:16 ` [Virtio-fs] " Stefan Hajnoczi 1 sibling, 0 replies; 14+ messages in thread From: Gao Xiang @ 2019-08-15 0:56 UTC (permalink / raw) To: wangyan; +Cc: linux-fsdevel, virtio-fs, piaojun On Thu, Aug 15, 2019 at 08:30:43AM +0800, wangyan wrote: [] > > I found that the judgement statement 'if (!TestSetPageDirty(page))' always > true in function '__set_page_dirty_nobuffers', it will waste much time > to mark inode dirty, no one page is dirty when write it the second time. > The buffer write stack: > fuse_file_write_iter > ->fuse_cache_write_iter > ->generic_file_write_iter > ->__generic_file_write_iter > ->generic_perform_write > ->fuse_write_end > ->set_page_dirty > ->__set_page_dirty_nobuffers > > The reason for 'if (!TestSetPageDirty(page))' always true may be the pdflush > process will clean the page's dirty flags in clear_page_dirty_for_io(), > and call fuse_writepages_send() to flush all pages to the disk of the host. > So when the page is written the second time, it always not dirty. > The pdflush stack for fuse: > pdflush > ->... > ->do_writepages > ->fuse_writepages > ->write_cache_pages // will clear all page's dirty flags > ->clear_page_dirty_for_io // clear page's dirty flags > ->fuse_writepages_send // write all pages to the host, but > don't wait the result > Why not wait for getting the result of writing back pages to the host > before cleaning all page's dirty flags? > As my understanding, I personally think there's nothing wrong with the above process from your words above. Thanks, Gao Xiang > As for 9p, pdflush will call clear_page_dirty_for_io() to clean the page's > dirty flags. Then call p9_client_write() to write the page to the host, > waiting for the result, and then flush the next page. In this case, buffer > write of 9p will hit the dirty page many times before it is being write > back to the host by pdflush process. > The pdflush stack for 9p: > pdflush > ->... > ->do_writepages > ->generic_writepages > ->write_cache_pages > ->clear_page_dirty_for_io // clear page's dirty flags > ->__writepage > ->v9fs_vfs_writepage > ->v9fs_vfs_writepage_locked > ->p9_client_write // it will get the writing back > page's result > > > According to the test result, is the handling method of 9p for page writing > back more reasonable than virtiofs? > > Thanks, > Yan Wang > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-15 0:30 [QUESTION] A performance problem for buffer write compared with 9p wangyan 2019-08-15 0:56 ` Gao Xiang @ 2019-08-20 9:16 ` Stefan Hajnoczi 2019-08-21 7:51 ` Miklos Szeredi 1 sibling, 1 reply; 14+ messages in thread From: Stefan Hajnoczi @ 2019-08-20 9:16 UTC (permalink / raw) To: wangyan; +Cc: linux-fsdevel, virtio-fs, mszeredi [-- Attachment #1: Type: text/plain, Size: 6514 bytes --] On Thu, Aug 15, 2019 at 08:30:43AM +0800, wangyan wrote: > Hi all, > > I met a performance problem when I tested buffer write compared with 9p. CCing Miklos, FUSE maintainer, since this is mostly a FUSE file system writeback question. > > Guest configuration: > Kernel: https://github.com/rhvgoyal/linux/tree/virtio-fs-dev-5.1 > 2vCPU > 8GB RAM > Host configuration: > Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz > 128GB RAM > Linux 3.10.0 > Qemu: https://gitlab.com/virtio-fs/qemu/tree/virtio-fs-dev > EXT4 + ramdisk for shared folder > > ------------------------------------------------------------------------ > > For virtiofs: > virtiofsd cmd: > ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/share/ -o > cache=always -o writeback > mount cmd: > mount -t virtio_fs myfs /mnt/virtiofs -o > rootmode=040000,user_id=0,group_id=0 > > For 9p: > mount cmd: > mount -t 9p -o > trans=virtio,version=9p2000.L,rw,dirsync,nodev,msize=1000000000,cache=fscache > sharedir /mnt/virtiofs/ > > ------------------------------------------------------------------------ > > Compared with 9p, the test result: > 1. Latency > Test model: > fio -filename=/mnt/virtiofs/test -rw=write -bs=4K -size=1G > -iodepth=1 \ > -ioengine=psync -numjobs=1 -group_reporting -name=4K -time_based > -runtime=30 > > virtiofs: avg-lat is 6.37 usec > 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/471.9MB/0KB /s] [0/121K/0 > iops] [eta 00m:00s] > 4K: (groupid=0, jobs=1): err= 0: pid=5558: Fri Aug 9 09:21:13 2019 > write: io=13758MB, bw=469576KB/s, iops=117394, runt= 30001msec > clat (usec): min=2, max=10316, avg= 5.75, stdev=81.80 > lat (usec): min=3, max=10317, avg= 6.37, stdev=81.80 > > 9p: avg-lat is 3.94 usec > 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/634.2MB/0KB /s] [0/162K/0 > iops] [eta 00m:00s] > 4K: (groupid=0, jobs=1): err= 0: pid=5873: Fri Aug 9 09:53:46 2019 > write: io=19700MB, bw=672414KB/s, iops=168103, runt= 30001msec > clat (usec): min=2, max=632, avg= 3.34, stdev= 3.77 > lat (usec): min=2, max=633, avg= 3.94, stdev= 3.82 > > > 2. Bandwidth > Test model: > fio -filename=/mnt/virtiofs/test -rw=write -bs=1M -size=1G > -iodepth=1 \ > -ioengine=psync -numjobs=1 -group_reporting -name=1M -time_based > -runtime=30 > > virtiofs: bandwidth is 718961KB/s > 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/753.8MB/0KB /s] [0/753/0 > iops] [eta 00m:00s] > 1M: (groupid=0, jobs=1): err= 0: pid=5648: Fri Aug 9 09:24:36 2019 > write: io=21064MB, bw=718961KB/s, iops=702, runt= 30001msec > clat (usec): min=390, max=11127, avg=1361.41, stdev=1551.50 > lat (usec): min=432, max=11170, avg=1414.72, stdev=1553.28 > > 9p: bandwidth is 2305.5MB/s > 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/2406MB/0KB /s] [0/2406/0 > iops] [eta 00m:00s] > 1M: (groupid=0, jobs=1): err= 0: pid=5907: Fri Aug 9 09:55:14 2019 > write: io=69166MB, bw=2305.5MB/s, iops=2305, runt= 30001msec > clat (usec): min=287, max=17678, avg=352.00, stdev=503.43 > lat (usec): min=330, max=17721, avg=402.76, stdev=503.41 > > 9p has a lower latency and higher bandwidth than virtiofs. > > ------------------------------------------------------------------------ > > > I found that the judgement statement 'if (!TestSetPageDirty(page))' always > true in function '__set_page_dirty_nobuffers', it will waste much time > to mark inode dirty, no one page is dirty when write it the second time. > The buffer write stack: > fuse_file_write_iter > ->fuse_cache_write_iter > ->generic_file_write_iter > ->__generic_file_write_iter > ->generic_perform_write > ->fuse_write_end > ->set_page_dirty > ->__set_page_dirty_nobuffers > > The reason for 'if (!TestSetPageDirty(page))' always true may be the pdflush > process will clean the page's dirty flags in clear_page_dirty_for_io(), > and call fuse_writepages_send() to flush all pages to the disk of the host. > So when the page is written the second time, it always not dirty. > The pdflush stack for fuse: > pdflush > ->... > ->do_writepages > ->fuse_writepages > ->write_cache_pages // will clear all page's dirty flags > ->clear_page_dirty_for_io // clear page's dirty flags > ->fuse_writepages_send // write all pages to the host, but > don't wait the result > Why not wait for getting the result of writing back pages to the host > before cleaning all page's dirty flags? > > As for 9p, pdflush will call clear_page_dirty_for_io() to clean the page's > dirty flags. Then call p9_client_write() to write the page to the host, > waiting for the result, and then flush the next page. In this case, buffer > write of 9p will hit the dirty page many times before it is being write > back to the host by pdflush process. > The pdflush stack for 9p: > pdflush > ->... > ->do_writepages > ->generic_writepages > ->write_cache_pages > ->clear_page_dirty_for_io // clear page's dirty flags > ->__writepage > ->v9fs_vfs_writepage > ->v9fs_vfs_writepage_locked > ->p9_client_write // it will get the writing back > page's result > > > According to the test result, is the handling method of 9p for page writing > back more reasonable than virtiofs? > > Thanks, > Yan Wang > > _______________________________________________ > Virtio-fs mailing list > Virtio-fs@redhat.com > https://www.redhat.com/mailman/listinfo/virtio-fs [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-20 9:16 ` [Virtio-fs] " Stefan Hajnoczi @ 2019-08-21 7:51 ` Miklos Szeredi 2019-08-21 16:05 ` Stefan Hajnoczi 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-21 7:51 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: wangyan, linux-fsdevel, virtio-fs, Miklos Szeredi On Tue, Aug 20, 2019 at 11:16 AM Stefan Hajnoczi <stefanha@redhat.com> wrote: > > On Thu, Aug 15, 2019 at 08:30:43AM +0800, wangyan wrote: > > Hi all, > > > > I met a performance problem when I tested buffer write compared with 9p. > > CCing Miklos, FUSE maintainer, since this is mostly a FUSE file system > writeback question. This is expected. FUSE contains lots of complexity in the buffered write path related to preventing DoS caused by the userspace server. This added complexity, which causes the performance issue, could be disabled in virtio-fs, since the server lives on a different kernel than the filesystem. I'll do a patch.. Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-21 7:51 ` Miklos Szeredi @ 2019-08-21 16:05 ` Stefan Hajnoczi 2019-08-22 0:59 ` wangyan 0 siblings, 1 reply; 14+ messages in thread From: Stefan Hajnoczi @ 2019-08-21 16:05 UTC (permalink / raw) To: Miklos Szeredi; +Cc: wangyan, linux-fsdevel, virtio-fs, Miklos Szeredi [-- Attachment #1: Type: text/plain, Size: 865 bytes --] On Wed, Aug 21, 2019 at 09:51:20AM +0200, Miklos Szeredi wrote: > On Tue, Aug 20, 2019 at 11:16 AM Stefan Hajnoczi <stefanha@redhat.com> wrote: > > > > On Thu, Aug 15, 2019 at 08:30:43AM +0800, wangyan wrote: > > > Hi all, > > > > > > I met a performance problem when I tested buffer write compared with 9p. > > > > CCing Miklos, FUSE maintainer, since this is mostly a FUSE file system > > writeback question. > > This is expected. FUSE contains lots of complexity in the buffered > write path related to preventing DoS caused by the userspace server. > > This added complexity, which causes the performance issue, could be > disabled in virtio-fs, since the server lives on a different kernel > than the filesystem. > > I'll do a patch.. Great, thanks! Maybe wangyan can try your patch to see how the numbers compare to 9P. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-21 16:05 ` Stefan Hajnoczi @ 2019-08-22 0:59 ` wangyan 2019-08-22 11:43 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: wangyan @ 2019-08-22 0:59 UTC (permalink / raw) To: Stefan Hajnoczi, Miklos Szeredi; +Cc: linux-fsdevel, virtio-fs, Miklos Szeredi On 2019/8/22 0:05, Stefan Hajnoczi wrote: > On Wed, Aug 21, 2019 at 09:51:20AM +0200, Miklos Szeredi wrote: >> On Tue, Aug 20, 2019 at 11:16 AM Stefan Hajnoczi <stefanha@redhat.com> wrote: >>> >>> On Thu, Aug 15, 2019 at 08:30:43AM +0800, wangyan wrote: >>>> Hi all, >>>> >>>> I met a performance problem when I tested buffer write compared with 9p. >>> >>> CCing Miklos, FUSE maintainer, since this is mostly a FUSE file system >>> writeback question. >> >> This is expected. FUSE contains lots of complexity in the buffered >> write path related to preventing DoS caused by the userspace server. >> >> This added complexity, which causes the performance issue, could be >> disabled in virtio-fs, since the server lives on a different kernel >> than the filesystem. >> >> I'll do a patch.. > > Great, thanks! Maybe wangyan can try your patch to see how the numbers > compare to 9P. > > Stefan > I will test it when I get the patch, and post the compared result with 9p. Thanks, Yan Wang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 0:59 ` wangyan @ 2019-08-22 11:43 ` Miklos Szeredi 2019-08-22 12:48 ` wangyan 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-22 11:43 UTC (permalink / raw) To: wangyan; +Cc: Stefan Hajnoczi, linux-fsdevel, virtio-fs, Miklos Szeredi [-- Attachment #1: Type: text/plain, Size: 373 bytes --] On Thu, Aug 22, 2019 at 2:59 AM wangyan <wangyan122@huawei.com> wrote: > I will test it when I get the patch, and post the compared result with > 9p. Could you please try the attached patch? My guess is that it should improve the performance, perhaps by a big margin. Further improvement is possible by eliminating page copies, but that is less trivial. Thanks, Miklos [-- Attachment #2: virtio-fs-nostrict.patch --] [-- Type: text/x-patch, Size: 511 bytes --] Index: linux/fs/fuse/virtio_fs.c =================================================================== --- linux.orig/fs/fuse/virtio_fs.c 2019-08-22 13:38:31.782833564 +0200 +++ linux/fs/fuse/virtio_fs.c 2019-08-22 13:37:55.436406261 +0200 @@ -891,6 +891,9 @@ static int virtio_fs_fill_super(struct s if (err < 0) goto err_free_init_req; + /* No strict accounting needed for virtio-fs */ + sb->s_bdi->capabilities = 0; + fc = fs->vqs[VQ_REQUEST].fud->fc; /* TODO take fuse_mutex around this loop? */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 11:43 ` Miklos Szeredi @ 2019-08-22 12:48 ` wangyan 2019-08-22 13:07 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: wangyan @ 2019-08-22 12:48 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Stefan Hajnoczi, linux-fsdevel, virtio-fs, Miklos Szeredi On 2019/8/22 19:43, Miklos Szeredi wrote: > On Thu, Aug 22, 2019 at 2:59 AM wangyan <wangyan122@huawei.com> wrote: >> I will test it when I get the patch, and post the compared result with >> 9p. > > Could you please try the attached patch? My guess is that it should > improve the performance, perhaps by a big margin. > > Further improvement is possible by eliminating page copies, but that > is less trivial. > > Thanks, > Miklos > Using the same test model. And the test result is: 1. Latency virtiofs: avg-lat is 15.40 usec, bigger than before(6.64 usec). 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/142.4MB/0KB /s] [0/36.5K/0 iops] [eta 00m:00s] 4K: (groupid=0, jobs=1): err= 0: pid=5528: Thu Aug 22 20:39:07 2019 write: io=6633.2MB, bw=226404KB/s, iops=56600, runt= 30001msec clat (usec): min=2, max=40403, avg=14.77, stdev=33.71 lat (usec): min=3, max=40404, avg=15.40, stdev=33.74 2. Bandwidth virtiofs: bandwidth is 280840KB/s, lower than before(691894KB/s). 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 fio-2.13 Starting 1 process Jobs: 1 (f=1): [f(1)] [100.0% done] [0KB/29755KB/0KB /s] [0/29/0 iops] [eta 00m:00s] 1M: (groupid=0, jobs=1): err= 0: pid=5550: Thu Aug 22 20:41:28 2019 write: io=8228.0MB, bw=280840KB/s, iops=274, runt= 30001msec clat (usec): min=362, max=11038, avg=3571.33, stdev=1062.72 lat (usec): min=411, max=11093, avg=3628.39, stdev=1064.53 According to the result, the patch doesn't work and make it worse than before. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 12:48 ` wangyan @ 2019-08-22 13:07 ` Miklos Szeredi 2019-08-22 13:17 ` wangyan 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-22 13:07 UTC (permalink / raw) To: wangyan; +Cc: Stefan Hajnoczi, linux-fsdevel, virtio-fs, Miklos Szeredi On Thu, Aug 22, 2019 at 2:48 PM wangyan <wangyan122@huawei.com> wrote: > > On 2019/8/22 19:43, Miklos Szeredi wrote: > > On Thu, Aug 22, 2019 at 2:59 AM wangyan <wangyan122@huawei.com> wrote: > >> I will test it when I get the patch, and post the compared result with > >> 9p. > > > > Could you please try the attached patch? My guess is that it should > > improve the performance, perhaps by a big margin. > > > > Further improvement is possible by eliminating page copies, but that > > is less trivial. > > > > Thanks, > > Miklos > > > Using the same test model. And the test result is: > 1. Latency > virtiofs: avg-lat is 15.40 usec, bigger than before(6.64 usec). > 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/142.4MB/0KB /s] [0/36.5K/0 > iops] [eta 00m:00s] > 4K: (groupid=0, jobs=1): err= 0: pid=5528: Thu Aug 22 20:39:07 2019 > write: io=6633.2MB, bw=226404KB/s, iops=56600, runt= 30001msec > clat (usec): min=2, max=40403, avg=14.77, stdev=33.71 > lat (usec): min=3, max=40404, avg=15.40, stdev=33.74 > > 2. Bandwidth > virtiofs: bandwidth is 280840KB/s, lower than before(691894KB/s). > 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 > fio-2.13 > Starting 1 process > Jobs: 1 (f=1): [f(1)] [100.0% done] [0KB/29755KB/0KB /s] [0/29/0 iops] > [eta 00m:00s] > 1M: (groupid=0, jobs=1): err= 0: pid=5550: Thu Aug 22 20:41:28 2019 > write: io=8228.0MB, bw=280840KB/s, iops=274, runt= 30001msec > clat (usec): min=362, max=11038, avg=3571.33, stdev=1062.72 > lat (usec): min=411, max=11093, avg=3628.39, stdev=1064.53 > > According to the result, the patch doesn't work and make it worse than > before. Is server started with "-owriteback"? Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 13:07 ` Miklos Szeredi @ 2019-08-22 13:17 ` wangyan 2019-08-22 13:29 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: wangyan @ 2019-08-22 13:17 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Stefan Hajnoczi, linux-fsdevel, virtio-fs, Miklos Szeredi On 2019/8/22 21:07, Miklos Szeredi wrote: > On Thu, Aug 22, 2019 at 2:48 PM wangyan <wangyan122@huawei.com> wrote: >> >> On 2019/8/22 19:43, Miklos Szeredi wrote: >>> On Thu, Aug 22, 2019 at 2:59 AM wangyan <wangyan122@huawei.com> wrote: >>>> I will test it when I get the patch, and post the compared result with >>>> 9p. >>> >>> Could you please try the attached patch? My guess is that it should >>> improve the performance, perhaps by a big margin. >>> >>> Further improvement is possible by eliminating page copies, but that >>> is less trivial. >>> >>> Thanks, >>> Miklos >>> >> Using the same test model. And the test result is: >> 1. Latency >> virtiofs: avg-lat is 15.40 usec, bigger than before(6.64 usec). >> 4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1 >> fio-2.13 >> Starting 1 process >> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/142.4MB/0KB /s] [0/36.5K/0 >> iops] [eta 00m:00s] >> 4K: (groupid=0, jobs=1): err= 0: pid=5528: Thu Aug 22 20:39:07 2019 >> write: io=6633.2MB, bw=226404KB/s, iops=56600, runt= 30001msec >> clat (usec): min=2, max=40403, avg=14.77, stdev=33.71 >> lat (usec): min=3, max=40404, avg=15.40, stdev=33.74 >> >> 2. Bandwidth >> virtiofs: bandwidth is 280840KB/s, lower than before(691894KB/s). >> 1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1 >> fio-2.13 >> Starting 1 process >> Jobs: 1 (f=1): [f(1)] [100.0% done] [0KB/29755KB/0KB /s] [0/29/0 iops] >> [eta 00m:00s] >> 1M: (groupid=0, jobs=1): err= 0: pid=5550: Thu Aug 22 20:41:28 2019 >> write: io=8228.0MB, bw=280840KB/s, iops=274, runt= 30001msec >> clat (usec): min=362, max=11038, avg=3571.33, stdev=1062.72 >> lat (usec): min=411, max=11093, avg=3628.39, stdev=1064.53 >> >> According to the result, the patch doesn't work and make it worse than >> before. > > Is server started with "-owriteback"? > > Thanks, > Miklos > > . > I used these commands: virtiofsd cmd: ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/share/ -o cache=always -o writeback mount cmd: mount -t virtio_fs myfs /mnt/virtiofs -o rootmode=040000,user_id=0,group_id=0 Thanks, Yan Wang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 13:17 ` wangyan @ 2019-08-22 13:29 ` Miklos Szeredi 2019-08-22 14:02 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-22 13:29 UTC (permalink / raw) To: wangyan; +Cc: Miklos Szeredi, Stefan Hajnoczi, linux-fsdevel, virtio-fs [-- Attachment #1: Type: text/plain, Size: 417 bytes --] On Thu, Aug 22, 2019 at 3:18 PM wangyan <wangyan122@huawei.com> wrote: > I used these commands: > virtiofsd cmd: > ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/share/ > -o cache=always -o writeback > mount cmd: > mount -t virtio_fs myfs /mnt/virtiofs -o > rootmode=040000,user_id=0,group_id=0 Good. I think I got it now, updated patch attached. Thanks for your patience! Miklos [-- Attachment #2: virtio-fs-nostrict-v2.patch --] [-- Type: text/x-patch, Size: 434 bytes --] --- fs/fuse/virtio_fs.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -891,6 +891,10 @@ static int virtio_fs_fill_super(struct s if (err < 0) goto err_free_init_req; + /* No strict accounting needed for virtio-fs */ + sb->s_bdi->capabilities = 0; + bdi_set_max_ratio(sb->s_bdi, 100); + fc = fs->vqs[VQ_REQUEST].fud->fc; /* TODO take fuse_mutex around this loop? */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-22 13:29 ` Miklos Szeredi @ 2019-08-22 14:02 ` Miklos Szeredi [not found] ` <fd7a2791-d95c-3bd9-e387-b8778a9eca83@huawei.com> 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-22 14:02 UTC (permalink / raw) To: Miklos Szeredi; +Cc: wangyan, Stefan Hajnoczi, linux-fsdevel, virtio-fs [-- Attachment #1: Type: text/plain, Size: 585 bytes --] On Thu, Aug 22, 2019 at 3:30 PM Miklos Szeredi <mszeredi@redhat.com> wrote: > > On Thu, Aug 22, 2019 at 3:18 PM wangyan <wangyan122@huawei.com> wrote: > > > I used these commands: > > virtiofsd cmd: > > ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/share/ > > -o cache=always -o writeback > > mount cmd: > > mount -t virtio_fs myfs /mnt/virtiofs -o > > rootmode=040000,user_id=0,group_id=0 > > Good. > > I think I got it now, updated patch attached. > > Thanks for your patience! > > Miklos Previous one was broken as well. I hope this one works... [-- Attachment #2: virtio-fs-nostrict-v3.patch --] [-- Type: text/x-patch, Size: 451 bytes --] --- fs/fuse/virtio_fs.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -891,6 +891,10 @@ static int virtio_fs_fill_super(struct s if (err < 0) goto err_free_init_req; + /* No strict accounting needed for virtio-fs */ + sb->s_bdi->capabilities = BDI_CAP_NO_ACCT_WB; + bdi_set_max_ratio(sb->s_bdi, 100); + fc = fs->vqs[VQ_REQUEST].fud->fc; /* TODO take fuse_mutex around this loop? */ ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <fd7a2791-d95c-3bd9-e387-b8778a9eca83@huawei.com>]
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p [not found] ` <fd7a2791-d95c-3bd9-e387-b8778a9eca83@huawei.com> @ 2019-08-26 12:39 ` Miklos Szeredi 2019-08-28 0:57 ` wangyan 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2019-08-26 12:39 UTC (permalink / raw) To: wangyan; +Cc: Miklos Szeredi, Stefan Hajnoczi, linux-fsdevel, virtio-fs [-- Attachment #1: Type: text/plain, Size: 884 bytes --] On Sat, Aug 24, 2019 at 10:44 AM wangyan <wangyan122@huawei.com> wrote: > According to the result, for "-size=1G", it maybe exceed the dirty pages' > upper limit, and it frequently triggered pdflush for write-back. And for > "-size=700M", it maybe didn't exceed the dirty pages' upper limit, so no > extra pdflush was triggered. > > But for 9p using "-size=1G", the latency 3.94 usec, and the bandwidth is > 2305.5MB/s. It is better than virtiofs using "-size=1G". It seems that > it is not affected by the dirty pages' upper limit. I tried to reproduce these results, but failed to get decent (>100MB/s) performance out of 9p. I don't have fscache set up, does that play a part in getting high performance cached writes? What you describe makes sense, and I have a new patch (attached), but didn't see drastic improvement in performance of virtio-fs in my tests. Thanks, Miklos [-- Attachment #2: virtio-fs-nostrict-v4.patch --] [-- Type: text/x-patch, Size: 7456 bytes --] --- fs/fuse/file.c | 111 +++++++++++++++++++++++++++++++++++++--------------- fs/fuse/fuse_i.h | 3 + fs/fuse/inode.c | 3 + fs/fuse/virtio_fs.c | 4 + 4 files changed, 90 insertions(+), 31 deletions(-) --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -891,6 +891,10 @@ static int virtio_fs_fill_super(struct s if (err < 0) goto err_free_init_req; + /* No strict accounting needed for virtio-fs */ + sb->s_bdi->capabilities = 0; + bdi_set_max_ratio(sb->s_bdi, 100); + fc = fs->vqs[VQ_REQUEST].fud->fc; /* TODO take fuse_mutex around this loop? */ --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -695,6 +695,9 @@ struct fuse_conn { /** cache READLINK responses in page cache */ unsigned cache_symlinks:1; + /** use temp pages for writeback */ + unsigned writeback_tmp:1; + /* * The following bitfields are only for optimization purposes * and hence races in setting them will not cause malfunction --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1244,6 +1244,9 @@ static int fuse_fill_super(struct super_ err = fuse_fill_super_common(sb, &d); if (err < 0) goto err_free_init_req; + + get_fuse_conn_super(sb)->writeback_tmp = 1; + /* * atomic_dec_and_test() in fput() provides the necessary * memory barrier for file->private_data to be visible on all --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -363,11 +363,16 @@ static bool fuse_range_is_writeback(stru pgoff_t idx_to) { struct fuse_inode *fi = get_fuse_inode(inode); - bool found; + struct fuse_conn *fc = get_fuse_conn(inode); + bool found = false; - spin_lock(&fi->lock); - found = fuse_find_writeback(fi, idx_from, idx_to); - spin_unlock(&fi->lock); + if (fc->writeback_tmp) { + spin_lock(&fi->lock); + found = fuse_find_writeback(fi, idx_from, idx_to); + spin_unlock(&fi->lock); + } else { + WARN_ON(!list_empty(&fi->writepages)); + } return found; } @@ -1514,7 +1519,7 @@ static void fuse_writepage_free(struct f int i; for (i = 0; i < req->num_pages; i++) - __free_page(req->pages[i]); + put_page(req->pages[i]); if (req->ff) fuse_file_put(req->ff, false, false); @@ -1527,11 +1532,19 @@ static void fuse_writepage_finish(struct struct backing_dev_info *bdi = inode_to_bdi(inode); int i; - list_del(&req->writepages_entry); + if (fc->writeback_tmp) + list_del(&req->writepages_entry); + else + WARN_ON(!list_empty(&req->writepages_entry)); + for (i = 0; i < req->num_pages; i++) { - dec_wb_stat(&bdi->wb, WB_WRITEBACK); - dec_node_page_state(req->pages[i], NR_WRITEBACK_TEMP); - wb_writeout_inc(&bdi->wb); + if (fc->writeback_tmp) { + dec_wb_stat(&bdi->wb, WB_WRITEBACK); + dec_node_page_state(req->pages[i], NR_WRITEBACK_TEMP); + wb_writeout_inc(&bdi->wb); + } else { + end_page_writeback(req->pages[i]); + } } wake_up(&fi->page_waitq); } @@ -1616,6 +1629,10 @@ static void fuse_writepage_end(struct fu struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_write_in *inarg = &req->misc.write.in; struct fuse_req *next = req->misc.write.next; + + if (WARN_ON(!fc->writeback_tmp)) + break; + req->misc.write.next = next->misc.write.next; next->misc.write.next = NULL; next->ff = fuse_file_get(req->ff); @@ -1709,9 +1726,16 @@ static int fuse_writepage_locked(struct /* writeback always goes to bg_queue */ __set_bit(FR_BACKGROUND, &req->flags); - tmp_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); - if (!tmp_page) - goto err_free; + + + if (fc->writeback_tmp) { + tmp_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); + if (!tmp_page) + goto err_free; + } else { + tmp_page = page; + get_page(tmp_page); + } error = -EIO; req->ff = fuse_write_file_get(fc, fi); @@ -1720,7 +1744,8 @@ static int fuse_writepage_locked(struct fuse_write_fill(req, req->ff, page_offset(page), 0); - copy_highpage(tmp_page, page); + if (fc->writeback_tmp) + copy_highpage(tmp_page, page); req->misc.write.in.write_flags |= FUSE_WRITE_CACHE; req->misc.write.next = NULL; req->in.argpages = 1; @@ -1731,21 +1756,27 @@ static int fuse_writepage_locked(struct req->end = fuse_writepage_end; req->inode = inode; - inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK); - inc_node_page_state(tmp_page, NR_WRITEBACK_TEMP); + if (fc->writeback_tmp) { + inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK); + inc_node_page_state(tmp_page, NR_WRITEBACK_TEMP); + } spin_lock(&fi->lock); - list_add(&req->writepages_entry, &fi->writepages); + if (fc->writeback_tmp) + list_add(&req->writepages_entry, &fi->writepages); + else + INIT_LIST_HEAD(&req->writepages_entry); list_add_tail(&req->list, &fi->queued_writes); fuse_flush_writepages(inode); spin_unlock(&fi->lock); - end_page_writeback(page); + if (fc->writeback_tmp) + end_page_writeback(page); return 0; err_nofile: - __free_page(tmp_page); + put_page(tmp_page); err_free: fuse_request_free(req); err: @@ -1788,6 +1819,7 @@ static void fuse_writepages_send(struct struct fuse_req *req = data->req; struct inode *inode = data->inode; struct fuse_inode *fi = get_fuse_inode(inode); + struct fuse_conn *fc = get_fuse_conn(inode); int num_pages = req->num_pages; int i; @@ -1797,8 +1829,10 @@ static void fuse_writepages_send(struct fuse_flush_writepages(inode); spin_unlock(&fi->lock); - for (i = 0; i < num_pages; i++) - end_page_writeback(data->orig_pages[i]); + if (fc->writeback_tmp) { + for (i = 0; i < num_pages; i++) + end_page_writeback(data->orig_pages[i]); + } } /* @@ -1816,6 +1850,9 @@ static bool fuse_writepage_in_flight(str struct fuse_req *tmp; struct fuse_req *old_req; + if (WARN_ON(!fc->writeback_tmp)) + return false; + WARN_ON(new_req->num_pages != 0); spin_lock(&fi->lock); @@ -1901,10 +1938,15 @@ static int fuse_writepages_fill(struct p } } - err = -ENOMEM; - tmp_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); - if (!tmp_page) - goto out_unlock; + if (fc->writeback_tmp) { + err = -ENOMEM; + tmp_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); + if (!tmp_page) + goto out_unlock; + } else { + tmp_page = page; + get_page(tmp_page); + } /* * The page must not be redirtied until the writeout is completed @@ -1925,7 +1967,7 @@ static int fuse_writepages_fill(struct p err = -ENOMEM; req = fuse_request_alloc_nofs(FUSE_REQ_INLINE_PAGES); if (!req) { - __free_page(tmp_page); + put_page(tmp_page); goto out_unlock; } @@ -1938,21 +1980,28 @@ static int fuse_writepages_fill(struct p req->end = fuse_writepage_end; req->inode = inode; - spin_lock(&fi->lock); - list_add(&req->writepages_entry, &fi->writepages); - spin_unlock(&fi->lock); + if (fc->writeback_tmp) { + spin_lock(&fi->lock); + list_add(&req->writepages_entry, &fi->writepages); + spin_unlock(&fi->lock); + } else { + INIT_LIST_HEAD(&req->writepages_entry); + } data->req = req; } set_page_writeback(page); - copy_highpage(tmp_page, page); + if (fc->writeback_tmp) + copy_highpage(tmp_page, page); req->pages[req->num_pages] = tmp_page; req->page_descs[req->num_pages].offset = 0; req->page_descs[req->num_pages].length = PAGE_SIZE; - inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK); - inc_node_page_state(tmp_page, NR_WRITEBACK_TEMP); + if (fc->writeback_tmp) { + inc_wb_stat(&inode_to_bdi(inode)->wb, WB_WRITEBACK); + inc_node_page_state(tmp_page, NR_WRITEBACK_TEMP); + } err = 0; if (is_writeback && fuse_writepage_in_flight(req, page)) { ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Virtio-fs] [QUESTION] A performance problem for buffer write compared with 9p 2019-08-26 12:39 ` Miklos Szeredi @ 2019-08-28 0:57 ` wangyan 0 siblings, 0 replies; 14+ messages in thread From: wangyan @ 2019-08-28 0:57 UTC (permalink / raw) To: Miklos Szeredi Cc: Miklos Szeredi, Stefan Hajnoczi, linux-fsdevel, virtio-fs, piaojun On 2019/8/26 20:39, Miklos Szeredi wrote: > On Sat, Aug 24, 2019 at 10:44 AM wangyan <wangyan122@huawei.com> wrote: > >> According to the result, for "-size=1G", it maybe exceed the dirty pages' >> upper limit, and it frequently triggered pdflush for write-back. And for >> "-size=700M", it maybe didn't exceed the dirty pages' upper limit, so no >> extra pdflush was triggered. >> >> But for 9p using "-size=1G", the latency 3.94 usec, and the bandwidth is >> 2305.5MB/s. It is better than virtiofs using "-size=1G". It seems that >> it is not affected by the dirty pages' upper limit. > > I tried to reproduce these results, but failed to get decent > (>100MB/s) performance out of 9p. I don't have fscache set up, does > that play a part in getting high performance cached writes? Yes, you should open fscache. My mount command is: mount -t 9p -o trans=virtio,version=9p2000.L,rw,dirsync,nodev,msize=1000000000,cache=fscache sharedir /mnt/virtiofs/ Thanks, Yan Wang > > What you describe makes sense, and I have a new patch (attached), but > didn't see drastic improvement in performance of virtio-fs in my > tests. > > Thanks, > Miklos > ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-08-28 0:57 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-15 0:30 [QUESTION] A performance problem for buffer write compared with 9p wangyan 2019-08-15 0:56 ` Gao Xiang 2019-08-20 9:16 ` [Virtio-fs] " Stefan Hajnoczi 2019-08-21 7:51 ` Miklos Szeredi 2019-08-21 16:05 ` Stefan Hajnoczi 2019-08-22 0:59 ` wangyan 2019-08-22 11:43 ` Miklos Szeredi 2019-08-22 12:48 ` wangyan 2019-08-22 13:07 ` Miklos Szeredi 2019-08-22 13:17 ` wangyan 2019-08-22 13:29 ` Miklos Szeredi 2019-08-22 14:02 ` Miklos Szeredi [not found] ` <fd7a2791-d95c-3bd9-e387-b8778a9eca83@huawei.com> 2019-08-26 12:39 ` Miklos Szeredi 2019-08-28 0:57 ` wangyan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).