linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: wangyan <wangyan122@huawei.com>
To: <linux-fsdevel@vger.kernel.org>
Cc: "virtio-fs@redhat.com" <virtio-fs@redhat.com>,
	piaojun <piaojun@huawei.com>
Subject: [QUESTION] A performance problem for buffer write compared with 9p
Date: Thu, 15 Aug 2019 08:30:43 +0800	[thread overview]
Message-ID: <5abd7616-5351-761c-0c14-21d511251006@huawei.com> (raw)

Hi all,

I met a performance problem when I tested buffer write compared with 9p.

Guest configuration:
     Kernel: https://github.com/rhvgoyal/linux/tree/virtio-fs-dev-5.1
     2vCPU
     8GB RAM
Host configuration:
     Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
     128GB RAM
     Linux 3.10.0
     Qemu: https://gitlab.com/virtio-fs/qemu/tree/virtio-fs-dev
     EXT4 + ramdisk for shared folder

------------------------------------------------------------------------

For virtiofs:
virtiofsd cmd:
     ./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o 
source=/mnt/share/ -o cache=always -o writeback
mount cmd:
     mount -t virtio_fs myfs /mnt/virtiofs -o 
rootmode=040000,user_id=0,group_id=0

For 9p:
mount cmd:
     mount -t 9p -o 
trans=virtio,version=9p2000.L,rw,dirsync,nodev,msize=1000000000,cache=fscache 
sharedir /mnt/virtiofs/

------------------------------------------------------------------------

Compared with 9p, the test result:
1. Latency
     Test model:
         fio -filename=/mnt/virtiofs/test -rw=write -bs=4K -size=1G 
-iodepth=1 \
             -ioengine=psync -numjobs=1 -group_reporting -name=4K 
-time_based -runtime=30

     virtiofs: avg-lat is 6.37 usec
         4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, 
iodepth=1
         fio-2.13
         Starting 1 process
         Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/471.9MB/0KB /s] 
[0/121K/0 iops] [eta 00m:00s]
         4K: (groupid=0, jobs=1): err= 0: pid=5558: Fri Aug  9 09:21:13 2019
           write: io=13758MB, bw=469576KB/s, iops=117394, runt= 30001msec
             clat (usec): min=2, max=10316, avg= 5.75, stdev=81.80
              lat (usec): min=3, max=10317, avg= 6.37, stdev=81.80

     9p: avg-lat is 3.94 usec
         4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, 
iodepth=1
         fio-2.13
         Starting 1 process
         Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/634.2MB/0KB /s] 
[0/162K/0 iops] [eta 00m:00s]
         4K: (groupid=0, jobs=1): err= 0: pid=5873: Fri Aug  9 09:53:46 2019
           write: io=19700MB, bw=672414KB/s, iops=168103, runt= 30001msec
             clat (usec): min=2, max=632, avg= 3.34, stdev= 3.77
              lat (usec): min=2, max=633, avg= 3.94, stdev= 3.82


2. Bandwidth
     Test model:
         fio -filename=/mnt/virtiofs/test -rw=write -bs=1M -size=1G 
-iodepth=1 \
             -ioengine=psync -numjobs=1 -group_reporting -name=1M 
-time_based -runtime=30

     virtiofs: bandwidth is 718961KB/s
         1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, 
iodepth=1
         fio-2.13
         Starting 1 process
         Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/753.8MB/0KB /s] 
[0/753/0 iops] [eta 00m:00s]
         1M: (groupid=0, jobs=1): err= 0: pid=5648: Fri Aug  9 09:24:36 2019
             write: io=21064MB, bw=718961KB/s, iops=702, runt= 30001msec
              clat (usec): min=390, max=11127, avg=1361.41, stdev=1551.50
               lat (usec): min=432, max=11170, avg=1414.72, stdev=1553.28

     9p: bandwidth is 2305.5MB/s
         1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, 
iodepth=1
         fio-2.13
         Starting 1 process
         Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/2406MB/0KB /s] 
[0/2406/0 iops] [eta 00m:00s]
         1M: (groupid=0, jobs=1): err= 0: pid=5907: Fri Aug  9 09:55:14 2019
           write: io=69166MB, bw=2305.5MB/s, iops=2305, runt= 30001msec
             clat (usec): min=287, max=17678, avg=352.00, stdev=503.43
              lat (usec): min=330, max=17721, avg=402.76, stdev=503.41

9p has a lower latency and higher bandwidth than virtiofs.

------------------------------------------------------------------------ 


I found that the judgement statement 'if (!TestSetPageDirty(page))' always
true in function '__set_page_dirty_nobuffers', it will waste much time
to mark inode dirty, no one page is dirty when write it the second time.
The buffer write stack:
     fuse_file_write_iter
       ->fuse_cache_write_iter
         ->generic_file_write_iter
           ->__generic_file_write_iter
             ->generic_perform_write
               ->fuse_write_end
                 ->set_page_dirty
                   ->__set_page_dirty_nobuffers

The reason for 'if (!TestSetPageDirty(page))' always true may be the pdflush
process will clean the page's dirty flags in clear_page_dirty_for_io(),
and call fuse_writepages_send() to flush all pages to the disk of the host.
So when the page is written the second time, it always not dirty.
The pdflush stack for fuse:
     pdflush
       ->...
         ->do_writepages
           ->fuse_writepages
             ->write_cache_pages         // will clear all page's dirty 
flags
               ->clear_page_dirty_for_io // clear page's dirty flags
             ->fuse_writepages_send      // write all pages to the host, 
but don't wait the result
Why not wait for getting the result of writing back pages to the host
before cleaning all page's dirty flags?

As for 9p, pdflush will call clear_page_dirty_for_io() to clean the page's
dirty flags. Then call p9_client_write() to write the page to the host,
waiting for the result, and then flush the next page. In this case, buffer
write of 9p will hit the dirty page many times before it is being write
back to the host by pdflush process.
The pdflush stack for 9p:
     pdflush
       ->...
         ->do_writepages
           ->generic_writepages
             ->write_cache_pages
               ->clear_page_dirty_for_io // clear page's dirty flags
               ->__writepage
                 ->v9fs_vfs_writepage
                   ->v9fs_vfs_writepage_locked
                     ->p9_client_write   // it will get the writing back 
page's result


According to the test result, is the handling method of 9p for page writing
back more reasonable than virtiofs?

Thanks,
Yan Wang


             reply	other threads:[~2019-08-15  0:30 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-15  0:30 wangyan [this message]
2019-08-15  0:56 ` [QUESTION] A performance problem for buffer write compared with 9p Gao Xiang
2019-08-20  9:16 ` [Virtio-fs] " Stefan Hajnoczi
2019-08-21  7:51   ` Miklos Szeredi
2019-08-21 16:05     ` Stefan Hajnoczi
2019-08-22  0:59       ` wangyan
2019-08-22 11:43         ` Miklos Szeredi
2019-08-22 12:48           ` wangyan
2019-08-22 13:07             ` Miklos Szeredi
2019-08-22 13:17               ` wangyan
2019-08-22 13:29                 ` Miklos Szeredi
2019-08-22 14:02                   ` Miklos Szeredi
     [not found]                     ` <fd7a2791-d95c-3bd9-e387-b8778a9eca83@huawei.com>
2019-08-26 12:39                       ` Miklos Szeredi
2019-08-28  0:57                         ` wangyan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5abd7616-5351-761c-0c14-21d511251006@huawei.com \
    --to=wangyan122@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=piaojun@huawei.com \
    --cc=virtio-fs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).