All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org, dm-devel@redhat.com,
	jens.axboe@oracle.com, nauman@google.com, dpshah@google.com,
	ryov@valinux.co.jp, balbir@linux.vnet.ibm.com,
	righi.andrea@gmail.com, lizf@cn.fujitsu.com, mikew@google.com,
	fchecconi@gmail.com, paolo.valente@unimore.it,
	fernando@oss.ntt.co.jp, s-uchida@ap.jp.nec.com,
	taka@valinux.co.jp, jmoyer@redhat.com, dhaval@linux.vnet.ibm.com,
	m-ikeda@ds.jp.nec.com, agk@redhat.com, akpm@linux-foundation.org,
	peterz@infradead.org
Subject: Re: [RFC] IO scheduler based IO controller V7
Date: Fri, 31 Jul 2009 13:21:51 +0800	[thread overview]
Message-ID: <4A727F6F.9010005@cn.fujitsu.com> (raw)
In-Reply-To: <1248467274-32073-1-git-send-email-vgoyal@redhat.com>

Hi Vivek,

Here are some test results for normal reads and write for IO Controller V7 by fio.
Tested with "fairness == 0". It seems performance gets better comparing with V6.

Mode         Normal read   |   Random read   |   Normal write   |   Random write  |  Direct read  |  Direct Write

2.6.31-rc1   71,613KiB/s       3,606KiB/s        66,250KiB/s        9,420KiB/s       51,535KiB/s     55,752KiB/s

V7           70,540KiB/s       3,551KiB/s        64,548KiB/s        9,677KiB/s       53,530KiB/s     54,145KiB/s

Performance  -1.5%             -1.5%             -2.6%              +2.7%            +3.9%           -2.9%


Vivek Goyal wrote:
> Hi All,
> 
> Here is the V7 of the IO controller patches generated on top of 2.6.31-rc4.
> 
> For ease of patching, a consolidated patch is available here.
> 
> http://people.redhat.com/~vgoyal/io-controller/io-scheduler-based-io-controller-v7.patch
> 
> Previous versions of the patches was posted here.
> 
> (V1) http://lkml.org/lkml/2009/3/11/486
> (V2) http://lkml.org/lkml/2009/5/5/275
> (V3) http://lkml.org/lkml/2009/5/26/472
> (V4) http://lkml.org/lkml/2009/6/8/580
> (V5) http://lkml.org/lkml/2009/6/19/279
> (V6) http://lkml.org/lkml/2009/7/2/369
> 
> Changes from V6
> ===============
> - Introduced the notion of group_idling where we idle for next request to
>   come from the same group before we expire it. It is along the lines of
>   cfq's slice_idle thing to provide fairness. Switching to group idling
>   now helps in the sense that we don't have to rely whether queue idling
>   was turned on or not by CFQ. It becomes too much of debugging pain with
>   different work loads and different kind of storage media. Introduction
>   of group_idle should help.
> 
> - Moved some of the code like dynamic queue idling update, arming queue
>   idling timer, keeping track of average think time etc back to CFQ. With
>   group idling we don't need it now. Reduce the amount of change.
> 
> - Enabled cfq's close cooperator functionality in groups. So far this worked
>   only in root group. Now it should work in non-root groups also.
> 
> - Got rid of the patch where we calculated disk time based on average disk
>   rate in some circumstances. It was giving bad numbers in early queue
>   deletion cases. Also did not think that it was helping a lot. Remvoed it
>   for the time being.
>  
> - Added an experimental patch to map sync requests using bio tracking info and
>   not task context. This is only for noop, deadline and AS.
> 
> - Got rid of experimental patch of idling for async queues. Don't think it
>   was helping.
> 
> - Got rid of wait_busy and wait_busy_done logic from queue. Instead
>   implemented it for groups.
> 
> - Introduced oom_ioq to accomodate oom_cfqq change recently.
> 
> - Broke-up elv_init_ioq() fn into smaller functions. It had 7 arguments and
>   looked complicated.
> 
> - Fixed a bug in blk_queue_io_group_congested(). Thanks to Munehiro Ikeda.
> 
> - Merged gui's patch to fix the cgroup file format issue.
> 
> - Merged gui's patch to update per group congestion limit when
>   q->nr_group_requests is updated.
> 
> - Fixed a bug where close cooperation will not work if we wait for all the
>   requests to finish from previous queue.
> 
> - Fixed group deletion accouting where deletion from idle tree were also
>   appearing in the log.
> 
> - Got rid of busy_rt_queues infrastructure.
> 
> - Got rid of elv_ioq_request_dispatched(). An helper function just to
>   increment a variable.
>   
> Limitations
> ===========
> 
> - This IO controller provides the bandwidth control at the IO scheduler
>   level (leaf node in stacked hiearchy of logical devices). So there can
>   be cases (depending on configuration) where application does not see
>   proportional BW division at higher logical level device.
> 
>   LWN has written an article about the issue here.
> 
> 	http://lwn.net/Articles/332839/
> 
> How to solve the issue of fairness at higher level logical devices
> ==================================================================
> (Do we really need it? That's not where the contention for resources is.)
> 
> Couple of suggestions have come forward.
> 
> - Implement IO control at IO scheduler layer and then with the help of
>   some daemon, adjust the weight on underlying devices dynamiclly, depending
>   on what kind of BW gurantees are to be achieved at higher level logical
>   block devices.
> 
> - Also implement a higher level IO controller along with IO scheduler
>   based controller and let user choose one depending on his needs.
> 
>   A higher level controller does not know about the assumptions/policies
>   of unerldying IO scheduler, hence it has the potential to break down
>   the IO scheduler's policy with-in cgroup. A lower level controller
>   can work with IO scheduler much more closely and efficiently.
>  
> Other active IO controller developments
> =======================================
> 
> IO throttling
> -------------
> 
>   This is a max bandwidth controller and not the proportional one. Secondly
>   it is a second level controller which can break the IO scheduler's
>   policy/assumtions with-in cgroup. 
> 
> dm-ioband
> ---------
> 
>  This is a proportional bandwidth controller implemented as device mapper
>  driver. It is also a second level controller which can break the
>  IO scheduler's policy/assumptions with-in cgroup.
> 
> TODO
> ====
> - code cleanups, testing, bug fixing, optimizations, benchmarking etc...
> 
> Testing
> =======
> 
> I have been able to do some testing as follows. All my testing is with ext3
> file system with a SATA drive which supports queue depth of 31.
> 
> Test1 (Isolation between two KVM virtual machines)
> ==================================================
> Created two KVM virtual machines. Partitioned a disk on host in two partitions
> and gave one partition to each virtual machine. Put both the virtual machines
> in two different cgroup of weight 1000 and 500 each. Virtual machines created
> ext3 file system on the partitions exported from host and did buffered writes.
> Host seems writes as synchronous and virtual machine with higher weight gets
> double the disk time of virtual machine of lower weight. Used deadline
> scheduler in this test case.
> 
> Some more details about configuration are in documentation patch.
> 
> Test2 (Fairness for synchronous reads)
> ======================================
> - Two dd in two cgroups with cgrop weights 1000 and 500. Ran two "dd" in those
>   cgroups (With CFQ scheduler and /sys/block/<device>/queue/fairness = 1)
> 
>   Higher weight dd finishes first and at that point of time my script takes
>   care of reading cgroup files io.disk_time and io.disk_sectors for both the
>   groups and display the results.
> 
>   dd if=/mnt/$BLOCKDEV/zerofile1 of=/dev/null &
>   dd if=/mnt/$BLOCKDEV/zerofile2 of=/dev/null &
> 
>   234179072 bytes (234 MB) copied, 3.9065 s, 59.9 MB/s
>   234179072 bytes (234 MB) copied, 5.19232 s, 45.1 MB/s
> 
>   group1 time=8 16 2471 group1 sectors=8 16 457840
>   group2 time=8 16 1220 group2 sectors=8 16 225736
> 
> First two fields in time and sectors statistics represent major and minor
> number of the device. Third field represents disk time in milliseconds and
> number of sectors transferred respectively.
> 
> This patchset tries to provide fairness in terms of disk time received. group1
> got almost double of group2 disk time (At the time of first dd finish). These
> time and sectors statistics can be read using io.disk_time and io.disk_sector
> files in cgroup. More about it in documentation file.
> 
> Test3 (Reader Vs Buffered Writes)
> ================================
> Buffered writes can be problematic and can overwhelm readers, especially with
> noop and deadline. IO controller can provide isolation between readers and
> buffered (async) writers.
> 
> First I ran the test without io controller to see the severity of the issue.
> Ran a hostile writer and then after 10 seconds started a reader and then
> monitored the completion time of reader. Reader reads a 256 MB file. Tested
> this with noop scheduler.
> 
> sample script
> ------------
> sync
> echo 3 > /proc/sys/vm/drop_caches
> time dd if=/dev/zero of=/mnt/sdb/reader-writer-zerofile bs=4K count=2097152
> conv=fdatasync &
> sleep 10
> time dd if=/mnt/sdb/256M-file of=/dev/null &
> 
> Results
> -------
> 8589934592 bytes (8.6 GB) copied, 106.045 s, 81.0 MB/s (Writer)
> 268435456 bytes (268 MB) copied, 96.5237 s, 2.8 MB/s (Reader)
> 
> Now it was time to test io controller whether it can provide isolation between
> readers and writers with noop. I created two cgroups of weight 1000 each and
> put reader in group1 and writer in group 2 and ran the test again. Upon
> comletion of reader, my scripts read io.dis_time and io.disk_group cgroup
> files to get an estimate how much disk time each group got and how many
> sectors each group did IO for. 
> 
> For more accurate accounting of disk time for buffered writes with queuing
> hardware I had to set /sys/block/<disk>/queue/iosched/fairness to "1".
> 
> sample script
> -------------
> echo $$ > /cgroup/bfqio/test2/tasks
> dd if=/dev/zero of=/mnt/$BLOCKDEV/testzerofile bs=4K count=2097152 &
> sleep 10
> echo noop > /sys/block/$BLOCKDEV/queue/scheduler
> echo  1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> echo $$ > /cgroup/bfqio/test1/tasks
> dd if=/mnt/$BLOCKDEV/256M-file of=/dev/null &
> wait $!
> # Some code for reading cgroup files upon completion of reader.
> -------------------------
> 
> Results
> =======
> 268435456 bytes (268 MB) copied, 6.65819 s, 40.3 MB/s (Reader) 
> 
> group1 time=8 16 3063	group1 sectors=8 16 524808
> group2 time=8 16 3071	group2 sectors=8 16 441752
> 
> Note, reader finishes now much lesser time and both group1 and group2
> got almost 3 seconds of disk time. Hence io-controller provides isolation
> from buffered writes.
> 
> Test4 (AIO)
> ===========
> 
> AIO reads
> -----------
> Set up two fio, AIO read jobs in two cgroup with weight 1000 and 500
> respectively. I am using cfq scheduler. Following are some lines from my test
> script.
> 
> ---------------------------------------------------------------
> echo 1000 > /cgroup/bfqio/test1/io.weight
> echo 500 > /cgroup/bfqio/test2/io.weight
> 
> fio_args="--ioengine=libaio --rw=read --size=512M --direct=1"
> echo 1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/$BLOCKDEV/fio1/
> --output=/mnt/$BLOCKDEV/fio1/test1.log
> --exec_postrun="../read-and-display-group-stats.sh $maj_dev $minor_dev" &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/$BLOCKDEV/fio2/
> --output=/mnt/$BLOCKDEV/fio2/test2.log &
> ----------------------------------------------------------------
> 
> test1 and test2 are two groups with weight 1000 and 500 respectively.
> "read-and-display-group-stats.sh" is one small script which reads the
> test1 and test2 cgroup files to determine how much disk time each group
> got till first fio job finished.
> 
> Results
> ------
> test1 statistics: time=8 16 22403   sectors=8 16 1049640
> test2 statistics: time=8 16 11400   sectors=8 16 552864
> 
> Above shows that by the time first fio (higher weight), finished, group
> test1 got 22403 ms of disk time and group test2 got 11400 ms of disk time.
> similarly the statistics for number of sectors transferred are also shown.
> 
> Note that disk time given to group test1 is almost double of group2 disk
> time.
> 
> AIO writes
> ----------
> Set up two fio, AIO direct write jobs in two cgroup with weight 1000 and 500
> respectively. I am using cfq scheduler. Following are some lines from my test
> script.
> 
> ------------------------------------------------
> echo 1000 > /cgroup/bfqio/test1/io.weight
> echo 500 > /cgroup/bfqio/test2/io.weight
> fio_args="--ioengine=libaio --rw=write --size=512M --direct=1"
> 
> echo 1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/$BLOCKDEV/fio1/
> --output=/mnt/$BLOCKDEV/fio1/test1.log
> --exec_postrun="../read-and-display-group-stats.sh $maj_dev $minor_dev" &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/$BLOCKDEV/fio2/
> --output=/mnt/$BLOCKDEV/fio2/test2.log &
> -------------------------------------------------
> 
> test1 and test2 are two groups with weight 1000 and 500 respectively.
> "read-and-display-group-stats.sh" is one small script which reads the
> test1 and test2 cgroup files to determine how much disk time each group
> got till first fio job finished.
> 
> Following are the results.
> 
> test1 statistics: time=8 16 29085   sectors=8 16 1049656
> test2 statistics: time=8 16 14652   sectors=8 16 516728
> 
> Above shows that by the time first fio (higher weight), finished, group
> test1 got 28085 ms of disk time and group test2 got 14652 ms of disk time.
> similarly the statistics for number of sectors transferred are also shown.
> 
> Note that disk time given to group test1 is almost double of group2 disk
> time.
> 
> Test5 (Fairness for async writes, Buffered Write Vs Buffered Write)
> ===================================================================
> Fairness for async writes is tricky and biggest reason is that async writes
> are cached in higher layers (page cahe) as well as possibly in file system
> layer also (btrfs, xfs etc), and are dispatched to lower layers not necessarily
> in proportional manner.
> 
> For example, consider two dd threads reading /dev/zero as input file and doing
> writes of huge files. Very soon we will cross vm_dirty_ratio and dd thread will
> be forced to write out some pages to disk before more pages can be dirtied. But
> not necessarily dirty pages of same thread are picked. It can very well pick
> the inode of lesser priority dd thread and do some writeout. So effectively
> higher weight dd is doing writeouts of lower weight dd pages and we don't see
> service differentation.
> 
> IOW, the core problem with async write fairness is that higher weight thread
> does not throw enought IO traffic at IO controller to keep the queue
> continuously backlogged. In my testing, there are many .2 to .8 second
> intervals where higher weight queue is empty and in that duration lower weight
> queue get lots of job done giving the impression that there was no service
> differentiation.
> 
> In summary, from IO controller point of view async writes support is there.
> Because page cache has not been designed in such a manner that higher 
> prio/weight writer can do more write out as compared to lower prio/weight
> writer, gettting service differentiation is hard and it is visible in some
> cases and not visible in some cases.
> 
> Do we really care that much for fairness among two writer cgroups? One can
> choose to do direct writes or sync writes if fairness for writes really
> matters for him.
> 
> Following is the only case where it is hard to ensure fairness between cgroups.
> 
> - Buffered writes Vs Buffered Writes.
> 
> So to test async writes I created two partitions on a disk and created ext3
> file systems on both the partitions.  Also created two cgroups and generated
> lots of write traffic in two cgroups (50 fio threads) and watched the disk
> time statistics in respective cgroups at the interval of 2 seconds. Thanks to
> ryo tsuruta for the test case.
> 
> *****************************************************************
> sync
> echo 3 > /proc/sys/vm/drop_caches
> 
> fio_args="--size=64m --rw=write --numjobs=50 --group_reporting"
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/sdd1/fio/ --output=/mnt/sdd1/fio/test1.log &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/sdd2/fio/ --output=/mnt/sdd2/fio/test2.log &
> *********************************************************************** 
> 
> And watched the disk time and sector statistics for the both the cgroups
> every 2 seconds using a script. How is snippet from output.
> 
> test1 statistics: time=8 48 1315   sectors=8 48 55776 dq=8 48 1
> test2 statistics: time=8 48 633   sectors=8 48 14720 dq=8 48 2
> 
> test1 statistics: time=8 48 5586   sectors=8 48 339064 dq=8 48 2
> test2 statistics: time=8 48 2985   sectors=8 48 146656 dq=8 48 3
> 
> test1 statistics: time=8 48 9935   sectors=8 48 628728 dq=8 48 3
> test2 statistics: time=8 48 5265   sectors=8 48 278688 dq=8 48 4
> 
> test1 statistics: time=8 48 14156   sectors=8 48 932488 dq=8 48 6
> test2 statistics: time=8 48 7646   sectors=8 48 412704 dq=8 48 7
> 
> test1 statistics: time=8 48 18141   sectors=8 48 1231488 dq=8 48 10
> test2 statistics: time=8 48 9820   sectors=8 48 548400 dq=8 48 8
> 
> test1 statistics: time=8 48 21953   sectors=8 48 1485632 dq=8 48 13
> test2 statistics: time=8 48 12394   sectors=8 48 698288 dq=8 48 10
> 
> test1 statistics: time=8 48 25167   sectors=8 48 1705264 dq=8 48 13
> test2 statistics: time=8 48 14042   sectors=8 48 817808 dq=8 48 10
> 
> First two fields in time and sectors statistics represent major and minor
> number of the device. Third field represents disk time in milliseconds and
> number of sectors transferred respectively.
> 
> So disk time consumed by group1 is almost double of group2 in this case.
> 
> Your feedback is welcome.
> 
> Thanks
> Vivek
> 
> 
> 

-- 
Regards
Gui Jianfeng


WARNING: multiple messages have this Message-ID (diff)
From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: dhaval@linux.vnet.ibm.com, peterz@infradead.org,
	dm-devel@redhat.com, dpshah@google.com, jens.axboe@oracle.com,
	agk@redhat.com, balbir@linux.vnet.ibm.com,
	paolo.valente@unimore.it, fernando@oss.ntt.co.jp,
	mikew@google.com, jmoyer@redhat.com, nauman@google.com,
	m-ikeda@ds.jp.nec.com, lizf@cn.fujitsu.com, fchecconi@gmail.com,
	s-uchida@ap.jp.nec.com, containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	righi.andrea@gmail.com
Subject: Re: [RFC] IO scheduler based IO controller V7
Date: Fri, 31 Jul 2009 13:21:51 +0800	[thread overview]
Message-ID: <4A727F6F.9010005@cn.fujitsu.com> (raw)
In-Reply-To: <1248467274-32073-1-git-send-email-vgoyal@redhat.com>

Hi Vivek,

Here are some test results for normal reads and write for IO Controller V7 by fio.
Tested with "fairness == 0". It seems performance gets better comparing with V6.

Mode         Normal read   |   Random read   |   Normal write   |   Random write  |  Direct read  |  Direct Write

2.6.31-rc1   71,613KiB/s       3,606KiB/s        66,250KiB/s        9,420KiB/s       51,535KiB/s     55,752KiB/s

V7           70,540KiB/s       3,551KiB/s        64,548KiB/s        9,677KiB/s       53,530KiB/s     54,145KiB/s

Performance  -1.5%             -1.5%             -2.6%              +2.7%            +3.9%           -2.9%


Vivek Goyal wrote:
> Hi All,
> 
> Here is the V7 of the IO controller patches generated on top of 2.6.31-rc4.
> 
> For ease of patching, a consolidated patch is available here.
> 
> http://people.redhat.com/~vgoyal/io-controller/io-scheduler-based-io-controller-v7.patch
> 
> Previous versions of the patches was posted here.
> 
> (V1) http://lkml.org/lkml/2009/3/11/486
> (V2) http://lkml.org/lkml/2009/5/5/275
> (V3) http://lkml.org/lkml/2009/5/26/472
> (V4) http://lkml.org/lkml/2009/6/8/580
> (V5) http://lkml.org/lkml/2009/6/19/279
> (V6) http://lkml.org/lkml/2009/7/2/369
> 
> Changes from V6
> ===============
> - Introduced the notion of group_idling where we idle for next request to
>   come from the same group before we expire it. It is along the lines of
>   cfq's slice_idle thing to provide fairness. Switching to group idling
>   now helps in the sense that we don't have to rely whether queue idling
>   was turned on or not by CFQ. It becomes too much of debugging pain with
>   different work loads and different kind of storage media. Introduction
>   of group_idle should help.
> 
> - Moved some of the code like dynamic queue idling update, arming queue
>   idling timer, keeping track of average think time etc back to CFQ. With
>   group idling we don't need it now. Reduce the amount of change.
> 
> - Enabled cfq's close cooperator functionality in groups. So far this worked
>   only in root group. Now it should work in non-root groups also.
> 
> - Got rid of the patch where we calculated disk time based on average disk
>   rate in some circumstances. It was giving bad numbers in early queue
>   deletion cases. Also did not think that it was helping a lot. Remvoed it
>   for the time being.
>  
> - Added an experimental patch to map sync requests using bio tracking info and
>   not task context. This is only for noop, deadline and AS.
> 
> - Got rid of experimental patch of idling for async queues. Don't think it
>   was helping.
> 
> - Got rid of wait_busy and wait_busy_done logic from queue. Instead
>   implemented it for groups.
> 
> - Introduced oom_ioq to accomodate oom_cfqq change recently.
> 
> - Broke-up elv_init_ioq() fn into smaller functions. It had 7 arguments and
>   looked complicated.
> 
> - Fixed a bug in blk_queue_io_group_congested(). Thanks to Munehiro Ikeda.
> 
> - Merged gui's patch to fix the cgroup file format issue.
> 
> - Merged gui's patch to update per group congestion limit when
>   q->nr_group_requests is updated.
> 
> - Fixed a bug where close cooperation will not work if we wait for all the
>   requests to finish from previous queue.
> 
> - Fixed group deletion accouting where deletion from idle tree were also
>   appearing in the log.
> 
> - Got rid of busy_rt_queues infrastructure.
> 
> - Got rid of elv_ioq_request_dispatched(). An helper function just to
>   increment a variable.
>   
> Limitations
> ===========
> 
> - This IO controller provides the bandwidth control at the IO scheduler
>   level (leaf node in stacked hiearchy of logical devices). So there can
>   be cases (depending on configuration) where application does not see
>   proportional BW division at higher logical level device.
> 
>   LWN has written an article about the issue here.
> 
> 	http://lwn.net/Articles/332839/
> 
> How to solve the issue of fairness at higher level logical devices
> ==================================================================
> (Do we really need it? That's not where the contention for resources is.)
> 
> Couple of suggestions have come forward.
> 
> - Implement IO control at IO scheduler layer and then with the help of
>   some daemon, adjust the weight on underlying devices dynamiclly, depending
>   on what kind of BW gurantees are to be achieved at higher level logical
>   block devices.
> 
> - Also implement a higher level IO controller along with IO scheduler
>   based controller and let user choose one depending on his needs.
> 
>   A higher level controller does not know about the assumptions/policies
>   of unerldying IO scheduler, hence it has the potential to break down
>   the IO scheduler's policy with-in cgroup. A lower level controller
>   can work with IO scheduler much more closely and efficiently.
>  
> Other active IO controller developments
> =======================================
> 
> IO throttling
> -------------
> 
>   This is a max bandwidth controller and not the proportional one. Secondly
>   it is a second level controller which can break the IO scheduler's
>   policy/assumtions with-in cgroup. 
> 
> dm-ioband
> ---------
> 
>  This is a proportional bandwidth controller implemented as device mapper
>  driver. It is also a second level controller which can break the
>  IO scheduler's policy/assumptions with-in cgroup.
> 
> TODO
> ====
> - code cleanups, testing, bug fixing, optimizations, benchmarking etc...
> 
> Testing
> =======
> 
> I have been able to do some testing as follows. All my testing is with ext3
> file system with a SATA drive which supports queue depth of 31.
> 
> Test1 (Isolation between two KVM virtual machines)
> ==================================================
> Created two KVM virtual machines. Partitioned a disk on host in two partitions
> and gave one partition to each virtual machine. Put both the virtual machines
> in two different cgroup of weight 1000 and 500 each. Virtual machines created
> ext3 file system on the partitions exported from host and did buffered writes.
> Host seems writes as synchronous and virtual machine with higher weight gets
> double the disk time of virtual machine of lower weight. Used deadline
> scheduler in this test case.
> 
> Some more details about configuration are in documentation patch.
> 
> Test2 (Fairness for synchronous reads)
> ======================================
> - Two dd in two cgroups with cgrop weights 1000 and 500. Ran two "dd" in those
>   cgroups (With CFQ scheduler and /sys/block/<device>/queue/fairness = 1)
> 
>   Higher weight dd finishes first and at that point of time my script takes
>   care of reading cgroup files io.disk_time and io.disk_sectors for both the
>   groups and display the results.
> 
>   dd if=/mnt/$BLOCKDEV/zerofile1 of=/dev/null &
>   dd if=/mnt/$BLOCKDEV/zerofile2 of=/dev/null &
> 
>   234179072 bytes (234 MB) copied, 3.9065 s, 59.9 MB/s
>   234179072 bytes (234 MB) copied, 5.19232 s, 45.1 MB/s
> 
>   group1 time=8 16 2471 group1 sectors=8 16 457840
>   group2 time=8 16 1220 group2 sectors=8 16 225736
> 
> First two fields in time and sectors statistics represent major and minor
> number of the device. Third field represents disk time in milliseconds and
> number of sectors transferred respectively.
> 
> This patchset tries to provide fairness in terms of disk time received. group1
> got almost double of group2 disk time (At the time of first dd finish). These
> time and sectors statistics can be read using io.disk_time and io.disk_sector
> files in cgroup. More about it in documentation file.
> 
> Test3 (Reader Vs Buffered Writes)
> ================================
> Buffered writes can be problematic and can overwhelm readers, especially with
> noop and deadline. IO controller can provide isolation between readers and
> buffered (async) writers.
> 
> First I ran the test without io controller to see the severity of the issue.
> Ran a hostile writer and then after 10 seconds started a reader and then
> monitored the completion time of reader. Reader reads a 256 MB file. Tested
> this with noop scheduler.
> 
> sample script
> ------------
> sync
> echo 3 > /proc/sys/vm/drop_caches
> time dd if=/dev/zero of=/mnt/sdb/reader-writer-zerofile bs=4K count=2097152
> conv=fdatasync &
> sleep 10
> time dd if=/mnt/sdb/256M-file of=/dev/null &
> 
> Results
> -------
> 8589934592 bytes (8.6 GB) copied, 106.045 s, 81.0 MB/s (Writer)
> 268435456 bytes (268 MB) copied, 96.5237 s, 2.8 MB/s (Reader)
> 
> Now it was time to test io controller whether it can provide isolation between
> readers and writers with noop. I created two cgroups of weight 1000 each and
> put reader in group1 and writer in group 2 and ran the test again. Upon
> comletion of reader, my scripts read io.dis_time and io.disk_group cgroup
> files to get an estimate how much disk time each group got and how many
> sectors each group did IO for. 
> 
> For more accurate accounting of disk time for buffered writes with queuing
> hardware I had to set /sys/block/<disk>/queue/iosched/fairness to "1".
> 
> sample script
> -------------
> echo $$ > /cgroup/bfqio/test2/tasks
> dd if=/dev/zero of=/mnt/$BLOCKDEV/testzerofile bs=4K count=2097152 &
> sleep 10
> echo noop > /sys/block/$BLOCKDEV/queue/scheduler
> echo  1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> echo $$ > /cgroup/bfqio/test1/tasks
> dd if=/mnt/$BLOCKDEV/256M-file of=/dev/null &
> wait $!
> # Some code for reading cgroup files upon completion of reader.
> -------------------------
> 
> Results
> =======
> 268435456 bytes (268 MB) copied, 6.65819 s, 40.3 MB/s (Reader) 
> 
> group1 time=8 16 3063	group1 sectors=8 16 524808
> group2 time=8 16 3071	group2 sectors=8 16 441752
> 
> Note, reader finishes now much lesser time and both group1 and group2
> got almost 3 seconds of disk time. Hence io-controller provides isolation
> from buffered writes.
> 
> Test4 (AIO)
> ===========
> 
> AIO reads
> -----------
> Set up two fio, AIO read jobs in two cgroup with weight 1000 and 500
> respectively. I am using cfq scheduler. Following are some lines from my test
> script.
> 
> ---------------------------------------------------------------
> echo 1000 > /cgroup/bfqio/test1/io.weight
> echo 500 > /cgroup/bfqio/test2/io.weight
> 
> fio_args="--ioengine=libaio --rw=read --size=512M --direct=1"
> echo 1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/$BLOCKDEV/fio1/
> --output=/mnt/$BLOCKDEV/fio1/test1.log
> --exec_postrun="../read-and-display-group-stats.sh $maj_dev $minor_dev" &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/$BLOCKDEV/fio2/
> --output=/mnt/$BLOCKDEV/fio2/test2.log &
> ----------------------------------------------------------------
> 
> test1 and test2 are two groups with weight 1000 and 500 respectively.
> "read-and-display-group-stats.sh" is one small script which reads the
> test1 and test2 cgroup files to determine how much disk time each group
> got till first fio job finished.
> 
> Results
> ------
> test1 statistics: time=8 16 22403   sectors=8 16 1049640
> test2 statistics: time=8 16 11400   sectors=8 16 552864
> 
> Above shows that by the time first fio (higher weight), finished, group
> test1 got 22403 ms of disk time and group test2 got 11400 ms of disk time.
> similarly the statistics for number of sectors transferred are also shown.
> 
> Note that disk time given to group test1 is almost double of group2 disk
> time.
> 
> AIO writes
> ----------
> Set up two fio, AIO direct write jobs in two cgroup with weight 1000 and 500
> respectively. I am using cfq scheduler. Following are some lines from my test
> script.
> 
> ------------------------------------------------
> echo 1000 > /cgroup/bfqio/test1/io.weight
> echo 500 > /cgroup/bfqio/test2/io.weight
> fio_args="--ioengine=libaio --rw=write --size=512M --direct=1"
> 
> echo 1 > /sys/block/$BLOCKDEV/queue/iosched/fairness
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/$BLOCKDEV/fio1/
> --output=/mnt/$BLOCKDEV/fio1/test1.log
> --exec_postrun="../read-and-display-group-stats.sh $maj_dev $minor_dev" &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/$BLOCKDEV/fio2/
> --output=/mnt/$BLOCKDEV/fio2/test2.log &
> -------------------------------------------------
> 
> test1 and test2 are two groups with weight 1000 and 500 respectively.
> "read-and-display-group-stats.sh" is one small script which reads the
> test1 and test2 cgroup files to determine how much disk time each group
> got till first fio job finished.
> 
> Following are the results.
> 
> test1 statistics: time=8 16 29085   sectors=8 16 1049656
> test2 statistics: time=8 16 14652   sectors=8 16 516728
> 
> Above shows that by the time first fio (higher weight), finished, group
> test1 got 28085 ms of disk time and group test2 got 14652 ms of disk time.
> similarly the statistics for number of sectors transferred are also shown.
> 
> Note that disk time given to group test1 is almost double of group2 disk
> time.
> 
> Test5 (Fairness for async writes, Buffered Write Vs Buffered Write)
> ===================================================================
> Fairness for async writes is tricky and biggest reason is that async writes
> are cached in higher layers (page cahe) as well as possibly in file system
> layer also (btrfs, xfs etc), and are dispatched to lower layers not necessarily
> in proportional manner.
> 
> For example, consider two dd threads reading /dev/zero as input file and doing
> writes of huge files. Very soon we will cross vm_dirty_ratio and dd thread will
> be forced to write out some pages to disk before more pages can be dirtied. But
> not necessarily dirty pages of same thread are picked. It can very well pick
> the inode of lesser priority dd thread and do some writeout. So effectively
> higher weight dd is doing writeouts of lower weight dd pages and we don't see
> service differentation.
> 
> IOW, the core problem with async write fairness is that higher weight thread
> does not throw enought IO traffic at IO controller to keep the queue
> continuously backlogged. In my testing, there are many .2 to .8 second
> intervals where higher weight queue is empty and in that duration lower weight
> queue get lots of job done giving the impression that there was no service
> differentiation.
> 
> In summary, from IO controller point of view async writes support is there.
> Because page cache has not been designed in such a manner that higher 
> prio/weight writer can do more write out as compared to lower prio/weight
> writer, gettting service differentiation is hard and it is visible in some
> cases and not visible in some cases.
> 
> Do we really care that much for fairness among two writer cgroups? One can
> choose to do direct writes or sync writes if fairness for writes really
> matters for him.
> 
> Following is the only case where it is hard to ensure fairness between cgroups.
> 
> - Buffered writes Vs Buffered Writes.
> 
> So to test async writes I created two partitions on a disk and created ext3
> file systems on both the partitions.  Also created two cgroups and generated
> lots of write traffic in two cgroups (50 fio threads) and watched the disk
> time statistics in respective cgroups at the interval of 2 seconds. Thanks to
> ryo tsuruta for the test case.
> 
> *****************************************************************
> sync
> echo 3 > /proc/sys/vm/drop_caches
> 
> fio_args="--size=64m --rw=write --numjobs=50 --group_reporting"
> 
> echo $$ > /cgroup/bfqio/test1/tasks
> fio $fio_args --name=test1 --directory=/mnt/sdd1/fio/ --output=/mnt/sdd1/fio/test1.log &
> 
> echo $$ > /cgroup/bfqio/test2/tasks
> fio $fio_args --name=test2 --directory=/mnt/sdd2/fio/ --output=/mnt/sdd2/fio/test2.log &
> *********************************************************************** 
> 
> And watched the disk time and sector statistics for the both the cgroups
> every 2 seconds using a script. How is snippet from output.
> 
> test1 statistics: time=8 48 1315   sectors=8 48 55776 dq=8 48 1
> test2 statistics: time=8 48 633   sectors=8 48 14720 dq=8 48 2
> 
> test1 statistics: time=8 48 5586   sectors=8 48 339064 dq=8 48 2
> test2 statistics: time=8 48 2985   sectors=8 48 146656 dq=8 48 3
> 
> test1 statistics: time=8 48 9935   sectors=8 48 628728 dq=8 48 3
> test2 statistics: time=8 48 5265   sectors=8 48 278688 dq=8 48 4
> 
> test1 statistics: time=8 48 14156   sectors=8 48 932488 dq=8 48 6
> test2 statistics: time=8 48 7646   sectors=8 48 412704 dq=8 48 7
> 
> test1 statistics: time=8 48 18141   sectors=8 48 1231488 dq=8 48 10
> test2 statistics: time=8 48 9820   sectors=8 48 548400 dq=8 48 8
> 
> test1 statistics: time=8 48 21953   sectors=8 48 1485632 dq=8 48 13
> test2 statistics: time=8 48 12394   sectors=8 48 698288 dq=8 48 10
> 
> test1 statistics: time=8 48 25167   sectors=8 48 1705264 dq=8 48 13
> test2 statistics: time=8 48 14042   sectors=8 48 817808 dq=8 48 10
> 
> First two fields in time and sectors statistics represent major and minor
> number of the device. Third field represents disk time in milliseconds and
> number of sectors transferred respectively.
> 
> So disk time consumed by group1 is almost double of group2 in this case.
> 
> Your feedback is welcome.
> 
> Thanks
> Vivek
> 
> 
> 

-- 
Regards
Gui Jianfeng

  parent reply	other threads:[~2009-07-31  5:23 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-24 20:27 [RFC] IO scheduler based IO controller V7 Vivek Goyal
2009-07-24 20:27 ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 01/24] io-controller: Documentation Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 02/24] io-controller: Core of the B-WF2Q+ scheduler Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 03/24] io-controller: bfq support of in-class preemption Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-27 16:54   ` Jerome Marchand
     [not found]     ` <4A6DDBDE.8020608-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-27 22:41       ` Vivek Goyal
2009-07-27 22:41     ` Vivek Goyal
2009-07-27 22:41       ` Vivek Goyal
     [not found]       ` <20090727224138.GA3702-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 11:44         ` Jerome Marchand
2009-07-28 11:44           ` Jerome Marchand
2009-07-28 13:52           ` Vivek Goyal
2009-07-28 13:52             ` Vivek Goyal
2009-07-28 14:29             ` Jerome Marchand
2009-07-28 15:03               ` Vivek Goyal
2009-07-28 15:03                 ` Vivek Goyal
2009-07-28 15:37                 ` Jerome Marchand
     [not found]                   ` <4A6F1B4F.6080709-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 18:45                     ` Vivek Goyal
2009-07-28 18:45                   ` Vivek Goyal
2009-07-28 18:45                     ` Vivek Goyal
     [not found]                 ` <20090728150310.GA3870-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 15:37                   ` Jerome Marchand
     [not found]               ` <4A6F0B32.7060801-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 15:03                 ` Vivek Goyal
     [not found]             ` <20090728135212.GC6133-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 14:29               ` Jerome Marchand
     [not found]           ` <4A6EE4A0.6080700-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-28 13:52             ` Vivek Goyal
     [not found]   ` <1248467274-32073-4-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-27 16:54     ` Jerome Marchand
2009-07-24 20:27 ` [PATCH 04/24] io-controller: Common flat fair queuing code in elevaotor layer Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 05/24] io-controller: Modify cfq to make use of flat elevator fair queuing Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
     [not found]   ` <1248467274-32073-6-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-30  6:30     ` Gui Jianfeng
2009-07-30 15:42     ` Jerome Marchand
2009-07-30  6:30   ` Gui Jianfeng
     [not found]     ` <4A713E10.2030204-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-07-31 13:18       ` Vivek Goyal
2009-07-31 13:18     ` Vivek Goyal
2009-07-31 13:18       ` Vivek Goyal
2009-07-30 15:42   ` Jerome Marchand
     [not found]     ` <4A71BF76.6040709-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-30 18:30       ` Vivek Goyal
2009-07-30 18:30     ` Vivek Goyal
2009-07-30 18:30       ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 06/24] io-controller: core bfq scheduler changes for hierarchical setup Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 07/24] io-controller: cgroup related changes for hierarchical group support Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 08/24] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 09/24] io-controller: cfq changes to use " Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 11/24] io-controller: Debug hierarchical IO scheduling Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 12/24] io-controller: Introduce group idling Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 13/24] io-controller: Wait for requests to complete from last queue before new queue is scheduled Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 14/24] io-controller: Separate out queue and data Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 15/24] io-conroller: Prepare elevator layer for single queue schedulers Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 16/24] io-controller: noop changes for hierarchical fair queuing Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 17/24] io-controller: deadline " Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 18/24] io-controller: anticipatory " Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
     [not found] ` <1248467274-32073-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-07-24 20:27   ` [PATCH 01/24] io-controller: Documentation Vivek Goyal
2009-07-24 20:27   ` [PATCH 02/24] io-controller: Core of the B-WF2Q+ scheduler Vivek Goyal
2009-07-24 20:27   ` [PATCH 03/24] io-controller: bfq support of in-class preemption Vivek Goyal
2009-07-24 20:27   ` [PATCH 04/24] io-controller: Common flat fair queuing code in elevaotor layer Vivek Goyal
2009-07-24 20:27   ` [PATCH 05/24] io-controller: Modify cfq to make use of flat elevator fair queuing Vivek Goyal
2009-07-24 20:27   ` [PATCH 06/24] io-controller: core bfq scheduler changes for hierarchical setup Vivek Goyal
2009-07-24 20:27   ` [PATCH 07/24] io-controller: cgroup related changes for hierarchical group support Vivek Goyal
2009-07-24 20:27   ` [PATCH 08/24] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal
2009-07-24 20:27   ` [PATCH 09/24] io-controller: cfq changes to use " Vivek Goyal
2009-07-24 20:27   ` [PATCH 10/24] io-controller: Export disk time used and nr sectors dipatched through cgroups Vivek Goyal
2009-07-24 20:27     ` Vivek Goyal
2009-07-24 20:27   ` [PATCH 11/24] io-controller: Debug hierarchical IO scheduling Vivek Goyal
2009-07-24 20:27   ` [PATCH 12/24] io-controller: Introduce group idling Vivek Goyal
2009-07-24 20:27   ` [PATCH 13/24] io-controller: Wait for requests to complete from last queue before new queue is scheduled Vivek Goyal
2009-07-24 20:27   ` [PATCH 14/24] io-controller: Separate out queue and data Vivek Goyal
2009-07-24 20:27   ` [PATCH 15/24] io-conroller: Prepare elevator layer for single queue schedulers Vivek Goyal
2009-07-24 20:27   ` [PATCH 16/24] io-controller: noop changes for hierarchical fair queuing Vivek Goyal
2009-07-24 20:27   ` [PATCH 17/24] io-controller: deadline " Vivek Goyal
2009-07-24 20:27   ` [PATCH 18/24] io-controller: anticipatory " Vivek Goyal
2009-07-24 20:27   ` [PATCH 19/24] blkio_cgroup patches from Ryo to track async bios Vivek Goyal
2009-07-24 20:27   ` [PATCH 20/24] io-controller: map async requests to appropriate cgroup Vivek Goyal
2009-07-24 20:27   ` [PATCH 21/24] io-controller: Per cgroup request descriptor support Vivek Goyal
2009-07-24 20:27   ` [PATCH 22/24] io-controller: Per io group bdi congestion interface Vivek Goyal
2009-07-24 20:27   ` [PATCH 23/24] io-controller: Support per cgroup per device weights and io class Vivek Goyal
2009-07-24 20:27   ` [PATCH 24/24] map sync requests to group using bio tracking info Vivek Goyal
2009-07-31  5:21   ` [RFC] IO scheduler based IO controller V7 Gui Jianfeng
2009-07-24 20:27 ` [PATCH 19/24] blkio_cgroup patches from Ryo to track async bios Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 20/24] io-controller: map async requests to appropriate cgroup Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 21/24] io-controller: Per cgroup request descriptor support Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 22/24] io-controller: Per io group bdi congestion interface Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-08-08  8:14   ` Gui Jianfeng
     [not found]   ` <1248467274-32073-23-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-08-08  8:14     ` Gui Jianfeng
2009-07-24 20:27 ` [PATCH 23/24] io-controller: Support per cgroup per device weights and io class Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-24 20:27 ` [PATCH 24/24] map sync requests to group using bio tracking info Vivek Goyal
2009-07-24 20:27   ` Vivek Goyal
2009-07-31  5:21 ` Gui Jianfeng [this message]
2009-07-31  5:21   ` [RFC] IO scheduler based IO controller V7 Gui Jianfeng
     [not found]   ` <4A727F6F.9010005-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-07-31 13:13     ` Vivek Goyal
2009-07-31 13:13   ` Vivek Goyal
2009-07-31 13:13     ` Vivek Goyal
2009-08-03  0:40     ` Gui Jianfeng
2009-08-04  0:48     ` Gui Jianfeng
     [not found]       ` <4A778540.5050502-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-08-04  1:30         ` Vivek Goyal
2009-08-04  1:30       ` Vivek Goyal
2009-08-04  1:30         ` Vivek Goyal
2009-08-18  0:42         ` Gui Jianfeng
     [not found]         ` <20090804013001.GB2282-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-08-18  0:42           ` Gui Jianfeng
     [not found]     ` <20090731131359.GA3668-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-08-03  0:40       ` Gui Jianfeng
2009-08-04  0:48       ` Gui Jianfeng
2009-07-24 20:27 Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A727F6F.9010005@cn.fujitsu.com \
    --to=guijianfeng@cn.fujitsu.com \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=dpshah@google.com \
    --cc=fchecconi@gmail.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=m-ikeda@ds.jp.nec.com \
    --cc=mikew@google.com \
    --cc=nauman@google.com \
    --cc=paolo.valente@unimore.it \
    --cc=peterz@infradead.org \
    --cc=righi.andrea@gmail.com \
    --cc=ryov@valinux.co.jp \
    --cc=s-uchida@ap.jp.nec.com \
    --cc=taka@valinux.co.jp \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.