* Regarding dm-ioband tests @ 2009-09-01 16:50 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-01 16:50 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: linux kernel mailing list, dm-devel Hi Ryo, I decided to play a bit more with dm-ioband and started doing some testing. I am doing a simple two dd threads doing reads and don't seem to be gettting the fairness. So thought will ask you what's the issue here. Is there an issue with my testing procedure. I got one 40G SATA drive (no hardware queuing). I have created two partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The weights of ioband1 and ioband2 devices are 200 and 100 respectively. I am assuming that this setup will create two default groups and IO going to partition sdd1 should get double the BW of partition sdd2. But it looks like I am not gettting that behavior. Following is the output of "dmsetup table" command. This snapshot has been taken every 2 seconds while IO was going on. Column 9 seems to be containing how many sectors of IO has been done on a particular io band device and group. Looking at the snapshot, it does not look like that ioband1 default group got double the BW of ioband2 default group. Am I doing something wrong here? ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 Following are details of my test setup. --------------------------------------- I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. Created ioband devices using following command. ---------------------------------------------- echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 Started two dd threads ====================== dd if=/mnt/sdd1/testzerofile1 of=/dev/null & dd if=/mnt/sdd2/testzerofile1 of=/dev/null & Output of dmsetup table command ================================ ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Regarding dm-ioband tests @ 2009-09-01 16:50 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-01 16:50 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: dm-devel, linux kernel mailing list Hi Ryo, I decided to play a bit more with dm-ioband and started doing some testing. I am doing a simple two dd threads doing reads and don't seem to be gettting the fairness. So thought will ask you what's the issue here. Is there an issue with my testing procedure. I got one 40G SATA drive (no hardware queuing). I have created two partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The weights of ioband1 and ioband2 devices are 200 and 100 respectively. I am assuming that this setup will create two default groups and IO going to partition sdd1 should get double the BW of partition sdd2. But it looks like I am not gettting that behavior. Following is the output of "dmsetup table" command. This snapshot has been taken every 2 seconds while IO was going on. Column 9 seems to be containing how many sectors of IO has been done on a particular io band device and group. Looking at the snapshot, it does not look like that ioband1 default group got double the BW of ioband2 default group. Am I doing something wrong here? ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 Following are details of my test setup. --------------------------------------- I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. Created ioband devices using following command. ---------------------------------------------- echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 Started two dd threads ====================== dd if=/mnt/sdd1/testzerofile1 of=/dev/null & dd if=/mnt/sdd2/testzerofile1 of=/dev/null & Output of dmsetup table command ================================ ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-01 16:50 ` Vivek Goyal @ 2009-09-01 17:47 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-01 17:47 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: linux kernel mailing list, dm-devel On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > Hi Ryo, > > I decided to play a bit more with dm-ioband and started doing some > testing. I am doing a simple two dd threads doing reads and don't seem > to be gettting the fairness. So thought will ask you what's the issue > here. Is there an issue with my testing procedure. > > I got one 40G SATA drive (no hardware queuing). I have created two > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > I am assuming that this setup will create two default groups and IO > going to partition sdd1 should get double the BW of partition sdd2. > > But it looks like I am not gettting that behavior. Following is the output > of "dmsetup table" command. This snapshot has been taken every 2 seconds > while IO was going on. Column 9 seems to be containing how many sectors > of IO has been done on a particular io band device and group. Looking at > the snapshot, it does not look like that ioband1 default group got double > the BW of ioband2 default group. > > Am I doing something wrong here? > I tried another variant of test. This time I also created two additional groups on ioband1 devices and linked these to cgroups test1 and test2 and launched two dd threads in two cgroups on device ioband1. There also I don't seem to be getting the right fairness numbers for cgroup test1 and test2. Script to create ioband devices and additional groups ----------------------------------------------------- echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 # Some code to mount and create cgroups. # Read group id test1_id=`cat /cgroup/ioband/test1/blkio.id` test2_id=`cat /cgroup/ioband/test2/blkio.id` test1_weight=200 test2_weight=100 dmsetup message ioband1 0 type cgroup dmsetup message ioband1 0 attach $test1_id dmsetup message ioband1 0 attach $test2_id dmsetup message ioband1 0 weight $test1_id:$test1_weight dmsetup message ioband1 0 weight $test2_id:$test2_weight mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 ----------------------------------------------------------------- Following are two dd jobs ------------------------- dd if=/mnt/sdd1/testzerofile1 of=/dev/null & echo $! > /cgroup/ioband/test1/tasks dd if=/mnt/sdd1/testzerofile2 of=/dev/null & echo $! > /cgroup/ioband/test2/tasks Following are "dmsetup status" results every 2 seconds ====================================================== ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 689 0 86336 0 0 0 3 650 3 81472 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 1725 0 217024 0 0 0 3 1270 11 158912 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 2690 0 338744 0 0 0 3 1978 15 247856 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 3762 0 474040 0 0 0 3 2583 21 323736 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 4745 0 598064 0 0 0 3 3275 27 410392 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 5737 0 723120 0 0 0 3 3985 31 499592 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 6815 0 859184 0 0 0 3 4594 37 575864 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 7823 0 986288 0 0 0 3 5276 43 661360 0 0 0 "dmsetup table" output ====================== ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 Because I am using "weight" policy, I thought that test1 cgroup with id "2" will issue double the number of requests of cgroup test2 with id "3". But that does not seem to be happening here. Is there an issue with my testing method. Thanks Vivek > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 > ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 > ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 > ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 > ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 > ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 > ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 > ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 > ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 > ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 > ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 > ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 > > Following are details of my test setup. > --------------------------------------- > I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. > > Created ioband devices using following command. > ---------------------------------------------- > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > "weight 0 :200" | dmsetup create ioband1 > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > "weight 0 :100" | dmsetup create ioband2 > > mount /dev/mapper/ioband1 /mnt/sdd1 > mount /dev/mapper/ioband2 /mnt/sdd2 > > Started two dd threads > ====================== > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > dd if=/mnt/sdd2/testzerofile1 of=/dev/null & > > Output of dmsetup table command > ================================ > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 > > Thanks > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-01 17:47 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-01 17:47 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: dm-devel, linux kernel mailing list On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > Hi Ryo, > > I decided to play a bit more with dm-ioband and started doing some > testing. I am doing a simple two dd threads doing reads and don't seem > to be gettting the fairness. So thought will ask you what's the issue > here. Is there an issue with my testing procedure. > > I got one 40G SATA drive (no hardware queuing). I have created two > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > I am assuming that this setup will create two default groups and IO > going to partition sdd1 should get double the BW of partition sdd2. > > But it looks like I am not gettting that behavior. Following is the output > of "dmsetup table" command. This snapshot has been taken every 2 seconds > while IO was going on. Column 9 seems to be containing how many sectors > of IO has been done on a particular io band device and group. Looking at > the snapshot, it does not look like that ioband1 default group got double > the BW of ioband2 default group. > > Am I doing something wrong here? > I tried another variant of test. This time I also created two additional groups on ioband1 devices and linked these to cgroups test1 and test2 and launched two dd threads in two cgroups on device ioband1. There also I don't seem to be getting the right fairness numbers for cgroup test1 and test2. Script to create ioband devices and additional groups ----------------------------------------------------- echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 # Some code to mount and create cgroups. # Read group id test1_id=`cat /cgroup/ioband/test1/blkio.id` test2_id=`cat /cgroup/ioband/test2/blkio.id` test1_weight=200 test2_weight=100 dmsetup message ioband1 0 type cgroup dmsetup message ioband1 0 attach $test1_id dmsetup message ioband1 0 attach $test2_id dmsetup message ioband1 0 weight $test1_id:$test1_weight dmsetup message ioband1 0 weight $test2_id:$test2_weight mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 ----------------------------------------------------------------- Following are two dd jobs ------------------------- dd if=/mnt/sdd1/testzerofile1 of=/dev/null & echo $! > /cgroup/ioband/test1/tasks dd if=/mnt/sdd1/testzerofile2 of=/dev/null & echo $! > /cgroup/ioband/test2/tasks Following are "dmsetup status" results every 2 seconds ====================================================== ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 689 0 86336 0 0 0 3 650 3 81472 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 1725 0 217024 0 0 0 3 1270 11 158912 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 2690 0 338744 0 0 0 3 1978 15 247856 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 3762 0 474040 0 0 0 3 2583 21 323736 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 4745 0 598064 0 0 0 3 3275 27 410392 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 5737 0 723120 0 0 0 3 3985 31 499592 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 6815 0 859184 0 0 0 3 4594 37 575864 0 0 0 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 7823 0 986288 0 0 0 3 5276 43 661360 0 0 0 "dmsetup table" output ====================== ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 Because I am using "weight" policy, I thought that test1 cgroup with id "2" will issue double the number of requests of cgroup test2 with id "3". But that does not seem to be happening here. Is there an issue with my testing method. Thanks Vivek > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 > ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 > ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 > ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 > ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 > ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 > ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 > ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 > ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 > ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 > ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 > ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 > > Following are details of my test setup. > --------------------------------------- > I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. > > Created ioband devices using following command. > ---------------------------------------------- > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > "weight 0 :200" | dmsetup create ioband1 > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > "weight 0 :100" | dmsetup create ioband2 > > mount /dev/mapper/ioband1 /mnt/sdd1 > mount /dev/mapper/ioband2 /mnt/sdd2 > > Started two dd threads > ====================== > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > dd if=/mnt/sdd2/testzerofile1 of=/dev/null & > > Output of dmsetup table command > ================================ > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 > > Thanks > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-01 17:47 ` Vivek Goyal @ 2009-09-03 13:11 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-03 13:11 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: linux kernel mailing list, dm-devel On Tue, Sep 01, 2009 at 01:47:24PM -0400, Vivek Goyal wrote: > On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > > Hi Ryo, > > > > I decided to play a bit more with dm-ioband and started doing some > > testing. I am doing a simple two dd threads doing reads and don't seem > > to be gettting the fairness. So thought will ask you what's the issue > > here. Is there an issue with my testing procedure. > > > > I got one 40G SATA drive (no hardware queuing). I have created two > > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > > > I am assuming that this setup will create two default groups and IO > > going to partition sdd1 should get double the BW of partition sdd2. > > > > But it looks like I am not gettting that behavior. Following is the output > > of "dmsetup table" command. This snapshot has been taken every 2 seconds > > while IO was going on. Column 9 seems to be containing how many sectors > > of IO has been done on a particular io band device and group. Looking at > > the snapshot, it does not look like that ioband1 default group got double > > the BW of ioband2 default group. > > > > Am I doing something wrong here? > > > Hi Ryo, Did you get a chance to look into it? Am I doing something wrong or it is an issue with dm-ioband. Thanks Vivek > I tried another variant of test. This time I also created two additional > groups on ioband1 devices and linked these to cgroups test1 and test2 and > launched two dd threads in two cgroups on device ioband1. There also I > don't seem to be getting the right fairness numbers for cgroup test1 and > test2. > > Script to create ioband devices and additional groups > ----------------------------------------------------- > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > "weight 0 :200" | dmsetup create ioband1 > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > "weight 0 :100" | dmsetup create ioband2 > > # Some code to mount and create cgroups. > # Read group id > test1_id=`cat /cgroup/ioband/test1/blkio.id` > test2_id=`cat /cgroup/ioband/test2/blkio.id` > > test1_weight=200 > test2_weight=100 > > dmsetup message ioband1 0 type cgroup > dmsetup message ioband1 0 attach $test1_id > dmsetup message ioband1 0 attach $test2_id > dmsetup message ioband1 0 weight $test1_id:$test1_weight > dmsetup message ioband1 0 weight $test2_id:$test2_weight > > mount /dev/mapper/ioband1 /mnt/sdd1 > mount /dev/mapper/ioband2 /mnt/sdd2 > ----------------------------------------------------------------- > > Following are two dd jobs > ------------------------- > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > echo $! > /cgroup/ioband/test1/tasks > > dd if=/mnt/sdd1/testzerofile2 of=/dev/null & > echo $! > /cgroup/ioband/test2/tasks > > > Following are "dmsetup status" results every 2 seconds > ====================================================== > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 689 0 86336 0 0 0 3 650 3 > 81472 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 1725 0 217024 0 0 0 3 1270 > 11 158912 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 2690 0 338744 0 0 0 3 1978 > 15 247856 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 3762 0 474040 0 0 0 3 2583 > 21 323736 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 4745 0 598064 0 0 0 3 3275 > 27 410392 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 5737 0 723120 0 0 0 3 3985 > 31 499592 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 6815 0 859184 0 0 0 3 4594 > 37 575864 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 7823 0 986288 0 0 0 3 5276 > 43 661360 0 0 0 > > "dmsetup table" output > ====================== > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 > > Because I am using "weight" policy, I thought that test1 cgroup with id > "2" will issue double the number of requests of cgroup test2 with id "3". > But that does not seem to be happening here. Is there an issue with my > testing method. > > Thanks > Vivek > > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 > > > > Following are details of my test setup. > > --------------------------------------- > > I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. > > > > Created ioband devices using following command. > > ---------------------------------------------- > > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > > "weight 0 :200" | dmsetup create ioband1 > > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > > "weight 0 :100" | dmsetup create ioband2 > > > > mount /dev/mapper/ioband1 /mnt/sdd1 > > mount /dev/mapper/ioband2 /mnt/sdd2 > > > > Started two dd threads > > ====================== > > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > > dd if=/mnt/sdd2/testzerofile1 of=/dev/null & > > > > Output of dmsetup table command > > ================================ > > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > > ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 > > > > Thanks > > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-03 13:11 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-03 13:11 UTC (permalink / raw) To: Ryo Tsuruta; +Cc: dm-devel, linux kernel mailing list On Tue, Sep 01, 2009 at 01:47:24PM -0400, Vivek Goyal wrote: > On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > > Hi Ryo, > > > > I decided to play a bit more with dm-ioband and started doing some > > testing. I am doing a simple two dd threads doing reads and don't seem > > to be gettting the fairness. So thought will ask you what's the issue > > here. Is there an issue with my testing procedure. > > > > I got one 40G SATA drive (no hardware queuing). I have created two > > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > > > I am assuming that this setup will create two default groups and IO > > going to partition sdd1 should get double the BW of partition sdd2. > > > > But it looks like I am not gettting that behavior. Following is the output > > of "dmsetup table" command. This snapshot has been taken every 2 seconds > > while IO was going on. Column 9 seems to be containing how many sectors > > of IO has been done on a particular io band device and group. Looking at > > the snapshot, it does not look like that ioband1 default group got double > > the BW of ioband2 default group. > > > > Am I doing something wrong here? > > > Hi Ryo, Did you get a chance to look into it? Am I doing something wrong or it is an issue with dm-ioband. Thanks Vivek > I tried another variant of test. This time I also created two additional > groups on ioband1 devices and linked these to cgroups test1 and test2 and > launched two dd threads in two cgroups on device ioband1. There also I > don't seem to be getting the right fairness numbers for cgroup test1 and > test2. > > Script to create ioband devices and additional groups > ----------------------------------------------------- > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > "weight 0 :200" | dmsetup create ioband1 > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > "weight 0 :100" | dmsetup create ioband2 > > # Some code to mount and create cgroups. > # Read group id > test1_id=`cat /cgroup/ioband/test1/blkio.id` > test2_id=`cat /cgroup/ioband/test2/blkio.id` > > test1_weight=200 > test2_weight=100 > > dmsetup message ioband1 0 type cgroup > dmsetup message ioband1 0 attach $test1_id > dmsetup message ioband1 0 attach $test2_id > dmsetup message ioband1 0 weight $test1_id:$test1_weight > dmsetup message ioband1 0 weight $test2_id:$test2_weight > > mount /dev/mapper/ioband1 /mnt/sdd1 > mount /dev/mapper/ioband2 /mnt/sdd2 > ----------------------------------------------------------------- > > Following are two dd jobs > ------------------------- > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > echo $! > /cgroup/ioband/test1/tasks > > dd if=/mnt/sdd1/testzerofile2 of=/dev/null & > echo $! > /cgroup/ioband/test2/tasks > > > Following are "dmsetup status" results every 2 seconds > ====================================================== > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 689 0 86336 0 0 0 3 650 3 > 81472 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 1725 0 217024 0 0 0 3 1270 > 11 158912 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 2690 0 338744 0 0 0 3 1978 > 15 247856 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 3762 0 474040 0 0 0 3 2583 > 21 323736 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 4745 0 598064 0 0 0 3 3275 > 27 410392 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 5737 0 723120 0 0 0 3 3985 > 31 499592 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 6815 0 859184 0 0 0 3 4594 > 37 575864 0 0 0 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 2 7823 0 986288 0 0 0 3 5276 > 43 661360 0 0 0 > > "dmsetup table" output > ====================== > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 > > Because I am using "weight" policy, I thought that test1 cgroup with id > "2" will issue double the number of requests of cgroup test2 with id "3". > But that does not seem to be happening here. Is there an issue with my > testing method. > > Thanks > Vivek > > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 96 0 11528 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 82 0 9736 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 748 2 93032 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 896 0 112232 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 1326 5 165816 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 1816 0 228312 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 1943 6 243712 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 2692 0 338760 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 2461 10 308576 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 3618 0 455608 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 3118 11 391352 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 4406 0 555032 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 3734 15 468760 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 5273 0 664328 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 4307 17 540784 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 6181 0 778992 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 4930 19 619208 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 7028 0 885728 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 5599 22 703280 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 7815 0 985024 0 0 0 > > > > ioband2: 0 40355280 ioband 1 -1 6586 27 827456 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 8327 0 1049624 0 0 0 > > > > Following are details of my test setup. > > --------------------------------------- > > I took dm-ioband patch version 1.12.3 and applied on 2.6.31-rc6. > > > > Created ioband devices using following command. > > ---------------------------------------------- > > echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" > > "weight 0 :200" | dmsetup create ioband1 > > echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" > > "weight 0 :100" | dmsetup create ioband2 > > > > mount /dev/mapper/ioband1 /mnt/sdd1 > > mount /dev/mapper/ioband2 /mnt/sdd2 > > > > Started two dd threads > > ====================== > > dd if=/mnt/sdd1/testzerofile1 of=/dev/null & > > dd if=/mnt/sdd2/testzerofile1 of=/dev/null & > > > > Output of dmsetup table command > > ================================ > > ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 > > ioband1: 0 37768752 ioband 8:49 1 4 192 none weight 768 :200 > > > > Thanks > > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-03 13:11 ` Vivek Goyal (?) @ 2009-09-04 1:12 ` Ryo Tsuruta 2009-09-15 21:40 ` Vivek Goyal -1 siblings, 1 reply; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-04 1:12 UTC (permalink / raw) To: vgoyal; +Cc: linux-kernel, dm-devel Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > On Tue, Sep 01, 2009 at 01:47:24PM -0400, Vivek Goyal wrote: > > On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > > > Hi Ryo, > > > > > > I decided to play a bit more with dm-ioband and started doing some > > > testing. I am doing a simple two dd threads doing reads and don't seem > > > to be gettting the fairness. So thought will ask you what's the issue > > > here. Is there an issue with my testing procedure. > > > > > > I got one 40G SATA drive (no hardware queuing). I have created two > > > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > > > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > > > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > > > > > I am assuming that this setup will create two default groups and IO > > > going to partition sdd1 should get double the BW of partition sdd2. > > > > > > But it looks like I am not gettting that behavior. Following is the output > > > of "dmsetup table" command. This snapshot has been taken every 2 seconds > > > while IO was going on. Column 9 seems to be containing how many sectors > > > of IO has been done on a particular io band device and group. Looking at > > > the snapshot, it does not look like that ioband1 default group got double > > > the BW of ioband2 default group. > > > > > > Am I doing something wrong here? > > > > > > > Hi Ryo, > > Did you get a chance to look into it? Am I doing something wrong or it is > an issue with dm-ioband. Sorry, I missed it. I'll look into it and report back to you. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* dm-ioband fairness in terms of sectors seems to be killing disk (Was: Re: Regarding dm-ioband tests) 2009-09-04 1:12 ` Ryo Tsuruta @ 2009-09-15 21:40 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-15 21:40 UTC (permalink / raw) To: Ryo Tsuruta Cc: linux-kernel, dm-devel, dhaval, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer On Fri, Sep 04, 2009 at 10:12:22AM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > On Tue, Sep 01, 2009 at 01:47:24PM -0400, Vivek Goyal wrote: > > > On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > > > > Hi Ryo, > > > > > > > > I decided to play a bit more with dm-ioband and started doing some > > > > testing. I am doing a simple two dd threads doing reads and don't seem > > > > to be gettting the fairness. So thought will ask you what's the issue > > > > here. Is there an issue with my testing procedure. > > > > > > > > I got one 40G SATA drive (no hardware queuing). I have created two > > > > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > > > > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > > > > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > > > > > > > I am assuming that this setup will create two default groups and IO > > > > going to partition sdd1 should get double the BW of partition sdd2. > > > > > > > > But it looks like I am not gettting that behavior. Following is the output > > > > of "dmsetup table" command. This snapshot has been taken every 2 seconds > > > > while IO was going on. Column 9 seems to be containing how many sectors > > > > of IO has been done on a particular io band device and group. Looking at > > > > the snapshot, it does not look like that ioband1 default group got double > > > > the BW of ioband2 default group. > > > > > > > > Am I doing something wrong here? > > > > > > > > > > > Hi Ryo, > > > > Did you get a chance to look into it? Am I doing something wrong or it is > > an issue with dm-ioband. > > Sorry, I missed it. I'll look into it and report back to you. Hi Ryo, I am running a sequential reader in one group and few random reader and writers in second group. Both groups are of same weight. I ran fio scripts for 60 seconds and then looked at the output. In this case looks like we just kill the throughput of sequential reader and disk (because random readers/writers take over). I ran the test "with-dm-ioband", "without-dm-ioband" and "with ioscheduler based io controller". First I am pasting the results and in the end I will paste my test scripts. I have cut fio output heavily so that we does not get lost in lots of output. with-dm-ioband ============== ioband1 ------- randread: (groupid=0, jobs=4): err= 0: pid=3610 read : io=18,432KiB, bw=314KiB/s, iops=76, runt= 60076msec clat (usec): min=140, max=744K, avg=50866.75, stdev=61266.88 randwrite: (groupid=1, jobs=2): err= 0: pid=3614 write: io=920KiB, bw=15KiB/s, iops=3, runt= 60098msec clat (usec): min=203, max=14,171K, avg=522937.86, stdev=960929.44 ioband2 ------- seqread0: (groupid=0, jobs=1): err= 0: pid=3609 read : io=37,904KiB, bw=636KiB/s, iops=155, runt= 61026msec clat (usec): min=92, max=9,969K, avg=6437.89, stdev=168573.23 without dm-ioband (vanilla cfq, no grouping) ============================================ seqread0: (groupid=0, jobs=1): err= 0: pid=3969 read : io=321MiB, bw=5,598KiB/s, iops=1,366, runt= 60104msec clat (usec): min=91, max=763K, avg=729.61, stdev=17402.63 randread: (groupid=0, jobs=4): err= 0: pid=3970 read : io=15,112KiB, bw=257KiB/s, iops=62, runt= 60039msec clat (usec): min=124, max=1,066K, avg=63721.26, stdev=78215.17 randwrite: (groupid=1, jobs=2): err= 0: pid=3974 write: io=680KiB, bw=11KiB/s, iops=2, runt= 60073msec clat (usec): min=199, max=24,646K, avg=706719.51, stdev=1774887.55 With ioscheduer based io controller patches =========================================== cgroup 1 (weight 100) --------------------- randread: (groupid=0, jobs=4): err= 0: pid=2995 read : io=9,484KiB, bw=161KiB/s, iops=39, runt= 60107msec clat (msec): min=1, max=2,167, avg=95.47, stdev=131.60 randwrite: (groupid=1, jobs=2): err= 0: pid=2999 write: io=2,692KiB, bw=45KiB/s, iops=11, runt= 60131msec clat (usec): min=199, max=30,043K, avg=178710.05, stdev=1281485.75 cgroup 2 (weight 100) -------------------- seqread0: (groupid=0, jobs=1): err= 0: pid=2993 read : io=547mib, bw=9,556kib/s, iops=2,333, runt= 60043msec clat (usec): min=92, max=224k, avg=426.74, stdev=5734.12 Note the BW of sequential reader in three cases (636 KB/s, 5,598KiB/s, 9,556KiB/s). dm-ioband tries to provide fairness in terms of number of sectors and it completely kills the disk throughput. with io scheduler based io controller, we see increased throughput for seqential reader as compared to CFQ, because now random readers are running in a separate group and hence reader gets isolation from random readers. Here are my fio jobs -------------------- First fio job file ----------------- [global] runtime=60 [randread] rw=randread size=2G iodepth=20 directory=/mnt/sdd1/fio/ direct=1 numjobs=4 group_reporting [randwrite] rw=randwrite size=1G iodepth=20 directory=/mnt/sdd1/fio/ group_reporting direct=1 numjobs=2 Second fio job file ------------------- [global] runtime=60 rw=read size=4G directory=/mnt/sdd2/fio/ direct=1 [seqread0] numjobs=1 group_reporting Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* dm-ioband fairness in terms of sectors seems to be killing disk (Was: Re: Regarding dm-ioband tests) @ 2009-09-15 21:40 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-15 21:40 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk On Fri, Sep 04, 2009 at 10:12:22AM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > On Tue, Sep 01, 2009 at 01:47:24PM -0400, Vivek Goyal wrote: > > > On Tue, Sep 01, 2009 at 12:50:11PM -0400, Vivek Goyal wrote: > > > > Hi Ryo, > > > > > > > > I decided to play a bit more with dm-ioband and started doing some > > > > testing. I am doing a simple two dd threads doing reads and don't seem > > > > to be gettting the fairness. So thought will ask you what's the issue > > > > here. Is there an issue with my testing procedure. > > > > > > > > I got one 40G SATA drive (no hardware queuing). I have created two > > > > partitions on that disk /dev/sdd1 and /dev/sdd2 and created two ioband > > > > devices ioband1 and ioband2 on partitions sdd1 and sdd2 respectively. The > > > > weights of ioband1 and ioband2 devices are 200 and 100 respectively. > > > > > > > > I am assuming that this setup will create two default groups and IO > > > > going to partition sdd1 should get double the BW of partition sdd2. > > > > > > > > But it looks like I am not gettting that behavior. Following is the output > > > > of "dmsetup table" command. This snapshot has been taken every 2 seconds > > > > while IO was going on. Column 9 seems to be containing how many sectors > > > > of IO has been done on a particular io band device and group. Looking at > > > > the snapshot, it does not look like that ioband1 default group got double > > > > the BW of ioband2 default group. > > > > > > > > Am I doing something wrong here? > > > > > > > > > > > Hi Ryo, > > > > Did you get a chance to look into it? Am I doing something wrong or it is > > an issue with dm-ioband. > > Sorry, I missed it. I'll look into it and report back to you. Hi Ryo, I am running a sequential reader in one group and few random reader and writers in second group. Both groups are of same weight. I ran fio scripts for 60 seconds and then looked at the output. In this case looks like we just kill the throughput of sequential reader and disk (because random readers/writers take over). I ran the test "with-dm-ioband", "without-dm-ioband" and "with ioscheduler based io controller". First I am pasting the results and in the end I will paste my test scripts. I have cut fio output heavily so that we does not get lost in lots of output. with-dm-ioband ============== ioband1 ------- randread: (groupid=0, jobs=4): err= 0: pid=3610 read : io=18,432KiB, bw=314KiB/s, iops=76, runt= 60076msec clat (usec): min=140, max=744K, avg=50866.75, stdev=61266.88 randwrite: (groupid=1, jobs=2): err= 0: pid=3614 write: io=920KiB, bw=15KiB/s, iops=3, runt= 60098msec clat (usec): min=203, max=14,171K, avg=522937.86, stdev=960929.44 ioband2 ------- seqread0: (groupid=0, jobs=1): err= 0: pid=3609 read : io=37,904KiB, bw=636KiB/s, iops=155, runt= 61026msec clat (usec): min=92, max=9,969K, avg=6437.89, stdev=168573.23 without dm-ioband (vanilla cfq, no grouping) ============================================ seqread0: (groupid=0, jobs=1): err= 0: pid=3969 read : io=321MiB, bw=5,598KiB/s, iops=1,366, runt= 60104msec clat (usec): min=91, max=763K, avg=729.61, stdev=17402.63 randread: (groupid=0, jobs=4): err= 0: pid=3970 read : io=15,112KiB, bw=257KiB/s, iops=62, runt= 60039msec clat (usec): min=124, max=1,066K, avg=63721.26, stdev=78215.17 randwrite: (groupid=1, jobs=2): err= 0: pid=3974 write: io=680KiB, bw=11KiB/s, iops=2, runt= 60073msec clat (usec): min=199, max=24,646K, avg=706719.51, stdev=1774887.55 With ioscheduer based io controller patches =========================================== cgroup 1 (weight 100) --------------------- randread: (groupid=0, jobs=4): err= 0: pid=2995 read : io=9,484KiB, bw=161KiB/s, iops=39, runt= 60107msec clat (msec): min=1, max=2,167, avg=95.47, stdev=131.60 randwrite: (groupid=1, jobs=2): err= 0: pid=2999 write: io=2,692KiB, bw=45KiB/s, iops=11, runt= 60131msec clat (usec): min=199, max=30,043K, avg=178710.05, stdev=1281485.75 cgroup 2 (weight 100) -------------------- seqread0: (groupid=0, jobs=1): err= 0: pid=2993 read : io=547mib, bw=9,556kib/s, iops=2,333, runt= 60043msec clat (usec): min=92, max=224k, avg=426.74, stdev=5734.12 Note the BW of sequential reader in three cases (636 KB/s, 5,598KiB/s, 9,556KiB/s). dm-ioband tries to provide fairness in terms of number of sectors and it completely kills the disk throughput. with io scheduler based io controller, we see increased throughput for seqential reader as compared to CFQ, because now random readers are running in a separate group and hence reader gets isolation from random readers. Here are my fio jobs -------------------- First fio job file ----------------- [global] runtime=60 [randread] rw=randread size=2G iodepth=20 directory=/mnt/sdd1/fio/ direct=1 numjobs=4 group_reporting [randwrite] rw=randwrite size=1G iodepth=20 directory=/mnt/sdd1/fio/ group_reporting direct=1 numjobs=2 Second fio job file ------------------- [global] runtime=60 rw=read size=4G directory=/mnt/sdd2/fio/ direct=1 [seqread0] numjobs=1 group_reporting Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: dm-ioband fairness in terms of sectors seems to be killing disk 2009-09-15 21:40 ` Vivek Goyal @ 2009-09-16 11:10 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-16 11:10 UTC (permalink / raw) To: vgoyal Cc: linux-kernel, dm-devel, dhaval, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Ryo, > > I am running a sequential reader in one group and few random reader and > writers in second group. Both groups are of same weight. I ran fio scripts > for 60 seconds and then looked at the output. In this case looks like we just > kill the throughput of sequential reader and disk (because random > readers/writers take over). Thank you for testing dm-ioband. I ran your script on my environment, and here are the results. Throughput [KiB/s] vanilla dm-ioband dm-ioband (io-throttle = 4) (io-throttle = 50) randread 312 392 368 randwrite 11 12 10 seqread 4341 651 1599 I ran the script on dm-ioband under two conditions, one is that the io-throttle options is set to 4, and the other is set to 50. When there are some in-flight IO requests in the group and those numbers exceed io-throttle, then dm-ioband gives priority to the group and the group can issue subsequent IO requests in preference to the other groups. 50 io-throttle means that it cancels this mechanism, so the seq-read got more bandwidth than 4 io-throttle. I tried to test with 2.6.31-rc7 and io-controller v9, but unfortunately, a kernel panic happened. I'll try to test with your io-controller again later. > with io scheduler based io controller, we see increased throughput for > seqential reader as compared to CFQ, because now random readers are > running in a separate group and hence reader gets isolation from random > readers. I summarized your results in a tabular format. Throughput [KiB/s] vanilla io-controller dm-ioband randread 257 161 314 randwrite 11 45 15 seqread 5598 9556 631 On the result of io-controller, the throughput of seqread was increased but randread was decreased against vanilla. Did it perform as you expected? Was disktime consumed equally on each group according to the weight settings? Could you tell me your opinion what an io-controller should do when this kind of workload is applied? Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: dm-ioband fairness in terms of sectors seems to be killing disk @ 2009-09-16 11:10 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-16 11:10 UTC (permalink / raw) To: vgoyal Cc: dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Ryo, > > I am running a sequential reader in one group and few random reader and > writers in second group. Both groups are of same weight. I ran fio scripts > for 60 seconds and then looked at the output. In this case looks like we just > kill the throughput of sequential reader and disk (because random > readers/writers take over). Thank you for testing dm-ioband. I ran your script on my environment, and here are the results. Throughput [KiB/s] vanilla dm-ioband dm-ioband (io-throttle = 4) (io-throttle = 50) randread 312 392 368 randwrite 11 12 10 seqread 4341 651 1599 I ran the script on dm-ioband under two conditions, one is that the io-throttle options is set to 4, and the other is set to 50. When there are some in-flight IO requests in the group and those numbers exceed io-throttle, then dm-ioband gives priority to the group and the group can issue subsequent IO requests in preference to the other groups. 50 io-throttle means that it cancels this mechanism, so the seq-read got more bandwidth than 4 io-throttle. I tried to test with 2.6.31-rc7 and io-controller v9, but unfortunately, a kernel panic happened. I'll try to test with your io-controller again later. > with io scheduler based io controller, we see increased throughput for > seqential reader as compared to CFQ, because now random readers are > running in a separate group and hence reader gets isolation from random > readers. I summarized your results in a tabular format. Throughput [KiB/s] vanilla io-controller dm-ioband randread 257 161 314 randwrite 11 45 15 seqread 5598 9556 631 On the result of io-controller, the throughput of seqread was increased but randread was decreased against vanilla. Did it perform as you expected? Was disktime consumed equally on each group according to the weight settings? Could you tell me your opinion what an io-controller should do when this kind of workload is applied? Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-01 16:50 ` Vivek Goyal @ 2009-09-04 4:02 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-04 4:02 UTC (permalink / raw) To: vgoyal; +Cc: linux-kernel, dm-devel Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Ryo, > > I decided to play a bit more with dm-ioband and started doing some > testing. I am doing a simple two dd threads doing reads and don't seem > to be gettting the fairness. So thought will ask you what's the issue > here. Is there an issue with my testing procedure. Thank you for testing dm-ioband. dm-ioband is designed to start throttling bandwidth when multiple IO requests are issued to devices simultaneously, IOW, to start throttling when IO load exceeds a certain level. Here is my test script that runs multiple dd threads on each directory. Each directory stores 20 files of 2GB. #!/bin/sh tmout=60 for nr_threads in 1 4 8 12 16 20; do sync; echo 3 > /proc/sys/vm/drop_caches for i in $(seq $nr_threads); do dd if=/mnt1/ioband1.${i}.0 of=/dev/null & dd if=/mnt2/ioband2.${i}.0 of=/dev/null & done iostat -k 1 $tmout > ${nr_threads}.log killall -ws TERM dd done exit 0 Here is the result. The average throughputs of each device are according to the proportion of the weight settings when the number of thread is over four. Average thoughput in 60 seconds [KB/s] ioband1 ioband2 threads weight 200 weight 100 total 1 26642 (54.9%) 21925 (45.1%) 48568 4 33974 (67.7%) 16181 (32.3%) 50156 8 31952 (66.2%) 16297 (33.8%) 48249 12 32062 (67.8%) 15236 (32.2%) 47299 16 31780 (67.7%) 15165 (32.3%) 46946 20 29955 (66.3%) 15239 (33.7%) 45195 Please try to run the above script on your envirionment and I would be glad if you let me know the result. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-04 4:02 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-04 4:02 UTC (permalink / raw) To: vgoyal; +Cc: dm-devel, linux-kernel Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Ryo, > > I decided to play a bit more with dm-ioband and started doing some > testing. I am doing a simple two dd threads doing reads and don't seem > to be gettting the fairness. So thought will ask you what's the issue > here. Is there an issue with my testing procedure. Thank you for testing dm-ioband. dm-ioband is designed to start throttling bandwidth when multiple IO requests are issued to devices simultaneously, IOW, to start throttling when IO load exceeds a certain level. Here is my test script that runs multiple dd threads on each directory. Each directory stores 20 files of 2GB. #!/bin/sh tmout=60 for nr_threads in 1 4 8 12 16 20; do sync; echo 3 > /proc/sys/vm/drop_caches for i in $(seq $nr_threads); do dd if=/mnt1/ioband1.${i}.0 of=/dev/null & dd if=/mnt2/ioband2.${i}.0 of=/dev/null & done iostat -k 1 $tmout > ${nr_threads}.log killall -ws TERM dd done exit 0 Here is the result. The average throughputs of each device are according to the proportion of the weight settings when the number of thread is over four. Average thoughput in 60 seconds [KB/s] ioband1 ioband2 threads weight 200 weight 100 total 1 26642 (54.9%) 21925 (45.1%) 48568 4 33974 (67.7%) 16181 (32.3%) 50156 8 31952 (66.2%) 16297 (33.8%) 48249 12 32062 (67.8%) 15236 (32.2%) 47299 16 31780 (67.7%) 15165 (32.3%) 46946 20 29955 (66.3%) 15239 (33.7%) 45195 Please try to run the above script on your envirionment and I would be glad if you let me know the result. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-04 4:02 ` Ryo Tsuruta @ 2009-09-04 23:11 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-04 23:11 UTC (permalink / raw) To: Ryo Tsuruta Cc: linux-kernel, dm-devel, Jens Axboe, Alasdair G Kergon, Morton Andrew Morton, Nauman Rafique, Gui Jianfeng, Rik Van Riel, Moyer Jeff Moyer, Balbir Singh On Fri, Sep 04, 2009 at 01:02:28PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > Hi Ryo, > > > > I decided to play a bit more with dm-ioband and started doing some > > testing. I am doing a simple two dd threads doing reads and don't seem > > to be gettting the fairness. So thought will ask you what's the issue > > here. Is there an issue with my testing procedure. [ CCing relevant folks on thread as what does fairness mean is becoming interesting] > > Thank you for testing dm-ioband. dm-ioband is designed to start > throttling bandwidth when multiple IO requests are issued to devices > simultaneously, IOW, to start throttling when IO load exceeds a > certain level. > What is that certain level? Secondly what's the advantage of this? I can see disadvantages though. So unless a group is really busy "up to that certain level" it will not get fairness? I breaks the isolation between groups. > Here is my test script that runs multiple dd threads on each > directory. Each directory stores 20 files of 2GB. > > #!/bin/sh > tmout=60 > > for nr_threads in 1 4 8 12 16 20; do > sync; echo 3 > /proc/sys/vm/drop_caches > > for i in $(seq $nr_threads); do > dd if=/mnt1/ioband1.${i}.0 of=/dev/null & > dd if=/mnt2/ioband2.${i}.0 of=/dev/null & > done > iostat -k 1 $tmout > ${nr_threads}.log > killall -ws TERM dd > done > exit 0 > > Here is the result. The average throughputs of each device are > according to the proportion of the weight settings when the number of > thread is over four. > > Average thoughput in 60 seconds [KB/s] > > ioband1 ioband2 > threads weight 200 weight 100 total > 1 26642 (54.9%) 21925 (45.1%) 48568 > 4 33974 (67.7%) 16181 (32.3%) 50156 > 8 31952 (66.2%) 16297 (33.8%) 48249 > 12 32062 (67.8%) 15236 (32.2%) 47299 > 16 31780 (67.7%) 15165 (32.3%) 46946 > 20 29955 (66.3%) 15239 (33.7%) 45195 > > Please try to run the above script on your envirionment and I would be > glad if you let me know the result. I ran my simple dd test again with two ioband deviecs of weight 200 (ioband1) and 100 (ioband2)respectively. I launched four sequential dd readers on ioband2 and and one sequential reader in ioband1. Now if we are providing isolation between groups then ioband1 should get double the bandwidth of ioband1. But that does not happen. Following is the output of "dmsetup table" command. Fri Sep 4 18:02:01 EDT 2009 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 Fri Sep 4 18:02:08 EDT 2009 ioband2: 0 40355280 ioband 1 -1 2031 32 250280 0 0 0 ioband1: 0 37768752 ioband 1 -1 1484 0 186544 0 0 0 Fri Sep 4 18:02:12 EDT 2009 ioband2: 0 40355280 ioband 1 -1 3541 64 437192 0 0 0 ioband1: 0 37768752 ioband 1 -1 2802 0 352728 0 0 0 Fri Sep 4 18:02:16 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5200 87 644144 0 0 0 ioband1: 0 37768752 ioband 1 -1 4003 0 504296 0 0 0 Fri Sep 4 18:02:20 EDT 2009 ioband2: 0 40355280 ioband 1 -1 7632 111 948232 0 0 0 ioband1: 0 37768752 ioband 1 -1 4494 0 566080 0 0 0 This seems to be breaking the isolation between two groups. Now if there is one bad group with lots of readers and writers of lower weight, it will overwhelm a group of higher weight with 1-2 readers running or some random readers running etc. If there are lots of readers running in a group and then a small file reader comes in a different group of higher prio, it will not get any fairness and latency of file read will be very high. But one would expect that groups will provide isolation and latency of small file reader will not increase with number of readers in a low prio group. I also ran your test of doing heavy IO in two groups. This time I am running 4 dd threads in both the ioband devices. Following is the snapshot of "dmsetup table" output. Fri Sep 4 17:45:27 EDT 2009 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 Fri Sep 4 17:45:29 EDT 2009 ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 Fri Sep 4 17:45:37 EDT 2009 ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 Fri Sep 4 17:45:45 EDT 2009 ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 Fri Sep 4 17:45:51 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 Fri Sep 4 17:45:53 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 Fri Sep 4 17:45:57 EDT 2009 ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 Note, it took me more than 20 seconds (since I started the threds) to reach close to desired fairness level. That's too long a duration. Again random readers or small file readers are compeltely out of picture for any kind of fairness or are not protected at all with dm-ioband controller. I think there are serious issues with the notion of fairness and what kind of isolation dm-ioband provide between groups and it should be looked into. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-04 23:11 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-04 23:11 UTC (permalink / raw) To: Ryo Tsuruta Cc: Rik Van Riel, Gui Jianfeng, linux-kernel, Moyer Jeff Moyer, dm-devel, Jens Axboe, Nauman Rafique, Morton Andrew Morton, Alasdair G Kergon, Balbir Singh On Fri, Sep 04, 2009 at 01:02:28PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > Hi Ryo, > > > > I decided to play a bit more with dm-ioband and started doing some > > testing. I am doing a simple two dd threads doing reads and don't seem > > to be gettting the fairness. So thought will ask you what's the issue > > here. Is there an issue with my testing procedure. [ CCing relevant folks on thread as what does fairness mean is becoming interesting] > > Thank you for testing dm-ioband. dm-ioband is designed to start > throttling bandwidth when multiple IO requests are issued to devices > simultaneously, IOW, to start throttling when IO load exceeds a > certain level. > What is that certain level? Secondly what's the advantage of this? I can see disadvantages though. So unless a group is really busy "up to that certain level" it will not get fairness? I breaks the isolation between groups. > Here is my test script that runs multiple dd threads on each > directory. Each directory stores 20 files of 2GB. > > #!/bin/sh > tmout=60 > > for nr_threads in 1 4 8 12 16 20; do > sync; echo 3 > /proc/sys/vm/drop_caches > > for i in $(seq $nr_threads); do > dd if=/mnt1/ioband1.${i}.0 of=/dev/null & > dd if=/mnt2/ioband2.${i}.0 of=/dev/null & > done > iostat -k 1 $tmout > ${nr_threads}.log > killall -ws TERM dd > done > exit 0 > > Here is the result. The average throughputs of each device are > according to the proportion of the weight settings when the number of > thread is over four. > > Average thoughput in 60 seconds [KB/s] > > ioband1 ioband2 > threads weight 200 weight 100 total > 1 26642 (54.9%) 21925 (45.1%) 48568 > 4 33974 (67.7%) 16181 (32.3%) 50156 > 8 31952 (66.2%) 16297 (33.8%) 48249 > 12 32062 (67.8%) 15236 (32.2%) 47299 > 16 31780 (67.7%) 15165 (32.3%) 46946 > 20 29955 (66.3%) 15239 (33.7%) 45195 > > Please try to run the above script on your envirionment and I would be > glad if you let me know the result. I ran my simple dd test again with two ioband deviecs of weight 200 (ioband1) and 100 (ioband2)respectively. I launched four sequential dd readers on ioband2 and and one sequential reader in ioband1. Now if we are providing isolation between groups then ioband1 should get double the bandwidth of ioband1. But that does not happen. Following is the output of "dmsetup table" command. Fri Sep 4 18:02:01 EDT 2009 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 Fri Sep 4 18:02:08 EDT 2009 ioband2: 0 40355280 ioband 1 -1 2031 32 250280 0 0 0 ioband1: 0 37768752 ioband 1 -1 1484 0 186544 0 0 0 Fri Sep 4 18:02:12 EDT 2009 ioband2: 0 40355280 ioband 1 -1 3541 64 437192 0 0 0 ioband1: 0 37768752 ioband 1 -1 2802 0 352728 0 0 0 Fri Sep 4 18:02:16 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5200 87 644144 0 0 0 ioband1: 0 37768752 ioband 1 -1 4003 0 504296 0 0 0 Fri Sep 4 18:02:20 EDT 2009 ioband2: 0 40355280 ioband 1 -1 7632 111 948232 0 0 0 ioband1: 0 37768752 ioband 1 -1 4494 0 566080 0 0 0 This seems to be breaking the isolation between two groups. Now if there is one bad group with lots of readers and writers of lower weight, it will overwhelm a group of higher weight with 1-2 readers running or some random readers running etc. If there are lots of readers running in a group and then a small file reader comes in a different group of higher prio, it will not get any fairness and latency of file read will be very high. But one would expect that groups will provide isolation and latency of small file reader will not increase with number of readers in a low prio group. I also ran your test of doing heavy IO in two groups. This time I am running 4 dd threads in both the ioband devices. Following is the snapshot of "dmsetup table" output. Fri Sep 4 17:45:27 EDT 2009 ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 Fri Sep 4 17:45:29 EDT 2009 ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 Fri Sep 4 17:45:37 EDT 2009 ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 Fri Sep 4 17:45:45 EDT 2009 ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 Fri Sep 4 17:45:51 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 Fri Sep 4 17:45:53 EDT 2009 ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 Fri Sep 4 17:45:57 EDT 2009 ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 Note, it took me more than 20 seconds (since I started the threds) to reach close to desired fairness level. That's too long a duration. Again random readers or small file readers are compeltely out of picture for any kind of fairness or are not protected at all with dm-ioband controller. I think there are serious issues with the notion of fairness and what kind of isolation dm-ioband provide between groups and it should be looked into. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-04 23:11 ` Vivek Goyal @ 2009-09-07 11:02 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-07 11:02 UTC (permalink / raw) To: vgoyal Cc: linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, riel, jmoyer, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > Thank you for testing dm-ioband. dm-ioband is designed to start > > throttling bandwidth when multiple IO requests are issued to devices > > simultaneously, IOW, to start throttling when IO load exceeds a > > certain level. > > > > What is that certain level? Secondly what's the advantage of this? > > I can see disadvantages though. So unless a group is really busy "up to > that certain level" it will not get fairness? I breaks the isolation > between groups. In your test case, at least more than one dd thread have to run simultaneously in the higher weight group. The reason is that if there is an IO group which does not issue a certain number of IO requests, dm-ioband assumes the IO group is inactive and assign its spare bandwidth to active IO groups. Then whole bandwidth of the device can be efficiently used. Please run two dd threads in the higher group, it will work as you expect. However, if you want to get fairness in a case like this, a new bandwidth control policy which controls accurately according to assigned weights can be added to dm-ioband. > I also ran your test of doing heavy IO in two groups. This time I am > running 4 dd threads in both the ioband devices. Following is the snapshot > of "dmsetup table" output. > > Fri Sep 4 17:45:27 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > Fri Sep 4 17:45:29 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 > ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 > > Fri Sep 4 17:45:37 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 > ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 > > Fri Sep 4 17:45:45 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 > ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 > > Fri Sep 4 17:45:51 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 > ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 > > Fri Sep 4 17:45:53 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 > ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 > > Fri Sep 4 17:45:57 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 > ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 > > Note, it took me more than 20 seconds (since I started the threds) to > reach close to desired fairness level. That's too long a duration. We regarded reducing throughput loss rather than reducing duration as the design of dm-ioband. Of course, it is possible to make a new policy which reduces duration. Thanks, Ryo Tsuruta > Again random readers or small file readers are compeltely out of picture for > any kind of fairness or are not protected at all with dm-ioband > controller. > > I think there are serious issues with the notion of fairness and what kind of > isolation dm-ioband provide between groups and it should be looked into. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-07 11:02 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-07 11:02 UTC (permalink / raw) To: vgoyal Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > Thank you for testing dm-ioband. dm-ioband is designed to start > > throttling bandwidth when multiple IO requests are issued to devices > > simultaneously, IOW, to start throttling when IO load exceeds a > > certain level. > > > > What is that certain level? Secondly what's the advantage of this? > > I can see disadvantages though. So unless a group is really busy "up to > that certain level" it will not get fairness? I breaks the isolation > between groups. In your test case, at least more than one dd thread have to run simultaneously in the higher weight group. The reason is that if there is an IO group which does not issue a certain number of IO requests, dm-ioband assumes the IO group is inactive and assign its spare bandwidth to active IO groups. Then whole bandwidth of the device can be efficiently used. Please run two dd threads in the higher group, it will work as you expect. However, if you want to get fairness in a case like this, a new bandwidth control policy which controls accurately according to assigned weights can be added to dm-ioband. > I also ran your test of doing heavy IO in two groups. This time I am > running 4 dd threads in both the ioband devices. Following is the snapshot > of "dmsetup table" output. > > Fri Sep 4 17:45:27 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > Fri Sep 4 17:45:29 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 > ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 > > Fri Sep 4 17:45:37 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 > ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 > > Fri Sep 4 17:45:45 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 > ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 > > Fri Sep 4 17:45:51 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 > ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 > > Fri Sep 4 17:45:53 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 > ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 > > Fri Sep 4 17:45:57 EDT 2009 > ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 > ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 > > Note, it took me more than 20 seconds (since I started the threds) to > reach close to desired fairness level. That's too long a duration. We regarded reducing throughput loss rather than reducing duration as the design of dm-ioband. Of course, it is possible to make a new policy which reduces duration. Thanks, Ryo Tsuruta > Again random readers or small file readers are compeltely out of picture for > any kind of fairness or are not protected at all with dm-ioband > controller. > > I think there are serious issues with the notion of fairness and what kind of > isolation dm-ioband provide between groups and it should be looked into. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-07 11:02 ` Ryo Tsuruta @ 2009-09-07 13:53 ` Rik van Riel -1 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-07 13:53 UTC (permalink / raw) To: Ryo Tsuruta Cc: vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Ryo Tsuruta wrote: > However, if you want to get fairness in a case like this, a new > bandwidth control policy which controls accurately according to > assigned weights can be added to dm-ioband. Are you saying that dm-ioband is purposely unfair, until a certain load level is reached? > We regarded reducing throughput loss rather than reducing duration > as the design of dm-ioband. Of course, it is possible to make a new > policy which reduces duration. ... while also reducing overall system throughput by design? Why are you even bothering to submit this to the linux-kernel mailing list, when there is a codebase available that has no throughput or fairness regressions? (Vivek's io scheduler based io controler) -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-07 13:53 ` Rik van Riel 0 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-07 13:53 UTC (permalink / raw) To: Ryo Tsuruta Cc: guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Ryo Tsuruta wrote: > However, if you want to get fairness in a case like this, a new > bandwidth control policy which controls accurately according to > assigned weights can be added to dm-ioband. Are you saying that dm-ioband is purposely unfair, until a certain load level is reached? > We regarded reducing throughput loss rather than reducing duration > as the design of dm-ioband. Of course, it is possible to make a new > policy which reduces duration. ... while also reducing overall system throughput by design? Why are you even bothering to submit this to the linux-kernel mailing list, when there is a codebase available that has no throughput or fairness regressions? (Vivek's io scheduler based io controler) -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-07 13:53 ` Rik van Riel @ 2009-09-08 3:01 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-08 3:01 UTC (permalink / raw) To: riel Cc: vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi Rik, Rik van Riel <riel@redhat.com> wrote: > Ryo Tsuruta wrote: > > > However, if you want to get fairness in a case like this, a new > > bandwidth control policy which controls accurately according to > > assigned weights can be added to dm-ioband. > > Are you saying that dm-ioband is purposely unfair, > until a certain load level is reached? Not unfair, dm-ioband(weight policy) is intentionally designed to use bandwidth efficiently, weight policy tries to give spare bandwidth of inactive groups to active groups. > > We regarded reducing throughput loss rather than reducing duration > > as the design of dm-ioband. Of course, it is possible to make a new > > policy which reduces duration. > > ... while also reducing overall system throughput > by design? I think it reduces system throughput compared to the current implementation, because it causes more overhead to do fine grained control. > Why are you even bothering to submit this to the > linux-kernel mailing list, when there is a codebase > available that has no throughput or fairness regressions? > (Vivek's io scheduler based io controler) I think there are some advantages to dm-ioband. That's why I post dm-ioband to the mailing list. - dm-ioband supports not only proportional weight policy but also rate limiting policy. Besides, new policies can be added to dm-ioband if a user wants to control bandwidth by his or her own policy. - The dm-ioband driver can be replaced without stopping the system by using device-mapper's facility. It's easy to maintain. - dm-ioband can use without cgroup. (I remember Vivek said it's not an advantage.) Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 3:01 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-08 3:01 UTC (permalink / raw) To: riel Cc: guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Hi Rik, Rik van Riel <riel@redhat.com> wrote: > Ryo Tsuruta wrote: > > > However, if you want to get fairness in a case like this, a new > > bandwidth control policy which controls accurately according to > > assigned weights can be added to dm-ioband. > > Are you saying that dm-ioband is purposely unfair, > until a certain load level is reached? Not unfair, dm-ioband(weight policy) is intentionally designed to use bandwidth efficiently, weight policy tries to give spare bandwidth of inactive groups to active groups. > > We regarded reducing throughput loss rather than reducing duration > > as the design of dm-ioband. Of course, it is possible to make a new > > policy which reduces duration. > > ... while also reducing overall system throughput > by design? I think it reduces system throughput compared to the current implementation, because it causes more overhead to do fine grained control. > Why are you even bothering to submit this to the > linux-kernel mailing list, when there is a codebase > available that has no throughput or fairness regressions? > (Vivek's io scheduler based io controler) I think there are some advantages to dm-ioband. That's why I post dm-ioband to the mailing list. - dm-ioband supports not only proportional weight policy but also rate limiting policy. Besides, new policies can be added to dm-ioband if a user wants to control bandwidth by his or her own policy. - The dm-ioband driver can be replaced without stopping the system by using device-mapper's facility. It's easy to maintain. - dm-ioband can use without cgroup. (I remember Vivek said it's not an advantage.) Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 3:01 ` Ryo Tsuruta @ 2009-09-08 3:22 ` Balbir Singh -1 siblings, 0 replies; 80+ messages in thread From: Balbir Singh @ 2009-09-08 3:22 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > I think there are some advantages to dm-ioband. That's why I post > dm-ioband to the mailing list. > > - dm-ioband supports not only proportional weight policy but also rate > limiting policy. Besides, new policies can be added to dm-ioband if > a user wants to control bandwidth by his or her own policy. > - The dm-ioband driver can be replaced without stopping the system by > using device-mapper's facility. It's easy to maintain. > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > advantage.) But don't you need page_cgroup for IO tracking? -- Balbir ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 3:22 ` Balbir Singh 0 siblings, 0 replies; 80+ messages in thread From: Balbir Singh @ 2009-09-08 3:22 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > I think there are some advantages to dm-ioband. That's why I post > dm-ioband to the mailing list. > > - dm-ioband supports not only proportional weight policy but also rate > limiting policy. Besides, new policies can be added to dm-ioband if > a user wants to control bandwidth by his or her own policy. > - The dm-ioband driver can be replaced without stopping the system by > using device-mapper's facility. It's easy to maintain. > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > advantage.) But don't you need page_cgroup for IO tracking? -- Balbir ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 3:22 ` Balbir Singh @ 2009-09-08 5:05 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-08 5:05 UTC (permalink / raw) To: balbir Cc: riel, vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer Hi Balbir, Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > > > I think there are some advantages to dm-ioband. That's why I post > > dm-ioband to the mailing list. > > > > - dm-ioband supports not only proportional weight policy but also rate > > limiting policy. Besides, new policies can be added to dm-ioband if > > a user wants to control bandwidth by his or her own policy. > > - The dm-ioband driver can be replaced without stopping the system by > > using device-mapper's facility. It's easy to maintain. > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > advantage.) > > But don't you need page_cgroup for IO tracking? It is not necessary when controlling bandwidth on a per partition basis or on a IO thread basis like Xen blkback kernel thread. Here are configration examples. http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 5:05 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-08 5:05 UTC (permalink / raw) To: balbir Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk Hi Balbir, Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > > > I think there are some advantages to dm-ioband. That's why I post > > dm-ioband to the mailing list. > > > > - dm-ioband supports not only proportional weight policy but also rate > > limiting policy. Besides, new policies can be added to dm-ioband if > > a user wants to control bandwidth by his or her own policy. > > - The dm-ioband driver can be replaced without stopping the system by > > using device-mapper's facility. It's easy to maintain. > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > advantage.) > > But don't you need page_cgroup for IO tracking? It is not necessary when controlling bandwidth on a per partition basis or on a IO thread basis like Xen blkback kernel thread. Here are configration examples. http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 5:05 ` Ryo Tsuruta @ 2009-09-08 13:49 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 13:49 UTC (permalink / raw) To: Ryo Tsuruta Cc: balbir, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer On Tue, Sep 08, 2009 at 02:05:16PM +0900, Ryo Tsuruta wrote: > Hi Balbir, > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > > > > > I think there are some advantages to dm-ioband. That's why I post > > > dm-ioband to the mailing list. > > > > > > - dm-ioband supports not only proportional weight policy but also rate > > > limiting policy. Besides, new policies can be added to dm-ioband if > > > a user wants to control bandwidth by his or her own policy. > > > - The dm-ioband driver can be replaced without stopping the system by > > > using device-mapper's facility. It's easy to maintain. > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > But don't you need page_cgroup for IO tracking? > > It is not necessary when controlling bandwidth on a per partition > basis or on a IO thread basis like Xen blkback kernel thread. > > Here are configration examples. > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > For partition based control, where a thread or group of threads is doing IO to a specific parition, why can't you simply create different cgroups for each partition and move threads in those partitions. root / | \ sda1 sda2 sda3 Above are three groups and move threads doing IO into those groups and problem is solved. In fact that's what one will do for KVM virtual machines. Move all the qemu helper threds doing IO for a virtual machine instance into a specific group and control the IO. Why do you have to come up with additional complicated grouping mechanism? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 13:49 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 13:49 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Tue, Sep 08, 2009 at 02:05:16PM +0900, Ryo Tsuruta wrote: > Hi Balbir, > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-08 12:01:19]: > > > > > I think there are some advantages to dm-ioband. That's why I post > > > dm-ioband to the mailing list. > > > > > > - dm-ioband supports not only proportional weight policy but also rate > > > limiting policy. Besides, new policies can be added to dm-ioband if > > > a user wants to control bandwidth by his or her own policy. > > > - The dm-ioband driver can be replaced without stopping the system by > > > using device-mapper's facility. It's easy to maintain. > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > But don't you need page_cgroup for IO tracking? > > It is not necessary when controlling bandwidth on a per partition > basis or on a IO thread basis like Xen blkback kernel thread. > > Here are configration examples. > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > For partition based control, where a thread or group of threads is doing IO to a specific parition, why can't you simply create different cgroups for each partition and move threads in those partitions. root / | \ sda1 sda2 sda3 Above are three groups and move threads doing IO into those groups and problem is solved. In fact that's what one will do for KVM virtual machines. Move all the qemu helper threds doing IO for a virtual machine instance into a specific group and control the IO. Why do you have to come up with additional complicated grouping mechanism? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 13:49 ` Vivek Goyal @ 2009-09-09 5:17 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 5:17 UTC (permalink / raw) To: vgoyal Cc: balbir, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > It is not necessary when controlling bandwidth on a per partition > > basis or on a IO thread basis like Xen blkback kernel thread. > > > > Here are configration examples. > > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > > > > For partition based control, where a thread or group of threads is doing > IO to a specific parition, why can't you simply create different cgroups > for each partition and move threads in those partitions. > > > root > / | \ > sda1 sda2 sda3 > > Above are three groups and move threads doing IO into those groups and > problem is solved. In fact that's what one will do for KVM virtual > machines. Move all the qemu helper threds doing IO for a virtual machine > instance into a specific group and control the IO. > > Why do you have to come up with additional complicated grouping mechanism? I don't get why you think it's complicated, your io-controller also provides the same grouping machanism which assigns bandwidth per device by io.policy file. What's the difference? The thread grouping machianism is also not special, it is the same concept as cgroup. These mechanisms are necessary to make use of dm-ioband on the systems which doesn't support cgroup such as RHEL 5.x. As you know, dm-ioband also supports cgroup, the configurations you mentioned above can apply to the system by dm-ioband. I think it's not bad to have several ways to setup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 5:17 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 5:17 UTC (permalink / raw) To: vgoyal Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > It is not necessary when controlling bandwidth on a per partition > > basis or on a IO thread basis like Xen blkback kernel thread. > > > > Here are configration examples. > > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > > > > For partition based control, where a thread or group of threads is doing > IO to a specific parition, why can't you simply create different cgroups > for each partition and move threads in those partitions. > > > root > / | \ > sda1 sda2 sda3 > > Above are three groups and move threads doing IO into those groups and > problem is solved. In fact that's what one will do for KVM virtual > machines. Move all the qemu helper threds doing IO for a virtual machine > instance into a specific group and control the IO. > > Why do you have to come up with additional complicated grouping mechanism? I don't get why you think it's complicated, your io-controller also provides the same grouping machanism which assigns bandwidth per device by io.policy file. What's the difference? The thread grouping machianism is also not special, it is the same concept as cgroup. These mechanisms are necessary to make use of dm-ioband on the systems which doesn't support cgroup such as RHEL 5.x. As you know, dm-ioband also supports cgroup, the configurations you mentioned above can apply to the system by dm-ioband. I think it's not bad to have several ways to setup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 5:17 ` Ryo Tsuruta @ 2009-09-09 13:34 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 13:34 UTC (permalink / raw) To: Ryo Tsuruta Cc: balbir, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer On Wed, Sep 09, 2009 at 02:17:48PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > It is not necessary when controlling bandwidth on a per partition > > > basis or on a IO thread basis like Xen blkback kernel thread. > > > > > > Here are configration examples. > > > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > > > > > > > For partition based control, where a thread or group of threads is doing > > IO to a specific parition, why can't you simply create different cgroups > > for each partition and move threads in those partitions. > > > > > > root > > / | \ > > sda1 sda2 sda3 > > > > Above are three groups and move threads doing IO into those groups and > > problem is solved. In fact that's what one will do for KVM virtual > > machines. Move all the qemu helper threds doing IO for a virtual machine > > instance into a specific group and control the IO. > > > > Why do you have to come up with additional complicated grouping mechanism? > > I don't get why you think it's complicated, your io-controller also > provides the same grouping machanism which assigns bandwidth per > device by io.policy file. What's the difference? I am using purely cgroup based interface. This makes life easier for user space tools and libraries like libcgroup and should also help libvirt. Now they can treat all the resource controllers in kernel in a uniform way (through cgroup interface). With-in cgroups, there are controller specific files and libcgroup is aware of that. So libcgroup can be modified to take care of special syntax of io.policy file. But this is not too big a deviation in overall picture. dm-ioband is coming up with a whole new way of configuring and managing groups and now these user space tools shall have to be modified to take care of this new way only for io controller. The point is there is no need. You seem to be introducing this new interface because you want to use this module with RHEL5 which does not have cgroup support. I think this is hard to sell argument that upstream should also introduce a new interface just because one wants to use the module with older releases of kernel which did not have cgroup support. Taking this new interface solves your case but will make life harder for user space tools, libraries and applications making use of cgroups and various resource controllers. > The thread grouping > machianism is also not special, it is the same concept as cgroup. > These mechanisms are necessary to make use of dm-ioband on the systems > which doesn't support cgroup such as RHEL 5.x. As you know, dm-ioband > also supports cgroup, the configurations you mentioned above can apply > to the system by dm-ioband. dm-ioband also supports cgroup but there is additional step required and that is passing all the cgroup ids to various ioband devices. This requires knowledge of all the ioband devices and how these have been created and usage of dm tools. The only place where it helps a bit is that once the configuration is done, one can move the tasks in groups to arbitrarily group them instead of grouping these on pid, gid etc.... So it still does not solve the issue of dm-ioband being so different from rest of the controllers and introducing a new interface for creation and management of groups. > I think it's not bad to have several ways > to setup. It is not bad if there is a proper justification for new interface and why existing standard mechanism does not meet that requirement. In this case you are saying that in general cgroup mechanism is sufficient to take care of grouping of tasks but it is not available in older kernels hence let us introduce a new interface in upstream kernels. I think this does not work. This brings in unnecessary overhead of maintaining anohter interface for upstream and upstream does not benefit from this interface. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 13:34 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 13:34 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Wed, Sep 09, 2009 at 02:17:48PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > It is not necessary when controlling bandwidth on a per partition > > > basis or on a IO thread basis like Xen blkback kernel thread. > > > > > > Here are configration examples. > > > http://sourceforge.net/apps/trac/ioband/wiki/dm-ioband/man/examples > > > > > > > For partition based control, where a thread or group of threads is doing > > IO to a specific parition, why can't you simply create different cgroups > > for each partition and move threads in those partitions. > > > > > > root > > / | \ > > sda1 sda2 sda3 > > > > Above are three groups and move threads doing IO into those groups and > > problem is solved. In fact that's what one will do for KVM virtual > > machines. Move all the qemu helper threds doing IO for a virtual machine > > instance into a specific group and control the IO. > > > > Why do you have to come up with additional complicated grouping mechanism? > > I don't get why you think it's complicated, your io-controller also > provides the same grouping machanism which assigns bandwidth per > device by io.policy file. What's the difference? I am using purely cgroup based interface. This makes life easier for user space tools and libraries like libcgroup and should also help libvirt. Now they can treat all the resource controllers in kernel in a uniform way (through cgroup interface). With-in cgroups, there are controller specific files and libcgroup is aware of that. So libcgroup can be modified to take care of special syntax of io.policy file. But this is not too big a deviation in overall picture. dm-ioband is coming up with a whole new way of configuring and managing groups and now these user space tools shall have to be modified to take care of this new way only for io controller. The point is there is no need. You seem to be introducing this new interface because you want to use this module with RHEL5 which does not have cgroup support. I think this is hard to sell argument that upstream should also introduce a new interface just because one wants to use the module with older releases of kernel which did not have cgroup support. Taking this new interface solves your case but will make life harder for user space tools, libraries and applications making use of cgroups and various resource controllers. > The thread grouping > machianism is also not special, it is the same concept as cgroup. > These mechanisms are necessary to make use of dm-ioband on the systems > which doesn't support cgroup such as RHEL 5.x. As you know, dm-ioband > also supports cgroup, the configurations you mentioned above can apply > to the system by dm-ioband. dm-ioband also supports cgroup but there is additional step required and that is passing all the cgroup ids to various ioband devices. This requires knowledge of all the ioband devices and how these have been created and usage of dm tools. The only place where it helps a bit is that once the configuration is done, one can move the tasks in groups to arbitrarily group them instead of grouping these on pid, gid etc.... So it still does not solve the issue of dm-ioband being so different from rest of the controllers and introducing a new interface for creation and management of groups. > I think it's not bad to have several ways > to setup. It is not bad if there is a proper justification for new interface and why existing standard mechanism does not meet that requirement. In this case you are saying that in general cgroup mechanism is sufficient to take care of grouping of tasks but it is not available in older kernels hence let us introduce a new interface in upstream kernels. I think this does not work. This brings in unnecessary overhead of maintaining anohter interface for upstream and upstream does not benefit from this interface. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 3:01 ` Ryo Tsuruta @ 2009-09-08 13:42 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 13:42 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Tue, Sep 08, 2009 at 12:01:19PM +0900, Ryo Tsuruta wrote: > Hi Rik, > > Rik van Riel <riel@redhat.com> wrote: > > Ryo Tsuruta wrote: > > > > > However, if you want to get fairness in a case like this, a new > > > bandwidth control policy which controls accurately according to > > > assigned weights can be added to dm-ioband. > > > > Are you saying that dm-ioband is purposely unfair, > > until a certain load level is reached? > > Not unfair, dm-ioband(weight policy) is intentionally designed to > use bandwidth efficiently, weight policy tries to give spare bandwidth > of inactive groups to active groups. > This group is running a sequential reader. How can you call it an inactive group? I think that whole problem is that like CFQ you have not taken care of idling into account. Your solution seems to be designed only for processes doing bulk IO over a very long period of time. I think it limits the usefulness of solution severely. > > > We regarded reducing throughput loss rather than reducing duration > > > as the design of dm-ioband. Of course, it is possible to make a new > > > policy which reduces duration. > > > > ... while also reducing overall system throughput > > by design? > > I think it reduces system throughput compared to the current > implementation, because it causes more overhead to do fine grained > control. > > > Why are you even bothering to submit this to the > > linux-kernel mailing list, when there is a codebase > > available that has no throughput or fairness regressions? > > (Vivek's io scheduler based io controler) > > I think there are some advantages to dm-ioband. That's why I post > dm-ioband to the mailing list. > > - dm-ioband supports not only proportional weight policy but also rate > limiting policy. Besides, new policies can be added to dm-ioband if > a user wants to control bandwidth by his or her own policy. I think we can easily extent io scheduler based controller to also support max rate per group policy also. That should not be too hard. It is a matter of only keeping track of io rate per group and if a group is exceeding the rate, then schedule it out and move on to next group. I can do that once proportional weight solution is stablized and gets merged. So its not an advantage of dm-ioband. > - The dm-ioband driver can be replaced without stopping the system by > using device-mapper's facility. It's easy to maintain. We talked about this point in the past also. In io scheduler based controller, just move all the tasks to root group and you got a system not doing any io control. By the way why would one like to do that? So this is also not an advantage. > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > advantage.) I think this is more of a disadvantage than advantage. We have a very well defined functionality of cgroup in kernel to group the tasks. Now you are coming up with your own method of grouping the tasks which will make life even more confusing for users and application writers. I don't understand what is that core requirement of yours which is not met by io scheduler based io controller. range policy control you have implemented recently. I don't think that removing dm-ioband module dynamically is core requirement. Also whatever you can do with additional grouping mechanism, you can do with cgroup also. So if there is any of your core functionality which is not fulfilled by io scheduler based controller, please let me know. I will be happy to look into it and try to provide that feature. But looking at above list, I am not convinced that any of the above is a compelling argument for dm-ioband inclusion. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 13:42 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 13:42 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Tue, Sep 08, 2009 at 12:01:19PM +0900, Ryo Tsuruta wrote: > Hi Rik, > > Rik van Riel <riel@redhat.com> wrote: > > Ryo Tsuruta wrote: > > > > > However, if you want to get fairness in a case like this, a new > > > bandwidth control policy which controls accurately according to > > > assigned weights can be added to dm-ioband. > > > > Are you saying that dm-ioband is purposely unfair, > > until a certain load level is reached? > > Not unfair, dm-ioband(weight policy) is intentionally designed to > use bandwidth efficiently, weight policy tries to give spare bandwidth > of inactive groups to active groups. > This group is running a sequential reader. How can you call it an inactive group? I think that whole problem is that like CFQ you have not taken care of idling into account. Your solution seems to be designed only for processes doing bulk IO over a very long period of time. I think it limits the usefulness of solution severely. > > > We regarded reducing throughput loss rather than reducing duration > > > as the design of dm-ioband. Of course, it is possible to make a new > > > policy which reduces duration. > > > > ... while also reducing overall system throughput > > by design? > > I think it reduces system throughput compared to the current > implementation, because it causes more overhead to do fine grained > control. > > > Why are you even bothering to submit this to the > > linux-kernel mailing list, when there is a codebase > > available that has no throughput or fairness regressions? > > (Vivek's io scheduler based io controler) > > I think there are some advantages to dm-ioband. That's why I post > dm-ioband to the mailing list. > > - dm-ioband supports not only proportional weight policy but also rate > limiting policy. Besides, new policies can be added to dm-ioband if > a user wants to control bandwidth by his or her own policy. I think we can easily extent io scheduler based controller to also support max rate per group policy also. That should not be too hard. It is a matter of only keeping track of io rate per group and if a group is exceeding the rate, then schedule it out and move on to next group. I can do that once proportional weight solution is stablized and gets merged. So its not an advantage of dm-ioband. > - The dm-ioband driver can be replaced without stopping the system by > using device-mapper's facility. It's easy to maintain. We talked about this point in the past also. In io scheduler based controller, just move all the tasks to root group and you got a system not doing any io control. By the way why would one like to do that? So this is also not an advantage. > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > advantage.) I think this is more of a disadvantage than advantage. We have a very well defined functionality of cgroup in kernel to group the tasks. Now you are coming up with your own method of grouping the tasks which will make life even more confusing for users and application writers. I don't understand what is that core requirement of yours which is not met by io scheduler based io controller. range policy control you have implemented recently. I don't think that removing dm-ioband module dynamically is core requirement. Also whatever you can do with additional grouping mechanism, you can do with cgroup also. So if there is any of your core functionality which is not fulfilled by io scheduler based controller, please let me know. I will be happy to look into it and try to provide that feature. But looking at above list, I am not convinced that any of the above is a compelling argument for dm-ioband inclusion. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 13:42 ` Vivek Goyal @ 2009-09-08 16:30 ` Nauman Rafique -1 siblings, 0 replies; 80+ messages in thread From: Nauman Rafique @ 2009-09-08 16:30 UTC (permalink / raw) To: Vivek Goyal Cc: Ryo Tsuruta, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, guijianfeng, jmoyer, balbir On Tue, Sep 8, 2009 at 6:42 AM, Vivek Goyal<vgoyal@redhat.com> wrote: > On Tue, Sep 08, 2009 at 12:01:19PM +0900, Ryo Tsuruta wrote: >> Hi Rik, >> >> Rik van Riel <riel@redhat.com> wrote: >> > Ryo Tsuruta wrote: >> > >> > > However, if you want to get fairness in a case like this, a new >> > > bandwidth control policy which controls accurately according to >> > > assigned weights can be added to dm-ioband. >> > >> > Are you saying that dm-ioband is purposely unfair, >> > until a certain load level is reached? >> >> Not unfair, dm-ioband(weight policy) is intentionally designed to >> use bandwidth efficiently, weight policy tries to give spare bandwidth >> of inactive groups to active groups. >> > > This group is running a sequential reader. How can you call it an inactive > group? > > I think that whole problem is that like CFQ you have not taken care of > idling into account. I think this is probably the key deal breaker. dm-ioband has no mechanism to anticipate or idle for a reader task. Without such a mechanism, a proportional division scheme cannot work for tasks doing reads. Most readers do not send down more than one IO at a time, and they do not send another until the previous one is complete. Anticipation helps in this case, as we would wait for the task to send down a new IO, before we expire its timeslice. Without anticipation, we would serve the one IO from reader and then go on to serve IOs from other tasks. When the reader would finally get around to sending next IO, it would have to wait behind other IOs that have sent down in the meanwhile. IO schedulers in block layer have anticipation built into them, so a proportional scheduling scheduling at that layer does not have to repeat the logic or data structures for anticipation. In fact, a rate limiting mechanism like dm-ioband can potentially break the anticipation logic at IO schedulers, by queuing up the IOs at an upper layer, while scheduler in block layer could have been anticipating for it. > > Your solution seems to be designed only for processes doing bulk IO over > a very long period of time. I think it limits the usefulness of solution > severely. > >> > > We regarded reducing throughput loss rather than reducing duration >> > > as the design of dm-ioband. Of course, it is possible to make a new >> > > policy which reduces duration. >> > >> > ... while also reducing overall system throughput >> > by design? >> >> I think it reduces system throughput compared to the current >> implementation, because it causes more overhead to do fine grained >> control. >> >> > Why are you even bothering to submit this to the >> > linux-kernel mailing list, when there is a codebase >> > available that has no throughput or fairness regressions? >> > (Vivek's io scheduler based io controler) >> >> I think there are some advantages to dm-ioband. That's why I post >> dm-ioband to the mailing list. >> >> - dm-ioband supports not only proportional weight policy but also rate >> limiting policy. Besides, new policies can be added to dm-ioband if >> a user wants to control bandwidth by his or her own policy. > > I think we can easily extent io scheduler based controller to also support > max rate per group policy also. That should not be too hard. It is a > matter of only keeping track of io rate per group and if a group is > exceeding the rate, then schedule it out and move on to next group. At Google, we have implemented a rate limiting mechanism on top of Vivek's patches, and have been testing it. But I feel like the patch set maintained by Vivek is pretty big already. Once we have those patches merged, we can introduce more functionality. > > I can do that once proportional weight solution is stablized and gets > merged. > > So its not an advantage of dm-ioband. > >> - The dm-ioband driver can be replaced without stopping the system by >> using device-mapper's facility. It's easy to maintain. > > We talked about this point in the past also. In io scheduler based > controller, just move all the tasks to root group and you got a system > not doing any io control. > > By the way why would one like to do that? > > So this is also not an advantage. > >> - dm-ioband can use without cgroup. (I remember Vivek said it's not an >> advantage.) > > I think this is more of a disadvantage than advantage. We have a very well > defined functionality of cgroup in kernel to group the tasks. Now you are > coming up with your own method of grouping the tasks which will make life > even more confusing for users and application writers. > > I don't understand what is that core requirement of yours which is not met > by io scheduler based io controller. range policy control you have > implemented recently. I don't think that removing dm-ioband module > dynamically is core requirement. Also whatever you can do with additional > grouping mechanism, you can do with cgroup also. > > So if there is any of your core functionality which is not fulfilled by > io scheduler based controller, please let me know. I will be happy to look > into it and try to provide that feature. But looking at above list, I am > not convinced that any of the above is a compelling argument for dm-ioband > inclusion. > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 16:30 ` Nauman Rafique 0 siblings, 0 replies; 80+ messages in thread From: Nauman Rafique @ 2009-09-08 16:30 UTC (permalink / raw) To: Vivek Goyal Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, akpm, agk, balbir On Tue, Sep 8, 2009 at 6:42 AM, Vivek Goyal<vgoyal@redhat.com> wrote: > On Tue, Sep 08, 2009 at 12:01:19PM +0900, Ryo Tsuruta wrote: >> Hi Rik, >> >> Rik van Riel <riel@redhat.com> wrote: >> > Ryo Tsuruta wrote: >> > >> > > However, if you want to get fairness in a case like this, a new >> > > bandwidth control policy which controls accurately according to >> > > assigned weights can be added to dm-ioband. >> > >> > Are you saying that dm-ioband is purposely unfair, >> > until a certain load level is reached? >> >> Not unfair, dm-ioband(weight policy) is intentionally designed to >> use bandwidth efficiently, weight policy tries to give spare bandwidth >> of inactive groups to active groups. >> > > This group is running a sequential reader. How can you call it an inactive > group? > > I think that whole problem is that like CFQ you have not taken care of > idling into account. I think this is probably the key deal breaker. dm-ioband has no mechanism to anticipate or idle for a reader task. Without such a mechanism, a proportional division scheme cannot work for tasks doing reads. Most readers do not send down more than one IO at a time, and they do not send another until the previous one is complete. Anticipation helps in this case, as we would wait for the task to send down a new IO, before we expire its timeslice. Without anticipation, we would serve the one IO from reader and then go on to serve IOs from other tasks. When the reader would finally get around to sending next IO, it would have to wait behind other IOs that have sent down in the meanwhile. IO schedulers in block layer have anticipation built into them, so a proportional scheduling scheduling at that layer does not have to repeat the logic or data structures for anticipation. In fact, a rate limiting mechanism like dm-ioband can potentially break the anticipation logic at IO schedulers, by queuing up the IOs at an upper layer, while scheduler in block layer could have been anticipating for it. > > Your solution seems to be designed only for processes doing bulk IO over > a very long period of time. I think it limits the usefulness of solution > severely. > >> > > We regarded reducing throughput loss rather than reducing duration >> > > as the design of dm-ioband. Of course, it is possible to make a new >> > > policy which reduces duration. >> > >> > ... while also reducing overall system throughput >> > by design? >> >> I think it reduces system throughput compared to the current >> implementation, because it causes more overhead to do fine grained >> control. >> >> > Why are you even bothering to submit this to the >> > linux-kernel mailing list, when there is a codebase >> > available that has no throughput or fairness regressions? >> > (Vivek's io scheduler based io controler) >> >> I think there are some advantages to dm-ioband. That's why I post >> dm-ioband to the mailing list. >> >> - dm-ioband supports not only proportional weight policy but also rate >> limiting policy. Besides, new policies can be added to dm-ioband if >> a user wants to control bandwidth by his or her own policy. > > I think we can easily extent io scheduler based controller to also support > max rate per group policy also. That should not be too hard. It is a > matter of only keeping track of io rate per group and if a group is > exceeding the rate, then schedule it out and move on to next group. At Google, we have implemented a rate limiting mechanism on top of Vivek's patches, and have been testing it. But I feel like the patch set maintained by Vivek is pretty big already. Once we have those patches merged, we can introduce more functionality. > > I can do that once proportional weight solution is stablized and gets > merged. > > So its not an advantage of dm-ioband. > >> - The dm-ioband driver can be replaced without stopping the system by >> using device-mapper's facility. It's easy to maintain. > > We talked about this point in the past also. In io scheduler based > controller, just move all the tasks to root group and you got a system > not doing any io control. > > By the way why would one like to do that? > > So this is also not an advantage. > >> - dm-ioband can use without cgroup. (I remember Vivek said it's not an >> advantage.) > > I think this is more of a disadvantage than advantage. We have a very well > defined functionality of cgroup in kernel to group the tasks. Now you are > coming up with your own method of grouping the tasks which will make life > even more confusing for users and application writers. > > I don't understand what is that core requirement of yours which is not met > by io scheduler based io controller. range policy control you have > implemented recently. I don't think that removing dm-ioband module > dynamically is core requirement. Also whatever you can do with additional > grouping mechanism, you can do with cgroup also. > > So if there is any of your core functionality which is not fulfilled by > io scheduler based controller, please let me know. I will be happy to look > into it and try to provide that feature. But looking at above list, I am > not convinced that any of the above is a compelling argument for dm-ioband > inclusion. > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 16:30 ` Nauman Rafique @ 2009-09-08 16:47 ` Rik van Riel -1 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-08 16:47 UTC (permalink / raw) To: Nauman Rafique Cc: Vivek Goyal, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, guijianfeng, jmoyer, balbir Nauman Rafique wrote: > I think this is probably the key deal breaker. dm-ioband has no > mechanism to anticipate or idle for a reader task. Without such a > mechanism, a proportional division scheme cannot work for tasks doing > reads. That is a really big issue, since most reads tend to be synchronous (the application is waiting for the read), while many writes are not (the application is doing something else while the data is written). Having writes take precedence over reads will really screw over the readers, while not benefitting the writers all that much. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 16:47 ` Rik van Riel 0 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-08 16:47 UTC (permalink / raw) To: Nauman Rafique Cc: guijianfeng, linux-kernel, jmoyer, dm-devel, Vivek Goyal, jens.axboe, akpm, agk, balbir Nauman Rafique wrote: > I think this is probably the key deal breaker. dm-ioband has no > mechanism to anticipate or idle for a reader task. Without such a > mechanism, a proportional division scheme cannot work for tasks doing > reads. That is a really big issue, since most reads tend to be synchronous (the application is waiting for the read), while many writes are not (the application is doing something else while the data is written). Having writes take precedence over reads will really screw over the readers, while not benefitting the writers all that much. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 16:47 ` Rik van Riel @ 2009-09-08 17:54 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 17:54 UTC (permalink / raw) To: Rik van Riel Cc: Nauman Rafique, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, guijianfeng, jmoyer, balbir On Tue, Sep 08, 2009 at 12:47:33PM -0400, Rik van Riel wrote: > Nauman Rafique wrote: > >> I think this is probably the key deal breaker. dm-ioband has no >> mechanism to anticipate or idle for a reader task. Without such a >> mechanism, a proportional division scheme cannot work for tasks doing >> reads. > > That is a really big issue, since most reads tend to be synchronous > (the application is waiting for the read), while many writes are not > (the application is doing something else while the data is written). > > Having writes take precedence over reads will really screw over the > readers, while not benefitting the writers all that much. > I ran a test to show how readers can be starved in certain cases. I launched one reader and three writers. I ran this test twice. First without dm-ioband and then with dm-ioband. Following are few lines from the script to launch readers and writers. ************************************************************** sync echo 3 > /proc/sys/vm/drop_caches # Launch writers on sdd2 dd if=/dev/zero of=/mnt/sdd2/writezerofile1 bs=4K count=262144 & # Launch writers on sdd1 dd if=/dev/zero of=/mnt/sdd1/writezerofile1 bs=4K count=262144 & dd if=/dev/zero of=/mnt/sdd1/writezerofile2 bs=4K count=262144 & echo "sleeping for 5 seconds" sleep 5 # launch reader on sdd1 time dd if=/mnt/sdd1/testzerofile1 of=/dev/zero & echo "launched reader $!" ********************************************************************* Without dm-ioband, reader finished in roughly 5 seconds. 289533952 bytes (290 MB) copied, 5.16765 s, 56.0 MB/s real 0m5.300s user 0m0.098s sys 0m0.492s With dm-ioband, reader took, more than 2 minutes to finish. 289533952 bytes (290 MB) copied, 122.386 s, 2.4 MB/s real 2m2.569s user 0m0.107s sys 0m0.548s I had created ioband1 on /dev/sdd1 and ioband2 on /dev/sdd2 with weights 200 and 100 respectively. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 17:54 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-08 17:54 UTC (permalink / raw) To: Rik van Riel Cc: guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, Nauman Rafique, akpm, agk, balbir On Tue, Sep 08, 2009 at 12:47:33PM -0400, Rik van Riel wrote: > Nauman Rafique wrote: > >> I think this is probably the key deal breaker. dm-ioband has no >> mechanism to anticipate or idle for a reader task. Without such a >> mechanism, a proportional division scheme cannot work for tasks doing >> reads. > > That is a really big issue, since most reads tend to be synchronous > (the application is waiting for the read), while many writes are not > (the application is doing something else while the data is written). > > Having writes take precedence over reads will really screw over the > readers, while not benefitting the writers all that much. > I ran a test to show how readers can be starved in certain cases. I launched one reader and three writers. I ran this test twice. First without dm-ioband and then with dm-ioband. Following are few lines from the script to launch readers and writers. ************************************************************** sync echo 3 > /proc/sys/vm/drop_caches # Launch writers on sdd2 dd if=/dev/zero of=/mnt/sdd2/writezerofile1 bs=4K count=262144 & # Launch writers on sdd1 dd if=/dev/zero of=/mnt/sdd1/writezerofile1 bs=4K count=262144 & dd if=/dev/zero of=/mnt/sdd1/writezerofile2 bs=4K count=262144 & echo "sleeping for 5 seconds" sleep 5 # launch reader on sdd1 time dd if=/mnt/sdd1/testzerofile1 of=/dev/zero & echo "launched reader $!" ********************************************************************* Without dm-ioband, reader finished in roughly 5 seconds. 289533952 bytes (290 MB) copied, 5.16765 s, 56.0 MB/s real 0m5.300s user 0m0.098s sys 0m0.492s With dm-ioband, reader took, more than 2 minutes to finish. 289533952 bytes (290 MB) copied, 122.386 s, 2.4 MB/s real 2m2.569s user 0m0.107s sys 0m0.548s I had created ioband1 on /dev/sdd1 and ioband2 on /dev/sdd2 with weights 200 and 100 respectively. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* ioband: Writer starves reader even without competitors (Re: Regarding dm-ioband tests) 2009-09-08 17:54 ` Vivek Goyal @ 2009-09-15 23:37 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-15 23:37 UTC (permalink / raw) To: Ryo Tsuruta Cc: Nauman Rafique, linux-kernel, dm-devel, jens.axboe, agk, akpm, guijianfeng, jmoyer, balbir, Rik Van Riel On Tue, Sep 08, 2009 at 01:54:00PM -0400, Vivek Goyal wrote: [..] > I ran a test to show how readers can be starved in certain cases. I launched > one reader and three writers. I ran this test twice. First without dm-ioband > and then with dm-ioband. > > Following are few lines from the script to launch readers and writers. > > ************************************************************** > sync > echo 3 > /proc/sys/vm/drop_caches > > # Launch writers on sdd2 > dd if=/dev/zero of=/mnt/sdd2/writezerofile1 bs=4K count=262144 & > > # Launch writers on sdd1 > dd if=/dev/zero of=/mnt/sdd1/writezerofile1 bs=4K count=262144 & > dd if=/dev/zero of=/mnt/sdd1/writezerofile2 bs=4K count=262144 & > > echo "sleeping for 5 seconds" > sleep 5 > > # launch reader on sdd1 > time dd if=/mnt/sdd1/testzerofile1 of=/dev/zero & > echo "launched reader $!" > ********************************************************************* > > Without dm-ioband, reader finished in roughly 5 seconds. > > 289533952 bytes (290 MB) copied, 5.16765 s, 56.0 MB/s > real 0m5.300s > user 0m0.098s > sys 0m0.492s > > With dm-ioband, reader took, more than 2 minutes to finish. > > 289533952 bytes (290 MB) copied, 122.386 s, 2.4 MB/s > > real 2m2.569s > user 0m0.107s > sys 0m0.548s > > I had created ioband1 on /dev/sdd1 and ioband2 on /dev/sdd2 with weights > 200 and 100 respectively. Hi Ryo, I notice that with-in a single ioband device, a single writer starves the reader even without any competitor groups being present. I ran following two tests with and without dm-ioband devices Test1 ==== Try to use fio for a sequential reader job. First fio will lay out the file and do write operation. While those writes are going on, try to do ls on that partition and observe latency of ls operation. with dm-ioband (ls test) ------------------------ # cd /mnt/sdd2 # time ls real 0m9.483s user 0m0.000s sys 0m0.002s without dm-ioband (ls test) --------------------------- # cd /mnt/sdd2 # time ls 256M-file1 256M-file5 2G-file1 2G-file5 writefile1 writezerofile 256M-file2 256M-file6 2G-file2 files writefile2 256M-file3 256M-file7 2G-file3 fio writefile3 256M-file4 256M-file8 2G-file4 lost+found writefile4 real 0m0.067s user 0m0.000s sys 0m0.002s Notice the time simle "ls" operation took in two cases. Test2 ===== Same case where fio is laying out a file and then try to read some small files on that partition at the interval of 1 second. small file read with dm-ioband ------------------------------ [root@chilli fairness-tests]# ./small-file-read.sh file # 0, plain reading it took: 0.24 seconds file # 1, plain reading it took: 13.40 seconds file # 2, plain reading it took: 6.27 seconds file # 3, plain reading it took: 13.84 seconds file # 4, plain reading it took: 5.63 seconds small file read with-out dm-ioband ================================== [root@chilli fairness-tests]# ./small-file-read.sh file # 0, plain reading it took: 0.04 seconds file # 1, plain reading it took: 0.03 seconds file # 2, plain reading it took: 0.04 seconds file # 3, plain reading it took: 0.03 seconds file # 4, plain reading it took: 0.03 seconds Notice how small file read latencies have shot up. Looks like a single writer is completely starving a reader even without any IO going in any of the other groups. setup ===== I created an two ioband device of weight 100 each on partition /dev/sdd1 and /dev/sdd2 respectively. I am doing IO only on partition /dev/sdd2 (ioband2). Following is fio job script. [seqread] runtime=60 rw=read size=2G directory=/mnt/sdd2/fio/ numjobs=1 group_reporting Following is small file read script. echo 3 > /proc/sys/vm/drop_caches for ((i=0;i<5;i++)); do printf "file #%4d, plain reading it took: " $i /usr/bin/time -f "%e seconds" cat /mnt/sdd2/files/$i >/dev/null sleep 1 done Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* ioband: Writer starves reader even without competitors (Re: Regarding dm-ioband tests) @ 2009-09-15 23:37 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-15 23:37 UTC (permalink / raw) To: Ryo Tsuruta Cc: Rik Van Riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, Nauman Rafique, akpm, agk, balbir On Tue, Sep 08, 2009 at 01:54:00PM -0400, Vivek Goyal wrote: [..] > I ran a test to show how readers can be starved in certain cases. I launched > one reader and three writers. I ran this test twice. First without dm-ioband > and then with dm-ioband. > > Following are few lines from the script to launch readers and writers. > > ************************************************************** > sync > echo 3 > /proc/sys/vm/drop_caches > > # Launch writers on sdd2 > dd if=/dev/zero of=/mnt/sdd2/writezerofile1 bs=4K count=262144 & > > # Launch writers on sdd1 > dd if=/dev/zero of=/mnt/sdd1/writezerofile1 bs=4K count=262144 & > dd if=/dev/zero of=/mnt/sdd1/writezerofile2 bs=4K count=262144 & > > echo "sleeping for 5 seconds" > sleep 5 > > # launch reader on sdd1 > time dd if=/mnt/sdd1/testzerofile1 of=/dev/zero & > echo "launched reader $!" > ********************************************************************* > > Without dm-ioband, reader finished in roughly 5 seconds. > > 289533952 bytes (290 MB) copied, 5.16765 s, 56.0 MB/s > real 0m5.300s > user 0m0.098s > sys 0m0.492s > > With dm-ioband, reader took, more than 2 minutes to finish. > > 289533952 bytes (290 MB) copied, 122.386 s, 2.4 MB/s > > real 2m2.569s > user 0m0.107s > sys 0m0.548s > > I had created ioband1 on /dev/sdd1 and ioband2 on /dev/sdd2 with weights > 200 and 100 respectively. Hi Ryo, I notice that with-in a single ioband device, a single writer starves the reader even without any competitor groups being present. I ran following two tests with and without dm-ioband devices Test1 ==== Try to use fio for a sequential reader job. First fio will lay out the file and do write operation. While those writes are going on, try to do ls on that partition and observe latency of ls operation. with dm-ioband (ls test) ------------------------ # cd /mnt/sdd2 # time ls real 0m9.483s user 0m0.000s sys 0m0.002s without dm-ioband (ls test) --------------------------- # cd /mnt/sdd2 # time ls 256M-file1 256M-file5 2G-file1 2G-file5 writefile1 writezerofile 256M-file2 256M-file6 2G-file2 files writefile2 256M-file3 256M-file7 2G-file3 fio writefile3 256M-file4 256M-file8 2G-file4 lost+found writefile4 real 0m0.067s user 0m0.000s sys 0m0.002s Notice the time simle "ls" operation took in two cases. Test2 ===== Same case where fio is laying out a file and then try to read some small files on that partition at the interval of 1 second. small file read with dm-ioband ------------------------------ [root@chilli fairness-tests]# ./small-file-read.sh file # 0, plain reading it took: 0.24 seconds file # 1, plain reading it took: 13.40 seconds file # 2, plain reading it took: 6.27 seconds file # 3, plain reading it took: 13.84 seconds file # 4, plain reading it took: 5.63 seconds small file read with-out dm-ioband ================================== [root@chilli fairness-tests]# ./small-file-read.sh file # 0, plain reading it took: 0.04 seconds file # 1, plain reading it took: 0.03 seconds file # 2, plain reading it took: 0.04 seconds file # 3, plain reading it took: 0.03 seconds file # 4, plain reading it took: 0.03 seconds Notice how small file read latencies have shot up. Looks like a single writer is completely starving a reader even without any IO going in any of the other groups. setup ===== I created an two ioband device of weight 100 each on partition /dev/sdd1 and /dev/sdd2 respectively. I am doing IO only on partition /dev/sdd2 (ioband2). Following is fio job script. [seqread] runtime=60 rw=read size=2G directory=/mnt/sdd2/fio/ numjobs=1 group_reporting Following is small file read script. echo 3 > /proc/sys/vm/drop_caches for ((i=0;i<5;i++)); do printf "file #%4d, plain reading it took: " $i /usr/bin/time -f "%e seconds" cat /mnt/sdd2/files/$i >/dev/null sleep 1 done Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: ioband: Writer starves reader even without competitors 2009-09-15 23:37 ` Vivek Goyal (?) @ 2009-09-16 12:08 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-16 12:08 UTC (permalink / raw) To: vgoyal Cc: nauman, linux-kernel, dm-devel, jens.axboe, agk, akpm, guijianfeng, jmoyer, balbir, riel Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Ryo, > > I notice that with-in a single ioband device, a single writer starves the > reader even without any competitor groups being present. > > I ran following two tests with and without dm-ioband devices Thank you again for testing dm-ioband. I got the similar results and am investigating it now. I'll let you know if I find something. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 13:42 ` Vivek Goyal (?) (?) @ 2009-09-08 17:06 ` Dhaval Giani 2009-09-09 6:05 ` Ryo Tsuruta -1 siblings, 1 reply; 80+ messages in thread From: Dhaval Giani @ 2009-09-08 17:06 UTC (permalink / raw) To: Vivek Goyal Cc: Ryo Tsuruta, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > advantage.) > > I think this is more of a disadvantage than advantage. We have a very well > defined functionality of cgroup in kernel to group the tasks. Now you are > coming up with your own method of grouping the tasks which will make life > even more confusing for users and application writers. > I would tend to agree with this. With other resource management controllers using cgroups, having dm-ioband use something different will require a different set of userspace tools/libraries to be used. Something that will severly limit its usefulness froma programmer's perspective. thanks, -- regards, Dhaval ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 17:06 ` Regarding dm-ioband tests Dhaval Giani @ 2009-09-09 6:05 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 6:05 UTC (permalink / raw) To: dhaval Cc: vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > I think this is more of a disadvantage than advantage. We have a very well > > defined functionality of cgroup in kernel to group the tasks. Now you are > > coming up with your own method of grouping the tasks which will make life > > even more confusing for users and application writers. I know that cgroup is a very well defined functionality, that is why dm-ioband also supports throttling per cgroup. But how are we supposed to do throttling on the system which doesn't support cgroup? As I wrote in another mail to Vivek, I would like to make use of dm-ioband on RHEL 5.x. And I don't think that the grouping methods are not complicated, just stack a new device on the existing device and assign bandwidth to it, that is the same method as other device-mapper targets, if you would like to assign bandwidth per thread, then register the thread's ID to the device and assign bandwidth to it as well. I don't think it makes users confused. > I would tend to agree with this. With other resource management > controllers using cgroups, having dm-ioband use something different will > require a different set of userspace tools/libraries to be used. > Something that will severly limit its usefulness froma programmer's > perspective. Once we create a dm-ioband device, the device can be configured through the cgroup interface. I think it will not severly limit its usefulness. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 6:05 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 6:05 UTC (permalink / raw) To: dhaval Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Hi, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > I think this is more of a disadvantage than advantage. We have a very well > > defined functionality of cgroup in kernel to group the tasks. Now you are > > coming up with your own method of grouping the tasks which will make life > > even more confusing for users and application writers. I know that cgroup is a very well defined functionality, that is why dm-ioband also supports throttling per cgroup. But how are we supposed to do throttling on the system which doesn't support cgroup? As I wrote in another mail to Vivek, I would like to make use of dm-ioband on RHEL 5.x. And I don't think that the grouping methods are not complicated, just stack a new device on the existing device and assign bandwidth to it, that is the same method as other device-mapper targets, if you would like to assign bandwidth per thread, then register the thread's ID to the device and assign bandwidth to it as well. I don't think it makes users confused. > I would tend to agree with this. With other resource management > controllers using cgroups, having dm-ioband use something different will > require a different set of userspace tools/libraries to be used. > Something that will severly limit its usefulness froma programmer's > perspective. Once we create a dm-ioband device, the device can be configured through the cgroup interface. I think it will not severly limit its usefulness. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 6:05 ` Ryo Tsuruta (?) @ 2009-09-09 10:51 ` Dhaval Giani 2009-09-10 7:58 ` Ryo Tsuruta -1 siblings, 1 reply; 80+ messages in thread From: Dhaval Giani @ 2009-09-09 10:51 UTC (permalink / raw) To: Ryo Tsuruta Cc: vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Wed, Sep 09, 2009 at 03:05:11PM +0900, Ryo Tsuruta wrote: > Hi, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > > advantage.) > > > > > > I think this is more of a disadvantage than advantage. We have a very well > > > defined functionality of cgroup in kernel to group the tasks. Now you are > > > coming up with your own method of grouping the tasks which will make life > > > even more confusing for users and application writers. > > I know that cgroup is a very well defined functionality, that is why > dm-ioband also supports throttling per cgroup. But how are we supposed > to do throttling on the system which doesn't support cgroup? > As I wrote in another mail to Vivek, I would like to make use of > dm-ioband on RHEL 5.x. Hi Ryo, I am not sure that upstream should really be worrying about RHEL 5.x. cgroups is a relatively mature solution and is available in most (if not all) community distros today. We really should not be looking at another grouping solution if the sole reason is that then dm-ioband can be used on RHEL 5.x. The correct solution would be to maintain a separate patch for RHEL 5.x then and not to burden the upstream kernel. > And I don't think that the grouping methods are not complicated, just > stack a new device on the existing device and assign bandwidth to it, > that is the same method as other device-mapper targets, if you would > like to assign bandwidth per thread, then register the thread's ID to > the device and assign bandwidth to it as well. I don't think it makes > users confused. > > > I would tend to agree with this. With other resource management > > controllers using cgroups, having dm-ioband use something different will > > require a different set of userspace tools/libraries to be used. > > Something that will severly limit its usefulness froma programmer's > > perspective. > > Once we create a dm-ioband device, the device can be configured > through the cgroup interface. I think it will not severly limit its > usefulness. > My objection is slightly different. My objection is that there are too many interfaces to do the same thing. Which one of these is the recommended one? WHich one is going to be supported? If we say that cgroups is not the preferred interface, do the application developers need to use yet another library for io control along with cpu/memory control? thanks, -- regards, Dhaval ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 10:51 ` Dhaval Giani @ 2009-09-10 7:58 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-10 7:58 UTC (permalink / raw) To: dhaval Cc: vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > I know that cgroup is a very well defined functionality, that is why > > dm-ioband also supports throttling per cgroup. But how are we supposed > > to do throttling on the system which doesn't support cgroup? > > As I wrote in another mail to Vivek, I would like to make use of > > dm-ioband on RHEL 5.x. > > Hi Ryo, > > I am not sure that upstream should really be worrying about RHEL 5.x. > cgroups is a relatively mature solution and is available in most (if not > all) community distros today. We really should not be looking at another > grouping solution if the sole reason is that then dm-ioband can be used > on RHEL 5.x. The correct solution would be to maintain a separate patch > for RHEL 5.x then and not to burden the upstream kernel. RHEL 5.x is not the sole reason for that. > > And I don't think that the grouping methods are not complicated, just > > stack a new device on the existing device and assign bandwidth to it, > > that is the same method as other device-mapper targets, if you would > > like to assign bandwidth per thread, then register the thread's ID to > > the device and assign bandwidth to it as well. I don't think it makes > > users confused. > > > > > I would tend to agree with this. With other resource management > > > controllers using cgroups, having dm-ioband use something different will > > > require a different set of userspace tools/libraries to be used. > > > Something that will severly limit its usefulness froma programmer's > > > perspective. > > > > Once we create a dm-ioband device, the device can be configured > > through the cgroup interface. I think it will not severly limit its > > usefulness. > > > > My objection is slightly different. My objection is that there are too > many interfaces to do the same thing. Not too many, There are only two interfaces device-mapper and cgroup. > Which one of these is the recommended one? I think whichever users like, it's up to users. > WHich one is going to be supported? Both device-mapper and cgroup interfaces are supported continuously. > If we say that cgroups is not the preferred interface, do the > application developers need to use yet another library for io > control along with cpu/memory control? I don't think that cgroup is not the preferred interface, especially not if dm-ioband uses with cpu/memory controller together. Once a dm-ioband device is created, all configurations for the device can be done through the cgroup interface like other cgroup subsystems. It does not require a different set of userspace tools/libraries to be used. I think it is natural to make dm-ioband can be configured by device-mapper's interface since dm-ioband is a device-mapper driver, and it extends use case, rather it limits usefulness. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-10 7:58 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-10 7:58 UTC (permalink / raw) To: dhaval Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Hi, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > I know that cgroup is a very well defined functionality, that is why > > dm-ioband also supports throttling per cgroup. But how are we supposed > > to do throttling on the system which doesn't support cgroup? > > As I wrote in another mail to Vivek, I would like to make use of > > dm-ioband on RHEL 5.x. > > Hi Ryo, > > I am not sure that upstream should really be worrying about RHEL 5.x. > cgroups is a relatively mature solution and is available in most (if not > all) community distros today. We really should not be looking at another > grouping solution if the sole reason is that then dm-ioband can be used > on RHEL 5.x. The correct solution would be to maintain a separate patch > for RHEL 5.x then and not to burden the upstream kernel. RHEL 5.x is not the sole reason for that. > > And I don't think that the grouping methods are not complicated, just > > stack a new device on the existing device and assign bandwidth to it, > > that is the same method as other device-mapper targets, if you would > > like to assign bandwidth per thread, then register the thread's ID to > > the device and assign bandwidth to it as well. I don't think it makes > > users confused. > > > > > I would tend to agree with this. With other resource management > > > controllers using cgroups, having dm-ioband use something different will > > > require a different set of userspace tools/libraries to be used. > > > Something that will severly limit its usefulness froma programmer's > > > perspective. > > > > Once we create a dm-ioband device, the device can be configured > > through the cgroup interface. I think it will not severly limit its > > usefulness. > > > > My objection is slightly different. My objection is that there are too > many interfaces to do the same thing. Not too many, There are only two interfaces device-mapper and cgroup. > Which one of these is the recommended one? I think whichever users like, it's up to users. > WHich one is going to be supported? Both device-mapper and cgroup interfaces are supported continuously. > If we say that cgroups is not the preferred interface, do the > application developers need to use yet another library for io > control along with cpu/memory control? I don't think that cgroup is not the preferred interface, especially not if dm-ioband uses with cpu/memory controller together. Once a dm-ioband device is created, all configurations for the device can be done through the cgroup interface like other cgroup subsystems. It does not require a different set of userspace tools/libraries to be used. I think it is natural to make dm-ioband can be configured by device-mapper's interface since dm-ioband is a device-mapper driver, and it extends use case, rather it limits usefulness. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-10 7:58 ` Ryo Tsuruta (?) @ 2009-09-11 9:53 ` Dhaval Giani 2009-09-15 15:12 ` Ryo Tsuruta -1 siblings, 1 reply; 80+ messages in thread From: Dhaval Giani @ 2009-09-11 9:53 UTC (permalink / raw) To: Ryo Tsuruta Cc: vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Thu, Sep 10, 2009 at 04:58:49PM +0900, Ryo Tsuruta wrote: > Hi, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > I know that cgroup is a very well defined functionality, that is why > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > to do throttling on the system which doesn't support cgroup? > > > As I wrote in another mail to Vivek, I would like to make use of > > > dm-ioband on RHEL 5.x. > > > > Hi Ryo, > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > cgroups is a relatively mature solution and is available in most (if not > > all) community distros today. We really should not be looking at another > > grouping solution if the sole reason is that then dm-ioband can be used > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > for RHEL 5.x then and not to burden the upstream kernel. > > RHEL 5.x is not the sole reason for that. > Could you please enumerate the other reasons for pushing in another grouping mechanism then? (Why can we not resolve them via cgroups?) Thanks, -- regards, Dhaval ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-11 9:53 ` Dhaval Giani @ 2009-09-15 15:12 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-15 15:12 UTC (permalink / raw) To: dhaval Cc: vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi Dhaval, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > I know that cgroup is a very well defined functionality, that is why > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > to do throttling on the system which doesn't support cgroup? > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > dm-ioband on RHEL 5.x. > > > > > > Hi Ryo, > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > cgroups is a relatively mature solution and is available in most (if not > > > all) community distros today. We really should not be looking at another > > > grouping solution if the sole reason is that then dm-ioband can be used > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > RHEL 5.x is not the sole reason for that. > > > > Could you please enumerate the other reasons for pushing in another > grouping mechanism then? (Why can we not resolve them via cgroups?) I'm sorry for late reply. I'm not only pushing in the grouping mechanism by using the dmsetup command. Please understand that dm-ioband also provides cgroup interface and can be configured in the same manner like other cgroup subsystems. Why it is so bad to have multiple ways to configure? I think that it rather gains in flexibility of configurations. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-15 15:12 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-15 15:12 UTC (permalink / raw) To: dhaval Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Hi Dhaval, Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > I know that cgroup is a very well defined functionality, that is why > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > to do throttling on the system which doesn't support cgroup? > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > dm-ioband on RHEL 5.x. > > > > > > Hi Ryo, > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > cgroups is a relatively mature solution and is available in most (if not > > > all) community distros today. We really should not be looking at another > > > grouping solution if the sole reason is that then dm-ioband can be used > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > RHEL 5.x is not the sole reason for that. > > > > Could you please enumerate the other reasons for pushing in another > grouping mechanism then? (Why can we not resolve them via cgroups?) I'm sorry for late reply. I'm not only pushing in the grouping mechanism by using the dmsetup command. Please understand that dm-ioband also provides cgroup interface and can be configured in the same manner like other cgroup subsystems. Why it is so bad to have multiple ways to configure? I think that it rather gains in flexibility of configurations. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-15 15:12 ` Ryo Tsuruta @ 2009-09-15 15:19 ` Balbir Singh -1 siblings, 0 replies; 80+ messages in thread From: Balbir Singh @ 2009-09-15 15:19 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval, vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: > Hi Dhaval, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > > I know that cgroup is a very well defined functionality, that is why > > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > > to do throttling on the system which doesn't support cgroup? > > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > > dm-ioband on RHEL 5.x. > > > > > > > > Hi Ryo, > > > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > > cgroups is a relatively mature solution and is available in most (if not > > > > all) community distros today. We really should not be looking at another > > > > grouping solution if the sole reason is that then dm-ioband can be used > > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > > > RHEL 5.x is not the sole reason for that. > > > > > > > Could you please enumerate the other reasons for pushing in another > > grouping mechanism then? (Why can we not resolve them via cgroups?) > > I'm sorry for late reply. > > I'm not only pushing in the grouping mechanism by using the dmsetup > command. Please understand that dm-ioband also provides cgroup > interface and can be configured in the same manner like other cgroup > subsystems. > Why it is so bad to have multiple ways to configure? I think that it > rather gains in flexibility of configurations. > The main issue I see is user confusion and distro issues. If a distro compiles cgroups and dmsetup provides both methods, what method do we recommend to end users? Also should system management tool support two configuration mechanisms for the same functionality? -- Balbir ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-15 15:19 ` Balbir Singh 0 siblings, 0 replies; 80+ messages in thread From: Balbir Singh @ 2009-09-15 15:19 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, agk, jens.axboe, nauman, akpm, vgoyal * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: > Hi Dhaval, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > > I know that cgroup is a very well defined functionality, that is why > > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > > to do throttling on the system which doesn't support cgroup? > > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > > dm-ioband on RHEL 5.x. > > > > > > > > Hi Ryo, > > > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > > cgroups is a relatively mature solution and is available in most (if not > > > > all) community distros today. We really should not be looking at another > > > > grouping solution if the sole reason is that then dm-ioband can be used > > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > > > RHEL 5.x is not the sole reason for that. > > > > > > > Could you please enumerate the other reasons for pushing in another > > grouping mechanism then? (Why can we not resolve them via cgroups?) > > I'm sorry for late reply. > > I'm not only pushing in the grouping mechanism by using the dmsetup > command. Please understand that dm-ioband also provides cgroup > interface and can be configured in the same manner like other cgroup > subsystems. > Why it is so bad to have multiple ways to configure? I think that it > rather gains in flexibility of configurations. > The main issue I see is user confusion and distro issues. If a distro compiles cgroups and dmsetup provides both methods, what method do we recommend to end users? Also should system management tool support two configuration mechanisms for the same functionality? -- Balbir ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-15 15:19 ` Balbir Singh @ 2009-09-15 15:58 ` Rik van Riel -1 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-15 15:58 UTC (permalink / raw) To: balbir Cc: Ryo Tsuruta, dhaval, vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer Balbir Singh wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: >> Why it is so bad to have multiple ways to configure? I think that it >> rather gains in flexibility of configurations. >> > > The main issue I see is user confusion and distro issues. If a distro > compiles cgroups and dmsetup provides both methods, what method > do we recommend to end users? Also should system management tool > support two configuration mechanisms for the same functionality? It gets worse. If the distro sets up things via cgroups and the admin tries to use dmsetup - how does the configuration propagate between the two mechanisms? The sysadmin would expect that any changes made via dmsetup will become visible via the config tools (that use cgroups), too. This will quickly increase the code requirements to ridiculous proportions - or leave sysadmins confused and annoyed. Neither is a good option, IMHO. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-15 15:58 ` Rik van Riel 0 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-15 15:58 UTC (permalink / raw) To: balbir Cc: dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, nauman, agk, jens.axboe, akpm, vgoyal Balbir Singh wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: >> Why it is so bad to have multiple ways to configure? I think that it >> rather gains in flexibility of configurations. >> > > The main issue I see is user confusion and distro issues. If a distro > compiles cgroups and dmsetup provides both methods, what method > do we recommend to end users? Also should system management tool > support two configuration mechanisms for the same functionality? It gets worse. If the distro sets up things via cgroups and the admin tries to use dmsetup - how does the configuration propagate between the two mechanisms? The sysadmin would expect that any changes made via dmsetup will become visible via the config tools (that use cgroups), too. This will quickly increase the code requirements to ridiculous proportions - or leave sysadmins confused and annoyed. Neither is a good option, IMHO. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-15 15:19 ` Balbir Singh @ 2009-09-15 16:21 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-15 16:21 UTC (permalink / raw) To: balbir Cc: dhaval, vgoyal, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer Hi Balbir, Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: > > > Hi Dhaval, > > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > > > I know that cgroup is a very well defined functionality, that is why > > > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > > > to do throttling on the system which doesn't support cgroup? > > > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > > > dm-ioband on RHEL 5.x. > > > > > > > > > > Hi Ryo, > > > > > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > > > cgroups is a relatively mature solution and is available in most (if not > > > > > all) community distros today. We really should not be looking at another > > > > > grouping solution if the sole reason is that then dm-ioband can be used > > > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > > > > > RHEL 5.x is not the sole reason for that. > > > > > > > > > > Could you please enumerate the other reasons for pushing in another > > > grouping mechanism then? (Why can we not resolve them via cgroups?) > > > > I'm sorry for late reply. > > > > I'm not only pushing in the grouping mechanism by using the dmsetup > > command. Please understand that dm-ioband also provides cgroup > > interface and can be configured in the same manner like other cgroup > > subsystems. > > Why it is so bad to have multiple ways to configure? I think that it > > rather gains in flexibility of configurations. > > > > The main issue I see is user confusion and distro issues. If a distro > compiles cgroups and dmsetup provides both methods, what method > do we recommend to end users? Also should system management tool > support two configuration mechanisms for the same functionality? I think that it is up to users which mechanism they choose to use, and such kind of users who can use dmsetup or cgroup interface directly will not be confused in such a situation. I also think that management tools are required for end users, and if a distro supports cgroups, I recommend the management tools configure dm-ioband by using cgroups, because dm-ioband is more usable when using with blkio-cgroup and memory cgroup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-15 16:21 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-15 16:21 UTC (permalink / raw) To: balbir Cc: riel, dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, agk, jens.axboe, nauman, akpm, vgoyal Hi Balbir, Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > * Ryo Tsuruta <ryov@valinux.co.jp> [2009-09-16 00:12:37]: > > > Hi Dhaval, > > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > > > I know that cgroup is a very well defined functionality, that is why > > > > > > dm-ioband also supports throttling per cgroup. But how are we supposed > > > > > > to do throttling on the system which doesn't support cgroup? > > > > > > As I wrote in another mail to Vivek, I would like to make use of > > > > > > dm-ioband on RHEL 5.x. > > > > > > > > > > Hi Ryo, > > > > > > > > > > I am not sure that upstream should really be worrying about RHEL 5.x. > > > > > cgroups is a relatively mature solution and is available in most (if not > > > > > all) community distros today. We really should not be looking at another > > > > > grouping solution if the sole reason is that then dm-ioband can be used > > > > > on RHEL 5.x. The correct solution would be to maintain a separate patch > > > > > for RHEL 5.x then and not to burden the upstream kernel. > > > > > > > > RHEL 5.x is not the sole reason for that. > > > > > > > > > > Could you please enumerate the other reasons for pushing in another > > > grouping mechanism then? (Why can we not resolve them via cgroups?) > > > > I'm sorry for late reply. > > > > I'm not only pushing in the grouping mechanism by using the dmsetup > > command. Please understand that dm-ioband also provides cgroup > > interface and can be configured in the same manner like other cgroup > > subsystems. > > Why it is so bad to have multiple ways to configure? I think that it > > rather gains in flexibility of configurations. > > > > The main issue I see is user confusion and distro issues. If a distro > compiles cgroups and dmsetup provides both methods, what method > do we recommend to end users? Also should system management tool > support two configuration mechanisms for the same functionality? I think that it is up to users which mechanism they choose to use, and such kind of users who can use dmsetup or cgroup interface directly will not be confused in such a situation. I also think that management tools are required for end users, and if a distro supports cgroups, I recommend the management tools configure dm-ioband by using cgroups, because dm-ioband is more usable when using with blkio-cgroup and memory cgroup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 6:05 ` Ryo Tsuruta @ 2009-09-09 13:57 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 13:57 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Wed, Sep 09, 2009 at 03:05:11PM +0900, Ryo Tsuruta wrote: > Hi, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > > advantage.) > > > > > > I think this is more of a disadvantage than advantage. We have a very well > > > defined functionality of cgroup in kernel to group the tasks. Now you are > > > coming up with your own method of grouping the tasks which will make life > > > even more confusing for users and application writers. > > I know that cgroup is a very well defined functionality, that is why > dm-ioband also supports throttling per cgroup. But how are we supposed > to do throttling on the system which doesn't support cgroup? > As I wrote in another mail to Vivek, I would like to make use of > dm-ioband on RHEL 5.x. I think you need to maintain and support this module out of the kernel tree for older kernels. Does not make much sense to introduce new interfaces to support a functionality in older kernels. > And I don't think that the grouping methods are not complicated, just > stack a new device on the existing device and assign bandwidth to it, > that is the same method as other device-mapper targets, if you would > like to assign bandwidth per thread, then register the thread's ID to > the device and assign bandwidth to it as well. I don't think it makes > users confused. - First of all it is more about doing things a new way and not the standard way. Moreover upstream does not benefit from this new interface. It just stands to loose because of maintenance overhead and need of chaning user space tools to make use of this new interface. - Secondly, personally I think it more twisted also. Following is the small code to setup two ioband devices ioband1 and ioband2 and two additional groups on ioband1 device using cgroup interface. *********************************************************************** echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 mount -t cgroup -o blkio hier1 /cgroup/ioband mkdir /cgroup/ioband/test1 /cgroup/ioband/test2 test1_id=`cat /cgroup/ioband/test1/blkio.id` test2_id=`cat /cgroup/ioband/test2/blkio.id` test1_weight=200 test2_weight=100 dmsetup message ioband1 0 type cgroup dmsetup message ioband1 0 attach $test1_id dmsetup message ioband1 0 attach $test2_id dmsetup message ioband1 0 weight $test1_id:$test1_weight dmsetup message ioband1 0 weight $test2_id:$test2_weight mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 ************************************************************************* For status of various settings one needs to use "dmsetup status" and "dmsetup table" commands. Look at the output of these commands with just two groups. Output for all the groups is on a single line. Think of the situation when there are 7-8 groups and how bad it will look. #dmsetup status ioband2: 0 40355280 ioband 1 -1 105 0 834 1 0 8 ioband1: 0 37768752 ioband 1 -1 105 0 834 1 0 8 2 0 0 0 0 0 0 3 0 0 0 0 0 0 #dmsetup table ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 I find it so hard to interpre those numbers. Everything about a device is exported in a single line. In cgroup based interface, things are deviced nicely among different files. Also one group shows statistics about that group only and not about all the groups present in the system. It is easier to parse and comprehend. > > > I would tend to agree with this. With other resource management > > controllers using cgroups, having dm-ioband use something different will > > require a different set of userspace tools/libraries to be used. > > Something that will severly limit its usefulness froma programmer's > > perspective. > > Once we create a dm-ioband device, the device can be configured > through the cgroup interface. I think it will not severly limit its > usefulness. To create the device once you need dm-tools and libcgroup needs to learn how to make various use of various dm commands. It also needs to learn how to parse outputs of "dmsetup table" and "dmsetup status" commands and consolidate that information. This is despite the fact that it is using cgroup interface finally to group the task. But libcgroup still need to propagate cgroup id to individual ioband devices. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 13:57 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 13:57 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, dhaval, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Wed, Sep 09, 2009 at 03:05:11PM +0900, Ryo Tsuruta wrote: > Hi, > > Dhaval Giani <dhaval@linux.vnet.ibm.com> wrote: > > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > > advantage.) > > > > > > I think this is more of a disadvantage than advantage. We have a very well > > > defined functionality of cgroup in kernel to group the tasks. Now you are > > > coming up with your own method of grouping the tasks which will make life > > > even more confusing for users and application writers. > > I know that cgroup is a very well defined functionality, that is why > dm-ioband also supports throttling per cgroup. But how are we supposed > to do throttling on the system which doesn't support cgroup? > As I wrote in another mail to Vivek, I would like to make use of > dm-ioband on RHEL 5.x. I think you need to maintain and support this module out of the kernel tree for older kernels. Does not make much sense to introduce new interfaces to support a functionality in older kernels. > And I don't think that the grouping methods are not complicated, just > stack a new device on the existing device and assign bandwidth to it, > that is the same method as other device-mapper targets, if you would > like to assign bandwidth per thread, then register the thread's ID to > the device and assign bandwidth to it as well. I don't think it makes > users confused. - First of all it is more about doing things a new way and not the standard way. Moreover upstream does not benefit from this new interface. It just stands to loose because of maintenance overhead and need of chaning user space tools to make use of this new interface. - Secondly, personally I think it more twisted also. Following is the small code to setup two ioband devices ioband1 and ioband2 and two additional groups on ioband1 device using cgroup interface. *********************************************************************** echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 none" "weight 0 :200" | dmsetup create ioband1 echo "0 $(blockdev --getsize /dev/sdd2) ioband /dev/sdd2 1 0 0 none" "weight 0 :100" | dmsetup create ioband2 mount -t cgroup -o blkio hier1 /cgroup/ioband mkdir /cgroup/ioband/test1 /cgroup/ioband/test2 test1_id=`cat /cgroup/ioband/test1/blkio.id` test2_id=`cat /cgroup/ioband/test2/blkio.id` test1_weight=200 test2_weight=100 dmsetup message ioband1 0 type cgroup dmsetup message ioband1 0 attach $test1_id dmsetup message ioband1 0 attach $test2_id dmsetup message ioband1 0 weight $test1_id:$test1_weight dmsetup message ioband1 0 weight $test2_id:$test2_weight mount /dev/mapper/ioband1 /mnt/sdd1 mount /dev/mapper/ioband2 /mnt/sdd2 ************************************************************************* For status of various settings one needs to use "dmsetup status" and "dmsetup table" commands. Look at the output of these commands with just two groups. Output for all the groups is on a single line. Think of the situation when there are 7-8 groups and how bad it will look. #dmsetup status ioband2: 0 40355280 ioband 1 -1 105 0 834 1 0 8 ioband1: 0 37768752 ioband 1 -1 105 0 834 1 0 8 2 0 0 0 0 0 0 3 0 0 0 0 0 0 #dmsetup table ioband2: 0 40355280 ioband 8:50 1 4 192 none weight 768 :100 ioband1: 0 37768752 ioband 8:49 1 4 192 cgroup weight 768 :200 2:200 3:100 I find it so hard to interpre those numbers. Everything about a device is exported in a single line. In cgroup based interface, things are deviced nicely among different files. Also one group shows statistics about that group only and not about all the groups present in the system. It is easier to parse and comprehend. > > > I would tend to agree with this. With other resource management > > controllers using cgroups, having dm-ioband use something different will > > require a different set of userspace tools/libraries to be used. > > Something that will severly limit its usefulness froma programmer's > > perspective. > > Once we create a dm-ioband device, the device can be configured > through the cgroup interface. I think it will not severly limit its > usefulness. To create the device once you need dm-tools and libcgroup needs to learn how to make various use of various dm commands. It also needs to learn how to parse outputs of "dmsetup table" and "dmsetup status" commands and consolidate that information. This is despite the fact that it is using cgroup interface finally to group the task. But libcgroup still need to propagate cgroup id to individual ioband devices. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 13:57 ` Vivek Goyal (?) @ 2009-09-10 3:06 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-10 3:06 UTC (permalink / raw) To: vgoyal Cc: dhaval, riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > - Secondly, personally I think it more twisted also. Following is the > small code to setup two ioband devices ioband1 and ioband2 and two > additional groups on ioband1 device using cgroup interface. > The latest dm-ioband and blkio-cgroup, configurations can be done through the cgroup interface once a dm-device is created, blkio.setting file is appered under the cgroup directory. There is no need to run "dmsetup message" command and care the the blkio.id anymore. The following is an example script based on yours. *********************************************************************** echo "0 $(blockdev --getsize /dev/sdd1) ioband /dev/sdd1 1 0 0 cgroup" "weight 0 :100" | dmsetup create ioband1 mount -t cgroup -o blkio hier1 /cgroup/ioband mkdir /cgroup/ioband/test1 /cgroup/ioband/test2 echo ioband1 200 > /cgroup/ioband/test1/blkio.settings echo ioband1 100 > /cgroup/ioband/test2/blkio.settings mount /dev/mapper/ioband1 /mnt/sdd1 > For status of various settings one needs to use "dmsetup status" and > "dmsetup table" commands. Look at the output of these commands with just > two groups. Output for all the groups is on a single line. Think of the > situation when there are 7-8 groups and how bad it will look. > > #dmsetup status > ioband2: 0 40355280 ioband 1 -1 105 0 834 1 0 8 > ioband1: 0 37768752 ioband 1 -1 105 0 834 1 0 8 2 0 0 0 0 0 0 3 0 0 0 0 0 0 I'll provide blkio.stat file to get statistics per cgroup in the next release. > > Once we create a dm-ioband device, the device can be configured > > through the cgroup interface. I think it will not severly limit its > > usefulness. > > To create the device once you need dm-tools and libcgroup needs to learn > how to make various use of various dm commands. It also needs to learn how > to parse outputs of "dmsetup table" and "dmsetup status" commands and > consolidate that information. > > This is despite the fact that it is using cgroup interface finally to > group the task. But libcgroup still need to propagate cgroup id to > individual ioband devices. We still need to use dmsetup for the device creation, but It is not too much pain. I think it would be better If dm-ioband is intergrated into LVM, then we can handle dm-ioband devies in almost the same manner as other LV devices. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 13:42 ` Vivek Goyal ` (2 preceding siblings ...) (?) @ 2009-09-09 10:01 ` Ryo Tsuruta 2009-09-09 14:31 ` Vivek Goyal -1 siblings, 1 reply; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 10:01 UTC (permalink / raw) To: vgoyal Cc: riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > I think there are some advantages to dm-ioband. That's why I post > > dm-ioband to the mailing list. > > > > - dm-ioband supports not only proportional weight policy but also rate > > limiting policy. Besides, new policies can be added to dm-ioband if > > a user wants to control bandwidth by his or her own policy. > > I think we can easily extent io scheduler based controller to also support > max rate per group policy also. That should not be too hard. It is a > matter of only keeping track of io rate per group and if a group is > exceeding the rate, then schedule it out and move on to next group. > > I can do that once proportional weight solution is stablized and gets > merged. > > So its not an advantage of dm-ioband. O.K. > > - The dm-ioband driver can be replaced without stopping the system by > > using device-mapper's facility. It's easy to maintain. > > We talked about this point in the past also. In io scheduler based > controller, just move all the tasks to root group and you got a system > not doing any io control. > > By the way why would one like to do that? > > So this is also not an advantage. My point is that dm-ioband can be updated for improvements and bug-fixing without stopping the system. > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > advantage.) > > I think this is more of a disadvantage than advantage. We have a very well > defined functionality of cgroup in kernel to group the tasks. Now you are > coming up with your own method of grouping the tasks which will make life > even more confusing for users and application writers. > > I don't understand what is that core requirement of yours which is not met > by io scheduler based io controller. range policy control you have > implemented recently. I don't think that removing dm-ioband module > dynamically is core requirement. Also whatever you can do with additional > grouping mechanism, you can do with cgroup also. > > So if there is any of your core functionality which is not fulfilled by > io scheduler based controller, please let me know. I will be happy to look > into it and try to provide that feature. But looking at above list, I am > not convinced that any of the above is a compelling argument for dm-ioband > inclusion. As I wrote in another email, I would like to make use of dm-ioband on the system which doesn't support cgroup such as RHEL. In addition, there are devices which doesn't use standard IO schedulers, and dm-ioband can work on even such devices. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 10:01 ` Ryo Tsuruta @ 2009-09-09 14:31 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 14:31 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Wed, Sep 09, 2009 at 07:01:46PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > I think there are some advantages to dm-ioband. That's why I post > > > dm-ioband to the mailing list. > > > > > > - dm-ioband supports not only proportional weight policy but also rate > > > limiting policy. Besides, new policies can be added to dm-ioband if > > > a user wants to control bandwidth by his or her own policy. > > > > I think we can easily extent io scheduler based controller to also support > > max rate per group policy also. That should not be too hard. It is a > > matter of only keeping track of io rate per group and if a group is > > exceeding the rate, then schedule it out and move on to next group. > > > > I can do that once proportional weight solution is stablized and gets > > merged. > > > > So its not an advantage of dm-ioband. > > O.K. > > > > - The dm-ioband driver can be replaced without stopping the system by > > > using device-mapper's facility. It's easy to maintain. > > > > We talked about this point in the past also. In io scheduler based > > controller, just move all the tasks to root group and you got a system > > not doing any io control. > > > > By the way why would one like to do that? > > > > So this is also not an advantage. > > My point is that dm-ioband can be updated for improvements and > bug-fixing without stopping the system. > > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > I think this is more of a disadvantage than advantage. We have a very well > > defined functionality of cgroup in kernel to group the tasks. Now you are > > coming up with your own method of grouping the tasks which will make life > > even more confusing for users and application writers. > > > > I don't understand what is that core requirement of yours which is not met > > by io scheduler based io controller. range policy control you have > > implemented recently. I don't think that removing dm-ioband module > > dynamically is core requirement. Also whatever you can do with additional > > grouping mechanism, you can do with cgroup also. > > > > So if there is any of your core functionality which is not fulfilled by > > io scheduler based controller, please let me know. I will be happy to look > > into it and try to provide that feature. But looking at above list, I am > > not convinced that any of the above is a compelling argument for dm-ioband > > inclusion. > > As I wrote in another email, I would like to make use of dm-ioband on > the system which doesn't support cgroup such as RHEL. For supporting io controller mechanism in older kernels which don't have cgroup interface support, I think one needs to maintain out of the tree module. Upstream does not benefit from it. > In addition, > there are devices which doesn't use standard IO schedulers, and > dm-ioband can work on even such devices. This is a interesting use case. Few thoughts. - Can't io scheduling mechanism of these devices make use of elevator and elevator fair queuing interfaces to take advantage of io controlling mechanism. It should not be too difficult. Look at noop. It has just 131 lines of code and it now supports hierarchical io scheduling. This will come with request queue and its merging and plug/unplug mechanism. Is that an issue? - If not, then yes, for these corner cases, io scheduler based controller does not work as it is. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 14:31 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 14:31 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Wed, Sep 09, 2009 at 07:01:46PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > I think there are some advantages to dm-ioband. That's why I post > > > dm-ioband to the mailing list. > > > > > > - dm-ioband supports not only proportional weight policy but also rate > > > limiting policy. Besides, new policies can be added to dm-ioband if > > > a user wants to control bandwidth by his or her own policy. > > > > I think we can easily extent io scheduler based controller to also support > > max rate per group policy also. That should not be too hard. It is a > > matter of only keeping track of io rate per group and if a group is > > exceeding the rate, then schedule it out and move on to next group. > > > > I can do that once proportional weight solution is stablized and gets > > merged. > > > > So its not an advantage of dm-ioband. > > O.K. > > > > - The dm-ioband driver can be replaced without stopping the system by > > > using device-mapper's facility. It's easy to maintain. > > > > We talked about this point in the past also. In io scheduler based > > controller, just move all the tasks to root group and you got a system > > not doing any io control. > > > > By the way why would one like to do that? > > > > So this is also not an advantage. > > My point is that dm-ioband can be updated for improvements and > bug-fixing without stopping the system. > > > > - dm-ioband can use without cgroup. (I remember Vivek said it's not an > > > advantage.) > > > > I think this is more of a disadvantage than advantage. We have a very well > > defined functionality of cgroup in kernel to group the tasks. Now you are > > coming up with your own method of grouping the tasks which will make life > > even more confusing for users and application writers. > > > > I don't understand what is that core requirement of yours which is not met > > by io scheduler based io controller. range policy control you have > > implemented recently. I don't think that removing dm-ioband module > > dynamically is core requirement. Also whatever you can do with additional > > grouping mechanism, you can do with cgroup also. > > > > So if there is any of your core functionality which is not fulfilled by > > io scheduler based controller, please let me know. I will be happy to look > > into it and try to provide that feature. But looking at above list, I am > > not convinced that any of the above is a compelling argument for dm-ioband > > inclusion. > > As I wrote in another email, I would like to make use of dm-ioband on > the system which doesn't support cgroup such as RHEL. For supporting io controller mechanism in older kernels which don't have cgroup interface support, I think one needs to maintain out of the tree module. Upstream does not benefit from it. > In addition, > there are devices which doesn't use standard IO schedulers, and > dm-ioband can work on even such devices. This is a interesting use case. Few thoughts. - Can't io scheduling mechanism of these devices make use of elevator and elevator fair queuing interfaces to take advantage of io controlling mechanism. It should not be too difficult. Look at noop. It has just 131 lines of code and it now supports hierarchical io scheduling. This will come with request queue and its merging and plug/unplug mechanism. Is that an issue? - If not, then yes, for these corner cases, io scheduler based controller does not work as it is. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 14:31 ` Vivek Goyal (?) @ 2009-09-10 3:45 ` Ryo Tsuruta 2009-09-10 13:25 ` Vivek Goyal -1 siblings, 1 reply; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-10 3:45 UTC (permalink / raw) To: vgoyal Cc: riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > > In addition, > > there are devices which doesn't use standard IO schedulers, and > > dm-ioband can work on even such devices. > > This is a interesting use case. Few thoughts. > > - Can't io scheduling mechanism of these devices make use of elevator and > elevator fair queuing interfaces to take advantage of io controlling > mechanism. It should not be too difficult. Look at noop. It has > just 131 lines of code and it now supports hierarchical io scheduling. > > This will come with request queue and its merging and plug/unplug > mechanism. Is that an issue? > > - If not, then yes, for these corner cases, io scheduler based controller > does not work as it is. I have a extreme fast SSD and its device driver provides its own make_request_fn(). So the device driver intercepts IO requests and the subsequent processes are done within it. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-10 3:45 ` Ryo Tsuruta @ 2009-09-10 13:25 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-10 13:25 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Thu, Sep 10, 2009 at 12:45:47PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > In addition, > > > there are devices which doesn't use standard IO schedulers, and > > > dm-ioband can work on even such devices. > > > > This is a interesting use case. Few thoughts. > > > > - Can't io scheduling mechanism of these devices make use of elevator and > > elevator fair queuing interfaces to take advantage of io controlling > > mechanism. It should not be too difficult. Look at noop. It has > > just 131 lines of code and it now supports hierarchical io scheduling. > > > > This will come with request queue and its merging and plug/unplug > > mechanism. Is that an issue? > > > > - If not, then yes, for these corner cases, io scheduler based controller > > does not work as it is. > > I have a extreme fast SSD and its device driver provides its own > make_request_fn(). So the device driver intercepts IO requests and the > subsequent processes are done within it. IMHO, in those cases these SSD driver needs to hook into block layer's request queue mechanism if they need io controlling mechanism instead of we coming up a device mapper module. Think of it that if somebody needs CFQ like tasks classes and prio suported on these devices, should we also come up with another device mapper module "dm-cfq"? Jens, I am wondering if similiar concerns have popped in the past also for CFQ also? Somebody asking to support task prio and classes on devices which don't use standard IO scheduler? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-10 13:25 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-10 13:25 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Thu, Sep 10, 2009 at 12:45:47PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > In addition, > > > there are devices which doesn't use standard IO schedulers, and > > > dm-ioband can work on even such devices. > > > > This is a interesting use case. Few thoughts. > > > > - Can't io scheduling mechanism of these devices make use of elevator and > > elevator fair queuing interfaces to take advantage of io controlling > > mechanism. It should not be too difficult. Look at noop. It has > > just 131 lines of code and it now supports hierarchical io scheduling. > > > > This will come with request queue and its merging and plug/unplug > > mechanism. Is that an issue? > > > > - If not, then yes, for these corner cases, io scheduler based controller > > does not work as it is. > > I have a extreme fast SSD and its device driver provides its own > make_request_fn(). So the device driver intercepts IO requests and the > subsequent processes are done within it. IMHO, in those cases these SSD driver needs to hook into block layer's request queue mechanism if they need io controlling mechanism instead of we coming up a device mapper module. Think of it that if somebody needs CFQ like tasks classes and prio suported on these devices, should we also come up with another device mapper module "dm-cfq"? Jens, I am wondering if similiar concerns have popped in the past also for CFQ also? Somebody asking to support task prio and classes on devices which don't use standard IO scheduler? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 3:01 ` Ryo Tsuruta @ 2009-09-08 19:24 ` Rik van Riel -1 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-08 19:24 UTC (permalink / raw) To: Ryo Tsuruta Cc: vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Ryo Tsuruta wrote: > Rik van Riel <riel@redhat.com> wrote: >> Are you saying that dm-ioband is purposely unfair, >> until a certain load level is reached? > > Not unfair, dm-ioband(weight policy) is intentionally designed to > use bandwidth efficiently, weight policy tries to give spare bandwidth > of inactive groups to active groups. This sounds good, except that the lack of anticipation means that a group with just one task doing reads will be considered "inactive" in-between reads. This means writes can always get in-between two reads, sometimes multiple writes at a time, really disadvantaging a group that is doing just disk reads. This is a problem, because reads are generally more time sensitive than writes. >>> We regarded reducing throughput loss rather than reducing duration >>> as the design of dm-ioband. Of course, it is possible to make a new >>> policy which reduces duration. >> ... while also reducing overall system throughput >> by design? > > I think it reduces system throughput compared to the current > implementation, because it causes more overhead to do fine grained > control. Except that the io scheduler based io controller seems to be able to enforce fairness while not reducing throughput. Dm-ioband would have to address these issues to be a serious contender, IMHO. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-08 19:24 ` Rik van Riel 0 siblings, 0 replies; 80+ messages in thread From: Rik van Riel @ 2009-09-08 19:24 UTC (permalink / raw) To: Ryo Tsuruta Cc: guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Ryo Tsuruta wrote: > Rik van Riel <riel@redhat.com> wrote: >> Are you saying that dm-ioband is purposely unfair, >> until a certain load level is reached? > > Not unfair, dm-ioband(weight policy) is intentionally designed to > use bandwidth efficiently, weight policy tries to give spare bandwidth > of inactive groups to active groups. This sounds good, except that the lack of anticipation means that a group with just one task doing reads will be considered "inactive" in-between reads. This means writes can always get in-between two reads, sometimes multiple writes at a time, really disadvantaging a group that is doing just disk reads. This is a problem, because reads are generally more time sensitive than writes. >>> We regarded reducing throughput loss rather than reducing duration >>> as the design of dm-ioband. Of course, it is possible to make a new >>> policy which reduces duration. >> ... while also reducing overall system throughput >> by design? > > I think it reduces system throughput compared to the current > implementation, because it causes more overhead to do fine grained > control. Except that the io scheduler based io controller seems to be able to enforce fairness while not reducing throughput. Dm-ioband would have to address these issues to be a serious contender, IMHO. -- All rights reversed. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-08 19:24 ` Rik van Riel (?) @ 2009-09-09 0:09 ` Fabio Checconi 2009-09-09 2:06 ` Vivek Goyal 2009-09-09 9:24 ` Ryo Tsuruta -1 siblings, 2 replies; 80+ messages in thread From: Fabio Checconi @ 2009-09-09 0:09 UTC (permalink / raw) To: Rik van Riel Cc: Ryo Tsuruta, vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi, > From: Rik van Riel <riel@redhat.com> > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > Ryo Tsuruta wrote: > >Rik van Riel <riel@redhat.com> wrote: > > >>Are you saying that dm-ioband is purposely unfair, > >>until a certain load level is reached? > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > >use bandwidth efficiently, weight policy tries to give spare bandwidth > >of inactive groups to active groups. > > This sounds good, except that the lack of anticipation > means that a group with just one task doing reads will > be considered "inactive" in-between reads. > anticipation helps in achieving fairness, but CFQ currently disables idling for nonrot+NCQ media, to avoid the resulting throughput loss on some SSDs. Are we really sure that we want to introduce anticipation everywhere, not only to improve throughput on rotational media, but to achieve fairness too? ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 0:09 ` Fabio Checconi @ 2009-09-09 2:06 ` Vivek Goyal 2009-09-09 9:24 ` Ryo Tsuruta 1 sibling, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 2:06 UTC (permalink / raw) To: Fabio Checconi Cc: Rik van Riel, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Wed, Sep 09, 2009 at 02:09:00AM +0200, Fabio Checconi wrote: > Hi, > > > From: Rik van Riel <riel@redhat.com> > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > Ryo Tsuruta wrote: > > >Rik van Riel <riel@redhat.com> wrote: > > > > >>Are you saying that dm-ioband is purposely unfair, > > >>until a certain load level is reached? > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > >of inactive groups to active groups. > > > > This sounds good, except that the lack of anticipation > > means that a group with just one task doing reads will > > be considered "inactive" in-between reads. > > > > anticipation helps in achieving fairness, but CFQ currently disables > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > some SSDs. Are we really sure that we want to introduce anticipation > everywhere, not only to improve throughput on rotational media, but to > achieve fairness too? That's a good point. Personally I think that fairness requirements for individual queues and groups are little different. CFQ in general seems to be focussing more on latency and throughput at the cost of fairness. With groups, we probably need to put a greater amount of emphasis on group fairness. So group will be a relatively a slower entity (with anticiaption on and more idling), but it will also give you a greater amount of isolation. So in practice, one will create groups carefully and they will not proliferate like queues. This can mean overall reduced throughput on SSD. Having said that, group idling is tunable and one can always reduce it to achieve a balance between fairness vs throughput depending on his need. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 2:06 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 2:06 UTC (permalink / raw) To: Fabio Checconi Cc: Rik van Riel, guijianfeng, linux-kernel, jmoyer, dm-devel, nauman, jens.axboe, akpm, agk, balbir On Wed, Sep 09, 2009 at 02:09:00AM +0200, Fabio Checconi wrote: > Hi, > > > From: Rik van Riel <riel@redhat.com> > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > Ryo Tsuruta wrote: > > >Rik van Riel <riel@redhat.com> wrote: > > > > >>Are you saying that dm-ioband is purposely unfair, > > >>until a certain load level is reached? > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > >of inactive groups to active groups. > > > > This sounds good, except that the lack of anticipation > > means that a group with just one task doing reads will > > be considered "inactive" in-between reads. > > > > anticipation helps in achieving fairness, but CFQ currently disables > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > some SSDs. Are we really sure that we want to introduce anticipation > everywhere, not only to improve throughput on rotational media, but to > achieve fairness too? That's a good point. Personally I think that fairness requirements for individual queues and groups are little different. CFQ in general seems to be focussing more on latency and throughput at the cost of fairness. With groups, we probably need to put a greater amount of emphasis on group fairness. So group will be a relatively a slower entity (with anticiaption on and more idling), but it will also give you a greater amount of isolation. So in practice, one will create groups carefully and they will not proliferate like queues. This can mean overall reduced throughput on SSD. Having said that, group idling is tunable and one can always reduce it to achieve a balance between fairness vs throughput depending on his need. Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 2:06 ` Vivek Goyal (?) @ 2009-09-09 15:41 ` Fabio Checconi 2009-09-09 17:30 ` Vivek Goyal -1 siblings, 1 reply; 80+ messages in thread From: Fabio Checconi @ 2009-09-09 15:41 UTC (permalink / raw) To: Vivek Goyal Cc: Rik van Riel, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir > From: Vivek Goyal <vgoyal@redhat.com> > Date: Tue, Sep 08, 2009 10:06:20PM -0400 > > On Wed, Sep 09, 2009 at 02:09:00AM +0200, Fabio Checconi wrote: > > Hi, > > > > > From: Rik van Riel <riel@redhat.com> > > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > > > Ryo Tsuruta wrote: > > > >Rik van Riel <riel@redhat.com> wrote: > > > > > > >>Are you saying that dm-ioband is purposely unfair, > > > >>until a certain load level is reached? > > > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > > >of inactive groups to active groups. > > > > > > This sounds good, except that the lack of anticipation > > > means that a group with just one task doing reads will > > > be considered "inactive" in-between reads. > > > > > > > anticipation helps in achieving fairness, but CFQ currently disables > > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > > some SSDs. Are we really sure that we want to introduce anticipation > > everywhere, not only to improve throughput on rotational media, but to > > achieve fairness too? > > That's a good point. Personally I think that fairness requirements for > individual queues and groups are little different. CFQ in general seems > to be focussing more on latency and throughput at the cost of fairness. > > With groups, we probably need to put a greater amount of emphasis on group > fairness. So group will be a relatively a slower entity (with anticiaption > on and more idling), but it will also give you a greater amount of > isolation. So in practice, one will create groups carefully and they will > not proliferate like queues. This can mean overall reduced throughput on > SSD. > Ok, I personally agree on that, but I think it's something to be documented. > Having said that, group idling is tunable and one can always reduce it to > achieve a balance between fairness vs throughput depending on his need. > This is good, however tuning will not be an easy task (at least, in my experience with BFQ it has been a problem): while for throughput usually there are tradeoffs, as soon as a queue/group idles and then timeouts, from the fairness perspective the results soon become almost random (i.e., depending on the rate of successful anticipations, but in the common case they are unpredictable)... ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 15:41 ` Fabio Checconi @ 2009-09-09 17:30 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 17:30 UTC (permalink / raw) To: Fabio Checconi Cc: Rik van Riel, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir On Wed, Sep 09, 2009 at 05:41:26PM +0200, Fabio Checconi wrote: > > From: Vivek Goyal <vgoyal@redhat.com> > > Date: Tue, Sep 08, 2009 10:06:20PM -0400 > > > > On Wed, Sep 09, 2009 at 02:09:00AM +0200, Fabio Checconi wrote: > > > Hi, > > > > > > > From: Rik van Riel <riel@redhat.com> > > > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > > > > > Ryo Tsuruta wrote: > > > > >Rik van Riel <riel@redhat.com> wrote: > > > > > > > > >>Are you saying that dm-ioband is purposely unfair, > > > > >>until a certain load level is reached? > > > > > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > > > >of inactive groups to active groups. > > > > > > > > This sounds good, except that the lack of anticipation > > > > means that a group with just one task doing reads will > > > > be considered "inactive" in-between reads. > > > > > > > > > > anticipation helps in achieving fairness, but CFQ currently disables > > > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > > > some SSDs. Are we really sure that we want to introduce anticipation > > > everywhere, not only to improve throughput on rotational media, but to > > > achieve fairness too? > > > > That's a good point. Personally I think that fairness requirements for > > individual queues and groups are little different. CFQ in general seems > > to be focussing more on latency and throughput at the cost of fairness. > > > > With groups, we probably need to put a greater amount of emphasis on group > > fairness. So group will be a relatively a slower entity (with anticiaption > > on and more idling), but it will also give you a greater amount of > > isolation. So in practice, one will create groups carefully and they will > > not proliferate like queues. This can mean overall reduced throughput on > > SSD. > > > > Ok, I personally agree on that, but I think it's something to be documented. > Sure. I will document it in documentation file. > > > Having said that, group idling is tunable and one can always reduce it to > > achieve a balance between fairness vs throughput depending on his need. > > > > This is good, however tuning will not be an easy task (at least, in my > experience with BFQ it has been a problem): while for throughput usually > there are tradeoffs, as soon as a queue/group idles and then timeouts, > from the fairness perspective the results soon become almost random > (i.e., depending on the rate of successful anticipations, but in the > common case they are unpredictable)... I am lost in last few lines. I guess you are suggesting that static tuning is hard and dynamically adjusting idling has limitations that it might not be accurate all the time? I will explain how things are working in current set of io scheduler patches. Currently on top of queue idling, I have implemented group idling also. Queue idling is dynamic and io scheduler like CFQ keeps track of traffic pattern on the queue and disables/enables idling dynamically. So in this case fairness depends on rate of successful anticipations by the io scheduler. Group idling currently is static in nature and purely implemented in elevator fair queuing layer. Group idling kicks in only when a group is empty at the time of queue expiration and underlying ioscheduler has not chosen to enable idling on the queue. This provides us the gurantee that group will keep on getting its fair share of disk as long as a new request comes in the group with-in that idling period. Implementing group idling ensures that it does not bog down the io scheduler and with-in group queue switching can still be very fast (no idling on many of the queues by cfq). Now in case of SSD if group idling is really hurting somebody, I would expect him to set it to either 1 or 0. You might get better throughput but then expect fairness for the group only if the group is continuously backlogged. (Something what dm-ioband guys seem to be doing). So do you think that adjusting this "group_idling" tunable is too complicated and there are better ways to handle it in case of SSD+NCQ? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 17:30 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-09 17:30 UTC (permalink / raw) To: Fabio Checconi Cc: Rik van Riel, guijianfeng, linux-kernel, jmoyer, dm-devel, nauman, jens.axboe, akpm, agk, balbir On Wed, Sep 09, 2009 at 05:41:26PM +0200, Fabio Checconi wrote: > > From: Vivek Goyal <vgoyal@redhat.com> > > Date: Tue, Sep 08, 2009 10:06:20PM -0400 > > > > On Wed, Sep 09, 2009 at 02:09:00AM +0200, Fabio Checconi wrote: > > > Hi, > > > > > > > From: Rik van Riel <riel@redhat.com> > > > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > > > > > Ryo Tsuruta wrote: > > > > >Rik van Riel <riel@redhat.com> wrote: > > > > > > > > >>Are you saying that dm-ioband is purposely unfair, > > > > >>until a certain load level is reached? > > > > > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > > > >of inactive groups to active groups. > > > > > > > > This sounds good, except that the lack of anticipation > > > > means that a group with just one task doing reads will > > > > be considered "inactive" in-between reads. > > > > > > > > > > anticipation helps in achieving fairness, but CFQ currently disables > > > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > > > some SSDs. Are we really sure that we want to introduce anticipation > > > everywhere, not only to improve throughput on rotational media, but to > > > achieve fairness too? > > > > That's a good point. Personally I think that fairness requirements for > > individual queues and groups are little different. CFQ in general seems > > to be focussing more on latency and throughput at the cost of fairness. > > > > With groups, we probably need to put a greater amount of emphasis on group > > fairness. So group will be a relatively a slower entity (with anticiaption > > on and more idling), but it will also give you a greater amount of > > isolation. So in practice, one will create groups carefully and they will > > not proliferate like queues. This can mean overall reduced throughput on > > SSD. > > > > Ok, I personally agree on that, but I think it's something to be documented. > Sure. I will document it in documentation file. > > > Having said that, group idling is tunable and one can always reduce it to > > achieve a balance between fairness vs throughput depending on his need. > > > > This is good, however tuning will not be an easy task (at least, in my > experience with BFQ it has been a problem): while for throughput usually > there are tradeoffs, as soon as a queue/group idles and then timeouts, > from the fairness perspective the results soon become almost random > (i.e., depending on the rate of successful anticipations, but in the > common case they are unpredictable)... I am lost in last few lines. I guess you are suggesting that static tuning is hard and dynamically adjusting idling has limitations that it might not be accurate all the time? I will explain how things are working in current set of io scheduler patches. Currently on top of queue idling, I have implemented group idling also. Queue idling is dynamic and io scheduler like CFQ keeps track of traffic pattern on the queue and disables/enables idling dynamically. So in this case fairness depends on rate of successful anticipations by the io scheduler. Group idling currently is static in nature and purely implemented in elevator fair queuing layer. Group idling kicks in only when a group is empty at the time of queue expiration and underlying ioscheduler has not chosen to enable idling on the queue. This provides us the gurantee that group will keep on getting its fair share of disk as long as a new request comes in the group with-in that idling period. Implementing group idling ensures that it does not bog down the io scheduler and with-in group queue switching can still be very fast (no idling on many of the queues by cfq). Now in case of SSD if group idling is really hurting somebody, I would expect him to set it to either 1 or 0. You might get better throughput but then expect fairness for the group only if the group is continuously backlogged. (Something what dm-ioband guys seem to be doing). So do you think that adjusting this "group_idling" tunable is too complicated and there are better ways to handle it in case of SSD+NCQ? Thanks Vivek ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 17:30 ` Vivek Goyal (?) @ 2009-09-09 19:01 ` Fabio Checconi -1 siblings, 0 replies; 80+ messages in thread From: Fabio Checconi @ 2009-09-09 19:01 UTC (permalink / raw) To: Vivek Goyal Cc: Rik van Riel, Ryo Tsuruta, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir > From: Vivek Goyal <vgoyal@redhat.com> > Date: Wed, Sep 09, 2009 01:30:03PM -0400 > > On Wed, Sep 09, 2009 at 05:41:26PM +0200, Fabio Checconi wrote: > > > From: Vivek Goyal <vgoyal@redhat.com> > > > Date: Tue, Sep 08, 2009 10:06:20PM -0400 > > > ... > > This is good, however tuning will not be an easy task (at least, in my > > experience with BFQ it has been a problem): while for throughput usually > > there are tradeoffs, as soon as a queue/group idles and then timeouts, > > from the fairness perspective the results soon become almost random > > (i.e., depending on the rate of successful anticipations, but in the > > common case they are unpredictable)... > > I am lost in last few lines. I guess you are suggesting that static tuning > is hard and dynamically adjusting idling has limitations that it might not > be accurate all the time? > Yes, this was the problem, at least for me. As soon as there were unsuccessful anticipations there was no graceful degradation of fairness, and bandwidth distribution became almost random. In this situation all the complexity of CFQ/BFQ/io-controller seems overkill; NCQ+SSD is or will be quite a common usage scenario triggering it. > I will explain how things are working in current set of io scheduler > patches. > > Currently on top of queue idling, I have implemented group idling also. > Queue idling is dynamic and io scheduler like CFQ keeps track of > traffic pattern on the queue and disables/enables idling dynamically. So > in this case fairness depends on rate of successful anticipations by the > io scheduler. > > Group idling currently is static in nature and purely implemented in > elevator fair queuing layer. Group idling kicks in only when a group is > empty at the time of queue expiration and underlying ioscheduler has not > chosen to enable idling on the queue. This provides us the gurantee that > group will keep on getting its fair share of disk as long as a new request > comes in the group with-in that idling period. > > Implementing group idling ensures that it does not bog down the io scheduler > and with-in group queue switching can still be very fast (no idling on many of > the queues by cfq). > > Now in case of SSD if group idling is really hurting somebody, I would > expect him to set it to either 1 or 0. You might get better throughput > but then expect fairness for the group only if the group is continuously > backlogged. (Something what dm-ioband guys seem to be doing). > > So do you think that adjusting this "group_idling" tunable is too > complicated and there are better ways to handle it in case of SSD+NCQ? > Unfortunately I am not aware of any reasonable and working method to properly handle this issue; anyway adjusting the tunable is something that needs a lot of care. ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests 2009-09-09 0:09 ` Fabio Checconi @ 2009-09-09 9:24 ` Ryo Tsuruta 2009-09-09 9:24 ` Ryo Tsuruta 1 sibling, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 9:24 UTC (permalink / raw) To: fchecconi Cc: riel, vgoyal, linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, jmoyer, balbir Hi, Fabio Checconi <fchecconi@gmail.com> wrote: > Hi, > > > From: Rik van Riel <riel@redhat.com> > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > Ryo Tsuruta wrote: > > >Rik van Riel <riel@redhat.com> wrote: > > > > >>Are you saying that dm-ioband is purposely unfair, > > >>until a certain load level is reached? > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > >of inactive groups to active groups. > > > > This sounds good, except that the lack of anticipation > > means that a group with just one task doing reads will > > be considered "inactive" in-between reads. > > > > anticipation helps in achieving fairness, but CFQ currently disables > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > some SSDs. Are we really sure that we want to introduce anticipation > everywhere, not only to improve throughput on rotational media, but to > achieve fairness too? I'm also not sure if it's worth introducing anticipation everywhere. The storage devices are becoming faster and smarter every year. In practice, I did a benchmark with a SAN storage and the noop scheduler got the best result. However, I'll consider how IO from one task should take care of. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: Regarding dm-ioband tests @ 2009-09-09 9:24 ` Ryo Tsuruta 0 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-09 9:24 UTC (permalink / raw) To: fchecconi Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, vgoyal, jens.axboe, nauman, akpm, agk, balbir Hi, Fabio Checconi <fchecconi@gmail.com> wrote: > Hi, > > > From: Rik van Riel <riel@redhat.com> > > Date: Tue, Sep 08, 2009 03:24:08PM -0400 > > > > Ryo Tsuruta wrote: > > >Rik van Riel <riel@redhat.com> wrote: > > > > >>Are you saying that dm-ioband is purposely unfair, > > >>until a certain load level is reached? > > > > > >Not unfair, dm-ioband(weight policy) is intentionally designed to > > >use bandwidth efficiently, weight policy tries to give spare bandwidth > > >of inactive groups to active groups. > > > > This sounds good, except that the lack of anticipation > > means that a group with just one task doing reads will > > be considered "inactive" in-between reads. > > > > anticipation helps in achieving fairness, but CFQ currently disables > idling for nonrot+NCQ media, to avoid the resulting throughput loss on > some SSDs. Are we really sure that we want to introduce anticipation > everywhere, not only to improve throughput on rotational media, but to > achieve fairness too? I'm also not sure if it's worth introducing anticipation everywhere. The storage devices are becoming faster and smarter every year. In practice, I did a benchmark with a SAN storage and the noop scheduler got the best result. However, I'll consider how IO from one task should take care of. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
* ioband: Limited fairness and weak isolation between groups (Was: Re: Regarding dm-ioband tests) 2009-09-07 11:02 ` Ryo Tsuruta @ 2009-09-16 4:45 ` Vivek Goyal -1 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-16 4:45 UTC (permalink / raw) To: Ryo Tsuruta Cc: linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, riel, jmoyer, balbir On Mon, Sep 07, 2009 at 08:02:22PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > Thank you for testing dm-ioband. dm-ioband is designed to start > > > throttling bandwidth when multiple IO requests are issued to devices > > > simultaneously, IOW, to start throttling when IO load exceeds a > > > certain level. > > > > > > > What is that certain level? Secondly what's the advantage of this? > > > > I can see disadvantages though. So unless a group is really busy "up to > > that certain level" it will not get fairness? I breaks the isolation > > between groups. > > In your test case, at least more than one dd thread have to run > simultaneously in the higher weight group. The reason is that > if there is an IO group which does not issue a certain number of IO > requests, dm-ioband assumes the IO group is inactive and assign its > spare bandwidth to active IO groups. Then whole bandwidth of the > device can be efficiently used. Please run two dd threads in the > higher group, it will work as you expect. > > However, if you want to get fairness in a case like this, a new > bandwidth control policy which controls accurately according to > assigned weights can be added to dm-ioband. > > > I also ran your test of doing heavy IO in two groups. This time I am > > running 4 dd threads in both the ioband devices. Following is the snapshot > > of "dmsetup table" output. > > > > Fri Sep 4 17:45:27 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > > > Fri Sep 4 17:45:29 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 > > > > Fri Sep 4 17:45:37 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 > > > > Fri Sep 4 17:45:45 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 > > > > Fri Sep 4 17:45:51 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 > > > > Fri Sep 4 17:45:53 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 > > > > Fri Sep 4 17:45:57 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 > > > > Note, it took me more than 20 seconds (since I started the threds) to > > reach close to desired fairness level. That's too long a duration. > > We regarded reducing throughput loss rather than reducing duration > as the design of dm-ioband. Of course, it is possible to make a new > policy which reduces duration. Not anticipating on rotation media and letting other group do the dispatch is not only bad for fairness of random readers but it seems to be bad for overall throughput also. So letting other group dispatching thinking it will boost throughput is not necessarily right on rotational media. I ran following test. Created two groups of weight 100 each and put a sequential dd reader in first group and put buffered writers in second group and let it run for 20 seconds and observed at the end of 20 seconds which group got how much work done. I ran this test multiple time, while increasing the number of writers by one each time. Did test this with dm-ioband and with io scheduler based io controller patches. With dm-ioband ============== launched reader 3176 launched 1 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 159 0 1272 0 0 0 ioband1: 0 37768752 ioband 1 -1 13282 23 1673656 0 0 0 Total sectors transferred: 1674928 launched reader 3194 launched 2 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 138 0 1104 54538 54081 436304 ioband1: 0 37768752 ioband 1 -1 4247 1 535056 0 0 0 Total sectors transferred: 972464 launched reader 3203 launched 3 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 189 0 1512 44956 44572 359648 ioband1: 0 37768752 ioband 1 -1 3546 0 447128 0 0 0 Total sectors transferred: 808288 launched reader 3213 launched 4 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 83 0 664 55937 55810 447496 ioband1: 0 37768752 ioband 1 -1 2243 0 282624 0 0 0 Total sectors transferred: 730784 launched reader 3224 launched 5 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 179 0 1432 46544 46146 372352 ioband1: 0 37768752 ioband 1 -1 3348 0 422744 0 0 0 Total sectors transferred: 796528 launched reader 3236 launched 6 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 176 0 1408 44499 44115 355992 ioband1: 0 37768752 ioband 1 -1 3998 0 504504 0 0 0 Total sectors transferred: 861904 launched reader 3250 launched 7 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 451 0 3608 42267 42115 338136 ioband1: 0 37768752 ioband 1 -1 2682 0 337976 0 0 0 Total sectors transferred: 679720 With io scheduler based io controller ===================================== launched reader 3026 launched 1 writers waiting for 20 seconds test1 statistics: time=8:48 8657 sectors=8:48 886112 dq=8:48 0 test2 statistics: time=8:48 7685 sectors=8:48 473384 dq=8:48 4 Total sectors transferred: 1359496 launched reader 3064 launched 2 writers waiting for 20 seconds test1 statistics: time=8:48 7429 sectors=8:48 856664 dq=8:48 0 test2 statistics: time=8:48 7431 sectors=8:48 376528 dq=8:48 0 Total sectors transferred: 1233192 launched reader 3094 launched 3 writers waiting for 20 seconds test1 statistics: time=8:48 7279 sectors=8:48 832840 dq=8:48 0 test2 statistics: time=8:48 7302 sectors=8:48 372120 dq=8:48 0 Total sectors transferred: 1204960 launched reader 3122 launched 4 writers waiting for 20 seconds test1 statistics: time=8:48 7291 sectors=8:48 846024 dq=8:48 0 test2 statistics: time=8:48 7314 sectors=8:48 361280 dq=8:48 0 Total sectors transferred: 1207304 launched reader 3151 launched 5 writers waiting for 20 seconds test1 statistics: time=8:48 7077 sectors=8:48 815184 dq=8:48 0 test2 statistics: time=8:48 7090 sectors=8:48 398472 dq=8:48 0 Total sectors transferred: 1213656 launched reader 3179 launched 6 writers waiting for 20 seconds test1 statistics: time=8:48 7494 sectors=8:48 873304 dq=8:48 1 test2 statistics: time=8:48 7034 sectors=8:48 316312 dq=8:48 2 Total sectors transferred: 1189616 launched reader 3209 launched 7 writers waiting for 20 seconds test1 statistics: time=8:48 6809 sectors=8:48 795528 dq=8:48 0 test2 statistics: time=8:48 6850 sectors=8:48 380008 dq=8:48 1 Total sectors transferred: 1175536 Few things stand out. ==================== - With dm-ioband, as number of writers increased, in group 2, it gave BW to those writes over reads running in group 1. It had two bad effects. First of all read throughput went down secondly overall disk throughput also went down. So reader did not get fairness at the same time overall throughput went down. Hence probably it is not a very good idea to not anticipate and always let other groups dispatch on rotational media. In contrast, io scheduler based controller seems to be steady and reader doest not suffer as number of writers increase in the second group and overall disk throughput also remains stable. Follwoing is the sample script I used for above test. ******************************************************************* launch_writers() { nr_writers=$1 for ((j=1;j<=$nr_writers;j++)); do dd if=/dev/zero of=/mnt/sdd2/writefile$j bs=4K & # echo "launched writer $!" done } do_test () { nr_writers=$1 sync echo 3 > /proc/sys/vm/drop_caches echo noop > /sys/block/sdd/queue/scheduler echo cfq > /sys/block/sdd/queue/scheduler dmsetup message ioband1 0 reset dmsetup message ioband2 0 reset #launch a sequential reader in sdd1 dd if=/mnt/sdd1/4G-file of=/dev/null & echo "launched reader $!" launch_writers $nr_writers echo "launched $nr_writers writers" echo "waiting for 20 seconds" sleep 20 dmsetup status killall dd > /dev/null 2>&1 } for ((i=1;i<8;i++)); do do_test $i echo done ********************************************************************* ^ permalink raw reply [flat|nested] 80+ messages in thread
* ioband: Limited fairness and weak isolation between groups (Was: Re: Regarding dm-ioband tests) @ 2009-09-16 4:45 ` Vivek Goyal 0 siblings, 0 replies; 80+ messages in thread From: Vivek Goyal @ 2009-09-16 4:45 UTC (permalink / raw) To: Ryo Tsuruta Cc: riel, guijianfeng, linux-kernel, jmoyer, dm-devel, jens.axboe, nauman, akpm, agk, balbir On Mon, Sep 07, 2009 at 08:02:22PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > Thank you for testing dm-ioband. dm-ioband is designed to start > > > throttling bandwidth when multiple IO requests are issued to devices > > > simultaneously, IOW, to start throttling when IO load exceeds a > > > certain level. > > > > > > > What is that certain level? Secondly what's the advantage of this? > > > > I can see disadvantages though. So unless a group is really busy "up to > > that certain level" it will not get fairness? I breaks the isolation > > between groups. > > In your test case, at least more than one dd thread have to run > simultaneously in the higher weight group. The reason is that > if there is an IO group which does not issue a certain number of IO > requests, dm-ioband assumes the IO group is inactive and assign its > spare bandwidth to active IO groups. Then whole bandwidth of the > device can be efficiently used. Please run two dd threads in the > higher group, it will work as you expect. > > However, if you want to get fairness in a case like this, a new > bandwidth control policy which controls accurately according to > assigned weights can be added to dm-ioband. > > > I also ran your test of doing heavy IO in two groups. This time I am > > running 4 dd threads in both the ioband devices. Following is the snapshot > > of "dmsetup table" output. > > > > Fri Sep 4 17:45:27 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 0 0 0 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 0 0 0 0 0 0 > > > > Fri Sep 4 17:45:29 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 41 0 4184 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 173 0 20096 0 0 0 > > > > Fri Sep 4 17:45:37 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 1605 23 197976 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 4640 1 583168 0 0 0 > > > > Fri Sep 4 17:45:45 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 3650 47 453488 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 8572 1 1079144 0 0 0 > > > > Fri Sep 4 17:45:51 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 5111 68 635696 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 11587 1 1459544 0 0 0 > > > > Fri Sep 4 17:45:53 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 5698 73 709272 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 12503 1 1575112 0 0 0 > > > > Fri Sep 4 17:45:57 EDT 2009 > > ioband2: 0 40355280 ioband 1 -1 6790 87 845808 0 0 0 > > ioband1: 0 37768752 ioband 1 -1 14395 2 1813680 0 0 0 > > > > Note, it took me more than 20 seconds (since I started the threds) to > > reach close to desired fairness level. That's too long a duration. > > We regarded reducing throughput loss rather than reducing duration > as the design of dm-ioband. Of course, it is possible to make a new > policy which reduces duration. Not anticipating on rotation media and letting other group do the dispatch is not only bad for fairness of random readers but it seems to be bad for overall throughput also. So letting other group dispatching thinking it will boost throughput is not necessarily right on rotational media. I ran following test. Created two groups of weight 100 each and put a sequential dd reader in first group and put buffered writers in second group and let it run for 20 seconds and observed at the end of 20 seconds which group got how much work done. I ran this test multiple time, while increasing the number of writers by one each time. Did test this with dm-ioband and with io scheduler based io controller patches. With dm-ioband ============== launched reader 3176 launched 1 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 159 0 1272 0 0 0 ioband1: 0 37768752 ioband 1 -1 13282 23 1673656 0 0 0 Total sectors transferred: 1674928 launched reader 3194 launched 2 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 138 0 1104 54538 54081 436304 ioband1: 0 37768752 ioband 1 -1 4247 1 535056 0 0 0 Total sectors transferred: 972464 launched reader 3203 launched 3 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 189 0 1512 44956 44572 359648 ioband1: 0 37768752 ioband 1 -1 3546 0 447128 0 0 0 Total sectors transferred: 808288 launched reader 3213 launched 4 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 83 0 664 55937 55810 447496 ioband1: 0 37768752 ioband 1 -1 2243 0 282624 0 0 0 Total sectors transferred: 730784 launched reader 3224 launched 5 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 179 0 1432 46544 46146 372352 ioband1: 0 37768752 ioband 1 -1 3348 0 422744 0 0 0 Total sectors transferred: 796528 launched reader 3236 launched 6 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 176 0 1408 44499 44115 355992 ioband1: 0 37768752 ioband 1 -1 3998 0 504504 0 0 0 Total sectors transferred: 861904 launched reader 3250 launched 7 writers waiting for 20 seconds ioband2: 0 40355280 ioband 1 -1 451 0 3608 42267 42115 338136 ioband1: 0 37768752 ioband 1 -1 2682 0 337976 0 0 0 Total sectors transferred: 679720 With io scheduler based io controller ===================================== launched reader 3026 launched 1 writers waiting for 20 seconds test1 statistics: time=8:48 8657 sectors=8:48 886112 dq=8:48 0 test2 statistics: time=8:48 7685 sectors=8:48 473384 dq=8:48 4 Total sectors transferred: 1359496 launched reader 3064 launched 2 writers waiting for 20 seconds test1 statistics: time=8:48 7429 sectors=8:48 856664 dq=8:48 0 test2 statistics: time=8:48 7431 sectors=8:48 376528 dq=8:48 0 Total sectors transferred: 1233192 launched reader 3094 launched 3 writers waiting for 20 seconds test1 statistics: time=8:48 7279 sectors=8:48 832840 dq=8:48 0 test2 statistics: time=8:48 7302 sectors=8:48 372120 dq=8:48 0 Total sectors transferred: 1204960 launched reader 3122 launched 4 writers waiting for 20 seconds test1 statistics: time=8:48 7291 sectors=8:48 846024 dq=8:48 0 test2 statistics: time=8:48 7314 sectors=8:48 361280 dq=8:48 0 Total sectors transferred: 1207304 launched reader 3151 launched 5 writers waiting for 20 seconds test1 statistics: time=8:48 7077 sectors=8:48 815184 dq=8:48 0 test2 statistics: time=8:48 7090 sectors=8:48 398472 dq=8:48 0 Total sectors transferred: 1213656 launched reader 3179 launched 6 writers waiting for 20 seconds test1 statistics: time=8:48 7494 sectors=8:48 873304 dq=8:48 1 test2 statistics: time=8:48 7034 sectors=8:48 316312 dq=8:48 2 Total sectors transferred: 1189616 launched reader 3209 launched 7 writers waiting for 20 seconds test1 statistics: time=8:48 6809 sectors=8:48 795528 dq=8:48 0 test2 statistics: time=8:48 6850 sectors=8:48 380008 dq=8:48 1 Total sectors transferred: 1175536 Few things stand out. ==================== - With dm-ioband, as number of writers increased, in group 2, it gave BW to those writes over reads running in group 1. It had two bad effects. First of all read throughput went down secondly overall disk throughput also went down. So reader did not get fairness at the same time overall throughput went down. Hence probably it is not a very good idea to not anticipate and always let other groups dispatch on rotational media. In contrast, io scheduler based controller seems to be steady and reader doest not suffer as number of writers increase in the second group and overall disk throughput also remains stable. Follwoing is the sample script I used for above test. ******************************************************************* launch_writers() { nr_writers=$1 for ((j=1;j<=$nr_writers;j++)); do dd if=/dev/zero of=/mnt/sdd2/writefile$j bs=4K & # echo "launched writer $!" done } do_test () { nr_writers=$1 sync echo 3 > /proc/sys/vm/drop_caches echo noop > /sys/block/sdd/queue/scheduler echo cfq > /sys/block/sdd/queue/scheduler dmsetup message ioband1 0 reset dmsetup message ioband2 0 reset #launch a sequential reader in sdd1 dd if=/mnt/sdd1/4G-file of=/dev/null & echo "launched reader $!" launch_writers $nr_writers echo "launched $nr_writers writers" echo "waiting for 20 seconds" sleep 20 dmsetup status killall dd > /dev/null 2>&1 } for ((i=1;i<8;i++)); do do_test $i echo done ********************************************************************* ^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: ioband: Limited fairness and weak isolation between groups 2009-09-16 4:45 ` Vivek Goyal (?) @ 2009-09-18 7:33 ` Ryo Tsuruta -1 siblings, 0 replies; 80+ messages in thread From: Ryo Tsuruta @ 2009-09-18 7:33 UTC (permalink / raw) To: vgoyal Cc: linux-kernel, dm-devel, jens.axboe, agk, akpm, nauman, guijianfeng, riel, jmoyer, balbir Hi Vivek, Vivek Goyal <vgoyal@redhat.com> wrote: > I ran following test. Created two groups of weight 100 each and put a > sequential dd reader in first group and put buffered writers in second > group and let it run for 20 seconds and observed at the end of 20 seconds > which group got how much work done. I ran this test multiple time, while > increasing the number of writers by one each time. Did test this with > dm-ioband and with io scheduler based io controller patches. I did the same test on my environment (2.6.31 + dm-ioband v1.13.0) and here are the results. The number of sectors transferred writers read write total 1 800696 588600 1389296 2 747704 430736 1178440 3 757136 455808 1212944 4 704888 562912 1267800 5 788760 387672 1176432 6 730664 495832 1226496 7 765864 427384 1193248 I got the different results to yours, the total throughput did not decreased according to increasing the number of writers. I've attached the outputs of the test script. Please note that the format of "dmsetup status" have been changed like /sys/block/dev/stat file. launched reader 3567 launched 1 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 85 0 680 0 100087 0 800696 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 4673 0 588600 0 0 0 0 0 0 0 0 launched reader 3575 launched 2 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 197 0 1576 0 93463 0 747704 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 3420 0 430736 0 0 0 0 0 0 0 0 launched reader 3584 launched 3 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 237 0 1896 0 94642 0 757136 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 3614 0 455808 0 0 0 0 0 0 0 0 launched reader 3594 launched 4 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 207 0 1656 0 88111 0 704888 0 159 0 0 ioband1: 0 112455000 ioband share1 -1 4462 0 562912 0 0 0 0 0 0 0 0 launched reader 3605 launched 5 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 234 0 1872 0 98595 0 788760 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 3077 0 387672 0 0 0 0 0 0 0 0 launched reader 3618 launched 6 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 215 0 1720 0 91333 0 730664 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 3937 0 495832 0 0 0 0 0 0 0 0 launched reader 3631 launched 7 writers waiting for 20 seconds ioband2: 0 112455000 ioband share1 -1 245 0 1960 0 95733 0 765864 0 384 0 0 ioband1: 0 112455000 ioband share1 -1 3391 0 427384 0 0 0 0 0 0 0 0 Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 80+ messages in thread
end of thread, other threads:[~2009-09-18 7:33 UTC | newest] Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-09-01 16:50 Regarding dm-ioband tests Vivek Goyal 2009-09-01 16:50 ` Vivek Goyal 2009-09-01 17:47 ` Vivek Goyal 2009-09-01 17:47 ` Vivek Goyal 2009-09-03 13:11 ` Vivek Goyal 2009-09-03 13:11 ` Vivek Goyal 2009-09-04 1:12 ` Ryo Tsuruta 2009-09-15 21:40 ` dm-ioband fairness in terms of sectors seems to be killing disk (Was: Re: Regarding dm-ioband tests) Vivek Goyal 2009-09-15 21:40 ` Vivek Goyal 2009-09-16 11:10 ` dm-ioband fairness in terms of sectors seems to be killing disk Ryo Tsuruta 2009-09-16 11:10 ` Ryo Tsuruta 2009-09-04 4:02 ` Regarding dm-ioband tests Ryo Tsuruta 2009-09-04 4:02 ` Ryo Tsuruta 2009-09-04 23:11 ` Vivek Goyal 2009-09-04 23:11 ` Vivek Goyal 2009-09-07 11:02 ` Ryo Tsuruta 2009-09-07 11:02 ` Ryo Tsuruta 2009-09-07 13:53 ` Rik van Riel 2009-09-07 13:53 ` Rik van Riel 2009-09-08 3:01 ` Ryo Tsuruta 2009-09-08 3:01 ` Ryo Tsuruta 2009-09-08 3:22 ` Balbir Singh 2009-09-08 3:22 ` Balbir Singh 2009-09-08 5:05 ` Ryo Tsuruta 2009-09-08 5:05 ` Ryo Tsuruta 2009-09-08 13:49 ` Vivek Goyal 2009-09-08 13:49 ` Vivek Goyal 2009-09-09 5:17 ` Ryo Tsuruta 2009-09-09 5:17 ` Ryo Tsuruta 2009-09-09 13:34 ` Vivek Goyal 2009-09-09 13:34 ` Vivek Goyal 2009-09-08 13:42 ` Vivek Goyal 2009-09-08 13:42 ` Vivek Goyal 2009-09-08 16:30 ` Nauman Rafique 2009-09-08 16:30 ` Nauman Rafique 2009-09-08 16:47 ` Rik van Riel 2009-09-08 16:47 ` Rik van Riel 2009-09-08 17:54 ` Vivek Goyal 2009-09-08 17:54 ` Vivek Goyal 2009-09-15 23:37 ` ioband: Writer starves reader even without competitors (Re: Regarding dm-ioband tests) Vivek Goyal 2009-09-15 23:37 ` Vivek Goyal 2009-09-16 12:08 ` ioband: Writer starves reader even without competitors Ryo Tsuruta 2009-09-08 17:06 ` Regarding dm-ioband tests Dhaval Giani 2009-09-09 6:05 ` Ryo Tsuruta 2009-09-09 6:05 ` Ryo Tsuruta 2009-09-09 10:51 ` Dhaval Giani 2009-09-10 7:58 ` Ryo Tsuruta 2009-09-10 7:58 ` Ryo Tsuruta 2009-09-11 9:53 ` Dhaval Giani 2009-09-15 15:12 ` Ryo Tsuruta 2009-09-15 15:12 ` Ryo Tsuruta 2009-09-15 15:19 ` Balbir Singh 2009-09-15 15:19 ` Balbir Singh 2009-09-15 15:58 ` Rik van Riel 2009-09-15 15:58 ` Rik van Riel 2009-09-15 16:21 ` Ryo Tsuruta 2009-09-15 16:21 ` Ryo Tsuruta 2009-09-09 13:57 ` Vivek Goyal 2009-09-09 13:57 ` Vivek Goyal 2009-09-10 3:06 ` Ryo Tsuruta 2009-09-09 10:01 ` Ryo Tsuruta 2009-09-09 14:31 ` Vivek Goyal 2009-09-09 14:31 ` Vivek Goyal 2009-09-10 3:45 ` Ryo Tsuruta 2009-09-10 13:25 ` Vivek Goyal 2009-09-10 13:25 ` Vivek Goyal 2009-09-08 19:24 ` Rik van Riel 2009-09-08 19:24 ` Rik van Riel 2009-09-09 0:09 ` Fabio Checconi 2009-09-09 2:06 ` Vivek Goyal 2009-09-09 2:06 ` Vivek Goyal 2009-09-09 15:41 ` Fabio Checconi 2009-09-09 17:30 ` Vivek Goyal 2009-09-09 17:30 ` Vivek Goyal 2009-09-09 19:01 ` Fabio Checconi 2009-09-09 9:24 ` Ryo Tsuruta 2009-09-09 9:24 ` Ryo Tsuruta 2009-09-16 4:45 ` ioband: Limited fairness and weak isolation between groups (Was: Re: Regarding dm-ioband tests) Vivek Goyal 2009-09-16 4:45 ` Vivek Goyal 2009-09-18 7:33 ` ioband: Limited fairness and weak isolation between groups Ryo Tsuruta
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.