linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cfq-iosched: two questions about the hrtimer version of CFQ
@ 2017-03-07  0:11 Hou Tao
  2017-03-08 14:24 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Hou Tao @ 2017-03-07  0:11 UTC (permalink / raw)
  To: Jan Kara, linux-block; +Cc: Jens Axboe, linux-kernel, jmoyer, Vivek Goyal

Hi Jan and list,

When testing the hrtimer version of CFQ, we found a performance degradation
problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at
least 1 jiffie instead of 1 ns").

The following is the test process:

* filesystem and block device
	* XFS + /dev/sda mounted on /tmp/sda
* CFQ configuration
	* default configuration
* run "fio ./cfq.job"
* fio job configuration cfq.job
	[global]
	bs=4k
	ioengine=psync
	iodepth=1
	direct=1
	rw=randwrite
	time_based
	runtime=15
	cgroup_nodelete=1
	group_reporting=1

	[cfq_a]
	filename=/tmp/sda/cfq_a.dat
	size=2G
	cgroup_weight=500
	cgroup=cfq_a
	thread=1
	numjobs=2

	[cfq_b]
	new_group
	filename=/tmp/sda/cfq_b.dat
	size=2G
	rate=4m
	cgroup_weight=500
	cgroup=cfq_b
	thread=1
	numjobs=2

The following is the test result:
* with 0b31c10:
	* fio report
		cfq_a: bw=5312.6KB/s, iops=1328
		cfq_b: bw=8192.6KB/s, iops=2048

	* blkcg debug files
		./cfq_a/blkio.group_wait_time:8:0 12062571233
		./cfq_b/blkio.group_wait_time:8:0 155841600
		./cfq_a/blkio.io_serviced:Total 19922
		./cfq_b/blkio.io_serviced:Total 30722
		./cfq_a/blkio.time:8:0 19406083246
		./cfq_b/blkio.time:8:0 19417146869

* without 0b31c10:
	* fio report
		cfq_a: bw=21670KB/s, iops=5417
		cfq_b: bw=8191.2KB/s, iops=2047

	* blkcg debug files
		./cfq_a/blkio.group_wait_time:8:0 5798452504
		./cfq_b/blkio.group_wait_time:8:0 5131844007
		./cfq_a/blkio.io_serviced:8:0 Write 81261
		./cfq_b/blkio.io_serviced:8:0 Write 30722
		./cfq_a/blkio.time:8:0 5642608173
		./cfq_b/blkio.time:8:0 5849949812

We want to known the reason why you revert the minimal used slice to 1 jiffy
when the slice has not been allocated. Doest it lead to some performance
regressions or something similar ? If not, I think we could revert the minimal
slice to 1 ns again.

Another problem is about the time comparison in CFQ code. In no-hrtimer version
of CFQ, it uses time_after or time_before when possible, Why the hrtimer version
doesn't use the equivalent time_after64/time_before64 ? Can ktime_get_ns()
ensure there will be no wrapping problem ?

Thanks very much.

Regards,

Tao

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cfq-iosched: two questions about the hrtimer version of CFQ
  2017-03-07  0:11 cfq-iosched: two questions about the hrtimer version of CFQ Hou Tao
@ 2017-03-08 14:24 ` Jan Kara
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2017-03-08 14:24 UTC (permalink / raw)
  To: Hou Tao
  Cc: Jan Kara, linux-block, Jens Axboe, linux-kernel, jmoyer, Vivek Goyal

Hi,

On Tue 07-03-17 08:11:44, Hou Tao wrote:
> When testing the hrtimer version of CFQ, we found a performance degradation
> problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at
> least 1 jiffie instead of 1 ns").
> 
> The following is the test process:
> 
> * filesystem and block device
> 	* XFS + /dev/sda mounted on /tmp/sda
> * CFQ configuration
> 	* default configuration
> * run "fio ./cfq.job"
> * fio job configuration cfq.job
> 	[global]
> 	bs=4k
> 	ioengine=psync
> 	iodepth=1
> 	direct=1
> 	rw=randwrite
> 	time_based
> 	runtime=15
> 	cgroup_nodelete=1
> 	group_reporting=1
> 
> 	[cfq_a]
> 	filename=/tmp/sda/cfq_a.dat
> 	size=2G
> 	cgroup_weight=500
> 	cgroup=cfq_a
> 	thread=1
> 	numjobs=2
> 
> 	[cfq_b]
> 	new_group
> 	filename=/tmp/sda/cfq_b.dat
> 	size=2G
> 	rate=4m
> 	cgroup_weight=500
> 	cgroup=cfq_b
> 	thread=1
> 	numjobs=2
> 
> The following is the test result:
> * with 0b31c10:
> 	* fio report
> 		cfq_a: bw=5312.6KB/s, iops=1328
> 		cfq_b: bw=8192.6KB/s, iops=2048
> 
> 	* blkcg debug files
> 		./cfq_a/blkio.group_wait_time:8:0 12062571233
> 		./cfq_b/blkio.group_wait_time:8:0 155841600
> 		./cfq_a/blkio.io_serviced:Total 19922
> 		./cfq_b/blkio.io_serviced:Total 30722
> 		./cfq_a/blkio.time:8:0 19406083246
> 		./cfq_b/blkio.time:8:0 19417146869
> 
> * without 0b31c10:
> 	* fio report
> 		cfq_a: bw=21670KB/s, iops=5417
> 		cfq_b: bw=8191.2KB/s, iops=2047
> 
> 	* blkcg debug files
> 		./cfq_a/blkio.group_wait_time:8:0 5798452504
> 		./cfq_b/blkio.group_wait_time:8:0 5131844007
> 		./cfq_a/blkio.io_serviced:8:0 Write 81261
> 		./cfq_b/blkio.io_serviced:8:0 Write 30722
> 		./cfq_a/blkio.time:8:0 5642608173
> 		./cfq_b/blkio.time:8:0 5849949812
> 
> We want to known the reason why you revert the minimal used slice to 1 jiffy
> when the slice has not been allocated. Doest it lead to some performance
> regressions or something similar ? If not, I think we could revert the minimal
> slice to 1 ns again.

So I reverted to accounting 1 jiffie because it was that way before my
commit 9a7f38c42c2b "cfq-iosched: Convert from jiffies to nanoseconds". I
am not aware of any particular issue caused by charging only 1 ns however
it is certainly underestimating the time used by cfqq for that one request
and could be possibly abused by malicious cgroups. How much should be
accounted to cfqq in case no request has completed yet is questionable and
frankly I don't have a good answer for that.

> Another problem is about the time comparison in CFQ code. In no-hrtimer
> version of CFQ, it uses time_after or time_before when possible, Why the
> hrtimer version doesn't use the equivalent time_after64/time_before64 ?
> Can ktime_get_ns() ensure there will be no wrapping problem ?

time_after64() and friends is for 64-bit jiffie values. CFQ is now working
in nanoseconds and not jiffies so you cannot use those functions. WRT
wrapping: 2^64 ns is ~584 years so I'm not concerned about wrapping.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cfq-iosched: two questions about the hrtimer version of CFQ
  2017-03-06 13:50 Hou Tao
@ 2017-03-07  1:22 ` Hou Tao
  0 siblings, 0 replies; 4+ messages in thread
From: Hou Tao @ 2017-03-07  1:22 UTC (permalink / raw)
  To: Jan Kara, linux-block; +Cc: axboe, linux-kernel, Vivek Goyal

Sorry for the resend, please refer to the later one.

On 2017/3/6 21:50, Hou Tao wrote:
> Hi Jan and list,
> 
> When testing the hrtimer version of CFQ, we found a performance degradation
> problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at
> least 1 jiffie instead of 1 ns").
> 
> The following is the test process:
> 
> * filesystem and block device
> 	* XFS + /dev/sda mounted on /tmp/sda
> * CFQ configuration
> 	* default configurations
> * fio job configuration
> 	[global]
> 	bs=4k
> 	ioengine=psync
> 	iodepth=1
> 	direct=1
> 	rw=randwrite
> 	time_based
> 	runtime=15
> 	cgroup_nodelete=1
> 	group_reporting=1
> 
> 	[cfq_a]
> 	filename=/tmp/sda/cfq_a.dat
> 	size=2G
> 	cgroup_weight=500
> 	cgroup=cfq_a
> 	thread=1
> 	numjobs=2
> 
> 	[cfq_b]
> 	new_group
> 	filename=/tmp/sda/cfq_b.dat
> 	size=2G
> 	rate=4m
> 	cgroup_weight=500
> 	cgroup=cfq_b
> 	thread=1
> 	numjobs=2
> 
> 
> The following is the test result:
> * with 0b31c10:
> 	* fio report
> 		cfq_a: bw=5312.6KB/s, iops=1328
> 		cfq_b: bw=8192.6KB/s, iops=2048
> 
> 	* blkcg debug files
> 		./cfq_a/blkio.group_wait_time:8:0 12062571233
> 		./cfq_b/blkio.group_wait_time:8:0 155841600
> 		./cfq_a/blkio.io_serviced:Total 19922
> 		./cfq_b/blkio.io_serviced:Total 30722
> 		./cfq_a/blkio.time:8:0 19406083246
> 		./cfq_b/blkio.time:8:0 19417146869
> 
> * without 0b31c10:
> 	* fio report
> 		cfq_a: bw=21670KB/s, iops=5417
> 		cfq_b: bw=8191.2KB/s, iops=2047
> 
> 	* blkcg debug files
> 		./cfq_a/blkio.group_wait_time:8:0 5798452504
> 		./cfq_b/blkio.group_wait_time:8:0 5131844007
> 		./cfq_a/blkio.io_serviced:8:0 Write 81261
> 		./cfq_b/blkio.io_serviced:8:0 Write 30722
> 		./cfq_a/blkio.time:8:0 5642608173
> 		./cfq_b/blkio.time:8:0 5849949812
> 
> We want to known the reason why you revert the minimal used slice to 1 jiffy
> when the slice has not been allocated. Does it lead to some performance
> regressions or something similar ? If not, I think we could revert the minimal
> slice to 1 ns again.
> 
> Another problem is about the time comparison in CFQ code. In no-hrtimer version
> of CFQ, it uses time_after or time_before when possible, Why the hrtimer version
> doesn't use the equivalent time_after64/time_before64 ? Can ktime_get_ns()
> ensure there will be no wrapping problem ?
> 
> Thanks very much.
> 
> Regards,
> 
> Tao
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* cfq-iosched: two questions about the hrtimer version of CFQ
@ 2017-03-06 13:50 Hou Tao
  2017-03-07  1:22 ` Hou Tao
  0 siblings, 1 reply; 4+ messages in thread
From: Hou Tao @ 2017-03-06 13:50 UTC (permalink / raw)
  To: Jan Kara, linux-block; +Cc: axboe, linux-kernel, Vivek Goyal

Hi Jan and list,

When testing the hrtimer version of CFQ, we found a performance degradation
problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at
least 1 jiffie instead of 1 ns").

The following is the test process:

* filesystem and block device
	* XFS + /dev/sda mounted on /tmp/sda
* CFQ configuration
	* default configurations
* fio job configuration
	[global]
	bs=4k
	ioengine=psync
	iodepth=1
	direct=1
	rw=randwrite
	time_based
	runtime=15
	cgroup_nodelete=1
	group_reporting=1

	[cfq_a]
	filename=/tmp/sda/cfq_a.dat
	size=2G
	cgroup_weight=500
	cgroup=cfq_a
	thread=1
	numjobs=2

	[cfq_b]
	new_group
	filename=/tmp/sda/cfq_b.dat
	size=2G
	rate=4m
	cgroup_weight=500
	cgroup=cfq_b
	thread=1
	numjobs=2


The following is the test result:
* with 0b31c10:
	* fio report
		cfq_a: bw=5312.6KB/s, iops=1328
		cfq_b: bw=8192.6KB/s, iops=2048

	* blkcg debug files
		./cfq_a/blkio.group_wait_time:8:0 12062571233
		./cfq_b/blkio.group_wait_time:8:0 155841600
		./cfq_a/blkio.io_serviced:Total 19922
		./cfq_b/blkio.io_serviced:Total 30722
		./cfq_a/blkio.time:8:0 19406083246
		./cfq_b/blkio.time:8:0 19417146869

* without 0b31c10:
	* fio report
		cfq_a: bw=21670KB/s, iops=5417
		cfq_b: bw=8191.2KB/s, iops=2047

	* blkcg debug files
		./cfq_a/blkio.group_wait_time:8:0 5798452504
		./cfq_b/blkio.group_wait_time:8:0 5131844007
		./cfq_a/blkio.io_serviced:8:0 Write 81261
		./cfq_b/blkio.io_serviced:8:0 Write 30722
		./cfq_a/blkio.time:8:0 5642608173
		./cfq_b/blkio.time:8:0 5849949812

We want to known the reason why you revert the minimal used slice to 1 jiffy
when the slice has not been allocated. Does it lead to some performance
regressions or something similar ? If not, I think we could revert the minimal
slice to 1 ns again.

Another problem is about the time comparison in CFQ code. In no-hrtimer version
of CFQ, it uses time_after or time_before when possible, Why the hrtimer version
doesn't use the equivalent time_after64/time_before64 ? Can ktime_get_ns()
ensure there will be no wrapping problem ?

Thanks very much.

Regards,

Tao

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-08 15:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-07  0:11 cfq-iosched: two questions about the hrtimer version of CFQ Hou Tao
2017-03-08 14:24 ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2017-03-06 13:50 Hou Tao
2017-03-07  1:22 ` Hou Tao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).