Hi, I added basic io priority support to the time sliced cfq base. Right now this is just proof of concept, the interface for setting/querying io prio will change. There are 8 basic io priorities now, 0 being highest prio and 7 the lowest. The scheduling type is best effort, in the future there will be a realtime class as well (and hence the need to change sys_ioprio_set etc). If a process hasn't set its io priority explicitly, io priority is determined from the process nice level. CPU nice level of 0 yields io priority 4, cpu nice -20 gives you 0, and finally cpu nice 19 will give you an io priority of 7. Values in-between are appropriately scaled. If a process sets its io priority explicitly, that value is used from then on. A test run with 7 readers are various priorities: thread1 (read): err=0, prio=0 maxl=634msec, run=30012msec, bw=5884KiB/sec thread2 (read): err=0, prio=1 maxl=650msec, run=30041msec, bw=5102KiB/sec thread3 (read): err=0, prio=1 maxl=646msec, run=30057msec, bw=5062KiB/sec thread4 (read): err=0, prio=3 maxl=687msec, run=30079msec, bw=3551KiB/sec thread5 (read): err=0, prio=6 maxl=750msec, run=30208msec, bw=1253KiB/sec thread6 (read): err=0, prio=3 maxl=690msec, run=30100msec, bw=3562KiB/sec thread7 (read): err=0, prio=4 maxl=758msec, run=30181msec, bw=2631KiB/sec Run status: READ: io=775MiB, aggrb=26927, minl=634, maxl=758, minb=1253, maxb=5884, mint=30012msec, maxt=30208msec Note that aggregate bandwidth stays the same as without io priorities. Only io scheduling cares about the io priority currently, request allocation policy, queue congestion etc doesn't yet. I have attached a sample ionice.c file, so that you can do: # ionice -n3 some_process which will run that process at io priority 3. Other changes: - Disable TCQ in the hardware/driver by default. Can be changed (as always) with the max_depth setting. If you do that, don't expect fairness or priorities to work as well. - Import thinktime stats from AS. We use this to determine when to preempt a queue during its idle window. - Kill find_best_crq setting. It was on by default before, and it would be a bug if it didn't work well. - Add ability for a given process to preempt another process slice. - Allow idle window to slide, if there are no other potential queues we could service requests from. - Various little cleanups and optimizations. 2.6.10-rc2-mm4 patch: http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc2-mm4/cfq-time-slices-10-2.6.10-rc2-mm4.gz -- Jens Axboe