* [PATCH] Time sliced cfq with basic io priorities
@ 2004-12-13 12:50 Jens Axboe
2004-12-13 13:09 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2004-12-13 12:50 UTC (permalink / raw)
To: Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 2541 bytes --]
Hi,
I added basic io priority support to the time sliced cfq base. Right now
this is just proof of concept, the interface for setting/querying io
prio will change. There are 8 basic io priorities now, 0 being highest
prio and 7 the lowest. The scheduling type is best effort, in the future
there will be a realtime class as well (and hence the need to change
sys_ioprio_set etc). If a process hasn't set its io priority explicitly,
io priority is determined from the process nice level. CPU nice level of
0 yields io priority 4, cpu nice -20 gives you 0, and finally cpu nice
19 will give you an io priority of 7. Values in-between are
appropriately scaled. If a process sets its io priority explicitly, that
value is used from then on.
A test run with 7 readers are various priorities:
thread1 (read): err=0, prio=0 maxl=634msec, run=30012msec, bw=5884KiB/sec
thread2 (read): err=0, prio=1 maxl=650msec, run=30041msec, bw=5102KiB/sec
thread3 (read): err=0, prio=1 maxl=646msec, run=30057msec, bw=5062KiB/sec
thread4 (read): err=0, prio=3 maxl=687msec, run=30079msec, bw=3551KiB/sec
thread5 (read): err=0, prio=6 maxl=750msec, run=30208msec, bw=1253KiB/sec
thread6 (read): err=0, prio=3 maxl=690msec, run=30100msec, bw=3562KiB/sec
thread7 (read): err=0, prio=4 maxl=758msec, run=30181msec, bw=2631KiB/sec
Run status:
READ: io=775MiB, aggrb=26927, minl=634, maxl=758, minb=1253, maxb=5884, mint=30012msec, maxt=30208msec
Note that aggregate bandwidth stays the same as without io priorities.
Only io scheduling cares about the io priority currently, request
allocation policy, queue congestion etc doesn't yet.
I have attached a sample ionice.c file, so that you can do:
# ionice -n3 some_process
which will run that process at io priority 3.
Other changes:
- Disable TCQ in the hardware/driver by default. Can be changed (as
always) with the max_depth setting. If you do that, don't expect
fairness or priorities to work as well.
- Import thinktime stats from AS. We use this to determine when to
preempt a queue during its idle window.
- Kill find_best_crq setting. It was on by default before, and it would
be a bug if it didn't work well.
- Add ability for a given process to preempt another process slice.
- Allow idle window to slide, if there are no other potential queues we
could service requests from.
- Various little cleanups and optimizations.
2.6.10-rc2-mm4 patch:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc2-mm4/cfq-time-slices-10-2.6.10-rc2-mm4.gz
--
Jens Axboe
[-- Attachment #2: ionice.c --]
[-- Type: text/plain, Size: 1154 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <getopt.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <asm/unistd.h>
extern int sys_ioprio_set(int);
extern int sys_ioprio_get(void);
#if defined(__i386__)
#define __NR_ioprio_set 295
#define __NR_ioprio_get 296
#elif defined(__ppc__)
#define __NR_ioprio_set 278
#define __NR_ioprio_get 279
#elif defined(__x86_64__)
#define __NR_ioprio_set 254
#define __NR_ioprio_get 255
#elif defined(__ia64__)
#define __NR_ioprio_set 1274
#define __NR_ioprio_get 1275
#else
#error "Unsupported arch"
#endif
_syscall1(int, ioprio_set, int, ioprio);
_syscall0(int, ioprio_get);
int main(int argc, char *argv[])
{
int ioprio = 2, set = 0;
int c;
while ((c = getopt(argc, argv, "+n:")) != EOF) {
switch (c) {
case 'n':
ioprio = strtol(optarg, NULL, 10);
set = 1;
break;
}
}
if (!set) {
int ioprio = ioprio_get();
if (ioprio == -1)
perror("ioprio_get");
else
printf("%d\n", ioprio_get());
} else if (argv[optind]) {
if (ioprio_set(ioprio) == -1) {
perror("ioprio_set");
return 1;
}
execvp(argv[optind], &argv[optind]);
}
return 0;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-13 12:50 [PATCH] Time sliced cfq with basic io priorities Jens Axboe
@ 2004-12-13 13:09 ` Jens Axboe
2004-12-13 17:57 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2004-12-13 13:09 UTC (permalink / raw)
To: Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 323 bytes --]
On Mon, Dec 13 2004, Jens Axboe wrote:
> 2.6.10-rc2-mm4 patch:
So 2.6.10-rc3-mm1 is out I notice, here's a patch for that:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-10-2.6.10-rc3-mm1.gz
And an updated ionice.c attached, the syscall numbers changed.
--
Jens Axboe
[-- Attachment #2: ionice.c --]
[-- Type: text/plain, Size: 1154 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <getopt.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <asm/unistd.h>
extern int sys_ioprio_set(int);
extern int sys_ioprio_get(void);
#if defined(__i386__)
#define __NR_ioprio_set 294
#define __NR_ioprio_get 295
#elif defined(__ppc__)
#define __NR_ioprio_set 277
#define __NR_ioprio_get 278
#elif defined(__x86_64__)
#define __NR_ioprio_set 254
#define __NR_ioprio_get 255
#elif defined(__ia64__)
#define __NR_ioprio_set 1274
#define __NR_ioprio_get 1275
#else
#error "Unsupported arch"
#endif
_syscall1(int, ioprio_set, int, ioprio);
_syscall0(int, ioprio_get);
int main(int argc, char *argv[])
{
int ioprio = 2, set = 0;
int c;
while ((c = getopt(argc, argv, "+n:")) != EOF) {
switch (c) {
case 'n':
ioprio = strtol(optarg, NULL, 10);
set = 1;
break;
}
}
if (!set) {
int ioprio = ioprio_get();
if (ioprio == -1)
perror("ioprio_get");
else
printf("%d\n", ioprio_get());
} else if (argv[optind]) {
if (ioprio_set(ioprio) == -1) {
perror("ioprio_set");
return 1;
}
execvp(argv[optind], &argv[optind]);
}
return 0;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-13 13:09 ` Jens Axboe
@ 2004-12-13 17:57 ` Jens Axboe
2004-12-14 13:37 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2004-12-13 17:57 UTC (permalink / raw)
To: Linux Kernel
On Mon, Dec 13 2004, Jens Axboe wrote:
> On Mon, Dec 13 2004, Jens Axboe wrote:
> > 2.6.10-rc2-mm4 patch:
>
> So 2.6.10-rc3-mm1 is out I notice, here's a patch for that:
>
> http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-10-2.6.10-rc3-mm1.gz
>
> And an updated ionice.c attached, the syscall numbers changed.
Posted -11 for -mm and -BK as well. Changes:
- Preemption fairness fixes
- Enable preemption
For 2.6.10-rc3-mm1:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-11-2.6.10-rc3-mm1.gz
For 2.6-BK:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3/cfq-time-slices-11.gz
Note that the syscall numbers are different yet again, I will
consolidate these on next release. For now, find your sys_ioprio_set/get
numbers from include/asm-<your arch/unistd.h and change ionice for your
arch appropriately (if in doubt, just mail me).
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-13 17:57 ` Jens Axboe
@ 2004-12-14 13:37 ` Jens Axboe
2004-12-14 21:31 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2004-12-14 13:37 UTC (permalink / raw)
To: Linux Kernel
Hi,
Version -12 has been uploaded. Changes:
- Small optimization to choose next request logic
- An idle queue that exited would waste time for the next process
- Request allocation changes. Should get a smooth stream for writes now,
not as bursty as before. Also simplified the may_queue/check_waiters
logic, rely more on the regular block rq allocation congestion and
don't waste sys time doing multiple wakeups.
- Fix compilation on x86_64
No io priority specific fixes, the above are all to improve the cfq time
slicing.
For 2.6.10-rc3-mm1:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-12-2.6.10-rc3-mm1.gz
For 2.6-BK:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3/cfq-time-slices-12.gz
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-14 13:37 ` Jens Axboe
@ 2004-12-14 21:31 ` Paul E. McKenney
2004-12-15 6:36 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2004-12-14 21:31 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linux Kernel
On Tue, Dec 14, 2004 at 02:37:25PM +0100, Jens Axboe wrote:
> Hi,
>
> Version -12 has been uploaded. Changes:
>
> - Small optimization to choose next request logic
>
> - An idle queue that exited would waste time for the next process
>
> - Request allocation changes. Should get a smooth stream for writes now,
> not as bursty as before. Also simplified the may_queue/check_waiters
> logic, rely more on the regular block rq allocation congestion and
> don't waste sys time doing multiple wakeups.
>
> - Fix compilation on x86_64
>
> No io priority specific fixes, the above are all to improve the cfq time
> slicing.
>
> For 2.6.10-rc3-mm1:
>
> http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-12-2.6.10-rc3-mm1.gz
>
> For 2.6-BK:
>
> http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3/cfq-time-slices-12.gz
OK... I confess, I am confused...
I see the comment stating that only one thread updates, hence no need
for locking. But I can't find the readers! There is a section of
code under rcu_read_lock(), but this same function updates the list
as well. If there really is only one updater, then the rcu_read_lock()
is not needed, because rcu_read_lock() is only required to protect against
concurrent deletion.
Either way, in cfq_exit_io_context(), the list_for_each_safe_rcu() should
be able to be simply list_for_each_safe(), since this is apparently the
sole updater thread, so no concurrent updates are possible.
If only one task is referencing the list at all, no need for RCU or for
any other synchronization mechanism. If multiple threads are referencing
the list, I cannot find any pure readers. If multiple threads are updating
the list, I don't see how they are excluding each other.
Any enlightenment available? I most definitely need a clue here...
Thanx, Paul
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-14 21:31 ` Paul E. McKenney
@ 2004-12-15 6:36 ` Jens Axboe
2004-12-15 15:18 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2004-12-15 6:36 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Linux Kernel
On Tue, Dec 14 2004, Paul E. McKenney wrote:
> On Tue, Dec 14, 2004 at 02:37:25PM +0100, Jens Axboe wrote:
> > Hi,
> >
> > Version -12 has been uploaded. Changes:
> >
> > - Small optimization to choose next request logic
> >
> > - An idle queue that exited would waste time for the next process
> >
> > - Request allocation changes. Should get a smooth stream for writes now,
> > not as bursty as before. Also simplified the may_queue/check_waiters
> > logic, rely more on the regular block rq allocation congestion and
> > don't waste sys time doing multiple wakeups.
> >
> > - Fix compilation on x86_64
> >
> > No io priority specific fixes, the above are all to improve the cfq time
> > slicing.
> >
> > For 2.6.10-rc3-mm1:
> >
> > http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3-mm1/cfq-time-slices-12-2.6.10-rc3-mm1.gz
> >
> > For 2.6-BK:
> >
> > http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.6/2.6.10-rc3/cfq-time-slices-12.gz
>
> OK... I confess, I am confused...
>
> I see the comment stating that only one thread updates, hence no need
> for locking. But I can't find the readers! There is a section of
> code under rcu_read_lock(), but this same function updates the list
> as well. If there really is only one updater, then the rcu_read_lock()
> is not needed, because rcu_read_lock() is only required to protect against
> concurrent deletion.
>
> Either way, in cfq_exit_io_context(), the list_for_each_safe_rcu() should
> be able to be simply list_for_each_safe(), since this is apparently the
> sole updater thread, so no concurrent updates are possible.
>
> If only one task is referencing the list at all, no need for RCU or for
> any other synchronization mechanism. If multiple threads are referencing
> the list, I cannot find any pure readers. If multiple threads are updating
> the list, I don't see how they are excluding each other.
>
> Any enlightenment available? I most definitely need a clue here...
No, you are about right :-)
The RCU stuff can go again, because I moved everything to happen under
the same task. The section under rcu_read_lock() is the reader, it just
later on moved the hot entry to the front as well which does indeed mean
it's buggy if there were concurrent updaters. So that's why it's in a
state of being a little messy right now.
A note on the list itself - a task has a cfq_io_context per queue it's
doing io against and it needs to be looked up when we this process
queues io. The task sets this up itself on first io and tears this down
on exit. So only the task itself ever updates or searches this list.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Time sliced cfq with basic io priorities
2004-12-15 6:36 ` Jens Axboe
@ 2004-12-15 15:18 ` Paul E. McKenney
0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2004-12-15 15:18 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linux Kernel
On Wed, Dec 15, 2004 at 07:36:28AM +0100, Jens Axboe wrote:
> On Tue, Dec 14 2004, Paul E. McKenney wrote:
> > If only one task is referencing the list at all, no need for RCU or for
> > any other synchronization mechanism. If multiple threads are referencing
> > the list, I cannot find any pure readers. If multiple threads are updating
> > the list, I don't see how they are excluding each other.
> >
> > Any enlightenment available? I most definitely need a clue here...
>
> No, you are about right :-)
>
> The RCU stuff can go again, because I moved everything to happen under
> the same task. The section under rcu_read_lock() is the reader, it just
> later on moved the hot entry to the front as well which does indeed mean
> it's buggy if there were concurrent updaters. So that's why it's in a
> state of being a little messy right now.
>
> A note on the list itself - a task has a cfq_io_context per queue it's
> doing io against and it needs to be looked up when we this process
> queues io. The task sets this up itself on first io and tears this down
> on exit. So only the task itself ever updates or searches this list.
Whew!!! I feel much better!
Thanx, Paul
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-12-15 15:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-13 12:50 [PATCH] Time sliced cfq with basic io priorities Jens Axboe
2004-12-13 13:09 ` Jens Axboe
2004-12-13 17:57 ` Jens Axboe
2004-12-14 13:37 ` Jens Axboe
2004-12-14 21:31 ` Paul E. McKenney
2004-12-15 6:36 ` Jens Axboe
2004-12-15 15:18 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).