linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
@ 2003-07-26 16:09 Guillaume Chazarain
  2003-07-27  9:31 ` Mike Galbraith
  2003-07-27 10:26 ` Ingo Molnar
  0 siblings, 2 replies; 6+ messages in thread
From: Guillaume Chazarain @ 2003-07-26 16:09 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Davide Libenzi, Mike Galbraith, linux-kernel

Hi,

> - cycle accuracy (nanosec resolution) timekeeping within the scheduler. 
>   This fixes a number of audio artifacts (skipping) i've reproduced. I
>   dont think we can get away without going cycle accuracy - reading the
>   cycle counter adds some overhead, but it's acceptable. The first
>   nanosec-accuracy patch was done by Mike Galbraith - this patch is
>   different but similar in nature. I went further in also changing the
>   sleep_avg to be of nanosec resolution.

I'd like to have your opinion on this, in 2.6.0-test1 we have:

static inline void activate_task(task_t *p, runqueue_t *rq)
{
        long sleep_time = jiffies - p->last_run - 1;

I'd like to have it back to jiffies - p->last_run. See why:
suppose jiffies - p->last_run == 1, here are the extremum timescales
possible for this: (S: sleep, W: wake up)

             p->last_run     jiffies
time:      |-------------|------------->
longest:    S                         W   => sleep_time = 2
shortest:               S W               => sleep_time = 0

That is, when jiffies - p->last_run == 1, all we know is that sleep_time
is between 0 and 2.  And currently we take 0 for sleep_time in this case.
Wouldn't it be more accurate to take 1 as a better approximation for something
between 0 and 2?  This is a sensible thing to me, because it makes mplayer
move from max CPU hog to max interactive.  When I see that mplayer uses only
10% of the cpu, it makes some sense.
BTW, with your nanosecond resolution, mplayer is also max interactive.
Don't you find it strange to need a nanosecond resolution to evaluate a simple
integer in a [-5, 5] range?

> - more finegrained timeslices: there's now a timeslice 'sub unit' of 50 
>   usecs (TIMESLICE_GRANULARITY) - CPU hogs on the same priority level 
>   will roundrobin with this unit. This change is intended to make gaming
>   latencies shorter.

Strange that no one complained about the throughput.  Anyway I don't care too.

> - include scheduling latency in sleep bonus calculation. This change 
>
>   extends the sleep-average calculation to the period of time a task
>   spends on the runqueue but doesnt get scheduled yet, right after
>   wakeup. Note that tasks that were preempted (ie. not woken up) and are 
>   still on the runqueue do not get this benefit. This change closes one 
>   of the last hole in the dynamic priority estimation, it should result 
>   in interactive tasks getting more priority under heavy load. This
>   change also fixes the test-starve.c testcase from David Mosberger.

Right, it solves test-starve.c, but not irman2.c.  With sched-G4, when irman2
is launched, a kernel compile could take ages, I tried it.  After 3 hours it
was still with the first file (scripts/fixdep.c), it produced no .o file.
With the patch at the end a kernel compile takes one hour (with -j1 and -j16)
versus five minutes when nothing else runs (config: allnoconfig).

The idea in the patch is to keep a list of the tasks on the runqueue, without
the one currently running, and sorted by insertion date.  Before reinserting an
interactive task in the active array, we check that no task has waited too
long on this list.  Davide, does it implement the interactivity throttle you had
in mind?

It's very similar to EXPIRED_STARVING(), but it has the advantage of considering
all tasks, not only the expired. It seems that with irman2, tasks don't even
have the time to expire.


Regards,
Guillaume




--- linux-2.6.0-test1/include/linux/sched.h
+++ linux-2.6.0-test1-gfc/include/linux/sched.h
@@ -335,6 +335,8 @@
 
 	int prio, static_prio;
 	struct list_head run_list;
+	unsigned long activated_timestamp;
+	struct list_head activated_list;
 	prio_array_t *array;
 
 	unsigned long sleep_avg;
--- linux-2.6.0-test1/kernel/fork.c
+++ linux-2.6.0-test1-gfc/kernel/fork.c
@@ -839,6 +839,7 @@
 	p->proc_dentry = NULL;
 
 	INIT_LIST_HEAD(&p->run_list);
+	INIT_LIST_HEAD(&p->activated_list);
 
 	INIT_LIST_HEAD(&p->children);
 	INIT_LIST_HEAD(&p->sibling);
--- linux-2.6.0-test1/kernel/sched.c
+++ linux-2.6.0-test1-gfc/kernel/sched.c
@@ -74,14 +74,14 @@
 #define PRIO_BONUS_RATIO	25
 #define INTERACTIVE_DELTA	2
 #define MAX_SLEEP_AVG		(10*HZ)
-#define STARVATION_LIMIT	(10*HZ)
+#define STARVATION_LIMIT	MAX_TIMESLICE
 #define NODE_THRESHOLD		125
 
 /*
- * If a task is 'interactive' then we reinsert it in the active
- * array after it has expired its current timeslice. (it will not
- * continue to run immediately, it will still roundrobin with
- * other interactive tasks.)
+ * If a task is 'interactive' and there is no starvation, then we
+ * reinsert it in the active array after it has expired its current
+ * timeslice. (it will not continue to run immediately, it will still
+ * roundrobin with other interactive tasks.)
  *
  * This part scales the interactivity limit depending on niceness.
  *
@@ -157,11 +157,11 @@
  */
 struct runqueue {
 	spinlock_t lock;
-	unsigned long nr_running, nr_switches, expired_timestamp,
-			nr_uninterruptible;
+	unsigned long nr_running, nr_switches, nr_uninterruptible;
 	task_t *curr, *idle;
 	struct mm_struct *prev_mm;
 	prio_array_t *active, *expired, arrays[2];
+	struct list_head activated_list;
 	int prev_cpu_load[NR_CPUS];
 #ifdef CONFIG_NUMA
 	atomic_t *node_nr_running;
@@ -330,6 +330,17 @@
 	return prio;
 }
 
+static inline void activated_task(task_t *p, runqueue_t *rq)
+{
+	if (likely(p != rq->idle) &&
+	    likely(!rt_task(p)) && 
+	    likely(!p->activated_list.next)) {
+
+		list_add_tail(&p->activated_list, &rq->activated_list);
+		p->activated_timestamp = jiffies;
+	}
+}
+
 /*
  * __activate_task - move a task to the runqueue.
  */
@@ -337,6 +348,7 @@
 {
 	enqueue_task(p, rq->active);
 	nr_running_inc(rq);
+	activated_task(p, rq);
 }
 
 /*
@@ -563,6 +575,7 @@
 		p->array = current->array;
 		p->array->nr_active++;
 		nr_running_inc(rq);
+		activated_task(p, rq);
 	}
 	task_rq_unlock(rq, &flags);
 }
@@ -1155,15 +1168,34 @@
  * We place interactive tasks back into the active array, if possible.
  *
  * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks:
- */
-#define EXPIRED_STARVING(rq) \
-		(STARVATION_LIMIT && ((rq)->expired_timestamp && \
-		(jiffies - (rq)->expired_timestamp >= \
-			STARVATION_LIMIT * ((rq)->nr_running) + 1)))
+ * interactivity of a task if a task had to wait more than a 'reasonable'
+ * amount of time. This deadline timeout is load-dependent, as the frequency
+ * of array switched decreases with increasing number of running tasks:
+ */
+static inline int runqueue_starvation(runqueue_t *rq)
+{
+	struct task_struct *first;
+	unsigned long delay;
+
+	if (list_empty(&rq->activated_list))
+		return 0;
+
+	first = list_entry(rq->activated_list.next, task_t, activated_list);
+	delay = jiffies - first->activated_timestamp;
+
+	if (unlikely(delay > rq->nr_running * STARVATION_LIMIT)) {
+		list_del(&first->run_list);
+		list_add(&first->run_list, first->array->queue + first->prio);
+
+		/* DEBUG */
+		printk("%d (%s) is starved (%lu ms) ", first->pid,
+		       first->comm, delay - rq->nr_running * MAX_TIMESLICE);
+		printk("punishing %d (%s)\n", current->pid, current->comm);
+		return 1;
+	}
+
+	return 0;
+}
 
 /*
  * This function gets called by the timer code, with HZ frequency.
@@ -1238,11 +1270,9 @@
 		p->time_slice = task_timeslice(p);
 		p->first_time_slice = 0;
 
-		if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
-			if (!rq->expired_timestamp)
-				rq->expired_timestamp = jiffies;
+		if (!TASK_INTERACTIVE(p) || unlikely(runqueue_starvation(rq)))
 			enqueue_task(p, rq->expired);
-		} else
+		else
 			enqueue_task(p, rq->active);
 	}
 out_unlock:
@@ -1251,6 +1281,28 @@
 	rebalance_tick(rq, 0);
 }
 
+static inline void prepare_switch(runqueue_t *rq, task_t *prev, task_t *next)
+{
+	/* prev has been preempted. */
+	if (prev->state == TASK_RUNNING ||
+	    unlikely(preempt_count() & PREEMPT_ACTIVE))
+		activated_task(prev, rq);
+
+	/* next is no more waiting for being scheduled. */
+	if (likely(next != rq->idle) &&
+	    likely(!rt_task(next))) {
+		if (likely(next->activated_list.next != NULL)) {
+			list_del(&next->activated_list);
+			next->activated_list.next = NULL;
+		} else {
+			/* DEBUG */
+			dump_stack();
+			printk("not activated: %d (%s)\n", next->pid,
+			next->comm);
+		}
+	}
+}
+
 void scheduling_functions_start_here(void) { }
 
 /*
@@ -1311,7 +1363,6 @@
 			goto pick_next_task;
 #endif
 		next = rq->idle;
-		rq->expired_timestamp = 0;
 		goto switch_tasks;
 	}
 
@@ -1323,7 +1374,6 @@
 		rq->active = rq->expired;
 		rq->expired = array;
 		array = rq->active;
-		rq->expired_timestamp = 0;
 	}
 
 	idx = sched_find_first_bit(array->bitmap);
@@ -1336,6 +1386,8 @@
 	RCU_qsctr(task_cpu(prev))++;
 
 	if (likely(prev != next)) {
+		prepare_switch(rq, prev, next);
+
 		rq->nr_switches++;
 		rq->curr = next;
 
@@ -2497,6 +2549,7 @@
 		rq = cpu_rq(i);
 		rq->active = rq->arrays;
 		rq->expired = rq->arrays + 1;
+		INIT_LIST_HEAD(&rq->activated_list);
 		spin_lock_init(&rq->lock);
 		INIT_LIST_HEAD(&rq->migration_queue);
 		atomic_set(&rq->nr_iowait, 0);





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
  2003-07-26 16:09 [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency Guillaume Chazarain
@ 2003-07-27  9:31 ` Mike Galbraith
  2003-07-27 10:26 ` Ingo Molnar
  1 sibling, 0 replies; 6+ messages in thread
From: Mike Galbraith @ 2003-07-27  9:31 UTC (permalink / raw)
  To: Guillaume Chazarain; +Cc: Ingo Molnar, Davide Libenzi, linux-kernel

Hi,

At 06:09 PM 7/26/2003 +0200, Guillaume Chazarain wrote:

> > - include scheduling latency in sleep bonus calculation. This change
> >
> >   extends the sleep-average calculation to the period of time a task
> >   spends on the runqueue but doesnt get scheduled yet, right after
> >   wakeup. Note that tasks that were preempted (ie. not woken up) and are
> >   still on the runqueue do not get this benefit. This change closes one
> >   of the last hole in the dynamic priority estimation, it should result
> >   in interactive tasks getting more priority under heavy load. This
> >   change also fixes the test-starve.c testcase from David Mosberger.
>
>Right, it solves test-starve.c, but not irman2.c.  With sched-G4, when irman2
>is launched, a kernel compile could take ages, I tried it.  After 3 hours it
>was still with the first file (scripts/fixdep.c), it produced no .o file.
>With the patch at the end a kernel compile takes one hour (with -j1 and -j16)
>versus five minutes when nothing else runs (config: allnoconfig).
>
>The idea in the patch is to keep a list of the tasks on the runqueue, without
>the one currently running, and sorted by insertion date.  Before 
>reinserting an
>interactive task in the active array, we check that no task has waited too
>long on this list.  Davide, does it implement the interactivity throttle 
>you had
>in mind?
>
>It's very similar to EXPIRED_STARVING(), but it has the advantage of 
>considering
>all tasks, not only the expired. It seems that with irman2, tasks don't even
>have the time to expire.

True, irman2 is a very nasty little bugger.

Your method works well here.  With your patch in test1+G4, I can run irman2 
and still have a quite usable system.  It could possibly use a little 
refinement though.  If xmms happens to run out of slice at a time when you 
determine that starvation is in progress, it'll nail the innocent xmms (or 
other very light task).

I went a different route in tackling irman2 to avoid the pain of expiring X 
(glitch), but really there's no difference between X and xmms... they 
should both be run RR if you want 100% glitch free operation (and thus 
would be exempted from punishment under your method, and mine as well)

         -Mike 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
  2003-07-26 16:09 [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency Guillaume Chazarain
  2003-07-27  9:31 ` Mike Galbraith
@ 2003-07-27 10:26 ` Ingo Molnar
  1 sibling, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2003-07-27 10:26 UTC (permalink / raw)
  To: Guillaume Chazarain; +Cc: Davide Libenzi, Mike Galbraith, linux-kernel


On Sat, 26 Jul 2003, Guillaume Chazarain wrote:

> BTW, with your nanosecond resolution, mplayer is also max interactive.

good.

> Don't you find it strange to need a nanosecond resolution to evaluate a
> simple integer in a [-5, 5] range?

no, i dont find it strange, as the -5..5 integer range is only the end
result. What we really want is an accurate sleep average, the integral of
sleep times done over the last N seconds. Given that tasks can schedule at
very high frequencies these days, this needs accurate measurements. I used
to hope that a 1 msec sampling frequency would to be enough for this, but
when we fix the statistics cornercase you mention, another cornercase pops
up. And it's not _that_ hard to use the cycle counter and still have a
reasonably fast algorithm, anyway.

	Ingo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
  2003-07-25 19:59 Ingo Molnar
  2003-07-25 22:58 ` Felipe Alfaro Solana
@ 2003-07-25 23:30 ` Con Kolivas
  1 sibling, 0 replies; 6+ messages in thread
From: Con Kolivas @ 2003-07-25 23:30 UTC (permalink / raw)
  To: linux-kernel

On Sat, 26 Jul 2003 05:59, Ingo Molnar wrote:
> my current "interactivity changes" scheduler patchset can be found at:
>
> 	redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G3
>
> (this patch is mostly orthogonal to Con's patchset, but obviously collides
> patch-wise. The patch should also cleanly apply to 2.6.0-test1-bk2.)

If the engineer is back to tune his own engine, the mechanic shall stop 
playing with it.

Con


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
  2003-07-25 19:59 Ingo Molnar
@ 2003-07-25 22:58 ` Felipe Alfaro Solana
  2003-07-25 23:30 ` Con Kolivas
  1 sibling, 0 replies; 6+ messages in thread
From: Felipe Alfaro Solana @ 2003-07-25 22:58 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML

On Fri, 2003-07-25 at 21:59, Ingo Molnar wrote:
> my current "interactivity changes" scheduler patchset can be found at:
> 
> 	redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G3
> 
> (this patch is mostly orthogonal to Con's patchset, but obviously collides
> patch-wise. The patch should also cleanly apply to 2.6.0-test1-bk2.)

For my workloads, this patch behaves just a little bit better than -G2,
but overall, X still feels jumpy, even under no load. I haven't been
able to reproduce any XMMS sound skips, due to the raising of
CHILD_PENALTY.

I've seen much better response times and less jerkiness of the system by
setting MAX_SLEEP_AVG at 1000000000 and STARVATION_LIMIT to HZ. With
these settings, under heavy load, no process starvation or slowdown can
be perceived, an opening new terminals, or launching processes is nearly
as fast as when the system is under no load. This is great, as the
vanilla scheduler is unable to handle this correctly.

Additionally, renicing X to -20 gives pretty good results for X
applications, like Evolution, but there are still some remanents of
jerkiness from time to time.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
@ 2003-07-25 19:59 Ingo Molnar
  2003-07-25 22:58 ` Felipe Alfaro Solana
  2003-07-25 23:30 ` Con Kolivas
  0 siblings, 2 replies; 6+ messages in thread
From: Ingo Molnar @ 2003-07-25 19:59 UTC (permalink / raw)
  To: linux-kernel


my current "interactivity changes" scheduler patchset can be found at:

	redhat.com/~mingo/O(1)-scheduler/sched-2.6.0-test1-G3

(this patch is mostly orthogonal to Con's patchset, but obviously collides
patch-wise. The patch should also cleanly apply to 2.6.0-test1-bk2.)

Changes:

 - cycle accuracy (nanosec resolution) timekeeping within the scheduler. 
   This fixes a number of audio artifacts (skipping) i've reproduced. I
   dont think we can get away without going cycle accuracy - reading the
   cycle counter adds some overhead, but it's acceptable. The first
   nanosec-accuracy patch was done by Mike Galbraith - this patch is
   different but similar in nature. I went further in also changing the
   sleep_avg to be of nanosec resolution.

 - more finegrained timeslices: there's now a timeslice 'sub unit' of 50 
   usecs (TIMESLICE_GRANULARITY) - CPU hogs on the same priority level 
   will roundrobin with this unit. This change is intended to make gaming
   latencies shorter.

 - include scheduling latency in sleep bonus calculation. This change 
   extends the sleep-average calculation to the period of time a task
   spends on the runqueue but doesnt get scheduled yet, right after
   wakeup. Note that tasks that were preempted (ie. not woken up) and are 
   still on the runqueue do not get this benefit. This change closes one 
   of the last hole in the dynamic priority estimation, it should result 
   in interactive tasks getting more priority under heavy load. This
   change also fixes the test-starve.c testcase from David Mosberger.

 - (some other, smaller changes.)

if you've experienced audio skipping in 2.6.0-test1 (and later) kernels
then please give this patch a go. Reports, testing feedback and comments
are welcome,

	Ingo


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-07-27 10:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-26 16:09 [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency Guillaume Chazarain
2003-07-27  9:31 ` Mike Galbraith
2003-07-27 10:26 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2003-07-25 19:59 Ingo Molnar
2003-07-25 22:58 ` Felipe Alfaro Solana
2003-07-25 23:30 ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).