linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guillaume Chazarain <gfc@altern.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Davide Libenzi <davidel@xmailserver.org>,
	Mike Galbraith <efault@gmx.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency
Date: Sat, 26 Jul 2003 18:09:27 +0200	[thread overview]
Message-ID: <VRC7KGZX0573CW1TPN65C3Y312IC.3f22a7b7@monpc> (raw)

Hi,

> - cycle accuracy (nanosec resolution) timekeeping within the scheduler. 
>   This fixes a number of audio artifacts (skipping) i've reproduced. I
>   dont think we can get away without going cycle accuracy - reading the
>   cycle counter adds some overhead, but it's acceptable. The first
>   nanosec-accuracy patch was done by Mike Galbraith - this patch is
>   different but similar in nature. I went further in also changing the
>   sleep_avg to be of nanosec resolution.

I'd like to have your opinion on this, in 2.6.0-test1 we have:

static inline void activate_task(task_t *p, runqueue_t *rq)
{
        long sleep_time = jiffies - p->last_run - 1;

I'd like to have it back to jiffies - p->last_run. See why:
suppose jiffies - p->last_run == 1, here are the extremum timescales
possible for this: (S: sleep, W: wake up)

             p->last_run     jiffies
time:      |-------------|------------->
longest:    S                         W   => sleep_time = 2
shortest:               S W               => sleep_time = 0

That is, when jiffies - p->last_run == 1, all we know is that sleep_time
is between 0 and 2.  And currently we take 0 for sleep_time in this case.
Wouldn't it be more accurate to take 1 as a better approximation for something
between 0 and 2?  This is a sensible thing to me, because it makes mplayer
move from max CPU hog to max interactive.  When I see that mplayer uses only
10% of the cpu, it makes some sense.
BTW, with your nanosecond resolution, mplayer is also max interactive.
Don't you find it strange to need a nanosecond resolution to evaluate a simple
integer in a [-5, 5] range?

> - more finegrained timeslices: there's now a timeslice 'sub unit' of 50 
>   usecs (TIMESLICE_GRANULARITY) - CPU hogs on the same priority level 
>   will roundrobin with this unit. This change is intended to make gaming
>   latencies shorter.

Strange that no one complained about the throughput.  Anyway I don't care too.

> - include scheduling latency in sleep bonus calculation. This change 
>
>   extends the sleep-average calculation to the period of time a task
>   spends on the runqueue but doesnt get scheduled yet, right after
>   wakeup. Note that tasks that were preempted (ie. not woken up) and are 
>   still on the runqueue do not get this benefit. This change closes one 
>   of the last hole in the dynamic priority estimation, it should result 
>   in interactive tasks getting more priority under heavy load. This
>   change also fixes the test-starve.c testcase from David Mosberger.

Right, it solves test-starve.c, but not irman2.c.  With sched-G4, when irman2
is launched, a kernel compile could take ages, I tried it.  After 3 hours it
was still with the first file (scripts/fixdep.c), it produced no .o file.
With the patch at the end a kernel compile takes one hour (with -j1 and -j16)
versus five minutes when nothing else runs (config: allnoconfig).

The idea in the patch is to keep a list of the tasks on the runqueue, without
the one currently running, and sorted by insertion date.  Before reinserting an
interactive task in the active array, we check that no task has waited too
long on this list.  Davide, does it implement the interactivity throttle you had
in mind?

It's very similar to EXPIRED_STARVING(), but it has the advantage of considering
all tasks, not only the expired. It seems that with irman2, tasks don't even
have the time to expire.


Regards,
Guillaume




--- linux-2.6.0-test1/include/linux/sched.h
+++ linux-2.6.0-test1-gfc/include/linux/sched.h
@@ -335,6 +335,8 @@
 
 	int prio, static_prio;
 	struct list_head run_list;
+	unsigned long activated_timestamp;
+	struct list_head activated_list;
 	prio_array_t *array;
 
 	unsigned long sleep_avg;
--- linux-2.6.0-test1/kernel/fork.c
+++ linux-2.6.0-test1-gfc/kernel/fork.c
@@ -839,6 +839,7 @@
 	p->proc_dentry = NULL;
 
 	INIT_LIST_HEAD(&p->run_list);
+	INIT_LIST_HEAD(&p->activated_list);
 
 	INIT_LIST_HEAD(&p->children);
 	INIT_LIST_HEAD(&p->sibling);
--- linux-2.6.0-test1/kernel/sched.c
+++ linux-2.6.0-test1-gfc/kernel/sched.c
@@ -74,14 +74,14 @@
 #define PRIO_BONUS_RATIO	25
 #define INTERACTIVE_DELTA	2
 #define MAX_SLEEP_AVG		(10*HZ)
-#define STARVATION_LIMIT	(10*HZ)
+#define STARVATION_LIMIT	MAX_TIMESLICE
 #define NODE_THRESHOLD		125
 
 /*
- * If a task is 'interactive' then we reinsert it in the active
- * array after it has expired its current timeslice. (it will not
- * continue to run immediately, it will still roundrobin with
- * other interactive tasks.)
+ * If a task is 'interactive' and there is no starvation, then we
+ * reinsert it in the active array after it has expired its current
+ * timeslice. (it will not continue to run immediately, it will still
+ * roundrobin with other interactive tasks.)
  *
  * This part scales the interactivity limit depending on niceness.
  *
@@ -157,11 +157,11 @@
  */
 struct runqueue {
 	spinlock_t lock;
-	unsigned long nr_running, nr_switches, expired_timestamp,
-			nr_uninterruptible;
+	unsigned long nr_running, nr_switches, nr_uninterruptible;
 	task_t *curr, *idle;
 	struct mm_struct *prev_mm;
 	prio_array_t *active, *expired, arrays[2];
+	struct list_head activated_list;
 	int prev_cpu_load[NR_CPUS];
 #ifdef CONFIG_NUMA
 	atomic_t *node_nr_running;
@@ -330,6 +330,17 @@
 	return prio;
 }
 
+static inline void activated_task(task_t *p, runqueue_t *rq)
+{
+	if (likely(p != rq->idle) &&
+	    likely(!rt_task(p)) && 
+	    likely(!p->activated_list.next)) {
+
+		list_add_tail(&p->activated_list, &rq->activated_list);
+		p->activated_timestamp = jiffies;
+	}
+}
+
 /*
  * __activate_task - move a task to the runqueue.
  */
@@ -337,6 +348,7 @@
 {
 	enqueue_task(p, rq->active);
 	nr_running_inc(rq);
+	activated_task(p, rq);
 }
 
 /*
@@ -563,6 +575,7 @@
 		p->array = current->array;
 		p->array->nr_active++;
 		nr_running_inc(rq);
+		activated_task(p, rq);
 	}
 	task_rq_unlock(rq, &flags);
 }
@@ -1155,15 +1168,34 @@
  * We place interactive tasks back into the active array, if possible.
  *
  * To guarantee that this does not starve expired tasks we ignore the
- * interactivity of a task if the first expired task had to wait more
- * than a 'reasonable' amount of time. This deadline timeout is
- * load-dependent, as the frequency of array switched decreases with
- * increasing number of running tasks:
- */
-#define EXPIRED_STARVING(rq) \
-		(STARVATION_LIMIT && ((rq)->expired_timestamp && \
-		(jiffies - (rq)->expired_timestamp >= \
-			STARVATION_LIMIT * ((rq)->nr_running) + 1)))
+ * interactivity of a task if a task had to wait more than a 'reasonable'
+ * amount of time. This deadline timeout is load-dependent, as the frequency
+ * of array switched decreases with increasing number of running tasks:
+ */
+static inline int runqueue_starvation(runqueue_t *rq)
+{
+	struct task_struct *first;
+	unsigned long delay;
+
+	if (list_empty(&rq->activated_list))
+		return 0;
+
+	first = list_entry(rq->activated_list.next, task_t, activated_list);
+	delay = jiffies - first->activated_timestamp;
+
+	if (unlikely(delay > rq->nr_running * STARVATION_LIMIT)) {
+		list_del(&first->run_list);
+		list_add(&first->run_list, first->array->queue + first->prio);
+
+		/* DEBUG */
+		printk("%d (%s) is starved (%lu ms) ", first->pid,
+		       first->comm, delay - rq->nr_running * MAX_TIMESLICE);
+		printk("punishing %d (%s)\n", current->pid, current->comm);
+		return 1;
+	}
+
+	return 0;
+}
 
 /*
  * This function gets called by the timer code, with HZ frequency.
@@ -1238,11 +1270,9 @@
 		p->time_slice = task_timeslice(p);
 		p->first_time_slice = 0;
 
-		if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
-			if (!rq->expired_timestamp)
-				rq->expired_timestamp = jiffies;
+		if (!TASK_INTERACTIVE(p) || unlikely(runqueue_starvation(rq)))
 			enqueue_task(p, rq->expired);
-		} else
+		else
 			enqueue_task(p, rq->active);
 	}
 out_unlock:
@@ -1251,6 +1281,28 @@
 	rebalance_tick(rq, 0);
 }
 
+static inline void prepare_switch(runqueue_t *rq, task_t *prev, task_t *next)
+{
+	/* prev has been preempted. */
+	if (prev->state == TASK_RUNNING ||
+	    unlikely(preempt_count() & PREEMPT_ACTIVE))
+		activated_task(prev, rq);
+
+	/* next is no more waiting for being scheduled. */
+	if (likely(next != rq->idle) &&
+	    likely(!rt_task(next))) {
+		if (likely(next->activated_list.next != NULL)) {
+			list_del(&next->activated_list);
+			next->activated_list.next = NULL;
+		} else {
+			/* DEBUG */
+			dump_stack();
+			printk("not activated: %d (%s)\n", next->pid,
+			next->comm);
+		}
+	}
+}
+
 void scheduling_functions_start_here(void) { }
 
 /*
@@ -1311,7 +1363,6 @@
 			goto pick_next_task;
 #endif
 		next = rq->idle;
-		rq->expired_timestamp = 0;
 		goto switch_tasks;
 	}
 
@@ -1323,7 +1374,6 @@
 		rq->active = rq->expired;
 		rq->expired = array;
 		array = rq->active;
-		rq->expired_timestamp = 0;
 	}
 
 	idx = sched_find_first_bit(array->bitmap);
@@ -1336,6 +1386,8 @@
 	RCU_qsctr(task_cpu(prev))++;
 
 	if (likely(prev != next)) {
+		prepare_switch(rq, prev, next);
+
 		rq->nr_switches++;
 		rq->curr = next;
 
@@ -2497,6 +2549,7 @@
 		rq = cpu_rq(i);
 		rq->active = rq->arrays;
 		rq->expired = rq->arrays + 1;
+		INIT_LIST_HEAD(&rq->activated_list);
 		spin_lock_init(&rq->lock);
 		INIT_LIST_HEAD(&rq->migration_queue);
 		atomic_set(&rq->nr_iowait, 0);





             reply	other threads:[~2003-07-26 15:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-26 16:09 Guillaume Chazarain [this message]
2003-07-27  9:31 ` Mike Galbraith
2003-07-27 10:26 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2003-07-25 19:59 Ingo Molnar
2003-07-25 22:58 ` Felipe Alfaro Solana
2003-07-25 23:30 ` Con Kolivas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VRC7KGZX0573CW1TPN65C3Y312IC.3f22a7b7@monpc \
    --to=gfc@altern.org \
    --cc=davidel@xmailserver.org \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --subject='Re: [patch] sched-2.6.0-test1-G3, interactivity changes, audio latency' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).