linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched: Fix adverse effects of NFS client on interactive response
@ 2005-12-21  6:00 Peter Williams
  2005-12-21  6:09 ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-21  6:00 UTC (permalink / raw)
  To: Ingo Molnar, Trond Myklebust; +Cc: Con Kolivas, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

This patch addresses the adverse effect that the NFS client can have on 
interactive response when CPU bound tasks (such as a kernel build) 
operate on files mounted via NFS.  (NB It is emphasized that this has 
nothing to do with the effects of interactive tasks accessing NFS 
mounted files themselves.)

The problem occurs because tasks accessing NFS mounted files for data 
can undergo quite a lot of TASK_INTERRUPTIBLE sleep depending on the 
load on the server and the quality of the network connection.  This can 
result in these tasks getting quite high values for sleep_avg and 
consequently a large priority bonus.  On the system where I noticed this 
problem they were getting the full 10 bonus points and being given the 
same dynamic priority as genuine interactive tasks such as the X server 
and rythmbox.

The solution to this problem is to use TASK_NONINTERACTIVE to tell the 
scheduler that the TASK_INTERRUPTIBLE sleeps in the NFS client and 
SUNRPC are NOT interactive sleeps.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

--

  fs/nfs/inode.c       |    3 ++-
  fs/nfs/nfs4proc.c    |    2 +-
  fs/nfs/pagelist.c    |    3 ++-
  fs/nfs/write.c       |    3 ++-
  net/sunrpc/sched.c   |    2 +-
  net/sunrpc/svcsock.c |    2 +-
  6 files changed, 9 insertions(+), 6 deletions(-)

-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

[-- Attachment #2: make-nfs-sleeps-noninteractive --]
[-- Type: text/plain, Size: 3679 bytes --]

Index: GIT-warnings/fs/nfs/inode.c
===================================================================
--- GIT-warnings.orig/fs/nfs/inode.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/fs/nfs/inode.c	2005-12-21 16:22:11.000000000 +1100
@@ -937,7 +937,8 @@ static int nfs_wait_on_inode(struct inod
 
 	rpc_clnt_sigmask(clnt, &oldmask);
 	error = wait_on_bit_lock(&nfsi->flags, NFS_INO_REVALIDATING,
-					nfs_wait_schedule, TASK_INTERRUPTIBLE);
+					nfs_wait_schedule,
+					TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE);
 	rpc_clnt_sigunmask(clnt, &oldmask);
 
 	return error;
Index: GIT-warnings/fs/nfs/nfs4proc.c
===================================================================
--- GIT-warnings.orig/fs/nfs/nfs4proc.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/fs/nfs/nfs4proc.c	2005-12-21 16:22:11.000000000 +1100
@@ -2547,7 +2547,7 @@ static int nfs4_wait_clnt_recover(struct
 	rpc_clnt_sigmask(clnt, &oldset);
 	interruptible = TASK_UNINTERRUPTIBLE;
 	if (clnt->cl_intr)
-		interruptible = TASK_INTERRUPTIBLE;
+		interruptible = TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE;
 	prepare_to_wait(&clp->cl_waitq, &wait, interruptible);
 	nfs4_schedule_state_recovery(clp);
 	if (clnt->cl_intr && signalled())
Index: GIT-warnings/fs/nfs/pagelist.c
===================================================================
--- GIT-warnings.orig/fs/nfs/pagelist.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/fs/nfs/pagelist.c	2005-12-21 16:22:11.000000000 +1100
@@ -210,7 +210,8 @@ nfs_wait_on_request(struct nfs_page *req
 	 */
 	rpc_clnt_sigmask(clnt, &oldmask);
 	ret = out_of_line_wait_on_bit(&req->wb_flags, PG_BUSY,
-			nfs_wait_bit_interruptible, TASK_INTERRUPTIBLE);
+			nfs_wait_bit_interruptible,
+			TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE);
 	rpc_clnt_sigunmask(clnt, &oldmask);
 out:
 	return ret;
Index: GIT-warnings/fs/nfs/write.c
===================================================================
--- GIT-warnings.orig/fs/nfs/write.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/fs/nfs/write.c	2005-12-21 16:22:11.000000000 +1100
@@ -595,7 +595,8 @@ static int nfs_wait_on_write_congestion(
 		sigset_t oldset;
 
 		rpc_clnt_sigmask(clnt, &oldset);
-		prepare_to_wait(&nfs_write_congestion, &wait, TASK_INTERRUPTIBLE);
+		prepare_to_wait(&nfs_write_congestion, &wait,
+				TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE);
 		if (bdi_write_congested(bdi)) {
 			if (signalled())
 				ret = -ERESTARTSYS;
Index: GIT-warnings/net/sunrpc/sched.c
===================================================================
--- GIT-warnings.orig/net/sunrpc/sched.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/net/sunrpc/sched.c	2005-12-21 16:22:11.000000000 +1100
@@ -659,7 +659,7 @@ static int __rpc_execute(struct rpc_task
 		/* Note: Caller should be using rpc_clnt_sigmask() */
 		status = out_of_line_wait_on_bit(&task->tk_runstate,
 				RPC_TASK_QUEUED, rpc_wait_bit_interruptible,
-				TASK_INTERRUPTIBLE);
+				TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE);
 		if (status == -ERESTARTSYS) {
 			/*
 			 * When a sync task receives a signal, it exits with
Index: GIT-warnings/net/sunrpc/svcsock.c
===================================================================
--- GIT-warnings.orig/net/sunrpc/svcsock.c	2005-12-21 16:22:09.000000000 +1100
+++ GIT-warnings/net/sunrpc/svcsock.c	2005-12-21 16:22:11.000000000 +1100
@@ -1213,7 +1213,7 @@ svc_recv(struct svc_serv *serv, struct s
 		 * We have to be able to interrupt this wait
 		 * to bring down the daemons ...
 		 */
-		set_current_state(TASK_INTERRUPTIBLE);
+		set_current_state(TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE);
 		add_wait_queue(&rqstp->rq_wait, &wait);
 		spin_unlock_bh(&serv->sv_lock);
 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21  6:00 [PATCH] sched: Fix adverse effects of NFS client on interactive response Peter Williams
@ 2005-12-21  6:09 ` Trond Myklebust
  2005-12-21  6:32   ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-21  6:09 UTC (permalink / raw)
  To: Peter Williams; +Cc: Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Wed, 2005-12-21 at 17:00 +1100, Peter Williams wrote:
> This patch addresses the adverse effect that the NFS client can have on 
> interactive response when CPU bound tasks (such as a kernel build) 
> operate on files mounted via NFS.  (NB It is emphasized that this has 
> nothing to do with the effects of interactive tasks accessing NFS 
> mounted files themselves.)
> 
> The problem occurs because tasks accessing NFS mounted files for data 
> can undergo quite a lot of TASK_INTERRUPTIBLE sleep depending on the 
> load on the server and the quality of the network connection.  This can 
> result in these tasks getting quite high values for sleep_avg and 
> consequently a large priority bonus.  On the system where I noticed this 
> problem they were getting the full 10 bonus points and being given the 
> same dynamic priority as genuine interactive tasks such as the X server 
> and rythmbox.
> 
> The solution to this problem is to use TASK_NONINTERACTIVE to tell the 
> scheduler that the TASK_INTERRUPTIBLE sleeps in the NFS client and 
> SUNRPC are NOT interactive sleeps.

Sorry. That theory is just plain wrong. ALL of those case _ARE_
interactive sleeps.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21  6:09 ` Trond Myklebust
@ 2005-12-21  6:32   ` Peter Williams
  2005-12-21 13:21     ` Trond Myklebust
                       ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Peter Williams @ 2005-12-21  6:32 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Wed, 2005-12-21 at 17:00 +1100, Peter Williams wrote:
> 
>>This patch addresses the adverse effect that the NFS client can have on 
>>interactive response when CPU bound tasks (such as a kernel build) 
>>operate on files mounted via NFS.  (NB It is emphasized that this has 
>>nothing to do with the effects of interactive tasks accessing NFS 
>>mounted files themselves.)
>>
>>The problem occurs because tasks accessing NFS mounted files for data 
>>can undergo quite a lot of TASK_INTERRUPTIBLE sleep depending on the 
>>load on the server and the quality of the network connection.  This can 
>>result in these tasks getting quite high values for sleep_avg and 
>>consequently a large priority bonus.  On the system where I noticed this 
>>problem they were getting the full 10 bonus points and being given the 
>>same dynamic priority as genuine interactive tasks such as the X server 
>>and rythmbox.
>>
>>The solution to this problem is to use TASK_NONINTERACTIVE to tell the 
>>scheduler that the TASK_INTERRUPTIBLE sleeps in the NFS client and 
>>SUNRPC are NOT interactive sleeps.
> 
> 
> Sorry. That theory is just plain wrong. ALL of those case _ARE_
> interactive sleeps.

It's not a theory.  It's a result of observing a -j 16 build with the 
sources on an NFS mounted file system with top with and without the 
patches and comparing that with the same builds with the sources on a 
local file system.  Without the patches the tasks in the kernel build 
all get the same dynamic priority as the X server and other interactive 
programs when the sources are on an NFS mounted file system.  With the 
patches they generally have dynamic priorities between 6 to 10 higher 
than the X server and other interactive programs.

In both cases, when the build is run on a source on a local file system 
the kernel build tasks all have dynamic priorities 6 to 10 higher than 
the X server and other interactive programs.

In all cases, the dynamic priorities of the X server and other 
interactive programs are the same.

In the testing that I have done so far the patch has not resulted in any 
genuine interactive tasks not being identified as interactive.


Peter
PS There's a difference between interruptible and interactive in that 
while all interactive sleeps will be interruptible not all interruptible 
sleeps are interactive.  Ingo introduced TASK_NONINTERACTIVE to enable 
this distinction to be made.
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21  6:32   ` Peter Williams
@ 2005-12-21 13:21     ` Trond Myklebust
  2005-12-21 13:36       ` Kyle Moffett
  2005-12-21 16:11     ` Ingo Molnar
  2006-01-02 11:01     ` Helge Hafting
  2 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-21 13:21 UTC (permalink / raw)
  To: Peter Williams; +Cc: Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Wed, 2005-12-21 at 17:32 +1100, Peter Williams wrote:

> > Sorry. That theory is just plain wrong. ALL of those case _ARE_
> > interactive sleeps.
> 
> It's not a theory.  It's a result of observing a -j 16 build with the 
> sources on an NFS mounted file system with top with and without the 
> patches and comparing that with the same builds with the sources on a 
> local file system.  Without the patches the tasks in the kernel build 
> all get the same dynamic priority as the X server and other interactive 
> programs when the sources are on an NFS mounted file system.  With the 
> patches they generally have dynamic priorities between 6 to 10 higher 
> than the X server and other interactive programs.

...and if you stick in a faster server?...

There is _NO_ fundamental difference between NFS and a local filesystem
that warrants marking one as "interactive" and the other as
"noninteractive". What you are basically saying is that all I/O should
be marked as TASK_NONINTERACTIVE.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 13:21     ` Trond Myklebust
@ 2005-12-21 13:36       ` Kyle Moffett
  2005-12-21 13:40         ` Trond Myklebust
  2005-12-21 16:10         ` Horst von Brand
  0 siblings, 2 replies; 55+ messages in thread
From: Kyle Moffett @ 2005-12-21 13:36 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Williams, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
> ...and if you stick in a faster server?...
>
> There is _NO_ fundamental difference between NFS and a local  
> filesystem that warrants marking one as "interactive" and the other  
> as "noninteractive". What you are basically saying is that all I/O  
> should be marked as TASK_NONINTERACTIVE.

Uhh, what part of disk/NFS/filesystem access is "interactive"?  Which  
of those sleeps directly involve responding to user-interface  
events?  _That_ is the whole point of the interactivity bonus, and  
precisely why Ingo introduced TASK_NONINTERACTIVE sleeps; so that  
processes that are not being useful for interactivity could be moved  
away from TASK_NONINTERRUPTABLE, with the end result that the X- 
server could be run at priority 0 without harming interactivity, even  
during heavy *disk*, *NFS*, and *network* activity.  Admittedly, that  
may not be what some people want, but they're welcome to turn off the  
interactivity bonuses via some file in /proc (sorry, don't remember  
which at the moment).

Cheers,
Kyle Moffett

--
I have yet to see any problem, however complicated, which, when you  
looked at it in the right way, did not become still more complicated.
   -- Poul Anderson




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 13:36       ` Kyle Moffett
@ 2005-12-21 13:40         ` Trond Myklebust
  2005-12-22  2:26           ` Peter Williams
  2005-12-21 16:10         ` Horst von Brand
  1 sibling, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-21 13:40 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Peter Williams, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Wed, 2005-12-21 at 08:36 -0500, Kyle Moffett wrote:
> On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
> > ...and if you stick in a faster server?...
> >
> > There is _NO_ fundamental difference between NFS and a local  
> > filesystem that warrants marking one as "interactive" and the other  
> > as "noninteractive". What you are basically saying is that all I/O  
> > should be marked as TASK_NONINTERACTIVE.
> 
> Uhh, what part of disk/NFS/filesystem access is "interactive"?  Which  
> of those sleeps directly involve responding to user-interface  
> events?  _That_ is the whole point of the interactivity bonus, and  
> precisely why Ingo introduced TASK_NONINTERACTIVE sleeps; so that  
> processes that are not being useful for interactivity could be moved  
> away from TASK_NONINTERRUPTABLE, with the end result that the X- 
> server could be run at priority 0 without harming interactivity, even  
> during heavy *disk*, *NFS*, and *network* activity.  Admittedly, that  
> may not be what some people want, but they're welcome to turn off the  
> interactivity bonuses via some file in /proc (sorry, don't remember  
> which at the moment).

Then have io_schedule() automatically set that flag, and convert NFS to
use io_schedule(), or something along those lines. I don't want a bunch
of RT-specific flags littering the NFS/RPC code.

Cheers,
 Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 13:36       ` Kyle Moffett
  2005-12-21 13:40         ` Trond Myklebust
@ 2005-12-21 16:10         ` Horst von Brand
  2005-12-21 20:36           ` Kyle Moffett
  1 sibling, 1 reply; 55+ messages in thread
From: Horst von Brand @ 2005-12-21 16:10 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Trond Myklebust, Peter Williams, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Kyle Moffett <mrmacman_g4@mac.com> wrote:
> On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
> > ...and if you stick in a faster server?...
> > There is _NO_ fundamental difference between NFS and a local
> > filesystem that warrants marking one as "interactive" and the other
> > as "noninteractive". What you are basically saying is that all I/O
> > should be marked as TASK_NONINTERACTIVE.

> Uhh, what part of disk/NFS/filesystem access is "interactive"?  Which
> of those sleeps directly involve responding to user-interface  events?

And if it is a user waiting for the data to display? Can't distinguish that
so easily from the compiler waiting for something to do...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21  6:32   ` Peter Williams
  2005-12-21 13:21     ` Trond Myklebust
@ 2005-12-21 16:11     ` Ingo Molnar
  2005-12-21 22:49       ` Peter Williams
  2006-01-02 11:01     ` Helge Hafting
  2 siblings, 1 reply; 55+ messages in thread
From: Ingo Molnar @ 2005-12-21 16:11 UTC (permalink / raw)
  To: Peter Williams; +Cc: Trond Myklebust, Con Kolivas, Linux Kernel Mailing List


* Peter Williams <pwil3058@bigpond.net.au> wrote:

> It's not a theory.  It's a result of observing a -j 16 build with the 
> sources on an NFS mounted file system with top with and without the 
> patches and comparing that with the same builds with the sources on a 
> local file system. [...]

could you try the build with the scheduler queue from -mm, and set the 
shell to SCHED_BATCH first? Do you still see interactivity problems 
after that?

i'm not sure we want to override the scheduling patterns observed by the 
kernel, via TASK_NONINTERACTIVE - apart of a few obvious cases.

	Ingo

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 16:10         ` Horst von Brand
@ 2005-12-21 20:36           ` Kyle Moffett
  2005-12-21 22:59             ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Kyle Moffett @ 2005-12-21 20:36 UTC (permalink / raw)
  To: Horst von Brand
  Cc: Trond Myklebust, Peter Williams, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Dec 21, 2005, at 11:10, Horst von Brand wrote:
> Kyle Moffett <mrmacman_g4@mac.com> wrote:
>> On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
>>> ...and if you stick in a faster server?...
>>> There is _NO_ fundamental difference between NFS and a local  
>>> filesystem that warrants marking one as "interactive" and the  
>>> other as "noninteractive". What you are basically saying is that  
>>> all I/O should be marked as TASK_NONINTERACTIVE.
>>
>> Uhh, what part of disk/NFS/filesystem access is "interactive"?   
>> Which of those sleeps directly involve responding to user- 
>> interface  events?
>
> And if it is a user waiting for the data to display? Can't  
> distinguish that so easily from the compiler waiting for something  
> to do...

No, but in that case the program probably _already_ has some  
interactivity bonus just from user interaction.  On the other hand,  
UI programming guidelines say that any task which might take more  
than a half-second or so should not be run in the event loop, but in  
a separate thread (either a drawing thread or similar).  In that  
case, your event loop thread is the one with the interactivity bonus,  
and the others are just data processing threads (like the compile you  
have running in the background or the webserver responding to HTTP  
requests), that the user would need to manually arbitrate between  
with nice levels.

The whole point of the interactivity bonus was that processes that  
follow the cycle <waiting-for-input> => <respond-to-input-for-less- 
than-time-quantum> => <waiting-for-input> would get a boost; things  
like dragging a window or handling mouse or keyboard events should  
happen within a small number of milliseconds, whereas background  
tasks really _don't_ care if they are delayed running their time  
quantum by 400ms, as long as they get their full quantum during each  
cycle.

Cheers,
Kyle Moffett

--
Debugging is twice as hard as writing the code in the first place.   
Therefore, if you write the code as cleverly as possible, you are, by  
definition, not smart enough to debug it.
   -- Brian Kernighan



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 16:11     ` Ingo Molnar
@ 2005-12-21 22:49       ` Peter Williams
  0 siblings, 0 replies; 55+ messages in thread
From: Peter Williams @ 2005-12-21 22:49 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Trond Myklebust, Con Kolivas, Linux Kernel Mailing List

Ingo Molnar wrote:
> * Peter Williams <pwil3058@bigpond.net.au> wrote:
> 
> 
>>It's not a theory.  It's a result of observing a -j 16 build with the 
>>sources on an NFS mounted file system with top with and without the 
>>patches and comparing that with the same builds with the sources on a 
>>local file system. [...]
> 
> 
> could you try the build with the scheduler queue from -mm, and set the 
> shell to SCHED_BATCH first? Do you still see interactivity problems 
> after that?

There's no real point in doing such a test as running the build as 
SCHED_BATCH would obviously prevent its tasks from getting any 
interactive bonus.  So I'll concede that is a solution.

However, the problem I see with this solution is that it's pushing the 
onus onto the user and forcing them to decide/remember to run non 
interactive tasks as SCHED_BATCH (and I see the whole point of the 
interactive responsiveness embellishments of the scheduler being to free 
the user of the need to worry about these things).  It's a marginally 
better solution than its complement i.e. marking interactive tasks as 
being such via putting them in a (hypothetical) SCHED_IA class because 
that would clearly have to be a privileged operation unlike setting 
SCHED_BATCH.

This is a case where the PAGG patches would have been useful.  With them 
a mechanism for monitoring exec()s and shifting programs to SCHED_BATCH 
based on what program they had just exec()ed would be possible making 
SCHED_BATCH a better solution to this problem.  If PAGG were 
complimented with a kernel to user space event notification mechanism 
the bulk of this could be accomplished in user space.  The new code SGI 
is proposing as an alternative to PAGG may meet these requirements?

> 
> i'm not sure we want to override the scheduling patterns observed by the 
> kernel, via TASK_NONINTERACTIVE - apart of a few obvious cases.

I thought that this was one of the obvious cases.  I.e. interruptible 
sleeps that clearly aren't interactive.

I interpreted your statement "Right now only pipe_wait() will make use 
of it, because it's a common source of not-so-interactive waits (kernel 
compilation jobs, etc.)." in the original announcement of 
TASK_INTERACTIVE to mean that it was a "work in progresss" and would be 
used more extensively when other places for its application were identified.

BTW I don't think that it should be blindly applied to all file system 
code as I tried that and it resulted in the X server not getting any 
interactive bonus with obvious consequences :-(.  I think that use of 
TASK_NONINTERACTIVE should be done carefully and tested to make sure 
that it has no unexpected scheduling implications (and I think that this 
is such a case).  Provided the TASK_XXX flags are always treated as such 
there should be no changes to the semantics or efficiency (after all, 
it's just an extra bit in an integer constant set at compile time) of 
any other code (than the scheduler's) as a result of its use.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 20:36           ` Kyle Moffett
@ 2005-12-21 22:59             ` Peter Williams
  0 siblings, 0 replies; 55+ messages in thread
From: Peter Williams @ 2005-12-21 22:59 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Horst von Brand, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Kyle Moffett wrote:
> On Dec 21, 2005, at 11:10, Horst von Brand wrote:
> 
>> Kyle Moffett <mrmacman_g4@mac.com> wrote:
>>
>>> On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
>>>
>>>> ...and if you stick in a faster server?...
>>>> There is _NO_ fundamental difference between NFS and a local  
>>>> filesystem that warrants marking one as "interactive" and the  other 
>>>> as "noninteractive". What you are basically saying is that  all I/O 
>>>> should be marked as TASK_NONINTERACTIVE.
>>>
>>>
>>> Uhh, what part of disk/NFS/filesystem access is "interactive"?   
>>> Which of those sleeps directly involve responding to user- interface  
>>> events?
>>
>>
>> And if it is a user waiting for the data to display? Can't  
>> distinguish that so easily from the compiler waiting for something  to 
>> do...
> 
> 
> No, but in that case the program probably _already_ has some  
> interactivity bonus just from user interaction.

And if it doesn't then it is (by definition) not interactive. :-)

As you imply, this change is targetting those tasks whose ONLY 
interruptible sleeps are due to NFS use.

>  On the other hand,  UI 
> programming guidelines say that any task which might take more  than a 
> half-second or so should not be run in the event loop, but in  a 
> separate thread (either a drawing thread or similar).  In that  case, 
> your event loop thread is the one with the interactivity bonus,  and the 
> others are just data processing threads (like the compile you  have 
> running in the background or the webserver responding to HTTP  
> requests), that the user would need to manually arbitrate between  with 
> nice levels.
> 
> The whole point of the interactivity bonus was that processes that  
> follow the cycle <waiting-for-input> => <respond-to-input-for-less- 
> than-time-quantum> => <waiting-for-input> would get a boost; things  
> like dragging a window or handling mouse or keyboard events should  
> happen within a small number of milliseconds, whereas background  tasks 
> really _don't_ care if they are delayed running their time  quantum by 
> 400ms, as long as they get their full quantum during each  cycle.

Exactly.  It's all about latency and doesn't really effect the 
allocation of CPU resources according to niceness as that is handled via 
differential time slice allocations.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21 13:40         ` Trond Myklebust
@ 2005-12-22  2:26           ` Peter Williams
  2005-12-22 22:08             ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-22  2:26 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Wed, 2005-12-21 at 08:36 -0500, Kyle Moffett wrote:
> 
>>On Dec 21, 2005, at 08:21, Trond Myklebust wrote:
>>
>>>...and if you stick in a faster server?...
>>>
>>>There is _NO_ fundamental difference between NFS and a local  
>>>filesystem that warrants marking one as "interactive" and the other  
>>>as "noninteractive". What you are basically saying is that all I/O  
>>>should be marked as TASK_NONINTERACTIVE.
>>
>>Uhh, what part of disk/NFS/filesystem access is "interactive"?  Which  
>>of those sleeps directly involve responding to user-interface  
>>events?  _That_ is the whole point of the interactivity bonus, and  
>>precisely why Ingo introduced TASK_NONINTERACTIVE sleeps; so that  
>>processes that are not being useful for interactivity could be moved  
>>away from TASK_NONINTERRUPTABLE, with the end result that the X- 
>>server could be run at priority 0 without harming interactivity, even  
>>during heavy *disk*, *NFS*, and *network* activity.  Admittedly, that  
>>may not be what some people want, but they're welcome to turn off the  
>>interactivity bonuses via some file in /proc (sorry, don't remember  
>>which at the moment).
> 
> 
> Then have io_schedule() automatically set that flag, and convert NFS to
> use io_schedule(), or something along those lines. I don't want a bunch
> of RT-specific flags littering the NFS/RPC code.

This flag isn't RT-specific.  It's used in the scheduling SCHED_NORMAL 
tasks and has no other semantic effects.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-22  2:26           ` Peter Williams
@ 2005-12-22 22:08             ` Trond Myklebust
  2005-12-22 22:33               ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-22 22:08 UTC (permalink / raw)
  To: Peter Williams
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Thu, 2005-12-22 at 13:26 +1100, Peter Williams wrote:

> > Then have io_schedule() automatically set that flag, and convert NFS to
> > use io_schedule(), or something along those lines. I don't want a bunch
> > of RT-specific flags littering the NFS/RPC code.
> 
> This flag isn't RT-specific.  It's used in the scheduling SCHED_NORMAL 
> tasks and has no other semantic effects.

It still has sod all business being in the NFS code. We don't touch task
scheduling in the filesystem code.

  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-22 22:08             ` Trond Myklebust
@ 2005-12-22 22:33               ` Peter Williams
  2005-12-22 22:59                 ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-22 22:33 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Thu, 2005-12-22 at 13:26 +1100, Peter Williams wrote:
> 
> 
>>>Then have io_schedule() automatically set that flag, and convert NFS to
>>>use io_schedule(), or something along those lines. I don't want a bunch
>>>of RT-specific flags littering the NFS/RPC code.
>>
>>This flag isn't RT-specific.  It's used in the scheduling SCHED_NORMAL 
>>tasks and has no other semantic effects.
> 
> 
> It still has sod all business being in the NFS code. We don't touch task
> scheduling in the filesystem code.

How do you explain the use of the TASK_INTERRUPTIBLE flag then?

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-22 22:33               ` Peter Williams
@ 2005-12-22 22:59                 ` Trond Myklebust
  2005-12-23  0:02                   ` Kyle Moffett
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-22 22:59 UTC (permalink / raw)
  To: Peter Williams
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Fri, 2005-12-23 at 09:33 +1100, Peter Williams wrote:
> > It still has sod all business being in the NFS code. We don't touch task
> > scheduling in the filesystem code.
> 
> How do you explain the use of the TASK_INTERRUPTIBLE flag then?

Oh, please...

TASK_INTERRUPTIBLE is used to set the task to sleep. It has NOTHING to
do with scheduling.

Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-22 22:59                 ` Trond Myklebust
@ 2005-12-23  0:02                   ` Kyle Moffett
  2005-12-23  0:25                     ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Kyle Moffett @ 2005-12-23  0:02 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Williams, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Dec 22, 2005, at 17:59, Trond Myklebust wrote:
> On Fri, 2005-12-23 at 09:33 +1100, Peter Williams wrote:
>>> It still has sod all business being in the NFS code. We don't  
>>> touch task scheduling in the filesystem code.
>>
>> How do you explain the use of the TASK_INTERRUPTIBLE flag then?
>
> Oh, please...
>
> TASK_INTERRUPTIBLE is used to set the task to sleep. It has NOTHING  
> to do with scheduling.

Putting a task to sleep _is_ rescheduling it.  TASK_NONINTERACTIVE  
means that you are about to reschedule and are willing to tolerate a  
higher wakeup latency.  TASK_INTERRUPTABLE means you are about to  
sleep and want to be woken up using the "standard" latency.  If you  
do any kind of sleep at all, both are valid, independent of what part  
of the kernel you are.  There's a reason that both are TASK_* flags.

Cheers,
Kyle Moffett

--
If you don't believe that a case based on [nothing] could potentially  
drag on in court for _years_, then you have no business playing with  
the legal system at all.
   -- Rob Landley




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23  0:02                   ` Kyle Moffett
@ 2005-12-23  0:25                     ` Trond Myklebust
  2005-12-23  3:06                       ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23  0:25 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Peter Williams, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Thu, 2005-12-22 at 19:02 -0500, Kyle Moffett wrote:
> On Dec 22, 2005, at 17:59, Trond Myklebust wrote:
> > On Fri, 2005-12-23 at 09:33 +1100, Peter Williams wrote:
> >>> It still has sod all business being in the NFS code. We don't  
> >>> touch task scheduling in the filesystem code.
> >>
> >> How do you explain the use of the TASK_INTERRUPTIBLE flag then?
> >
> > Oh, please...
> >
> > TASK_INTERRUPTIBLE is used to set the task to sleep. It has NOTHING  
> > to do with scheduling.
> 
> Putting a task to sleep _is_ rescheduling it.  TASK_NONINTERACTIVE  
> means that you are about to reschedule and are willing to tolerate a  
> higher wakeup latency.  TASK_INTERRUPTABLE means you are about to  
> sleep and want to be woken up using the "standard" latency.  If you  
> do any kind of sleep at all, both are valid, independent of what part  
> of the kernel you are.  There's a reason that both are TASK_* flags.

Tolerance for higher wakeup latencies is a scheduling _policy_ decision.
Please explain why the hell we should have to deal with that in
filesystem code?

As far as a filesystem is concerned, there should be 2 scheduling
states: running and sleeping. Any scheduling policy beyond that belongs
in kernel/*.

  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23  0:25                     ` Trond Myklebust
@ 2005-12-23  3:06                       ` Peter Williams
  2005-12-23  9:39                         ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-23  3:06 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Thu, 2005-12-22 at 19:02 -0500, Kyle Moffett wrote:
> 
>>On Dec 22, 2005, at 17:59, Trond Myklebust wrote:
>>
>>>On Fri, 2005-12-23 at 09:33 +1100, Peter Williams wrote:
>>>
>>>>>It still has sod all business being in the NFS code. We don't  
>>>>>touch task scheduling in the filesystem code.
>>>>
>>>>How do you explain the use of the TASK_INTERRUPTIBLE flag then?
>>>
>>>Oh, please...
>>>
>>>TASK_INTERRUPTIBLE is used to set the task to sleep. It has NOTHING  
>>>to do with scheduling.
>>
>>Putting a task to sleep _is_ rescheduling it.  TASK_NONINTERACTIVE  
>>means that you are about to reschedule and are willing to tolerate a  
>>higher wakeup latency.  TASK_INTERRUPTABLE means you are about to  
>>sleep and want to be woken up using the "standard" latency.  If you  
>>do any kind of sleep at all, both are valid, independent of what part  
>>of the kernel you are.  There's a reason that both are TASK_* flags.
> 
> 
> Tolerance for higher wakeup latencies is a scheduling _policy_ decision.
> Please explain why the hell we should have to deal with that in
> filesystem code?

In order to make good decisions it needs good data.  I don't think that 
it's unreasonable to expect sub systems to help in that regard 
especially when there is no cost involved.  The patch just turns another 
bit on (at compile time) in some integer constants.  No extra space or 
computing resources are required.

> 
> As far as a filesystem is concerned, there should be 2 scheduling
> states: running and sleeping. Any scheduling policy beyond that belongs
> in kernel/*.

Actually there are currently two kinds of sleep: interruptible and 
uninterruptible.  This just adds a variation to one of these, 
interruptible, that says even though I'm interruptible I'm not 
interactive (i.e. I'm not waiting for human intervention via a key 
press, mouse action, etc. to initiate the interrupt).  This helps the 
scheduler to decide whether the task involved is an interactive one or 
not which in turn improves users' interactive experiences by ensuring 
snappy responses to keyboard and mouse actions even when the system is 
heavily loaded.

There are probably many interruptible sleeps in the kernel that should 
be marked as non interactive but for most of them it doesn't matter 
because the duration of the sleep is so short that being mislabelled 
doesn't materially effect the decision re whether a task is interactive 
or not.  However, for reasons not related to the quality or efficiency 
of the code, NFS interruptible sleeps do not fall into that category as 
they can be quite long due to server load or network congestion.  (N.B. 
the size of delays that can be significant is quite small i.e. much less 
than the size of a normal time slice.)

An alternative to using TASK_NONINTERACTIVE to mark non interactive 
interruptible sleeps that are significant (probably a small number) 
would be to go in the other direction and treat all interruptible sleeps 
as being non interactive and then labelling all the ones that are 
interactive as such.  Although this would result in no changes being 
made to the NFS code, I'm pretty sure that this option would involve a 
great deal more code changes elsewhere as all the places where genuine 
interactive sleeping were identified and labelled.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23  3:06                       ` Peter Williams
@ 2005-12-23  9:39                         ` Trond Myklebust
  2005-12-23 10:49                           ` Peter Williams
  2005-12-23 19:07                           ` Lee Revell
  0 siblings, 2 replies; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23  9:39 UTC (permalink / raw)
  To: Peter Williams
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Fri, 2005-12-23 at 14:06 +1100, Peter Williams wrote:
> > 
> > As far as a filesystem is concerned, there should be 2 scheduling
> > states: running and sleeping. Any scheduling policy beyond that belongs
> > in kernel/*.
> 
> Actually there are currently two kinds of sleep: interruptible and 
> uninterruptible.  This just adds a variation to one of these, 
> interruptible, that says even though I'm interruptible I'm not 
> interactive (i.e. I'm not waiting for human intervention via a key 
> press, mouse action, etc. to initiate the interrupt).  This helps the 
> scheduler to decide whether the task involved is an interactive one or 
> not which in turn improves users' interactive experiences by ensuring 
> snappy responses to keyboard and mouse actions even when the system is 
> heavily loaded.

No! This is not the same thing at all.

You are asking the coder to provide a policy judgement as to whether or
not the users might care.

As far as I'm concerned, other users' MP3 player, X processes, and
keyboard response times can rot in hell whenever I'm busy writing out
data at full blast. I don't give a rats arse about user interactivity,
because my priority is to see the batch jobs complete.

However on another machine, the local administrator may have a different
opinion. That sort of difference in opinion is precisely why we do not
put this sort of policy in the filesystem code but leave it all in the
scheduler code where all the bits and pieces can (hopefully) be treated
consistently as a single policy, and where the user can be given tools
in order to tweak the policy.

TASK_NONINTERACTIVE is basically a piss-poor interface because it moves
the policy into the lower level code where the user has less control.

> There are probably many interruptible sleeps in the kernel that should 
> be marked as non interactive but for most of them it doesn't matter 
> because the duration of the sleep is so short that being mislabelled 
> doesn't materially effect the decision re whether a task is interactive 
> or not.  However, for reasons not related to the quality or efficiency 
> of the code, NFS interruptible sleeps do not fall into that category as 
> they can be quite long due to server load or network congestion.  (N.B. 
> the size of delays that can be significant is quite small i.e. much less 
> than the size of a normal time slice.)
> 
> An alternative to using TASK_NONINTERACTIVE to mark non interactive 
> interruptible sleeps that are significant (probably a small number) 
> would be to go in the other direction and treat all interruptible sleeps 
> as being non interactive and then labelling all the ones that are 
> interactive as such.  Although this would result in no changes being 
> made to the NFS code, I'm pretty sure that this option would involve a 
> great deal more code changes elsewhere as all the places where genuine 
> interactive sleeping were identified and labelled.

That is exactly the same rotten idea, just implemented differently. You
are still asking coders to guess as to what the scheduling policy should
be instead of letting the user decide.

  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23  9:39                         ` Trond Myklebust
@ 2005-12-23 10:49                           ` Peter Williams
  2005-12-23 12:51                             ` Trond Myklebust
  2005-12-23 19:07                           ` Lee Revell
  1 sibling, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-23 10:49 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Fri, 2005-12-23 at 14:06 +1100, Peter Williams wrote:
> 
>>>As far as a filesystem is concerned, there should be 2 scheduling
>>>states: running and sleeping. Any scheduling policy beyond that belongs
>>>in kernel/*.
>>
>>Actually there are currently two kinds of sleep: interruptible and 
>>uninterruptible.  This just adds a variation to one of these, 
>>interruptible, that says even though I'm interruptible I'm not 
>>interactive (i.e. I'm not waiting for human intervention via a key 
>>press, mouse action, etc. to initiate the interrupt).  This helps the 
>>scheduler to decide whether the task involved is an interactive one or 
>>not which in turn improves users' interactive experiences by ensuring 
>>snappy responses to keyboard and mouse actions even when the system is 
>>heavily loaded.
> 
> 
> No! This is not the same thing at all.
> 
> You are asking the coder to provide a policy judgement as to whether or
> not the users might care.

No.  It is asking whether the NORMAL interruption of this interruptible 
sleep will be caused by a human user action such as a keystroke or mouse 
action.  For the NFS client the answer to that question is unequivically 
no.  It's not a matter of policy it's a matter of fact.

> 
> As far as I'm concerned, other users' MP3 player, X processes, and
> keyboard response times can rot in hell whenever I'm busy writing out
> data at full blast. I don't give a rats arse about user interactivity,
> because my priority is to see the batch jobs complete.
> 
> However on another machine, the local administrator may have a different
> opinion. That sort of difference in opinion is precisely why we do not
> put this sort of policy

It's not policy.  It's a statement of fact about the nature of the sleep 
that is being undertaken.

> in the filesystem code but leave it all in the
> scheduler code where all the bits and pieces can (hopefully) be treated
> consistently as a single policy, and where the user can be given tools
> in order to tweak the policy.
> 
> TASK_NONINTERACTIVE is basically a piss-poor interface because it moves
> the policy into the lower level code where the user has less control.

TASK_INTERACTIVE is not about policy.

> 
> 
>>There are probably many interruptible sleeps in the kernel that should 
>>be marked as non interactive but for most of them it doesn't matter 
>>because the duration of the sleep is so short that being mislabelled 
>>doesn't materially effect the decision re whether a task is interactive 
>>or not.  However, for reasons not related to the quality or efficiency 
>>of the code, NFS interruptible sleeps do not fall into that category as 
>>they can be quite long due to server load or network congestion.  (N.B. 
>>the size of delays that can be significant is quite small i.e. much less 
>>than the size of a normal time slice.)
>>
>>An alternative to using TASK_NONINTERACTIVE to mark non interactive 
>>interruptible sleeps that are significant (probably a small number) 
>>would be to go in the other direction and treat all interruptible sleeps 
>>as being non interactive and then labelling all the ones that are 
>>interactive as such.  Although this would result in no changes being 
>>made to the NFS code, I'm pretty sure that this option would involve a 
>>great deal more code changes elsewhere as all the places where genuine 
>>interactive sleeping were identified and labelled.
> 
> 
> That is exactly the same rotten idea, just implemented differently.

I thought that I said (or at least implied) that.  The difference is 
that we wouldn't be having this conversation.

> You
> are still asking coders to guess as to what the scheduling policy should
> be instead of letting the user decide.

I wish that I could make you understand that that isn't the case. 
You're not being asked to make a policy decision you're being asked to 
make a statement of fact about whether the interruptible sleep is 
interactive or not.  In the cases involved in this patch this question 
is always "no, it's not an interactive" sleep and it can be answered at 
compile time with absolutely no run time overhead incurred.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 10:49                           ` Peter Williams
@ 2005-12-23 12:51                             ` Trond Myklebust
  2005-12-23 13:36                               ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23 12:51 UTC (permalink / raw)
  To: Peter Williams
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Fri, 2005-12-23 at 21:49 +1100, Peter Williams wrote:
> No.  It is asking whether the NORMAL interruption of this interruptible 
> sleep will be caused by a human user action such as a keystroke or mouse 
> action.  For the NFS client the answer to that question is unequivically 
> no.  It's not a matter of policy it's a matter of fact.

        /*
         * Tasks that have marked their sleep as noninteractive get
         * woken up without updating their sleep average. (i.e. their
         * sleep is handled in a priority-neutral manner, no priority
         * boost and no penalty.)
         */

This appears to be the only documentation for the TASK_NONINTERACTIVE
flag, and I see no mention of human user actions in that comment. The
comment rather appears to states that this particular flag is designed
to switch between two different scheduling policies.

If the flag really is only about identifying sleeps that will involve
human user actions, then surely it would be easy to set up a short set
of guidelines in Documentation, say, that spell out exactly what the
purpose is, and when it should be used.
That should be done _before_ one starts charging round converting every
instance of TASK_INTERRUPTIBLE.

  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 12:51                             ` Trond Myklebust
@ 2005-12-23 13:36                               ` Peter Williams
  2006-01-02 12:09                                 ` Pekka Enberg
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2005-12-23 13:36 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Kyle Moffett, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Trond Myklebust wrote:
> On Fri, 2005-12-23 at 21:49 +1100, Peter Williams wrote:
> 
>>No.  It is asking whether the NORMAL interruption of this interruptible 
>>sleep will be caused by a human user action such as a keystroke or mouse 
>>action.  For the NFS client the answer to that question is unequivically 
>>no.  It's not a matter of policy it's a matter of fact.
> 
> 
>         /*
>          * Tasks that have marked their sleep as noninteractive get
>          * woken up without updating their sleep average. (i.e. their
>          * sleep is handled in a priority-neutral manner, no priority
>          * boost and no penalty.)
>          */
> 
> This appears to be the only documentation for the TASK_NONINTERACTIVE
> flag,

I guess it makes to many assumptions about the reader's prior knowledge 
of the scheduler internals.  I'll try to make it clearer.

> and I see no mention of human user actions in that comment. The
> comment rather appears to states that this particular flag is designed
> to switch between two different scheduling policies.

Changes of scheduling policy only occur via calls to sched_setscheduler().

> 
> If the flag really is only about identifying sleeps that will involve
> human user actions, then surely it would be easy to set up a short set
> of guidelines in Documentation, say, that spell out exactly what the
> purpose is, and when it should be used.

Sounds reasonable.  I'll propose some changes to the scheduler 
documentation.

> That should be done _before_ one starts charging round converting every
> instance of TASK_INTERRUPTIBLE.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23  9:39                         ` Trond Myklebust
  2005-12-23 10:49                           ` Peter Williams
@ 2005-12-23 19:07                           ` Lee Revell
  2005-12-23 21:08                             ` Trond Myklebust
  1 sibling, 1 reply; 55+ messages in thread
From: Lee Revell @ 2005-12-23 19:07 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 10:39 +0100, Trond Myklebust wrote:
> No! This is not the same thing at all.
> 
> You are asking the coder to provide a policy judgement as to whether
> or
> not the users might care.
> 
> As far as I'm concerned, other users' MP3 player, X processes, and
> keyboard response times can rot in hell whenever I'm busy writing out
> data at full blast. I don't give a rats arse about user interactivity,
> because my priority is to see the batch jobs complete.
> 

By your logic it's also broken to use cond_resched() in filesystem code.

Lee


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 19:07                           ` Lee Revell
@ 2005-12-23 21:08                             ` Trond Myklebust
  2005-12-23 21:17                               ` Lee Revell
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23 21:08 UTC (permalink / raw)
  To: Lee Revell
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 14:07 -0500, Lee Revell wrote:

> By your logic it's also broken to use cond_resched() in filesystem code.

...and your point is?

Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 21:08                             ` Trond Myklebust
@ 2005-12-23 21:17                               ` Lee Revell
  2005-12-23 21:23                                 ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Lee Revell @ 2005-12-23 21:17 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 22:08 +0100, Trond Myklebust wrote:
> On Fri, 2005-12-23 at 14:07 -0500, Lee Revell wrote:
> 
> > By your logic it's also broken to use cond_resched() in filesystem code.
> 
> ...and your point is?

Reductio ad absurdum.  Subsystems not using cond_resched would render
Linux unusable for even trivial soft realtime applications like AV
playback and recording.

Lee


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 21:17                               ` Lee Revell
@ 2005-12-23 21:23                                 ` Trond Myklebust
  2005-12-23 22:04                                   ` Lee Revell
  0 siblings, 1 reply; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23 21:23 UTC (permalink / raw)
  To: Lee Revell
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 16:17 -0500, Lee Revell wrote:
> On Fri, 2005-12-23 at 22:08 +0100, Trond Myklebust wrote:
> > On Fri, 2005-12-23 at 14:07 -0500, Lee Revell wrote:
> > 
> > > By your logic it's also broken to use cond_resched() in filesystem code.
> > 
> > ...and your point is?
> 
> Reductio ad absurdum.  Subsystems not using cond_resched would render
> Linux unusable for even trivial soft realtime applications like AV
> playback and recording.

It may surprise you to learn that some people don't use their computers
for AV playback and recording. However absurd it may seem to you, those
people are quite happy to use 2.4.x kernels without a cond_resched
lurking in every nook and cranny.

  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 21:23                                 ` Trond Myklebust
@ 2005-12-23 22:04                                   ` Lee Revell
  2005-12-23 22:10                                     ` Trond Myklebust
  0 siblings, 1 reply; 55+ messages in thread
From: Lee Revell @ 2005-12-23 22:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 22:23 +0100, Trond Myklebust wrote:
> On Fri, 2005-12-23 at 16:17 -0500, Lee Revell wrote:
> > On Fri, 2005-12-23 at 22:08 +0100, Trond Myklebust wrote:
> > > On Fri, 2005-12-23 at 14:07 -0500, Lee Revell wrote:
> > > 
> > > > By your logic it's also broken to use cond_resched() in filesystem code.
> > > 
> > > ...and your point is?
> > 
> > Reductio ad absurdum.  Subsystems not using cond_resched would render
> > Linux unusable for even trivial soft realtime applications like AV
> > playback and recording.
> 
> It may surprise you to learn that some people don't use their computers
> for AV playback and recording. However absurd it may seem to you, those
> people are quite happy to use 2.4.x kernels without a cond_resched
> lurking in every nook and cranny.

Of course, but I think a reasonable goal for 2.6 is to maintain the
server side performance of 2.4 but also enable desktop type applications
to work well.

cond_resched is really a temporary hack to make the desktop usable until
the kernel becomes fully preemptible.

Lee


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 22:04                                   ` Lee Revell
@ 2005-12-23 22:10                                     ` Trond Myklebust
  0 siblings, 0 replies; 55+ messages in thread
From: Trond Myklebust @ 2005-12-23 22:10 UTC (permalink / raw)
  To: Lee Revell
  Cc: Peter Williams, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

On Fri, 2005-12-23 at 17:04 -0500, Lee Revell wrote:

> cond_resched is really a temporary hack to make the desktop usable until
> the kernel becomes fully preemptible.

...and my argument is that we should avoid adding yet another load of
scheduling hacks deep in unrelated code in order to satisfy yet another
minority of users. The Linux way has always been to emphasise
maintainability, and hence clean coding, over functionality.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-21  6:32   ` Peter Williams
  2005-12-21 13:21     ` Trond Myklebust
  2005-12-21 16:11     ` Ingo Molnar
@ 2006-01-02 11:01     ` Helge Hafting
  2006-01-02 23:54       ` Peter Williams
  2 siblings, 1 reply; 55+ messages in thread
From: Helge Hafting @ 2006-01-02 11:01 UTC (permalink / raw)
  To: Peter Williams
  Cc: Trond Myklebust, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
> Trond Myklebust wrote:
[...]
> >
> >Sorry. That theory is just plain wrong. ALL of those case _ARE_
> >interactive sleeps.
> 
> It's not a theory.  It's a result of observing a -j 16 build with the 
> sources on an NFS mounted file system with top with and without the 
> patches and comparing that with the same builds with the sources on a 
> local file system.  Without the patches the tasks in the kernel build 
> all get the same dynamic priority as the X server and other interactive 
> programs when the sources are on an NFS mounted file system.  With the 
> patches they generally have dynamic priorities between 6 to 10 higher 
> than the X server and other interactive programs.
> 
A process waiting for NFS data looses cpu time, which is spent on running 
something else.  Therefore, it gains some priority so it won't be
forever behind when it wakes up.  Same as for any other io waiting.

Perhaps expecting a 16-way parallel make to have "no impact" is
a bit optimistic.  How about nicing the make, explicitly telling
linux that it isn't important?  Or how about giving important
tasks extra priority?

Helge Hafting

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2005-12-23 13:36                               ` Peter Williams
@ 2006-01-02 12:09                                 ` Pekka Enberg
  0 siblings, 0 replies; 55+ messages in thread
From: Pekka Enberg @ 2006-01-02 12:09 UTC (permalink / raw)
  To: Peter Williams
  Cc: Trond Myklebust, Kyle Moffett, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Hi,

Trond Myklebust wrote:
> >         /*
> >          * Tasks that have marked their sleep as noninteractive get
> >          * woken up without updating their sleep average. (i.e. their
> >          * sleep is handled in a priority-neutral manner, no priority
> >          * boost and no penalty.)
> >          */
> >
> > This appears to be the only documentation for the TASK_NONINTERACTIVE
> > flag,

On 12/23/05, Peter Williams <pwil3058@bigpond.net.au> wrote:
> I guess it makes to many assumptions about the reader's prior knowledge
> of the scheduler internals.  I'll try to make it clearer.

FWIW, Ingo invented TASK_NONINTERACTIVE to fix a problem I had with
Wine. See the following threads for further discussion:

http://marc.theaimsgroup.com/?t=111729237700002&r=1&w=2
http://marc.theaimsgroup.com/?t=111761183900001&r=1&w=2

                                 Pekka

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-02 11:01     ` Helge Hafting
@ 2006-01-02 23:54       ` Peter Williams
  2006-01-04  1:25         ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-02 23:54 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Trond Myklebust, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Helge Hafting wrote:
> On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
> 
>>Trond Myklebust wrote:
> 
> [...]
> 
>>>Sorry. That theory is just plain wrong. ALL of those case _ARE_
>>>interactive sleeps.
>>
>>It's not a theory.  It's a result of observing a -j 16 build with the 
>>sources on an NFS mounted file system with top with and without the 
>>patches and comparing that with the same builds with the sources on a 
>>local file system.  Without the patches the tasks in the kernel build 
>>all get the same dynamic priority as the X server and other interactive 
>>programs when the sources are on an NFS mounted file system.  With the 
>>patches they generally have dynamic priorities between 6 to 10 higher 
>>than the X server and other interactive programs.
>>
> 
> A process waiting for NFS data looses cpu time, which is spent on running 
> something else.  Therefore, it gains some priority so it won't be
> forever behind when it wakes up.  Same as for any other io waiting.

That's more or less independent of this issue as the distribution of CPU 
to tasks is largely determined by the time slice mechanism and the 
dynamic priority is primarily about latency.  (This distinction is a 
little distorted by the fact that, under some circumstances, 
"interactive" tasks don't get moved to the expired list at the end of 
their time slice but this usually won't matter as genuine interactive 
tasks aren't generally CPU hogs.)  In other words, the issue that you 
raised is largely solved by the time tasks spend on the active queue 
before moving to the expired queue rather than the order in which they 
run when on the active queue.

This problem is all about those tasks getting an inappropriate boost to 
improve their latency because they are mistakenly believed to be 
interactive.  Having had a closer think about the way the scheduler 
works I'm now of the opinion that completely ignoring sleeps labelled as 
TASK_NONINTERACTIVE may be a mistake and that it might be more 
appropriate to treat them the same as TASK_UNITERRUPTIBLE but I'll bow 
to Ingo on this as he would have a better understanding of the issues 
involved.

> 
> Perhaps expecting a 16-way parallel make to have "no impact" is
> a bit optimistic.  How about nicing the make, explicitly telling
> linux that it isn't important?

Yes, but that shouldn't be necessary.  If I do the same build on a local 
file system everything works OK and the tasks in the build have dynamic 
priorities 8 to 10 slots higher than the X server and other interactive 
programs.

>  Or how about giving important
> tasks extra priority?

Only root can do that.  But some operating systems do just that e.g. 
Solaris has an IA scheduling class (which all X based programs are run 
in) that takes precedence over programs in the TS class (which is the 
equivalent of Linus's SCHED_NORMAL).  I'm not sure how they handle the 
privileges issues related to stopping inappropriate programs misusing 
the IA class.  IA is really just TS with a boost which is effectively 
just the reverse implementation of what the new SCHED_BATCH achieves. 
Arguably, SCHED_BATCH is the superior way of doing this as it doesn't 
cause any privilege issues as shifting to SCHED_BATCH can be done by the 
owner of the task.

The main drawback to the SCHED_BATCH approach is that it (currently) 
requires the user to explicitly set it on the relevant tasks.  It's long 
term success would be greatly enhanced if programmers could be convinced 
to have their programs switch themselves to SCHED_BATCH unless they are 
genuine interactive processes.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-02 23:54       ` Peter Williams
@ 2006-01-04  1:25         ` Peter Williams
  2006-01-04  9:40           ` Marcelo Tosatti
  2006-01-04 21:51           ` Peter Williams
  0 siblings, 2 replies; 55+ messages in thread
From: Peter Williams @ 2006-01-04  1:25 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Peter Williams, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Peter Williams wrote:
> Helge Hafting wrote:
> 
>> On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
>>
>>> Trond Myklebust wrote:
>>
>>
>> [...]
>>
>>>> Sorry. That theory is just plain wrong. ALL of those case _ARE_
>>>> interactive sleeps.
>>>
>>>
>>> It's not a theory.  It's a result of observing a -j 16 build with the 
>>> sources on an NFS mounted file system with top with and without the 
>>> patches and comparing that with the same builds with the sources on a 
>>> local file system.  Without the patches the tasks in the kernel build 
>>> all get the same dynamic priority as the X server and other 
>>> interactive programs when the sources are on an NFS mounted file 
>>> system.  With the patches they generally have dynamic priorities 
>>> between 6 to 10 higher than the X server and other interactive programs.
>>>
>>
>> A process waiting for NFS data looses cpu time, which is spent on 
>> running something else.  Therefore, it gains some priority so it won't be
>> forever behind when it wakes up.  Same as for any other io waiting.
> 
> 
> That's more or less independent of this issue as the distribution of CPU 
> to tasks is largely determined by the time slice mechanism and the 
> dynamic priority is primarily about latency.  (This distinction is a 
> little distorted by the fact that, under some circumstances, 
> "interactive" tasks don't get moved to the expired list at the end of 
> their time slice but this usually won't matter as genuine interactive 
> tasks aren't generally CPU hogs.)  In other words, the issue that you 
> raised is largely solved by the time tasks spend on the active queue 
> before moving to the expired queue rather than the order in which they 
> run when on the active queue.
> 
> This problem is all about those tasks getting an inappropriate boost to 
> improve their latency because they are mistakenly believed to be 
> interactive.

One of the unfortunate side effects of this is that it can effect 
scheduler fairness because if these tasks get sufficient bonus points 
the TASK_INTERACTIVE() macro will return true for them and they will be 
rescheduled on the active queue instead of the expired queue at the end 
of the time slice (provided EXPIRED_STARVING()) doesn't prevent this). 
This will have an adverse effect on scheduling fairness.

The ideal design of the scheduler would be for the fairness mechanism 
and the interactive responsiveness mechanism to be independent but this 
is not the case due to the fact that requeueing interactive tasks on the 
expired array could add unacceptably to their latency.  As I said above 
this slight divergence from the ideal of perfect independence shouldn't 
matter as genuine interactive processes aren't very CPU intensive.

In summary, inappropriate identification of CPU intensive tasks as 
interactive has two bad effects: 1) responsiveness problems for genuine 
interactive tasks due to the extra competition at their dynamic priority 
and 2) a degradation of scheduling fairness; not just one.

For an example of the effect of inappropriate identification of CPU hogs 
as interactive tasks see the thread "[SCHED] Totally WRONG priority 
calculation with specific test-case (since 2.6.10-bk12)" in this list.

>  Having had a closer think about the way the scheduler 
> works I'm now of the opinion that completely ignoring sleeps labelled as 
> TASK_NONINTERACTIVE may be a mistake and that it might be more 
> appropriate to treat them the same as TASK_UNITERRUPTIBLE but I'll bow 
> to Ingo on this as he would have a better understanding of the issues 
> involved.
> 
>>
>> Perhaps expecting a 16-way parallel make to have "no impact" is
>> a bit optimistic.  How about nicing the make, explicitly telling
>> linux that it isn't important?
> 
> 
> Yes, but that shouldn't be necessary.  If I do the same build on a local 
> file system everything works OK and the tasks in the build have dynamic 
> priorities 8 to 10 slots higher than the X server and other interactive 
> programs.

Further analysis indicates that this is not a complete solution as the 
tasks would still be identified as interactive and given a bonus. 
Although the change of nice value would be sufficient to stop these 
tasks competing with the genuine interactive tasks, they would probably 
still get a positive return value from TASK_INTERACTIVE() (as it's 
effectively based on the bonus acquired i.e. difference between prio and 
static_prio) and hence preferential treatment at the end of their time 
slice with a consequent degradation of scheduling fairness.

> 
>>  Or how about giving important
>> tasks extra priority?
> 
> 
> Only root can do that.  But some operating systems do just that e.g. 
> Solaris has an IA scheduling class (which all X based programs are run 
> in) that takes precedence over programs in the TS class (which is the 
> equivalent of Linus's SCHED_NORMAL).  I'm not sure how they handle the 
> privileges issues related to stopping inappropriate programs misusing 
> the IA class.  IA is really just TS with a boost which is effectively 
> just the reverse implementation of what the new SCHED_BATCH achieves. 
> Arguably, SCHED_BATCH is the superior way of doing this as it doesn't 
> cause any privilege issues as shifting to SCHED_BATCH can be done by the 
> owner of the task.
> 
> The main drawback to the SCHED_BATCH approach is that it (currently) 
> requires the user to explicitly set it on the relevant tasks.  It's long 
> term success would be greatly enhanced if programmers could be convinced 
> to have their programs switch themselves to SCHED_BATCH unless they are 
> genuine interactive processes.
> 
> Peter


-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-04  1:25         ` Peter Williams
@ 2006-01-04  9:40           ` Marcelo Tosatti
  2006-01-04 12:18             ` Con Kolivas
  2006-01-04 21:51           ` Peter Williams
  1 sibling, 1 reply; 55+ messages in thread
From: Marcelo Tosatti @ 2006-01-04  9:40 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 10720 bytes --]

Hi Peter,

On Wed, Jan 04, 2006 at 12:25:40PM +1100, Peter Williams wrote:
> Peter Williams wrote:
> >Helge Hafting wrote:
> >
> >>On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
> >>
> >>>Trond Myklebust wrote:
> >>
> >>
> >>[...]
> >>
> >>>>Sorry. That theory is just plain wrong. ALL of those case _ARE_
> >>>>interactive sleeps.
> >>>
> >>>
> >>>It's not a theory.  It's a result of observing a -j 16 build with the 
> >>>sources on an NFS mounted file system with top with and without the 
> >>>patches and comparing that with the same builds with the sources on a 
> >>>local file system.  Without the patches the tasks in the kernel build 
> >>>all get the same dynamic priority as the X server and other 
> >>>interactive programs when the sources are on an NFS mounted file 
> >>>system.  With the patches they generally have dynamic priorities 
> >>>between 6 to 10 higher than the X server and other interactive programs.
> >>>
> >>
> >>A process waiting for NFS data looses cpu time, which is spent on 
> >>running something else.  Therefore, it gains some priority so it won't be
> >>forever behind when it wakes up.  Same as for any other io waiting.
> >
> >
> >That's more or less independent of this issue as the distribution of CPU 
> >to tasks is largely determined by the time slice mechanism and the 
> >dynamic priority is primarily about latency.  (This distinction is a 
> >little distorted by the fact that, under some circumstances, 
> >"interactive" tasks don't get moved to the expired list at the end of 
> >their time slice but this usually won't matter as genuine interactive 
> >tasks aren't generally CPU hogs.)  In other words, the issue that you 
> >raised is largely solved by the time tasks spend on the active queue 
> >before moving to the expired queue rather than the order in which they 
> >run when on the active queue.
> >
> >This problem is all about those tasks getting an inappropriate boost to 
> >improve their latency because they are mistakenly believed to be 
> >interactive.
> 
> One of the unfortunate side effects of this is that it can effect 
> scheduler fairness because if these tasks get sufficient bonus points 
> the TASK_INTERACTIVE() macro will return true for them and they will be 
> rescheduled on the active queue instead of the expired queue at the end 
> of the time slice (provided EXPIRED_STARVING()) doesn't prevent this). 
> This will have an adverse effect on scheduling fairness.
> 
> The ideal design of the scheduler would be for the fairness mechanism 
> and the interactive responsiveness mechanism to be independent but this 
> is not the case due to the fact that requeueing interactive tasks on the 
> expired array could add unacceptably to their latency.  As I said above 
> this slight divergence from the ideal of perfect independence shouldn't 
> matter as genuine interactive processes aren't very CPU intensive.
> 
> In summary, inappropriate identification of CPU intensive tasks as 
> interactive has two bad effects: 1) responsiveness problems for genuine 
> interactive tasks due to the extra competition at their dynamic priority 
> and 2) a degradation of scheduling fairness; not just one.
> 
> For an example of the effect of inappropriate identification of CPU hogs 
> as interactive tasks see the thread "[SCHED] Totally WRONG priority 
> calculation with specific test-case (since 2.6.10-bk12)" in this list.

And another real-life example of the issue you describe above.

>From marcelo.tosatti@cyclades.com Fri Dec  2 18:51:59 2005
Date: Fri, 2 Dec 2005 18:51:59 -0200
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Ingo Molnar <mingo@elte.hu>, Nick Piggin <piggin@cyberone.com.au>
Cc: Regina Kodato <regina.kodato@cyclades.com>,
	Wanda Rosalino <wanda.rosalino@cyclades.com>,
	Edson Seabra <edson.seabra@cyclades.com>
Subject: scheduler starvation with v2.6.11 on embedded PPC appliance


We are experiencing what seems to be a scheduler starvation issue on our
application, running v2.6.11. The same load works as expected on v2.4.

We would like to know if v2.6.14 could possibly fix this problem.

Hardware is a PowerPC 8xx at 48Mhz (embedded SoC) with 128MB RAM,
handling remote access to its own 48 serial ports running at 9600bps
each (8N1, HW flow control).

Access to the ports is performed via SSH (one sshd instance for each
port), and there are two different configurations:

1) slim socket mode: Each SSH process is responsible for handling IO to
its own serial port.

2) buffering mode: Where a single process handles IO on the 48 tty's,
copying data to a shared memory region and signalling the respective ssh
daemon with SIGIO once a certain amount of data is ready.

The test transfers a 78k file via each serial port (total = 48*78k =
3.7MB) from an x86 Linux box, usually taking:

78110 bytes after 81 seconds, 964 cps (+-9640 bps).

Time varies from 77 sec upto 85 sec.

Problem description:

Using slim socket mode, where each SSH process handles IO to its own
port, the scheduler starves a certain number of processes, causing their
connections to timeout.

Further investigation with schedstats allowed us to notice that
"wait_ticks" is much higher using this mode.

Follows the output of "latency" and "vmstat 2" with buffering mode (low
wait_ticks, high number of context switches):

913 (cy_buffering) 25(25) 1077(1077) 843(843) 0.03 1.28
1166 (sshd) 220(220) 143(143) 1276(1276) 0.17 0.11
913 (cy_buffering) 36(11) 1078(1) 952(109) 0.10 0.01
1166 (sshd) 231(11) 191(48) 1883(607) 0.02 0.08
913 (cy_buffering) 242(206) 1131(53) 3200(2248) 0.09 0.02
1166 (sshd) 294(63) 383(192) 2523(640) 0.10 0.30
913 (cy_buffering) 440(198) 1172(41) 5637(2437) 0.08 0.02
1166 (sshd) 353(59) 574(191) 3160(637) 0.09 0.30
913 (cy_buffering) 644(204) 1199(27) 7918(2281) 0.09 0.01
1166 (sshd) 372(19) 678(104) 3771(611) 0.03 0.17
913 (cy_buffering) 644(0) 1201(2) 7978(60) 0.00 0.03
1166 (sshd) 372(0) 681(3) 4372(601) 0.00 0.00

procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
 0  0      0 159752  51200   9960    0    0     0     0   23  1171  1 11  0 88
 0  0      0 159752  51200   9960    0    0     0     0   10  1111  0  5  0 94
 1  0      0 159752  51200   9964    0    0     2     0  311  1226 35 55  0 10
 1  0      0 159752  51200   9964    0    0     0     0  934  1718 50 50  0  0
 1  0      0 159752  51200   9964    0    0     0     0  874  1519 52 48  0  0
11  0      0 159752  51200   9964    0    0     0     0  800  1358 47 53  0  0
 7  0      0 159752  51200   9964    0    0     0     0  527  1235 44 56  0  0
 1  0      0 159752  51200   9964    0    0     0     0  301  1144 47 53  0  0
 1  0      0 159752  51200   9964    0    0     0     0  363  1241 43 57  0  0
 2  0      0 159752  51200   9964    0    0     0     1  428  1194 45 55  0  0
 1  0      0 159752  51200   9964    0    0     0     0  428  1141 42 58  0  0
 1  0      0 159752  51200   9964    0    0     0     0  433  1255 44 56  0  0
 2  0      0 159752  51200   9964    0    0     0     0  444  1067 46 54  0  0
 1  0      0 159752  51200   9964    0    0     0     0  465  1071 55 45  0  0
 1  0      0 159752  51200   9964    0    0     0     0  510  1101 42 58  0  0
 1  0      0 159752  51200   9964    0    0     0     0  409  1082 47 53  0  0
 1  0      0 159752  51200   9964    0    0     0     0  401  1075 40 60  0  0
 1  0      0 159752  51200   9964    0    0     0     0  409  1081 44 56  0  0


And with slim socket mode (very high wait_ticks, low number of context
switches):

1200 (sshd) 382(0) 3891(0) 1879(30) 0.00 0.00
1216 (sshd) 479(0) 7216(0) 2387(30) 0.00 0.00
1241 (sshd) 802(0) 6869(2) 4069(31) 0.00 0.06
1276 (sshd) 499(2) 8807(42) 3204(34) 0.06 1.24
1301 (sshd) 601(2) 8319(38) 2752(32) 0.06 1.19

1200 (sshd) 388(6) 4184(293) 1909(30) 0.20 9.77
1216 (sshd) 487(8) 7516(300) 2413(26) 0.31 11.54
1241 (sshd) 866(64) 7575(706) 4427(358) 0.18 1.97
1276 (sshd) 656(157) 9824(1017) 3756(552) 0.28 1.84
1301 (sshd) 610(9) 8422(103) 2761(9) 1.00 11.44

1200 (sshd) 415(27) 7132(2948) 1982(73) 0.37 40.38
1216 (sshd) 511(24) 10537(3021) 2496(83) 0.29 36.40
1241 (sshd) 943(77) 8537(962) 4875(448) 0.17 2.15
1276 (sshd) 776(120) 10892(1068) 4336(580) 0.21 1.84
1301 (sshd) 620(10) 11034(2612) 2771(10) 1.00 261.20

procs                      memory      swap          io     system         cpu
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
 5  0      0 159816  51200   9916    0    0     0     0   18   113  0  1  0 99
 0  0      0 159816  51200   9916    0    0     0     0   19   112  0  2  0 98
 0  0      0 159816  51200   9916    0    0     0     0  166   176  1  6  0 93
37  0      0 159880  51200   9916    0    0     0     0 2857  1219 46 50  0  4
38  0      0 159880  51200   9916    0    0     0     0 2662  1059 58 42  0  0
33  0      0 159880  51200   9916    0    0     0     0 1058   496 72 28  0  0
33  0      0 159880  51200   9916    0    0     0     0 1593   743 70 30  0  0
33  0      0 159880  51200   9916    0    0     0     0 1519   706 71 29  0  0
34  0      0 159880  51200   9916    0    0     0     0 1073   520 74 26  0  0
35  0      0 159880  51200   9916    0    0     0     0 1047   493 67 33  0  0
49  0      0 159880  51200   9916    0    0     0     0 1130   543 70 30  0  0
34  0      0 159880  51200   9916    0    0     0     0 1239   612 70 30  0  0
46  0      0 159880  51200   9916    0    0     0     0 1427   737 69 31  0  0
34  0      0 159880  51200   9916    0    0     0     0  835   423 73 27  0  0
36  0      0 159880  51200   9916    0    0     0     1 1036   414 69 31  0  0
37  0      0 159880  51200   9916    0    0     0     0  917   379 73 27  0  0
44  0      0 159880  51200   9916    0    0     0     0 3401  1311 65 35  0  0

Another noticeable difference on schedstat output is that slim mode
causes the scheduler to switch the active/expired queues 4 times during
the total run, while buffering mode switches the queues 38 times.

Attached you can find schedstats-buffering.txt and schedstats-slim.txt.

On v2.4.17 both modes work fine, with a high context-switch number.

We suspected that the TASK_INTERACTIVE() logic in kernel/sched.c would
be moving some processes directly to the active list, thus starving some
others. So we set the nice value of all 48 processes to "nice +19" to
disable TASK_INTERACTIVE() and the starvation is gone. However with +19
it becomes impossible to use the box interactively while the test runs,
which is the case with the default "0" nice value.

Are there significant changes between v2.6.11 -> v2.6.14 aimed at fixing
this problem?

[-- Attachment #2: schedstats-slim.txt --]
[-- Type: text/plain, Size: 28206 bytes --]


00:00:00--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


        390          schedule()
          0(  0.00%) switched active and expired queues
        285( 73.08%) used existing active queue


        284          try_to_wake_up()
        284(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.02/0.01      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:01--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


        448          schedule()
          0(  0.00%) switched active and expired queues
        323( 72.10%) used existing active queue


        322          try_to_wake_up()
        322(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.01/0.00      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:01--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       1944          schedule()
          0(  0.00%) switched active and expired queues
       1744( 89.71%) used existing active queue


       3695          try_to_wake_up()
       3695(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.16/1.44      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:02--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       3526          schedule()
          0(  0.00%) switched active and expired queues
       3526(100.00%) used existing active queue


       9211          try_to_wake_up()
       9211(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.25/4.49      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:03--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2356          schedule()
          0(  0.00%) switched active and expired queues
       2356(100.00%) used existing active queue


       4498          try_to_wake_up()
       4498(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.25/1.90      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:03--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2541          schedule()
          0(  0.00%) switched active and expired queues
       2541(100.00%) used existing active queue


       4905          try_to_wake_up()
       4905(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.26/2.25      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:04--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2340          schedule()
          0(  0.00%) switched active and expired queues
       2340(100.00%) used existing active queue


       4520          try_to_wake_up()
       4520(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.26/2.15      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:05--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2206          schedule()
          0(  0.00%) switched active and expired queues
       2206(100.00%) used existing active queue


       4278          try_to_wake_up()
       4278(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.26/2.05      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:05--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       1968          schedule()
          0(  0.00%) switched active and expired queues
       1968(100.00%) used existing active queue


       4513          try_to_wake_up()
       4513(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.26/33.56     avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:06--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2338          schedule()
          1(  0.04%) switched active and expired queues
       2337( 99.96%) used existing active queue


       5741          try_to_wake_up()
       5741(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.26/7.76      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:06--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2598          schedule()
          0(  0.00%) switched active and expired queues
       2598(100.00%) used existing active queue


       6888          try_to_wake_up()
       6888(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.27/2.48      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:07--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2643          schedule()
          0(  0.00%) switched active and expired queues
       2643(100.00%) used existing active queue


       7118          try_to_wake_up()
       7118(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.27/1.79      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:08--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2825          schedule()
          0(  0.00%) switched active and expired queues
       2825(100.00%) used existing active queue


       7474          try_to_wake_up()
       7474(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.27/2.93      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:09--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2618          schedule()
          0(  0.00%) switched active and expired queues
       2618(100.00%) used existing active queue


       6914          try_to_wake_up()
       6914(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.27/2.41      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:10--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       3365          schedule()
          1(  0.03%) switched active and expired queues
       3364( 99.97%) used existing active queue


       8379          try_to_wake_up()
       8379(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.27/10.09     avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:10--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2920          schedule()
          1(  0.03%) switched active and expired queues
       2919( 99.97%) used existing active queue


       4522          try_to_wake_up()
       4522(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.24/3.51      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:11--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2256          schedule()
          0(  0.00%) switched active and expired queues
       2256(100.00%) used existing active queue


       3469          try_to_wake_up()
       3469(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.24/4.55      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:11--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       2280          schedule()
          1(  0.04%) switched active and expired queues
       2279( 99.96%) used existing active queue


       3480          try_to_wake_up()
       3480(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.24/3.45      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


[-- Attachment #3: schedstats-buffering.txt --]
[-- Type: text/plain, Size: 32907 bytes --]


00:00:00--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5495          schedule()
          0(  0.00%) switched active and expired queues
       5007( 91.12%) used existing active queue


       4952          try_to_wake_up()
       4952(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.00/0.04      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:01--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5617          schedule()
          0(  0.00%) switched active and expired queues
       5112( 91.01%) used existing active queue


       5056          try_to_wake_up()
       5056(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.01/0.00      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:01--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5963          schedule()
          2(  0.03%) switched active and expired queues
       5816( 97.53%) used existing active queue


       6660          try_to_wake_up()
       6660(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.06/0.11      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:02--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       6482          schedule()
          2(  0.03%) switched active and expired queues
       6480( 99.97%) used existing active queue


       9882          try_to_wake_up()
       9882(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.33      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:02--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5517          schedule()
          1(  0.02%) switched active and expired queues
       5516( 99.98%) used existing active queue


       8904          try_to_wake_up()
       8904(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.10/0.50      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:03--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       6183          schedule()
          2(  0.03%) switched active and expired queues
       6181( 99.97%) used existing active queue


      10045          try_to_wake_up()
      10045(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.10/0.55      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:03--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5366          schedule()
          1(  0.02%) switched active and expired queues
       5365( 99.98%) used existing active queue


       8628          try_to_wake_up()
       8628(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.10/0.52      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:04--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5447          schedule()
          4(  0.07%) switched active and expired queues
       5443( 99.93%) used existing active queue


       8767          try_to_wake_up()
       8767(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.47      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:04--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5391          schedule()
          1(  0.02%) switched active and expired queues
       5390( 99.98%) used existing active queue


       8719          try_to_wake_up()
       8719(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.10/0.50      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:05--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5499          schedule()
          2(  0.04%) switched active and expired queues
       5497( 99.96%) used existing active queue


       8925          try_to_wake_up()
       8925(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.45      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:05--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5419          schedule()
          4(  0.07%) switched active and expired queues
       5415( 99.93%) used existing active queue


       8747          try_to_wake_up()
       8747(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.45      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:06--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5471          schedule()
          2(  0.04%) switched active and expired queues
       5469( 99.96%) used existing active queue


       8832          try_to_wake_up()
       8832(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.43      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:06--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5449          schedule()
          2(  0.04%) switched active and expired queues
       5447( 99.96%) used existing active queue


       8765          try_to_wake_up()
       8765(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.44      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:07--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5491          schedule()
          2(  0.04%) switched active and expired queues
       5489( 99.96%) used existing active queue


       8859          try_to_wake_up()
       8859(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.42      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:07--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5449          schedule()
          4(  0.07%) switched active and expired queues
       5445( 99.93%) used existing active queue


       8761          try_to_wake_up()
       8761(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.43      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:08--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5470          schedule()
          1(  0.02%) switched active and expired queues
       5469( 99.98%) used existing active queue


       8893          try_to_wake_up()
       8893(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.10/0.44      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:08--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5421          schedule()
          3(  0.06%) switched active and expired queues
       5418( 99.94%) used existing active queue


       8740          try_to_wake_up()
       8740(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.48      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:09--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5488          schedule()
          2(  0.04%) switched active and expired queues
       5486( 99.96%) used existing active queue


       8744          try_to_wake_up()
       8744(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.09/0.41      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:09--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5745          schedule()
          3(  0.05%) switched active and expired queues
       5450( 94.87%) used existing active queue


       5672          try_to_wake_up()
       5672(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.04/0.06      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:10--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5478          schedule()
          0(  0.00%) switched active and expired queues
       4974( 90.80%) used existing active queue


       4962          try_to_wake_up()
       4962(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.00/0.02      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


00:00:10--------------------------------------------------------------
          0          sys_sched_yield()
          0(  0.00%) found (only) active queue empty on current cpu
          0(  0.00%) found (only) expired queue empty on current cpu
          0(  0.00%) found both queues empty on current cpu
          0(  0.00%) found neither queue empty on current cpu


       5478          schedule()
          0(  0.00%) switched active and expired queues
       4970( 90.73%) used existing active queue


       4963          try_to_wake_up()
       4963(100.00%) task already running, or killed
          0(  0.00%) successfully moved a task to waking cpu
          0(  0.00%) task started on previous cpu

          0(  0.00%) tried to move a task because of possible affinity
          0(  0.00%) tried to move a task to improve load balancing


          0          wake_up_forked_thread()
          0(  0.00%) successfully moved a task


          0          pull_task()
          0(  0.00%) moved when newly idle
          0(  0.00%) moved while idle
          0(  0.00%) moved while busy
          0(  0.00%) moved from active_load_balance()

          0          active_load_balance()
          0          sched_balance_exec()
          0          sched_migrate_task()

      0.00/0.00      avg runtime/latency over all cpus (ms)

          0          load_balance()
          0(  0.00%) called while idle
          0(  0.00%) called while busy
          0(  0.00%) called when newly idle

          0          sched_balance_exec() tried to push a task


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-04 12:18             ` Con Kolivas
@ 2006-01-04 10:31               ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2006-01-04 10:31 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Peter Williams, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Wed, Jan 04, 2006 at 11:18:01PM +1100, Con Kolivas wrote:
> On Wednesday 04 January 2006 20:40, Marcelo Tosatti wrote:
> > We suspected that the TASK_INTERACTIVE() logic in kernel/sched.c would
> > be moving some processes directly to the active list, thus starving some
> > others. So we set the nice value of all 48 processes to "nice +19" to
> > disable TASK_INTERACTIVE() and the starvation is gone. However with +19
> > it becomes impossible to use the box interactively while the test runs,
> > which is the case with the default "0" nice value.
> >
> > Are there significant changes between v2.6.11 -> v2.6.14 aimed at fixing
> > this problem?
> 
> The SCHED_BATCH policy Ingo has implemented should help just such a problem.

Yeap, he sent me the patch (which I promised to test), but still haven't. 

Will do ASAP.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-04  9:40           ` Marcelo Tosatti
@ 2006-01-04 12:18             ` Con Kolivas
  2006-01-04 10:31               ` Marcelo Tosatti
  0 siblings, 1 reply; 55+ messages in thread
From: Con Kolivas @ 2006-01-04 12:18 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Peter Williams, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Wednesday 04 January 2006 20:40, Marcelo Tosatti wrote:
> We suspected that the TASK_INTERACTIVE() logic in kernel/sched.c would
> be moving some processes directly to the active list, thus starving some
> others. So we set the nice value of all 48 processes to "nice +19" to
> disable TASK_INTERACTIVE() and the starvation is gone. However with +19
> it becomes impossible to use the box interactively while the test runs,
> which is the case with the default "0" nice value.
>
> Are there significant changes between v2.6.11 -> v2.6.14 aimed at fixing
> this problem?

The SCHED_BATCH policy Ingo has implemented should help just such a problem.

Con

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-04  1:25         ` Peter Williams
  2006-01-04  9:40           ` Marcelo Tosatti
@ 2006-01-04 21:51           ` Peter Williams
  2006-01-05  6:31             ` Mike Galbraith
  1 sibling, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-04 21:51 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Trond Myklebust, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

Peter Williams wrote:
> Peter Williams wrote:
> 
>> Helge Hafting wrote:
>>
>>> On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
>>>
>>>> Trond Myklebust wrote:
>>>
>>>
>>>
>>> [...]
>>>
>>>>> Sorry. That theory is just plain wrong. ALL of those case _ARE_
>>>>> interactive sleeps.
>>>>
>>>>
>>>>
>>>> It's not a theory.  It's a result of observing a -j 16 build with 
>>>> the sources on an NFS mounted file system with top with and without 
>>>> the patches and comparing that with the same builds with the sources 
>>>> on a local file system.  Without the patches the tasks in the kernel 
>>>> build all get the same dynamic priority as the X server and other 
>>>> interactive programs when the sources are on an NFS mounted file 
>>>> system.  With the patches they generally have dynamic priorities 
>>>> between 6 to 10 higher than the X server and other interactive 
>>>> programs.
>>>>
>>>
>>> A process waiting for NFS data looses cpu time, which is spent on 
>>> running something else.  Therefore, it gains some priority so it 
>>> won't be
>>> forever behind when it wakes up.  Same as for any other io waiting.
>>
>>
>>
>> That's more or less independent of this issue as the distribution of 
>> CPU to tasks is largely determined by the time slice mechanism and the 
>> dynamic priority is primarily about latency.  (This distinction is a 
>> little distorted by the fact that, under some circumstances, 
>> "interactive" tasks don't get moved to the expired list at the end of 
>> their time slice but this usually won't matter as genuine interactive 
>> tasks aren't generally CPU hogs.)  In other words, the issue that you 
>> raised is largely solved by the time tasks spend on the active queue 
>> before moving to the expired queue rather than the order in which they 
>> run when on the active queue.
>>
>> This problem is all about those tasks getting an inappropriate boost 
>> to improve their latency because they are mistakenly believed to be 
>> interactive.
> 
> 
> One of the unfortunate side effects of this is that it can effect 
> scheduler fairness because if these tasks get sufficient bonus points 
> the TASK_INTERACTIVE() macro will return true for them and they will be 
> rescheduled on the active queue instead of the expired queue at the end 
> of the time slice (provided EXPIRED_STARVING()) doesn't prevent this). 
> This will have an adverse effect on scheduling fairness.

I should have added here that if EXPIRED_STARVING() stops these tasks 
from being requeued on the active queue at the end of their time slice 
then it will also stop genuine interactive tasks from being requeued on 
the active queue with bad effects for interactive responsiveness.

> 
> The ideal design of the scheduler would be for the fairness mechanism 
> and the interactive responsiveness mechanism to be independent but this 
> is not the case due to the fact that requeueing interactive tasks on the 
> expired array could add unacceptably to their latency.  As I said above 
> this slight divergence from the ideal of perfect independence shouldn't 
> matter as genuine interactive processes aren't very CPU intensive.
> 
> In summary, inappropriate identification of CPU intensive tasks as 
> interactive has two bad effects: 1) responsiveness problems for genuine 
> interactive tasks due to the extra competition at their dynamic priority 
> and 2) a degradation of scheduling fairness; not just one.
> 
> For an example of the effect of inappropriate identification of CPU hogs 
> as interactive tasks see the thread "[SCHED] Totally WRONG priority 
> calculation with specific test-case (since 2.6.10-bk12)" in this list.
> 
>>  Having had a closer think about the way the scheduler works I'm now 
>> of the opinion that completely ignoring sleeps labelled as 
>> TASK_NONINTERACTIVE may be a mistake and that it might be more 
>> appropriate to treat them the same as TASK_UNITERRUPTIBLE but I'll bow 
>> to Ingo on this as he would have a better understanding of the issues 
>> involved.

I've changed my mind again on this and now think that, rather than 
treating TASK_NONINTERACTIVE sleeps the way TASK_UNINTERRUPTIBLE sleeps 
are currently treated, TASK_UNINTERRUPTIBLE sleeps should be ignored 
just like TASK_NONINTERACTIVE sleeps currently are.

>>
>>>
>>> Perhaps expecting a 16-way parallel make to have "no impact" is
>>> a bit optimistic.  How about nicing the make, explicitly telling
>>> linux that it isn't important?
>>
>>
>>
>> Yes, but that shouldn't be necessary.  If I do the same build on a 
>> local file system everything works OK and the tasks in the build have 
>> dynamic priorities 8 to 10 slots higher than the X server and other 
>> interactive programs.
> 
> 
> Further analysis indicates that this is not a complete solution as the 
> tasks would still be identified as interactive and given a bonus. 
> Although the change of nice value would be sufficient to stop these 
> tasks competing with the genuine interactive tasks, they would probably 
> still get a positive return value from TASK_INTERACTIVE() (as it's 
> effectively based on the bonus acquired i.e. difference between prio and 
> static_prio) and hence preferential treatment at the end of their time 
> slice with a consequent degradation of scheduling fairness.
> 
>>
>>>  Or how about giving important
>>> tasks extra priority?
>>
>>
>>
>> Only root can do that.  But some operating systems do just that e.g. 
>> Solaris has an IA scheduling class (which all X based programs are run 
>> in) that takes precedence over programs in the TS class (which is the 
>> equivalent of Linus's SCHED_NORMAL).  I'm not sure how they handle the 
>> privileges issues related to stopping inappropriate programs misusing 
>> the IA class.  IA is really just TS with a boost which is effectively 
>> just the reverse implementation of what the new SCHED_BATCH achieves. 
>> Arguably, SCHED_BATCH is the superior way of doing this as it doesn't 
>> cause any privilege issues as shifting to SCHED_BATCH can be done by 
>> the owner of the task.
>>
>> The main drawback to the SCHED_BATCH approach is that it (currently) 
>> requires the user to explicitly set it on the relevant tasks.  It's 
>> long term success would be greatly enhanced if programmers could be 
>> convinced to have their programs switch themselves to SCHED_BATCH 
>> unless they are genuine interactive processes.

I think that some of the harder to understand parts of the scheduler 
code are actually attempts to overcome the undesirable effects (such as 
those I've described) of inappropriately identifying tasks as 
interactive.  I think that it would have been better to attempt to fix 
the inappropriate identifications rather than their effects and I think 
the prudent use of TASK_NONINTERACTIVE is an important tool for 
achieving this.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
  2006-01-04 21:51           ` Peter Williams
@ 2006-01-05  6:31             ` Mike Galbraith
  2006-01-05 11:31               ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Mike Galbraith @ 2006-01-05  6:31 UTC (permalink / raw)
  To: Peter Williams, Helge Hafting
  Cc: Trond Myklebust, Ingo Molnar, Con Kolivas, Linux Kernel Mailing List

At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:

>I think that some of the harder to understand parts of the scheduler code 
>are actually attempts to overcome the undesirable effects (such as those 
>I've described) of inappropriately identifying tasks as interactive.  I 
>think that it would have been better to attempt to fix the inappropriate 
>identifications rather than their effects and I think the prudent use of 
>TASK_NONINTERACTIVE is an important tool for achieving this.

IMHO, that's nothing but a cover for the weaknesses induced by using 
exclusively sleep time as an information source for the priority 
calculation.  While this heuristic does work pretty darn well, it's easily 
fooled (intentionally or otherwise).  The challenge is to find the right 
low cost informational component, and to stir it in at O(1).

The fundamental problem with the whole interactivity issue is that the 
kernel has no way to know if there's a human involved or not.  My 100%cpu 
GL screensaver is interactive while I'm mindlessly staring at it.

         -Mike 


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client  on interactive response
  2006-01-05  6:31             ` Mike Galbraith
@ 2006-01-05 11:31               ` Peter Williams
  2006-01-05 14:31                 ` Mike Galbraith
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-05 11:31 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Mike Galbraith wrote:
> At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:
> 
>> I think that some of the harder to understand parts of the scheduler 
>> code are actually attempts to overcome the undesirable effects (such 
>> as those I've described) of inappropriately identifying tasks as 
>> interactive.  I think that it would have been better to attempt to fix 
>> the inappropriate identifications rather than their effects and I 
>> think the prudent use of TASK_NONINTERACTIVE is an important tool for 
>> achieving this.
> 
> 
> IMHO, that's nothing but a cover for the weaknesses induced by using 
> exclusively sleep time as an information source for the priority 
> calculation.  While this heuristic does work pretty darn well, it's 
> easily fooled (intentionally or otherwise).  The challenge is to find 
> the right low cost informational component, and to stir it in at O(1).

TASK_NONINTERACTIVE helps in this regard, is no cost in the code where 
it's used and probably decreases the costs in the scheduler code by 
enabling some processing to be skipped.  If by its judicious use the 
heuristic is only fed interactive sleep data the heuristics accuracy in 
identifying interactive tasks should be improved.  It may also allow the 
heuristic to be simplified.

Other potential information sources the priority calculation may also 
benefit from TASK_INTERACTIVE.  E.g. measuring interactive latency 
requires knowing that the task is waking from an interactive sleep.

> 
> The fundamental problem with the whole interactivity issue is that the 
> kernel has no way to know if there's a human involved or not.

Which is why SCHED_BATCH has promise.  The key for it becoming really 
useful will be getting authors of non interactive programs to use it. 
The hard part will be getting them to admit that their programs are non 
interactive and undeserving of a boost.

>  My 
> 100%cpu GL screensaver is interactive while I'm mindlessly staring at it.

I've never actually seen what bonuses the screensaver gets :-) but I 
imagine any sleeping they do is in a very regular sleep/run pattern and 
this regularity can be measured and used to exclude them from bonuses. 
However, it would need some extra parameters to avoid depriving audio 
and video programs of bonuses as they too have very regular sleep/run 
patterns.  The average sleep/run interval is one possibility as 
audio/video programs tend to use small intervals.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client  on interactive response
  2006-01-05 11:31               ` Peter Williams
@ 2006-01-05 14:31                 ` Mike Galbraith
  2006-01-05 23:13                   ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Mike Galbraith @ 2006-01-05 14:31 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

At 10:31 PM 1/5/2006 +1100, Peter Williams wrote:
>Mike Galbraith wrote:
>>At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:
>>
>>>I think that some of the harder to understand parts of the scheduler 
>>>code are actually attempts to overcome the undesirable effects (such as 
>>>those I've described) of inappropriately identifying tasks as 
>>>interactive.  I think that it would have been better to attempt to fix 
>>>the inappropriate identifications rather than their effects and I think 
>>>the prudent use of TASK_NONINTERACTIVE is an important tool for achieving this.
>>
>>IMHO, that's nothing but a cover for the weaknesses induced by using 
>>exclusively sleep time as an information source for the priority 
>>calculation.  While this heuristic does work pretty darn well, it's 
>>easily fooled (intentionally or otherwise).  The challenge is to find the 
>>right low cost informational component, and to stir it in at O(1).
>
>TASK_NONINTERACTIVE helps in this regard, is no cost in the code where 
>it's used and probably decreases the costs in the scheduler code by 
>enabling some processing to be skipped.  If by its judicious use the 
>heuristic is only fed interactive sleep data the heuristics accuracy in 
>identifying interactive tasks should be improved.  It may also allow the 
>heuristic to be simplified.

I disagree.  You can nip and tuck all the bits of sleep time you want, and 
it'll just shift the lumpy spots around (btdt).

         -Mike 


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-05 14:31                 ` Mike Galbraith
@ 2006-01-05 23:13                   ` Peter Williams
  2006-01-05 23:33                     ` Con Kolivas
  2006-01-06  7:39                     ` Mike Galbraith
  0 siblings, 2 replies; 55+ messages in thread
From: Peter Williams @ 2006-01-05 23:13 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Mike Galbraith wrote:
> At 10:31 PM 1/5/2006 +1100, Peter Williams wrote:
> 
>> Mike Galbraith wrote:
>>
>>> At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:
>>>
>>>> I think that some of the harder to understand parts of the scheduler 
>>>> code are actually attempts to overcome the undesirable effects (such 
>>>> as those I've described) of inappropriately identifying tasks as 
>>>> interactive.  I think that it would have been better to attempt to 
>>>> fix the inappropriate identifications rather than their effects and 
>>>> I think the prudent use of TASK_NONINTERACTIVE is an important tool 
>>>> for achieving this.
>>>
>>>
>>> IMHO, that's nothing but a cover for the weaknesses induced by using 
>>> exclusively sleep time as an information source for the priority 
>>> calculation.  While this heuristic does work pretty darn well, it's 
>>> easily fooled (intentionally or otherwise).  The challenge is to find 
>>> the right low cost informational component, and to stir it in at O(1).
>>
>>
>> TASK_NONINTERACTIVE helps in this regard, is no cost in the code where 
>> it's used and probably decreases the costs in the scheduler code by 
>> enabling some processing to be skipped.  If by its judicious use the 
>> heuristic is only fed interactive sleep data the heuristics accuracy 
>> in identifying interactive tasks should be improved.  It may also 
>> allow the heuristic to be simplified.
> 
> 
> I disagree.  You can nip and tuck all the bits of sleep time you want, 
> and it'll just shift the lumpy spots around (btdt).

Yes, but there's a lot of (understandable) reluctance to do any major 
rework of this part of the scheduler so we're stuck with nips and tucks 
for the time being.  This patch is a zero cost nip and tuck.

If the plugsched patches were included in -mm we could get wider testing 
of alternative scheduling mechanisms.  But I think it will take a lot of 
testing of the new schedulers to allay fears that they may introduce new 
problems of their own.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-05 23:13                   ` Peter Williams
@ 2006-01-05 23:33                     ` Con Kolivas
  2006-01-06  0:02                       ` Peter Williams
  2006-01-06  7:39                     ` Mike Galbraith
  1 sibling, 1 reply; 55+ messages in thread
From: Con Kolivas @ 2006-01-05 23:33 UTC (permalink / raw)
  To: Peter Williams
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Fri, 6 Jan 2006 10:13 am, Peter Williams wrote:
> If the plugsched patches were included in -mm we could get wider testing
> of alternative scheduling mechanisms.  But I think it will take a lot of
> testing of the new schedulers to allay fears that they may introduce new
> problems of their own.

When I first generated plugsched and posted it to lkml for inclusion in -mm it 
was blocked as having no chance of being included by both Ingo and Linus and 
I doubt they've changed their position since then. As you're well aware this 
is why I gave up working on it and let you maintain it since then. Obviously 
I thought it was a useful feature or I wouldn't have worked on it.

Con

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-05 23:33                     ` Con Kolivas
@ 2006-01-06  0:02                       ` Peter Williams
  2006-01-06  0:08                         ` Con Kolivas
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-06  0:02 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

Con Kolivas wrote:
> On Fri, 6 Jan 2006 10:13 am, Peter Williams wrote:
> 
>>If the plugsched patches were included in -mm we could get wider testing
>>of alternative scheduling mechanisms.  But I think it will take a lot of
>>testing of the new schedulers to allay fears that they may introduce new
>>problems of their own.
> 
> 
> When I first generated plugsched and posted it to lkml for inclusion in -mm it 
> was blocked as having no chance of being included by both Ingo and Linus and 
> I doubt they've changed their position since then. As you're well aware this 
> is why I gave up working on it and let you maintain it since then. Obviously 
> I thought it was a useful feature or I wouldn't have worked on it.

I've put a lot of effort into reducing code duplication and reducing the 
size of the interface and making it completely orthogonal to load 
balancing so I'm hopeful (perhaps mistakenly) that this makes it more 
acceptable (at least in -mm).

My testing shows that there's no observable difference in performance 
between a stock kernel and plugsched with ingosched selected at the 
total system level (although micro benchmarking may show slight 
increases in individual operations).

Anyway, I'll just keep plugging away,
Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-06  0:02                       ` Peter Williams
@ 2006-01-06  0:08                         ` Con Kolivas
  2006-01-06  0:40                           ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Con Kolivas @ 2006-01-06  0:08 UTC (permalink / raw)
  To: Peter Williams
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Fri, 6 Jan 2006 11:02 am, Peter Williams wrote:
> Con Kolivas wrote:
> > On Fri, 6 Jan 2006 10:13 am, Peter Williams wrote:
> >>If the plugsched patches were included in -mm we could get wider testing
> >>of alternative scheduling mechanisms.  But I think it will take a lot of
> >>testing of the new schedulers to allay fears that they may introduce new
> >>problems of their own.
> >
> > When I first generated plugsched and posted it to lkml for inclusion in
> > -mm it was blocked as having no chance of being included by both Ingo and
> > Linus and I doubt they've changed their position since then. As you're
> > well aware this is why I gave up working on it and let you maintain it
> > since then. Obviously I thought it was a useful feature or I wouldn't
> > have worked on it.
>
> I've put a lot of effort into reducing code duplication and reducing the
> size of the interface and making it completely orthogonal to load
> balancing so I'm hopeful (perhaps mistakenly) that this makes it more
> acceptable (at least in -mm).

The objection was to dilution of developer effort towards one cpu scheduler to 
rule them all. Linus' objection was against specialisation - he preferred one 
cpu scheduler that could do everything rather than unique cpu schedulers for 
NUMA, SMP, UP, embedded... Each approach has its own arguments and there 
isn't much point bringing them up again. We shall use Linux as the 
"steamroller to crack a nut" no matter what that nut is.

> My testing shows that there's no observable difference in performance
> between a stock kernel and plugsched with ingosched selected at the
> total system level (although micro benchmarking may show slight
> increases in individual operations).

I could find no difference either, but IA64 which does not cope with 
indirection well would probably suffer a demonstrable performance hit I have 
been told. I do not have access to such hardware.

> Anyway, I'll just keep plugging away,

Nice pun.

Cheers,
Con

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-06  0:08                         ` Con Kolivas
@ 2006-01-06  0:40                           ` Peter Williams
  0 siblings, 0 replies; 55+ messages in thread
From: Peter Williams @ 2006-01-06  0:40 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

Con Kolivas wrote:
> On Fri, 6 Jan 2006 11:02 am, Peter Williams wrote:
> 
>>Con Kolivas wrote:
>>
>>>On Fri, 6 Jan 2006 10:13 am, Peter Williams wrote:
>>>
>>>>If the plugsched patches were included in -mm we could get wider testing
>>>>of alternative scheduling mechanisms.  But I think it will take a lot of
>>>>testing of the new schedulers to allay fears that they may introduce new
>>>>problems of their own.
>>>
>>>When I first generated plugsched and posted it to lkml for inclusion in
>>>-mm it was blocked as having no chance of being included by both Ingo and
>>>Linus and I doubt they've changed their position since then. As you're
>>>well aware this is why I gave up working on it and let you maintain it
>>>since then. Obviously I thought it was a useful feature or I wouldn't
>>>have worked on it.
>>
>>I've put a lot of effort into reducing code duplication and reducing the
>>size of the interface and making it completely orthogonal to load
>>balancing so I'm hopeful (perhaps mistakenly) that this makes it more
>>acceptable (at least in -mm).
> 
> 
> The objection was to dilution of developer effort towards one cpu scheduler to 
> rule them all.

I think that I've partially addressed that objection by narrowing the 
focus of the alternative schedulers so that the dilution of effort is 
reduced.  The dichotomy between the dual array schedulers (ingosched and 
nicksched) and the single array schedulers (staircase and the SPA 
schedulers) is the main stumbling block to narrowing the focus further.

> Linus' objection was against specialisation - he preferred one 
> cpu scheduler that could do everything rather than unique cpu schedulers for 
> NUMA, SMP, UP, embedded...

kernbench results show that the penalties for an all purpose scheduler 
aren't very big so it's probably not a bad philosophy.  In spite of this 
I think specialization is worth pursuing if it can be achieved with very 
small configurable differences to the mechanism.  If the configuration 
change can be done at boot time or on a running system then it's even 
better e.g. your "compute" switch in staircase.

> Each approach has its own arguments and there 
> isn't much point bringing them up again. We shall use Linux as the 
> "steamroller to crack a nut" no matter what that nut is.
> 

Even if plugsched has no hope of getting into the mainline kernel, I see 
it as a useful tool for the practical evaluation of the various 
approaches.  If it could go into -mm for a while this evaluation could 
be more widespread.

In it's current state it should not interfere with other scheduling 
related development such as the load balancing changes, cpusets etc.

> 
>>My testing shows that there's no observable difference in performance
>>between a stock kernel and plugsched with ingosched selected at the
>>total system level (although micro benchmarking may show slight
>>increases in individual operations).
> 
> 
> I could find no difference either, but IA64 which does not cope with 
> indirection well would probably suffer a demonstrable performance hit I have 
> been told.

I wasn't aware of that.

> I do not have access to such hardware.

Nor do I.

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client   on interactive response
  2006-01-05 23:13                   ` Peter Williams
  2006-01-05 23:33                     ` Con Kolivas
@ 2006-01-06  7:39                     ` Mike Galbraith
  2006-01-07  1:11                       ` Peter Williams
  1 sibling, 1 reply; 55+ messages in thread
From: Mike Galbraith @ 2006-01-06  7:39 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2367 bytes --]

At 10:13 AM 1/6/2006 +1100, Peter Williams wrote:
>Mike Galbraith wrote:
>>At 10:31 PM 1/5/2006 +1100, Peter Williams wrote:
>>
>>>Mike Galbraith wrote:
>>>
>>>>At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:
>>>>
>>>>>I think that some of the harder to understand parts of the scheduler 
>>>>>code are actually attempts to overcome the undesirable effects (such 
>>>>>as those I've described) of inappropriately identifying tasks as 
>>>>>interactive.  I think that it would have been better to attempt to fix 
>>>>>the inappropriate identifications rather than their effects and I 
>>>>>think the prudent use of TASK_NONINTERACTIVE is an important tool for 
>>>>>achieving this.
>>>>
>>>>
>>>>IMHO, that's nothing but a cover for the weaknesses induced by using 
>>>>exclusively sleep time as an information source for the priority 
>>>>calculation.  While this heuristic does work pretty darn well, it's 
>>>>easily fooled (intentionally or otherwise).  The challenge is to find 
>>>>the right low cost informational component, and to stir it in at O(1).
>>>
>>>
>>>TASK_NONINTERACTIVE helps in this regard, is no cost in the code where 
>>>it's used and probably decreases the costs in the scheduler code by 
>>>enabling some processing to be skipped.  If by its judicious use the 
>>>heuristic is only fed interactive sleep data the heuristics accuracy in 
>>>identifying interactive tasks should be improved.  It may also allow the 
>>>heuristic to be simplified.
>>
>>I disagree.  You can nip and tuck all the bits of sleep time you want, 
>>and it'll just shift the lumpy spots around (btdt).
>
>Yes, but there's a lot of (understandable) reluctance to do any major 
>rework of this part of the scheduler so we're stuck with nips and tucks 
>for the time being.  This patch is a zero cost nip and tuck.

Color me skeptical, but nonetheless, it looks to me like the mechanism 
might need the attached.

On the subject of nip and tuck, take a look at the little proggy posted in 
thread [SCHED] wrong priority calc - SIMPLE test case.  That testcase was 
the result of Paolo Ornati looking into a real problem on his system.  I 
just 'fixed' that nanosleep() problem by judicious application of 
TASK_NONINTERACTIVE to the schedule_timeout().  Sure, it works, but it 
doesn't look like anything but a bandaid (tourniquet in this case:) to me.

         -Mike 

[-- Attachment #2: sched.c.diff --]
[-- Type: application/octet-stream, Size: 604 bytes --]

--- linux-2.6.15/kernel/sched.c.org	Fri Jan  6 08:44:09 2006
+++ linux-2.6.15/kernel/sched.c	Fri Jan  6 08:51:03 2006
@@ -1353,7 +1353,7 @@
 
 out_activate:
 #endif /* CONFIG_SMP */
-	if (old_state == TASK_UNINTERRUPTIBLE) {
+	if (old_state & TASK_UNINTERRUPTIBLE) {
 		rq->nr_uninterruptible--;
 		/*
 		 * Tasks on involuntary sleep don't earn
@@ -3010,7 +3010,7 @@
 				unlikely(signal_pending(prev))))
 			prev->state = TASK_RUNNING;
 		else {
-			if (prev->state == TASK_UNINTERRUPTIBLE)
+			if (prev->state & TASK_UNINTERRUPTIBLE)
 				rq->nr_uninterruptible++;
 			deactivate_task(prev, rq);
 		}

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-06  7:39                     ` Mike Galbraith
@ 2006-01-07  1:11                       ` Peter Williams
  2006-01-07  5:27                         ` Mike Galbraith
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-07  1:11 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Mike Galbraith wrote:
> At 10:13 AM 1/6/2006 +1100, Peter Williams wrote:
> 
>> Mike Galbraith wrote:
>>
>>> At 10:31 PM 1/5/2006 +1100, Peter Williams wrote:
>>>
>>>> Mike Galbraith wrote:
>>>>
>>>>> At 08:51 AM 1/5/2006 +1100, Peter Williams wrote:
>>>>>
>>>>>> I think that some of the harder to understand parts of the 
>>>>>> scheduler code are actually attempts to overcome the undesirable 
>>>>>> effects (such as those I've described) of inappropriately 
>>>>>> identifying tasks as interactive.  I think that it would have been 
>>>>>> better to attempt to fix the inappropriate identifications rather 
>>>>>> than their effects and I think the prudent use of 
>>>>>> TASK_NONINTERACTIVE is an important tool for achieving this.
>>>>>
>>>>>
>>>>>
>>>>> IMHO, that's nothing but a cover for the weaknesses induced by 
>>>>> using exclusively sleep time as an information source for the 
>>>>> priority calculation.  While this heuristic does work pretty darn 
>>>>> well, it's easily fooled (intentionally or otherwise).  The 
>>>>> challenge is to find the right low cost informational component, 
>>>>> and to stir it in at O(1).
>>>>
>>>>
>>>>
>>>> TASK_NONINTERACTIVE helps in this regard, is no cost in the code 
>>>> where it's used and probably decreases the costs in the scheduler 
>>>> code by enabling some processing to be skipped.  If by its judicious 
>>>> use the heuristic is only fed interactive sleep data the heuristics 
>>>> accuracy in identifying interactive tasks should be improved.  It 
>>>> may also allow the heuristic to be simplified.
>>>
>>>
>>> I disagree.  You can nip and tuck all the bits of sleep time you 
>>> want, and it'll just shift the lumpy spots around (btdt).
>>
>>
>> Yes, but there's a lot of (understandable) reluctance to do any major 
>> rework of this part of the scheduler so we're stuck with nips and 
>> tucks for the time being.  This patch is a zero cost nip and tuck.
> 
> 
> Color me skeptical, but nonetheless, it looks to me like the mechanism 
> might need the attached.

Is that patch complete?  (This is all I got.)

--- linux-2.6.15/kernel/sched.c.org	Fri Jan  6 08:44:09 2006
+++ linux-2.6.15/kernel/sched.c	Fri Jan  6 08:51:03 2006
@@ -1353,7 +1353,7 @@

  out_activate:
  #endif /* CONFIG_SMP */
-	if (old_state == TASK_UNINTERRUPTIBLE) {
+	if (old_state & TASK_UNINTERRUPTIBLE) {
  		rq->nr_uninterruptible--;
  		/*
  		 * Tasks on involuntary sleep don't earn
@@ -3010,7 +3010,7 @@
  				unlikely(signal_pending(prev))))
  			prev->state = TASK_RUNNING;
  		else {
-			if (prev->state == TASK_UNINTERRUPTIBLE)
+			if (prev->state & TASK_UNINTERRUPTIBLE)
  				rq->nr_uninterruptible++;
  			deactivate_task(prev, rq);
  		}

In the absence of any use of TASK_NONINTERACTIVE in conjunction with 
TASK_UNINTERRUPTIBLE it will have no effect.  Personally, I think that 
all TASK_UNINTERRUPTIBLE sleeps should be treated as non interactive 
rather than just be heavily discounted (and that TASK_NONINTERACTIVE 
shouldn't be needed in conjunction with it) BUT I may be wrong 
especially w.r.t. media streamers such as audio and video players and 
the mechanisms they use to do sleeps between cpu bursts.

> 
> On the subject of nip and tuck, take a look at the little proggy posted 
> in thread [SCHED] wrong priority calc - SIMPLE test case.  That testcase 
> was the result of Paolo Ornati looking into a real problem on his 
> system.  I just 'fixed' that nanosleep() problem by judicious 
> application of TASK_NONINTERACTIVE to the schedule_timeout().  Sure, it 
> works, but it doesn't look like anything but a bandaid (tourniquet in 
> this case:) to me.
> 
>         -Mike

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-07  1:11                       ` Peter Williams
@ 2006-01-07  5:27                         ` Mike Galbraith
  2006-01-07  6:34                           ` Peter Williams
  2006-01-07  9:30                           ` Con Kolivas
  0 siblings, 2 replies; 55+ messages in thread
From: Mike Galbraith @ 2006-01-07  5:27 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

At 12:11 PM 1/7/2006 +1100, Peter Williams wrote:

>Is that patch complete?  (This is all I got.)

Yes.

>--- linux-2.6.15/kernel/sched.c.org     Fri Jan  6 08:44:09 2006
>+++ linux-2.6.15/kernel/sched.c Fri Jan  6 08:51:03 2006
>@@ -1353,7 +1353,7 @@
>
>  out_activate:
>  #endif /* CONFIG_SMP */
>-       if (old_state == TASK_UNINTERRUPTIBLE) {
>+       if (old_state & TASK_UNINTERRUPTIBLE) {
>                 rq->nr_uninterruptible--;
>                 /*
>                 * Tasks on involuntary sleep don't earn
>@@ -3010,7 +3010,7 @@
>                                 unlikely(signal_pending(prev))))
>                         prev->state = TASK_RUNNING;
>                 else {
>-                       if (prev->state == TASK_UNINTERRUPTIBLE)
>+                       if (prev->state & TASK_UNINTERRUPTIBLE)
>                                 rq->nr_uninterruptible++;
>                         deactivate_task(prev, rq);
>                 }
>
>In the absence of any use of TASK_NONINTERACTIVE in conjunction with 
>TASK_UNINTERRUPTIBLE it will have no effect.

Exactly.  It's only life insurance.

>   Personally, I think that all TASK_UNINTERRUPTIBLE sleeps should be 
> treated as non interactive rather than just be heavily discounted (and 
> that TASK_NONINTERACTIVE shouldn't be needed in conjunction with it) BUT 
> I may be wrong especially w.r.t. media streamers such as audio and video 
> players and the mechanisms they use to do sleeps between cpu bursts.

Try it, you won't like it.  When I first examined sleep_avg woes, my 
reaction was to nuke uninterruptible sleep too... boy did that ever _suck_ :)

I'm trying to think of ways to quell the nasty side of sleep_avg without 
destroying the good.  One method I've tinkered with in the past with 
encouraging results is to compute a weighted slice_avg, which is a measure 
of how long it takes you to use your slice, and scale it to match 
MAX_SLEEPAVG for easy comparison.  A possible use thereof:  In order to be 
classified interactive, you need the sleep_avg, but that's not enough... 
you also have to have a record of sharing the cpu. When your slice_avg 
degrades enough as you burn cpu, you no longer get to loop in the active 
queue.  Being relegated to the expired array though will improve your 
slice_avg and let you regain your status.  Your priority remains, so you 
can still preempt, but you become mortal and have to share.  When there is 
a large disparity between sleep_avg and slice_avg, it can be used as a 
general purpose throttle to trigger TASK_NONINTERACTIVE flagging in 
schedule() as negative feedback for the ill behaved.  Thoughts?

         -Mike 


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client     on interactive response
  2006-01-07  5:27                         ` Mike Galbraith
@ 2006-01-07  6:34                           ` Peter Williams
  2006-01-07  8:54                             ` Mike Galbraith
  2006-01-07  9:30                           ` Con Kolivas
  1 sibling, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-07  6:34 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Mike Galbraith wrote:
> At 12:11 PM 1/7/2006 +1100, Peter Williams wrote:
> 
>> Is that patch complete?  (This is all I got.)
> 
> 
> Yes.
> 
>> --- linux-2.6.15/kernel/sched.c.org     Fri Jan  6 08:44:09 2006
>> +++ linux-2.6.15/kernel/sched.c Fri Jan  6 08:51:03 2006
>> @@ -1353,7 +1353,7 @@
>>
>>  out_activate:
>>  #endif /* CONFIG_SMP */
>> -       if (old_state == TASK_UNINTERRUPTIBLE) {
>> +       if (old_state & TASK_UNINTERRUPTIBLE) {
>>                 rq->nr_uninterruptible--;
>>                 /*
>>                 * Tasks on involuntary sleep don't earn
>> @@ -3010,7 +3010,7 @@
>>                                 unlikely(signal_pending(prev))))
>>                         prev->state = TASK_RUNNING;
>>                 else {
>> -                       if (prev->state == TASK_UNINTERRUPTIBLE)
>> +                       if (prev->state & TASK_UNINTERRUPTIBLE)
>>                                 rq->nr_uninterruptible++;
>>                         deactivate_task(prev, rq);
>>                 }
>>
>> In the absence of any use of TASK_NONINTERACTIVE in conjunction with 
>> TASK_UNINTERRUPTIBLE it will have no effect.
> 
> 
> Exactly.  It's only life insurance.
> 
>>   Personally, I think that all TASK_UNINTERRUPTIBLE sleeps should be 
>> treated as non interactive rather than just be heavily discounted (and 
>> that TASK_NONINTERACTIVE shouldn't be needed in conjunction with it) 
>> BUT I may be wrong especially w.r.t. media streamers such as audio and 
>> video players and the mechanisms they use to do sleeps between cpu 
>> bursts.
> 
> 
> Try it, you won't like it.

It's on my list of things to try.

>  When I first examined sleep_avg woes, my 
> reaction was to nuke uninterruptible sleep too... boy did that ever 
> _suck_ :)

I look forward to seeing it.  :-)

> 
> I'm trying to think of ways to quell the nasty side of sleep_avg without 
> destroying the good.  One method I've tinkered with in the past with 
> encouraging results is to compute a weighted slice_avg, which is a 
> measure of how long it takes you to use your slice, and scale it to 
> match MAX_SLEEPAVG for easy comparison.  A possible use thereof:  In 
> order to be classified interactive, you need the sleep_avg, but that's 
> not enough... you also have to have a record of sharing the cpu. When 
> your slice_avg degrades enough as you burn cpu, you no longer get to 
> loop in the active queue.  Being relegated to the expired array though 
> will improve your slice_avg and let you regain your status.  Your 
> priority remains, so you can still preempt, but you become mortal and 
> have to share.  When there is a large disparity between sleep_avg and 
> slice_avg, it can be used as a general purpose throttle to trigger 
> TASK_NONINTERACTIVE flagging in schedule() as negative feedback for the 
> ill behaved.  Thoughts?

Sounds like the kind of thing that's required.  I think the deferred 
shift from active to expired is safe as long as CPU hogs can't exploit 
it and your scheme sounds like it might provide that assurance.  One 
problem this solution will experience is that when the system gets 
heavily loaded every task will have small CPU usage rates (even the CPU 
hogs) and this makes it harder to detect the CPU hogs.  One slight 
variation of your scheme would be to measure the average length of the 
CPU runs that the task does (i.e. how long it runs without voluntarily 
relinquishing the CPU) and not allowing them to defer the shift to the 
expired array if this average run length is greater than some specified 
value.  The length of this average for each task shouldn't change with 
system load.  (This is more or less saying that it's ok for a task to 
stay on the active array provided it's unlikely to delay the switch 
between the active and expired arrays for very long.)

My own way around the problem is to nuke the expired/active arrays and 
use a single priority array.  That gets rid of the problem of deferred 
shifting from active to expired all together.  :-)

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client     on interactive response
  2006-01-07  6:34                           ` Peter Williams
@ 2006-01-07  8:54                             ` Mike Galbraith
  2006-01-07 23:40                               ` Peter Williams
  0 siblings, 1 reply; 55+ messages in thread
From: Mike Galbraith @ 2006-01-07  8:54 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

At 05:34 PM 1/7/2006 +1100, Peter Williams wrote:
>Mike Galbraith wrote:
>
>>I'm trying to think of ways to quell the nasty side of sleep_avg without 
>>destroying the good.  One method I've tinkered with in the past with 
>>encouraging results is to compute a weighted slice_avg, which is a 
>>measure of how long it takes you to use your slice, and scale it to match 
>>MAX_SLEEPAVG for easy comparison.  A possible use thereof:  In order to 
>>be classified interactive, you need the sleep_avg, but that's not 
>>enough... you also have to have a record of sharing the cpu. When your 
>>slice_avg degrades enough as you burn cpu, you no longer get to loop in 
>>the active queue.  Being relegated to the expired array though will 
>>improve your slice_avg and let you regain your status.  Your priority 
>>remains, so you can still preempt, but you become mortal and have to 
>>share.  When there is a large disparity between sleep_avg and slice_avg, 
>>it can be used as a general purpose throttle to trigger 
>>TASK_NONINTERACTIVE flagging in schedule() as negative feedback for the 
>>ill behaved.  Thoughts?
>
>Sounds like the kind of thing that's required.  I think the deferred shift 
>from active to expired is safe as long as CPU hogs can't exploit it and 
>your scheme sounds like it might provide that assurance.  One problem this 
>solution will experience is that when the system gets heavily loaded every 
>task will have small CPU usage rates (even the CPU hogs) and this makes it 
>harder to detect the CPU hogs.

True.  A gaggle of more or less equally well (or not) behaving tasks will 
have their 'hogginess' diluted.   I'll have to think more about scaling 
with nr_running or maybe starting the clock at first tick of a new slice... 
that should still catch most of the guys who are burning hard without being 
preempted, or only sleeping for short intervals only to keep coming right 
back to beat up poor cc1.  I think the real problem children should stick 
out enough for a proof of concept even without additional complexity.

>   One slight variation of your scheme would be to measure the average 
> length of the CPU runs that the task does (i.e. how long it runs without 
> voluntarily relinquishing the CPU) and not allowing them to defer the 
> shift to the expired array if this average run length is greater than 
> some specified value.  The length of this average for each task shouldn't 
> change with system load.  (This is more or less saying that it's ok for a 
> task to stay on the active array provided it's unlikely to delay the 
> switch between the active and expired arrays for very long.)

Average burn time would indeed probably be a better metric, but that would 
require doing bookkeeping is the fast path.  I'd like to stick to tick time 
or even better, slice renewal time if possible to keep it down on the 'dead 
simple and dirt cheap' shelf.  After all, this kind of thing is supposed to 
accomplish absolutely nothing meaningful the vast majority of the time :)

         Thanks for the feedback,

         -Mike 


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-07  5:27                         ` Mike Galbraith
  2006-01-07  6:34                           ` Peter Williams
@ 2006-01-07  9:30                           ` Con Kolivas
  2006-01-07 10:23                             ` Mike Galbraith
  2006-01-07 23:31                             ` Peter Williams
  1 sibling, 2 replies; 55+ messages in thread
From: Con Kolivas @ 2006-01-07  9:30 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Peter Williams, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Saturday 07 January 2006 16:27, Mike Galbraith wrote:
> >   Personally, I think that all TASK_UNINTERRUPTIBLE sleeps should be
> > treated as non interactive rather than just be heavily discounted (and
> > that TASK_NONINTERACTIVE shouldn't be needed in conjunction with it) BUT
> > I may be wrong especially w.r.t. media streamers such as audio and video
> > players and the mechanisms they use to do sleeps between cpu bursts.
>
> Try it, you won't like it.  When I first examined sleep_avg woes, my
> reaction was to nuke uninterruptible sleep too... boy did that ever _suck_
> :)

Glad you've seen why I put the uninterruptible sleep logic in there. In 
essence this is why the NFS client interactive case is not as nice - the NFS 
code doesn't do "work on behalf of" a cpu hog with the TASK_UNINTERRUPTIBLE 
state. The uninterruptible sleep detection logic made a massive difference to 
interactivity when cpu bound tasks do disk I/O.

Cheers,
Con

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-07  9:30                           ` Con Kolivas
@ 2006-01-07 10:23                             ` Mike Galbraith
  2006-01-07 23:31                             ` Peter Williams
  1 sibling, 0 replies; 55+ messages in thread
From: Mike Galbraith @ 2006-01-07 10:23 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Peter Williams, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

At 08:30 PM 1/7/2006 +1100, Con Kolivas wrote:
>On Saturday 07 January 2006 16:27, Mike Galbraith wrote:
> > >   Personally, I think that all TASK_UNINTERRUPTIBLE sleeps should be
> > > treated as non interactive rather than just be heavily discounted (and
> > > that TASK_NONINTERACTIVE shouldn't be needed in conjunction with it) BUT
> > > I may be wrong especially w.r.t. media streamers such as audio and video
> > > players and the mechanisms they use to do sleeps between cpu bursts.
> >
> > Try it, you won't like it.  When I first examined sleep_avg woes, my
> > reaction was to nuke uninterruptible sleep too... boy did that ever _suck_
> > :)
>
>Glad you've seen why I put the uninterruptible sleep logic in there.

Yeah, if there's one thing worse than too much preemption, it's too little 
preemption.

         -Mike


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-07  9:30                           ` Con Kolivas
  2006-01-07 10:23                             ` Mike Galbraith
@ 2006-01-07 23:31                             ` Peter Williams
  2006-01-08  0:38                               ` Con Kolivas
  1 sibling, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-07 23:31 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

Con Kolivas wrote:
> On Saturday 07 January 2006 16:27, Mike Galbraith wrote:
> 
>>>  Personally, I think that all TASK_UNINTERRUPTIBLE sleeps should be
>>>treated as non interactive rather than just be heavily discounted (and
>>>that TASK_NONINTERACTIVE shouldn't be needed in conjunction with it) BUT
>>>I may be wrong especially w.r.t. media streamers such as audio and video
>>>players and the mechanisms they use to do sleeps between cpu bursts.
>>
>>Try it, you won't like it.  When I first examined sleep_avg woes, my
>>reaction was to nuke uninterruptible sleep too... boy did that ever _suck_
>>:)
> 
> 
> Glad you've seen why I put the uninterruptible sleep logic in there. In 
> essence this is why the NFS client interactive case is not as nice - the NFS 
> code doesn't do "work on behalf of" a cpu hog with the TASK_UNINTERRUPTIBLE 
> state. The uninterruptible sleep detection logic made a massive difference to 
> interactivity when cpu bound tasks do disk I/O.

TASK_NONINTERACTIVE doesn't mean that the task is a CPU hog.  It just 
means that this sleep should be ignored as far as determining whether 
this task is interactive or not.

Also, compensation for uninterruptible sleeps should be handled by the 
"fairness" mechanism (i.e. time slices and the active/expired arrays) 
not the "interactive response" mechanism.  In other words, doing a lot 
of uninterruptible sleeps is (theoretically) not a sign that the task is 
interactive or for that matter that it's non interactive so 
(theoretically) should just be ignored.  That bad things happen when it 
isn't needs explaining.

I see two possible reasons:

1. Audio/video streamers aren't really interactive but we want to treat 
them as such (to ensure they have low latency).  The fact that they 
aren't really interactive may mean that the sleeps they do between runs 
are uninterruptible and if we don't count uninterruptible sleep we'll 
miss them.

2. The X server isn't really a completely interactive program either. 
It handles a lot of interactive on behalf of interactive programs (which 
should involve interactive sleeps and help get it classified as 
interactive) but also does a lot of non interactive stuff (which can be 
CPU intensive and make it loose points due to CPU hoggishness) which 
probably involves uninterruptible sleep.  The combination of ignoring 
the uninterruptible sleep and the occasional high CPU usage rate could 
result in losing too much bonus with consequent poor interactive 
responsiveness.

So it would be interesting to know which programs suffered badly when 
uninterruptible sleep was ignored?  This may enable an alternate 
solution to be found.

In any case and in the meantime, perhaps the solution is to use 
TASK_NONINTERACTIVE where needed but treat 
TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE sleep the same as 
TASK_UNINTERRUPTIBLE sleep instead of ignoring it?

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client      on interactive response
  2006-01-07  8:54                             ` Mike Galbraith
@ 2006-01-07 23:40                               ` Peter Williams
  2006-01-08  5:51                                 ` Mike Galbraith
  0 siblings, 1 reply; 55+ messages in thread
From: Peter Williams @ 2006-01-07 23:40 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

Mike Galbraith wrote:
> At 05:34 PM 1/7/2006 +1100, Peter Williams wrote:
> 
>> Mike Galbraith wrote:
>>
>>> I'm trying to think of ways to quell the nasty side of sleep_avg 
>>> without destroying the good.  One method I've tinkered with in the 
>>> past with encouraging results is to compute a weighted slice_avg, 
>>> which is a measure of how long it takes you to use your slice, and 
>>> scale it to match MAX_SLEEPAVG for easy comparison.  A possible use 
>>> thereof:  In order to be classified interactive, you need the 
>>> sleep_avg, but that's not enough... you also have to have a record of 
>>> sharing the cpu. When your slice_avg degrades enough as you burn cpu, 
>>> you no longer get to loop in the active queue.  Being relegated to 
>>> the expired array though will improve your slice_avg and let you 
>>> regain your status.  Your priority remains, so you can still preempt, 
>>> but you become mortal and have to share.  When there is a large 
>>> disparity between sleep_avg and slice_avg, it can be used as a 
>>> general purpose throttle to trigger TASK_NONINTERACTIVE flagging in 
>>> schedule() as negative feedback for the ill behaved.  Thoughts?
>>
>>
>> Sounds like the kind of thing that's required.  I think the deferred 
>> shift from active to expired is safe as long as CPU hogs can't exploit 
>> it and your scheme sounds like it might provide that assurance.  One 
>> problem this solution will experience is that when the system gets 
>> heavily loaded every task will have small CPU usage rates (even the 
>> CPU hogs) and this makes it harder to detect the CPU hogs.
> 
> 
> True.  A gaggle of more or less equally well (or not) behaving tasks 
> will have their 'hogginess' diluted.   I'll have to think more about 
> scaling with nr_running or maybe starting the clock at first tick of a 
> new slice... that should still catch most of the guys who are burning 
> hard without being preempted, or only sleeping for short intervals only 
> to keep coming right back to beat up poor cc1.  I think the real problem 
> children should stick out enough for a proof of concept even without 
> additional complexity.
> 
>>   One slight variation of your scheme would be to measure the average 
>> length of the CPU runs that the task does (i.e. how long it runs 
>> without voluntarily relinquishing the CPU) and not allowing them to 
>> defer the shift to the expired array if this average run length is 
>> greater than some specified value.  The length of this average for 
>> each task shouldn't change with system load.  (This is more or less 
>> saying that it's ok for a task to stay on the active array provided 
>> it's unlikely to delay the switch between the active and expired 
>> arrays for very long.)
> 
> 
> Average burn time would indeed probably be a better metric, but that 
> would require doing bookkeeping is the fast path.

Most of the infrastructure is already there and the cost of doing the 
extra bits required to get this metric would be extremely small.  The 
hardest bit would be deciding on the "limit" to be applied when deciding 
whether to let a supposed interactive task stay on the active array.

 From the statistical point of view, the distribution of random time 
intervals with a given average length is such that about 99% of them 
will be less than four times the average length.  So a value of 1/4 of 
the delay in array switches that can be tolerated would be about right. 
  But that still leaves the problem of what delay can be tolerated :-).

>  I'd like to stick to 
> tick time or even better, slice renewal time if possible to keep it down 
> on the 'dead simple and dirt cheap' shelf.  After all, this kind of 
> thing is supposed to accomplish absolutely nothing meaningful the vast 
> majority of the time :)
> 

By the way, it seems you have your own scheduler versions?  If so are 
you interested in adding them to the collection in PlugSched?

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client    on interactive response
  2006-01-07 23:31                             ` Peter Williams
@ 2006-01-08  0:38                               ` Con Kolivas
  0 siblings, 0 replies; 55+ messages in thread
From: Con Kolivas @ 2006-01-08  0:38 UTC (permalink / raw)
  To: Peter Williams
  Cc: Mike Galbraith, Helge Hafting, Trond Myklebust, Ingo Molnar,
	Linux Kernel Mailing List

On Sunday 08 January 2006 10:31, Peter Williams wrote:
> In any case and in the meantime, perhaps the solution is to use
> TASK_NONINTERACTIVE where needed but treat
> TASK_INTERRUPTIBLE|TASK_NONINTERACTIVE sleep the same as
> TASK_UNINTERRUPTIBLE sleep instead of ignoring it?

That's how I would tackle it.

Con

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] sched: Fix adverse effects of NFS client      on interactive response
  2006-01-07 23:40                               ` Peter Williams
@ 2006-01-08  5:51                                 ` Mike Galbraith
  0 siblings, 0 replies; 55+ messages in thread
From: Mike Galbraith @ 2006-01-08  5:51 UTC (permalink / raw)
  To: Peter Williams
  Cc: Helge Hafting, Trond Myklebust, Ingo Molnar, Con Kolivas,
	Linux Kernel Mailing List

At 10:40 AM 1/8/2006 +1100, Peter Williams wrote:
>Mike Galbraith wrote:
>>>   One slight variation of your scheme would be to measure the average 
>>> length of the CPU runs that the task does (i.e. how long it runs 
>>> without voluntarily relinquishing the CPU) and not allowing them to 
>>> defer the shift to the expired array if this average run length is 
>>> greater than some specified value.  The length of this average for each 
>>> task shouldn't change with system load.  (This is more or less saying 
>>> that it's ok for a task to stay on the active array provided it's 
>>> unlikely to delay the switch between the active and expired arrays for 
>>> very long.)
>>
>>Average burn time would indeed probably be a better metric, but that 
>>would require doing bookkeeping is the fast path.
>
>Most of the infrastructure is already there and the cost of doing the 
>extra bits required to get this metric would be extremely small.  The 
>hardest bit would be deciding on the "limit" to be applied when deciding 
>whether to let a supposed interactive task stay on the active array.

Yeah, I noticed run_time when I started implementing my first cut.  (which 
is of course buggy)


>By the way, it seems you have your own scheduler versions?  If so are you 
>interested in adding them to the collection in PlugSched?

No, I used to do a bunch of experimentation in fairness vs interactivity, 
but they all ended up just trading one weakness for an other.

         -Mike 


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2006-01-08  5:51 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-21  6:00 [PATCH] sched: Fix adverse effects of NFS client on interactive response Peter Williams
2005-12-21  6:09 ` Trond Myklebust
2005-12-21  6:32   ` Peter Williams
2005-12-21 13:21     ` Trond Myklebust
2005-12-21 13:36       ` Kyle Moffett
2005-12-21 13:40         ` Trond Myklebust
2005-12-22  2:26           ` Peter Williams
2005-12-22 22:08             ` Trond Myklebust
2005-12-22 22:33               ` Peter Williams
2005-12-22 22:59                 ` Trond Myklebust
2005-12-23  0:02                   ` Kyle Moffett
2005-12-23  0:25                     ` Trond Myklebust
2005-12-23  3:06                       ` Peter Williams
2005-12-23  9:39                         ` Trond Myklebust
2005-12-23 10:49                           ` Peter Williams
2005-12-23 12:51                             ` Trond Myklebust
2005-12-23 13:36                               ` Peter Williams
2006-01-02 12:09                                 ` Pekka Enberg
2005-12-23 19:07                           ` Lee Revell
2005-12-23 21:08                             ` Trond Myklebust
2005-12-23 21:17                               ` Lee Revell
2005-12-23 21:23                                 ` Trond Myklebust
2005-12-23 22:04                                   ` Lee Revell
2005-12-23 22:10                                     ` Trond Myklebust
2005-12-21 16:10         ` Horst von Brand
2005-12-21 20:36           ` Kyle Moffett
2005-12-21 22:59             ` Peter Williams
2005-12-21 16:11     ` Ingo Molnar
2005-12-21 22:49       ` Peter Williams
2006-01-02 11:01     ` Helge Hafting
2006-01-02 23:54       ` Peter Williams
2006-01-04  1:25         ` Peter Williams
2006-01-04  9:40           ` Marcelo Tosatti
2006-01-04 12:18             ` Con Kolivas
2006-01-04 10:31               ` Marcelo Tosatti
2006-01-04 21:51           ` Peter Williams
2006-01-05  6:31             ` Mike Galbraith
2006-01-05 11:31               ` Peter Williams
2006-01-05 14:31                 ` Mike Galbraith
2006-01-05 23:13                   ` Peter Williams
2006-01-05 23:33                     ` Con Kolivas
2006-01-06  0:02                       ` Peter Williams
2006-01-06  0:08                         ` Con Kolivas
2006-01-06  0:40                           ` Peter Williams
2006-01-06  7:39                     ` Mike Galbraith
2006-01-07  1:11                       ` Peter Williams
2006-01-07  5:27                         ` Mike Galbraith
2006-01-07  6:34                           ` Peter Williams
2006-01-07  8:54                             ` Mike Galbraith
2006-01-07 23:40                               ` Peter Williams
2006-01-08  5:51                                 ` Mike Galbraith
2006-01-07  9:30                           ` Con Kolivas
2006-01-07 10:23                             ` Mike Galbraith
2006-01-07 23:31                             ` Peter Williams
2006-01-08  0:38                               ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).