linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19  6:06 Perez-Gonzalez, Inaky
  2003-06-19  6:07 ` Ingo Molnar
  2003-06-19 16:00 ` george anzinger
  0 siblings, 2 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19  6:06 UTC (permalink / raw)
  To: 'Andrew Morton', 'george anzinger'
  Cc: 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]


Hi All

I have another test case that is showing this behavior.
It is very similar to George's, however, it is a simplification
of another one using threads that a co-worker, Adam Li found
a few days ago.

Parent (FIFO 5) forks child that sets itself to FIFO 4 and 
busy loops, then it sleeps five seconds and kills the child. 

Doing SysRq + T after a while shows the parent'd call trace 
to be at sys_rt_sigaction+0xd1, that is just inside the final 
copy_to_user() in signal.c:sys_rt_sigaction().

Reprioritizing events/0 to FIFO 5+ fixes the inversion. 

If I call nanosleep directly (with system() instead of
glibc's sleep(), so I avoid all the rt_sig calls),
I get the parent process always stuck in work_resched+0x5,
in entry.S:work_resched, just after the call to the
scheduler - however, I cannot trace what is happening
inside the scheduler.

My point here is: I am trying to trace where this program
is making use of workqueues inside of the kernel, and I
can find none. The only place where I need to look some
more is inside the timer code, but in a quick glance,
it seems it is not being used, so why is it affected by
the reprioritization of the events/0 thread? George, can
you help me here?

kernel is 2.5.67, SMP and PREEMPT with maxcpus=1; tomorrow
I will try .72 ... 

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)



[-- Attachment #2: sched-hang.c --]
[-- Type: application/octet-stream, Size: 1056 bytes --]

#include <signal.h>
#include <unistd.h>
#include <linux/unistd.h>
#include <stdlib.h>
#include <sched.h>
#include <stdio.h>
#include <time.h>

volatile int dummy;
volatile int child_run = 1;

/* #define DPRINTF(a...) fprintf(stderr, a) */
#define DPRINTF(a...) do { } while (0)

void child_signal_handler (int signum)
{
  child_run = 0;
}

int main (void)
{
  int child;
  struct timespec tp = { 5, 0 };
  struct sched_param param;
  param.sched_priority = 5;
  sched_setscheduler (0, SCHED_FIFO, &param);
  child = fork();
  switch (child)
  {
   case 0:
    DPRINTF("Child starts\n");
    signal (SIGTERM, child_signal_handler);
    param.sched_priority = 4;
    sched_setscheduler (0, SCHED_FIFO, &param);
    for (; child_run;)
      dummy = dummy + 1;
    DPRINTF("Child dies\n");
    break;
   case -1:
    perror ("fork failed");
    abort();
    break;    
   default:
/*     sleep (5); */
    syscall (__NR_nanosleep, &tp, NULL);
    kill (child, SIGTERM);
   break;
  }
  return 0;
}

    
  

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  6:06 O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks Perez-Gonzalez, Inaky
@ 2003-06-19  6:07 ` Ingo Molnar
  2003-06-19 16:00 ` george anzinger
  1 sibling, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2003-06-19  6:07 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Andrew Morton', 'george anzinger',
	'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam


On Wed, 18 Jun 2003, Perez-Gonzalez, Inaky wrote:

> My point here is: I am trying to trace where this program is making use
> of workqueues inside of the kernel, and I can find none. The only place
> where I need to look some more is inside the timer code, but in a quick
> glance, it seems it is not being used, so why is it affected by the
> reprioritization of the events/0 thread? George, can you help me here?

well, printk (console input/output) can already make use of keventd.

	Ingo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  6:06 O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks Perez-Gonzalez, Inaky
  2003-06-19  6:07 ` Ingo Molnar
@ 2003-06-19 16:00 ` george anzinger
  2003-06-19 17:19   ` 'joe.korty@ccur.com'
  1 sibling, 1 reply; 22+ messages in thread
From: george anzinger @ 2003-06-19 16:00 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Andrew Morton', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam, Robert Love

Perez-Gonzalez, Inaky wrote:
> Hi All
> 
> I have another test case that is showing this behavior.
> It is very similar to George's, however, it is a simplification
> of another one using threads that a co-worker, Adam Li found
> a few days ago.
> 
> Parent (FIFO 5) forks child that sets itself to FIFO 4 and 
> busy loops, then it sleeps five seconds and kills the child. 
> 
> Doing SysRq + T after a while shows the parent'd call trace 
> to be at sys_rt_sigaction+0xd1, that is just inside the final 
> copy_to_user() in signal.c:sys_rt_sigaction().
> 
> Reprioritizing events/0 to FIFO 5+ fixes the inversion. 
> 
> If I call nanosleep directly (with system() instead of
> glibc's sleep(), so I avoid all the rt_sig calls),
> I get the parent process always stuck in work_resched+0x5,
> in entry.S:work_resched, just after the call to the
> scheduler - however, I cannot trace what is happening
> inside the scheduler.
> 
> My point here is: I am trying to trace where this program
> is making use of workqueues inside of the kernel, and I
> can find none. The only place where I need to look some
> more is inside the timer code, but in a quick glance,
> it seems it is not being used, so why is it affected by
> the reprioritization of the events/0 thread? George, can
> you help me here?
> 

Hm!  I wonder.  Robert is working on a fix for schedsetschedule() 
where it fails to actually tell the scheduler to switch to a process 
that it just made higher priority or away from one it just lowered.

The net result is that the caller keeps running (FIFO for all in this 
case) when, in fact it should have been switched out.  Next time 
schedule() actually switches, it is all sorted out again.  Could the 
elavation of the events/0 thread cause this needed switch?

-g

> kernel is 2.5.67, SMP and PREEMPT with maxcpus=1; tomorrow
> I will try .72 ... 
> 
> Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
> (and my fault)
> 
> 

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19 16:00 ` george anzinger
@ 2003-06-19 17:19   ` 'joe.korty@ccur.com'
  2003-06-19 17:23     ` Robert Love
  2003-06-19 17:45     ` [patch] setscheduler fix Robert Love
  0 siblings, 2 replies; 22+ messages in thread
From: 'joe.korty@ccur.com' @ 2003-06-19 17:19 UTC (permalink / raw)
  To: george anzinger
  Cc: Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam, Robert Love

> Hm!  I wonder.  Robert is working on a fix for schedsetschedule() 
> where it fails to actually tell the scheduler to switch to a process 
> that it just made higher priority or away from one it just lowered.
> 
> The net result is that the caller keeps running (FIFO for all in this 
> case) when, in fact it should have been switched out.  Next time 
> schedule() actually switches, it is all sorted out again.  Could the 
> elavation of the events/0 thread cause this needed switch?


I posted a fix for this a month ago that was ignored.  Which is a
good thing, since now that I look at it again, I don't care for the
approach I took nor does it appear to be complete.

Joe

----------- original posting

> Date: Wed, 21 May 2003 16:40:26 -0400
> From: Joe Korty <joe.korty@ccur.com>
> To: Ingo Molnar <mingo@elte.hu>
> Cc: linux-kernel@vger.kernel.org
> Subject: [PATCH] setscheduler resched bug

setscheduler is not forcing a reschedule when needed like set_user_nice
does.  It should.

Joe


--- 2.5.69/kernel/sched.c.orig	2003-05-21 14:50:53.000000000 -0400
+++ 2.5.69/kernel/sched.c	2003-05-21 15:01:13.000000000 -0400
@@ -1716,6 +1716,7 @@
 	unsigned long flags;
 	runqueue_t *rq;
 	task_t *p;
+	int oldprio;
 
 	if (!param || pid < 0)
 		goto out_nounlock;
@@ -1778,12 +1779,20 @@
 	retval = 0;
 	p->policy = policy;
 	p->rt_priority = lp.sched_priority;
+	oldprio = p->prio;
 	if (policy != SCHED_NORMAL)
 		p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority;
 	else
 		p->prio = p->static_prio;
-	if (array)
+	if (array) {
 		__activate_task(p, task_rq(p));
+		/*
+		 * Reschedule if on a CPU and the priority dropped, or not on
+		 * a CPU and the priority rose above the currently running task.
+		 */
+		if ((rq->curr == p) ? (p->prio > oldprio) : (p->prio < rq->curr->prio))
+			resched_task(rq->curr);
+	}
 
 out_unlock:
 	task_rq_unlock(rq, &flags);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19 17:19   ` 'joe.korty@ccur.com'
@ 2003-06-19 17:23     ` Robert Love
  2003-06-19 17:28       ` Joe Korty
  2003-06-19 17:45     ` [patch] setscheduler fix Robert Love
  1 sibling, 1 reply; 22+ messages in thread
From: Robert Love @ 2003-06-19 17:23 UTC (permalink / raw)
  To: Joe Korty
  Cc: george anzinger, Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

On Thu, 2003-06-19 at 10:19, 'joe.korty@ccur.com' wrote:

> I posted a fix for this a month ago that was ignored.  Which is a
> good thing, since now that I look at it again, I don't care for the
> approach I took nor does it appear to be complete.

Ah, sorry for missing it. Other than that tertiary statement inside an
if ;) my patch is about the same.

Why do you think it is incomplete? It looks correct to me.

	Robert Love


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19 17:23     ` Robert Love
@ 2003-06-19 17:28       ` Joe Korty
  0 siblings, 0 replies; 22+ messages in thread
From: Joe Korty @ 2003-06-19 17:28 UTC (permalink / raw)
  To: Robert Love
  Cc: george anzinger, Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

On Thu, Jun 19, 2003 at 10:23:30AM -0700, Robert Love wrote:
> On Thu, 2003-06-19 at 10:19, 'joe.korty@ccur.com' wrote:
> 
> > I posted a fix for this a month ago that was ignored.  Which is a
> > good thing, since now that I look at it again, I don't care for the
> > approach I took nor does it appear to be complete.
> 
> Ah, sorry for missing it. Other than that tertiary statement inside an
> if ;) my patch is about the same.
> 
> Why do you think it is incomplete? It looks correct to me.


It may be better to add it to __activate_task() rather than after the
single activate_task() use.  At the time I wrote the patch I did not
think to look at the five __activate_task() calls to see if they each
needed the test.  By me not looking, my patch is automatically
incorrect, even if it turns out to be correct.

Joe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch] setscheduler fix
  2003-06-19 17:19   ` 'joe.korty@ccur.com'
  2003-06-19 17:23     ` Robert Love
@ 2003-06-19 17:45     ` Robert Love
  2003-06-19 18:20       ` Joe Korty
  2003-06-19 19:09       ` Ingo Molnar
  1 sibling, 2 replies; 22+ messages in thread
From: Robert Love @ 2003-06-19 17:45 UTC (permalink / raw)
  To: Joe Korty
  Cc: george anzinger, Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

Here is my patch. It is the same idea as Joe's. Is there a better fix?

Basically, the problem is that setscheduler() does not set need_resched
when needed. There are two basic cases where this is needed:

	- the task is running, but now it is no longer the highest
	  priority task on the rq
	- the task is not running, but now it is the highest
	  priority task on the rq

In either case, we need to set need_resched to invoke the scheduler.

Patch is against 2.5.72. Comments?

	Robert Love


setschedule() needs to force a reschedule in some situations.

 kernel/sched.c |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletion(-)


diff -urN linux-2.5.72/kernel/sched.c linux/kernel/sched.c
--- linux-2.5.72/kernel/sched.c	2003-06-16 21:20:20.000000000 -0700
+++ linux/kernel/sched.c	2003-06-17 13:44:15.509894276 -0700
@@ -1691,6 +1691,7 @@
 {
 	struct sched_param lp;
 	int retval = -EINVAL;
+	int oldprio;
 	prio_array_t *array;
 	unsigned long flags;
 	runqueue_t *rq;
@@ -1757,12 +1758,24 @@
 	retval = 0;
 	p->policy = policy;
 	p->rt_priority = lp.sched_priority;
+	oldprio = p->prio;
 	if (policy != SCHED_NORMAL)
 		p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority;
 	else
 		p->prio = p->static_prio;
-	if (array)
+	if (array) {
 		__activate_task(p, task_rq(p));
+		/*
+		 * Reschedule if we are currently running on this runqueue and
+		 * our priority decreased, or if we are not currently running on
+		 * this runqueue and our priority is higher than the current's
+		 */
+		if (rq->curr == p) {
+			if (p->prio > oldprio)
+				resched_task(rq->curr);
+		} else if (p->prio < rq->curr->prio)
+			resched_task(rq->curr);
+	}
 
 out_unlock:
 	task_rq_unlock(rq, &flags);



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] setscheduler fix
  2003-06-19 17:45     ` [patch] setscheduler fix Robert Love
@ 2003-06-19 18:20       ` Joe Korty
  2003-06-19 18:38         ` Robert Love
  2003-06-19 19:09       ` Ingo Molnar
  1 sibling, 1 reply; 22+ messages in thread
From: Joe Korty @ 2003-06-19 18:20 UTC (permalink / raw)
  To: Robert Love
  Cc: george anzinger, Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

> Here is my patch. It is the same idea as Joe's. Is there a better fix?
> 
> Basically, the problem is that setscheduler() does not set need_resched
> when needed. There are two basic cases where this is needed:
> 
> 	- the task is running, but now it is no longer the highest
> 	  priority task on the rq
> 	- the task is not running, but now it is the highest
> 	  priority task on the rq
> 
> In either case, we need to set need_resched to invoke the scheduler.
> 
> Patch is against 2.5.72. Comments?

Looks good to me.

migration_thread and try_to_wake_up already have a simplier version of
your test that seems to be correct for that environment, so no change
is needed there.

wake_up_forked_process in principle might need your patch, but as it
appears to be called only from boot code it is unimportant that it
have the lowest possible latency, so no change is needed there either.

Joe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] setscheduler fix
  2003-06-19 18:20       ` Joe Korty
@ 2003-06-19 18:38         ` Robert Love
  0 siblings, 0 replies; 22+ messages in thread
From: Robert Love @ 2003-06-19 18:38 UTC (permalink / raw)
  To: Joe Korty
  Cc: george anzinger, Perez-Gonzalez, Inaky, 'Andrew Morton',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam

On Thu, 2003-06-19 at 11:20, Joe Korty wrote:

> Looks good to me.

Good.

> migration_thread and try_to_wake_up already have a simplier version of
> your test that seems to be correct for that environment, so no change
> is needed there.
> 
> wake_up_forked_process in principle might need your patch, but as it
> appears to be called only from boot code it is unimportant that it
> have the lowest possible latency, so no change is needed there either.

Agreed.

This is worse than just a latency issue, by the way. Imagine if a
FIFO/50 thread promotes a FIFO/40 thread to FIFO/60. The thread should
run immediately (because, at priority 60, it is the highest), but it may
not until the FIFO/50 thread completes.

	Robert Love


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] setscheduler fix
  2003-06-19 17:45     ` [patch] setscheduler fix Robert Love
  2003-06-19 18:20       ` Joe Korty
@ 2003-06-19 19:09       ` Ingo Molnar
  1 sibling, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2003-06-19 19:09 UTC (permalink / raw)
  To: Robert Love
  Cc: Joe Korty, george anzinger, Perez-Gonzalez, Inaky,
	'Andrew Morton', 'linux-kernel@vger.kernel.org',
	Li, Adam


On 19 Jun 2003, Robert Love wrote:

> +		if (rq->curr == p) {
> +			if (p->prio > oldprio)
> +				resched_task(rq->curr);
> +		} else if (p->prio < rq->curr->prio)
> +			resched_task(rq->curr);
> +	}

looks good.

	Ingo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-20  2:53 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-20  2:53 UTC (permalink / raw)
  To: 'george anzinger'
  Cc: 'linux-kernel@vger.kernel.org', 'mingo@elte.hu',
	Li, Adam, 'Robert Love'

> From: george anzinger [mailto:george@mvista.com]
> Perez-Gonzalez, Inaky wrote:
> > ...
> > My point here is: I am trying to trace where this program
> > is making use of workqueues inside of the kernel, and I
> > can find none. ...<snip>...
> 
> Hm!  I wonder.  Robert is working on a fix for schedsetschedule()
> where it fails to actually tell the scheduler to switch to a process
> that it just made higher priority or away from one it just lowered.
> 
> The net result is that the caller keeps running (FIFO for all in this
> case) when, in fact it should have been switched out.  Next time
> schedule() actually switches, it is all sorted out again.  Could the
> elavation of the events/0 thread cause this needed switch?

Maybe it was that, as with Robert's patch, the hang goes
away ... gee, weirdo. Doing a brute-force grep of who is doing
anything that could wake up the event daemon (by calling
{queue,schedule}_[delayed_]work() or flush_{workqueue,delayed_work})
shows that arch/i386/kernel/cpu/mcheck:mce_timerfunc() is 
scheduling work every MCE_RATE seconds, so that could wake up
the event daemon and cause the thing to be sorted out. However,
that's each 15 seconds ... too slow?

The VT code does too (as a callback mechanism for setting the
console, and seems like scrolling), but none seems to be
periodic so that they'd fix it. Others are at too unclear too.

Oh well ... it works, so it goes to the bin :]

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19 19:22 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19 19:22 UTC (permalink / raw)
  To: 'Robert Love'
  Cc: 'Ingo Molnar', 'Andrew Morton',
	'george anzinger', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam


> From: Robert Love [mailto:rml@tech9.net]
> 
> And we can prevent starvation just by running the kernel thread at
> FIFO/99, because then it will never be starved by a higher priority
> task. If the RT task being starved is also at priority 99, it will
> eventually block (as in our example, on console I/O) and let the kernel
> thread run. If the RT task being starved is lower priority, then there
> is nothing to worry about.

/me is quite uneasy with that assumption; not that it is not
correct (it should), but I can think of cases where that does 
not need to happen (for example, if you have another FIFO/99 
task B after your user task A that depends on kernel task K0),
and that task B happens to hold it for too long for A to miss
deadlines.

> I guess a real deadlock could only occur if the FIFO/99 task does not
> block on the resource the kernel thread is providing but busy loops
> waiting for it.

I can think of a OpenOffice + sched_yield() style or a brain-damaged 
poll for previous art. It would screw the whole equation.

Another example, I changed NGPT to do spin+futex for spinlocks, 
but then it was changed back by someone to spinning again -- 
performance reasons [they said] - of course, Thou Shall Not 
Use NGPT + Real-Time (tm). But how many of these are left? I am
so scared of JVMs...

This is even more prone to happen when you have priority
inheritance ... been there, done that, SysRq+E was my friend. 

It gets uncomfortably close to the "not my problem if you don't 
know to set up your system" area, but I would really prefer that
safeguard that then I can disable manually (by prioritizing down)
if I know where to push.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19 18:31 Perez-Gonzalez, Inaky
@ 2003-06-19 18:36 ` Robert Love
  0 siblings, 0 replies; 22+ messages in thread
From: Robert Love @ 2003-06-19 18:36 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Ingo Molnar', 'Andrew Morton',
	'george anzinger', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam

On Thu, 2003-06-19 at 11:31, Perez-Gonzalez, Inaky wrote:


> I don't think is ideal either, but it is the only way I see where we
> can make sure that no user thread is going to stomp over the kernel
> toes and cause a deadlock (this is a extreme, but it can happen).

Hmm, I guess a deadlock _is_ possible but I think the issue is more of
starvation.

And we can prevent starvation just by running the kernel thread at
FIFO/99, because then it will never be starved by a higher priority
task. If the RT task being starved is also at priority 99, it will
eventually block (as in our example, on console I/O) and let the kernel
thread run. If the RT task being starved is lower priority, then there
is nothing to worry about.

I guess a real deadlock could only occur if the FIFO/99 task does not
block on the resource the kernel thread is providing but busy loops
waiting for it.

It is all a trade off, and rarely a pleasant one...

	Robert Love



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19 18:31 Perez-Gonzalez, Inaky
  2003-06-19 18:36 ` Robert Love
  0 siblings, 1 reply; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19 18:31 UTC (permalink / raw)
  To: 'Robert Love'
  Cc: 'Ingo Molnar', 'Andrew Morton',
	'george anzinger', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam

> From: Robert Love [mailto:rml@tech9.net]
> On Wed, 2003-06-18 at 23:52, Perez-Gonzalez, Inaky wrote:
> 
> > Then some output would show on my serial console when events/0 is
> > reprioritized...
> >
> > OTOH, what do you think of Robert's idea of adding 20 levels of
> > priorities for the kernel's sole use?
> 
> That was your idea, I just said the infrastructure was in place and we
> could do it ;-)

I stand corrected (Man! I should really have my fingers typing 
after my brain thinks it).

> I am not so sure it is ideal. I hesitate to make kernel threads FIFO at
> a maximum priority, let alone an even greater one. I would really prefer
> to find a nicer solution. Anyhow, if we make events FIFO/99 that would
> also solve the problem, without dipping into extra high levels.

I don't think is ideal either, but it is the only way I see where we
can make sure that no user thread is going to stomp over the kernel
toes and cause a deadlock (this is a extreme, but it can happen). If
by default we ship all the kernel threads at higher priority than
anything that the user can do, we avoid this problem (of course, some
are going to be a no-no, so we default them to OTHER -20), but the 
most common ones to cause trouble, like the migration thread, keventd
and some else might benefit from this.

Then, being then the user/sysadmin/designer able to tweak them 
up or down at will could fix many potential issues with this.

(eg: customer who has decided his applications are real-time,
and thus have to be made SCHED_FIFO/50 and without any warning or 
possible cause, they are deadlocking while on some other systems
they work flawlessly ... )

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  6:52 Perez-Gonzalez, Inaky
@ 2003-06-19 17:43 ` Robert Love
  0 siblings, 0 replies; 22+ messages in thread
From: Robert Love @ 2003-06-19 17:43 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Ingo Molnar', 'Andrew Morton',
	'george anzinger', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam

On Wed, 2003-06-18 at 23:52, Perez-Gonzalez, Inaky wrote:

> Then some output would show on my serial console when events/0 is
> reprioritized...
> 
> OTOH, what do you think of Robert's idea of adding 20 levels of
> priorities for the kernel's sole use?

That was your idea, I just said the infrastructure was in place and we
could do it ;-)

I am not so sure it is ideal. I hesitate to make kernel threads FIFO at
a maximum priority, let alone an even greater one. I would really prefer
to find a nicer solution. Anyhow, if we make events FIFO/99 that would
also solve the problem, without dipping into extra high levels.

	Robert Love


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19  6:52 Perez-Gonzalez, Inaky
  2003-06-19 17:43 ` Robert Love
  0 siblings, 1 reply; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19  6:52 UTC (permalink / raw)
  To: 'Ingo Molnar'
  Cc: 'Andrew Morton', 'george anzinger',
	'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org',
	Li, Adam


> From: Ingo Molnar [mailto:mingo@elte.hu]
> 
> On Wed, 18 Jun 2003, Perez-Gonzalez, Inaky wrote:
> 
> > My point here is: I am trying to trace where this program is making use
> > of workqueues inside of the kernel, and I can find none. The only place
> > where I need to look some more is inside the timer code, but in a quick
> > glance, it seems it is not being used, so why is it affected by the
> > reprioritization of the events/0 thread? George, can you help me here?
> 
> well, printk (console input/output) can already make use of keventd.

Then some output would show on my serial console when events/0 is
reprioritized...

OTOH, what do you think of Robert's idea of adding 20 levels of
priorities for the kernel's sole use?

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19  4:38 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19  4:38 UTC (permalink / raw)
  To: 'Joe Korty'; +Cc: 'linux-kernel@vger.kernel.org'

> From: 'joe.korty@ccur.com' [mailto:joe.korty@ccur.com]
>
> On Wed, Jun 18, 2003 at 06:44:42PM -0700, Perez-Gonzalez, Inaky wrote:
> >
> > Now that we are at that, it might be wise to add a higher-than-anything
> > priority that the kernel code can use (what would be 100 for user space,
> > but off-limits), so even FIFO 99 code in user space cannot block out
> > the migration thread, keventd and friends.
> 
> I would prefer users have the ability to put one or two truly critical RT
> tasks above keventd & family.  Such tasks would have to follow certain
rules
> .. run & sleep quick .. limited or no device IO ..  most communication to
> other tasks through shared memory .. possibly others.

Agreed - see my answers to George Anzinger and Robert Love; I wasn't
precise enough on meaning "yeah, you should be able to reprioritize it
at will". My point is that user programs have a limit that they cannot
use, while kernel threads can use the user's priority space and their
highest priority space.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  1:44 Perez-Gonzalez, Inaky
  2003-06-19  1:58 ` Robert Love
  2003-06-19  2:02 ` george anzinger
@ 2003-06-19  4:34 ` 'joe.korty@ccur.com'
  2 siblings, 0 replies; 22+ messages in thread
From: 'joe.korty@ccur.com' @ 2003-06-19  4:34 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Andrew Morton', 'george anzinger',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu'

On Wed, Jun 18, 2003 at 06:44:42PM -0700, Perez-Gonzalez, Inaky wrote:
> 
> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.


I would prefer users have the ability to put one or two truly critical RT
tasks above keventd & family.  Such tasks would have to follow certain rules
.. run & sleep quick .. limited or no device IO ..  most communication to
other tasks through shared memory .. possibly others.

There are those willing to follow whatever rules necessary & split up their
application into tasks any which way in order to get high responsiveness to a
critical but small part of their application.  If you follow the rules, you
should be allowed to put a carefully crafted task above the system daemons
(with the possible exception of the migration daemon).

Joe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19  2:55 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19  2:55 UTC (permalink / raw)
  To: 'george anzinger'
  Cc: 'Andrew Morton', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu'


> From: george anzinger [mailto:george@mvista.com]
> 
> Wait a bit (or even a byte) here.  I think the proper thing to do, IF
> we want to go down this road, is to seperate out the various
> subsystems and give them each their own kernel task or workqueue.

Thaaaaaaat'd be so nice ... it'd also waste a lot of resources ...
maybe; I don't know if everybody at the lkml would swallow it.

That grossly reminds me of Timesys ... btw

> Then  those who need to could adjust, for example, network code to run
> ..<snip>.. If you give any kernel thread an untouchable priority, you
> might just as well move the work back to a bottom half or even the
> interrupt code.

My fault in not being more precise: on setting kernel stuff to FIFO 99+1,
for example, I was talking about defaults; users (better, admins/
designers) should be able to then tweak it (specially on embedded systems).

My point is that when Joe Sixpack fires some common appl that happens
to use RT things, he doesn't have to understand the whole book on
realtime and stuff to be able to tweak the system so it works.

In fact, Robert's point is the best, IMHO. Basically you add some more
priority space (20, I think he mentions) that are reserved for
system stuff (I think Solaris has something similar). They are the
only ones that can use that priority space, asides from the normal
space reserved to the user.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  1:44 Perez-Gonzalez, Inaky
  2003-06-19  1:58 ` Robert Love
@ 2003-06-19  2:02 ` george anzinger
  2003-06-19  4:34 ` 'joe.korty@ccur.com'
  2 siblings, 0 replies; 22+ messages in thread
From: george anzinger @ 2003-06-19  2:02 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Andrew Morton', 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu'

Perez-Gonzalez, Inaky wrote:
>>From: Andrew Morton [mailto:akpm@digeo.com]
>>
>>Various things like character drivers do rely upon keventd services.  So
> 
> it
> 
>>is possible that bash is stuck waiting on keyboard input, but there is no
>>keyboard input because keventd is locked out.
>>
>>I'll take a closer look at this, see if there is a specific case which can
>>be fixed.
>>
>>Arguably, keventd should be running max-prio RT because it is a kernel
>>service, providing "process context interrupt service".
> 
> 
> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.

Wait a bit (or even a byte) here.  I think the proper thing to do, IF 
we want to go down this road, is to seperate out the various 
subsystems and give them each their own kernel task or workqueue. 
Then  those who need to could adjust, for example, network code to run 
after real time process control and prior to print jobs, priority 
wise, that is.  Likewise, you could adjust the console access to be 
higher priority than the network so that we call always talk to the 
system.  If you give any kernel thread an untouchable priority, you 
might just as well move the work back to a bottom half or even the 
interrupt code.

-g
> 
> 
>>IIRC, Andrea's kernel runs keventd as SCHED_FIFO.  I've tried to avoid
>>making this change for ideological reasons ;) Userspace is more important
>>than the kernel and the kernel has no damn right to be saying "oh my stuff
>>is so important that it should run before latency-critical user code".
> 
> 
> I agree with that, but the consequence is kind of ugly; not that a true
> real-time embedded process is going to be printing to the console, but 
> it might be outputting to a serial line, so now they rely on the keventd.
> 
> BTW, I have seen similar problems wrt to the migration thread, where a
> FIFO 20 process would get stuck in CPU1, that is taken by a FIFO 40
> while CPU0 was running a FIFO 10 -- however, I am not that positive
> that it is a migration thread problem; I blame it more on the scheduler
> not taking into account priorities for firing the load balancer. It is
> a tricky thingie, though. Affinity helps, in this case.
> 
> Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
> (and my fault)
> 
> 

-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
  2003-06-19  1:44 Perez-Gonzalez, Inaky
@ 2003-06-19  1:58 ` Robert Love
  2003-06-19  2:02 ` george anzinger
  2003-06-19  4:34 ` 'joe.korty@ccur.com'
  2 siblings, 0 replies; 22+ messages in thread
From: Robert Love @ 2003-06-19  1:58 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky
  Cc: 'Andrew Morton', 'george anzinger',
	'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu'

On Wed, 2003-06-18 at 18:44, Perez-Gonzalez, Inaky wrote:

> Now that we are at that, it might be wise to add a higher-than-anything
> priority that the kernel code can use (what would be 100 for user space,
> but off-limits), so even FIFO 99 code in user space cannot block out
> the migration thread, keventd and friends.

I did this about a year ago, and it is merged into the kernel.

See MAX_USER_RT_PRIO and MAX_RT_PRIO in <linux/sched.h>.

We just need to change MAX_RT_PRIO to, say, (MAX_USER_RT_PRIO + 10).

The one kicker is if we end up changing the size of BITMAP_SIZE, the
default sched_find_first_bit() will break and we will need to implement
a new one. I did a generic one, as well as code to detect at
compile-time which to use, but the optimized one is a lot nicer. On
32-bit machines, the BITMAP_SIZE ends up being 160-bits
(5*sizeof(unsigned long)) so there are about 20 extra priority levels
one can add "for free."

	Robert Love


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks
@ 2003-06-19  1:44 Perez-Gonzalez, Inaky
  2003-06-19  1:58 ` Robert Love
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-06-19  1:44 UTC (permalink / raw)
  To: 'Andrew Morton', 'george anzinger'
  Cc: 'joe.korty@ccur.com',
	'linux-kernel@vger.kernel.org', 'mingo@elte.hu'

> From: Andrew Morton [mailto:akpm@digeo.com]
> 
> Various things like character drivers do rely upon keventd services.  So
it
> is possible that bash is stuck waiting on keyboard input, but there is no
> keyboard input because keventd is locked out.
> 
> I'll take a closer look at this, see if there is a specific case which can
> be fixed.
> 
> Arguably, keventd should be running max-prio RT because it is a kernel
> service, providing "process context interrupt service".

Now that we are at that, it might be wise to add a higher-than-anything
priority that the kernel code can use (what would be 100 for user space,
but off-limits), so even FIFO 99 code in user space cannot block out
the migration thread, keventd and friends.

> IIRC, Andrea's kernel runs keventd as SCHED_FIFO.  I've tried to avoid
> making this change for ideological reasons ;) Userspace is more important
> than the kernel and the kernel has no damn right to be saying "oh my stuff
> is so important that it should run before latency-critical user code".

I agree with that, but the consequence is kind of ugly; not that a true
real-time embedded process is going to be printing to the console, but 
it might be outputting to a serial line, so now they rely on the keventd.

BTW, I have seen similar problems wrt to the migration thread, where a
FIFO 20 process would get stuck in CPU1, that is taken by a FIFO 40
while CPU0 was running a FIFO 10 -- however, I am not that positive
that it is a migration thread problem; I blame it more on the scheduler
not taking into account priorities for firing the load balancer. It is
a tricky thingie, though. Affinity helps, in this case.

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2003-06-20  2:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-19  6:06 O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks Perez-Gonzalez, Inaky
2003-06-19  6:07 ` Ingo Molnar
2003-06-19 16:00 ` george anzinger
2003-06-19 17:19   ` 'joe.korty@ccur.com'
2003-06-19 17:23     ` Robert Love
2003-06-19 17:28       ` Joe Korty
2003-06-19 17:45     ` [patch] setscheduler fix Robert Love
2003-06-19 18:20       ` Joe Korty
2003-06-19 18:38         ` Robert Love
2003-06-19 19:09       ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2003-06-20  2:53 O(1) scheduler seems to lock up on sched_FIFO and sched_RR ta sks Perez-Gonzalez, Inaky
2003-06-19 19:22 Perez-Gonzalez, Inaky
2003-06-19 18:31 Perez-Gonzalez, Inaky
2003-06-19 18:36 ` Robert Love
2003-06-19  6:52 Perez-Gonzalez, Inaky
2003-06-19 17:43 ` Robert Love
2003-06-19  4:38 Perez-Gonzalez, Inaky
2003-06-19  2:55 Perez-Gonzalez, Inaky
2003-06-19  1:44 Perez-Gonzalez, Inaky
2003-06-19  1:58 ` Robert Love
2003-06-19  2:02 ` george anzinger
2003-06-19  4:34 ` 'joe.korty@ccur.com'

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).