All of lore.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] Re: O(1) scheduler "complex" macros
@ 2002-07-10  9:05 Erich Focht
  2002-07-10 12:34 ` Erich Focht
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Erich Focht @ 2002-07-10  9:05 UTC (permalink / raw)
  To: linux-ia64

Hi Ingo,

thanks for the quick response!

> the best solution might be to just lock the 'next' task - this needs a new
> per-task irq-safe spinlock, to avoid deadlocks. This way whenever a task
> is in the middle of a context-switch it cannot be scheduled on another
> CPU.

We tested this and it looked good. But inserting a udelay(100) like:
	...
	prepare_arch_switch(rq, next);
	udelay(100);
	prev = context_switch(prev, next);
	...
leads to a crash after 10 minutes. Again this looks like accessing an
empty page.

Does anything speak against such a test? It is there just to show up
quickly problems which we might normally get only after hours of running.

Regards,
Erich
 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
@ 2002-07-10 12:34 ` Erich Focht
  2002-07-10 18:31 ` Ingo Molnar
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Erich Focht @ 2002-07-10 12:34 UTC (permalink / raw)
  To: linux-ia64

Hi Ingo,

> there is one more detail - wait_task_inactive() needs to consider the
> ->switch_lock as well - otherwise exit() might end up freeing the
> pagetables earlier than the context-switch has truly finished. The
> udelay(100) test should trigger this race.
>
> i've fixed this and uploaded the -A8 patch:
>
>         http://redhat.com/~mingo/O(1)-scheduler/sched-2.5.25-A8
>
> does this fix the ia64 crashes? you need to define an ia64-specific

looks good! Rock solid despite udelay. Though a bit slower than before...

Thanks a lot for the help!

Erich



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
  2002-07-10 12:34 ` Erich Focht
@ 2002-07-10 18:31 ` Ingo Molnar
  2002-07-10 19:30 ` Ingo Molnar
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-07-10 18:31 UTC (permalink / raw)
  To: linux-ia64

On Tue, 9 Jul 2002, Erich Focht wrote:

> Suppose we have 
>   cpu1: idle1
>   cpu2: prev2 -> next2  (in the switch)
> 
> I don't understand how task_lock(prev2) done on cpu2 can prevent cpu1 to
> schedule prev2, which it stole after the RQ#2 lock release. It will just
> try to task_lock(idle1), which will be successfull.

you are right - the 'complex' macros also need to lock the 'next' task,
not only the 'previous' task - but to do that deadlock-free, they need to
drop the runqueue lock ...

	Ingo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
  2002-07-10 12:34 ` Erich Focht
  2002-07-10 18:31 ` Ingo Molnar
@ 2002-07-10 19:30 ` Ingo Molnar
  2002-07-11  9:25 ` Erich Focht
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-07-10 19:30 UTC (permalink / raw)
  To: linux-ia64

the best solution might be to just lock the 'next' task - this needs a new
per-task irq-safe spinlock, to avoid deadlocks. This way whenever a task
is in the middle of a context-switch it cannot be scheduled on another
CPU.

in fact this solution simplifies things - only two per-arch macros are
needed. I've done this in my current 2.5.25 tree:

	http://redhat.com/~mingo/O(1)-scheduler/sched-2.5.25-A4

check out the sparc64 changes for the 'complex' locking scenario - it's
untested, please give it a go on ia64, does that solve your problems? x86
is tested and works just fine.

	Ingo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
                   ` (2 preceding siblings ...)
  2002-07-10 19:30 ` Ingo Molnar
@ 2002-07-11  9:25 ` Erich Focht
  2002-07-11  9:31 ` Ingo Molnar
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Erich Focht @ 2002-07-11  9:25 UTC (permalink / raw)
  To: linux-ia64

Ingo,

the last fix survived one night of testing with included udelay(100), so
I would consider it stable. Thanks again!

The performance with the new "complex" macros is as expected worse than
with the simple ones, we see 10% in some cases, which is hurting a lot.
Would you please consider moving the location of the switch_lock from
the end of the task_struct to the hot area near the scheduler related
variables? The effect may vary depending on the cache line size but
having it behind *array or sleep_timestamp would probably save us a cache
miss here.

Regards,
Erich



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
                   ` (3 preceding siblings ...)
  2002-07-11  9:25 ` Erich Focht
@ 2002-07-11  9:31 ` Ingo Molnar
  2002-07-11  9:39 ` Ingo Molnar
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-07-11  9:31 UTC (permalink / raw)
  To: linux-ia64

On Wed, 10 Jul 2002, Erich Focht wrote:

> > the best solution might be to just lock the 'next' task - this needs a new
> > per-task irq-safe spinlock, to avoid deadlocks. This way whenever a task
> > is in the middle of a context-switch it cannot be scheduled on another
> > CPU.
> 
> We tested this and it looked good. But inserting a udelay(100) like:
> 	...
> 	prepare_arch_switch(rq, next);
> 	udelay(100);
> 	prev = context_switch(prev, next);
> 	...
> leads to a crash after 10 minutes. Again this looks like accessing an
> empty page.

there is one more detail - wait_task_inactive() needs to consider the
->switch_lock as well - otherwise exit() might end up freeing the
pagetables earlier than the context-switch has truly finished. The
udelay(100) test should trigger this race.

i've fixed this and uploaded the -A8 patch:

        http://redhat.com/~mingo/O(1)-scheduler/sched-2.5.25-A8

does this fix the ia64 crashes? you need to define an ia64-specific
task_running(rq, p) macro, which should be something like:

 #define task_running(rq, p) \
	((rq)->curr = (p)) && !spin_is_locked(&(p)->switch_lock)

a number of other places needed to be updated to use the task_running()  
macro. For load_balance() and set_cpus_allowed() it's technically not
necessery, but i've added it to make things cleaner and safer for the time
being.

the default locking is still as lightweight as it used to be.

> Does anything speak against such a test? It is there just to show up
> quickly problems which we might normally get only after hours of
> running.

the udelay() test should be fine otherwise. (as long as ia64 udelay doesnt
do anything weird.)

	Ingo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
                   ` (4 preceding siblings ...)
  2002-07-11  9:31 ` Ingo Molnar
@ 2002-07-11  9:39 ` Ingo Molnar
  2002-07-11  9:52 ` Ingo Molnar
  2002-07-12 12:39 ` Pavel Machek
  7 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-07-11  9:39 UTC (permalink / raw)
  To: linux-ia64

>  #define task_running(rq, p) \
> 	((rq)->curr = (p)) && !spin_is_locked(&(p)->switch_lock)

one more implementational note: the above test is not 'sharp' in the sense
that on SMP it's only correct (the test has no barriers) if the runqueue
lock is held. This is true for all the critical task_running() uses in
sched.c - and the cases that use it outside the runqueue lock are
optimizations so they dont need an exact test.

	Ingo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
                   ` (5 preceding siblings ...)
  2002-07-11  9:39 ` Ingo Molnar
@ 2002-07-11  9:52 ` Ingo Molnar
  2002-07-12 12:39 ` Pavel Machek
  7 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2002-07-11  9:52 UTC (permalink / raw)
  To: linux-ia64

the correct macro is:

  #define task_running(rq, p) \
 	(((rq)->curr = (p)) || spin_is_locked(&(p)->switch_lock))

	Ingo



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Linux-ia64] Re: O(1) scheduler "complex" macros
  2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
                   ` (6 preceding siblings ...)
  2002-07-11  9:52 ` Ingo Molnar
@ 2002-07-12 12:39 ` Pavel Machek
  7 siblings, 0 replies; 9+ messages in thread
From: Pavel Machek @ 2002-07-12 12:39 UTC (permalink / raw)
  To: linux-ia64

Hi!

> >  #define task_running(rq, p) \
> > 	((rq)->curr = (p)) && !spin_is_locked(&(p)->switch_lock)
> 
> one more implementational note: the above test is not 'sharp' in the sense
> that on SMP it's only correct (the test has no barriers) if the runqueue
> lock is held. This is true for all the critical task_running() uses in
> sched.c - and the cases that use it outside the runqueue lock are
> optimizations so they dont need an exact test.

I believe this is worth a *big fat* comment.
								Pavel
-- 
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-07-12 12:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-10  9:05 [Linux-ia64] Re: O(1) scheduler "complex" macros Erich Focht
2002-07-10 12:34 ` Erich Focht
2002-07-10 18:31 ` Ingo Molnar
2002-07-10 19:30 ` Ingo Molnar
2002-07-11  9:25 ` Erich Focht
2002-07-11  9:31 ` Ingo Molnar
2002-07-11  9:39 ` Ingo Molnar
2002-07-11  9:52 ` Ingo Molnar
2002-07-12 12:39 ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.