linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] set_cpus_allowed() for 2.4
@ 2002-10-01 23:03 Robert Love
  2002-10-02 13:01 ` Christoph Hellwig
  2002-11-05  3:37 ` Christoph Hellwig
  0 siblings, 2 replies; 45+ messages in thread
From: Robert Love @ 2002-10-01 23:03 UTC (permalink / raw)
  To: jgarzik, hch; +Cc: linux-kernel

The following patch implements set_cpus_allowed() for stock 2.4 without
the O(1) scheduler.

The calling semantics and behavior remain the same as 2.5's method.

This is to provide a backward-compatible interface, specifically for
those interested in back-porting the new workqueue code to 2.4 -
set_cpus_allowed() seems to be the only nit preventing a straight
drop-in.

Patch is against 2.4.20-pre8 and untested but does compile.

	Robert Love

diff -urN linux-2.4.20-pre8/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.4.20-pre8/include/linux/sched.h	Mon Sep 30 17:41:22 2002
+++ linux/include/linux/sched.h	Tue Oct  1 18:35:28 2002
@@ -163,6 +164,12 @@
 extern int start_context_thread(void);
 extern int current_is_keventd(void);
 
+#if CONFIG_SMP
+extern void set_cpus_allowed(struct task_struct *p, unsigned long new_mask);
+#else
+# define set_cpus_allowed(p, new_mask) do { } while (0)
+#endif
+
 /*
  * The default fd array needs to be at least BITS_PER_LONG,
  * as this is the granularity returned by copy_fdset().
diff -urN linux-2.4.20-pre8/kernel/ksyms.c linux/kernel/ksyms.c
--- linux-2.4.20-pre8/kernel/ksyms.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/ksyms.c	Tue Oct  1 18:34:41 2002
@@ -451,6 +451,9 @@
 EXPORT_SYMBOL(interruptible_sleep_on_timeout);
 EXPORT_SYMBOL(schedule);
 EXPORT_SYMBOL(schedule_timeout);
+#if CONFIG_SMP
+EXPORT_SYMBOL(set_cpus_allowed);
+#endif
 EXPORT_SYMBOL(yield);
 EXPORT_SYMBOL(__cond_resched);
 EXPORT_SYMBOL(jiffies);
diff -urN linux-2.4.20-pre8/kernel/sched.c linux/kernel/sched.c
--- linux-2.4.20-pre8/kernel/sched.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/sched.c	Tue Oct  1 18:54:49 2002
@@ -850,6 +850,46 @@
 
 void scheduling_functions_end_here(void) { }
 
+#if CONFIG_SMP
+
+/**
+ * set_cpus_allowed() - change a given task's processor affinity
+ * @p: task to bind
+ * @new_mask: bitmask of allowed processors
+ *
+ * Upon return, the task is running on a legal processor.  Note the caller
+ * must have a valid reference to the task: it must not exit() prematurely.
+ * This call can sleep; do not hold locks on call.
+ */
+void set_cpus_allowed(struct task_struct *p, unsigned long new_mask)
+{
+	new_mask &= cpu_online_map;
+	BUG_ON(!new_mask);
+
+	p->cpus_allowed = new_mask;
+
+	/*
+	 * If the task is on a no-longer-allowed processor, we need to move
+	 * it.  If the task is not current, then set need_resched and send
+	 * its processor an IPI to reschedule.
+	 */
+	if (!(p->cpus_runnable & p->cpus_allowed)) {
+		if (p != current) {
+			p->need_resched = 1;
+			smp_send_reschedule(p->processor);
+		}
+		/*
+		 * Wait until we are on a legal processor.  If the task is
+		 * current, then we should be on a legal processor the next
+		 * time we reschedule.  Otherwise, we need to wait for the IPI.
+		 */
+		while (!(p->cpus_runnable & p->cpus_allowed))
+			schedule();
+	}
+}
+
+#endif /* CONFIG_SMP */
+
 #ifndef __alpha__
 
 /*
diff -urN linux-2.4.20-pre8/kernel/softirq.c linux/kernel/softirq.c
--- linux-2.4.20-pre8/kernel/softirq.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/softirq.c	Tue Oct  1 18:53:01 2002
@@ -368,9 +368,8 @@
 	sigfillset(&current->blocked);
 
 	/* Migrate to the right CPU */
-	current->cpus_allowed = 1UL << cpu;
-	while (smp_processor_id() != cpu)
-		schedule();
+	set_cpus_allowed(current, 1UL << cpu);
+	BUG_ON(smp_processor_id() != cpu);
 
 	sprintf(current->comm, "ksoftirqd_CPU%d", bind_cpu);
 




^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-10-01 23:03 [PATCH] set_cpus_allowed() for 2.4 Robert Love
@ 2002-10-02 13:01 ` Christoph Hellwig
  2002-10-02 15:00   ` Robert Love
  2002-11-05  3:37 ` Christoph Hellwig
  1 sibling, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2002-10-02 13:01 UTC (permalink / raw)
  To: Robert Love; +Cc: jgarzik, linux-kernel

On Tue, Oct 01, 2002 at 07:03:28PM -0400, Robert Love wrote:
> The following patch implements set_cpus_allowed() for stock 2.4 without
> the O(1) scheduler.
> 
> The calling semantics and behavior remain the same as 2.5's method.
> 
> This is to provide a backward-compatible interface, specifically for
> those interested in back-porting the new workqueue code to 2.4 -
> set_cpus_allowed() seems to be the only nit preventing a straight
> drop-in.
> 
> Patch is against 2.4.20-pre8 and untested but does compile.

Patch looks good to me, and I'd really like to have it in XFS :)
BTW, now that you have the core functionality I wonder why you don't
also add the cpu affinity syscalls..


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-10-02 13:01 ` Christoph Hellwig
@ 2002-10-02 15:00   ` Robert Love
  0 siblings, 0 replies; 45+ messages in thread
From: Robert Love @ 2002-10-02 15:00 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: jgarzik, linux-kernel

On Wed, 2002-10-02 at 09:01, Christoph Hellwig wrote:

> Patch looks good to me, and I'd really like to have it in XFS :)

Good :)

> BTW, now that you have the core functionality I wonder why you don't
> also add the cpu affinity syscalls..

I already wrote them for 2.4, although I guess I should redo them for
the new set_cpus_allowed()... I have been waiting to send them to
Marcelo to ensure the 2.5 interfaces were solid and did not change.  As
we approach the feature freeze, I guess we are getting there.

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-10-01 23:03 [PATCH] set_cpus_allowed() for 2.4 Robert Love
  2002-10-02 13:01 ` Christoph Hellwig
@ 2002-11-05  3:37 ` Christoph Hellwig
  2002-11-06 15:32   ` Adrian Bunk
  2002-12-02 17:12   ` Mikael Pettersson
  1 sibling, 2 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-11-05  3:37 UTC (permalink / raw)
  To: marcelo, Robert Love; +Cc: linux-kernel

Hi Marcelo,

now that all vendors ship a backport of Ingo's O(1) scheduler external projects
like XFS have to track those projects in addition to the mainline kernel.

Having the common new APIs available in mainline would be a very good think for
thos projects.  We already have a proper yield() in 2.4.20, but the
set_cpus_allowed() API used e.g. for kernelthreads bound to CPUs is still missing.

Any chance you could apply Robert Love's patch to add it for 2.4.20-rc2?  Note
that it does not change any existing code but just adds that interface.


diff -urN linux-2.4.20-pre8/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.4.20-pre8/include/linux/sched.h	Mon Sep 30 17:41:22 2002
+++ linux/include/linux/sched.h	Tue Oct  1 18:35:28 2002
@@ -163,6 +164,12 @@
 extern int start_context_thread(void);
 extern int current_is_keventd(void);
 
+#if CONFIG_SMP
+extern void set_cpus_allowed(struct task_struct *p, unsigned long new_mask);
+#else
+# define set_cpus_allowed(p, new_mask) do { } while (0)
+#endif
+
 /*
  * The default fd array needs to be at least BITS_PER_LONG,
  * as this is the granularity returned by copy_fdset().
diff -urN linux-2.4.20-pre8/kernel/ksyms.c linux/kernel/ksyms.c
--- linux-2.4.20-pre8/kernel/ksyms.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/ksyms.c	Tue Oct  1 18:34:41 2002
@@ -451,6 +451,9 @@
 EXPORT_SYMBOL(interruptible_sleep_on_timeout);
 EXPORT_SYMBOL(schedule);
 EXPORT_SYMBOL(schedule_timeout);
+#if CONFIG_SMP
+EXPORT_SYMBOL(set_cpus_allowed);
+#endif
 EXPORT_SYMBOL(yield);
 EXPORT_SYMBOL(__cond_resched);
 EXPORT_SYMBOL(jiffies);
diff -urN linux-2.4.20-pre8/kernel/sched.c linux/kernel/sched.c
--- linux-2.4.20-pre8/kernel/sched.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/sched.c	Tue Oct  1 18:54:49 2002
@@ -850,6 +850,46 @@
 
 void scheduling_functions_end_here(void) { }
 
+#if CONFIG_SMP
+
+/**
+ * set_cpus_allowed() - change a given task's processor affinity
+ * @p: task to bind
+ * @new_mask: bitmask of allowed processors
+ *
+ * Upon return, the task is running on a legal processor.  Note the caller
+ * must have a valid reference to the task: it must not exit() prematurely.
+ * This call can sleep; do not hold locks on call.
+ */
+void set_cpus_allowed(struct task_struct *p, unsigned long new_mask)
+{
+	new_mask &= cpu_online_map;
+	BUG_ON(!new_mask);
+
+	p->cpus_allowed = new_mask;
+
+	/*
+	 * If the task is on a no-longer-allowed processor, we need to move
+	 * it.  If the task is not current, then set need_resched and send
+	 * its processor an IPI to reschedule.
+	 */
+	if (!(p->cpus_runnable & p->cpus_allowed)) {
+		if (p != current) {
+			p->need_resched = 1;
+			smp_send_reschedule(p->processor);
+		}
+		/*
+		 * Wait until we are on a legal processor.  If the task is
+		 * current, then we should be on a legal processor the next
+		 * time we reschedule.  Otherwise, we need to wait for the IPI.
+		 */
+		while (!(p->cpus_runnable & p->cpus_allowed))
+			schedule();
+	}
+}
+
+#endif /* CONFIG_SMP */
+
 #ifndef __alpha__
 
 /*

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-11-05  3:37 ` Christoph Hellwig
@ 2002-11-06 15:32   ` Adrian Bunk
  2002-11-07 21:42     ` Christoph Hellwig
  2002-12-02 17:12   ` Mikael Pettersson
  1 sibling, 1 reply; 45+ messages in thread
From: Adrian Bunk @ 2002-11-06 15:32 UTC (permalink / raw)
  To: Christoph Hellwig, Robert Love, linux-kernel

On Mon, Nov 04, 2002 at 10:37:25PM -0500, Christoph Hellwig wrote:
>...
> now that all vendors ship a backport of Ingo's O(1) scheduler external projects
>...

Your "all vendors" doesn't include Debian?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-11-06 15:32   ` Adrian Bunk
@ 2002-11-07 21:42     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-11-07 21:42 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Christoph Hellwig, Robert Love, linux-kernel

On Wed, Nov 06, 2002 at 04:32:17PM +0100, Adrian Bunk wrote:
> On Mon, Nov 04, 2002 at 10:37:25PM -0500, Christoph Hellwig wrote:
> >...
> > now that all vendors ship a backport of Ingo's O(1) scheduler external projects
> >...
> 
> Your "all vendors" doesn't include Debian?

No.  Replace all vendors with all commercial vendors or all recently
releaseased distribution :)  


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-11-05  3:37 ` Christoph Hellwig
  2002-11-06 15:32   ` Adrian Bunk
@ 2002-12-02 17:12   ` Mikael Pettersson
  2002-12-03  0:51     ` Christoph Hellwig
  1 sibling, 1 reply; 45+ messages in thread
From: Mikael Pettersson @ 2002-12-02 17:12 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel

On November 4, Christoph Hellwig wrote:
 > +void set_cpus_allowed(struct task_struct *p, unsigned long new_mask)
 > +{
 > +	new_mask &= cpu_online_map;
 > +	BUG_ON(!new_mask);
 > +
 > +	p->cpus_allowed = new_mask;
 > +
 > +	/*
 > +	 * If the task is on a no-longer-allowed processor, we need to move
 > +	 * it.  If the task is not current, then set need_resched and send
 > +	 * its processor an IPI to reschedule.
 > +	 */
 > +	if (!(p->cpus_runnable & p->cpus_allowed)) {
 > +		if (p != current) {
 > +			p->need_resched = 1;
 > +			smp_send_reschedule(p->processor);
 > +		}
 > +		/*
 > +		 * Wait until we are on a legal processor.  If the task is
 > +		 * current, then we should be on a legal processor the next
 > +		 * time we reschedule.  Otherwise, we need to wait for the IPI.
 > +		 */
 > +		while (!(p->cpus_runnable & p->cpus_allowed))
 > +			schedule();
 > +	}
 > +}

Is this implementation of set_cpus_allowed() Ok for all 2.4 kernels,
even if they (like RH8.0's) use a non-vanilla scheduler?

I'm asking because I need to put a set_cpus_allowed() implementation
in my performance counters driver's compat layer. If it makes any
difference, I'll only use set_cpus_allowed(p, new_mask) when p == current
or p is stopped and under ptrace() control by current.

/Mikael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03  0:51     ` Christoph Hellwig
@ 2002-12-02 17:47       ` Mikael Pettersson
  2002-12-02 19:10         ` Robert Love
  0 siblings, 1 reply; 45+ messages in thread
From: Mikael Pettersson @ 2002-12-02 17:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel

Christoph Hellwig writes:
 > On Mon, Dec 02, 2002 at 06:12:05PM +0100, Mikael Pettersson wrote:
 > > Is this implementation of set_cpus_allowed() Ok for all 2.4 kernels,
 > > even if they (like RH8.0's) use a non-vanilla scheduler?
 > 
 > No, it's for the stock scheduler.  But RH8.0 already has set_cpus_allowed().

I knew RH8.0 has set_cpus_allowed(), but I wanted to avoid having to check
for being compiled in a RH-hacked kernel. LINUX_VERSION_CODE doesn't
distinguish between standard and "with tons of vendor-specific changes" :-(

I'll use your code then on stock 2.4 kernels, and work out some kludge
for the RH case.

Thanks,

/Mikael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:47       ` Mikael Pettersson
@ 2002-12-02 19:10         ` Robert Love
  0 siblings, 0 replies; 45+ messages in thread
From: Robert Love @ 2002-12-02 19:10 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: Christoph Hellwig, linux-kernel

On Mon, 2002-12-02 at 12:47, Mikael Pettersson wrote:

> I knew RH8.0 has set_cpus_allowed(), but I wanted to avoid having to check
> for being compiled in a RH-hacked kernel. LINUX_VERSION_CODE doesn't
> distinguish between standard and "with tons of vendor-specific changes" :-(
> 
> I'll use your code then on stock 2.4 kernels, and work out some kludge
> for the RH case.

The code only works on the stock scheduler.  It is the same interface
and has the same behavior as the O(1) scheduler version, but the code is
very very different.

If this patch is merged, you can safely use set_cpus_allowed() in either
kernel (which is the intention).  But you cannot use this routine's code
on either scheduler.

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:12   ` Mikael Pettersson
@ 2002-12-03  0:51     ` Christoph Hellwig
  2002-12-02 17:47       ` Mikael Pettersson
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-03  0:51 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-kernel

On Mon, Dec 02, 2002 at 06:12:05PM +0100, Mikael Pettersson wrote:
> Is this implementation of set_cpus_allowed() Ok for all 2.4 kernels,
> even if they (like RH8.0's) use a non-vanilla scheduler?

No, it's for the stock scheduler.  But RH8.0 already has set_cpus_allowed().


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-13 21:34 ` Adrian Bunk
@ 2002-12-14  4:55   ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-14  4:55 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: linux-kernel

On Fri, Dec 13, 2002 at 10:34:59PM +0100, Adrian Bunk wrote:
> On Fri, Dec 13, 2002 at 06:08:14PM -0500, Christoph Hellwig wrote:
> 
> >...
> > now that all vendors ship a backport of Ingo's O(1) scheduler, external
> >...
> 
> #include <all-vendors-except-Debian>  ;-)

OOPS, I reused the first template again.  I think I owe you a beer :)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
@ 2002-12-13 23:08 Christoph Hellwig
  2002-12-13 21:34 ` Adrian Bunk
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-13 23:08 UTC (permalink / raw)
  To: marcelo; +Cc: rml, linux-kernel

Hi Marcelo,

now that all vendors ship a backport of Ingo's O(1) scheduler, external
projects like XFS have to track those trees in addition to the mainline kernel.

Having the common new APIs available in mainline would be a very good thing
to have for us and others.  Now that 2.4.20 already has a working yield()
the biggest missing part is set_cpus_allowed() to limit (kernel-)threads
to a specific CPU or set of CPUs.

--- linux/include/linux/sched.h~	Mon Sep 30 17:41:22 2002
+++ linux/include/linux/sched.h	Tue Oct  1 18:35:28 2002
@@ -163,6 +164,12 @@
 extern int start_context_thread(void);
 extern int current_is_keventd(void);
 
+#if CONFIG_SMP
+extern void set_cpus_allowed(struct task_struct *p, unsigned long new_mask);
+#else
+# define set_cpus_allowed(p, new_mask) do { } while (0)
+#endif
+
 /*
  * The default fd array needs to be at least BITS_PER_LONG,
  * as this is the granularity returned by copy_fdset().
--- linux/kernel/ksyms.c~	Mon Sep 30 17:41:22 2002
+++ linux/kernel/ksyms.c	Tue Oct  1 18:34:41 2002
@@ -451,6 +451,9 @@
 EXPORT_SYMBOL(interruptible_sleep_on_timeout);
 EXPORT_SYMBOL(schedule);
 EXPORT_SYMBOL(schedule_timeout);
+#if CONFIG_SMP
+EXPORT_SYMBOL(set_cpus_allowed);
+#endif
 EXPORT_SYMBOL(yield);
 EXPORT_SYMBOL(__cond_resched);
 EXPORT_SYMBOL(jiffies);
--- linux/kernel/sched.c~	Mon Sep 30 17:41:22 2002
+++ linux/kernel/sched.c	Tue Oct  1 18:54:49 2002
@@ -850,6 +850,45 @@
 
 void scheduling_functions_end_here(void) { }
 
+#if CONFIG_SMP
+/**
+ * set_cpus_allowed() - change a given task's processor affinity
+ * @p: task to bind
+ * @new_mask: bitmask of allowed processors
+ *
+ * Upon return, the task is running on a legal processor.  Note the caller
+ * must have a valid reference to the task: it must not exit() prematurely.
+ * This call can sleep; do not hold locks on call.
+ */
+void set_cpus_allowed(struct task_struct *p, unsigned long new_mask)
+{
+	new_mask &= cpu_online_map;
+	BUG_ON(!new_mask);
+
+	p->cpus_allowed = new_mask;
+
+	/*
+	 * If the task is on a no-longer-allowed processor, we need to move
+	 * it.  If the task is not current, then set need_resched and send
+	 * its processor an IPI to reschedule.
+	 */
+	if (!(p->cpus_runnable & p->cpus_allowed)) {
+		if (p != current) {
+			p->need_resched = 1;
+			smp_send_reschedule(p->processor);
+		}
+
+		/*
+		 * Wait until we are on a legal processor.  If the task is
+		 * current, then we should be on a legal processor the next
+		 * time we reschedule.  Otherwise, we need to wait for the IPI.
+		 */
+		while (!(p->cpus_runnable & p->cpus_allowed))
+			schedule();
+	}
+}
+#endif /* CONFIG_SMP */
+
 #ifndef __alpha__
 
 /*

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-13 23:08 Christoph Hellwig
@ 2002-12-13 21:34 ` Adrian Bunk
  2002-12-14  4:55   ` Christoph Hellwig
  0 siblings, 1 reply; 45+ messages in thread
From: Adrian Bunk @ 2002-12-13 21:34 UTC (permalink / raw)
  To: Christoph Hellwig, linux-kernel

On Fri, Dec 13, 2002 at 06:08:14PM -0500, Christoph Hellwig wrote:

>...
> now that all vendors ship a backport of Ingo's O(1) scheduler, external
>...

#include <all-vendors-except-Debian>  ;-)

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
@ 2002-12-09 20:19 kernel
  0 siblings, 0 replies; 45+ messages in thread
From: kernel @ 2002-12-09 20:19 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel


I confirm that 2.5 is much much much better. On a dual Pentium pro@180MHz i had 48 lames encoding wave to mp3 and X where responding as nothing has happening....
With 2.4 kaboom! Even the mouse slows down!

Nice work Ingo ;)

George


On Sun, 8 Dec 2002, Andrew Morton wrote:

> Yes, thanks.  Will we also be seeing the "interactivity estimator" fixes
> in 2.5?

yes, but i'd like to clarify one more thing - worst-case O(1)  
interactivity indeed is indeed very jerky (eg. the fast window moving
thing you noticed), but the normal behavior is much better than the old
scheduler's. Just try compiling the kernel with make -j4 under stock 2.4
and _everything_ in X will be jerky. With the O(1) scheduler things are
just as smooth as on an idle system - as long as your application does not
get rated CPU-intensive. [which happens too fast in the case you
described.] So we do have something in 2.5 that is visibly better in a
number of cases, and i want to preserve that - while fixing the
corner-cases discussed here. I'm working on it.

	Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-08 19:56       ` Andrew Morton
@ 2002-12-09 20:13         ` Ingo Molnar
  0 siblings, 0 replies; 45+ messages in thread
From: Ingo Molnar @ 2002-12-09 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Christoph Hellwig, Robert Love, linux-kernel


On Sun, 8 Dec 2002, Andrew Morton wrote:

> Yes, thanks.  Will we also be seeing the "interactivity estimator" fixes
> in 2.5?

yes, but i'd like to clarify one more thing - worst-case O(1)  
interactivity indeed is indeed very jerky (eg. the fast window moving
thing you noticed), but the normal behavior is much better than the old
scheduler's. Just try compiling the kernel with make -j4 under stock 2.4
and _everything_ in X will be jerky. With the O(1) scheduler things are
just as smooth as on an idle system - as long as your application does not
get rated CPU-intensive. [which happens too fast in the case you
described.] So we do have something in 2.5 that is visibly better in a
number of cases, and i want to preserve that - while fixing the
corner-cases discussed here. I'm working on it.

	Ingo


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
@ 2002-12-09  3:02 Jim Houston
  0 siblings, 0 replies; 45+ messages in thread
From: Jim Houston @ 2002-12-09  3:02 UTC (permalink / raw)
  To: linux-kernel, mingo


Hi Everyone,

I ran into a lockup with the O(1) scheduler back in August and
exchanged email with Andrea. I tried a couple of his patches.
They prevented the lockup but it was still easy to have the
X server exiled to the inactive array for seconds at a time.
This got me started working on a patch to make the schedule
more fair.

I posted a patch archive here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103508412423719&w=2

It fixes fairness but breaks nice(2). Rik van Riel has a
patch here which builds on my patch which fixes this:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103651801424031&w=2

I tested the combination of these patches with linux-2.5.48 and
it seems well behaved:-)  

I found this problem with the LTP waitpid06 test.  It actually
produced a live-lock. See this mail:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103133744217082&w=2

I have been distracted by Posix timers but I plan to get back
finish this.

Jim Houston - Concurrent Computer Corp.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-08 13:23     ` Ingo Molnar
@ 2002-12-08 19:56       ` Andrew Morton
  2002-12-09 20:13         ` Ingo Molnar
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2002-12-08 19:56 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Martin J. Bligh, Christoph Hellwig, Robert Love, linux-kernel

Ingo Molnar wrote:
> 
> On Mon, 2 Dec 2002, Andrew Morton wrote:
> 
> > I have observed two problems with the new scheduler, both serious IMO:
> >
> > 1) Changed sched_yield() semantics.  [...]
> 
> we noticed this OpenOffice/StarOffice problem in July, while beta-testing
> RH 8.0. In July Andrea already had another yield implementation in his
> tree, which was addressing an unrelated yield()-related regression. I'd
> like to note here that StarOffice/OpenOffice sucked just as much under
> Andrea's yield() variant as the original (and 2.5) O(1) scheduler variant
> did.
> 
> So i talked to Andrea, and we agreed in a rough solution that worked
> sufficiently well for OpenOffice and the other regression as well. I
> implemented it and tested it for OpenOffice. You can see (an i suspect
> later incarnation) of that implementation in Andrea's current tree. My
> position back then was that we should not try to move the arguably broken
> 2.4 yield() implementation to 2.5.
> 
> So this is the history of O(1) yield().
> 
> fortunately, things have changed since July, since due to NPTL threading
> the architectural need for user-space yield() has decreased significantly
> (NPTL uses futexes, no yielding anywhere), so the only worry is behavioral
> compatibility with LinuxThreads (and other yield() users). I'll forward
> port the new (well, old) yield() semantics to 2.5 as well, which will be
> quite similar to the yield() implementation in Andrea's tree.
> 
> there's another (this time unique) bit implemented by Andrea, a variant of
> giving newly forked children priority in a more subtle way - i'm testing
> this change currently, to see whether it has any positive effect on
> compilation workloads.
> 
> does this clarify things?
> 

Yes, thanks.  Will we also be seeing the "interactivity estimator" fixes
in 2.5?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 19:30   ` Andrew Morton
  2002-12-02 19:50     ` Andrea Arcangeli
@ 2002-12-08 13:23     ` Ingo Molnar
  2002-12-08 19:56       ` Andrew Morton
  1 sibling, 1 reply; 45+ messages in thread
From: Ingo Molnar @ 2002-12-08 13:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, Robert Love, linux-kernel


On Mon, 2 Dec 2002, Andrew Morton wrote:

> I have observed two problems with the new scheduler, both serious IMO:
> 
> 1) Changed sched_yield() semantics.  [...]

we noticed this OpenOffice/StarOffice problem in July, while beta-testing
RH 8.0. In July Andrea already had another yield implementation in his
tree, which was addressing an unrelated yield()-related regression. I'd
like to note here that StarOffice/OpenOffice sucked just as much under
Andrea's yield() variant as the original (and 2.5) O(1) scheduler variant
did.

So i talked to Andrea, and we agreed in a rough solution that worked
sufficiently well for OpenOffice and the other regression as well. I
implemented it and tested it for OpenOffice. You can see (an i suspect
later incarnation) of that implementation in Andrea's current tree. My
position back then was that we should not try to move the arguably broken
2.4 yield() implementation to 2.5.

So this is the history of O(1) yield().

fortunately, things have changed since July, since due to NPTL threading
the architectural need for user-space yield() has decreased significantly
(NPTL uses futexes, no yielding anywhere), so the only worry is behavioral
compatibility with LinuxThreads (and other yield() users). I'll forward
port the new (well, old) yield() semantics to 2.5 as well, which will be
quite similar to the yield() implementation in Andrea's tree.

there's another (this time unique) bit implemented by Andrea, a variant of
giving newly forked children priority in a more subtle way - i'm testing
this change currently, to see whether it has any positive effect on
compilation workloads.

does this clarify things?

	Ingo


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 22:41         ` Robert Love
@ 2002-12-07 16:55           ` bill davidsen
  0 siblings, 0 replies; 45+ messages in thread
From: bill davidsen @ 2002-12-07 16:55 UTC (permalink / raw)
  To: linux-kernel

In article <1038868912.869.60.camel@phantasy>,
Robert Love  <rml@tech9.net> wrote:

| Ingo did explicitly mention he thought the O(1) scheduler was not 2.4
| material.  Whether this has changed, e.g. due to stabilization of the
| scheduler, I do not know.  But I do recall he had an opinion in the
| past.

  I have exchanged Email with him explaining why I feel it's highly
desirable on news servers, and I sent him some metrics with and without.
I had the impression he would reconsider the issue in the future. Note
that means "think about it again" rather than any implied change in his
conclusion.

  As long as patches are available I will continue to apply them, but I
certainly think the increased stability would suggest a backport to 2.4
at some time. I'm not totally sure that's now, Ingo is far better
qualified than I to evaluate the overall impact on more typical loads.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  1:14               ` Andrew Morton
  2002-12-04  1:21                 ` Andrea Arcangeli
@ 2002-12-06 18:11                 ` William Lee Irwin III
  1 sibling, 0 replies; 45+ messages in thread
From: William Lee Irwin III @ 2002-12-06 18:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrea Arcangeli, Martin J. Bligh, Christoph Hellwig, rml, linux-kernel

On Tue, Dec 03, 2002 at 05:14:40PM -0800, Andrew Morton wrote:
> I just retested.  This is on uniprocessor.  Running `make -j1 bzImage',
> while typing into a StarOffice 5.2 document:

I just reproduced the kernel compile issue by forgetting to run a
kernel compile niced down on UP, and getting many ccache misses.


Bill

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  1:03               ` William Lee Irwin III
@ 2002-12-04  9:25                 ` William Lee Irwin III
  0 siblings, 0 replies; 45+ messages in thread
From: William Lee Irwin III @ 2002-12-04  9:25 UTC (permalink / raw)
  To: Andrea Arcangeli, Andrew Morton, Martin J. Bligh,
	Christoph Hellwig, marcelo, rml, linux-kernel

On Tue, Dec 03, 2002 at 04:30:28PM -0800, Andrew Morton wrote:
>>> load is just one or more busywaits.  It has to be a compilation.  It
>>> could be something to do with all the short-lived processes, or gcc -pipe)

On Wed, Dec 04, 2002 at 01:42:34AM +0100, Andrea Arcangeli wrote:
>> could be that we think they're very interactive or something like that.

On Tue, Dec 03, 2002 at 05:03:07PM -0800, William Lee Irwin III wrote:
> The pipe issue is observable without involving gcc or kernel compiles.
> Cooperating processes are consistently granted excessive priorities.

More specifically, the "cooperating processes monopolize the cpu"
scenario is at its worst when a shell script is used to drive the bochs
simulator by single-stepping in order to generate instruction-level
boot-time traces of the execution of custom executives.

./foo.sh | bochs is the method, where the contents of foo.sh are
trivially derivable from bochs' debugging interface (a couple of
newlines and then repeating 's' indefinitely, then killing the process
by hand when the exception is observed while tail -f'ing the trace).


Bill

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  1:21                 ` Andrea Arcangeli
@ 2002-12-04  2:14                   ` Andrew Morton
  0 siblings, 0 replies; 45+ messages in thread
From: Andrew Morton @ 2002-12-04  2:14 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Martin J. Bligh, Christoph Hellwig, rml, linux-kernel

Andrea Arcangeli wrote:
> 
> ...
> > Could be.  Removing -pipe affected it quite a bit.
> 
> you could try decreasing PARENT_PENALTY to 50. I would like to see if
> the scheduler *still* thinks they're interactive stuff then.
> 

That didn't seem to make much difference either way.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  1:14               ` Andrew Morton
@ 2002-12-04  1:21                 ` Andrea Arcangeli
  2002-12-04  2:14                   ` Andrew Morton
  2002-12-06 18:11                 ` William Lee Irwin III
  1 sibling, 1 reply; 45+ messages in thread
From: Andrea Arcangeli @ 2002-12-04  1:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Martin J. Bligh, Christoph Hellwig, rml, linux-kernel

On Tue, Dec 03, 2002 at 05:14:40PM -0800, Andrew Morton wrote:
> Andrea Arcangeli wrote:
> > 
> > On Tue, Dec 03, 2002 at 04:30:28PM -0800, Andrew Morton wrote:
> > > load is just one or more busywaits.  It has to be a compilation.  It
> > > could be something to do with all the short-lived processes, or gcc -pipe)
> > 
> > could be that we think they're very interactive or something like that.
> 
> I just retested.  This is on uniprocessor.  Running `make -j1 bzImage',
> while typing into a StarOffice 5.2 document:
> 
> - 2.4.19-pre4: smooth
> - 2.4.20aa1: Jerky.  Sometimes it's OK, sometimes a few characters
>   lag.
> 
> Then I disabled `-pipe' in the build and restarted it:
> 
> - 2.4.19-pre4: smooth
> - 2.4.20aa1: Quite a lot more jerky.  Enough to be a bit irritating.
> 
> > ...
> > >
> > > This problem is the "changed sched_yield semantics".  It was actually
> > > tested on uniprocessor.  The difference between 2.4 and 2.4-aa is
> > > still noticeable here, but it is not a terrible problem now.
> > 
> > strange, the algorithm should be nearly the same now (modulo RT). Still
> > I wonder that's something else on the short lived gcc processes side.
> 
> Could be.  Removing -pipe affected it quite a bit.


you could try decreasing PARENT_PENALTY to 50. I would like to see if
the scheduler *still* thinks they're interactive stuff then.

Andrea

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  0:42             ` Andrea Arcangeli
  2002-12-04  1:03               ` William Lee Irwin III
@ 2002-12-04  1:14               ` Andrew Morton
  2002-12-04  1:21                 ` Andrea Arcangeli
  2002-12-06 18:11                 ` William Lee Irwin III
  1 sibling, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2002-12-04  1:14 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Martin J. Bligh, Christoph Hellwig, rml, linux-kernel

Andrea Arcangeli wrote:
> 
> On Tue, Dec 03, 2002 at 04:30:28PM -0800, Andrew Morton wrote:
> > load is just one or more busywaits.  It has to be a compilation.  It
> > could be something to do with all the short-lived processes, or gcc -pipe)
> 
> could be that we think they're very interactive or something like that.

I just retested.  This is on uniprocessor.  Running `make -j1 bzImage',
while typing into a StarOffice 5.2 document:

- 2.4.19-pre4: smooth
- 2.4.20aa1: Jerky.  Sometimes it's OK, sometimes a few characters
  lag.

Then I disabled `-pipe' in the build and restarted it:

- 2.4.19-pre4: smooth
- 2.4.20aa1: Quite a lot more jerky.  Enough to be a bit irritating.

> ...
> >
> > This problem is the "changed sched_yield semantics".  It was actually
> > tested on uniprocessor.  The difference between 2.4 and 2.4-aa is
> > still noticeable here, but it is not a terrible problem now.
> 
> strange, the algorithm should be nearly the same now (modulo RT). Still
> I wonder that's something else on the short lived gcc processes side.

Could be.  Removing -pipe affected it quite a bit.
 
> ...
> the right implementation would be probably to let all the other task
> run, so it can't waste entire timeslices if two tasks runs sched_yield
> in a loop and the holder waits behind them, but that proven to be quite
> slow in pratice for apps like openoffice (really when we tested that
> algorithm there were still various bugs but I still think letting all
> tasks to run before staroffice could make progress was the major reason
> of the slowdown, think all gcc spending their timeslice before you can
> take a mutex etc...).

Yup.  yield() is a very vague thing.  So vague as to make it a bit
useless really.  Anything which depends on it will show large
changes in behaviour as system load changes.  And indeed as the
yield() implementation changes ;)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  0:42             ` Andrea Arcangeli
@ 2002-12-04  1:03               ` William Lee Irwin III
  2002-12-04  9:25                 ` William Lee Irwin III
  2002-12-04  1:14               ` Andrew Morton
  1 sibling, 1 reply; 45+ messages in thread
From: William Lee Irwin III @ 2002-12-04  1:03 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andrew Morton, Martin J. Bligh, Christoph Hellwig, marcelo, rml,
	linux-kernel

On Tue, Dec 03, 2002 at 04:30:28PM -0800, Andrew Morton wrote:
>> load is just one or more busywaits.  It has to be a compilation.  It
>> could be something to do with all the short-lived processes, or gcc -pipe)

On Wed, Dec 04, 2002 at 01:42:34AM +0100, Andrea Arcangeli wrote:
> could be that we think they're very interactive or something like that.

The pipe issue is observable without involving gcc or kernel compiles.
Cooperating processes are consistently granted excessive priorities.


Bill

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  0:30           ` Andrew Morton
@ 2002-12-04  0:42             ` Andrea Arcangeli
  2002-12-04  1:03               ` William Lee Irwin III
  2002-12-04  1:14               ` Andrew Morton
  0 siblings, 2 replies; 45+ messages in thread
From: Andrea Arcangeli @ 2002-12-04  0:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, rml, linux-kernel

On Tue, Dec 03, 2002 at 04:30:28PM -0800, Andrew Morton wrote:
> load is just one or more busywaits.  It has to be a compilation.  It
> could be something to do with all the short-lived processes, or gcc -pipe)

could be that we think they're very interactive or something like that.

> 
> > ...
> > > With a `make -j1' running:
> > >
> > > - Normal O(1) behaviour in StarOffice 5.2 is 15-30 second delays between
> > >   actions.
> > >
> > > - With 2.4.20aa1, typing into a text document typically had a 2-3 character
> > >   delay.
> > >
> > > - With the standard 2.4 scheduler the delay is zero characters.
> > 
> > again, I guess that's SMP and that's quite a pain to fix it to be 100%
> > equivalent to 2.4 without hurting scalability.
> 
> This problem is the "changed sched_yield semantics".  It was actually
> tested on uniprocessor.  The difference between 2.4 and 2.4-aa is
> still noticeable here, but it is not a terrible problem now.

strange, the algorithm should be nearly the same now (modulo RT). Still
I wonder that's something else on the short lived gcc processes side.

> > ..
> > 
> > Overall I don't see any showstopper with openoffice (or staroffice) on
> > my version of the o1 scheduler.
> 
> I'd agree that it's not a showstopper.  It's in the "could be improved
> a bit sometime" department.
> 
> Post-2.4, well, spinning on sched_yield() is a silly way to implement
> a graphical application and I don't believe we need to struggle to
> support such a thing.
> 
> The Open Group say
> 
>      The sched_yield() function shall force the running thread to relinquish
>      the processor until it again becomes the head of its thread list. It
>      takes no arguments.
> 
> That's a bit vague, but it does tend to imply that a yield could
> relinquish the CPU for a very long time.

the right implementation would be probably to let all the other task
run, so it can't waste entire timeslices if two tasks runs sched_yield
in a loop and the holder waits behind them, but that proven to be quite
slow in pratice for apps like openoffice (really when we tested that
algorithm there were still various bugs but I still think letting all
tasks to run before staroffice could make progress was the major reason
of the slowdown, think all gcc spending their timeslice before you can
take a mutex etc...).

Andrea

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-04  0:06         ` Andrea Arcangeli
@ 2002-12-04  0:30           ` Andrew Morton
  2002-12-04  0:42             ` Andrea Arcangeli
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2002-12-04  0:30 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, rml, linux-kernel

Andrea Arcangeli wrote:
> 
> ...
> >
> > The difference is unlikely to be noticed by many.  (But it should be
> > _better_ than stock 2.4)
> 
> it can't be better in SMP because due its scalability feature we
> completely lose track of the global smp and we only can keep track of
> the single per-cpu queue. Was it on SMP or UP?

The problem with the "interactivity estimator" was observed on
dual CPU.  It has almost vanished in 2.4.20aa1 and I don't think
it needs any more attention.

(BTW: it is not possible to trigger this problem when the background
load is just one or more busywaits.  It has to be a compilation.  It
could be something to do with all the short-lived processes, or gcc -pipe)

> ...
> > With a `make -j1' running:
> >
> > - Normal O(1) behaviour in StarOffice 5.2 is 15-30 second delays between
> >   actions.
> >
> > - With 2.4.20aa1, typing into a text document typically had a 2-3 character
> >   delay.
> >
> > - With the standard 2.4 scheduler the delay is zero characters.
> 
> again, I guess that's SMP and that's quite a pain to fix it to be 100%
> equivalent to 2.4 without hurting scalability.

This problem is the "changed sched_yield semantics".  It was actually
tested on uniprocessor.  The difference between 2.4 and 2.4-aa is
still noticeable here, but it is not a terrible problem now.

> ..
> 
> Overall I don't see any showstopper with openoffice (or staroffice) on
> my version of the o1 scheduler.

I'd agree that it's not a showstopper.  It's in the "could be improved
a bit sometime" department.

Post-2.4, well, spinning on sched_yield() is a silly way to implement
a graphical application and I don't believe we need to struggle to
support such a thing.

The Open Group say

     The sched_yield() function shall force the running thread to relinquish
     the processor until it again becomes the head of its thread list. It
     takes no arguments.

That's a bit vague, but it does tend to imply that a yield could
relinquish the CPU for a very long time.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03 21:09         ` Martin J. Bligh
@ 2002-12-04  0:09           ` Andrea Arcangeli
  0 siblings, 0 replies; 45+ messages in thread
From: Andrea Arcangeli @ 2002-12-04  0:09 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Andrew Morton, Christoph Hellwig, rml, linux-kernel

On Tue, Dec 03, 2002 at 01:09:42PM -0800, Martin J. Bligh wrote:
> >>please try with my tree.
> >
> >It is greatly improved.  It is still not as smooth as the standard 2.4
> >scheduler, but I'd characterise it as "a bit jerky" rather than "makes
> >me want to punch a hole in the monitor".
> >
> >The difference is unlikely to be noticed by many.  (But it should be
> >_better_ than stock 2.4)
> 
> ...
> 
> >>can you reproduce with my tree?
> >
> >Again, hugely improved over normal O(1) behaviour, but not as responsive
> >as the stock 2.4 scheduler.
> 
> Andrea, which patches in your tree are the ones that fix this?
> If it's the big-monster one ... any chance you could split out
> the bits actually fix it? I'd love to be able to apply your fixes
> to 2.5 and try them there ....

it's all in these patches:

andrea@dualathlon:~/remote/kernel.org/kernels/v2.4/2.4.20aa1> ls -1 *sched*
00_flush-inode-reschedule-2
00_sched-O1-aa-2.4.19rc3-5.gz
10_sched-o1-bluetooth-1
10_sched-o1-hyperthreading-3
20_apm-o1-sched-1
20_sched-o1-fixes-8
71_xfs-sched-1

I'm fixing the RT case too right now, in a few days a further fix will
be available to avoid deadlocks of some app with RT enabled.

Andrea

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03 20:49       ` Andrew Morton
  2002-12-03 21:09         ` Martin J. Bligh
@ 2002-12-04  0:06         ` Andrea Arcangeli
  2002-12-04  0:30           ` Andrew Morton
  1 sibling, 1 reply; 45+ messages in thread
From: Andrea Arcangeli @ 2002-12-04  0:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, rml, linux-kernel

On Tue, Dec 03, 2002 at 12:49:16PM -0800, Andrew Morton wrote:
> Andrea Arcangeli wrote:
> > 
> > On Mon, Dec 02, 2002 at 11:30:05AM -0800, Andrew Morton wrote:
> > ...
> > > I have observed two problems with the new scheduler, both serious IMO:
> > >
> > > 1) Changed sched_yield() semantics.  sched_yield() has changed
> > >    dramatically, and it can seriously impact existing applications.
> > >    A testcase (this is on 2.5.46, UP, no preempt):
> > >
> > >    make -j3 bzImage
> > >    wait 30 seconds
> > >    ^C
> > >    make clean (OK, it's all in cache)
> > >    start StarOffice 5.2
> > >    make -j3 bzImage
> > >    wait 5 seconds
> > >    now click on the SO5.2 `File' menu.
> > >
> > >    It takes ~15 seconds for the menu to appear, and >30 seconds for
> > >    it to go away.  The application is wholly unusable for the duration
> > >    of the compilation.
> > 
> > please try with my tree.
> 
> It is greatly improved.  It is still not as smooth as the standard 2.4
> scheduler, but I'd characterise it as "a bit jerky" rather than "makes
> me want to punch a hole in the monitor".

;)

Thanks for taking the time of testing it btw.

> 
> The difference is unlikely to be noticed by many.  (But it should be
> _better_ than stock 2.4)

it can't be better in SMP because due its scalability feature we
completely lose track of the global smp and we only can keep track of
the single per-cpu queue. Was it on SMP or UP? I guess on SMP. On UP
sched_yield in my tree should be equivalent to stock 2.4 (modulo RT that
is still broken in sched_yield with the o1 scheduler, I'm fixing this
these days but it doesn't matter for your test). If it was UP then it
was probably some other less agressive dynamic priority effect and not
the sched_yield that made the difference.

> 
> > ...
> > > 2) The interactivity estimator makes inappropriate decisions.
> > >
> > >    Test case:
> > >
> > >    start a kernel compile as above
> > >    grab an xterm and waggle it about a lot.
> > >
> > >    The amount of waggling depends on the video hardware (I think).  One
> > >    of my machines (nVidia NV15) needs a huge amount of vigorous waggling.
> > >    Another machine (voodoo III) just needs a little waggle.
> > >
> > >    When you've waggled enough, the scheduler decides that the X server
> > >    is a `batch' process and schedules it alongside the background
> > >    compilation.  Everything goes silly.  The mouse cursor sticks stationary
> > >    for 0.5-1.0 seconds, then takes great leaps across the screen.  Unusable.
> > >    You have to stop using the machine for five seconds or so, wait for the
> > >    X server to flip back into `interactive' mode.
> > >
> > >    This also affects netscape 4.x mailnews.  Start the kernel compile,
> > >    then select a new (large) folder.  Netscape will consume maybe one
> > >    second CPU doing the initial processing on that folder, which is
> > >    enough for the system to decide it's a "batch" process.  The user
> > >    interface seizes up and you have to wait five seconds for it to be
> > >    treated as an "interactive" process again before you can do anything.
> > >
> > >    It also affects gdb.  Start a kernel compile, then run `gdb vmlinux'.
> > >    The initial processing which gdb does on the executable is enough for it
> > >    to be treated as a batch process and the subsequent interactive session
> > >    is comatose for several seconds.  Same deal.
> > >
> > >    This one needs fixing in 2.5.  Please.  It's very irritating.
> > 
> > can you reproduce with my tree?
> 
> Again, hugely improved over normal O(1) behaviour, but not as responsive
> as the stock 2.4 scheduler.
> 
> With a `make -j1' running:
> 
> - Normal O(1) behaviour in StarOffice 5.2 is 15-30 second delays between
>   actions.
> 
> - With 2.4.20aa1, typing into a text document typically had a 2-3 character
>   delay.
> 
> - With the standard 2.4 scheduler the delay is zero characters.

again, I guess that's SMP and that's quite a pain to fix it to be 100%
equivalent to 2.4 without hurting scalability. sched_yield in my tree is
still fully scalable, in turn it has no knowledge of the global smp, in
turn the global smp is still left to the loadbalancing code only (not in
sched_yield) so userspace could spin and sched_yield for some time in
one cpu until some balancing happens or the other cpu expires the
timeslice of the running task and let the lock holder to run. This
wouldn't happen in stock 2.4, in stock 2.4 the holder would run in the
cpu that run sched_yield.

However in UP my tree should be just fully responsive like mainline 2.4
in terms of sched_yield.

At the moment fixing the brokeness of sched_yield with RT tasks in the
o1 scheduler is an higher prio compared to try to have sched_yield as
smart as mainline 2.4 with SMP. Using load_balance() in sched_yield may
not work well either (it's an off-by one issue and load_balance works
starting from the off-by two), so it would be a quite tedious work to
make sched_yield aware of the whole smp and to pick tasks from other
per-cpu queues if there's nothing left to run in the local cpu.  And the
current sched_yield proved to work just well enough in SMP too for quite
obvious reasons (the main problem is when the lock holder lives in the
same cpu of the sched_yield and that's solved now).

> 
> So StarOffice 5.2 is still a bit uncomfortable to use with 2.4.20aa1.

as said it could be the SMP with sched_yield thing (not an issue on UP),
but it could be also some less aggressive dynamic priority, that's
tunable but it doesn't sound like a showstopper. My first prio is that
the heuristics in the scheduler are sane, makes perfect sense and they
don't fall apart into corner cases, if we dropped some minor heuristic
to catch more interactive threads in the dynamic priority that's not a
showstopper it can be re-added later or it may be only a tuning effort
needed on some variable like
MAX_SLEEP_AVG/MAX_TIMESLICE/PRIO_BONUS_RATIO/INTERACTIVE_DELTA. For
example just increasing PRIO_BONUS_RATIO from 25% to 50% would lead to
more interactive processes getting higher dynamic priority levels.

Overall I don't see any showstopper with openoffice (or staroffice) on
my version of the o1 scheduler.

Andrea

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03 20:49       ` Andrew Morton
@ 2002-12-03 21:09         ` Martin J. Bligh
  2002-12-04  0:09           ` Andrea Arcangeli
  2002-12-04  0:06         ` Andrea Arcangeli
  1 sibling, 1 reply; 45+ messages in thread
From: Martin J. Bligh @ 2002-12-03 21:09 UTC (permalink / raw)
  To: Andrew Morton, Andrea Arcangeli; +Cc: Christoph Hellwig, rml, linux-kernel

>> please try with my tree.
>
> It is greatly improved.  It is still not as smooth as the standard 2.4
> scheduler, but I'd characterise it as "a bit jerky" rather than "makes
> me want to punch a hole in the monitor".
>
> The difference is unlikely to be noticed by many.  (But it should be
> _better_ than stock 2.4)

...

>> can you reproduce with my tree?
>
> Again, hugely improved over normal O(1) behaviour, but not as responsive
> as the stock 2.4 scheduler.

Andrea, which patches in your tree are the ones that fix this?
If it's the big-monster one ... any chance you could split out
the bits actually fix it? I'd love to be able to apply your fixes
to 2.5 and try them there ....

Thanks,

M.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 19:50     ` Andrea Arcangeli
@ 2002-12-03 20:49       ` Andrew Morton
  2002-12-03 21:09         ` Martin J. Bligh
  2002-12-04  0:06         ` Andrea Arcangeli
  0 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2002-12-03 20:49 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, rml, linux-kernel

Andrea Arcangeli wrote:
> 
> On Mon, Dec 02, 2002 at 11:30:05AM -0800, Andrew Morton wrote:
> ...
> > I have observed two problems with the new scheduler, both serious IMO:
> >
> > 1) Changed sched_yield() semantics.  sched_yield() has changed
> >    dramatically, and it can seriously impact existing applications.
> >    A testcase (this is on 2.5.46, UP, no preempt):
> >
> >    make -j3 bzImage
> >    wait 30 seconds
> >    ^C
> >    make clean (OK, it's all in cache)
> >    start StarOffice 5.2
> >    make -j3 bzImage
> >    wait 5 seconds
> >    now click on the SO5.2 `File' menu.
> >
> >    It takes ~15 seconds for the menu to appear, and >30 seconds for
> >    it to go away.  The application is wholly unusable for the duration
> >    of the compilation.
> 
> please try with my tree.

It is greatly improved.  It is still not as smooth as the standard 2.4
scheduler, but I'd characterise it as "a bit jerky" rather than "makes
me want to punch a hole in the monitor".

The difference is unlikely to be noticed by many.  (But it should be
_better_ than stock 2.4)

> ...
> > 2) The interactivity estimator makes inappropriate decisions.
> >
> >    Test case:
> >
> >    start a kernel compile as above
> >    grab an xterm and waggle it about a lot.
> >
> >    The amount of waggling depends on the video hardware (I think).  One
> >    of my machines (nVidia NV15) needs a huge amount of vigorous waggling.
> >    Another machine (voodoo III) just needs a little waggle.
> >
> >    When you've waggled enough, the scheduler decides that the X server
> >    is a `batch' process and schedules it alongside the background
> >    compilation.  Everything goes silly.  The mouse cursor sticks stationary
> >    for 0.5-1.0 seconds, then takes great leaps across the screen.  Unusable.
> >    You have to stop using the machine for five seconds or so, wait for the
> >    X server to flip back into `interactive' mode.
> >
> >    This also affects netscape 4.x mailnews.  Start the kernel compile,
> >    then select a new (large) folder.  Netscape will consume maybe one
> >    second CPU doing the initial processing on that folder, which is
> >    enough for the system to decide it's a "batch" process.  The user
> >    interface seizes up and you have to wait five seconds for it to be
> >    treated as an "interactive" process again before you can do anything.
> >
> >    It also affects gdb.  Start a kernel compile, then run `gdb vmlinux'.
> >    The initial processing which gdb does on the executable is enough for it
> >    to be treated as a batch process and the subsequent interactive session
> >    is comatose for several seconds.  Same deal.
> >
> >    This one needs fixing in 2.5.  Please.  It's very irritating.
> 
> can you reproduce with my tree?

Again, hugely improved over normal O(1) behaviour, but not as responsive
as the stock 2.4 scheduler.

With a `make -j1' running:

- Normal O(1) behaviour in StarOffice 5.2 is 15-30 second delays between
  actions.

- With 2.4.20aa1, typing into a text document typically had a 2-3 character
  delay.

- With the standard 2.4 scheduler the delay is zero characters.

So StarOffice 5.2 is still a bit uncomfortable to use with 2.4.20aa1.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:50 ` Martin J. Bligh
  2002-12-02 18:50   ` Adrian Bunk
  2002-12-02 19:30   ` Andrew Morton
@ 2002-12-03  1:11   ` Christoph Hellwig
  2002-12-02 18:59     ` Robert Love
  2002-12-02 22:47     ` Alan Cox
  2 siblings, 2 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-03  1:11 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: marcelo, rml, linux-kernel

On Mon, Dec 02, 2002 at 09:50:50AM -0800, Martin J. Bligh wrote:
> There was talk of merging the O(1) scheduler into 2.4 at OLS.
> If every distro has it, and 2.5 has it, and it's been around for
> this long, I think that proves it stable.
> 
> Marcelo, what are the chances of getting this merged into mainline
> in the 2.4.20 timeframe?

Ingo vetoed it.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:24 ` Jeff Garzik
  2002-12-02 18:57   ` Robert Love
@ 2002-12-03  0:51   ` Christoph Hellwig
  1 sibling, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-03  0:51 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: marcelo, rml, linux-kernel

On Mon, Dec 02, 2002 at 12:24:49PM -0500, Jeff Garzik wrote:
> Adding to that, it is also used for backporting Ingo's workqueue stuff, 
> which is useful and completely separate from the O(1) scheduler.

Hey, that's my next patch :)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH] set_cpus_allowed() for 2.4
@ 2002-12-03  0:26 Christoph Hellwig
  2002-12-02 17:24 ` Jeff Garzik
  2002-12-02 17:50 ` Martin J. Bligh
  0 siblings, 2 replies; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-03  0:26 UTC (permalink / raw)
  To: marcelo, rml; +Cc: linux-kernel

now that all commercial vendors ship a backport of Ingo's O(1) scheduler
external projects like XFS have to track those projects in addition to the
mainline kernel.

Having the common new APIs available in mainline would be a very good thing
for those projects.  We already have a proper yield() in 2.4.20, but the
set_cpus_allowed() API as used e.g. for kernelthreads bound to CPUs is
still missing.

Any chance you could apply Robert Love's patch to add it for 2.4.21?  Note
that it does not change any existing code but just adds that interface.


diff -urN linux-2.4.20-pre8/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.4.20-pre8/include/linux/sched.h	Mon Sep 30 17:41:22 2002
+++ linux/include/linux/sched.h	Tue Oct  1 18:35:28 2002
@@ -163,6 +164,12 @@
 extern int start_context_thread(void);
 extern int current_is_keventd(void);
 
+#if CONFIG_SMP
+extern void set_cpus_allowed(struct task_struct *p, unsigned long new_mask);
+#else
+# define set_cpus_allowed(p, new_mask) do { } while (0)
+#endif
+
 /*
  * The default fd array needs to be at least BITS_PER_LONG,
  * as this is the granularity returned by copy_fdset().
diff -urN linux-2.4.20-pre8/kernel/ksyms.c linux/kernel/ksyms.c
--- linux-2.4.20-pre8/kernel/ksyms.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/ksyms.c	Tue Oct  1 18:34:41 2002
@@ -451,6 +451,9 @@
 EXPORT_SYMBOL(interruptible_sleep_on_timeout);
 EXPORT_SYMBOL(schedule);
 EXPORT_SYMBOL(schedule_timeout);
+#if CONFIG_SMP
+EXPORT_SYMBOL(set_cpus_allowed);
+#endif
 EXPORT_SYMBOL(yield);
 EXPORT_SYMBOL(__cond_resched);
 EXPORT_SYMBOL(jiffies);
diff -urN linux-2.4.20-pre8/kernel/sched.c linux/kernel/sched.c
--- linux-2.4.20-pre8/kernel/sched.c	Mon Sep 30 17:41:22 2002
+++ linux/kernel/sched.c	Tue Oct  1 18:54:49 2002
@@ -850,6 +850,46 @@
 
 void scheduling_functions_end_here(void) { }
 
+#if CONFIG_SMP
+
+/**
+ * set_cpus_allowed() - change a given task's processor affinity
+ * @p: task to bind
+ * @new_mask: bitmask of allowed processors
+ *
+ * Upon return, the task is running on a legal processor.  Note the caller
+ * must have a valid reference to the task: it must not exit() prematurely.
+ * This call can sleep; do not hold locks on call.
+ */
+void set_cpus_allowed(struct task_struct *p, unsigned long new_mask)
+{
+	new_mask &= cpu_online_map;
+	BUG_ON(!new_mask);
+
+	p->cpus_allowed = new_mask;
+
+	/*
+	 * If the task is on a no-longer-allowed processor, we need to move
+	 * it.  If the task is not current, then set need_resched and send
+	 * its processor an IPI to reschedule.
+	 */
+	if (!(p->cpus_runnable & p->cpus_allowed)) {
+		if (p != current) {
+			p->need_resched = 1;
+			smp_send_reschedule(p->processor);
+		}
+		/*
+		 * Wait until we are on a legal processor.  If the task is
+		 * current, then we should be on a legal processor the next
+		 * time we reschedule.  Otherwise, we need to wait for the IPI.
+		 */
+		while (!(p->cpus_runnable & p->cpus_allowed))
+			schedule();
+	}
+}
+
+#endif /* CONFIG_SMP */
+
 #ifndef __alpha__
 
 /*

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03  1:11   ` Christoph Hellwig
  2002-12-02 18:59     ` Robert Love
@ 2002-12-02 22:47     ` Alan Cox
  2002-12-02 22:38       ` Christoph Hellwig
  1 sibling, 1 reply; 45+ messages in thread
From: Alan Cox @ 2002-12-02 22:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin J. Bligh, marcelo, rml, Linux Kernel Mailing List

On Tue, 2002-12-03 at 01:11, Christoph Hellwig wrote:
> On Mon, Dec 02, 2002 at 09:50:50AM -0800, Martin J. Bligh wrote:
> > There was talk of merging the O(1) scheduler into 2.4 at OLS.
> > If every distro has it, and 2.5 has it, and it's been around for
> > this long, I think that proves it stable.
> > 
> > Marcelo, what are the chances of getting this merged into mainline
> > in the 2.4.20 timeframe?
> 
> Ingo vetoed it.

I wasnt aware Ingo had a veto


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 22:38       ` Christoph Hellwig
@ 2002-12-02 22:41         ` Robert Love
  2002-12-07 16:55           ` bill davidsen
  0 siblings, 1 reply; 45+ messages in thread
From: Robert Love @ 2002-12-02 22:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alan Cox, Martin J. Bligh, marcelo, Linux Kernel Mailing List

On Mon, 2002-12-02 at 17:38, Christoph Hellwig wrote:
> On Mon, Dec 02, 2002 at 10:47:28PM +0000, Alan Cox wrote:
> > > Ingo vetoed it.
> > 
> > I wasnt aware Ingo had a veto
> 
> It's not exactly considered nice to merge code against the intention
> of it's author.  (which doesn't mean it's impossible, of course)

Ingo did explicitly mention he thought the O(1) scheduler was not 2.4
material.  Whether this has changed, e.g. due to stabilization of the
scheduler, I do not know.  But I do recall he had an opinion in the
past.

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 22:47     ` Alan Cox
@ 2002-12-02 22:38       ` Christoph Hellwig
  2002-12-02 22:41         ` Robert Love
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2002-12-02 22:38 UTC (permalink / raw)
  To: Alan Cox; +Cc: Martin J. Bligh, marcelo, rml, Linux Kernel Mailing List

On Mon, Dec 02, 2002 at 10:47:28PM +0000, Alan Cox wrote:
> > Ingo vetoed it.
> 
> I wasnt aware Ingo had a veto

It's not exactly considered nice to merge code against the intention
of it's author.  (which doesn't mean it's impossible, of course)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 19:30   ` Andrew Morton
@ 2002-12-02 19:50     ` Andrea Arcangeli
  2002-12-03 20:49       ` Andrew Morton
  2002-12-08 13:23     ` Ingo Molnar
  1 sibling, 1 reply; 45+ messages in thread
From: Andrea Arcangeli @ 2002-12-02 19:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Martin J. Bligh, Christoph Hellwig, marcelo, rml, linux-kernel

On Mon, Dec 02, 2002 at 11:30:05AM -0800, Andrew Morton wrote:
> "Martin J. Bligh" wrote:
> > 
> > > now that all commercial vendors ship a backport of Ingo's O(1)
> > > scheduler external projects like XFS have to track those projects
> > > in addition to the mainline kernel.
> > 
> > There was talk of merging the O(1) scheduler into 2.4 at OLS.
> > If every distro has it, and 2.5 has it, and it's been around for
> > this long, I think that proves it stable.
> > 
> 
> I have observed two problems with the new scheduler, both serious IMO:
> 
> 1) Changed sched_yield() semantics.  sched_yield() has changed
>    dramatically, and it can seriously impact existing applications.
>    A testcase (this is on 2.5.46, UP, no preempt):
> 
>    make -j3 bzImage
>    wait 30 seconds
>    ^C
>    make clean	(OK, it's all in cache)
>    start StarOffice 5.2
>    make -j3 bzImage
>    wait 5 seconds
>    now click on the SO5.2 `File' menu.
> 
>    It takes ~15 seconds for the menu to appear, and >30 seconds for
>    it to go away.  The application is wholly unusable for the duration
>    of the compilation.

please try with my tree. Besides the sched_yield issue the o1 scheduler
as well waste around 60% of the whole cpu power on a multi-way smp
in some workload with frequent wakeups without the fixes in my tree that
allows to reschedule idle cpus properly, the HZ=1000 probably hides the
problem a little in 2.5 and in the RHAS but the problem definitely
remains. that is by far the worst design bug in the o1 scheduler IMHO,
it's almost unknown and it's completely fixed in my tree as far as I
know from the numbers.

>    [..] Rumour has
>    it that this is happening inside the pthread library.

probably.

>    Arguably, the new sched_yield() is correct and the old one wasn't,
>    but the effects of this change make it unsuitable for a 2.4 merge.

yes. Just use my tree and you'll be fine in 2.4 with the o1 scheduler.
if you cut-and-paste my most recent sched-yield version to 2.5 you'll be
fine too.

> 2) The interactivity estimator makes inappropriate decisions.
> 
>    Test case:
> 
>    start a kernel compile as above
>    grab an xterm and waggle it about a lot.
> 
>    The amount of waggling depends on the video hardware (I think).  One
>    of my machines (nVidia NV15) needs a huge amount of vigorous waggling.
>    Another machine (voodoo III) just needs a little waggle.
> 
>    When you've waggled enough, the scheduler decides that the X server
>    is a `batch' process and schedules it alongside the background
>    compilation.  Everything goes silly.  The mouse cursor sticks stationary
>    for 0.5-1.0 seconds, then takes great leaps across the screen.  Unusable.
>    You have to stop using the machine for five seconds or so, wait for the
>    X server to flip back into `interactive' mode.
> 
>    This also affects netscape 4.x mailnews.  Start the kernel compile,
>    then select a new (large) folder.  Netscape will consume maybe one
>    second CPU doing the initial processing on that folder, which is
>    enough for the system to decide it's a "batch" process.  The user
>    interface seizes up and you have to wait five seconds for it to be
>    treated as an "interactive" process again before you can do anything.
> 
>    It also affects gdb.  Start a kernel compile, then run `gdb vmlinux'.
>    The initial processing which gdb does on the executable is enough for it
>    to be treated as a batch process and the subsequent interactive session
>    is comatose for several seconds.  Same deal.
> 
>    This one needs fixing in 2.5.  Please.  It's very irritating.

can you reproduce with my tree? (please test on 2.4.20rc1aa1 or
2.4.20rc2aa1 + the fix I posted yesterday for a deadlock in the inode
writeback, I'll soon upload a 2.4.20aa1 with such bugfix included,
probably this night)

In short if you want to use o1 in 2.4 make 200% sure you use my tree or
that at the very least you merge all my fixes, that AFIK at the moment
aren't included in any other 2.4 or 2.5 o1 scheduler patch out there. At
least unless you're fine to run with various slowdowns.

Andrea

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:50 ` Martin J. Bligh
  2002-12-02 18:50   ` Adrian Bunk
@ 2002-12-02 19:30   ` Andrew Morton
  2002-12-02 19:50     ` Andrea Arcangeli
  2002-12-08 13:23     ` Ingo Molnar
  2002-12-03  1:11   ` Christoph Hellwig
  2 siblings, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2002-12-02 19:30 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Christoph Hellwig, marcelo, rml, linux-kernel

"Martin J. Bligh" wrote:
> 
> > now that all commercial vendors ship a backport of Ingo's O(1)
> > scheduler external projects like XFS have to track those projects
> > in addition to the mainline kernel.
> 
> There was talk of merging the O(1) scheduler into 2.4 at OLS.
> If every distro has it, and 2.5 has it, and it's been around for
> this long, I think that proves it stable.
> 

I have observed two problems with the new scheduler, both serious IMO:

1) Changed sched_yield() semantics.  sched_yield() has changed
   dramatically, and it can seriously impact existing applications.
   A testcase (this is on 2.5.46, UP, no preempt):

   make -j3 bzImage
   wait 30 seconds
   ^C
   make clean	(OK, it's all in cache)
   start StarOffice 5.2
   make -j3 bzImage
   wait 5 seconds
   now click on the SO5.2 `File' menu.

   It takes ~15 seconds for the menu to appear, and >30 seconds for
   it to go away.  The application is wholly unusable for the duration
   of the compilation.

   This is because StarOffice is spinning on sched_yield().  Rumour has
   it that this is happening inside the pthread library.

   This will affect other things, both in-kernel and out.  This includes
   ext3, which uses yield() in its transaction batching.  ext3's fsync()
   operation performs dreadfully with the new yield() if there are
   compute-intensive things happening at the same time.  If people are
   shipping that sched_yield() implementation without having changed ext3,
   then they will receive bug reports against this.

   Arguably, the new sched_yield() is correct and the old one wasn't,
   but the effects of this change make it unsuitable for a 2.4 merge.

2) The interactivity estimator makes inappropriate decisions.

   Test case:

   start a kernel compile as above
   grab an xterm and waggle it about a lot.

   The amount of waggling depends on the video hardware (I think).  One
   of my machines (nVidia NV15) needs a huge amount of vigorous waggling.
   Another machine (voodoo III) just needs a little waggle.

   When you've waggled enough, the scheduler decides that the X server
   is a `batch' process and schedules it alongside the background
   compilation.  Everything goes silly.  The mouse cursor sticks stationary
   for 0.5-1.0 seconds, then takes great leaps across the screen.  Unusable.
   You have to stop using the machine for five seconds or so, wait for the
   X server to flip back into `interactive' mode.

   This also affects netscape 4.x mailnews.  Start the kernel compile,
   then select a new (large) folder.  Netscape will consume maybe one
   second CPU doing the initial processing on that folder, which is
   enough for the system to decide it's a "batch" process.  The user
   interface seizes up and you have to wait five seconds for it to be
   treated as an "interactive" process again before you can do anything.

   It also affects gdb.  Start a kernel compile, then run `gdb vmlinux'.
   The initial processing which gdb does on the executable is enough for it
   to be treated as a batch process and the subsequent interactive session
   is comatose for several seconds.  Same deal.

   This one needs fixing in 2.5.  Please.  It's very irritating.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 18:50   ` Adrian Bunk
@ 2002-12-02 19:12     ` Robert Love
  0 siblings, 0 replies; 45+ messages in thread
From: Robert Love @ 2002-12-02 19:12 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Martin J. Bligh, linux-kernel

On Mon, 2002-12-02 at 13:50, Adrian Bunk wrote:

> The kernel images in Debian don't have the O(1) scheduler.

At least we should be happy they have a 2.4 kernel available ;-)

*ducks*

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03  1:11   ` Christoph Hellwig
@ 2002-12-02 18:59     ` Robert Love
  2002-12-02 22:47     ` Alan Cox
  1 sibling, 0 replies; 45+ messages in thread
From: Robert Love @ 2002-12-02 18:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Martin J. Bligh, marcelo, linux-kernel

On Mon, 2002-12-02 at 20:11, Christoph Hellwig wrote:

> > Marcelo, what are the chances of getting this merged into mainline
> > in the 2.4.20 timeframe?
> 
> Ingo vetoed it.

I did too.  I know the distributors (including the one I work for) want
it, but its a big change and very much a 2.5 thing.

I would not be against tuning the 2.4 scheduler, though.  But the 
changes to architecture-dependent code mean it may not even work on one
or two architectures (i.e. cris, maybe?) and so I am against the whole
O(1) scheduler and all of that supporting code for 2.4 proper.

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:24 ` Jeff Garzik
@ 2002-12-02 18:57   ` Robert Love
  2002-12-03  0:51   ` Christoph Hellwig
  1 sibling, 0 replies; 45+ messages in thread
From: Robert Love @ 2002-12-02 18:57 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Christoph Hellwig, marcelo, linux-kernel

On Mon, 2002-12-02 at 12:24, Jeff Garzik wrote:

> Adding to that, it is also used for backporting Ingo's workqueue stuff, 
> which is useful and completely separate from the O(1) scheduler.

That is why I back-ported it - hch and you mentioned it was needed for
workqueues :)

It also simplifies the processor affinity syscalls (same code I did for
2.5, in fact), which I plan on submitting to Marcelo soon.

	Robert Love


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-02 17:50 ` Martin J. Bligh
@ 2002-12-02 18:50   ` Adrian Bunk
  2002-12-02 19:12     ` Robert Love
  2002-12-02 19:30   ` Andrew Morton
  2002-12-03  1:11   ` Christoph Hellwig
  2 siblings, 1 reply; 45+ messages in thread
From: Adrian Bunk @ 2002-12-02 18:50 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: linux-kernel

On Mon, Dec 02, 2002 at 09:50:50AM -0800, Martin J. Bligh wrote:

> There was talk of merging the O(1) scheduler into 2.4 at OLS.
> If every distro has it, and 2.5 has it, and it's been around for
>...

The kernel images in Debian don't have the O(1) scheduler.

> Thanks,
> 
> M.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03  0:26 Christoph Hellwig
  2002-12-02 17:24 ` Jeff Garzik
@ 2002-12-02 17:50 ` Martin J. Bligh
  2002-12-02 18:50   ` Adrian Bunk
                     ` (2 more replies)
  1 sibling, 3 replies; 45+ messages in thread
From: Martin J. Bligh @ 2002-12-02 17:50 UTC (permalink / raw)
  To: Christoph Hellwig, marcelo, rml; +Cc: linux-kernel

> now that all commercial vendors ship a backport of Ingo's O(1) 
> scheduler external projects like XFS have to track those projects 
> in addition to the mainline kernel.

There was talk of merging the O(1) scheduler into 2.4 at OLS.
If every distro has it, and 2.5 has it, and it's been around for
this long, I think that proves it stable.

Marcelo, what are the chances of getting this merged into mainline
in the 2.4.20 timeframe?

Thanks,

M.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] set_cpus_allowed() for 2.4
  2002-12-03  0:26 Christoph Hellwig
@ 2002-12-02 17:24 ` Jeff Garzik
  2002-12-02 18:57   ` Robert Love
  2002-12-03  0:51   ` Christoph Hellwig
  2002-12-02 17:50 ` Martin J. Bligh
  1 sibling, 2 replies; 45+ messages in thread
From: Jeff Garzik @ 2002-12-02 17:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: marcelo, rml, linux-kernel

Christoph Hellwig wrote:
> now that all commercial vendors ship a backport of Ingo's O(1) scheduler
> external projects like XFS have to track those projects in addition to the
> mainline kernel.
> 
> Having the common new APIs available in mainline would be a very good thing
> for those projects.  We already have a proper yield() in 2.4.20, but the
> set_cpus_allowed() API as used e.g. for kernelthreads bound to CPUs is
> still missing.
> 
> Any chance you could apply Robert Love's patch to add it for 2.4.21?  Note
> that it does not change any existing code but just adds that interface.


Adding to that, it is also used for backporting Ingo's workqueue stuff, 
which is useful and completely separate from the O(1) scheduler.

I plan on using workqueues for moving some drivers' duties to process 
context where it really belongs [which in turn fixes bugs].

	Jeff





^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2002-12-13 21:34 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-01 23:03 [PATCH] set_cpus_allowed() for 2.4 Robert Love
2002-10-02 13:01 ` Christoph Hellwig
2002-10-02 15:00   ` Robert Love
2002-11-05  3:37 ` Christoph Hellwig
2002-11-06 15:32   ` Adrian Bunk
2002-11-07 21:42     ` Christoph Hellwig
2002-12-02 17:12   ` Mikael Pettersson
2002-12-03  0:51     ` Christoph Hellwig
2002-12-02 17:47       ` Mikael Pettersson
2002-12-02 19:10         ` Robert Love
2002-12-03  0:26 Christoph Hellwig
2002-12-02 17:24 ` Jeff Garzik
2002-12-02 18:57   ` Robert Love
2002-12-03  0:51   ` Christoph Hellwig
2002-12-02 17:50 ` Martin J. Bligh
2002-12-02 18:50   ` Adrian Bunk
2002-12-02 19:12     ` Robert Love
2002-12-02 19:30   ` Andrew Morton
2002-12-02 19:50     ` Andrea Arcangeli
2002-12-03 20:49       ` Andrew Morton
2002-12-03 21:09         ` Martin J. Bligh
2002-12-04  0:09           ` Andrea Arcangeli
2002-12-04  0:06         ` Andrea Arcangeli
2002-12-04  0:30           ` Andrew Morton
2002-12-04  0:42             ` Andrea Arcangeli
2002-12-04  1:03               ` William Lee Irwin III
2002-12-04  9:25                 ` William Lee Irwin III
2002-12-04  1:14               ` Andrew Morton
2002-12-04  1:21                 ` Andrea Arcangeli
2002-12-04  2:14                   ` Andrew Morton
2002-12-06 18:11                 ` William Lee Irwin III
2002-12-08 13:23     ` Ingo Molnar
2002-12-08 19:56       ` Andrew Morton
2002-12-09 20:13         ` Ingo Molnar
2002-12-03  1:11   ` Christoph Hellwig
2002-12-02 18:59     ` Robert Love
2002-12-02 22:47     ` Alan Cox
2002-12-02 22:38       ` Christoph Hellwig
2002-12-02 22:41         ` Robert Love
2002-12-07 16:55           ` bill davidsen
2002-12-09  3:02 Jim Houston
2002-12-09 20:19 kernel
2002-12-13 23:08 Christoph Hellwig
2002-12-13 21:34 ` Adrian Bunk
2002-12-14  4:55   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).