linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] Re: Negative scalability by removal of
       [not found] <3A06C007.99EE3746@uow.edu.au>
@ 2000-11-06 17:17 ` Alan Cox
  2000-11-06 17:48 ` Linus Torvalds
  1 sibling, 0 replies; 13+ messages in thread
From: Alan Cox @ 2000-11-06 17:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, Linus Torvalds, linux-kernel

> It's a 16-liner!  I'll cheerfully admit that this patch
> may be completely broken, but hey, it's free.  I suggest
> that _something_ has to be done for 2.2 now, because
> Apache has switched to unserialised accept().

Interesting

> The fact that the throughput is 3-4 time worse for 2, 3, 4 and 5
> server processes is completely wierd.  Perhaps some strange miss
> pattern, but it doesn't do it on 2.4.  I'll dump this problem
> onto the netdev list, see if anyone has any bright ideas.

That would be consistent with the fact that thttpd is single threaded and
kicks apache for performance in 2.2 (less so 2.4!)

> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
       [not found] <3A06C007.99EE3746@uow.edu.au>
  2000-11-06 17:17 ` [PATCH] Re: Negative scalability by removal of Alan Cox
@ 2000-11-06 17:48 ` Linus Torvalds
  2000-11-07  5:23   ` dean gaudet
  2000-11-07 12:54   ` Andrew Morton
  1 sibling, 2 replies; 13+ messages in thread
From: Linus Torvalds @ 2000-11-06 17:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, linux-kernel



On Tue, 7 Nov 2000, Andrew Morton wrote:

> Alan Cox wrote:
> > 
> > > Even 2.2.x can be fixed to do the wake-one for accept(), if required.
> > 
> > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
> > try and backport all the mechanism. I think for 2.2 using the semaphore is a
> > good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed
> 
> It's a 16-liner!  I'll cheerfully admit that this patch
> may be completely broken, but hey, it's free.  I suggest
> that _something_ has to be done for 2.2 now, because
> Apache has switched to unserialised accept().

This is why I'd love to _not_ see silly work-arounds in apache: we
obviously _can_ fix the places where our performance sucks, but only if we
don't have other band-aids hiding the true issues.

For example, with a file-locking apache, we'd have to fix the (noticeably
harder) file locking thing to be wake-one instead, and even then we'd
never be able to do as well as something that gets the same wake-one thing
without the two extra system calls.

The patch looks superficially fine to me, although it does seem to add
another cache-line to the wakeup setup - it migth be worth-while to have
the exclusive state closer. But maybe I just didn't count right.

		Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-07  5:23   ` dean gaudet
@ 2000-11-07  5:14     ` David S. Miller
  2000-11-07  9:27       ` dean gaudet
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2000-11-07  5:14 UTC (permalink / raw)
  To: dean-list-linux-kernel; +Cc: torvalds, andrewm, alan, linux-kernel

   Date: 	Mon, 6 Nov 2000 21:23:57 -0800 (PST)
   From: dean gaudet <dean-list-linux-kernel@arctic.org>

     apache is about correctness first, and performance second.

Which is why we say it is "incorrect" for apache to try
and work around kernel performance problems. :-)))

Later,
David S. Miller
davem@redhat.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-06 17:48 ` Linus Torvalds
@ 2000-11-07  5:23   ` dean gaudet
  2000-11-07  5:14     ` David S. Miller
  2000-11-07 12:54   ` Andrew Morton
  1 sibling, 1 reply; 13+ messages in thread
From: dean gaudet @ 2000-11-07  5:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Alan Cox, linux-kernel

On Mon, 6 Nov 2000, Linus Torvalds wrote:

> This is why I'd love to _not_ see silly work-arounds in apache

hey, maybe it's time for me to repeat something that i'm often quoted as
saying:

  apache is about correctness first, and performance second.

i don't think that's silly personally.  remember most websites can be
served fine off an anemic 486/33 with one ethernet port tied behind its
back while doing a three legged race with a 6502 up a hill in san
francisco during el nino.

don't let the benchmarks fool ya!  it's generally more important that a
server be able to fork perl and parse CGIs fast than it is for it to
accept an extra 1000 conns/s.

apache-1.3.15 defines SINGLE_LISTEN_UNSERIALIZED_ACCEPT on linux 2.2 and
later.  dunno when the release date will be... someone go find a security
flaw and it'll push up the release ;)  (p.s. and rbb promised to forward
the change into 2.0 and rse said he'd forward the change into mm, all of
which were based off the same code.)

-dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-07  5:14     ` David S. Miller
@ 2000-11-07  9:27       ` dean gaudet
  0 siblings, 0 replies; 13+ messages in thread
From: dean gaudet @ 2000-11-07  9:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, andrewm, alan, linux-kernel

haha, ok!  :)

(well i'm sure you know the history, but for others -- that code entered
apache not specifically for linux... but specifically for handling the
many early-to-mid 90s unixes that just plain broke on multiple accept :)

-dean

On Mon, 6 Nov 2000, David S. Miller wrote:

>    Date: 	Mon, 6 Nov 2000 21:23:57 -0800 (PST)
>    From: dean gaudet <dean-list-linux-kernel@arctic.org>
> 
>      apache is about correctness first, and performance second.
> 
> Which is why we say it is "incorrect" for apache to try
> and work around kernel performance problems. :-)))
> 
> Later,
> David S. Miller
> davem@redhat.com
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-06 17:48 ` Linus Torvalds
  2000-11-07  5:23   ` dean gaudet
@ 2000-11-07 12:54   ` Andrew Morton
  2000-11-07 13:52     ` Alan Cox
  2000-11-21  2:06     ` lamont
  1 sibling, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2000-11-07 12:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Alan Cox, linux-kernel

Linus Torvalds wrote:
> 
> On Tue, 7 Nov 2000, Andrew Morton wrote:
> 
> > Alan Cox wrote:
> > >
> > > > Even 2.2.x can be fixed to do the wake-one for accept(), if required.
> > >
> > > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
> > > try and backport all the mechanism. I think for 2.2 using the semaphore is a
> > > good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed
> >
> > It's a 16-liner!  I'll cheerfully admit that this patch
> > may be completely broken, but hey, it's free.  I suggest
> > that _something_ has to be done for 2.2 now, because
> > Apache has switched to unserialised accept().
> 
> This is why I'd love to _not_ see silly work-arounds in apache: we
> obviously _can_ fix the places where our performance sucks, but only if we
> don't have other band-aids hiding the true issues.
> 
> For example, with a file-locking apache, we'd have to fix the (noticeably
> harder) file locking thing to be wake-one instead, and even then we'd
> never be able to do as well as something that gets the same wake-one thing
> without the two extra system calls.
> 
> The patch looks superficially fine to me, although it does seem to add
> another cache-line to the wakeup setup - it migth be worth-while to have
> the exclusive state closer. But maybe I just didn't count right.

Your counting's fine.  But I figured the third cachline was OK
because we're going to need that in add_to_runqueue() a few
cycles later.

Anyway, version 2 below uses LIFO for the accept() wakeups.  This
appears to be a 5%-10% win for Apache.  The browsing loop for
exclusive tasks will now pull in cachelines 0 and 2, rather
than the previous 0 and 1.

--- linux-2.2.18-pre19/include/linux/sched.h	Sun Nov  5 11:46:54 2000
+++ linux-akpm/include/linux/sched.h	Tue Nov  7 20:20:13 2000
@@ -79,6 +79,7 @@
 #define TASK_ZOMBIE		4
 #define TASK_STOPPED		8
 #define TASK_SWAPPING		16
+#define TASK_EXCLUSIVE		32
 
 /*
  * Scheduling policies
@@ -251,6 +252,7 @@
 	struct task_struct *next_task, *prev_task;
 	struct task_struct *next_run,  *prev_run;
 
+	unsigned int task_exclusive;	/* task wants wake-one semantics in __wake_up() */
 /* task state */
 	struct linux_binfmt *binfmt;
 	int exit_code, exit_signal;
@@ -370,6 +372,7 @@
 /* counter */	DEF_PRIORITY,DEF_PRIORITY,0, \
 /* SMP */	0,0,0,-1, \
 /* schedlink */	&init_task,&init_task, &init_task, &init_task, \
+/* task_exclusive */ 0, \
 /* binfmt */	NULL, \
 /* ec,brk... */	0,0,0,0,0,0, \
 /* pid etc.. */	0,0,0,0,0, \
@@ -496,8 +499,8 @@
 						    signed long timeout));
 extern void FASTCALL(wake_up_process(struct task_struct * tsk));
 
-#define wake_up(x)			__wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE)
-#define wake_up_interruptible(x)	__wake_up((x),TASK_INTERRUPTIBLE)
+#define wake_up(x)			__wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE | TASK_EXCLUSIVE)
+#define wake_up_interruptible(x)	__wake_up((x),TASK_INTERRUPTIBLE | TASK_EXCLUSIVE)
 
 #define __set_current_state(state_value)	do { current->state = state_value; } while (0)
 #ifdef __SMP__
--- linux-2.2.18-pre19/kernel/sched.c	Sun Nov  5 11:46:54 2000
+++ linux-akpm/kernel/sched.c	Tue Nov  7 20:23:25 2000
@@ -890,8 +890,9 @@
  */
 void __wake_up(struct wait_queue **q, unsigned int mode)
 {
-	struct task_struct *p;
+	struct task_struct *p, *last_exclusive;
 	struct wait_queue *head, *next;
+	unsigned int done_exclusive, do_exclusive;
 
         if (!q)
 		goto out;
@@ -906,10 +907,17 @@
 	if (!next)
 		goto out_unlock;
 
+	last_exclusive = NULL;
+	do_exclusive = mode & TASK_EXCLUSIVE;
 	while (next != head) {
 		p = next->task;
 		next = next->next;
 		if (p->state & mode) {
+			if (do_exclusive && p->task_exclusive) {
+				last_exclusive = p;
+				continue;
+			}
+
 			/*
 			 * We can drop the read-lock early if this
 			 * is the only/last process.
@@ -922,6 +930,8 @@
 			wake_up_process(p);
 		}
 	}
+	if (last_exclusive)
+		wake_up_process(last_exclusive);
 out_unlock:
 	read_unlock(&waitqueue_lock);
 out:
--- linux-2.2.18-pre19/net/ipv4/tcp.c	Sun Nov  5 11:46:54 2000
+++ linux-akpm/net/ipv4/tcp.c	Tue Nov  7 20:20:13 2000
@@ -1619,6 +1619,7 @@
 	struct wait_queue wait = { current, NULL };
 	struct open_request *req;
 
+	current->task_exclusive = 1;
 	add_wait_queue(sk->sleep, &wait);
 	for (;;) {
 		current->state = TASK_INTERRUPTIBLE;
@@ -1632,6 +1633,8 @@
 			break;
 	}
 	current->state = TASK_RUNNING;
+	wmb();
+	current->task_exclusive = 0;
 	remove_wait_queue(sk->sleep, &wait);
 	return req;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-07 12:54   ` Andrew Morton
@ 2000-11-07 13:52     ` Alan Cox
  2000-11-21  2:06     ` lamont
  1 sibling, 0 replies; 13+ messages in thread
From: Alan Cox @ 2000-11-07 13:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, Alan Cox, linux-kernel

> Anyway, version 2 below uses LIFO for the accept() wakeups.  This
> appears to be a 5%-10% win for Apache.  The browsing loop for
> exclusive tasks will now pull in cachelines 0 and 2, rather
> than the previous 0 and 1.

That makes it much worse for the newest cpus which use 64byte lines (Athlon
and PIV)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-07 12:54   ` Andrew Morton
  2000-11-07 13:52     ` Alan Cox
@ 2000-11-21  2:06     ` lamont
  1 sibling, 0 replies; 13+ messages in thread
From: lamont @ 2000-11-21  2:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, Alan Cox, linux-kernel


there's already the Linux Scalability Project's wake_one() patch for 2.2.9
(which applies fine to 2.2.18preX):

http://www.citi.umich.edu/projects/linux-scalability/patches/p_accept-2.2.9.diff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-05 20:21   ` dean gaudet
@ 2000-11-05 22:43     ` Alan Cox
  0 siblings, 0 replies; 13+ messages in thread
From: Alan Cox @ 2000-11-05 22:43 UTC (permalink / raw)
  To: dean gaudet; +Cc: Alan Cox, Linus Torvalds, linux-kernel

> oh, someone reminded me of the other reason sysvsems suck:  a cgi can grab
> the semaphore and hold it, causing a DoS.  of course folks could, and
> should use suexec/cgiwrap to avoid this.

The same cgi can killall -STOP httpd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-04 10:54 ` [PATCH] Re: Negative scalability by removal of Alan Cox
  2000-11-04 17:22   ` Linus Torvalds
@ 2000-11-05 20:21   ` dean gaudet
  2000-11-05 22:43     ` Alan Cox
  1 sibling, 1 reply; 13+ messages in thread
From: dean gaudet @ 2000-11-05 20:21 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linus Torvalds, linux-kernel

the numbers didn't look that bad for the small numbers of concurrent
clients on 2.2... a few % slower without the serialisation.  compared to
orders of magnitude slower with large numbers of concurrent client.

oh, someone reminded me of the other reason sysvsems suck:  a cgi can grab
the semaphore and hold it, causing a DoS.  of course folks could, and
should use suexec/cgiwrap to avoid this.

-dean

On Sat, 4 Nov 2000, Alan Cox wrote:

> > Even 2.2.x can be fixed to do the wake-one for accept(), if required. 
> 
> Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
> try and backport all the mechanism. I think for 2.2 using the semaphore is a 
> good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-04 17:22   ` Linus Torvalds
@ 2000-11-05 16:22     ` Andrea Arcangeli
  0 siblings, 0 replies; 13+ messages in thread
From: Andrea Arcangeli @ 2000-11-05 16:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Alan Cox, linux-kernel

On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote:
> We don't need to backport of the full exclusive wait queues: we could do
> the equivalent of the semaphore inside the kernel around just accept(). It
> wouldn't be a generic thing, but it would fix the specific case of
> accept().

The first wake-one patch floating around was against 2.2.x waitqueues and
it's a very simple patch and it fixes the problem (it also gives LIFO
to accept with the downside that it needs to do an O(N) browse on the
waitqueue before doing the exclusive wakeup compared to 2.4.x that does
the wake-one task selection in O(1) if everybody is sleeping in accept, but it
does that FIFO unfortunately).

The real problem that DaveM knows well is that TCP in 2.2.x will end doing
three wakeups every time the socket moves from LISTEN to ESTABLISHED state, so
it was really doing a wake-three not a wake-one :). So the brainer part
is to fix TCP and not the scheduler/waitqueue part.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-04 10:54 ` [PATCH] Re: Negative scalability by removal of Alan Cox
@ 2000-11-04 17:22   ` Linus Torvalds
  2000-11-05 16:22     ` Andrea Arcangeli
  2000-11-05 20:21   ` dean gaudet
  1 sibling, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2000-11-04 17:22 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel



On Sat, 4 Nov 2000, Alan Cox wrote:
>
> > Even 2.2.x can be fixed to do the wake-one for accept(), if required. 
> 
> Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
> try and backport all the mechanism. I think for 2.2 using the semaphore is a 
> good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed

We don't need to backport of the full exclusive wait queues: we could do
the equivalent of the semaphore inside the kernel around just accept(). It
wouldn't be a generic thing, but it would fix the specific case of
accept().

Otherwise we're going to have old binaries of apache lying around forever
that do the wrong thing..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Re: Negative scalability by removal of
  2000-11-04  6:23 [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9) Linus Torvalds
@ 2000-11-04 10:54 ` Alan Cox
  2000-11-04 17:22   ` Linus Torvalds
  2000-11-05 20:21   ` dean gaudet
  0 siblings, 2 replies; 13+ messages in thread
From: Alan Cox @ 2000-11-04 10:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

> Even 2.2.x can be fixed to do the wake-one for accept(), if required. 

Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to
try and backport all the mechanism. I think for 2.2 using the semaphore is a 
good approach. Its a hack to fix an old OS kernel. For 2.4 its not needed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2000-11-21  2:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <3A06C007.99EE3746@uow.edu.au>
2000-11-06 17:17 ` [PATCH] Re: Negative scalability by removal of Alan Cox
2000-11-06 17:48 ` Linus Torvalds
2000-11-07  5:23   ` dean gaudet
2000-11-07  5:14     ` David S. Miller
2000-11-07  9:27       ` dean gaudet
2000-11-07 12:54   ` Andrew Morton
2000-11-07 13:52     ` Alan Cox
2000-11-21  2:06     ` lamont
2000-11-04  6:23 [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9) Linus Torvalds
2000-11-04 10:54 ` [PATCH] Re: Negative scalability by removal of Alan Cox
2000-11-04 17:22   ` Linus Torvalds
2000-11-05 16:22     ` Andrea Arcangeli
2000-11-05 20:21   ` dean gaudet
2000-11-05 22:43     ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).