linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Strange load spikes on 2.4.19 kernel
       [not found] <Pine.LNX.4.33.0210130202070.17395-100000@coffee.psychology.mcmaster.ca>
@ 2002-10-13  6:34 ` Rob Mueller
  2002-10-13  7:01   ` Joseph D. Wagner
  2002-10-13 12:31   ` Marius Gedminas
  0 siblings, 2 replies; 41+ messages in thread
From: Rob Mueller @ 2002-10-13  6:34 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-kernel, Jeremy Howard


We've just discovered that this is actually happening on another one of our
machines as well. This machine uses 2.4.18 kernel, ext3 and has 2 SCSI
drives and 2 IDE drives which hold the main user mailbox data. Also it's a
P3 with a completely different motherboard rather than an Athlon, so it
doesn't seem to be hardware related in that respect.

> well, it's conceivable that if something is blocking a bunch
> of procs, it would also block your shell and ps, so not show up.
> "vmstat 1" might work better, though it's munging plenty of
> /proc files so is hardly immune to that.

But the ps/shell would only use CPU time, and there's plenty of that
available. Unless it was something everything would block on... what could
that be? A spin lock or something? Some interrupt routine? Would that even
result in other processes being counted as blocked process?

Also Let me do a calculation, though I have no idea if this is right or
not...
a) the first item in the uptime output is 'system load average for the last
1 minute'
b) it seems to only update/recalculate every 5 seconds
c) it jumps from < 1 to 20 in 1 interval (eg 5 seconds)

This means that for it to jump from < 1 to 20 in 5 seconds, there must be on
average about 60/5 * 20 = 240 processes blocked over those 5 seconds waiting
for run time of some sort for the load to jump 20 points. Is that right?

> but it's worth asking: do you notice a hiccup other than by looking
> at the loadav?  that is, suppose the loadav is simply miscalculated...

Well yes, there definitely does seem to be a performance hit on the whole
system when the load jumps, everything feels significantly more 'sluggish'
during the spikes is the best I can describe it right now...

Rob


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:01   ` Joseph D. Wagner
@ 2002-10-13  7:01     ` David S. Miller
  2002-10-13  7:49       ` Joseph D. Wagner
  2002-10-13  8:59     ` Anton Blanchard
  1 sibling, 1 reply; 41+ messages in thread
From: David S. Miller @ 2002-10-13  7:01 UTC (permalink / raw)
  To: wagnerjd; +Cc: robm, hahn, linux-kernel, jhoward

   From: "Joseph D. Wagner" <wagnerjd@prodigy.net>
   Date: Sun, 13 Oct 2002 02:01:44 -0500

   I'll let you in on a dirty little secret.  The Linux file system does
   not utilize SMP.  That's right.  All file processes go through one and
   only one processor.  It has to do with the fact that the Linux kernel is
   a non-preemptive kernel.

Not true, page cache accesses (translation: read and write)
go through the page cache which is fully multi-threaded.

Allocating blocks and inodes, yes that is currently single
threaded on SMP.  But there is no fundamental reason for that,
we just haven't gotten around to threading that bit yet.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  6:34 ` Strange load spikes on 2.4.19 kernel Rob Mueller
@ 2002-10-13  7:01   ` Joseph D. Wagner
  2002-10-13  7:01     ` David S. Miller
  2002-10-13  8:59     ` Anton Blanchard
  2002-10-13 12:31   ` Marius Gedminas
  1 sibling, 2 replies; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-13  7:01 UTC (permalink / raw)
  To: 'Rob Mueller', 'Mark Hahn'
  Cc: linux-kernel, 'Jeremy Howard'

I'll let you in on a dirty little secret.  The Linux file system does
not utilize SMP.  That's right.  All file processes go through one and
only one processor.  It has to do with the fact that the Linux kernel is
a non-preemptive kernel.

Linus, in his infinite wisdom, made a "strategiery" decision that it
would be better for one process to be able to grind your machine to a
halt than to redo and rework sections of the kernel that don't allow for
preemption.

Try switching kernels to the Linux Kernel Preemption Project:
http://sourceforge.net/projects/kpreempt


-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Rob Mueller
Sent: Sunday, October 13, 2002 1:34 AM
To: Mark Hahn
Cc: linux-kernel@vger.kernel.org; Jeremy Howard
Subject: Re: Strange load spikes on 2.4.19 kernel


We've just discovered that this is actually happening on another one of
our
machines as well. This machine uses 2.4.18 kernel, ext3 and has 2 SCSI
drives and 2 IDE drives which hold the main user mailbox data. Also it's
a
P3 with a completely different motherboard rather than an Athlon, so it
doesn't seem to be hardware related in that respect.

> well, it's conceivable that if something is blocking a bunch
> of procs, it would also block your shell and ps, so not show up.
> "vmstat 1" might work better, though it's munging plenty of
> /proc files so is hardly immune to that.

But the ps/shell would only use CPU time, and there's plenty of that
available. Unless it was something everything would block on... what
could
that be? A spin lock or something? Some interrupt routine? Would that
even
result in other processes being counted as blocked process?

Also Let me do a calculation, though I have no idea if this is right or
not...
a) the first item in the uptime output is 'system load average for the
last
1 minute'
b) it seems to only update/recalculate every 5 seconds
c) it jumps from < 1 to 20 in 1 interval (eg 5 seconds)

This means that for it to jump from < 1 to 20 in 5 seconds, there must
be on
average about 60/5 * 20 = 240 processes blocked over those 5 seconds
waiting
for run time of some sort for the load to jump 20 points. Is that right?

> but it's worth asking: do you notice a hiccup other than by looking
> at the loadav?  that is, suppose the loadav is simply miscalculated...

Well yes, there definitely does seem to be a performance hit on the
whole
system when the load jumps, everything feels significantly more
'sluggish'
during the spikes is the best I can describe it right now...

Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:01     ` David S. Miller
@ 2002-10-13  7:49       ` Joseph D. Wagner
  2002-10-13  7:50         ` David S. Miller
  2002-10-16 21:00         ` Bill Davidsen
  0 siblings, 2 replies; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-13  7:49 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: robm, hahn, linux-kernel, jhoward

>> I'll let you in on a dirty little secret.  The Linux file
>> system does not utilize SMP.  That's right.  All file
>> processes go through one and only one processor.  It has
>> to do with the fact that the Linux kernel is a non-preemptive
>> kernel.

> Not true, page cache accesses (translation: read and write)
> go through the page cache which is fully multi-threaded.

> Allocating blocks and inodes, yes that is currently single
> threaded on SMP.

Now wait a minute!  Allocating blocks and inodes is an integral part of
a write.  Oh sure the actual writing of file data is SMP, but that
process is bottlenecked by single threaded allocation of blocks and
inodes.  Perhaps I could phrase what I said to be more technically
accurate by saying, "Writing makes such poor use of multi-threading on
SMP that in terms of performance it's as if it was single threaded."

> But there is no fundamental reason for that, we just haven't
> gotten around to threading that bit yet.

Oh yes there is.  What if an allocation of blocks and/or inodes is
preempted?  Another thread could attempt to allocate the same set of
blocks and/or inodes.

This isn't a problem on a uniprocessor system because only one processor
can access any data structure at any given time.

However, on SMP two kernel control paths running on different CPUs could
concurrently access the same data structure.  There's two ways to deal
with this error: 1) the lazy way and the way Linus decided to go was to
run block and inode allocation through one single thread, or 2) the
better way is to preempt the other process which would require a) a
preemptive kernel and b) synchronization (and as a programmer I can tell
you that synchronization gets messy if not thoroughly designed and
implemented).

Rather than go through all the work of rewriting the kernel to be
preemptive and significantly improving synchronization routines (a lot
of work), Linus chose to solve the problem by avoiding it, rather than
dealing with it as he should have.

If you don't believe me, prove me wrong.  Write the code.  If you ever
got your @$$ around to it, you'd see that I'm right.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:49       ` Joseph D. Wagner
@ 2002-10-13  7:50         ` David S. Miller
  2002-10-13  8:16           ` Joseph D. Wagner
  2002-10-16 21:00         ` Bill Davidsen
  1 sibling, 1 reply; 41+ messages in thread
From: David S. Miller @ 2002-10-13  7:50 UTC (permalink / raw)
  To: wagnerjd; +Cc: robm, hahn, linux-kernel, jhoward

   From: "Joseph D. Wagner" <wagnerjd@prodigy.net>
   Date: Sun, 13 Oct 2002 02:49:55 -0500

   > But there is no fundamental reason for that, we just haven't
   > gotten around to threading that bit yet.
   
   Oh yes there is.  What if an allocation of blocks and/or inodes is
   preempted?  Another thread could attempt to allocate the same set of
   blocks and/or inodes.
   
That's why we protect the allocation with SMP locking primitives
which under Linux prevent preemption.

This isn't rocket science, the IP networking is fully threaded for
example and I consider that about as hard to thread as something like
ext2/ext3 inode/block allocation.

Also, as Andi Kleen noted, it's actually filesystem dependant whether
the inode/block allocation is threaded or not.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:16           ` Joseph D. Wagner
@ 2002-10-13  8:13             ` David S. Miller
  2002-10-13  8:40               ` Joseph D. Wagner
  0 siblings, 1 reply; 41+ messages in thread
From: David S. Miller @ 2002-10-13  8:13 UTC (permalink / raw)
  To: wagnerjd; +Cc: robm, hahn, linux-kernel, jhoward

   From: "Joseph D. Wagner" <wagnerjd@prodigy.net>
   Date: Sun, 13 Oct 2002 03:16:30 -0500

   "SMP locking primitives"? Tell me what that is again?  Oh yeah!  That's
   when the kernel basically gives SMP a timeout and behaves as if there
   was only one processor.
   
   So in effect, I was right.  File processes really do use one and only
   one processor.
   
Not true.  While a block is being allocated on mounted filesystem X
on one cpu, a TCP packet can be being processed on another processor and
a block can be allocated on mounted filesystem Y on another processor.

Actually, it can even be threaded to the point where block allocations
on the same filesystem can occur in parallel as long as it is being
done for different block groups.

So in effect, you're not so right.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:50         ` David S. Miller
@ 2002-10-13  8:16           ` Joseph D. Wagner
  2002-10-13  8:13             ` David S. Miller
  0 siblings, 1 reply; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-13  8:16 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: robm, hahn, linux-kernel, jhoward

>>> But there is no fundamental reason for that, we just haven't
>>> gotten around to threading that bit yet.
   
>> Oh yes there is.  What if an allocation of blocks and/or
>> inodes is preempted?  Another thread could attempt to
>> allocate the same set of blocks and/or inodes.
   
> That's why we protect the allocation with SMP locking
> primitives which under Linux prevent preemption.

"SMP locking primitives"? Tell me what that is again?  Oh yeah!  That's
when the kernel basically gives SMP a timeout and behaves as if there
was only one processor.

So in effect, I was right.  File processes really do use one and only
one processor.

> This isn't rocket science....

I agree.  I totally agree.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:13             ` David S. Miller
@ 2002-10-13  8:40               ` Joseph D. Wagner
  2002-10-13  8:45                 ` David S. Miller
                                   ` (4 more replies)
  0 siblings, 5 replies; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-13  8:40 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: robm, hahn, linux-kernel, jhoward

> Not true.  While a block is being allocated on mounted
> filesystem X on one cpu, a TCP packet can be being
> processed on another processor and a block can be allocated
> on mounted filesystem Y on another processor.

Does anyone besides me notice that the more Dave and I argue the longer
and longer the list of extenuating circumstances gets in order for Dave
to continue to be right?

In this email, I'm not right if the data is on separate partitions.

Dave, do you realize how many people, despite advice to the contrary,
put everything on one honk'in / partition?  For all those people, I'm
right.

Dave, you're confusing the rule with the exceptions to the rule.  I'm
right as a general rule, and you're pointing out all the exceptions to
the rule to try to prove that I'm wrong.

> Actually, it can even be threaded to the point where
> block allocations on the same filesystem can occur in
> parallel as long as it is being done for different
> block groups.

Prove it.  If you can code multi-threading SMP block and inode
allocation using a non-preemptive kernel (which Linux is) ON THE SAME
PARTITION, I will eat my hard drive.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:40               ` Joseph D. Wagner
@ 2002-10-13  8:45                 ` David S. Miller
  2002-10-13  8:48                 ` Mike Galbraith
                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 41+ messages in thread
From: David S. Miller @ 2002-10-13  8:45 UTC (permalink / raw)
  To: wagnerjd; +Cc: robm, hahn, linux-kernel, jhoward

   From: "Joseph D. Wagner" <wagnerjd@prodigy.net>
   Date: Sun, 13 Oct 2002 03:40:51 -0500

   If you can code multi-threading SMP block and inode
   allocation using a non-preemptive kernel (which Linux is) ON THE SAME
   PARTITION, I will eat my hard drive.
   
First, what you're asking me for is already occuring in the reiserfs
and xfs code in 2.5.x.

Now onto ext2/ext3 where it doesn't exactly happen now.

It can easily be done using the SMP atomic bit operations we have in
the kernel.  On many cpus (x86 is one) it would thus reduce to one
atomic instruction to allocate a block or inode anywhere on the
filesystem, no locks even needed to make it atomic.

Allocating a block/inode is just a compare and set operation after
all.  The block/inode maps in ext2/ext3 are already just plain
bitmaps suitable for sending to the SMP bit operations we have.

It's very doable and I've even discussed this with Stephen Tweedie
and others in the past.

I think I bring some credibility to the table, being that I worked on
threading the entire Linux networking.  You can choose to disagree. :)

Why hasn't it been done?  Because ext2/ext3 block allocation
synchronization isn't showing up high on anyone's profiles at all
since the operations are so short and quick that the lock is dropped
almost immediately after it is taken.  And it's not like people aren't
running large workloads on 16-way and higher NUMA boxes in 2.5.x.
Copying the data around and doing the I/O eats the bulk of the
computing cycles.

And if you're of the "numbers talk, bullshit walks" variety just have
a look at the Linux specweb99 submissions if you don't believe the
Linux kernel can scale quite well.  Show us something that scales
better than what we have now if you're going to say we suck.  We do
suck, just a lot less than most implementations. :-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:40               ` Joseph D. Wagner
  2002-10-13  8:45                 ` David S. Miller
@ 2002-10-13  8:48                 ` Mike Galbraith
  2002-10-13  8:48                   ` David S. Miller
  2002-10-13  8:51                 ` William Lee Irwin III
                                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: Mike Galbraith @ 2002-10-13  8:48 UTC (permalink / raw)
  To: Joseph D. Wagner, 'David S. Miller'
  Cc: robm, hahn, linux-kernel, jhoward

At 03:40 AM 10/13/2002 -0500, Joseph D. Wagner wrote:
> > Not true.  While a block is being allocated on mounted
> > filesystem X on one cpu, a TCP packet can be being
> > processed on another processor and a block can be allocated
> > on mounted filesystem Y on another processor.
>
>Does anyone besides me notice that the more Dave and I argue the longer
>and longer the list of extenuating circumstances gets in order for Dave
>to continue to be right?

Nope.  You seem to think "threaded" means there can be no critical sections.

         -Mike



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:48                 ` Mike Galbraith
@ 2002-10-13  8:48                   ` David S. Miller
  0 siblings, 0 replies; 41+ messages in thread
From: David S. Miller @ 2002-10-13  8:48 UTC (permalink / raw)
  To: efault; +Cc: wagnerjd, robm, hahn, linux-kernel, jhoward

   From: Mike Galbraith <efault@gmx.de>
   Date: Sun, 13 Oct 2002 10:48:21 +0200
   
   You seem to think "threaded" means there can be no critical sections.

And as I mention in my other email, the allocation can be
broken down into a single instruction's worth of critical section.
Which is as good as his version of "threaded" could be.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:40               ` Joseph D. Wagner
  2002-10-13  8:45                 ` David S. Miller
  2002-10-13  8:48                 ` Mike Galbraith
@ 2002-10-13  8:51                 ` William Lee Irwin III
  2002-10-13 10:20                 ` Ingo Molnar
  2002-10-13 19:42                 ` Rik van Riel
  4 siblings, 0 replies; 41+ messages in thread
From: William Lee Irwin III @ 2002-10-13  8:51 UTC (permalink / raw)
  To: Joseph D. Wagner
  Cc: 'David S. Miller', robm, hahn, linux-kernel, jhoward

On Sun, Oct 13, 2002 at 03:40:51AM -0500, Joseph D. Wagner wrote:
> Does anyone besides me notice that the more Dave and I argue the longer
> and longer the list of extenuating circumstances gets in order for Dave
> to continue to be right?
> In this email, I'm not right if the data is on separate partitions.
> Dave, do you realize how many people, despite advice to the contrary,
> put everything on one honk'in / partition?  For all those people, I'm

There are enough managers to put up with without asinine browbeating
about a feature whose design is already done and implementation
underway not being implemented, posted, and merged yet.

If this is not happening fast enough for your tastes you should write
the code yourself instead of hounding those actually doing it.

I don't have time for this.

*plonk*


Bill

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:01   ` Joseph D. Wagner
  2002-10-13  7:01     ` David S. Miller
@ 2002-10-13  8:59     ` Anton Blanchard
  2002-10-13  9:26       ` William Lee Irwin III
  1 sibling, 1 reply; 41+ messages in thread
From: Anton Blanchard @ 2002-10-13  8:59 UTC (permalink / raw)
  To: Joseph D. Wagner
  Cc: 'Rob Mueller', 'Mark Hahn',
	linux-kernel, 'Jeremy Howard'


> I'll let you in on a dirty little secret.  The Linux file system does
> not utilize SMP.  That's right.  All file processes go through one and
> only one processor.  It has to do with the fact that the Linux kernel is
> a non-preemptive kernel.

My 24 way SMP disagrees with your analysis:

http://samba.org/~anton/linux/2.5.40/dbench/

Thats just ext2. dbench is a filesystem benchmark that is heavy on
inode/block allocation.

Please show us your profiles which show linux filesystems do not
utilise SMP.

Anton

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:59     ` Anton Blanchard
@ 2002-10-13  9:26       ` William Lee Irwin III
  0 siblings, 0 replies; 41+ messages in thread
From: William Lee Irwin III @ 2002-10-13  9:26 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Joseph D. Wagner, 'Rob Mueller', 'Mark Hahn',
	linux-kernel, 'Jeremy Howard'

On Sun, Oct 13, 2002 at 06:59:38PM +1000, Anton Blanchard wrote:
> My 24 way SMP disagrees with your analysis:
> http://samba.org/~anton/linux/2.5.40/dbench/
> Thats just ext2. dbench is a filesystem benchmark that is heavy on
> inode/block allocation.
> Please show us your profiles which show linux filesystems do not
> utilise SMP.

Low-level fs driver block allocation etc. does appear to be an issue in
the fs-intensive benchmarks I run. In fact, the it's the only remaining
serious lock contention issue besides the dcache_lock, and that's
solved in akpm's tree. I have assurances work is being done on block
allocation and am not too concerned about it.

The rest of the trouble I see is lock contention in the page allocator
(solved in akpm's tree), stability (%$#*!), scheduler/VM/vfs/block I/O
data structure space consumption, and raw cpu cost of various
algorithms. pmd's are particularly pernicious (dmc had something for
this), followed by buffer_heads, task_structs, names_cache (mostly an
artifact of the benchmarks, but worth fixing), and inodes.

Last, but not least, when OOM does occur, the algorithm for OOM
recovery does not degrade well in the presence of many tasks. There is
also an issue with the arrival rate to out_of_memory() being O(cpus)
and the OOM killer being based on arrival rates, but not scaling its
threshholds appropriately. The former means that the OOM killer is
triggered falsely, and the latter means the box is unresponsive for so
long in OOM kill sprees it is dead period.

Bill

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:40               ` Joseph D. Wagner
                                   ` (2 preceding siblings ...)
  2002-10-13  8:51                 ` William Lee Irwin III
@ 2002-10-13 10:20                 ` Ingo Molnar
  2002-10-13 19:42                 ` Rik van Riel
  4 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2002-10-13 10:20 UTC (permalink / raw)
  To: Joseph D. Wagner
  Cc: 'David S. Miller', robm, hahn, linux-kernel, jhoward


On Sun, 13 Oct 2002, Joseph D. Wagner wrote:

> Does anyone besides me notice that the more Dave and I argue the longer
> and longer the list of extenuating circumstances gets in order for Dave
> to continue to be right?

(what i have noticed is that the more you argue with David over this
issue, the more silly your arguments get. Please take further discussions
of this topic to: http://kernelnewbies.org/)

	Ingo


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  6:34 ` Strange load spikes on 2.4.19 kernel Rob Mueller
  2002-10-13  7:01   ` Joseph D. Wagner
@ 2002-10-13 12:31   ` Marius Gedminas
  1 sibling, 0 replies; 41+ messages in thread
From: Marius Gedminas @ 2002-10-13 12:31 UTC (permalink / raw)
  To: linux-kernel

On Sun, Oct 13, 2002 at 04:34:21PM +1000, Rob Mueller wrote:
> Also Let me do a calculation, though I have no idea if this is right or
> not...
> a) the first item in the uptime output is 'system load average for the last
> 1 minute'
> b) it seems to only update/recalculate every 5 seconds
> c) it jumps from < 1 to 20 in 1 interval (eg 5 seconds)
> 
> This means that for it to jump from < 1 to 20 in 5 seconds, there must be on
> average about 60/5 * 20 = 240 processes blocked over those 5 seconds waiting
> for run time of some sort for the load to jump 20 points. Is that right?

Load is an exponential average, recalculated according to this formula
(see CALC_LOAD in sched.h) every five seconds:

  load1 = load1 * exp + n * (1 - exp)

where exp = 1/exp(5sec/1min) ~= 1884/2048 ~= 0.92
      n = the number of running tasks at the moment

To jump from 0.21 to 27.65 in 5 second (1 update), n would have to be
343.  Wow.  (Substituting the numbers for 5 and 15 minute averages I get
n of about 362 and 352).

Can somebody check my math?

Marius Gedminas
-- 
Never trust a computer you can't repair yourself.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  8:40               ` Joseph D. Wagner
                                   ` (3 preceding siblings ...)
  2002-10-13 10:20                 ` Ingo Molnar
@ 2002-10-13 19:42                 ` Rik van Riel
  4 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2002-10-13 19:42 UTC (permalink / raw)
  To: Joseph D. Wagner
  Cc: 'David S. Miller', robm, hahn, linux-kernel, jhoward

On Sun, 13 Oct 2002, Joseph D. Wagner wrote:

> Prove it.  If you can code multi-threading SMP block and inode
> allocation using a non-preemptive kernel (which Linux is) ON THE SAME
> PARTITION, I will eat my hard drive.

Try the XFS patch.

Do you prefer ketchup or mustard ?

Rik
-- 
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/		http://distro.conectiva.com/
Current spamtrap:  <a href=mailto:"october@surriel.com">october@surriel.com</a>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:49       ` Joseph D. Wagner
  2002-10-13  7:50         ` David S. Miller
@ 2002-10-16 21:00         ` Bill Davidsen
  1 sibling, 0 replies; 41+ messages in thread
From: Bill Davidsen @ 2002-10-16 21:00 UTC (permalink / raw)
  To: Joseph D. Wagner
  Cc: 'David S. Miller', robm, hahn, linux-kernel, jhoward

On Sun, 13 Oct 2002, Joseph D. Wagner wrote:

> Now wait a minute!  Allocating blocks and inodes is an integral part of
> a write.  Oh sure the actual writing of file data is SMP, but that
> process is bottlenecked by single threaded allocation of blocks and
> inodes.  Perhaps I could phrase what I said to be more technically
> accurate by saying, "Writing makes such poor use of multi-threading on
> SMP that in terms of performance it's as if it was single threaded."

People should note that the reason this hasn't been addressed to date is
that the disk is so many orders of magnitude slower than CPU that the
practical effect of the "bottleneck" is below the noise with most
hardware.

So you're not wrong, but you seem to be making it more of an issue than
the actual impact seems to justify. I bet you can measure it, or even
config a system where it matters, but in most cases it doesn't.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-12-12  0:13     ` Andrew Morton
@ 2002-12-12  0:31       ` Steven Roussey
  0 siblings, 0 replies; 41+ messages in thread
From: Steven Roussey @ 2002-12-12  0:31 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

> Looks like all your apache instances woke up and started doing something.
> What makes you think it's a kernel or ext3 thing?

Google. ;) My search led to this particular thread, but I could not find a
final determination. I did just get a message from Rob saying that his
problem was a simultaneous wakeup as well:

http://groups.google.com/groups?q=strange+load+spikes+solved&hl=en&lr=&ie=UT
F-8&oe=UTF-8&selm=06fa01c294cc%24bcf46e10%241900a8c0%40lifebook&rnum=1

I'll move this to an apache list. Thank you for your time. I know it is
limited.

Sincerely,
Steven Roussey
http://Network54.com/ 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-12-11 23:54   ` Steven Roussey
@ 2002-12-12  0:13     ` Andrew Morton
  2002-12-12  0:31       ` Steven Roussey
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-12-12  0:13 UTC (permalink / raw)
  To: Steven Roussey; +Cc: robm, linux-kernel

Steven Roussey wrote:
>
> 500      32429  0.3  1.1 41948 9148 ?        R    15:19   0:06 /usr/local/apache/bin/httpd -DSSL

Looks like all your apache instances woke up and started doing something.
What makes you think it's a kernel or ext3 thing?

If there were processes there in `D' state then it would probably
be related to filesystem activity.  But that does not appear to
be the case.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
  2002-12-11 23:09 ` Andrew Morton
@ 2002-12-11 23:54   ` Steven Roussey
  2002-12-12  0:13     ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Steven Roussey @ 2002-12-11 23:54 UTC (permalink / raw)
  To: akpm; +Cc: robm, linux-kernel

Thanks for looking at this.

> Tried mounting all filesystems `-o noatime'?

Did that a while back.

> Is there much disk write activity?  

No:
# iostat -k
Linux 2.4.18-18.7.xsmp (morpheus.network54.com)         12/11/2002

avg-cpu:  %user   %nice    %sys   %idle
          43.70    0.00   14.87   41.43

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
dev3-0            6.36        36.76        28.33    2588238    1994376

>What journalling mode are you using?

I remember just using the default. How can I tell?

# mount
/dev/hda1 on / type ext3 (rw,noatime)
none on /proc type proc (rw)
usbdevfs on /proc/bus/usb type usbdevfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
none on /dev/shm type tmpfs (rw)
/dev/hda3 on /usr type ext3 (rw,noatime)

> The output of `ps aux' during a stall would be interesting,
> as would the `vmstat 1' ouptut.

If it helps, I recompiled Apache to have a higher limit on the number of
child servers that it can have running. I don't know why it was 256 (I
changed it to 512), unless the kernel has issues with lots of processes. But
what is 'lots'?

It is really odd. The idle % goes way up and then drops to nothing while
cpu(r) goes way high relative to normal.

This is from a mid-afternoon spike (load from 3 to 48):

#vmstat 1
...
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 6  0  0   7444  87024  13980 158236   0   0     0    16 3491  2510  19  18
63
 0  0  0   7444  80104  13876 158340   0   0     0     0 2224  1538  12   7
81
 1  0  0   7444  72860  13808 158408   0   0     0     0 2912  1759  16  11
73
 0  0  0   7444  67388  13768 158448   0   0     0     0 2025  1348  10   6
84
 0  0  0   7444  64156  13756 158460   0   0     0     0 1823  1142   8   6
86
 0  0  0   7444  62908  13756 158460   0   0     0     0 1444   515   6   6
87
 1  0  0   7444  62904  13744 158472   0   0     0     0 1073    75   1   1
98
 0  0  0   7444  62904  13672 158544   0   0     0     0  880    49   0   1
98
 0  0  0   7444  61788  13668 158548   0   0     0     0 1450   452   6   3
91
 3  0  1   7444  58904  13660 158556   0   0     0     0 3104  1386  14   8
78
 0  0  0   7444  58252  13656 158568   0   0     0    16 1481   628   6   7
87
 0  0  0   7444  54584  13648 158576   0   0     0     0 3188  2287  17   6
77
350  0  2   7444  50044  13652 158580   0   0    12     0 8759  3995  50  18
31
293 27  2   7444  41456  13644 158588   0   0     4     0 8576  3644  78  22
0
297  0  2   7444  38076  13640 158600   0   0     4    16 13163  6299  77
23   0
289  0  2   7444  36600  13616 158624   0   0     4     0 9035  4545  73  26
0
255  0  2   7444  36740  13624 158632   0   0    16     0 9827  4974  75  25
0
289  0  2   7444  36456  13604 158676   0   0    28     0 10030  4619  77
22   0
292  0  2   7444  35036  13596 158684   0   0     4     0 9064  4434  75  25
0
236  0  2   7444  32576  13656 158688   0   0    60    32 14771  7496  79
21   0
151  0  2   7444  32384  13652 158692   0   0     0     0 8670  5028  69  31
0
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
125  0  2   7444  31272  13652 158700   0   0     8     0 8825  4676  79  21
0
98  0  2   7444  30340  13648 158712   0   0    12     0 9248  5197  77  23
0
25  0  0   7444  32928  13664 158724   0   0     0   132 8649  4629  70  28
2
58  0  2   7444  32960  13672 158744   0   0    28    60 8016  4156  62  18
19
13  0  1   7444  34020  13668 158756   0   0     8     0 8759  4982  73  27
0
 1  0  0   7444  33696  13668 158776   0   0    20     0 8252  5977  65  18
17
 5  0  0   7444  34952  13668 158776   0   0     0     0 7625  5618  50  17
33
 4  0  0   7444  34752  13644 158800   0   0     0     0 8869  5982  70  20
10
 5  0  0   7444  39588  13656 158856   0   0    68     0 7054  5321  46  17
37
 1  0  0   7444  40472  13640 158872   0   0     0     0 6915  5282  50  20
30
 4  0  1   7444  41892  13640 158872   0   0     0     0 6286  4728  39  14
47
22  0  1   7444  41936  13612 158920   0   0    20     0 6323  4507  44  17
39
 5  0  0   7444  43292  13612 158936   0   0    12    16 7257  5086  53  18
28
 6  0  2   7444  44532  13604 158976   0   0    36     0 7306  5384  53  21
25
 3  0  0   7444  43952  13596 158988   0   0     4     0 7759  5077  50  46
4
 0  0  0   7444  45188  13684 158980   0   0     0   448 7696  5710  59  25
16 


## ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1416  456 ?        S    Dec10   0:06 init [3]
root         2  0.0  0.0     0    0 ?        SW   Dec10   0:00
[migration_CPU0]
root         3  0.0  0.0     0    0 ?        SW   Dec10   0:00
[migration_CPU1]
root         4  0.0  0.0     0    0 ?        SW   Dec10   0:00 [keventd]
root         5  0.0  0.0     0    0 ?        RWN  Dec10   0:01
[ksoftirqd_CPU0]
root         6  0.0  0.0     0    0 ?        RWN  Dec10   0:01
[ksoftirqd_CPU1]
root         7  0.2  0.0     0    0 ?        SW   Dec10   3:29 [kswapd]
root         8  0.0  0.0     0    0 ?        SW   Dec10   0:00 [bdflush]
root         9  0.0  0.0     0    0 ?        SW   Dec10   0:00 [kupdated]
root        10  0.0  0.0     0    0 ?        SW   Dec10   0:00 [mdrecoveryd]
root        14  0.0  0.0     0    0 ?        SW   Dec10   0:13 [kjournald]
root        92  0.0  0.0     0    0 ?        SW   Dec10   0:00 [khubd]
root       182  0.0  0.0     0    0 ?        SW   Dec10   0:00 [kjournald]
root       798  0.0  0.0  1476  528 ?        S    Dec10   0:03 syslogd -m 0
root       803  0.0  0.0  2136  492 ?        S    Dec10   0:00 klogd -2
rpc        823  0.0  0.0  1568  604 ?        S    Dec10   0:00 portmap
rpcuser    851  0.0  0.0  1600  596 ?        S    Dec10   0:00 rpc.statd
root      1002  0.0  0.1  3268 1216 ?        S    Dec10   0:00
/usr/sbin/sshd
root      1035  0.0  0.0  2316  704 ?        S    Dec10   0:00 xinetd
-stayalive -reuse -pidfile /var/run/xinetd.pid
root      1076  0.0  0.2  5304 1644 ?        S    Dec10   0:03 sendmail:
accepting connections
root      1107  0.0  0.0  1592  644 ?        S    Dec10   0:00 crond
xfs       1161  0.0  0.0  4564  664 ?        S    Dec10   0:00 xfs -droppriv
-daemon
daemon    1197  0.0  0.0  1448  488 ?        S    Dec10   0:00 /usr/sbin/atd
named     1232  0.0  0.3 13900 2528 ?        S    Dec10   0:00 named -u
named
named     1234  0.0  0.3 13900 2528 ?        S    Dec10   0:00 named -u
named
named     1235  0.0  0.3 13900 2528 ?        S    Dec10   0:59 named -u
named
named     1236  0.0  0.3 13900 2528 ?        S    Dec10   0:58 named -u
named
named     1237  0.0  0.3 13900 2528 ?        S    Dec10   0:00 named -u
named
named     1238  0.0  0.3 13900 2528 ?        S    Dec10   0:25 named -u
named
root      1249  0.0  0.1  5956 1256 ?        S    Dec10   0:01 /usr/bin/perl
/usr/libexec/webmin/miniserv.pl /etc/webmin/mini
root      1253  0.0  0.0  1388  368 tty1     S    Dec10   0:00
/sbin/mingetty tty1
root      1254  0.0  0.0  1388  368 tty2     S    Dec10   0:00
/sbin/mingetty tty2
root      1255  0.0  0.0  1388  368 tty3     S    Dec10   0:00
/sbin/mingetty tty3
root      1256  0.0  0.0  1388  368 tty4     S    Dec10   0:00
/sbin/mingetty tty4
root      1257  0.0  0.0  1388  368 tty5     S    Dec10   0:00
/sbin/mingetty tty5
root      1258  0.0  0.0  1388  368 tty6     S    Dec10   0:00
/sbin/mingetty tty6
root      6940  0.0  0.2  6712 1788 ?        S    Dec10   0:00
/usr/sbin/sshd
root      6944  0.0  0.1  2556 1176 pts/1    S    Dec10   0:00 -bash
root     29549  0.0  0.2  6664 2076 ?        S    13:46   0:00
/usr/sbin/sshd
root     29552  0.0  0.1  2528 1352 pts/0    S    13:46   0:00 -bash
root     29622  0.0  0.2  4940 1892 ?        S    13:47   0:00 smbd -D
root     29627  0.0  0.2  3920 1740 ?        S    13:47   0:02 nmbd -D
root     29628  0.0  0.1  3872 1500 ?        S    13:47   0:00 nmbd -D
root     29929  0.0  0.2  3360 1668 pts/0    S    13:53   0:00 ssh 10.1.1.10
root     31539  0.0  0.3  6348 2568 ?        S    14:56   0:00 sendmail:
./gB9LXsY03345 hotmail.co.uk.: user open
root     32124  0.0  0.6 41352 4648 ?        S    15:19   0:00
/usr/local/apache/bin/httpd -DSSL
500      32125  0.3  1.2 42140 9484 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32126  0.3  1.1 42092 9228 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32127  0.3  1.2 41956 9272 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32128  0.3  1.2 42056 9348 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32129  0.3  1.2 41976 9424 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32130  0.3  1.2 42064 9492 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32131  0.4  1.1 41956 9008 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32132  0.2  1.1 42352 9196 ?        S    15:19   0:04
/usr/local/apache/bin/httpd -DSSL
500      32133  0.3  1.2 42076 9396 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32134  0.1  1.1 42044 9080 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32135  0.3  1.2 42064 9444 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32136  0.3  1.2 42084 9336 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32137  0.3  1.2 42032 9388 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32138  0.3  1.1 42036 8916 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32139  0.3  1.2 42072 9376 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32140  0.2  1.2 42020 9356 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32141  0.5  1.3 43344 10616 ?       R    15:19   0:08
/usr/local/apache/bin/httpd -DSSL
500      32142  0.3  1.1 42016 9140 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32143  0.4  1.2 42152 9580 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32144  0.1  1.2 42436 9372 ?        R    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32145  0.3  1.2 42084 9856 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32146  0.3  1.2 41960 9500 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32147  0.3  1.1 41992 9048 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32148  0.3  1.2 42092 9400 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32149  0.1  1.1 41888 8912 ?        R    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32150  0.3  1.1 41944 9224 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32151  0.3  1.2 42140 9352 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32152  0.3  1.2 42096 9556 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32153  0.3  1.2 42160 9528 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32154  0.4  1.2 41964 9844 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32155  0.2  1.1 42020 8860 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32156  0.3  1.2 42244 9472 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32157  0.3  1.1 42064 9048 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32158  0.3  1.1 42020 9072 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32159  0.3  1.2 42128 9748 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32160  0.3  1.2 42092 9384 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32161  0.3  1.1 42160 9244 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32162  0.3  1.2 41992 9348 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32163  0.3  1.2 42116 9416 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32164  0.3  1.1 42076 9240 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32165  0.3  1.1 41800 8980 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32166  0.4  1.1 41976 9008 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32167  0.3  1.1 41968 9236 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32168  0.3  1.1 41924 8908 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32169  0.3  1.2 42036 9368 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32170  0.3  1.2 42048 9284 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32171  0.3  1.2 41916 9580 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32172  0.3  1.1 41996 8916 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32173  0.4  1.2 42168 9328 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32174  0.3  1.2 41976 9432 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32175  0.3  1.1 41988 8912 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32176  0.1  1.1 41948 8752 ?        R    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32177  0.3  1.2 42096 9544 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32178  0.3  1.2 42028 9304 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32179  0.3  1.1 42068 9168 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32180  0.3  1.2 41936 9392 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32181  0.3  1.2 42004 9700 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32182  0.3  1.1 41976 9196 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32183  0.3  1.2 42060 9796 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32184  0.4  1.1 42036 8888 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32185  0.3  1.2 42232 9824 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32186  0.3  1.1 42068 9084 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32187  0.4  1.1 42056 8900 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32188  0.3  1.2 41932 9488 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32189  0.3  1.1 42016 9112 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32190  0.3  1.1 42000 9036 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32191  0.2  1.1 41904 8980 ?        S    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32192  0.3  1.1 42112 9176 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32193  0.3  1.1 42044 9184 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32194  0.3  1.1 42016 9184 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32195  0.3  1.2 41916 9428 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32196  0.2  1.2 42076 9348 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32197  0.3  1.1 41952 8984 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32198  0.3  1.2 42128 9272 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32199  0.3  1.1 42092 9048 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32200  0.3  1.2 42036 9464 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32201  0.3  1.2 42024 9312 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32202  0.3  1.1 41988 8868 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32203  0.3  1.2 41944 9364 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32204  0.3  1.2 42128 9392 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32205  0.3  1.1 41888 9248 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32206  0.3  1.2 42016 9580 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32207  0.3  1.2 42360 9344 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32208  0.2  1.1 41984 9152 ?        R    15:19   0:04
/usr/local/apache/bin/httpd -DSSL
500      32209  0.3  1.2 42016 9516 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32210  0.3  1.2 41956 9472 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32211  0.4  1.2 42012 9700 ?        S    15:19   0:08
/usr/local/apache/bin/httpd -DSSL
500      32212  0.3  1.2 42092 9600 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32213  0.3  1.2 42012 9688 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32214  0.3  1.1 42028 9008 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32215  0.3  1.1 42028 9168 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32216  0.3  1.2 42116 9304 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32217  0.3  1.1 42132 8820 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32218  0.3  1.1 42012 9256 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32219  0.3  1.1 41956 9072 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32220  0.3  1.1 41984 9220 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32221  0.3  1.1 42012 9180 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32222  0.3  1.1 42244 9244 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32223  0.3  1.1 42044 9236 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32224  0.3  1.1 41896 9140 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32225  0.4  1.1 42088 8912 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32226  0.2  1.1 42088 8792 ?        S    15:19   0:04
/usr/local/apache/bin/httpd -DSSL
500      32227  0.3  1.2 42048 9336 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32228  0.3  1.1 42140 8864 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32229  0.1  1.1 41876 8520 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32230  0.3  1.1 41916 9204 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32231  0.3  1.2 42308 9568 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32232  0.3  1.2 42084 9316 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32233  0.3  1.2 42084 9824 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32234  0.3  1.2 42008 9340 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32235  0.3  1.2 42100 9668 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32236  0.3  1.1 41976 9024 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32237  0.3  1.1 41960 9104 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32238  0.3  1.2 42124 9344 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32239  0.3  1.2 42140 9608 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32240  0.3  1.2 41948 9360 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32241  0.4  1.2 42104 9860 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32242  0.3  1.1 42072 9224 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32243  0.3  1.1 42112 9252 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32244  0.3  1.2 42076 9380 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32245  0.3  1.1 41924 8912 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32246  0.3  1.1 42028 8700 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32247  0.3  1.1 42084 9252 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32248  0.3  1.1 41916 8864 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32249  0.4  1.2 42164 9884 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32250  0.3  1.2 42292 9424 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32251  0.3  1.1 42056 9000 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32252  0.4  1.2 42140 9704 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32253  0.3  1.2 42004 9396 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32254  0.3  1.1 41948 9008 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32255  0.3  1.1 42024 8988 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32256  0.3  1.1 42104 9244 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32257  0.3  1.2 42148 9628 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32258  0.3  1.2 42232 9328 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32259  0.3  1.2 42104 9644 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32260  0.3  1.1 42064 9224 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32261  0.3  1.1 41940 8924 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32262  0.3  1.1 41972 9224 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32263  0.3  1.2 42068 9488 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32264  0.3  1.2 41940 9396 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32265  0.3  1.1 41928 9240 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32266  0.3  1.1 42172 9228 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32267  0.3  1.1 41952 9172 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32268  0.1  1.1 41992 8756 ?        R    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32269  0.3  1.1 41980 9140 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32270  0.3  1.2 41948 9284 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32271  0.3  1.2 42064 9296 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32272  0.3  1.2 42104 9352 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32273  0.3  1.1 42040 9244 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32274  0.3  1.1 42064 9048 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32275  0.3  1.2 42112 9796 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32276  0.3  1.2 42056 9304 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32277  0.3  1.2 41920 9340 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32278  0.3  1.2 42060 9332 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32279  0.3  1.1 42188 9180 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32280  0.3  1.1 41944 9048 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32281  0.3  1.1 42024 8512 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32282  0.3  1.2 42292 9400 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32283  0.3  1.1 41948 8840 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32284  0.3  1.2 42188 9396 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32285  0.3  1.2 42116 9792 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32286  0.3  1.1 41936 9100 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32287  0.3  1.2 41964 9420 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32288  0.3  1.1 41996 9260 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32289  0.3  1.1 42024 9156 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32290  0.3  1.2 42076 9320 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32291  0.3  1.1 42048 9036 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32292  0.3  1.1 42040 9012 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32293  0.3  1.1 42100 9184 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32294  0.3  1.2 42072 9500 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32295  0.3  1.1 41952 9204 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32296  0.1  1.0 41932 8404 ?        R    15:19   0:01
/usr/local/apache/bin/httpd -DSSL
500      32297  0.3  1.1 42008 9164 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32298  0.3  1.2 42016 9312 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32299  0.3  1.2 42032 9696 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32300  0.3  1.2 41936 9268 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32301  0.3  1.2 42088 9432 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32302  0.3  1.1 41912 9020 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32303  0.3  1.1 41888 9088 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32304  0.3  1.1 42032 8984 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32305  0.3  1.2 42128 9576 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32306  0.3  1.1 42008 8840 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32307  0.3  1.2 42012 9404 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32308  0.3  1.2 42092 9372 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32309  0.4  1.2 42340 9460 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32310  0.3  1.1 41988 9264 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32311  0.3  1.1 41996 9184 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32312  0.3  1.1 42200 8808 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32313  0.3  1.1 41968 8952 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32314  0.3  1.1 42224 9108 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32315  0.3  1.2 42400 9596 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32316  0.3  1.2 42500 9708 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32317  0.3  1.1 42016 8820 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32318  0.3  1.1 42048 8876 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32319  0.3  1.2 42064 9568 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32320  0.3  1.2 41920 9544 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32321  0.4  1.1 42108 9064 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32322  0.3  2.0 48804 15572 ?       S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32323  0.3  1.2 42176 9368 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32324  0.3  1.2 42180 9460 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32325  0.3  1.1 41976 8932 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32326  0.3  1.2 42216 9420 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32327  0.3  1.1 42008 9172 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32328  0.3  1.2 42116 9436 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32329  0.3  1.2 42416 9280 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32330  0.4  1.1 42032 8800 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32331  0.3  1.1 42036 9184 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32332  0.4  1.1 41972 9260 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32333  0.2  1.2 42004 9268 ?        S    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32334  0.3  1.1 41988 9172 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32335  0.3  1.1 42052 8920 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32336  0.4  1.2 42084 9352 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32337  0.3  1.2 42244 9668 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32338  0.3  1.1 42036 8964 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32339  0.3  1.1 41992 8772 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32340  0.3  1.1 42124 9012 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32341  0.3  1.2 42000 9568 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32342  0.3  1.2 42032 9284 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32343  0.3  1.2 42292 9272 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32344  0.4  1.1 42280 9132 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32345  0.3  1.1 41912 9044 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32346  0.3  1.2 42040 9492 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32347  0.3  1.2 41992 9676 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32348  0.3  1.2 42056 9404 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32349  0.4  1.1 42040 9080 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32350  0.3  1.1 41964 9060 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32351  0.1  1.1 42140 9192 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32352  0.3  1.1 42024 8868 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32353  0.3  1.1 42124 8888 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32354  0.3  1.2 41960 9668 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32355  0.3  1.1 42068 9212 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32356  0.3  1.2 42020 9292 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32357  0.3  1.1 41952 8968 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32358  0.3  1.2 42020 9340 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32359  0.3  1.2 42088 9504 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32360  0.3  1.1 42008 9056 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32361  0.2  1.1 41764 8844 ?        S    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32362  0.3  1.2 42240 9692 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32363  0.3  1.1 42272 9248 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32364  0.4  1.1 41888 8952 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32365  0.3  1.1 41964 9208 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32366  0.3  1.1 42140 9248 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32367  0.3  1.2 42020 9288 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32368  0.4  1.2 42032 9768 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32369  0.3  1.1 42084 9244 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32370  0.3  1.1 41936 8880 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32371  0.1  1.1 41968 8636 ?        R    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32372  0.3  1.1 42044 9220 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32373  0.3  1.2 42040 9316 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32374  0.3  1.1 41896 8812 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32375  0.1  1.1 42160 8872 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32376  0.3  1.1 41936 8552 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32377  0.4  1.3 43472 10388 ?       S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32378  0.3  1.1 42012 8932 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32379  0.3  1.2 42156 9484 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32380  0.3  1.2 42204 9792 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32381  0.3  1.2 42028 9436 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32382  0.3  1.1 42048 9084 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32383  0.3  1.2 42068 9268 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32384  0.3  1.1 41924 8984 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32385  0.3  1.1 42100 9156 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32386  0.3  1.1 41944 8920 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32387  0.3  1.2 42116 9380 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32388  0.3  1.1 41928 9172 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32389  0.3  1.1 41940 9036 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32390  0.3  1.2 42248 9820 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32391  0.3  1.1 42008 9124 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32392  0.3  1.2 41988 9328 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32393  0.3  1.1 41924 8884 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32394  0.2  1.1 42648 9140 ?        S    15:19   0:04
/usr/local/apache/bin/httpd -DSSL
500      32395  0.3  1.2 42064 9336 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32396  0.3  1.1 42120 9244 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32397  0.3  1.2 41968 9272 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32398  0.3  1.1 41852 9252 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32399  0.3  1.2 42032 9540 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32400  0.3  1.2 42004 9392 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32401  0.3  1.2 42044 9340 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32402  0.3  1.1 41976 8972 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32403  0.1  1.1 41944 9008 ?        S    15:19   0:03
/usr/local/apache/bin/httpd -DSSL
500      32404  0.1  1.1 42064 8592 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32405  0.3  1.2 42068 9384 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32406  0.3  1.1 42004 9024 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32407  0.3  1.1 42016 9012 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32408  0.3  1.1 41952 8640 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32409  0.3  1.2 42124 9356 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32410  0.3  1.1 42128 9028 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32411  0.3  1.1 41980 9196 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32412  0.3  1.1 42040 9196 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32413  0.3  1.1 42144 9136 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32414  0.3  1.1 42032 8940 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32415  0.3  1.2 42016 9488 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32416  0.3  1.1 42008 9236 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32417  0.3  1.2 42200 9436 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32418  0.4  1.1 41980 8672 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32419  0.3  1.1 42048 8900 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32420  0.3  1.1 42052 9104 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32421  0.3  1.2 42184 9564 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32422  0.3  1.1 42036 8972 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32423  0.3  1.2 42172 9528 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32424  0.4  1.1 42068 9176 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32425  0.3  1.1 41988 8716 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32426  0.3  1.1 42008 9056 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32427  0.3  1.1 42056 9188 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32428  0.4  1.1 41988 9000 ?        R    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32429  0.3  1.1 41948 9148 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32430  0.3  1.2 42288 9484 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32431  0.3  1.1 42016 8964 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32432  0.4  1.1 41940 9080 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32433  0.4  1.2 42120 9280 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32434  0.3  1.2 42228 9332 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32435  0.3  1.1 42088 8944 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32436  0.1  1.1 41968 8628 ?        S    15:19   0:02
/usr/local/apache/bin/httpd -DSSL
500      32437  0.3  1.1 42056 9192 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32438  0.3  1.2 42280 9988 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32439  0.3  1.2 42096 9276 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32440  0.3  1.1 42020 9076 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32441  0.3  1.1 41880 9252 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32442  0.3  1.1 42388 9232 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32443  0.3  1.1 42072 9244 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32444  0.4  1.1 41944 9140 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32445  0.3  1.1 42064 9140 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32446  0.3  1.2 42116 9440 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32447  0.3  1.2 42224 9724 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32448  0.3  1.1 41948 9208 ?        S    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32449  0.0  1.0 41944 8448 ?        S    15:19   0:01
/usr/local/apache/bin/httpd -DSSL
500      32450  0.3  1.2 42084 9324 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32451  0.4  1.1 42024 8952 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32452  0.4  1.1 42004 9108 ?        S    15:19   0:07
/usr/local/apache/bin/httpd -DSSL
500      32453  0.3  1.2 42112 9552 ?        R    15:19   0:05
/usr/local/apache/bin/httpd -DSSL
500      32454  0.3  1.1 42008 9204 ?        S    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32455  0.3  1.2 42176 9584 ?        R    15:19   0:06
/usr/local/apache/bin/httpd -DSSL
500      32536  0.3  1.2 42216 9364 ?        S    15:22   0:05
/usr/local/apache/bin/httpd -DSSL
500      32568  0.3  1.1 42016 9248 ?        S    15:23   0:04
/usr/local/apache/bin/httpd -DSSL
500      32599  0.3  1.1 42112 9220 ?        S    15:24   0:05
/usr/local/apache/bin/httpd -DSSL
500      32600  0.3  1.1 41964 8992 ?        S    15:24   0:05
/usr/local/apache/bin/httpd -DSSL
500      32601  0.3  1.1 42128 9092 ?        S    15:24   0:05
/usr/local/apache/bin/httpd -DSSL
500      32602  0.3  1.1 41984 8852 ?        S    15:24   0:04
/usr/local/apache/bin/httpd -DSSL
500      32603  0.3  1.2 42020 9596 ?        S    15:24   0:05
/usr/local/apache/bin/httpd -DSSL
500      32644  0.3  1.2 42112 9388 ?        S    15:26   0:05
/usr/local/apache/bin/httpd -DSSL
500      32645  0.3  1.1 41972 9052 ?        S    15:26   0:04
/usr/local/apache/bin/httpd -DSSL
500      32647  0.0  0.9 42152 7664 ?        S    15:26   0:00
/usr/local/apache/bin/httpd -DSSL
root       391  2.2  0.1  2364 1296 pts/1    S    15:32   0:22 top
root       392  0.0  0.2  6668 2084 ?        S    15:32   0:00
/usr/sbin/sshd
root       394  0.0  0.1  2540 1360 pts/2    S    15:32   0:00 -bash
root       440  2.3  0.0  1448  484 pts/2    S    15:32   0:22 vmstat 1
root       441  0.0  0.2  6668 2100 ?        S    15:32   0:00
/usr/sbin/sshd
root       447  0.0  0.1  2540 1364 pts/3    S    15:32   0:00 -bash
500        843  0.2  0.9 41756 7436 ?        S    15:45   0:00
/usr/local/apache/bin/httpd -DSSL
500        907  0.4  0.9 41776 7112 ?        S    15:47   0:00
/usr/local/apache/bin/httpd -DSSL
500        921  0.2  0.8 42108 6844 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        922  0.3  0.8 41764 6628 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        923  0.3  0.8 41812 6748 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        924  0.2  0.8 41728 6664 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        925  0.8  0.8 41676 6940 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        926  0.2  0.8 41676 6632 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        927  0.3  0.8 41844 6788 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        928  0.3  0.9 41896 7112 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        929  0.2  0.8 41792 6624 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        930  0.2  0.8 41772 6880 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        931  0.2  0.8 41744 6940 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        932  0.4  0.8 41756 6792 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        933  0.4  0.9 42268 7252 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        934  0.4  0.9 41792 7236 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        935  0.2  0.8 41760 6640 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        936  0.3  0.8 41836 6792 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        937  0.3  0.8 41876 6684 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        938  0.3  0.8 41720 6724 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        939  0.2  0.9 41792 6952 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        940  0.4  0.8 41900 6716 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        941  0.2  0.8 41740 6780 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        942  0.2  0.8 41772 6508 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        943  0.1  0.8 41776 6564 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        944  0.3  0.8 41788 6788 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        945  0.3  0.8 41692 6580 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        946  0.0  0.8 41820 6544 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        947  0.5  0.8 41956 6948 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        948  0.2  0.8 41752 6816 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        949  0.3  0.8 41800 6860 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        950  0.3  0.9 41788 6956 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        951  0.3  0.9 41920 7204 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        952  0.6  0.9 41752 6996 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        953  0.3  0.9 41700 7072 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        954  0.7  0.8 41740 6724 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        955  0.3  0.9 41904 6964 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        956  0.2  0.8 41764 6480 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        957  0.6  0.9 41740 7072 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        958  0.3  0.9 41800 6960 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        959  0.2  0.8 41712 6624 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        960  0.5  0.8 41688 6672 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        961  0.3  0.8 41768 6864 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        962  0.3  0.8 41800 6848 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        963  0.4  0.8 41936 6912 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        964  0.4  0.9 41740 7020 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        965  0.3  0.8 41804 6828 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        966  0.1  0.8 41904 6784 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        967  0.3  0.9 41748 6952 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        968  0.2  0.8 41944 6792 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        969  0.3  0.8 41760 6812 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        970  0.3  0.8 41776 6676 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        971  0.1  0.8 41760 6828 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        972  0.5  0.8 41768 6940 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        973  0.6  0.8 41736 6904 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        974  0.3  0.8 41720 6776 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        975  0.3  0.8 41768 6896 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        976  0.1  0.8 41764 6668 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        977  0.2  0.8 41732 6740 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        978  0.1  0.8 41788 6784 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        979  0.3  0.8 41764 6640 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        980  0.2  0.8 41768 6584 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        981  0.0  0.8 41840 6476 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        982  0.2  0.8 41696 6740 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        983  0.1  0.8 41700 6556 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        984  0.3  0.8 41752 6924 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        985  0.2  0.8 41764 6768 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        986  0.8  0.9 41820 6980 ?        S    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
500        987  0.3  0.8 41768 6644 ?        R    15:48   0:00
/usr/local/apache/bin/httpd -DSSL
root       988  0.0  0.1  4876 1376 ?        S    15:48   0:00
/usr/sbin/sendmail -i -t -fnobody@network54.com
root       989 23.0  0.1  3224 1292 pts/3    R    15:48   0:00 ps aux
root       990  0.0  0.1  4876 1376 ?        R    15:48   0:00
/usr/sbin/sendmail -i -t -fnobody@network54.com


Sincerely,
Steven Roussey
http://Network54.com/ 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-12-11 22:54 Steven Roussey
@ 2002-12-11 23:09 ` Andrew Morton
  2002-12-11 23:54   ` Steven Roussey
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-12-11 23:09 UTC (permalink / raw)
  To: Steven Roussey; +Cc: robm, linux-kernel

Steven Roussey wrote:
> 
> Was there ever a solution to this issue?

No.  It wasn't clear what was going on.

>  Is it kernel or ext3 based issue?

One of those.

> Is there a workaround?

Tried mounting all filesystems `-o noatime'?

> I've spent two months looking for a source and
> solution to this issue. It is pressing for me since all our users get locked
> out at the height of the spike. Ours is a webserver.
> 

Is there much disk write activity?  What journalling mode
are you using?

The output of `ps aux' during a stall would be interesting,
as would the `vmstat 1' ouptut.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
@ 2002-12-11 22:54 Steven Roussey
  2002-12-11 23:09 ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Steven Roussey @ 2002-12-11 22:54 UTC (permalink / raw)
  To: robm; +Cc: linux-kernel

Was there ever a solution to this issue?  Is it kernel or ext3 based issue?
Is there a workaround? I've spent two months looking for a source and
solution to this issue. It is pressing for me since all our users get locked
out at the height of the spike. Ours is a webserver.

Example of load graph (my minute):

   http://www.network54.com/spikes.html

TIA

Steven Roussey



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13 11:47       ` Hugh Dickins
@ 2002-10-13 18:29         ` Andrew Morton
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Morton @ 2002-10-13 18:29 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andi Kleen, David S. Miller, wagnerjd, robm, hahn, linux-kernel, jhoward

Hugh Dickins wrote:
> 
> On 13 Oct 2002, Andi Kleen wrote:
> >
> > Still in 2.4 the VFS takes the big kernel lock unnecessarily for
> > a few VFS operations (no matter if the underlying FS needs it or not).
> > That's fixed in 2.5.
> 
> Something I was a bit surprised to notice recently: 2.5 still holds
> big kernel lock around the potentially very lengthy vmtruncate() -
> is that one still really necessary at VFS level?
> 

eww..  Truncating 1G of pagecache takes 2.5 seconds on my testbox.
Probably 4 seconds if that pagecache got there via write() (need to
crunch on buffer_heads as well).

There's a cond_resched() after every 16th page in truncate_inode_pages(),
so it won't be very visible to humans.  But a multi-second holdtime is
rather rude.

Certainly we don't need to hold it across truncate_inode_pages(), which
is where the heavy lifting happens.  Probably, we can just push it down
to vmtruncate(), around the i_op->truncate() callout.

But as ever, it's not really clear what the thing is protecting.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:24     ` Andi Kleen
  2002-10-13  7:21       ` David S. Miller
@ 2002-10-13 11:47       ` Hugh Dickins
  2002-10-13 18:29         ` Andrew Morton
  1 sibling, 1 reply; 41+ messages in thread
From: Hugh Dickins @ 2002-10-13 11:47 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, wagnerjd, robm, hahn, linux-kernel, jhoward

On 13 Oct 2002, Andi Kleen wrote:
> 
> Still in 2.4 the VFS takes the big kernel lock unnecessarily for
> a few VFS operations (no matter if the underlying FS needs it or not).
> That's fixed in 2.5.

Something I was a bit surprised to notice recently: 2.5 still holds
big kernel lock around the potentially very lengthy vmtruncate() -
is that one still really necessary at VFS level?

Hugh


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  6:14               ` Rob Mueller
@ 2002-10-13  7:27                 ` Simon Kirby
  0 siblings, 0 replies; 41+ messages in thread
From: Simon Kirby @ 2002-10-13  7:27 UTC (permalink / raw)
  To: Rob Mueller; +Cc: Andrew Morton, linux-kernel, Jeremy Howard

On Sun, Oct 13, 2002 at 04:14:18PM +1000, Rob Mueller wrote:

> > Not yet.  Yours is only the second report.  Possible report.
> > Please try ordered mode.  The below will fix journalled
> > mode, if this is indeed the source of the problem
> 
> We tried applying this patch, but no change either. Again, we've tried both
> journaled and ordered.

Hmmm.. Your vmstat output looked a lot like our mail server when we
recently tried switching it to ext3.  We also saw large load spikes, but
I did not investigate it very closely.  Have you tried mounting as ext2
to see if journalling is responsible?

If it's a mail server, does it use dotlocking (creation and deletion of
lots of small/empty files)?  I haven't had any time recently to look at
this any further, but at the time I had guessed that this had something
to do with the constant write-out...

Simon-

[        Simon Kirby        ][        Network Operations        ]
[     sim@netnation.com     ][     NetNation Communications     ]
[  Opinions expressed are not necessarily those of my employer. ]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
       [not found]   ` <20021013.000127.43007739.davem@redhat.com.suse.lists.linux.kernel>
@ 2002-10-13  7:24     ` Andi Kleen
  2002-10-13  7:21       ` David S. Miller
  2002-10-13 11:47       ` Hugh Dickins
  0 siblings, 2 replies; 41+ messages in thread
From: Andi Kleen @ 2002-10-13  7:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: wagnerjd, robm, hahn, linux-kernel, jhoward

"David S. Miller" <davem@redhat.com> writes:
> 
> Allocating blocks and inodes, yes that is currently single
> threaded on SMP.  But there is no fundamental reason for that,
> we just haven't gotten around to threading that bit yet.

It depends on your file system. XFS and JFS do block and inode allocation
fully SMP multithreaded. reiserfs/ext2/ext3 do not.

Still in 2.4 the VFS takes the big kernel lock unnecessarily for
a few VFS operations (no matter if the underlying FS needs it or not).
That's fixed in 2.5.

-Andi

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-13  7:24     ` Andi Kleen
@ 2002-10-13  7:21       ` David S. Miller
  2002-10-13 11:47       ` Hugh Dickins
  1 sibling, 0 replies; 41+ messages in thread
From: David S. Miller @ 2002-10-13  7:21 UTC (permalink / raw)
  To: ak; +Cc: wagnerjd, robm, hahn, linux-kernel, jhoward

   From: Andi Kleen <ak@suse.de>
   Date: 13 Oct 2002 09:24:32 +0200
   
   It depends on your file system.
...   
   Still in 2.4 the VFS takes the big kernel lock unnecessarily
...
   That's fixed in 2.5.

All true.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  7:00             ` Andrew Morton
@ 2002-10-13  6:14               ` Rob Mueller
  2002-10-13  7:27                 ` Simon Kirby
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Mueller @ 2002-10-13  6:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jeremy Howard


> > So you're saying that ext3 is somehow breaking the standard kernel
writeback
> > code?
>
> Possibly.  Please try ordered mode.

Ok, we've now tried ordered mode, and are seeing exactly the same behaviour,
no change. No processes in a waiting state or blocked state at all, but
still big load spikes. See my other post for an alternating uptime/ps
output.

> Not yet.  Yours is only the second report.  Possible report.
> Please try ordered mode.  The below will fix journalled
> mode, if this is indeed the source of the problem

We tried applying this patch, but no change either. Again, we've tried both
journaled and ordered.

Rob


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
       [not found] <Pine.LNX.4.33.0210121605490.16179-100000@coffee.psychology.mcmaster.ca>
@ 2002-10-13  0:49 ` Rob Mueller
  0 siblings, 0 replies; 41+ messages in thread
From: Rob Mueller @ 2002-10-13  0:49 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-kernel, Jeremy Howard


> but that load of 21 is really just an artifact of a bunch
> of procs being in short-term io wait (D state in top/ps), right?
> such procs get counted in loadaverage, even though they're asleep,
> not eating cycles.

Well I tried running this:

while true; do uptime; ps -wax | grep -v ' S' | grep -v 'ps -wax'; sleep 5;
done

Which should show all procs not in the sleep state (except the ps process
itself) every 5 seconds, along with an uptime load. See output below, which
again has a load jump about half way down. It appears no extra processes are
in the 'D' state. And no extra CPU load appears either during the spike.

  7:45pm  up 1 day, 18:06,  2 users,  load average: 0.29, 0.67, 2.11
  7:45pm  up 1 day, 18:07,  2 users,  load average: 0.27, 0.66, 2.10
19784 ?        R      0:00 /usr/local/apache/bin/httpd
  7:45pm  up 1 day, 18:07,  2 users,  load average: 0.24, 0.65, 2.09
19808 ?        R      0:00 /usr/local/apache/bin/httpd
  7:45pm  up 1 day, 18:07,  2 users,  load average: 0.22, 0.64, 2.08
  7:45pm  up 1 day, 18:07,  2 users,  load average: 0.21, 0.63, 2.07
  7:45pm  up 1 day, 18:07,  2 users,  load average: 27.65, 6.46, 3.95
  7:45pm  up 1 day, 18:07,  2 users,  load average: 25.44, 6.35, 3.93
  7:45pm  up 1 day, 18:07,  2 users,  load average: 23.40, 6.25, 3.90
19668 ?        R      0:00 imapd
19669 ?        R      0:00 imapd
19742 ?        D      0:00 imapd
  7:45pm  up 1 day, 18:07,  2 users,  load average: 21.53, 6.14, 3.88
19784 ?        R      0:01 /usr/local/apache/bin/httpd
  7:45pm  up 1 day, 18:07,  2 users,  load average: 19.80, 6.04, 3.86
  7:45pm  up 1 day, 18:07,  2 users,  load average: 18.22, 5.94, 3.84


I did notice a very small load jump earlier caused by what you describe
above, which seems to be due to the journal being flushed, but notice how
small this jump is in comparison to the one above...


  7:38pm  up 1 day, 18:00,  2 users,  load average: 0.63, 1.47, 2.97
  7:39pm  up 1 day, 18:00,  2 users,  load average: 0.58, 1.45, 2.96
18564 ?        D      0:00 imapd
  7:39pm  up 1 day, 18:00,  2 users,  load average: 0.53, 1.42, 2.94
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.49, 1.40, 2.93
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.53, 1.39, 2.92
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.76, 1.41, 2.91
 1441 ?        D      1:23 qmgr -l -t fifo -u
  7:39pm  up 1 day, 18:01,  2 users,  load average: 1.26, 1.50, 2.93
   10 ?        DW    17:08 [kjournald]
 1577 ?        D      0:00 imapd
16346 ?        D      0:00 lmtpd -a
16393 ?        D      0:00 imapd
16427 ?        D      0:00 lmtpd -a
17349 ?        D      0:00 imapd
17481 ?        D      0:00 lmtpd -a
18356 ?        D      0:00 imapd
  7:39pm  up 1 day, 18:01,  2 users,  load average: 1.16, 1.48, 2.91
  465 ?        R      5:57 syslogd -m 0
  7:39pm  up 1 day, 18:01,  2 users,  load average: 1.07, 1.45, 2.89
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.98, 1.43, 2.88
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.98, 1.42, 2.87
  7:39pm  up 1 day, 18:01,  2 users,  load average: 0.90, 1.40, 2.85
  7:40pm  up 1 day, 18:01,  2 users,  load average: 0.99, 1.41, 2.85
1

Now I'm more confused than ever, because there don't actually appear to be
any blocked processes at all? What is going on???

Rob Mueller


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  6:52           ` Rob Mueller
@ 2002-10-12  7:00             ` Andrew Morton
  2002-10-13  6:14               ` Rob Mueller
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-10-12  7:00 UTC (permalink / raw)
  To: Rob Mueller; +Cc: linux-kernel, Jeremy Howard

Rob Mueller wrote:
> 
> > It commits your changes to the journal every five seconds.  But your data
> > is then only in the journal.  It still needs to be written into your
> files.
> > That writeback is controlled by the normal kernel 30-second writeback
> > timing.  If that writeback isn't keeping up, kjournald needs to
> > force the writeback so it can recycle that data's space in the journal.
> >
> > While that writeback is happening, everything tends to wait on it.
> 
> Doesn't bdflush let you control this?

It doesn't work if the buffer flushtimes are wrong.

> So you're saying that ext3 is somehow breaking the standard kernel writeback
> code?

Possibly.  Please try ordered mode.

> Is this something they know about

yes

> , and/or are addressing?

Not yet.  Yours is only the second report.  Possible report.
Please try ordered mode.  The below will fix journalled
mode, if this is indeed the source of the problem


--- 2.4.19-pre10/fs/buffer.c~ext3-flushtime	Wed Jun  5 21:39:14 2002
+++ 2.4.19-pre10-akpm/fs/buffer.c	Wed Jun  5 21:39:22 2002
@@ -1067,6 +1067,8 @@ static void __refile_buffer(struct buffe
 		bh->b_list = dispose;
 		if (dispose == BUF_CLEAN)
 			remove_inode_queue(bh);
+		if (dispose == BUF_DIRTY)
+			set_buffer_flushtime(bh);
 		__insert_into_lru_list(bh, dispose);
 	}
 }
--- 2.4.19-pre10/fs/jbd/transaction.c~ext3-flushtime	Wed Jun  5 21:39:18 2002
+++ 2.4.19-pre10-akpm/fs/jbd/transaction.c	Wed Jun  5 21:39:22 2002
@@ -1101,7 +1101,6 @@ int journal_dirty_metadata (handle_t *ha
 	
 	spin_lock(&journal_datalist_lock);
 	set_bit(BH_JBDDirty, &bh->b_state);
-	set_buffer_flushtime(bh);
 
 	J_ASSERT_JH(jh, jh->b_transaction != NULL);

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
       [not found] <001401c2719b$9d45c4a0$53241c43@joe>
@ 2002-10-12  6:54 ` Rob Mueller
  0 siblings, 0 replies; 41+ messages in thread
From: Rob Mueller @ 2002-10-12  6:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: Joseph D. Wagner, Jeremy Howard


> 1) If you don't need to know when a file was last accessed, mount the
> ext3 file system with the -noatime option.  This disables updating of
> the "Last Accessed On:" property, which should significantly increase
> throughput.

Yes, already did that. Should have noted our fstab entry:

/dev/sda2  /   ext3    defaults,noatime 1 1

> /sbin/elvtune -r 16384 -w 8192 /dev/mount-point
> where mount-point is the partition (e.g. /dev/hda5)

Thanks, we'll give this a try. Didn't know about this one before.

> 3) If the above don't work, double the journal size.

As you noted, we seem to be doing more reads than writes, so I'd be suprised
if 192M wasn't enough...

> P.S.  You'd probably get more help from the ext3 mailing list.

I wasn't too sure that it was an I/O problem, which is why I posted here. As
the vmstat output showed, there didn't seem to be any sudden excessive I/O
occuring, but the load would jump enormously.

Maybe we should definitely try some different journaling modes, or disabling
journalling all together, to test if that is the actual culprit...

Rob

PS. I forgot to include the uptime dump in my last post. I just wanted to
show how spikey it was. Basically there's a long falling off decay, and then
a sudden spike again... which decays off... and then spikes again...
basically repeat what you see below over and over and over in 5-10 minute
intervals...

  1:29am  up 23:50,  3 users,  load average: 0.35, 2.49, 2.73
  1:29am  up 23:51,  2 users,  load average: 0.45, 2.44, 2.71
  1:29am  up 23:51,  2 users,  load average: 0.70, 2.42, 2.71
  1:29am  up 23:51,  2 users,  load average: 0.59, 2.34, 2.68
  1:29am  up 23:51,  2 users,  load average: 0.58, 2.28, 2.65
  1:29am  up 23:51,  2 users,  load average: 0.49, 2.21, 2.62
  1:30am  up 23:51,  2 users,  load average: 0.49, 2.15, 2.60
  1:30am  up 23:52,  2 users,  load average: 21.39, 6.43, 3.98
  1:30am  up 23:52,  2 users,  load average: 18.10, 6.22, 3.94
  1:30am  up 23:52,  2 users,  load average: 15.32, 6.01, 3.89
  1:30am  up 23:52,  2 users,  load average: 13.04, 5.83, 3.86
  1:30am  up 23:52,  2 users,  load average: 11.03, 5.64, 3.81
  1:31am  up 23:52,  2 users,  load average: 9.41, 5.47, 3.78
  1:31am  up 23:53,  2 users,  load average: 7.96, 5.29, 3.74
  1:31am  up 23:53,  2 users,  load average: 6.81, 5.13, 3.70
  1:31am  up 23:53,  2 users,  load average: 5.76, 4.96, 3.66
  1:31am  up 23:53,  2 users,  load average: 4.88, 4.80, 3.62
  1:31am  up 23:53,  2 users,  load average: 4.13, 4.64, 3.58
  1:32am  up 23:53,  2 users,  load average: 3.49, 4.48, 3.54
  1:32am  up 23:54,  2 users,  load average: 2.95, 4.34, 3.51
  1:32am  up 23:54,  2 users,  load average: 2.50, 4.19, 3.47
  1:32am  up 23:54,  2 users,  load average: 2.12, 4.05, 3.43
  1:32am  up 23:54,  2 users,  load average: 1.79, 3.92, 3.39
  1:32am  up 23:54,  2 users,  load average: 1.51, 3.79, 3.36
  1:33am  up 23:54,  2 users,  load average: 1.43, 3.70, 3.33
  1:33am  up 23:55,  2 users,  load average: 1.21, 3.58, 3.30
  1:33am  up 23:55,  2 users,  load average: 1.03, 3.46, 3.26
  1:33am  up 23:55,  2 users,  load average: 1.03, 3.38, 3.23
  1:33am  up 23:55,  2 users,  load average: 0.87, 3.27, 3.20
  1:33am  up 23:55,  2 users,  load average: 0.82, 3.17, 3.17


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  6:44         ` Andrew Morton
@ 2002-10-12  6:52           ` Rob Mueller
  2002-10-12  7:00             ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Mueller @ 2002-10-12  6:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jeremy Howard


> It commits your changes to the journal every five seconds.  But your data
> is then only in the journal.  It still needs to be written into your
files.
> That writeback is controlled by the normal kernel 30-second writeback
> timing.  If that writeback isn't keeping up, kjournald needs to
> force the writeback so it can recycle that data's space in the journal.
>
> While that writeback is happening, everything tends to wait on it.

Doesn't bdflush let you control this? I noted in my first post that we'd
played with changing bdflush params as described here:

http://www-106.ibm.com/developerworks/linux/library/l-fs8.html?dwzone=linux

And set them to this:

[root@server5 hm]# cat /proc/sys/vm/bdflush
39      500     0       0       60      300     60      0       0

Shouldn't this reduce it to writing every 3 seconds? We tried lowering some
of the values even further based on the description here:

http://www.bb-zone.com/zope/bbzone/docs/slgfg/part2/cha04/sec04

So we altered the first one (nfract) to 10(%) to try and keep the dirty
buffer list small, but that didn't help either. I sort of thought that
age_buffer at 3 seconds would be more likely to activate anyway than 40% of
buffers being dirty?

> It is suspected that ext3 gets the flushtime on those buffers
> wrong as well, so the writeback isn't happening right.

So you're saying that ext3 is somehow breaking the standard kernel writeback
code? Is this something they know about, and/or are addressing?

Rob


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  6:37       ` Rob Mueller
@ 2002-10-12  6:44         ` Andrew Morton
  2002-10-12  6:52           ` Rob Mueller
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-10-12  6:44 UTC (permalink / raw)
  To: Rob Mueller; +Cc: linux-kernel, Jeremy Howard

Rob Mueller wrote:
> 
> > If it was this, one would expect it to happen every time you'd written
> > 0.75 * 192 Mbytes to the filesystem.  Which seems about right.
> >
> > Easy enough to test though.
> 
> Hmmm, so why wouldn't the journal be flushing more regularly (the 5 seconds
> it's claiming in desg), or is that something we should ask on the ext3 list?

It commits your changes to the journal every five seconds.  But your data
is then only in the journal.  It still needs to be written into your files.
That writeback is controlled by the normal kernel 30-second writeback
timing.  If that writeback isn't keeping up, kjournald needs to
force the writeback so it can recycle that data's space in the journal.

While that writeback is happening, everything tends to wait on it.

It is suspected that ext3 gets the flushtime on those buffers
wrong as well, so the writeback isn't happening right.

> Apart from remounting the filesystem, is there any easy way to test this
> (again, silly mounted as /, so I think it's a reboot every time to try a new
> mounting configuration?)
> 

You'll need to reboot.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  3:07     ` Andrew Morton
@ 2002-10-12  6:37       ` Rob Mueller
  2002-10-12  6:44         ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Mueller @ 2002-10-12  6:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jeremy Howard


> If it was this, one would expect it to happen every time you'd written
> 0.75 * 192 Mbytes to the filesystem.  Which seems about right.
>
> Easy enough to test though.

Hmmm, so why wouldn't the journal be flushing more regularly (the 5 seconds
it's claiming in desg), or is that something we should ask on the ext3 list?
Apart from remounting the filesystem, is there any easy way to test this
(again, silly mounted as /, so I think it's a reboot every time to try a new
mounting configuration?)

Thanks

Rob


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
@ 2002-10-12  3:13 Joseph D. Wagner
  0 siblings, 0 replies; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-12  3:13 UTC (permalink / raw)
  To: 'Linux Kernel Development List'; +Cc: 'Rob Mueller'

Also, try doing a:

dumpe2fs

and sending us back the data.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: Strange load spikes on 2.4.19 kernel
@ 2002-10-12  3:10 Joseph D. Wagner
  0 siblings, 0 replies; 41+ messages in thread
From: Joseph D. Wagner @ 2002-10-12  3:10 UTC (permalink / raw)
  To: Linux Kernel Development List

I doubt it's the journal, but remember that data=journal requires a much
larger journal than data=ordered.

I'd suggest the following:

1) If you don't need to know when a file was last accessed, mount the
ext3 file system with the -noatime option.  This disables updating of
the "Last Accessed On:" property, which should significantly increase
throughput.

2) EXT3 is optimized for writes but it sounds like your server is used
primarily for reads.  (If, I'm wrong, ignore this point.)  Try:
/sbin/elvtune -r 16384 -w 8192 /dev/mount-point
	where mount-point is the partition (e.g. /dev/hda5)

If this only makes things worse, it means either 1) my numbers are too
big, or 2) your system should be optimized for writes.  (BTW, the
default is -r 4096 -w 8192.)

You can also try other elvtune settings.  Once you have found elvtune
settings that give you your most satisfactory mix of latency and
throughput for your application set, you can add the calls to the
/sbin/elvtune program to the end of your /etc/rc.d/rc.local script so
that they are set again to your chosen values at every boot.

3) If the above don't work, double the journal size.

Hope this helped.

Joseph Wagner

P.S.  You'd probably get more help from the ext3 mailing list.

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Rob Mueller
Sent: Friday, October 11, 2002 9:26 PM
To: Andrew Morton
Cc: linux-kernel@vger.kernel.org; Jeremy Howard
Subject: Re: Strange load spikes on 2.4.19 kernel


> > Filesystem is ext3 with one big / partition (that's a mistake
> > we won't repeat, but too late now). This should be mounted
> > with data=journal given the kernel command line above, though
> > it's a bit hard to tell from the dmesg log:
> >
>
> It's possible tht the journal keeps on filling.  When that happens,
> everything has to wait for writeback into the main filesystem.
> Completion of that writeback frees up journal space and then
everything
> can unblock.
>
> Suggest you try data=ordered.

We have a 192M journal, and from the dmesg log it's saying that it's got
a 5
second flush interval, so I can't imagine that the journal is filling,
but
we'll try it and see I guess.

What I don't understand is why the spike is so sudden, and decays so
slowly.
It's Friday night now, so the load is fairly low. I setup a loop to dump
uptime information every 10 seconds and attached the result below. It's
running smoothly, then 'bam', it's hit with something big, which then
slowly
decays off.

A few extra things:
1. It happens every couple of minutes or so, but not exactly on any
time, so
it's not a cron job or anything
2. Viewing 'top', there are no extra processes obviously running when it
happens

Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  2:25   ` Rob Mueller
@ 2002-10-12  3:07     ` Andrew Morton
  2002-10-12  6:37       ` Rob Mueller
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-10-12  3:07 UTC (permalink / raw)
  To: Rob Mueller; +Cc: linux-kernel, Jeremy Howard

Rob Mueller wrote:
> 
> > > Filesystem is ext3 with one big / partition (that's a mistake
> > > we won't repeat, but too late now). This should be mounted
> > > with data=journal given the kernel command line above, though
> > > it's a bit hard to tell from the dmesg log:
> > >
> >
> > It's possible tht the journal keeps on filling.  When that happens,
> > everything has to wait for writeback into the main filesystem.
> > Completion of that writeback frees up journal space and then everything
> > can unblock.
> >
> > Suggest you try data=ordered.
> 
> We have a 192M journal, and from the dmesg log it's saying that it's got a 5
> second flush interval, so I can't imagine that the journal is filling, but
> we'll try it and see I guess.
> 
> What I don't understand is why the spike is so sudden, and decays so slowly.
> It's Friday night now, so the load is fairly low. I setup a loop to dump
> uptime information every 10 seconds and attached the result below. It's
> running smoothly, then 'bam', it's hit with something big, which then slowly
> decays off.
> 
> A few extra things:
> 1. It happens every couple of minutes or so, but not exactly on any time, so
> it's not a cron job or anything
> 2. Viewing 'top', there are no extra processes obviously running when it
> happens
> 

If it was this, one would expect it to happen every time you'd written
0.75 * 192 Mbytes to the filesystem.  Which seems about right.

Easy enough to test though.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  1:25 ` Andrew Morton
@ 2002-10-12  2:25   ` Rob Mueller
  2002-10-12  3:07     ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Mueller @ 2002-10-12  2:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Jeremy Howard


> > Filesystem is ext3 with one big / partition (that's a mistake
> > we won't repeat, but too late now). This should be mounted
> > with data=journal given the kernel command line above, though
> > it's a bit hard to tell from the dmesg log:
> >
>
> It's possible tht the journal keeps on filling.  When that happens,
> everything has to wait for writeback into the main filesystem.
> Completion of that writeback frees up journal space and then everything
> can unblock.
>
> Suggest you try data=ordered.

We have a 192M journal, and from the dmesg log it's saying that it's got a 5
second flush interval, so I can't imagine that the journal is filling, but
we'll try it and see I guess.

What I don't understand is why the spike is so sudden, and decays so slowly.
It's Friday night now, so the load is fairly low. I setup a loop to dump
uptime information every 10 seconds and attached the result below. It's
running smoothly, then 'bam', it's hit with something big, which then slowly
decays off.

A few extra things:
1. It happens every couple of minutes or so, but not exactly on any time, so
it's not a cron job or anything
2. Viewing 'top', there are no extra processes obviously running when it
happens

Rob


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Strange load spikes on 2.4.19 kernel
  2002-10-12  1:12 Rob Mueller
@ 2002-10-12  1:25 ` Andrew Morton
  2002-10-12  2:25   ` Rob Mueller
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2002-10-12  1:25 UTC (permalink / raw)
  To: Rob Mueller; +Cc: linux-kernel

Rob Mueller wrote:
> 
> Filesystem is ext3 with one big / partition (that's a mistake
> we won't repeat, but too late now). This should be mounted
> with data=journal given the kernel command line above, though
> it's a bit hard to tell from the dmesg log:
> 

It's possible tht the journal keeps on filling.  When that happens,
everything has to wait for writeback into the main filesystem.
Completion of that writeback frees up journal space and then everything
can unblock.

Suggest you try data=ordered.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Strange load spikes on 2.4.19 kernel
@ 2002-10-12  1:12 Rob Mueller
  2002-10-12  1:25 ` Andrew Morton
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Mueller @ 2002-10-12  1:12 UTC (permalink / raw)
  To: linux-kernel


Summary:
We've spent the last couple of days trying to diagnose some strange load
spikes we've been observing on our server. The basic symptom is that the
load will appear to be falling gradually (often down to between 0.5 and
3), and then will appear to basically *instantly* rise to a rather large
value (between 15 and 30). This occurs at intervals of around 1 to 5
minutes or so. During the spiking time, system response is significantly
slowed. Sometimes it takes up to 10 seconds just to run an 'uptime' or
'ls' command. Oddly though, once run, it appears to be much more
responsive thereafter... (no swapping is occuring see below)

vmstat and uptime output:
To show what I mean about the load spikes, I did the following to gather
uptime and vmstat data at the same time at 5 second intervals.

vmstat -n 5 > /tmp/d1 & while true; do uptime >> /tmp/d2; sleep 5; done

These got a little out of sync after a few minutes, but it's still fairly
illustrative. I used a perl script to join these together with the time
(seconds) on the left. I've trimmed the swap columns, which didn't change,
and also the 2 longer term load average columns so it fits in 76 chars.

      procs              memory        io     system   cpu
t   r  b  w  free   buff  cache  bi    bo   in    cs  us  sy  id uptime
--|-------------------------------------------------------------|------
37| 0  0  0 66440 414628 1691196  51   983  395  963   7   8  85| 2.22,
42| 0  0  0 66860 414880 1691240  34   825  383 1023   6   9  85| 2.04,
47| 0  0  1 69736 415008 1691384  33   778  409 1285   7   9  84| 1.95,
52| 0  0  0 71596 415120 1691432   9  1021  393 1154   6   9  85| 1.80,
57| 0  0  0 66620 415972 1692476  47  4910  825 2100  11  13  76| 1.89,
02| 0  1  0 77624 416320 1694268 197  2132  761 1570  11  10  79| 1.90,
07| 0  0  0 76796 416544 1694804  99  1494  492 1209   7   9  83| 1.91,
12| 1  0  0 84736 416708 1695808 200  1178  498 1270  14  10  77| 1.76,
17| 0  0  0 86388 416792 1695824   4   972  445  994   6   7  88| 1.62,
22| 0  0  0 98276 416928 1695868   1  1322  484 1049   9  10  81| 1.49,
27| 0  1  1 94284 417164 1695564  48  1553  527 1205   9  10  81| 1.37,
32| 1  0  0 90336 417340 1695676  18  1243  497 1188   8  10  82| 17.03
37| 4  0  1 84288 417440 1695728   8   812  425 1186   6  10  84| 15.67
42| 0  0  0 89736 417648 1696504 120  1042  539 1340   9  10  81| 14.41
47| 0  1  0 85284 417764 1696692  21   852  452 1329   6  11  83| 13.34
52| 0  0  0 81272 417856 1696992  68   826  499 1552  16  12  72| 12.35
57| 0  0  0 80984 417972 1697520  22  1312  469 1223   7  10  83| 11.52
02| 0  1  2 70984 418476 1697876  38  3401  633 1530  10  11  79| 10.92
07| 0  0  0 67976 418692 1697556  13  1651  510 1444  15  11  74| 10.04
12| 1  0  0 61576 418880 1698244 132  1262  443 1040  13   8  78| 9.40,
17| 2  0  0 59852 419044 1698372  32  1249  473 1060   8   8  84| 8.65,
22| 0  0  0 58108 419268 1698584  31  1198  501 1416   8  10  82| 7.95,
27| 0  0  0 52200 419596 1698988  60  1676  708 2090  10  12  78| 7.56,
32| 0  0  0 51496 419696 1698908  25  1034  487 1280   9  10  81| 6.95,

si and so both remained at 0 the whole time, and swpd was 133920 (eg
something like this)

                               memory    swap    
 time       swpd   free   buff  cache  si  so       uptime
--------|----------------------------------------|------------------
22:54:37| 133920  66440 414628 1691196   0   0   | 2.22, 8.14, 9.14
22:54:42| 133920  66860 414880 1691240   0   0   | 2.04, 8.01, 9.09
22:54:47| 133920  69736 415008 1691384   0   0   | 1.95, 7.89, 9.05

The thing is, I can't see any reason for the load to suddenly jump from
1.37 to 17.03. Apparently there's no sudden disk activity and no sudden
CPU activity.

A few minutes later, another one of these occurs, but there's some other
odd data this time. (Again, I've removed the swap columns as there was
no swap activity at all)

      procs              memory        io     system   cpu
t   r  b  w  free   buff  cache  bi    bo   in    cs  us  sy  id uptime
--|-------------------------------------------------------------|------
17| 0  0  0 38944 425788 1708248  23   702  478 1284  11   9  80| 1.47,
22| 2  0  0 47908 426068 1708900  95  1723  594 1585   9  11  80| 1.35,
27| 0  0  0 41924 426308 1708892  66  1318  586 1625  19  11  70| 1.24,
32| 0  0  0 45108 426468 1705684  14  1099  463 1190  13   9  77| 1.30,
37| 0  0  0 45968 426536 1705760  13   864  372  651   4   7  89| 1.28,
42| 0  0  0 45776 426672 1706032  53  1003  461 1312   6  10  84| 1.17,
47| 0  0  1 44728 426824 1706116   7  1301  598 1884  10  11  79| 1.16,
52| 0  0  0 40880 427004 1706628  96  1015  496 1590   7  11  82| 1.07,
57| 1  0  0 33404 427128 1706720   8  1317  479 1309   6  11  84| 1.06,
02| 2  4  1 21756 427252 1707280  34  1514  757 2066  22  14  63| 0.98,
07| 2  0  0 14236 427612 1710672 392  2638 1231 2735  30  16  55| 18.11
12| 0  0  0 12528 426288 1706112  63  1680  563 1595  13  11  77| 17.22
17| 1  1  1 12140 424696 1704576  32  1301  567 1726  12  14  74| 15.84
22| 0  1  0 22780 425172 1704492 110  1672  796 1879  14  13  73| 14.58
27| 0 17  2 21548 425888 1704576 130   730  417 1031   4   9  87| 13.49
50| 0  1  1 18520 420628 1686628 205  3357 1790 5404   7  10  83| 12.49
55| 0  1  0 32572 420780 1686712  14  1289  584 1527   6  11  83| 14.45
00| 1  0  0 47324 420952 1686768  16  1051  493 1425   9  10  81| 18.50
05| 0  0  0 60980 421188 1687064  77  1005  611 2066   6  15  79| 23.74
10| 0  0  0 59368 421404 1687412  83  1178  600 2049  12  13  75| 22.32
15| 0  0  0 58788 421564 1687976  69  1559  612 1800  11  14  75| 20.62
20| 0  1  0 69484 421760 1688064  30  1342  499 1103  11  10  80| 19.04
25| 0  6  1 61256 421996 1688192  21  1313  530 1170   7  10  83| 17.52
30| 0 36  2 57648 421996 1688216   2    47  303  882   6   8  86| 16.12
35| 0 59  2 46020 422004 1688232   2     0  181  577   3   7  91| 14.83
40| 0 88  2 28064 422004 1688240   0     1  168  958   2   8  90| 13.64
45| 1  3  3 15968 420608 1683860 243  2803 1112 2534  31  19  50| 12.63
50| 0  0  0 36720 420812 1684072  57  1366  584 1596   8  11  80| 12.41
55| 0  1  0 54364 420960 1684360  22   971  541 1704   6  13  81| 14.78
00| 1  0  0 69724 421160 1684328  39   886  426 1371   7   8  85| 18.64
05| 0  0  0 64868 421360 1689448 158  3040  564 1279   6   9  84| 24.28
10| 0  0  0 59380 421460 1689756  53  1813  479 1492  11  13  76| 22.41
15| 0  1  0 62732 421544 1689860  16  1057  373  968   5  10  85| 20.62
20| 0  0  0 62560 421636 1689956  44   959  371 1048   6   9  85| 18.97
25| 0 20  2 59552 421836 1690344  46   950  467 1264   6   9  85| 17.45

Notice the sudden jump again, and this time there is I/O associated with
it, but not what I'd call an excessive amount. But then look 1 minute
later. All the IO activity drops off, lots of processes in the
uninterruptable state, but nothing special happening to the load,
before everything starts firing up again. I thought maybe these had
got out of sync, but I double checked, and each of these lines is
basically being dumped around the same time.

I did one more run, this time doing a dump every 1 second to see if
there was any more detail

 1  0  1 1393908 440380 1070184    0  1044  376  1405   9  8  83| 0.49,
 0  0  0 1393432 440384 1070264    0   760  508   771  15  5  80| 0.49,
 0  0  0 1392728 440384 1070280    0   316  292  1439   2  8  91| 0.45,
 0  0  0 1392656 440384 1070288    0   700  348   540   4  6  90| 0.45,
 0  0  0 1393772 440384 1070296    0   396  343  1294   1  9  90| 0.45,
 0  0  0 1393616 440384 1070332   24  1052  550   735   9  5  85| 0.45,
 0  0  0 1392096 440384 1070308    4  1216  543  1951   9 10  80| 0.45,
 2  0  0 1391060 440384 1070320    0   700  620  1896  15 10  75| 0.82,
 1  1  0 1390360 440404 1070380   64  1648  589  1431   7  8  85| 0.82,
 0  0  0 1390012 440432 1070580  116  1520  666  1567  14  9  77| 0.82,
 0  0  0 1389732 440432 1070592    8   852  347   623   5  3  92| 0.82,
 1  0  0 1390420 440432 1070604    0  1084  372   859   2  6  92| 0.82,
 0  0  0 1390104 440436 1070624    8  1004  493  1276   9  7  84| 17.33
 0  0  0 1390388 440436 1070628    0   656  369  1067   5  5  90| 17.33
 0  0  0 1390712 440436 1070636    4  1024  508  1226  10  8  82| 17.33
 0  0  0 1390324 440452 1070660   20  1200  434  1480   6  7  87| 17.33
 0  0  0 1389388 440452 1070672    0  1044  458  1403   9  8  83| 17.33
 0  0  0 1403856 440452 1070660    0   492  462   997  15  6  79| 15.94
 0  0  0 1403744 440476 1070736   84  1236  442  1230   1 10  89| 15.94
 2  0  1 1402648 440476 1070728    0  1200  567  1574  10  9  81| 15.94
 0  1  0 1400692 440524 1071960 1232  1216  725  1771  14 12  74| 15.94
 0  0  0 1399144 440560 1073840 1892  1252  839  2063  11 10  79| 15.94
 1  0  0 1399112 440564 1073832    4   712  404   986   1  7  92| 14.74
 0  0  0 1399888 440564 1073840   32   608  575  1015  13  6  81| 14.74
 1  0  0 1399240 440564 1073836    0   856  541  1249   6  6  88| 14.74
 0  0  0 1398340 440564 1073844    0  1368  790  1872  19 11  70| 14.74
 0  0  0 1397800 440584 1073916   68  1196  597  1440   5  8  87| 14.74
 0  0  0 1397836 440584 1073996   92   608  456   896   2  6  93| 13.56
 0  0  0 1397764 440584 1073992    0   756  422   953   0  7  93| 13.56

About 10 seconds after the spike, there's a big IO batch, coincidence?
Could that have something to do with it? Why was there nothing like
that in the first list?

Machine usage:
This server is being used as both an IMAP server and a web server. We're
using Apache 1.3.27 with mod_perl as the web-server, and Cyrus 2.1.7 as
the IMAP server. We use Apache/mod_accel and perdition on another machine
to act as web/imap proxies to the appropriate backend machine (of which
this is just 1 of them). The server never appears to be CPU bound by a
long way (usually 80% idle on both CPUs).

Specification:
Dual AMD Athlon 2000+ MP
Tyan Motherboard
4GB RAM
LSI Logic MegaRAID Express 500 Raid Controller
5x76 GB HD's in RAID5 config

Kernel: (lines I thought relevant from /var/log/dmesg)
Linux version 2.4.19 (root@server5.fastmail.fm) (gcc version 2.96 20000731
(Red Hat Linux 7.3 2.96-112)) #2 SMP Mon Oct 7 18:18:45 CDT 2002
megaraid: v1.18 (Release Date: Thu Oct 11 15:02:53 EDT 2001)
scsi0 : LSI Logic MegaRAID h132 254 commands 16 targs 4 chans 7 luns
kjournald starting.  Commit interval 5 seconds

Kernel command line: ro root=/dev/sda2 rootflags=data=journal noapic

Typical 'top' output:
  4:49am  up  3:13,  2 users,  load average: 10.60, 11.69, 12.73
1016 processes: 1015 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 13.1% user,  6.2% system,  0.0% nice, 80.1% idle
CPU1 states:  9.1% user,  9.5% system,  0.0% nice, 80.4% idle
Mem:  3873940K av, 2729812K used, 1144128K free,       0K shrd,  456512K
buff
Swap: 1052248K av,       0K used, 1052248K free                 1209064K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
  798 nobody    10   0 31356  30M 20844 S     4.5  0.8   0:01 httpd
32296 root      16   0  1572 1572   736 R     4.4  0.0   0:21 top
  601 nobody    10   0 31772  31M 19472 S     3.4  0.8   0:03 httpd
  607 nobody    10   0 31684  30M 20296 S     2.1  0.8   0:03 httpd
 1093 cyrus      9   0 14392  14M 14088 S     2.1  0.3   0:00 imapd
32477 cyrus      9   0 14356  14M 14052 S     1.9  0.3   0:00 imapd
 1007 cyrus      9   0 14256  13M 13952 S     1.7  0.3   0:00 imapd
 1097 cyrus      9   0 14216  13M 13920 S     1.7  0.3   0:00 imapd
   10 root      11   0     0    0     0 SW    1.1  0.0   1:11 kjournald
  941 nobody     9   0 30556  29M 20896 S     1.1  0.7   0:00 httpd
  608 nobody     9   0 29740  29M 20408 S     0.9  0.7   0:02 httpd
  745 nobody     9   0  5236 5236  1528 S     0.5  0.1   0:44 rated.pl
 1441 postfix    9   0  1356 1356  1000 S     0.5  0.0   0:05 qmgr
 1452 nobody     5 -10  4668 4668  1572 S <   0.5  0.1   0:27 imappool.pl
30613 cyrus     10   0  7120 7120  5228 S     0.5  0.1   0:02 saslperld.pl
 1094 cyrus      9   0  1596 1592  1296 S     0.5  0.0   0:00 imapd
    7 root      11   0     0    0     0 SW    0.3  0.0   0:44 kupdated
  808 cyrus     15   0  1364 1364   844 S     0.3  0.0   0:18 master
 1114 cyrus     11   0  1232 1228   980 S     0.3  0.0   0:00 pop3d

(this doesn't change radically, whether the load is in a low spot or
a high spot)

/proc/meminfo output:
[root@server5 flush]# cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
Mem:  3966914560 2814578688 1152335872        0 468127744 1246683136
Swap: 1077501952        0 1077501952
MemTotal:      3873940 kB
MemFree:       1125328 kB
MemShared:           0 kB
Buffers:        457156 kB
Cached:        1217464 kB
SwapCached:          0 kB
Active:        1062396 kB
Inactive:      1461100 kB
HighTotal:     3006400 kB
HighFree:       937256 kB
LowTotal:       867540 kB
LowFree:        188072 kB
SwapTotal:     1052248 kB
SwapFree:      1052248 kB

Other:
All syslog files are async (- prefix)

Filesystem is ext3 with one big / partition (that's a mistake
we won't repeat, but too late now). This should be mounted
with data=journal given the kernel command line above, though
it's a bit hard to tell from the dmesg log:

EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with journal data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 120k freed
Adding Swap: 1052248k swap-space (priority -1)
EXT3 FS 2.4-0.9.17, 10 Jan 2002 on sd(8,2), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.17, 10 Jan 2002 on sd(8,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.

We've tried enabling the write-back mode on the SCSI controller.
Unfortunately the box is in a colo, so we can only take their
word that they've enabled it in the bios, even though when we
do:

[root@server5 flush]# scsiinfo -c /dev/sda

Data from Caching Page
----------------------
Write Cache                        0
Read Cache                         1
Prefetch units                     0
Demand Read Retention Priority     0
Demand Write Retention Priority    0
Disable Pre-fetch Transfer Length  0
Minimum Pre-fetch                  0
Maximum Pre-fetch                  0
Maximum Pre-fetch Ceiling          0

We're not convinced...

We've also tuned the bdflush parameters a few times. Based on this
article:
http://www-106.ibm.com/developerworks/linux/library/l-fs8.html?dwzone=linux

[root@server5 hm]# cat /proc/sys/vm/bdflush
39      500     0       0       60      300     60      0       0

We also tried playing around with these somemore, lowering the 
nfract and age_buffer and ndirty values, but none of it seemed to
make any difference.

If anyone has any clues as to what might be causing this, and what to
do about it, we'd really appreciate some information. At the moment
we're at a complete loss about where to go from here. Happy to supply
any extra info that people might think would be helpful.

Rob Mueller


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2002-12-12  0:24 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.33.0210130202070.17395-100000@coffee.psychology.mcmaster.ca>
2002-10-13  6:34 ` Strange load spikes on 2.4.19 kernel Rob Mueller
2002-10-13  7:01   ` Joseph D. Wagner
2002-10-13  7:01     ` David S. Miller
2002-10-13  7:49       ` Joseph D. Wagner
2002-10-13  7:50         ` David S. Miller
2002-10-13  8:16           ` Joseph D. Wagner
2002-10-13  8:13             ` David S. Miller
2002-10-13  8:40               ` Joseph D. Wagner
2002-10-13  8:45                 ` David S. Miller
2002-10-13  8:48                 ` Mike Galbraith
2002-10-13  8:48                   ` David S. Miller
2002-10-13  8:51                 ` William Lee Irwin III
2002-10-13 10:20                 ` Ingo Molnar
2002-10-13 19:42                 ` Rik van Riel
2002-10-16 21:00         ` Bill Davidsen
2002-10-13  8:59     ` Anton Blanchard
2002-10-13  9:26       ` William Lee Irwin III
2002-10-13 12:31   ` Marius Gedminas
2002-12-11 22:54 Steven Roussey
2002-12-11 23:09 ` Andrew Morton
2002-12-11 23:54   ` Steven Roussey
2002-12-12  0:13     ` Andrew Morton
2002-12-12  0:31       ` Steven Roussey
     [not found] <113001c27282$93955eb0$1900a8c0@lifebook.suse.lists.linux.kernel>
     [not found] ` <000001c27286$6ab6bc60$7443f4d1@joe.suse.lists.linux.kernel>
     [not found]   ` <20021013.000127.43007739.davem@redhat.com.suse.lists.linux.kernel>
2002-10-13  7:24     ` Andi Kleen
2002-10-13  7:21       ` David S. Miller
2002-10-13 11:47       ` Hugh Dickins
2002-10-13 18:29         ` Andrew Morton
     [not found] <Pine.LNX.4.33.0210121605490.16179-100000@coffee.psychology.mcmaster.ca>
2002-10-13  0:49 ` Rob Mueller
     [not found] <001401c2719b$9d45c4a0$53241c43@joe>
2002-10-12  6:54 ` Rob Mueller
  -- strict thread matches above, loose matches on Subject: below --
2002-10-12  3:13 Joseph D. Wagner
2002-10-12  3:10 Joseph D. Wagner
2002-10-12  1:12 Rob Mueller
2002-10-12  1:25 ` Andrew Morton
2002-10-12  2:25   ` Rob Mueller
2002-10-12  3:07     ` Andrew Morton
2002-10-12  6:37       ` Rob Mueller
2002-10-12  6:44         ` Andrew Morton
2002-10-12  6:52           ` Rob Mueller
2002-10-12  7:00             ` Andrew Morton
2002-10-13  6:14               ` Rob Mueller
2002-10-13  7:27                 ` Simon Kirby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).