linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Where's all my memory going?
@ 2002-01-09 17:36 Matt Dainty
  2002-01-09 17:47 ` Alan Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Matt Dainty @ 2002-01-09 17:36 UTC (permalink / raw)
  To: linux-kernel

Hi,

I've fashioned a qmail mail server using an HP NetServer with an HP NetRaid
4M & 1GB RAM, running 2.4.17 with aacraid, LVM, ext3 and highmem. The box
has 6x 9GB disks, one for system, one for qmail's queue, and the remaining
four are RAID5'd with LVM. ext3 is only on the queue disk, ext2 everywhere
else.

Before I stick the box live, I wanted to test that it doesn't fall over
under any remote kind of stress, so I've run postal to simulate lots of mail
connections.

Nothing too hard to begin with, but I'm seeing a degradation in performance
over time, using a maximum message size of 10KB, 5 simultaneous connections,
and limiting to 1500 messages per minute.

Initially the box memory situation is like this:

root@plum:~# free
             total       used       free     shared    buffers     cached
Mem:       1029524      78948     950576          0      26636      23188
-/+ buffers/cache:      29124    1000400
Swap:      2097136          0    2097136

...running postal, it seems to cope fine. Checking the queue using
qmail-qstat shows no messages being delayed for delivery, everything I chuck
at it is being delivered straight away.

However, over time, (30-45 minutes), more and more memory seems to just
disappear from the system until it looks like this, (note that swap is
hardly ever touched):

root@plum:~# free
             total       used       free     shared    buffers     cached
Mem:       1029524    1018032      11492          0      49380     245568
-/+ buffers/cache:     723084     306440
Swap:      2097136        676    2096460

...and qmail-qstat reports a few thousand queued messages. Even if I stop
the postal process, let the queue empty and start again, it never attains
the same performance as it did initially and the queue gets slowly filled.

I haven't left it long enough to see if the box grinds itself into the
ground, but it appears to stay at pretty much the same level as above, once
it gets there. CPU load stays at about ~5.0, (PIII 533), but it's still
very reponsive to input and launching stuff.

Looking at the processes, the biggest memory hog is a copy of dnscache that
claims to have used ~10MB, which is fine as I specified a cache of that size.
Nothing else shows any hint of excessive memory usage.

Can anyone offer any advice or solution to this behaviour, (or more tricks
or settings I can try)? I'd like the mail server to be able to handle 1500
messages instead of 150 a minute! :-) Any extra info required, please let me
know, I'm not sure what else to provide atm.

Cheers

Matt
-- 
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-09 17:36 Where's all my memory going? Matt Dainty
@ 2002-01-09 17:47 ` Alan Cox
  2002-01-09 22:36   ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2002-01-09 17:47 UTC (permalink / raw)
  To: Matt Dainty; +Cc: linux-kernel

> However, over time, (30-45 minutes), more and more memory seems to just
> disappear from the system until it looks like this, (note that swap is
> hardly ever touched):

I don't see any disappearing memory. Remember that Linux will intentionally
keep memory filled with cache pages when it is possible. The rest I can't
help with - Im not familiar enough with qmail to know what limits it places
internally or where the points it and/or the kernel might interact to
cause bottlenecks are 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-09 17:47 ` Alan Cox
@ 2002-01-09 22:36   ` Rik van Riel
  2002-01-10  8:45     ` Bruce Guenter
  0 siblings, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2002-01-09 22:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: Matt Dainty, linux-kernel

On Wed, 9 Jan 2002, Alan Cox wrote:

> > However, over time, (30-45 minutes), more and more memory seems to just
> > disappear from the system until it looks like this, (note that swap is
> > hardly ever touched):
>
> I don't see any disappearing memory. Remember that Linux will
> intentionally keep memory filled with cache pages when it is possible.

Matt's system seems to go from 900 MB free to about
300 MB (free + cache).

I doubt qmail would eat 600 MB of RAM (it might, I
just doubt it) so I'm curious where the RAM is going.

Matt, do you see any suspiciously high numbers in
/proc/slabinfo ?

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-09 22:36   ` Rik van Riel
@ 2002-01-10  8:45     ` Bruce Guenter
  2002-01-10 10:05       ` Andreas Dilger
  0 siblings, 1 reply; 14+ messages in thread
From: Bruce Guenter @ 2002-01-10  8:45 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1493 bytes --]

On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> Matt's system seems to go from 900 MB free to about
> 300 MB (free + cache).
> 
> I doubt qmail would eat 600 MB of RAM (it might, I
> just doubt it) so I'm curious where the RAM is going.

I am seeing the same symptoms, with similar use -- ext3 filesystems
running qmail.  Adding up the RSS of all the processes in use gives
about 75MB, while free shows:

             total       used       free     shared    buffers     cached
Mem:        901068     894088       6980          0     157568     113856
-/+ buffers/cache:     622664     278404
Swap:      1028152      10468    1017684

This are fairly consistent numbers.  buffers hovers around 150MB and
cached around 110MB all day.  The server is heavy on write traffic.

> Matt, do you see any suspiciously high numbers in
> /proc/slabinfo ?

What would be suspiciously high?  The four biggest numbers I see are:

inode_cache       139772 204760    480 25589 25595    1
dentry_cache      184024 326550    128 10885 10885    1
buffer_head       166620 220480     96 4487 5512    1
size-64           102388 174876     64 2964 2964    1

I can post complete details for any who wish to investigate further.  I
am not seeing a huge slowdown, but I have no real baseline to compare
against.
-- 
Bruce Guenter <bruceg@em.ca> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA  2E2A E96F B2DC 6999 80E8

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10  8:45     ` Bruce Guenter
@ 2002-01-10 10:05       ` Andreas Dilger
  2002-01-10 11:28         ` Matt Dainty
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Andreas Dilger @ 2002-01-10 10:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Bruce Guenter, Rik van Riel

On Jan 10, 2002  02:45 -0600, Bruce Guenter wrote:
> On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > Matt's system seems to go from 900 MB free to about
> > 300 MB (free + cache).
> > 
> > I doubt qmail would eat 600 MB of RAM (it might, I
> > just doubt it) so I'm curious where the RAM is going.
> 
> I am seeing the same symptoms, with similar use -- ext3 filesystems
> running qmail.

Hmm, does qmail put each piece of email is in a separate file?  That
might explain a lot about what is going on here.

> Adding up the RSS of all the processes in use gives
> about 75MB, while free shows:
> 
>              total       used       free     shared    buffers     cached
> Mem:        901068     894088       6980          0     157568     113856
> -/+ buffers/cache:     622664     278404
> Swap:      1028152      10468    1017684
> 
> This are fairly consistent numbers.  buffers hovers around 150MB and
> cached around 110MB all day.  The server is heavy on write traffic.
> 
> > Matt, do you see any suspiciously high numbers in
> > /proc/slabinfo ?
> 
> What would be suspiciously high?  The four biggest numbers I see are:
> 
> inode_cache       139772 204760    480 25589 25595    1
> dentry_cache      184024 326550    128 10885 10885    1
> buffer_head       166620 220480     96 4487 5512    1
> size-64           102388 174876     64 2964 2964    1

Well, these numbers _are_ high, but with 1GB of RAM you have to use it all
_somewhere_.  It looks like you don't have much memory pressure, because
there is lots of free space in these slabs that could probably be freed
easily.

I'm thinking that if you get _lots_ of dentry and inode items (especially
under the "postal" benchmark) you may not be able to free the negative
dentries for all of the created/deleted files in the mailspool (all of
which will have unique names).  There is a deadlock path in the VM that
has to be avoided, and as a result it makes it harder to free dentries
under certain uncommon loads.

I had a "use once" patch for negative dentries that allowed the VM to
free negative dentries easily if they are never referenced again.  It
is a bit old, but it should be pretty close to applying.  I have been
using it for months without problems (although I don't really stress
it very much in this regard).

The other question would of course be whether we are calling into
shrink_dcache_memory() enough, but that is an issue for Matt to
see by testing "postal" with and without the patch, and keeping an
eye on the slab caches.

Cheers, Andreas
======================= dcache-2.4.13-neg.diff ============================
--- linux.orig/fs/dcache.c	Thu Oct 25 01:50:30 2001
+++ linux/fs/dcache.c	Thu Oct 25 00:02:58 2001
@@ -137,7 +137,16 @@
 	/* Unreachable? Get rid of it */
 	if (list_empty(&dentry->d_hash))
 		goto kill_it;
-	list_add(&dentry->d_lru, &dentry_unused);
+	if (dentry->d_inode) {
+		list_add(&dentry->d_lru, &dentry_unused);
+	} else {
+		/* Put an unused negative inode to the end of the list.
+		 * If it is not referenced again before we need to free some
+		 * memory, it will be the first to be freed.
+		 */
+		dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
+		list_add_tail(&dentry->d_lru, &dentry_unused);
+	}
 	dentry_stat.nr_unused++;
 	spin_unlock(&dcache_lock);
 	return;
@@ -306,8 +315,9 @@
 }
 
 /**
- * prune_dcache - shrink the dcache
+ * _prune_dcache - shrink the dcache
  * @count: number of entries to try and free
+ * @gfp_mask: context under which we are trying to free memory
  *
  * Shrink the dcache. This is done when we need
  * more memory, or simply when we need to unmount
@@ -318,7 +328,7 @@
  * all the dentries are in use.
  */
  
-void prune_dcache(int count)
+void _prune_dcache(int count, unsigned int gfp_mask)
 {
 	spin_lock(&dcache_lock);
 	for (;;) {
@@ -329,15 +339,32 @@
 
 		if (tmp == &dentry_unused)
 			break;
-		list_del_init(tmp);
 		dentry = list_entry(tmp, struct dentry, d_lru);
 
 		/* If the dentry was recently referenced, don't free it. */
 		if (dentry->d_vfs_flags & DCACHE_REFERENCED) {
+			list_del_init(tmp);
 			dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
 			list_add(&dentry->d_lru, &dentry_unused);
 			continue;
 		}
+
+		/*
+		 * Nasty deadlock avoidance.
+		 *
+		 * ext2_new_block->getblk->GFP->shrink_dcache_memory->
+		 * prune_dcache->prune_one_dentry->dput->dentry_iput->iput->
+		 * inode->i_sb->s_op->put_inode->ext2_discard_prealloc->
+		 * ext2_free_blocks->lock_super->DEADLOCK.
+		 *
+		 * We should make sure we don't hold the superblock lock over
+		 * block allocations, but for now we will only free unused
+		 * negative dentries (which are added at the end of the list).
+		 */
+		if (dentry->d_inode && !(gfp_mask & __GFP_FS))
+			break;
+
+		list_del_init(tmp);
 		dentry_stat.nr_unused--;
 
 		/* Unused dentry with a count? */
@@ -351,6 +378,11 @@
 	spin_unlock(&dcache_lock);
 }
 
+void prune_dcache(int count)
+{
+	_prune_dcache(count, __GFP_FS);
+}
+
 /*
  * Shrink the dcache for the specified super block.
  * This allows us to unmount a device without disturbing
@@ -549,26 +581,11 @@
  */
 int shrink_dcache_memory(int priority, unsigned int gfp_mask)
 {
-	int count = 0;
-
-	/*
-	 * Nasty deadlock avoidance.
-	 *
-	 * ext2_new_block->getblk->GFP->shrink_dcache_memory->prune_dcache->
-	 * prune_one_dentry->dput->dentry_iput->iput->inode->i_sb->s_op->
-	 * put_inode->ext2_discard_prealloc->ext2_free_blocks->lock_super->
-	 * DEADLOCK.
-	 *
-	 * We should make sure we don't hold the superblock lock over
-	 * block allocations, but for now:
-	 */
-	if (!(gfp_mask & __GFP_FS))
-		return 0;
-
-	count = dentry_stat.nr_unused / priority;
+	int count = dentry_stat.nr_unused / (priority + 1);
 
-	prune_dcache(count);
+	_prune_dcache(count, gfp_mask);
 	kmem_cache_shrink(dentry_cache);
+
 	return 0;
 }
 
@@ -590,8 +607,15 @@
 	struct dentry *dentry;
 
 	dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); 
-	if (!dentry)
-		return NULL;
+	if (!dentry) {
+		/* Try to free some unused dentries from the cache, but do
+		 * not call into the filesystem to do so (avoid deadlock).
+		 */
+		_prune_dcache(16, GFP_NOFS);
+		dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL);
+		if (!dentry)
+			return NULL;
+	}
 
 	if (name->len > DNAME_INLINE_LEN-1) {
 		str = kmalloc(NAME_ALLOC_LEN(name->len), GFP_KERNEL);
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 10:05       ` Andreas Dilger
@ 2002-01-10 11:28         ` Matt Dainty
  2002-01-10 14:55         ` Matt Dainty
  2002-01-10 22:18         ` Bruce Guenter
  2 siblings, 0 replies; 14+ messages in thread
From: Matt Dainty @ 2002-01-10 11:28 UTC (permalink / raw)
  To: linux-kernel

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002  02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > > Matt's system seems to go from 900 MB free to about
> > > 300 MB (free + cache).
> > > 
> > > I doubt qmail would eat 600 MB of RAM (it might, I
> > > just doubt it) so I'm curious where the RAM is going.

I'm fairly sure we can eliminate qmail, as most of its processes are
short-lived, often invoked per-delivery, only a few processes stay running
for any length of time.

> > I am seeing the same symptoms, with similar use -- ext3 filesystems
> > running qmail.

Heh, it's your fault I'm using ext3 for the queue! :P

> Hmm, does qmail put each piece of email is in a separate file?  That
> might explain a lot about what is going on here.

Yes, in more places than one in the best setups. The queue stores each
message in separate areas as it moves through the system, the queue names
each message as the inode it resides on. (I probably haven't explained that
too well, I'm sure Bruce can elaborate :-). When the message is delivered
locally, and using djb's Maildir mailbox format, the message will be stored
as a separate file too, most commonly under ~user/Maildir/new/.

I originally thought ReiserFS would be good for this, but the benchmarks
Bruce did, showed that in fact ext3 is better, (using data=journal, and
using the syncdir library to force synchronous behaviour on open(), etc.,
similar to chattr +S). I've also used 'noatime' to coax some more speed
out of it.

> > Adding up the RSS of all the processes in use gives
> > about 75MB, while free shows:
> > 
> >              total       used       free     shared    buffers     cached
> > Mem:        901068     894088       6980          0     157568     113856
> > -/+ buffers/cache:     622664     278404
> > Swap:      1028152      10468    1017684
> > 
> > This are fairly consistent numbers.  buffers hovers around 150MB and
> > cached around 110MB all day.  The server is heavy on write traffic.
> > 
> > > Matt, do you see any suspiciously high numbers in
> > > /proc/slabinfo ?

I'll have another run and see what happens...

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

I'll try this patch and see how it performs.

Cheers

Matt
-- 
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 10:05       ` Andreas Dilger
  2002-01-10 11:28         ` Matt Dainty
@ 2002-01-10 14:55         ` Matt Dainty
  2002-01-10 16:17           ` David Rees
  2002-01-10 20:46           ` Andreas Dilger
  2002-01-10 22:18         ` Bruce Guenter
  2 siblings, 2 replies; 14+ messages in thread
From: Matt Dainty @ 2002-01-10 14:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andreas Dilger, Bruce Guenter, Rik van Riel

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002  02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > 
> > > Matt, do you see any suspiciously high numbers in
> > > /proc/slabinfo ?
> > 
> > What would be suspiciously high?  The four biggest numbers I see are:
> > 
> > inode_cache       139772 204760    480 25589 25595    1
> > dentry_cache      184024 326550    128 10885 10885    1
> > buffer_head       166620 220480     96 4487 5512    1
> > size-64           102388 174876     64 2964 2964    1

Pretty much the same as Bruce here, mostly same culprits anyway:

inode_cache        84352  90800    480 11340 11350    1 :  124   62
dentry_cache      240060 240060    128 8002 8002    1 :  252  126
buffer_head       215417 227760     96 5694 5694    1 :  252  126
size-32           209954 209954     32 1858 1858    1 :  252  126

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

Patch applied cleanly, and I redid the 'test'. I've attached the output
of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
cases postal was left to run for about 35 minutes by which time it had
delivered around ~54000 messages locally.

Overall, with the patch, the large numbers in /proc/slabinfo are *still*
large, but not as large as without the patch. Overall memory usage still
seems similar.

Matt
-- 
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"

[-- Attachment #2: before.1 --]
[-- Type: text/plain, Size: 4221 bytes --]

             total       used       free     shared    buffers     cached
Mem:       1029524     113744     915780          0      30988      32156
-/+ buffers/cache:      50600     978924
Swap:      2097136          0    2097136

slabinfo - version: 1.1 (SMP)
kmem_cache            64     64    244    4    4    1 :  252  126
devfsd_event         126    169     20    1    1    1 :  252  126
tcp_tw_bucket         40     40     96    1    1    1 :  252  126
tcp_bind_bucket      113    113     32    1    1    1 :  252  126
tcp_open_request      59     59     64    1    1    1 :  252  126
inet_peer_cache        0      0     64    0    0    1 :  252  126
ip_fib_hash          113    113     32    1    1    1 :  252  126
ip_dst_cache          96     96    160    4    4    1 :  252  126
arp_cache             30     30    128    1    1    1 :  252  126
blkdev_requests      800    800     96   20   20    1 :  252  126
journal_head         390    390     48    5    5    1 :  252  126
revoke_table         126    253     12    1    1    1 :  252  126
revoke_record          0      0     32    0    0    1 :  252  126
dnotify cache          0      0     20    0    0    1 :  252  126
file lock cache       42     42     92    1    1    1 :  252  126
fasync cache           0      0     16    0    0    1 :  252  126
uid_cache            113    113     32    1    1    1 :  252  126
skbuff_head_cache     96     96    160    4    4    1 :  252  126
sock                  18     18    832    2    2    2 :  124   62
sigqueue              29     29    132    1    1    1 :  252  126
cdev_cache           413    413     64    7    7    1 :  252  126
bdev_cache            59     59     64    1    1    1 :  252  126
mnt_cache             59     59     64    1    1    1 :  252  126
inode_cache        37352  37352    480 4669 4669    1 :  124   62
dentry_cache       47280  47280    128 1576 1576    1 :  252  126
dquot                  0      0    128    0    0    1 :  252  126
filp                 420    420    128   14   14    1 :  252  126
names_cache           21     21   4096   21   21    1 :   60   30
buffer_head        16060  17320     96  409  433    1 :  252  126
mm_struct             72     72    160    3    3    1 :  252  126
vm_area_struct       674    800     96   20   20    1 :  252  126
fs_cache             118    118     64    2    2    1 :  252  126
files_cache           63     63    416    7    7    1 :  124   62
signal_act            69     69   1312   23   23    1 :   60   30
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384             3      3  16384    3    3    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              2      3   8192    2    3    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :   60   30
size-4096             24     24   4096   24   24    1 :   60   30
size-2048(DMA)         0      0   2048    0    0    1 :   60   30
size-2048             78     78   2048   39   39    1 :   60   30
size-1024(DMA)         0      0   1024    0    0    1 :  124   62
size-1024             96     96   1024   24   24    1 :  124   62
size-512(DMA)          0      0    512    0    0    1 :  124   62
size-512             136    136    512   17   17    1 :  124   62
size-256(DMA)          0      0    256    0    0    1 :  252  126
size-256              45     45    256    3    3    1 :  252  126
size-128(DMA)          0      0    128    0    0    1 :  252  126
size-128            1170   1170    128   39   39    1 :  252  126
size-64(DMA)           0      0     64    0    0    1 :  252  126
size-64              236    236     64    4    4    1 :  252  126
size-32(DMA)           0      0     32    0    0    1 :  252  126
size-32             5424   5424     32   48   48    1 :  252  126

[-- Attachment #3: after.1 --]
[-- Type: text/plain, Size: 4222 bytes --]

             total       used       free     shared    buffers     cached
Mem:       1029524     992848      36676          0      54296     139212
-/+ buffers/cache:     799340     230184
Swap:      2097136        116    2097020
slabinfo - version: 1.1 (SMP)
kmem_cache            64     64    244    4    4    1 :  252  126
devfsd_event           0      0     20    0    0    1 :  252  126
tcp_tw_bucket        134   1520     96   18   38    1 :  252  126
tcp_bind_bucket      113    113     32    1    1    1 :  252  126
tcp_open_request      59     59     64    1    1    1 :  252  126
inet_peer_cache        0      0     64    0    0    1 :  252  126
ip_fib_hash           14    113     32    1    1    1 :  252  126
ip_dst_cache          96     96    160    4    4    1 :  252  126
arp_cache              9     30    128    1    1    1 :  252  126
blkdev_requests      640    640     96   16   16    1 :  252  126
journal_head         372    624     48    7    8    1 :  252  126
revoke_table           1    253     12    1    1    1 :  252  126
revoke_record          0      0     32    0    0    1 :  252  126
dnotify cache          0      0     20    0    0    1 :  252  126
file lock cache       42     42     92    1    1    1 :  252  126
fasync cache           0      0     16    0    0    1 :  252  126
uid_cache            113    113     32    1    1    1 :  252  126
skbuff_head_cache    110    120    160    5    5    1 :  252  126
sock                  27     27    832    3    3    2 :  124   62
sigqueue              29     29    132    1    1    1 :  252  126
cdev_cache           358    413     64    7    7    1 :  252  126
bdev_cache            40     59     64    1    1    1 :  252  126
mnt_cache             19     59     64    1    1    1 :  252  126
inode_cache        84352  90800    480 11340 11350    1 :  124   62
dentry_cache      240060 240060    128 8002 8002    1 :  252  126
dquot                  0      0    128    0    0    1 :  252  126
filp                 477    480    128   16   16    1 :  252  126
names_cache           11     11   4096   11   11    1 :   60   30
buffer_head       215417 227760     96 5694 5694    1 :  252  126
mm_struct             96     96    160    4    4    1 :  252  126
vm_area_struct       668    920     96   22   23    1 :  252  126
fs_cache              69    118     64    2    2    1 :  252  126
files_cache           69     72    416    8    8    1 :  124   62
signal_act            75     75   1312   25   25    1 :   60   30
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384             3      3  16384    3    3    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              2      2   8192    2    2    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :   60   30
size-4096             26     26   4096   26   26    1 :   60   30
size-2048(DMA)         0      0   2048    0    0    1 :   60   30
size-2048             98     98   2048   49   49    1 :   60   30
size-1024(DMA)         0      0   1024    0    0    1 :  124   62
size-1024             96     96   1024   24   24    1 :  124   62
size-512(DMA)          0      0    512    0    0    1 :  124   62
size-512             136    136    512   17   17    1 :  124   62
size-256(DMA)          0      0    256    0    0    1 :  252  126
size-256              45     45    256    3    3    1 :  252  126
size-128(DMA)          0      0    128    0    0    1 :  252  126
size-128            1104   1230    128   41   41    1 :  252  126
size-64(DMA)           0      0     64    0    0    1 :  252  126
size-64              175    236     64    4    4    1 :  252  126
size-32(DMA)           0      0     32    0    0    1 :  252  126
size-32           209954 209954     32 1858 1858    1 :  252  126

[-- Attachment #4: before.2 --]
[-- Type: text/plain, Size: 4221 bytes --]

             total       used       free     shared    buffers     cached
Mem:       1029524      37764     991760          0       1988      11008
-/+ buffers/cache:      24768    1004756
Swap:      2097136          0    2097136

slabinfo - version: 1.1 (SMP)
kmem_cache            64     64    244    4    4    1 :  252  126
devfsd_event         126    169     20    1    1    1 :  252  126
tcp_tw_bucket          0      0     96    0    0    1 :  252  126
tcp_bind_bucket      113    113     32    1    1    1 :  252  126
tcp_open_request      59     59     64    1    1    1 :  252  126
inet_peer_cache        0      0     64    0    0    1 :  252  126
ip_fib_hash          113    113     32    1    1    1 :  252  126
ip_dst_cache          24     24    160    1    1    1 :  252  126
arp_cache             30     30    128    1    1    1 :  252  126
blkdev_requests      800    800     96   20   20    1 :  252  126
journal_head          78     78     48    1    1    1 :  252  126
revoke_table         126    253     12    1    1    1 :  252  126
revoke_record          0      0     32    0    0    1 :  252  126
dnotify cache          0      0     20    0    0    1 :  252  126
file lock cache       42     42     92    1    1    1 :  252  126
fasync cache           0      0     16    0    0    1 :  252  126
uid_cache            113    113     32    1    1    1 :  252  126
skbuff_head_cache     96     96    160    4    4    1 :  252  126
sock                  18     18    832    2    2    2 :  124   62
sigqueue              29     29    132    1    1    1 :  252  126
cdev_cache           354    354     64    6    6    1 :  252  126
bdev_cache            59     59     64    1    1    1 :  252  126
mnt_cache             59     59     64    1    1    1 :  252  126
inode_cache         1160   1160    480  145  145    1 :  124   62
dentry_cache        1320   1320    128   44   44    1 :  252  126
dquot                  0      0    128    0    0    1 :  252  126
filp                 330    330    128   11   11    1 :  252  126
names_cache            7      7   4096    7    7    1 :   60   30
buffer_head         3440   3440     96   86   86    1 :  252  126
mm_struct             48     48    160    2    2    1 :  252  126
vm_area_struct       600    600     96   15   15    1 :  252  126
fs_cache              59     59     64    1    1    1 :  252  126
files_cache           54     54    416    6    6    1 :  124   62
signal_act            51     51   1312   17   17    1 :   60   30
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384             3      3  16384    3    3    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              2      3   8192    2    3    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :   60   30
size-4096             23     23   4096   23   23    1 :   60   30
size-2048(DMA)         0      0   2048    0    0    1 :   60   30
size-2048             76     76   2048   38   38    1 :   60   30
size-1024(DMA)         0      0   1024    0    0    1 :  124   62
size-1024             96     96   1024   24   24    1 :  124   62
size-512(DMA)          0      0    512    0    0    1 :  124   62
size-512             136    136    512   17   17    1 :  124   62
size-256(DMA)          0      0    256    0    0    1 :  252  126
size-256              45     45    256    3    3    1 :  252  126
size-128(DMA)          0      0    128    0    0    1 :  252  126
size-128             990    990    128   33   33    1 :  252  126
size-64(DMA)           0      0     64    0    0    1 :  252  126
size-64              177    177     64    3    3    1 :  252  126
size-32(DMA)           0      0     32    0    0    1 :  252  126
size-32              339    339     32    3    3    1 :  252  126

[-- Attachment #5: after.2 --]
[-- Type: text/plain, Size: 4221 bytes --]

             total       used       free     shared    buffers     cached
Mem:       1029524     992708      36816          0      43792     144516
-/+ buffers/cache:     804400     225124
Swap:      2097136        116    2097020

slabinfo - version: 1.1 (SMP)
kmem_cache            64     64    244    4    4    1 :  252  126
devfsd_event           0      0     20    0    0    1 :  252  126
tcp_tw_bucket        219   1520     96   20   38    1 :  252  126
tcp_bind_bucket      113    113     32    1    1    1 :  252  126
tcp_open_request      59     59     64    1    1    1 :  252  126
inet_peer_cache        0      0     64    0    0    1 :  252  126
ip_fib_hash           14    113     32    1    1    1 :  252  126
ip_dst_cache          72     72    160    3    3    1 :  252  126
arp_cache             11     30    128    1    1    1 :  252  126
blkdev_requests      640    640     96   16   16    1 :  252  126
journal_head         246    624     48    7    8    1 :  252  126
revoke_table           1    253     12    1    1    1 :  252  126
revoke_record          0      0     32    0    0    1 :  252  126
dnotify cache          0      0     20    0    0    1 :  252  126
file lock cache       17     42     92    1    1    1 :  252  126
fasync cache           0      0     16    0    0    1 :  252  126
uid_cache             10    113     32    1    1    1 :  252  126
skbuff_head_cache    111    120    160    5    5    1 :  252  126
sock                  27     27    832    3    3    2 :  124   62
sigqueue              29     29    132    1    1    1 :  252  126
cdev_cache           348    354     64    6    6    1 :  252  126
bdev_cache            40     59     64    1    1    1 :  252  126
mnt_cache             19     59     64    1    1    1 :  252  126
inode_cache        55744  62440    480 7801 7805    1 :  124   62
dentry_cache      125400 125400    128 4180 4180    1 :  252  126
dquot                  0      0    128    0    0    1 :  252  126
filp                 468    480    128   16   16    1 :  252  126
names_cache           11     11   4096   11   11    1 :   60   30
buffer_head       223430 236160     96 5904 5904    1 :  252  126
mm_struct             72     72    160    3    3    1 :  252  126
vm_area_struct       754    880     96   22   22    1 :  252  126
fs_cache             118    118     64    2    2    1 :  252  126
files_cache           72     72    416    8    8    1 :  124   62
signal_act            75     75   1312   25   25    1 :   60   30
size-131072(DMA)       0      0 131072    0    0   32 :    0    0
size-131072            0      0 131072    0    0   32 :    0    0
size-65536(DMA)        0      0  65536    0    0   16 :    0    0
size-65536             0      0  65536    0    0   16 :    0    0
size-32768(DMA)        0      0  32768    0    0    8 :    0    0
size-32768             0      0  32768    0    0    8 :    0    0
size-16384(DMA)        0      0  16384    0    0    4 :    0    0
size-16384             3      3  16384    3    3    4 :    0    0
size-8192(DMA)         0      0   8192    0    0    2 :    0    0
size-8192              2      2   8192    2    2    2 :    0    0
size-4096(DMA)         0      0   4096    0    0    1 :   60   30
size-4096             27     27   4096   27   27    1 :   60   30
size-2048(DMA)         0      0   2048    0    0    1 :   60   30
size-2048             94     94   2048   47   47    1 :   60   30
size-1024(DMA)         0      0   1024    0    0    1 :  124   62
size-1024             96     96   1024   24   24    1 :  124   62
size-512(DMA)          0      0    512    0    0    1 :  124   62
size-512             136    136    512   17   17    1 :  124   62
size-256(DMA)          0      0    256    0    0    1 :  252  126
size-256              45     45    256    3    3    1 :  252  126
size-128(DMA)          0      0    128    0    0    1 :  252  126
size-128            1104   1230    128   41   41    1 :  252  126
size-64(DMA)           0      0     64    0    0    1 :  252  126
size-64              177    236     64    4    4    1 :  252  126
size-32(DMA)           0      0     32    0    0    1 :  252  126
size-32           100005 100005     32  885  885    1 :  252  126

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 14:55         ` Matt Dainty
@ 2002-01-10 16:17           ` David Rees
  2002-01-10 20:46           ` Andreas Dilger
  1 sibling, 0 replies; 14+ messages in thread
From: David Rees @ 2002-01-10 16:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andreas Dilger, Bruce Guenter, Rik van Riel

On Thu, Jan 10, 2002 at 02:55:42PM +0000, Matt Dainty wrote:
>
> Patch applied cleanly, and I redid the 'test'. I've attached the output
> of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
> cases postal was left to run for about 35 minutes by which time it had
> delivered around ~54000 messages locally.
> 
> Overall, with the patch, the large numbers in /proc/slabinfo are *still*
> large, but not as large as without the patch. Overall memory usage still
> seems similar.

So the performance of the test was the same with or without the patch?

Does top or vmstat indicate any kind of difference on the system when the
benchmark is pushing 1500 msgs/s vs 150 msgs/s?

There's a kernel profiling tool somewhere that might also help if there's a
large amount of system time being used up.  (I think this is it:
http://oss.sgi.com/projects/kernprof/)

-Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 14:55         ` Matt Dainty
  2002-01-10 16:17           ` David Rees
@ 2002-01-10 20:46           ` Andreas Dilger
  2002-01-10 22:24             ` Bruce Guenter
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2002-01-10 20:46 UTC (permalink / raw)
  To: linux-kernel, Bruce Guenter, Rik van Riel

On Jan 10, 2002  14:55 +0000, Matt Dainty wrote:
> Pretty much the same as Bruce here, mostly same culprits anyway:
> 
> inode_cache        84352  90800    480 11340 11350    1 :  124   62
> dentry_cache      240060 240060    128 8002 8002    1 :  252  126
> buffer_head       215417 227760     96 5694 5694    1 :  252  126
> size-32           209954 209954     32 1858 1858    1 :  252  126
> 
> On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> > The other question would of course be whether we are calling into
> > shrink_dcache_memory() enough, but that is an issue for Matt to
> > see by testing "postal" with and without the patch, and keeping an
> > eye on the slab caches.
> 
> Patch applied cleanly, and I redid the 'test'. I've attached the output
> of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
> cases postal was left to run for about 35 minutes by which time it had
> delivered around ~54000 messages locally.

One question - what happens to the emails after they are delivered?  Are
they kept on the local filesystem?  That would be one reason why you have
so many inodes and dentries in the cache - they are cacheing all of these
newly-accessed inodes in the assumption that they may be used again soon.

Even on my system, I have about 35000 items in the inode and dentry caches.

> Overall, with the patch, the large numbers in /proc/slabinfo are *still*
> large, but not as large as without the patch. Overall memory usage still
> seems similar.

Well, Linux will pretty much always use up all of your memory.  The real
question always boils down to how to use it most effectively.

Without patch:
>              total       used       free     shared    buffers     cached
> Mem:       1029524     992848      36676          0      54296     139212
> -/+ buffers/cache:     799340     230184
> Swap:      2097136        116    2097020
>
> inode_cache        84352  90800    480 11340 11350    1 :  124   62
> dentry_cache      240060 240060    128 8002 8002    1 :  252  126
> buffer_head       215417 227760     96 5694 5694    1 :  252  126
> size-32           209954 209954     32 1858 1858    1 :  252  126

With patch:
>              total       used       free     shared    buffers     cached
> Mem:       1029524     992708      36816          0      43792     144516
> -/+ buffers/cache:     804400     225124
> Swap:      2097136        116    2097020
> 
> inode_cache        55744  62440    480 7801 7805    1 :  124   62
> dentry_cache      125400 125400    128 4180 4180    1 :  252  126
> buffer_head       223430 236160     96 5904 5904    1 :  252  126
> size-32           100005 100005     32  885  885    1 :  252  126

Well with the patch, you have:

((11350 - 7805) + (8002 - 4180) + (1858 - 885)) * 4096 bytes = 32MB

more RAM to play with.  Granted that it is not a ton on a 1GB machine,
but it is nothing to sneeze at either.  In your case, we could still
be a lot more aggressive in removing dentries and inodes from the cache,
but under many workloads (not the artificial use-once case of such a
benchmark) that may be a net performance loss.

One interesting tidbit is the number of size-32 items in use.  This means
that the filename does not fit into the 16 bytes provided inline with
the dentry.  There was another patch available which increased the size
of DNAME_INLINE_LEN so that it filled the rest of the cacheline, since
the dentries are aligned this way anyways.  Depending on how much space
that gives you, and the length of the filenames used by qmail, you could
save another (whopping) 3.5MB of space and avoid extra allocations for
each and every file.

Still, these slabs only total 18744 pages = 73 MB, so there must be
another culprit hiding elsewhere using the other 720MB of RAM.  What
does the output of Ctrl-Alt-SysRQ-M show you (either kernel is fine).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 10:05       ` Andreas Dilger
  2002-01-10 11:28         ` Matt Dainty
  2002-01-10 14:55         ` Matt Dainty
@ 2002-01-10 22:18         ` Bruce Guenter
  2 siblings, 0 replies; 14+ messages in thread
From: Bruce Guenter @ 2002-01-10 22:18 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2600 bytes --]

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002  02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > > Matt's system seems to go from 900 MB free to about
> > > 300 MB (free + cache).
> > > 
> > > I doubt qmail would eat 600 MB of RAM (it might, I
> > > just doubt it) so I'm curious where the RAM is going.
> > 
> > I am seeing the same symptoms, with similar use -- ext3 filesystems
> > running qmail.
> 
> Hmm, does qmail put each piece of email is in a separate file?  That
> might explain a lot about what is going on here.

There are actually three to five individual files used as part of the
equation.  qmail stores each message as three our four individual files
while it is in the queue (which for local deliveries is very briefly).
In addition, each delivered message is saved as an individual file,
until the client picks it up (and deletes it) with POP.

> Well, these numbers _are_ high, but with 1GB of RAM you have to use it all
> _somewhere_.

Agreed.  Free RAM is wasted RAM.  However, when adding up the numbers
buffers+cache+RSS+slab,  the totals I am reading account for roughly
half of the used RAM:
	RSS	 84MB (including shared pages counted multiple times)
	slabs	 82MB
	buffers	154MB
	cache	152MB
	-------------
	total	477MB
However, free reports 895MB as used.  What am I missing?

> I'm thinking that if you get _lots_ of dentry and inode items (especially
> under the "postal" benchmark) you may not be able to free the negative
> dentries for all of the created/deleted files in the mailspool (all of
> which will have unique names).  There is a deadlock path in the VM that
> has to be avoided, and as a result it makes it harder to free dentries
> under certain uncommon loads.

The names in the queue are actually reused fairly frequently.  qmail
creates an initial file named for the creating PID, and then renames it
to the inode number of the file.  These inode numbers are of course
recycled as are the filenames.

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

I'd love to test this as well, but this is a production server.  I'll
see if I can put one of my home systems to the task.
-- 
Bruce Guenter <bruceg@em.ca> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA  2E2A E96F B2DC 6999 80E8

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 20:46           ` Andreas Dilger
@ 2002-01-10 22:24             ` Bruce Guenter
  2002-01-10 22:36               ` Andreas Dilger
  0 siblings, 1 reply; 14+ messages in thread
From: Bruce Guenter @ 2002-01-10 22:24 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

On Thu, Jan 10, 2002 at 01:46:57PM -0700, Andreas Dilger wrote:
> One question - what happens to the emails after they are delivered?  Are
> they kept on the local filesystem?

Messages in the queue are deleted after delivery (of course).  Messages
delivered locally are stored on the local filesystem until they're
picked up by POP (typically within 15 minutes).
-- 
Bruce Guenter <bruceg@em.ca> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA  2E2A E96F B2DC 6999 80E8

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 22:24             ` Bruce Guenter
@ 2002-01-10 22:36               ` Andreas Dilger
  2002-01-14 11:40                 ` Matt Dainty
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Dilger @ 2002-01-10 22:36 UTC (permalink / raw)
  To: linux-kernel

On Jan 10, 2002  16:24 -0600, Bruce Guenter wrote:
> On Thu, Jan 10, 2002 at 01:46:57PM -0700, Andreas Dilger wrote:
> > One question - what happens to the emails after they are delivered?  Are
> > they kept on the local filesystem?
> 
> Messages in the queue are deleted after delivery (of course).  Messages
> delivered locally are stored on the local filesystem until they're
> picked up by POP (typically within 15 minutes).

Sorry, I meant for the "Postal" benchmark only.  I would hope that locally
delivered emails are kept until the recipient gets it in the normal case.

In any case, you also pointed out the same thing I did, namely that these
slab entries (while having some high numbers) do not account for the large
amount of used memory in the system.  Maybe SysRQ-M output can help a bit?

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
  2002-01-10 22:36               ` Andreas Dilger
@ 2002-01-14 11:40                 ` Matt Dainty
  0 siblings, 0 replies; 14+ messages in thread
From: Matt Dainty @ 2002-01-14 11:40 UTC (permalink / raw)
  To: linux-kernel

On Thu, Jan 10, 2002 at 03:36:39PM -0700, Andreas Dilger wrote:
> 
> In any case, you also pointed out the same thing I did, namely that these
> slab entries (while having some high numbers) do not account for the large
> amount of used memory in the system.  Maybe SysRQ-M output can help a bit?

Running this on the box after it's settled down a bit, (over the weekend,
the usage hasn't altered), with all mail delivered and collected so the box
is currently quiet, produces the following 'free' output:

root@plum:~# free
             total       used       free     shared    buffers     cached
Mem:       1029524     965344      64180          0      45204      22936
-/+ buffers/cache:     897204     132320
Swap:      2097136        116    2097020

And SysRQ+M yields the following:

SysRq : Show Memory
Mem-info:
Free pages:       66612kB (  4424kB HighMem)
Zone:DMA freepages:  4848kB min:   128kB low:   256kB high:   384kB
Zone:Normal freepages: 57340kB min:  1020kB low:  2040kB high:  3060kB
Zone:HighMem freepages:  4424kB min:  1020kB low:  2040kB high:  3060kB
( Active: 160155, inactive: 58253, free: 16653 )
124*4kB 98*8kB 41*16kB 13*32kB 3*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB = 4848kB)
11029*4kB 1097*8kB 140*16kB 7*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB = 57340kB)
328*4kB 55*8kB 19*16kB 10*32kB 2*64kB 3*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB = 4424kB)
Swap cache: add 48, delete 24, find 13/14, race 0+0
Free swap:       2097020kB
262128 pages of RAM
32752 pages of HIGHMEM
4747 reserved pages
20447 pages shared
24 pages swap cached
0 pages in page table cache
Buffer memory:    44424kB
    CLEAN: 209114 buffers, 836405 kbyte, 188 used (last=209112), 0 locked, 0 dirty
                           ^^^^^^ Is this our magic value?
    DIRTY: 8 buffers, 32 kbyte, 0 used (last=0), 0 locked, 8 dirty

Cheers

Matt
-- 
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Where's all my memory going?
@ 2002-01-11 15:30 Rolf Lear
  0 siblings, 0 replies; 14+ messages in thread
From: Rolf Lear @ 2002-01-11 15:30 UTC (permalink / raw)
  To: matt; +Cc: linux-kernel


Matt Dainty <matt@bodgit-n-scarper.com> writes:
>Hi,
>
>I've fashioned a qmail mail server using an HP NetServer with an HP NetRaid
>4M & 1GB RAM, running 2.4.17 with aacraid, LVM, ext3 and highmem. The box
>has 6x 9GB disks, one for system, one for qmail's queue, and the remaining
>four are RAID5'd with LVM. ext3 is only on the queue disk, ext2 everywhere
>else.
>
....

qmail is very file intensive (which is a good thing ...), and RAID5 is very resource intensive (every write to RAID5 involves a number of reads and a write).

It is quite conceivable that the data volume (throughput) generated by the tests is too large for the throughput of your RAID system. From your mail I understand that the mails are being delivered locally to the RAID5 disk array.

One explaination for your results are that your qmail queue is being filled (by qmail-smtpd) at the rate of the network (presumably 100Mbit or about 10-12MiB/s). This queue is then delivered locally to the RAID5. Files in the queue do not last long (are created and then deleted, and the cache probably never gets flushed to disk ...). Delivered e-mails fill the cache though, and the kernel at some point will begin flushing these cache entries to disk. At some point (and I am guessing this is your 35-40 minute point) all pages in the cache are dirty (i.e. the kernel has not been able to write the cache to disk as fast as it is being filled ...). This will cause the disk to become your bottleneck.

This is based on the assumption that the RAID5 is slower than the network. In my experience, this is often the case. A good test for this would be tools like bonnie++, or tools like vmstat. On a saturated raid array with a cache, it is typical to get 'vmstat 1' output which shows rapid bursts of data writes (bo's), followed by periods of inactivity. A longer vmstat like 'vmstat 10' will probably even out these bursts, and show an 'averaged' throughput of your disks.

It is possible that I am completely off base, but I have been battling similar problems myself recently, and discovered to my horror that RAID5 disk arrays are pathetically slow. Check your disk performance for the bottleneck.

Rolf

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-01-14 11:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-01-09 17:36 Where's all my memory going? Matt Dainty
2002-01-09 17:47 ` Alan Cox
2002-01-09 22:36   ` Rik van Riel
2002-01-10  8:45     ` Bruce Guenter
2002-01-10 10:05       ` Andreas Dilger
2002-01-10 11:28         ` Matt Dainty
2002-01-10 14:55         ` Matt Dainty
2002-01-10 16:17           ` David Rees
2002-01-10 20:46           ` Andreas Dilger
2002-01-10 22:24             ` Bruce Guenter
2002-01-10 22:36               ` Andreas Dilger
2002-01-14 11:40                 ` Matt Dainty
2002-01-10 22:18         ` Bruce Guenter
2002-01-11 15:30 Rolf Lear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).