linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* negative dentries wasting ram
@ 2002-05-24  7:16 Andrea Arcangeli
  2002-05-24  8:10 ` Andreas Dilger
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24  7:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Alexander Viro

I actually noticed that after an unlink the dentry wasn't released (the
inode was released the dentry wasn't). At first I thought it was a bug,
then while reading the code I noticed this is intentional.  So after
creating some thousand of different names and then deleting them and
then recreating again different names and deleting them, the size of the
dcache keeps growing again and again like a memory leak until the vm
starts shrinking unrelated pagecache and then finally prune_dcache
started collecting the first negative dentries away after the dcache
grown like crazy and the hashtable is overfull.

You can try yourself with:

	while :; do >$RANDOM; done

and then rm *, then restart, and rm again, and monitor the size of the
i/dcache via slabinfo, inodes returns back to zero after rm -r, dcache
only goes up.

as far I can see that negative dentries are not caching anything, they
should be dropped immediatly, they even slowdown the lookups because
they're hashed.

I'm pretty sure I want to avoid to waste my ram with negative dentries,
after I `rm` a file I want the dentry for such file to go away too, not
only the inode. I want free ram space for useful cache.

So I did this patch that seems to work for me (this also takes care of
when a creat fails, there may be other corner cases like creat-failure).
The directories were just released correctly because of d_unhash, it was
a problem only for the files (and of course the negative dentries are
all collected away when the parent dir is rmdirred, but how do you know
that the parent dir will ever go away?). I can very well imagine spools
where an huge number of files is regularly created and deleted with
timestamped names, always different and the parent dir never going away.

Negative dentries should be only temporary entities, for example between
the allocation of the dentry and the create of the inode, they shouldn't
be left around waiting the vm to collect them. They maybe more
intersting with nfs, maybe if something is been invalidated because the
server timeouts or stuff like that, while we wait it's fine to have
negative dentries, but to ensure good performance negative dentries
should be always controlled and never left floating around for an
undefinite time, since they are no cache, they should be collected away
at the last dput (infact another approch is to d_drop implicitly inside
dput, if d_inode is null, right before the check for d_hash but I didn't
took that approch because it was less obvious for fs like nfs that
may make more special usage of negative dentries).

the patch is only slightly tested under uml, the first chunk is
obviously safe, the other a bit less, so beware that it can corrupt your
fs.

Comments?

--- 2.4.19pre8aa4/fs/dcache.c.~1~	Fri May 24 05:57:21 2002
+++ 2.4.19pre8aa4/fs/dcache.c	Fri May 24 08:33:27 2002
@@ -812,6 +812,7 @@ out:
  
 void d_delete(struct dentry * dentry)
 {
+#ifdef DENTRY_WASTE_RAM
 	/*
 	 * Are we the only user?
 	 */
@@ -821,6 +822,7 @@ void d_delete(struct dentry * dentry)
 		return;
 	}
 	spin_unlock(&dcache_lock);
+#endif
 
 	/*
 	 * If not, just drop the dentry and let dput
--- 2.4.19pre8aa4/fs/namei.c.~1~	Fri May 24 05:57:21 2002
+++ 2.4.19pre8aa4/fs/namei.c	Fri May 24 08:35:05 2002
@@ -1055,6 +1055,10 @@ do_last:
 			mode &= ~current->fs->umask;
 		error = vfs_create(dir->d_inode, dentry, mode);
 		up(&dir->d_inode->i_sem);
+#ifndef DENTRY_WASTE_RAM
+		if (error)
+			d_drop(dentry);
+#endif
 		dput(nd->dentry);
 		nd->dentry = dentry;
 		if (error)


Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24  7:16 negative dentries wasting ram Andrea Arcangeli
@ 2002-05-24  8:10 ` Andreas Dilger
  2002-05-24 15:36   ` Andrea Arcangeli
  2002-05-24 14:43 ` Linus Torvalds
  2002-05-31  8:34 ` Oliver Neukum
  2 siblings, 1 reply; 30+ messages in thread
From: Andreas Dilger @ 2002-05-24  8:10 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel, Linus Torvalds, Alexander Viro

On May 24, 2002  09:16 +0200, Andrea Arcangeli wrote:
> I actually noticed that after an unlink the dentry wasn't released (the
> inode was released the dentry wasn't). At first I thought it was a bug,
> then while reading the code I noticed this is intentional.

The benefits of negative dentries are also there, although in some cases
(such as with many thousands of use-once files are deleted) the drawbacks
probably outweigh the benefits.  The benefits are that the VFS does not
need to do _very_slow_ access to the filesystem (=disk) when resolving
executables in the PATH or dynamic libraries in ld.so.

What I did to fix this problem was to put negative dentries at the end of
the dentry_unused list and not mark them referenced until the second time
they are used.  This allows these unused negative dentries to be reaped
easily, even if we are under memory pressure from a non-GFP_FS allocator
(which would otherwise not allow the dcache to be scanned because of
deadlock problems).

Otherwise, these dentries must go through the LRU twice before they
are freed (once to clear referenced flag, again to get to the end of
the list).  The old code also didn't allow pruning any dentries under
a non-GFP_FS allocator, so you couldn't free these dentries at the time
you need to free them.

The patch I've had in my tree for a long time is below.  In my testing
at the time I wrote this patch, it didn't totally eliminate dentry
cache growth, but it did cap it at a reasonable level (i.e. when memory
pressure started up the dcache was shrunk until it held a steady size).

We might concievably want to continue scanning the dentry_unused list
even after we hit an in-use dentry, rather than bailing out as my
patch does.  It would probably just mean doing list_for_each_prev()
and skipping in-use entries (but leaving them at the end of the list)
when __GFP_FS is not set, maybe until we have skipped up to 'count'
active dentries.  Even so, the new behaviour is still better than the old
one when GFP_FS is not set, because it used to never try to drop dentries
at all in that case.

Cheers, Andreas
====================== dcache-2.4.18-negative.diff ======================
--- linux-2.4.18.orig/fs/dcache.c	Wed Feb 27 10:31:58 2002
+++ linux-2.4.18-aed/fs/dcache.c	Fri May 24 01:52:03 2002
@@ -137,7 +137,16 @@ repeat:
 	/* Unreachable? Get rid of it */
 	if (list_empty(&dentry->d_hash))
 		goto kill_it;
-	list_add(&dentry->d_lru, &dentry_unused);
+	if (dentry->d_inode) {
+		list_add(&dentry->d_lru, &dentry_unused);
+	} else {
+		/* Put an unused negative inode to the end of the list.  If it
+		 * is not referenced again before we need to free some memory,
+		 * it will be the first to be freed.
+		 */
+		dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
+		list_add_tail(&dentry->d_lru, &dentry_unused);
+	}
 	dentry_stat.nr_unused++;
 	spin_unlock(&dcache_lock);
 	return;
@@ -306,8 +315,9 @@ static inline void prune_one_dentry(stru
 }
 
 /**
- * prune_dcache - shrink the dcache
+ * _prune_dcache - shrink the dcache
  * @count: number of entries to try and free
+ * @gfp_mask: context under which we are trying to free memory
  *
  * Shrink the dcache. This is done when we need
  * more memory, or simply when we need to unmount
@@ -318,7 +328,7 @@ static inline void prune_one_dentry(stru
  * all the dentries are in use.
  */
  
-void prune_dcache(int count)
+void _prune_dcache(int count, unsigned int gfp_mask)
 {
 	spin_lock(&dcache_lock);
 	for (;;) {
@@ -329,15 +339,40 @@ void prune_dcache(int count)
 
 		if (tmp == &dentry_unused)
 			break;
-		list_del_init(tmp);
 		dentry = list_entry(tmp, struct dentry, d_lru);
 
 		/* If the dentry was recently referenced, don't free it. */
 		if (dentry->d_vfs_flags & DCACHE_REFERENCED) {
+			list_del_init(tmp);
 			dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
 			list_add(&dentry->d_lru, &dentry_unused);
 			continue;
 		}
+
+		/*
+		 * Nasty deadlock avoidance.
+		 *
+		 * ext2_new_block->getblk->GFP->shrink_dcache_memory->
+		 * prune_dcache->prune_one_dentry->dput->dentry_iput->iput->
+		 * inode->i_sb->s_op->put_inode->ext2_discard_prealloc->
+		 * ext2_free_blocks->lock_super->DEADLOCK.
+		 *
+		 * We should make sure we don't hold the superblock lock over
+		 * block allocations, but for now we will only free unused
+		 * negative dentries (which are added at the end of the list).
+		 *
+		 * It is safe to call prune_one_dentry() on a negative dentry
+		 * even with GFP_FS, because dentry_iput() is a no-op in this
+		 * case, and no chance of calling into the filesystem.
+		 *
+		 * I'm not sure if the d_release check is necessary to avoid
+		 * deadlock in d_free(), but better to be safe for now.
+		 */
+		if (((dentry->d_op && dentry->d_op->d_release) ||
+		     dentry->d_inode) && !(gfp_mask & __GFP_FS))
+			break;
+
+		list_del_init(tmp);
 		dentry_stat.nr_unused--;
 
 		/* Unused dentry with a count? */
@@ -351,6 +386,11 @@ void prune_dcache(int count)
 	spin_unlock(&dcache_lock);
 }
 
+void prune_dcache(int count)
+{
+	_prune_dcache(count, __GFP_FS);
+}
+
 /*
  * Shrink the dcache for the specified super block.
  * This allows us to unmount a device without disturbing
@@ -543,32 +583,17 @@ void shrink_dcache_parent(struct dentry 
  * too much.
  *
  * Priority:
- *   0 - very urgent: shrink everything
+ *   1 - very urgent: shrink everything
  *  ...
  *   6 - base-level: try to shrink a bit.
  */
 int shrink_dcache_memory(int priority, unsigned int gfp_mask)
 {
-	int count = 0;
-
-	/*
-	 * Nasty deadlock avoidance.
-	 *
-	 * ext2_new_block->getblk->GFP->shrink_dcache_memory->prune_dcache->
-	 * prune_one_dentry->dput->dentry_iput->iput->inode->i_sb->s_op->
-	 * put_inode->ext2_discard_prealloc->ext2_free_blocks->lock_super->
-	 * DEADLOCK.
-	 *
-	 * We should make sure we don't hold the superblock lock over
-	 * block allocations, but for now:
-	 */
-	if (!(gfp_mask & __GFP_FS))
-		return 0;
-
-	count = dentry_stat.nr_unused / priority;
+	int count = dentry_stat.nr_unused / priority;
 
-	prune_dcache(count);
+	_prune_dcache(count, gfp_mask);
 	kmem_cache_shrink(dentry_cache);
+
 	return 0;
 }
 
@@ -590,8 +615,15 @@ struct dentry * d_alloc(struct dentry * 
 	struct dentry *dentry;
 
 	dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); 
-	if (!dentry)
-		return NULL;
+	if (!dentry) {
+		/* Try to free some unused dentries from the cache, but do
+		 * not call into the filesystem to do so (avoid deadlock).
+		 */
+		_prune_dcache(16, GFP_NOFS);
+		dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL);
+		if (!dentry)
+			return NULL;
+	}
 
 	if (name->len > DNAME_INLINE_LEN-1) {
 		str = kmalloc(NAME_ALLOC_LEN(name->len), GFP_KERNEL);
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24  7:16 negative dentries wasting ram Andrea Arcangeli
  2002-05-24  8:10 ` Andreas Dilger
@ 2002-05-24 14:43 ` Linus Torvalds
  2002-05-24 14:51   ` David S. Miller
                     ` (3 more replies)
  2002-05-31  8:34 ` Oliver Neukum
  2 siblings, 4 replies; 30+ messages in thread
From: Linus Torvalds @ 2002-05-24 14:43 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel, Alexander Viro



On Fri, 24 May 2002, Andrea Arcangeli wrote:
>
> Negative dentries should be only temporary entities, for example between
> the allocation of the dentry and the create of the inode, they shouldn't
> be left around waiting the vm to collect them.

Wrong. Negative dentries are very useful for caching negative lookups:
look at the average startup sequence of any program linked with glibc, and
depending on your setup you will notice how it tries to open a _lot_ of a
files that do not exist (the "depending on your setup" comes from the fact
that it depends on things like how quickly it finds your "locale" setup
from its locale path - you may have one of the setups that puts it in the
first location glibc searches etc).

If you don't cache those negative lookups, you will do a low-level
filesystem lookup for each of those failures, which is _expensive_.

However, you're right that it probably doesn't help to do this after
"unlink()" - it's probably only worth doing when actually doing a
"lookup()" that fails.

		Linus



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:43 ` Linus Torvalds
@ 2002-05-24 14:51   ` David S. Miller
  2002-05-24 14:53   ` Jakub Jelinek
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: David S. Miller @ 2002-05-24 14:51 UTC (permalink / raw)
  To: torvalds; +Cc: andrea, linux-kernel, viro

   From: Linus Torvalds <torvalds@transmeta.com>
   Date: Fri, 24 May 2002 07:43:32 -0700 (PDT)
   
   However, you're right that it probably doesn't help to do this after
   "unlink()" - it's probably only worth doing when actually doing a
   "lookup()" that fails.

There was some stupidity in how rm -rf * works that I remember from
some NFS hacking, but it may not be effected by this.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:43 ` Linus Torvalds
  2002-05-24 14:51   ` David S. Miller
@ 2002-05-24 14:53   ` Jakub Jelinek
  2002-05-24 20:44     ` David Schwartz
  2002-05-25 17:33     ` Florian Weimer
  2002-05-24 15:54   ` Andrea Arcangeli
  2002-05-24 16:22   ` Alexander Viro
  3 siblings, 2 replies; 30+ messages in thread
From: Jakub Jelinek @ 2002-05-24 14:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrea Arcangeli, linux-kernel, Alexander Viro

On Fri, May 24, 2002 at 07:43:32AM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> >
> > Negative dentries should be only temporary entities, for example between
> > the allocation of the dentry and the create of the inode, they shouldn't
> > be left around waiting the vm to collect them.
> 
> Wrong. Negative dentries are very useful for caching negative lookups:
> look at the average startup sequence of any program linked with glibc, and
> depending on your setup you will notice how it tries to open a _lot_ of a
> files that do not exist (the "depending on your setup" comes from the fact
> that it depends on things like how quickly it finds your "locale" setup
> from its locale path - you may have one of the setups that puts it in the
> first location glibc searches etc).

In glibc 2.3 this will be open("/usr/lib/locale/locale-archive", ), so
negative dentries won't be useful for glibc locale handling (that
doesn't mean negative dentries won't be useful for other things, including
exec?p or searching libraries if $LD_LIBRARY_PATH is used).

	Jakub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24  8:10 ` Andreas Dilger
@ 2002-05-24 15:36   ` Andrea Arcangeli
  2002-05-24 16:12     ` Alexander Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 15:36 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds, Alexander Viro

On Fri, May 24, 2002 at 02:10:43AM -0600, Andreas Dilger wrote:
> On May 24, 2002  09:16 +0200, Andrea Arcangeli wrote:
> > I actually noticed that after an unlink the dentry wasn't released (the
> > inode was released the dentry wasn't). At first I thought it was a bug,
> > then while reading the code I noticed this is intentional.
> 
> The benefits of negative dentries are also there, although in some cases
> (such as with many thousands of use-once files are deleted) the drawbacks
> probably outweigh the benefits.  The benefits are that the VFS does not
> need to do _very_slow_ access to the filesystem (=disk) when resolving
> executables in the PATH or dynamic libraries in ld.so.

The fs access will be exactly the same, only the dentry won't be
allocated because it's just in the hash, but it has no inode and it
doesn't correspond to any on-disk dentry, we simply cannot defer the
removal of the dentry on disk otherwise if we SYSRQ+B after some hour
those stale deleted dentries would showup at the next reboot. So I don't
see any possible difference in disk access. Furthmore to resolve
executables in path etc.. those exec or libs will never be negative
dentries, the dentries are turned to negative only once they're deleted,
and of course they're finally collected when the parent dir is rmdirred.
If it was beneficial to keep them hanging around, why don't you also
turn the parent dir into a negative dentry and you want somebody to
recreate the dir and all its subdirs? (if you did that, then you whould
notice the waste of ram even while doing normal cp -a and rm -r of the
kernel)

> 
> What I did to fix this problem was to put negative dentries at the end of
> the dentry_unused list and not mark them referenced until the second time
> they are used.  This allows these unused negative dentries to be reaped
> easily, even if we are under memory pressure from a non-GFP_FS allocator
> (which would otherwise not allow the dcache to be scanned because of
> deadlock problems).

that certainly helps but it's still not enough, still the vm will start
shrinking the dcache when there's the first remote sign of pagecache
shortage, so if there are giga of clean pagecache such negative dentries
will keep wasting ram because prune_dcache won't be invoked during such
workload. this isn't going to change easily in 2.4 also because doing so
is a sane algorithm (it's the same in 2.2 and 2.5 too indeed, in 2.5 we
may consider changing the dentry recalamtion logic, but still it make
sense to first work on the pagecache or under pagecache pollution the
dcache would be turned down to zero immediatly basically dropping all
the caching benefits of avoiding the i_op->lookup and only adding
overhead in building the in-core lookup mechanism, so it's not that 2.4
is very bad in doing that, it's ok)

So in short those useless negative dentries will always cause useful
caching pagecache to be dropped.

> Otherwise, these dentries must go through the LRU twice before they
> are freed (once to clear referenced flag, again to get to the end of
> the list).  The old code also didn't allow pruning any dentries under
> a non-GFP_FS allocator, so you couldn't free these dentries at the time
> you need to free them.
> 
> The patch I've had in my tree for a long time is below.  In my testing
> at the time I wrote this patch, it didn't totally eliminate dentry
> cache growth, but it did cap it at a reasonable level (i.e. when memory
> pressure started up the dcache was shrunk until it held a steady size).

For an efficient vm all useless cache that could grow to an huge level
(in particular if it's an high prio cache that is shrunk later) must be
collected away as soon as possible, not lazily relying on the vm. low
prio caches that are shrunk immediatly like slab are fine to be
collected lazily, that is the whole point behind their design, they're
like free ram, dcache negative dentries aren't like free ram.

At the very least if they really provide any kind of benefit that I
cannot see all negative dentries should be put in a separate list that
is shrunk by the vm constantly in front of kmem_cache_reap, that would
fix the problem too, but I think it's simpler to avoid them to be stale
around, at least until I see some benefit from having them hanging
around.

thanks for the feedback!

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:43 ` Linus Torvalds
  2002-05-24 14:51   ` David S. Miller
  2002-05-24 14:53   ` Jakub Jelinek
@ 2002-05-24 15:54   ` Andrea Arcangeli
  2002-05-24 16:22   ` Alexander Viro
  3 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 15:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, Alexander Viro

On Fri, May 24, 2002 at 07:43:32AM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> >
> > Negative dentries should be only temporary entities, for example between
> > the allocation of the dentry and the create of the inode, they shouldn't
> > be left around waiting the vm to collect them.
> 
> Wrong. Negative dentries are very useful for caching negative lookups:
> look at the average startup sequence of any program linked with glibc, and

yep I know it is a flood of enoent.

> depending on your setup you will notice how it tries to open a _lot_ of a
> files that do not exist (the "depending on your setup" comes from the fact
> that it depends on things like how quickly it finds your "locale" setup
> from its locale path - you may have one of the setups that puts it in the
> first location glibc searches etc).
> 
> If you don't cache those negative lookups, you will do a low-level
> filesystem lookup for each of those failures, which is _expensive_.

I see now the point, so they cache the information that there's no entry :).

> However, you're right that it probably doesn't help to do this after
> "unlink()" - it's probably only worth doing when actually doing a

Agreed, they should be dropped after unlink, and also if creat fails, so
I think my patch fits perfectly into the vfs caching scheme, the
negative dentries still will be generated for the costantly failed
lookups, but not on after unlink and creat-failures.

> "lookup()" that fails.
> 
> 		Linus
> 


Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 15:36   ` Andrea Arcangeli
@ 2002-05-24 16:12     ` Alexander Viro
  2002-05-24 16:21       ` Andrea Arcangeli
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 16:12 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel, Linus Torvalds



On Fri, 24 May 2002, Andrea Arcangeli wrote:

> The fs access will be exactly the same, only the dentry won't be
> allocated because it's just in the hash, but it has no inode and it
> doesn't correspond to any on-disk dentry, we simply cannot defer the

RTFS.

Lookup on a name that has hashed negative dentry does not touch fs code.
At all.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:12     ` Alexander Viro
@ 2002-05-24 16:21       ` Andrea Arcangeli
  2002-05-24 16:24         ` Alexander Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 16:21 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-kernel, Linus Torvalds

On Fri, May 24, 2002 at 12:12:16PM -0400, Alexander Viro wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> 
> > The fs access will be exactly the same, only the dentry won't be
> > allocated because it's just in the hash, but it has no inode and it
> > doesn't correspond to any on-disk dentry, we simply cannot defer the
> 
> RTFS.
> 
> Lookup on a name that has hashed negative dentry does not touch fs code.
> At all.

of course I was thinking mostly at the unlink procedure, I see the point
now in having the information that no dentry exists on disk with such
name.  that's an heuristic to optimize some common case but the unlink
and a create failure should definitely get rid of the negative dentry,
it's not a common case to delete a file and then to try to access it,
while there are common cases that wants to avoid stale dentries around
for deleted files.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:43 ` Linus Torvalds
                     ` (2 preceding siblings ...)
  2002-05-24 15:54   ` Andrea Arcangeli
@ 2002-05-24 16:22   ` Alexander Viro
  2002-05-24 16:29     ` Linus Torvalds
  3 siblings, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 16:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrea Arcangeli, linux-kernel



On Fri, 24 May 2002, Linus Torvalds wrote:

> However, you're right that it probably doesn't help to do this after
> "unlink()" - it's probably only worth doing when actually doing a
> "lookup()" that fails.

Depends on many things, including the amount of userland code that does
	unlink(name);
	open(name, O_CREAT|O_EXCL..., ...);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:21       ` Andrea Arcangeli
@ 2002-05-24 16:24         ` Alexander Viro
  0 siblings, 0 replies; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 16:24 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel, Linus Torvalds



On Fri, 24 May 2002, Andrea Arcangeli wrote:

> it's not a common case to delete a file and then to try to access it,

I'm less than sure about that.

> while there are common cases that wants to avoid stale dentries around
> for deleted files.

Keep in mind that e.g. rm -rf on a tree _will_ evict the dentries in
question, so quite a few of these cases actually don't leave the stuff
behind.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:22   ` Alexander Viro
@ 2002-05-24 16:29     ` Linus Torvalds
  2002-05-24 16:39       ` Andrea Arcangeli
  2002-05-24 17:00       ` Alexander Viro
  0 siblings, 2 replies; 30+ messages in thread
From: Linus Torvalds @ 2002-05-24 16:29 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Andrea Arcangeli, linux-kernel


On Fri, 24 May 2002, Alexander Viro wrote:
>
> On Fri, 24 May 2002, Linus Torvalds wrote:
>
> > However, you're right that it probably doesn't help to do this after
> > "unlink()" - it's probably only worth doing when actually doing a
> > "lookup()" that fails.
>
> Depends on many things, including the amount of userland code that does
> 	unlink(name);
> 	open(name, O_CREAT|O_EXCL..., ...);

Note that this will have to touch the FS anyway, since the O_CREAT thing
forces a call down to the FS to actually create the file.

The only think we save is a dentry kfree/kmalloc in this case, nbot a FS
downcall. And I think Andrea is right that it can waste memory for the
likely much more common case where the file just stays removed.

		Linus


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:29     ` Linus Torvalds
@ 2002-05-24 16:39       ` Andrea Arcangeli
  2002-05-24 17:04         ` Alexander Viro
  2002-05-24 17:00       ` Alexander Viro
  1 sibling, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 16:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Alexander Viro, linux-kernel

On Fri, May 24, 2002 at 09:29:56AM -0700, Linus Torvalds wrote:
> 
> On Fri, 24 May 2002, Alexander Viro wrote:
> >
> > On Fri, 24 May 2002, Linus Torvalds wrote:
> >
> > > However, you're right that it probably doesn't help to do this after
> > > "unlink()" - it's probably only worth doing when actually doing a
> > > "lookup()" that fails.
> >
> > Depends on many things, including the amount of userland code that does
> > 	unlink(name);
> > 	open(name, O_CREAT|O_EXCL..., ...);
> 
> Note that this will have to touch the FS anyway, since the O_CREAT thing
> forces a call down to the FS to actually create the file.

yep. the only case where it could provide some in-core "caching"
positive effect is:

	unlink
	open(w/o creat)

but I don't see it as a common case.

> The only think we save is a dentry kfree/kmalloc in this case, nbot a FS

agreed.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:29     ` Linus Torvalds
  2002-05-24 16:39       ` Andrea Arcangeli
@ 2002-05-24 17:00       ` Alexander Viro
  2002-05-24 18:36         ` Mark Mielke
  1 sibling, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 17:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrea Arcangeli, linux-kernel



On Fri, 24 May 2002, Linus Torvalds wrote:

> 
> On Fri, 24 May 2002, Alexander Viro wrote:
> >
> > On Fri, 24 May 2002, Linus Torvalds wrote:
> >
> > > However, you're right that it probably doesn't help to do this after
> > > "unlink()" - it's probably only worth doing when actually doing a
> > > "lookup()" that fails.
> >
> > Depends on many things, including the amount of userland code that does
> > 	unlink(name);
> > 	open(name, O_CREAT|O_EXCL..., ...);
> 
> Note that this will have to touch the FS anyway, since the O_CREAT thing
> forces a call down to the FS to actually create the file.
 
> The only think we save is a dentry kfree/kmalloc in this case, nbot a FS
> downcall. And I think Andrea is right that it can waste memory for the
> likely much more common case where the file just stays removed.

???
It's lookup + unlink + lookup + create vs. lookup + unlink + create.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 16:39       ` Andrea Arcangeli
@ 2002-05-24 17:04         ` Alexander Viro
  2002-05-24 17:06           ` Alexander Viro
  2002-05-24 17:55           ` Andrea Arcangeli
  0 siblings, 2 replies; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 17:04 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel



On Fri, 24 May 2002, Andrea Arcangeli wrote:

> > Note that this will have to touch the FS anyway, since the O_CREAT thing
> > forces a call down to the FS to actually create the file.
> 
> yep. the only case where it could provide some in-core "caching"
> positive effect is:
> 
> 	unlink
> 	open(w/o creat)
> 
> but I don't see it as a common case.

	Guys, how about tracing the damn thing and checking what actually
happens?  Or, at least, checking the prototypes and noticing that ->create()
takes (hashed) dentry as an argument, so if unlinked on had been freed we _must_
call ->lookup().



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 17:04         ` Alexander Viro
@ 2002-05-24 17:06           ` Alexander Viro
  2002-05-24 17:55           ` Andrea Arcangeli
  1 sibling, 0 replies; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 17:06 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel



On Fri, 24 May 2002, Alexander Viro wrote:

> takes (hashed) dentry as an argument, so if unlinked on had 
                                                       ^^
                                                       one
sorry.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 17:04         ` Alexander Viro
  2002-05-24 17:06           ` Alexander Viro
@ 2002-05-24 17:55           ` Andrea Arcangeli
  2002-05-24 18:00             ` Alexander Viro
  2002-05-26  8:06             ` Eric W. Biederman
  1 sibling, 2 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 17:55 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel

On Fri, May 24, 2002 at 01:04:33PM -0400, Alexander Viro wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> 
> > > Note that this will have to touch the FS anyway, since the O_CREAT thing
> > > forces a call down to the FS to actually create the file.
> > 
> > yep. the only case where it could provide some in-core "caching"
> > positive effect is:
> > 
> > 	unlink
> > 	open(w/o creat)
> > 
> > but I don't see it as a common case.
> 
> 	Guys, how about tracing the damn thing and checking what actually
> happens?  Or, at least, checking the prototypes and noticing that ->create()
> takes (hashed) dentry as an argument, so if unlinked on had been freed we _must_
> call ->lookup().

so why don't you also left a negative directory floating around, so you
know if you creat a file with such name you don't need to ->loopup the
lowlevel fs but you only need to destroy the negative directory and all
its leafs in-core-dcache? If you did the negative effect would become
more obvious, the d_unhash hides it except for the spooling workloads.

Avoiding a lowlevel lookup operation for an unlink/open cycle, looks a
minor optimization compared to a massive dcache ""leak"" under certain
common spooling workloads IMHO.

Anyways in 2.5 we could still take advantage of the negative dentries as
much as possible (also after unlink) by moving the negative dentries
into a separate list and by putting the shrinkage of this list in front
of kmem_cache_reap, so we are as efficient as possible, but we don't
risk throwing away very useful cache, for more dubious caching effects
after an unlink/create-failure that currently have the side effect of
throwing away tons of worthwhile positive pagecache (and even triggering
swap false positives) in some workloads.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 17:55           ` Andrea Arcangeli
@ 2002-05-24 18:00             ` Alexander Viro
  2002-05-24 18:58               ` Andrea Arcangeli
  2002-05-26  8:06             ` Eric W. Biederman
  1 sibling, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 18:00 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel



On Fri, 24 May 2002, Andrea Arcangeli wrote:

> so why don't you also left a negative directory floating around, so you
> know if you creat a file with such name you don't need to ->loopup the
> lowlevel fs but you only need to destroy the negative directory and all
> its leafs in-core-dcache? If you did the negative effect would become
> more obvious, the d_unhash hides it except for the spooling workloads.
 
-ENOPARSE

> of kmem_cache_reap, so we are as efficient as possible, but we don't
> risk throwing away very useful cache, for more dubious caching effects
> after an unlink/create-failure that currently have the side effect of
> throwing away tons of worthwhile positive pagecache (and even triggering
> swap false positives) in some workloads.

I might buy that argument if we didn't also leave around _unreferenced_
inodes for minutes in the icache.  And _that_ is much stronger source of
memory pressure, so if you want to balance the thing you need to start
there.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 17:00       ` Alexander Viro
@ 2002-05-24 18:36         ` Mark Mielke
  0 siblings, 0 replies; 30+ messages in thread
From: Mark Mielke @ 2002-05-24 18:36 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, Andrea Arcangeli, linux-kernel

On Fri, May 24, 2002 at 01:00:14PM -0400, Alexander Viro wrote:
> > The only think we save is a dentry kfree/kmalloc in this case, nbot a FS
> > downcall. And I think Andrea is right that it can waste memory for the
> > likely much more common case where the file just stays removed.
> ???
> It's lookup + unlink + lookup + create vs. lookup + unlink + create.

I would rather use kernel memory for far more useful things, such as
more room for actual dentries/inodes, or negative dentries found from
failed lookup() calls (i.e. proven useful).

The overhead of unlink()/create() probably swamps the rather minimal
gain from a saved lookup() in this not very common situation.

Just the opinion of somebody that doesn't matter... :-)
mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 18:00             ` Alexander Viro
@ 2002-05-24 18:58               ` Andrea Arcangeli
  2002-05-24 19:04                 ` Alexander Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 18:58 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel

On Fri, May 24, 2002 at 02:00:36PM -0400, Alexander Viro wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> 
> > so why don't you also left a negative directory floating around, so you
> > know if you creat a file with such name you don't need to ->loopup the
> > lowlevel fs but you only need to destroy the negative directory and all
> > its leafs in-core-dcache? If you did the negative effect would become
> > more obvious, the d_unhash hides it except for the spooling workloads.
>  
> -ENOPARSE

instead of dropping the dentry for a directory after an rmdir you could
left it there as a negative entry, it would avoid you to ->lookup if
somebody creat() using the name of such ex-directory.

> 
> > of kmem_cache_reap, so we are as efficient as possible, but we don't
> > risk throwing away very useful cache, for more dubious caching effects
> > after an unlink/create-failure that currently have the side effect of
> > throwing away tons of worthwhile positive pagecache (and even triggering
> > swap false positives) in some workloads.
> 
> I might buy that argument if we didn't also leave around _unreferenced_
> inodes for minutes in the icache.  And _that_ is much stronger source of

I don't see it, at the last iput of an inode with i_nlink == 0 the inode
is freed immediatly, not like the dcache that is left floating around as
a negative one with no useful caching effects for most workloads.

> memory pressure, so if you want to balance the thing you need to start
> there.


Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 18:58               ` Andrea Arcangeli
@ 2002-05-24 19:04                 ` Alexander Viro
  2002-05-24 19:43                   ` Andrea Arcangeli
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 19:04 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel



On Fri, 24 May 2002, Andrea Arcangeli wrote:

> > I might buy that argument if we didn't also leave around _unreferenced_
> > inodes for minutes in the icache.  And _that_ is much stronger source of
> 
> I don't see it, at the last iput of an inode with i_nlink == 0 the inode
> is freed immediatly, not like the dcache that is left floating around as
> a negative one with no useful caching effects for most workloads.

Right.  Now look at the inodes with i_nlink != 0.  And realize that they'd
already gone through the aging in dcache - if they get to the point of
final iput(), they have no references remaining.  And _after_ that they
happily stay in icache for minutes.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 19:04                 ` Alexander Viro
@ 2002-05-24 19:43                   ` Andrea Arcangeli
  2002-05-24 19:55                     ` Alexander Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 19:43 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel

On Fri, May 24, 2002 at 03:04:26PM -0400, Alexander Viro wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
> 
> > > I might buy that argument if we didn't also leave around _unreferenced_
> > > inodes for minutes in the icache.  And _that_ is much stronger source of
> > 
> > I don't see it, at the last iput of an inode with i_nlink == 0 the inode
> > is freed immediatly, not like the dcache that is left floating around as
> > a negative one with no useful caching effects for most workloads.
> 
> Right.  Now look at the inodes with i_nlink != 0.  And realize that they'd
> already gone through the aging in dcache - if they get to the point of
> final iput(), they have no references remaining.  And _after_ that they
> happily stay in icache for minutes.

and they provide useful cache, they remebers the i_size and everything
else that you need to read from disk the next time a lookup that ends in
such inode happens. It's not a "this dentry doesn't exist" kind of
info after an unlink, so very very unlikely to be ever needed
information. Furthmore there cannot be an huge grow of those inodes see
below.

It's a "I know everything about this valid inode" that is been used in
the past and that may be used in the future, so I feel it's an order of
magnitude more useful information.

And most important if the dentry is collected for not deleted inodes it
means there's mem pressure, so the inode as well will be collected soon,
prune_icache is run right after prune_dcache. So only the very last
inodes will be left there for minutes, and they will belong to the most
hot dentries, so very likely to be required again by a later iget as
soon as the dentry is re-created. It don't see any similarity to the
unlink-dentry-negative issue.

But if you want to change the iput so that the inode is discared at the
last iput that probably won't make much differnce, but I don't see any
benefit. As said until the last prune_icache, most of the inodes are
released anyways after they become unused.  But I just don't see a
problem there, because those inodes won't stays there for minutes
prune_icache will collect them, and if the last one stays for minute
it's fine, the dcache aging made sure that if that was the last inode
left hanging around it is more likely to be reused next and if it's
reused we avoid a lowlevel ->read_inode. In short the part about the
inodes destroy procedure looks all right to me.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 19:43                   ` Andrea Arcangeli
@ 2002-05-24 19:55                     ` Alexander Viro
  2002-05-24 20:36                       ` Andrea Arcangeli
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Viro @ 2002-05-24 19:55 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linus Torvalds, linux-kernel



On Fri, 24 May 2002, Andrea Arcangeli wrote:
 
> and they provide useful cache, they remebers the i_size and everything
> else that you need to read from disk the next time a lookup that ends in
> such inode happens. It's not a "this dentry doesn't exist" kind of
> info after an unlink, so very very unlikely to be ever needed
> information. Furthmore there cannot be an huge grow of those inodes see
> below.

That's crap, since there _IS_ such a grow.  Again, they easily sit around
for 5-7 minutes without a single attempt to access them, while the system
is swapping like hell.

> It's a "I know everything about this valid inode" that is been used in
> the past and that may be used in the future, so I feel it's an order of
> magnitude more useful information.

It's "I hadn't touched that inode in quite a while, but I'll retain it
in-core almost indefinitely".

> means there's mem pressure, so the inode as well will be collected soon,
> prune_icache is run right after prune_dcache. So only the very last
> inodes will be left there for minutes, and they will belong to the most
> hot dentries, so very likely to be required again by a later iget as
> soon as the dentry is re-created. It don't see any similarity to the
> unlink-dentry-negative issue.

Again, inodes are in that state only if there is no dentry pointing to
them.  And in _that_ state (== no references from the rest of kernel)
they happily sit around for minutes.

> But if you want to change the iput so that the inode is discared at the
> last iput that probably won't make much differnce, but I don't see any
> benefit. As said until the last prune_icache, most of the inodes are
> released anyways after they become unused.  But I just don't see a
> problem there, because those inodes won't stays there for minutes
> prune_icache will collect them, and if the last one stays for minute
> it's fine, the dcache aging made sure that if that was the last inode
> left hanging around it is more likely to be reused next and if it's
> reused we avoid a lowlevel ->read_inode. In short the part about the
> inodes destroy procedure looks all right to me.

It's always a pity when trivial testing spoils a beautiful theory, isn't it?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 19:55                     ` Alexander Viro
@ 2002-05-24 20:36                       ` Andrea Arcangeli
  2002-05-24 22:14                         ` Jan Harkes
  0 siblings, 1 reply; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 20:36 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, linux-kernel

On Fri, May 24, 2002 at 03:55:45PM -0400, Alexander Viro wrote:
> 
> 
> On Fri, 24 May 2002, Andrea Arcangeli wrote:
>  
> > and they provide useful cache, they remebers the i_size and everything
> > else that you need to read from disk the next time a lookup that ends in
> > such inode happens. It's not a "this dentry doesn't exist" kind of
> > info after an unlink, so very very unlikely to be ever needed
> > information. Furthmore there cannot be an huge grow of those inodes see
> > below.
> 
> That's crap, since there _IS_ such a grow.  Again, they easily sit around
> for 5-7 minutes without a single attempt to access them, while the system
> is swapping like hell.

no-way, that's because your vm is broken then, apply vm-35 and it
shouldn't really happen, if the system swaps inodes will be pruned
correcty too, an inode will never stay around for minutes while the
system is swapping. Actually really you may want to apply also my last
fix for the inode highmem balance to be sure to rotate the list, maybe
that could make the difference for this case, but again, if something goes
wrong in this sense it's a prune_icache bug, not a design bug in iput.

> 
> > It's a "I know everything about this valid inode" that is been used in
> > the past and that may be used in the future, so I feel it's an order of
> > magnitude more useful information.
> 
> It's "I hadn't touched that inode in quite a while, but I'll retain it
> in-core almost indefinitely".

disagree, you can apply the same argument to the whole dcache in the
first place (not even the negative one!).

> 
> > means there's mem pressure, so the inode as well will be collected soon,
> > prune_icache is run right after prune_dcache. So only the very last
> > inodes will be left there for minutes, and they will belong to the most
> > hot dentries, so very likely to be required again by a later iget as
> > soon as the dentry is re-created. It don't see any similarity to the
> > unlink-dentry-negative issue.
> 
> Again, inodes are in that state only if there is no dentry pointing to
> them.  And in _that_ state (== no references from the rest of kernel)
> they happily sit around for minutes.

there is no difference at all from the inode side prospective if there's
a dentry or not, nothing guarantees that the dentry will be used soon.

As said unless your vm is broken, freeing the inode at the last iput, so
to have it allocated only when some dentry is pointing to it, shouldn't
nearly make any difference in practice, if it makes big difference that's
a vm problem.

> 
> > But if you want to change the iput so that the inode is discared at the
> > last iput that probably won't make much differnce, but I don't see any
> > benefit. As said until the last prune_icache, most of the inodes are
> > released anyways after they become unused.  But I just don't see a
> > problem there, because those inodes won't stays there for minutes
> > prune_icache will collect them, and if the last one stays for minute
> > it's fine, the dcache aging made sure that if that was the last inode
> > left hanging around it is more likely to be reused next and if it's
> > reused we avoid a lowlevel ->read_inode. In short the part about the
> > inodes destroy procedure looks all right to me.
> 
> It's always a pity when trivial testing spoils a beautiful theory, isn't it?

well, try 2.4.19pre8aa3 + the inode fix I posted this morning, and then
try to spol the theory with the trivial testing again :) I think you
won't spoil it, but I'd like to know if you can reproduce the problem
such way too.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:53   ` Jakub Jelinek
@ 2002-05-24 20:44     ` David Schwartz
  2002-05-25 17:33     ` Florian Weimer
  1 sibling, 0 replies; 30+ messages in thread
From: David Schwartz @ 2002-05-24 20:44 UTC (permalink / raw)
  To: linux-kernel


>In glibc 2.3 this will be open("/usr/lib/locale/locale-archive", ), so
>negative dentries won't be useful for glibc locale handling (that
>doesn't mean negative dentries won't be useful for other things, including
>exec?p or searching libraries if $LD_LIBRARY_PATH is used).
>
>    Jakub

	Web servers tend to look for all kinds of things that don't exist. For 
example, if you hit "http://www.mydomain.com/foo" is there a file called 
"foo" in the root document directory? Or is "foo" a directory with an 
"index.html" file in it?

	And what if index files can be "index.html", "index.htm", or "index.cgi"? A 
single URL hit can easily involve looking for five files that don't exist 
before you find the one that does.

	Of course, some web servers have their own internal URL->file mapping 
caches. The ideal solution would be to get rid of the negative dentries we 
aren't using (much? recently?) when we want to get some more free memory.

	DS



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 20:36                       ` Andrea Arcangeli
@ 2002-05-24 22:14                         ` Jan Harkes
  2002-05-24 22:31                           ` Andrea Arcangeli
  0 siblings, 1 reply; 30+ messages in thread
From: Jan Harkes @ 2002-05-24 22:14 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Alexander Viro, linux-kernel

On Fri, May 24, 2002 at 10:36:30PM +0200, Andrea Arcangeli wrote:
> On Fri, May 24, 2002 at 03:55:45PM -0400, Alexander Viro wrote:
> > On Fri, 24 May 2002, Andrea Arcangeli wrote:
> >  
> > > and they provide useful cache, they remebers the i_size and everything
> > > else that you need to read from disk the next time a lookup that ends in
> > > such inode happens. It's not a "this dentry doesn't exist" kind of
> > > info after an unlink, so very very unlikely to be ever needed
> > > information. Furthmore there cannot be an huge grow of those inodes see
> > > below.
> > 
> > That's crap, since there _IS_ such a grow.  Again, they easily sit around
> > for 5-7 minutes without a single attempt to access them, while the system
> > is swapping like hell.
> 
> no-way, that's because your vm is broken then, apply vm-35 and it
> shouldn't really happen, if the system swaps inodes will be pruned

Reading this thread I just got this incredible sense of 'deja vu'.

    http://marc.theaimsgroup.com/?l=linux-kernel&m=98709057613992&w=2
    http://marc.theaimsgroup.com/?l=linux-kernel&m=98840062922352&w=2

Actually these threads are very interesting to read.

Most interesting is the following message with a patch from you, because
the dcache and icache were pruned 'too agressively' when the new VM was
on the verge of being introduced in 2.4.10 :) Considering that what you
are proposing now is even more agressive than that, it is almost amusing.

    http://marc.theaimsgroup.com/?l=linux-kernel&m=100076684905307&w=2

Jan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 22:14                         ` Jan Harkes
@ 2002-05-24 22:31                           ` Andrea Arcangeli
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2002-05-24 22:31 UTC (permalink / raw)
  To: Alexander Viro, linux-kernel; +Cc: Jan Harkes

On Fri, May 24, 2002 at 06:14:47PM -0400, Jan Harkes wrote:
> Most interesting is the following message with a patch from you, because
> the dcache and icache were pruned 'too agressively' when the new VM was
> on the verge of being introduced in 2.4.10 :) Considering that what you
> are proposing now is even more agressive than that, it is almost amusing.
> 
>     http://marc.theaimsgroup.com/?l=linux-kernel&m=100076684905307&w=2

that was really too much aggressive, it was getting shrunk even with
plenty of cache available. at that time we were missing the
refill_inactive list logic. if you read the patch in such email
carefully, you'll see the that the shrink_dcache_memory(priority,
gfp_mask), shrink_icache_memory(priority, gfp_mask) were executed
_before_ finishing probing the pagecache levels.

before that patch it was so aggressive that the dcache/icache could be
shrunk before finishing probing the pagecache, so it would be fine for
the inactive-dentries actually :), but only for them! :)

In short what we do is:

	probe and shrink pagecache

if we probe some remote shortage of pagecache we do the next step:

	shrink dcache icache and start some pagetable walking to decrease the mapping pressure

So if the system swaps like crazy the inode cache must definitely be
shrunk very hard, if it doesn't it's a vm bug.

There is no inchoerency with what I said and the previous email, it's
just that at that time it was way too aggressive, it was shrinking the
icache/dcache way before finishing probing the pagecache-active list too
for excessive amounts of clean cache.

Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 14:53   ` Jakub Jelinek
  2002-05-24 20:44     ` David Schwartz
@ 2002-05-25 17:33     ` Florian Weimer
  1 sibling, 0 replies; 30+ messages in thread
From: Florian Weimer @ 2002-05-25 17:33 UTC (permalink / raw)
  To: linux-kernel

Jakub Jelinek <jakub@redhat.com> writes:

> In glibc 2.3 this will be open("/usr/lib/locale/locale-archive", ), so
> negative dentries won't be useful for glibc locale handling (that
> doesn't mean negative dentries won't be useful for other things, including
> exec?p or searching libraries if $LD_LIBRARY_PATH is used).

I guess Apache's .htaccess checking benefits from negative caching,
too.

-- 
Florian Weimer 	                  Weimer@CERT.Uni-Stuttgart.DE
University of Stuttgart           http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT                          +49-711-685-5973/fax +49-711-685-5898

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24 17:55           ` Andrea Arcangeli
  2002-05-24 18:00             ` Alexander Viro
@ 2002-05-26  8:06             ` Eric W. Biederman
  1 sibling, 0 replies; 30+ messages in thread
From: Eric W. Biederman @ 2002-05-26  8:06 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Alexander Viro, Linus Torvalds, linux-kernel

Andrea Arcangeli <andrea@suse.de> writes:
> 
> Anyways in 2.5 we could still take advantage of the negative dentries as
> much as possible (also after unlink) by moving the negative dentries
> into a separate list and by putting the shrinkage of this list in front
> of kmem_cache_reap, so we are as efficient as possible, but we don't
> risk throwing away very useful cache, for more dubious caching effects
> after an unlink/create-failure that currently have the side effect of
> throwing away tons of worthwhile positive pagecache (and even triggering
> swap false positives) in some workloads.

Right treat the new never referenced negative dentries as second class
citizens until someone comes along and uses them, instead of aged useful
cache entries.  This sounds like a very good solution to this.

Eric

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: negative dentries wasting ram
  2002-05-24  7:16 negative dentries wasting ram Andrea Arcangeli
  2002-05-24  8:10 ` Andreas Dilger
  2002-05-24 14:43 ` Linus Torvalds
@ 2002-05-31  8:34 ` Oliver Neukum
  2 siblings, 0 replies; 30+ messages in thread
From: Oliver Neukum @ 2002-05-31  8:34 UTC (permalink / raw)
  To: Andrea Arcangeli, linux-kernel; +Cc: Linus Torvalds, Alexander Viro


> as far I can see that negative dentries are not caching anything, they
> should be dropped immediatly, they even slowdown the lookups because
> they're hashed.

They cache things like this:
open("/usr/lib/locale/de_DE+euro/LC_MEASUREMENT", O_RDONLY) = -1 ENOENT (No 
such file or directory)
open("/usr/lib/locale/de_DE@euro/LC_MEASUREMENT", O_RDONLY) = 3

Thus they are not really useless, they are just not usefully limited.
Thus IMHO you should look at the swap out path.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2002-05-26  8:16 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-24  7:16 negative dentries wasting ram Andrea Arcangeli
2002-05-24  8:10 ` Andreas Dilger
2002-05-24 15:36   ` Andrea Arcangeli
2002-05-24 16:12     ` Alexander Viro
2002-05-24 16:21       ` Andrea Arcangeli
2002-05-24 16:24         ` Alexander Viro
2002-05-24 14:43 ` Linus Torvalds
2002-05-24 14:51   ` David S. Miller
2002-05-24 14:53   ` Jakub Jelinek
2002-05-24 20:44     ` David Schwartz
2002-05-25 17:33     ` Florian Weimer
2002-05-24 15:54   ` Andrea Arcangeli
2002-05-24 16:22   ` Alexander Viro
2002-05-24 16:29     ` Linus Torvalds
2002-05-24 16:39       ` Andrea Arcangeli
2002-05-24 17:04         ` Alexander Viro
2002-05-24 17:06           ` Alexander Viro
2002-05-24 17:55           ` Andrea Arcangeli
2002-05-24 18:00             ` Alexander Viro
2002-05-24 18:58               ` Andrea Arcangeli
2002-05-24 19:04                 ` Alexander Viro
2002-05-24 19:43                   ` Andrea Arcangeli
2002-05-24 19:55                     ` Alexander Viro
2002-05-24 20:36                       ` Andrea Arcangeli
2002-05-24 22:14                         ` Jan Harkes
2002-05-24 22:31                           ` Andrea Arcangeli
2002-05-26  8:06             ` Eric W. Biederman
2002-05-24 17:00       ` Alexander Viro
2002-05-24 18:36         ` Mark Mielke
2002-05-31  8:34 ` Oliver Neukum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).