linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
@ 2006-01-16 22:34 Olaf Hering
  2006-01-16 23:23 ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Olaf Hering @ 2006-01-16 22:34 UTC (permalink / raw)
  To: linux-kernel

I get 'Busy inodes after umount' very often, even with recent kernels.
Usually it remains unnoticed for a while. A bit more info about what
superblock had problems would be helpful.

bdevname() doesnt seem to be the best way for pretty-printing NFS mounts
for example. Should it just print the major:minor pair? 
Are there scripts or something that parse such kernel messages, should
the extra info go somewhere else?

 fs/super.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/fs/super.c
===================================================================
--- linux-2.6.orig/fs/super.c
+++ linux-2.6/fs/super.c
@@ -227,6 +227,7 @@ void generic_shutdown_super(struct super
 {
 	struct dentry *root = sb->s_root;
 	struct super_operations *sop = sb->s_op;
+	char b[BDEVNAME_SIZE];
 
 	if (root) {
 		sb->s_root = NULL;
@@ -247,8 +248,9 @@ void generic_shutdown_super(struct super
 
 		/* Forget any remaining inodes */
 		if (invalidate_inodes(sb)) {
-			printk("VFS: Busy inodes after unmount. "
-			   "Self-destruct in 5 seconds.  Have a nice day...\n");
+			printk("VFS: (%s) Busy inodes after unmount. "
+			   "Self-destruct in 5 seconds.  Have a nice day...\n",
+			   bdevname(sb->s_bdev, b));
 		}
 
 		unlock_kernel();
-- 
short story of a lazy sysadmin:
 alias appserv=wotan

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-16 22:34 [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super Olaf Hering
@ 2006-01-16 23:23 ` Kirill Korotaev
  2006-01-16 23:29   ` Olaf Hering
  2006-01-18 22:49   ` Jan Blunck
  0 siblings, 2 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-16 23:23 UTC (permalink / raw)
  To: Olaf Hering; +Cc: linux-kernel, Andrew Morton

Olaf, can you please check if my patch for busy inodes from -mm tree 
helps you?
Patch name is fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15/2.6.15-mm4/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch

I can also provide you a more sophisticated debug patch for this if 
required.

Kirill

> I get 'Busy inodes after umount' very often, even with recent kernels.
> Usually it remains unnoticed for a while. A bit more info about what
> superblock had problems would be helpful.
> 
> bdevname() doesnt seem to be the best way for pretty-printing NFS mounts
> for example. Should it just print the major:minor pair? 
> Are there scripts or something that parse such kernel messages, should
> the extra info go somewhere else?
> 
>  fs/super.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/fs/super.c
> ===================================================================
> --- linux-2.6.orig/fs/super.c
> +++ linux-2.6/fs/super.c
> @@ -227,6 +227,7 @@ void generic_shutdown_super(struct super
>  {
>  	struct dentry *root = sb->s_root;
>  	struct super_operations *sop = sb->s_op;
> +	char b[BDEVNAME_SIZE];
>  
>  	if (root) {
>  		sb->s_root = NULL;
> @@ -247,8 +248,9 @@ void generic_shutdown_super(struct super
>  
>  		/* Forget any remaining inodes */
>  		if (invalidate_inodes(sb)) {
> -			printk("VFS: Busy inodes after unmount. "
> -			   "Self-destruct in 5 seconds.  Have a nice day...\n");
> +			printk("VFS: (%s) Busy inodes after unmount. "
> +			   "Self-destruct in 5 seconds.  Have a nice day...\n",
> +			   bdevname(sb->s_bdev, b));
>  		}
>  
>  		unlock_kernel();

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-16 23:23 ` Kirill Korotaev
@ 2006-01-16 23:29   ` Olaf Hering
  2006-01-17  2:05     ` Andrew Morton
  2006-01-18 22:49   ` Jan Blunck
  1 sibling, 1 reply; 41+ messages in thread
From: Olaf Hering @ 2006-01-16 23:29 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: linux-kernel, Andrew Morton

 On Tue, Jan 17, Kirill Korotaev wrote:

> Olaf, can you please check if my patch for busy inodes from -mm tree 
> helps you?

I cant reprpoduce it at will, thats the thing. It likely happens with NFS
mounts. agruen@suse.de did some work recently. But I remember even with
these changes (for a 2.6.13), the busy inodes did not disappear.

Merging your patch into our cvs will give it more testing, I will do
that tomorrow if noone disagrees.

-- 
short story of a lazy sysadmin:
 alias appserv=wotan

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-16 23:29   ` Olaf Hering
@ 2006-01-17  2:05     ` Andrew Morton
  2006-01-17  7:03       ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2006-01-17  2:05 UTC (permalink / raw)
  To: Olaf Hering; +Cc: dev, linux-kernel

Olaf Hering <olh@suse.de> wrote:
>
>  On Tue, Jan 17, Kirill Korotaev wrote:
> 
> > Olaf, can you please check if my patch for busy inodes from -mm tree 
> > helps you?
> 
> I cant reprpoduce it at will, thats the thing. It likely happens with NFS
> mounts. agruen@suse.de did some work recently. But I remember even with
> these changes (for a 2.6.13), the busy inodes did not disappear.
> 
> Merging your patch into our cvs will give it more testing, I will do
> that tomorrow if noone disagrees.
> 

The patch is certainly safe and stable.  But it's so huge and complex and
ugly that I was hoping that a better fix would turn up.  The bug itself
takes quite some ingenuity to hit.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-17  2:05     ` Andrew Morton
@ 2006-01-17  7:03       ` Kirill Korotaev
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-17  7:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Olaf Hering, linux-kernel

> Olaf Hering <olh@suse.de> wrote:
>>  On Tue, Jan 17, Kirill Korotaev wrote:
>>
>>> Olaf, can you please check if my patch for busy inodes from -mm tree 
>>> helps you?
>> I cant reprpoduce it at will, thats the thing. It likely happens with NFS
>> mounts. agruen@suse.de did some work recently. But I remember even with
>> these changes (for a 2.6.13), the busy inodes did not disappear.
>>
>> Merging your patch into our cvs will give it more testing, I will do
>> that tomorrow if noone disagrees.
>>
> 
> The patch is certainly safe and stable.  But it's so huge and complex and
> ugly that I was hoping that a better fix would turn up.  The bug itself
> takes quite some ingenuity to hit.
We have another idea how to reimplement it via refcounters instead of 
lists. But I'm not sure when this will happen, due to lack of time :(

Kirill

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-16 23:23 ` Kirill Korotaev
  2006-01-16 23:29   ` Olaf Hering
@ 2006-01-18 22:49   ` Jan Blunck
  2006-01-18 23:10     ` Andrew Morton
  2006-01-19  9:52     ` Kirill Korotaev
  1 sibling, 2 replies; 41+ messages in thread
From: Jan Blunck @ 2006-01-18 22:49 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Olaf Hering, linux-kernel, Andrew Morton

On Tue, Jan 17, Kirill Korotaev wrote:

> Olaf, can you please check if my patch for busy inodes from -mm tree 
> helps you?
> Patch name is fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15/2.6.15-mm4/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch

This patch is just wrong. It is hiding bugs in file systems. The problem is
that somewhere the reference counting on the vfsmount objects is wrong. The
file system is unmounted before the last dentry is dereferenced. Either you
didn't hold a reference to the proper vfsmount objects at all or you
dereference it too early. See Al Viros patch series (search for "namei fixes")
on how to fix this issues.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-18 22:49   ` Jan Blunck
@ 2006-01-18 23:10     ` Andrew Morton
  2006-01-19 10:08       ` Kirill Korotaev
  2006-01-19  9:52     ` Kirill Korotaev
  1 sibling, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2006-01-18 23:10 UTC (permalink / raw)
  To: Jan Blunck; +Cc: dev, olh, linux-kernel

Jan Blunck <jblunck@suse.de> wrote:
>
> On Tue, Jan 17, Kirill Korotaev wrote:
> 
> > Olaf, can you please check if my patch for busy inodes from -mm tree 
> > helps you?
> > Patch name is fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
> > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15/2.6.15-mm4/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
> 
> This patch is just wrong. It is hiding bugs in file systems. The problem is
> that somewhere the reference counting on the vfsmount objects is wrong. The
> file system is unmounted before the last dentry is dereferenced. Either you
> didn't hold a reference to the proper vfsmount objects at all or you
> dereference it too early. See Al Viros patch series (search for "namei fixes")
> on how to fix this issues.
> 

The only reason I've been carrying that patch is as a reminder that there's
a bug that we need to fix.  It'd be good news if that bug had been fixed by
other means.

Kirill, do you know whether the bug is still present in 2.6.16-rc1?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-18 22:49   ` Jan Blunck
  2006-01-18 23:10     ` Andrew Morton
@ 2006-01-19  9:52     ` Kirill Korotaev
  2006-01-19 10:04       ` Jan Blunck
  1 sibling, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-19  9:52 UTC (permalink / raw)
  To: Jan Blunck; +Cc: Olaf Hering, linux-kernel, Andrew Morton

>>Olaf, can you please check if my patch for busy inodes from -mm tree 
>>helps you?
>>Patch name is fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
>>http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15/2.6.15-mm4/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
> 
> 
> This patch is just wrong. It is hiding bugs in file systems. The problem is
> that somewhere the reference counting on the vfsmount objects is wrong. The
> file system is unmounted before the last dentry is dereferenced. Either you
> didn't hold a reference to the proper vfsmount objects at all or you
> dereference it too early. See Al Viros patch series (search for "namei fixes")
> on how to fix this issues.

This patch has nothing to do with vfsmount references and doesn't hide 
anything. It just adds syncronization barrier between do_umount() and 
shrink_dcache() since the latter can work with dentries/inodes without 
holding locks.

So if you think there is something wrong with it, please, be more specific.

Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-19  9:52     ` Kirill Korotaev
@ 2006-01-19 10:04       ` Jan Blunck
  2006-01-19 10:26         ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Blunck @ 2006-01-19 10:04 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Olaf Hering, linux-kernel, Andrew Morton

On Thu, Jan 19, Kirill Korotaev wrote:

> This patch has nothing to do with vfsmount references and doesn't hide 
> anything. It just adds syncronization barrier between do_umount() and 
> shrink_dcache() since the latter can work with dentries/inodes without 
> holding locks.
> 
> So if you think there is something wrong with it, please, be more specific.
> 

You can only unmount a file system if there are no references to the vfsmount
object anymore. Since shrink_dcache*() is called after checking the refcount of
vfsmount while unmounting the file system, it isn't possible to hold a
reference to a dentry (and therefore call dput()) after this point in
time. Therefore your reference counting on the vfsmount is wrong which is the
root case for your problem of busy inodes.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-18 23:10     ` Andrew Morton
@ 2006-01-19 10:08       ` Kirill Korotaev
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-19 10:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Blunck, olh, linux-kernel

>>On Tue, Jan 17, Kirill Korotaev wrote:
>>
>>
>>>Olaf, can you please check if my patch for busy inodes from -mm tree 
>>>helps you?
>>>Patch name is fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
>>>http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15/2.6.15-mm4/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch
>>
>>This patch is just wrong. It is hiding bugs in file systems. The problem is
>>that somewhere the reference counting on the vfsmount objects is wrong. The
>>file system is unmounted before the last dentry is dereferenced. Either you
>>didn't hold a reference to the proper vfsmount objects at all or you
>>dereference it too early. See Al Viros patch series (search for "namei fixes")
>>on how to fix this issues.
> 
> 
> The only reason I've been carrying that patch is as a reminder that there's
> a bug that we need to fix.  It'd be good news if that bug had been fixed by
> other means.
> 
> Kirill, do you know whether the bug is still present in 2.6.16-rc1?

it exists in 2.6.15 and I see no changes in 2.6.16-rc1 except for 
cosmetics :(
checked the git tree, dput() etc. the bug is definetely still here - 
nothing changed in this area.

Sorry for bad news.

The patch can be probably remade via introducing "notlocked_refs" 
counter on dentry. if shrinker()/umount() see such a dentry() with 
non-zero refcnt it can sleep as it is done in current patch. It would be 
a little bit cleaner/simpler. What do you think?
If someone suggest any brilliant/helpfull idea I would be happy to 
improve it.

Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-19 10:04       ` Jan Blunck
@ 2006-01-19 10:26         ` Kirill Korotaev
  2006-01-20 19:06           ` Jan Blunck
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-19 10:26 UTC (permalink / raw)
  To: Jan Blunck; +Cc: Olaf Hering, linux-kernel, Andrew Morton

>>This patch has nothing to do with vfsmount references and doesn't hide 
>>anything. It just adds syncronization barrier between do_umount() and 
>>shrink_dcache() since the latter can work with dentries/inodes without 
>>holding locks.
>>
>>So if you think there is something wrong with it, please, be more specific.
>>
> 
> 
> You can only unmount a file system if there are no references to the vfsmount
> object anymore. Since shrink_dcache*() is called after checking the refcount of
> vfsmount while unmounting the file system, it isn't possible to hold a
> reference to a dentry (and therefore call dput()) after this point in
> time. Therefore your reference counting on the vfsmount is wrong which is the
> root case for your problem of busy inodes.

You didn't take into account shrink_dcache*() on memory pressure. It 
works when it works. And when it calls dput() it detaches dentry from 
the whole tree and starts to work with inode. do_umount() can 
successfully shrink the other part of the tree, since dentry in question 
is detached, complain about busy inode (it is really being put on 
another CPU, but still busy) and destroy super block.

another scenario from patch comment:

CPU 1				CPU 2
~~~~~				~~~~~
umount /dev/sda1
generic_shutdown_super          shrink_dcache_memory()
shrink_dcache_parent            dput dentry
select_parent                   prune_one_dentry()
                                 <<<< child is dead, locks are released,
                                   but parent is still referenced!!! >>>>
skip dentry->parent,
since it's d_count > 0

message: BUSY inodes after umount...
                                 <<< parent is left on dentry_unused list,
                                    referencing freed super block >>>


Kirill



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-19 10:26         ` Kirill Korotaev
@ 2006-01-20 19:06           ` Jan Blunck
  2006-01-23  8:14             ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Blunck @ 2006-01-20 19:06 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Olaf Hering, linux-kernel, Andrew Morton

On Thu, Jan 19, Kirill Korotaev wrote:

> CPU 1				CPU 2
> ~~~~~				~~~~~
> umount /dev/sda1
> generic_shutdown_super          shrink_dcache_memory()
> shrink_dcache_parent            dput dentry
> select_parent                   prune_one_dentry()
>                                 <<<< child is dead, locks are released,
>                                   but parent is still referenced!!! >>>>
> skip dentry->parent,
> since it's d_count > 0
> 
> message: BUSY inodes after umount...
>                                 <<< parent is left on dentry_unused list,
>                                    referencing freed super block >>>

I see. The problem is that dcache_lock is given up before dereferencing the
parent. But your patch seems to be wrong anyway IMHO. I'll post patches in a
seperate thread.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-20 19:06           ` Jan Blunck
@ 2006-01-23  8:14             ` Kirill Korotaev
  2006-01-30 11:54               ` Jan Blunck
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-23  8:14 UTC (permalink / raw)
  To: Jan Blunck; +Cc: Olaf Hering, linux-kernel, Andrew Morton

>>CPU 1				CPU 2
>>~~~~~				~~~~~
>>umount /dev/sda1
>>generic_shutdown_super          shrink_dcache_memory()
>>shrink_dcache_parent            dput dentry
>>select_parent                   prune_one_dentry()
>>                                <<<< child is dead, locks are released,
>>                                  but parent is still referenced!!! >>>>
>>skip dentry->parent,
>>since it's d_count > 0
>>
>>message: BUSY inodes after umount...
>>                                <<< parent is left on dentry_unused list,
>>                                   referencing freed super block >>>
> 
> 
> I see. The problem is that dcache_lock is given up before dereferencing the
> parent. But your patch seems to be wrong anyway IMHO. I'll post patches in a
> seperate thread.
Jan, I still have not heard a single comment about what's wrong with 
it... I would really appreciate if you provide me one.

Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-23  8:14             ` Kirill Korotaev
@ 2006-01-30 11:54               ` Jan Blunck
  2006-01-30 14:05                 ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Blunck @ 2006-01-30 11:54 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Olaf Hering, linux-kernel, Andrew Morton

On Mon, Jan 23, Kirill Korotaev wrote:

> Jan, I still have not heard a single comment about what's wrong with 
> it... I would really appreciate if you provide me one.
> 

Sorry for the delay. I had to fix a totally bogus patch (mine ;).

The problem with your patch is that it hides too early mntput's. Think about
following situation:

 mntput(path->mnt);   // too early mntput()
 dput(path->dentry);

Assuming that in-between this sequence someone unmounts the file system, your
patch will wait for this dput() to finish before it proceeds with unmounting
the file system. I think this isn't what we want.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-30 11:54               ` Jan Blunck
@ 2006-01-30 14:05                 ` Kirill Korotaev
  2006-01-30 14:21                   ` Jan Blunck
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-30 14:05 UTC (permalink / raw)
  To: Jan Blunck; +Cc: Olaf Hering, linux-kernel, Andrew Morton

Hello Jan,

>>Jan, I still have not heard a single comment about what's wrong with 
>>it... I would really appreciate if you provide me one.
>>
> 
> 
> Sorry for the delay. I had to fix a totally bogus patch (mine ;).
> 
> The problem with your patch is that it hides too early mntput's. Think about
> following situation:
> 
>  mntput(path->mnt);   // too early mntput()
>  dput(path->dentry);
> 
> Assuming that in-between this sequence someone unmounts the file system, your
> patch will wait for this dput() to finish before it proceeds with unmounting
> the file system. I think this isn't what we want.
No, it won't wait for anything, because if umount happened between 
mntput/dput, dentry is not in s_dshrinkers list.
if umount happens in parallell with dput() (where shrinker operations 
are), then it will behave ok - will wait for dput() and then umount. It 
was intended behaviour!

Also, please, note that such early mntput()'s are bugs!!! because such 
dentries can reference freed memory after last mntput(). And I remember 
some patches in 2.4.x/2.6.x which fixed this sequence everywhere.

Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-30 14:05                 ` Kirill Korotaev
@ 2006-01-30 14:21                   ` Jan Blunck
  2006-01-30 14:34                     ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Blunck @ 2006-01-30 14:21 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Olaf Hering, linux-kernel, Andrew Morton

On Mon, Jan 30, Kirill Korotaev wrote:

> >
> > mntput(path->mnt);   // too early mntput()
> > dput(path->dentry);
> >
> >Assuming that in-between this sequence someone unmounts the file system, 
> >your
> >patch will wait for this dput() to finish before it proceeds with 
> >unmounting
> >the file system. I think this isn't what we want.
> No, it won't wait for anything, because if umount happened between 
> mntput/dput, dentry is not in s_dshrinkers list.
> if umount happens in parallell with dput() (where shrinker operations 
> are), then it will behave ok - will wait for dput() and then umount. It 
> was intended behaviour!

It should not wait.

> 
> Also, please, note that such early mntput()'s are bugs!!! because such 
> dentries can reference freed memory after last mntput(). And I remember 
> some patches in 2.4.x/2.6.x which fixed this sequence everywhere.

Thats why I'm complaining ...

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-01-30 14:21                   ` Jan Blunck
@ 2006-01-30 14:34                     ` Kirill Korotaev
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-01-30 14:34 UTC (permalink / raw)
  To: Jan Blunck; +Cc: Olaf Hering, linux-kernel, Andrew Morton

Hello Jan,

>>>mntput(path->mnt);   // too early mntput()
>>>dput(path->dentry);
>>>
>>>Assuming that in-between this sequence someone unmounts the file system, 
>>>your
>>>patch will wait for this dput() to finish before it proceeds with 
>>>unmounting
>>>the file system. I think this isn't what we want.
>>
>>No, it won't wait for anything, because if umount happened between 
>>mntput/dput, dentry is not in s_dshrinkers list.
>>if umount happens in parallell with dput() (where shrinker operations 
>>are), then it will behave ok - will wait for dput() and then umount. It 
>>was intended behaviour!
> 
> 
> It should not wait.
why?! it makes sure, that dentries/inodes are gone _before_ super block 
destroyed.

>>Also, please, note that such early mntput()'s are bugs!!! because such 
>>dentries can reference freed memory after last mntput(). And I remember 
>>some patches in 2.4.x/2.6.x which fixed this sequence everywhere.
> 
> 
> Thats why I'm complaining ...
about what?
my patch doesn't hide this bug, nor helps it anyhow.

Kirill



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07 23:20         ` Neil Brown
@ 2006-03-09 12:03           ` Kirill Korotaev
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-09 12:03 UTC (permalink / raw)
  To: Neil Brown
  Cc: Kirill Korotaev, linux-kernel, Andrew Morton, Olaf Hering,
	Jan Blunck, Al Viro

[-- Attachment #1: Type: text/plain, Size: 969 bytes --]

>>>>In general your patch is still does what mine do, so I will be happy if 
>>>>any of this is commited mainstream. In future, please, keep the 
>>>>reference to original authors, this will also make sure that I'm on CC 
>>>>if something goes wrong.
>>>
>>>
>>>Sorry: which 'original author' did I miss ?
>>
>>I mean, it is better to mention original author 
>>(http://marc.theaimsgroup.com/?l=linux-kernel&m=114123870406751&w=2) in 
>>patch description, as it makes sure that he will be on CC if this patch 
>>will be discussed later again. My patch fixing this race was in -mm tree 
>>for half a year already.
> 
> 
> Which patch is that?  The race still seems to be present in -mm.
I attached you my latest patch for 2.6.16-rc5

in -mm tree these patch was until the discussion arose again recently:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.14/2.6.14-mm2/broken-out/fix-of-dcache-race-leading-to-busy-inodes-on-umount.patch

Thanks,
Kirill

[-- Attachment #2: diff-ms-dcache-race-20060303 --]
[-- Type: text/plain, Size: 10441 bytes --]

--- linux-2.6.15.orig/fs/dcache.c	2006-01-03 06:21:10.000000000 +0300
+++ linux-2.6.15-025stab015-dcache-race/fs/dcache.c	2006-03-03 16:22:38.862706448 +0300
@@ -114,6 +114,75 @@ static inline void dentry_iput(struct de
 	}
 }
 
+struct dcache_shrinker {
+	struct list_head list;
+	struct dentry *dentry;
+};
+
+DECLARE_WAIT_QUEUE_HEAD(dcache_shrinker_wq);
+
+/* called under dcache_lock */
+static void dcache_shrinker_add(struct dcache_shrinker *ds,
+		struct dentry *parent, struct dentry *dentry)
+{
+	struct super_block *sb;
+
+	sb = parent->d_sb;
+	ds->dentry = parent;
+	list_add(&ds->list, &sb->s_dshrinkers);
+}
+
+/* called under dcache_lock */
+static void dcache_shrinker_del(struct dcache_shrinker *ds)
+{
+	if (ds == NULL || list_empty(&ds->list))
+		return;
+
+	list_del_init(&ds->list);
+	wake_up_all(&dcache_shrinker_wq);
+}
+
+/* called under dcache_lock, drops inside */
+static void dcache_shrinker_wait(struct super_block *sb)
+{
+	DECLARE_WAITQUEUE(wq, current);
+
+	__set_current_state(TASK_UNINTERRUPTIBLE);
+	add_wait_queue(&dcache_shrinker_wq, &wq);
+	spin_unlock(&dcache_lock);
+
+	schedule();
+	remove_wait_queue(&dcache_shrinker_wq, &wq);
+	__set_current_state(TASK_RUNNING);
+}
+
+void dcache_shrinker_wait_sb(struct super_block *sb)
+{
+	/* the root dentry can be held in dput_recursive */
+	spin_lock(&dcache_lock);
+	while (!list_empty(&sb->s_dshrinkers)) {
+		dcache_shrinker_wait(sb);
+		spin_lock(&dcache_lock);
+	}
+	spin_unlock(&dcache_lock);
+}
+
+/* dcache_lock protects shrinker's list */
+static void shrink_dcache_racecheck(struct dentry *parent, int *racecheck)
+{
+	struct super_block *sb;
+	struct dcache_shrinker *ds;
+
+	sb = parent->d_sb;
+	list_for_each_entry(ds, &sb->s_dshrinkers, list) {
+		/* is one of dcache shrinkers working on the dentry? */
+		if (ds->dentry == parent) {
+			*racecheck = 1;
+			break;
+		}
+	}
+}
+
 /* 
  * This is dput
  *
@@ -132,8 +201,9 @@ static inline void dentry_iput(struct de
  */
 
 /*
- * dput - release a dentry
- * @dentry: dentry to release 
+ * dput_recursive - go upward through the dentry tree and release dentries
+ * @dentry: starting dentry
+ * @ds: shrinker to be added to active list (see shrink_dcache_parent)
  *
  * Release a dentry. This will drop the usage count and if appropriate
  * call the dentry unlink method as well as removing it from the queues and
@@ -142,18 +212,15 @@ static inline void dentry_iput(struct de
  *
  * no dcache lock, please.
  */
-
-void dput(struct dentry *dentry)
+static void dput_recursive(struct dentry *dentry, struct dcache_shrinker *ds)
 {
-	if (!dentry)
-		return;
-
-repeat:
 	if (atomic_read(&dentry->d_count) == 1)
 		might_sleep();
 	if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
 		return;
+	dcache_shrinker_del(ds);
 
+repeat:
 	spin_lock(&dentry->d_lock);
 	if (atomic_read(&dentry->d_count)) {
 		spin_unlock(&dentry->d_lock);
@@ -185,6 +252,7 @@ unhash_it:
 
 kill_it: {
 		struct dentry *parent;
+		struct dcache_shrinker lds;
 
 		/* If dentry was on d_lru list
 		 * delete it from there
@@ -194,18 +262,47 @@ kill_it: {
   			dentry_stat.nr_unused--;
   		}
   		list_del(&dentry->d_u.d_child);
+		parent = dentry->d_parent;
+		dcache_shrinker_add(&lds, parent, dentry);
 		dentry_stat.nr_dentry--;	/* For d_free, below */
 		/*drops the locks, at that point nobody can reach this dentry */
 		dentry_iput(dentry);
-		parent = dentry->d_parent;
 		d_free(dentry);
-		if (dentry == parent)
+		if (unlikely(dentry == parent)) {
+			spin_lock(&dcache_lock);
+			dcache_shrinker_del(&lds);
+			spin_unlock(&dcache_lock);
 			return;
+		}
 		dentry = parent;
-		goto repeat;
+		spin_lock(&dcache_lock);
+		dcache_shrinker_del(&lds);
+		if (atomic_dec_and_test(&dentry->d_count))
+			goto repeat;
+		spin_unlock(&dcache_lock);
 	}
 }
 
+/*
+ * dput - release a dentry
+ * @dentry: dentry to release 
+ *
+ * Release a dentry. This will drop the usage count and if appropriate
+ * call the dentry unlink method as well as removing it from the queues and
+ * releasing its resources. If the parent dentries were scheduled for release
+ * they too may now get deleted.
+ *
+ * no dcache lock, please.
+ */
+
+void dput(struct dentry *dentry)
+{
+	if (!dentry)
+		return;
+
+	dput_recursive(dentry, NULL);
+}
+
 /**
  * d_invalidate - invalidate a dentry
  * @dentry: dentry to invalidate
@@ -362,19 +459,23 @@ restart:
  * removed.
  * Called with dcache_lock, drops it and then regains.
  */
-static inline void prune_one_dentry(struct dentry * dentry)
+static void prune_one_dentry(struct dentry * dentry)
 {
 	struct dentry * parent;
+	struct dcache_shrinker ds;
 
 	__d_drop(dentry);
 	list_del(&dentry->d_u.d_child);
+	parent = dentry->d_parent;
+	dcache_shrinker_add(&ds, parent, dentry);
 	dentry_stat.nr_dentry--;	/* For d_free, below */
 	dentry_iput(dentry);
 	parent = dentry->d_parent;
 	d_free(dentry);
 	if (parent != dentry)
-		dput(parent);
+		dput_recursive(parent, &ds);
 	spin_lock(&dcache_lock);
+	dcache_shrinker_del(&ds);
 }
 
 /**
@@ -557,13 +658,12 @@ positive:
  * drop the lock and return early due to latency
  * constraints.
  */
-static int select_parent(struct dentry * parent)
+static int select_parent(struct dentry * parent, int * racecheck)
 {
 	struct dentry *this_parent = parent;
 	struct list_head *next;
 	int found = 0;
 
-	spin_lock(&dcache_lock);
 repeat:
 	next = this_parent->d_subdirs.next;
 resume:
@@ -605,6 +705,9 @@ dentry->d_parent->d_name.name, dentry->d
 #endif
 			goto repeat;
 		}
+
+		if (!found && racecheck != NULL)
+			shrink_dcache_racecheck(dentry, racecheck);
 	}
 	/*
 	 * All done at this level ... ascend and resume the search.
@@ -619,7 +722,6 @@ this_parent->d_parent->d_name.name, this
 		goto resume;
 	}
 out:
-	spin_unlock(&dcache_lock);
 	return found;
 }
 
@@ -632,10 +734,66 @@ out:
  
 void shrink_dcache_parent(struct dentry * parent)
 {
-	int found;
+	int found, r;
 
-	while ((found = select_parent(parent)) != 0)
+	while (1) {
+		spin_lock(&dcache_lock);
+		found = select_parent(parent, NULL);
+		if (found)
+			goto found;
+
+		/*
+		 * try again with a dput_recursive() race check.
+		 * it returns quickly if everything was really shrinked
+		 */
+		r = 0;
+		found = select_parent(parent, &r);
+		if (found)
+			goto found;
+		if (!r)
+			break;
+
+		/* drops the lock inside */
+		dcache_shrinker_wait(parent->d_sb);
+		continue;
+
+found:
+		spin_unlock(&dcache_lock);
 		prune_dcache(found);
+	}
+	spin_unlock(&dcache_lock);
+}
+
+/*
+ * Move any unused anon dentries to the end of the unused list.
+ * called under dcache_lock
+ */
+static int select_anon(struct hlist_head *head, int *racecheck)
+{
+	struct hlist_node *lp;
+	int found = 0;
+
+	hlist_for_each(lp, head) {
+		struct dentry *this = hlist_entry(lp, struct dentry, d_hash);
+		if (!list_empty(&this->d_lru)) {
+			dentry_stat.nr_unused--;
+			list_del_init(&this->d_lru);
+		}
+
+		/* 
+		 * move only zero ref count dentries to the end 
+		 * of the unused list for prune_dcache
+		 */
+		if (!atomic_read(&this->d_count)) {
+			list_add_tail(&this->d_lru, &dentry_unused);
+			dentry_stat.nr_unused++;
+			found++;
+		}
+
+		if (!found && racecheck != NULL)
+			shrink_dcache_racecheck(this, racecheck);
+	}
+	return found;
 }
 
 /**
@@ -648,33 +806,36 @@ void shrink_dcache_parent(struct dentry 
  * done under dcache_lock.
  *
  */
-void shrink_dcache_anon(struct hlist_head *head)
+void shrink_dcache_anon(struct super_block *sb)
 {
-	struct hlist_node *lp;
-	int found;
-	do {
-		found = 0;
+	int found, r;
+
+	while (1) {
 		spin_lock(&dcache_lock);
-		hlist_for_each(lp, head) {
-			struct dentry *this = hlist_entry(lp, struct dentry, d_hash);
-			if (!list_empty(&this->d_lru)) {
-				dentry_stat.nr_unused--;
-				list_del_init(&this->d_lru);
-			}
+		found = select_anon(&sb->s_anon, NULL);
+		if (found)
+			goto found;
 
-			/* 
-			 * move only zero ref count dentries to the end 
-			 * of the unused list for prune_dcache
-			 */
-			if (!atomic_read(&this->d_count)) {
-				list_add_tail(&this->d_lru, &dentry_unused);
-				dentry_stat.nr_unused++;
-				found++;
-			}
-		}
+		/*
+		 * try again with a dput_recursive() race check.
+		 * it returns quickly if everything was really shrinked
+		 */
+		r = 0;
+		found = select_anon(&sb->s_anon, &r);
+		if (found)
+			goto found;
+		if (!r)
+			break;
+
+		/* drops the lock inside */
+		dcache_shrinker_wait(sb);
+		continue;
+
+found:
 		spin_unlock(&dcache_lock);
 		prune_dcache(found);
-	} while(found);
+	}
+	spin_unlock(&dcache_lock);
 }
 
 /*
--- linux-2.6.15.orig/fs/super.c	2006-01-03 06:21:10.000000000 +0300
+++ linux-2.6.15-025stab015-dcache-race/fs/super.c	2006-03-03 16:22:38.841709640 +0300
@@ -69,6 +69,7 @@ static struct super_block *alloc_super(v
 		INIT_LIST_HEAD(&s->s_io);
 		INIT_LIST_HEAD(&s->s_files);
 		INIT_LIST_HEAD(&s->s_instances);
+		INIT_LIST_HEAD(&s->s_dshrinkers);
 		INIT_HLIST_HEAD(&s->s_anon);
 		INIT_LIST_HEAD(&s->s_inodes);
 		init_rwsem(&s->s_umount);
@@ -231,8 +232,9 @@ void generic_shutdown_super(struct super
 	if (root) {
 		sb->s_root = NULL;
 		shrink_dcache_parent(root);
-		shrink_dcache_anon(&sb->s_anon);
+		shrink_dcache_anon(sb);
 		dput(root);
+		dcache_shrinker_wait_sb(sb);
 		fsync_super(sb);
 		lock_super(sb);
 		sb->s_flags &= ~MS_ACTIVE;
--- linux-2.6.15.orig/include/linux/dcache.h	2006-01-03 06:21:10.000000000 +0300
+++ linux-2.6.15-025stab015-dcache-race/include/linux/dcache.h	2006-03-03 16:22:38.843709336 +0300
@@ -209,7 +209,8 @@ extern struct dentry * d_alloc_anon(stru
 extern struct dentry * d_splice_alias(struct inode *, struct dentry *);
 extern void shrink_dcache_sb(struct super_block *);
 extern void shrink_dcache_parent(struct dentry *);
-extern void shrink_dcache_anon(struct hlist_head *);
+extern void shrink_dcache_anon(struct super_block *);
+extern void dcache_shrinker_wait_sb(struct super_block *sb);
 extern int d_invalidate(struct dentry *);
 
 /* only used at mount-time */
--- linux-2.6.15.orig/include/linux/fs.h	2006-01-03 06:21:10.000000000 +0300
+++ linux-2.6.15-025stab015-dcache-race/include/linux/fs.h	2006-03-03 16:22:38.821712680 +0300
@@ -803,6 +803,7 @@ struct super_block {
 	struct list_head	s_io;		/* parked for writeback */
 	struct hlist_head	s_anon;		/* anonymous dentries for (nfs) exporting */
 	struct list_head	s_files;
+	struct list_head	s_dshrinkers;	/* active dcache shrinkers */
 
 	struct block_device	*s_bdev;
 	struct list_head	s_instances;

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-08  3:05               ` Balbir Singh
@ 2006-03-08 11:01                 ` Jan Blunck
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Blunck @ 2006-03-08 11:01 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Neil Brown, Kirill Korotaev, Balbir Singh, linux-kernel,
	Andrew Morton, Olaf Hering, Kirill Korotaev, Al Viro

On Wed, Mar 08, Balbir Singh wrote:

> 
> wait_on_prunes() breaks out if sb->prunes == 0. What if shrink_dcache_parent()
> now calls select_parent(). select_parent() could still find entries 
> with d_count > 0 and skip them and shrink_dcache_memory() can still cause
> the race condition to occur.
> 
> I think pushing wait_on_prunes() to after shrink_dcache_parent() will
> most likely solve the race.
> 

This is why I used to let shrink_dache_parent() only return after an
unsuccessfull select_parent() after a wait.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-08  2:39             ` Neil Brown
@ 2006-03-08  3:05               ` Balbir Singh
  2006-03-08 11:01                 ` Jan Blunck
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2006-03-08  3:05 UTC (permalink / raw)
  To: Neil Brown
  Cc: Kirill Korotaev, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

> I think that in most cases, the race doesn't matter if
> shrink_dcache_memory misses a dentry because someone else is holding a
> temporary reference, it really doesn't matter.
> Similarly most callers of shrink_dcache_parent are happy with a
> best-effort.

I agree.

> 
> I should have been more explicit that the patch was against
> 2.6.16-rc5-mm2.  This contains some dcache patches to allow nfs
> filesystem to share superblocks, and one of the patches replaces the
> calls to shrink_dcache_parent and shrink_dcache_anon with a single
> call to a new function: shrink_dcache_sb.
>

shrink_dcache_parent() has been added back to generic_shutdown_super in
-mm3 (just checked). With that being the case, I have only one concern
with your patch

wait_on_prunes() breaks out if sb->prunes == 0. What if shrink_dcache_parent()
now calls select_parent(). select_parent() could still find entries 
with d_count > 0 and skip them and shrink_dcache_memory() can still cause
the race condition to occur.

I think pushing wait_on_prunes() to after shrink_dcache_parent() will
most likely solve the race.

 
> Thanks for the feedback

Your welcome!

> 
> NeilBrown

Balbir

-- 
I'm extremely grateful that hundreds of you have taken time to read these
patches, and to detect and report errors that you've found.
Your comments have helped me improve enormously. But I must confess that
I'm also disappointed to have had absolutely no feedback so far on several of
the patches on which I worked hardest when I was preparing these patches.
Could it be that (1) you've said nothing about them because I somehow managed
to get the details perfect? Or is it that (2) you shy away and are busy, hence
you are unable to spend more than a few minutes on any particular topic?
Although I do like to think that readers like to provide feedback, I fear that
hypothesis (1) is far less likely than hypothesis (2). 

Adapted from Don Knuth's comments on feedback for his exercises


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-08  2:17           ` Balbir Singh
@ 2006-03-08  2:39             ` Neil Brown
  2006-03-08  3:05               ` Balbir Singh
  0 siblings, 1 reply; 41+ messages in thread
From: Neil Brown @ 2006-03-08  2:39 UTC (permalink / raw)
  To: balbir
  Cc: Kirill Korotaev, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

On Wednesday March 8, balbir@in.ibm.com wrote:
> > So: blending the better bits of various patches together, I've come up
> > with the following.  It still needs a changelog entry, but does anyone
> > want to ACK it ???
> 
> I think this patch is much more cleaner and refined.

Thanks.

> 
> From yesterdays comments I am beginning to wonder if it is enough to solve
> only the unmount race or should the fix be more generic to address the race
> between the shrinkers call to shrink_dcache_memory() and shrink_dcache_parent().
> 
> From what I understand of the race, the race occurs when dput of the
> parent fails to happen and because the referecne count is not 0,
> shrink_dcache_parent() skips over those dentries. The race occurs for
> the dentry and its ancestors above 

I think that in most cases, the race doesn't matter if
shrink_dcache_memory misses a dentry because someone else is holding a
temporary reference, it really doesn't matter.
Similarly most callers of shrink_dcache_parent are happy with a
best-effort.

Unmount is a special case because it wants to 'shrink' *all* of the
dentries for the filesystem.  In that case, someone holding a
transient reference s a bad thing.  In other cases it is, at best, a
minor inconvenience. 

> 
> I was wondering if the following would work (to solve the generic race)
> 
> Add a prune_mutex to the super-block. Hold on to it in prune_one_dentry()
> until we hit a parent dentry that is a mount point (d_mounted > 0) or
> the parent has a reference count > 1 or at the end of prune_one_dentry().
> This should ensure that for each super block dentry counts are consistent.
> Also get select_parent() to hold the super block's prune_mutex, so that it
> sees a consistent view of the super block.
> 
> Oh! now that I think about it, I think your solution looks like an
> elegant way to do the same thing. The only draw back is that it solves
> only the unmount race and there are some changes in generic_shutdown_super()
> which I do not understand.
> 

> > diff ./fs/super.c~current~ ./fs/super.c
> > --- ./fs/super.c~current~	2006-03-08 10:50:37.000000000 +1100
> > +++ ./fs/super.c	2006-03-08 11:02:12.000000000 +1100
> > @@ -81,6 +81,8 @@ static struct super_block *alloc_super(v
> >  		mutex_init(&s->s_dquot.dqio_mutex);
> >  		mutex_init(&s->s_dquot.dqonoff_mutex);
> >  		init_rwsem(&s->s_dquot.dqptr_sem);
> > +		s->s_prunes = 0;
> > +		init_waitqueue_head(&s->s_wait_prunes);
> >  		init_waitqueue_head(&s->s_wait_unfrozen);
> >  		s->s_maxbytes = MAX_NON_LFS;
> >  		s->dq_op = sb_dquot_ops;
> > @@ -231,6 +233,7 @@ void generic_shutdown_super(struct super
> >  
> >  	if (root) {
> >  		sb->s_root = NULL;
> > +		wait_on_prunes(sb);
> >  		shrink_dcache_sb(sb);
> 
> Hmm... in 2.6.16-rc5, I see
> 
> shrink_dcache_parent(root);
> shrink_dcache_anon(&sb->sb_anon);
> 
> without these calls, some dentries might not get moved to LRU list.
> 
> Am I missing something here?

I should have been more explicit that the patch was against
2.6.16-rc5-mm2.  This contains some dcache patches to allow nfs
filesystem to share superblocks, and one of the patches replaces the
calls to shrink_dcache_parent and shrink_dcache_anon with a single
call to a new function: shrink_dcache_sb.

Thanks for the feedback

NeilBrown


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-08  0:29         ` Neil Brown
@ 2006-03-08  2:17           ` Balbir Singh
  2006-03-08  2:39             ` Neil Brown
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2006-03-08  2:17 UTC (permalink / raw)
  To: Neil Brown
  Cc: Kirill Korotaev, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

> So: we seem to have two different approaches to solving this problem.
> 
> One is to stop any other thread from calling dentry_iput while the
> umount is running generic_shutdown_super.  This cannot be done with
> the PF_MEMALLOC trick and so would require calls to prune_dcache to
> state their intentions (e.g. pass a 'struct super_block *').
> With this approach, generic_shutdown_super needs to wait for stray
> pruning to finish after marking the superblock as being unmounted, and
> before shrinking the dcache.
> 
> The other is to allow other threads to call dentry_iput at any time,
> but to keep track of them, and to wait for all to finish after
> pruning all the filesystem's dentries.  This requires the extra
> locking in prune_one_dentry to make sure we dput the parent before
> releasing dcache_lock.
> 
> Of these two, I prefer the former, because fiddling with the locking
> in prune_one_dentry is rather intrusive.
> 
> Also, the recent 
>    nfs-permit-filesystem-to-override-root-dentry-on-mount.patch 
> patch in -mm means that generic_shutdown_super does not call
> prune_dcache but has it's own pruning code.  This means there is no
> longer a need to pass intentions to prune_dcache.  Prune_dcache can
> always ignore dentries with s->root==NULL, and shrink_dcache_sb only
> works with it's own dentries.
> 
> So: blending the better bits of various patches together, I've come up
> with the following.  It still needs a changelog entry, but does anyone
> want to ACK it ???

I think this patch is much more cleaner and refined.

>From yesterdays comments I am beginning to wonder if it is enough to solve
only the unmount race or should the fix be more generic to address the race
between the shrinkers call to shrink_dcache_memory() and shrink_dcache_parent().

>From what I understand of the race, the race occurs when dput of the
parent fails to happen and because the referecne count is not 0,
shrink_dcache_parent() skips over those dentries. The race occurs for
the dentry and its ancestors above 

I was wondering if the following would work (to solve the generic race)

Add a prune_mutex to the super-block. Hold on to it in prune_one_dentry()
until we hit a parent dentry that is a mount point (d_mounted > 0) or
the parent has a reference count > 1 or at the end of prune_one_dentry().
This should ensure that for each super block dentry counts are consistent.
Also get select_parent() to hold the super block's prune_mutex, so that it
sees a consistent view of the super block.

Oh! now that I think about it, I think your solution looks like an
elegant way to do the same thing. The only draw back is that it solves
only the unmount race and there are some changes in generic_shutdown_super()
which I do not understand.

Please find some comments for your patch below

> 
> Thanks,
> NeilBrown
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./fs/dcache.c            |   32 ++++++++++++++++++++++++++++++--
>  ./fs/super.c             |    3 +++
>  ./include/linux/dcache.h |    1 +
>  ./include/linux/fs.h     |    3 +++
>  4 files changed, 37 insertions(+), 2 deletions(-)
> 
> diff ./fs/dcache.c~current~ ./fs/dcache.c
> --- ./fs/dcache.c~current~	2006-03-08 10:42:56.000000000 +1100
> +++ ./fs/dcache.c	2006-03-08 11:24:11.000000000 +1100
> @@ -364,17 +364,43 @@ restart:
>   */
>  static inline void prune_one_dentry(struct dentry * dentry)
>  {
> +	struct super_block *sb = dentry->d_sb;
>  	struct dentry * parent;
>  
>  	__d_drop(dentry);
>  	list_del(&dentry->d_u.d_child);
>  	dentry_stat.nr_dentry--;	/* For d_free, below */
> +	sb->s_prunes++;
>  	dentry_iput(dentry);
>  	parent = dentry->d_parent;
>  	d_free(dentry);
>  	if (parent != dentry)
>  		dput(parent);
>  	spin_lock(&dcache_lock);
> +	sb->s_prunes--;
> +	if (waitqueue_active(&sb->s_wait_prunes))
> +		wake_up(&sb->s_wait_prunes);
> +}
> +
> +/* As prune_one_dentry can hold an input with calling
> + * dentry_iput, generic_shutdown_super needs to wait for any
> + * pending pruning to stop before doing it's own dentry
> + * pruning.
> + */
> +void wait_on_prunes(struct super_block *sb)
> +{
> +	DEFINE_WAIT(w);
> +	spin_lock(&dcache_lock);
> +	for (;;) {
> +		prepare_to_wait(&sb->s_wait_prunes, &w, TASK_UNINTERRUPTIBLE);
> +		if (sb->s_prunes == 0)
> +			break;
> +		spin_unlock(&dcache_lock);
> +		schedule();
> +		spin_lock(&dcache_lock);
> +	}
> +	spin_unlock(&dcache_lock);
> +	finish_wait(&sb->s_wait_prunes, &w);
>  }
>  
>  /**
> @@ -417,8 +443,10 @@ static void prune_dcache(int count)
>   			spin_unlock(&dentry->d_lock);
>  			continue;
>  		}
> -		/* If the dentry was recently referenced, don't free it. */
> -		if (dentry->d_flags & DCACHE_REFERENCED) {
> +		/* If the dentry was recently referenced, or if the filesystem
> +		 * is being unmounted, don't free it. */
> +		if ((dentry->d_flags & DCACHE_REFERENCED) ||
> +		    dentry->d_sb->s_root == NULL) {
>  			dentry->d_flags &= ~DCACHE_REFERENCED;
>   			list_add(&dentry->d_lru, &dentry_unused);
>   			dentry_stat.nr_unused++;
> 
> diff ./fs/super.c~current~ ./fs/super.c
> --- ./fs/super.c~current~	2006-03-08 10:50:37.000000000 +1100
> +++ ./fs/super.c	2006-03-08 11:02:12.000000000 +1100
> @@ -81,6 +81,8 @@ static struct super_block *alloc_super(v
>  		mutex_init(&s->s_dquot.dqio_mutex);
>  		mutex_init(&s->s_dquot.dqonoff_mutex);
>  		init_rwsem(&s->s_dquot.dqptr_sem);
> +		s->s_prunes = 0;
> +		init_waitqueue_head(&s->s_wait_prunes);
>  		init_waitqueue_head(&s->s_wait_unfrozen);
>  		s->s_maxbytes = MAX_NON_LFS;
>  		s->dq_op = sb_dquot_ops;
> @@ -231,6 +233,7 @@ void generic_shutdown_super(struct super
>  
>  	if (root) {
>  		sb->s_root = NULL;
> +		wait_on_prunes(sb);
>  		shrink_dcache_sb(sb);

Hmm... in 2.6.16-rc5, I see

shrink_dcache_parent(root);
shrink_dcache_anon(&sb->sb_anon);

without these calls, some dentries might not get moved to LRU list.

Am I missing something here?

[snip]

Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  6:16       ` Kirill Korotaev
  2006-03-07  7:03         ` Balbir Singh
@ 2006-03-08  0:29         ` Neil Brown
  2006-03-08  2:17           ` Balbir Singh
  1 sibling, 1 reply; 41+ messages in thread
From: Neil Brown @ 2006-03-08  0:29 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Balbir Singh, linux-kernel, Andrew Morton, Olaf Hering,
	Jan Blunck, Kirill Korotaev, Al Viro

On Tuesday March 7, dev@sw.ru wrote:
> >>The code changes look big, have you looked at
> >>http://marc.theaimsgroup.com/?l=linux-kernel&m=113817279225962&w=2
> > 
> > 
> > No I haven't.  I like it.
> >  - Holding the semaphore shouldn't be a problem.
> >  - calling down_read_trylock ought to be fast
> >  - I *think* the unwanted calls to prune_dcache are always under
> >    PF_MEMALLOC - they certainly seem to be.
> No, it looks as it is not :(
> Have you noticed my comment about "count" argument to prune_dcache()?
> For example, prune_dcache() is called from shrink_dcache_parent() which 
> is called in many places and not all of them have PF_MEMALLOC or 
> s_umount semaphore for write. But prune_dcache() doesn't care for super 
> blocks etc. It simply shrinks N dentries which are found _first_.
> 
> So the condition:
> +		if ((current->flags & PF_MEMALLOC) &&
> +			!(ret = down_read_trylock(&s->s_umount))) {
> is not always true when the race occurs, as PF_MEMALLOC is not always set.
> 
> > And it is a nice small change.
> > Have you had any other feedback on this?
> here it is :)

Thanks....

So: we seem to have two different approaches to solving this problem.

One is to stop any other thread from calling dentry_iput while the
umount is running generic_shutdown_super.  This cannot be done with
the PF_MEMALLOC trick and so would require calls to prune_dcache to
state their intentions (e.g. pass a 'struct super_block *').
With this approach, generic_shutdown_super needs to wait for stray
pruning to finish after marking the superblock as being unmounted, and
before shrinking the dcache.

The other is to allow other threads to call dentry_iput at any time,
but to keep track of them, and to wait for all to finish after
pruning all the filesystem's dentries.  This requires the extra
locking in prune_one_dentry to make sure we dput the parent before
releasing dcache_lock.

Of these two, I prefer the former, because fiddling with the locking
in prune_one_dentry is rather intrusive.

Also, the recent 
   nfs-permit-filesystem-to-override-root-dentry-on-mount.patch 
patch in -mm means that generic_shutdown_super does not call
prune_dcache but has it's own pruning code.  This means there is no
longer a need to pass intentions to prune_dcache.  Prune_dcache can
always ignore dentries with s->root==NULL, and shrink_dcache_sb only
works with it's own dentries.

So: blending the better bits of various patches together, I've come up
with the following.  It still needs a changelog entry, but does anyone
want to ACK it ???

Thanks,
NeilBrown

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/dcache.c            |   32 ++++++++++++++++++++++++++++++--
 ./fs/super.c             |    3 +++
 ./include/linux/dcache.h |    1 +
 ./include/linux/fs.h     |    3 +++
 4 files changed, 37 insertions(+), 2 deletions(-)

diff ./fs/dcache.c~current~ ./fs/dcache.c
--- ./fs/dcache.c~current~	2006-03-08 10:42:56.000000000 +1100
+++ ./fs/dcache.c	2006-03-08 11:24:11.000000000 +1100
@@ -364,17 +364,43 @@ restart:
  */
 static inline void prune_one_dentry(struct dentry * dentry)
 {
+	struct super_block *sb = dentry->d_sb;
 	struct dentry * parent;
 
 	__d_drop(dentry);
 	list_del(&dentry->d_u.d_child);
 	dentry_stat.nr_dentry--;	/* For d_free, below */
+	sb->s_prunes++;
 	dentry_iput(dentry);
 	parent = dentry->d_parent;
 	d_free(dentry);
 	if (parent != dentry)
 		dput(parent);
 	spin_lock(&dcache_lock);
+	sb->s_prunes--;
+	if (waitqueue_active(&sb->s_wait_prunes))
+		wake_up(&sb->s_wait_prunes);
+}
+
+/* As prune_one_dentry can hold an input with calling
+ * dentry_iput, generic_shutdown_super needs to wait for any
+ * pending pruning to stop before doing it's own dentry
+ * pruning.
+ */
+void wait_on_prunes(struct super_block *sb)
+{
+	DEFINE_WAIT(w);
+	spin_lock(&dcache_lock);
+	for (;;) {
+		prepare_to_wait(&sb->s_wait_prunes, &w, TASK_UNINTERRUPTIBLE);
+		if (sb->s_prunes == 0)
+			break;
+		spin_unlock(&dcache_lock);
+		schedule();
+		spin_lock(&dcache_lock);
+	}
+	spin_unlock(&dcache_lock);
+	finish_wait(&sb->s_wait_prunes, &w);
 }
 
 /**
@@ -417,8 +443,10 @@ static void prune_dcache(int count)
  			spin_unlock(&dentry->d_lock);
 			continue;
 		}
-		/* If the dentry was recently referenced, don't free it. */
-		if (dentry->d_flags & DCACHE_REFERENCED) {
+		/* If the dentry was recently referenced, or if the filesystem
+		 * is being unmounted, don't free it. */
+		if ((dentry->d_flags & DCACHE_REFERENCED) ||
+		    dentry->d_sb->s_root == NULL) {
 			dentry->d_flags &= ~DCACHE_REFERENCED;
  			list_add(&dentry->d_lru, &dentry_unused);
  			dentry_stat.nr_unused++;

diff ./fs/super.c~current~ ./fs/super.c
--- ./fs/super.c~current~	2006-03-08 10:50:37.000000000 +1100
+++ ./fs/super.c	2006-03-08 11:02:12.000000000 +1100
@@ -81,6 +81,8 @@ static struct super_block *alloc_super(v
 		mutex_init(&s->s_dquot.dqio_mutex);
 		mutex_init(&s->s_dquot.dqonoff_mutex);
 		init_rwsem(&s->s_dquot.dqptr_sem);
+		s->s_prunes = 0;
+		init_waitqueue_head(&s->s_wait_prunes);
 		init_waitqueue_head(&s->s_wait_unfrozen);
 		s->s_maxbytes = MAX_NON_LFS;
 		s->dq_op = sb_dquot_ops;
@@ -231,6 +233,7 @@ void generic_shutdown_super(struct super
 
 	if (root) {
 		sb->s_root = NULL;
+		wait_on_prunes(sb);
 		shrink_dcache_sb(sb);
 		dput(root);
 		fsync_super(sb);

diff ./include/linux/dcache.h~current~ ./include/linux/dcache.h
--- ./include/linux/dcache.h~current~	2006-03-08 11:07:50.000000000 +1100
+++ ./include/linux/dcache.h	2006-03-08 11:25:14.000000000 +1100
@@ -220,6 +220,7 @@ extern void shrink_dcache_sb(struct supe
 extern void shrink_dcache_parent(struct dentry *);
 extern void shrink_dcache_anon(struct hlist_head *);
 extern int d_invalidate(struct dentry *);
+extern void wait_on_prunes(struct super_block *);
 
 /* only used at mount-time */
 extern struct dentry * d_alloc_root(struct inode *);

diff ./include/linux/fs.h~current~ ./include/linux/fs.h
--- ./include/linux/fs.h~current~	2006-03-08 11:02:23.000000000 +1100
+++ ./include/linux/fs.h	2006-03-08 11:03:02.000000000 +1100
@@ -838,6 +838,9 @@ struct super_block {
 	struct list_head	s_instances;
 	struct quota_info	s_dquot;	/* Diskquota specific options */
 
+	unsigned int		s_prunes;	/* protected by dcache_lock */
+	wait_queue_head_t	s_wait_prunes;
+
 	int			s_frozen;
 	wait_queue_head_t	s_wait_unfrozen;
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  6:20       ` Kirill Korotaev
@ 2006-03-07 23:20         ` Neil Brown
  2006-03-09 12:03           ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Neil Brown @ 2006-03-07 23:20 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Kirill Korotaev, linux-kernel, Andrew Morton, Olaf Hering,
	Jan Blunck, Al Viro

On Tuesday March 7, dev@sw.ru wrote:
> >>In general your patch is still does what mine do, so I will be happy if 
> >>any of this is commited mainstream. In future, please, keep the 
> >>reference to original authors, this will also make sure that I'm on CC 
> >>if something goes wrong.
> > 
> > 
> > Sorry: which 'original author' did I miss ?
> I mean, it is better to mention original author 
> (http://marc.theaimsgroup.com/?l=linux-kernel&m=114123870406751&w=2) in 
> patch description, as it makes sure that he will be on CC if this patch 
> will be discussed later again. My patch fixing this race was in -mm tree 
> for half a year already.

Which patch is that?  The race still seems to be present in -mm.

NeilBrown

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  7:21           ` Kirill Korotaev
@ 2006-03-07 11:05             ` Balbir Singh
  0 siblings, 0 replies; 41+ messages in thread
From: Balbir Singh @ 2006-03-07 11:05 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Neil Brown, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

> >Given that background, I thought our main concern was with respect to
> >unmount. The race was between shrink_dcache_parent() (called from unmount)
> >and shrink_dcache_memory() (called from the allocator), hence the fix
> >for the race condition.
> Partial fix doesn't make much sense from my point of view.
>

IMHO, It was not a partial fix. slab_drop() addition changed the assumptions
used by this fix 
 
> >I just noticied that 2.6.16-rc* now seems to have drop_slab() where
> >PF_MEMALLOC is not set. So, we can still race with my fix if there
> >if /proc/sys/vm/drop_caches is written to and unmount is done in parallel.
> >
> >A simple hack would be to set PF_MEMALLOC in drop_slab(), but I do not
> >think it is a good idea.
> Yeah, playing with PF_MEMALLOC can be not so good idea :/
> And as it doesn't help in other cases it looks unpromising...

Yes, agreed.

> 
> >>>Have you had any other feedback on this?
> >>here it is :)
> >Thanks for your detailed feedback
> Sorry, that I did it too late :/
> 

No problem

> Thanks,
> Kirill
> 

Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  7:03         ` Balbir Singh
@ 2006-03-07  7:21           ` Kirill Korotaev
  2006-03-07 11:05             ` Balbir Singh
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-07  7:21 UTC (permalink / raw)
  To: balbir
  Cc: Neil Brown, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

>>No, it looks as it is not :(
>>Have you noticed my comment about "count" argument to prune_dcache()?
>>For example, prune_dcache() is called from shrink_dcache_parent() which 
>>is called in many places and not all of them have PF_MEMALLOC or 
>>s_umount semaphore for write. But prune_dcache() doesn't care for super 
>>blocks etc. It simply shrinks N dentries which are found _first_.
>>
>>So the condition:
>>+		if ((current->flags & PF_MEMALLOC) &&
>>+			!(ret = down_read_trylock(&s->s_umount))) {
>>is not always true when the race occurs, as PF_MEMALLOC is not always set.
> 
> 
> I understand your comment about shrink_dcache_parent() being called
> from several places. prune_one_dentry() would eventually dput the parent,
> but unmount would go ahead and unmount the filesystem before the
> dput of the parent could happen.
exactly.

> Given that background, I thought our main concern was with respect to
> unmount. The race was between shrink_dcache_parent() (called from unmount)
> and shrink_dcache_memory() (called from the allocator), hence the fix
> for the race condition.
Partial fix doesn't make much sense from my point of view.

> I just noticied that 2.6.16-rc* now seems to have drop_slab() where
> PF_MEMALLOC is not set. So, we can still race with my fix if there
> if /proc/sys/vm/drop_caches is written to and unmount is done in parallel.
> 
> A simple hack would be to set PF_MEMALLOC in drop_slab(), but I do not
> think it is a good idea.
Yeah, playing with PF_MEMALLOC can be not so good idea :/
And as it doesn't help in other cases it looks unpromising...

>>>Have you had any other feedback on this?
>>here it is :)
> Thanks for your detailed feedback
Sorry, that I did it too late :/

Thanks,
Kirill



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  6:16       ` Kirill Korotaev
@ 2006-03-07  7:03         ` Balbir Singh
  2006-03-07  7:21           ` Kirill Korotaev
  2006-03-08  0:29         ` Neil Brown
  1 sibling, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2006-03-07  7:03 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Neil Brown, Balbir Singh, linux-kernel, Andrew Morton,
	Olaf Hering, Jan Blunck, Kirill Korotaev, Al Viro

On Tue, Mar 07, 2006 at 09:16:22AM +0300, Kirill Korotaev wrote:
> >>The code changes look big, have you looked at
> >>http://marc.theaimsgroup.com/?l=linux-kernel&m=113817279225962&w=2
> >
> >
> >No I haven't.  I like it.
> > - Holding the semaphore shouldn't be a problem.
> > - calling down_read_trylock ought to be fast
> > - I *think* the unwanted calls to prune_dcache are always under
> >   PF_MEMALLOC - they certainly seem to be.
> No, it looks as it is not :(
> Have you noticed my comment about "count" argument to prune_dcache()?
> For example, prune_dcache() is called from shrink_dcache_parent() which 
> is called in many places and not all of them have PF_MEMALLOC or 
> s_umount semaphore for write. But prune_dcache() doesn't care for super 
> blocks etc. It simply shrinks N dentries which are found _first_.
> 
> So the condition:
> +		if ((current->flags & PF_MEMALLOC) &&
> +			!(ret = down_read_trylock(&s->s_umount))) {
> is not always true when the race occurs, as PF_MEMALLOC is not always set.

I understand your comment about shrink_dcache_parent() being called
from several places. prune_one_dentry() would eventually dput the parent,
but unmount would go ahead and unmount the filesystem before the
dput of the parent could happen.

Given that background, I thought our main concern was with respect to
unmount. The race was between shrink_dcache_parent() (called from unmount)
and shrink_dcache_memory() (called from the allocator), hence the fix
for the race condition.

I just noticied that 2.6.16-rc* now seems to have drop_slab() where
PF_MEMALLOC is not set. So, we can still race with my fix if there
if /proc/sys/vm/drop_caches is written to and unmount is done in parallel.

A simple hack would be to set PF_MEMALLOC in drop_slab(), but I do not
think it is a good idea.

> 
> >And it is a nice small change.
> >Have you had any other feedback on this?
> here it is :)
> 

Thanks for your detailed feedback

> Thanks,
> Kirill
> 

Regards,
Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  2:49       ` Balbir Singh
@ 2006-03-07  6:22         ` Kirill Korotaev
  0 siblings, 0 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-07  6:22 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Neil Brown, linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

>>>>+               /* avoid further wakeups */
>>>>+               sb->s_pending_iputs = 65000;
>>>
>>>This looks a bit ugly, what is 65000?
>>
>>Just the first big number that came to by head... probably not needed.
>>
> 
> 
> ok, I would rather use a const or a #define and hide it under a
> meaningful name, with comments. If it is not needed, then nothing like
> avoiding magic numbers.
It looks like this assignment is not needed at all if "wait_for_prunes" 
is moved after dput(root), since no more dentries should exist after 
that point and wakeup can not potentially happen.

Thanks,
Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  2:01     ` Neil Brown
@ 2006-03-07  6:20       ` Kirill Korotaev
  2006-03-07 23:20         ` Neil Brown
  0 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-07  6:20 UTC (permalink / raw)
  To: Neil Brown
  Cc: Kirill Korotaev, linux-kernel, Andrew Morton, Olaf Hering,
	Jan Blunck, Al Viro

>>In general your patch is still does what mine do, so I will be happy if 
>>any of this is commited mainstream. In future, please, keep the 
>>reference to original authors, this will also make sure that I'm on CC 
>>if something goes wrong.
> 
> 
> Sorry: which 'original author' did I miss ?
I mean, it is better to mention original author 
(http://marc.theaimsgroup.com/?l=linux-kernel&m=114123870406751&w=2) in 
patch description, as it makes sure that he will be on CC if this patch 
will be discussed later again. My patch fixing this race was in -mm tree 
for half a year already.

Thanks,
Kirill



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  1:58     ` Neil Brown
  2006-03-07  2:49       ` Balbir Singh
@ 2006-03-07  6:16       ` Kirill Korotaev
  2006-03-07  7:03         ` Balbir Singh
  2006-03-08  0:29         ` Neil Brown
  1 sibling, 2 replies; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-07  6:16 UTC (permalink / raw)
  To: Neil Brown
  Cc: Balbir Singh, linux-kernel, Andrew Morton, Olaf Hering,
	Jan Blunck, Kirill Korotaev, Al Viro

>>The code changes look big, have you looked at
>>http://marc.theaimsgroup.com/?l=linux-kernel&m=113817279225962&w=2
> 
> 
> No I haven't.  I like it.
>  - Holding the semaphore shouldn't be a problem.
>  - calling down_read_trylock ought to be fast
>  - I *think* the unwanted calls to prune_dcache are always under
>    PF_MEMALLOC - they certainly seem to be.
No, it looks as it is not :(
Have you noticed my comment about "count" argument to prune_dcache()?
For example, prune_dcache() is called from shrink_dcache_parent() which 
is called in many places and not all of them have PF_MEMALLOC or 
s_umount semaphore for write. But prune_dcache() doesn't care for super 
blocks etc. It simply shrinks N dentries which are found _first_.

So the condition:
+		if ((current->flags & PF_MEMALLOC) &&
+			!(ret = down_read_trylock(&s->s_umount))) {
is not always true when the race occurs, as PF_MEMALLOC is not always set.

> And it is a nice small change.
> Have you had any other feedback on this?
here it is :)

Thanks,
Kirill


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-07  1:58     ` Neil Brown
@ 2006-03-07  2:49       ` Balbir Singh
  2006-03-07  6:22         ` Kirill Korotaev
  2006-03-07  6:16       ` Kirill Korotaev
  1 sibling, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2006-03-07  2:49 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

> No I haven't.  I like it.
>  - Holding the semaphore shouldn't be a problem.
>  - calling down_read_trylock ought to be fast
>  - I *think* the unwanted calls to prune_dcache are always under
>    PF_MEMALLOC - they certainly seem to be.
>
> And it is a nice small change.
> Have you had any other feedback on this?
>
>

Thanks, I do not have any feedback on it, but I am certainly hungry for it :-)

> >
> > Some top of the head feedback below. Will try and do a detailed review later.
> >
>
> > > +               /* avoid further wakeups */
> > > +               sb->s_pending_iputs = 65000;
> >
> > This looks a bit ugly, what is 65000?
>
> Just the first big number that came to by head... probably not needed.
>

ok, I would rather use a const or a #define and hide it under a
meaningful name, with comments. If it is not needed, then nothing like
avoiding magic numbers.

>
> NeilBrown
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06 11:56   ` Jan Blunck
@ 2006-03-07  2:15     ` Neil Brown
  0 siblings, 0 replies; 41+ messages in thread
From: Neil Brown @ 2006-03-07  2:15 UTC (permalink / raw)
  To: Jan Blunck
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Kirill Korotaev, Al Viro

On Monday March 6, jblunck@suse.de wrote:
> 
> This are two different problems which you adress with this and your first
> patch. This one is to prevent busy inodes on umouny, the first one was to get
> the reference counting on dentries right.

I think that solving the "busy inodes" problem is sufficient.  The
reference count on dentries isn't *wrong* as someone is actually
holding a reference.  It is just that generic_shutdown_super doesn't
expect anyone else to hold any references.  Fixing the "busy inodes"
problem means that no-one else will be holding any references, so it
becomes a non-problem.
> 
> Neil, did you actually read my patch for this one?!
> http://marc.theaimsgroup.com/?l=linux-kernel&m=114123870406751&w=2

No, I didn't :-(  I obviously didn't do enough homework.

The significant differences seem to be:
 - you test ->s_prunes inside the spinlock.  I don't bother.  Yours is
   probably safer.
 - You call wake_up every time through prune_one_dentry while I try to
   limit the calls.  As each call is a function call and a spinlock,
   maybe at least guard it with
      if(waitqueue_active()) ...


> 
> What I don't like, is that you are serializing the work of shrink_dcache_*
> although they could work in parallel on different processors.

I don't see how I am parallelising anything.  Multiple shrink_dcache_*
can still run.  The only place that extra locking is done is in
generic_shutdown_super.

But what do you think of Balbir Singh's patch?  I think it is less
intrusive and solves the problem just a well.

Thanks,
NeilBrown



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06 11:56   ` Kirill Korotaev
@ 2006-03-07  2:01     ` Neil Brown
  2006-03-07  6:20       ` Kirill Korotaev
  0 siblings, 1 reply; 41+ messages in thread
From: Neil Brown @ 2006-03-07  2:01 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck, Al Viro

On Monday March 6, dev@openvz.org wrote:
> 
> In general your patch is still does what mine do, so I will be happy if 
> any of this is commited mainstream. In future, please, keep the 
> reference to original authors, this will also make sure that I'm on CC 
> if something goes wrong.

Sorry: which 'original author' did I miss ?

> >  
> > +		if (dentry->d_sb->s_root == NULL &&
> > +		    (parent == NULL ||
> > +		     parent->d_sb != dentry->d_sb))
> > +			continue;
> <<<<
> - we select some dentries in select_parent and then try to prune N 
> dentries. But this can be other N dentries :/ It's not the problem with 
> your code, but... adding 'parent' arg to prune_dcache() makes me feel 
> the same as with 'found' arg: you don't use it actually right way. You 
> don't prune only its children, you see? It is better to pass 'sb' arg then.
> 

That's a valid point.  I should be passing 'sb'.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06  7:32   ` Balbir Singh
@ 2006-03-07  1:58     ` Neil Brown
  2006-03-07  2:49       ` Balbir Singh
  2006-03-07  6:16       ` Kirill Korotaev
  0 siblings, 2 replies; 41+ messages in thread
From: Neil Brown @ 2006-03-07  1:58 UTC (permalink / raw)
  To: Balbir Singh
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

On Monday March 6, bsingharora@gmail.com wrote:
> > Somewhere in among the comments (thanks), I realised that I was only
> > closing half the race.  I had tried to make sure there were no stray
> > references to any dentries, but there is still the inode which is
> > being iput which can cause problem.
> >
> > The following patch takes a totally different approach, is based on an
> > idea from Jan Kara, and is much less intrusive.
> >
> > We:
> >   - keep track of "who" is calling prune_dcache, and when a filesystem
> >     is being unmounted (s_root == NULL) we only allow the unmount thread
> >     to prune dentries.
> >   - keep track of how many dentries are in the process of having
> >     dentry_iput called on them for pruning
> >   - don't allow umount to proceed until that count hits zero
> >   - bias the count this way and that to make sure we get a wake_up at
> >     the right time
> >   - reuse 's_wait_unfrozen' to wait on the iput to complete.
> >
> > Again, I'm very keen on feedback.  This race is very hard to trigger,
> > so code review is the only real way to evaluate that patch.
> >
> > Thanks,
> > NeilBrown
> >
> 
> The code changes look big, have you looked at
> http://marc.theaimsgroup.com/?l=linux-kernel&m=113817279225962&w=2

No I haven't.  I like it.
 - Holding the semaphore shouldn't be a problem.
 - calling down_read_trylock ought to be fast
 - I *think* the unwanted calls to prune_dcache are always under
   PF_MEMALLOC - they certainly seem to be.

And it is a nice small change.
Have you had any other feedback on this?


> 
> Some top of the head feedback below. Will try and do a detailed review later.
> 

> > +               /* avoid further wakeups */
> > +               sb->s_pending_iputs = 65000;
> 
> This looks a bit ugly, what is 65000?

Just the first big number that came to by head... probably not needed.


NeilBrown

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06  6:09 ` Neil Brown
  2006-03-06  7:32   ` Balbir Singh
  2006-03-06 11:56   ` Jan Blunck
@ 2006-03-06 11:56   ` Kirill Korotaev
  2006-03-07  2:01     ` Neil Brown
  2 siblings, 1 reply; 41+ messages in thread
From: Kirill Korotaev @ 2006-03-06 11:56 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

>>Comments?  Please :-?
ok :) see inline.

> Somewhere in among the comments (thanks), I realised that I was only
> closing half the race.  I had tried to make sure there were no stray
> references to any dentries, but there is still the inode which is
> being iput which can cause problem.
> 
> The following patch takes a totally different approach, is based on an
> idea from Jan Kara, and is much less intrusive.
> 
> We:
>   - keep track of "who" is calling prune_dcache, and when a filesystem
>     is being unmounted (s_root == NULL) we only allow the unmount thread
>     to prune dentries.
>   - keep track of how many dentries are in the process of having
>     dentry_iput called on them for pruning
>   - don't allow umount to proceed until that count hits zero
>   - bias the count this way and that to make sure we get a wake_up at
>     the right time
>   - reuse 's_wait_unfrozen' to wait on the iput to complete.
> 
> Again, I'm very keen on feedback.  This race is very hard to trigger,
> so code review is the only real way to evaluate that patch.
hmm... It's not that very hard. I suppose no one ever tried except us.
We have a test which does mounts/umounts and lots of other activity 
which makes memory pressure. AFAIR, it takes us ~3 hours to trigger this 
bug on SMP.

In general your patch is still does what mine do, so I will be happy if 
any of this is commited mainstream. In future, please, keep the 
reference to original authors, this will also make sure that I'm on CC 
if something goes wrong.

Thanks,
Kiril

> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./fs/dcache.c        |   17 +++++++++++++----
>  ./fs/super.c         |   11 +++++++++++
>  ./include/linux/fs.h |    2 ++
>  3 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff ./fs/dcache.c~current~ ./fs/dcache.c
> --- ./fs/dcache.c~current~	2006-03-06 16:54:59.000000000 +1100
> +++ ./fs/dcache.c	2006-03-06 16:55:33.000000000 +1100
> @@ -366,6 +366,7 @@ static inline void prune_one_dentry(stru
>  {
>  	struct dentry * parent;
>  
> +	dentry->d_sb->s_pending_iputs ++;
>  	__d_drop(dentry);
>  	list_del(&dentry->d_u.d_child);
>  	dentry_stat.nr_dentry--;	/* For d_free, below */
> @@ -375,6 +376,9 @@ static inline void prune_one_dentry(stru
>  	if (parent != dentry)
>  		dput(parent);
>  	spin_lock(&dcache_lock);
> +	dentry->d_sb->s_pending_iputs --;
> +	if (dentry->d_sb->s_pending_iputs < 0)
> +		wake_up(&dentry->d_sb->s_wait_unfrozen);
>  }
>  
>  /**
> @@ -390,7 +394,7 @@ static inline void prune_one_dentry(stru
>   * all the dentries are in use.
>   */
>   
> -static void prune_dcache(int count)
> +static void prune_dcache(int count, struct dentry *parent)
>  {
>  	spin_lock(&dcache_lock);
>  	for (; count ; count--) {
> @@ -407,6 +411,11 @@ static void prune_dcache(int count)
>   		dentry_stat.nr_unused--;
>  		dentry = list_entry(tmp, struct dentry, d_lru);
>  
> +		if (dentry->d_sb->s_root == NULL &&
> +		    (parent == NULL ||
> +		     parent->d_sb != dentry->d_sb))
> +			continue;
<<<<
- we select some dentries in select_parent and then try to prune N 
dentries. But this can be other N dentries :/ It's not the problem with 
your code, but... adding 'parent' arg to prune_dcache() makes me feel 
the same as with 'found' arg: you don't use it actually right way. You 
don't prune only its children, you see? It is better to pass 'sb' arg then.

> +
>   		spin_lock(&dentry->d_lock);
>  		/*
>  		 * We found an inuse dentry which was not removed from
> @@ -635,7 +644,7 @@ void shrink_dcache_parent(struct dentry 
>  	int found;
>  
>  	while ((found = select_parent(parent)) != 0)
> -		prune_dcache(found);
> +		prune_dcache(found, parent);
>  }
>  
>  /**
> @@ -673,7 +682,7 @@ void shrink_dcache_anon(struct hlist_hea
>  			}
>  		}
>  		spin_unlock(&dcache_lock);
> -		prune_dcache(found);
> +		prune_dcache(found, NULL);
>  	} while(found);
>  }
>  
> @@ -694,7 +703,7 @@ static int shrink_dcache_memory(int nr, 
>  	if (nr) {
>  		if (!(gfp_mask & __GFP_FS))
>  			return -1;
> -		prune_dcache(nr);
> +		prune_dcache(nr, NULL);
>  	}
>  	return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
>  }
> 
> diff ./fs/super.c~current~ ./fs/super.c
> --- ./fs/super.c~current~	2006-03-06 16:54:59.000000000 +1100
> +++ ./fs/super.c	2006-03-06 16:57:19.000000000 +1100
> @@ -230,7 +230,18 @@ void generic_shutdown_super(struct super
>  	struct super_operations *sop = sb->s_op;
>  
>  	if (root) {
> +		spin_lock(&dcache_lock);
> +		/* disable stray dputs */
>  		sb->s_root = NULL;
> +
> +		/* trigger a wake_up */
> +		sb->s_pending_iputs --;
> +		spin_unlock(&dcache_lock);
> +		wait_event(sb->s_wait_unfrozen,
> +			   sb->s_pending_iputs < 0);
> +		/* avoid further wakeups */
> +		sb->s_pending_iputs = 65000;
<<< its ulgy... :( why don't you do wait after shrink's and dput(root) 
below?

> +
>  		shrink_dcache_parent(root);
>  		shrink_dcache_anon(&sb->s_anon);
>  		dput(root);
> 
> diff ./include/linux/fs.h~current~ ./include/linux/fs.h
> --- ./include/linux/fs.h~current~	2006-03-06 16:54:59.000000000 +1100
> +++ ./include/linux/fs.h	2006-03-06 12:49:55.000000000 +1100
> @@ -833,6 +833,8 @@ struct super_block {
>  	struct hlist_head	s_anon;		/* anonymous dentries for (nfs) exporting */
>  	struct list_head	s_files;
>  
> +	int			s_pending_iputs;
> +
>  	struct block_device	*s_bdev;
>  	struct list_head	s_instances;
>  	struct quota_info	s_dquot;	/* Diskquota specific options */
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06  6:09 ` Neil Brown
  2006-03-06  7:32   ` Balbir Singh
@ 2006-03-06 11:56   ` Jan Blunck
  2006-03-07  2:15     ` Neil Brown
  2006-03-06 11:56   ` Kirill Korotaev
  2 siblings, 1 reply; 41+ messages in thread
From: Jan Blunck @ 2006-03-06 11:56 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Kirill Korotaev, Al Viro

On Mon, Mar 06, Neil Brown wrote:

> On Thursday March 2, neilb@suse.de wrote:
> > 
> > 
> > Hi,
> >  This mail relates to the thread with the same subject which can be
> >  found at
> > 
> >     http://lkml.org/lkml/2006/1/16/279
> > 
> >  I would like to propose an alternate patch for the problem.
> ....
> > 
> > Comments?  Please :-?
> 
> Somewhere in among the comments (thanks), I realised that I was only
> closing half the race.  I had tried to make sure there were no stray
> references to any dentries, but there is still the inode which is
> being iput which can cause problem.
> 
> The following patch takes a totally different approach, is based on an
> idea from Jan Kara, and is much less intrusive.
> 
> We:
>   - keep track of "who" is calling prune_dcache, and when a filesystem
>     is being unmounted (s_root == NULL) we only allow the unmount thread
>     to prune dentries.
>   - keep track of how many dentries are in the process of having
>     dentry_iput called on them for pruning
>   - don't allow umount to proceed until that count hits zero
>   - bias the count this way and that to make sure we get a wake_up at
>     the right time
>   - reuse 's_wait_unfrozen' to wait on the iput to complete.
> 
> Again, I'm very keen on feedback.  This race is very hard to trigger,
> so code review is the only real way to evaluate that patch.
> 

Just ask Olaf. Afaik he was the one who triggered it frequently.

This are two different problems which you adress with this and your first
patch. This one is to prevent busy inodes on umouny, the first one was to get
the reference counting on dentries right.

Neil, did you actually read my patch for this one?!
http://marc.theaimsgroup.com/?l=linux-kernel&m=114123870406751&w=2

> diff ./fs/super.c~current~ ./fs/super.c
> --- ./fs/super.c~current~	2006-03-06 16:54:59.000000000 +1100
> +++ ./fs/super.c	2006-03-06 16:57:19.000000000 +1100
> @@ -230,7 +230,18 @@ void generic_shutdown_super(struct super
>  	struct super_operations *sop = sb->s_op;
>  
>  	if (root) {
> +		spin_lock(&dcache_lock);
> +		/* disable stray dputs */
>  		sb->s_root = NULL;
> +
> +		/* trigger a wake_up */
> +		sb->s_pending_iputs --;
> +		spin_unlock(&dcache_lock);
> +		wait_event(sb->s_wait_unfrozen,
> +			   sb->s_pending_iputs < 0);
> +		/* avoid further wakeups */
> +		sb->s_pending_iputs = 65000;
> +
>  		shrink_dcache_parent(root);
>  		shrink_dcache_anon(&sb->s_anon);
>  		dput(root);
> 

What I don't like, is that you are serializing the work of shrink_dcache_*
although they could work in parallel on different processors.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-06  6:09 ` Neil Brown
@ 2006-03-06  7:32   ` Balbir Singh
  2006-03-07  1:58     ` Neil Brown
  2006-03-06 11:56   ` Jan Blunck
  2006-03-06 11:56   ` Kirill Korotaev
  2 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2006-03-06  7:32 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

> Somewhere in among the comments (thanks), I realised that I was only
> closing half the race.  I had tried to make sure there were no stray
> references to any dentries, but there is still the inode which is
> being iput which can cause problem.
>
> The following patch takes a totally different approach, is based on an
> idea from Jan Kara, and is much less intrusive.
>
> We:
>   - keep track of "who" is calling prune_dcache, and when a filesystem
>     is being unmounted (s_root == NULL) we only allow the unmount thread
>     to prune dentries.
>   - keep track of how many dentries are in the process of having
>     dentry_iput called on them for pruning
>   - don't allow umount to proceed until that count hits zero
>   - bias the count this way and that to make sure we get a wake_up at
>     the right time
>   - reuse 's_wait_unfrozen' to wait on the iput to complete.
>
> Again, I'm very keen on feedback.  This race is very hard to trigger,
> so code review is the only real way to evaluate that patch.
>
> Thanks,
> NeilBrown
>

The code changes look big, have you looked at
http://marc.theaimsgroup.com/?l=linux-kernel&m=113817279225962&w=2

Some top of the head feedback below. Will try and do a detailed review later.

>
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> ### Diffstat output
>  ./fs/dcache.c        |   17 +++++++++++++----
>  ./fs/super.c         |   11 +++++++++++
>  ./include/linux/fs.h |    2 ++
>  3 files changed, 26 insertions(+), 4 deletions(-)
>
> diff ./fs/dcache.c~current~ ./fs/dcache.c
> --- ./fs/dcache.c~current~      2006-03-06 16:54:59.000000000 +1100
> +++ ./fs/dcache.c       2006-03-06 16:55:33.000000000 +1100
> @@ -366,6 +366,7 @@ static inline void prune_one_dentry(stru
>  {
>         struct dentry * parent;
>
> +       dentry->d_sb->s_pending_iputs ++;
>         __d_drop(dentry);
>         list_del(&dentry->d_u.d_child);
>         dentry_stat.nr_dentry--;        /* For d_free, below */
> @@ -375,6 +376,9 @@ static inline void prune_one_dentry(stru
>         if (parent != dentry)
>                 dput(parent);
>         spin_lock(&dcache_lock);
> +       dentry->d_sb->s_pending_iputs --;
> +       if (dentry->d_sb->s_pending_iputs < 0)
> +               wake_up(&dentry->d_sb->s_wait_unfrozen);
>  }
>
>  /**
> @@ -390,7 +394,7 @@ static inline void prune_one_dentry(stru
>   * all the dentries are in use.
>   */
>
> -static void prune_dcache(int count)
> +static void prune_dcache(int count, struct dentry *parent)
>  {
>         spin_lock(&dcache_lock);
>         for (; count ; count--) {
> @@ -407,6 +411,11 @@ static void prune_dcache(int count)
>                 dentry_stat.nr_unused--;
>                 dentry = list_entry(tmp, struct dentry, d_lru);
>
> +               if (dentry->d_sb->s_root == NULL &&
> +                   (parent == NULL ||
> +                    parent->d_sb != dentry->d_sb))
> +                       continue;
> +
>                 spin_lock(&dentry->d_lock);
>                 /*
>                  * We found an inuse dentry which was not removed from
> @@ -635,7 +644,7 @@ void shrink_dcache_parent(struct dentry
>         int found;
>
>         while ((found = select_parent(parent)) != 0)
> -               prune_dcache(found);
> +               prune_dcache(found, parent);
>  }
>
>  /**
> @@ -673,7 +682,7 @@ void shrink_dcache_anon(struct hlist_hea
>                         }
>                 }
>                 spin_unlock(&dcache_lock);
> -               prune_dcache(found);
> +               prune_dcache(found, NULL);
>         } while(found);
>  }
>
> @@ -694,7 +703,7 @@ static int shrink_dcache_memory(int nr,
>         if (nr) {
>                 if (!(gfp_mask & __GFP_FS))
>                         return -1;
> -               prune_dcache(nr);
> +               prune_dcache(nr, NULL);
>         }
>         return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
>  }
>
> diff ./fs/super.c~current~ ./fs/super.c
> --- ./fs/super.c~current~       2006-03-06 16:54:59.000000000 +1100
> +++ ./fs/super.c        2006-03-06 16:57:19.000000000 +1100
> @@ -230,7 +230,18 @@ void generic_shutdown_super(struct super
>         struct super_operations *sop = sb->s_op;
>
>         if (root) {
> +               spin_lock(&dcache_lock);
> +               /* disable stray dputs */
>                 sb->s_root = NULL;
> +
> +               /* trigger a wake_up */
> +               sb->s_pending_iputs --;
> +               spin_unlock(&dcache_lock);
> +               wait_event(sb->s_wait_unfrozen,
> +                          sb->s_pending_iputs < 0);
> +               /* avoid further wakeups */
> +               sb->s_pending_iputs = 65000;

This looks a bit ugly, what is 65000?

> +
>                 shrink_dcache_parent(root);
>                 shrink_dcache_anon(&sb->s_anon);
>                 dput(root);
>
> diff ./include/linux/fs.h~current~ ./include/linux/fs.h
> --- ./include/linux/fs.h~current~       2006-03-06 16:54:59.000000000 +1100
> +++ ./include/linux/fs.h        2006-03-06 12:49:55.000000000 +1100
> @@ -833,6 +833,8 @@ struct super_block {
>         struct hlist_head       s_anon;         /* anonymous dentries for (nfs) exporting */
>         struct list_head        s_files;
>
> +       int                     s_pending_iputs;
> +
>         struct block_device     *s_bdev;
>         struct list_head        s_instances;
>         struct quota_info       s_dquot;        /* Diskquota specific options */

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-02  6:57 Neil Brown
  2006-03-02 10:48 ` Jan Blunck
  2006-03-03 11:42 ` Jan Blunck
@ 2006-03-06  6:09 ` Neil Brown
  2006-03-06  7:32   ` Balbir Singh
                     ` (2 more replies)
  2 siblings, 3 replies; 41+ messages in thread
From: Neil Brown @ 2006-03-06  6:09 UTC (permalink / raw)
  To: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck,
	Kirill Korotaev, Al Viro

On Thursday March 2, neilb@suse.de wrote:
> 
> 
> Hi,
>  This mail relates to the thread with the same subject which can be
>  found at
> 
>     http://lkml.org/lkml/2006/1/16/279
> 
>  I would like to propose an alternate patch for the problem.
....
> 
> Comments?  Please :-?

Somewhere in among the comments (thanks), I realised that I was only
closing half the race.  I had tried to make sure there were no stray
references to any dentries, but there is still the inode which is
being iput which can cause problem.

The following patch takes a totally different approach, is based on an
idea from Jan Kara, and is much less intrusive.

We:
  - keep track of "who" is calling prune_dcache, and when a filesystem
    is being unmounted (s_root == NULL) we only allow the unmount thread
    to prune dentries.
  - keep track of how many dentries are in the process of having
    dentry_iput called on them for pruning
  - don't allow umount to proceed until that count hits zero
  - bias the count this way and that to make sure we get a wake_up at
    the right time
  - reuse 's_wait_unfrozen' to wait on the iput to complete.

Again, I'm very keen on feedback.  This race is very hard to trigger,
so code review is the only real way to evaluate that patch.

Thanks,
NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/dcache.c        |   17 +++++++++++++----
 ./fs/super.c         |   11 +++++++++++
 ./include/linux/fs.h |    2 ++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff ./fs/dcache.c~current~ ./fs/dcache.c
--- ./fs/dcache.c~current~	2006-03-06 16:54:59.000000000 +1100
+++ ./fs/dcache.c	2006-03-06 16:55:33.000000000 +1100
@@ -366,6 +366,7 @@ static inline void prune_one_dentry(stru
 {
 	struct dentry * parent;
 
+	dentry->d_sb->s_pending_iputs ++;
 	__d_drop(dentry);
 	list_del(&dentry->d_u.d_child);
 	dentry_stat.nr_dentry--;	/* For d_free, below */
@@ -375,6 +376,9 @@ static inline void prune_one_dentry(stru
 	if (parent != dentry)
 		dput(parent);
 	spin_lock(&dcache_lock);
+	dentry->d_sb->s_pending_iputs --;
+	if (dentry->d_sb->s_pending_iputs < 0)
+		wake_up(&dentry->d_sb->s_wait_unfrozen);
 }
 
 /**
@@ -390,7 +394,7 @@ static inline void prune_one_dentry(stru
  * all the dentries are in use.
  */
  
-static void prune_dcache(int count)
+static void prune_dcache(int count, struct dentry *parent)
 {
 	spin_lock(&dcache_lock);
 	for (; count ; count--) {
@@ -407,6 +411,11 @@ static void prune_dcache(int count)
  		dentry_stat.nr_unused--;
 		dentry = list_entry(tmp, struct dentry, d_lru);
 
+		if (dentry->d_sb->s_root == NULL &&
+		    (parent == NULL ||
+		     parent->d_sb != dentry->d_sb))
+			continue;
+
  		spin_lock(&dentry->d_lock);
 		/*
 		 * We found an inuse dentry which was not removed from
@@ -635,7 +644,7 @@ void shrink_dcache_parent(struct dentry 
 	int found;
 
 	while ((found = select_parent(parent)) != 0)
-		prune_dcache(found);
+		prune_dcache(found, parent);
 }
 
 /**
@@ -673,7 +682,7 @@ void shrink_dcache_anon(struct hlist_hea
 			}
 		}
 		spin_unlock(&dcache_lock);
-		prune_dcache(found);
+		prune_dcache(found, NULL);
 	} while(found);
 }
 
@@ -694,7 +703,7 @@ static int shrink_dcache_memory(int nr, 
 	if (nr) {
 		if (!(gfp_mask & __GFP_FS))
 			return -1;
-		prune_dcache(nr);
+		prune_dcache(nr, NULL);
 	}
 	return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
 }

diff ./fs/super.c~current~ ./fs/super.c
--- ./fs/super.c~current~	2006-03-06 16:54:59.000000000 +1100
+++ ./fs/super.c	2006-03-06 16:57:19.000000000 +1100
@@ -230,7 +230,18 @@ void generic_shutdown_super(struct super
 	struct super_operations *sop = sb->s_op;
 
 	if (root) {
+		spin_lock(&dcache_lock);
+		/* disable stray dputs */
 		sb->s_root = NULL;
+
+		/* trigger a wake_up */
+		sb->s_pending_iputs --;
+		spin_unlock(&dcache_lock);
+		wait_event(sb->s_wait_unfrozen,
+			   sb->s_pending_iputs < 0);
+		/* avoid further wakeups */
+		sb->s_pending_iputs = 65000;
+
 		shrink_dcache_parent(root);
 		shrink_dcache_anon(&sb->s_anon);
 		dput(root);

diff ./include/linux/fs.h~current~ ./include/linux/fs.h
--- ./include/linux/fs.h~current~	2006-03-06 16:54:59.000000000 +1100
+++ ./include/linux/fs.h	2006-03-06 12:49:55.000000000 +1100
@@ -833,6 +833,8 @@ struct super_block {
 	struct hlist_head	s_anon;		/* anonymous dentries for (nfs) exporting */
 	struct list_head	s_files;
 
+	int			s_pending_iputs;
+
 	struct block_device	*s_bdev;
 	struct list_head	s_instances;
 	struct quota_info	s_dquot;	/* Diskquota specific options */

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-02  6:57 Neil Brown
  2006-03-02 10:48 ` Jan Blunck
@ 2006-03-03 11:42 ` Jan Blunck
  2006-03-06  6:09 ` Neil Brown
  2 siblings, 0 replies; 41+ messages in thread
From: Jan Blunck @ 2006-03-03 11:42 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Kirill Korotaev, Al Viro

On Thu, Mar 02, Neil Brown wrote:

>  The core problem is that:
>    prune_one_dentry can hold a reference to a dentry without any
>      lock being held, and without any other reference to the
>      filesystem (if it is being called from shrink_dcache_memory).
>      It holds this reference while calling iput on an inode.  This can
>      take an arbitrarily long time to complete, especially if NFS
>      needs to wait for some RPCs to complete or timeout.
> 
>    shrink_dcache_parent skips over dentries which have a reference,
>      such as the one held by prune_one_dentry.
> 
>    Thus umount can find that an inode is still in use (by it's dentry
>    which was skipped) and will complain.  Worse, when the nfs request
>    on some inode finally completes, it might find the superblock
>    doesn't exist any more and... oops.
> 
>   My proposed solution to the problem is never to expose the reference
>   held by prune_one_dentry.  i.e. keep the spin_lock held.

This morning I wondered, why I was using a list to drop the dentry's inodes
after the dput() has visited all parents.

It is not enough to fix prune_one_dentry() to hold the lock until the parent
is dereferenced since prune_one_dentry() calls __dput_locked(). And in
__dput_locked() we have the same problem again:

from __dput_locked():

                list_del(&dentry->d_u.d_child);
                dentry_stat.nr_dentry--;        /* For d_free, below */
                /*drops the locks, at that point nobody can reach this dentry*/
->              dentry_iput(dentry);
                parent = dentry->d_parent;
                d_free(dentry);
                if (dentry == parent)
                        return;
                dentry = parent;
+
+               if (atomic_read(&dentry->d_count) == 1)
+                       might_sleep();
+               if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+                       return;
+
+               spin_lock(&dentry->d_lock);
                goto repeat;
        }
  }

Between -> and the atomic_dec_and_lock() the reference count on the parent is
wrong and no lock is held. I fixed that by using d_lru to keep track of all
dentry's which inodes still have to be dereferenced. This should happen after
all parents have been dereferenced.

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
  2006-03-02  6:57 Neil Brown
@ 2006-03-02 10:48 ` Jan Blunck
  2006-03-03 11:42 ` Jan Blunck
  2006-03-06  6:09 ` Neil Brown
  2 siblings, 0 replies; 41+ messages in thread
From: Jan Blunck @ 2006-03-02 10:48 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-kernel, Andrew Morton, Olaf Hering, Kirill Korotaev, Al Viro

On Thu, Mar 02, Neil Brown wrote:

>   This requires:
>   - Breaking dentry_iput into 2 pieces, one that happens while the
>     dcache locks are held, and one that happens unlocked.
>   - Also, dput needs a variant which can be called with the spinlocks
>     held.
>   - This also requires a suitable comment in the code.
> 
>     It is possible that the dentry_iput call in dput might need to be
>     split into the locked/unlocked portions as well.  That would
>     require collecting a list of inodes and dentries to be freed once
>     the lock is dropped, which would be ugly.
>     An alternative might be to skip the tail recursion when
>     dput_locked was called as I *think* it is just an optimisation.
> 
> 
>  The following patch addressed the first three points.
> 
> Comments?  Please :-?
> 

This looks very much like a fixed version of my patch from
http://lkml.org/lkml/2006/1/20/303. Therfore, in general I'm fine with it ;)

Comments below !

> +void dput_locked(struct dentry *dentry)
> +{
> +	if (!dentry)
> +		return;
> +	if (!atomic_dec_and_test(&dentry->d_count)) {
> + 		spin_unlock(&dentry->d_lock);
> + 		spin_unlock(&dcache_lock);
> +		return;
> +	}
> +	__dput_locked(dentry);
> +}
> +

A comment like in dentry_iput() would be fine here:
 * Called with dcache_lock and per dentry lock held, drops both.

>  static inline void prune_one_dentry(struct dentry * dentry)
>  {
>  	struct dentry * parent;
> +	struct inode * ino;
>  
>  	__d_drop(dentry);
>  	list_del(&dentry->d_u.d_child);
>  	dentry_stat.nr_dentry--;	/* For d_free, below */
> -	dentry_iput(dentry);
> +	ino = dentry_iput_locked(dentry);
>  	parent = dentry->d_parent;
> -	d_free(dentry);
>  	if (parent != dentry)
> -		dput(parent);
> +		dput_locked(parent);
> +	else {
> +		spin_unlock(&dentry->d_lock);
> +		spin_unlock(&dcache_lock);
> +	}
> +	dentry_iput_unlocked(dentry, ino);
> +	d_free(dentry);
> +
>  	spin_lock(&dcache_lock);
>  }
>  

You missed getting the parent dentry's lock before calling dput_locked().

Regards,
	Jan

-- 
Jan Blunck                                               jblunck@suse.de
SuSE LINUX AG - A Novell company
Maxfeldstr. 5                                          +49-911-74053-608
D-90409 Nürnberg                                      http://www.suse.de

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super
@ 2006-03-02  6:57 Neil Brown
  2006-03-02 10:48 ` Jan Blunck
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Neil Brown @ 2006-03-02  6:57 UTC (permalink / raw)
  To: linux-kernel, Andrew Morton, Olaf Hering, Jan Blunck, Kirill Korotaev
  Cc: Al Viro



Hi,
 This mail relates to the thread with the same subject which can be
 found at

    http://lkml.org/lkml/2006/1/16/279

 I would like to propose an alternate patch for the problem.

 The core problem is that:
   prune_one_dentry can hold a reference to a dentry without any
     lock being held, and without any other reference to the
     filesystem (if it is being called from shrink_dcache_memory).
     It holds this reference while calling iput on an inode.  This can
     take an arbitrarily long time to complete, especially if NFS
     needs to wait for some RPCs to complete or timeout.

   shrink_dcache_parent skips over dentries which have a reference,
     such as the one held by prune_one_dentry.

   Thus umount can find that an inode is still in use (by it's dentry
   which was skipped) and will complain.  Worse, when the nfs request
   on some inode finally completes, it might find the superblock
   doesn't exist any more and... oops.

  My proposed solution to the problem is never to expose the reference
  held by prune_one_dentry.  i.e. keep the spin_lock held.

  This requires:
  - Breaking dentry_iput into 2 pieces, one that happens while the
    dcache locks are held, and one that happens unlocked.
  - Also, dput needs a variant which can be called with the spinlocks
    held.
  - This also requires a suitable comment in the code.

    It is possible that the dentry_iput call in dput might need to be
    split into the locked/unlocked portions as well.  That would
    require collecting a list of inodes and dentries to be freed once
    the lock is dropped, which would be ugly.
    An alternative might be to skip the tail recursion when
    dput_locked was called as I *think* it is just an optimisation.


 The following patch addressed the first three points.

Comments?  Please :-?

NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/dcache.c |  105 ++++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 81 insertions(+), 24 deletions(-)

diff ./fs/dcache.c~current~ ./fs/dcache.c
--- ./fs/dcache.c~current~	2006-03-02 17:14:24.000000000 +1100
+++ ./fs/dcache.c	2006-03-02 17:55:08.000000000 +1100
@@ -94,24 +94,36 @@ static void d_free(struct dentry *dentry
  * d_iput() operation if defined.
  * Called with dcache_lock and per dentry lock held, drops both.
  */
-static void dentry_iput(struct dentry * dentry)
+static inline struct inode *dentry_iput_locked(struct dentry *dentry)
 {
 	struct inode *inode = dentry->d_inode;
 	if (inode) {
 		dentry->d_inode = NULL;
 		list_del_init(&dentry->d_alias);
-		spin_unlock(&dentry->d_lock);
-		spin_unlock(&dcache_lock);
-		if (!inode->i_nlink)
-			fsnotify_inoderemove(inode);
-		if (dentry->d_op && dentry->d_op->d_iput)
-			dentry->d_op->d_iput(dentry, inode);
-		else
-			iput(inode);
-	} else {
-		spin_unlock(&dentry->d_lock);
-		spin_unlock(&dcache_lock);
 	}
+	return inode;
+}
+
+static inline void dentry_iput_unlocked(struct dentry *dentry,
+					struct inode *inode)
+{
+	if (!inode)
+		return;
+	if (!inode->i_nlink)
+		fsnotify_inoderemove(inode);
+	if (dentry->d_op && dentry->d_op->d_iput)
+		dentry->d_op->d_iput(dentry, inode);
+	else
+		iput(inode);
+}
+
+static void dentry_iput(struct dentry * dentry)
+{
+	struct inode *inode = dentry_iput_locked(dentry);
+
+	spin_unlock(&dentry->d_lock);
+	spin_unlock(&dcache_lock);
+	dentry_iput_unlocked(dentry, inode);
 }
 
 /* 
@@ -143,18 +155,10 @@ static void dentry_iput(struct dentry * 
  * no dcache lock, please.
  */
 
-void dput(struct dentry *dentry)
+static void __dput_locked(struct dentry *dentry)
 {
-	if (!dentry)
-		return;
 
 repeat:
-	if (atomic_read(&dentry->d_count) == 1)
-		might_sleep();
-	if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
-		return;
-
-	spin_lock(&dentry->d_lock);
 	if (atomic_read(&dentry->d_count)) {
 		spin_unlock(&dentry->d_lock);
 		spin_unlock(&dcache_lock);
@@ -202,10 +206,43 @@ kill_it: {
 		if (dentry == parent)
 			return;
 		dentry = parent;
+
+		if (atomic_read(&dentry->d_count) == 1)
+			might_sleep();
+		if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+			return;
+
+		spin_lock(&dentry->d_lock);
 		goto repeat;
 	}
 }
 
+void dput(struct dentry *dentry)
+{
+	if (!dentry)
+		return;
+	if (atomic_read(&dentry->d_count) == 1)
+		might_sleep();
+	if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+		return;
+
+	spin_lock(&dentry->d_lock);
+
+	__dput_locked(dentry);
+}
+
+void dput_locked(struct dentry *dentry)
+{
+	if (!dentry)
+		return;
+	if (!atomic_dec_and_test(&dentry->d_count)) {
+ 		spin_unlock(&dentry->d_lock);
+ 		spin_unlock(&dcache_lock);
+		return;
+	}
+	__dput_locked(dentry);
+}
+
 /**
  * d_invalidate - invalidate a dentry
  * @dentry: dentry to invalidate
@@ -361,19 +398,39 @@ restart:
  * This requires that the LRU list has already been
  * removed.
  * Called with dcache_lock, drops it and then regains.
+ *
+ * There was a risk of this function, called from shrink_dache_memory,
+ * racing with select_dcache_parent called from generic_shutdown_super.
+ * This function was holding a reference to the parent after the child
+ * has been removed, and this wasn't protected by any spinlock.
+ * select_dcache_parent would think the dentry was in use, and so it would
+ * not get discarded.  This would result in a very unclean unmount.
+ * So we need to keep the spin_lock while ever we hold a reference to
+ * a dentry.  This (hopefully) explains the two-stage
+ * dentry_iput, and the need for dput_locked.
+ * Note: the race was easiest to hit if iput was very slow, as
+ * it could be when tearing down a large address space, or waiting
+ * for pending network requests to return/timeout.
  */
 static inline void prune_one_dentry(struct dentry * dentry)
 {
 	struct dentry * parent;
+	struct inode * ino;
 
 	__d_drop(dentry);
 	list_del(&dentry->d_u.d_child);
 	dentry_stat.nr_dentry--;	/* For d_free, below */
-	dentry_iput(dentry);
+	ino = dentry_iput_locked(dentry);
 	parent = dentry->d_parent;
-	d_free(dentry);
 	if (parent != dentry)
-		dput(parent);
+		dput_locked(parent);
+	else {
+		spin_unlock(&dentry->d_lock);
+		spin_unlock(&dcache_lock);
+	}
+	dentry_iput_unlocked(dentry, ino);
+	d_free(dentry);
+
 	spin_lock(&dcache_lock);
 }
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2006-03-09 12:00 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-16 22:34 [PATCH] Busy inodes after unmount, be more verbose in generic_shutdown_super Olaf Hering
2006-01-16 23:23 ` Kirill Korotaev
2006-01-16 23:29   ` Olaf Hering
2006-01-17  2:05     ` Andrew Morton
2006-01-17  7:03       ` Kirill Korotaev
2006-01-18 22:49   ` Jan Blunck
2006-01-18 23:10     ` Andrew Morton
2006-01-19 10:08       ` Kirill Korotaev
2006-01-19  9:52     ` Kirill Korotaev
2006-01-19 10:04       ` Jan Blunck
2006-01-19 10:26         ` Kirill Korotaev
2006-01-20 19:06           ` Jan Blunck
2006-01-23  8:14             ` Kirill Korotaev
2006-01-30 11:54               ` Jan Blunck
2006-01-30 14:05                 ` Kirill Korotaev
2006-01-30 14:21                   ` Jan Blunck
2006-01-30 14:34                     ` Kirill Korotaev
2006-03-02  6:57 Neil Brown
2006-03-02 10:48 ` Jan Blunck
2006-03-03 11:42 ` Jan Blunck
2006-03-06  6:09 ` Neil Brown
2006-03-06  7:32   ` Balbir Singh
2006-03-07  1:58     ` Neil Brown
2006-03-07  2:49       ` Balbir Singh
2006-03-07  6:22         ` Kirill Korotaev
2006-03-07  6:16       ` Kirill Korotaev
2006-03-07  7:03         ` Balbir Singh
2006-03-07  7:21           ` Kirill Korotaev
2006-03-07 11:05             ` Balbir Singh
2006-03-08  0:29         ` Neil Brown
2006-03-08  2:17           ` Balbir Singh
2006-03-08  2:39             ` Neil Brown
2006-03-08  3:05               ` Balbir Singh
2006-03-08 11:01                 ` Jan Blunck
2006-03-06 11:56   ` Jan Blunck
2006-03-07  2:15     ` Neil Brown
2006-03-06 11:56   ` Kirill Korotaev
2006-03-07  2:01     ` Neil Brown
2006-03-07  6:20       ` Kirill Korotaev
2006-03-07 23:20         ` Neil Brown
2006-03-09 12:03           ` Kirill Korotaev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).