All of lore.kernel.org
 help / color / mirror / Atom feed
* (unknown)
@ 2009-01-06 10:02 Eric Sesterhenn
  2009-01-06 17:29 ` [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Sesterhenn @ 2009-01-06 10:02 UTC (permalink / raw)
  To: linux-nfs; +Cc: bfields, hch, neilb

hi,

with todays -git i see the following lockdep warning, that wasnt there
yesterday. It occurs after booting the box.

[   83.741022] Installing knfsd (copyright (C) 1996 okir-pn4DOG8n3UYbFoVRYvo4fw@public.gmane.org).
[   84.115838] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
recovery directory
[   84.133473] NFSD: starting 90-second grace period
[  174.132715] 
[  174.132724] =============================================
[  174.133004] [ INFO: possible recursive locking detected ]
[  174.133133] 2.6.28 #85
[  174.133232] ---------------------------------------------
[  174.133428] nfsd4/3693 is trying to acquire lock:
[  174.133548]  (&type->i_mutex_dir_key#2){--..}, at: [<c01c4212>]
vfs_fsync+0x62/0xe0
[  174.133991] 
[  174.133994] but task is already holding lock:
[  174.134251]  (&type->i_mutex_dir_key#2){--..}, at: [<d1c224f7>]
nfsd4_sync_rec_dir+0x17/0x40 [nfsd]
[  174.134774] 
[  174.134777] other info that might help us debug this:
[  174.134983] 4 locks held by nfsd4/3693:
[  174.135151]  #0:  (nfsd4){--..}, at: [<c0139d4a>]
run_workqueue+0x7a/0x1e0
[  174.135539]  #1:  ((laundromat_work).work){--..}, at: [<c0139d4a>]
run_workqueue+0x7a/0x1e0
[  174.135968]  #2:  (client_mutex){--..}, at: [<d1c1e9f6>]
laundromat_main+0x26/0x2a0 [nfsd]
[  174.136021]  #3:  (&type->i_mutex_dir_key#2){--..}, at: [<d1c224f7>]
nfsd4_sync_rec_dir+0x17/0x40 [nfsd]
[  174.136021] 
[  174.136021] stack backtrace:
[  174.136021] Pid: 3693, comm: nfsd4 Not tainted 2.6.28 #85
[  174.136021] Call Trace:
[  174.136021]  [<c05abdb6>] ? printk+0x18/0x1a
[  174.136021]  [<c014f626>] __lock_acquire+0xe76/0x1110
[  174.136021]  [<c014f934>] lock_acquire+0x74/0xa0
[  174.136021]  [<c01c4212>] ? vfs_fsync+0x62/0xe0
[  174.136021]  [<c05acfac>] mutex_lock_nested+0x8c/0x2c0
[  174.136021]  [<c01c4212>] ? vfs_fsync+0x62/0xe0
[  174.136021]  [<c01c4212>] ? vfs_fsync+0x62/0xe0
[  174.136021]  [<c01c4212>] vfs_fsync+0x62/0xe0
[  174.136021]  [<d1c074be>] nfsd_sync_dir+0xe/0x10 [nfsd]
[  174.136021]  [<d1c22501>] nfsd4_sync_rec_dir+0x21/0x40 [nfsd]
[  174.136021]  [<d1c22595>] nfsd4_recdir_purge_old+0x75/0x80 [nfsd]
[  174.136021]  [<d1c1ea1e>] laundromat_main+0x4e/0x2a0 [nfsd]
[  174.136021]  [<c0139d4a>] ? run_workqueue+0x7a/0x1e0
[  174.136021]  [<c0139dac>] run_workqueue+0xdc/0x1e0
[  174.136021]  [<c0139d4a>] ? run_workqueue+0x7a/0x1e0
[  174.136021]  [<d1c1e9d0>] ? laundromat_main+0x0/0x2a0 [nfsd]
[  174.136021]  [<c013a1a7>] worker_thread+0x87/0xf0
[  174.136021]  [<c013dac0>] ? autoremove_wake_function+0x0/0x50
[  174.136021]  [<c013a120>] ? worker_thread+0x0/0xf0
[  174.136021]  [<c013d80a>] kthread+0x3a/0x70
[  174.136021]  [<c013d7d0>] ? kthread+0x0/0x70
[  174.136021]  [<c0103cf3>] kernel_thread_helper+0x7/0x14

Greetings, Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory
  2009-01-06 10:02 (unknown) Eric Sesterhenn
@ 2009-01-06 17:29 ` J. Bruce Fields
  2009-01-06 17:58   ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2009-01-06 17:29 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: linux-nfs, hch, neilb

From: J. Bruce Fields <bfields@citi.umich.edu>

This mutex should be unnecessary, and as of 4c728ef583b3d "add a
vfs_fsync helper" it triggers a lockdep warning.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
---
 fs/nfsd/nfs4recover.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

On Tue, Jan 06, 2009 at 11:02:47AM +0100, Eric Sesterhenn wrote:
> hi,
> 
> with todays -git i see the following lockdep warning, that wasnt there
> yesterday. It occurs after booting the box.

Thanks.  The much-hated recovery code strikes again.  I think this
should do it?

--b.

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 74f7b67..9fb5a10 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -127,9 +127,7 @@ out_no_tfm:
 static void
 nfsd4_sync_rec_dir(void)
 {
-	mutex_lock(&rec_dir.dentry->d_inode->i_mutex);
 	nfsd_sync_dir(rec_dir.dentry);
-	mutex_unlock(&rec_dir.dentry->d_inode->i_mutex);
 }
 
 int
-- 
1.5.5.rc1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory
  2009-01-06 17:29 ` [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory J. Bruce Fields
@ 2009-01-06 17:58   ` J. Bruce Fields
  2009-01-06 19:23     ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2009-01-06 17:58 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: linux-nfs, hch, neilb

On Tue, Jan 06, 2009 at 12:29:14PM -0500, bfields wrote:
> From: J. Bruce Fields <bfields@citi.umich.edu>
> 
> This mutex should be unnecessary, and as of 4c728ef583b3d "add a
> vfs_fsync helper" it triggers a lockdep warning.
> 
> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
> ---
>  fs/nfsd/nfs4recover.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> On Tue, Jan 06, 2009 at 11:02:47AM +0100, Eric Sesterhenn wrote:
> > hi,
> > 
> > with todays -git i see the following lockdep warning, that wasnt there
> > yesterday. It occurs after booting the box.
> 
> Thanks.  The much-hated recovery code strikes again.  I think this
> should do it?

No, then we just run into a deadlocks in unlink, create, or any of the
other nfsd operations that want the parent lock to cover more than just
the sync.  So 4c728ef583b3d just doesn't work for nfsd.

--b.

> 
> --b.
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 74f7b67..9fb5a10 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -127,9 +127,7 @@ out_no_tfm:
>  static void
>  nfsd4_sync_rec_dir(void)
>  {
> -	mutex_lock(&rec_dir.dentry->d_inode->i_mutex);
>  	nfsd_sync_dir(rec_dir.dentry);
> -	mutex_unlock(&rec_dir.dentry->d_inode->i_mutex);
>  }
>  
>  int
> -- 
> 1.5.5.rc1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory
  2009-01-06 17:58   ` J. Bruce Fields
@ 2009-01-06 19:23     ` J. Bruce Fields
  2009-01-06 19:32       ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2009-01-06 19:23 UTC (permalink / raw)
  To: Eric Sesterhenn; +Cc: linux-nfs, hch, neilb

On Tue, Jan 06, 2009 at 12:58:29PM -0500, bfields wrote:
> No, then we just run into a deadlocks in unlink, create, or any of the
> other nfsd operations that want the parent lock to cover more than just
> the sync.  So 4c728ef583b3d just doesn't work for nfsd.

We could add another nfsd exception to vfs_sync() by taking the i_mutex
only in the "file != NULL" case.  Perhaps there'd be some advantage to
having nfsd's peculiarity noted in the common code; I don't have
terifically strong feelings either way.

However I'm inclined to think that at that point the special cases get
out of hand and that it would be better to keep this back in the nfsd
code itself.  The following (tested this time) seems to work.

--b.

commit 33e3950dc2eae7484e79685083c304d93013e3ec
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date:   Tue Jan 6 13:37:03 2009 -0500

    nfsd: fix double-locks of directory mutex
    
    A number of nfsd operations depend on the i_mutex to cover more code
    than just the fsync, so the approach of 4c728ef583b3d8 "add a vfs_fsync
    helper" doesn't work for nfsd.  Revert the parts of those patches that
    touch nfsd, and remove the logic from vfs_nfsd that was needed only for
    the special case of nfsd.
    
    Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
    Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 44aa92a..6e50aaa 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -744,16 +744,44 @@ nfsd_close(struct file *filp)
 	fput(filp);
 }
 
+/*
+ * Sync a file
+ * As this calls fsync (not fdatasync) there is no need for a write_inode
+ * after it.
+ */
+static inline int nfsd_dosync(struct file *filp, struct dentry *dp,
+			      const struct file_operations *fop)
+{
+	struct inode *inode = dp->d_inode;
+	int (*fsync) (struct file *, struct dentry *, int);
+	int err;
+
+	err = filemap_fdatawrite(inode->i_mapping);
+	if (err == 0 && fop && (fsync = fop->fsync))
+		err = fsync(filp, dp, 0);
+	if (err == 0)
+		err = filemap_fdatawait(inode->i_mapping);
+
+	return err;
+}
+
 static int
 nfsd_sync(struct file *filp)
 {
-	return vfs_fsync(filp, filp->f_path.dentry, 0);
+        int err;
+	struct inode *inode = filp->f_path.dentry->d_inode;
+	dprintk("nfsd: sync file %s\n", filp->f_path.dentry->d_name.name);
+	mutex_lock(&inode->i_mutex);
+	err=nfsd_dosync(filp, filp->f_path.dentry, filp->f_op);
+	mutex_unlock(&inode->i_mutex);
+
+	return err;
 }
 
 int
-nfsd_sync_dir(struct dentry *dentry)
+nfsd_sync_dir(struct dentry *dp)
 {
-	return vfs_fsync(NULL, dentry, 0);
+	return nfsd_dosync(NULL, dp, dp->d_inode->i_fop);
 }
 
 /*
diff --git a/fs/sync.c b/fs/sync.c
index 0921d6d..8e0a656 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -83,10 +83,6 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync)
  *
  * Write back data and metadata for @file to disk.  If @datasync is
  * set only metadata needed to access modified file data is written.
- *
- * In case this function is called from nfsd @file may be %NULL and
- * only @dentry is set.  This can only happen when the filesystem
- * implements the export_operations API.
  */
 int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
 {
@@ -94,18 +90,8 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync)
 	struct address_space *mapping;
 	int err, ret;
 
-	/*
-	 * Get mapping and operations from the file in case we have
-	 * as file, or get the default values for them in case we
-	 * don't have a struct file available.  Damn nfsd..
-	 */
-	if (file) {
-		mapping = file->f_mapping;
-		fop = file->f_op;
-	} else {
-		mapping = dentry->d_inode->i_mapping;
-		fop = dentry->d_inode->i_fop;
-	}
+	mapping = file->f_mapping;
+	fop = file->f_op;
 
 	if (!fop || !fop->fsync) {
 		ret = -EINVAL;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory
  2009-01-06 19:23     ` J. Bruce Fields
@ 2009-01-06 19:32       ` Christoph Hellwig
  2009-01-06 20:07         ` J. Bruce Fields
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2009-01-06 19:32 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Eric Sesterhenn, linux-nfs, hch, neilb

On Tue, Jan 06, 2009 at 02:23:01PM -0500, J. Bruce Fields wrote:
> On Tue, Jan 06, 2009 at 12:58:29PM -0500, bfields wrote:
> > No, then we just run into a deadlocks in unlink, create, or any of the
> > other nfsd operations that want the parent lock to cover more than just
> > the sync.  So 4c728ef583b3d just doesn't work for nfsd.
> 
> We could add another nfsd exception to vfs_sync() by taking the i_mutex
> only in the "file != NULL" case.  Perhaps there'd be some advantage to
> having nfsd's peculiarity noted in the common code; I don't have
> terifically strong feelings either way.
> 
> However I'm inclined to think that at that point the special cases get
> out of hand and that it would be better to keep this back in the nfsd
> code itself.  The following (tested this time) seems to work.

Sorry, this is not going to do it.  i_mutex is going to move into
->fsync soon unless we kill it entirely pretty soon.  And for the
cases of stckable filesystems what you do (aswell as the <= 2.6.28 case)
is broken, as the NFS locking scheme can't apply to the lower stacked
filesystem.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory
  2009-01-06 19:32       ` Christoph Hellwig
@ 2009-01-06 20:07         ` J. Bruce Fields
  0 siblings, 0 replies; 6+ messages in thread
From: J. Bruce Fields @ 2009-01-06 20:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Eric Sesterhenn, linux-nfs, hch, neilb

On Tue, Jan 06, 2009 at 02:32:56PM -0500, Christoph Hellwig wrote:
> On Tue, Jan 06, 2009 at 02:23:01PM -0500, J. Bruce Fields wrote:
> > On Tue, Jan 06, 2009 at 12:58:29PM -0500, bfields wrote:
> > > No, then we just run into a deadlocks in unlink, create, or any of the
> > > other nfsd operations that want the parent lock to cover more than just
> > > the sync.  So 4c728ef583b3d just doesn't work for nfsd.
> > 
> > We could add another nfsd exception to vfs_sync() by taking the i_mutex
> > only in the "file != NULL" case.  Perhaps there'd be some advantage to
> > having nfsd's peculiarity noted in the common code; I don't have
> > terifically strong feelings either way.
> > 
> > However I'm inclined to think that at that point the special cases get
> > out of hand and that it would be better to keep this back in the nfsd
> > code itself.  The following (tested this time) seems to work.
> 
> Sorry, this is not going to do it.  i_mutex is going to move into
> ->fsync soon unless we kill it entirely pretty soon.  And for the
> cases of stckable filesystems what you do (aswell as the <= 2.6.28 case)
> is broken, as the NFS locking scheme can't apply to the lower stacked
> filesystem.

We need an immediate fix for the regression--nfsd (all versions) just
doesn't work at all in mainline right now.  My apologies for not having
seen that earlier.

A more complete solution is going to take more than a day or two.

--b.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-01-06 20:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-06 10:02 (unknown) Eric Sesterhenn
2009-01-06 17:29 ` [PATCH] nfsd: fix double-lock of i_mutex on nfsv4 recovery directory J. Bruce Fields
2009-01-06 17:58   ` J. Bruce Fields
2009-01-06 19:23     ` J. Bruce Fields
2009-01-06 19:32       ` Christoph Hellwig
2009-01-06 20:07         ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.