linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Some reiserfs fixes
@ 2009-12-30  5:21 Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 1/4] reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion Frederic Weisbecker
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  5:21 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Christian Kujau, Yinghai Lu,
	Alexander Beregalov, Chris Mason, Ingo Molnar

Hi,

These patches should fix most of the lock inversions you've reported,
plus some others I've encountered, except may be the one reported by
Yinghai Lu that I'm still trying to reproduce.

I just hope these fixes won't enlight more lock inversions.

Anyway, please give it a try, your reports have been really helpful.

You can merge my branch into latest upstream tree to test it:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	reiserfs/kill-bkl

Thanks!

Frederic Weisbecker (4):
  reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
  reiserfs: Warn on lock relax if taken recursively
  reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
  reiserfs: Relax reiserfs lock while freeing the journal

 fs/reiserfs/journal.c       |    3 ++-
 fs/reiserfs/lock.c          |    9 +++++++++
 fs/reiserfs/xattr.c         |   11 ++++++++---
 include/linux/reiserfs_fs.h |   13 ++++++++++---
 4 files changed, 29 insertions(+), 7 deletions(-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
@ 2009-12-30  5:21 ` Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 2/4] reiserfs: Warn on lock relax if taken recursively Frederic Weisbecker
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  5:21 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Christian Kujau, Yinghai Lu,
	Alexander Beregalov, Chris Mason, Ingo Molnar

i_xattr_sem depends on the reiserfs lock. But after we grab
i_xattr_sem, we may relax/relock the reiserfs lock while waiting
on a freezed filesystem, creating a dependency inversion between
the two locks.

In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's
create a reiserfs_down_read_safe() that acts like
reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing
another lock to avoid undesired dependencies induced by the
heivyweight reiserfs lock.

This fixes the following warning:

[  990.005931] =======================================================
[  990.012373] [ INFO: possible circular locking dependency detected ]
[  990.013233] 2.6.33-rc1 #1
[  990.013233] -------------------------------------------------------
[  990.013233] dbench/1891 is trying to acquire lock:
[  990.013233]  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]
[  990.013233] but task is already holding lock:
[  990.013233]  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]
[  990.013233] which lock already depends on the new lock.
[  990.013233]
[  990.013233]
[  990.013233] the existing dependency chain (in reverse order) is:
[  990.013233]
[  990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}:
[  990.013233]        [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560
[  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]        [<ffffffff814ac194>] down_write+0x44/0x80
[  990.013233]        [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[  990.013233]
[  990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[  990.013233]        [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]        [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[  990.013233]        [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[  990.013233]        [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]        [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[  990.013233]        [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[  990.013233]
[  990.013233] other info that might help us debug this:
[  990.013233]
[  990.013233] 2 locks held by dbench/1891:
[  990.013233]  #0:  (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0
[  990.013233]  #1:  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]
[  990.013233] stack backtrace:
[  990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1
[  990.013233] Call Trace:
[  990.013233]  [<ffffffff81061639>] print_circular_bug+0xe9/0xf0
[  990.013233]  [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]  [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]  [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0
[  990.013233]  [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140
[  990.013233]  [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[  990.013233]  [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[  990.013233]  [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[  990.013233]  [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[  990.013233]  [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]  [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0
[  990.013233]  [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]  [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]  [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]  [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]  [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]  [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100
[  990.013233]  [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10
[  990.013233]  [<ffffffff810560a3>] ? cpu_clock+0x43/0x50
[  990.013233]  [<ffffffff810c6820>] ? fget+0xb0/0x110
[  990.013233]  [<ffffffff810c6770>] ? fget+0x0/0x110
[  990.013233]  [<ffffffff81002ddc>] ? sysret_check+0x27/0x62
[  990.013233]  [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]  [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b

Reported-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/reiserfs/xattr.c         |    2 +-
 include/linux/reiserfs_fs.h |    8 ++++++++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 8891cd8..a0e2e7a 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -484,7 +484,7 @@ reiserfs_xattr_set_handle(struct reiserfs_transaction_handle *th,
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	down_write(&REISERFS_I(inode)->i_xattr_sem);
+	reiserfs_down_read_safe(&REISERFS_I(inode)->i_xattr_sem, inode->i_sb);
 
 	xahash = xattr_hash(buffer, buffer_size);
 	while (buffer_pos < buffer_size || buffer_pos == 0) {
diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h
index 4351b49..35d3f45 100644
--- a/include/linux/reiserfs_fs.h
+++ b/include/linux/reiserfs_fs.h
@@ -106,6 +106,14 @@ reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass,
 	reiserfs_write_lock(s);
 }
 
+static inline void
+reiserfs_down_read_safe(struct rw_semaphore *sem, struct super_block *s)
+{
+	reiserfs_write_unlock(s);
+	down_read(sem);
+	reiserfs_write_lock(s);
+}
+
 /*
  * When we schedule, we usually want to also release the write lock,
  * according to the previous bkl based locking scheme of reiserfs.
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] reiserfs: Warn on lock relax if taken recursively
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 1/4] reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion Frederic Weisbecker
@ 2009-12-30  5:21 ` Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 3/4] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr Frederic Weisbecker
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  5:21 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Christian Kujau, Yinghai Lu,
	Alexander Beregalov, Chris Mason, Ingo Molnar

When we relax the reiserfs lock to avoid creating unwanted
dependencies against others locks while grabbing these,
we want to ensure it has not been taken recursively, otherwise
the lock won't be really relaxed. Only its depth will be decreased.
The unwanted dependency would then actually happen.

To prevent from that, add a reiserfs_lock_check_recursive() call
in the places that need it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/reiserfs/lock.c          |    9 +++++++++
 include/linux/reiserfs_fs.h |    9 +++++++++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/fs/reiserfs/lock.c b/fs/reiserfs/lock.c
index ee2cfc0..b87aa2c 100644
--- a/fs/reiserfs/lock.c
+++ b/fs/reiserfs/lock.c
@@ -86,3 +86,12 @@ void reiserfs_check_lock_depth(struct super_block *sb, char *caller)
 		reiserfs_panic(sb, "%s called without kernel lock held %d",
 			       caller);
 }
+
+#ifdef CONFIG_REISERFS_CHECK
+void reiserfs_lock_check_recursive(struct super_block *sb)
+{
+	struct reiserfs_sb_info *sb_i = REISERFS_SB(sb);
+
+	WARN_ONCE((sb_i->lock_depth > 0), "Unwanted recursive reiserfs lock!\n");
+}
+#endif
diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h
index 35d3f45..793bf83 100644
--- a/include/linux/reiserfs_fs.h
+++ b/include/linux/reiserfs_fs.h
@@ -62,6 +62,12 @@ void reiserfs_write_unlock(struct super_block *s);
 int reiserfs_write_lock_once(struct super_block *s);
 void reiserfs_write_unlock_once(struct super_block *s, int lock_depth);
 
+#ifdef CONFIG_REISERFS_CHECK
+void reiserfs_lock_check_recursive(struct super_block *s);
+#else
+static inline void reiserfs_lock_check_recursive(struct super_block *s) { }
+#endif
+
 /*
  * Several mutexes depend on the write lock.
  * However sometimes we want to relax the write lock while we hold
@@ -92,6 +98,7 @@ void reiserfs_write_unlock_once(struct super_block *s, int lock_depth);
 static inline void reiserfs_mutex_lock_safe(struct mutex *m,
 			       struct super_block *s)
 {
+	reiserfs_lock_check_recursive(s);
 	reiserfs_write_unlock(s);
 	mutex_lock(m);
 	reiserfs_write_lock(s);
@@ -101,6 +108,7 @@ static inline void
 reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass,
 			       struct super_block *s)
 {
+	reiserfs_lock_check_recursive(s);
 	reiserfs_write_unlock(s);
 	mutex_lock_nested(m, subclass);
 	reiserfs_write_lock(s);
@@ -109,6 +117,7 @@ reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass,
 static inline void
 reiserfs_down_read_safe(struct rw_semaphore *sem, struct super_block *s)
 {
+	reiserfs_lock_check_recursive(s);
 	reiserfs_write_unlock(s);
 	down_read(sem);
 	reiserfs_write_lock(s);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 1/4] reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 2/4] reiserfs: Warn on lock relax if taken recursively Frederic Weisbecker
@ 2009-12-30  5:21 ` Frederic Weisbecker
  2009-12-30 20:27   ` [PATCH 3/4 v2] " Frederic Weisbecker
  2009-12-30  5:21 ` [PATCH 4/4] reiserfs: Relax reiserfs lock while freeing the journal Frederic Weisbecker
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  5:21 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Christian Kujau, Yinghai Lu,
	Alexander Beregalov, Chris Mason, Ingo Molnar

While deleting the xattrs of an inode, we hold the reiserfs lock
and grab the inode->i_mutex of the targeted inode and the root
private xattr directory.

Later on, we may relax the reiserfs lock for various reasons, this
creates inverted dependencies.

We can remove the reiserfs lock -> i_mutex dependency by relaxing
the former before calling open_xa_dir(). This is fine because the
lookup and creation of xattr private directories done in
open_xa_dir() are covered by the targeted inode mutexes. And deeper
operations in the tree are still done under the write lock.

This fixes the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c1141d83>] open_xa_dir+0x43/0x1b0
       [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
       [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by cp/3204:
 #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
 #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
 [<c13ff993>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105d3aa>] ? check_usage+0x6a/0x460
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c1401a2b>] mutex_lock_nested+0x5b/0x340
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
 [<c1117012>] reiserfs_lookup+0x62/0x140
 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bd85f>] __lookup_hash+0xef/0x110
 [<c10bf21d>] lookup_one_len+0x8d/0xc0
 [<c1141e2a>] open_xa_dir+0xea/0x1b0
 [<c1141fe5>] xattr_lookup+0x15/0x160
 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
 [<c1144042>] reiserfs_get_acl+0xa2/0x360
 [<c10ca2e7>] ? new_inode+0x27/0xa0
 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
 [<c1402eb7>] ? _spin_unlock+0x27/0x40
 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
 [<c10c7cb8>] ? __d_lookup+0x108/0x190
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
 [<c10bd17a>] ? generic_permission+0x1a/0xa0
 [<c11788fe>] ? security_inode_permission+0x1e/0x20
 [<c10bea96>] vfs_mkdir+0xd6/0x180
 [<c10c0c10>] sys_mkdirat+0xc0/0xd0
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c10c0c40>] sys_mkdir+0x20/0x30
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/reiserfs/xattr.c         |    9 +++++++--
 include/linux/reiserfs_fs.h |   10 ----------
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index a0e2e7a..c320c77 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -234,17 +234,22 @@ static int reiserfs_for_each_xattr(struct inode *inode,
 	if (IS_PRIVATE(inode) || get_inode_sd_version(inode) == STAT_DATA_V1)
 		return 0;
 
+	reiserfs_write_unlock(inode->i_sb);
 	dir = open_xa_dir(inode, XATTR_REPLACE);
 	if (IS_ERR(dir)) {
 		err = PTR_ERR(dir);
+		reiserfs_write_lock(inode->i_sb);
 		goto out;
 	} else if (!dir->d_inode) {
 		err = 0;
+		reiserfs_write_lock(inode->i_sb);
 		goto out_dir;
 	}
 
-	reiserfs_mutex_lock_nested_safe(&dir->d_inode->i_mutex, I_MUTEX_XATTR,
-					inode->i_sb);
+	mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_XATTR);
+
+	reiserfs_write_lock(inode->i_sb);
+
 	buf.xadir = dir;
 	err = reiserfs_readdir_dentry(dir, &buf, fill_with_dentries, &pos);
 	while ((err == 0 || err == -ENOSPC) && buf.count) {
diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h
index 793bf83..99db5ce 100644
--- a/include/linux/reiserfs_fs.h
+++ b/include/linux/reiserfs_fs.h
@@ -105,16 +105,6 @@ static inline void reiserfs_mutex_lock_safe(struct mutex *m,
 }
 
 static inline void
-reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass,
-			       struct super_block *s)
-{
-	reiserfs_lock_check_recursive(s);
-	reiserfs_write_unlock(s);
-	mutex_lock_nested(m, subclass);
-	reiserfs_write_lock(s);
-}
-
-static inline void
 reiserfs_down_read_safe(struct rw_semaphore *sem, struct super_block *s)
 {
 	reiserfs_lock_check_recursive(s);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] reiserfs: Relax reiserfs lock while freeing the journal
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2009-12-30  5:21 ` [PATCH 3/4] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr Frederic Weisbecker
@ 2009-12-30  5:21 ` Frederic Weisbecker
  2009-12-30  6:53 ` [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  5:21 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Christian Kujau, Yinghai Lu,
	Alexander Beregalov, Chris Mason, Ingo Molnar

Keeping the reiserfs lock while freeing the journal on
umount path triggers a lock inversion between bdev->bd_mutex
and the reiserfs lock.

We don't need the reiserfs lock at this stage. The filesystem
is not usable anymore, and there are no more pending commits,
everything got flushed (even this operation was done in parallel
and didn't required the reiserfs lock from the current process).

This fixes the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #172
-------------------------------------------------------
umount/3904 is trying to acquire lock:
 (&bdev->bd_mutex){+.+.+.}, at: [<c10de2c2>] __blkdev_put+0x22/0x160

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c1143229>] reiserfs_write_lock_once+0x29/0x50
       [<c111c485>] reiserfs_get_block+0x85/0x1620
       [<c10e1040>] do_mpage_readpage+0x1f0/0x6d0
       [<c10e1640>] mpage_readpages+0xc0/0x100
       [<c1119b89>] reiserfs_readpages+0x19/0x20
       [<c108f1ec>] __do_page_cache_readahead+0x1bc/0x260
       [<c108f2b8>] ra_submit+0x28/0x40
       [<c1087e3e>] filemap_fault+0x40e/0x420
       [<c109b5fd>] __do_fault+0x3d/0x430
       [<c109d47e>] handle_mm_fault+0x12e/0x790
       [<c1022a65>] do_page_fault+0x135/0x330
       [<c1403663>] error_code+0x6b/0x70
       [<c10ef9ca>] load_elf_binary+0x82a/0x1a10
       [<c10ba130>] search_binary_handler+0x90/0x1d0
       [<c10bb70f>] do_execve+0x1df/0x250
       [<c1001746>] sys_execve+0x46/0x70
       [<c1002fa5>] syscall_call+0x7/0xb

-> #2 (&mm->mmap_sem){++++++}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c109b1ab>] might_fault+0x8b/0xb0
       [<c11b8f52>] copy_to_user+0x32/0x70
       [<c10c3b94>] filldir64+0xa4/0xf0
       [<c1109116>] sysfs_readdir+0x116/0x210
       [<c10c3e1d>] vfs_readdir+0x8d/0xb0
       [<c10c3ea9>] sys_getdents64+0x69/0xb0
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #1 (sysfs_mutex){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c110951c>] sysfs_addrm_start+0x2c/0xb0
       [<c1109aa0>] create_dir+0x40/0x90
       [<c1109b1b>] sysfs_create_dir+0x2b/0x50
       [<c11b2352>] kobject_add_internal+0xc2/0x1b0
       [<c11b2531>] kobject_add_varg+0x31/0x50
       [<c11b25ac>] kobject_add+0x2c/0x60
       [<c1258294>] device_add+0x94/0x560
       [<c11036ea>] add_partition+0x18a/0x2a0
       [<c110418a>] rescan_partitions+0x33a/0x450
       [<c10de5bf>] __blkdev_get+0x12f/0x2d0
       [<c10de76a>] blkdev_get+0xa/0x10
       [<c11034b8>] register_disk+0x108/0x130
       [<c11a87a9>] add_disk+0xd9/0x130
       [<c12998e5>] sd_probe_async+0x105/0x1d0
       [<c10528af>] async_thread+0xcf/0x230
       [<c104bfd4>] kthread+0x74/0x80
       [<c1003aab>] kernel_thread_helper+0x7/0x3c

-> #0 (&bdev->bd_mutex){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c10de2c2>] __blkdev_put+0x22/0x160
       [<c10de40a>] blkdev_put+0xa/0x10
       [<c113ce22>] free_journal_ram+0xd2/0x130
       [<c113ea18>] do_journal_release+0x98/0x190
       [<c113eb2a>] journal_release+0xa/0x10
       [<c1128eb6>] reiserfs_put_super+0x36/0x130
       [<c10b776f>] generic_shutdown_super+0x4f/0xe0
       [<c10b7825>] kill_block_super+0x25/0x40
       [<c11255df>] reiserfs_kill_sb+0x7f/0x90
       [<c10b7f4a>] deactivate_super+0x7a/0x90
       [<c10cccd8>] mntput_no_expire+0x98/0xd0
       [<c10ccfcc>] sys_umount+0x4c/0x310
       [<c10cd2a9>] sys_oldumount+0x19/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by umount/3904:
 #0:  (&type->s_umount_key#30){+++++.}, at: [<c10b7f45>] deactivate_super+0x75/0x90
 #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40

stack backtrace:
Pid: 3904, comm: umount Not tainted 2.6.32-atom #172
Call Trace:
 [<c13ff903>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c108b66f>] ? free_pcppages_bulk+0x1f/0x250
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c140199b>] mutex_lock_nested+0x5b/0x340
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c10afe12>] ? kfree+0x92/0xd0
 [<c10de2c2>] __blkdev_put+0x22/0x160
 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
 [<c10de40a>] blkdev_put+0xa/0x10
 [<c113ce22>] free_journal_ram+0xd2/0x130
 [<c113ea18>] do_journal_release+0x98/0x190
 [<c113eb2a>] journal_release+0xa/0x10
 [<c1128eb6>] reiserfs_put_super+0x36/0x130
 [<c1050596>] ? up_write+0x16/0x30
 [<c10b776f>] generic_shutdown_super+0x4f/0xe0
 [<c10b7825>] kill_block_super+0x25/0x40
 [<c10f41e0>] ? vfs_quota_off+0x0/0x20
 [<c11255df>] reiserfs_kill_sb+0x7f/0x90
 [<c10b7f4a>] deactivate_super+0x7a/0x90
 [<c10cccd8>] mntput_no_expire+0x98/0xd0
 [<c10ccfcc>] sys_umount+0x4c/0x310
 [<c10cd2a9>] sys_oldumount+0x19/0x20
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/reiserfs/journal.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index a059879..83ac4d3 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2009,10 +2009,11 @@ static int do_journal_release(struct reiserfs_transaction_handle *th,
 		destroy_workqueue(commit_wq);
 		commit_wq = NULL;
 	}
-	reiserfs_write_lock(sb);
 
 	free_journal_ram(sb);
 
+	reiserfs_write_lock(sb);
+
 	return 0;
 }
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] Some reiserfs fixes
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2009-12-30  5:21 ` [PATCH 4/4] reiserfs: Relax reiserfs lock while freeing the journal Frederic Weisbecker
@ 2009-12-30  6:53 ` Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 0/5] reiserfs lock inversion fixes on xattr Frederic Weisbecker
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30  6:53 UTC (permalink / raw)
  To: LKML
  Cc: Christian Kujau, Yinghai Lu, Alexander Beregalov, Chris Mason,
	Ingo Molnar

On Wed, Dec 30, 2009 at 06:21:27AM +0100, Frederic Weisbecker wrote:
> Hi,
> 
> These patches should fix most of the lock inversions you've reported,
> plus some others I've encountered, except may be the one reported by
> Yinghai Lu that I'm still trying to reproduce.
> 
> I just hope these fixes won't enlight more lock inversions.
> 
> Anyway, please give it a try, your reports have been really helpful.



Actually please don't try it!

I just tested dbench with xattrs and got 5 more different lock
inversions (brought with soft lockups ;). I just fixed all of
them with 5 more patches, I just don't have pushed them yet,
waiting for a night in my bed before writing the changelogs.

Will push/post them tomorrow.

Thanks!


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/4 v2] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
  2009-12-30  5:21 ` [PATCH 3/4] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr Frederic Weisbecker
@ 2009-12-30 20:27   ` Frederic Weisbecker
  0 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

While deleting the xattrs of an inode, we hold the reiserfs lock
and grab the inode->i_mutex of the targeted inode and the root
private xattr directory.

Later on, we may relax the reiserfs lock for various reasons, this
creates inverted dependencies.

We can remove the reiserfs lock -> i_mutex dependency by relaxing
the former before calling open_xa_dir(). This is fine because the
lookup and creation of xattr private directories done in
open_xa_dir() are covered by the targeted inode mutexes. And deeper
operations in the tree are still done under the write lock.

This fixes the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c1141d83>] open_xa_dir+0x43/0x1b0
       [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
       [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by cp/3204:
 #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
 #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
 [<c13ff993>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105d3aa>] ? check_usage+0x6a/0x460
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c1401a2b>] mutex_lock_nested+0x5b/0x340
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
 [<c1117012>] reiserfs_lookup+0x62/0x140
 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bd85f>] __lookup_hash+0xef/0x110
 [<c10bf21d>] lookup_one_len+0x8d/0xc0
 [<c1141e2a>] open_xa_dir+0xea/0x1b0
 [<c1141fe5>] xattr_lookup+0x15/0x160
 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
 [<c1144042>] reiserfs_get_acl+0xa2/0x360
 [<c10ca2e7>] ? new_inode+0x27/0xa0
 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
 [<c1402eb7>] ? _spin_unlock+0x27/0x40
 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
 [<c10c7cb8>] ? __d_lookup+0x108/0x190
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
 [<c10bd17a>] ? generic_permission+0x1a/0xa0
 [<c11788fe>] ? security_inode_permission+0x1e/0x20
 [<c10bea96>] vfs_mkdir+0xd6/0x180
 [<c10c0c10>] sys_mkdirat+0xc0/0xd0
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c10c0c40>] sys_mkdir+0x20/0x30
 [<c1002ec4>] sysenter_do_call+0x12/0x32

v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still
    need it later

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
 fs/reiserfs/xattr.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index a0e2e7a..c320c77 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -234,17 +234,22 @@ static int reiserfs_for_each_xattr(struct inode *inode,
 	if (IS_PRIVATE(inode) || get_inode_sd_version(inode) == STAT_DATA_V1)
 		return 0;
 
+	reiserfs_write_unlock(inode->i_sb);
 	dir = open_xa_dir(inode, XATTR_REPLACE);
 	if (IS_ERR(dir)) {
 		err = PTR_ERR(dir);
+		reiserfs_write_lock(inode->i_sb);
 		goto out;
 	} else if (!dir->d_inode) {
 		err = 0;
+		reiserfs_write_lock(inode->i_sb);
 		goto out_dir;
 	}
 
-	reiserfs_mutex_lock_nested_safe(&dir->d_inode->i_mutex, I_MUTEX_XATTR,
-					inode->i_sb);
+	mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_XATTR);
+
+	reiserfs_write_lock(inode->i_sb);
+
 	buf.xadir = dir;
 	err = reiserfs_readdir_dentry(dir, &buf, fill_with_dentries, &pos);
 	while ((err == 0 || err == -ENOSPC) && buf.count) {
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 0/5] reiserfs lock inversion fixes on xattr
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2009-12-30  6:53 ` [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  2009-12-31  6:32   ` Christian Kujau
  2009-12-30 20:42 ` [PATCH 1/5] reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() Frederic Weisbecker
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

Hi,

Christian Kujau's use of dbench to test reiserfs xattr has inspired me.
I tried dbench -x yesterday and got various soft-lockups and lockdep
reports.

I fixed all those I've seen in this batch and it passed through 6 hours
of stress testing (if a loop of dbench -x -t 60 20 can be called stress
testing) and things look pretty fine now.

I hope you can give it a try (and/or review) before I send the whole
to Linus.

Thanks!

It's on:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
        reiserfs/kill-bkl

How to get it:

git-clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
cd linux-2.6.git
git-pull git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git reiserfs/kill-bkl

Frederic Weisbecker (5):
  reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
  reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
  reiserfs: Fix journal mutex <-> inode mutex lock inversion
  reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
  reiserfs: Safely acquire i_mutex from xattr_rmdir

 fs/reiserfs/namei.c |    7 ++++---
 fs/reiserfs/xattr.c |   20 ++++++++++++++------
 2 files changed, 18 insertions(+), 9 deletions(-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2009-12-30 20:42 ` [PATCH 0/5] reiserfs lock inversion fixes on xattr Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 2/5] reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() Frederic Weisbecker
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

We call xattr_lookup() from reiserfs_xattr_get(). We then hold
the reiserfs lock when we grab the i_mutex. But later, we may
relax the reiserfs lock, creating dependency inversion between
both locks.

The lookups and creation jobs ar already protected by the
inode mutex, so we can safely relax the reiserfs lock, dropping
the unwanted reiserfs lock -> i_mutex dependency, as shown
in the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c1141d83>] open_xa_dir+0x43/0x1b0
       [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
       [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by cp/3204:
 #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
 #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
 [<c13ff993>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105d3aa>] ? check_usage+0x6a/0x460
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c1401a2b>] mutex_lock_nested+0x5b/0x340
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
 [<c1117012>] reiserfs_lookup+0x62/0x140
 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bd85f>] __lookup_hash+0xef/0x110
 [<c10bf21d>] lookup_one_len+0x8d/0xc0
 [<c1141e2a>] open_xa_dir+0xea/0x1b0
 [<c1141fe5>] xattr_lookup+0x15/0x160
 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
 [<c1144042>] reiserfs_get_acl+0xa2/0x360
 [<c10ca2e7>] ? new_inode+0x27/0xa0
 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
 [<c1402eb7>] ? _spin_unlock+0x27/0x40
 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
 [<c10c7cb8>] ? __d_lookup+0x108/0x190
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
 [<c10bd17a>] ? generic_permission+0x1a/0xa0
 [<c11788fe>] ? security_inode_permission+0x1e/0x20
 [<c10bea96>] vfs_mkdir+0xd6/0x180
 [<c10c0c10>] sys_mkdirat+0xc0/0xd0
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c10c0c40>] sys_mkdir+0x20/0x30
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Kujau <lists@nerdbynature.de>
---
 fs/reiserfs/xattr.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index c320c77..78a3f24 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -485,11 +485,16 @@ reiserfs_xattr_set_handle(struct reiserfs_transaction_handle *th,
 	if (!buffer)
 		return lookup_and_delete_xattr(inode, name);
 
+	reiserfs_write_unlock(inode->i_sb);
 	dentry = xattr_lookup(inode, name, flags);
-	if (IS_ERR(dentry))
+	if (IS_ERR(dentry)) {
+		reiserfs_write_lock(inode->i_sb);
 		return PTR_ERR(dentry);
+	}
 
-	reiserfs_down_read_safe(&REISERFS_I(inode)->i_xattr_sem, inode->i_sb);
+	down_read(&REISERFS_I(inode)->i_xattr_sem);
+
+	reiserfs_write_lock(inode->i_sb);
 
 	xahash = xattr_hash(buffer, buffer_size);
 	while (buffer_pos < buffer_size || buffer_pos == 0) {
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/5] reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2009-12-30 20:42 ` [PATCH 1/5] reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 3/5] reiserfs: Fix journal mutex <-> inode mutex lock inversion Frederic Weisbecker
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

reiserfs_unlink() may or may not be called under the reiserfs
lock.
But it also takes the reiserfs lock and can then acquire it
recursively which leads to do_journal_begin_r() that fails to
relax the reiserfs lock before grabbing the journal mutex,
creating an unexpected lock inversion.

We need to ensure reiserfs_unlink() won't get the reiserfs lock
recursively using reiserfs_write_lock_once().

This fixes the following warning that precedes a lock inversion
report (reiserfs lock <-> journal mutex).

------------[ cut here ]------------
WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177
Call Trace:
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c10373a7>] warn_slowpath_common+0x67/0xc0
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c1037446>] warn_slowpath_fmt+0x26/0x30
 [<c114327a>] reiserfs_lock_check_recursive+0x3a/0x50
 [<c113c213>] do_journal_begin_r+0x83/0x360
 [<c105eb16>] ? __lock_acquire+0x1296/0x19e0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c113c670>] journal_begin+0x80/0x130
 [<c1116d5d>] reiserfs_unlink+0x7d/0x2d0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a64>] xattr_unlink+0x64/0xb0
 [<c1143169>] delete_one_xattr+0x29/0x100
 [<c11427ab>] reiserfs_for_each_xattr+0x10b/0x290
 [<c1143140>] ? delete_one_xattr+0x0/0x100
 [<c1401ca9>] ? mutex_lock_nested+0x299/0x340
 [<c11429aa>] reiserfs_delete_xattrs+0x1a/0x60
 [<c11432f9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d0f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e980>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1022ab7>] ? do_page_fault+0x187/0x330
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c1022930>] ? do_page_fault+0x0/0x330
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0a00>] sys_unlink+0x10/0x20
 [<c1002ec4>] sysenter_do_call+0x12/0x32
---[ end trace 2e35d71a6cc69d0c ]---

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Kujau <lists@nerdbynature.de>
---
 fs/reiserfs/namei.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c
index e296ff7..9d4dcf0 100644
--- a/fs/reiserfs/namei.c
+++ b/fs/reiserfs/namei.c
@@ -921,6 +921,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry)
 	struct reiserfs_transaction_handle th;
 	int jbegin_count;
 	unsigned long savelink;
+	int depth;
 
 	inode = dentry->d_inode;
 
@@ -932,7 +933,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry)
 	    JOURNAL_PER_BALANCE_CNT * 2 + 2 +
 	    4 * REISERFS_QUOTA_TRANS_BLOCKS(dir->i_sb);
 
-	reiserfs_write_lock(dir->i_sb);
+	depth = reiserfs_write_lock_once(dir->i_sb);
 	retval = journal_begin(&th, dir->i_sb, jbegin_count);
 	if (retval)
 		goto out_unlink;
@@ -993,7 +994,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry)
 
 	retval = journal_end(&th, dir->i_sb, jbegin_count);
 	reiserfs_check_path(&path);
-	reiserfs_write_unlock(dir->i_sb);
+	reiserfs_write_unlock_once(dir->i_sb, depth);
 	return retval;
 
       end_unlink:
@@ -1003,7 +1004,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry)
 	if (err)
 		retval = err;
       out_unlink:
-	reiserfs_write_unlock(dir->i_sb);
+	reiserfs_write_unlock_once(dir->i_sb, depth);
 	return retval;
 }
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/5] reiserfs: Fix journal mutex <-> inode mutex lock inversion
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2009-12-30 20:42 ` [PATCH 2/5] reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 4/5] reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 5/5] reiserfs: Safely acquire i_mutex from xattr_rmdir Frederic Weisbecker
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

We need to relax the reiserfs lock before locking the inode mutex
from xattr_unlink(), otherwise we'll face the usual bad dependencies:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #178
-------------------------------------------------------
rm/3202 is trying to acquire lock:
 (&journal->j_mutex){+.+...}, at: [<c113c234>] do_journal_begin_r+0x94/0x360

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c1142a67>] xattr_unlink+0x57/0xb0
       [<c1143179>] delete_one_xattr+0x29/0x100
       [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c1143359>] reiserfs_write_lock+0x29/0x40
       [<c113c23c>] do_journal_begin_r+0x9c/0x360
       [<c113c680>] journal_begin+0x80/0x130
       [<c1127363>] reiserfs_remount+0x223/0x4e0
       [<c10b6dd6>] do_remount_sb+0xa6/0x140
       [<c10ce6a0>] do_mount+0x560/0x750
       [<c10ce914>] sys_mount+0x84/0xb0
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&journal->j_mutex){+.+...}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c113c234>] do_journal_begin_r+0x94/0x360
       [<c113c680>] journal_begin+0x80/0x130
       [<c1116d63>] reiserfs_unlink+0x83/0x2e0
       [<c1142a74>] xattr_unlink+0x64/0xb0
       [<c1143179>] delete_one_xattr+0x29/0x100
       [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by rm/3202:
 #0:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c114274b>] reiserfs_for_each_xattr+0x9b/0x290
 #1:  (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0

stack backtrace:
Pid: 3202, comm: rm Not tainted 2.6.32-atom #178
Call Trace:
 [<c13ff9e3>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c1142a67>] ? xattr_unlink+0x57/0xb0
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c1401a7b>] mutex_lock_nested+0x5b/0x340
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c113c234>] do_journal_begin_r+0x94/0x360
 [<c10411b6>] ? run_timer_softirq+0x1a6/0x220
 [<c103cb00>] ? __do_softirq+0x50/0x140
 [<c113c680>] journal_begin+0x80/0x130
 [<c103cba2>] ? __do_softirq+0xf2/0x140
 [<c104f72f>] ? hrtimer_interrupt+0xdf/0x220
 [<c1116d63>] reiserfs_unlink+0x83/0x2e0
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c11b8d08>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c1142a67>] ? xattr_unlink+0x57/0xb0
 [<c1142a74>] xattr_unlink+0x64/0xb0
 [<c1143179>] delete_one_xattr+0x29/0x100
 [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
 [<c1143150>] ? delete_one_xattr+0x0/0x100
 [<c1401cb9>] ? mutex_lock_nested+0x299/0x340
 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
 [<c1143309>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d1f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c1401068>] ? mutex_unlock+0x8/0x10
 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
 [<c10c3af0>] ? filldir64+0x0/0xf0
 [<c1002ef3>] ? sysenter_exit+0xf/0x16
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0b13>] sys_unlinkat+0x23/0x40
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Kujau <lists@nerdbynature.de>
---
 fs/reiserfs/xattr.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 78a3f24..8b9631d 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -82,7 +82,8 @@ static int xattr_unlink(struct inode *dir, struct dentry *dentry)
 	BUG_ON(!mutex_is_locked(&dir->i_mutex));
 	vfs_dq_init(dir);
 
-	mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD);
+	reiserfs_mutex_lock_nested_safe(&dentry->d_inode->i_mutex,
+					I_MUTEX_CHILD, dir->i_sb);
 	error = dir->i_op->unlink(dir, dentry);
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2009-12-30 20:42 ` [PATCH 3/5] reiserfs: Fix journal mutex <-> inode mutex lock inversion Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  2009-12-30 20:42 ` [PATCH 5/5] reiserfs: Safely acquire i_mutex from xattr_rmdir Frederic Weisbecker
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

Relax the reiserfs lock before taking the inode mutex from
reiserfs_for_each_xattr() to avoid the usual bad dependencies:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #179
-------------------------------------------------------
rm/3242 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401aab>] mutex_lock_nested+0x5b/0x340
       [<c1143339>] reiserfs_write_lock_once+0x29/0x50
       [<c1117022>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e3a>] open_xa_dir+0xea/0x1b0
       [<c1142720>] reiserfs_for_each_xattr+0x70/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401aab>] mutex_lock_nested+0x5b/0x340
       [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

1 lock held by rm/3242:
 #0:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40

stack backtrace:
Pid: 3242, comm: rm Not tainted 2.6.32-atom #179
Call Trace:
 [<c13ffa13>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
 [<c1401098>] ? mutex_unlock+0x8/0x10
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c1401aab>] mutex_lock_nested+0x5b/0x340
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
 [<c1143180>] ? delete_one_xattr+0x0/0x100
 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
 [<c1143339>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d4f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c1401098>] ? mutex_unlock+0x8/0x10
 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
 [<c10c3af0>] ? filldir64+0x0/0xf0
 [<c1002ef3>] ? sysenter_exit+0xf/0x16
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0b13>] sys_unlinkat+0x23/0x40
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Kujau <lists@nerdbynature.de>
---
 fs/reiserfs/xattr.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 8b9631d..bfdac66 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -289,8 +289,9 @@ static int reiserfs_for_each_xattr(struct inode *inode,
 		err = journal_begin(&th, inode->i_sb, blocks);
 		if (!err) {
 			int jerror;
-			mutex_lock_nested(&dir->d_parent->d_inode->i_mutex,
-					  I_MUTEX_XATTR);
+			reiserfs_mutex_lock_nested_safe(
+					  &dir->d_parent->d_inode->i_mutex,
+					  I_MUTEX_XATTR, inode->i_sb);
 			err = action(dir, data);
 			jerror = journal_end(&th, inode->i_sb, blocks);
 			mutex_unlock(&dir->d_parent->d_inode->i_mutex);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/5] reiserfs: Safely acquire i_mutex from xattr_rmdir
  2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2009-12-30 20:42 ` [PATCH 4/5] reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr Frederic Weisbecker
@ 2009-12-30 20:42 ` Frederic Weisbecker
  10 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2009-12-30 20:42 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Alexander Beregalov, Chris Mason,
	Ingo Molnar, Christian Kujau

Relax the reiserfs lock before taking the inode mutex from
xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex
bad dependency.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christian Kujau <lists@nerdbynature.de>
---
 fs/reiserfs/xattr.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index bfdac66..9623cfe 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -98,7 +98,8 @@ static int xattr_rmdir(struct inode *dir, struct dentry *dentry)
 	BUG_ON(!mutex_is_locked(&dir->i_mutex));
 	vfs_dq_init(dir);
 
-	mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD);
+	reiserfs_mutex_lock_nested_safe(&dentry->d_inode->i_mutex,
+					I_MUTEX_CHILD, dir->i_sb);
 	dentry_unhash(dentry);
 	error = dir->i_op->rmdir(dir, dentry);
 	if (!error)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] reiserfs lock inversion fixes on xattr
  2009-12-30 20:42 ` [PATCH 0/5] reiserfs lock inversion fixes on xattr Frederic Weisbecker
@ 2009-12-31  6:32   ` Christian Kujau
  2010-01-01  9:29     ` Ingo Molnar
  2010-01-02  1:40     ` Frederic Weisbecker
  0 siblings, 2 replies; 17+ messages in thread
From: Christian Kujau @ 2009-12-31  6:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Alexander Beregalov, Chris Mason, Ingo Molnar, reiserfs-devel

On Wed, 30 Dec 2009 at 21:42, Frederic Weisbecker wrote:
> I hope you can give it a try (and/or review) before I send the whole
> to Linus.

Without your patches, "dbench -x" was pretty much instantly causing the 
threads to lock up, forcing me to reboot the box. With LOCKDEP enabled, 
warnings were printed. After applying your patches (and a reboot) I've run 
"dbench -x" for quite a while now with different runtimes and number of
clients and it's not locking up any more. I've tested with:

CONFIG_REISERFS_FS=y
CONFIG_REISERFS_CHECK=y
CONFIG_REISERFS_PROC_INFO=y
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
# CONFIG_REISERFS_FS_SECURITY is not set

...and a few "Kernel hacking" options set. Full config and dmesg:

   http://nerdbynature.de/bits/2.6.33-rc1/reiserfs/t3/

   (although I'm not really testing -rc1 but latest mainline -git
    and your changes pulled)

So, feel free to add:

  Tested-by: Christian Kujau <lists@nerdbynature.de>

I can't help with the code review though :-\

Thanks!
Christian.
-- 
BOFH excuse #208:

Your mail is being routed through Germany ... and they're censoring us.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] reiserfs lock inversion fixes on xattr
  2009-12-31  6:32   ` Christian Kujau
@ 2010-01-01  9:29     ` Ingo Molnar
  2010-01-02  3:52       ` Frederic Weisbecker
  2010-01-02  1:40     ` Frederic Weisbecker
  1 sibling, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2010-01-01  9:29 UTC (permalink / raw)
  To: Christian Kujau
  Cc: Frederic Weisbecker, LKML, Alexander Beregalov, Chris Mason,
	reiserfs-devel


* Christian Kujau <lists@nerdbynature.de> wrote:

> On Wed, 30 Dec 2009 at 21:42, Frederic Weisbecker wrote:
> > I hope you can give it a try (and/or review) before I send the whole
> > to Linus.
> 
> Without your patches, "dbench -x" was pretty much instantly causing the 
> threads to lock up, forcing me to reboot the box. With LOCKDEP enabled, 
> warnings were printed. After applying your patches (and a reboot) I've run 
> "dbench -x" for quite a while now with different runtimes and number of
> clients and it's not locking up any more. I've tested with:
> 
> CONFIG_REISERFS_FS=y
> CONFIG_REISERFS_CHECK=y
> CONFIG_REISERFS_PROC_INFO=y
> CONFIG_REISERFS_FS_XATTR=y
> CONFIG_REISERFS_FS_POSIX_ACL=y
> # CONFIG_REISERFS_FS_SECURITY is not set
> 
> ...and a few "Kernel hacking" options set. Full config and dmesg:
> 
>    http://nerdbynature.de/bits/2.6.33-rc1/reiserfs/t3/
> 
>    (although I'm not really testing -rc1 but latest mainline -git
>     and your changes pulled)
> 
> So, feel free to add:
> 
>   Tested-by: Christian Kujau <lists@nerdbynature.de>

Frederic, i'd suggest you send this tree to Linus with Christian's tested-by 
tags added ASAP - we better have these fixes in -rc3.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] reiserfs lock inversion fixes on xattr
  2009-12-31  6:32   ` Christian Kujau
  2010-01-01  9:29     ` Ingo Molnar
@ 2010-01-02  1:40     ` Frederic Weisbecker
  1 sibling, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2010-01-02  1:40 UTC (permalink / raw)
  To: Christian Kujau
  Cc: LKML, Alexander Beregalov, Chris Mason, Ingo Molnar, reiserfs-devel

On Wed, Dec 30, 2009 at 10:32:04PM -0800, Christian Kujau wrote:
> On Wed, 30 Dec 2009 at 21:42, Frederic Weisbecker wrote:
> > I hope you can give it a try (and/or review) before I send the whole
> > to Linus.
> 
> Without your patches, "dbench -x" was pretty much instantly causing the 
> threads to lock up, forcing me to reboot the box. With LOCKDEP enabled, 
> warnings were printed. After applying your patches (and a reboot) I've run 
> "dbench -x" for quite a while now with different runtimes and number of
> clients and it's not locking up any more. I've tested with:
> 
> CONFIG_REISERFS_FS=y
> CONFIG_REISERFS_CHECK=y
> CONFIG_REISERFS_PROC_INFO=y
> CONFIG_REISERFS_FS_XATTR=y
> CONFIG_REISERFS_FS_POSIX_ACL=y
> # CONFIG_REISERFS_FS_SECURITY is not set
> 
> ...and a few "Kernel hacking" options set. Full config and dmesg:
> 
>    http://nerdbynature.de/bits/2.6.33-rc1/reiserfs/t3/
> 
>    (although I'm not really testing -rc1 but latest mainline -git
>     and your changes pulled)
> 
> So, feel free to add:
> 
>   Tested-by: Christian Kujau <lists@nerdbynature.de>


Great, I've added this to the xattr related commits,
thanks a lot!


> 
> I can't help with the code review though :-\


Don't worry, both testing and review are very precious
and appreciated!


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] reiserfs lock inversion fixes on xattr
  2010-01-01  9:29     ` Ingo Molnar
@ 2010-01-02  3:52       ` Frederic Weisbecker
  0 siblings, 0 replies; 17+ messages in thread
From: Frederic Weisbecker @ 2010-01-02  3:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christian Kujau, LKML, Alexander Beregalov, Chris Mason, reiserfs-devel

On Fri, Jan 01, 2010 at 10:29:44AM +0100, Ingo Molnar wrote:
> 
> * Christian Kujau <lists@nerdbynature.de> wrote:
> 
> > On Wed, 30 Dec 2009 at 21:42, Frederic Weisbecker wrote:
> > > I hope you can give it a try (and/or review) before I send the whole
> > > to Linus.
> > 
> > Without your patches, "dbench -x" was pretty much instantly causing the 
> > threads to lock up, forcing me to reboot the box. With LOCKDEP enabled, 
> > warnings were printed. After applying your patches (and a reboot) I've run 
> > "dbench -x" for quite a while now with different runtimes and number of
> > clients and it's not locking up any more. I've tested with:
> > 
> > CONFIG_REISERFS_FS=y
> > CONFIG_REISERFS_CHECK=y
> > CONFIG_REISERFS_PROC_INFO=y
> > CONFIG_REISERFS_FS_XATTR=y
> > CONFIG_REISERFS_FS_POSIX_ACL=y
> > # CONFIG_REISERFS_FS_SECURITY is not set
> > 
> > ...and a few "Kernel hacking" options set. Full config and dmesg:
> > 
> >    http://nerdbynature.de/bits/2.6.33-rc1/reiserfs/t3/
> > 
> >    (although I'm not really testing -rc1 but latest mainline -git
> >     and your changes pulled)
> > 
> > So, feel free to add:
> > 
> >   Tested-by: Christian Kujau <lists@nerdbynature.de>
> 
> Frederic, i'd suggest you send this tree to Linus with Christian's tested-by 
> tags added ASAP - we better have these fixes in -rc3.
> 
> Thanks,
> 
> 	Ingo


Yep, done :)


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-01-02  3:52 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-30  5:21 [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
2009-12-30  5:21 ` [PATCH 1/4] reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion Frederic Weisbecker
2009-12-30  5:21 ` [PATCH 2/4] reiserfs: Warn on lock relax if taken recursively Frederic Weisbecker
2009-12-30  5:21 ` [PATCH 3/4] reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr Frederic Weisbecker
2009-12-30 20:27   ` [PATCH 3/4 v2] " Frederic Weisbecker
2009-12-30  5:21 ` [PATCH 4/4] reiserfs: Relax reiserfs lock while freeing the journal Frederic Weisbecker
2009-12-30  6:53 ` [PATCH 0/4] Some reiserfs fixes Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 0/5] reiserfs lock inversion fixes on xattr Frederic Weisbecker
2009-12-31  6:32   ` Christian Kujau
2010-01-01  9:29     ` Ingo Molnar
2010-01-02  3:52       ` Frederic Weisbecker
2010-01-02  1:40     ` Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 1/5] reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 2/5] reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 3/5] reiserfs: Fix journal mutex <-> inode mutex lock inversion Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 4/5] reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr Frederic Weisbecker
2009-12-30 20:42 ` [PATCH 5/5] reiserfs: Safely acquire i_mutex from xattr_rmdir Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).