All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
@ 2009-08-25  2:32 Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 1/4] kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency Frederic Weisbecker
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-08-25  2:32 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Chris Mason, Roland Dreier,
	Ingo Molnar, Andi Kleen, Jeff Mahoney, Alexander Beregalov,
	Bron Gondwana, Reiserfs, Al Viro, Andrea Gelmini,
	Trenton D. Adams, Thomas Meyer, Alessio Igor Bogani,
	Marcel Hilzinger, Edward Shishkin, Laurent Riffard

Hi,

This small set fixes some lock dependency inversions found in reiserfs
xattr and mmap paths.
I guess there are still some of them that I'll have to hunt, especially one
reported by Laurent Riffard and another one introduced by the reiserfs_readdir
path optimization (though I'm not sure about the latter, I have yet to find a
way to reproduce it properly).

As usual, these patches can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
	reiserfs/kill-bkl

Thanks,
Frederic.

Frederic Weisbecker (4):
  kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock
    inversion dependency
  kill-the-bkl/reiserfs: fix recursive reiserfs lock in
    reiserfs_mkdir()
  kill-the-bkl/reiserfs: fix recursive reiserfs write lock in
    reiserfs_commit_write()
  kill-the-bkl/reiserfs: panic in case of lock imbalance

 fs/reiserfs/inode.c |   11 ++---------
 fs/reiserfs/lock.c  |    7 +++----
 fs/reiserfs/namei.c |    7 ++++---
 fs/reiserfs/xattr.c |    2 +-
 4 files changed, 10 insertions(+), 17 deletions(-)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/4] kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency
  2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
@ 2009-08-25  2:32 ` Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 2/4] kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir() Frederic Weisbecker
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-08-25  2:32 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Chris Mason, Roland Dreier,
	Ingo Molnar, Andi Kleen, Jeff Mahoney, Alexander Beregalov,
	Bron Gondwana, Reiserfs, Al Viro, Andrea Gelmini,
	Trenton D. Adams, Thomas Meyer, Alessio Igor Bogani,
	Marcel Hilzinger, Edward Shishkin, Laurent Riffard

reiserfs_xattr_init is called with the reiserfs write lock held, but
if the ".reiserfs_priv" entry is not created, we take the superblock
root directory inode mutex until .reiserfs_priv is created.

This creates a lock dependency inversion against other sites such as
reiserfs_file_release() which takes an inode mutex and the reiserfs
lock after.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/xattr.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 59870a4..58aa8e7 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -1004,7 +1004,7 @@ int reiserfs_xattr_init(struct super_block *s, int mount_flags)
 		goto error;
 
 	if (!privroot->d_inode && !(mount_flags & MS_RDONLY)) {
-		mutex_lock(&s->s_root->d_inode->i_mutex);
+		reiserfs_mutex_lock_safe(&s->s_root->d_inode->i_mutex, s);
 		err = create_privroot(REISERFS_SB(s)->priv_root);
 		mutex_unlock(&s->s_root->d_inode->i_mutex);
 	}
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/4] kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir()
  2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 1/4] kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency Frederic Weisbecker
@ 2009-08-25  2:32 ` Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 3/4] kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write() Frederic Weisbecker
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-08-25  2:32 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Chris Mason, Roland Dreier,
	Ingo Molnar, Andi Kleen, Jeff Mahoney, Alexander Beregalov,
	Bron Gondwana, Reiserfs, Al Viro, Andrea Gelmini,
	Trenton D. Adams, Thomas Meyer, Alessio Igor Bogani,
	Marcel Hilzinger, Edward Shishkin, Laurent Riffard

reiserfs_mkdir() acquires the reiserfs lock, assuming it has been called
from the dir inodes callbacks, without the lock held.

But it can also be called from other internal sites such as
reiserfs_xattr_init() which already holds the lock. This recursive
locking leads to further wrong assumptions. For example, later calls
to reiserfs_mutex_lock_safe() won't actually unlock the reiserfs lock
the time we acquire a given mutex, creating unexpected lock inversions.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/namei.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c
index b3973c9..e296ff7 100644
--- a/fs/reiserfs/namei.c
+++ b/fs/reiserfs/namei.c
@@ -732,6 +732,7 @@ static int reiserfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 	struct inode *inode;
 	struct reiserfs_transaction_handle th;
 	struct reiserfs_security_handle security;
+	int lock_depth;
 	/* We need blocks for transaction + (user+group)*(quotas for new inode + update of quota for directory owner) */
 	int jbegin_count =
 	    JOURNAL_PER_BALANCE_CNT * 3 +
@@ -755,7 +756,7 @@ static int reiserfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 		return retval;
 	}
 	jbegin_count += retval;
-	reiserfs_write_lock(dir->i_sb);
+	lock_depth = reiserfs_write_lock_once(dir->i_sb);
 
 	retval = journal_begin(&th, dir->i_sb, jbegin_count);
 	if (retval) {
@@ -805,8 +806,8 @@ static int reiserfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
 	d_instantiate(dentry, inode);
 	unlock_new_inode(inode);
 	retval = journal_end(&th, dir->i_sb, jbegin_count);
-      out_failed:
-	reiserfs_write_unlock(dir->i_sb);
+out_failed:
+	reiserfs_write_unlock_once(dir->i_sb, lock_depth);
 	return retval;
 }
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/4] kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write()
  2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 1/4] kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 2/4] kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir() Frederic Weisbecker
@ 2009-08-25  2:32 ` Frederic Weisbecker
  2009-08-25  2:32 ` [PATCH 4/4] kill-the-bkl/reiserfs: panic in case of lock imbalance Frederic Weisbecker
  2009-08-26 20:13 ` [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Alexander Beregalov
  4 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-08-25  2:32 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Chris Mason, Roland Dreier,
	Ingo Molnar, Andi Kleen, Jeff Mahoney, Alexander Beregalov,
	Bron Gondwana, Reiserfs, Al Viro, Andrea Gelmini,
	Trenton D. Adams, Thomas Meyer, Alessio Igor Bogani,
	Marcel Hilzinger, Edward Shishkin, Laurent Riffard

reiserfs_commit_write() is always called with the write lock held.
Thus the current calls to reiserfs_write_lock() in this function are
acquiring the lock recursively.
We can safely drop them.

This also solves further assumptions for this lock to be really
released while calling reiserfs_write_unlock().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/inode.c |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index 853f4f6..965c8ea 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -2795,7 +2795,6 @@ int reiserfs_commit_write(struct file *f, struct page *page,
 	 */
 	if (pos > inode->i_size) {
 		struct reiserfs_transaction_handle myth;
-		reiserfs_write_lock(inode->i_sb);
 		/* If the file have grown beyond the border where it
 		   can have a tail, unmark it as needing a tail
 		   packing */
@@ -2806,10 +2805,9 @@ int reiserfs_commit_write(struct file *f, struct page *page,
 			REISERFS_I(inode)->i_flags &= ~i_pack_on_close_mask;
 
 		ret = journal_begin(&myth, inode->i_sb, 1);
-		if (ret) {
-			reiserfs_write_unlock(inode->i_sb);
+		if (ret)
 			goto journal_error;
-		}
+
 		reiserfs_update_inode_transaction(inode);
 		inode->i_size = pos;
 		/*
@@ -2821,16 +2819,13 @@ int reiserfs_commit_write(struct file *f, struct page *page,
 		reiserfs_update_sd(&myth, inode);
 		update_sd = 1;
 		ret = journal_end(&myth, inode->i_sb, 1);
-		reiserfs_write_unlock(inode->i_sb);
 		if (ret)
 			goto journal_error;
 	}
 	if (th) {
-		reiserfs_write_lock(inode->i_sb);
 		if (!update_sd)
 			mark_inode_dirty(inode);
 		ret = reiserfs_end_persistent_transaction(th);
-		reiserfs_write_unlock(inode->i_sb);
 		if (ret)
 			goto out;
 	}
@@ -2840,11 +2835,9 @@ int reiserfs_commit_write(struct file *f, struct page *page,
 
       journal_error:
 	if (th) {
-		reiserfs_write_lock(inode->i_sb);
 		if (!update_sd)
 			reiserfs_update_sd(th, inode);
 		ret = reiserfs_end_persistent_transaction(th);
-		reiserfs_write_unlock(inode->i_sb);
 	}
 
 	return ret;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/4] kill-the-bkl/reiserfs: panic in case of lock imbalance
  2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2009-08-25  2:32 ` [PATCH 3/4] kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write() Frederic Weisbecker
@ 2009-08-25  2:32 ` Frederic Weisbecker
  2009-08-26 20:13 ` [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Alexander Beregalov
  4 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-08-25  2:32 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Chris Mason, Roland Dreier,
	Ingo Molnar, Andi Kleen, Jeff Mahoney, Alexander Beregalov,
	Bron Gondwana, Reiserfs, Al Viro, Andrea Gelmini,
	Trenton D. Adams, Thomas Meyer, Alessio Igor Bogani,
	Marcel Hilzinger, Edward Shishkin, Laurent Riffard

Until now, trying to unlock the reiserfs write lock whereas the current
task doesn't hold it lead to a simple warning.
We should actually warn and panic in this case to avoid the user datas
to reach an unstable state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/lock.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/reiserfs/lock.c b/fs/reiserfs/lock.c
index cb1bba3..ee2cfc0 100644
--- a/fs/reiserfs/lock.c
+++ b/fs/reiserfs/lock.c
@@ -37,11 +37,10 @@ void reiserfs_write_unlock(struct super_block *s)
 
 	/*
 	 * Are we unlocking without even holding the lock?
-	 * Such a situation could even raise a BUG() if we don't
-	 * want the data become corrupted
+	 * Such a situation must raise a BUG() if we don't want
+	 * to corrupt the data.
 	 */
-	WARN_ONCE(sb_i->lock_owner != current,
-		  "Superblock write lock imbalance");
+	BUG_ON(sb_i->lock_owner != current);
 
 	if (--sb_i->lock_depth == -1) {
 		sb_i->lock_owner = NULL;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
  2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2009-08-25  2:32 ` [PATCH 4/4] kill-the-bkl/reiserfs: panic in case of lock imbalance Frederic Weisbecker
@ 2009-08-26 20:13 ` Alexander Beregalov
  2009-09-01 22:16   ` Frederic Weisbecker
  2009-09-14 20:37   ` Frederic Weisbecker
  4 siblings, 2 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-08-26 20:13 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

On Tue, Aug 25, 2009 at 04:32:46AM +0200, Frederic Weisbecker wrote:
> Hi,
> 
> This small set fixes some lock dependency inversions found in reiserfs
> xattr and mmap paths.
> I guess there are still some of them that I'll have to hunt, especially one
> reported by Laurent Riffard and another one introduced by the reiserfs_readdir
> path optimization (though I'm not sure about the latter, I have yet to find a
> way to reproduce it properly).
> 
> As usual, these patches can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
> 	reiserfs/kill-bkl

Hi,
possible circular locking dependency detected

Is it false positive?

REISERFS (device sda1): found reiserfs format "3.6" with standard journal
REISERFS (device sda1): using ordered data mode
REISERFS (device sda1): journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
REISERFS (device sda1): checking transaction log (sda1)
REISERFS debug (device sda1): journal-1153: found in header: first_unflushed_offset 6766, last_flushed_trans_id 1836992
REISERFS debug (device sda1): journal-1206: Starting replay from offset 7889824857987694, trans_id 18
REISERFS debug (device sda1): journal-1299: Setting newest_mount_id to 229
REISERFS (device sda1): Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly on device 8:1.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-rc7-00135-g9399a4c #5
-------------------------------------------------------
init.sh/599 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c1071056>] might_fault+0x46/0xa0

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c10ff20e>] reiserfs_write_lock+0x1e/0x30

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c104dd18>] __lock_acquire+0xd28/0x1390
       [<c104e3ef>] lock_acquire+0x6f/0x90
       [<c133af96>] __mutex_lock_common+0x46/0x310
       [<c133b318>] mutex_lock_nested+0x38/0x40
       [<c10ff1ce>] reiserfs_write_lock_once+0x1e/0x40
       [<c10db74c>] reiserfs_get_block+0x5c/0x1440
       [<c10ae7a0>] do_mpage_readpage+0x120/0x4b0
       [<c10aec2f>] mpage_readpages+0x9f/0xe0
       [<c10d8c39>] reiserfs_readpages+0x19/0x20
       [<c10671f5>] __do_page_cache_readahead+0x195/0x210
       [<c1067291>] ra_submit+0x21/0x30
       [<c1061229>] filemap_fault+0x2e9/0x380
       [<c1072f08>] __do_fault+0x38/0x3b0
       [<c1073b5d>] handle_mm_fault+0xcd/0x550
       [<c101ad63>] do_page_fault+0xf3/0x240
       [<c133cd53>] error_code+0x63/0x68
       [<c10baa74>] padzero+0x24/0x40
       [<c10bc062>] load_elf_binary+0x632/0x1480
       [<c108aafa>] search_binary_handler+0x8a/0x270
       [<c108cb15>] do_execve+0x215/0x2a0
       [<c1001658>] sys_execve+0x28/0x60
       [<c1002d99>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

-> #0 (&mm->mmap_sem){++++++}:
       [<c104dd99>] __lock_acquire+0xda9/0x1390
       [<c104e3ef>] lock_acquire+0x6f/0x90
       [<c1071087>] might_fault+0x77/0xa0
       [<c11c1886>] copy_to_user+0x36/0x130
       [<c1093c29>] filldir64+0xa9/0xf0
       [<c10deda1>] reiserfs_readdir_dentry+0x4a1/0x7b0
       [<c10df0c7>] reiserfs_readdir+0x17/0x20
       [<c1093eb5>] vfs_readdir+0x85/0xa0
       [<c1093f34>] sys_getdents64+0x64/0xb0
       [<c1002d18>] sysenter_do_call+0x12/0x36
       [<ffffffff>] 0xffffffff

other info that might help us debug this:

2 locks held by init.sh/599:
 #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c1093e82>] vfs_readdir+0x52/0xa0
 #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c10ff20e>] reiserfs_write_lock+0x1e/0x30

stack backtrace:
Pid: 599, comm: init.sh Not tainted 2.6.31-rc7-00135-g9399a4c #5
Call Trace:
 [<c133a11a>] ? printk+0x18/0x1e
 [<c104bfdd>] print_circular_bug_tail+0x8d/0xd0
 [<c104dd99>] __lock_acquire+0xda9/0x1390
 [<c104e3ef>] lock_acquire+0x6f/0x90
 [<c1071056>] ? might_fault+0x46/0xa0
 [<c1071087>] might_fault+0x77/0xa0
 [<c1071056>] ? might_fault+0x46/0xa0
 [<c11c1886>] copy_to_user+0x36/0x130
 [<c1093c29>] filldir64+0xa9/0xf0
 [<c10ff20e>] ? reiserfs_write_lock+0x1e/0x30
 [<c10deda1>] reiserfs_readdir_dentry+0x4a1/0x7b0
 [<c1093b80>] ? filldir64+0x0/0xf0
 [<c104d453>] ? __lock_acquire+0x463/0x1390
 [<c104c62e>] ? trace_hardirqs_on_caller+0x7e/0x170
 [<c1093b80>] ? filldir64+0x0/0xf0
 [<c133b153>] ? __mutex_lock_common+0x203/0x310
 [<c1093b80>] ? filldir64+0x0/0xf0
 [<c10df0c7>] reiserfs_readdir+0x17/0x20
 [<c1093eb5>] vfs_readdir+0x85/0xa0
 [<c1093f34>] sys_getdents64+0x64/0xb0
 [<c1002d18>] sysenter_do_call+0x12/0x36

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
  2009-08-26 20:13 ` [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Alexander Beregalov
@ 2009-09-01 22:16   ` Frederic Weisbecker
  2009-09-14 20:37   ` Frederic Weisbecker
  1 sibling, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-01 22:16 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: LKML, Reiserfs

On Thu, Aug 27, 2009 at 12:13:30AM +0400, Alexander Beregalov wrote:
> On Tue, Aug 25, 2009 at 04:32:46AM +0200, Frederic Weisbecker wrote:
> > Hi,
> > 
> > This small set fixes some lock dependency inversions found in reiserfs
> > xattr and mmap paths.
> > I guess there are still some of them that I'll have to hunt, especially one
> > reported by Laurent Riffard and another one introduced by the reiserfs_readdir
> > path optimization (though I'm not sure about the latter, I have yet to find a
> > way to reproduce it properly).
> > 
> > As usual, these patches can be found at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
> > 	reiserfs/kill-bkl
> 
> Hi,
> possible circular locking dependency detected
> 
> Is it false positive?



No, this confirms a real one, also the worst because it has been
introduced by a nice optimization (in term of result of throughput)
that I'll probably need to revert.

reiserfs_readdir() has a reiserfs_write_lock -> mm->mmap_sem
dependency.
But reiserfs_readpage() has a mm->mmap_sem -> reiserfs_write_lock
dependency.

I'll send a fix soon. Thanks a lot for this report Alexander!

Frederic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
  2009-08-26 20:13 ` [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Alexander Beregalov
  2009-09-01 22:16   ` Frederic Weisbecker
@ 2009-09-14 20:37   ` Frederic Weisbecker
  2009-09-14 21:33     ` Alexander Beregalov
  1 sibling, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-14 20:37 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: LKML, Reiserfs

On Thu, Aug 27, 2009 at 12:13:30AM +0400, Alexander Beregalov wrote:
> Hi,
> possible circular locking dependency detected
> 
> Is it false positive?
> 
> REISERFS (device sda1): found reiserfs format "3.6" with standard journal
> REISERFS (device sda1): using ordered data mode
> REISERFS (device sda1): journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
> REISERFS (device sda1): checking transaction log (sda1)
> REISERFS debug (device sda1): journal-1153: found in header: first_unflushed_offset 6766, last_flushed_trans_id 1836992
> REISERFS debug (device sda1): journal-1206: Starting replay from offset 7889824857987694, trans_id 18
> REISERFS debug (device sda1): journal-1299: Setting newest_mount_id to 229
> REISERFS (device sda1): Using r5 hash to sort names
> VFS: Mounted root (reiserfs filesystem) readonly on device 8:1.
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-rc7-00135-g9399a4c #5
> -------------------------------------------------------
> init.sh/599 is trying to acquire lock:
>  (&mm->mmap_sem){++++++}, at: [<c1071056>] might_fault+0x46/0xa0
> 
> but task is already holding lock:
>  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c10ff20e>] reiserfs_write_lock+0x1e/0x30
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
>        [<c104dd18>] __lock_acquire+0xd28/0x1390
>        [<c104e3ef>] lock_acquire+0x6f/0x90
>        [<c133af96>] __mutex_lock_common+0x46/0x310
>        [<c133b318>] mutex_lock_nested+0x38/0x40
>        [<c10ff1ce>] reiserfs_write_lock_once+0x1e/0x40
>        [<c10db74c>] reiserfs_get_block+0x5c/0x1440
>        [<c10ae7a0>] do_mpage_readpage+0x120/0x4b0
>        [<c10aec2f>] mpage_readpages+0x9f/0xe0
>        [<c10d8c39>] reiserfs_readpages+0x19/0x20
>        [<c10671f5>] __do_page_cache_readahead+0x195/0x210
>        [<c1067291>] ra_submit+0x21/0x30
>        [<c1061229>] filemap_fault+0x2e9/0x380
>        [<c1072f08>] __do_fault+0x38/0x3b0
>        [<c1073b5d>] handle_mm_fault+0xcd/0x550
>        [<c101ad63>] do_page_fault+0xf3/0x240
>        [<c133cd53>] error_code+0x63/0x68
>        [<c10baa74>] padzero+0x24/0x40
>        [<c10bc062>] load_elf_binary+0x632/0x1480
>        [<c108aafa>] search_binary_handler+0x8a/0x270
>        [<c108cb15>] do_execve+0x215/0x2a0
>        [<c1001658>] sys_execve+0x28/0x60
>        [<c1002d99>] syscall_call+0x7/0xb
>        [<ffffffff>] 0xffffffff
> 
> -> #0 (&mm->mmap_sem){++++++}:
>        [<c104dd99>] __lock_acquire+0xda9/0x1390
>        [<c104e3ef>] lock_acquire+0x6f/0x90
>        [<c1071087>] might_fault+0x77/0xa0
>        [<c11c1886>] copy_to_user+0x36/0x130
>        [<c1093c29>] filldir64+0xa9/0xf0
>        [<c10deda1>] reiserfs_readdir_dentry+0x4a1/0x7b0
>        [<c10df0c7>] reiserfs_readdir+0x17/0x20
>        [<c1093eb5>] vfs_readdir+0x85/0xa0
>        [<c1093f34>] sys_getdents64+0x64/0xb0
>        [<c1002d18>] sysenter_do_call+0x12/0x36
>        [<ffffffff>] 0xffffffff
> 
> other info that might help us debug this:
> 
> 2 locks held by init.sh/599:
>  #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c1093e82>] vfs_readdir+0x52/0xa0
>  #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c10ff20e>] reiserfs_write_lock+0x1e/0x30
> 
> stack backtrace:
> Pid: 599, comm: init.sh Not tainted 2.6.31-rc7-00135-g9399a4c #5
> Call Trace:
>  [<c133a11a>] ? printk+0x18/0x1e
>  [<c104bfdd>] print_circular_bug_tail+0x8d/0xd0
>  [<c104dd99>] __lock_acquire+0xda9/0x1390
>  [<c104e3ef>] lock_acquire+0x6f/0x90
>  [<c1071056>] ? might_fault+0x46/0xa0
>  [<c1071087>] might_fault+0x77/0xa0
>  [<c1071056>] ? might_fault+0x46/0xa0
>  [<c11c1886>] copy_to_user+0x36/0x130
>  [<c1093c29>] filldir64+0xa9/0xf0
>  [<c10ff20e>] ? reiserfs_write_lock+0x1e/0x30
>  [<c10deda1>] reiserfs_readdir_dentry+0x4a1/0x7b0
>  [<c1093b80>] ? filldir64+0x0/0xf0
>  [<c104d453>] ? __lock_acquire+0x463/0x1390
>  [<c104c62e>] ? trace_hardirqs_on_caller+0x7e/0x170
>  [<c1093b80>] ? filldir64+0x0/0xf0
>  [<c133b153>] ? __mutex_lock_common+0x203/0x310
>  [<c1093b80>] ? filldir64+0x0/0xf0
>  [<c10df0c7>] reiserfs_readdir+0x17/0x20
>  [<c1093eb5>] vfs_readdir+0x85/0xa0
>  [<c1093f34>] sys_getdents64+0x64/0xb0
>  [<c1002d18>] sysenter_do_call+0x12/0x36



Hi Alexander,

It should be fixed now, still in the following tree:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	reiserfs/kill-bkl


Don't hesistate to tell me if you see other problems.

Thanks a lot for your report!

Frederic.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency  inversions
  2009-09-14 20:37   ` Frederic Weisbecker
@ 2009-09-14 21:33     ` Alexander Beregalov
  2009-09-14 21:50       ` Frederic Weisbecker
  2009-09-16 20:37       ` Frederic Weisbecker
  0 siblings, 2 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-14 21:33 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

> Hi Alexander,
>
> It should be fixed now, still in the following tree:

Hi!
Another one, similar:
It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
panic in case of lock imbalance), UP.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-03149-gdcc030a #1
-------------------------------------------------------
udevadm/716 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c107249a>] might_fault+0x4a/0xa0

but task is already holding lock:
 (sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (sysfs_mutex){+.+.+.}:
       [<c104e08e>] __lock_acquire+0xd0e/0x15c0
       [<c104e9ba>] lock_acquire+0x7a/0xa0
       [<c13402c6>] __mutex_lock_common+0x46/0x310
       [<c134066a>] mutex_lock_nested+0x3a/0x50
       [<c10cbcdc>] sysfs_addrm_start+0x2c/0xa0
       [<c10cca50>] create_dir+0x40/0x90
       [<c10ccacb>] sysfs_create_dir+0x2b/0x40
       [<c11bdd6b>] kobject_add_internal+0x9b/0x250
       [<c11be01d>] kobject_add_varg+0x2d/0x50
       [<c11be09c>] kobject_add+0x2c/0x60
       [<c1230984>] device_add+0xf4/0x4a0
       [<c10c98df>] add_partition+0x13f/0x230
       [<c10c9fdb>] rescan_partitions+0x24b/0x3c0
       [<c10aeb80>] __blkdev_get+0x140/0x320
       [<c10aed6a>] blkdev_get+0xa/0x10
       [<c10c9787>] register_disk+0x127/0x140
       [<c11b7ff0>] add_disk+0x80/0x140
       [<c12473a8>] sd_probe_async+0xf8/0x1b0
       [<c1042371>] async_thread+0xd1/0x220
       [<c103c6cc>] kthread+0x6c/0x80
       [<c100355f>] kernel_thread_helper+0x7/0x18

-> #2 (&bdev->bd_mutex){+.+.+.}:
       [<c104e08e>] __lock_acquire+0xd0e/0x15c0
       [<c104e9ba>] lock_acquire+0x7a/0xa0
       [<c13402c6>] __mutex_lock_common+0x46/0x310
       [<c134066a>] mutex_lock_nested+0x3a/0x50
       [<c10aea6a>] __blkdev_get+0x2a/0x320
       [<c10aed6a>] blkdev_get+0xa/0x10
       [<c10aeed1>] open_by_devnum+0x21/0x50
       [<c10fb577>] journal_init+0x1d7/0xa10
       [<c10e8463>] reiserfs_fill_super+0x313/0xdb0
       [<c108a898>] get_sb_bdev+0x108/0x150
       [<c10e6381>] get_super_block+0x21/0x30
       [<c1089910>] vfs_kern_mount+0x40/0xa0
       [<c10899c9>] do_kern_mount+0x39/0xd0
       [<c109f319>] do_mount+0x309/0x700
       [<c109f776>] sys_mount+0x66/0xa0
       [<c15fe886>] mount_block_root+0xc4/0x245
       [<c15fea60>] mount_root+0x59/0x5f
       [<c15feb77>] prepare_namespace+0x111/0x14b
       [<c15fe23d>] kernel_init+0xca/0xd6
       [<c100355f>] kernel_thread_helper+0x7/0x18

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c104e08e>] __lock_acquire+0xd0e/0x15c0
       [<c104e9ba>] lock_acquire+0x7a/0xa0
       [<c13402c6>] __mutex_lock_common+0x46/0x310
       [<c134066a>] mutex_lock_nested+0x3a/0x50
       [<c11012d8>] reiserfs_write_lock_once+0x28/0x50
       [<c10dd7cc>] reiserfs_get_block+0x5c/0x1440
       [<c10b0660>] do_mpage_readpage+0x120/0x4b0
       [<c10b0aef>] mpage_readpages+0x9f/0xe0
       [<c10dacb9>] reiserfs_readpages+0x19/0x20
       [<c1067dc5>] __do_page_cache_readahead+0x195/0x210
       [<c1067e61>] ra_submit+0x21/0x30
       [<c1062119>] filemap_fault+0x2e9/0x380
       [<c1074388>] __do_fault+0x38/0x3b0
       [<c1074fdd>] handle_mm_fault+0xcd/0x550
       [<c101af55>] do_page_fault+0xf5/0x240
       [<c13420d3>] error_code+0x63/0x68
       [<c10bcae4>] padzero+0x24/0x40
       [<c10bde9a>] load_elf_binary+0x61a/0x1450
       [<c108c0e0>] search_binary_handler+0x90/0x270
       [<c108e152>] do_execve+0x1d2/0x240
       [<c1001658>] sys_execve+0x28/0x60
       [<c1002d59>] syscall_call+0x7/0xb

-> #0 (&mm->mmap_sem){++++++}:
       [<c104e4a5>] __lock_acquire+0x1125/0x15c0
       [<c104e9ba>] lock_acquire+0x7a/0xa0
       [<c10724cb>] might_fault+0x7b/0xa0
       [<c11c4c86>] copy_to_user+0x36/0x130
       [<c10951f9>] filldir64+0xa9/0xf0
       [<c10cba4e>] sysfs_readdir+0xfe/0x200
       [<c1095485>] vfs_readdir+0x85/0xa0
       [<c1095504>] sys_getdents64+0x64/0xb0
       [<c1002cd8>] sysenter_do_call+0x12/0x36

other info that might help us debug this:

2 locks held by udevadm/716:
 #0:  (&type->i_mutex_dir_key){+.+.+.}, at: [<c1095452>] vfs_readdir+0x52/0xa0
 #1:  (sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200

stack backtrace:
Pid: 716, comm: udevadm Tainted: G        W  2.6.31-03149-gdcc030a #1
Call Trace:
 [<c133f41a>] ? printk+0x18/0x1e
 [<c104c110>] print_circular_bug+0xc0/0xd0
 [<c104e4a5>] __lock_acquire+0x1125/0x15c0
 [<c104e9ba>] lock_acquire+0x7a/0xa0
 [<c107249a>] ? might_fault+0x4a/0xa0
 [<c10724cb>] might_fault+0x7b/0xa0
 [<c107249a>] ? might_fault+0x4a/0xa0
 [<c11c4c86>] copy_to_user+0x36/0x130
 [<c10951f9>] filldir64+0xa9/0xf0
 [<c1095150>] ? filldir64+0x0/0xf0
 [<c10cba4e>] sysfs_readdir+0xfe/0x200
 [<c1095150>] ? filldir64+0x0/0xf0
 [<c1095150>] ? filldir64+0x0/0xf0
 [<c1095485>] vfs_readdir+0x85/0xa0
 [<c1095504>] sys_getdents64+0x64/0xb0
 [<c1002cd8>] sysenter_do_call+0x12/0x36

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
  2009-09-14 21:33     ` Alexander Beregalov
@ 2009-09-14 21:50       ` Frederic Weisbecker
  2009-09-16 20:37       ` Frederic Weisbecker
  1 sibling, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-14 21:50 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: LKML, Reiserfs

On Tue, Sep 15, 2009 at 01:33:42AM +0400, Alexander Beregalov wrote:
> > Hi Alexander,
> >
> > It should be fixed now, still in the following tree:
> 
> Hi!
> Another one, similar:
> It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
> panic in case of lock imbalance), UP.
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-03149-gdcc030a #1
> -------------------------------------------------------
> udevadm/716 is trying to acquire lock:
>  (&mm->mmap_sem){++++++}, at: [<c107249a>] might_fault+0x4a/0xa0



I hate this mm->mmap_sem....



> 
> but task is already holding lock:
>  (sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200
> 
> which lock already depends on the new lock.



This is really weird. This is two externals locks from reiserfs.
I guess I created this dependency somewhere, but how...

Anyway, someone reported a similar bug with this tree some months ago,
I've even met it once but could never be able to reproduce it anymore.

I'll try to find out. Thanks!


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions
  2009-09-14 21:33     ` Alexander Beregalov
  2009-09-14 21:50       ` Frederic Weisbecker
@ 2009-09-16 20:37       ` Frederic Weisbecker
  2009-09-16 23:37           ` Alexander Beregalov
  1 sibling, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-16 20:37 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: LKML, Reiserfs

On Tue, Sep 15, 2009 at 01:33:42AM +0400, Alexander Beregalov wrote:
> > Hi Alexander,
> >
> > It should be fixed now, still in the following tree:
> 
> Hi!
> Another one, similar:
> It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
> panic in case of lock imbalance), UP.



Although I can't reproduce it, I think I see how that can happen.

On mount time, we have the following dependency:

reiserfs_lock -> bdev_mutex -> sysfs_mutex

which happens while calling journal_init_dev() because
we open the device there.

But also in case of mmap on a reiserfs filesystem we
may call reiserfs_readpages(), holding the reiserfs lock
while already holding mm->mmap_sem

The above dependency is then updated:

mmap_sem
     |
     |
     ------- reiserfs_lock -> bdev_mutex -> sysfs_mutex

And later, while doing a readdir() on a sysfs directory,
sysfs calls filldir, which might_fault, and then might grab
mmap_sem. filldir is called there while holding sys_mutex,
creating the new following dependency

sysfs_mutex -> mmap_sem

Hence the inversion.
It seems the deadlock can't ever happen, I even don't see
corner cases where it could happen.

But still, this dependency should disappear.

Could you please tell me if the following patch makes it shut down?

Otherwise, I may need your config. I don't know why, but I suspect
the lock_acquire(mmap_sem) in might_fault doesn't trigger needed the lockdep
check, although I have the appropriate debug config, at least it seems.
But anyway, it should also happen in my box but it doesn't...

Thanks!

The patch:

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index d23d6d7..59f7a4c 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2801,11 +2801,14 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 		goto free_and_return;
 	}
 
+	reiserfs_write_unlock(sb);
 	if (journal_init_dev(sb, journal, j_dev_name) != 0) {
 		reiserfs_warning(sb, "sh-462",
 				 "unable to initialize jornal device");
+		reiserfs_write_lock(sb);
 		goto free_and_return;
 	}
+	reiserfs_write_lock(sb);
 
 	rs = SB_DISK_SUPER_BLOCK(sb);
 


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency  inversions
  2009-09-16 20:37       ` Frederic Weisbecker
@ 2009-09-16 23:37           ` Alexander Beregalov
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-16 23:37 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

2009/9/17 Frederic Weisbecker <fweisbec@gmail.com>:
> On Tue, Sep 15, 2009 at 01:33:42AM +0400, Alexander Beregalov wrote:
>> > Hi Alexander,
>> >
>> > It should be fixed now, still in the following tree:
>>
>> Hi!
>> Another one, similar:
>> It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
>> panic in case of lock imbalance), UP.
>
>
>
> Although I can't reproduce it, I think I see how that can happen.
>
> On mount time, we have the following dependency:
>
> reiserfs_lock -> bdev_mutex -> sysfs_mutex
>
> which happens while calling journal_init_dev() because
> we open the device there.
>
> But also in case of mmap on a reiserfs filesystem we
> may call reiserfs_readpages(), holding the reiserfs lock
> while already holding mm->mmap_sem
>
> The above dependency is then updated:
>
> mmap_sem
>     |
>     |
>     ------- reiserfs_lock -> bdev_mutex -> sysfs_mutex
>
> And later, while doing a readdir() on a sysfs directory,
> sysfs calls filldir, which might_fault, and then might grab
> mmap_sem. filldir is called there while holding sys_mutex,
> creating the new following dependency
>
> sysfs_mutex -> mmap_sem
>
> Hence the inversion.
> It seems the deadlock can't ever happen, I even don't see
> corner cases where it could happen.
>
> But still, this dependency should disappear.
>
> Could you please tell me if the following patch makes it shut down?
>
> Otherwise, I may need your config. I don't know why, but I suspect
> the lock_acquire(mmap_sem) in might_fault doesn't trigger needed the lockdep
> check, although I have the appropriate debug config, at least it seems.
> But anyway, it should also happen in my box but it doesn't...
>
> Thanks!
>
> The patch:
Yes, it is working!

>
> diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
> index d23d6d7..59f7a4c 100644
> --- a/fs/reiserfs/journal.c
> +++ b/fs/reiserfs/journal.c
> @@ -2801,11 +2801,14 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
>                goto free_and_return;
>        }
>
> +       reiserfs_write_unlock(sb);
>        if (journal_init_dev(sb, journal, j_dev_name) != 0) {
>                reiserfs_warning(sb, "sh-462",
>                                 "unable to initialize jornal device");
> +               reiserfs_write_lock(sb);
>                goto free_and_return;
>        }
> +       reiserfs_write_lock(sb);
>
>        rs = SB_DISK_SUPER_BLOCK(sb);
>
>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency  inversions
@ 2009-09-16 23:37           ` Alexander Beregalov
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-16 23:37 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

2009/9/17 Frederic Weisbecker <fweisbec@gmail.com>:
> On Tue, Sep 15, 2009 at 01:33:42AM +0400, Alexander Beregalov wrote:
>> > Hi Alexander,
>> >
>> > It should be fixed now, still in the following tree:
>>
>> Hi!
>> Another one, similar:
>> It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
>> panic in case of lock imbalance), UP.
>
>
>
> Although I can't reproduce it, I think I see how that can happen.
>
> On mount time, we have the following dependency:
>
> reiserfs_lock -> bdev_mutex -> sysfs_mutex
>
> which happens while calling journal_init_dev() because
> we open the device there.
>
> But also in case of mmap on a reiserfs filesystem we
> may call reiserfs_readpages(), holding the reiserfs lock
> while already holding mm->mmap_sem
>
> The above dependency is then updated:
>
> mmap_sem
>     |
>     |
>     ------- reiserfs_lock -> bdev_mutex -> sysfs_mutex
>
> And later, while doing a readdir() on a sysfs directory,
> sysfs calls filldir, which might_fault, and then might grab
> mmap_sem. filldir is called there while holding sys_mutex,
> creating the new following dependency
>
> sysfs_mutex -> mmap_sem
>
> Hence the inversion.
> It seems the deadlock can't ever happen, I even don't see
> corner cases where it could happen.
>
> But still, this dependency should disappear.
>
> Could you please tell me if the following patch makes it shut down?
>
> Otherwise, I may need your config. I don't know why, but I suspect
> the lock_acquire(mmap_sem) in might_fault doesn't trigger needed the lockdep
> check, although I have the appropriate debug config, at least it seems.
> But anyway, it should also happen in my box but it doesn't...
>
> Thanks!
>
> The patch:
Yes, it is working!

>
> diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
> index d23d6d7..59f7a4c 100644
> --- a/fs/reiserfs/journal.c
> +++ b/fs/reiserfs/journal.c
> @@ -2801,11 +2801,14 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
>                goto free_and_return;
>        }
>
> +       reiserfs_write_unlock(sb);
>        if (journal_init_dev(sb, journal, j_dev_name) != 0) {
>                reiserfs_warning(sb, "sh-462",
>                                 "unable to initialize jornal device");
> +               reiserfs_write_lock(sb);
>                goto free_and_return;
>        }
> +       reiserfs_write_lock(sb);
>
>        rs = SB_DISK_SUPER_BLOCK(sb);
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
  2009-09-16 23:37           ` Alexander Beregalov
  (?)
@ 2009-09-17  5:06           ` Frederic Weisbecker
  2009-09-22 13:55             ` Alexander Beregalov
  -1 siblings, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-17  5:06 UTC (permalink / raw)
  To: Alexander Beregalov
  Cc: LKML, Reiserfs, Jeff Mahoney, Chris Mason, Ingo Molnar, Laurent Riffard

On Thu, Sep 17, 2009 at 03:37:22AM +0400, Alexander Beregalov wrote:
> 2009/9/17 Frederic Weisbecker <fweisbec@gmail.com>:
> > On Tue, Sep 15, 2009 at 01:33:42AM +0400, Alexander Beregalov wrote:
> >> > Hi Alexander,
> >> >
> >> > It should be fixed now, still in the following tree:
> >>
> >> Hi!
> >> Another one, similar:
> >> It is v2.6.31-3123-g99bc470 plus your 805031859(kill-the-bkl/reiserfs:
> >> panic in case of lock imbalance), UP.
> >
> >
> >
> > Although I can't reproduce it, I think I see how that can happen.
> >
> > On mount time, we have the following dependency:
> >
> > reiserfs_lock -> bdev_mutex -> sysfs_mutex
> >
> > which happens while calling journal_init_dev() because
> > we open the device there.
> >
> > But also in case of mmap on a reiserfs filesystem we
> > may call reiserfs_readpages(), holding the reiserfs lock
> > while already holding mm->mmap_sem
> >
> > The above dependency is then updated:
> >
> > mmap_sem
> >     |
> >     |
> >     ------- reiserfs_lock -> bdev_mutex -> sysfs_mutex
> >
> > And later, while doing a readdir() on a sysfs directory,
> > sysfs calls filldir, which might_fault, and then might grab
> > mmap_sem. filldir is called there while holding sys_mutex,
> > creating the new following dependency
> >
> > sysfs_mutex -> mmap_sem
> >
> > Hence the inversion.
> > It seems the deadlock can't ever happen, I even don't see
> > corner cases where it could happen.
> >
> > But still, this dependency should disappear.
> >
> > Could you please tell me if the following patch makes it shut down?
> >
> > Otherwise, I may need your config. I don't know why, but I suspect
> > the lock_acquire(mmap_sem) in might_fault doesn't trigger needed the lockdep
> > check, although I have the appropriate debug config, at least it seems.
> > But anyway, it should also happen in my box but it doesn't...
> >
> > Thanks!
> >
> > The patch:
> Yes, it is working!


Great, I've pushed the fix then, see the following patch.

Thanks a lot Alexander!

---
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 17 Sep 2009 05:31:37 +0200
Subject: [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency

Alexander Beregalov reported the following warning:

	=======================================================
	[ INFO: possible circular locking dependency detected ]
	2.6.31-03149-gdcc030a #1
	-------------------------------------------------------
	udevadm/716 is trying to acquire lock:
	 (&mm->mmap_sem){++++++}, at: [<c107249a>] might_fault+0x4a/0xa0

	but task is already holding lock:
	 (sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200

	which lock already depends on the new lock.

	the existing dependency chain (in reverse order) is:

	-> #3 (sysfs_mutex){+.+.+.}:
	       [...]

	-> #2 (&bdev->bd_mutex){+.+.+.}:
	       [...]

	-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
	       [...]

	-> #0 (&mm->mmap_sem){++++++}:
	       [...]

On reiserfs mount path, we take the reiserfs lock and while
initializing the journal, we open the device, taking the
bdev->bd_mutex. Then rescan_partition() may signal the change
to sysfs.

We have then the following dependency:

	reiserfs_lock -> bd_mutex -> sysfs_mutex

Later, while entering reiserfs_readpage() after a pagefault in an
mmaped reiserfs file, we are holding the mm->mmap_sem, and we are going
to take the reiserfs lock too.
We have then the following dependency:

	mm->mmap_sem -> reiserfs_lock

which, expanded with the previous dependency gives us:

	mm->mmap_sem -> reiserfs_lock -> bd_mutex -> sysfs_mutex

Now while entering the sysfs readdir path, we are holding the
sysfs_mutex. And when we copy a directory entry to the user buffer, we
might fault and then take the mm->mmap_sem lock. Which leads to the
circular locking dependency reported.

We can fix that by relaxing the reiserfs lock during the call to
journal_init_dev(), which is the place where we open the mounted
device.

This is fine to relax the lock here because we are in the begining of
the reiserfs mount path and there is nothing to protect at this time,
the journal is not intialized.
We just keep this lock around for paranoid reasons.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/journal.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index d23d6d7..04e3c42 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2801,11 +2801,27 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 		goto free_and_return;
 	}
 
+	/*
+	 * We need to unlock here to avoid creating the following
+	 * dependency:
+	 * reiserfs_lock -> sysfs_mutex
+	 * Because the reiserfs mmap path creates the following dependency:
+	 * mm->mmap -> reiserfs_lock, hence we have
+	 * mm->mmap -> reiserfs_lock ->sysfs_mutex
+	 * This would ends up in a circular dependency with sysfs readdir path
+	 * which does sysfs_mutex -> mm->mmap_sem
+	 * This is fine because the reiserfs lock is useless in mount path,
+	 * at least until we call journal_begin. We keep it for paranoid
+	 * reasons.
+	 */
+	reiserfs_write_unlock(sb);
 	if (journal_init_dev(sb, journal, j_dev_name) != 0) {
+		reiserfs_write_lock(sb);
 		reiserfs_warning(sb, "sh-462",
 				 "unable to initialize jornal device");
 		goto free_and_return;
 	}
+	reiserfs_write_lock(sb);
 
 	rs = SB_DISK_SUPER_BLOCK(sb);
 
-- 
1.6.2.3




^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to  sysfs_mutex dependency
  2009-09-17  5:06           ` [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency Frederic Weisbecker
@ 2009-09-22 13:55             ` Alexander Beregalov
  2009-09-29  7:46               ` Frederic Weisbecker
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-22 13:55 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

Hi Frederic.

Another very similar warning.
(smp 2*2core)
v2.6.31-7068-g43c1266 plus 193be0ee1 kill-the-bkl/reiserfs: Fix
induced mm->mmap_sem to sysfs_mutex dependency


[ INFO: possible circular locking dependency detected ]
2.6.31-07095-g25a3912 #4
-------------------------------------------------------
udevadm/790 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0

but task is already holding lock:
 (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #5 (sysfs_mutex){+.+.+.}:
       [<c1069810>] __lock_acquire+0xc90/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c13ff805>] __mutex_lock_common+0x65/0x440
       [<c13ffce0>] mutex_lock_nested+0x40/0x60
       [<c1108535>] sysfs_addrm_start+0x35/0xc0
       [<c11094d1>] create_dir+0x51/0xb0
       [<c1109566>] sysfs_create_dir+0x36/0x60
       [<c12356d7>] kobject_add_internal+0xa7/0x270
       [<c12359da>] kobject_add_varg+0x3a/0x70
       [<c1235a4e>] kobject_init_and_add+0x3e/0x60
       [<c10af62d>] sysfs_slab_add+0x7d/0x1e0
       [<c10af7ef>] sysfs_add_func+0x5f/0xa0
       [<c104d16e>] worker_thread+0x16e/0x270
       [<c1051fac>] kthread+0x7c/0x90
       [<c1003f5b>] kernel_thread_helper+0x7/0x7c

-> #4 (slub_lock){+++++.}:
       [<c1069810>] __lock_acquire+0xc90/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c14000d2>] down_read+0x52/0xb0
       [<c15e2497>] slab_cpuup_callback+0x4a/0x196
       [<c140481d>] notifier_call_chain+0x4d/0x90
       [<c1058095>] __raw_notifier_call_chain+0x25/0x40
       [<c15e1c63>] cpu_up+0xdd/0x1a6
       [<c15b1403>] kernel_init+0xac/0x1a8
       [<c1003f5b>] kernel_thread_helper+0x7/0x7c

-> #3 (cpu_hotplug.lock){+.+.+.}:
       [<c1069810>] __lock_acquire+0xc90/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c13ff805>] __mutex_lock_common+0x65/0x440
       [<c13ffce0>] mutex_lock_nested+0x40/0x60
       [<c15e1c24>] cpu_up+0x9e/0x1a6
       [<c15b1403>] kernel_init+0xac/0x1a8
       [<c1003f5b>] kernel_thread_helper+0x7/0x7c

-> #2 (cpu_add_remove_lock){+.+.+.}:
       [<c1069810>] __lock_acquire+0xc90/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c13ff805>] __mutex_lock_common+0x65/0x440
       [<c13ffce0>] mutex_lock_nested+0x40/0x60
       [<c103ad9d>] cpu_maps_update_begin+0x1d/0x40
       [<c104d9cc>] __create_workqueue_key+0x9c/0x210
       [<c113be94>] journal_init+0x9a4/0xa60
       [<c112735f>] reiserfs_fill_super+0x35f/0xdf0
       [<c10b8618>] get_sb_bdev+0x138/0x180
       [<c1124f0d>] get_super_block+0x2d/0x50
       [<c10b72f1>] vfs_kern_mount+0x51/0xc0
       [<c10b73ed>] do_kern_mount+0x4d/0x100
       [<c10d115a>] do_mount+0x21a/0x720
       [<c10d16ec>] sys_mount+0x8c/0xe0
       [<c15b1d19>] mount_block_root+0xcf/0x26b
       [<c15b1f19>] mount_root+0x64/0x7b
       [<c15b204c>] prepare_namespace+0x11c/0x167
       [<c15b14df>] kernel_init+0x188/0x1a8
       [<c1003f5b>] kernel_thread_helper+0x7/0x7c

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c1069810>] __lock_acquire+0xc90/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c13ff805>] __mutex_lock_common+0x65/0x440
       [<c13ffce0>] mutex_lock_nested+0x40/0x60
       [<c1142282>] reiserfs_write_lock_once+0x32/0x70
       [<c111c1fb>] reiserfs_get_block+0x6b/0x1510
       [<c10e5e48>] do_mpage_readpage+0x168/0x510
       [<c10e631f>] mpage_readpages+0xaf/0x100
       [<c1118ef5>] reiserfs_readpages+0x25/0x40
       [<c108bc2d>] __do_page_cache_readahead+0x1fd/0x2a0
       [<c108bcfd>] ra_submit+0x2d/0x50
       [<c1083bf6>] filemap_fault+0x436/0x470
       [<c109af24>] __do_fault+0x54/0x410
       [<c109bdd1>] handle_mm_fault+0x1c1/0x680
       [<c14043e5>] do_page_fault+0x115/0x380
       [<c1401ff0>] error_code+0x78/0x80
       [<c10f45a7>] padzero+0x37/0x50
       [<c10f4d4d>] load_elf_binary+0x63d/0x1500
       [<c10bc461>] search_binary_handler+0x1a1/0x330
       [<c10bcb48>] do_execve+0x1f8/0x270
       [<c1001783>] sys_execve+0x33/0x80
       [<c10033dc>] syscall_call+0x7/0xb

-> #0 (&mm->mmap_sem){++++++}:
       [<c1069dc8>] __lock_acquire+0x1248/0x1290
       [<c1069eaa>] lock_acquire+0x9a/0xd0
       [<c1098973>] might_fault+0xa3/0xc0
       [<c123d611>] copy_to_user+0x41/0x130
       [<c10c4f9c>] filldir64+0xcc/0x120
       [<c11081ba>] sysfs_readdir+0xfa/0x260
       [<c10c525e>] vfs_readdir+0x9e/0xc0
       [<c10c52fa>] sys_getdents64+0x7a/0xe0
       [<c100334f>] sysenter_do_call+0x12/0x36

other info that might help us debug this:

2 locks held by udevadm/790:
 #0:  (&type->i_mutex_dir_key){+.+.+.}, at: [<c10c522a>] vfs_readdir+0x6a/0xc0
 #1:  (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260

stack backtrace:
Pid: 790, comm: udevadm Not tainted 2.6.31-07095-g25a3912 #4
Call Trace:
 [<c13fdd64>] ? printk+0x23/0x37
 [<c1067738>] print_circular_bug+0xe8/0x100
 [<c1069dc8>] __lock_acquire+0x1248/0x1290
 [<c1069eaa>] lock_acquire+0x9a/0xd0
 [<c1098942>] ? might_fault+0x72/0xc0
 [<c1098942>] ? might_fault+0x72/0xc0
 [<c1098973>] might_fault+0xa3/0xc0
 [<c1098942>] ? might_fault+0x72/0xc0
 [<c123d611>] copy_to_user+0x41/0x130
 [<c10c4f9c>] filldir64+0xcc/0x120
 [<c11081ba>] sysfs_readdir+0xfa/0x260
 [<c10c4ed0>] ? filldir64+0x0/0x120
 [<c10c525e>] vfs_readdir+0x9e/0xc0
 [<c10c4ed0>] ? filldir64+0x0/0x120
 [<c10c52fa>] sys_getdents64+0x7a/0xe0
 [<c100334f>] sysenter_do_call+0x12/0x36

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
  2009-09-22 13:55             ` Alexander Beregalov
@ 2009-09-29  7:46               ` Frederic Weisbecker
  2009-09-29 10:22                   ` Alexander Beregalov
  0 siblings, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2009-09-29  7:46 UTC (permalink / raw)
  To: Alexander Beregalov; +Cc: LKML, Reiserfs

On Tue, Sep 22, 2009 at 05:55:43PM +0400, Alexander Beregalov wrote:
> Hi Frederic.
> 
> Another very similar warning.
> (smp 2*2core)
> v2.6.31-7068-g43c1266 plus 193be0ee1 kill-the-bkl/reiserfs: Fix
> induced mm->mmap_sem to sysfs_mutex dependency
> 
> 
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-07095-g25a3912 #4
> -------------------------------------------------------
> udevadm/790 is trying to acquire lock:
>  (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0
> 
> but task is already holding lock:
>  (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260
> 
> which lock already depends on the new lock.



Yeah indeed, it's about the same kind of thing.
Could you please test the following patch?

Thanks!

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 04e3c42..2f8a7e7 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2933,8 +2933,11 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 	}
 
 	reiserfs_mounted_fs_count++;
-	if (reiserfs_mounted_fs_count <= 1)
+	if (reiserfs_mounted_fs_count <= 1) {
+		reiserfs_write_unlock(sb);
 		commit_wq = create_workqueue("reiserfs");
+		reiserfs_write_lock(sb);
+	}
 
 	INIT_DELAYED_WORK(&journal->j_work, flush_async_commits);
 	journal->j_work_sb = sb;



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to  sysfs_mutex dependency
  2009-09-29  7:46               ` Frederic Weisbecker
@ 2009-09-29 10:22                   ` Alexander Beregalov
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-29 10:22 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

2009/9/29 Frederic Weisbecker <fweisbec@gmail.com>:
> On Tue, Sep 22, 2009 at 05:55:43PM +0400, Alexander Beregalov wrote:
>> Hi Frederic.
>>
>> Another very similar warning.
>> (smp 2*2core)
>> v2.6.31-7068-g43c1266 plus 193be0ee1 kill-the-bkl/reiserfs: Fix
>> induced mm->mmap_sem to sysfs_mutex dependency
>>
>>
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.31-07095-g25a3912 #4
>> -------------------------------------------------------
>> udevadm/790 is trying to acquire lock:
>>  (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0
>>
>> but task is already holding lock:
>>  (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260
>>
>> which lock already depends on the new lock.
>
>
>
> Yeah indeed, it's about the same kind of thing.
> Could you please test the following patch?

Thanks, the warning has gone away.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to  sysfs_mutex dependency
@ 2009-09-29 10:22                   ` Alexander Beregalov
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Beregalov @ 2009-09-29 10:22 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Reiserfs

2009/9/29 Frederic Weisbecker <fweisbec@gmail.com>:
> On Tue, Sep 22, 2009 at 05:55:43PM +0400, Alexander Beregalov wrote:
>> Hi Frederic.
>>
>> Another very similar warning.
>> (smp 2*2core)
>> v2.6.31-7068-g43c1266 plus 193be0ee1 kill-the-bkl/reiserfs: Fix
>> induced mm->mmap_sem to sysfs_mutex dependency
>>
>>
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.31-07095-g25a3912 #4
>> -------------------------------------------------------
>> udevadm/790 is trying to acquire lock:
>>  (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0
>>
>> but task is already holding lock:
>>  (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260
>>
>> which lock already depends on the new lock.
>
>
>
> Yeah indeed, it's about the same kind of thing.
> Could you please test the following patch?

Thanks, the warning has gone away.
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
  2009-09-29 10:22                   ` Alexander Beregalov
  (?)
@ 2009-10-05 18:12                   ` Frederic Weisbecker
  -1 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2009-10-05 18:12 UTC (permalink / raw)
  To: Alexander Beregalov
  Cc: LKML, Reiserfs, Jeff Mahoney, Chris Mason, Ingo Molnar,
	Alexander Beregalov, Laurent Riffard

On Tue, Sep 29, 2009 at 02:22:42PM +0400, Alexander Beregalov wrote:
> 2009/9/29 Frederic Weisbecker <fweisbec@gmail.com>:
> > Yeah indeed, it's about the same kind of thing.
> > Could you please test the following patch?
> 
> Thanks, the warning has gone away.


Thanks a lot Alexander, your tests and reports are very precious!

I've pushed the commit below, as usual it can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	reiserfs/kill-bkl

---
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 5 Oct 2009 16:31:37 +0200
Subject: [PATCH] kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency

While creating the reiserfs workqueue during the journal
initialization, we are holding the reiserfs lock, but
create_workqueue() also holds the cpu_add_remove_lock, creating
then the following dependency:

- reiserfs lock -> cpu_add_remove_lock

But we also have the following existing dependencies:

- mm->mmap_sem -> reiserfs lock
- cpu_add_remove_lock -> cpu_hotplug.lock -> slub_lock -> sysfs_mutex

The merged dependency chain then becomes:

- mm->mmap_sem -> reiserfs lock -> cpu_add_remove_lock ->
	cpu_hotplug.lock -> slub_lock -> sysfs_mutex

But when we fill a dir entry in sysfs_readir(), we are holding the
sysfs_mutex and we also might fault while copying the directory entry
to the user, leading to the following dependency:

- sysfs_mutex -> mm->mmap_sem

The end result is then a lock inversion between sysfs_mutex and
mm->mmap_sem, as reported in the following lockdep warning:

[ INFO: possible circular locking dependency detected ]
2.6.31-07095-g25a3912 #4
-------------------------------------------------------
udevadm/790 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0

but task is already holding lock:
 (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #5 (sysfs_mutex){+.+.+.}:
      [...]

-> #4 (slub_lock){+++++.}:
      [...]

-> #3 (cpu_hotplug.lock){+.+.+.}:
      [...]

-> #2 (cpu_add_remove_lock){+.+.+.}:
      [...]

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
      [...]

-> #0 (&mm->mmap_sem){++++++}:
      [...]

This can be fixed by relaxing the reiserfs lock while creating the
workqueue.
This is fine to relax the lock here, we just keep it around to pass
through reiserfs lock checks and for paranoid reasons.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
---
 fs/reiserfs/journal.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 04e3c42..2f8a7e7 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2933,8 +2933,11 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 	}
 
 	reiserfs_mounted_fs_count++;
-	if (reiserfs_mounted_fs_count <= 1)
+	if (reiserfs_mounted_fs_count <= 1) {
+		reiserfs_write_unlock(sb);
 		commit_wq = create_workqueue("reiserfs");
+		reiserfs_write_lock(sb);
+	}
 
 	INIT_DELAYED_WORK(&journal->j_work, flush_async_commits);
 	journal->j_work_sb = sb;
-- 
1.6.2.3




^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-10-05 18:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-25  2:32 [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Frederic Weisbecker
2009-08-25  2:32 ` [PATCH 1/4] kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency Frederic Weisbecker
2009-08-25  2:32 ` [PATCH 2/4] kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir() Frederic Weisbecker
2009-08-25  2:32 ` [PATCH 3/4] kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write() Frederic Weisbecker
2009-08-25  2:32 ` [PATCH 4/4] kill-the-bkl/reiserfs: panic in case of lock imbalance Frederic Weisbecker
2009-08-26 20:13 ` [PATCH 0/4] kill-the-bkl/reiserfs: fix some lock dependency inversions Alexander Beregalov
2009-09-01 22:16   ` Frederic Weisbecker
2009-09-14 20:37   ` Frederic Weisbecker
2009-09-14 21:33     ` Alexander Beregalov
2009-09-14 21:50       ` Frederic Weisbecker
2009-09-16 20:37       ` Frederic Weisbecker
2009-09-16 23:37         ` Alexander Beregalov
2009-09-16 23:37           ` Alexander Beregalov
2009-09-17  5:06           ` [PATCH] kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency Frederic Weisbecker
2009-09-22 13:55             ` Alexander Beregalov
2009-09-29  7:46               ` Frederic Weisbecker
2009-09-29 10:22                 ` Alexander Beregalov
2009-09-29 10:22                   ` Alexander Beregalov
2009-10-05 18:12                   ` [PATCH] kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.