All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] The many faces of the I_LOCK
@ 2007-02-21 13:09 Jörn Engel
  2007-02-21 18:29 ` David Chinner
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 13:09 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Anton Altaparmakov, David Chinner, Dave Kleikamp, Al Viro,
	Christoph Hellwig

1. Introduction

This lengthy investigation was caused by a deadlock problem in LogFS,
but uncovered a more general problem.  It affects, at the least, all
filesystems that need to read inodes in their write path.  To my
knowledge, that includes LogFS and NTFS, possibly also JFS and XFS.

Deadlock happens when two processes A and B (that may be identical) have
these two call chains:

Process A:				Process B:
inode_wait				[filesystem locking write path]
__wait_on_bit				__writeback_single_inode
out_of_line_wait_on_bit
ifind_fast
[filesystem calling iget()]
[filesystem locking write path]

Process A will wait_on_inode() and block until process B proceeds
through __sync_single_inode(), which is called from
__writeback_single_inode() in Process B.  Process B will block on the
lock of the filesystem write path, held by Process A.

After investigating several quick solutions - none of them worked - I
started asking the question of why ifind_fast() should wait for
__sync_single_inode() to finish.  Does that make sense?  So I tried to
describe what I_LOCK is used for.  If I made any mistakes in my
investigations, please correct me.


2. The usage of inode_lock and I_LOCK

Almost all modifications of inodes are protected by the inode_lock, a
global spinlock.  Some modifications, however, can block for various
reasons and require the inode_lock to get dropped temporarily.  In the
meantime, the individual inode needs to get protected somehow.  Usually
this happens through the use of I_LOCK.

But I_LOCK is not a simple mutex.  It is a Janus-faced bit in the inode
that is used for several things, including mutual exclusion and
completion notification.  Most users are open-coded, so it is not easy
to follow, but can be summarized in the table below.

In this table columns indicate events when I_LOCK is either set or
reset (or not reset but all waiters are notified anyway).  Rows
indicate code that either checks for I_LOCK and changes behaviour
depending on its state or is waiting until I_LOCK gets reset (or is
waiting even if I_LOCK is not set).

			__sync_single_inode
			|	get_new_inode[_fast]
			|	| unlock_new_inode
			|	| |	dispose_list
			|	| |	|	generic_delete_inode
			|	| |	|	|
generic_forget_inode
lock			v	v |	|	|	|
unlock/complete		v	  v	v	v	v	comment
-------------------------------------------------------------------------------
__writeback_single_inodeX	O	O	O	O	sync
write_inode_now		X	O	O	O	O	sync
clear_inode		X	O	O	O	O	sync
__mark_inode_dirty	X	O	O	O	O	lists
generic_osync_inode	?	?	?	?	?	?
get_new_inode[_fast]	O	X	O	O	O	mutex
find_inode[_fast]	O	O	X	X	X
I_FREEING
ifind[_fast]		O	  X	O	O	O	read

jfs txCommit		?	?	?	?	?	?
xfs_ichgtime[_fast]	?	?	?	?	?	?

Comments:
sync - wait for writeout to finish
lists - move inode to dirty list without racing against
__sync_single_inode
mutex - protect against two concurrent get_new_inode[_fast] creating two
inodes
I_FREEING - wait for inode to get freed, then repeat
read - don't return inode until it is read from medium

Now, the "X"s mark combinations where columns and rows are related.
"O"s mark combinations where afaiks columns and rows share no
relationship whatsoever except that both use either I_LOCK or
wake_up_inode()/wait_on_inode() or any other of the partially open-coded
variants.  Three rows have not exposed their meaning to me yet, so I'd
gladly receive some insight here.


3. Seperating out sync notification

One of the results from the investigations in 2 appears to be that one
class of users in fs/fs-writeback.c is completely unrelated to another
class of users in fs/inode.c.

In particular, __sync_single_inode(), __writeback_single_inode(),
write_inode_now(), clear_inode(), __mark_inode_dirty() and (possibly?)
generic_osync_inode() seem to only need a completion event to
synchronize with.  There is no reason why this group should share a lock
with iget() and any of its many permutations.

Now, if the group in fs/fs-writeback.c had a completion event that is
independent of anything in fs/inode.c, the deadlock scenario described
in section 1 goes away.  As a further result, ilookup5_nowait() can get
removed as its only user, NTFS, only introduced it as a workaround for
the deadlock scenario.

Potentially the checks in JFS and XFS can also get removed, but I don't
claim to understand their meaning yet.

Comments?

Jörn

-- 
Prosperity makes friends, adversity tries them.
-- Publilius Syrus
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] The many faces of the I_LOCK
  2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
@ 2007-02-21 18:29 ` David Chinner
  2007-02-21 18:47   ` Jörn Engel
  2007-02-21 18:54 ` [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses Jörn Engel
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: David Chinner @ 2007-02-21 18:29 UTC (permalink / raw)
  To: Jörn Engel
  Cc: linux-fsdevel, Anton Altaparmakov, David Chinner, Dave Kleikamp,
	Al Viro, Christoph Hellwig

On Wed, Feb 21, 2007 at 01:09:56PM +0000, Jörn Engel wrote:
> 1. Introduction
> 
> This lengthy investigation was caused by a deadlock problem in LogFS,
> but uncovered a more general problem.  It affects, at the least, all
> filesystems that need to read inodes in their write path.  To my
> knowledge, that includes LogFS and NTFS, possibly also JFS and XFS.

I don't think XFS has any problems here - we're pretty careful about
reading inodes from disk before we lock other potentially dependent
objects in the filesystem....

> Deadlock happens when two processes A and B (that may be identical) have
> these two call chains:
> 
> Process A:				Process B:
> inode_wait				[filesystem locking write path]
> __wait_on_bit				__writeback_single_inode
> out_of_line_wait_on_bit
> ifind_fast
> [filesystem calling iget()]
> [filesystem locking write path]
> 
> Process A will wait_on_inode() and block until process B proceeds
> through __sync_single_inode(), which is called from
> __writeback_single_inode() in Process B.  Process B will block on the
> lock of the filesystem write path, held by Process A.

This is caused by your cleaner thread racing with writeback doing inode
lookup, right? You need a non-blocking inode lookup to prevent the deadlock,
I guess....

> 2. The usage of inode_lock and I_LOCK
.....
> Three rows have not exposed their meaning to me yet, so I'd
> gladly receive some insight here.

IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if the
inode is currently I_LOCKed then we can't mark it dirty anyway so we don't
even bother trying. We do mark the XFS inode structure (*) dirty, though, so
the modification will make it to disk at some time in the future.

(*) XFS does double inode caching because the XFS transaction subsystem
requires inodes to have a different lifecycle to the linux struct inode
lifecycle.

> 3. Seperating out sync notification
> 
> One of the results from the investigations in 2 appears to be that one
> class of users in fs/fs-writeback.c is completely unrelated to another
> class of users in fs/inode.c.
> 
> In particular, __sync_single_inode(), __writeback_single_inode(),
> write_inode_now(), clear_inode(), __mark_inode_dirty() and (possibly?)
> generic_osync_inode() seem to only need a completion event to
> synchronize with.  There is no reason why this group should share a lock
> with iget() and any of its many permutations.

Seems reasonable, but I don't know all the little details in these
paths....

> Now, if the group in fs/fs-writeback.c had a completion event that is
> independent of anything in fs/inode.c, the deadlock scenario described
> in section 1 goes away.  As a further result, ilookup5_nowait() can get
> removed as its only user, NTFS, only introduced it as a workaround for
> the deadlock scenario.

Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock
that way?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] The many faces of the I_LOCK
  2007-02-21 18:29 ` David Chinner
@ 2007-02-21 18:47   ` Jörn Engel
  2007-02-21 18:57     ` Jörn Engel
  0 siblings, 1 reply; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 18:47 UTC (permalink / raw)
  To: David Chinner
  Cc: linux-fsdevel, Anton Altaparmakov, Dave Kleikamp, Al Viro,
	Christoph Hellwig

On Thu, 22 February 2007 05:29:04 +1100, David Chinner wrote:
> On Wed, Feb 21, 2007 at 01:09:56PM +0000, Jörn Engel wrote:
> > 
> > Process A:				Process B:
> > inode_wait				[filesystem locking write path]
> > __wait_on_bit				__writeback_single_inode
> > out_of_line_wait_on_bit
> > ifind_fast
> > [filesystem calling iget()]
> > [filesystem locking write path]
> 
> This is caused by your cleaner thread racing with writeback doing inode
> lookup, right? You need a non-blocking inode lookup to prevent the deadlock,
> I guess....

Almost correct.  I need an inode lookup that is not blocking on
writes.  Blocking on reads is fine - it will simply delay the cleaner
and cost some performance.

> IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if the
> inode is currently I_LOCKed then we can't mark it dirty anyway so we don't
> even bother trying. We do mark the XFS inode structure (*) dirty, though, so
> the modification will make it to disk at some time in the future.
> 
> (*) XFS does double inode caching because the XFS transaction subsystem
> requires inodes to have a different lifecycle to the linux struct inode
> lifecycle.

Ok.  Then my current patches would break XFS, I believe.  I'll just add
a seperate fix for now and will merge that in later.

> > Now, if the group in fs/fs-writeback.c had a completion event that is
> > independent of anything in fs/inode.c, the deadlock scenario described
> > in section 1 goes away.  As a further result, ilookup5_nowait() can get
> > removed as its only user, NTFS, only introduced it as a workaround for
> > the deadlock scenario.
> 
> Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock
> that way?

I could, but I personally consider ilookup5_nowait() to be a nasty hack.
It admittedly does what it was supposed to do, but it merely hacks
around the issue of using the same lock for two completely independent
purposes.

Also, I _want_ to block on I_LOCK in ilookup5() and friends.  If the
inode has been allocated, but not read from disk yet, the
wait_on_inode(inode) in ifind() is the only thing protecting me from
accessing garbage data.

Thinking about it, ignoring this lock can cause data loss on the
filesystem.  When validating blocks, they are compared to an unread
inode, which just looks like an empty file.  So all blocks look invalid
and will get deleted.  That race is _really_ hard to lose, but also
_really_ nasty.

Jörn

-- 
"[One] doesn't need to know [...] how to cause a headache in order
to take an aspirin."
-- Scott Culp, Manager of the Microsoft Security Response Center, 2001
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses.
  2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
  2007-02-21 18:29 ` David Chinner
@ 2007-02-21 18:54 ` Jörn Engel
  2007-02-21 18:55 ` [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5 Jörn Engel
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 18:54 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Anton Altaparmakov, David Chinner, Dave Kleikamp, Al Viro,
	Christoph Hellwig

The users of I_LOCK in fs/fs-writeback appear to be independent of
all others.  On top, sharing the same bit for __sync_single_inode()
and ifind/ifind_fast caused deadlocks in both LogFS and NTFS.

Introduce a new bit and use that for fs/fs-writeback.c users.
---
 fs/fs-writeback.c         |   39 ++++++++++++++++++++++++---------------
 include/linux/fs.h        |    2 ++
 include/linux/writeback.h |    7 +++++++
 3 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a4b142a..4958fae 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -99,11 +99,11 @@ void __mark_inode_dirty(struct inode *in
 		inode->i_state |= flags;
 
 		/*
-		 * If the inode is locked, just update its dirty state. 
+		 * If the inode is being synced, just update its dirty state.
 		 * The unlocker will place the inode on the appropriate
 		 * superblock list, based upon its state.
 		 */
-		if (inode->i_state & I_LOCK)
+		if (inode->i_state & I_SYNC)
 			goto out;
 
 		/*
@@ -139,6 +139,15 @@ static int write_inode(struct inode *ino
 	return 0;
 }
 
+static void inode_sync_complete(struct inode *inode)
+{
+	/*
+	 * Prevent speculative execution through spin_unlock(&inode_lock);
+	 */
+	smp_mb();
+	wake_up_bit(&inode->i_state, __I_SYNC);
+}
+
 /*
  * Write a single inode's dirty pages and inode data out to disk.
  * If `wait' is set, wait on the writeout.
@@ -158,11 +167,11 @@ __sync_single_inode(struct inode *inode,
 	int wait = wbc->sync_mode == WB_SYNC_ALL;
 	int ret;
 
-	BUG_ON(inode->i_state & I_LOCK);
+	BUG_ON(inode->i_state & I_SYNC);
 
-	/* Set I_LOCK, reset I_DIRTY */
+	/* Set I_SYNC, reset I_DIRTY */
 	dirty = inode->i_state & I_DIRTY;
-	inode->i_state |= I_LOCK;
+	inode->i_state |= I_SYNC;
 	inode->i_state &= ~I_DIRTY;
 
 	spin_unlock(&inode_lock);
@@ -183,7 +192,7 @@ __sync_single_inode(struct inode *inode,
 	}
 
 	spin_lock(&inode_lock);
-	inode->i_state &= ~I_LOCK;
+	inode->i_state &= ~I_SYNC;
 	if (!(inode->i_state & I_FREEING)) {
 		if (!(inode->i_state & I_DIRTY) &&
 		    mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
@@ -231,7 +240,7 @@ __sync_single_inode(struct inode *inode,
 			list_move(&inode->i_list, &inode_unused);
 		}
 	}
-	wake_up_inode(inode);
+	inode_sync_complete(inode);
 	return ret;
 }
 
@@ -250,7 +259,7 @@ __writeback_single_inode(struct inode *i
 	else
 		WARN_ON(inode->i_state & I_WILL_FREE);
 
-	if ((wbc->sync_mode != WB_SYNC_ALL) && (inode->i_state & I_LOCK)) {
+	if ((wbc->sync_mode != WB_SYNC_ALL) && (inode->i_state & I_SYNC)) {
 		struct address_space *mapping = inode->i_mapping;
 		int ret;
 
@@ -269,16 +278,16 @@ __writeback_single_inode(struct inode *i
 	/*
 	 * It's a data-integrity sync.  We must wait.
 	 */
-	if (inode->i_state & I_LOCK) {
-		DEFINE_WAIT_BIT(wq, &inode->i_state, __I_LOCK);
+	if (inode->i_state & I_SYNC) {
+		DEFINE_WAIT_BIT(wq, &inode->i_state, __I_SYNC);
 
-		wqh = bit_waitqueue(&inode->i_state, __I_LOCK);
+		wqh = bit_waitqueue(&inode->i_state, __I_SYNC);
 		do {
 			spin_unlock(&inode_lock);
 			__wait_on_bit(wqh, &wq, inode_wait,
 							TASK_UNINTERRUPTIBLE);
 			spin_lock(&inode_lock);
-		} while (inode->i_state & I_LOCK);
+		} while (inode->i_state & I_SYNC);
 	}
 	return __sync_single_inode(inode, wbc);
 }
@@ -311,7 +320,7 @@ __writeback_single_inode(struct inode *i
  * The inodes to be written are parked on sb->s_io.  They are moved back onto
  * sb->s_dirty as they are selected for writing.  This way, none can be missed
  * on the writer throttling path, and we get decent balancing between many
- * throttled threads: we don't want them all piling up on __wait_on_inode.
+ * throttled threads: we don't want them all piling up on inode_sync_wait.
  */
 static void
 sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc)
@@ -583,7 +592,7 @@ int write_inode_now(struct inode *inode,
 	ret = __writeback_single_inode(inode, &wbc);
 	spin_unlock(&inode_lock);
 	if (sync)
-		wait_on_inode(inode);
+		inode_sync_wait(inode);
 	return ret;
 }
 EXPORT_SYMBOL(write_inode_now);
@@ -658,7 +667,7 @@ int generic_osync_inode(struct inode *in
 			err = err2;
 	}
 	else
-		wait_on_inode(inode);
+		inode_sync_wait(inode);
 
 	return err;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 86ec3f4..f9eb221 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1184,6 +1184,8 @@ #define I_FREEING		16
 #define I_CLEAR			32
 #define I_NEW			64
 #define I_WILL_FREE		128
+#define __I_SYNC		8
+#define I_SYNC			(1 << __I_SYNC) /* Currently being synced */
 
 #define I_DIRTY (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES)
 
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index fc35e6b..3abc1d1 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -77,6 +77,13 @@ static inline void wait_on_inode(struct 
 	wait_on_bit(&inode->i_state, __I_LOCK, inode_wait,
 							TASK_UNINTERRUPTIBLE);
 }
+static inline void inode_sync_wait(struct inode *inode)
+{
+	might_sleep();
+	wait_on_bit(&inode->i_state, __I_SYNC, inode_wait,
+							TASK_UNINTERRUPTIBLE);
+}
+
 
 /*
  * mm/page-writeback.c
-- 
1.4.2.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5
  2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
  2007-02-21 18:29 ` David Chinner
  2007-02-21 18:54 ` [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses Jörn Engel
@ 2007-02-21 18:55 ` Jörn Engel
  2007-02-21 18:56 ` [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS Jörn Engel
  2007-02-22 11:40 ` [RFC] The many faces of the I_LOCK Jörn Engel
  4 siblings, 0 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 18:55 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Anton Altaparmakov, David Chinner, Dave Kleikamp, Al Viro,
	Christoph Hellwig

After seperating I_LOCK and I_SYNC, the deadlock in NTFS that caused
the creation of ilookup5_nowait can no longer happen.  Remove
ilookup5_nowait, as it has become pointless.
---
 fs/inode.c         |   41 ++++-------------------------------------
 fs/ntfs/mft.c      |   12 ++----------
 include/linux/fs.h |    3 ---
 3 files changed, 6 insertions(+), 50 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 5abb097..5cfc6eb 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -731,7 +731,6 @@ EXPORT_SYMBOL(igrab);
  * @head:       the head of the list to search
  * @test:	callback used for comparisons between inodes
  * @data:	opaque data pointer to pass to @test
- * @wait:	if true wait for the inode to be unlocked, if false do not
  *
  * ifind() searches for the inode specified by @data in the inode
  * cache. This is a generalized version of ifind_fast() for file systems where
@@ -746,7 +745,7 @@ EXPORT_SYMBOL(igrab);
  */
 static struct inode *ifind(struct super_block *sb,
 		struct hlist_head *head, int (*test)(struct inode *, void *),
-		void *data, const int wait)
+		void *data)
 {
 	struct inode *inode;
 
@@ -755,8 +754,7 @@ static struct inode *ifind(struct super_
 	if (inode) {
 		__iget(inode);
 		spin_unlock(&inode_lock);
-		if (likely(wait))
-			wait_on_inode(inode);
+		wait_on_inode(inode);
 		return inode;
 	}
 	spin_unlock(&inode_lock);
@@ -796,37 +794,6 @@ static struct inode *ifind_fast(struct s
 }
 
 /**
- * ilookup5_nowait - search for an inode in the inode cache
- * @sb:		super block of file system to search
- * @hashval:	hash value (usually inode number) to search for
- * @test:	callback used for comparisons between inodes
- * @data:	opaque data pointer to pass to @test
- *
- * ilookup5() uses ifind() to search for the inode specified by @hashval and
- * @data in the inode cache. This is a generalized version of ilookup() for
- * file systems where the inode number is not sufficient for unique
- * identification of an inode.
- *
- * If the inode is in the cache, the inode is returned with an incremented
- * reference count.  Note, the inode lock is not waited upon so you have to be
- * very careful what you do with the returned inode.  You probably should be
- * using ilookup5() instead.
- *
- * Otherwise NULL is returned.
- *
- * Note, @test is called with the inode_lock held, so can't sleep.
- */
-struct inode *ilookup5_nowait(struct super_block *sb, unsigned long hashval,
-		int (*test)(struct inode *, void *), void *data)
-{
-	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
-
-	return ifind(sb, head, test, data, 0);
-}
-
-EXPORT_SYMBOL(ilookup5_nowait);
-
-/**
  * ilookup5 - search for an inode in the inode cache
  * @sb:		super block of file system to search
  * @hashval:	hash value (usually inode number) to search for
@@ -850,7 +817,7 @@ struct inode *ilookup5(struct super_bloc
 {
 	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
 
-	return ifind(sb, head, test, data, 1);
+	return ifind(sb, head, test, data);
 }
 
 EXPORT_SYMBOL(ilookup5);
@@ -907,7 +874,7 @@ struct inode *iget5_locked(struct super_
 	struct hlist_head *head = inode_hashtable + hash(sb, hashval);
 	struct inode *inode;
 
-	inode = ifind(sb, head, test, data, 1);
+	inode = ifind(sb, head, test, data);
 	if (inode)
 		return inode;
 	/*
diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c
index 2ad5c8b..62d659b 100644
--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -956,14 +956,7 @@ bool ntfs_may_write_mft_record(ntfs_volu
 		vi = igrab(mft_vi);
 		BUG_ON(vi != mft_vi);
 	} else {
-		/*
-		 * Have to use ilookup5_nowait() since ilookup5() waits for the
-		 * inode lock which causes ntfs to deadlock when a concurrent
-		 * inode write via the inode dirty code paths and the page
-		 * dirty code path of the inode dirty code path when writing
-		 * $MFT occurs.
-		 */
-		vi = ilookup5_nowait(sb, mft_no, (test_t)ntfs_test_inode, &na);
+		vi = ilookup5(sb, mft_no, (test_t)ntfs_test_inode, &na);
 	}
 	if (vi) {
 		ntfs_debug("Base inode 0x%lx is in icache.", mft_no);
@@ -1024,8 +1017,7 @@ bool ntfs_may_write_mft_record(ntfs_volu
 		vi = igrab(mft_vi);
 		BUG_ON(vi != mft_vi);
 	} else
-		vi = ilookup5_nowait(sb, na.mft_no, (test_t)ntfs_test_inode,
-				&na);
+		vi = ilookup5(sb, na.mft_no, (test_t)ntfs_test_inode, &na);
 	if (!vi) {
 		/*
 		 * The base inode is not in icache, write this extent mft
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f9eb221..8ee5810 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1660,9 +1660,6 @@ extern int inode_needs_sync(struct inode
 extern void generic_delete_inode(struct inode *inode);
 extern void generic_drop_inode(struct inode *inode);
 
-extern struct inode *ilookup5_nowait(struct super_block *sb,
-		unsigned long hashval, int (*test)(struct inode *, void *),
-		void *data);
 extern struct inode *ilookup5(struct super_block *sb, unsigned long hashval,
 		int (*test)(struct inode *, void *), void *data);
 extern struct inode *ilookup(struct super_block *sb, unsigned long ino);
-- 
1.4.2.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS
  2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
                   ` (2 preceding siblings ...)
  2007-02-21 18:55 ` [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5 Jörn Engel
@ 2007-02-21 18:56 ` Jörn Engel
  2007-02-22 11:40 ` [RFC] The many faces of the I_LOCK Jörn Engel
  4 siblings, 0 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 18:56 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Anton Altaparmakov, David Chinner, Dave Kleikamp, Al Viro,
	Christoph Hellwig

This part was lost when splitting I_LOCK and I_SYNC.
---
 fs/xfs/linux-2.6/xfs_iops.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 0b5fa12..e0e06dd 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -133,7 +133,7 @@ xfs_ichgtime(
 	 */
 	SYNCHRONIZE();
 	ip->i_update_core = 1;
-	if (!(inode->i_state & I_LOCK))
+	if (!(inode->i_state & I_SYNC))
 		mark_inode_dirty_sync(inode);
 }
 
@@ -185,7 +185,7 @@ xfs_ichgtime_fast(
 	 */
 	SYNCHRONIZE();
 	ip->i_update_core = 1;
-	if (!(inode->i_state & I_LOCK))
+	if (!(inode->i_state & I_SYNC))
 		mark_inode_dirty_sync(inode);
 }
 
-- 
1.4.2.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC] The many faces of the I_LOCK
  2007-02-21 18:47   ` Jörn Engel
@ 2007-02-21 18:57     ` Jörn Engel
  0 siblings, 0 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-21 18:57 UTC (permalink / raw)
  To: David Chinner
  Cc: linux-fsdevel, Anton Altaparmakov, Dave Kleikamp, Al Viro,
	Christoph Hellwig

On Wed, 21 February 2007 18:47:29 +0000, Jörn Engel wrote:
> 
> Ok.  Then my current patches would break XFS, I believe.  I'll just add
> a seperate fix for now and will merge that in later.

Patches are out.  Nothing merge-worthy yet, but a basis for discussion
and they do fix the deadlock problem in LogFS.

Jörn

-- 
The wise man seeks everything in himself; the ignorant man tries to get
everything from somebody else.
-- unknown
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] The many faces of the I_LOCK
  2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
                   ` (3 preceding siblings ...)
  2007-02-21 18:56 ` [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS Jörn Engel
@ 2007-02-22 11:40 ` Jörn Engel
  4 siblings, 0 replies; 8+ messages in thread
From: Jörn Engel @ 2007-02-22 11:40 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Anton Altaparmakov, David Chinner, Dave Kleikamp, Al Viro,
	Christoph Hellwig

On Wed, 21 February 2007 13:09:56 +0000, Jörn Engel wrote:
> 
> 3. Seperating out sync notification

Ok, that turned out to be a bad idea.  Deadlock still exists.  Now it is
harder to hit and involves:

I_SYNC waiting on LogFS write mutex
clear_inode waiting on I_SYNC with I_FREEING set
find_inode waiting for I_FREEING
LogFS write path waiting for find_inode

Sorry for the noise.

Jörn

-- 
Rules of Optimization:
Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
-- M.A. Jackson
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-02-22 11:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-21 13:09 [RFC] The many faces of the I_LOCK Jörn Engel
2007-02-21 18:29 ` David Chinner
2007-02-21 18:47   ` Jörn Engel
2007-02-21 18:57     ` Jörn Engel
2007-02-21 18:54 ` [PATCH 1/3] Replace I_LOCK with I_SYNC for fs/fs-writeback.c uses Jörn Engel
2007-02-21 18:55 ` [PATCH] [NTFS] Remove ilookup5_nowait and convert users to ilookup5 Jörn Engel
2007-02-21 18:56 ` [PATCH 3/3] Replace I_LOCK with I_SYNC in XFS Jörn Engel
2007-02-22 11:40 ` [RFC] The many faces of the I_LOCK Jörn Engel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.