* fs: break out inode LRU operations from inode_lock V2 @ 2010-10-27 23:02 Dave Chinner 2010-10-27 23:02 ` [PATCH 1/4] fs: protect inode->i_state with inode->i_lock Dave Chinner ` (4 more replies) 0 siblings, 5 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 23:02 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel Hi Al, The following patches break the inode LRU operations and the first half of iput_final() out from under the inode_lock. The patches should apply to your current merge-stem tree. Version 2: - make i_state checks/__iget() atomic under inode->i_lock - merge dispose_one_inode() into evict() - move inode hash removal after ->evict_inode call. - always take the inode_lru_lock() when checking whether the inode is on the lru or not. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/4] fs: protect inode->i_state with inode->i_lock 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner @ 2010-10-27 23:02 ` Dave Chinner 2010-10-27 23:02 ` [PATCH 2/4] fs: factor inode disposal Dave Chinner ` (3 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 23:02 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel From: Dave Chinner <dchinner@redhat.com> Protect inode state transitions and validity checks with the inode->i_lock. This enables us to make inode state transitions independently of the inode_lock and is the first step to peeling away the inode_lock from the code. This requires that __iget() is done atomically with i_state checks during list traversals so that we don't race with another thread marking the inode I_FREEING between the state check and grabbing the reference. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/drop_caches.c | 9 +++-- fs/fs-writeback.c | 42 +++++++++++++++++----- fs/inode.c | 93 +++++++++++++++++++++++++++++++++++++++-------- fs/notify/inode_mark.c | 21 ++++++++--- fs/quota/dquot.c | 13 ++++--- 5 files changed, 138 insertions(+), 40 deletions(-) diff --git a/fs/drop_caches.c b/fs/drop_caches.c index 2195c21..62dd8ee 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -18,11 +18,14 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) spin_lock(&inode_lock); list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { - if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) - continue; - if (inode->i_mapping->nrpages == 0) + spin_lock(&inode->i_lock); + if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || + (inode->i_mapping->nrpages == 0)) { + spin_unlock(&inode->i_lock); continue; + } __iget(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); invalidate_mapping_pages(inode->i_mapping, 0, -1); iput(toput_inode); diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index f6af81a..50de4f1 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -293,9 +293,11 @@ static void inode_wait_for_writeback(struct inode *inode) wqh = bit_waitqueue(&inode->i_state, __I_SYNC); while (inode->i_state & I_SYNC) { + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); __wait_on_bit(wqh, &wq, inode_wait, TASK_UNINTERRUPTIBLE); spin_lock(&inode_lock); + spin_lock(&inode->i_lock); } } @@ -319,6 +321,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) unsigned dirty; int ret; + spin_lock(&inode->i_lock); if (!atomic_read(&inode->i_count)) WARN_ON(!(inode->i_state & (I_WILL_FREE|I_FREEING))); else @@ -334,6 +337,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) * completed a full scan of b_io. */ if (wbc->sync_mode != WB_SYNC_ALL) { + spin_unlock(&inode->i_lock); requeue_io(inode); return 0; } @@ -349,6 +353,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) /* Set I_SYNC, reset I_DIRTY_PAGES */ inode->i_state |= I_SYNC; inode->i_state &= ~I_DIRTY_PAGES; + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); ret = do_writepages(mapping, wbc); @@ -370,8 +375,10 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) * write_inode() */ spin_lock(&inode_lock); + spin_lock(&inode->i_lock); dirty = inode->i_state & I_DIRTY; inode->i_state &= ~(I_DIRTY_SYNC | I_DIRTY_DATASYNC); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); /* Don't write the inode if only I_DIRTY_PAGES was set */ if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) { @@ -381,6 +388,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) } spin_lock(&inode_lock); + spin_lock(&inode->i_lock); inode->i_state &= ~I_SYNC; if (!(inode->i_state & I_FREEING)) { if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { @@ -422,6 +430,7 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) } } inode_sync_complete(inode); + spin_unlock(&inode->i_lock); return ret; } @@ -492,7 +501,9 @@ static int writeback_sb_inodes(struct super_block *sb, struct bdi_writeback *wb, * kind does not need peridic writeout yet, and for the latter * kind writeout is handled by the freer. */ + spin_lock(&inode->i_lock); if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { + spin_unlock(&inode->i_lock); requeue_io(inode); continue; } @@ -501,10 +512,14 @@ static int writeback_sb_inodes(struct super_block *sb, struct bdi_writeback *wb, * Was this inode dirtied after sync_sb_inodes was called? * This keeps sync from extra jobs and livelock. */ - if (inode_dirtied_after(inode, wbc->wb_start)) + if (inode_dirtied_after(inode, wbc->wb_start)) { + spin_unlock(&inode->i_lock); return 1; + } __iget(inode); + spin_unlock(&inode->i_lock); + pages_skipped = wbc->pages_skipped; writeback_single_inode(inode, wbc); if (wbc->pages_skipped != pages_skipped) { @@ -681,7 +696,9 @@ static long wb_writeback(struct bdi_writeback *wb, if (!list_empty(&wb->b_more_io)) { inode = wb_inode(wb->b_more_io.prev); trace_wbc_writeback_wait(&wbc, wb->bdi); + spin_lock(&inode->i_lock); inode_wait_for_writeback(inode); + spin_unlock(&inode->i_lock); } spin_unlock(&inode_lock); } @@ -947,6 +964,7 @@ void __mark_inode_dirty(struct inode *inode, int flags) block_dump___mark_inode_dirty(inode); spin_lock(&inode_lock); + spin_lock(&inode->i_lock); if ((inode->i_state & flags) != flags) { const int was_dirty = inode->i_state & I_DIRTY; @@ -958,7 +976,7 @@ void __mark_inode_dirty(struct inode *inode, int flags) * superblock list, based upon its state. */ if (inode->i_state & I_SYNC) - goto out; + goto out_unlock_inode; /* * Only add valid (hashed) inodes to the superblock's @@ -966,11 +984,12 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ if (!S_ISBLK(inode->i_mode)) { if (inode_unhashed(inode)) - goto out; + goto out_unlock_inode; } if (inode->i_state & I_FREEING) - goto out; + goto out_unlock_inode; + spin_unlock(&inode->i_lock); /* * If the inode was already on b_dirty/b_io/b_more_io, don't * reposition it (that would break b_dirty time-ordering). @@ -995,7 +1014,10 @@ void __mark_inode_dirty(struct inode *inode, int flags) inode->dirtied_when = jiffies; list_move(&inode->i_wb_list, &bdi->wb.b_dirty); } + goto out; } +out_unlock_inode: + spin_unlock(&inode->i_lock); out: spin_unlock(&inode_lock); @@ -1041,14 +1063,16 @@ static void wait_sb_inodes(struct super_block *sb) * we still have to wait for that writeout. */ list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { - struct address_space *mapping; + struct address_space *mapping = inode->i_mapping; - if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) - continue; - mapping = inode->i_mapping; - if (mapping->nrpages == 0) + spin_lock(&inode->i_lock); + if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || + (mapping->nrpages == 0)) { + spin_unlock(&inode->i_lock); continue; + } __iget(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); /* * We hold a reference to 'inode' so it couldn't have diff --git a/fs/inode.c b/fs/inode.c index a6d6068..e1959f3 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -26,6 +26,17 @@ #include <linux/posix_acl.h> /* + * inode locking rules. + * + * inode->i_lock protects: + * inode->i_state, __iget() + * + * Lock ordering: + * inode_lock + * inode->i_lock + */ + +/* * This is needed for the following functions: * - inode_has_buffers * - invalidate_bdev @@ -429,7 +440,9 @@ void end_writeback(struct inode *inode) BUG_ON(!(inode->i_state & I_FREEING)); BUG_ON(inode->i_state & I_CLEAR); inode_sync_wait(inode); + spin_lock(&inode->i_lock); inode->i_state = I_FREEING | I_CLEAR; + spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(end_writeback); @@ -498,12 +511,17 @@ void evict_inodes(struct super_block *sb) if (atomic_read(&inode->i_count)) continue; + spin_lock(&inode->i_lock); if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { + spin_unlock(&inode->i_lock); WARN_ON(1); continue; } inode->i_state |= I_FREEING; + if (!(inode->i_state & (I_DIRTY | I_SYNC))) + percpu_counter_dec(&nr_inodes_unused); + spin_unlock(&inode->i_lock); /* * Move the inode off the IO lists and LRU once I_FREEING is @@ -511,8 +529,6 @@ void evict_inodes(struct super_block *sb) */ list_move(&inode->i_lru, &dispose); list_del_init(&inode->i_wb_list); - if (!(inode->i_state & (I_DIRTY | I_SYNC))) - percpu_counter_dec(&nr_inodes_unused); } spin_unlock(&inode_lock); @@ -537,14 +553,21 @@ int invalidate_inodes(struct super_block *sb) spin_lock(&inode_lock); list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { - if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) + spin_lock(&inode->i_lock); + if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { + spin_unlock(&inode->i_lock); continue; + } if (atomic_read(&inode->i_count)) { + spin_unlock(&inode->i_lock); busy = 1; continue; } inode->i_state |= I_FREEING; + if (!(inode->i_state & (I_DIRTY | I_SYNC))) + percpu_counter_dec(&nr_inodes_unused); + spin_unlock(&inode->i_lock); /* * Move the inode off the IO lists and LRU once I_FREEING is @@ -552,8 +575,6 @@ int invalidate_inodes(struct super_block *sb) */ list_move(&inode->i_lru, &dispose); list_del_init(&inode->i_wb_list); - if (!(inode->i_state & (I_DIRTY | I_SYNC))) - percpu_counter_dec(&nr_inodes_unused); } spin_unlock(&inode_lock); @@ -612,8 +633,10 @@ static void prune_icache(int nr_to_scan) * Referenced or dirty inodes are still in use. Give them * another pass through the LRU as we canot reclaim them now. */ + spin_lock(&inode->i_lock); if (atomic_read(&inode->i_count) || (inode->i_state & ~I_REFERENCED)) { + spin_unlock(&inode->i_lock); list_del_init(&inode->i_lru); percpu_counter_dec(&nr_inodes_unused); continue; @@ -621,12 +644,14 @@ static void prune_icache(int nr_to_scan) /* recently referenced inodes get one more pass */ if (inode->i_state & I_REFERENCED) { - list_move(&inode->i_lru, &inode_lru); inode->i_state &= ~I_REFERENCED; + spin_unlock(&inode->i_lock); + list_move(&inode->i_lru, &inode_lru); continue; } if (inode_has_buffers(inode) || inode->i_data.nrpages) { __iget(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); if (remove_inode_buffers(inode)) reap += invalidate_mapping_pages(&inode->i_data, @@ -637,11 +662,15 @@ static void prune_icache(int nr_to_scan) if (inode != list_entry(inode_lru.next, struct inode, i_lru)) continue; /* wrong inode or list_empty */ - if (!can_unuse(inode)) + spin_lock(&inode->i_lock); + if (!can_unuse(inode)) { + spin_unlock(&inode->i_lock); continue; + } } WARN_ON(inode->i_state & I_NEW); inode->i_state |= I_FREEING; + spin_unlock(&inode->i_lock); /* * Move the inode off the IO lists and LRU once I_FREEING is @@ -708,11 +737,13 @@ repeat: continue; if (!test(inode, data)) continue; + spin_lock(&inode->i_lock); if (inode->i_state & (I_FREEING|I_WILL_FREE)) { __wait_on_freeing_inode(inode); goto repeat; } __iget(inode); + spin_unlock(&inode->i_lock); return inode; } return NULL; @@ -734,11 +765,13 @@ repeat: continue; if (inode->i_sb != sb) continue; + spin_lock(&inode->i_lock); if (inode->i_state & (I_FREEING|I_WILL_FREE)) { __wait_on_freeing_inode(inode); goto repeat; } __iget(inode); + spin_unlock(&inode->i_lock); return inode; } return NULL; @@ -803,8 +836,10 @@ struct inode *new_inode(struct super_block *sb) inode = alloc_inode(sb); if (inode) { spin_lock(&inode_lock); - __inode_sb_list_add(inode); + spin_lock(&inode->i_lock); inode->i_state = 0; + spin_unlock(&inode->i_lock); + __inode_sb_list_add(inode); spin_unlock(&inode_lock); } return inode; @@ -871,9 +906,11 @@ static struct inode *get_new_inode(struct super_block *sb, if (set(inode, data)) goto set_failed; + spin_lock(&inode->i_lock); + inode->i_state = I_NEW; + spin_unlock(&inode->i_lock); hlist_add_head(&inode->i_hash, head); __inode_sb_list_add(inode); - inode->i_state = I_NEW; spin_unlock(&inode_lock); /* Return the locked inode with I_NEW set, the @@ -917,10 +954,12 @@ static struct inode *get_new_inode_fast(struct super_block *sb, /* We released the lock, so.. */ old = find_inode_fast(sb, head, ino); if (!old) { + spin_lock(&inode->i_lock); inode->i_ino = ino; + inode->i_state = I_NEW; + spin_unlock(&inode->i_lock); hlist_add_head(&inode->i_hash, head); __inode_sb_list_add(inode); - inode->i_state = I_NEW; spin_unlock(&inode_lock); /* Return the locked inode with I_NEW set, the @@ -1005,15 +1044,19 @@ EXPORT_SYMBOL(iunique); struct inode *igrab(struct inode *inode) { spin_lock(&inode_lock); - if (!(inode->i_state & (I_FREEING|I_WILL_FREE))) + spin_lock(&inode->i_lock); + if (!(inode->i_state & (I_FREEING|I_WILL_FREE))) { __iget(inode); - else + spin_unlock(&inode->i_lock); + } else { + spin_unlock(&inode->i_lock); /* * Handle the case where s_op->clear_inode is not been * called yet, and somebody is calling igrab * while the inode is getting freed. */ inode = NULL; + } spin_unlock(&inode_lock); return inode; } @@ -1242,7 +1285,9 @@ int insert_inode_locked(struct inode *inode) ino_t ino = inode->i_ino; struct hlist_head *head = inode_hashtable + hash(sb, ino); + spin_lock(&inode->i_lock); inode->i_state |= I_NEW; + spin_unlock(&inode->i_lock); while (1) { struct hlist_node *node; struct inode *old = NULL; @@ -1252,8 +1297,11 @@ int insert_inode_locked(struct inode *inode) continue; if (old->i_sb != sb) continue; - if (old->i_state & (I_FREEING|I_WILL_FREE)) + spin_lock(&old->i_lock); + if (old->i_state & (I_FREEING|I_WILL_FREE)) { + spin_unlock(&old->i_lock); continue; + } break; } if (likely(!node)) { @@ -1262,6 +1310,7 @@ int insert_inode_locked(struct inode *inode) return 0; } __iget(old); + spin_unlock(&old->i_lock); spin_unlock(&inode_lock); wait_on_inode(old); if (unlikely(!inode_unhashed(old))) { @@ -1279,7 +1328,9 @@ int insert_inode_locked4(struct inode *inode, unsigned long hashval, struct super_block *sb = inode->i_sb; struct hlist_head *head = inode_hashtable + hash(sb, hashval); + spin_lock(&inode->i_lock); inode->i_state |= I_NEW; + spin_unlock(&inode->i_lock); while (1) { struct hlist_node *node; @@ -1291,8 +1342,11 @@ int insert_inode_locked4(struct inode *inode, unsigned long hashval, continue; if (!test(old, data)) continue; - if (old->i_state & (I_FREEING|I_WILL_FREE)) + spin_lock(&old->i_lock); + if (old->i_state & (I_FREEING|I_WILL_FREE)) { + spin_unlock(&old->i_lock); continue; + } break; } if (likely(!node)) { @@ -1301,6 +1355,7 @@ int insert_inode_locked4(struct inode *inode, unsigned long hashval, return 0; } __iget(old); + spin_unlock(&old->i_lock); spin_unlock(&inode_lock); wait_on_inode(old); if (unlikely(!inode_unhashed(old))) { @@ -1346,6 +1401,9 @@ static void iput_final(struct inode *inode) const struct super_operations *op = inode->i_sb->s_op; int drop; + spin_lock(&inode->i_lock); + WARN_ON(inode->i_state & I_NEW); + if (op && op->drop_inode) drop = op->drop_inode(inode); else @@ -1357,21 +1415,23 @@ static void iput_final(struct inode *inode) if (!(inode->i_state & (I_DIRTY|I_SYNC))) { inode_lru_list_add(inode); } + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); return; } - WARN_ON(inode->i_state & I_NEW); inode->i_state |= I_WILL_FREE; + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); write_inode_now(inode, 1); spin_lock(&inode_lock); + spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); inode->i_state &= ~I_WILL_FREE; __remove_inode_hash(inode); } - WARN_ON(inode->i_state & I_NEW); inode->i_state |= I_FREEING; + spin_unlock(&inode->i_lock); /* * Move the inode off the IO lists and LRU once I_FREEING is @@ -1592,6 +1652,7 @@ static void __wait_on_freeing_inode(struct inode *inode) DEFINE_WAIT_BIT(wait, &inode->i_state, __I_NEW); wq = bit_waitqueue(&inode->i_state, __I_NEW); prepare_to_wait(wq, &wait.wait, TASK_UNINTERRUPTIBLE); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); schedule(); finish_wait(wq, &wait.wait); diff --git a/fs/notify/inode_mark.c b/fs/notify/inode_mark.c index 21ed106..9c0fb09 100644 --- a/fs/notify/inode_mark.c +++ b/fs/notify/inode_mark.c @@ -249,8 +249,11 @@ void fsnotify_unmount_inodes(struct list_head *list) * I_WILL_FREE, or I_NEW which is fine because by that point * the inode cannot have any associated watches. */ - if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) + spin_lock(&inode->i_lock); + if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) { + spin_unlock(&inode->i_lock); continue; + } /* * If i_count is zero, the inode cannot have any watches and @@ -258,8 +261,10 @@ void fsnotify_unmount_inodes(struct list_head *list) * evict all inodes with zero i_count from icache which is * unnecessarily violent and may in fact be illegal to do. */ - if (!atomic_read(&inode->i_count)) + if (!atomic_read(&inode->i_count)) { + spin_unlock(&inode->i_lock); continue; + } need_iput_tmp = need_iput; need_iput = NULL; @@ -269,13 +274,17 @@ void fsnotify_unmount_inodes(struct list_head *list) __iget(inode); else need_iput_tmp = NULL; + spin_unlock(&inode->i_lock); /* In case the dropping of a reference would nuke next_i. */ if ((&next_i->i_sb_list != list) && - atomic_read(&next_i->i_count) && - !(next_i->i_state & (I_FREEING | I_WILL_FREE))) { - __iget(next_i); - need_iput = next_i; + atomic_read(&next_i->i_count)) { + spin_lock(&next_i->i_lock); + if (!(next_i->i_state & (I_FREEING | I_WILL_FREE))) { + __iget(next_i); + need_iput = next_i; + } + spin_unlock(&next_i->i_lock); } /* diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index aad1316..a2d89f8 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -898,18 +898,19 @@ static void add_dquot_ref(struct super_block *sb, int type) spin_lock(&inode_lock); list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { - if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) + spin_lock(&inode->i_lock); + if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || + !atomic_read(&inode->i_writecount) || + !dqinit_needed(inode, type)) { + spin_unlock(&inode->i_lock); continue; + } #ifdef CONFIG_QUOTA_DEBUG if (unlikely(inode_get_rsv_space(inode) > 0)) reserved = 1; #endif - if (!atomic_read(&inode->i_writecount)) - continue; - if (!dqinit_needed(inode, type)) - continue; - __iget(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); iput(old_inode); -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/4] fs: factor inode disposal 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner 2010-10-27 23:02 ` [PATCH 1/4] fs: protect inode->i_state with inode->i_lock Dave Chinner @ 2010-10-27 23:02 ` Dave Chinner 2010-10-27 23:02 ` [PATCH 3/4] fs: Lock the inode LRU list separately Dave Chinner ` (2 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 23:02 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel From: Dave Chinner <dchinner@redhat.com> We have a couple of places that dispose of inodes. factor the disposal into evict() to isolate this code and make it simpler to peel away the inode_lock from the code. While doing this, change the logic flow in iput_final() to separate the different cases that need to be handled to make the transitions the inode goes through more obvious. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/inode.c | 89 +++++++++++++++++++++++++---------------------------------- 1 files changed, 38 insertions(+), 51 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index e1959f3..bd1688b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -408,17 +408,6 @@ void __insert_inode_hash(struct inode *inode, unsigned long hashval) EXPORT_SYMBOL(__insert_inode_hash); /** - * __remove_inode_hash - remove an inode from the hash - * @inode: inode to unhash - * - * Remove an inode from the superblock. - */ -static void __remove_inode_hash(struct inode *inode) -{ - hlist_del_init(&inode->i_hash); -} - -/** * remove_inode_hash - remove an inode from the hash * @inode: inode to unhash * @@ -446,10 +435,26 @@ void end_writeback(struct inode *inode) } EXPORT_SYMBOL(end_writeback); +/* + * Free the inode passed in, removing it from the lists it is still connected + * to. We remove any pages still attached to the inode and wait for any IO that + * is still in progress before finally destroying the inode. + * + * An inode must already be marked I_FREEING so that we avoid the inode being + * moved back onto lists if we race with other code that manipulates the lists + * (e.g. writeback_single_inode). The caller is responsible for setting this. + */ static void evict(struct inode *inode) { const struct super_operations *op = inode->i_sb->s_op; + BUG_ON(!(inode->i_state & I_FREEING)); + + spin_lock(&inode_lock); + list_del_init(&inode->i_wb_list); + __inode_sb_list_del(inode); + spin_unlock(&inode_lock); + if (op->evict_inode) { op->evict_inode(inode); } else { @@ -461,6 +466,11 @@ static void evict(struct inode *inode) bd_forget(inode); if (S_ISCHR(inode->i_mode) && inode->i_cdev) cd_forget(inode); + + remove_inode_hash(inode); + wake_up_inode(inode); + BUG_ON(inode->i_state != (I_FREEING | I_CLEAR)); + destroy_inode(inode); } /* @@ -479,14 +489,6 @@ static void dispose_list(struct list_head *head) list_del_init(&inode->i_lru); evict(inode); - - spin_lock(&inode_lock); - __remove_inode_hash(inode); - __inode_sb_list_del(inode); - spin_unlock(&inode_lock); - - wake_up_inode(inode); - destroy_inode(inode); } } @@ -524,11 +526,10 @@ void evict_inodes(struct super_block *sb) spin_unlock(&inode->i_lock); /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. + * Move the inode off the LRU once I_FREEING is set so that it + * won't get moved back on. */ list_move(&inode->i_lru, &dispose); - list_del_init(&inode->i_wb_list); } spin_unlock(&inode_lock); @@ -570,11 +571,10 @@ int invalidate_inodes(struct super_block *sb) spin_unlock(&inode->i_lock); /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. + * Move the inode off the LRU once I_FREEING is + * set so that it won't get moved back on. */ list_move(&inode->i_lru, &dispose); - list_del_init(&inode->i_wb_list); } spin_unlock(&inode_lock); @@ -673,11 +673,10 @@ static void prune_icache(int nr_to_scan) spin_unlock(&inode->i_lock); /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. + * Move the inode off the LRU once I_FREEING is + * set so that it won't get moved back on. */ list_move(&inode->i_lru, &freeable); - list_del_init(&inode->i_wb_list); percpu_counter_dec(&nr_inodes_unused); } if (current_is_kswapd()) @@ -1409,16 +1408,16 @@ static void iput_final(struct inode *inode) else drop = generic_drop_inode(inode); + if (!drop && (sb->s_flags & MS_ACTIVE)) { + inode->i_state |= I_REFERENCED; + if (!(inode->i_state & (I_DIRTY|I_SYNC))) + inode_lru_list_add(inode); + spin_unlock(&inode->i_lock); + spin_unlock(&inode_lock); + return; + } + if (!drop) { - if (sb->s_flags & MS_ACTIVE) { - inode->i_state |= I_REFERENCED; - if (!(inode->i_state & (I_DIRTY|I_SYNC))) { - inode_lru_list_add(inode); - } - spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); - return; - } inode->i_state |= I_WILL_FREE; spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); @@ -1427,26 +1426,14 @@ static void iput_final(struct inode *inode) spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); inode->i_state &= ~I_WILL_FREE; - __remove_inode_hash(inode); } inode->i_state |= I_FREEING; - spin_unlock(&inode->i_lock); - - /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. - */ inode_lru_list_del(inode); - list_del_init(&inode->i_wb_list); - - __inode_sb_list_del(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); + evict(inode); - remove_inode_hash(inode); - wake_up_inode(inode); - BUG_ON(inode->i_state != (I_FREEING | I_CLEAR)); - destroy_inode(inode); } /** -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/4] fs: Lock the inode LRU list separately 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner 2010-10-27 23:02 ` [PATCH 1/4] fs: protect inode->i_state with inode->i_lock Dave Chinner 2010-10-27 23:02 ` [PATCH 2/4] fs: factor inode disposal Dave Chinner @ 2010-10-27 23:02 ` Dave Chinner 2010-10-27 23:02 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner 2010-10-28 13:58 ` fs: break out inode LRU operations from inode_lock V2 Christoph Hellwig 4 siblings, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 23:02 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel From: Dave Chinner <dchinner@redhat.com> Introduce the inode_lru_lock to protect the inode_lru list. This lock is nested inside the inode->i_lock to allow the inode to be added to the LRU list in iput_final without needing to deal with lock inversions. This keeps iput_final() clean and neat. Further, where marking the inode I_FREEING and removing it from the LRU, move the LRU list manipulation within the inode->i_lock to keep the list manipulation consistent with iput_final. This also means that most of the open coded LRU list removal + unused inode accounting can now use the inode_lru_list_del() wrappers which cleans the code up further. However, this locking change means what the LRU traversal in prune_icache() inverts this lock ordering and needs to use trylock semantics on the inode->i_lock to avoid deadlocking. In these cases, if we fail to lock the inode we move it to the back of the LRU to prevent spinning on it. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/inode.c | 45 +++++++++++++++++++++++++++++---------------- 1 files changed, 29 insertions(+), 16 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index bd1688b..da741e7 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -30,10 +30,13 @@ * * inode->i_lock protects: * inode->i_state, __iget() + * inode_lru_lock protects: + * inode_lru, inode->i_lru * * Lock ordering: * inode_lock * inode->i_lock + * inode_lru_lock */ /* @@ -83,6 +86,7 @@ static unsigned int i_hash_shift __read_mostly; */ static LIST_HEAD(inode_lru); +static DEFINE_SPINLOCK(inode_lru_lock); static struct hlist_head *inode_hashtable __read_mostly; /* @@ -343,18 +347,22 @@ EXPORT_SYMBOL(ihold); static void inode_lru_list_add(struct inode *inode) { + spin_lock(&inode_lru_lock); if (list_empty(&inode->i_lru)) { list_add(&inode->i_lru, &inode_lru); percpu_counter_inc(&nr_inodes_unused); } + spin_unlock(&inode_lru_lock); } static void inode_lru_list_del(struct inode *inode) { + spin_lock(&inode_lru_lock); if (!list_empty(&inode->i_lru)) { list_del_init(&inode->i_lru); percpu_counter_dec(&nr_inodes_unused); } + spin_unlock(&inode_lru_lock); } static inline void __inode_sb_list_add(struct inode *inode) @@ -521,15 +529,10 @@ void evict_inodes(struct super_block *sb) } inode->i_state |= I_FREEING; - if (!(inode->i_state & (I_DIRTY | I_SYNC))) - percpu_counter_dec(&nr_inodes_unused); + inode_lru_list_del(inode); spin_unlock(&inode->i_lock); - /* - * Move the inode off the LRU once I_FREEING is set so that it - * won't get moved back on. - */ - list_move(&inode->i_lru, &dispose); + list_add(&inode->i_lru, &dispose); } spin_unlock(&inode_lock); @@ -566,15 +569,10 @@ int invalidate_inodes(struct super_block *sb) } inode->i_state |= I_FREEING; - if (!(inode->i_state & (I_DIRTY | I_SYNC))) - percpu_counter_dec(&nr_inodes_unused); + inode_lru_list_del(inode); spin_unlock(&inode->i_lock); - /* - * Move the inode off the LRU once I_FREEING is - * set so that it won't get moved back on. - */ - list_move(&inode->i_lru, &dispose); + list_add(&inode->i_lru, &dispose); } spin_unlock(&inode_lock); @@ -621,6 +619,7 @@ static void prune_icache(int nr_to_scan) down_read(&iprune_sem); spin_lock(&inode_lock); + spin_lock(&inode_lru_lock); for (nr_scanned = 0; nr_scanned < nr_to_scan; nr_scanned++) { struct inode *inode; @@ -630,10 +629,19 @@ static void prune_icache(int nr_to_scan) inode = list_entry(inode_lru.prev, struct inode, i_lru); /* + * we are inverting the inode_lru_lock/inode->i_lock here, + * so use a trylock. If we fail to get the lock, just move the + * inode to the back of the list so we don't spin on it. + */ + if (!spin_trylock(&inode->i_lock)) { + list_move(&inode->i_lru, &inode_lru); + continue; + } + + /* * Referenced or dirty inodes are still in use. Give them * another pass through the LRU as we canot reclaim them now. */ - spin_lock(&inode->i_lock); if (atomic_read(&inode->i_count) || (inode->i_state & ~I_REFERENCED)) { spin_unlock(&inode->i_lock); @@ -652,17 +660,21 @@ static void prune_icache(int nr_to_scan) if (inode_has_buffers(inode) || inode->i_data.nrpages) { __iget(inode); spin_unlock(&inode->i_lock); + spin_unlock(&inode_lru_lock); spin_unlock(&inode_lock); if (remove_inode_buffers(inode)) reap += invalidate_mapping_pages(&inode->i_data, 0, -1); iput(inode); spin_lock(&inode_lock); + spin_lock(&inode_lru_lock); if (inode != list_entry(inode_lru.next, struct inode, i_lru)) continue; /* wrong inode or list_empty */ - spin_lock(&inode->i_lock); + /* avoid lock inversions with trylock */ + if (!spin_trylock(&inode->i_lock)) + continue; if (!can_unuse(inode)) { spin_unlock(&inode->i_lock); continue; @@ -683,6 +695,7 @@ static void prune_icache(int nr_to_scan) __count_vm_events(KSWAPD_INODESTEAL, reap); else __count_vm_events(PGINODESTEAL, reap); + spin_unlock(&inode_lru_lock); spin_unlock(&inode_lock); dispose_list(&freeable); -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner ` (2 preceding siblings ...) 2010-10-27 23:02 ` [PATCH 3/4] fs: Lock the inode LRU list separately Dave Chinner @ 2010-10-27 23:02 ` Dave Chinner 2010-10-28 12:00 ` Christoph Hellwig 2010-10-28 13:58 ` fs: break out inode LRU operations from inode_lock V2 Christoph Hellwig 4 siblings, 1 reply; 20+ messages in thread From: Dave Chinner @ 2010-10-27 23:02 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel From: Dave Chinner <dchinner@redhat.com> Now that inode state changes are protected by the inode->i_lock and the inode LRU manipulations by the inode_lru_lock, we can remove the inode_lock from prune_icache and the initial part of iput_final(). instead of using the inode_lock to protect the inode during iput_final, use the inode->i_lock instead. This protects the inode against new references being taken while we change the inode state to I_FREEING, as well as preventing prune_icache from grabbing the inode while we are manipulating it. Hence we no longer need the inode_lock in iput_final prior to setting I_FREEING on the inode. For prune_icache, we no longer need the inode_lock to protect the LRU list, and the inodes themselves are protected against freeing races by the inode->i_lock. Hence we can lift the inode_lock from prune_icache as well. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/inode.c | 15 +++------------ 1 files changed, 3 insertions(+), 12 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index da741e7..f64aeda 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -618,7 +618,6 @@ static void prune_icache(int nr_to_scan) unsigned long reap = 0; down_read(&iprune_sem); - spin_lock(&inode_lock); spin_lock(&inode_lru_lock); for (nr_scanned = 0; nr_scanned < nr_to_scan; nr_scanned++) { struct inode *inode; @@ -644,8 +643,8 @@ static void prune_icache(int nr_to_scan) */ if (atomic_read(&inode->i_count) || (inode->i_state & ~I_REFERENCED)) { - spin_unlock(&inode->i_lock); list_del_init(&inode->i_lru); + spin_unlock(&inode->i_lock); percpu_counter_dec(&nr_inodes_unused); continue; } @@ -653,20 +652,18 @@ static void prune_icache(int nr_to_scan) /* recently referenced inodes get one more pass */ if (inode->i_state & I_REFERENCED) { inode->i_state &= ~I_REFERENCED; - spin_unlock(&inode->i_lock); list_move(&inode->i_lru, &inode_lru); + spin_unlock(&inode->i_lock); continue; } if (inode_has_buffers(inode) || inode->i_data.nrpages) { __iget(inode); spin_unlock(&inode->i_lock); spin_unlock(&inode_lru_lock); - spin_unlock(&inode_lock); if (remove_inode_buffers(inode)) reap += invalidate_mapping_pages(&inode->i_data, 0, -1); iput(inode); - spin_lock(&inode_lock); spin_lock(&inode_lru_lock); if (inode != list_entry(inode_lru.next, @@ -696,7 +693,6 @@ static void prune_icache(int nr_to_scan) else __count_vm_events(PGINODESTEAL, reap); spin_unlock(&inode_lru_lock); - spin_unlock(&inode_lock); dispose_list(&freeable); up_read(&iprune_sem); @@ -1413,7 +1409,6 @@ static void iput_final(struct inode *inode) const struct super_operations *op = inode->i_sb->s_op; int drop; - spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); if (op && op->drop_inode) @@ -1426,16 +1421,13 @@ static void iput_final(struct inode *inode) if (!(inode->i_state & (I_DIRTY|I_SYNC))) inode_lru_list_add(inode); spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); return; } if (!drop) { inode->i_state |= I_WILL_FREE; spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); write_inode_now(inode, 1); - spin_lock(&inode_lock); spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); inode->i_state &= ~I_WILL_FREE; @@ -1444,7 +1436,6 @@ static void iput_final(struct inode *inode) inode->i_state |= I_FREEING; inode_lru_list_del(inode); spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); evict(inode); } @@ -1463,7 +1454,7 @@ void iput(struct inode *inode) if (inode) { BUG_ON(inode->i_state & I_CLEAR); - if (atomic_dec_and_lock(&inode->i_count, &inode_lock)) + if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) iput_final(inode); } } -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 23:02 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner @ 2010-10-28 12:00 ` Christoph Hellwig 2010-10-28 21:41 ` Dave Chinner 0 siblings, 1 reply; 20+ messages in thread From: Christoph Hellwig @ 2010-10-28 12:00 UTC (permalink / raw) To: Dave Chinner; +Cc: viro, linux-fsdevel, linux-kernel Looks good - although at this point you could already remove inode_lock from igrab, too. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-28 12:00 ` Christoph Hellwig @ 2010-10-28 21:41 ` Dave Chinner 0 siblings, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-28 21:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: viro, linux-fsdevel, linux-kernel On Thu, Oct 28, 2010 at 08:00:57AM -0400, Christoph Hellwig wrote: > Looks good - although at this point you could already remove inode_lock > from igrab, too. Yes, it probably can go at this point - after the next three patches it's pretty much the only place the lock is left, so it's not really protecting anything anymore... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: fs: break out inode LRU operations from inode_lock V2 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner ` (3 preceding siblings ...) 2010-10-27 23:02 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner @ 2010-10-28 13:58 ` Christoph Hellwig 4 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2010-10-28 13:58 UTC (permalink / raw) To: Dave Chinner; +Cc: viro, linux-fsdevel, linux-kernel On Thu, Oct 28, 2010 at 10:02:14AM +1100, Dave Chinner wrote: > Hi Al, > > The following patches break the inode LRU operations and the first > half of iput_final() out from under the inode_lock. The patches > should apply to your current merge-stem tree. The series looks good to me and survives QA so far. ^ permalink raw reply [flat|nested] 20+ messages in thread
* fs: break out inode LRU operations from node_lock @ 2010-10-27 4:23 Dave Chinner 2010-10-27 4:23 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner 0 siblings, 1 reply; 20+ messages in thread From: Dave Chinner @ 2010-10-27 4:23 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel Hi Al, The following patches break the inode LRU operations and the first half of iput_final() out from under the inode_lock. I included the dispose_one_inode factoring patch to isolate the inode_lock from iput_final() completely. It's easy enough to drop if you don't want that right now. It passes xfstests on 1-, 2- and 8-way VMs, survives 8-way parallel create/traverse/unlink workloads with 0, 1 and 65536 byte files on XFS and ext4, and shows no problems with looping 50-client dbench runs on XFS or ext4. The patches should apply to your current merge-stem tree. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 4:23 fs: break out inode LRU operations from node_lock Dave Chinner @ 2010-10-27 4:23 ` Dave Chinner 2010-10-27 4:40 ` Al Viro 0 siblings, 1 reply; 20+ messages in thread From: Dave Chinner @ 2010-10-27 4:23 UTC (permalink / raw) To: viro; +Cc: linux-fsdevel, linux-kernel From: Dave Chinner <dchinner@redhat.com> Now that inode state changes are protected by the inode->i_lock and the inode LRU manipulations by the inode_lru_lock, we can remove the inode_lock from prune_icache and the initial part of iput_final(). instead of using the inode_lock to protect the inode during iput_final, use the inode->i_lock instead. This protects the inode against new references being taken while we change the inode state to I_FREEING, as well as preventing prune_icache from grabbing the inode while we are manipulating it. Hence we no longer need the iṉode_lock in iput_final prior to setting I_FREEING on the inode. For prune_icache, we no longer need the inode_lock to protect the LRU list, and the inodes themselves are protected against freeing races by the inode->i_lock. Hence we can lift the inode_lock from prune_icache as well. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/inode.c | 19 +++++-------------- 1 files changed, 5 insertions(+), 14 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index e04371e..b5f1585 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -634,7 +634,6 @@ static void prune_icache(int nr_to_scan) unsigned long reap = 0; down_read(&iprune_sem); - spin_lock(&inode_lock); spin_lock(&inode_lru_lock); for (nr_scanned = 0; nr_scanned < nr_to_scan; nr_scanned++) { struct inode *inode; @@ -660,8 +659,8 @@ static void prune_icache(int nr_to_scan) */ if (atomic_read(&inode->i_count) || (inode->i_state & ~I_REFERENCED)) { - spin_unlock(&inode->i_lock); list_del_init(&inode->i_lru); + spin_unlock(&inode->i_lock); percpu_counter_dec(&nr_inodes_unused); continue; } @@ -669,20 +668,18 @@ static void prune_icache(int nr_to_scan) /* recently referenced inodes get one more pass */ if (inode->i_state & I_REFERENCED) { inode->i_state &= ~I_REFERENCED; - spin_unlock(&inode->i_lock); list_move(&inode->i_lru, &inode_lru); + spin_unlock(&inode->i_lock); continue; } if (inode_has_buffers(inode) || inode->i_data.nrpages) { __iget(inode); spin_unlock(&inode->i_lock); spin_unlock(&inode_lru_lock); - spin_unlock(&inode_lock); if (remove_inode_buffers(inode)) reap += invalidate_mapping_pages(&inode->i_data, 0, -1); iput(inode); - spin_lock(&inode_lock); spin_lock(&inode_lru_lock); if (inode != list_entry(inode_lru.next, @@ -701,8 +698,8 @@ static void prune_icache(int nr_to_scan) spin_unlock(&inode->i_lock); /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. + * Move the inode off the LRU once I_FREEING is set so that it + * won't get moved back on there if it is dirty. */ list_move(&inode->i_lru, &freeable); percpu_counter_dec(&nr_inodes_unused); @@ -712,7 +709,6 @@ static void prune_icache(int nr_to_scan) else __count_vm_events(PGINODESTEAL, reap); spin_unlock(&inode_lru_lock); - spin_unlock(&inode_lock); dispose_list(&freeable); up_read(&iprune_sem); @@ -1429,7 +1425,6 @@ static void iput_final(struct inode *inode) const struct super_operations *op = inode->i_sb->s_op; int drop; - spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); if (op && op->drop_inode) @@ -1442,16 +1437,13 @@ static void iput_final(struct inode *inode) if (!(inode->i_state & (I_DIRTY|I_SYNC))) inode_lru_list_add(inode); spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); return; } if (!drop) { inode->i_state |= I_WILL_FREE; spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); write_inode_now(inode, 1); - spin_lock(&inode_lock); spin_lock(&inode->i_lock); WARN_ON(inode->i_state & I_NEW); inode->i_state &= ~I_WILL_FREE; @@ -1460,7 +1452,6 @@ static void iput_final(struct inode *inode) inode->i_state |= I_FREEING; inode_lru_list_del(inode); spin_unlock(&inode->i_lock); - spin_unlock(&inode_lock); dispose_one_inode(inode); } @@ -1479,7 +1470,7 @@ void iput(struct inode *inode) if (inode) { BUG_ON(inode->i_state & I_CLEAR); - if (atomic_dec_and_lock(&inode->i_count, &inode_lock)) + if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) iput_final(inode); } } -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 4:23 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner @ 2010-10-27 4:40 ` Al Viro 2010-10-27 4:47 ` Eric Dumazet 2010-10-27 9:12 ` Dave Chinner 0 siblings, 2 replies; 20+ messages in thread From: Al Viro @ 2010-10-27 4:40 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > Now that inode state changes are protected by the inode->i_lock and > the inode LRU manipulations by the inode_lru_lock, we can remove the > inode_lock from prune_icache and the initial part of iput_final(). > > instead of using the inode_lock to protect the inode during > iput_final, use the inode->i_lock instead. This protects the inode > against new references being taken while we change the inode state > to I_FREEING, as well as preventing prune_icache from grabbing the > inode while we are manipulating it. Hence we no longer need the > i???ode_lock in iput_final prior to setting I_FREEING on the inode. ^^^^^^^^^^^^ ... the hell? There's more such damage elsewhere in the thread; what's going on? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 4:40 ` Al Viro @ 2010-10-27 4:47 ` Eric Dumazet 2010-10-27 9:12 ` Dave Chinner 1 sibling, 0 replies; 20+ messages in thread From: Eric Dumazet @ 2010-10-27 4:47 UTC (permalink / raw) To: Al Viro; +Cc: Dave Chinner, linux-fsdevel, linux-kernel Le mercredi 27 octobre 2010 à 05:40 +0100, Al Viro a écrit : > On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Now that inode state changes are protected by the inode->i_lock and > > the inode LRU manipulations by the inode_lru_lock, we can remove the > > inode_lock from prune_icache and the initial part of iput_final(). > > > > instead of using the inode_lock to protect the inode during > > iput_final, use the inode->i_lock instead. This protects the inode > > against new references being taken while we change the inode state > > to I_FREEING, as well as preventing prune_icache from grabbing the > > inode while we are manipulating it. Hence we no longer need the > > i???ode_lock in iput_final prior to setting I_FREEING on the inode. > ^^^^^^^^^^^^ > > ... the hell? There's more such damage elsewhere in the thread; what's > going on? > -- Maybe its on your side, no problem here on my copy. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache @ 2010-10-27 4:47 ` Eric Dumazet 0 siblings, 0 replies; 20+ messages in thread From: Eric Dumazet @ 2010-10-27 4:47 UTC (permalink / raw) To: Al Viro; +Cc: Dave Chinner, linux-fsdevel, linux-kernel Le mercredi 27 octobre 2010 à 05:40 +0100, Al Viro a écrit : > On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Now that inode state changes are protected by the inode->i_lock and > > the inode LRU manipulations by the inode_lru_lock, we can remove the > > inode_lock from prune_icache and the initial part of iput_final(). > > > > instead of using the inode_lock to protect the inode during > > iput_final, use the inode->i_lock instead. This protects the inode > > against new references being taken while we change the inode state > > to I_FREEING, as well as preventing prune_icache from grabbing the > > inode while we are manipulating it. Hence we no longer need the > > i???ode_lock in iput_final prior to setting I_FREEING on the inode. > ^^^^^^^^^^^^ > > ... the hell? There's more such damage elsewhere in the thread; what's > going on? > -- Maybe its on your side, no problem here on my copy. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 4:47 ` Eric Dumazet (?) @ 2010-10-27 5:25 ` Al Viro 2010-10-27 5:50 ` Eric Dumazet -1 siblings, 1 reply; 20+ messages in thread From: Al Viro @ 2010-10-27 5:25 UTC (permalink / raw) To: Eric Dumazet; +Cc: Dave Chinner, linux-fsdevel, linux-kernel On Wed, Oct 27, 2010 at 06:47:46AM +0200, Eric Dumazet wrote: > Le mercredi 27 octobre 2010 ?? 05:40 +0100, Al Viro a ??crit : > > On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@redhat.com> > > > > > > Now that inode state changes are protected by the inode->i_lock and > > > the inode LRU manipulations by the inode_lru_lock, we can remove the > > > inode_lock from prune_icache and the initial part of iput_final(). > > > > > > instead of using the inode_lock to protect the inode during > > > iput_final, use the inode->i_lock instead. This protects the inode > > > against new references being taken while we change the inode state > > > to I_FREEING, as well as preventing prune_icache from grabbing the > > > inode while we are manipulating it. Hence we no longer need the > > > i???ode_lock in iput_final prior to setting I_FREEING on the inode. > > ^^^^^^^^^^^^ > > > > ... the hell? There's more such damage elsewhere in the thread; what's > > going on? > > -- > > Maybe its on your side, no problem here on my copy. "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter n with line below". I doubt that it's MTA braindamage. In the first patch there's - * invalidate_inodes - attempt to free all inodes on a + * nvalidate_inodes - attempt to free all inodes on a and I _really_ doubt that anything in mail system is capable of something that elaborate. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 5:25 ` Al Viro @ 2010-10-27 5:50 ` Eric Dumazet 0 siblings, 0 replies; 20+ messages in thread From: Eric Dumazet @ 2010-10-27 5:50 UTC (permalink / raw) To: Al Viro; +Cc: Dave Chinner, linux-fsdevel, linux-kernel Le mercredi 27 octobre 2010 à 06:25 +0100, Al Viro a écrit : > "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter > n with line below". I doubt that it's MTA braindamage. > > In the first patch there's > > - * invalidate_inodes - attempt to free all inodes on a > + * nvalidate_inodes - attempt to free all inodes on a > > and I _really_ doubt that anything in mail system is capable of something > that elaborate. Again, I can not see it in my copy, I checked lkml archives too : http://lkml.org/lkml/2010/10/27/7 Mail was fine, maybe your file system is corrupted ? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache @ 2010-10-27 5:50 ` Eric Dumazet 0 siblings, 0 replies; 20+ messages in thread From: Eric Dumazet @ 2010-10-27 5:50 UTC (permalink / raw) To: Al Viro; +Cc: Dave Chinner, linux-fsdevel, linux-kernel Le mercredi 27 octobre 2010 à 06:25 +0100, Al Viro a écrit : > "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter > n with line below". I doubt that it's MTA braindamage. > > In the first patch there's > > - * invalidate_inodes - attempt to free all inodes on a > + * nvalidate_inodes - attempt to free all inodes on a > > and I _really_ doubt that anything in mail system is capable of something > that elaborate. Again, I can not see it in my copy, I checked lkml archives too : http://lkml.org/lkml/2010/10/27/7 Mail was fine, maybe your file system is corrupted ? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 5:50 ` Eric Dumazet (?) @ 2010-10-27 6:01 ` Al Viro -1 siblings, 0 replies; 20+ messages in thread From: Al Viro @ 2010-10-27 6:01 UTC (permalink / raw) To: Eric Dumazet; +Cc: Dave Chinner, linux-fsdevel, linux-kernel On Wed, Oct 27, 2010 at 07:50:23AM +0200, Eric Dumazet wrote: > Le mercredi 27 octobre 2010 ?? 06:25 +0100, Al Viro a ??crit : > > "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter > > n with line below". I doubt that it's MTA braindamage. > > > > In the first patch there's > > > > - * invalidate_inodes - attempt to free all inodes on a > > + * nvalidate_inodes - attempt to free all inodes on a > > > > and I _really_ doubt that anything in mail system is capable of something > > that elaborate. > > Again, I can not see it in my copy, I checked lkml archives too : > > http://lkml.org/lkml/2010/10/27/7 > > Mail was fine, maybe your file system is corrupted ? fs corruption inserting pieces like that? Then we have a serious trouble of HAL kind... It's not a result of rediff; it's plain vi /var/mail/$USER... ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 5:50 ` Eric Dumazet (?) (?) @ 2010-10-27 6:09 ` Davidlohr Bueso -1 siblings, 0 replies; 20+ messages in thread From: Davidlohr Bueso @ 2010-10-27 6:09 UTC (permalink / raw) To: Eric Dumazet; +Cc: Al Viro, Dave Chinner, linux-fsdevel, linux-kernel On Wed, 2010-10-27 at 07:50 +0200, Eric Dumazet wrote: > Le mercredi 27 octobre 2010 à 06:25 +0100, Al Viro a écrit : > > "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter > > n with line below". I doubt that it's MTA braindamage. > > > > In the first patch there's > > > > - * invalidate_inodes - attempt to free all inodes on a > > + * nvalidate_inodes - attempt to free all inodes on a > > > > and I _really_ doubt that anything in mail system is capable of something > > that elaborate. > > Again, I can not see it in my copy, I checked lkml archives too : > Neither can I, the patch is good (from a character point of view). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 5:50 ` Eric Dumazet ` (2 preceding siblings ...) (?) @ 2010-10-27 7:11 ` Christian Stroetmann -1 siblings, 0 replies; 20+ messages in thread From: Christian Stroetmann @ 2010-10-27 7:11 UTC (permalink / raw) To: Eric Dumazet, Al Viro, linux-fsdevel, linux-kernel On the 27.10.2010 07:50, Eric Dumazet wrote : > Le mercredi 27 octobre 2010 à 06:25 +0100, Al Viro a écrit : >> "i\xe1\xb9\x89ode_lock", i.e. 'n' turned into U+1E49, aka "latin small letter >> n with line below". I doubt that it's MTA braindamage. >> >> In the first patch there's >> >> - * invalidate_inodes - attempt to free all inodes on a >> + * nvalidate_inodes - attempt to free all inodes on a >> >> and I _really_ doubt that anything in mail system is capable of something >> that elaborate. > Again, I can not see it in my copy, I checked lkml archives too : > > http://lkml.org/lkml/2010/10/27/7 > > Mail was fine, maybe your file system is corrupted ? > I have it in patch 4/4 too. But in patch 1/4 could it be just a typo? Christian Stroetmann ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache 2010-10-27 4:40 ` Al Viro @ 2010-10-27 9:12 ` Dave Chinner 2010-10-27 9:12 ` Dave Chinner 1 sibling, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 9:12 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, linux-kernel On Wed, Oct 27, 2010 at 05:40:38AM +0100, Al Viro wrote: > On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Now that inode state changes are protected by the inode->i_lock and > > the inode LRU manipulations by the inode_lru_lock, we can remove the > > inode_lock from prune_icache and the initial part of iput_final(). > > > > instead of using the inode_lock to protect the inode during > > iput_final, use the inode->i_lock instead. This protects the inode > > against new references being taken while we change the inode state > > to I_FREEING, as well as preventing prune_icache from grabbing the > > inode while we are manipulating it. Hence we no longer need the > > i???ode_lock in iput_final prior to setting I_FREEING on the inode. > ^^^^^^^^^^^^ > > ... the hell? There's more such damage elsewhere in the thread; what's > going on? vim utf-8 multibyte support that is causing these characters to be created. e.g. e<backspace>' results in é. sometimes the resultant character looks almost identical and so I didn't notice. I haven't found the magic recipe to turn this off (maxcombine=0 doesn't seem to stop it) or change the special compose sequence, so I'll keep looking. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache @ 2010-10-27 9:12 ` Dave Chinner 0 siblings, 0 replies; 20+ messages in thread From: Dave Chinner @ 2010-10-27 9:12 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, linux-kernel On Wed, Oct 27, 2010 at 05:40:38AM +0100, Al Viro wrote: > On Wed, Oct 27, 2010 at 03:23:04PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Now that inode state changes are protected by the inode->i_lock and > > the inode LRU manipulations by the inode_lru_lock, we can remove the > > inode_lock from prune_icache and the initial part of iput_final(). > > > > instead of using the inode_lock to protect the inode during > > iput_final, use the inode->i_lock instead. This protects the inode > > against new references being taken while we change the inode state > > to I_FREEING, as well as preventing prune_icache from grabbing the > > inode while we are manipulating it. Hence we no longer need the > > i???ode_lock in iput_final prior to setting I_FREEING on the inode. > ^^^^^^^^^^^^ > > ... the hell? There's more such damage elsewhere in the thread; what's > going on? vim utf-8 multibyte support that is causing these characters to be created. e.g. e<backspace>' results in é. sometimes the resultant character looks almost identical and so I didn't notice. I haven't found the magic recipe to turn this off (maxcombine=0 doesn't seem to stop it) or change the special compose sequence, so I'll keep looking. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-10-28 21:41 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-10-27 23:02 fs: break out inode LRU operations from inode_lock V2 Dave Chinner 2010-10-27 23:02 ` [PATCH 1/4] fs: protect inode->i_state with inode->i_lock Dave Chinner 2010-10-27 23:02 ` [PATCH 2/4] fs: factor inode disposal Dave Chinner 2010-10-27 23:02 ` [PATCH 3/4] fs: Lock the inode LRU list separately Dave Chinner 2010-10-27 23:02 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner 2010-10-28 12:00 ` Christoph Hellwig 2010-10-28 21:41 ` Dave Chinner 2010-10-28 13:58 ` fs: break out inode LRU operations from inode_lock V2 Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2010-10-27 4:23 fs: break out inode LRU operations from node_lock Dave Chinner 2010-10-27 4:23 ` [PATCH 4/4] fs: remove inode_lock from iput_final and prune_icache Dave Chinner 2010-10-27 4:40 ` Al Viro 2010-10-27 4:47 ` Eric Dumazet 2010-10-27 4:47 ` Eric Dumazet 2010-10-27 5:25 ` Al Viro 2010-10-27 5:50 ` Eric Dumazet 2010-10-27 5:50 ` Eric Dumazet 2010-10-27 6:01 ` Al Viro 2010-10-27 6:09 ` Davidlohr Bueso 2010-10-27 7:11 ` Christian Stroetmann 2010-10-27 9:12 ` Dave Chinner 2010-10-27 9:12 ` Dave Chinner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.