From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: Status of inode scaling work Date: Thu, 27 Jan 2011 11:40:57 +1100 Message-ID: References: <20110125071011.GD28803@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Dave Chinner , Al Viro , Christoph Hellwig , Linus Torvalds , linux-fsdevel To: Dave Chinner Return-path: Received: from mail-wy0-f174.google.com ([74.125.82.174]:38978 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754298Ab1A0Ak7 convert rfc822-to-8bit (ORCPT ); Wed, 26 Jan 2011 19:40:59 -0500 Received: by wyb28 with SMTP id 28so1560768wyb.19 for ; Wed, 26 Jan 2011 16:40:57 -0800 (PST) In-Reply-To: <20110125071011.GD28803@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Jan 25, 2011 at 6:10 PM, Dave Chinner wro= te: > On Fri, Jan 14, 2011 at 08:02:57PM +1100, Nick Piggin wrote: >> I'd like to know about the status and plans of the inode scaling wor= k. >> >> Aiming for 2.6.39 or 40 might be an idea. > > It was ready for 2.6.38.... Can it go into linux-next? Unless Al can put it in his for-next branch? >> I'd like to be able to see what patches are proposed to finish break= ing >> inode_lock into its constituent sub classes, and we can go from ther= e. > > Already done. I pointed you at the tree a while back: > > The following changes since commit c723fdab8aa728dc2bf0da6a0de8bb9c3f= 588d84: > > =A0Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2= =2E6 (2011-01-25 14:23:54 +1000) > > are available in the git repository at: > > =A0git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev.git inode= -scale > > Dave Chinner (9): > =A0 =A0 =A0fs: protect inode->i_state with inode->i_lock The places where you take i_lock to initialize i_state in newly allocated inodes should not be required, because they can't be found until they get added to lists, and adding to the list must have the rig= ht barriers or lock synchronisation to make sure traversals don't see uninitialised inodes. eg. spin_lock(&inode->i_lock); inode->i_state =3D I_NEW; hlist_add_head(&inode->i_hash, head); spin_unlock(&inode->i_lock); inode_sb_list_add(inode); spin_unlock(&inode_lock); That lightens new inode paths a little bit. "Also remove the unlock_new_inode() memory barrier optimisation required to avoid taking the inode_lock when clearing I_NEW. Simplify the code by simply taking the inode->i_lock around the state change and wakeup. Because the wakeup is no longer tricky, remove the wake_up_inode() function and open code the wakeup where necessary." Can you do that bit separately? > =A0 =A0 =A0fs: factor inode disposal - spin_unlock(&inode->i_lock); - - /* - * Move the inode off the IO lists and LRU once I_FREEING is - * set so that it won't get moved back on there if it is dirty. - */ inode_lru_list_del(inode); - list_del_init(&inode->i_wb_list); - - __inode_sb_list_del(inode); + spin_unlock(&inode->i_lock); spin_unlock(&inode_lock); + Why do you change i_lock to cover inode_lru_list_del in this patch? > =A0 =A0 =A0fs: Lock the inode LRU list separately BTW. have you found any issues with nesting locks inside i_lock? We do that extensively in dcache now, but it can be trivially made to use another lock (eg. i_dcache_lock). > =A0 =A0 =A0fs: remove inode_lock from iput_final and prune_icache > =A0 =A0 =A0fs: move i_sb_list out from under inode_lock I don't know if we should be including ../internal.h, although you coul= d argue that quota and notify is core code. > =A0 =A0 =A0fs: move i_wb_list out from under inode_lock > =A0 =A0 =A0fs: rename inode_lock to inode_hash_lock > =A0 =A0 =A0fs: Clean up documentation references to inode_lock Can these be merged with the patches as they change the locking? Some places also call it "inode lock" (eg. find_inode()) > =A0 =A0 =A0fs: pull inode->i_lock up out of writeback_single_inode I think the series looks pretty good overall. Thanks. Nick -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html