All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Dave Chinner <dchinner@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Status of inode scaling work
Date: Thu, 27 Jan 2011 11:40:57 +1100	[thread overview]
Message-ID: <AANLkTinkr78scr+22crPknHWARU0wGa7OkZLopP4VHAx@mail.gmail.com> (raw)
In-Reply-To: <20110125071011.GD28803@dastard>

On Tue, Jan 25, 2011 at 6:10 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Jan 14, 2011 at 08:02:57PM +1100, Nick Piggin wrote:
>> I'd like to know about the status and plans of the inode scaling work.
>>
>> Aiming for 2.6.39 or 40 might be an idea.
>
> It was ready for 2.6.38....

Can it go into linux-next? Unless Al can put it in his for-next branch?

>> I'd like to be able to see what patches are proposed to finish breaking
>> inode_lock into its constituent sub classes, and we can go from there.
>
> Already done. I pointed you at the tree a while back:
>
> The following changes since commit c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84:
>
>  Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 (2011-01-25 14:23:54 +1000)
>
> are available in the git repository at:
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev.git inode-scale
>
> Dave Chinner (9):
>      fs: protect inode->i_state with inode->i_lock

The places where you take i_lock to initialize i_state in newly
allocated inodes should not be required, because they can't be found
until they get added to lists, and adding to the list must have the right
barriers or lock synchronisation to make sure traversals don't see
uninitialised inodes.

eg.
                         spin_lock(&inode->i_lock);
                         inode->i_state = I_NEW;
                         hlist_add_head(&inode->i_hash, head);
                         spin_unlock(&inode->i_lock);
                         inode_sb_list_add(inode);
                         spin_unlock(&inode_lock);

That lightens new inode paths a little bit.

"Also remove the unlock_new_inode() memory barrier optimisation
required to avoid taking the inode_lock when clearing I_NEW.
Simplify the code by simply taking the inode->i_lock around the
state change and wakeup. Because the wakeup is no longer tricky,
remove the wake_up_inode() function and open code the wakeup where
necessary."

Can you do that bit separately?


>      fs: factor inode disposal

-       spin_unlock(&inode->i_lock);
-
-       /*
-        * Move the inode off the IO lists and LRU once I_FREEING is
-        * set so that it won't get moved back on there if it is dirty.
-        */
        inode_lru_list_del(inode);
-       list_del_init(&inode->i_wb_list);
-
-       __inode_sb_list_del(inode);
+       spin_unlock(&inode->i_lock);
        spin_unlock(&inode_lock);
+

Why do you change i_lock to cover inode_lru_list_del in this
patch?


>      fs: Lock the inode LRU list separately

BTW. have you found any issues with nesting locks inside i_lock? We
do that extensively in dcache now, but it can be trivially made to use
another lock (eg. i_dcache_lock).


>      fs: remove inode_lock from iput_final and prune_icache
>      fs: move i_sb_list out from under inode_lock

I don't know if we should be including ../internal.h, although you could
argue that quota and notify is core code.


>      fs: move i_wb_list out from under inode_lock
>      fs: rename inode_lock to inode_hash_lock
>      fs: Clean up documentation references to inode_lock

Can these be merged with the patches as they change the locking?
Some places also call it "inode lock" (eg. find_inode())


>      fs: pull inode->i_lock up out of writeback_single_inode

I think the series looks pretty good overall. Thanks.

Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-01-27  0:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-14  9:02 Status of inode scaling work Nick Piggin
2011-01-25  7:10 ` Dave Chinner
2011-01-27  0:40   ` Nick Piggin [this message]
2011-01-27  4:57     ` Dave Chinner
2011-01-27  5:11       ` Nick Piggin
2011-01-27  6:20         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTinkr78scr+22crPknHWARU0wGa7OkZLopP4VHAx@mail.gmail.com \
    --to=npiggin@gmail.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.