From: Nick Piggin <npiggin@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Frank Mayhar <fmayhar@google.com>, John Stultz <johnstul@us.ibm.com> Subject: VFS scalability git tree Date: Fri, 23 Jul 2010 05:01:00 +1000 [thread overview] Message-ID: <20100722190100.GA22269@amd> (raw) I'm pleased to announce I have a git tree up of my vfs scalability work. git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git Branch vfs-scale-working The really interesting new item is the store-free path walk, (43fe2b) which I've re-introduced. It has had a complete redesign, it has much better performance and scalability in more cases, and is actually sane code now. What this does is to allow parallel name lookups to walk down common elements without any cacheline bouncing between them. It can walk across many interesting cases such as mount points, back up '..', and negative dentries of most filesystems. It does so without requiring any atomic operations or any stores at all to hared data. This also makes it very fast in serial performance (path walking is nearly twice as fast on my Opteron). In cases where it cannot continue the RCU walk (eg. dentry does not exist), then it can in most cases take a reference on the farthest element it has reached so far, and then continue on with a regular refcount-based path walk. My first attempt at this simply dropped everything and re-did the full refcount based walk. I've also been working on stress testing, bug fixing, cutting down 'XXX'es, and improving changelogs and comments. Most filesystems are untested (it's too large a job to do comprehensive stress tests on everything), but none have known issues (except nilfs2). Ext2/3, nfs, nfsd, and ram based filesystems seem to work well, ext4/btrfs/xfs/autofs4 have had light testing. I've never had filesystem corruption when testing these patches (only lockups or other bugs). But standard disclaimer: they may eat your data. Summary of a few numbers I've run. google's socket teardown workload runs 3-4x faster on my 2 socket Opteron. Single thread git diff runs 20% on same machine. 32 node Altix runs dbench on ramfs 150x faster (100MB/s up to 15GB/s). At this point, I would be very interested in reviewing, correctness testing on different configurations, and of course benchmarking. Thanks, Nick
WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <npiggin@kernel.dk> To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Frank Mayhar <fmayhar@google.com>, John Stultz <johnstul@us.ibm.com> Subject: VFS scalability git tree Date: Fri, 23 Jul 2010 05:01:00 +1000 [thread overview] Message-ID: <20100722190100.GA22269@amd> (raw) I'm pleased to announce I have a git tree up of my vfs scalability work. git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin.git http://git.kernel.org/?p=linux/kernel/git/npiggin/linux-npiggin.git Branch vfs-scale-working The really interesting new item is the store-free path walk, (43fe2b) which I've re-introduced. It has had a complete redesign, it has much better performance and scalability in more cases, and is actually sane code now. What this does is to allow parallel name lookups to walk down common elements without any cacheline bouncing between them. It can walk across many interesting cases such as mount points, back up '..', and negative dentries of most filesystems. It does so without requiring any atomic operations or any stores at all to hared data. This also makes it very fast in serial performance (path walking is nearly twice as fast on my Opteron). In cases where it cannot continue the RCU walk (eg. dentry does not exist), then it can in most cases take a reference on the farthest element it has reached so far, and then continue on with a regular refcount-based path walk. My first attempt at this simply dropped everything and re-did the full refcount based walk. I've also been working on stress testing, bug fixing, cutting down 'XXX'es, and improving changelogs and comments. Most filesystems are untested (it's too large a job to do comprehensive stress tests on everything), but none have known issues (except nilfs2). Ext2/3, nfs, nfsd, and ram based filesystems seem to work well, ext4/btrfs/xfs/autofs4 have had light testing. I've never had filesystem corruption when testing these patches (only lockups or other bugs). But standard disclaimer: they may eat your data. Summary of a few numbers I've run. google's socket teardown workload runs 3-4x faster on my 2 socket Opteron. Single thread git diff runs 20% on same machine. 32 node Altix runs dbench on ramfs 150x faster (100MB/s up to 15GB/s). At this point, I would be very interested in reviewing, correctness testing on different configurations, and of course benchmarking. Thanks, Nick -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2010-07-22 19:01 UTC|newest] Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top 2010-07-22 19:01 Nick Piggin [this message] 2010-07-22 19:01 ` VFS scalability git tree Nick Piggin 2010-07-23 11:13 ` Dave Chinner 2010-07-23 11:13 ` Dave Chinner 2010-07-23 14:04 ` [PATCH 0/2] vfs scalability tree fixes Dave Chinner 2010-07-23 14:04 ` Dave Chinner 2010-07-23 16:09 ` Nick Piggin 2010-07-23 16:09 ` Nick Piggin 2010-07-23 14:04 ` [PATCH 1/2] xfs: fix shrinker build Dave Chinner 2010-07-23 14:04 ` Dave Chinner 2010-07-23 14:04 ` [PATCH 2/2] xfs: shrinker should use a per-filesystem scan count Dave Chinner 2010-07-23 14:04 ` Dave Chinner 2010-07-23 15:51 ` VFS scalability git tree Nick Piggin 2010-07-23 15:51 ` Nick Piggin 2010-07-24 0:21 ` Dave Chinner 2010-07-24 0:21 ` Dave Chinner 2010-07-23 11:17 ` Christoph Hellwig 2010-07-23 11:17 ` Christoph Hellwig 2010-07-23 15:42 ` Nick Piggin 2010-07-23 15:42 ` Nick Piggin 2010-07-23 13:55 ` Dave Chinner 2010-07-23 13:55 ` Dave Chinner 2010-07-23 16:16 ` Nick Piggin 2010-07-23 16:16 ` Nick Piggin 2010-07-27 7:05 ` Nick Piggin 2010-07-27 7:05 ` Nick Piggin 2010-07-27 8:06 ` Nick Piggin 2010-07-27 8:06 ` Nick Piggin 2010-07-27 11:36 ` XFS hang in xlog_grant_log_space (was Re: VFS scalability git tree) Nick Piggin 2010-07-27 13:30 ` Dave Chinner 2010-07-27 14:58 ` XFS hang in xlog_grant_log_space Dave Chinner 2010-07-28 13:17 ` Dave Chinner 2010-07-29 14:05 ` Nick Piggin 2010-07-29 22:56 ` Dave Chinner 2010-07-30 3:59 ` Nick Piggin 2010-07-28 12:57 ` VFS scalability git tree Dave Chinner 2010-07-28 12:57 ` Dave Chinner 2010-07-29 14:03 ` Nick Piggin 2010-07-29 14:03 ` Nick Piggin 2010-07-27 11:09 ` Nick Piggin 2010-07-27 11:09 ` Nick Piggin 2010-07-27 13:18 ` Dave Chinner 2010-07-27 13:18 ` Dave Chinner 2010-07-27 15:09 ` Nick Piggin 2010-07-27 15:09 ` Nick Piggin 2010-07-28 4:59 ` Dave Chinner 2010-07-28 4:59 ` Dave Chinner 2010-07-28 4:59 ` Dave Chinner 2010-07-23 15:35 ` Nick Piggin 2010-07-23 15:35 ` Nick Piggin 2010-07-24 8:43 ` KOSAKI Motohiro 2010-07-24 8:43 ` KOSAKI Motohiro 2010-07-24 8:44 ` [PATCH 1/2] vmscan: shrink_all_slab() use reclaim_state instead the return value of shrink_slab() KOSAKI Motohiro 2010-07-24 8:44 ` KOSAKI Motohiro 2010-07-24 8:44 ` KOSAKI Motohiro 2010-07-24 12:05 ` KOSAKI Motohiro 2010-07-24 12:05 ` KOSAKI Motohiro 2010-07-24 8:46 ` [PATCH 2/2] vmscan: change shrink_slab() return tyep with void KOSAKI Motohiro 2010-07-24 8:46 ` KOSAKI Motohiro 2010-07-24 8:46 ` KOSAKI Motohiro 2010-07-24 10:54 ` VFS scalability git tree KOSAKI Motohiro 2010-07-24 10:54 ` KOSAKI Motohiro 2010-07-26 5:41 ` Nick Piggin 2010-07-26 5:41 ` Nick Piggin 2010-07-28 10:24 ` Nick Piggin 2010-07-28 10:24 ` Nick Piggin 2010-07-30 9:12 ` Nick Piggin 2010-07-30 9:12 ` Nick Piggin 2010-08-03 0:27 ` john stultz 2010-08-03 0:27 ` john stultz 2010-08-03 0:27 ` john stultz 2010-08-03 5:44 ` Nick Piggin 2010-08-03 5:44 ` Nick Piggin 2010-08-03 5:44 ` Nick Piggin 2010-09-14 22:26 ` Christoph Hellwig 2010-09-14 23:02 ` Frank Mayhar
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20100722190100.GA22269@amd \ --to=npiggin@kernel.dk \ --cc=fmayhar@google.com \ --cc=johnstul@us.ibm.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.