linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@gmail.com>
To: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Linus Torvalds <torvalds@osdl.org>, Theodore Tso <tytso@mit.edu>,
	Suparna Bhattacharya <suparna@in.ibm.com>,
	Andrew Morton <akpm@osdl.org>, Willy Tarreau <w@1wt.eu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	git@vger.kernel.org, "J.H." <warthog9@kernel.org>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	Pavel Machek <pavel@ucw.cz>,
	kernel list <linux-kernel@vger.kernel.org>,
	webmaster@kernel.org,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: How git affects kernel.org performance
Date: Wed, 10 Jan 2007 22:07:30 +0800	[thread overview]
Message-ID: <368438013.19600@ustc.edu.cn> (raw)
Message-ID: <20070110140730.GA986@mail.ustc.edu.cn> (raw)
In-Reply-To: <1168399249.2585.6.camel@nigel.suspend2.net>

On Wed, Jan 10, 2007 at 02:20:49PM +1100, Nigel Cunningham wrote:
> Hi.
> 
> On Wed, 2007-01-10 at 09:57 +0800, Fengguang Wu wrote:
> > On Tue, Jan 09, 2007 at 08:23:32AM -0800, Linus Torvalds wrote:
> > >
> > >
> > > On Tue, 9 Jan 2007, Fengguang Wu wrote:
> > > > >
> > > > > The fastest and probably most important thing to add is some readahead
> > > > > smarts to directories --- both to the htree and non-htree cases.  If
> > > >
> > > > Here's is a quick hack to practice the directory readahead idea.
> > > > Comments are welcome, it's a freshman's work :)
> > >
> > > Well, I'd probably have done it differently, but more important is whether
> > > this actually makes a difference performance-wise. Have you benchmarked it
> > > at all?
> > 
> > Yes, a trivial test shows a marginal improvement, on a minimal debian system:
> > 
> > # find / | wc -l
> > 13641
> > 
> > # time find / > /dev/null
> > 
> > real    0m10.000s
> > user    0m0.210s
> > sys     0m4.370s
> > 
> > # time find / > /dev/null
> > 
> > real    0m9.890s
> > user    0m0.160s
> > sys     0m3.270s
> > 
> > > Doing an
> > >
> > > 	echo 3 > /proc/sys/vm/drop_caches
> > >
> > > is your friend for testing things like this, to force cold-cache
> > > behaviour..
> > 
> > Thanks, I'll work out numbers on large/concurrent dir accesses soon.
> 
> I gave it a try, and I'm afraid the results weren't pretty.
> 
> I did:
> 
> time find /usr/src | wc -l
> 
> on current git with (3 times) and without (5 times) the patch, and got
> 
> with:
> real   54.306, 54.327, 53.742s
> usr    0.324, 0.284, 0.234s
> sys    2.432, 2.484, 2.592s
> 
> without:
> real   24.413, 24.616, 24.080s
> usr    0.208, 0.316, 0.312s
> sys:   2.496, 2.440, 2.540s
> 
> Subsequent runs without dropping caches did give a significant
> improvement in both cases (1.821/.188/1.632 is one result I wrote with
> the patch applied).

Thanks, Nigel.
But I'm very sorry that the calculation in the patch was wrong.

Would you give this new patch a run?

It produced pretty numbers here:

#!/bin/zsh

ROOT=/mnt/mnt
TIMEFMT="%E clock  %S kernel  %U user  %w+%c cs  %J"

echo 3 > /proc/sys/vm/drop_caches

# 49: enable dir readahead
# 50: disable
echo ${1:-50} > /proc/sys/vm/readahead_ratio

# time find $ROOT/a > /dev/null

time find /etch > /dev/null

# time find $ROOT/a > /dev/null&
# time grep -r asdf $ROOT/b > /dev/null&
# time cp /etch/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso /dev/null&

exit 0

# collected results on a SATA disk:
# ./test-parallel-dir-reada.sh 49
4.18s clock  0.08s kernel  0.04s user  418+0 cs  find $ROOT/a > /dev/null
4.09s clock  0.10s kernel  0.02s user  410+1 cs  find $ROOT/a > /dev/null

# ./test-parallel-dir-reada.sh 50
12.18s clock  0.15s kernel  0.07s user  1520+4 cs  find $ROOT/a > /dev/null
11.99s clock  0.13s kernel  0.04s user  1558+6 cs  find $ROOT/a > /dev/null


# ./test-parallel-dir-reada.sh 49
4.01s clock  0.06s kernel  0.01s user  1567+2 cs  find /etch > /dev/null
4.08s clock  0.07s kernel  0.00s user  1568+0 cs  find /etch > /dev/null

# ./test-parallel-dir-reada.sh 50
4.10s clock  0.09s kernel  0.01s user  1578+1 cs  find /etch > /dev/null
4.19s clock  0.08s kernel  0.03s user  1578+0 cs  find /etch > /dev/null


# ./test-parallel-dir-reada.sh 49
7.73s clock  0.11s kernel  0.06s user  438+2 cs  find $ROOT/a > /dev/null
18.92s clock  0.43s kernel  0.02s user  1246+13 cs  cp /etch/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso /dev/null
32.91s clock  4.20s kernel  1.55s user  103564+51 cs  grep -r asdf $ROOT/b > /dev/null

8.47s clock  0.10s kernel  0.02s user  442+4 cs  find $ROOT/a > /dev/null
19.24s clock  0.53s kernel  0.03s user  1250+23 cs  cp /etch/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso /dev/null
29.93s clock  4.18s kernel  1.61s user  100425+47 cs  grep -r asdf $ROOT/b > /dev/null

# ./test-parallel-dir-reada.sh 50
17.87s clock  0.57s kernel  0.02s user  1244+21 cs  cp /etch/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso /dev/null
21.30s clock  0.08s kernel  0.05s user  1517+5 cs  find $ROOT/a > /dev/null
49.68s clock  3.94s kernel  1.67s user  101520+57 cs  grep -r asdf $ROOT/b > /dev/null

15.66s clock  0.51s kernel  0.00s user  1248+25 cs  cp /etch/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso /dev/null
22.15s clock  0.15s kernel  0.04s user  1520+5 cs  find $ROOT/a > /dev/null
46.14s clock  4.08s kernel  1.68s user  101517+63 cs  grep -r asdf $ROOT/b > /dev/null

Thanks,
Wu
---

Subject: ext3 readdir readahead

Do readahead for ext3_readdir().

Reasons to be aggressive:
- readdir() users are likely to traverse the whole directory,
  so readahead miss is not a concern.
- most dirs are small, so slow start is not good
- the htree indexing introduces some randomness,
  which can be helped by the aggressiveness.

So we do 128K sized readaheads, at twice the speed of reads.

The following actual readahead pages are collected for a dir with
110000 entries:
	32 31 30 31 28 29 29 28 27 25 29 22 25 30 24 15 19
That means a readahead hit ratio of
	454/541 = 84%

The performance is marginally better for a minimal debian system:
	command:	find /
	baseline:	4.10s	4.19s
	patched:	4.01s	4.08s

And considerably better for 100 directories, each with 1000 8K files:
	command:	find /throwaways
	baseline:	12.18s	11.99s
	patched:	 4.18s	 4.09s

And also noticable better for parallel operations:
					baseline	patched
	find /throwaways &		21.30s 22.15s    7.73s  8.47s 
	grep -r asdf /throwaways2 &     49.68s 46.14s   32.91s 29.93s 
	cp /KNOPPIX_CD.iso /dev/null &  17.87s 15.66s   18.92s 19.24s 

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
---
 fs/ext3/dir.c           |   33 +++++++++++++++++++++++++++++++++
 fs/ext3/inode.c         |    2 +-
 include/linux/ext3_fs.h |    2 ++
 3 files changed, 36 insertions(+), 1 deletion(-)

--- linux.orig/fs/ext3/dir.c
+++ linux/fs/ext3/dir.c
@@ -94,6 +94,28 @@ int ext3_check_dir_entry (const char * f
 	return error_msg == NULL ? 1 : 0;
 }
 
+#define DIR_READAHEAD_BYTES  (128*1024)
+#define DIR_READAHEAD_PGMASK ((DIR_READAHEAD_BYTES >> PAGE_CACHE_SHIFT) - 1)
+
+static void ext3_dir_readahead(struct file * filp)
+{
+	struct inode *inode = filp->f_path.dentry->d_inode;
+	struct address_space *mapping = inode->i_sb->s_bdev->bd_inode->i_mapping;
+	int bbits = inode->i_blkbits;
+	unsigned long blk, end;
+
+	blk = filp->f_ra.prev_page << (PAGE_CACHE_SHIFT - bbits);
+	end = min(inode->i_blocks >> (bbits - 9),
+		  blk + (DIR_READAHEAD_BYTES >> bbits));
+
+	for (; blk < end; blk++) {
+		pgoff_t phy;
+		phy = generic_block_bmap(inode->i_mapping, blk, ext3_get_block)
+				>> (PAGE_CACHE_SHIFT - bbits);
+		do_page_cache_readahead(mapping, filp, phy, 1);
+	}
+}
+
 static int ext3_readdir(struct file * filp,
 			 void * dirent, filldir_t filldir)
 {
@@ -108,6 +130,17 @@ static int ext3_readdir(struct file * fi
 
 	sb = inode->i_sb;
 
+	/*
+	 * Reading-ahead at 2x the page fault rate, in hope of reducing
+	 * readahead misses caused by the partially random htree order.
+	 */
+	filp->f_ra.prev_page += 2;
+	filp->f_ra.prev_page &= ~1;
+
+	if (!(filp->f_ra.prev_page & DIR_READAHEAD_PGMASK) &&
+		filp->f_ra.prev_page < (inode->i_blocks >> (PAGE_CACHE_SHIFT-9)))
+		ext3_dir_readahead(filp);
+
 #ifdef CONFIG_EXT3_INDEX
 	if (EXT3_HAS_COMPAT_FEATURE(inode->i_sb,
 				    EXT3_FEATURE_COMPAT_DIR_INDEX) &&
--- linux.orig/fs/ext3/inode.c
+++ linux/fs/ext3/inode.c
@@ -945,7 +945,7 @@ out:
 
 #define DIO_CREDITS (EXT3_RESERVE_TRANS_BLOCKS + 32)
 
-static int ext3_get_block(struct inode *inode, sector_t iblock,
+int ext3_get_block(struct inode *inode, sector_t iblock,
 			struct buffer_head *bh_result, int create)
 {
 	handle_t *handle = journal_current_handle();
--- linux.orig/include/linux/ext3_fs.h
+++ linux/include/linux/ext3_fs.h
@@ -814,6 +814,8 @@ struct buffer_head * ext3_bread (handle_
 int ext3_get_blocks_handle(handle_t *handle, struct inode *inode,
 	sector_t iblock, unsigned long maxblocks, struct buffer_head *bh_result,
 	int create, int extend_disksize);
+extern int ext3_get_block(struct inode *inode, sector_t iblock,
+			struct buffer_head *bh_result, int create);
 
 extern void ext3_read_inode (struct inode *);
 extern int  ext3_write_inode (struct inode *, int);

  reply	other threads:[~2007-01-10 14:08 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-14 22:37 kernel.org lies about latest -mm kernel Pavel Machek
2006-12-14 23:01 ` Randy Dunlap
2006-12-14 23:38 ` Sergio Monteiro Basto
2006-12-16 17:44 ` [KORG] " Randy Dunlap
2006-12-16 17:57   ` Andrew Morton
2006-12-16 18:02     ` Randy Dunlap
2006-12-16 19:30       ` J.H.
2006-12-16 20:30         ` Russell King
2006-12-26 16:47           ` H. Peter Anvin
2006-12-16 21:21         ` Nigel Cunningham
2006-12-26 16:49           ` H. Peter Anvin
2007-01-07  3:35             ` Nigel Cunningham
2007-01-07  4:10               ` Jeff Garzik
2007-01-07  4:47                 ` Nigel Cunningham
2007-01-07  4:22               ` Jeff Garzik
2007-01-07  4:29                 ` Linus Torvalds
2007-01-07 20:11                 ` Greg KH
2007-01-07 21:30                   ` H. Peter Anvin
2007-01-07  5:17               ` H. Peter Anvin
2007-01-07  5:24                 ` How git affects kernel.org performance H. Peter Anvin
2007-01-07  5:39                   ` Linus Torvalds
2007-01-07  8:55                     ` Willy Tarreau
2007-01-07  8:58                       ` H. Peter Anvin
2007-01-07  9:03                         ` Willy Tarreau
2007-01-07 10:28                           ` Christoph Hellwig
2007-01-07 10:52                             ` Willy Tarreau
2007-01-07 18:17                             ` Linus Torvalds
2007-01-07 19:13                               ` Linus Torvalds
     [not found]                                 ` <9e4733910701071126r7931042eldfb73060792f4f41@mail.gmail.com>
2007-01-07 19:35                                   ` Linus Torvalds
2007-01-07 10:50                           ` Jan Engelhardt
2007-01-07 18:49                             ` Randy Dunlap
2007-01-07 19:07                               ` Jan Engelhardt
2007-01-07 19:28                                 ` Randy Dunlap
2007-01-07 19:37                                   ` Linus Torvalds
2007-01-07  9:15                       ` Andrew Morton
2007-01-07  9:38                         ` Rene Herman
2007-01-08  3:05                         ` Suparna Bhattacharya
2007-01-08 12:58                           ` Theodore Tso
2007-01-08 13:41                             ` Johannes Stezenbach
2007-01-08 13:56                               ` Theodore Tso
2007-01-08 13:59                                 ` Pavel Machek
2007-01-08 14:17                                   ` Theodore Tso
2007-01-08 13:43                             ` Jeff Garzik
2007-01-09  1:09                               ` Paul Jackson
2007-01-09  2:18                                 ` Jeremy Higdon
     [not found]                             ` <20070109075945.GA8799@mail.ustc.edu.cn>
2007-01-09  7:59                               ` Fengguang Wu
2007-01-09 16:23                                 ` Linus Torvalds
     [not found]                                   ` <20070110015739.GA26978@mail.ustc.edu.cn>
2007-01-10  1:57                                     ` Fengguang Wu
2007-01-10  3:20                                     ` Nigel Cunningham
     [not found]                                       ` <20070110140730.GA986@mail.ustc.edu.cn>
2007-01-10 14:07                                         ` Fengguang Wu [this message]
2007-01-12 10:54                                         ` Nigel Cunningham
2007-01-07 14:57                   ` Robert Fitzsimons
2007-01-07 19:12                     ` J.H.
2007-01-08  1:51                     ` Jakub Narebski
2007-01-07 15:06                   ` Krzysztof Halasa
2007-01-07 20:31                     ` Shawn O. Pearce
2007-01-08 14:46                       ` Nicolas Pitre
2007-01-09  4:29                 ` [KORG] Re: kernel.org lies about latest -mm kernel Nigel Cunningham
2007-01-09  5:09                   ` Adrian Bunk
2007-01-09  5:51                     ` Nigel Cunningham
2006-12-17 12:32         ` Pavel Machek
2006-12-17 13:13           ` Jeff Garzik
2006-12-17 18:23         ` Randy Dunlap
2006-12-17 22:37           ` Matti Aarnio
2006-12-18  0:42             ` J.H.
2006-12-19  6:46               ` Willy Tarreau
2006-12-19  7:39                 ` J.H.
2006-12-19 13:32                   ` Willy Tarreau
2006-12-19 14:36                   ` Dave Jones
2006-12-19 14:38                     ` Willy Tarreau
2006-12-26 16:14                     ` H. Peter Anvin
2007-01-08 20:10           ` Jean Delvare
2006-12-19  6:34         ` Willy Tarreau
2006-12-19  6:52           ` J.H.
2007-01-06 18:33             ` Randy Dunlap
2007-01-06 19:18               ` H. Peter Anvin
2007-01-06 19:35                 ` Willy Tarreau
2007-01-06 19:37                 ` Nicholas Miell
2007-01-06 20:13                   ` Andrew Morton
2007-01-06 20:18                     ` H. Peter Anvin
2007-03-19 19:27                       ` [PATCH] sysctl: vfs_cache_divisor Randy Dunlap
2007-03-19 20:36                         ` Andrew Morton
2007-03-19 20:42                           ` Randy Dunlap
2007-03-20  4:22                             ` H. Peter Anvin
2007-03-21 23:01                               ` Randy Dunlap
2007-03-21 23:11                                 ` Andrew Morton
2007-03-23  0:07                                   ` Kyle Moffett
2007-03-23 20:36                                     ` Randy Dunlap
2007-03-23 20:59                                       ` H. Peter Anvin
2007-03-24  0:45                                         ` Kyle Moffett
2007-03-24  1:17                                           ` Kyle Moffett
2007-03-20 19:53                         ` Ingo Oeser
2007-01-06 23:50                     ` [KORG] Re: kernel.org lies about latest -mm kernel H. Peter Anvin
2007-01-06 20:13                 ` Jeff Garzik
2007-01-06 20:17                   ` Andrew Morton
2007-01-06 20:20                     ` H. Peter Anvin
2007-01-06 20:36                       ` Andrew Morton
2007-01-06 19:21               ` J.H.
2007-01-07 19:52                 ` Randy Dunlap
2007-01-07 23:56                   ` H. Peter Anvin
2006-12-26 17:02           ` H. Peter Anvin
2007-01-08 19:31             ` Jean Delvare
2007-01-08 19:37               ` Willy Tarreau
2007-01-08 22:05                 ` Jean Delvare
2006-12-19 15:37         ` Tim Schmielau
2007-01-08 21:20         ` Jean Delvare
2007-01-08 21:33           ` J.H.
2007-01-09  7:01             ` Jean Delvare
2007-01-09  7:25               ` J.H.
2007-01-09 13:36                 ` Jean Delvare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=368438013.19600@ustc.edu.cn \
    --to=fengguang.wu@gmail.com \
    --cc=akpm@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nigel@nigel.suspend2.net \
    --cc=pavel@ucw.cz \
    --cc=randy.dunlap@oracle.com \
    --cc=suparna@in.ibm.com \
    --cc=torvalds@osdl.org \
    --cc=tytso@mit.edu \
    --cc=w@1wt.eu \
    --cc=warthog9@kernel.org \
    --cc=webmaster@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).