linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Willy Tarreau <w@1wt.eu>
Cc: Linus Torvalds <torvalds@osdl.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	git@vger.kernel.org, nigel@nigel.suspend2.net,
	"J.H." <warthog9@kernel.org>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	Pavel Machek <pavel@ucw.cz>,
	kernel list <linux-kernel@vger.kernel.org>,
	webmaster@kernel.org,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: How git affects kernel.org performance
Date: Sun, 7 Jan 2007 01:15:42 -0800	[thread overview]
Message-ID: <20070107011542.3496bc76.akpm@osdl.org> (raw)
In-Reply-To: <20070107085526.GR24090@1wt.eu>

On Sun, 7 Jan 2007 09:55:26 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Sat, Jan 06, 2007 at 09:39:42PM -0800, Linus Torvalds wrote:
> > 
> > 
> > On Sat, 6 Jan 2007, H. Peter Anvin wrote:
> > > 
> > > During extremely high load, it appears that what slows kernel.org down more
> > > than anything else is the time that each individual getdents() call takes.
> > > When I've looked this I've observed times from 200 ms to almost 2 seconds!
> > > Since an unpacked *OR* unpruned git tree adds 256 directories to a cleanly
> > > packed tree, you can do the math yourself.
> > 
> > "getdents()" is totally serialized by the inode semaphore. It's one of the 
> > most expensive system calls in Linux, partly because of that, and partly 
> > because it has to call all the way down into the filesystem in a way that 
> > almost no other common system call has to (99% of all filesystem calls can 
> > be handled basically at the VFS layer with generic caches - but not 
> > getdents()).
> > 
> > So if there are concurrent readdirs on the same directory, they get 
> > serialized. If there is any file creation/deletion activity in the 
> > directory, it serializes getdents(). 
> > 
> > To make matters worse, I don't think it has any read-ahead at all when you 
> > use hashed directory entries. So if you have cold-cache case, you'll read 
> > every single block totally individually, and serialized. One block at a 
> > time (I think the non-hashed case is likely also suspect, but that's a 
> > separate issue)
> > 
> > In other words, I'm not at all surprised it hits on filldir time. 
> > Especially on ext3.
> 
> At work, we had the same problem on a file server with ext3. We use rsync
> to make backups to a local IDE disk, and we noticed that getdents() took
> about the same time as Peter reports (0.2 to 2 seconds), especially in
> maildir directories. We tried many things to fix it with no result,
> including enabling dirindexes. Finally, we made a full backup, and switched
> over to XFS and the problem totally disappeared. So it seems that the
> filesystem matters a lot here when there are lots of entries in a
> directory, and that ext3 is not suitable for usages with thousands
> of entries in directories with millions of files on disk. I'm not
> certain it would be that easy to try other filesystems on kernel.org
> though :-/
> 

Yeah, slowly-growing directories will get splattered all over the disk.

Possible short-term fixes would be to just allocate up to (say) eight
blocks when we grow a directory by one block.  Or teach the
directory-growth code to use ext3 reservations.

Longer-term people are talking about things like on-disk rerservations. 
But I expect directories are being forgotten about in all of that.


  parent reply	other threads:[~2007-01-07  9:16 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-14 22:37 kernel.org lies about latest -mm kernel Pavel Machek
2006-12-14 23:01 ` Randy Dunlap
2006-12-14 23:38 ` Sergio Monteiro Basto
2006-12-16 17:44 ` [KORG] " Randy Dunlap
2006-12-16 17:57   ` Andrew Morton
2006-12-16 18:02     ` Randy Dunlap
2006-12-16 19:30       ` J.H.
2006-12-16 20:30         ` Russell King
2006-12-26 16:47           ` H. Peter Anvin
2006-12-16 21:21         ` Nigel Cunningham
2006-12-26 16:49           ` H. Peter Anvin
2007-01-07  3:35             ` Nigel Cunningham
2007-01-07  4:10               ` Jeff Garzik
2007-01-07  4:47                 ` Nigel Cunningham
2007-01-07  4:22               ` Jeff Garzik
2007-01-07  4:29                 ` Linus Torvalds
2007-01-07 20:11                 ` Greg KH
2007-01-07 21:30                   ` H. Peter Anvin
2007-01-07  5:17               ` H. Peter Anvin
2007-01-07  5:24                 ` How git affects kernel.org performance H. Peter Anvin
2007-01-07  5:39                   ` Linus Torvalds
2007-01-07  8:55                     ` Willy Tarreau
2007-01-07  8:58                       ` H. Peter Anvin
2007-01-07  9:03                         ` Willy Tarreau
2007-01-07 10:28                           ` Christoph Hellwig
2007-01-07 10:52                             ` Willy Tarreau
2007-01-07 18:17                             ` Linus Torvalds
2007-01-07 19:13                               ` Linus Torvalds
     [not found]                                 ` <9e4733910701071126r7931042eldfb73060792f4f41@mail.gmail.com>
2007-01-07 19:35                                   ` Linus Torvalds
2007-01-07 10:50                           ` Jan Engelhardt
2007-01-07 18:49                             ` Randy Dunlap
2007-01-07 19:07                               ` Jan Engelhardt
2007-01-07 19:28                                 ` Randy Dunlap
2007-01-07 19:37                                   ` Linus Torvalds
2007-01-07  9:15                       ` Andrew Morton [this message]
2007-01-07  9:38                         ` Rene Herman
2007-01-08  3:05                         ` Suparna Bhattacharya
2007-01-08 12:58                           ` Theodore Tso
2007-01-08 13:41                             ` Johannes Stezenbach
2007-01-08 13:56                               ` Theodore Tso
2007-01-08 13:59                                 ` Pavel Machek
2007-01-08 14:17                                   ` Theodore Tso
2007-01-08 13:43                             ` Jeff Garzik
2007-01-09  1:09                               ` Paul Jackson
2007-01-09  2:18                                 ` Jeremy Higdon
     [not found]                             ` <20070109075945.GA8799@mail.ustc.edu.cn>
2007-01-09  7:59                               ` Fengguang Wu
2007-01-09 16:23                                 ` Linus Torvalds
     [not found]                                   ` <20070110015739.GA26978@mail.ustc.edu.cn>
2007-01-10  1:57                                     ` Fengguang Wu
2007-01-10  3:20                                     ` Nigel Cunningham
     [not found]                                       ` <20070110140730.GA986@mail.ustc.edu.cn>
2007-01-10 14:07                                         ` Fengguang Wu
2007-01-12 10:54                                         ` Nigel Cunningham
2007-01-07 14:57                   ` Robert Fitzsimons
2007-01-07 19:12                     ` J.H.
2007-01-08  1:51                     ` Jakub Narebski
2007-01-07 15:06                   ` Krzysztof Halasa
2007-01-07 20:31                     ` Shawn O. Pearce
2007-01-08 14:46                       ` Nicolas Pitre
2007-01-09  4:29                 ` [KORG] Re: kernel.org lies about latest -mm kernel Nigel Cunningham
2007-01-09  5:09                   ` Adrian Bunk
2007-01-09  5:51                     ` Nigel Cunningham
2006-12-17 12:32         ` Pavel Machek
2006-12-17 13:13           ` Jeff Garzik
2006-12-17 18:23         ` Randy Dunlap
2006-12-17 22:37           ` Matti Aarnio
2006-12-18  0:42             ` J.H.
2006-12-19  6:46               ` Willy Tarreau
2006-12-19  7:39                 ` J.H.
2006-12-19 13:32                   ` Willy Tarreau
2006-12-19 14:36                   ` Dave Jones
2006-12-19 14:38                     ` Willy Tarreau
2006-12-26 16:14                     ` H. Peter Anvin
2007-01-08 20:10           ` Jean Delvare
2006-12-19  6:34         ` Willy Tarreau
2006-12-19  6:52           ` J.H.
2007-01-06 18:33             ` Randy Dunlap
2007-01-06 19:18               ` H. Peter Anvin
2007-01-06 19:35                 ` Willy Tarreau
2007-01-06 19:37                 ` Nicholas Miell
2007-01-06 20:13                   ` Andrew Morton
2007-01-06 20:18                     ` H. Peter Anvin
2007-03-19 19:27                       ` [PATCH] sysctl: vfs_cache_divisor Randy Dunlap
2007-03-19 20:36                         ` Andrew Morton
2007-03-19 20:42                           ` Randy Dunlap
2007-03-20  4:22                             ` H. Peter Anvin
2007-03-21 23:01                               ` Randy Dunlap
2007-03-21 23:11                                 ` Andrew Morton
2007-03-23  0:07                                   ` Kyle Moffett
2007-03-23 20:36                                     ` Randy Dunlap
2007-03-23 20:59                                       ` H. Peter Anvin
2007-03-24  0:45                                         ` Kyle Moffett
2007-03-24  1:17                                           ` Kyle Moffett
2007-03-20 19:53                         ` Ingo Oeser
2007-01-06 23:50                     ` [KORG] Re: kernel.org lies about latest -mm kernel H. Peter Anvin
2007-01-06 20:13                 ` Jeff Garzik
2007-01-06 20:17                   ` Andrew Morton
2007-01-06 20:20                     ` H. Peter Anvin
2007-01-06 20:36                       ` Andrew Morton
2007-01-06 19:21               ` J.H.
2007-01-07 19:52                 ` Randy Dunlap
2007-01-07 23:56                   ` H. Peter Anvin
2006-12-26 17:02           ` H. Peter Anvin
2007-01-08 19:31             ` Jean Delvare
2007-01-08 19:37               ` Willy Tarreau
2007-01-08 22:05                 ` Jean Delvare
2006-12-19 15:37         ` Tim Schmielau
2007-01-08 21:20         ` Jean Delvare
2007-01-08 21:33           ` J.H.
2007-01-09  7:01             ` Jean Delvare
2007-01-09  7:25               ` J.H.
2007-01-09 13:36                 ` Jean Delvare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070107011542.3496bc76.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=git@vger.kernel.org \
    --cc=hpa@zytor.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nigel@nigel.suspend2.net \
    --cc=pavel@ucw.cz \
    --cc=randy.dunlap@oracle.com \
    --cc=torvalds@osdl.org \
    --cc=w@1wt.eu \
    --cc=warthog9@kernel.org \
    --cc=webmaster@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).