linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Daniel Phillips <phillips@arcor.de>,
	Chris Friesen <cfriesen@nortelnetworks.com>,
	Andrew Morton <akpm@digeo.com>,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	Oliver Neukum <oliver@neukum.name>,
	Rob Landley <landley@trommello.org>,
	linux-kernel@vger.kernel.org
Subject: Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 -  (NUMA))
Date: Mon, 7 Oct 2002 20:39:04 -0400	[thread overview]
Message-ID: <20021008003903.GA473@think.thunk.org> (raw)
In-Reply-To: <Pine.LNX.4.33.0210071229080.10168-100000@penguin.transmeta.com>

On Mon, Oct 07, 2002 at 12:35:26PM -0700, Linus Torvalds wrote:
> 
> On Mon, 7 Oct 2002, Daniel Phillips wrote:
> > 
> > Ext2 likes to spread directory inodes around the volume so that there is
> > room to keep the associated file blocks nearby.  This interacts rather
> > poorly with readahead.
> 
> Not a read-ahead problem. It interacts rather poory _full_stop_.
> 
> It means that the inode tables are spread all out, the bitmaps are
> fragmented etc, so the disk head has to move all over the disk even when
> only working with one directory tree like the kernel sources.

It depends on what you are doing.  BSD, and even XFS, uses the concept
of using cylinder groups or block groups as one of many tools to avoid
file fragmentation and to concetrate locality for files within a
directory.  The reason why FAT filesystems have file fragmentation
problems in far more worse way is because they attempt don't have the
concept of a block group, and simply always allocate from the
beginning of the filesystem.  This is effectively what would happen if
you had a single block/cylinder group in the filesystem.

> So the problem with spreading stuff out doesn't have anything to do with 
> read-ahead, and has everything to do with the basic issue of BAD LOCALITY. 
> Locality is _good_, independently of read-ahead and independently of 
> medium. 

Ironically, as I mentioned, one of the reasons behind the block group
scheme is to *increase* locality for files within a particular
directory.  As you point out quite correctly, though, it tends to
destroy locality across an entire directory tree.

Maybe the answer is that we need some way of declaring that some
directory is the root of "a directory tree".  That way, the filesystem
can keep directories underneath the directory tree close together, and
the filesystem can try to keep directory trees far apart from each
other.  

In order to do something like this, we would just need a filesystem
API extension to allow programs like tar and bitkeeper to give a hint
that a new directory tree is being established --- and ideally, it
needs to be done at mkdir time, so that the filesystem can perform
appropriate do a better job of deciding where to place the initial
root of the "directory tree".  Things would also work if you declared
some directory tree to be the root of a "directory tree" after the
directory was initially created, but the allocation hueristics
wouldn't be nearly as effective.

Linus, what do you think about defining a new flag which could be
passed as part of the mode bits to mkdir()?  If we allow the
filesystem to get some additional hints from userspace about what the
difference between /usr/src/linux (where directory spreading is a bad
idea) and /usr/home (where directory spreading is a very good idea),
it would make life for the filesystem much easier.

						- Ted


  reply	other threads:[~2002-10-08  0:34 UTC|newest]

Thread overview: 206+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-24  1:54 [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAID device driver Larry Kessler
2002-09-24  2:22 ` Jeff Garzik
2002-09-26 15:52   ` Alan Cox
2002-09-26 22:55     ` [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAIDdevice driver Larry Kessler
2002-09-26 22:58       ` Jeff Garzik
2002-09-26 23:07         ` Linus Torvalds
2002-09-27  2:27           ` Jeff Garzik
2002-09-27  4:45             ` Linus Torvalds
2002-09-28  7:46               ` Ingo Molnar
2002-09-28  9:16                 ` jw schultz
2002-09-30 14:05                   ` Denis Vlasenko
2002-09-30 10:22                     ` Tomas Szepe
2002-09-30 11:10                       ` jw schultz
2002-09-30 11:17                       ` Adrian Bunk
2002-09-30 19:48                       ` Rik van Riel
2002-09-30 20:30                         ` Christoph Hellwig
2002-09-28 15:40                 ` Kernel version [Was: Re: [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAIDdevice driver] Horst von Brand
2002-09-29  1:31                 ` v2.6 vs v3.0 Linus Torvalds
2002-09-29  6:14                   ` james
2002-09-29  6:55                     ` Andre Hedrick
2002-09-29 12:59                     ` Gerhard Mack
2002-09-29 13:46                       ` Dr. David Alan Gilbert
2002-09-29 14:06                         ` Wakko Warner
2002-09-29 15:42                         ` Jens Axboe
2002-09-29 16:21                           ` Alan Cox
2002-09-29 16:17                             ` Jens Axboe
2002-09-30  0:39                             ` Jeff Chua
2002-09-29 16:22                           ` Dave Jones
2002-09-29 16:26                             ` Jens Axboe
2002-09-29 21:46                             ` Matthias Andree
2002-09-30  7:05                               ` Michael Clark
2002-09-30  7:22                                 ` Andrew Morton
2002-09-30 13:08                                   ` Kevin Corry
2002-09-30 13:05                                 ` Kevin Corry
2002-09-30 13:49                                   ` Michael Clark
2002-09-30 14:26                                     ` Kevin Corry
2002-09-30 13:59                                   ` Michael Clark
2002-09-30 15:50                                     ` Kevin Corry
2002-09-29 17:06                       ` Jochen Friedrich
2002-09-29 15:18                     ` Trever L. Adams
2002-09-29 15:45                       ` Jens Axboe
2002-09-29 15:59                         ` Trever L. Adams
2002-09-29 16:06                           ` Jens Axboe
2002-09-29 16:13                             ` Trever L. Adams
2002-09-30  6:54                               ` Kai Henningsen
2002-09-30 18:40                                 ` Bill Davidsen
2002-10-01 12:38                                   ` Matthias Andree
2002-10-04 19:58                                     ` Bill Davidsen
2002-09-29 17:42                     ` Linus Torvalds
2002-09-29 17:54                       ` Rik van Riel
2002-09-29 18:24                       ` Alan Cox
2002-09-30  7:56                         ` Jens Axboe
2002-09-30  9:53                           ` Andre Hedrick
2002-09-30 11:54                             ` Jens Axboe
2002-09-30 12:58                           ` Alan Cox
2002-09-30 13:05                             ` Jens Axboe
2002-10-01  2:17                               ` Andre Hedrick
2002-09-30 16:39                       ` jbradford
2002-09-30 16:47                     ` Pau Aliagas
2002-09-29  7:16                   ` jbradford
2002-09-29  8:08                     ` Jeff Garzik
2002-09-29  8:17                     ` David S. Miller
2002-09-29  9:12                     ` Jens Axboe
2002-09-29 11:19                       ` Murray J. Root
2002-09-29 15:50                         ` Jens Axboe
2002-09-30  7:01                           ` Kai Henningsen
2002-09-29 16:04                         ` Zwane Mwaikambo
2002-09-29 14:56                       ` Alan Cox
2002-09-29 15:38                         ` Jens Axboe
2002-09-29 16:30                           ` Dave Jones
2002-09-29 16:42                           ` Bjoern A. Zeeb
2002-09-29 21:16                           ` Russell King
2002-09-29 21:32                             ` Alan Cox
2002-09-29 21:49                             ` steve
2002-09-29 21:52                           ` Matthias Andree
2002-09-30  7:31                             ` Tomas Szepe
2002-09-30 15:33                           ` Jan Harkes
2002-09-30 18:13                           ` Jeff Willis
2002-09-29 17:48                         ` Linus Torvalds
2002-09-29 18:13                           ` Jaroslav Kysela
2002-09-30 19:32                       ` Bill Davidsen
2002-10-01  6:26                         ` Jens Axboe
2002-10-01  7:54                           ` Mikael Pettersson
2002-10-01  8:27                             ` Jens Axboe
2002-10-01  8:44                               ` jbradford
2002-10-01 11:31                             ` Alan Cox
2002-10-01 11:25                               ` Jens Axboe
2002-09-29 15:34                     ` Andi Kleen
2002-09-29 17:26                       ` Jochen Friedrich
2002-09-29 17:35                         ` Jeff Garzik
2002-09-30  0:00                         ` Andi Kleen
2002-10-01 19:28                         ` IPv6 stability (success story ;) Petr Baudis
2002-09-29  9:15                   ` v2.6 vs v3.0 Jens Axboe
2002-09-29 19:53                     ` james
2002-09-29 15:26                   ` Matthias Andree
2002-09-29 16:24                     ` Alan Cox
2002-09-29 22:00                       ` Matthias Andree
2002-09-30 19:02                       ` Bill Davidsen
2002-09-30 18:37                   ` Bill Davidsen
2002-10-03 15:51               ` [OT] 2.6 not 3.0 - (WAS Re: [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAIDdevice) jbradford
2002-10-03 15:57                 ` Linus Torvalds
2002-10-03 16:16                   ` [OT] 2.6 not 3.0 - (WAS Re: [PATCH-RFC] 4 of 4 - New problem jbradford
2002-10-03 22:30                     ` Greg KH
2002-10-04  6:33                       ` jbradford
2002-10-04  6:37                         ` Greg KH
2002-10-04  7:17                           ` jbradford
2002-10-04  7:30                             ` Greg KH
2002-10-03 16:37                   ` [OT] 2.6 not 3.0 - (WAS Re: [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAIDdevice) Alan Cox
2002-10-03 16:56                     ` Linus Torvalds
2002-10-03 17:40                       ` Alan Cox
2002-10-03 19:55                       ` jlnance
2002-10-03 16:51                   ` Dave Jones
2002-10-03 17:04                     ` Alan Cox
2002-10-03 20:43                     ` Andrew Morton
2002-10-03 22:05                       ` Dave Jones
2002-10-04  3:46                         ` Andreas Boman
2002-10-04  7:44                         ` jbradford
2002-10-03 19:51                   ` Rik van Riel
2002-10-04 22:26                   ` [OT] 2.6 not 3.0 - (NUMA) Martin J. Bligh
2002-10-04 23:13                     ` Linus Torvalds
2002-10-05  0:21                       ` Martin J. Bligh
2002-10-05  0:36                         ` Linus Torvalds
2002-10-05  1:25                           ` Michael Hohnbaum
2002-10-05 20:30                       ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA)) Rob Landley
2002-10-06  2:15                         ` Andrew Morton
2002-10-06  9:42                           ` Russell King
2002-10-06 17:06                             ` Alan Cox
2002-10-06 13:44                           ` Oliver Neukum
2002-10-06 15:19                             ` Martin J. Bligh
2002-10-06 15:14                               ` Oliver Neukum
2002-10-07  8:08                               ` Helge Hafting
2002-10-07  9:18                                 ` Oliver Neukum
2002-10-07 14:11                                   ` Jan Hudec
2002-10-07 15:01                                     ` Jesse Pollard
2002-10-07 15:34                                       ` Jan Hudec
2002-10-08  3:12                                         ` [OT] " Scott Mcdermott
2002-10-10 23:49                                           ` Mike Fedyk
2002-10-07 15:15                                   ` Martin J. Bligh
2002-10-08 13:49                                   ` Helge Hafting
2002-10-07 17:43                               ` Daniel Phillips
2002-10-07 18:31                                 ` Andrew Morton
2002-10-07 18:51                                   ` Linus Torvalds
2002-10-07 20:14                                     ` Alan Cox
2002-10-07 20:31                                       ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not3.0 " Andrew Morton
2002-10-07 20:46                                         ` Linus Torvalds
2002-10-07 20:44                                       ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 " Linus Torvalds
2002-10-07 21:16                                         ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not3.0 " Andrew Morton
2002-10-07 23:47                                           ` jw schultz
2002-10-11  0:02                                           ` Mike Fedyk
2002-10-07 18:58                                   ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 " Chris Friesen
2002-10-07 19:21                                     ` Daniel Phillips
2002-10-07 19:35                                       ` Linus Torvalds
2002-10-08  0:39                                         ` Theodore Ts'o [this message]
2002-10-08  2:59                                           ` Andrew Morton
2002-10-08 16:15                                             ` Theodore Ts'o
2002-10-08 19:39                                               ` Andrew Morton
2002-10-08 17:06                                                 ` Rob Landley
2002-10-07 19:36                                     ` Andrew Morton
2002-10-08  2:36                                       ` Simon Kirby
2002-10-08  2:47                                         ` Daniel Phillips
2002-10-08  2:50                                         ` Andrew Morton
2002-10-08  2:54                                           ` Simon Kirby
2002-10-08  3:00                                             ` Andrew Morton
2002-10-08 16:17                                               ` Theodore Ts'o
2002-10-08 12:49                                           ` jlnance
2002-10-08 17:09                                             ` Andrew Morton
2002-10-10 20:53                                               ` Thomas Zimmerman
2002-10-08 13:54                                       ` Helge Hafting
2002-10-08 15:31                                         ` Andreas Dilger
2002-10-07 19:05                                   ` Daniel Phillips
2002-10-07 19:24                                     ` Linus Torvalds
2002-10-07 20:02                                       ` Daniel Phillips
2002-10-07 20:14                                         ` Andrew Morton
2002-10-07 20:22                                           ` Daniel Phillips
2002-10-07 20:28                                         ` Linus Torvalds
2002-10-07 21:16                                           ` Daniel Phillips
2002-10-07 21:55                                             ` Linus Torvalds
2002-10-07 22:02                                               ` Daniel Phillips
2002-10-07 22:12                                                 ` Andrew Morton
2002-10-08  8:49                                                   ` Padraig Brady
2002-10-07 22:14                                             ` Charles Cazabon
2002-10-30 18:26                                   ` Lee Leahu
2002-10-06  6:33                         ` Martin J. Bligh
2002-10-07  5:28                         ` John Alvord
2002-10-07  8:39                         ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 n Giuliano Pochini
2002-10-07 13:56                         ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA)) Jesse Pollard
2002-10-07 14:03                           ` Rob Landley
2002-10-08 22:14                             ` Jesse Pollard
2002-10-08 19:11                               ` Rob Landley
2002-10-09  8:17                             ` Alexander Kellett
2002-10-07 18:22                           ` Daniel Phillips
2002-10-08  8:19                             ` Jan Hudec
2002-10-11 23:53                         ` Hans Reiser
2002-10-11 20:26                           ` Rob Landley
2002-10-12  4:14                             ` Nick LeRoy
2002-10-13 17:27                               ` Rob Landley
2002-10-12 10:03                             ` Hans Reiser
2002-10-13 17:32                               ` Rob Landley
2002-10-13 23:51                                 ` Hans Reiser
2002-10-14 16:33                                   ` Rob Landley
2002-10-14  7:10                                 ` Nikita Danilov
2002-10-21 15:36                                   ` [OT] Please don't call it 3.0!! (was Re: The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA))) Calin A. Culianu
2002-10-21 16:20                                     ` Wakko Warner
2002-10-12 11:42                             ` The reason to call it 3.0 is the desktop (was Re: [OT] 2.6 not 3.0 - (NUMA)) Matthias Andree
2002-10-12 14:56                               ` Hugh Dickins
2002-09-27 11:32       ` [PATCH-RFC] 4 of 4 - New problem logging macros, SCSI RAIDdevice driver Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021008003903.GA473@think.thunk.org \
    --to=tytso@mit.edu \
    --cc=akpm@digeo.com \
    --cc=cfriesen@nortelnetworks.com \
    --cc=landley@trommello.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@aracnet.com \
    --cc=oliver@neukum.name \
    --cc=phillips@arcor.de \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).