All of lore.kernel.org
 help / color / mirror / Atom feed
* file system technical comparisons
@ 2004-01-02 21:38 Steve Glines
  2004-01-05  9:42 ` venom
  2004-01-09 19:32 ` Stewart Smith
  0 siblings, 2 replies; 14+ messages in thread
From: Steve Glines @ 2004-01-02 21:38 UTC (permalink / raw)
  To: linux-kernel

I'm looking for a technical comparison between the major file systems. 
At a minimum I'd like to see a comparison between ext3, reiserfs, xfs 
and jfs. In the oh so perfect world I'd like to see detailed info on all 
supported file systems.

Please CC or mail me directly as I am not a subscriber to this list.

Thanks
-- 
Steve Glines

In theory, there's no difference between theory and practice, but in 
practice there is.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-02 21:38 file system technical comparisons Steve Glines
@ 2004-01-05  9:42 ` venom
  2004-01-05 11:04   ` Hans Reiser
  2004-01-05 17:37   ` Randy.Dunlap
  2004-01-09 19:32 ` Stewart Smith
  1 sibling, 2 replies; 14+ messages in thread
From: venom @ 2004-01-05  9:42 UTC (permalink / raw)
  To: Steve Glines; +Cc: linux-kernel


http://www.linux-mag.com/2002-10/jfs_01.html

On some point it could be discussed, but it is a good starting point.

if you know italian, I will send you another article published in three part
on Linux&C (http://www.oltrelinux.com) about journaled filesystems available in
Linux kernel.

bests

Luigi

On Fri, 2 Jan 2004, Steve Glines wrote:

> Date: Fri, 02 Jan 2004 16:38:22 -0500
> From: Steve Glines <sglines@is-cs.com>
> To: linux-kernel@vger.kernel.org
> Subject: file system technical comparisons
>
> I'm looking for a technical comparison between the major file systems.
> At a minimum I'd like to see a comparison between ext3, reiserfs, xfs
> and jfs. In the oh so perfect world I'd like to see detailed info on all
> supported file systems.
>
> Please CC or mail me directly as I am not a subscriber to this list.
>
> Thanks
> --
> Steve Glines
>
> In theory, there's no difference between theory and practice, but in
> practice there is.
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05  9:42 ` venom
@ 2004-01-05 11:04   ` Hans Reiser
  2004-01-05 17:08     ` venom
  2004-01-05 17:37   ` Randy.Dunlap
  1 sibling, 1 reply; 14+ messages in thread
From: Hans Reiser @ 2004-01-05 11:04 UTC (permalink / raw)
  To: venom; +Cc: Steve Glines, linux-kernel

You can read www.namesys.com for a description of reiser4, and 
www.namesys.com/benchmarks.html for some benchmarks.

There are no well done independent benchmarks unfortunately.

Of my competitors, and not considering ReiserFS (about which I am not 
objective), I would say that if you don't have really large files and 
don't have any large directories, ext3 offers the best performance.

If you have large streaming files, look at XFS.   Don't use XFS for 
files smaller than 100k, as the last time I tested against it its 
metadata updates tended to be slow, and that starts to matter at <100k 
file sizes.

JFS has never done very well in the benchmarks I run, which is why I 
tend to compare us mostly to ext3.

If you are willing to consider ReiserFS, V3 is the journaling filesystem 
that has been out for the longest, and receives the least updates (we 
are all working on V4), so it is the most stable.  I'll let others 
comment on its performance.

V4 is far higher performance than V3, but not quite fully stable yet.  
Some brave people are using it though.  Hopefully we will ship something 
stable this month.

Hans

venom@sns.it wrote:

>http://www.linux-mag.com/2002-10/jfs_01.html
>
>On some point it could be discussed, but it is a good starting point.
>
>if you know italian, I will send you another article published in three part
>on Linux&C (http://www.oltrelinux.com) about journaled filesystems available in
>Linux kernel.
>
>bests
>
>Luigi
>
>On Fri, 2 Jan 2004, Steve Glines wrote:
>
>  
>
>>Date: Fri, 02 Jan 2004 16:38:22 -0500
>>From: Steve Glines <sglines@is-cs.com>
>>To: linux-kernel@vger.kernel.org
>>Subject: file system technical comparisons
>>
>>I'm looking for a technical comparison between the major file systems.
>>At a minimum I'd like to see a comparison between ext3, reiserfs, xfs
>>and jfs. In the oh so perfect world I'd like to see detailed info on all
>>supported file systems.
>>
>>Please CC or mail me directly as I am not a subscriber to this list.
>>
>>Thanks
>>--
>>Steve Glines
>>
>>In theory, there's no difference between theory and practice, but in
>>practice there is.
>>
>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>
>>    
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05 11:04   ` Hans Reiser
@ 2004-01-05 17:08     ` venom
  2004-01-05 17:18       ` Hans Reiser
  0 siblings, 1 reply; 14+ messages in thread
From: venom @ 2004-01-05 17:08 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Steve Glines, linux-kernel


I already gave a look to reiser4, but I have some difficoultis with dancing
tree structure.

But definitelly I was waiting for an extent based FS
as we talked about this more than one year and a half ago,
when you described me
your plans for reiser in an interview on Linux&C.

Now, will have to find some time to set up a test, possibly on a 64 bit CPU.

bests

Luigi



On Mon, 5 Jan 2004, Hans Reiser wrote:

> Date: Mon, 05 Jan 2004 14:04:58 +0300
> From: Hans Reiser <reiser@namesys.com>
> To: venom@sns.it
> Cc: Steve Glines <sglines@is-cs.com>, linux-kernel@vger.kernel.org
> Subject: Re: file system technical comparisons
>
> You can read www.namesys.com for a description of reiser4, and
> www.namesys.com/benchmarks.html for some benchmarks.
>
> There are no well done independent benchmarks unfortunately.
>
> Of my competitors, and not considering ReiserFS (about which I am not
> objective), I would say that if you don't have really large files and
> don't have any large directories, ext3 offers the best performance.
>
> If you have large streaming files, look at XFS.   Don't use XFS for
> files smaller than 100k, as the last time I tested against it its
> metadata updates tended to be slow, and that starts to matter at <100k
> file sizes.
>
> JFS has never done very well in the benchmarks I run, which is why I
> tend to compare us mostly to ext3.
>
> If you are willing to consider ReiserFS, V3 is the journaling filesystem
> that has been out for the longest, and receives the least updates (we
> are all working on V4), so it is the most stable.  I'll let others
> comment on its performance.
>
> V4 is far higher performance than V3, but not quite fully stable yet.
> Some brave people are using it though.  Hopefully we will ship something
> stable this month.
>
> Hans
>
> venom@sns.it wrote:
>
> >http://www.linux-mag.com/2002-10/jfs_01.html
> >
> >On some point it could be discussed, but it is a good starting point.
> >
> >if you know italian, I will send you another article published in three part
> >on Linux&C (http://www.oltrelinux.com) about journaled filesystems available in
> >Linux kernel.
> >
> >bests
> >
> >Luigi
> >
> >On Fri, 2 Jan 2004, Steve Glines wrote:
> >
> >
> >
> >>Date: Fri, 02 Jan 2004 16:38:22 -0500
> >>From: Steve Glines <sglines@is-cs.com>
> >>To: linux-kernel@vger.kernel.org
> >>Subject: file system technical comparisons
> >>
> >>I'm looking for a technical comparison between the major file systems.
> >>At a minimum I'd like to see a comparison between ext3, reiserfs, xfs
> >>and jfs. In the oh so perfect world I'd like to see detailed info on all
> >>supported file systems.
> >>
> >>Please CC or mail me directly as I am not a subscriber to this list.
> >>
> >>Thanks
> >>--
> >>Steve Glines
> >>
> >>In theory, there's no difference between theory and practice, but in
> >>practice there is.
> >>
> >>
> >>-
> >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>Please read the FAQ at  http://www.tux.org/lkml/
> >>
> >>
> >>
> >
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >
> >
> >
> >
>
>
> --
> Hans
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05 17:08     ` venom
@ 2004-01-05 17:18       ` Hans Reiser
  2004-01-06 11:58         ` venom
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Reiser @ 2004-01-05 17:18 UTC (permalink / raw)
  To: venom; +Cc: Steve Glines, linux-kernel

venom@sns.it wrote:

>I already gave a look to reiser4, but I have some difficoultis with dancing
>tree structure.
>
>  
>
Difficulties meaning that you don't think it is a good 
structure/algorithm, or that you don't find the docs understandable?

-- 
Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05  9:42 ` venom
  2004-01-05 11:04   ` Hans Reiser
@ 2004-01-05 17:37   ` Randy.Dunlap
  2004-01-06 12:04     ` venom
  1 sibling, 1 reply; 14+ messages in thread
From: Randy.Dunlap @ 2004-01-05 17:37 UTC (permalink / raw)
  To: venom; +Cc: sglines, linux-kernel

On Mon, 5 Jan 2004 10:42:32 +0100 (CET) venom@sns.it wrote:

| 
| http://www.linux-mag.com/2002-10/jfs_01.html
| 
| On some point it could be discussed, but it is a good starting point.
| 
| if you know italian, I will send you another article published in three part
| on Linux&C (http://www.oltrelinux.com) about journaled filesystems available in
| Linux kernel.
| 
| bests
| 
| Luigi
| 
| On Fri, 2 Jan 2004, Steve Glines wrote:
| 
| > Date: Fri, 02 Jan 2004 16:38:22 -0500
| > From: Steve Glines <sglines@is-cs.com>
| > To: linux-kernel@vger.kernel.org
| > Subject: file system technical comparisons
| >
| > I'm looking for a technical comparison between the major file systems.
| > At a minimum I'd like to see a comparison between ext3, reiserfs, xfs
| > and jfs. In the oh so perfect world I'd like to see detailed info on all
| > supported file systems.
| >
| > Please CC or mail me directly as I am not a subscriber to this list.
| >
| > Thanks
| > --
| > Steve Glines

A couple of years ago I did a journaling filesystems comparison
for 2.4.x, so it's quite dated.  I wouldn't pay much attention to
the performance numbers from then.

You can get some other (non-performance) comparison info by looking
at <http://developer.osdl.org/rddunlap/journal_fs/lwe-jgfs.pdf>.

--
~Randy
MOTD:  Always include version info.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05 17:18       ` Hans Reiser
@ 2004-01-06 11:58         ` venom
  2004-01-06 12:07           ` Hans Reiser
  0 siblings, 1 reply; 14+ messages in thread
From: venom @ 2004-01-06 11:58 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Steve Glines, linux-kernel


That there is something I am not really sure I understood.

Luigi

On Mon, 5 Jan 2004, Hans Reiser wrote:

> Date: Mon, 05 Jan 2004 20:18:26 +0300
> From: Hans Reiser <reiser@namesys.com>
> To: venom@sns.it
> Cc: Steve Glines <sglines@is-cs.com>, linux-kernel@vger.kernel.org
> Subject: Re: file system technical comparisons
>
> venom@sns.it wrote:
>
> >I already gave a look to reiser4, but I have some difficoultis with dancing
> >tree structure.
> >
> >
> >
> Difficulties meaning that you don't think it is a good
> structure/algorithm, or that you don't find the docs understandable?
>
> --
> Hans
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-05 17:37   ` Randy.Dunlap
@ 2004-01-06 12:04     ` venom
  2004-01-06 14:55       ` Hans Reiser
  0 siblings, 1 reply; 14+ messages in thread
From: venom @ 2004-01-06 12:04 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: sglines, linux-kernel


What would be interesting is a new comparison between reiserFS reiser4 and
latest XFS. To be onest I think ext3, with or withou HTree, obsolete, but it is
abvious if you consider its origins, while I do not speack about JFS, since
technically is interesting, but then the bench I did, more than an year ago,
were not untisiasmant, and it was buggy when in a DIR there were too many
"small" files.

Luigi


On Mon, 5 Jan 2004, Randy.Dunlap wrote:

> Date: Mon, 5 Jan 2004 09:37:45 -0800
> From: Randy.Dunlap <rddunlap@osdl.org>
> To: venom@sns.it
> Cc: sglines@is-cs.com, linux-kernel@vger.kernel.org
> Subject: Re: file system technical comparisons
>
> On Mon, 5 Jan 2004 10:42:32 +0100 (CET) venom@sns.it wrote:
>
> |
> | http://www.linux-mag.com/2002-10/jfs_01.html
> |
> | On some point it could be discussed, but it is a good starting point.
> |
> | if you know italian, I will send you another article published in three part
> | on Linux&C (http://www.oltrelinux.com) about journaled filesystems available in
> | Linux kernel.
> |
> | bests
> |
> | Luigi
> |
> | On Fri, 2 Jan 2004, Steve Glines wrote:
> |
> | > Date: Fri, 02 Jan 2004 16:38:22 -0500
> | > From: Steve Glines <sglines@is-cs.com>
> | > To: linux-kernel@vger.kernel.org
> | > Subject: file system technical comparisons
> | >
> | > I'm looking for a technical comparison between the major file systems.
> | > At a minimum I'd like to see a comparison between ext3, reiserfs, xfs
> | > and jfs. In the oh so perfect world I'd like to see detailed info on all
> | > supported file systems.
> | >
> | > Please CC or mail me directly as I am not a subscriber to this list.
> | >
> | > Thanks
> | > --
> | > Steve Glines
>
> A couple of years ago I did a journaling filesystems comparison
> for 2.4.x, so it's quite dated.  I wouldn't pay much attention to
> the performance numbers from then.
>
> You can get some other (non-performance) comparison info by looking
> at <http://developer.osdl.org/rddunlap/journal_fs/lwe-jgfs.pdf>.
>
> --
> ~Randy
> MOTD:  Always include version info.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-06 11:58         ` venom
@ 2004-01-06 12:07           ` Hans Reiser
  2004-01-06 23:48             ` venom
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Reiser @ 2004-01-06 12:07 UTC (permalink / raw)
  To: venom; +Cc: Steve Glines, linux-kernel

venom@sns.it wrote:

>That there is something I am not really sure I understood.
>
>Luigi
>
>  
>
balanced trees squish things together at every modification of the 
tree.  Dancing trees squish things together when they get low on ram, 
which is less often.  this means that we can afford to squish tighter 
because we do it less often.

-- 
Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-06 12:04     ` venom
@ 2004-01-06 14:55       ` Hans Reiser
  2004-01-06 20:32         ` Theodore Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Reiser @ 2004-01-06 14:55 UTC (permalink / raw)
  To: venom; +Cc: Randy.Dunlap, sglines, linux-kernel

venom@sns.it wrote:

>What would be interesting is a new comparison between reiserFS reiser4 and
>latest XFS. To be onest I think ext3, with or withou HTree, obsolete, but it is
>abvious if you consider its origins, while I do not speack about JFS, since
>technically is interesting, but then the bench I did, more than an year ago,
>were not untisiasmant, and it was buggy when in a DIR there were too many
>"small" files.
>
>Luigi
>
>  
>
Actually I agree with you that JFS is architecturally much more 
interesting than ext3 (though Andrew Morton's readahead code for ext* is 
beautiful stuff).  I haven't really looked at why JFS is slow, though 
usually being slow at <100k sized files in a journaling filesystem is 
due to the journaling code.  The thing about performance is that the 
mistakes count for 4x what the things done right count for.  Chris Mason 
did a lot for V3's performance compared to the competition by writing 
nice journaling code for us.

htree has performance problems that are due to its architecture --- I 
think this is why they don't make it on by default --- it actually slows 
ext3 down substantially for average directory sizes.....  you can see 
that on our benchmarks page, or just by copying around some copies of 
the linux kernel yourself with it on and off.

-- 
Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-06 14:55       ` Hans Reiser
@ 2004-01-06 20:32         ` Theodore Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2004-01-06 20:32 UTC (permalink / raw)
  To: Hans Reiser; +Cc: venom, Randy.Dunlap, sglines, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2097 bytes --]

On Tue, Jan 06, 2004 at 05:55:24PM +0300, Hans Reiser wrote:
> 
> htree has performance problems that are due to its architecture --- I 
> think this is why they don't make it on by default --- it actually slows 
> ext3 down substantially for average directory sizes.....  you can see 
> that on our benchmarks page, or just by copying around some copies of 
> the linux kernel yourself with it on and off.

Actually, the reason why we didn't enable by default was more because
of conservatism; it's a new feature, and during the bug shakedown
phase, we didn't want to impact users with some of the initial memory
leaks, NFS server incompatibilities, etc., that have since been all
fixed.

Htree has performance problems for certain workloads --- specifically,
workloads that do a readdir() followed by a stat().  This is because
readdir() returns inodes in hash value order, instead of in the order
that the files were created (which generally meant increasing inode
order).  Because accesses to the inode table now become random, this
adversely impacts certain workloads, such as the kernel tar and untar
benchmark.  

This can be easily fixed by changing the application to sort the
inodes by inode number, or by using an LD_PRELOAD library to do the
sorting in userspace.  (See attached).

Whether or not this performance issue is a problem in real life is a
different story.  If you are just doing accesses in random order and
are doing lookups by name, such as in a squid cache, you won't see
this issue at all, and htree will be a huge win.  However, if like
sendmail the program is running readdir() and then stat'ing all files,
then the following LD_PRELOAD library is necessary to avoid a
performance regression caused by htree returning files in hash order.

(Note that this LD_PRELOAD library will often speed up readdir/stat
workloads on non-htree filesystems as well, since in general most
inode-based filesystems are happier when you access files in inode
order, and over time, directories tend to get disordered and are no
longer in create/inode number sorted order.)

						- Ted

[-- Attachment #2: spd_readdir.c --]
[-- Type: text/x-csrc, Size: 5855 bytes --]

/*
 * readdir accelerator
 *
 * (C) Copyright 2003 by Theodore Ts'o.
 *
 * %Begin-Header%
 * This file may be redistributed under the terms of the GNU Public
 * License.
 * %End-Header%
 * 
 */

#define ALLOC_STEPSIZE	100
#define MAX_DIRSIZE	0

#define DEBUG

#ifdef DEBUG
#define DEBUG_DIR(x)	{if (do_debug) { x; }}
#else
#define DEBUG_DIR(x)
#endif

#define _GNU_SOURCE
#define __USE_LARGEFILE64

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
#include <errno.h>
#include <dlfcn.h>

struct dirent_s {
	unsigned long long d_ino;
	long long d_off;
	unsigned short int d_reclen;
	unsigned char d_type;
	char *d_name;
};

struct dir_s {
	DIR	*dir;
	int	num;
	int	max;
	struct dirent_s *dp;
	int	pos;
	struct dirent ret_dir;
	struct dirent64 ret_dir64;
};

static int (*real_closedir)(DIR *dir) = 0;
static DIR *(*real_opendir)(const char *name) = 0;
static struct dirent *(*real_readdir)(DIR *dir) = 0;
static struct dirent64 *(*real_readdir64)(DIR *dir) = 0;
static off_t (*real_telldir)(DIR *dir) = 0;
static void (*real_seekdir)(DIR *dir, off_t offset) = 0;
static unsigned long max_dirsize = MAX_DIRSIZE;
#ifdef DEBUG
static int do_debug = 0;
#endif

static void setup_ptr()
{
	char *cp;

	real_opendir = dlsym(RTLD_NEXT, "opendir");
	real_closedir = dlsym(RTLD_NEXT, "closedir");
	real_readdir = dlsym(RTLD_NEXT, "readdir");
	real_readdir64 = dlsym(RTLD_NEXT, "readdir64");
	real_telldir = dlsym(RTLD_NEXT, "telldir");
	real_seekdir = dlsym(RTLD_NEXT, "seekdir");
	if ((cp = getenv("SPD_READDIR_MAX_SIZE")) != NULL) {
		max_dirsize = atol(cp);
	}
#ifdef DEBUG
	if (getenv("SPD_READDIR_DEBUG"))
		do_debug++;
#endif
}

static void free_cached_dir(struct dir_s *dirstruct)
{
	int i;

	if (!dirstruct->dp)
		return;

	for (i=0; i < dirstruct->num; i++) {
		free(dirstruct->dp[i].d_name);
	}
	free(dirstruct->dp);
	dirstruct->dp = 0;
}	

static int ino_cmp(const void *a, const void *b)
{
	const struct dirent_s *ds_a = (const struct dirent_s *) a;
	const struct dirent_s *ds_b = (const struct dirent_s *) b;
	ino_t i_a, i_b;
	
	i_a = ds_a->d_ino;
	i_b = ds_b->d_ino;

	if (ds_a->d_name[0] == '.') {
		if (ds_a->d_name[1] == 0)
			i_a = 0;
		else if ((ds_a->d_name[1] == '.') && (ds_a->d_name[2] == 0))
			i_a = 1;
	}
	if (ds_b->d_name[0] == '.') {
		if (ds_b->d_name[1] == 0)
			i_b = 0;
		else if ((ds_b->d_name[1] == '.') && (ds_b->d_name[2] == 0))
			i_b = 1;
	}

	return (i_a - i_b);
}


DIR *opendir(const char *name)
{
	DIR *dir;
	struct dir_s	*dirstruct;
	struct dirent_s *ds, *dnew;
	struct dirent64 *d;
	struct stat st;

	if (!real_opendir)
		setup_ptr();

	dir = (*real_opendir)(name);
	if (!dir)
		return NULL;

	dirstruct = malloc(sizeof(struct dir_s));
	if (!dirstruct) {
		(*real_closedir)(dir);
		errno = -ENOMEM;
		return NULL;
	}
	dirstruct->num = 0;
	dirstruct->max = 0;
	dirstruct->dp = 0;
	dirstruct->pos = 0;
	dirstruct->dir = 0;

	if (max_dirsize && (stat(name, &st) == 0) && 
	    (st.st_size > max_dirsize)) {
		DEBUG_DIR(printf("Directory size %ld, using direct readdir\n",
				 st.st_size));
		dirstruct->dir = dir;
		return (DIR *) dirstruct;
	}

	while ((d = (*real_readdir64)(dir)) != NULL) {
		if (dirstruct->num >= dirstruct->max) {
			dirstruct->max += ALLOC_STEPSIZE;
			DEBUG_DIR(printf("Reallocating to size %d\n", 
					 dirstruct->max));
			dnew = realloc(dirstruct->dp, 
				       dirstruct->max * sizeof(struct dir_s));
			if (!dnew)
				goto nomem;
			dirstruct->dp = dnew;
		}
		ds = &dirstruct->dp[dirstruct->num++];
		ds->d_ino = d->d_ino;
		ds->d_off = d->d_off;
		ds->d_reclen = d->d_reclen;
		ds->d_type = d->d_type;
		if ((ds->d_name = malloc(strlen(d->d_name)+1)) == NULL) {
			dirstruct->num--;
			goto nomem;
		}
		strcpy(ds->d_name, d->d_name);
		DEBUG_DIR(printf("readdir: %lu %s\n", 
				 (unsigned long) d->d_ino, d->d_name));
	}
	(*real_closedir)(dir);
	qsort(dirstruct->dp, dirstruct->num, sizeof(struct dirent_s), ino_cmp);
	return ((DIR *) dirstruct);
nomem:
	DEBUG_DIR(printf("No memory, backing off to direct readdir\n"));
	free_cached_dir(dirstruct);
	dirstruct->dir = dir;
	return ((DIR *) dirstruct);
}

int closedir(DIR *dir)
{
	struct dir_s	*dirstruct = (struct dir_s *) dir;

	if (dirstruct->dir)
		(*real_closedir)(dirstruct->dir);

	free_cached_dir(dirstruct);
	free(dirstruct);
	return 0;
}

struct dirent *readdir(DIR *dir)
{
	struct dir_s	*dirstruct = (struct dir_s *) dir;
	struct dirent_s *ds;

	if (dirstruct->dir)
		return (*real_readdir)(dirstruct->dir);

	if (dirstruct->pos >= dirstruct->num)
		return NULL;

	ds = &dirstruct->dp[dirstruct->pos++];
	dirstruct->ret_dir.d_ino = ds->d_ino;
	dirstruct->ret_dir.d_off = ds->d_off;
	dirstruct->ret_dir.d_reclen = ds->d_reclen;
	dirstruct->ret_dir.d_type = ds->d_type;
	strncpy(dirstruct->ret_dir.d_name, ds->d_name,
		sizeof(dirstruct->ret_dir.d_name));

	return (&dirstruct->ret_dir);
}

struct dirent64 *readdir64(DIR *dir)
{
	struct dir_s	*dirstruct = (struct dir_s *) dir;
	struct dirent_s *ds;

	if (dirstruct->dir)
		return (*real_readdir64)(dirstruct->dir);

	if (dirstruct->pos >= dirstruct->num)
		return NULL;

	ds = &dirstruct->dp[dirstruct->pos++];
	dirstruct->ret_dir64.d_ino = ds->d_ino;
	dirstruct->ret_dir64.d_off = ds->d_off;
	dirstruct->ret_dir64.d_reclen = ds->d_reclen;
	dirstruct->ret_dir64.d_type = ds->d_type;
	strncpy(dirstruct->ret_dir64.d_name, ds->d_name,
		sizeof(dirstruct->ret_dir64.d_name));

	return (&dirstruct->ret_dir64);
}

off_t telldir(DIR *dir)
{
	struct dir_s	*dirstruct = (struct dir_s *) dir;

	if (dirstruct->dir)
		return (*real_telldir)(dirstruct->dir);

	return ((off_t) dirstruct->pos);
}

void seekdir(DIR *dir, off_t offset)
{
	struct dir_s	*dirstruct = (struct dir_s *) dir;

	if (dirstruct->dir) {
		(*real_seekdir)(dirstruct->dir, offset);
		return;
	}

	dirstruct->pos = offset;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-06 12:07           ` Hans Reiser
@ 2004-01-06 23:48             ` venom
  2004-01-07  9:13               ` Hans Reiser
  0 siblings, 1 reply; 14+ messages in thread
From: venom @ 2004-01-06 23:48 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Steve Glines, linux-kernel

On Tue, 6 Jan 2004, Hans Reiser wrote:

> balanced trees squish things together at every modification of the
> tree.  Dancing trees squish things together when they get low on ram,
> which is less often.  this means that we can afford to squish tighter
> because we do it less often.

This is generally true except some maior cases.

A SAP server, for example, is "always" low on ram, not because of oracle, but
because how the "disp+work" processes work.

Another case I am thinking is a tibco server, when processes start to fork
because of a lot of incoming messages from everywhere, and the DB really start
to write a lot of stuff (all small writes).

I am curious to make some test in those cases.

Another think I am thinking about is an MC^2 lun. If all the I/O is resolved
inside of the EMC cache, BTrees could be better than dancing trees? In fact
in this case what matters is the CPU power you are using, since you de facto
talk just with EMC cache.

I know those are strange scenarios, but those are the scenarios I am actually
working with. Since those are not typical situations, I think right now they are
ininfluent, but in the future maybe more people will have to deal with them.

Anyway untill I do not make some serious experiment mine are just
speculations.

Luigi


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-06 23:48             ` venom
@ 2004-01-07  9:13               ` Hans Reiser
  0 siblings, 0 replies; 14+ messages in thread
From: Hans Reiser @ 2004-01-07  9:13 UTC (permalink / raw)
  To: venom; +Cc: Steve Glines, linux-kernel

venom@sns.it wrote:

>On Tue, 6 Jan 2004, Hans Reiser wrote:
>
>  
>
>>balanced trees squish things together at every modification of the
>>tree.  Dancing trees squish things together when they get low on ram,
>>which is less often.  this means that we can afford to squish tighter
>>because we do it less often.
>>    
>>
>
>This is generally true except some maior cases.
>
>A SAP server, for example, is "always" low on ram, not because of oracle, but
>because how the "disp+work" processes work.
>
>Another case I am thinking is a tibco server, when processes start to fork
>because of a lot of incoming messages from everywhere, and the DB really start
>to write a lot of stuff (all small writes).
>
>I am curious to make some test in those cases.
>  
>
even if it is always low on ram, the memory pressure signal from VM is 
still less often than the tree modification because we squish in big 
batches.

>Another think I am thinking about is an MC^2 lun. If all the I/O is resolved
>inside of the EMC cache, BTrees could be better than dancing trees?
>
no, dancing trees would still fit in that cache and still be more cpu 
efficient

> In fact
>in this case what matters is the CPU power you are using, since you de facto
>talk just with EMC cache.
>
>I know those are strange scenarios, but those are the scenarios I am actually
>working with. Since those are not typical situations, I think right now they are
>ininfluent, but in the future maybe more people will have to deal with them.
>
>Anyway untill I do not make some serious experiment mine are just
>speculations.
>
>Luigi
>
>
>
>  
>
there are flaws in the reiser4 algorithms, but the dancing tree concept 
is a good one.  We are currently experimentally encountering various 
oddities needing fixing.  For instance, if your working set is just 
barely too large for ram, we have a tendency to flush too many pages out 
of ram and make you wait for us to do so.  this is fixable, and being 
discussed now amongst us.

-- 
Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: file system technical comparisons
  2004-01-02 21:38 file system technical comparisons Steve Glines
  2004-01-05  9:42 ` venom
@ 2004-01-09 19:32 ` Stewart Smith
  1 sibling, 0 replies; 14+ messages in thread
From: Stewart Smith @ 2004-01-09 19:32 UTC (permalink / raw)
  To: Steve Glines; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

http://www.flamingspork.com/honors/

I did some (theoretical) comparisons between a number of file systems in
my Honors thesis. Notable exceptions are NTFS and JFS. Some good
reasoning behind why some designs are better than others.

On Sat, 2004-01-03 at 08:38, Steve Glines wrote:
> I'm looking for a technical comparison between the major file systems. 
> At a minimum I'd like to see a comparison between ext3, reiserfs, xfs 
> and jfs. In the oh so perfect world I'd like to see detailed info on all 
> supported file systems.
> 
> Please CC or mail me directly as I am not a subscriber to this list.
> 
> Thanks

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-01-09 19:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-02 21:38 file system technical comparisons Steve Glines
2004-01-05  9:42 ` venom
2004-01-05 11:04   ` Hans Reiser
2004-01-05 17:08     ` venom
2004-01-05 17:18       ` Hans Reiser
2004-01-06 11:58         ` venom
2004-01-06 12:07           ` Hans Reiser
2004-01-06 23:48             ` venom
2004-01-07  9:13               ` Hans Reiser
2004-01-05 17:37   ` Randy.Dunlap
2004-01-06 12:04     ` venom
2004-01-06 14:55       ` Hans Reiser
2004-01-06 20:32         ` Theodore Ts'o
2004-01-09 19:32 ` Stewart Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.