linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Ext2 directory index, updated
@ 2001-11-04  2:28 Daniel Phillips
  2001-11-04  2:44 ` Daniel Phillips
  2001-11-04 22:09 ` Christian Laursen
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-04  2:28 UTC (permalink / raw)
  To: linux-kernel

***N.B.: still for use on test partitions only.***

This update mainly fixes a bug, a one-in-a-million occurance on an untested 
code path.  This bug resulted in rm -rf deleting all files but one from a 
million-file directory.  I believe that's the last untested code path, and 
otherwise it's been very stable.

I didn't expect highmem to work properly, and it didn't.  It's on my to-do 
list, but for now highmem has to be off or you will oops on boot.

I elaborated the dx_show_buckets debug output to show dump the full index 
tree instead of just one level.  This function now serves as a capsule 
summary of the index tree structure, and as you can see, it's simple.

I've done quite a bit more testing, including stress testing on a real 
machine and I find that everything works quite comfortably up to about 2 
million files, turning in an average time of about 50 microseconds/create and 
300 microseconds/delete (1 GHz PIII).  In the 4 million file range things go 
pear-shaped, which I believe is not due to the index patch, but to rd.  The 
runs do complete, but require exponentially more time, with cpu 98% idle and 
block throughput in the 300/second range.  I'll look into that more later.

I did run into some bad mm behavior on 2.4.13.  The icache seems to be too 
severely throttled, resulting in delete performance being less than it should 
be.  I also find I am rarely unable to create a million file test run on uml 
(2.4.13) without oom-ing.  In my experience, such problems are not due to 
uml, but to the kernel's memory manager.  These issues may have been 
addressed in recent pre-patch kernels, but it seems there is a still some 
room for improvement in mm stability.

The patch is available at:

  http://nl.linux.org/~phillips/htree/ext2.index-2.4.13

To apply:

  cd /your/source/tree
  patch -p0 <this.patch

--
Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04  2:28 Ext2 directory index, updated Daniel Phillips
@ 2001-11-04  2:44 ` Daniel Phillips
  2001-11-04 22:09 ` Christian Laursen
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-04  2:44 UTC (permalink / raw)
  To: linux-kernel

On November 4, 2001 03:28 am, I wrote:
> The patch is available at:
> 
>   http://nl.linux.org/~phillips/htree/ext2.index-2.4.13
>

Make that:
 
    http://nl.linux.org/~phillips/htree/ext2.index-2.4.13-2

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04  2:28 Ext2 directory index, updated Daniel Phillips
  2001-11-04  2:44 ` Daniel Phillips
@ 2001-11-04 22:09 ` Christian Laursen
  2001-11-04 22:24   ` Daniel Phillips
  2001-11-05  1:43   ` Daniel Phillips
  1 sibling, 2 replies; 19+ messages in thread
From: Christian Laursen @ 2001-11-04 22:09 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

Daniel Phillips <phillips@bonn-fries.net> writes:

> ***N.B.: still for use on test partitions only.***

It's the first time, I've tried this patch and I must say, that
the first impression is very good indeed.

I took a real world directory (my linux-kernel MH folder containing
roughly 115000 files) and did a 'du -s' on it.

Without the patch it took a little more than 20 minutes to complete.

With the patch, it took less than 20 seconds. (And that was inside uml)


However, when I accidentally killed the uml, it left me with an unclean
filesystem which fsck refuses to touch because it has unsupported features.

Even the latest version does this.

Is there a patch for fsck, that fixes this somewhere?

-- 
Best regards
    Christian Laursen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 22:09 ` Christian Laursen
@ 2001-11-04 22:24   ` Daniel Phillips
  2001-11-04 22:54     ` Christian Laursen
  2001-11-04 23:01     ` Daniel Phillips
  2001-11-05  1:43   ` Daniel Phillips
  1 sibling, 2 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-04 22:24 UTC (permalink / raw)
  To: Christian Laursen; +Cc: linux-kernel, Andreas Dilger

On November 4, 2001 11:09 pm, Christian Laursen wrote:
> Daniel Phillips <phillips@bonn-fries.net> writes:
> 
> > ***N.B.: still for use on test partitions only.***
> 
> It's the first time, I've tried this patch and I must say, that
> the first impression is very good indeed.
> 
> I took a real world directory (my linux-kernel MH folder containing
> roughly 115000 files) and did a 'du -s' on it.
> 
> Without the patch it took a little more than 20 minutes to complete.
> 
> With the patch, it took less than 20 seconds. (And that was inside uml)
> 
> 
> However, when I accidentally killed the uml, it left me with an unclean
> filesystem which fsck refuses to touch because it has unsupported features.
> 
> Even the latest version does this.
> 
> Is there a patch for fsck, that fixes this somewhere?

Ted Ts'o volunteered to do that but I failed to support him with proper 
documentation so it hasn't been done yet.

However, it's very easy to get around this, just comment out the part of the 
patch that sets the incompat flag.  Then the indexed directories will 
magically turn back into normal directories the next time you write to them 
(it would be very good to give this feature a real-life test :-)

There is an easy way to turn that FEATURE_COMPAT flag back off so you can 
fsck, but I don't know it and I should.

Andreas?

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 22:24   ` Daniel Phillips
@ 2001-11-04 22:54     ` Christian Laursen
  2001-11-04 23:01     ` Daniel Phillips
  1 sibling, 0 replies; 19+ messages in thread
From: Christian Laursen @ 2001-11-04 22:54 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel, Andreas Dilger

Daniel Phillips <phillips@bonn-fries.net> writes:

> There is an easy way to turn that FEATURE_COMPAT flag back off so you can 
> fsck, but I don't know it and I should.

I figured it out by myself. :)

$ debugfs -w root_fs
debugfs:  feature -dir_index

-- 
Best regards
    Christian Laursen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 22:24   ` Daniel Phillips
  2001-11-04 22:54     ` Christian Laursen
@ 2001-11-04 23:01     ` Daniel Phillips
  2001-11-04 23:09       ` Gábor Lénárt
  2001-11-05 22:10       ` Andreas Dilger
  1 sibling, 2 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-04 23:01 UTC (permalink / raw)
  To: Christian Laursen; +Cc: linux-kernel, Andreas Dilger

On November 4, 2001 11:24 pm, Daniel Phillips wrote:
> On November 4, 2001 11:09 pm, Christian Laursen wrote:
> > Daniel Phillips <phillips@bonn-fries.net> writes:
> > However, when I accidentally killed the uml, it left me with an unclean
> > filesystem which fsck refuses to touch because it has unsupported 
features.
> > 
> > Even the latest version does this.
> > 
> > Is there a patch for fsck, that fixes this somewhere?
> 
> [...]
> There is an easy way to turn that FEATURE_COMPAT flag back off so you can 
> fsck, but I don't know it and I should.

It's debug2fs, details to come.

The COMPAT_FEATURE thing is a problem, we *are* supposed to be able to fsck
a volume that has indexed directories on it with old versions of fsck, and 
it's only the COMPAT_FEATURE flag that prevents this.  You tried fsck -f 
and it didn't work, right?

For using the -o index option on a non-throwaway volume, we should do this:

 void ext2_add_compat_feature (struct super_block *sb, unsigned feature)
 {
+	return;
 	if (!EXT2_HAS_COMPAT_FEATURE(sb, feature))
 	{

And afterwards you can rm -rf your test directory, though actually normal 
ext2 shouldn't see anything unusual about it.  The real reason for rm'ing the 
test directory is so that I can tweak the index format in upcoming prerelease 
versions.

I've disabled the add_compat_feature here for now, because until fsck can 
handle it, it just causes trouble.  I'll go read Andreas' writeup on the 
COMPAT flags again and see if I can come up with a more friendly 
interpretation.

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 23:01     ` Daniel Phillips
@ 2001-11-04 23:09       ` Gábor Lénárt
  2001-11-05 22:10       ` Andreas Dilger
  1 sibling, 0 replies; 19+ messages in thread
From: Gábor Lénárt @ 2001-11-04 23:09 UTC (permalink / raw)
  To: linux-kernel

Hmmm. Maybe some off-topic questions follows:

* Is there patch for directory index and ext3 together?
* Will dirindex ever be a part of official kernels (eg: 2.5.x) ?

- Gabor

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 22:09 ` Christian Laursen
  2001-11-04 22:24   ` Daniel Phillips
@ 2001-11-05  1:43   ` Daniel Phillips
  2001-11-05  7:48     ` Ville Herva
                       ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-05  1:43 UTC (permalink / raw)
  To: Christian Laursen; +Cc: linux-kernel

On November 4, 2001 11:09 pm, Christian Laursen wrote:
> Daniel Phillips <phillips@bonn-fries.net> writes:
> 
> > ***N.B.: still for use on test partitions only.***
> 
> It's the first time, I've tried this patch and I must say, that
> the first impression is very good indeed.
> 
> I took a real world directory (my linux-kernel MH folder containing
> roughly 115000 files) and did a 'du -s' on it.
> 
> Without the patch it took a little more than 20 minutes to complete.
> 
> With the patch, it took less than 20 seconds. (And that was inside uml)

Which kernel are you using?  From 2.4.10 on ext2 has an accelerator in 
ext2_find_entry - it caches the last lookup position.  I'm wondering how that 
affects this case.

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05  1:43   ` Daniel Phillips
@ 2001-11-05  7:48     ` Ville Herva
  2001-11-05  9:53       ` Daniel Phillips
  2001-11-05 22:59     ` Christian Laursen
  2001-11-08  7:21     ` Christian Laursen
  2 siblings, 1 reply; 19+ messages in thread
From: Ville Herva @ 2001-11-05  7:48 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Mon, Nov 05, 2001 at 02:43:28AM +0100, you [Daniel Phillips] claimed:
>
> Which kernel are you using?  From 2.4.10 on ext2 has an accelerator in 
> ext2_find_entry - it caches the last lookup position.  I'm wondering how that 
> affects this case.

Is that the same optimization Ted T'so implemented for ext3 around 0.9.10? I
thought it hadn't been ported the ext2...

BTW, I assume the ext2 dir index patch is roughly equivalent to FreeBSD
dirhash and the the other patch resembles theFreeBSD dirperf patch?
Have you looked at them? [http://www.osnews.com/story.php?news_id=153]


-- v --

v@iki.fi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05  7:48     ` Ville Herva
@ 2001-11-05  9:53       ` Daniel Phillips
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-05  9:53 UTC (permalink / raw)
  To: Ville Herva; +Cc: linux-kernel

On November 5, 2001 08:48 am, Ville Herva wrote:
> On Mon, Nov 05, 2001 at 02:43:28AM +0100, you [Daniel Phillips] claimed:
> >
> > Which kernel are you using?  From 2.4.10 on ext2 has an accelerator in 
> > ext2_find_entry - it caches the last lookup position.  I'm wondering how that 
> > affects this case.
> 
> Is that the same optimization Ted T'so implemented for ext3 around 0.9.10? I
> thought it hadn't been ported the ext2...

Yes, Ted did it, earlier this year.

> BTW, I assume the ext2 dir index patch is roughly equivalent to FreeBSD
> dirhash and the the other patch resembles theFreeBSD dirperf patch?
> Have you looked at them? [http://www.osnews.com/story.php?news_id=153]

I *think* the performance of my dir index patch is roughly in line with BSD's
dirhash patch, for common cases.  The big difference is that the BSD dirhash
is not persistent - the cache goes away when the directory is closed.  So
there are loads that can break it badly, such as accessing files in large
directories randomly over a large disk.  This forces the entire directory to
be read into cache, in the worst case, on every access.  Another bad case is
first-time access.  A million file directory is around 30 meg - it takes a
long time to read and hash all those blocks, just to open the first file.

They will have to implement a persistent index at some point.  For common
cases though, the BSD approach is good.  

I'll go into the gory details next week at ALS if people are insterested.

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-04 23:01     ` Daniel Phillips
  2001-11-04 23:09       ` Gábor Lénárt
@ 2001-11-05 22:10       ` Andreas Dilger
  2001-11-06  0:38         ` Daniel Phillips
  1 sibling, 1 reply; 19+ messages in thread
From: Andreas Dilger @ 2001-11-05 22:10 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Christian Laursen, linux-kernel

On Nov 05, 2001  00:01 +0100, Daniel Phillips wrote:
> For using the -o index option on a non-throwaway volume, we should do this:
> 
>  void ext2_add_compat_feature (struct super_block *sb, unsigned feature)
>  {
> +	return;
>  	if (!EXT2_HAS_COMPAT_FEATURE(sb, feature))
>  	{
> 
> And afterwards you can rm -rf your test directory, though actually normal 
> ext2 shouldn't see anything unusual about it.  The real reason for rm'ing the 
> test directory is so that I can tweak the index format in upcoming prerelease 
> versions.

Well, e2fsck _should_ really know about the fact that there are indexed
directories in the filesystem, which is what the COMPAT flag flag is for.
The only current issue is that e2fsck doesn't understand this compat flag.

> I've disabled the add_compat_feature here for now, because until fsck can 
> handle it, it just causes trouble.  I'll go read Andreas' writeup on the 
> COMPAT flags again and see if I can come up with a more friendly 
> interpretation.

No, COMPAT is the friendliest.  It means old kernels can read/write this
filesystem without problems, just that e2fsck can't/won't check it.  Even
though an old fsck _probably_ won't break such a filesystem, there is no
guarantee of that, and it definitely won't validate the indexes, so a
"successfull" fsck of an indexed directory doesn't mean anything until it
can understand this COMPAT flag.

That said, I agree that turning the COMPAT flag off for short term testing
is probably not fatal, but I thought we were not going to even suggest
using non-throwaway filesystems until the hash function was finalized?  In
the end, if an updated e2fsck detects the DIR_INDEX flag (and valid indexes
therein) it will turn on the COMPAT flag for us, so all will be well.  I
don't advise that we push for patch inclusion until e2fsck is done, however.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05  1:43   ` Daniel Phillips
  2001-11-05  7:48     ` Ville Herva
@ 2001-11-05 22:59     ` Christian Laursen
  2001-11-05 23:13       ` Daniel Phillips
  2001-11-08  7:21     ` Christian Laursen
  2 siblings, 1 reply; 19+ messages in thread
From: Christian Laursen @ 2001-11-05 22:59 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

Daniel Phillips <phillips@bonn-fries.net> writes:

> On November 4, 2001 11:09 pm, Christian Laursen wrote:
> > Daniel Phillips <phillips@bonn-fries.net> writes:
> > 
> > > ***N.B.: still for use on test partitions only.***
> > 
> > It's the first time, I've tried this patch and I must say, that
> > the first impression is very good indeed.
> > 
> > I took a real world directory (my linux-kernel MH folder containing
> > roughly 115000 files) and did a 'du -s' on it.
> > 
> > Without the patch it took a little more than 20 minutes to complete.
> > 
> > With the patch, it took less than 20 seconds. (And that was inside uml)
> 
> Which kernel are you using?

Actually, it was on a 2.2.20 kernel.

> From 2.4.10 on ext2 has an accelerator in 
> ext2_find_entry - it caches the last lookup position.  I'm wondering how that 
> affects this case.

>From the description I read a while ago, I believe it could cause a significant
speedup.

I'll have to try that out one of these days.

-- 
Best regards
    Christian Laursen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05 22:59     ` Christian Laursen
@ 2001-11-05 23:13       ` Daniel Phillips
  2001-11-05 23:45         ` Andreas Dilger
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Phillips @ 2001-11-05 23:13 UTC (permalink / raw)
  To: Christian Laursen; +Cc: linux-kernel

On November 5, 2001 11:59 pm, Christian Laursen wrote:
> Daniel Phillips <phillips@bonn-fries.net> writes:
> 
> > On November 4, 2001 11:09 pm, Christian Laursen wrote:
> > > Daniel Phillips <phillips@bonn-fries.net> writes:
> > > 
> > > > ***N.B.: still for use on test partitions only.***
> > > 
> > > It's the first time, I've tried this patch and I must say, that
> > > the first impression is very good indeed.
> > > 
> > > I took a real world directory (my linux-kernel MH folder containing
> > > roughly 115000 files) and did a 'du -s' on it.
> > > 
> > > Without the patch it took a little more than 20 minutes to complete.
> > > 
> > > With the patch, it took less than 20 seconds. (And that was inside uml)
> > 
> > Which kernel are you using?
> 
> Actually, it was on a 2.2.20 kernel.

Yes, it's cool you can run 2.4 uml kernels on 2.2, isn't it?  What I meant 
was, which kernel is your uml built on?

> > From 2.4.10 on ext2 has an accelerator in 
> > ext2_find_entry - it caches the last lookup position.  I'm wondering how 
> > that affects this case.
> 
> From the description I read a while ago, I believe it could cause a
> significant speedup.
> 
> I'll have to try that out one of these days.

I noticed split results with the find_entry accelerator, at least in its 
current form: faster delete, slower create.

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05 23:13       ` Daniel Phillips
@ 2001-11-05 23:45         ` Andreas Dilger
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Dilger @ 2001-11-05 23:45 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Christian Laursen, linux-kernel

On Nov 06, 2001  00:13 +0100, Daniel Phillips wrote:
> > > From 2.4.10 on ext2 has an accelerator in 
> > > ext2_find_entry - it caches the last lookup position.  I'm wondering how 
> > > that affects this case.
> > 
> > From the description I read a while ago, I believe it could cause a
> > significant speedup.
> > 
> > I'll have to try that out one of these days.
> 
> I noticed split results with the find_entry accelerator, at least in its 
> current form: faster delete, slower create.

Well, according to reiserfs benchmarks at:
http://namesys.com/benchmarks/mongo/2.4.8_vs_2.4.9_vs_2.4.10_table.txt

the accelerator speeds up stat times (in all cases) by a factor of 3 to 5.
Create times are reduced as well (although not as much).  In fact, it also
shows delete speed as being slower, but that is hard to quantify as the
reiserfs delete spped is slower also.

It actually looks like both ext2 and reiserfs took a hit in the read
department in 2.4.10 as well.  Maybe a bad interaction with the page
cache or something?  It would also be worthwhile to go back to the
addition of directories-in-pagecache as well, because I seem to recall
posting about a hit in read performance at that time as well, and never
really heard anything about it.

The bonnie++ benchmark doesn't show any obvious trends (incomplete tables):
http://namesys.com/benchmarks/bonnie/2.4.8_2.4.9_2.4.10.txt

I'll have to go and update my bonnie benchmarks for newer kernels (last
run when testing indexed directores and dir-in-pagecache at 2.4.5).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05 22:10       ` Andreas Dilger
@ 2001-11-06  0:38         ` Daniel Phillips
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Phillips @ 2001-11-06  0:38 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Christian Laursen, linux-kernel, ext2-devel

On November 5, 2001 11:10 pm, Andreas Dilger wrote:
> On Nov 05, 2001  00:01 +0100, Daniel Phillips wrote:
> > For using the -o index option on a non-throwaway volume, we should do 
this:
> > 
> >  void ext2_add_compat_feature (struct super_block *sb, unsigned feature)
> >  {
> > +	return;
> >  	if (!EXT2_HAS_COMPAT_FEATURE(sb, feature))
> >  	{
> > 
> > And afterwards you can rm -rf your test directory, though actually normal 
> > ext2 shouldn't see anything unusual about it.  The real reason for rm'ing 
the 
> > test directory is so that I can tweak the index format in upcoming 
prerelease 
> > versions.
> 
> Well, e2fsck _should_ really know about the fact that there are indexed
> directories in the filesystem, which is what the COMPAT flag flag is for.
> The only current issue is that e2fsck doesn't understand this compat flag.
> 
> > I've disabled the add_compat_feature here for now, because until fsck can 
> > handle it, it just causes trouble.  I'll go read Andreas' writeup on the 
> > COMPAT flags again and see if I can come up with a more friendly 
> > interpretation.
> 
> No, COMPAT is the friendliest.  It means old kernels can read/write this
> filesystem without problems, just that e2fsck can't/won't check it.  Even
> though an old fsck _probably_ won't break such a filesystem, there is no
> guarantee of that,

Well, it's hard to see how the fsck could hurt, since all the blocks of the 
directory look like legitimate empty blocks.  When did 

> and it definitely won't validate the indexes, so a
> "successfull" fsck of an indexed directory doesn't mean anything until it
> can understand this COMPAT flag.
> 
> That said, I agree that turning the COMPAT flag off for short term testing
> is probably not fatal, but I thought we were not going to even suggest
> using non-throwaway filesystems until the hash function was finalized?

True.  Right now, I'm interested in finding out exactly how the old fscks are 
going to behave when they run into indexed directories, so I'll leave the 
COMPAT flag off for now and turn it back on when we hit the first 
format-frozen release.  The method of restoring a partition to a fsckable 
state is easy to document:

    # debugfs -w root_fs
    debugfs: feature -dir_index

Anybody who's running the patch will have access to a recent version of 
debugfs that knows about the dir_index flag.

> In
> the end, if an updated e2fsck detects the DIR_INDEX flag (and valid indexes
> therein) it will turn on the COMPAT flag for us, so all will be well.  I
> don't advise that we push for patch inclusion until e2fsck is done, however.

Yes, as long as testers heed my warning and stick to test partitions there's 
no particular danger.  There's a simple recovery procedure for anyone who 
doesn't want to bother re-mkfsing the partition.  We're in pretty good shape. 

My improved show_buckets routine is working code that could be used to get 
started on the new fsck code.  It walks the index in hash bucket order 
dumping out statistics, and, together with the checks in dx_probe, basically 
defines the index format.

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-05  1:43   ` Daniel Phillips
  2001-11-05  7:48     ` Ville Herva
  2001-11-05 22:59     ` Christian Laursen
@ 2001-11-08  7:21     ` Christian Laursen
  2 siblings, 0 replies; 19+ messages in thread
From: Christian Laursen @ 2001-11-08  7:21 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

Daniel Phillips <phillips@bonn-fries.net> writes:

> On November 4, 2001 11:09 pm, Christian Laursen wrote:
> > Daniel Phillips <phillips@bonn-fries.net> writes:
> > 
> > > ***N.B.: still for use on test partitions only.***
> > 
> > It's the first time, I've tried this patch and I must say, that
> > the first impression is very good indeed.
> > 
> > I took a real world directory (my linux-kernel MH folder containing
> > roughly 115000 files) and did a 'du -s' on it.
> >
> Which kernel are you using?  From 2.4.10 on ext2 has an accelerator in 
> ext2_find_entry - it caches the last lookup position.  I'm wondering how that 
> affects this case.

I ran the tests again and got some real numbers this time.

The accelerator should work as normal, when the filesystem is not
mounted with -o index, shouldn't it (Although it's on a kernel
with the directory index patch)?

xi@tam:~/Mail > uname -a
Linux tam 2.4.13-3um #1 Sun Nov 4 14:29:19 CET 2001 i686 unknown

xi@tam:~/Mail > mount
/dev/ubd0 on / type ext2 (rw,index)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
/dev/ubd2 on /mnt/flaf type ext2 (rw)

xi@tam:/mnt/flaf > time du -s linux-kernel/
685652  linux-kernel

real    19m14.689s
user    0m1.650s
sys     23m39.000s

xi@tam:~/Mail > time du -s linux-kernel/
686432  linux-kernel

real    1m8.363s
user    0m5.500s
sys     0m57.350s


-- 
Best regards
    Christian Laursen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Ext2 Directory Index, updated
@ 2002-03-04 11:03 Daniel Phillips
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Phillips @ 2002-03-04 11:03 UTC (permalink / raw)
  To: linux-kernel

After some considerable time letting this patch 'age', here's an update
to 2.4.18:

   http://people.nl.linux.org/~phillips/htree/htree-2.4.18

The diff to 2.4.17 was provided by Ted Ts'o, and bug hunting/fixing by
Chris Li (who has the rather interesting email address
<chrisl@gnuchina.org>).  Chris is working on the Ext3 port as well.

Bill Irwin is providing loving care and attention to the hash function, so I 
feel it's in good hands.

Known Bug:

  - highmem doesn't work (because the kmapping code is wrong)

To Do:

  - Finalize hash function.
  - Coalesce on delete.  I have to do this sooner rather than later,
    since hash bucket fragmentation leads quickly to reduced leaf
    fullness.
  - Stable telldir cookie for NFS
  - Ext2 util updates (in progress, Ted Ts'o)
  - Miscellaneous source cleanups

Because of the unfinalized hash function, this patch is still for testing 
only.

-- 
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Ext2 directory index, updated
  2001-11-02  3:36 Ext2 directory index, updated Daniel Phillips
@ 2001-11-02  5:04 ` Andreas Dilger
  0 siblings, 0 replies; 19+ messages in thread
From: Andreas Dilger @ 2001-11-02  5:04 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Nov 02, 2001  04:36 +0100, Daniel Phillips wrote:
> I ran it through some basic tests, up to half a million files/directory, 
> without problems.  There are still a few minor warts to clean up, including 
> still not having settled on a final-final hash function, although it looks 
> likely that it's going to end up being dx_hack_hash, with a more respectable 
> name.

Out of curiosity, does the blkdev-in-pagecache change make the indexed-dir
code simpler?  You should just be able to do bread() for a directory
block and it will work, because the "buffer" is actually backed by the
page cache.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Ext2 directory index, updated
@ 2001-11-02  3:36 Daniel Phillips
  2001-11-02  5:04 ` Andreas Dilger
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Phillips @ 2001-11-02  3:36 UTC (permalink / raw)
  To: linux-kernel

Here is the htree directory index patch for ext2, updated to 2.4.13.  
***N.B.: still for use on test partitions only.***

I ran it through some basic tests, up to half a million files/directory, 
without problems.  There are still a few minor warts to clean up, including 
still not having settled on a final-final hash function, although it looks 
likely that it's going to end up being dx_hack_hash, with a more respectable 
name.

I'm not 100% sure I've handled kmap/highmem correctly, and I haven't checked 
that yet.

This patch is just a snapshot of my work-in-progress.  There will be an 
update in another day or so, and a to-do list.  There are a few extra hash 
functions in the code from various sources, including reiserfs and bitkeeper, 
which I'll remove in the next update.  Those who find this kind of thing 
interesting may find these... interesting.

The patch is available at:

  http://nl.linux.org/~phillips/htree/ext2.index-2.4.13

To apply:

  cd /your/source/tree
  patch -p0 <this.patch

--
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2002-03-04 11:07 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-11-04  2:28 Ext2 directory index, updated Daniel Phillips
2001-11-04  2:44 ` Daniel Phillips
2001-11-04 22:09 ` Christian Laursen
2001-11-04 22:24   ` Daniel Phillips
2001-11-04 22:54     ` Christian Laursen
2001-11-04 23:01     ` Daniel Phillips
2001-11-04 23:09       ` Gábor Lénárt
2001-11-05 22:10       ` Andreas Dilger
2001-11-06  0:38         ` Daniel Phillips
2001-11-05  1:43   ` Daniel Phillips
2001-11-05  7:48     ` Ville Herva
2001-11-05  9:53       ` Daniel Phillips
2001-11-05 22:59     ` Christian Laursen
2001-11-05 23:13       ` Daniel Phillips
2001-11-05 23:45         ` Andreas Dilger
2001-11-08  7:21     ` Christian Laursen
  -- strict thread matches above, loose matches on Subject: below --
2002-03-04 11:03 Ext2 Directory Index, updated Daniel Phillips
2001-11-02  3:36 Ext2 directory index, updated Daniel Phillips
2001-11-02  5:04 ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).