linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Reiser4 debugging status update
@ 2003-09-30  9:28 Hans Reiser
  0 siblings, 0 replies; only message in thread
From: Hans Reiser @ 2003-09-30  9:28 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Reiserfs mail-list

The filesystem is getting reasonably stable. 

This weekend we hit a bug in space reservation, which we can't reproduce 
yet but probably isn't too hard to find by code inspection.  There is 
some thought that the assertion not the space reservation is buggy, in 
any case we'll release a snapshot after it is fixed.

Our performance is generally wonderful and getting better. 

It has the following weakpoints:

* We allocate a "jnode" per unformatted node in the filesystem.  The 
traversing of these jnodes consumes more CPU than performing the memcpy 
from user space to kernel space when doing large writes.  I don't yet 
really understand on an intuitive level why this is so, which is a 
reflection on my ignorance as it is consistent with stories I have heard 
from other implementors of filesystems who found that eliminating per 
page structures was an important part of optimizing large writes.  We 
will fix this by creating a new structure called an extent-node that 
will exist on a per extent basis, and this will probably cure the 
problem.  This will greatly simplify parts of our code for reasons I 
won't go into, and it will also take us 6 weeks to do it.  I don't think 
users should wait for it, and so we will ship without it.

* Our dbench performance was poor, has improved due to coding changes, 
and we need to test and analyze again.  Perhaps more fixes will be 
needed, we can't say yet.

* Our fsync performance is poor.  We will pay attention to this next 
year, frankly, after we have fully implemented the transactions API.  At 
that point we will say something like, if you care about fsync 
performance you should be using the transactions API and/or sponsoring 
us to tune for NVRAM, users will say back "but our legacy apps on 
hardware without NVRAM matter!", and we will grudgingly but effectively 
tune for this because we care about real users too.;-)

Nikita recently invented and implemented a clever bit of code that keeps 
track of the highest node in the tree that spans a directory, and then 
performs repeat lookups within the same directory starting from there 
rather than the root.  This is a nice answer to those who keep asking 
me, wouldn't it be faster to have separate trees for each directory?  
Now I have better answer for them --- nice work Nikita.  It also has the 
nice side effect of reducing spin lock contention on the root node for 
4-way SMP.

I am hoping to move my laptop to SuSE 9.0 running reiser4 sometime this 
week, and I am hoping we will ask for more outside testers to help us 
find bugs at that time.  While I have mentioned only the performance 
flaws in this email, our overall performance seems to leave little doubt 
that the filesystem as-is is far better than V3, and even though it will 
get much faster with another year or so of tuning, if now we are the 
fastest available on Linux, we should be shipping now (assuming we find 
no new bugs in the last round of internal testing).

Benchmarks can be found at www.namesys.com/benchmarks.html

As you can see in those benchmarks, in V4 tails IMPROVE performance due 
to saving IO transfer time.  This is a great improvement over V3, and 
generally speaking V4 stomps all over V3 performance.  It also scales 
better, has plugins, and improves semantics a little bit (big semantic 
improvements will be in the next major release not V4). 

You'll also notice that we increased the size of the fileset to be more 
fair to ext3, and we tested some ext3 configurations Andrew Morton 
suggested testing.

-- 
Hans



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-09-30  9:28 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-30  9:28 Reiser4 debugging status update Hans Reiser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).