All of lore.kernel.org
 help / color / mirror / Atom feed
* [ANNOUNCE] DualFS: File System with Meta-data and Data Separation
@ 2007-02-11  3:06 Sorin Faibish
  2007-02-14 21:10 ` sfaibish
  0 siblings, 1 reply; 42+ messages in thread
From: Sorin Faibish @ 2007-02-11  3:06 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: piernas

Introducing DualFS

File System developers played with the idea of separation of
meta-data from data in file systems for a while. The idea was
lately revived by a small group of file system enthusiasts
 from Spain (from the little known University of Murcia) and
it is called DualFS. We believe that the separation idea
will bring to Linux file systems great value.

We see DualFS as a next-generation journaling file system which
has the same consistency guaranties as traditional journaling
file systems but better performance characteristics. The new file
system puts data and meta-data on different devices (usually, two
partitions on the same disk or different disks), and manages them
in very different ways:

1. DualFS has only one copy of every meta-data block. This copy is
    in the meta-data device,

2. The meta-data device is a log which is used by DualFS both to
    read and to write meta-data blocks.

3. DualFS avoids an extra copy of meta-data blocks, which allow
    DualFS to achieve higher performance than other journaling file
    systems.

4. DualFS implements performance enhancements: meta-data prefetch,
    on-line meta-data relocation and faster fsck and mkfs operations.

5. DualFS file system is suitable for use with TB and PB of storage

We have carried out different experiments which compare DualFS and
other popular Linux file systems, namely, Ext2, Ext3, XFS, JFS, and
ReiserFS. The results, both performance and management, prove the
value of the new file system design based on the separation of data
and metadata which increase performance dramatically up to 97% by
simply using an additional partition of same disk.

We have performed extensive tests using micro-benchmarks as well
as macro-benchmarks including Postmark v1.5, SpecWeb99, TPCC-uva.
We also measured performance of maintenance tasks like mkfs and
fsck which all show that DualFS performance is superior to all the
other file systems tested with performance advantage in the range
between 50-300% depending on the benchmark and the configuration.
And all this performance advantage is a direct result of the
separation of the meta-data and data.

The project started in 2000 by Juan Piernas Canovas as the primary
and almost unique contributor, with some small contributions by Toni
Cortes, and Jose M. Garcia. The project was stopped for some time. We
restarted the project last year, and after several months of updates
and tests we created a SourceForge project with the intent to share
the value of this old and yet new concept.

The DualFS code, tools and performance papers are available at:

  	<http://sourceforge.net/projects/dualfs>

The code requires kernel patches to 2.4.19 (oldies but goodies) and
a separate fsck code.  The latest kernel we used it for is 2.6.11
and we hope with you help to port it to the latest Linux kernel.

We will present the architecture, principles and performance
characterization at the LFS07 next week.

We are very interested to get your feedback and criticism.

Sorin Faibish and Juan Piernas Canovas
--------------------------------------




^ permalink raw reply	[flat|nested] 42+ messages in thread
* Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation
@ 2007-02-17  3:44 Adam J. Richter
  0 siblings, 0 replies; 42+ messages in thread
From: Adam J. Richter @ 2007-02-17  3:44 UTC (permalink / raw)
  To: sfaibish; +Cc: linux-fsdevel

>The new file
>system puts data and meta-data on different devices (usually, two
>partitions on the same disk or different disks), [...]

	I can intuitively see the benefits in terms of disk arm motion.

	For example, mkfs, fsck, and chmod -R would involve seeks
entirely within the metadata partition, which might be 5% the size of
the data partition.  If the disk arm spends half its time accelerating
and half decelerating, then seeking a certain distance D, should take
roughly sqrt(D) time, so reducing this average seek distance when
metadata operations follow each other would reduce average disk arm
motion time by a factor of almost 4.5, although rotational delays
would be unaffected.

	It is also possible that your log structured metadata file
system might be improving locality further by separating unused blocks
from used blocks on the scale of disk tracks.  When disk reads a
sector in a segment of a log structured file system, the disk drive
will bascially cache the whole track, and the other sectors on that
track will be more likely to have valid data, thanks to the cleaner.
Of course, this has to be compared to similar benefits from the layout
policies other file systems.

	On another subject, it would reduce system administration
overhead down to the level of other file systems if your file system
could could run on a single logical device and just have a policy of
having the metadata area concenrtated in a particular place on the
logical device (probably the middle), so that the file system would
automatically accomodate changing ratios of data to metadata.  This
would also make it easier to run the whole thing on an interleaved
RAID.  Retaining the ability to put metadata on a separate device
would still be useful for putting metadata in a different type of
device (such as a faster spinning disk or flash).

Adam Richter

^ permalink raw reply	[flat|nested] 42+ messages in thread
* Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation
@ 2007-02-17  3:59 Adam J. Richter
  0 siblings, 0 replies; 42+ messages in thread
From: Adam J. Richter @ 2007-02-17  3:59 UTC (permalink / raw)
  To: sfaibish; +Cc: linux-fsdevel

I wrote:
>	For example, mkfs, fsck, and chmod -R would involve seeks
>entirely within the metadata partition, which might be 5% the size of
>the data partition.  [...] so reducing this average seek distance
>[by a factor of 20] when
>metadata operations follow each other would reduce average disk arm
>motion time by a factor of almost 4.5, although rotational delays
>would be unaffected.

       Before anyone jumps of me for a seemingly wild claim, I want to
clarify that I understand that metadata in file systems like bsd-ffs,
ext2 and ext3 is already grouped together in cylinder groups.  It is
just the presumably less frequent seeks between metadata in different
cylinder groups that I expect would be improved in this manner.

Adam Richter

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2007-02-26 13:26 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-11  3:06 [ANNOUNCE] DualFS: File System with Meta-data and Data Separation Sorin Faibish
2007-02-14 21:10 ` sfaibish
2007-02-14 21:57   ` Jan Engelhardt
2007-02-15 18:38     ` Juan Piernas Canovas
2007-02-15 20:09       ` Jörn Engel
2007-02-15 22:59         ` Juan Piernas Canovas
2007-02-16  9:13           ` Jörn Engel
2007-02-16 11:05             ` Benny Amorsen
2007-02-16 23:47             ` Bill Davidsen
2007-02-17 15:11               ` Jörn Engel
2007-02-17 18:10                 ` Bill Davidsen
2007-02-17 18:36                   ` Jörn Engel
2007-02-17 20:47                     ` Sorin Faibish
2007-02-18  5:59                       ` Jörn Engel
2007-02-18 12:46                         ` Jörn Engel
2007-02-19 23:57                         ` Juan Piernas Canovas
2007-02-20  0:10                           ` Bron Gondwana
2007-02-20  0:30                           ` Jörn Engel
2007-02-21  4:36                             ` Juan Piernas Canovas
2007-02-21 12:37                               ` Jörn Engel
2007-02-21 18:31                                 ` Juan Piernas Canovas
2007-02-21 19:25                                   ` Jörn Engel
2007-02-22  4:30                                     ` Juan Piernas Canovas
2007-02-22 16:25                                       ` Jörn Engel
2007-02-22 19:57                                         ` Juan Piernas Canovas
2007-02-23 13:26                                           ` Jörn Engel
2007-02-24 22:35                                             ` Sorin Faibish
2007-02-25  2:41                                             ` Juan Piernas Canovas
2007-02-25 12:01                                               ` Jörn Engel
2007-02-26  3:48                                                 ` Juan Piernas Canovas
2007-02-20 20:43                           ` Bill Davidsen
2007-02-15 20:38       ` Andi Kleen
2007-02-15 19:46         ` Jan Engelhardt
2007-02-16  1:43           ` sfaibish
2007-02-15 21:09         ` Juan Piernas Canovas
2007-02-15 23:57           ` Andi Kleen
2007-02-16  4:57             ` Juan Piernas Canovas
2007-02-26 11:49   ` Yakov Lerner
2007-02-26 13:08     ` Matthias Schniedermeyer
2007-02-26 13:24     ` Sorin Faibish
2007-02-17  3:44 Adam J. Richter
2007-02-17  3:59 Adam J. Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.