linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <andrewm@uow.edu.au>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Dan Kegel <dank@kegel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: O_DIRECT please; Sybase 12.5
Date: Fri, 06 Jul 2001 01:06:53 +1000	[thread overview]
Message-ID: <3B44828D.C220CAE@uow.edu.au> (raw)
In-Reply-To: <3B3C4CB4.6B3D2B2F@kegel.com>, <3B3C4CB4.6B3D2B2F@kegel.com>; <20010705155350.O17051@athlon.random> <3B44797F.DD9EAC99@uow.edu.au>, <3B44797F.DD9EAC99@uow.edu.au>; from andrewm@uow.edu.au on Fri, Jul 06, 2001 at 12:28:15AM +1000 <20010705163716.R17051@athlon.random>

Andrea Arcangeli wrote:
> 
> On Fri, Jul 06, 2001 at 12:28:15AM +1000, Andrew Morton wrote:
> > ext3 journals data.  That's unique and it breaks things (or rather,
> > things break it).   It'd be trivial to support O_DIRECT in ext3's
> > writeback mode (metadata-only), but nobody uses that.
> 
> I thought everybody uses metadata-only to avoid killing data-write
> performance.

ext3 has three modes:

data=journal

	Data is journalled.  Yes, this slows things down
	significantly.

data=ordered

	The default mode and the most popular.  All data is written
	to disk prior to a commit.  Write throughput is good, and
	you don't have uninitialised data in your files after a
	crash.

data=writeback

	Metadata-only.   Better write throughput (in dbench, anyway),
	but only metadata integrity is preserved after a crash. ie:
	fsck says the fs is fine, but files can (and almost always do)
	contain random stuff after a crash.

Ordered data mode is really nice.  It's not magical though - for example,
if you reset the machine during a kernel build, a subsequent `make' will
fail because you have a number of .o files which have zero length.
That's the length they happened to have when the machine went down.

For ordered-data mode we need to keep track of all the buffers which
are associated with a transaction's journalled metadata and ensure that
they are written out before the transaction commits.  That is done with
a little structure which hangs off ->b_private.

> So I thought it was ok to at first support O_DIRECT only
> for metadata journaling, doing that should be a three liner as you said
> and that is what I expected.

Yup.  metadata-only journalling is all-round much, much simpler.

  reply	other threads:[~2001-07-05 15:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-29  9:39 O_DIRECT please; Sybase 12.5 Dan Kegel
2001-06-29  9:50 ` Alan Cox
2001-06-29 10:16   ` Dan Kegel
2001-06-29 12:49     ` Mike Harrold
2001-07-05 13:59   ` Andrea Arcangeli
2001-06-29 15:23 ` Steve Lord
2001-07-03  9:42 ` Stephen C. Tweedie
2001-07-03 15:10   ` Daryll Strauss
2001-07-03 15:48     ` Stephen C. Tweedie
2001-07-05 13:53 ` Andrea Arcangeli
2001-07-05 14:28   ` Andrew Morton
2001-07-05 14:37     ` Andrea Arcangeli
2001-07-05 15:06       ` Andrew Morton [this message]
2001-07-06  0:25         ` Keith Owens
     [not found] <3B3C4CB4.6B3D2B2F@kegel.com.suse.lists.linux.kernel>
2001-06-29 10:42 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3B44828D.C220CAE@uow.edu.au \
    --to=andrewm@uow.edu.au \
    --cc=andrea@suse.de \
    --cc=dank@kegel.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).