linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthias Andree <matthias.andree@gmx.de>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: True  fsync() in Linux (on IDE)
Date: Thu, 18 Mar 2004 12:34:53 +0100	[thread overview]
Message-ID: <20040318113453.GB6864@merlin.emma.line.org> (raw)
In-Reply-To: <20040318064757.GA1072@suse.de>

On Thu, 18 Mar 2004, Jens Axboe wrote:

> Chris and I have working real fsync() with the barrier patches. I'll
> clean it up and post a patch for vanilla 2.6.5-rc today.

This is good news.

The barrier stuff is long overdue^UI'm looking forward to this.

I'm using the term "TCQ" liberally although it may be inexact for older
(parallel) ATA generations:

All these ATA fsync() vs. write cache issues have been open for much too
long - no reproaches, but it's a pity we haven't been able to have data
consistency for data bases and fast bulk writes (that need the write
cache without TCQ) in the same drive for so long. I have seen Linux
introduce TCQ for PATA early in 2.5, then drop it again. Similarly,
FreeBSD ventured into TCQ for ATA but appears to have dropped it again
as well.

May I ask that the information whether a particular driver (file system,
hardware) supports write barriers be exposed in a standard way, for
instance in the Kconfig help lines?

If I recall correctly from earlier patches, the barrier stuff is 1.
command model (ATA vs.  SCSI) specific and 2. driver and hardware
specific and 3. requires that the file system knows how to use this
properly.

Given that file systems have certain write ordering requirements if they
are to be recoverable after a crash, I suspect Linux has _not_ been able
to guarantee on-disk consistency for any time for years, which means
that a crash in the wrong moment can kill the file system itself if the
drive has reordered writes - only ext3 without write cache seems to
behave better in this respect (data=ordered).

I would like to have a document that shows which file system, which
chipset driver for PATA, which chipset driver for ATA, which low-level
SCSI host adaptor driver, which file system support write barrier. We
will probably also need to check if intermediate layers such as md and
dm-mod propagate such information.

Given the necessary information, I can hack together a HTML document to
provide this information; this offer has however not seen any response
in the past. I am however not acquainted with the drivers and need
information from the kernel hackers. Without such support, such a
documentation effort is doomed.

BTW, I should very much like to be able to trace the low-level write
information that goes out to the device, possibly including the payload
- something like tcpdump for the ATA or SCSI commands that are sent to
the driver. Is such a facility available?

-- 
Matthias Andree

Encrypt your mail: my GnuPG key ID is 0x052E7D95

  reply	other threads:[~2004-03-18 11:35 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-18  1:08 True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18  6:47 ` Jens Axboe
2004-03-18 11:34   ` Matthias Andree [this message]
2004-03-18 11:55     ` Jens Axboe
2004-03-18 12:21       ` Matthias Andree
2004-03-18 12:37         ` Jens Axboe
2004-03-18 11:58     ` (no subject) Daniel Czarnecki
2004-03-18 19:44   ` True fsync() in Linux (on IDE) Peter Zaitsev
2004-03-18 19:47     ` Jens Axboe
2004-03-18 20:11       ` Chris Mason
2004-03-18 20:17         ` Peter Zaitsev
2004-03-18 20:33           ` Chris Mason
2004-03-18 20:46             ` Peter Zaitsev
2004-03-18 21:02               ` Chris Mason
2004-03-18 21:09                 ` Peter Zaitsev
2004-03-18 21:19                   ` Chris Mason
2004-03-19  8:05                     ` Hans Reiser
2004-03-19 13:52                       ` Chris Mason
2004-03-19 19:26                         ` Peter Zaitsev
2004-03-19 20:23                           ` Chris Mason
2004-03-19 20:31                             ` Hans Reiser
2004-03-19 20:38                               ` Chris Mason
2004-03-19 20:48                                 ` Hans Reiser
2004-03-19 20:56                                   ` Chris Mason
2004-03-20 11:04                                     ` Hans Reiser
2004-03-19 19:36                         ` Hans Reiser
2004-03-19 19:57                           ` Chris Mason
2004-03-19 20:04                             ` Hans Reiser
2004-03-19 20:15                               ` Chris Mason
2004-03-19 20:06                           ` Peter Zaitsev
2004-03-19 22:03                             ` Matthias Andree
2004-03-20 10:20                             ` Jamie Lokier
2004-03-20 19:48                               ` Peter Zaitsev
2004-03-22 13:08 Heikki Tuuri
2004-03-22 13:23 ` Jens Axboe
2004-03-22 15:17   ` Matthias Andree
2004-03-22 15:35     ` Christoph Hellwig
2004-03-22 19:12     ` Christoffer Hall-Frederiksen
2004-03-22 20:28       ` Matthias Andree
2004-03-22 19:33     ` Hans Reiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040318113453.GB6864@merlin.emma.line.org \
    --to=matthias.andree@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).