linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	Edward Shishkin <edward@redhat.com>
Subject: Re: [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error
Date: Tue, 17 Jan 2012 11:36:13 +1100	[thread overview]
Message-ID: <20120117003613.GA28571@dastard> (raw)
In-Reply-To: <CA+55aFyUiLq7UZeQD=-MU5ppvEDULiBP8xV0mJqVLL6nFAi7VA@mail.gmail.com>

On Mon, Jan 16, 2012 at 10:55:55AM -0800, Linus Torvalds wrote:
> On Mon, Jan 16, 2012 at 8:01 AM, Jan Kara <jack@suse.cz> wrote:
> >
> >  Hum, let me understand this. I understand the meaning of buffer_uptodate
> > bit as "the buffer has at least as new content as what is on disk". Now
> > when storage cannot write the block under the buffer, the contents of the
> > buffer is still "at least as new as what is (was) on disk".
> 
> No.
> 
> Stop making crap up.

Jan is right, Linus. His definition of what up-to-date means for
dirty buffers is correct, especially in the case of write errors.

> If the write fails, the buffer contents have *nothing* to do with what
> is on disk.

The dirty buffer contains what is *supposed* to be on disk. If we
fail to write it, we corrupt some application's data.

> You don't know what the disk contents are.

But *we don't care* what is on disk after a write error because
there is no guarantee that after a write error we can even read the
previous data that was on disk. IOWs, the contents of the region on
disk where the write failed is -undefined- and cannot be trusted.

> So clearly the buffer cannot be up-to-date.

What we have in memory is what is *supposed* to be on disk, and the
error is telling us that the disk is failing to be made up-to-date.
IOWs, the disk is stale after a write error, not what is in memory.
So clearly the buffer contains the up-to-date version of the data
after a write error.

How the filesystem handles that error is now up to the filesystem.
For example, the filesystem can chose to allocate new blocks for the
failed write and write the valid, up-to-date in-memory data to a
different location and continue onwards without errors. From this
example, it's pretty obvious that the data in memory contains the
data that what we need to care about after a write error, not what
is on disk.

> Now, feel free to use *other* arguments for why we shouldn't clear the
> up-to-date bit, but using the disk contents as one is pure and utter
> garbage. And it is *obviously* pure and utter garbage.

For the read case you are correct, but that logic (that the disk
version is always correct) does not apply to handling write errors.
It's an important distinction....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2012-01-17  0:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-05 14:40 [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 1/3] fs: Convert checks for write IO errors from !buffer_uptodate to buffer_write_io_error Jan Kara
2012-01-05 14:40 ` [PATCH 2/3] fs: Do not clear uptodate flag on write IO error Jan Kara
2012-01-05 14:40 ` [PATCH 3/3] ext2: Replace tests of write IO errors using buffer_uptodate Jan Kara
2012-01-05 22:16 ` [RFC PATCH 0/3] Stop clearing uptodate flag on write IO error Andrew Morton
2012-01-15  2:19 ` Linus Torvalds
2012-01-16 16:01   ` Jan Kara
2012-01-16 18:55     ` Linus Torvalds
2012-01-16 19:06       ` Linus Torvalds
2012-01-17  0:36       ` Dave Chinner [this message]
2012-01-17  0:59         ` Linus Torvalds
2012-01-17 10:46           ` Boaz Harrosh
2012-01-23  3:04           ` Dave Chinner
2012-01-23 21:47             ` Ted Ts'o
2012-01-23 23:49               ` Linus Torvalds
2012-01-24  6:12                 ` Dave Chinner
2012-01-24  7:10                   ` Linus Torvalds
2012-01-24 12:13                     ` Jan Kara
2012-01-24  0:36               ` Dave Chinner
2012-01-26 12:17                 ` Ric Wheeler
2012-01-26 20:51                   ` Jan Kara
2012-01-26 20:58                     ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120117003613.GA28571@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=edward@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).