linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Kevin Wolf <kwolf@redhat.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Christoph Hellwig <hch@infradead.org>,
	Ric Wheeler <rwheeler@redhat.com>, Rik van Riel <riel@redhat.com>
Subject: Re: [LSF/MM TOPIC] I/O error handling and fsync()
Date: Wed, 11 Jan 2017 00:03:56 -0500	[thread overview]
Message-ID: <20170111050356.ldlx73n66zjdkh6i@thunk.org> (raw)
In-Reply-To: <20170110160224.GC6179@noname.redhat.com>

A couple of thoughts.

First of all, one of the reasons why this probably hasn't been
addressed for so long is because programs who really care about issues
like this tend to use Direct I/O, and don't use the page cache at all.
And perhaps this is an option open to qemu as well?

Secondly, one of the reasons why we mark the page clean is because we
didn't want a failing disk to memory to be trapped with no way of
releasing the pages.  For example, if a user plugs in a USB
thumbstick, writes to it, and then rudely yanks it out before all of
the pages have been writeback, it would be unfortunate if the dirty
pages can only be released by rebooting the system.

So an approach that might work is fsync() will keep the pages dirty
--- but only while the file descriptor is open.  This could either be
the default behavior, or something that has to be specifically
requested via fcntl(2).  That way, as soon as the process exits (at
which point it will be too late for it do anything to save the
contents of the file) we also release the memory.  And if the process
gets OOM killed, again, the right thing happens.  But if the process
wants to take emergency measures to write the file somewhere else, it
knows that the pages won't get lost until the file gets closed.

(BTW, a process could guarantee this today without any kernel changes
by mmap'ing the whole file and mlock'ing the pages that it had
modified.  That way, even if there is an I/O error and the fsync
causes the pages to be marked clean, the pages wouldn't go away.
However, this is really a hack, and it would probably be easier for
the process to use Direct I/O instead.  :-)


Finally, if the kernel knows that an error might be one that could be
resolved by the simple expedient of waiting (for example, if a fibre
channel cable is temporarily unplugged so it can be rerouted, but the
user might plug it back in a minute or two later, or a dm-thin device
is full, but the system administrator might do something to fix it),
in the ideal world, the kernel should deal with it without requiring
any magic from userspace applications.  There might be a helper system
daemon that enacts policy (we've paged the sysadmin, so it's OK to
keep the page dirty and retry the writebacks to the dm-thin volume
after the helper daemon gives the all-clear), but we shouldn't require
all user space applications to have magic, Linux-specific retry code.

Cheers,

					- Ted

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-01-11  5:03 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-10 16:02 [LSF/MM TOPIC] I/O error handling and fsync() Kevin Wolf
2017-01-11  0:41 ` NeilBrown
2017-01-13 11:09   ` Kevin Wolf
2017-01-13 14:21     ` Theodore Ts'o
2017-01-13 16:00       ` Kevin Wolf
2017-01-13 22:28         ` NeilBrown
2017-01-14  6:18           ` Darrick J. Wong
2017-01-16 12:14           ` [Lsf-pc] " Jeff Layton
2017-01-22 22:44             ` NeilBrown
2017-01-22 23:31               ` Jeff Layton
2017-01-23  0:21                 ` Theodore Ts'o
2017-01-23 10:09                   ` Kevin Wolf
2017-01-23 12:10                     ` Jeff Layton
2017-01-23 17:25                       ` Theodore Ts'o
2017-01-23 17:53                         ` Chuck Lever
2017-01-23 22:40                         ` Jeff Layton
2017-01-23 22:35                     ` Jeff Layton
2017-01-23 23:09                       ` Trond Myklebust
2017-01-24  0:16                         ` NeilBrown
2017-01-24  0:46                           ` Jeff Layton
2017-01-24 21:58                             ` NeilBrown
2017-01-25 13:00                               ` Jeff Layton
2017-01-30  5:30                                 ` NeilBrown
2017-01-24  3:34                           ` Trond Myklebust
2017-01-25 18:35                             ` Theodore Ts'o
2017-01-26  0:36                               ` NeilBrown
2017-01-26  9:25                                 ` Jan Kara
2017-01-26 22:19                                   ` NeilBrown
2017-01-27  3:23                                     ` Theodore Ts'o
2017-01-27  6:03                                       ` NeilBrown
2017-01-30 16:04                                       ` Jan Kara
2017-01-13 18:40     ` Al Viro
2017-01-13 19:06       ` Kevin Wolf
2017-01-11  5:03 ` Theodore Ts'o [this message]
2017-01-11  9:47   ` [Lsf-pc] " Jan Kara
2017-01-11 15:45     ` Theodore Ts'o
2017-01-11 10:55   ` Chris Vest
2017-01-11 11:40   ` Kevin Wolf
2017-01-13  4:51     ` NeilBrown
2017-01-13 11:51       ` Kevin Wolf
2017-01-13 21:55         ` NeilBrown
2017-01-11 12:14   ` Chris Vest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170111050356.ldlx73n66zjdkh6i@thunk.org \
    --to=tytso@mit.edu \
    --cc=hch@infradead.org \
    --cc=kwolf@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=riel@redhat.com \
    --cc=rwheeler@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).