linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Tejun Heo <tj@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>, Jens Axboe <axboe@kernel.dk>,
	tomaz.solc@tablix.org, aaron.lu@intel.com,
	linux-kernel@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Fengguang Wu <fengguang.wu@intel.com>
Subject: Re: Writeback threads and freezable
Date: Fri, 20 Dec 2013 15:00:17 +0100	[thread overview]
Message-ID: <1560958.dtxLEj3lem@vostro.rjw.lan> (raw)
In-Reply-To: <20131219162411.GD16994@htj.dyndns.org>

On Thursday, December 19, 2013 11:24:11 AM Tejun Heo wrote:
> Yo, Dave.
> 
> On Thu, Dec 19, 2013 at 03:08:21PM +1100, Dave Chinner wrote:
> > > If knowing that the underlying device has gone away somehow helps
> > > filesystem, maybe we can expose that interface and avoid flushing
> > > after hotunplug but that merely hides the possible deadlock scenario
> > > that you're concerned about.  Nothing is really solved.
> > 
> > Except that a user of the block device has been informed that it is
> > now gone and has been freed from under it. i.e. we can *immediately*
> > inform the user that their mounted filesystem is now stuffed and
> > supress all the errors that are going to occur as a result of
> > sync_filesystem() triggering IO failures all over the place and then
> > having to react to that.i
> 
> Please note that there's no real "immediacy" in that it's inherently
> racy and that the extent of the usefulness of such notification can't
> reach much further than suppressing error messages.  Even that benefit
> is kinda dubious.  Don't we want to generate errors when a device is
> removed while dirty data / IOs are pending on it?  I fail to see how
> "supressing all the errors" would be a sane thing to do.
> 
> Another thing is that I think it's actually healthier in terms of
> excercise of code paths to travel those error paths on hot unplugs
> which are relatively common than taking a different behavior on them.
> It'll inevitably lower our test coverage.
> 
> > Indeed, there is no guarantee that sync_filesystem will result in
> > the filesystem being shut down - if the filesystem is clean then
> > nothing will happen, and it won't be until the user modifies some
> > metadata that a shutdown will be triggered. That could be a long
> > time after the device has been removed....
> 
> I still fail to see that why that is a problem.  Filesystems should be
> able to handle hot unplug or IO failures at any point in a reasonable
> way, so what difference would having a notification make other than
> introducing yet another exception code path?
> 
> > I don't see that there is a difference between a warm and hot unplug
> > from a filesystem point of view - both result in the filesystem's
> > backing device being deleted and freed, and in both cases we have to
> > take the same action....
> 
> Yeah, exactly, so what'd be the point of getting separate notification
> for hot unplug events?
> 
> > > Do you mean xfs never gives up after IO failures?
> > 
> > There's this thing called a transient IO failure which we have to
> > handle. e.g multipath taking several minutes to detect a path
> > failure and fail over, whilst in the mean time IO errors are
> > reported after a 30s timeout. So some types of async metadata write
> > IO failures are simply rescheduled for a short time in the future.
> > They'll either succeed, or continual failure will eventually trigger
> > some kind of filesystem failure.
> > 
> > If it's a synchronous write or a write that we cannot tolerate even
> > transient errors on (e.g. journal writes), then we'll shut down the
> > filesystem immediately.
> 
> Sure, filesystems should (be able to) react to different types of
> errors in different ways.  We still have a long way to go to do that
> properly but that should be done through IO failures not some side
> channel one-off "hotunplug" happened call.  Again, it doesn't solve
> anything.  It just side steps one very specific case in a half-assed
> way.

Another problem is to distinguish "hotunplug" from "power failure", for
example, because it may not be entirely clear what happened to start with
until way after when we get a "these devices are gone" notification from the
platform firmware.

So even if something may be regarded as a "hotunplug" from the order of the
Universe perspective, it still may look like a regular IO error at the device
level until device_del() is called for that device (which may not happen for
quite a while).

[That said, there are situations in which we may want to wait until it is
"safe" to eject stuff, or even we may want to allow subsystems to fail
"offline" requests, like in the memory eject case.  In those cases the
hot-removal actually consists of two parts, an offline and a removal,
where devices are supposed to be not in use after offline (which may fail).
But that is a different story. :-)]

Thanks,
Rafael

      parent reply	other threads:[~2013-12-20 13:46 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-13 17:49 Writeback threads and freezable Tejun Heo
2013-12-13 18:52 ` Tejun Heo
2013-12-13 20:40   ` [PATCH] libata, freezer: avoid block device removal while system is frozen Tejun Heo
2013-12-13 22:45     ` Nigel Cunningham
2013-12-13 23:07       ` Tejun Heo
2013-12-13 23:15         ` Nigel Cunningham
2013-12-14  1:55           ` Dave Chinner
2013-12-14 20:31           ` Tejun Heo
2013-12-14 20:36             ` Tejun Heo
2013-12-14 21:21               ` Nigel Cunningham
2013-12-17  2:35                 ` Rafael J. Wysocki
2013-12-17  2:34             ` Rafael J. Wysocki
2013-12-17 12:34               ` Tejun Heo
2013-12-18  0:35                 ` Rafael J. Wysocki
2013-12-18 11:17                   ` Tejun Heo
2013-12-18 21:48                     ` Rafael J. Wysocki
2013-12-18 21:39                       ` Tejun Heo
2013-12-18 21:41                         ` Tejun Heo
2013-12-18 22:04                           ` Rafael J. Wysocki
2013-12-19 23:35                             ` [PATCH wq/for-3.14 1/2] workqueue: update max_active clamping rules Tejun Heo
2013-12-20  1:26                               ` Rafael J. Wysocki
2013-12-19 23:37                             ` [PATCH wq/for-3.14 2/2] workqueue: implement @drain for workqueue_set_max_active() Tejun Heo
2013-12-20  1:31                               ` Rafael J. Wysocki
2013-12-20 13:32                                 ` Tejun Heo
2013-12-20 13:56                                   ` Rafael J. Wysocki
2013-12-20 14:23                                     ` Tejun Heo
2013-12-16 12:12         ` [PATCH] libata, freezer: avoid block device removal while system is frozen Ming Lei
2013-12-16 12:45           ` Tejun Heo
2013-12-16 13:24             ` Ming Lei
2013-12-16 16:05               ` Tejun Heo
2013-12-17  2:38     ` Rafael J. Wysocki
2013-12-17 12:36       ` Tejun Heo
2013-12-18  0:23         ` Rafael J. Wysocki
2013-12-17 12:50     ` [PATCH v2] " Tejun Heo
2013-12-18  1:04       ` Rafael J. Wysocki
2013-12-18 11:08         ` Tejun Heo
2013-12-18 12:07       ` [PATCH v3] " Tejun Heo
2013-12-18 22:08         ` Rafael J. Wysocki
2013-12-19 17:24           ` Tejun Heo
2013-12-19 18:54         ` [PATCH v4] " Tejun Heo
2013-12-14  1:53 ` Writeback threads and freezable Dave Chinner
2013-12-14 17:30   ` Greg Kroah-Hartman
2013-12-14 20:23   ` Tejun Heo
2013-12-16  3:56     ` Dave Chinner
2013-12-16 12:51       ` Tejun Heo
2013-12-16 12:56         ` Tejun Heo
2013-12-18  0:35           ` Dave Chinner
2013-12-18 11:43             ` Tejun Heo
2013-12-18 22:14               ` Rafael J. Wysocki
2013-12-19  4:08               ` Dave Chinner
2013-12-19 16:24                 ` Tejun Heo
2013-12-20  0:51                   ` Dave Chinner
2013-12-20 14:51                     ` Tejun Heo
2013-12-20 14:00                   ` Rafael J. Wysocki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1560958.dtxLEj3lem@vostro.rjw.lan \
    --to=rjw@rjwysocki.net \
    --cc=aaron.lu@intel.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    --cc=tomaz.solc@tablix.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).