All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, Jens Axboe <axboe@kernel.dk>,
	tomaz.solc@tablix.org, aaron.lu@intel.com,
	linux-kernel@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Fengguang Wu <fengguang.wu@intel.com>
Subject: Re: Writeback threads and freezable
Date: Sat, 14 Dec 2013 15:23:24 -0500	[thread overview]
Message-ID: <20131214202324.GA4020@htj.dyndns.org> (raw)
In-Reply-To: <20131214015343.GP31386@dastard>

Hello, Dave.

On Sat, Dec 14, 2013 at 12:53:43PM +1100, Dave Chinner wrote:
> That's the fundamental problem here - device removal asks the device
> to fsync the filesystem on top of the device that was just removed.
> The simple way to trigger this is to pull a device from underneath
> an active filesystem (e.g. user unplugs a USB device without first
> unmounting it). There are many years worth of bug reports showing
> that this attempt by the device removal code to sync the filesystem
> leads to deadlocks.

What are you suggesting?  Implementing separate warm and hot unplug
paths?  That makes no sense whatsoever.  Device hot unplug is just a
sub operation of general device unplug which should be able to succeed
whether the underlying device is failing IOs or not.

Another thing to consider is that you want to excercise your error
handling path as often as possible.  IOs could still fail even during
device warm unplugs.  If anything atop the driver can't handle that,
that's a bug and we want to fix them.  The fact that we have bug
reports going back years on the subject simply means that we've been
traditionally lousy with error handling throughout the layers - those
were the occassions where we fixed actual error handling bugs, hot
unplug or not.  Special casing hot unplug would just make existing
error handling bugs more difficult to find.

> It's simply not a valid thing to do - just how is the filesystem
> supposed to sync to a non-existent device?

By issuing all IOs and then handling the failures appropriately.
Exactly just as it would handle *ANY* io errors.  This actually is the
simplest thing to do to drain the pipe - pushing things down all the
way and fail them at single point.  What's the alternative?  Implement
quick exit path at each layer?  That's just silly and error-prone.

> I've raised this each time a user reports it over the past few years
> and never been able to convince any to fix the filesystem
> re-entrancy problem device removal causes. Syncing the filesystem

Well, it isn't a good idea.

> will require taking locks that are by IO in progress, and so can't
> make progress until the IO is completed, but that can't happen
> until the error handling completes the sync of the filesystem....

IOs in progress shouldn't ever be stalled by filesystem sync.  That's
a massive layering violation.  Where and how does that happen?

> Preventing fs/io re-entrancy from contexts where we might be holding
> locks is the same reason we have GFP_NOFS and GFP_NOIO for memory
> allocation: re-entering a filesystem or IO subsystem whenever we are
> holding locks or serialised context in the fs/io path can deadlock
> the fs/io path.
>
> IOWs, syncing the filesystem from the device delete code is just
> plain wrong. That's what needs fixing - removing the cause of
> re-entrancy, not the workqueue or writeback code...

No, the wrong thing there is having IO failures depending on
filesystem sync.  Nothing else.

> > Ideas?
> 
> Fix the device delete error handling not to re-enter the
> filesystem and IO path. It's just wrong.

Nope, bad idea.

Thanks.

-- 
tejun

  parent reply	other threads:[~2013-12-14 20:23 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-13 17:49 Writeback threads and freezable Tejun Heo
2013-12-13 18:52 ` Tejun Heo
2013-12-13 20:40   ` [PATCH] libata, freezer: avoid block device removal while system is frozen Tejun Heo
2013-12-13 22:45     ` Nigel Cunningham
2013-12-13 23:07       ` Tejun Heo
2013-12-13 23:15         ` Nigel Cunningham
2013-12-14  1:55           ` Dave Chinner
2013-12-14 20:31           ` Tejun Heo
2013-12-14 20:36             ` Tejun Heo
2013-12-14 21:21               ` Nigel Cunningham
2013-12-17  2:35                 ` Rafael J. Wysocki
2013-12-17  2:34             ` Rafael J. Wysocki
2013-12-17 12:34               ` Tejun Heo
2013-12-18  0:35                 ` Rafael J. Wysocki
2013-12-18 11:17                   ` Tejun Heo
2013-12-18 21:48                     ` Rafael J. Wysocki
2013-12-18 21:39                       ` Tejun Heo
2013-12-18 21:41                         ` Tejun Heo
2013-12-18 22:04                           ` Rafael J. Wysocki
2013-12-19 23:35                             ` [PATCH wq/for-3.14 1/2] workqueue: update max_active clamping rules Tejun Heo
2013-12-20  1:26                               ` Rafael J. Wysocki
2013-12-19 23:37                             ` [PATCH wq/for-3.14 2/2] workqueue: implement @drain for workqueue_set_max_active() Tejun Heo
2013-12-20  1:31                               ` Rafael J. Wysocki
2013-12-20 13:32                                 ` Tejun Heo
2013-12-20 13:56                                   ` Rafael J. Wysocki
2013-12-20 14:23                                     ` Tejun Heo
2013-12-16 12:12         ` [PATCH] libata, freezer: avoid block device removal while system is frozen Ming Lei
2013-12-16 12:45           ` Tejun Heo
2013-12-16 13:24             ` Ming Lei
2013-12-16 16:05               ` Tejun Heo
2013-12-17  2:38     ` Rafael J. Wysocki
2013-12-17 12:36       ` Tejun Heo
2013-12-18  0:23         ` Rafael J. Wysocki
2013-12-17 12:50     ` [PATCH v2] " Tejun Heo
2013-12-18  1:04       ` Rafael J. Wysocki
2013-12-18 11:08         ` Tejun Heo
2013-12-18 12:07       ` [PATCH v3] " Tejun Heo
2013-12-18 22:08         ` Rafael J. Wysocki
2013-12-19 17:24           ` Tejun Heo
2013-12-19 18:54         ` [PATCH v4] " Tejun Heo
2013-12-14  1:53 ` Writeback threads and freezable Dave Chinner
2013-12-14 17:30   ` Greg Kroah-Hartman
2013-12-14 20:23   ` Tejun Heo [this message]
2013-12-16  3:56     ` Dave Chinner
2013-12-16 12:51       ` Tejun Heo
2013-12-16 12:56         ` Tejun Heo
2013-12-18  0:35           ` Dave Chinner
2013-12-18 11:43             ` Tejun Heo
2013-12-18 22:14               ` Rafael J. Wysocki
2013-12-19  4:08               ` Dave Chinner
2013-12-19 16:24                 ` Tejun Heo
2013-12-20  0:51                   ` Dave Chinner
2013-12-20 14:51                     ` Tejun Heo
2013-12-20 14:00                   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131214202324.GA4020@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=aaron.lu@intel.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=tomaz.solc@tablix.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.