From: Dave Chinner <david@fromorbit.com>
To: Theodore Ts'o <tytso@mit.edu>,
"Luis R. Rodriguez" <mcgrof@kernel.org>,
viro@zeniv.linux.org.uk, bart.vanassche@wdc.com,
ming.lei@redhat.com, darrick.wong@oracle.com, jikos@kernel.org,
rjw@rjwysocki.net, pavel@ucw.cz, len.brown@intel.com,
linux-fsdevel@vger.kernel.org, boris.ostrovsky@oracle.com,
jgross@suse.com, todd.e.brandt@linux.intel.com,
nborisov@suse.com, jack@suse.cz, martin.petersen@oracle.com,
ONeukum@suse.com, oleksandr@natalenko.name,
oleg.b.antonyan@gmail.com, linux-pm@vger.kernel.org,
linux-block@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC 4/5] ext4: add fs freezing support on suspend/hibernation
Date: Wed, 4 Oct 2017 18:05:23 +1100 [thread overview]
Message-ID: <20171004070523.GE3666@dastard> (raw)
In-Reply-To: <20171004014204.mztkdnm7wb7r6sl2@thunk.org>
On Tue, Oct 03, 2017 at 09:42:04PM -0400, Theodore Ts'o wrote:
> On Tue, Oct 03, 2017 at 10:13:21PM +0200, Luis R. Rodriguez wrote:
> > > After we remove add the NEEDS_RECOVERY flag, we need to make sure
> > > recovery flag is pushed out to disk before any other changes are
> > > allowed to be pushed out to disk. That's why we originally did the
> > > update synchronously.
> >
> > I see. I had to try to dig through linux-history to see why this was done, and
> > I actually could not find an exact commit which explained what you just did.
> > Thanks!
> >
> > Hrm, so on freeze we issue the same commit as well, so is a sync *really* needed
> > on thaw?
>
> So let me explain what's going on. When we do a freeze, we make sure
> all of the blocks that are written to the journal are writen to the
> final location on disk, and the journal is is truncated. (This is
> called a "journal checkpoint".) Then we clear the NEEDS RECOVERY
> feature flag and set the EXT4_VALID_FS flags in the superblock. In
> the no journal case, we flush out all metadata writes, and we set the
> EXT4_VALID_FS flag. In both cases, we call ext4_commit_super(sb, 1)
> to make sure the flags update is pushed out to disk. This puts the
> file system into a "quiscent" state; in fact, it looks like the file
> system has been unmounted, so that it becomes safe to run
> dump/restore, run fsck -n on the file system, etc.
>
> The problem on the thaw side is that we need to make sure that
> EXT4_VALID_FS flag is removed (and if journaling is enabled, the NEEDS
> RECOVERY feature flag is set) and the superblock is flushed out to the
> storage device before any other writes are persisted on the disk. In
> the case where we have journalling enabled, we could wait until the
> first journal commit to write out the superblock, since in journal
> mode all metadata updates to the disk are suppressed until the journal
> commit. We don't do that today, but it is a change we could make.
>
> However, in the no journal node we need to make sure the EXT4_VALID_FS
> flag is cleared on disk before any other metadata operations can take
> place, and calling ext4_commit_super(sb, 1) is the only real way to do
> that.
>
> > No, it was am empirical evaluation done at testing, I observed bio_submit()
> > stalls, we never get completion from the device. Even if we call thaw at the
> > very end of resume, after the devices should be up, we still end up in the same
> > situation. Given how I order freezing filesystems after freezing userspace I do
> > believe we should thaw filesystems prior unfreezing userspace, its why I placed
> > the call where it is now.
>
> So when we call ext4_commit_super(sb, 1), we issue the bio, and then
> we block waiting for the I/O to complete. That's a blocking call. Is
> the thaw context one which allows blocking?
thaw_super() does down_write(), so it *must* be able to sleep.
> If userspace is still
> frozen, does that imply that the scheduler isn't allow to run? If
> that's the case, then that's probably the problem.
>
> More generally, if the thawing code needs to allocate memory, or do
Which is does. xfs_fs_unfreeze() queues work to a workqueue, so
there's memory allocation there.
> any number of things that could potentially block, this could
> potentially be an issue.
yup, gfs2_unfreeze does even more stuff - it releases glocks which
may end up queuing work, waking other threads and freeing stuff via
call_rcu()...
Basically, before thawing filesystems the rest of the kernel
infrastructure needs to have been restarted. i.e. the order
needs to be:
freeze userspace
freeze filesystems
freeze kernel threads
freeze workqueues
thaw workqueues
thaw kernel threads
thaw filesystems
thaw userspace
and it should end up that way.
> Or if we have a network block device, or
> something else in the storage stack that needs to run a kernel thread
> context (or a workqueue, etc.) --- is the fact that userspace is
> frozen mean the scheduler is going to refuse to schedule()?
No.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2017-10-04 7:05 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-03 18:53 [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 1/5] fs: add iterate_supers_reverse() Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 2/5] fs: freeze on suspend and thaw on resume Luis R. Rodriguez
2017-10-03 20:02 ` Bart Van Assche
2017-10-03 20:23 ` Luis R. Rodriguez
2017-10-03 20:32 ` Bart Van Assche
2017-10-03 20:39 ` Luis R. Rodriguez
2017-10-03 20:06 ` Jiri Kosina
2017-10-03 20:58 ` Dave Chinner
2017-10-03 21:16 ` Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 3/5] xfs: allow fs freeze on suspend/hibernation Luis R. Rodriguez
2017-10-03 18:53 ` [RFC 4/5] ext4: add fs freezing support " Luis R. Rodriguez
2017-10-03 19:59 ` Theodore Ts'o
2017-10-03 20:13 ` Luis R. Rodriguez
2017-10-04 1:42 ` Theodore Ts'o
2017-10-04 7:05 ` Dave Chinner [this message]
2017-10-04 15:25 ` Bart Van Assche
2017-10-04 16:48 ` Theodore Ts'o
2017-10-04 22:22 ` Dave Chinner
2017-10-03 18:53 ` [RFC 5/5] pm: remove kernel thread freezing Luis R. Rodriguez
2017-10-03 18:59 ` Rafael J. Wysocki
2017-10-03 21:15 ` Rafael J. Wysocki
2017-10-04 0:47 ` Luis R. Rodriguez
2017-10-04 1:03 ` Bart Van Assche
2017-11-29 23:05 ` Luis R. Rodriguez
2017-10-04 7:18 ` Dave Chinner
2017-10-03 20:12 ` Pavel Machek
2017-10-03 20:15 ` Jiri Kosina
2017-10-03 20:21 ` Pavel Machek
2017-10-03 20:38 ` Jiri Kosina
2017-10-03 20:41 ` Rafael J. Wysocki
2017-10-03 20:57 ` Pavel Machek
2017-10-03 21:00 ` Jiri Kosina
2017-10-03 21:09 ` Shuah Khan
2017-10-03 21:18 ` Luis R. Rodriguez
2017-10-03 20:49 ` Luis R. Rodriguez
2017-10-06 12:07 ` Pavel Machek
2017-10-06 12:54 ` Theodore Ts'o
2017-10-03 20:13 ` Bart Van Assche
2017-10-03 20:17 ` Jiri Kosina
2017-10-03 20:21 ` Bart Van Assche
2017-10-03 20:24 ` Jiri Kosina
2017-10-03 20:27 ` Luis R. Rodriguez
2017-10-03 20:51 ` Jiri Kosina
2017-10-03 21:04 ` Dave Chinner
2017-10-03 21:07 ` Luis R. Rodriguez
2017-10-04 6:07 ` Hannes Reinecke
2017-10-03 19:33 ` [RFC 0/5] fs: replace kthread freezing with filesystem freeze/thaw Ming Lei
2017-10-03 20:05 ` Luis R. Rodriguez
2017-10-03 20:47 ` Matthew Wilcox
2017-10-03 20:54 ` Luis R. Rodriguez
2017-10-03 20:59 ` Bart Van Assche
2017-10-04 15:43 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171004070523.GE3666@dastard \
--to=david@fromorbit.com \
--cc=ONeukum@suse.com \
--cc=bart.vanassche@wdc.com \
--cc=boris.ostrovsky@oracle.com \
--cc=darrick.wong@oracle.com \
--cc=jack@suse.cz \
--cc=jgross@suse.com \
--cc=jikos@kernel.org \
--cc=len.brown@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=mcgrof@kernel.org \
--cc=ming.lei@redhat.com \
--cc=nborisov@suse.com \
--cc=oleg.b.antonyan@gmail.com \
--cc=oleksandr@natalenko.name \
--cc=pavel@ucw.cz \
--cc=rjw@rjwysocki.net \
--cc=todd.e.brandt@linux.intel.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).