linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>,
	xfs <linux-xfs@vger.kernel.org>,
	linux-pm@vger.kernel.org
Subject: Re: Suspend fails when xfs is involved?
Date: Tue, 28 Mar 2017 00:30:40 +0200	[thread overview]
Message-ID: <1707526.iy5HQXy0ZS@aspire.rjw.lan> (raw)
In-Reply-To: <20170327204607.GA4874@birch.djwong.org>

On Monday, March 27, 2017 01:46:07 PM Darrick J. Wong wrote:
> [cc linux-pm since this intersects with suspend...]
> 
> On Sat, Feb 04, 2017 at 09:31:27AM +1100, Dave Chinner wrote:
> > On Thu, Feb 02, 2017 at 05:04:01PM -0800, Darrick J. Wong wrote:
> > > Hi list,
> > > 
> > > So I've noticed that my laptop consistently fails to suspend with:
> > > 
> > > [1183625.726800] atkbd serio0: Unknown key pressed (translated set 2, code 0xd8 on isa0060/serio0).
> > > [1183625.726804] atkbd serio0: Use 'setkeycodes e058 <keycode>' to make it known.
> > > [1183625.727492] atkbd serio0: Unknown key released (translated set 2, code 0xd8 on isa0060/serio0).
> > > [1183625.727497] atkbd serio0: Use 'setkeycodes e058 <keycode>' to make it known.
> > > [1183626.203928] e1000e: enp0s25 NIC Link is Down
> > > [1183626.422720] PM: Syncing filesystems ... done.
> > > [1183626.450348] Freezing user space processes ... (elapsed 0.002 seconds) done.
> > > [1183626.452995] Freezing remaining freezable tasks ... 
> > > [1183632.657243] atkbd serio0: Unknown key pressed (translated set 2, code 0xd9 on isa0060/serio0).
> > > [1183632.657247] atkbd serio0: Use 'setkeycodes e059 <keycode>' to make it known.
> > > [1183632.657814] atkbd serio0: Unknown key released (translated set 2, code 0xd9 on isa0060/serio0).
> > > [1183632.657817] atkbd serio0: Use 'setkeycodes e059 <keycode>' to make it known.
> > > [1183646.459310] Freezing of tasks failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
> > > [1183646.459348] xfsaild/dm-1    D    0  1767      2 0x00000000
> > 
> > Yes, this can happen because suspend thinks that "sync" is
> > sufficient to quiesce a filesystem into an idle state. 
> > 
> > > [1183646.459366] Call Trace:
> > > [1183646.459386]  [<ffffffffb5a43b8d>] schedule+0x3d/0x90
> > > [1183646.459390]  [<ffffffffb5a47339>] schedule_timeout+0x239/0x420
> > > [1183646.459401]  [<ffffffffb5a450e6>] wait_for_completion+0xa6/0x120
> > > [1183646.459460]  [<ffffffffb539ba0f>] xfs_buf_submit_wait+0x7f/0x280
> > > [1183646.459466]  [<ffffffffb539bc33>] _xfs_buf_read+0x23/0x30
> > > [1183646.459470]  [<ffffffffb539bd64>] xfs_buf_read_map+0x124/0x1b0
> > > [1183646.459473]  [<ffffffffb53eb270>] xfs_trans_read_buf_map+0x110/0x370
> > > [1183646.459478]  [<ffffffffb538417e>] xfs_imap_to_bp+0x6e/0xe0
> > > [1183646.459481]  [<ffffffffb53b3883>] xfs_iflush+0xd3/0x230
> > > [1183646.459486]  [<ffffffffb53e0ab4>] xfs_inode_item_push+0xf4/0x150
> > > [1183646.459489]  [<ffffffffb53e9cdf>] xfsaild+0x2df/0x740
> > > [1183646.459500]  [<ffffffffb51101f9>] kthread+0xd9/0xf0
> > 
> > That's inode writeback when the underlying inode buffer has been
> > reclaimed before the dirty cached inode has been written. So the
> > xfsaild is doing read/modify/write cycles to write back dirty
> > inodes. i.e. you're running in active memory reclaim conditions
> > prior to suspend...
> 
> So I wrote up a patch that removes WQ_FREEZABLE from the xfs_buf thread,
> and since then I haven't had any problems suspending my laptop.  Last
> week at LSF I inquired about whether it was proper to be freezing IO
> helper threads as part of suspend, and was told in response "Are you
> convinced that use of WQ_FREEZABLE is even correct?"  TBH I can't see
> why you'd want to freeze IO helper workqueues at all.
> 
> So, I'm going to email that patch out as an RFC and if anyone wants to
> follow up the discussion, let's do it there.

Yes, please!

> I get it, suspend really
> should just fsfreeze, but the question I really want to know is, why
> does XFS freeze its own threads?  They seem to go to sleep just fine
> after we're done doing all the IO we want.

That, quite frankly, is what I would expect.

> > > ISTR Dave or someone grumbling about this being some artifact of the log
> > > trying to read in some buffer or other as part of flushing the log prior
> > > to suspend, but the io completion ends up tied to a workqueue that's
> > > already been put to sleep, so xfs gets stuck forever.
> > 
> > Yup, suspend is just completely fucked, has been for more than 10
> > years. It needs to freeze filesystems so they are quiesced sanely,
> > not left to run while random parts of the kernel infrastructure they
> > rely on are shut down behind the filesystem's back.
> > 
> > > Look familiar to anyone before I try to debug this tomorrow?
> > 
> > See this as a recent starting point.
> > 
> > https://lwn.net/Articles/705269/
> 
> I wonder if they've done any work on freezing filesystems...

Not that I know of.

Thanks,
Rafael


  reply	other threads:[~2017-03-27 22:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-03  1:04 Suspend fails when xfs is involved? Darrick J. Wong
2017-02-03  2:18 ` Carlos E. R.
2017-02-03 22:31 ` Dave Chinner
2017-03-27 20:46   ` Darrick J. Wong
2017-03-27 22:30     ` Rafael J. Wysocki [this message]
2017-03-27 23:14       ` Luis R. Rodriguez
2017-03-28 16:33         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1707526.iy5HQXy0ZS@aspire.rjw.lan \
    --to=rjw@rjwysocki.net \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).