Linux-XFS Archive on
 help / color / Atom feed
From: "Darrick J. Wong" <>
Subject: [PATCH v2 00/10] xfs: deferred inode inactivation
Date: Tue, 31 Dec 2019 17:08:40 -0800
Message-ID: <157784092020.1362752.15046503361741521784.stgit@magnolia> (raw)

Hi all,

This patch series implements deferred inode inactivation.  Inactivation
is the process of updating all on-disk metadata when a file is deleted
-- freeing the data/attr/COW fork extent allocations, removing the inode
from the unlinked hash, marking the inode record itself free, and
updating the inode btrees so that they show the inode as not being in

Currently, all this inactivation is performed during in-core inode
reclaim, which creates two big headaches: first, this makes direct
memory reclamation /really/ slow, and second, it prohibits us from
partially freezing the filesystem for online fsck activity because scrub
can hit direct memory reclaim.  It's ok for scrub to fail with ENOMEM,
but it's not ok for scrub to deadlock memory reclaim. :)

The implementation will be familiar to those who have studied how XFS
scans for reclaimable in-core inodes -- we create a couple more inode
state flags to mark an inode as needing inactivation and being in the
middle of inactivation.  When inodes need inactivation, we set iflags,
set the RECLAIM radix tree tag, update a count of how many resources
will be freed by the pending inactivations, and schedule a deferred work
item.  The deferred work item scans the inode radix tree for inodes to
inactivate, and does all the on-disk metadata updates.  Once the inode
has been inactivated, it is left in the reclaim state and the background
reclaim worker (or direct reclaim) will get to it eventually.

Patch 1-2 refactor some of the inactivation predicates.

Patches 3-4 implement the count of blocks/quota that can be freed by
running inactivation; this is necessary to preserve the behavior where
you rm a file and the fs counters update immediately.

Patches 5-6 refactor more inode reclaim code so that we can reuse some
of it for inactivation.

Patch 8 delivers the core of the inactivation changes by altering the
inode lifetime state machine to include the new inode flags and
background workers.

Patches 9-10 makes it so that if an allocation attempt hits ENOSPC it
will force inactivation to free resources and try again.

Patch 11 converts the per-fs inactivation scanner to be tracked on a
per-AG basis so that we can be more targeted in our inactivation.

Patches 12-14 teach the per-AG sick status to remember if we inactivate
inodes that themselves had unfixed sick flags set, and for scrub to
clear all those flags if it finds that the filesystem is clean.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.


kernel git tree:

             reply index

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-01  1:08 Darrick J. Wong [this message]
2020-01-01  1:08 ` [PATCH 01/10] xfs: decide if inode needs inactivation Darrick J. Wong
2020-01-01  1:08 ` [PATCH 02/10] xfs: track unlinked inactive inode fs summary counters Darrick J. Wong
2020-01-01  1:08 ` [PATCH 03/10] xfs: track unlinked inactive inode quota counters Darrick J. Wong
2020-01-01  1:09 ` [PATCH 04/10] xfs: pass per-ag structure to the xfs_ici_walk execute function Darrick J. Wong
2020-01-01  1:09 ` [PATCH 05/10] xfs: pass around xfs_inode_ag_walk iget/irele helper functions Darrick J. Wong
2020-01-01  1:09 ` [PATCH 06/10] xfs: deferred inode inactivation Darrick J. Wong
2020-01-01  1:09 ` [PATCH 07/10] xfs: force inode inactivation and retry fs writes when there isn't space Darrick J. Wong
2020-01-01  1:09 ` [PATCH 08/10] xfs: force inactivation before fallocate when space is low Darrick J. Wong
2020-01-01  1:09 ` [PATCH 09/10] xfs: parallelize inode inactivation Darrick J. Wong
2020-01-01  1:09 ` [PATCH 10/10] xfs: create a polled function to force " Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=157784092020.1362752.15046503361741521784.stgit@magnolia \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-XFS Archive on

Archives are clonable:
	git clone --mirror linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ \
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone