All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [GFS2 PATCH 0/2 v3] Fix infinite loop in ail1 flush with jdata
@ 2020-01-13 14:04 Bob Peterson
  2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 1/2 v3] Revert "gfs2: eliminate tr_num_revoke_rm" Bob Peterson
  2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 2/2 v3] gfs2: keep a redirty list for jdata pages that are PageChecked in ail1 Bob Peterson
  0 siblings, 2 replies; 3+ messages in thread
From: Bob Peterson @ 2020-01-13 14:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi. This patch set fixes a problem in which gfs2 can become deadlocked
while doing normal IO on jdata files. The problem is best observed by
repeatedly running xfstests generic/269 repeatedly with jdata files.
The specifics of the hang are best described in the second patch.

The first patch reverts e955537e3262de8e56f070b13817f525f472fa00.
The defective patch caused tr->tr_num_revoke to sometimes be a negative
number, since you can remove more revokes than you add. However, since
tr_num_revoke is declared unsigned, it triggered this assert in
gfs2_trans_end:

	if (gfs2_assert_withdraw(sdp, (nbuf <= tr->tr_blocks) &&
			       (tr->tr_num_revoke <= tr->tr_revokes)))

The management of revokes is not very good since we moved them from a
private list to a global list hung off the superblock pointer, sdp.
So we will probably want to revisit this and rework how revokes are
handled. In the meantime, it is safest to just revert the patch until
we can fix it properly.

The second patch fixes an infinite loop deadlock while flushing the
ail1 list for jdata pages. The patch comments describe the problem
and circumstances fairly well.

Bob Peterson (2):
  Revert "gfs2: eliminate tr_num_revoke_rm"
  gfs2: keep a redirty list for jdata pages that are PageChecked in ail1

 fs/gfs2/incore.h |  2 ++
 fs/gfs2/log.c    | 30 +++++++++++++++++++++++++++++-
 fs/gfs2/trans.c  |  7 ++++---
 3 files changed, 35 insertions(+), 4 deletions(-)

-- 
2.24.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-13 14:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-13 14:04 [Cluster-devel] [GFS2 PATCH 0/2 v3] Fix infinite loop in ail1 flush with jdata Bob Peterson
2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 1/2 v3] Revert "gfs2: eliminate tr_num_revoke_rm" Bob Peterson
2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 2/2 v3] gfs2: keep a redirty list for jdata pages that are PageChecked in ail1 Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.