All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH 0/2 v3] Fix infinite loop in ail1 flush with jdata
Date: Mon, 13 Jan 2020 08:04:19 -0600	[thread overview]
Message-ID: <20200113140421.867659-1-rpeterso@redhat.com> (raw)

Hi. This patch set fixes a problem in which gfs2 can become deadlocked
while doing normal IO on jdata files. The problem is best observed by
repeatedly running xfstests generic/269 repeatedly with jdata files.
The specifics of the hang are best described in the second patch.

The first patch reverts e955537e3262de8e56f070b13817f525f472fa00.
The defective patch caused tr->tr_num_revoke to sometimes be a negative
number, since you can remove more revokes than you add. However, since
tr_num_revoke is declared unsigned, it triggered this assert in
gfs2_trans_end:

	if (gfs2_assert_withdraw(sdp, (nbuf <= tr->tr_blocks) &&
			       (tr->tr_num_revoke <= tr->tr_revokes)))

The management of revokes is not very good since we moved them from a
private list to a global list hung off the superblock pointer, sdp.
So we will probably want to revisit this and rework how revokes are
handled. In the meantime, it is safest to just revert the patch until
we can fix it properly.

The second patch fixes an infinite loop deadlock while flushing the
ail1 list for jdata pages. The patch comments describe the problem
and circumstances fairly well.

Bob Peterson (2):
  Revert "gfs2: eliminate tr_num_revoke_rm"
  gfs2: keep a redirty list for jdata pages that are PageChecked in ail1

 fs/gfs2/incore.h |  2 ++
 fs/gfs2/log.c    | 30 +++++++++++++++++++++++++++++-
 fs/gfs2/trans.c  |  7 ++++---
 3 files changed, 35 insertions(+), 4 deletions(-)

-- 
2.24.1



             reply	other threads:[~2020-01-13 14:04 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-13 14:04 Bob Peterson [this message]
2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 1/2 v3] Revert "gfs2: eliminate tr_num_revoke_rm" Bob Peterson
2020-01-13 14:04 ` [Cluster-devel] [GFS2 PATCH 2/2 v3] gfs2: keep a redirty list for jdata pages that are PageChecked in ail1 Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200113140421.867659-1-rpeterso@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.