All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [PATCH 00/15] GFS2: Withdraw corruption patches [V2]
@ 2019-02-27 20:55 Bob Peterson
  2019-02-27 20:55 ` [Cluster-devel] [PATCH 01/15] gfs2: log error reform Bob Peterson
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Bob Peterson @ 2019-02-27 20:55 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This is a revision to the patch set I sent on 13 February 2019. These
won't make this merge window, obviously, because that's almost upon us.

This version fixes some glaring mistakes and problems of the first set.
As before, this may not be the final version, but I wanted to put it out
for review anyway.

Among changes from the original are:

1. I fixed some really stupid mistakes of the original patch set.
2. I found and fixed several additional problems not covered by the first
   patch set.
3. I broke up the patch "Force withdraw to replay journals and wait for
   it to finish" into more reasonaly sized pieces. It's still complex,
   but not nearly as bad as the original.
4. I included some of the instrumentation I've used to detect file system
   corruption. It makes sense to include them in mainline, I think.
5. I still need to figure out what to do about Dave Teigland's observation
   regarding the patch "dlm: recover slot regardless of whether we still
   have a connection". The patch is omitted in this set until I figure out
   a reasonable course of action.

This version is much more stable. I've still been able to break it, given
enough pressure, but I think that's an additional bug. I'll continue to
chase it, and will post further patches, if necessary.

These patches address a bunch of problems related to journal replay
overwriting valid gfs2 metadata due to io errors, withdraws and such.
These seem to fix several metadata corruption problems I've been able
to reliably recreate lately with multi-node multi-file system recovery
tests.

Bob Peterson (15):
  gfs2: log error reform
  gfs2: Introduce concept of a pending withdraw
  gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn
  gfs2: move check_journal_clean to util.c for future use
  gfs2: Allow some glocks to be used during withdraw
  gfs2: Make secondary withdrawers wait for first withdrawer
  gfs2: Don't write log headers after file system withdraw
  gfs2: Force withdraw to replay journals and wait for it to finish
  gfs2: Add verbose option to check_journal_clean
  gfs2: Check for log write errors before telling dlm  to unlock
  gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty
  gfs2: If the journal isn't live ignore log flushes
  gfs2: Issue revokes more intelligently
  gfs2: Warn when a journal replay overwrites a rgrp with buffers
  gfs2: log which portion of the journal is replayed

 fs/gfs2/aops.c       |   4 +-
 fs/gfs2/file.c       |   2 +-
 fs/gfs2/glock.c      |  39 ++++++++--
 fs/gfs2/glock.h      |   1 +
 fs/gfs2/glops.c      |  82 ++++++++++++++++++++-
 fs/gfs2/incore.h     |  14 +++-
 fs/gfs2/lock_dlm.c   |  68 +++++++++++++++++
 fs/gfs2/log.c        |  94 +++++++++++-------------
 fs/gfs2/log.h        |   1 +
 fs/gfs2/lops.c       |  29 +++++++-
 fs/gfs2/meta_io.c    |   6 +-
 fs/gfs2/ops_fstype.c |  52 ++-----------
 fs/gfs2/quota.c      |   8 +-
 fs/gfs2/recovery.c   |   3 +-
 fs/gfs2/super.c      |  30 ++++----
 fs/gfs2/super.h      |   1 +
 fs/gfs2/sys.c        |   2 +-
 fs/gfs2/util.c       | 171 ++++++++++++++++++++++++++++++++++++++++++-
 fs/gfs2/util.h       |  11 +++
 19 files changed, 477 insertions(+), 141 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-02-27 20:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27 20:55 [Cluster-devel] [PATCH 00/15] GFS2: Withdraw corruption patches [V2] Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 01/15] gfs2: log error reform Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 02/15] gfs2: Introduce concept of a pending withdraw Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 03/15] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 04/15] gfs2: move check_journal_clean to util.c for future use Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 05/15] gfs2: Allow some glocks to be used during withdraw Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 06/15] gfs2: Make secondary withdrawers wait for first withdrawer Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 07/15] gfs2: Don't write log headers after file system withdraw Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 08/15] gfs2: Force withdraw to replay journals and wait for it to finish Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 09/15] gfs2: Add verbose option to check_journal_clean Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 10/15] gfs2: Check for log write errors before telling dlm to unlock Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 11/15] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 12/15] gfs2: If the journal isn't live ignore log flushes Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 13/15] gfs2: Issue revokes more intelligently Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 14/15] gfs2: Warn when a journal replay overwrites a rgrp with buffers Bob Peterson
2019-02-27 20:55 ` [Cluster-devel] [PATCH 15/15] gfs2: log which portion of the journal is replayed Bob Peterson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.