[MEGAPATCHSET v25 1/2] xfs: online repair, part 1

* [MEGAPATCHSET v25 1/2] xfs: online repair, part 1
@ 2023-05-26  0:00 Darrick J. Wong
  2023-05-26  0:28 ` [PATCHSET v25.0 0/7] xfs: stage repair information in pageable memory Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Darrick J. Wong @ 2023-05-26  0:00 UTC (permalink / raw)
  To: Dave Chinner
  Cc: xfs, linux-fsdevel, Carlos Maiolino, Chandan Babu R, Catherine Hoang

Hi everyone,

I've finished merging parent pointers into what is now part 2 of online
repair.  Part 1 hasn't changed much since the last posting at the end of
2022, aside from various reorganizations of the directory repair, dotdot
repair, and the tempfile/orphanage infrastructure to support the bits
that part 2 will want.  Zorro merged all the pending fstests changes to
support and test everything in part 1, so that part is done.

In other words, I'm formally submitting part 1 for inclusion in 6.5.

For this review, I would like people to focus the following:

- Are the major subsystems sufficiently documented that you could figure
  out what the code does?

- Do you see any problems that are severe enough to cause long term
  support hassles? (e.g. bad API design, writing weird metadata to disk)

- Can you spot mis-interactions between the subsystems?

- What were my blind spots in devising this feature?

- Are there missing pieces that you'd like to help build?

- Can I just merge all of this?

The one thing that is /not/ in scope for this review are requests for
more refactoring of existing subsystems.

I've been running daily online **repairs** of every computer I own for
the last 14 months.  So far, no damage has resulted from these
operations.

Fuzz and stress testing of online repairs have been running well for a
year now.  As of this writing, online repair can fix slightly more
things than offline repair, and the fsstress+repair long soak test has
passed 200 million repairs with zero problems observed.  All issues
observed in that time have been corrected in this submission.

(For comparison, the long soak fsx test recently passed 99 billion file
operations, so online fsck has a ways to go...)

This is actually an excerpt of the xfsprogs patches -- I'm only mailing
the changes to xfs_scrub; there are substantially more bug fixes and
improvements to xfs_{db,repair,spaceman} that I've made along the way.

--D

^ permalink raw reply	[flat|nested] 54+ messages in thread