[PATCH 0/16] enabling GIT_REF_PARANOIA by default

* [PATCH 0/16] enabling GIT_REF_PARANOIA by default
@ 2021-09-24 18:30 Jeff King
  2021-09-24 18:32 ` [PATCH 01/16] t7900: clean up some more broken refs Jeff King
                   ` (16 more replies)
  0 siblings, 17 replies; 22+ messages in thread
From: Jeff King @ 2021-09-24 18:30 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

I recently ran into a situation where dealing with a corrupted
repository was more confusing than necessary, because Git by default
ignores corrupted refs in many commands.

A while ago we introduced GIT_REF_PARANOIA, which works by including
broken refs in iteration, which then typically causes later operations
to fail (e.g., during repacking, you'd prefer to barf loudly when trying
to access the missing object rather than incorrectly assume the objects
from the broken ref aren't reachable).

I think this is a better default for Git to have in general, not just
for a few select operations (we turn it on by default for pruning and
some repacks). We shouldn't see corruptions in general, and complaining
loudly when we do is the safest option. The reason we held back when the
knob was introduced was mostly out of deference to the historical
behavior.

So this series started as a patch to just flip that default, but I found
some interesting things:

 - there are a couple of tests that get confused. IMHO this is
   vindicating the idea of flipping the default, beacuse in each case
   these tests were poorly written (either corruptions they didn't
   realize they had, or doing questionable operations on an incomplete
   set of refs)

 - the existing GIT_REF_PARANOIA is over-eager to complain about
   dangling symrefs, even though they're perfectly fine

 - as usual, there was some obvious cleanup along the way. ;)

Even if you don't buy the argument that we should flip the default, I
think everything up through patch 11 is a worthwhile cleanup on its own.

Note that this conflicts with jt/no-abuse-alternate-odb-for-submodules,
since it is touching the innards of DO_FOR_EACH_REF_INCLUDE_BROKEN, too.
I left a note on that series about how I think that could be reconciled
(i.e., the conflict is just around how the code is written, and not
inherent to the goals).

In the end I left GIT_REF_PARANOIA as a knob, just defaulting to "1". I
think it's possibly useful as an escape hatch when dealing with a
corrupt repo. But we _could_ go all the way and basically drop
DO_FOR_EACH_REF_INCLUDE_BROKEN's do-we-have-the-object check entirely.
That would totally sever the relationship between the ref store and the
object store, which would make things conceptually a lot simpler (and I
saw was discussed in some of those earlier threads).

Just a breakdown of the series:

  [01/16]: t7900: clean up some more broken refs
  [02/16]: t5516: don't use HEAD ref for invalid ref-deletion tests
  [03/16]: t5600: provide detached HEAD for corruption failures
  [04/16]: t5312: drop "verbose" helper
  [05/16]: t5312: create bogus ref as necessary
  [06/16]: t5312: test non-destructive repack
  [07/16]: t5312: be more assertive about command failure

     Test cleanups. Necessary for the default flip, but I think each
     stands on its own.

  [08/16]: refs-internal.h: move DO_FOR_EACH_* flags next to each other
  [09/16]: refs-internal.h: reorganize DO_FOR_EACH_* flag documentation

     Cleanup of existing features.

  [10/16]: refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag
  [11/16]: refs: omit dangling symrefs when using GIT_REF_PARANOIA

     Fixing the current over-eager behavior of GIT_REF_PARANOIA.

  [12/16]: refs: turn on GIT_REF_PARANOIA by default

     The actual flip.

  [13/16]: repack, prune: drop GIT_REF_PARANOIA settings
  [14/16]: ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN
  [15/16]: ref-filter: drop broken-ref code entirely
  [16/16]: refs: drop "broken" flag from for_each_fullref_in()

     Some small cleanups we can do as a result.

 Documentation/git.txt         | 19 ++++++------
 builtin/branch.c              |  2 +-
 builtin/for-each-ref.c        |  2 +-
 builtin/prune.c               |  1 -
 builtin/repack.c              |  3 --
 builtin/rev-parse.c           |  4 +--
 cache.h                       |  8 -----
 environment.c                 |  1 -
 ls-refs.c                     |  2 +-
 ref-filter.c                  | 22 ++++++--------
 ref-filter.h                  |  1 -
 refs.c                        | 42 +++++++++++++-------------
 refs.h                        |  9 ++----
 refs/files-backend.c          |  5 ++++
 refs/refs-internal.h          | 56 ++++++++++++++++++++++-------------
 revision.c                    |  2 +-
 t/t1430-bad-ref-name.sh       |  2 +-
 t/t5312-prune-corruption.sh   | 48 ++++++++++++++++++++++--------
 t/t5516-fetch-push.sh         | 19 ++++++------
 t/t5600-clone-fail-cleanup.sh |  4 ++-
 t/t7900-maintenance.sh        |  6 +++-
 21 files changed, 142 insertions(+), 116 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 22+ messages in thread