git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Elijah Newren <newren@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)
Date: Tue, 28 Sep 2021 19:40:14 -0400	[thread overview]
Message-ID: <YVOn3hDsb5pnxR53@coredump.intra.peff.net> (raw)
In-Reply-To: <CABPp-BGuzd_TH57-1RvwJQD5r3S3ZkJcuiPnU8aWee8pgzUBEw@mail.gmail.com>

On Mon, Sep 27, 2021 at 11:46:40PM -0700, Elijah Newren wrote:

> > * en/remerge-diff (2021-08-31) 7 commits
> >  - doc/diff-options: explain the new --remerge-diff option
> >  - show, log: provide a --remerge-diff capability
> >  - tmp-objdir: new API for creating and removing primary object dirs
> >  - merge-ort: capture and print ll-merge warnings in our preferred fashion
> >  - ll-merge: add API for capturing warnings in a strbuf instead of stderr
> >  - merge-ort: add ability to record conflict messages in a file
> >  - merge-ort: mark a few more conflict messages as omittable
> >
> >  A new presentation for two-parent merge "--remerge-diff" can be
> >  used to show the difference between mechanical (and possibly
> >  conflicted) merge results and the recorded resolution.
> >
> >  Will merge to 'next'?
> 
> It has been a month that it's been cooking with no issues brought up,
> and it's been in production for nearly a year...
> 
> But just this morning I pinged peff and jrnieder if they might have
> time to respectively look at the tmp-objdir stuff (patch 5, plus its
> integration into log-tree.c in patch 7) and the ll-merge.[ch] changes
> (patch 3).  I don't know if either will have time to do it, but
> perhaps wait half a week or so to see if they'll mention they have
> time?  Otherwise, yeah, it's probably time to merge this down.

Sorry to take so long. I think this is a very exciting topic, and I
appreciate being called into to look at tmp-objdir stuff, because it can
be quite subtle.

I just left a rather long-ish mail in the thread, but the gist of it is
that I'm worried that there's some possibility of corruption there if
the diff code writes objects. I didn't do a proof-of-concept there, but
I worked one up just now. Try this:

  # make a repo
  git init repo
  cd repo
  echo base >file
  git add file
  git commit -m base

  # two diverging branches
  echo main >file
  git commit -am main
  git checkout -b side HEAD^
  echo side >file
  git commit -am side

  # we get a conflict, and we resolve
  git merge main
  echo resolved >file
  git commit -am merged

  # now imagine we had a file with a diff driver. I stuffed it
  # in here after the fact, but it could have been here all along,
  # or come as part of the merge, etc.
  echo whatever >unrelated
  echo "unrelated diff=foo" >.gitattributes
  git add .
  git commit --amend --no-edit

  # set up the diff driver not just to do a textconv, but to cache the
  # result. This will require writing out new objects for the cache
  # as part of the diff operation.
  git config diff.foo.textconv "$PWD/upcase"
  git config diff.foo.cachetextconv true
  cat >upcase <<\EOF &&
  #!/bin/sh
  tr a-z A-Z <$1
  EOF
  chmod +x upcase

  # now show the diff
  git log -1 --remerge-diff

  # and make sure the repo is still OK
  git fsck

The remerge diff will correctly show the textconv'd content (because
it's not in either parent, and hence an evil merge):

  diff --git a/unrelated b/unrelated
  new file mode 100644
  index 0000000..982793c
  --- /dev/null
  +++ b/unrelated
  @@ -0,0 +1 @@
  +WHATEVER

but then fsck shows that our cache is corrupted:

  Checking object directories: 100% (256/256), done.
  error: refs/notes/textconv/foo: invalid sha1 pointer 1e9b3ecffca73411001043fe914a7ec0e7291043
  error: refs/notes/textconv/foo: invalid reflog entry 1e9b3ecffca73411001043fe914a7ec0e7291043
  dangling tree 569b6e414203d2f50bdaafc1585eae11e28afc37

Now I'll admit the textconv-cache is a pretty seldom-used feature. And
that there _probably_ aren't a lot of other ways that the diff code
would try to write objects or references. But I think it speaks to the
kind of subtle interactions we should worry about (and certainly
corrupting the repository is a pretty bad outcome, though at least in
this case it's "just" a cache and we could blow away that ref manually).

-Peff

  parent reply	other threads:[~2021-09-28 23:40 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-28  0:52 What's cooking in git.git (Sep 2021, #08; Mon, 27) Junio C Hamano
2021-09-28  1:57 ` Ævar Arnfjörð Bjarmason
2021-09-28 20:52   ` Junio C Hamano
2021-09-28  6:46 ` Elijah Newren
2021-09-28  7:45   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:25     ` Junio C Hamano
2021-09-28 21:00       ` Neeraj Singh
2021-09-28 23:34         ` Junio C Hamano
2021-09-28 23:53           ` Neeraj Singh
2021-10-07 22:01             ` Junio C Hamano
2021-10-08  6:51               ` Elijah Newren
2021-10-08 22:30                 ` Neeraj Singh
2021-10-08 23:01                 ` Junio C Hamano
2021-09-28  8:07   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:27     ` Junio C Hamano
2021-09-28 13:31   ` Derrick Stolee
2021-09-28 17:33     ` Junio C Hamano
2021-09-28 20:16       ` Derrick Stolee
2021-09-28 17:16   ` Junio C Hamano
2021-09-29  6:42     ` Elijah Newren
2021-09-28 23:40   ` Jeff King [this message]
2021-09-28 23:49     ` Jeff King
2021-09-29 18:43     ` Neeraj Singh
2021-09-30  8:16       ` Jeff King
2021-10-01  7:50         ` Elijah Newren
2021-10-01 17:02           ` Junio C Hamano
2021-10-01 17:39             ` Neeraj Singh
2021-10-01 18:15               ` Elijah Newren
2021-10-01 18:12             ` Elijah Newren
2021-10-01 22:02               ` Junio C Hamano
2021-10-01 23:05                 ` Elijah Newren
2021-10-04 13:45     ` Elijah Newren
2021-09-28  8:22 ` da/difftool (was: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)) Ævar Arnfjörð Bjarmason
2021-09-28  8:23 ` ns/batched-fsync & en/remerge-diff (was " Ævar Arnfjörð Bjarmason
2021-09-28  8:31 ` sg/test-split-index-fix " Ævar Arnfjörð Bjarmason
2021-09-28  8:35 ` hn/reftable (Re: " Ævar Arnfjörð Bjarmason
2021-09-28 12:18   ` Han-Wen Nienhuys
2021-09-30  5:06     ` Carlo Arenas
2021-09-29  8:12 ` What's cooking in git.git (Sep 2021, #08; Mon, 27) Fabian Stelzer
2021-09-30 21:26   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVOn3hDsb5pnxR53@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).