From: Elijah Newren <newren@gmail.com>
To: "Priedhorsky, Reid" <reidpr@lanl.gov>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: bug? round-trip through fast-import/fast-export loses files
Date: Mon, 20 Mar 2023 18:57:21 -0700 [thread overview]
Message-ID: <CABPp-BEG+vp-UcpVfcZecPBnfcuTjO6JYCo7wEU5ZrDUHBUd9g@mail.gmail.com> (raw)
In-Reply-To: <BBB169A5-0665-47C9-819B-6409A22AB699@lanl.gov>
Hi,
On Mon, Mar 20, 2023 at 11:23 AM Priedhorsky, Reid <reidpr@lanl.gov> wrote:
>
> Hello,
>
> I believe I’ve found a bug in Git. It seems that (1) round-tripping through
> fast-export/fast-import a repository (2) that contains a commit that changes
> a file to a directory (3) deletes the contents of that directory from the
> repository.
>
> Thank you for filling out a Git bug report!
> Please answer the following questions to help us understand your issue.
>
> What did you do before the bug happened? (Steps to reproduce your issue)
>
> Run this shell script:
>
> ~~~~
> #!/bin/bash
>
> set -ex
>
> mkdir -p /tmp/weirdal
> cd /tmp/weirdal
> git --version
>
> # init repo
> rm -Rf wd
> mkdir wd
> cd wd
> git init -b main
>
> # first commit - foo is a file
> touch foo
> git add -A
> git commit -m 'file'
>
> # second commit - foo is a directory
> rm foo
> mkdir foo
> touch foo/bar
> git add -A
> git commit -m 'directory'
>
> # the contents of foo are in the working dir and the repo
> git status
> ls -lR
> git ls-tree --name-only -r HEAD
>
> # import/export repository (add --full-tree to work around bug)
> git fast-export --no-data -- --all > ../export
> cat ../export
> git fast-import --force --quiet < ../export
>
> # bug: foo is still in the WD but not the repo; should still be both
> git status
> ls -lR
> git ls-tree --name-only -r HEAD
> #git fast-export --no-data -- --all | diff -u --text ../export - || true
> ~~~~
>
> What did you expect to happen? (Expected behavior)
>
> Repo should be unchanged, i.e.:
>
> + git status
> On branch main
> nothing to commit, working tree clean
>
> What happened instead? (Actual behavior)
>
> Git thinks foo/bar has been staged:
>
> + git status
> On branch main
> Changes to be committed:
> (use "git restore --staged <file>..." to unstage)
> new file: foo/bar
>
> What's different between what you expected and what actually happened?
>
> File foo/bar is staged when it should be unchanged.
>
> Anything else you want to add:
>
> This also happens in 2.38.1 built from source.
>
> The bad behavior can be worked around with “--full-tree” on fast-export, but
> the real repo where I want to do this is pretty large, so I’d prefer not to.
>
> Note the “git fast-export” output:
>
> commit refs/heads/main
> mark :2
> author Reid Priedhorsky <reidpr@lanl.gov> 1679330805 -0600
> committer Reid Priedhorsky <reidpr@lanl.gov> 1679330805 -0600
> data 10
> directory
> from :1
> M 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 foo/bar
> D foo
>
> It looks to me like the “M ... foo/bar” is being processed before “D foo”
> when it should happen in the opposite order.
Thanks for the well-written bug report, including not only a testcase
but even the relevant bits of the fast-export output. I thought I had
fixed D/F issues in fast-export & fast-import before, and indeed a
search turns up both of
253fb5f889 (fast-import: Improve robustness when D->F changes provided
in wrong order, 2010-07-09)
060df62422 (fast-export: Fix output order of D/F changes, 2010-07-09)
However, it looks like both of those only considered D->F (directory
becomes a file) changes, whereas you specifically have a case of F->D
(file becoming a directory).
Honestly, looking back at those two patches of mine, I think both were
rather suboptimal. A better solution that would handle both F->D and
D->F would be having fast-export sort the diff_filepairs such that it
processes the deletes before the modifies. Another improved solution
would be having fast-import sort the files given to it and handling
deletes first. Either should fix this.
Might be a good task for a new contributor. Any takers? (Tagging as
#leftoverbits.)
next prev parent reply other threads:[~2023-03-21 1:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-20 17:10 bug? round-trip through fast-import/fast-export loses files Priedhorsky, Reid
2023-03-21 1:57 ` Elijah Newren [this message]
2023-03-21 15:54 ` Priedhorsky, Reid
2023-03-21 17:07 ` Junio C Hamano
2023-03-21 18:31 ` Jeff King
2023-03-22 3:07 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CABPp-BEG+vp-UcpVfcZecPBnfcuTjO6JYCo7wEU5ZrDUHBUd9g@mail.gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=reidpr@lanl.gov \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).