git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Sam James via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Sam James <sam@gentoo.org>
Subject: Re: [PATCH] diff: implement config.diff.renames=copies-harder
Date: Tue, 7 Nov 2023 09:19:48 -0800	[thread overview]
Message-ID: <CABPp-BF9iUkF+g_w7wLATFTmjfJ3f1hsBr+zXxNZEcq-XiNOWg@mail.gmail.com> (raw)
In-Reply-To: <xmqq7cmu9s29.fsf@gitster.g>

On Mon, Nov 6, 2023 at 7:10 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Fri, Nov 3, 2023 at 4:25 AM Sam James via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Sam James <sam@gentoo.org>
> >>
> >> This patch adds a config value for 'diff.renames' called 'copies-harder'
> >> which make it so '-C -C' is in effect always passed for 'git log -p',
> >> 'git diff', etc.
> >>
> >> This allows specifying that 'git log -p', 'git diff', etc should always act
> >> as if '-C --find-copies-harder' was passed.
> >>
> >> I've found this especially useful for certain types of repository (like
> >> Gentoo's ebuild repositories) because files are often copies of a previous
> >> version.
> >
> > These must be very small repositories?  --find-copies-harder is really
> > expensive...
>
> True.  "often copies of a previous version" means that it is a
> directory that has a collection of subdirectories, one for each
> version?  In a source tree managed in a version control system,
> files are often rewritten in place from the previous version,
> so I am puzzled by that justification.
>
> It is, in the proposed log message of our commits, a bit unusual to
> see "This patch does X" and "I do Y", by the way, which made my
> reading hiccup a bit, but perhaps it is just me?

I think I read Sam's description a bit differently than you.  My
assumption was they'd have files with names like the following in the
same directory:
   gcc-13.x.build.recipe
   gcc-12.x.build.recipe
   gcc-11.x.build.recipe
   gcc-10.x.build.recipe

And that gcc-13.x.build.recipe was started as a copy of
gcc-12.x.build.recipe (which was started as a copy of
gcc-11.x.build.recipe, etc.).  They keep all versions because they
want users to be able to build and install multiple gcc versions.

I could be completely off, but that's what I was imagining from the description.

> >> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
> >> index bd5ae0c3378..d2ff3c62d41 100644
> >> --- a/Documentation/config/diff.txt
> >> +++ b/Documentation/config/diff.txt
> >> @@ -131,7 +131,8 @@ diff.renames::
> >>         Whether and how Git detects renames.  If set to "false",
> >>         rename detection is disabled. If set to "true", basic rename
> >>         detection is enabled.  If set to "copies" or "copy", Git will
> >> -       detect copies, as well.  Defaults to true.  Note that this
> >> +       detect copies, as well.  If set to "copies-harder", Git will try harder
> >> +       to detect copies.  Defaults to true.  Note that this
> >
> > "try harder to detect copies" feels like an unhelpful explanation.
>
> Yup.  "will spend extra cycles to find more copies", perhaps?

I find that marginally better; but I still don't think it answers the
user's question of why they should pick one option or the other.  The
wording for the `--find-copies-harder` does explain when it's useful:

        For performance reasons, by default, `-C` option finds copies only
        if the original file of the copy was modified in the same
        changeset.  This flag makes the command
        inspect unmodified files as candidates for the source of
        copy.  This is a very expensive operation for large
        projects, so use it with caution.

We probably don't want to copy all three of those sentences here, but
I think we need to make sure users can find them, thus my suggestion
to reference the `--find-copies-harder` option to git-diff so that
affected users can get the info they need to choose.

  reply	other threads:[~2023-11-07 17:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-03 11:25 [PATCH] diff: implement config.diff.renames=copies-harder Sam James via GitGitGadget
2023-11-07  2:45 ` Elijah Newren
2023-11-07  3:10   ` Junio C Hamano
2023-11-07 17:19     ` Elijah Newren [this message]
2023-11-08  1:26       ` Junio C Hamano
2023-11-08  3:30         ` Elijah Newren
2023-11-08  4:06           ` Junio C Hamano
2023-11-08  4:38             ` Elijah Newren
2024-03-11 21:42 ` Sam James

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BF9iUkF+g_w7wLATFTmjfJ3f1hsBr+zXxNZEcq-XiNOWg@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=sam@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).