All of lore.kernel.org
 help / color / mirror / Atom feed
* git-cherry doesn't detect a "copied" commit
@ 2015-10-20 16:32 Francis Moreau
  2015-10-23  6:47 ` Francis Moreau
  2015-10-23 10:57 ` Kevin Daudt
  0 siblings, 2 replies; 6+ messages in thread
From: Francis Moreau @ 2015-10-20 16:32 UTC (permalink / raw)
  To: git

Hi,

I'm seeing something odd with git-cherry: it doesn't seem to detect
that a commit has been cherry-picked from master branch.

This happens with the systemd git repository (from github) so it
should be fairly simple to reproduce.

What I did:

$ git --version
git version 2.6.0
$ git checkout -b foo v210
$ git cherry-pick -x 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
$ git branch --contains 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
master
$ git cherry master HEAD
+ fef60bf34d1b372bea1db2515a8d936386dfc523

so git-cherry tells me that the cherry-picked commit has not
equivalent in master, which is no the case.

What am I missing ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-cherry doesn't detect a "copied" commit
  2015-10-20 16:32 git-cherry doesn't detect a "copied" commit Francis Moreau
@ 2015-10-23  6:47 ` Francis Moreau
  2015-10-23 10:57 ` Kevin Daudt
  1 sibling, 0 replies; 6+ messages in thread
From: Francis Moreau @ 2015-10-23  6:47 UTC (permalink / raw)
  To: git

Hi,

On Tue, Oct 20, 2015 at 6:32 PM, Francis Moreau <francis.moro@gmail.com> wrote:
> Hi,
>
> I'm seeing something odd with git-cherry: it doesn't seem to detect
> that a commit has been cherry-picked from master branch.
>
> This happens with the systemd git repository (from github) so it
> should be fairly simple to reproduce.
>
> What I did:
>
> $ git --version
> git version 2.6.0
> $ git checkout -b foo v210
> $ git cherry-pick -x 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
> $ git branch --contains 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
> master
> $ git cherry master HEAD
> + fef60bf34d1b372bea1db2515a8d936386dfc523
>
> so git-cherry tells me that the cherry-picked commit has not
> equivalent in master, which is no the case.
>
> What am I missing ?
>

Could anybody give me a hint even if I'm blind and not seeing my
stupid mistake ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-cherry doesn't detect a "copied" commit
  2015-10-20 16:32 git-cherry doesn't detect a "copied" commit Francis Moreau
  2015-10-23  6:47 ` Francis Moreau
@ 2015-10-23 10:57 ` Kevin Daudt
  2015-10-23 12:24   ` Francis Moreau
  1 sibling, 1 reply; 6+ messages in thread
From: Kevin Daudt @ 2015-10-23 10:57 UTC (permalink / raw)
  To: Francis Moreau; +Cc: git

On Tue, Oct 20, 2015 at 06:32:12PM +0200, Francis Moreau wrote:
> Hi,
> 
> I'm seeing something odd with git-cherry: it doesn't seem to detect
> that a commit has been cherry-picked from master branch.
> 
> This happens with the systemd git repository (from github) so it
> should be fairly simple to reproduce.
> 
> What I did:
> 
> $ git --version
> git version 2.6.0
> $ git checkout -b foo v210
> $ git cherry-pick -x 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
> $ git branch --contains 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
> master
> $ git cherry master HEAD
> + fef60bf34d1b372bea1db2515a8d936386dfc523
> 
> so git-cherry tells me that the cherry-picked commit has not
> equivalent in master, which is no the case.
> 
> What am I missing ?
> 

Let's see:

  $ git show | git patch-id 
  50c9f9548e1fd25401ff9540c82c1d5f9723c3d5 b4c86d2965aaf0736e4ab30be1d1a08931009a08

  $ git show 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f | git patch-id
  a5cfbb542882bd9cbe192b43026354d1f2741673 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f

Git patch-id calculates the hash over the diff, and, when gives in this
case two hashes, first the patch-id and the second the commit hash.

The patch-ids are different, explaining why git cherry does not see them
as equivalent. If I take a diff of the diff, I notice something:

   diff --git a/src/udev/udevadm-settle.c b/src/udev/udevadm-settle.c
  -index 927ea2a..65fc35f 100644
  +index b026155..ea9b078 100644
   --- a/src/udev/udevadm-settle.c
   +++ b/src/udev/udevadm-settle.c
   @@ -41,42 +41,28 @@
  @@ -1094,7 +1094,7 @@
                            exit(EXIT_SUCCESS);
   @@ -102,44 +85,13 @@ static int adm_settle(struct udev *udev, int argc, char *argv[])
                    default:
  -                         assert_not_reached("Unknown argument");
  +                         assert_not_reached("Unkown argument");
                    }
   +        }

If you look at the lines with assert_not_reached, it seems there is a
difference in the context of both diffs, causing the diffs to be
different.

Hope this helps, Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-cherry doesn't detect a "copied" commit
  2015-10-23 10:57 ` Kevin Daudt
@ 2015-10-23 12:24   ` Francis Moreau
  2015-10-23 22:41     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Francis Moreau @ 2015-10-23 12:24 UTC (permalink / raw)
  To: Kevin Daudt; +Cc: git

On Fri, Oct 23, 2015 at 12:57 PM, Kevin Daudt <me@ikke.info> wrote:
> On Tue, Oct 20, 2015 at 06:32:12PM +0200, Francis Moreau wrote:
>> Hi,
>>
>> I'm seeing something odd with git-cherry: it doesn't seem to detect
>> that a commit has been cherry-picked from master branch.
>>
>> This happens with the systemd git repository (from github) so it
>> should be fairly simple to reproduce.
>>
>> What I did:
>>
>> $ git --version
>> git version 2.6.0
>> $ git checkout -b foo v210
>> $ git cherry-pick -x 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
>> $ git branch --contains 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
>> master
>> $ git cherry master HEAD
>> + fef60bf34d1b372bea1db2515a8d936386dfc523
>>
>> so git-cherry tells me that the cherry-picked commit has not
>> equivalent in master, which is no the case.
>>
>> What am I missing ?
>>
>
> Let's see:
>
>   $ git show | git patch-id
>   50c9f9548e1fd25401ff9540c82c1d5f9723c3d5 b4c86d2965aaf0736e4ab30be1d1a08931009a08
>
>   $ git show 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f | git patch-id
>   a5cfbb542882bd9cbe192b43026354d1f2741673 9ea28c55a2488e6cd4a44ac5786f12b71ad5bc9f
>
> Git patch-id calculates the hash over the diff, and, when gives in this
> case two hashes, first the patch-id and the second the commit hash.
>
> The patch-ids are different, explaining why git cherry does not see them
> as equivalent. If I take a diff of the diff, I notice something:
>
>    diff --git a/src/udev/udevadm-settle.c b/src/udev/udevadm-settle.c
>   -index 927ea2a..65fc35f 100644
>   +index b026155..ea9b078 100644
>    --- a/src/udev/udevadm-settle.c
>    +++ b/src/udev/udevadm-settle.c
>    @@ -41,42 +41,28 @@
>   @@ -1094,7 +1094,7 @@
>                             exit(EXIT_SUCCESS);
>    @@ -102,44 +85,13 @@ static int adm_settle(struct udev *udev, int argc, char *argv[])
>                     default:
>   -                         assert_not_reached("Unknown argument");
>   +                         assert_not_reached("Unkown argument");
>                     }
>    +        }
>
> If you look at the lines with assert_not_reached, it seems there is a
> difference in the context of both diffs, causing the diffs to be
> different.
>
> Hope this helps, Kevin

Thanks Kevin

I was mislead by the git-cherry manpage somehow which says:

    "git-cherry therefore
     detects when commits have been "copied" by means of git-cherry-pick(1),

which is not exactly true.

Isn't there a way to get git-cherry less "strict" ?
I mean in this case most of 90% of the diff is the same, only one line
of context is not identical...

Would it make sense to add a "--fuzz" option which would reduce the
diff context area used to generate the hash ?

Thanks.
-- 
Francis

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-cherry doesn't detect a "copied" commit
  2015-10-23 12:24   ` Francis Moreau
@ 2015-10-23 22:41     ` Junio C Hamano
  2015-10-29 14:44       ` Francis Moreau
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2015-10-23 22:41 UTC (permalink / raw)
  To: Francis Moreau; +Cc: Kevin Daudt, git

Francis Moreau <francis.moro@gmail.com> writes:

> I was mislead by the git-cherry manpage somehow which says:
>
>     "git-cherry therefore
>      detects when commits have been "copied" by means of git-cherry-pick(1),
>
> which is not exactly true.

Yeah, I agree; the sentence is merely giving a description from
layperson's point of view, and it should have expressed it as such,
e.g. "roughly speaking, you can think of it like so", not sounding
as if it is giving a strictly correct and authoritative statement.

> Would it make sense to add a "--fuzz" option which would reduce the
> diff context area used to generate the hash ?

There could be situations where such fuzzing might be useful, but I
do not think this particular use case of yours is one of them.

I'd imagine that you had two branches A (with "Unkown") and B (with
"Unknown"), and wanted to apply changes in them to your integration
branch (let's call that 'master').  You ask cherry "what commits in
A are missing in my 'master'?" and apply them.  Next you ask cherry
"what commits in B are missing in my 'master' now?" and apply them.

Because "Unkown" and "Unknown" are not considered the "same" patches
(one is most likely an update to the other), you get conflict when
applying the second copy, and that is how you can notice that one of
them is a stale and buggy one.  If you haven't made your interim
integration result available to others after processing branch A,
you even have a chance to replace the "Unkown" one you already
applied with the corrected "Unknown" one before continuing.  Even if
you choose not to bother and skip the "Unknown" one from branch B,
at least you know that in the end result you have a typo that would
eventually need to be fixed from "Unkown" into "Unknown".

If you did a fuzzy version and ignored s/Unkown/Unknown/ typofix
between the "same" patches, you can avoid the conflict and all
patches from branch B may apply cleanly and automatically on top of
applying changes from branch A.  But depending on the order you
processed A and B, you have a 50% chance of keeping the buggy
version without even realizing.

So erring on the safe side and judging "Unkown" and "Unknown" are
different changes, inducing one extra conflict you had to look at,
is actively a good thing in this case.

One thing that helps to know while learning Git is that we try to
avoid being overly clever and outsmarting human users.  Instead, we
err on the safe side to avoid silently doing a wrong thing.

This is because a tool that automates 100% of cases with 2% chance
of producing wrong result cannot be trusted---you have to manually
inspect all 100% cases it automatically handled to make sure it did
the right thing.  We instead automate 98% of simple cases where it
is obvious what the right result is, and ask the human user for help
on the remaining 2%.

And this design principle is not limited to cherry.  The design of
our merge algorithms is the same way, for example.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-cherry doesn't detect a "copied" commit
  2015-10-23 22:41     ` Junio C Hamano
@ 2015-10-29 14:44       ` Francis Moreau
  0 siblings, 0 replies; 6+ messages in thread
From: Francis Moreau @ 2015-10-29 14:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kevin Daudt, git

Hello Junio,

On Sat, Oct 24, 2015 at 12:41 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Francis Moreau <francis.moro@gmail.com> writes:
>
>> I was mislead by the git-cherry manpage somehow which says:
>>
>>     "git-cherry therefore
>>      detects when commits have been "copied" by means of git-cherry-pick(1),
>>
>> which is not exactly true.
>
> Yeah, I agree; the sentence is merely giving a description from
> layperson's point of view, and it should have expressed it as such,
> e.g. "roughly speaking, you can think of it like so", not sounding
> as if it is giving a strictly correct and authoritative statement.
>
>> Would it make sense to add a "--fuzz" option which would reduce the
>> diff context area used to generate the hash ?
>
> There could be situations where such fuzzing might be useful, but I
> do not think this particular use case of yours is one of them.
>
> I'd imagine that you had two branches A (with "Unkown") and B (with
> "Unknown"), and wanted to apply changes in them to your integration
> branch (let's call that 'master').  You ask cherry "what commits in
> A are missing in my 'master'?" and apply them.  Next you ask cherry
> "what commits in B are missing in my 'master' now?" and apply them.
>
> Because "Unkown" and "Unknown" are not considered the "same" patches
> (one is most likely an update to the other), you get conflict when
> applying the second copy, and that is how you can notice that one of
> them is a stale and buggy one.  If you haven't made your interim
> integration result available to others after processing branch A,
> you even have a chance to replace the "Unkown" one you already
> applied with the corrected "Unknown" one before continuing.  Even if
> you choose not to bother and skip the "Unknown" one from branch B,
> at least you know that in the end result you have a typo that would
> eventually need to be fixed from "Unkown" into "Unknown".
>
> If you did a fuzzy version and ignored s/Unkown/Unknown/ typofix
> between the "same" patches, you can avoid the conflict and all
> patches from branch B may apply cleanly and automatically on top of
> applying changes from branch A.  But depending on the order you
> processed A and B, you have a 50% chance of keeping the buggy
> version without even realizing.
>
> So erring on the safe side and judging "Unkown" and "Unknown" are
> different changes, inducing one extra conflict you had to look at,
> is actively a good thing in this case.
>

In this case, ie where code modification happens, I agree that we
should play in the safe side.

But in my case I was more using git-cherry as a tool to help me
compare my integration branch and the master one. I'm interested to
know which commits have been picked up from upstream and which ones
are specific to my branch.

And when backporting (cherry-picking) commits from upstream, it's
quite frequent that the context is slightly different.

I think in this case and not the one you describe, such 'fuzz' option
might make sense. Fuzzy match could be reported with a different tag,
'~' for example.

> One thing that helps to know while learning Git is that we try to
> avoid being overly clever and outsmarting human users.  Instead, we
> err on the safe side to avoid silently doing a wrong thing.
>
> This is because a tool that automates 100% of cases with 2% chance
> of producing wrong result cannot be trusted---you have to manually
> inspect all 100% cases it automatically handled to make sure it did
> the right thing.  We instead automate 98% of simple cases where it
> is obvious what the right result is, and ask the human user for help
> on the remaining 2%.
>
> And this design principle is not limited to cherry.  The design of
> our merge algorithms is the same way, for example.

I fully agree with this principle.

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-29 14:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-20 16:32 git-cherry doesn't detect a "copied" commit Francis Moreau
2015-10-23  6:47 ` Francis Moreau
2015-10-23 10:57 ` Kevin Daudt
2015-10-23 12:24   ` Francis Moreau
2015-10-23 22:41     ` Junio C Hamano
2015-10-29 14:44       ` Francis Moreau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.