git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Is using git describe resilient enough for setting the build version of git?
@ 2010-02-21  0:48 Steve Diver
  2010-02-21  6:07 ` Avery Pennarun
  0 siblings, 1 reply; 3+ messages in thread
From: Steve Diver @ 2010-02-21  0:48 UTC (permalink / raw)
  To: Git List; +Cc: Johannes Schindelin

Hi,

I recently ran into a problem with the displayed version number for a 
build of MSysgit. After much head scratching on my part, and divine 
intervention from Dscho given his intimate knowledge of the codebase, 
was it resolved.

My problem was that although I had checked out the tip of devel branch, 
every time I ran make, my build version was incorrect, although the hash 
suffix corresponded to the tip of the devel branch.

$ make
GIT_VERSION = 1.6.4.msysgit.0.2049.g91809
...

I was expecting to see 1.7.0 as the version which had been merged a few 
days ago, and simply inspecting the files did not reveal where the 
discrepancy originated.

I had been concentrating on GIT-VERSION-GEN to see how the version was 
generated, and then comparing between two machines - one with a 
successful build showing correct version and the other not, but both 
showing 1.7.0 as DEF_VER and apparently identical repositories - 
Johannes then replied to my query with a request to fetch some tags 
which had been overlooked. Viewing the two repositories graphically side 
by side immediately revealed that the build with the incorrect version 
label did not have recent tags, and it was not the files I should have 
been diffing, but comparing the output of "git describe". Problem solved 
after fetching the new tags.

This has been a most beneficial learning exercise for me, and I am most 
grateful, and heartened I was on the right track, but I think I also see 
a potential problem.

GIT-VERSION-GEN sets a default value DEF_VER according to the version at 
the time.

The two most recent being 1.6.6.2 at revision 82221 and 1.7.0 at e923e

However, in the absence of a version, the script uses "git describe" to 
retrieve the latest tag, and goes on to use this to create the version 
file along with the hash suffix at the current HEAD. In my case, the 
latest tag was 1.6.4 but I was building from the latest source at 
revision 91809.

Reading the manual entry for "git describe"[1] there is a note saying 
that the hash suffix does not guarantee disambiguity, and given that a 
tag may be incorrect or missing, there is a chance - albeit with 
diminishing odds - that the 5 digit hash/tag combination might lead to 
some obscure problems at some point along the line.

The chance of this happening really is low, but there is a chance all 
the same. We cannot foresee all errors, but identifying, and further 
reducing the odds of some must be good. Without doing the math, a guess 
would be that the probability of a repeat 5 digit abbreviated hash 
suffix increases the longer a tagged version is in place, so never will 
be 100% safe. Relying on the build version alone is not a good test 
under most circumstances, but in my case I could see that the hash was 
correct and the displayed version was unexpected. The other way around 
or one of those rare occasions of a repeat would have gone completely 
unnoticed.

I may be wrong, but the only scenario where I see DEF_VER being used by 
GIT-VERSION-GEN, would be when there are no tags for git describe to 
retrieve. ie "git pull --no-tags"

If my understanding is correct, DEF_VER is unique and set at the same 
time as the tagged version, so wouldn't it be desirable to cross check, 
or include this value instead of relying solely on the tag when present 
during the generation of GIT-VERSION-FILE at build time?


Steve

1. http://www.kernel.org/pub/software/scm/git/docs/git-describe.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Is using git describe resilient enough for setting the  build version of git?
  2010-02-21  0:48 [RFC] Is using git describe resilient enough for setting the build version of git? Steve Diver
@ 2010-02-21  6:07 ` Avery Pennarun
  2010-02-21  8:42   ` Steve Diver
  0 siblings, 1 reply; 3+ messages in thread
From: Avery Pennarun @ 2010-02-21  6:07 UTC (permalink / raw)
  To: Steve Diver; +Cc: Git List, Johannes Schindelin

On Sat, Feb 20, 2010 at 7:48 PM, Steve Diver <squelch@think.zenbe.com> wrote:
> Reading the manual entry for "git describe"[1] there is a note saying that
> the hash suffix does not guarantee disambiguity, and given that a tag may be
> incorrect or missing, there is a chance - albeit with diminishing odds -
> that the 5 digit hash/tag combination might lead to some obscure problems at
> some point along the line.
>
> The chance of this happening really is low, but there is a chance all the
> same. We cannot foresee all errors, but identifying, and further reducing
> the odds of some must be good. Without doing the math, a guess would be that
> the probability of a repeat 5 digit abbreviated hash suffix increases the
> longer a tagged version is in place, so never will be 100% safe.

Not really.  Note that the number *before* the hash is the number of
commits between your version and the tag.  So the only way to get an
actual undetectable overlap would be to have two commits that are both
the same number of commits on top of the given tag, *and* both start
with the same first five digits.  It's just not very likely at all.
Besides which, using the hash code feature of git-describe is most
useful for the short periods between versions, not as a long-term
thing.  After a new release comes out it's unlikely anyone will care
if the previous hash prefixes were overlapping.

> I may be wrong, but the only scenario where I see DEF_VER being used by
> GIT-VERSION-GEN, would be when there are no tags for git describe to
> retrieve. ie "git pull --no-tags"
>
> If my understanding is correct, DEF_VER is unique and set at the same time
> as the tagged version, so wouldn't it be desirable to cross check, or
> include this value instead of relying solely on the tag when present during
> the generation of GIT-VERSION-FILE at build time?

If I recall correctly, the main reason for DEF_VER is when building
git from a tarball, in which case 'git describe' wouldn't be able to
tell you anything useful.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Is using git describe resilient enough for setting the build version of git?
  2010-02-21  6:07 ` Avery Pennarun
@ 2010-02-21  8:42   ` Steve Diver
  0 siblings, 0 replies; 3+ messages in thread
From: Steve Diver @ 2010-02-21  8:42 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Git List, Johannes Schindelin

On 21/02/2010 06:07, Avery Pennarun wrote:
>
> On Sat, Feb 20, 2010 at 7:48 PM, Steve Diver<squelch@think.zenbe.com>  wrote:
>> Reading the manual entry for "git describe"[1] there is a note saying that
>> the hash suffix does not guarantee disambiguity, and given that a tag may be
>> incorrect or missing, there is a chance - albeit with diminishing odds -
>> that the 5 digit hash/tag combination might lead to some obscure problems at
>> some point along the line.
>>
>> The chance of this happening really is low, but there is a chance all the
>> same. We cannot foresee all errors, but identifying, and further reducing
>> the odds of some must be good. Without doing the math, a guess would be that
>> the probability of a repeat 5 digit abbreviated hash suffix increases the
>> longer a tagged version is in place, so never will be 100% safe.
>
> Not really.  Note that the number *before* the hash is the number of
> commits between your version and the tag.  So the only way to get an
> actual undetectable overlap would be to have two commits that are both
> the same number of commits on top of the given tag, *and* both start
> with the same first five digits.  It's just not very likely at all.
> Besides which, using the hash code feature of git-describe is most
> useful for the short periods between versions, not as a long-term
> thing.  After a new release comes out it's unlikely anyone will care
> if the previous hash prefixes were overlapping.
>
Thanks for pointing that out, and I concede that it is most unlikely. 
Testing for a minor build revision is probably not a good idea anyway. I 
was thinking along the lines of testing for version integrity.

>> I may be wrong, but the only scenario where I see DEF_VER being used by
>> GIT-VERSION-GEN, would be when there are no tags for git describe to
>> retrieve. ie "git pull --no-tags"
>>
>> If my understanding is correct, DEF_VER is unique and set at the same time
>> as the tagged version, so wouldn't it be desirable to cross check, or
>> include this value instead of relying solely on the tag when present during
>> the generation of GIT-VERSION-FILE at build time?
>
> If I recall correctly, the main reason for DEF_VER is when building
> git from a tarball, in which case 'git describe' wouldn't be able to
> tell you anything useful.
>
I suppose my point is that relying on the tag alone via git describe, 
does not guarantee the correct displayed version. The actual minor build 
is less important. This was the only mechanism that allowed me to 
recognise something was amiss. It is the major version number that 
interests me.

Take for example a client application that tests for the git version, 
avoids problems in older versions, and utilizes features from the latest 
and greatest. This could all happen at run time, and would be fairly 
resilient, except for when the version is incorrectly applied.

If the client app used the output from my 1.7.0 build which was 
incorrectly labelled, it would not try to use feature x or would fail 
with a prompt to install a newer version. The situation could be far 
more serious if the advertised version was 1.7.0 based on a rogue tag, 
and the build was 1.5.0 - extreme and unlikely, but hopefully 
illustrates my point

What I am suggesting is that DEF_VER is not only used as fail over where 
git describe does not yield anything useful, but is also used for 
"checks and balances" purposes where git describe generates something 
different from DEF_VER.

Steve

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-02-21  8:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-21  0:48 [RFC] Is using git describe resilient enough for setting the build version of git? Steve Diver
2010-02-21  6:07 ` Avery Pennarun
2010-02-21  8:42   ` Steve Diver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).