All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
@ 2020-05-24 11:47 Yann E. MORIN
  2020-05-25  1:55 ` Vincent Fazio
  2020-05-29 21:41 ` Thomas Petazzoni
  0 siblings, 2 replies; 6+ messages in thread
From: Yann E. MORIN @ 2020-05-24 11:47 UTC (permalink / raw)
  To: buildroot

Older versions of git store the absolute path of the submodules'
repository as stored in the super-prject, e.g.:

    $ cat some-submodule/.git
    gitdir: /path/to/super-project/.git/modules/some-submodule

Obviously, this is not very reproducible.

More recent versions of git, however, store relative paths, shich
de-facto makes it reproducible.

Fix older versions by replacing the absolute paths with relative ones.

Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
---
 support/download/git | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/support/download/git b/support/download/git
index 075f665bbf..15d8c66e05 100755
--- a/support/download/git
+++ b/support/download/git
@@ -176,6 +176,19 @@ date="$( _git log -1 --pretty=format:%cD )"
 # There might be submodules, so fetch them.
 if [ ${recurse} -eq 1 ]; then
     _git submodule update --init --recursive
+
+    # Older versions of git will store the absolute path of the git tree
+    # in the .git of submodules, while newer versions just use relative
+    # paths. Detect and fix the older variants to use relative paths, so
+    # that the archives are reproducible across a wider range of git
+    # versions. However, we can't do that if git is too old and uses
+    # full repositories for submodules.
+    cmd='printf "%s\n" "${path}/"'
+    for module_dir in $( _git submodule --quiet foreach "'${cmd}'" ); do
+        [ -f "${module_dir}/.git" ] || continue
+        relative_dir="$( sed -r -e 's,/+,/,g; s,[^/]+/,../,g' <<<"${module_dir}" )"
+        sed -r -i -e "s:^gitdir\: $(pwd)/:gitdir\: "${relative_dir}":" "${module_dir}/.git"
+    done
 fi
 
 # Generate the archive, sort with the C locale so that it is reproducible.
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
  2020-05-24 11:47 [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions Yann E. MORIN
@ 2020-05-25  1:55 ` Vincent Fazio
  2020-05-25 20:05   ` Yann E. MORIN
  2020-05-29 21:41 ` Thomas Petazzoni
  1 sibling, 1 reply; 6+ messages in thread
From: Vincent Fazio @ 2020-05-25  1:55 UTC (permalink / raw)
  To: buildroot

Yann,

On 5/24/20 6:47 AM, Yann E. MORIN wrote:
> Older versions of git store the absolute path of the submodules'
> repository as stored in the super-prject, e.g.:
>
>      $ cat some-submodule/.git
>      gitdir: /path/to/super-project/.git/modules/some-submodule
>
> Obviously, this is not very reproducible.
>
> More recent versions of git, however, store relative paths, shich
> de-facto makes it reproducible.
>
> Fix older versions by replacing the absolute paths with relative ones.
>
> Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
> ---
>   support/download/git | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/support/download/git b/support/download/git
> index 075f665bbf..15d8c66e05 100755
> --- a/support/download/git
> +++ b/support/download/git
> @@ -176,6 +176,19 @@ date="$( _git log -1 --pretty=format:%cD )"
>   # There might be submodules, so fetch them.
>   if [ ${recurse} -eq 1 ]; then
>       _git submodule update --init --recursive
> +
> +    # Older versions of git will store the absolute path of the git tree
> +    # in the .git of submodules, while newer versions just use relative
> +    # paths. Detect and fix the older variants to use relative paths, so
> +    # that the archives are reproducible across a wider range of git
> +    # versions. However, we can't do that if git is too old and uses
> +    # full repositories for submodules.
> +    cmd='printf "%s\n" "${path}/"'
> +    for module_dir in $( _git submodule --quiet foreach "'${cmd}'" ); do
> +        [ -f "${module_dir}/.git" ] || continue
> +        relative_dir="$( sed -r -e 's,/+,/,g; s,[^/]+/,../,g' <<<"${module_dir}" )"
> +        sed -r -i -e "s:^gitdir\: $(pwd)/:gitdir\: "${relative_dir}":" "${module_dir}/.git"
> +    done
>   fi
>   
>   # Generate the archive, sort with the C locale so that it is reproducible.
Should we expand the `find` to ignore files named '.git' so that these 
don't get added to the tarball at all?

find . -not -type d \
 ?????? -and -not -path "./.git/*" -and -not -name ".git" >"${output}.list"

Seems like it'd be in-line with our current exclusion of the .git/ 
subfolder because that relative git reference wouldn't be valid after 
the tarball got unpacked anyway.

-- 
Vincent Fazio
Embedded Software Engineer - Linux
Extreme Engineering Solutions, Inc
http://www.xes-inc.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.busybox.net/pipermail/buildroot/attachments/20200524/478bbddc/attachment.html>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
  2020-05-25  1:55 ` Vincent Fazio
@ 2020-05-25 20:05   ` Yann E. MORIN
  2020-05-25 23:24     ` Vincent Fazio
  0 siblings, 1 reply; 6+ messages in thread
From: Yann E. MORIN @ 2020-05-25 20:05 UTC (permalink / raw)
  To: buildroot

Vincent, All,

On 2020-05-24 20:55 -0500, Vincent Fazio spake thusly:
> On 5/24/20 6:47 AM, Yann E. MORIN wrote:
>  Older versions of git store the absolute path of the submodules'
>  repository as stored in the super-prject, e.g.:
>      $ cat some-submodule/.git    gitdir: /path/to/super-project/.git/modules/some-submodule
>  Obviously, this is not very reproducible.
>  More recent versions of git, however, store relative paths, shich
>  de-facto makes it reproducible.Fix older versions by replacing the absolute paths with relative ones.
>  Signed-off-by: Yann E. MORIN
>  [1]<yann.morin.1998@free.fr>--- support/download/git | 13 +++++++++++++
>   1 file changed, 13 insertions(+)diff --git a/support/download/git b/support/download/git
>  index 075f665bbf..15d8c66e05 100755--- a/support/download/git
>  +++ b/support/download/git@@ -176,6 +176,19 @@ date="$( _git log -1 --pretty=format:%cD )"
>   # There might be submodules, so fetch them.
>   if [ ${recurse} -eq 1 ]; then     _git submodule update --init --recursive
>  ++    # Older versions of git will store the absolute path of the git tree
>  +    # in the .git of submodules, while newer versions just use relative
>  +    # paths. Detect and fix the older variants to use relative paths, so
>  +    # that the archives are reproducible across a wider range of git
>  +    # versions. However, we can't do that if git is too old and uses
>  +    # full repositories for submodules.+    cmd='printf "%s\n" "${path}/"'
>  +    for module_dir in $( _git submodule --quiet foreach "'${cmd}'" ); do
>  +        [ -f "${module_dir}/.git" ] || continue
>  +        relative_dir="$( sed -r -e 's,/+,/,g; s,[^/]+/,../,g' <<<"${module_dir}" )"
>  +        sed -r -i -e "s:^gitdir\: $(pwd)/:gitdir\: "${relative_dir}":" "${module_dir}/.git"
>  +    done fi  # Generate the archive, sort with the C locale so that it is reproducible.
> 
> Should we expand the `find` to ignore files named '.git' so that these don't get added to the tarball at all?
> 
> find . -not -type d \
> ?????? -and -not -path "./.git/*" -and -not -name ".git" >"${output}.list"

We do not want to do tht, because we want to reproduce the existign
tarballs. And those existign tarballs already contain the .git files.

Note however that, for people wo have prehistoric git versions, git
submodulkes will be entire repositories of their own, i.e. the .git of
submodules is a directory with an actual repository, instead of a plain
file with a gitdir indirection. For those people, tarballs from git
archives are not reproducible but we don't care.

But for the case that concerns us, we don't want to drop the .git files.

Yeah, that might be an oversight from back when we introduced support
for submodules, but it's now too late...

Regards,
Yann E. MORIN.

> Seems like it'd be in-line with our current exclusion of the .git/ subfolder because that relative git reference wouldn't be valid
> after the tarball got unpacked anyway.
> 
> -- Vincent FazioEmbedded Software Engineer - Linux
> Extreme Engineering Solutions, Inc
> [2]http://www.xes-inc.com
> 
> Links:
> 1. mailto:yann.morin.1998 at free.fr/
> 2. http://www.xes-inc.com/

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
  2020-05-25 20:05   ` Yann E. MORIN
@ 2020-05-25 23:24     ` Vincent Fazio
  0 siblings, 0 replies; 6+ messages in thread
From: Vincent Fazio @ 2020-05-25 23:24 UTC (permalink / raw)
  To: buildroot

Yann,

On 5/25/20 3:05 PM, Yann E. MORIN wrote:
> Vincent, All,
> 
> On 2020-05-24 20:55 -0500, Vincent Fazio spake thusly:
>> On 5/24/20 6:47 AM, Yann E. MORIN wrote:
>>   Older versions of git store the absolute path of the submodules'
>>   repository as stored in the super-prject, e.g.:
>>       $ cat some-submodule/.git    gitdir: /path/to/super-project/.git/modules/some-submodule
>>   Obviously, this is not very reproducible.
>>   More recent versions of git, however, store relative paths, shich
>>   de-facto makes it reproducible.Fix older versions by replacing the absolute paths with relative ones.
>>   Signed-off-by: Yann E. MORIN
>>   [1]<yann.morin.1998@free.fr>--- support/download/git | 13 +++++++++++++
>>    1 file changed, 13 insertions(+)diff --git a/support/download/git b/support/download/git
>>   index 075f665bbf..15d8c66e05 100755--- a/support/download/git
>>   +++ b/support/download/git@@ -176,6 +176,19 @@ date="$( _git log -1 --pretty=format:%cD )"
>>    # There might be submodules, so fetch them.
>>    if [ ${recurse} -eq 1 ]; then     _git submodule update --init --recursive
>>   ++    # Older versions of git will store the absolute path of the git tree
>>   +    # in the .git of submodules, while newer versions just use relative
>>   +    # paths. Detect and fix the older variants to use relative paths, so
>>   +    # that the archives are reproducible across a wider range of git
>>   +    # versions. However, we can't do that if git is too old and uses
>>   +    # full repositories for submodules.+    cmd='printf "%s\n" "${path}/"'
>>   +    for module_dir in $( _git submodule --quiet foreach "'${cmd}'" ); do
>>   +        [ -f "${module_dir}/.git" ] || continue
>>   +        relative_dir="$( sed -r -e 's,/+,/,g; s,[^/]+/,../,g' <<<"${module_dir}" )"
>>   +        sed -r -i -e "s:^gitdir\: $(pwd)/:gitdir\: "${relative_dir}":" "${module_dir}/.git"
>>   +    done fi  # Generate the archive, sort with the C locale so that it is reproducible.
>>
>> Should we expand the `find` to ignore files named '.git' so that these don't get added to the tarball at all?
>>
>> find . -not -type d \
>>  ?????? -and -not -path "./.git/*" -and -not -name ".git" >"${output}.list"
> 
> We do not want to do tht, because we want to reproduce the existign
> tarballs. And those existign tarballs already contain the .git files.
> 

Gotcha. In the case that this is a one-off patch and may be ported, I 
completely agree. I was thinking of this in terms of the PAX patch 
series we've been discussing via IRC

> Note however that, for people wo have prehistoric git versions, git
> submodulkes will be entire repositories of their own, i.e. the .git of
> submodules is a directory with an actual repository, instead of a plain
> file with a gitdir indirection. For those people, tarballs from git
> archives are not reproducible but we don't care.
> 
> But for the case that concerns us, we don't want to drop the .git files.
> 
> Yeah, that might be an oversight from back when we introduced support
> for submodules, but it's now too late...
> 

However, when we introduce the new PAX tarball, i think we should 
consider dropping all directories and files named '.git'...

Sure it was an oversight before with the GNU tarballs, but we know it's 
a problem and i don't think we need to carry it forward. Or is there a 
detail I'm missing here as well?

> Regards,
> Yann E. MORIN.
> 
>> Seems like it'd be in-line with our current exclusion of the .git/ subfolder because that relative git reference wouldn't be valid
>> after the tarball got unpacked anyway.
>>
>> -- Vincent FazioEmbedded Software Engineer - Linux
>> Extreme Engineering Solutions, Inc
>> [2]http://www.xes-inc.com
>>
>> Links:
>> 1. mailto:yann.morin.1998 at free.fr/
>> 2. http://www.xes-inc.com/
> 

-- 
Vincent Fazio
Embedded Software Engineer - Linux
Extreme Engineering Solutions, Inc
http://www.xes-inc.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
  2020-05-24 11:47 [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions Yann E. MORIN
  2020-05-25  1:55 ` Vincent Fazio
@ 2020-05-29 21:41 ` Thomas Petazzoni
  2020-05-30 21:08   ` Yann E. MORIN
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Petazzoni @ 2020-05-29 21:41 UTC (permalink / raw)
  To: buildroot

On Sun, 24 May 2020 13:47:18 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> +    # Older versions of git will store the absolute path of the git tree
> +    # in the .git of submodules, while newer versions just use relative
> +    # paths. Detect and fix the older variants to use relative paths, so
> +    # that the archives are reproducible across a wider range of git
> +    # versions. However, we can't do that if git is too old and uses
> +    # full repositories for submodules.

If I understand correctly, there are three "eras":

 - Really old Git versions, where full repositories are used for
   submodules, where we can't do anything.

 - Old Git versions, that stored absolute paths.

 - Recent Git versions, that store relative paths.

Would it be possible to identify which versions we're talking about
here? I'm sure you've done that research, and I think it makes sense to
capture that, as we will certainly wonder what we mean by "older
versions", "old version", "new version.

What is new, old, or older today, will feel quite different 5 years
from now.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions
  2020-05-29 21:41 ` Thomas Petazzoni
@ 2020-05-30 21:08   ` Yann E. MORIN
  0 siblings, 0 replies; 6+ messages in thread
From: Yann E. MORIN @ 2020-05-30 21:08 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2020-05-29 23:41 +0200, Thomas Petazzoni spake thusly:
> On Sun, 24 May 2020 13:47:18 +0200
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> > +    # Older versions of git will store the absolute path of the git tree
> > +    # in the .git of submodules, while newer versions just use relative
> > +    # paths. Detect and fix the older variants to use relative paths, so
> > +    # that the archives are reproducible across a wider range of git
> > +    # versions. However, we can't do that if git is too old and uses
> > +    # full repositories for submodules.
> 
> If I understand correctly, there are three "eras":
> 
>  - Really old Git versions, where full repositories are used for
>    submodules, where we can't do anything.
> 
>  - Old Git versions, that stored absolute paths.
> 
>  - Recent Git versions, that store relative paths.

Spot-on.

> Would it be possible to identify which versions we're talking about
> here? I'm sure you've done that research, and I think it makes sense to
> capture that, as we will certainly wonder what we mean by "older
> versions", "old version", "new version.
> 
> What is new, old, or older today, will feel quite different 5 years
> from now.

Sorry, you presumed too much: I haven't dug the nitty-gritty details on
when git transitioned from one area to another...

I just stumbled on this issue while working on the conversion of the
archives generated from a git tree, which got me scratch my head for
quite some time... I should have noted the conditions back then, true,
but I forgot...

Regards,
Yann E. MORIN.

> Thomas
> -- 
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-30 21:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-24 11:47 [Buildroot] [PATCH] suport/download: fix git wrapper with submodules on older git versions Yann E. MORIN
2020-05-25  1:55 ` Vincent Fazio
2020-05-25 20:05   ` Yann E. MORIN
2020-05-25 23:24     ` Vincent Fazio
2020-05-29 21:41 ` Thomas Petazzoni
2020-05-30 21:08   ` Yann E. MORIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.