All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
@ 2018-04-17 16:48 Yann E. MORIN
  2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-17 16:48 UTC (permalink / raw)
  To: buildroot

Hello All!

This series is an attempt at making our git backend more robust in case
the git cache for a package got corrupted for whatever reason. For
example, a git action may got killed, there could have been a
power-loss, or anything that could leave the git tree in an inconsistent
state.

This has been praticualarly observed in autobuilders, where random
files got removed without taking into account that we now had git trees
in there.

First and foremost, we ensure that all the git comand we run will only
ever act on the git tree we're interested in, to avoid git going up
until it finds a valid directory.

Then, we always initialise the git tree, just in case. git-init is safe
to run on already initialised trees, and it restores broken ones in
working conditions.

Third, we properly handle the case where a git tree had a sub-dir
converted to/from a submodule, and the checkout crosses the boundary of
the conversion.

Finally, we drop support for shallow clones because they are not
reliable, and only ever worked by chance so far.

Thanks a lot to Ricardo, Arnout and Tomas for their inputs during the
discussions that led to this series. :-)


Regards,
Yann E. MORIN.


The following changes since commit 3f37dd7c3b5eb25a41edc6f72ba73e5a21b07e9b

  mariadb: bump version to 10.1.32 (2018-04-17 08:56:57 +0200)


are available in the git repository at:

  git://git.buildroot.org/~ymorin/git/buildroot.git

for you to fetch changes up to 62d815cfe104e4d172ca4d51e1d56e577f737b87

  download/git: always do full-clone (2018-04-17 18:39:59 +0200)


----------------------------------------------------------------
Yann E. MORIN (4):
      download/git: ensure we always work in the expected repository
      download/git: ensure we have a sane repository
      download/git: ensure we can checkout repos with submodule conversions
      download/git: always do full-clone

 support/download/git | 66 +++++++++++++++++++++++++++-------------------------
 1 file changed, 34 insertions(+), 32 deletions(-)

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository
  2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
@ 2018-04-17 16:48 ` Yann E. MORIN
  2018-04-19 15:47   ` Ricardo Martincoski
  2018-04-19 20:38   ` Thomas Petazzoni
  2018-04-17 16:48 ` [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository Yann E. MORIN
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-17 16:48 UTC (permalink / raw)
  To: buildroot

git always look directories up until it finds a repository. In case
the git cache is broken, it may no longer be identified as a repository,
and git will look higher in the directories until it finds one.

In the default conditions, this would be Buildroot's own git tree
(because DL_DIR is a subdir of Buildroot), but in some situations may
very well be any repository the user has Buildroot in, like a
br2-external tree...

So, we force git to use our git cache and never look elsewhere, as
Suggested by Ricardo.

Use GIT_DIR, as it has been there for ages now, while --git-dir was
only introduced later (even if most distros ship an later version),
as suggested by Arnout.

Also fix the one call to git that was not using the wrapper.

Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
Cc: Arnout Vandecappelle <arnout@mind.be>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/download/git | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/support/download/git b/support/download/git
index 381f3ceeb3..c166ae2813 100755
--- a/support/download/git
+++ b/support/download/git
@@ -34,25 +34,28 @@ done
 
 shift $((OPTIND-1)) # Get rid of our options
 
+# We want to check if a cache of the git clone of this repo already exists.
+git_cache="${dl_dir}/git"
+
 # Caller needs to single-quote its arguments to prevent them from
 # being expanded a second time (in case there are spaces in them)
 _git() {
-    eval ${GIT} "${@}"
+    eval GIT_DIR="${git_cache}/.git" ${GIT} "${@}"
 }
 
-# We want to check if a cache of the git clone of this repo already exists.
-git_cache="${dl_dir}/git"
-
 # If the cache directory doesn't exists, init a new repo, which will be
 # fetch'ed later.
 if [ ! -d "${git_cache}" ]; then
+    # We can still go through the wrapper, because 'init' does not use
+    # the path pointed to by GIT_DIR, but really uses the directory
+    # passed as argument.
     _git init "'${git_cache}'"
 fi
 
 pushd "${git_cache}" >/dev/null
 
 # Ensure the repo has an origin (in case a previous run was killed).
-if ! git remote |grep -q -E '^origin$'; then
+if ! _git remote |grep -q -E '^origin$'; then
     _git remote add origin "'${uri}'"
 fi
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository
  2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
  2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
@ 2018-04-17 16:48 ` Yann E. MORIN
  2018-04-19 15:50   ` Ricardo Martincoski
  2018-04-19 20:38   ` Thomas Petazzoni
  2018-04-17 16:48 ` [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions Yann E. MORIN
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-17 16:48 UTC (permalink / raw)
  To: buildroot

There are cases where a repository might be broken, e.g. when a previous
operation was killed or otherwise failed unexpectedly.

We fix that by always initialising the repository, as suggested by
Ricardo. git-init is safe on an otherwise-healthy repository:

    Running git init in an existing repository is safe. It will not
    overwrite things that are already there. [...]

Using git-init will just ensure that we have the strictly required files
to form a sane tree. Any blob that is still missing would get fetched
later on.

Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
Cc: Arnout Vandecappelle <arnout@mind.be>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/download/git | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/support/download/git b/support/download/git
index c166ae2813..1172310186 100755
--- a/support/download/git
+++ b/support/download/git
@@ -43,14 +43,16 @@ _git() {
     eval GIT_DIR="${git_cache}/.git" ${GIT} "${@}"
 }
 
-# If the cache directory doesn't exists, init a new repo, which will be
-# fetch'ed later.
-if [ ! -d "${git_cache}" ]; then
-    # We can still go through the wrapper, because 'init' does not use
-    # the path pointed to by GIT_DIR, but really uses the directory
-    # passed as argument.
-    _git init "'${git_cache}'"
-fi
+# Initialise a repository in the git cache. If the repository already
+# existed, this is a noop, unless the repository was broken, in which
+# case this magically restores it to working conditions. In the latter
+# case, we might be missing blobs, but that's not a problem: we'll
+# fetch what we need later anyway.
+#
+# We can still go through the wrapper, because 'init' does not use the
+# path pointed to by GIT_DIR, but really uses the directory passed as
+# argument.
+_git init "'${git_cache}'"
 
 pushd "${git_cache}" >/dev/null
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
  2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
  2018-04-17 16:48 ` [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository Yann E. MORIN
@ 2018-04-17 16:48 ` Yann E. MORIN
  2018-04-18  3:13   ` Ricardo Martincoski
  2018-04-17 16:48 ` [Buildroot] [PATCH 4/4] download/git: always do full-clone Yann E. MORIN
  2018-04-18  8:40 ` [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Thomas Petazzoni
  4 siblings, 1 reply; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-17 16:48 UTC (permalink / raw)
  To: buildroot

When a repository has had a sub-dir <-> submodule conversion, checking
out a working copy from before/after the conversion can leave along a
few untracked files, expecially the .git files from submodules.

Ideally, we'd use "git submodule deinit --all" before the checkout to
the new ref, but deinit was only introduced in v1.8.3, which is still
not available in some enterprise-grade distros (RHEEL6 still has 1.7.1
AFAIK).

So, we use an alternate trick: we forcibly checkout the new ref, after
which we manually remove all .git files (not dirs!) in the new working
copy, then clean it up to remove all untracked and ignored files and
dirs, then we checkout to a pristine state.

From experimentation, this sequence looks like it is working as
expected...

However, this means we must forcibly update and initialise submodules
afterwards, so that their .git files get restored properly (and their
content checked out cleanly).

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
Cc: Arnout Vandecappelle <arnout@mind.be>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/download/git | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/support/download/git b/support/download/git
index 1172310186..e71ff029cd 100755
--- a/support/download/git
+++ b/support/download/git
@@ -97,7 +97,21 @@ fi
 
 # Checkout the required changeset, so that we can update the required
 # submodules.
-_git checkout -q "'${cset}'"
+_git checkout -f -q "'${cset}'"
+
+# We would need to deregister _current_ submodules before we checkout.
+# git-clean (which is used below) does not get rid of .git files (no
+# git command, but git-submodule, ever touches .git files or dirs);
+# Using "git submodule deinit --all" would remove all the files for
+# all submodules, including the corresponding .git files. However, it
+# was only introduced with git-1.8.3, which is too recent for some
+# entreprise-grade distros. So, we fall-back to just removing .git
+# files after the checkout.
+find . -type f -name .git -exec rm {} +
+
+# Restore repository to the new, clean-checkout state.
+_git clean -ffdx
+_git checkout -- .
 
 # Get date of commit to generate a reproducible archive.
 # %cD is RFC2822, so it's fully qualified, with TZ and all.
@@ -105,7 +119,7 @@ date="$( _git log -1 --pretty=format:%cD )"
 
 # There might be submodules, so fetch them.
 if [ ${recurse} -eq 1 ]; then
-    _git submodule update --init --recursive
+    _git submodule update --init --recursive --force
 fi
 
 # Generate the archive, sort with the C locale so that it is reproducible.
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 4/4] download/git: always do full-clone
  2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
                   ` (2 preceding siblings ...)
  2018-04-17 16:48 ` [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions Yann E. MORIN
@ 2018-04-17 16:48 ` Yann E. MORIN
  2018-04-18  3:18   ` Ricardo Martincoski
  2018-04-18  8:40 ` [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Thomas Petazzoni
  4 siblings, 1 reply; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-17 16:48 UTC (permalink / raw)
  To: buildroot

We currently attempt a shallow clone, as tentative to save bandwidth and
download time.

However, now that we keep the git tree as a cache, it may happen that we
need to checkout an earlier commit, and that would not be present with a
shallow clone.

Furthermore, the shallow fetch is already really btroken, and just
happens by chance. Consider the following actions, which are basivcally
what happens today:

    mkdir git
    git init git
    cd git
    git remote add origin https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
    git fetch origin --depth 1 v4.17-rc1
    if ! git fetch origin v4.17-rc1:v4.17-rc1 ; then
        echo "warning"
    fi
    git checkout v4.17-rc1

The checkout succeeds just because of the git-fetch in the if-condition,
which is initially there to fetch the special refs from github MRs, or
gerrit reviews. That fails, but we jsut print a warning. If we were to
ever remove support for special refs, then the checkout would fail.

The whole purpose of the git cache is to actually save bandwith and
download time, but in the long run. For one-offs, people would
preferably use a wget download (e.g. with the github macro) instead of
a git clone.

We switch to always doing a full clone. It is nore correct, and pays off
in the long run...

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
---
 support/download/git | 23 +++--------------------
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/support/download/git b/support/download/git
index e71ff029cd..6a4d1f937b 100755
--- a/support/download/git
+++ b/support/download/git
@@ -63,26 +63,9 @@ fi
 
 _git remote set-url origin "'${uri}'"
 
-# Try to fetch with limited depth, since it is faster than a full clone - but
-# that only works if the version is a ref (tag or branch). Before trying to do
-# a shallow clone we check if ${cset} is in the list provided by git ls-remote.
-# If not we fallback to a full fetch.
-#
-# Messages for the type of clone used are provided to ease debugging in
-# case of problems
-git_done=0
-if [ -n "$(_git ls-remote origin "'${cset}'" 2>&1)" ]; then
-    printf "Doing a shallow fetch\n"
-    if _git fetch "${@}" --depth 1 origin "'${cset}'"; then
-        git_done=1
-    else
-        printf "Shallow fetch failed, falling back to fetching all refs\n"
-    fi
-fi
-if [ ${git_done} -eq 0 ]; then
-    printf "Fetching all references\n"
-    _git fetch origin -t
-fi
+printf "Fetching all references\n"
+_git fetch origin
+_git fetch origin -t
 
 # Try to get the special refs exposed by some forges (pull-requests for
 # github, changes for gerrit...). There is no easy way to know whether
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-17 16:48 ` [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions Yann E. MORIN
@ 2018-04-18  3:13   ` Ricardo Martincoski
  2018-04-18  8:04     ` Arnout Vandecappelle
  0 siblings, 1 reply; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-18  3:13 UTC (permalink / raw)
  To: buildroot

Hello,

Really sorry I did finished the review/tests today to send by tags.

Could you double-check the --force option for git 1.7.1? I am not sure we would
have a workaround for this if it is not supported.
Also a question and some typos.

On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:

> When a repository has had a sub-dir <-> submodule conversion, checking
> out a working copy from before/after the conversion can leave along a
> few untracked files, expecially the .git files from submodules.

s/expecially/especially/
I think it's a typo, not a US/UK thing.

> 
> Ideally, we'd use "git submodule deinit --all" before the checkout to
> the new ref, but deinit was only introduced in v1.8.3, which is still
> not available in some enterprise-grade distros (RHEEL6 still has 1.7.1

s/RHEEL6/RHEL6/

> AFAIK).
[snip]
> +# We would need to deregister _current_ submodules before we checkout.
> +# git-clean (which is used below) does not get rid of .git files (no
> +# git command, but git-submodule, ever touches .git files or dirs);
> +# Using "git submodule deinit --all" would remove all the files for
> +# all submodules, including the corresponding .git files. However, it
> +# was only introduced with git-1.8.3, which is too recent for some
> +# entreprise-grade distros. So, we fall-back to just removing .git

s/entreprise/enterprise/

> +# files after the checkout.
> +find . -type f -name .git -exec rm {} +

In the unlikely case some user tries to abuse the git cache by first populating
it using the 'git multiple working trees' feature the main
dl/<package>/git/.git would be a file too. After this command all git commands
will fail.
But... well... that is expected! The user should not mess with the git cache,
it is there to be used solely by buildroot.
So we are good IMO.

> +
> +# Restore repository to the new, clean-checkout state.
> +_git clean -ffdx
> +_git checkout -- .

So it is needed to checkout the .git files again? Or did I miss something?

Long time since I developed using submodules, so I now forgot some catches.
Anyway I will play with those commands tomorrow to try to understand.

[snip]
> -    _git submodule update --init --recursive
> +    _git submodule update --init --recursive --force

It seems --force does not exist on git 1.7.1.
It is still possible I did something wrong when creating the docker image.
https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63519024

It does work for git 1.8.3.
https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63525858


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 4/4] download/git: always do full-clone
  2018-04-17 16:48 ` [Buildroot] [PATCH 4/4] download/git: always do full-clone Yann E. MORIN
@ 2018-04-18  3:18   ` Ricardo Martincoski
  0 siblings, 0 replies; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-18  3:18 UTC (permalink / raw)
  To: buildroot

Hello,

We can ask for more opinions if it needs to be split in 2 patches or not, so I
copied the CC from patch 3.
Also some typos and nits.

On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:

> We currently attempt a shallow clone, as tentative to save bandwidth and
> download time.
> 
> However, now that we keep the git tree as a cache, it may happen that we
> need to checkout an earlier commit, and that would not be present with a
> shallow clone.
> 
> Furthermore, the shallow fetch is already really btroken, and just

s/btroken/broken/

> happens by chance. Consider the following actions, which are basivcally

s/basivcally/basically/

[snip]
> The checkout succeeds just because of the git-fetch in the if-condition,
> which is initially there to fetch the special refs from github MRs, or

Nit: mixed naming
GitHub -> PR
GitLab -> MR
Gerrit -> Change

> gerrit reviews. That fails, but we jsut print a warning. If we were to

s/jsut/just/

> ever remove support for special refs, then the checkout would fail.
> 
> The whole purpose of the git cache is to actually save bandwith and

s/bandwith/bandwidth/

> download time, but in the long run. For one-offs, people would
> preferably use a wget download (e.g. with the github macro) instead of
> a git clone.
> 
> We switch to always doing a full clone. It is nore correct, and pays off

s/nore/more/
Nit: Actually using shallow fetch is not less correct, just more complicated
because it has a lot of corner cases. But the download script becomes more
correct than before by always doing a full clone.

> in the long run...
[snip]
> +printf "Fetching all references\n"

> +_git fetch origin

The line above could even be a separate patch.
It is there to fix the fetch for all refs for git versions older than 1.9.0.
Maybe we add a comment about the git version?
And/or mention the git version + upstream patch on the commit log, as in
http://patchwork.ozlabs.org/patch/897559/ ?

> +_git fetch origin -t


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-18  3:13   ` Ricardo Martincoski
@ 2018-04-18  8:04     ` Arnout Vandecappelle
  2018-04-19  0:59       ` Ricardo Martincoski
  2018-04-19 19:59       ` Yann E. MORIN
  0 siblings, 2 replies; 22+ messages in thread
From: Arnout Vandecappelle @ 2018-04-18  8:04 UTC (permalink / raw)
  To: buildroot



On 18-04-18 05:13, Ricardo Martincoski wrote:
> Hello,
> 
> Really sorry I did finished the review/tests today to send by tags.
> 
> Could you double-check the --force option for git 1.7.1? I am not sure we would
> have a workaround for this if it is not supported.

 The help of 1.7.1 says:

git submodule [--quiet] update [--init] [-N|--no-fetch] [--rebase] [--reference
<repository>] [--merge] [--recursive] [--] [<path>...]

No force there... Checking the code, I don't see any handling of -f or --force.

> Also a question and some typos.
> 
> On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:
[snip]
>> +# files after the checkout.
>> +find . -type f -name .git -exec rm {} +
> 
> In the unlikely case some user tries to abuse the git cache by first populating
> it using the 'git multiple working trees' feature the main
> dl/<package>/git/.git would be a file too. After this command all git commands
> will fail.
> But... well... that is expected! The user should not mess with the git cache,
> it is there to be used solely by buildroot.
> So we are good IMO.

 +1

 That said, IMO it would be better if the git cache were a bare repo, and that
we did the checkout outside of the repo. But I don't think that works well with
submodules. Submodules are annoying...

> 
>> +
>> +# Restore repository to the new, clean-checkout state.
>> +_git clean -ffdx
>> +_git checkout -- .
> 
> So it is needed to checkout the .git files again? Or did I miss something?

 I'm missing the point as well. Clearly needs a comment :-)

> 
> Long time since I developed using submodules, so I now forgot some catches.
> Anyway I will play with those commands tomorrow to try to understand.
> 
> [snip]
>> -    _git submodule update --init --recursive
>> +    _git submodule update --init --recursive --force
> 
> It seems --force does not exist on git 1.7.1.
> It is still possible I did something wrong when creating the docker image.
> https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63519024

 No, it doesn't exist.

 Note that git 1.7.1 also doesn't have the .git files in the submodules; the
submodules have .git directories, i.e. they're indistinguishable from normal
repos. I think that that is not a problem for this patch, since I think git
1.7.1 *will* kill .git directories when doing git clean --ffdx, but I'm not
sure. Note that this implies that submodules will *not* be cached...

 I'm liking the bare repos more and more :-)

 Regards,
 Arnout

> It does work for git 1.8.3.
> https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63525858
> 
> 
> Regards,
> Ricardo
> 

-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
  2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
                   ` (3 preceding siblings ...)
  2018-04-17 16:48 ` [Buildroot] [PATCH 4/4] download/git: always do full-clone Yann E. MORIN
@ 2018-04-18  8:40 ` Thomas Petazzoni
  2018-04-18  8:52   ` Thomas Petazzoni
  4 siblings, 1 reply; 22+ messages in thread
From: Thomas Petazzoni @ 2018-04-18  8:40 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue, 17 Apr 2018 18:48:19 +0200, Yann E. MORIN wrote:

> Yann E. MORIN (4):
>       download/git: ensure we always work in the expected repository
>       download/git: ensure we have a sane repository
>       download/git: ensure we can checkout repos with submodule conversions
>       download/git: always do full-clone

I have not yet tested with this patch series applied, but with the
current master, I see a difference in behavior between what happens on
my machine (recent Fedora system) and my build server (ancient Debian).

Note: on both cases, the DL_DIR/squashfs folder was entirely removed,
so we start from a situation where there is no Git cache.

On my machine, fetching squashfs works fine:

thomas at windsurf:~/projets/buildroot (master)$ make host-squashfs-extract
/usr/bin/make -j1 O=/home/thomas/projets/buildroot/output HOSTCC="/usr/bin/gcc" HOSTCXX="/usr/bin/g++" silentoldconfig
>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading
Initialized empty Git repository in /home/thomas/dl/squashfs/git/.git/
Fetching all references
remote: Counting objects: 8972, done.
remote: Total 8972 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (8972/8972), 1.56 MiB | 1.66 MiB/s, done.
Resolving deltas: 100% (6544/6544), done.
From https://git.kernel.org/pub/scm/fs/squashfs/squashfs-tools
 * [new branch]      lz4        -> origin/lz4
 * [new branch]      master     -> origin/master
 * [new branch]      stable     -> origin/stable
warning: refname 'e38956b92f738518c29734399629e7cdb33072d3' is ambiguous.
Git normally never creates a ref that ends with 40 hex characters
because it will be ignored when you just specify 40-hex. These refs
may be created by mistake. For example,

  git checkout -b $br $(git rev-parse ...)

where "$br" is somehow empty and a 40-hex ref is created. Please
examine these refs and maybe delete them. Turn this message off by
running "git config advice.objectNameWarning false"
squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz: OK (sha256: bd0aa3011320b8ebee68aa406060de277bef16daf81bad5b9f70cbea6db1a779)
>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Extracting
gzip -d -c /home/thomas/dl/squashfs/squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz | tar --strip-components=1 -C /home/thomas/projets/buildroot/output/build/host-squashfs-e38956b92f738518c29734399629e7cdb33072d3   -xf -

On my build server however, it fails badly:

test at build:~/buildroot$ make host-squashfs-extract
>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading
Initialized empty Git repository in /home/test/dl/squashfs/git/.git/
Fetching all references
Could not fetch special ref 'e38956b92f738518c29734399629e7cdb33072d3'; assuming it is not special.
fatal: reference is not a tree: e38956b92f738518c29734399629e7cdb33072d3
--2018-04-18 07:28:31--  http://sources.buildroot.net/squashfs/squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz
Resolving sources.buildroot.net (sources.buildroot.net)... 104.27.166.48, 104.27.167.48, 2400:cb00:2048:1::681b:a730, ...
Connecting to sources.buildroot.net (sources.buildroot.net)|104.27.166.48|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-04-18 07:28:31 ERROR 404: Not Found.

--2018-04-18 07:28:31--  http://sources.buildroot.net/squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz
Resolving sources.buildroot.net (sources.buildroot.net)... 104.27.166.48, 104.27.167.48, 2400:cb00:2048:1::681b:a630, ...
Connecting to sources.buildroot.net (sources.buildroot.net)|104.27.166.48|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-04-18 07:28:31 ERROR 404: Not Found.

make: *** [/home/test/buildroot/output/build/host-squashfs-e38956b92f738518c29734399629e7cdb33072d3/.stamp_downloaded] Error 1

My machine has Git 2.14.3, the build server has Git 1.7.10.4. I've
tested with another git-fetched package, "ubus", and I have the same
behavior.

Best regards,

Thomas Petazzoni
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
  2018-04-18  8:40 ` [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Thomas Petazzoni
@ 2018-04-18  8:52   ` Thomas Petazzoni
  2018-04-18 13:28     ` Ricardo Martincoski
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Petazzoni @ 2018-04-18  8:52 UTC (permalink / raw)
  To: buildroot

Hello,

On Wed, 18 Apr 2018 10:40:21 +0200, Thomas Petazzoni wrote:

> I have not yet tested with this patch series applied, but with the
> current master, I see a difference in behavior between what happens on
> my machine (recent Fedora system) and my build server (ancient Debian).

No change with the patch series applied. It does a full clone, but
fails after that:

test at build:~/buildroot$ rm -rf ~/dl/squashfs/
test at build:~/buildroot$ make host-squashfs-extract
>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading
Initialized empty Git repository in /home/test/dl/squashfs/git/.git/
Fetching all references 
remote: Counting objects: 8972, done.
remote: Total 8972 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (8972/8972), 1.56 MiB | 2.51 MiB/s, done.
Resolving deltas: 100% (6544/6544), done.
From https://git.kernel.org/pub/scm/fs/squashfs/squashfs-tools
 * [new branch]      lz4        -> origin/lz4
 * [new branch]      master     -> origin/master
 * [new branch]      stable     -> origin/stable
Could not fetch special ref 'e38956b92f738518c29734399629e7cdb33072d3'; assuming it is not special.
ERROR: squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz has wrong sha256 hash:
ERROR: expected: bd0aa3011320b8ebee68aa406060de277bef16daf81bad5b9f70cbea6db1a779
ERROR: got     : c7a61e3bcabb716b268f5a341055ac5ecda8b9f2b42025f82926f201ff5c8881
ERROR: Incomplete download, or man-in-the-middle (MITM) attack
--2018-04-18 08:41:14--  http://sources.buildroot.net/squashfs/squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz
Resolving sources.buildroot.net (sources.buildroot.net)... 104.27.166.48, 104.27.167.48, 2400:cb00:2048:1::681b:a730, ...
Connecting to sources.buildroot.net (sources.buildroot.net)|104.27.166.48|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-04-18 08:41:14 ERROR 404: Not Found.

--2018-04-18 08:41:14--  http://sources.buildroot.net/squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz
Resolving sources.buildroot.net (sources.buildroot.net)... 104.27.166.48, 104.27.167.48, 2400:cb00:2048:1::681b:a730, ...
Connecting to sources.buildroot.net (sources.buildroot.net)|104.27.166.48|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-04-18 08:41:14 ERROR 404: Not Found.

make: *** [/home/test/buildroot/output/build/host-squashfs-e38956b92f738518c29734399629e7cdb33072d3/.stamp_downloaded] Error 1

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
  2018-04-18  8:52   ` Thomas Petazzoni
@ 2018-04-18 13:28     ` Ricardo Martincoski
  2018-04-18 14:43       ` Thomas Petazzoni
  0 siblings, 1 reply; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-18 13:28 UTC (permalink / raw)
  To: buildroot

Hello,

On Wed, Apr 18, 2018 at 05:52 AM, Thomas Petazzoni wrote:

> On Wed, 18 Apr 2018 10:40:21 +0200, Thomas Petazzoni wrote:
> 
>> I have not yet tested with this patch series applied, but with the
>> current master, I see a difference in behavior between what happens on
>> my machine (recent Fedora system) and my build server (ancient Debian).
> 
> No change with the patch series applied. It does a full clone, but
> fails after that:
> 
> test at build:~/buildroot$ rm -rf ~/dl/squashfs/
> test at build:~/buildroot$ make host-squashfs-extract
>>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading
> Initialized empty Git repository in /home/test/dl/squashfs/git/.git/
> Fetching all references 
> remote: Counting objects: 8972, done.
> remote: Total 8972 (delta 0), reused 0 (delta 0)
> Receiving objects: 100% (8972/8972), 1.56 MiB | 2.51 MiB/s, done.
> Resolving deltas: 100% (6544/6544), done.
> From https://git.kernel.org/pub/scm/fs/squashfs/squashfs-tools
>  * [new branch]      lz4        -> origin/lz4
>  * [new branch]      master     -> origin/master
>  * [new branch]      stable     -> origin/stable
> Could not fetch special ref 'e38956b92f738518c29734399629e7cdb33072d3'; assuming it is not special.
> ERROR: squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz has wrong sha256 hash:
> ERROR: expected: bd0aa3011320b8ebee68aa406060de277bef16daf81bad5b9f70cbea6db1a779
> ERROR: got     : c7a61e3bcabb716b268f5a341055ac5ecda8b9f2b42025f82926f201ff5c8881
> ERROR: Incomplete download, or man-in-the-middle (MITM) attack

Could be the case your build server has a blacklisted tar version and you run
the commands in a clean output (actually without host-tar built)?

A quick test is to run with the series applied:
$ rm -rf ~/dl/squashfs/
$ make host-tar
$ make host-squashfs-extract
And check if the hash still mismatches.

My computer has:
git version 2.14.1
tar (GNU tar) 1.29
So in order to trigger the host-tar build I did:
$ sed -e 's,minor_max=29,minor_max=28,g' -i support/dependencies/check-host-tar.sh

In the current master with the change above, I did:
$ make clean ; make defconfig; rm -rf ~/dl/squashfs/; make host-squashfs-extract

>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading
Initialized empty Git repository in /home/ricardo/dl/squashfs/git/.git/
Fetching all references
remote: Counting objects: 8972, done.
remote: Total 8972 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (8972/8972), 1.73 MiB | 15.00 KiB/s, done.
Resolving deltas: 100% (6529/6529), done.
From https://git.kernel.org/pub/scm/fs/squashfs/squashfs-tools
 * [new branch]      lz4        -> origin/lz4
 * [new branch]      master     -> origin/master
 * [new branch]      stable     -> origin/stable
warning: refname 'e38956b92f738518c29734399629e7cdb33072d3' is ambiguous.
Git normally never creates a ref that ends with 40 hex characters
because it will be ignored when you just specify 40-hex. These refs
may be created by mistake. For example,

  git checkout -b $br $(git rev-parse ...)

where "$br" is somehow empty and a 40-hex ref is created. Please
examine these refs and maybe delete them. Turn this message off by
running "git config advice.objectNameWarning false"
squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz: OK (sha256: bd0aa3011320b8ebee68aa406060de277bef16daf81bad5b9f70cbea6db1a779)
>>> host-skeleton  Extracting
>>> host-skeleton  Patching
>>> host-skeleton  Configuring
>>> host-skeleton  Building
>>> host-skeleton  Installing to host directory
lzip-1.20.tar.gz: OK (sha256: c93b81a5a7788ef5812423d311345ba5d3bd4f5ebf1f693911e3a13553c1290c)
>>> host-tar 1.29 Downloading

host-tar is a pre-extract dependency, so when using intermediate targets
(I mean -source, -extract) in a clean output, host-tar is built before source
extraction, but not necessarily before download (and tarball generation).

The most common scenario for build farms (make xx_defconfig && make) works fine
most of the time because for an old distro almost certainly at least one host
package that does not use SCM download methods is extracted before any download
with SCM methods, and that triggers the build of host-tar.

But I think it is not that uncommon to use 'make xx_defconfig && make source &&
make' in build farms so the build fails early if the download fails, either by
a missing version on the remote or by human error selecting the wrong version
on a .mk file.

And we have also the use case of new distros with too recent tar versions.

Not sure how to fix it.

Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
  2018-04-18 13:28     ` Ricardo Martincoski
@ 2018-04-18 14:43       ` Thomas Petazzoni
  2018-04-18 21:35         ` Ricardo Martincoski
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Petazzoni @ 2018-04-18 14:43 UTC (permalink / raw)
  To: buildroot

Hello,

On Wed, 18 Apr 2018 10:28:24 -0300, Ricardo Martincoski wrote:

> Could be the case your build server has a blacklisted tar version and you run
> the commands in a clean output (actually without host-tar built)?

It is indeed the case (I have an old tar), and indeed building host-tar
first fixes the problem.

When I don't build host-tar, what happens is:

test at build:~/buildroot$ make host-squashfs-extract
>>> host-squashfs e38956b92f738518c29734399629e7cdb33072d3 Downloading  
Initialized empty Git repository in /home/test/dl/squashfs/git/.git/
Fetching all references 
remote: Counting objects: 8972, done.
remote: Total 8972 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (8972/8972), 1.56 MiB | 2.51 MiB/s, done.
Resolving deltas: 100% (6544/6544), done.
From https://git.kernel.org/pub/scm/fs/squashfs/squashfs-tools
 * [new branch]      lz4        -> origin/lz4
 * [new branch]      master     -> origin/master
 * [new branch]      stable     -> origin/stable
Could not fetch special ref 'e38956b92f738518c29734399629e7cdb33072d3'; assuming it is not special.
ERROR: squashfs-e38956b92f738518c29734399629e7cdb33072d3.tar.gz has wrong sha256 hash:
ERROR: expected: bd0aa3011320b8ebee68aa406060de277bef16daf81bad5b9f70cbea6db1a779
ERROR: got     : c7a61e3bcabb716b268f5a341055ac5ecda8b9f2b42025f82926f201ff5c8881
ERROR: Incomplete download, or man-in-the-middle (MITM) attack

So I assume it has used the system tar, which generates tar archives
whose hash doesn't match the one generated by "good" tar versions. Is
that the problem I was having ?

So, we indeed have a serious problem here. host-tar is not an extract
dependency, but a download dependency. Meh. Crap. This breaks several
things:

 - make <foo>-source on Git packages from a clean build

 - A regular build, if the first package downloaded is fetched from Git
   and no other package has been extracted before. Indeed, in such a
   case, host-tar would not yet be built/installed.

Gaaaah.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 0/4] support/download: make the git backend more robust
  2018-04-18 14:43       ` Thomas Petazzoni
@ 2018-04-18 21:35         ` Ricardo Martincoski
  0 siblings, 0 replies; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-18 21:35 UTC (permalink / raw)
  To: buildroot

Hello,

On Wed, Apr 18, 2018 at 11:43 AM, Thomas Petazzoni wrote:

> So I assume it has used the system tar, which generates tar archives
> whose hash doesn't match the one generated by "good" tar versions. Is
> that the problem I was having ?

Exactly.

Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-18  8:04     ` Arnout Vandecappelle
@ 2018-04-19  0:59       ` Ricardo Martincoski
  2018-04-19 19:59       ` Yann E. MORIN
  1 sibling, 0 replies; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-19  0:59 UTC (permalink / raw)
  To: buildroot

Hello,

On Wed, Apr 18, 2018 at 05:04 AM, Arnout Vandecappelle wrote:

> On 18-04-18 05:13, Ricardo Martincoski wrote:
>>> +    _git submodule update --init --recursive --force
>> 
>> It seems --force does not exist on git 1.7.1.
>> It is still possible I did something wrong when creating the docker image.
>> https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63519024
> 
>  No, it doesn't exist.
> 
>  Note that git 1.7.1 also doesn't have the .git files in the submodules; the
> submodules have .git directories, i.e. they're indistinguishable from normal
> repos. I think that that is not a problem for this patch, since I think git
> 1.7.1 *will* kill .git directories when doing git clean --ffdx, but I'm not
> sure. Note that this implies that submodules will *not* be cached...

And that also implies the hash will not match.

So for this patch too, let's take a step back...

In release 2018.02.1 we have 2 packages in the tree with submodules:
They... well... don't work with git 1.7.1

azure-iot-sdk-c: download hungs forever
https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63672026
I tested locally using git 1.7.1 without buildroot: for azure-iot-sdk-c git
itself hangs forever on 'git submodule update --init --recursive'.

sunxi-mali: has a hash mismatch (due to the .git dir from submodule)
https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63549538

And no one complained as far as I can remember.
So by supporting git 1.7.1 with submodules, it seems to me we will add
complexity for a combination that nobody is using.

Thoughts?


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository
  2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
@ 2018-04-19 15:47   ` Ricardo Martincoski
  2018-04-19 20:38   ` Thomas Petazzoni
  1 sibling, 0 replies; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-19 15:47 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:

> git always look directories up until it finds a repository. In case
> the git cache is broken, it may no longer be identified as a repository,
> and git will look higher in the directories until it finds one.
> 
> In the default conditions, this would be Buildroot's own git tree
> (because DL_DIR is a subdir of Buildroot), but in some situations may
> very well be any repository the user has Buildroot in, like a
> br2-external tree...
> 
> So, we force git to use our git cache and never look elsewhere, as
> Suggested by Ricardo.
> 
> Use GIT_DIR, as it has been there for ages now, while --git-dir was
> only introduced later (even if most distros ship an later version),
> as suggested by Arnout.
> 
> Also fix the one call to git that was not using the wrapper.
> 
> Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> Cc: Arnout Vandecappelle <arnout@mind.be>
> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>

Acked-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
[download script bails out for a completely broken git cache, avoiding changes
 to the buildroot repo when the broken git cache is in a subdir (the default)]
Tested-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Tests performed:

I choose dt-utils as its version is a tag, so it does not trigger the problem
with git<1.9.0 fixed by patch 4.

1) Create an empty dir in the place of git cache and download:
 Using current master 56823d6:
  git 1.7.1: [process hungs and the buildroot repo had its origin url changed]
  git 1.8.3 and 2.11.0: [the buildroot repo had its origin url changed and also
                   the commit from dt-utils is check out in the buildroot repo]
 Current master + this patch:
  git 1.7.1 and 1.8.3 and 2.11.0: [download script bails out as expected,
                                   falling back to backup site]
2) Download with no initial git cache
 Current master + this patch:
  git 1.7.1: [works fine, fetch all refs]
  git 1.8.3 and 2.11.0: [works fine, shallow fetch]

Commands used and outputs:
https://gist.github.com/ricardo-martincoski/ea341b69a4ebeadf38e4ab02ba33adf8


Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository
  2018-04-17 16:48 ` [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository Yann E. MORIN
@ 2018-04-19 15:50   ` Ricardo Martincoski
  2018-04-19 19:45     ` Yann E. MORIN
  2018-04-19 20:38   ` Thomas Petazzoni
  1 sibling, 1 reply; 22+ messages in thread
From: Ricardo Martincoski @ 2018-04-19 15:50 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:

> There are cases where a repository might be broken, e.g. when a previous
> operation was killed or otherwise failed unexpectedly.
> 
> We fix that by always initialising the repository, as suggested by
> Ricardo. git-init is safe on an otherwise-healthy repository:
> 
>     Running git init in an existing repository is safe. It will not
>     overwrite things that are already there. [...]
> 
> Using git-init will just ensure that we have the strictly required files
> to form a sane tree. Any blob that is still missing would get fetched
> later on.
> 
> Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> Cc: Arnout Vandecappelle <arnout@mind.be>
> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>

Acked-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
[a broken repo with a clean worktree is recovered at the extent that git allows]
Tested-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>


Tests performed with patches 1 and 2 applied:

1) In the same scenario tested for patch 1 (empty dl/<package>/git) instead of
   bailing out the script reinitialises and uses the git cache.

2) Using git 2.11.0, download all git packages in the tree, remove the tarball
   and regenerate it 2 times, before [1] and after [2] this patch.
[1] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20734086
[2] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20729761
In theses GitLab pipelines there are jobs marked as failures (remote server did
not respond, ...) but they are not related to patch 1 or 2. The same occur
before and after the 2 patches.

Regards,
Ricardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository
  2018-04-19 15:50   ` Ricardo Martincoski
@ 2018-04-19 19:45     ` Yann E. MORIN
  0 siblings, 0 replies; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-19 19:45 UTC (permalink / raw)
  To: buildroot

Ricardo, All,

On 2018-04-19 12:50 -0300, Ricardo Martincoski spake thusly:
> On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:
> > There are cases where a repository might be broken, e.g. when a previous
> > operation was killed or otherwise failed unexpectedly.
> > 
> > We fix that by always initialising the repository, as suggested by
> > Ricardo. git-init is safe on an otherwise-healthy repository:
> > 
> >     Running git init in an existing repository is safe. It will not
> >     overwrite things that are already there. [...]
> > 
> > Using git-init will just ensure that we have the strictly required files
> > to form a sane tree. Any blob that is still missing would get fetched
> > later on.
> > 
> > Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> > Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> > Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> > Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> > Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> > Cc: Arnout Vandecappelle <arnout@mind.be>
> > Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> 
> Acked-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> [a broken repo with a clean worktree is recovered at the extent that git allows]
> Tested-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> 
> 
> Tests performed with patches 1 and 2 applied:
> 
> 1) In the same scenario tested for patch 1 (empty dl/<package>/git) instead of
>    bailing out the script reinitialises and uses the git cache.
> 
> 2) Using git 2.11.0, download all git packages in the tree, remove the tarball
>    and regenerate it 2 times, before [1] and after [2] this patch.
> [1] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20734086
> [2] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20729761
> In theses GitLab pipelines there are jobs marked as failures (remote server did
> not respond, ...) but they are not related to patch 1 or 2. The same occur
> before and after the 2 patches.

Thanks for the extensive testing! :-)

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-18  8:04     ` Arnout Vandecappelle
  2018-04-19  0:59       ` Ricardo Martincoski
@ 2018-04-19 19:59       ` Yann E. MORIN
  2018-04-19 23:30         ` Arnout Vandecappelle
  1 sibling, 1 reply; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-19 19:59 UTC (permalink / raw)
  To: buildroot

Arnout, Ricardo, All,

On 2018-04-18 10:04 +0200, Arnout Vandecappelle spake thusly:
> On 18-04-18 05:13, Ricardo Martincoski wrote:
> > Hello,
> > 
> > Really sorry I did finished the review/tests today to send by tags.

He, no problem at all! ;-)

> > Could you double-check the --force option for git 1.7.1? I am not sure we would
> > have a workaround for this if it is not supported.
> 
>  The help of 1.7.1 says:
> 
> git submodule [--quiet] update [--init] [-N|--no-fetch] [--rebase] [--reference
> <repository>] [--merge] [--recursive] [--] [<path>...]
> 
> No force there... Checking the code, I don't see any handling of -f or --force.

I don't have access to a pre 1.8.0 git here... :-/

> > Also a question and some typos.
> > 
> > On Tue, Apr 17, 2018 at 01:48 PM, Yann E. MORIN wrote:
> [snip]
> >> +# files after the checkout.
> >> +find . -type f -name .git -exec rm {} +
> > 
> > In the unlikely case some user tries to abuse the git cache by first populating
> > it using the 'git multiple working trees' feature the main
> > dl/<package>/git/.git would be a file too. After this command all git commands
> > will fail.
> > But... well... that is expected! The user should not mess with the git cache,
> > it is there to be used solely by buildroot.
> > So we are good IMO.
>  +1

Indeed, we should not try to support that case.

>  That said, IMO it would be better if the git cache were a bare repo, and that
> we did the checkout outside of the repo. But I don't think that works well with
> submodules. Submodules are annoying...

Yes, see below for an explanation in a few git commands what I'm trying
to work around...

> >> +# Restore repository to the new, clean-checkout state.
> >> +_git clean -ffdx
> >> +_git checkout -- .
> > 
> > So it is needed to checkout the .git files again? Or did I miss something?
>  I'm missing the point as well. Clearly needs a comment :-)

Well, the situation is pretty tricky, and I don't understand all the
conditions. But, two things:

  - there can be cases where git checkout did not properly finish in the
    past, so we'd be left with untracked files (as is the cases
    currently in some of the autobuilders), or in case the user did
    interrupt git in the middle of a checkout, or in case a CI job
    killed a runaway job...

  - we could well have removed .git files that were not indications of
    submodules, so we want torestore those.

Speaking of submodules, what this whole patch is about is to workaround
idiosyncrasies in the way git handles sub;odules.

Consider this sequence:

    $ git init meh.git; cd meh.git
    $ mkdir foo; echo BAR >foo/bar
    $ git add foo/bar; git commit -m "foo/bar: BAR"
    $ cp -a foo ../foo.git
    $ git rm -r foo; git commit -m "foo/bar: before submod"

    $ cd ../foo.git; git init .
    $ git add bar; git commit -m "bar: BAR"

    $ cd ../meh.git
    $ git submodule add ../foo.git foo
    $ git commit -m "foo: submod"

Now that we have our git trees, with a sub-dir to sub-module conversion,
let's navigate in the history:

    $ git checkout HEAD^^  # Back to first commit: foo is just a subdir
    error: The following untracked working tree files would be
    overwritten by checkout:
        foo/bar
    Please move or remove them before you switch branches.
    Aborting

    $ rm -f foo/.git
    $ git checkout HEAD^^  # Back to first commit: foo is just a subdir
    warning: unable to rmdir foo: Directory not empty
    Note: checking out 'HEAD^^'.
    [...]
    HEAD is now at f18f7ad... foo/bar: BAR

    $ ls -lA foo/
    -rw-rw-r-- 1 ymorin ymorin  4 avril 18 21:25 bar
    -rw-rw-r-- 1 ymorin ymorin 28 avril 18 21:24 .git

Meh... Extra .git file lingering around... :-/

    $ git clean -ffdx

    $ ls -lA foo/
    -rw-rw-r-- 1 ymorin ymorin  4 avril 18 21:25 bar
    -rw-rw-r-- 1 ymorin ymorin 28 avril 18 21:24 .git

Bummer, still there... Let's remove it manually, then:

    $ rm -f foo/.git

Now, let's come back to master:

    $ git checkout master
    Previous HEAD position was f18f7ad... foo/bar: BAR
    Switched to branch 'master'
    $ ls -lA foo/
    [nothing, deep-space void...]

    $ git submodule update --init
    Submodule path 'foo': checked out '1184643257e1e3f63a2f849dd4737c5055184aff'

(weird, it sorks now; I'm pretty sure I got a case where Ineeded a
--force to the sbmodule update... Sigh, I'll have to investigate
further...)

> > Long time since I developed using submodules, so I now forgot some catches.
> > Anyway I will play with those commands tomorrow to try to understand.
> > 
> > [snip]
> >> -    _git submodule update --init --recursive
> >> +    _git submodule update --init --recursive --force
> > 
> > It seems --force does not exist on git 1.7.1.
> > It is still possible I did something wrong when creating the docker image.
> > https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63519024
> 
>  No, it doesn't exist.

Grrr... :-(

On the other hand, I can't reproduce the case where I needed the
submodule update --force. So maybe it is not needed, afterall?

I'll investigate further.

>  Note that git 1.7.1 also doesn't have the .git files in the submodules; the
> submodules have .git directories, i.e. they're indistinguishable from normal
> repos. I think that that is not a problem for this patch, since I think git
> 1.7.1 *will* kill .git directories when doing git clean --ffdx, but I'm not
> sure. Note that this implies that submodules will *not* be cached...
> 
>  I'm liking the bare repos more and more :-)

But then, we can't have submodules stored in a bare repo, because we
need a workign copy to have the list of submodules, and their
versions...

Regards,
Yann E. MORIN.

>  Regards,
>  Arnout
> 
> > It does work for git 1.8.3.
> > https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/63525858
> > 
> > 
> > Regards,
> > Ricardo
> > 
> 
> -- 
> Arnout Vandecappelle                          arnout at mind be
> Senior Embedded Software Architect            +32-16-286500
> Essensium/Mind                                http://www.mind.be
> G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
> LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
> GPG fingerprint:  7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository
  2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
  2018-04-19 15:47   ` Ricardo Martincoski
@ 2018-04-19 20:38   ` Thomas Petazzoni
  1 sibling, 0 replies; 22+ messages in thread
From: Thomas Petazzoni @ 2018-04-19 20:38 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue, 17 Apr 2018 18:48:20 +0200, Yann E. MORIN wrote:
> git always look directories up until it finds a repository. In case
> the git cache is broken, it may no longer be identified as a repository,
> and git will look higher in the directories until it finds one.
> 
> In the default conditions, this would be Buildroot's own git tree
> (because DL_DIR is a subdir of Buildroot), but in some situations may
> very well be any repository the user has Buildroot in, like a
> br2-external tree...
> 
> So, we force git to use our git cache and never look elsewhere, as
> Suggested by Ricardo.
> 
> Use GIT_DIR, as it has been there for ages now, while --git-dir was
> only introduced later (even if most distros ship an later version),
> as suggested by Arnout.
> 
> Also fix the one call to git that was not using the wrapper.
> 
> Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> Cc: Arnout Vandecappelle <arnout@mind.be>
> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> ---
>  support/download/git | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository
  2018-04-17 16:48 ` [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository Yann E. MORIN
  2018-04-19 15:50   ` Ricardo Martincoski
@ 2018-04-19 20:38   ` Thomas Petazzoni
  1 sibling, 0 replies; 22+ messages in thread
From: Thomas Petazzoni @ 2018-04-19 20:38 UTC (permalink / raw)
  To: buildroot

Hello,

On Tue, 17 Apr 2018 18:48:21 +0200, Yann E. MORIN wrote:
> There are cases where a repository might be broken, e.g. when a previous
> operation was killed or otherwise failed unexpectedly.
> 
> We fix that by always initialising the repository, as suggested by
> Ricardo. git-init is safe on an otherwise-healthy repository:
> 
>     Running git init in an existing repository is safe. It will not
>     overwrite things that are already there. [...]
> 
> Using git-init will just ensure that we have the strictly required files
> to form a sane tree. Any blob that is still missing would get fetched
> later on.
> 
> Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> Reported-by: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> Cc: Ricardo Martincoski <ricardo.martincoski@gmail.com>
> Cc: Maxime Hadjinlian <maxime.hadjinlian@gmail.com>
> Cc: Arnout Vandecappelle <arnout@mind.be>
> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
> ---
>  support/download/git | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)

Applied to master, thanks.

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-19 19:59       ` Yann E. MORIN
@ 2018-04-19 23:30         ` Arnout Vandecappelle
  2018-04-20  9:25           ` Yann E. MORIN
  0 siblings, 1 reply; 22+ messages in thread
From: Arnout Vandecappelle @ 2018-04-19 23:30 UTC (permalink / raw)
  To: buildroot



On 19-04-18 21:59, Yann E. MORIN wrote:
> Arnout, Ricardo, All,
> 
> On 2018-04-18 10:04 +0200, Arnout Vandecappelle spake thusly:
[snip]
>>  Note that git 1.7.1 also doesn't have the .git files in the submodules; the
>> submodules have .git directories, i.e. they're indistinguishable from normal
>> repos. I think that that is not a problem for this patch, since I think git
>> 1.7.1 *will* kill .git directories when doing git clean --ffdx, but I'm not
>> sure. Note that this implies that submodules will *not* be cached...
>>
>>  I'm liking the bare repos more and more :-)
> 
> But then, we can't have submodules stored in a bare repo, because we
> need a workign copy to have the list of submodules, and their
> versions...

 It's *possible* to do things fully "manually", e.g. with some plumbing commands:

export GIT_DIR=/path/to/bare/repo
blob=$(git ls-tree $cset .gitmodules | awk '{print $3}')
if [ "$blob" ]; then
  tmpfile=$(git unpack-file $blob)
  etc. etc.


 Regards,
 Arnout

-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions
  2018-04-19 23:30         ` Arnout Vandecappelle
@ 2018-04-20  9:25           ` Yann E. MORIN
  0 siblings, 0 replies; 22+ messages in thread
From: Yann E. MORIN @ 2018-04-20  9:25 UTC (permalink / raw)
  To: buildroot

Arnout, All,

On 2018-04-20 01:30 +0200, Arnout Vandecappelle spake thusly:
> On 19-04-18 21:59, Yann E. MORIN wrote:
> > On 2018-04-18 10:04 +0200, Arnout Vandecappelle spake thusly:
[--SNIP--]
> >>  I'm liking the bare repos more and more :-)
> > But then, we can't have submodules stored in a bare repo, because we
> > need a workign copy to have the list of submodules, and their
> > versions...
> 
>  It's *possible* to do things fully "manually", e.g. with some plumbing commands:
> 
> export GIT_DIR=/path/to/bare/repo
> blob=$(git ls-tree $cset .gitmodules | awk '{print $3}')
> if [ "$blob" ]; then
>   tmpfile=$(git unpack-file $blob)
>   etc. etc.

I hope you were only joking there... ;-)

No, no, no... We don't want to go that route, reinventing the git
internals in a shell script... Even I can't find this appealing...

Well, that could be fun to write...
No, don't look down the deep ravine... ;-]

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-04-20  9:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 16:48 [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Yann E. MORIN
2018-04-17 16:48 ` [Buildroot] [PATCH 1/4] download/git: ensure we always work in the expected repository Yann E. MORIN
2018-04-19 15:47   ` Ricardo Martincoski
2018-04-19 20:38   ` Thomas Petazzoni
2018-04-17 16:48 ` [Buildroot] [PATCH 2/4] download/git: ensure we have a sane repository Yann E. MORIN
2018-04-19 15:50   ` Ricardo Martincoski
2018-04-19 19:45     ` Yann E. MORIN
2018-04-19 20:38   ` Thomas Petazzoni
2018-04-17 16:48 ` [Buildroot] [PATCH 3/4] download/git: ensure we can checkout repos with submodule conversions Yann E. MORIN
2018-04-18  3:13   ` Ricardo Martincoski
2018-04-18  8:04     ` Arnout Vandecappelle
2018-04-19  0:59       ` Ricardo Martincoski
2018-04-19 19:59       ` Yann E. MORIN
2018-04-19 23:30         ` Arnout Vandecappelle
2018-04-20  9:25           ` Yann E. MORIN
2018-04-17 16:48 ` [Buildroot] [PATCH 4/4] download/git: always do full-clone Yann E. MORIN
2018-04-18  3:18   ` Ricardo Martincoski
2018-04-18  8:40 ` [Buildroot] [PATCH 0/4] support/download: make the git backend more robust Thomas Petazzoni
2018-04-18  8:52   ` Thomas Petazzoni
2018-04-18 13:28     ` Ricardo Martincoski
2018-04-18 14:43       ` Thomas Petazzoni
2018-04-18 21:35         ` Ricardo Martincoski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.