* [Buildroot] [RFC PATCH 0/3] fix some corner cases for download/git v1 @ 2018-04-12 9:28 Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git Ricardo Martincoski ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-12 9:28 UTC (permalink / raw) To: buildroot Hello, This series brings 3 patches trying to cover some corner cases when downloading packages with method git. The first 2 patches are intended to fix autobuilder failures. http://autobuild.buildroot.net/?reason=dtv-scan-tables-07b18ecef17493ac0908a2e112eae3fe269da7fa This occurs only with old versions of git when a full fetch is needed: fatal: reference is not a tree: sha1 These occur when the git cache is dirty: error: The following untracked working tree files would be overwritten by checkout: error: The following untracked working tree files would be removed by checkout: A dirty cache can also cause hash mismatches for the tarball. Patch 3 should fix the case of unshallowing the git cache (when a shallow fetch downloaded a ref, and later a commit behind this ref is needed). This scenario is relevant both during the buildroot development (a revert of a bump) and when a build farm is shared by projects that use different versions of buildroot. Regards, Ricardo Ricardo Martincoski (3): download/git: fix fetch all refs for old git download/git: recover dirty cache download/git: unshallow when fetching all refs support/download/git | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) -- 2.14.1 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git 2018-04-12 9:28 [Buildroot] [RFC PATCH 0/3] fix some corner cases for download/git v1 Ricardo Martincoski @ 2018-04-12 9:28 ` Ricardo Martincoski 2018-04-12 17:42 ` Yann E. MORIN 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs Ricardo Martincoski 2 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-12 9:28 UTC (permalink / raw) To: buildroot With old versions of git, when the shallow fetch was not enough, the download fails like this: Fetching all references Could not fetch special ref 'sha1'; assuming it is not special. fatal: reference is not a tree: sha1 At git version 1.9.0, the semantics of the option -t for git fetch changed, see upstream commit [1]: < 1.9.0: "fetch tags without also fetching other references"; >=1.9.0: "fetch tags *in addition to* other stuff". Fall back to explicit refspec to support distros that use ancient versions of git. Fixes: http://autobuild.buildroot.net/results/7a88d1aa9ea155f1527d2fa141207c676af85298 [1] https://github.com/git/git/commit/c5a84e92a2fe9e8748e32341c344d7a6c0f52a50 Signed-off-by: Ricardo Martincoski <ricardo.martincoski@gmail.com> Cc: "Yann E. MORIN" <yann.morin.1998@free.fr> --- WARNING: please review carefully, especially related to the plus sign. I created a special docker image replacing git 2.11.0 with 1.8.3 [2]. Using it together to a special .gitlab-ci.yml [3], I could test in Gitlab CI the download, without previous git cache and without backup site, of all packages in the tree that use site method git in these combinations: - [4] patch above master (20c1647f) using git 1.8.3; - [5] patch above master (20c1647f) using git 2.11.0; - [6] master (20c1647f) using git 1.8.3; - [7] master (20c1647f) using git 2.11.0; - [8] 2018.02.1 using git 1.8.3; - [9] 2018.02.1 using git 2.11.0. These packages fail with 'fatal: reference is not a tree' in the master branch using git 1.8.3 but do not fail for the other 5 combinations above: aer-inject armbian-firmware boot-wrapper-aarch64 dbus-triggerd dropwatch edid-decode expedite flashbench gst1-interpipe host-mxsldr host-pseudo host-squashfs libbroadvoice libg7221 libilbc libsilk libsoundtouch libubox libuci libyuv linux-firmware linux at beaglebone_qt5_defconfig ltrace mmc-utils net-tools open-lldp psplash rtmpdump slirp squashfs ti-sgx-demos ti-sgx-km ti-sgx-um uboot at ci20_defconfig uboot at roseapplepi_defconfig ubus uclibc-ng-test uhttpd ustream-ssl x264 xdriver_xf86-video-intel xdriver_xf86-video-voodoo yavta [2] https://gitlab.com/RicardoMartincoski/buildroot/commit/d293ba50fa8cd40eb5c396b2aa234c07e2118020 [3] http://patchwork.ozlabs.org/patch/896012/ [4] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20328681 [5] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20328707 [6] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20328727 [7] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20328745 [8] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20352543 [9] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/20356539 --- support/download/git | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/support/download/git b/support/download/git index 381f3ceeb3..1b09e5c750 100755 --- a/support/download/git +++ b/support/download/git @@ -76,7 +76,7 @@ if [ -n "$(_git ls-remote origin "'${cset}'" 2>&1)" ]; then fi if [ ${git_done} -eq 0 ]; then printf "Fetching all references\n" - _git fetch origin -t + _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" fi # Try to get the special refs exposed by some forges (pull-requests for -- 2.14.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git Ricardo Martincoski @ 2018-04-12 17:42 ` Yann E. MORIN 2018-04-13 18:23 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-12 17:42 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-12 06:28 -0300, Ricardo Martincoski spake thusly: > With old versions of git, when the shallow fetch was not enough, the > download fails like this: > Fetching all references > Could not fetch special ref 'sha1'; assuming it is not special. > fatal: reference is not a tree: sha1 > > At git version 1.9.0, the semantics of the option -t for git fetch > changed, see upstream commit [1]: > < 1.9.0: "fetch tags without also fetching other references"; > >=1.9.0: "fetch tags *in addition to* other stuff". > > Fall back to explicit refspec to support distros that use ancient > versions of git. [--SNIP--] > diff --git a/support/download/git b/support/download/git > index 381f3ceeb3..1b09e5c750 100755 > --- a/support/download/git > +++ b/support/download/git > @@ -76,7 +76,7 @@ if [ -n "$(_git ls-remote origin "'${cset}'" 2>&1)" ]; then > fi > if [ ${git_done} -eq 0 ]; then > printf "Fetching all references\n" > - _git fetch origin -t > + _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" Why don't we just run: git fetch origin git fetch origin -t As I understand it: ancient git | new git -------------------------------------------------------------------- git fetch | fetch all refs but tags | fetches all refs but tags git fetch -t | fetches only tags | fetch all refs and tags Right? Regards, Yann E. MORIN. > fi > > # Try to get the special refs exposed by some forges (pull-requests for > -- > 2.14.1 > -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git 2018-04-12 17:42 ` Yann E. MORIN @ 2018-04-13 18:23 ` Ricardo Martincoski 2018-04-15 12:12 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-13 18:23 UTC (permalink / raw) To: buildroot Hello, Thank you for your review of the series. On Thu, Apr 12, 2018 at 02:42 PM, Yann E. MORIN wrote: >> - _git fetch origin -t >> + _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" > > Why don't we just run: > > git fetch origin > git fetch origin -t One extra connection to the server ... but much more readable/maintainable! I will do. I will change this patch to Changes Requested. > As I understand it: > > ancient git | new git > -------------------------------------------------------------------- > git fetch | fetch all refs but tags | fetches all refs but tags > git fetch -t | fetches only tags | fetch all refs and tags > > Right? Yes. Nit: actually 'git fetch' does fetch some tags, but not necessarily all. Only tags in commits also reachable by branches are fetched. Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git 2018-04-13 18:23 ` Ricardo Martincoski @ 2018-04-15 12:12 ` Yann E. MORIN 2018-04-16 2:17 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-15 12:12 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-13 15:23 -0300, Ricardo Martincoski spake thusly: > On Thu, Apr 12, 2018 at 02:42 PM, Yann E. MORIN wrote: > >> - _git fetch origin -t > >> + _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" > > Why don't we just run: > > git fetch origin > > git fetch origin -t > > One extra connection to the server ... > but much more readable/maintainable! > I will do. I will change this patch to Changes Requested. Thanks. I don't care much about an extra round-trip to the server, especially if that makes the code easier to read. > > As I understand it: > > ancient git | new git > > -------------------------------------------------------------------- > > git fetch | fetch all refs but tags | fetches all refs but tags > > git fetch -t | fetches only tags | fetch all refs and tags > Nit: actually 'git fetch' does fetch some tags, but not necessarily all. Only > tags in commits also reachable by branches are fetched. OK, but overall, we would cover all cases with the two git-fetch, whether we use an old git or a newer one, and that's the important info. Thanks! :-) Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git 2018-04-15 12:12 ` Yann E. MORIN @ 2018-04-16 2:17 ` Ricardo Martincoski 0 siblings, 0 replies; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-16 2:17 UTC (permalink / raw) To: buildroot Hello, > On 2018-04-13 15:23 -0300, Ricardo Martincoski spake thusly: >> On Thu, Apr 12, 2018 at 02:42 PM, Yann E. MORIN wrote: >> >> - _git fetch origin -t >> >> + _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" >> > Why don't we just run: >> > git fetch origin >> > git fetch origin -t >> >> One extra connection to the server ... >> but much more readable/maintainable! >> I will do. I will change this patch to Changes Requested. > > Thanks. I don't care much about an extra round-trip to the server, > especially if that makes the code easier to read. I agree. And a few years from now, when we stop supporting distros with git <1.9.0 we can remove 'git fetch origin' if we wish. BTW, the order of the commands is important here. Some dumb http servers don't work well with old git versions if 'git fetch origin -t' is used first, see: https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/62895510 But in the order you propose (and I see you used in your series) it works fine: https://gitlab.com/RicardoMartincoski/buildroot/-/jobs/62897927 So we are good. > OK, but overall, we would cover all cases with the two git-fetch, > whether we use an old git or a newer one, and that's the important > info. Agreed. Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-12 9:28 [Buildroot] [RFC PATCH 0/3] fix some corner cases for download/git v1 Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git Ricardo Martincoski @ 2018-04-12 9:28 ` Ricardo Martincoski 2018-04-12 17:48 ` Yann E. MORIN 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs Ricardo Martincoski 2 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-12 9:28 UTC (permalink / raw) To: buildroot In the case a previous operation in the git cache was abruptly interrupted the git cache can be in a dirty state (removed files, untracked files). This can lead to download failures during the checkout, like this: error: The following untracked working tree files would be overwritten by checkout: and/or this: error: The following untracked working tree files would be removed by checkout: If an untracked file, directory or submodule is present after the checkout, the tarball will have additional content, leading to a hash mismatch. When performing the checkout, use -f to allow it to occur even if there are untracked files that would be overwritten or removed. Use git clean to remove any untracked files before generating the tarball. Use the second -f to ensure the repo is pristine even if previous checked out ref contained a submodule that is not present in the ref just checked out. Fixes: http://autobuild.buildroot.net/results/b418394f679ad269d587753302c540036793334c Possibly fixes: http://autobuild.buildroot.net/results/9b584a624bb08c5ab6d977b93f4b92ed4bd1d40f Signed-off-by: Ricardo Martincoski <ricardo.martincoski@gmail.com> Cc: "Yann E. MORIN" <yann.morin.1998@free.fr> --- I did not split this into 2 patches (the first one adding -f to checkout and the second one adding git reset) because only adding -f to checkout would make the git checkout command to succeed but the download would still fails when checking the hash if any untracked file was left. In this case the git cache also would still be dirty, so it is not really a fix to a dirty cache. --- support/download/git | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/support/download/git b/support/download/git index 1b09e5c750..68b5d920a1 100755 --- a/support/download/git +++ b/support/download/git @@ -92,7 +92,8 @@ fi # Checkout the required changeset, so that we can update the required # submodules. -_git checkout -q "'${cset}'" +_git checkout -f -q "'${cset}'" +_git clean -ffdx # Get date of commit to generate a reproducible archive. # %cD is RFC2822, so it's fully qualified, with TZ and all. -- 2.14.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache Ricardo Martincoski @ 2018-04-12 17:48 ` Yann E. MORIN 2018-04-13 18:28 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-12 17:48 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-12 06:28 -0300, Ricardo Martincoski spake thusly: > In the case a previous operation in the git cache was abruptly > interrupted the git cache can be in a dirty state (removed files, > untracked files). This can lead to download failures during the > checkout, like this: > error: The following untracked working tree files would be overwritten by checkout: > and/or this: > error: The following untracked working tree files would be removed by checkout: > > If an untracked file, directory or submodule is present after the > checkout, the tarball will have additional content, leading to a hash > mismatch. > > When performing the checkout, use -f to allow it to occur even if there > are untracked files that would be overwritten or removed. > > Use git clean to remove any untracked files before generating the > tarball. Use the second -f to ensure the repo is pristine even if > previous checked out ref contained a submodule that is not present in > the ref just checked out. > > Fixes: > http://autobuild.buildroot.net/results/b418394f679ad269d587753302c540036793334c > Possibly fixes: > http://autobuild.buildroot.net/results/9b584a624bb08c5ab6d977b93f4b92ed4bd1d40f I'm not very happy of those tricks... If the repos is broken, it may be in a state that repairing it is not even possible at all, in case a critical git data is missing/damaged. If the repo is broken, either: - we ditch the repository and restart frm scratch, - or we just bail out and tell the user what hapened, so they can take action (e.g. remove the broken repo manually). Regards, Yann E. MORIN. > Signed-off-by: Ricardo Martincoski <ricardo.martincoski@gmail.com> > Cc: "Yann E. MORIN" <yann.morin.1998@free.fr> > --- > I did not split this into 2 patches (the first one adding -f to checkout > and the second one adding git reset) because only adding -f to checkout > would make the git checkout command to succeed but the download would > still fails when checking the hash if any untracked file was left. In > this case the git cache also would still be dirty, so it is not really a > fix to a dirty cache. > --- > support/download/git | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/support/download/git b/support/download/git > index 1b09e5c750..68b5d920a1 100755 > --- a/support/download/git > +++ b/support/download/git > @@ -92,7 +92,8 @@ fi > > # Checkout the required changeset, so that we can update the required > # submodules. > -_git checkout -q "'${cset}'" > +_git checkout -f -q "'${cset}'" > +_git clean -ffdx > > # Get date of commit to generate a reproducible archive. > # %cD is RFC2822, so it's fully qualified, with TZ and all. > -- > 2.14.1 > -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-12 17:48 ` Yann E. MORIN @ 2018-04-13 18:28 ` Ricardo Martincoski 2018-04-15 12:02 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-13 18:28 UTC (permalink / raw) To: buildroot Hello, On Thu, Apr 12, 2018 at 02:48 PM, Yann E. MORIN wrote: > On 2018-04-12 06:28 -0300, Ricardo Martincoski spake thusly: >> When performing the checkout, use -f to allow it to occur even if there >> are untracked files that would be overwritten or removed. >> >> Use git clean to remove any untracked files before generating the >> tarball. Use the second -f to ensure the repo is pristine even if >> previous checked out ref contained a submodule that is not present in >> the ref just checked out. [snip] > > I'm not very happy of those tricks... If the repos is broken, it may be > in a state that repairing it is not even possible at all, in case a > critical git data is missing/damaged. I created the series thinking only on git repos that are sane but have a dirty worktree. I should have used "dirty worktree" in the commit log. Rethinking now, taking into account a broken repo, and also the fact that the default download directory is inside buildroot, this patch is a bit dangerous. With a completely broken repo in the git cache, the commands called by the download script end up executed in any repo that eventually contains that file (parent \(of parent \)\+ directory), so: $ cd buildroot $ git init dl/package/git $ cd dl/package/git $ rm .git/HEAD # simulate a completely broken repo $ git <command> The git command would be executed in the buildroot tree. A fetch to the wrong repo is bad, but not that bad. A checkout would fail. But checkout -f and git clean would succeed, potentially making the user to lose changes. I will change this patch to Changes Requested. A way to avoid this, once the script entered the directory, is to force the git-dir: $ git --git-dir=.git <command> This command would fail for a too-much-broken repo. This is important also if we test the repo with git fsck before ditch+restart, otherwise it would be possible to return OK based on the wrong repo. > > If the repo is broken, either: > > - we ditch the repository and restart frm scratch, > > - or we just bail out and tell the user what hapened, so they can take > action (e.g. remove the broken repo manually). When the repo is broken (detectable by git fsck) I agree this is the best approach. But I think a sane git repo (detectable by git fsck) that is in a dirty state (in the sense of git clean) is a different case. It can happen for example if a previous checkout operation was abruptly interrupted during a git checkout. In this case it seems to me a bit extreme (but not incorrect of course) to remove the repo and restart from scratch. For the case a file from the worktree is missing the current 'git checkout' succeeds and the problem shows up as a hash mismatch. Using 'git checkout -f' the repo is recovered. For the case a git object needed by the checkout is missing the current 'git checkout' succeeds and the problem shows up as a hash mismatch. Using 'git checkout -f' the checkout fails like this: error: invalid object The git clean makes sure, among other things, that the worktree is clean even in the case the previous checkout contained a submodule that is no longer present in the new checkout. Maybe there is another command that ensures this, but I don't know about yet. Maybe I am missing something, but as I see, for a package with submodule, if a submodule is removed from the upstream project, it would trigger a ditch+restart too. Again, I think this is a bit extreme (more than git clean -ffdx) but not wrong. In order to detect broken repos (and trigger the ditch) we could start the script by running fsck. Right? Do you have a local patch for this ditch+restart? Or should I create one? Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-13 18:28 ` Ricardo Martincoski @ 2018-04-15 12:02 ` Yann E. MORIN 2018-04-16 2:54 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-15 12:02 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-13 15:28 -0300, Ricardo Martincoski spake thusly: > On Thu, Apr 12, 2018 at 02:48 PM, Yann E. MORIN wrote: > > On 2018-04-12 06:28 -0300, Ricardo Martincoski spake thusly: > >> When performing the checkout, use -f to allow it to occur even if there > >> are untracked files that would be overwritten or removed. > >> > >> Use git clean to remove any untracked files before generating the > >> tarball. Use the second -f to ensure the repo is pristine even if > >> previous checked out ref contained a submodule that is not present in > >> the ref just checked out. > [snip] > > > > I'm not very happy of those tricks... If the repos is broken, it may be > > in a state that repairing it is not even possible at all, in case a > > critical git data is missing/damaged. > > I created the series thinking only on git repos that are sane but have a dirty > worktree. I should have used "dirty worktree" in the commit log. Right, these are indeed two different cases. > Rethinking now, taking into account a broken repo, and also the fact that the > default download directory is inside buildroot, this patch is a bit dangerous. > > With a completely broken repo in the git cache, the commands called by the > download script end up executed in any repo that eventually contains that file > (parent \(of parent \)\+ directory), so: > $ cd buildroot > $ git init dl/package/git > $ cd dl/package/git > $ rm .git/HEAD # simulate a completely broken repo > $ git <command> > The git command would be executed in the buildroot tree. > A fetch to the wrong repo is bad, but not that bad. > A checkout would fail. > But checkout -f and git clean would succeed, potentially making the user to lose > changes. Arg, indeed that is dangerous... > I will change this patch to Changes Requested. > > A way to avoid this, once the script entered the directory, is to force the > git-dir: > $ git --git-dir=.git <command> > This command would fail for a too-much-broken repo. We recently reverted use of -C becasue it was not supported by old git versions. What is the oldest git version to support --git-dir? It seems it was introduced in v.1.4.2, dated 2006-08-12. Is that old enough that all the distros we care about have that version or later? > This is important also if we test the repo with git fsck before ditch+restart, > otherwise it would be possible to return OK based on the wrong repo. > > > > > If the repo is broken, either: > > > > - we ditch the repository and restart frm scratch, > > > > - or we just bail out and tell the user what hapened, so they can take > > action (e.g. remove the broken repo manually). > > When the repo is broken (detectable by git fsck) I agree this is the best > approach. > > But I think a sane git repo (detectable by git fsck) that is in a dirty state > (in the sense of git clean) is a different case. > It can happen for example if a previous checkout operation was abruptly > interrupted during a git checkout. In this case it seems to me a bit extreme > (but not incorrect of course) to remove the repo and restart from scratch. > > For the case a file from the worktree is missing the current 'git checkout' > succeeds and the problem shows up as a hash mismatch. > Using 'git checkout -f' the repo is recovered. > For the case a git object needed by the checkout is missing the current 'git > checkout' succeeds and the problem shows up as a hash mismatch. > Using 'git checkout -f' the checkout fails like this: > error: invalid object > > The git clean makes sure, among other things, that the worktree is clean even in > the case the previous checkout contained a submodule that is no longer present > in the new checkout. Maybe there is another command that ensures this, but I > don't know about yet. > Maybe I am missing something, but as I see, for a package with submodule, if > a submodule is removed from the upstream project, it would trigger a > ditch+restart too. Again, I think this is a bit extreme (more than git clean > -ffdx) but not wrong. > > In order to detect broken repos (and trigger the ditch) we could start the > script by running fsck. Right? So, basically, we would do something along those lines: _git() { eval ${GIT} --git-dir=.git "${@}" } _git fsck || { rm -rf ${git_cache}; exec "${0}" "${@}"; exit 1; } _git fetch _git fetch -t _git checkout ${ref} _git clean -dX _git checkout -- . _git submodules update Of course, that would require using appropriate options to fsck to bail out. But what to do if any if the following actions fails? Should we simply exit, or should we clean up and clone again? I can see shere that could go wrong: the ref does not exist, so the first checkout fails, so we ditch the repository, clone again, and checkout again fails... My opinion, for what it's worth, is to clan only on the fsck. Any other failure should be left to the user to handle. Maybe with just a little message saying something like: If you are not sure how to solve this, remove ${cache_dir}. Thoughts? > Do you have a local patch for this ditch+restart? Or should I create one? I am working on that... Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-15 12:02 ` Yann E. MORIN @ 2018-04-16 2:54 ` Ricardo Martincoski 2018-04-16 16:01 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-16 2:54 UTC (permalink / raw) To: buildroot Hello, On Sun, Apr 15, 2018 at 09:02 AM, Yann E. MORIN wrote: > On 2018-04-13 15:28 -0300, Ricardo Martincoski spake thusly: >> On Thu, Apr 12, 2018 at 02:48 PM, Yann E. MORIN wrote: >> > On 2018-04-12 06:28 -0300, Ricardo Martincoski spake thusly: >> A way to avoid this, once the script entered the directory, is to force the >> git-dir: >> $ git --git-dir=.git <command> >> This command would fail for a too-much-broken repo. > > We recently reverted use of -C becasue it was not supported by old git > versions. What is the oldest git version to support --git-dir? It seems > it was introduced in v.1.4.2, dated 2006-08-12. Is that old enough that > all the distros we care about have that version or later? I think so. I added Thomas to the thread. In the past some similar decisions were taken based on the oldest supported RHEL version. RHEL 6 has EOL on November 30, 2020. It included git 1.7.1 last time I checked (one year ago!). Concerning the code, an alternative is: GIT_DIR=.git git <command> It works with fsck. Hopefully it always worked, but I did not checked the git/.git history. I think it is important to use this with fsck, otherwise for a too-much-broken repo (say, .git/HEAD is missing) git would search the parent directory, then the parent of the parent, and so on, and in the case it is a subdir of buildroot (or any other git repo) the fsck would return OK for the wrong repo. >> In order to detect broken repos (and trigger the ditch) we could start the >> script by running fsck. Right? > > So, basically, we would do something along those lines: > > _git() { > eval ${GIT} --git-dir=.git "${@}" > } > > _git fsck || { rm -rf ${git_cache}; exec "${0}" "${@}"; exit 1; } > _git fetch > _git fetch -t > _git checkout ${ref} > _git clean -dX > _git checkout -- . > _git submodules update In the overall I agree with this sequence. Below some nits: I would run fsck using GIT_DIR before the script enters {git_cache}, this way it is easy to remove the broken git cache and run init again, and the script continues without the need to call itself. GIT_DIR works with git 1.8.3, I did not tested with 1.7.1 When we have -f in checkout (I see in your WIP series that you considered to add it) do we need the '_git checkout -- .'? I was initially thinking in: first run fsck forcing the GIT_DIR and then trust the repo is sane, but your approach of adding it to _git() seems even more robust, it covers even the case someone (not buildroot) messes with the git cache in the wrong moment. Maybe it would be possible to even remove pushd/popd altogether. Of course it would cause changes to find and tar command lines again, so I don't think it's worth the risk of finding more corner cases with tool versions than the ones already solved. > > Of course, that would require using appropriate options to fsck to bail > out. Yes. Some interesting ones I listed below: --no-dangling: AFAIK they cause no harm; --no-reflogs: not sure; --full: this is for the case someone is abusing the git cache with alternates, should we care? I find out that different git versions also use different sets of errors to return non-zero code. Not so different, don't worry. Old versions return 0 for few errors (but the error: message is printed) git 1.8.3: return code 0 for a missing sha1 object pointed by a tag, printing: error: refs/tags/tag_b does not point to a valid object! git 2.14.1: return code 2 in the same case error: refs/tags/tag_b: invalid sha1 pointer 1f95d47cc18a9ed4e1eab9b71fe2009c9555448d BUT, as we always do a fetch before checkout, the fetch fixes it! So again, we are good. I don't think it needs extra code. > > But what to do if any if the following actions fails? Should we simply > exit, or should we clean up and clone again? > > I can see shere that could go wrong: the ref does not exist, so the > first checkout fails, so we ditch the repository, clone again, and > checkout again fails... > > My opinion, for what it's worth, is to clan only on the fsck. Any other > failure should be left to the user to handle. Maybe with just a little > message saying something like: > > If you are not sure how to solve this, remove ${cache_dir}. > > Thoughts? I agree, to clean only on the fsck is better. The user-friendly message is not *needed* IMO. But if you find an easy way to do that, it would be nice to have. Maybe in _git() but not in the fsck case. Maybe a trap? Not sure. Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-16 2:54 ` Ricardo Martincoski @ 2018-04-16 16:01 ` Yann E. MORIN 2018-04-16 20:56 ` Yann E. MORIN 2018-04-17 4:45 ` Ricardo Martincoski 0 siblings, 2 replies; 27+ messages in thread From: Yann E. MORIN @ 2018-04-16 16:01 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-15 23:54 -0300, Ricardo Martincoski spake thusly: > On Sun, Apr 15, 2018 at 09:02 AM, Yann E. MORIN wrote: [--SNIP--] > > Of course, that would require using appropriate options to fsck to bail > > out. > > Yes. Some interesting ones I listed below: > --no-dangling: AFAIK they cause no harm; > --no-reflogs: not sure; > --full: this is for the case someone is abusing the git cache with alternates, > should we care? So I played with git-fsck in quite a few setups, and it irremediably exits with a non-zero status when something is wrong. However, on my machine, it takes about 8min to fsck the Linux git tree, which is a huge amount of time (much more than it takes to do the build of said kernel). So, I am a bit reluctant at using git-fsck. I'm trying to see if we can find a faster way to detect if the git tree is sane or not. After all, we only need a sanity check, not repairing. If it is not sane, we ditch it and reclone. So, maybe just running "git status" or any other fast action should probably be enough. Thoughts? > I find out that different git versions also use different sets of errors to > return non-zero code. Not so different, don't worry. > Old versions return 0 for few errors (but the error: message is printed) > git 1.8.3: return code 0 for a missing sha1 object pointed by a tag, printing: > error: refs/tags/tag_b does not point to a valid object! > git 2.14.1: return code 2 in the same case > error: refs/tags/tag_b: invalid sha1 pointer 1f95d47cc18a9ed4e1eab9b71fe2009c9555448d > BUT, as we always do a fetch before checkout, the fetch fixes it! > So again, we are good. I don't think it needs extra code. Hmm... As long as the repo is not _broken_ and we can recover with the fetch / clean/ checkout actions, I'm OK. > > But what to do if any if the following actions fails? Should we simply > > exit, or should we clean up and clone again? > > > > I can see shere that could go wrong: the ref does not exist, so the > > first checkout fails, so we ditch the repository, clone again, and > > checkout again fails... > > > > My opinion, for what it's worth, is to clan only on the fsck. Any other > > failure should be left to the user to handle. Maybe with just a little > > message saying something like: > > > > If you are not sure how to solve this, remove ${cache_dir}. > > > > Thoughts? > > I agree, to clean only on the fsck is better. > > The user-friendly message is not *needed* IMO. > But if you find an easy way to do that, it would be nice to have. > Maybe in _git() but not in the fsck case. > Maybe a trap? Not sure. Well, a trap is pretty simple, yes. But let's leave that out for now, we already have enough complexity to handle... Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-16 16:01 ` Yann E. MORIN @ 2018-04-16 20:56 ` Yann E. MORIN [not found] ` <1ffe206e-4b1e-c233-a511-ba4c3a8cb5f0@armadeus.com> 2018-04-17 4:45 ` Ricardo Martincoski 1 sibling, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-16 20:56 UTC (permalink / raw) To: buildroot Ricardo, All, To all autobuilder runners: could you create a tarball of the failed dtv-scan-tables git tree on yuour autobuilders, from each of your instances, please? I got mine, butonly one is really broken (missing HEAD). I'd like to get a few other breakage to investigate furhter the git backend fixes and be sure they work as expected. Thanks! Otherwise, Ricardo, more comments below... On 2018-04-16 18:01 +0200, Yann E. MORIN spake thusly: > On 2018-04-15 23:54 -0300, Ricardo Martincoski spake thusly: > > On Sun, Apr 15, 2018 at 09:02 AM, Yann E. MORIN wrote: > [--SNIP--] > > > Of course, that would require using appropriate options to fsck to bail > > > out. > > > > Yes. Some interesting ones I listed below: > > --no-dangling: AFAIK they cause no harm; > > --no-reflogs: not sure; > > --full: this is for the case someone is abusing the git cache with alternates, > > should we care? > > So I played with git-fsck in quite a few setups, and it irremediably > exits with a non-zero status when something is wrong. And now I've had a case where git-fsck would barf, because of missing objects. The git tree was not salvageable with git-fsck. But then, git-fetch retrieved the missing blobs, and everything was back to being fine. So, we'd have situations where we'd detect a borked repository, so we'd re-clone it from scratch, when a simple fetch would have retrieve the missing parts... Hoowever, I'd prefer that we play safe, and download more than necessary, but be sure we end up with a sane situation. And I still think that running git-status is enough to detect a borked repository, rather than the expensive git-fsck. Thoughts? Regards, Yann E. MORIN. > However, on my machine, it takes about 8min to fsck the Linux git tree, > which is a huge amount of time (much more than it takes to do the build > of said kernel). > > So, I am a bit reluctant at using git-fsck. > > I'm trying to see if we can find a faster way to detect if the git tree > is sane or not. After all, we only need a sanity check, not repairing. > If it is not sane, we ditch it and reclone. > > So, maybe just running "git status" or any other fast action should > probably be enough. > > Thoughts? > > > I find out that different git versions also use different sets of errors to > > return non-zero code. Not so different, don't worry. > > Old versions return 0 for few errors (but the error: message is printed) > > git 1.8.3: return code 0 for a missing sha1 object pointed by a tag, printing: > > error: refs/tags/tag_b does not point to a valid object! > > git 2.14.1: return code 2 in the same case > > error: refs/tags/tag_b: invalid sha1 pointer 1f95d47cc18a9ed4e1eab9b71fe2009c9555448d > > BUT, as we always do a fetch before checkout, the fetch fixes it! > > So again, we are good. I don't think it needs extra code. > > Hmm... As long as the repo is not _broken_ and we can recover with the > fetch / clean/ checkout actions, I'm OK. > > > > But what to do if any if the following actions fails? Should we simply > > > exit, or should we clean up and clone again? > > > > > > I can see shere that could go wrong: the ref does not exist, so the > > > first checkout fails, so we ditch the repository, clone again, and > > > checkout again fails... > > > > > > My opinion, for what it's worth, is to clan only on the fsck. Any other > > > failure should be left to the user to handle. Maybe with just a little > > > message saying something like: > > > > > > If you are not sure how to solve this, remove ${cache_dir}. > > > > > > Thoughts? > > > > I agree, to clean only on the fsck is better. > > > > The user-friendly message is not *needed* IMO. > > But if you find an easy way to do that, it would be nice to have. > > Maybe in _git() but not in the fsck case. > > Maybe a trap? Not sure. > > Well, a trap is pretty simple, yes. > > But let's leave that out for now, we already have enough complexity to > handle... > > Regards, > Yann E. MORIN. > > -- > .-----------------.--------------------.------------------.--------------------. > | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | > | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | > | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | > | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | > '------------------------------^-------^------------------^--------------------' > _______________________________________________ > buildroot mailing list > buildroot at busybox.net > http://lists.busybox.net/mailman/listinfo/buildroot -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <1ffe206e-4b1e-c233-a511-ba4c3a8cb5f0@armadeus.com>]
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache [not found] ` <1ffe206e-4b1e-c233-a511-ba4c3a8cb5f0@armadeus.com> @ 2018-04-17 10:42 ` Yann E. MORIN 2018-04-17 11:30 ` Thomas Petazzoni 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-17 10:42 UTC (permalink / raw) To: buildroot Julien, All, On 2018-04-17 12:05 +0200, Julien Boibessot spake thusly: > On 16/04/2018 22:56, Yann E. MORIN wrote: > > Ricardo, All,To all autobuilder runners: could you create a tarball of the failed > dtv-scan-tables git tree on yuour autobuilders, from each of your > instances, please? > > I'm not sure to understand what you reallly need, could you please clarify ? You are running an autobuilder. That autobuilder will run 'n' builds in parallel, each in a directory instance-0, instance-1... In each of those instances, there is a dl/ sub-dir. In that dl/ sub-dir, there isyet another sub-dir per-package, of which dtv-scan-tables/ Buildroot downloads dtv-scan-tables as a git clone, stored in the git/ directory of the aforementioned dtv-scan-tables/ directory. I would need that you provide all thos git trees from all the instances your autobuilder runs, i.e. something like: tar cJf dtv-scan-tables-git.tar.xz instance-*/dl/dtv-scan-tables/git/ And then send me dtv-scan-tables-git.tar.xz Please! ;-) Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 10:42 ` Yann E. MORIN @ 2018-04-17 11:30 ` Thomas Petazzoni 0 siblings, 0 replies; 27+ messages in thread From: Thomas Petazzoni @ 2018-04-17 11:30 UTC (permalink / raw) To: buildroot Hello, On Tue, 17 Apr 2018 12:42:13 +0200, Yann E. MORIN wrote: > Please! ;-) On my side, I looked at the squashfs failures: http://autobuild.buildroot.net/?submitter=Thomas+Petazzoni+(Free+Electrons+server)&reason=%squashfs% I looked at instance-1 and instance-2, both of which are failing, and they look the same on the filesystem. instance-2 looks like this: $ tree -a instance-2/dl/squashfs instance-2/dl/squashfs |-- git | `-- .git | |-- FETCH_HEAD | |-- HEAD | |-- branches | |-- config | |-- description | |-- hooks | | |-- applypatch-msg.sample | | |-- commit-msg.sample | | |-- post-update.sample | | |-- pre-applypatch.sample | | |-- pre-commit.sample | | |-- pre-rebase.sample | | |-- prepare-commit-msg.sample | | `-- update.sample | |-- info | | `-- exclude | |-- objects | | |-- info | | `-- pack | `-- refs | |-- heads | `-- tags `-- squashfs-3de1687d7432ea9b302c2db9521996f506c140a3.tar.gz So basically, there is a tarball of the previous squashfs version, and the Git cache doesn't contain anything useful. See http://autobuild.buildroot.net/results/4f8/4f88b0aea78a8c8be245ec26e8e53f32d8a90eb5/build-end.log for the actual build failure. Best regards, Thomas Petazzoni -- Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons) Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-16 16:01 ` Yann E. MORIN 2018-04-16 20:56 ` Yann E. MORIN @ 2018-04-17 4:45 ` Ricardo Martincoski 2018-04-17 7:04 ` Ricardo Martincoski 1 sibling, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-17 4:45 UTC (permalink / raw) To: buildroot Hello, On Mon, Apr 16, 2018 at 01:01 PM, Yann E. MORIN wrote: > On 2018-04-15 23:54 -0300, Ricardo Martincoski spake thusly: >> On Sun, Apr 15, 2018 at 09:02 AM, Yann E. MORIN wrote: > [--SNIP--] >> > Of course, that would require using appropriate options to fsck to bail >> > out. >> >> Yes. Some interesting ones I listed below: >> --no-dangling: AFAIK they cause no harm; >> --no-reflogs: not sure; >> --full: this is for the case someone is abusing the git cache with alternates, >> should we care? > > So I played with git-fsck in quite a few setups, and it irremediably > exits with a non-zero status when something is wrong. > > However, on my machine, it takes about 8min to fsck the Linux git tree, > which is a huge amount of time (much more than it takes to do the build > of said kernel). Argh. Indeed it tests too much. And it takes too much time. Sorry my bad suggestion. > So, I am a bit reluctant at using git-fsck. > > I'm trying to see if we can find a faster way to detect if the git tree > is sane or not. After all, we only need a sanity check, not repairing. > If it is not sane, we ditch it and reclone. Agreed. > So, maybe just running "git status" or any other fast action should > probably be enough. I think 'git status' tests too much too. It's more related to the worktree status than to the repo (objects) status and we don't want to ditch the repo for a dirty worktree, git checkout -f and git clean -ffdx fix it for us. Maybe an even simpler command: 'git log -1' with the right options (to avoid failing if a checkout did not occur yet) or even other command, like 'git remote' or 'git tag'. But see below... > > Thoughts? > Maybe the solution is to use a reactive approach. With your approach of using --git-dir (or the alternate GIT_DIR=) for every command, we don't need a specific command to do the sanity check. If the repo is totally broken (e.g. .git/HEAD missing) the 'git remote' fails. If some object that the ref-to-be-checked-out needs is missing, 'git checkout -f' fails. Git is smart, it only fails if there is not a checked out copy of the file that matches the hash of the missing object. If the fetch was successful but the ref we need was not download (i.e. the case you described of someone using the wrong ref), the checkout also fails. So when the checkout fails we could test if the ref we need is in the local repo. If it is not present, we assume the repo is minimally sane and bail out. If it is present (and the checkout failed) something really wrong is going on. I guess 'git log -1 ref' will return the correct results for this test. _git() { eval ${GIT} --git-dir=.git "${@}" } _git init _git remote _git fetch _git fetch -t _git checkout -f ${ref} if failed if git log -1 ref ditch and restart else bail out _git clean -dX _git submodules update Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 4:45 ` Ricardo Martincoski @ 2018-04-17 7:04 ` Ricardo Martincoski 2018-04-17 8:10 ` Arnout Vandecappelle 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-17 7:04 UTC (permalink / raw) To: buildroot Hello, On Tue, Apr 17, 2018 at 01:45 AM, Ricardo Martincoski wrote: > _git() { > eval ${GIT} --git-dir=.git "${@}" > } > _git init > _git remote > _git fetch > _git fetch -t > _git checkout -f ${ref} > if failed > if git log -1 ref > ditch and restart > else > bail out > _git clean -dX > _git submodules update But let's take a step back... A dirty worktree is not so hard to find in practice. It happens when certain git commands are abruptly interrupted. As Thomas wrote in another thread, his autobuilder instance is sometimes restarted with the script running. Therefore the abrupt interruption will eventually occur during a checkout, possibly causing a dirty worktree. But the right commands (I believe they are checkout -f, git clean -ffdx) will fix it, assuming the repo is sane. But a broken repo is usually associated either to hardware failures, especially those repos that cannot be recovered by 'git init' and/or 'git fetch', or to misuse, that from the git perspective is what the autobuilder script is currently doing by eventually removing random individual files from inside a .git dir. And the implementation of ditch+restart seems to be more complicated than it seemed to me before. So, back to your proposal for a broken repo: either: - ditch+restart - bail out I changed my mind and I think we should adopt bail out for now. It will keep the script simple, yet robust. ditch+restart can be implemented later if we want/need. Thoughts? _git() { eval ${GIT} --git-dir=.git "${@}" } _git init _git remote _git fetch _git fetch -t _git checkout -f ${ref} _git clean -dX _git submodules update A simple improvement to the script to recover few broken repo scenarios is to make git init unconditional. From its help: "Running git init in an existing repository is safe. It will not overwrite things that are already there." For example, running 'git init' in a repo without .git/HEAD fixes it. Here is a squashed, no commit log, no comments on the code, hackish preview: https://gitlab.com/RicardoMartincoski/buildroot/commit/f4e20bb62b761bb6c82a29acbd2f8d56d90e4e9c Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 7:04 ` Ricardo Martincoski @ 2018-04-17 8:10 ` Arnout Vandecappelle 2018-04-17 8:56 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Arnout Vandecappelle @ 2018-04-17 8:10 UTC (permalink / raw) To: buildroot On 17-04-18 09:04, Ricardo Martincoski wrote: > Hello, > > On Tue, Apr 17, 2018 at 01:45 AM, Ricardo Martincoski wrote: > >> _git() { >> eval ${GIT} --git-dir=.git "${@}" >> } >> _git init >> _git remote >> _git fetch >> _git fetch -t >> _git checkout -f ${ref} >> if failed >> if git log -1 ref >> ditch and restart >> else >> bail out >> _git clean -dX >> _git submodules update > > But let's take a step back... > > A dirty worktree is not so hard to find in practice. It happens when certain git > commands are abruptly interrupted. As Thomas wrote in another thread, his > autobuilder instance is sometimes restarted with the script running. Therefore > the abrupt interruption will eventually occur during a checkout, possibly > causing a dirty worktree. > But the right commands (I believe they are checkout -f, git clean -ffdx) will > fix it, assuming the repo is sane. > > But a broken repo is usually associated either to hardware failures, especially > those repos that cannot be recovered by 'git init' and/or 'git fetch', or to > misuse, that from the git perspective is what the autobuilder script is > currently doing by eventually removing random individual files from inside a > .git dir. I was going to say exactly the same yesterday evening but didn't have the time to think it through. All the fsck or status or whatever checks are only needed to detect a broken .git directory, and this should never happen in normal git operation. > And the implementation of ditch+restart seems to be more complicated than it > seemed to me before. > > So, back to your proposal for a broken repo: > either: > - ditch+restart > - bail out > > I changed my mind and I think we should adopt bail out for now. > It will keep the script simple, yet robust. > ditch+restart can be implemented later if we want/need. Since a broken repo supposedly should never occur, I think we simply don't need to handle that case. > Thoughts? > > > _git() { > eval ${GIT} --git-dir=.git "${@}" I do indeed think that passing git-dir explicitly is a good idea. However, it should be passed as GIT_DIR in the environment instead of --git-dir so it also works with older versions of git. I've just checked git history, it seems that GIT_DIR in the environment was supported since v0.99 (i.e. from before the first tag in the repo). > } > _git init > _git remote > _git fetch > _git fetch -t > _git checkout -f ${ref} > _git clean -dX > _git submodules update > > A simple improvement to the script to recover few broken repo scenarios is to > make git init unconditional. From its help: > "Running git init in an existing repository is safe. It will not overwrite > things that are already there." > For example, running 'git init' in a repo without .git/HEAD fixes it. That is a *great* idea. > Here is a squashed, no commit log, no comments on the code, hackish preview: > https://gitlab.com/RicardoMartincoski/buildroot/commit/f4e20bb62b761bb6c82a29acbd2f8d56d90e4e9c Looks good to me, with the following mods: * use GIT_DIR * add comment above git init explaining that it is safe * explain why -ff is needed for git clean * Removing the shallow fetch should be done in a separate patch (I'm not entirely convinced yet that it is needed). Regards, Arnout -- Arnout Vandecappelle arnout at mind be Senior Embedded Software Architect +32-16-286500 Essensium/Mind http://www.mind.be G.Geenslaan 9, 3001 Leuven, Belgium BE 872 984 063 RPR Leuven LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle GPG fingerprint: 7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 8:10 ` Arnout Vandecappelle @ 2018-04-17 8:56 ` Ricardo Martincoski 2018-04-17 10:36 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-17 8:56 UTC (permalink / raw) To: buildroot Hello, On Tue, Apr 17, 2018 at 05:10 AM, Arnout Vandecappelle wrote: > On 17-04-18 09:04, Ricardo Martincoski wrote: Thank you for your comments. >> Here is a squashed, no commit log, no comments on the code, hackish preview: >> https://gitlab.com/RicardoMartincoski/buildroot/commit/f4e20bb62b761bb6c82a29acbd2f8d56d90e4e9c > > Looks good to me, with the following mods: > > * use GIT_DIR > * add comment above git init explaining that it is safe > * explain why -ff is needed for git clean > * Removing the shallow fetch should be done in a separate patch (I'm not > entirely convinced yet that it is needed). These 2 are already covered by Yann's WIP series, with comments and proper commit logs: https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/git-robust Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 8:56 ` Ricardo Martincoski @ 2018-04-17 10:36 ` Yann E. MORIN 2018-04-17 11:58 ` Arnout Vandecappelle 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-17 10:36 UTC (permalink / raw) To: buildroot Ricardo, Arnout, All, On 2018-04-17 05:56 -0300, Ricardo Martincoski spake thusly: > On Tue, Apr 17, 2018 at 05:10 AM, Arnout Vandecappelle wrote: > > On 17-04-18 09:04, Ricardo Martincoski wrote: > >> Here is a squashed, no commit log, no comments on the code, hackish preview: > >> https://gitlab.com/RicardoMartincoski/buildroot/commit/f4e20bb62b761bb6c82a29acbd2f8d56d90e4e9c ACK. I'll vampirise the git-init part, then. > > Looks good to me, with the following mods: > > > > * use GIT_DIR > > * add comment above git init explaining that it is safe > > * explain why -ff is needed for git clean ACK ?3 > > * Removing the shallow fetch should be done in a separate patch (I'm not > > entirely convinced yet that it is needed). IIRC, Ricardo explained the case in a previous message: there are cases where we may miss blobs in a tree. But anyway, the current code only works by chance. Let's assume you want to use tag v4.17-rc1 from Linus' tree. The code would currently do something like (without an existing git cache) mkdir git git init git cd git git remote add origin https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git fetch origin --depth 1 v4.17-rc1 if ! git fetch origin v4.17-rc1:v4.17-rc1 ; then echo "warning" fi git checkout v4.17-rc1 The if-block is initially there to fetch special refs created by things like gerrit or github PRs. If you do the above, you get the warning, i.e. the if-condition is false, i.e. fetching the tag that way fails with exit-code 1, and this message: error: cannot update the ref 'refs/heads/v4.17-rc1': Trying to write non-commit object a79c33a10ce2764c62fb8af6cbb571752d55c1c0 to branch refs/heads/v4.17-rc1 From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux ! [new tag] v4.17-rc1 -> v4.17-rc1 (unable to update local ref) * [new tag] v4.17-rc1 -> v4.17-rc1 But since it is in a if-ciondition, we just print a warning, and continue. Then and the checkout succeeds. But! If you didn't try to fetch the special refs, the checkout would have failed: error: pathspec 'v4.17-rc1' did not match any file(s) known to git. I.e. we have code that just happens to work by chance (or rather, by side-effects and luck). So, even if the missing-blob problem could be solved another way, our shallow fetch is anyway already borked, but we were lucky so far not to fall for it... I still think that this shalow-fetch dance is too risky, and that having a sane git cache helps so much more in the long run, so much so that we should just ditch the shallow fetch support. > These 2 are already covered by Yann's WIP series, with comments and proper > commit logs: > https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/git-robust With the last two to be swapped. Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache 2018-04-17 10:36 ` Yann E. MORIN @ 2018-04-17 11:58 ` Arnout Vandecappelle 0 siblings, 0 replies; 27+ messages in thread From: Arnout Vandecappelle @ 2018-04-17 11:58 UTC (permalink / raw) To: buildroot On 17-04-18 12:36, Yann E. MORIN wrote: > Ricardo, Arnout, All, > > On 2018-04-17 05:56 -0300, Ricardo Martincoski spake thusly: >> On Tue, Apr 17, 2018 at 05:10 AM, Arnout Vandecappelle wrote: [snip] >>> * Removing the shallow fetch should be done in a separate patch (I'm not >>> entirely convinced yet that it is needed). > > IIRC, Ricardo explained the case in a previous message: there are cases > where we may miss blobs in a tree. > > But anyway, the current code only works by chance. Let's assume you want > to use tag v4.17-rc1 from Linus' tree. The code would currently do > something like (without an existing git cache) > > mkdir git > git init git > cd git > git remote add origin https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > git fetch origin --depth 1 v4.17-rc1 > if ! git fetch origin v4.17-rc1:v4.17-rc1 ; then > echo "warning" > fi > git checkout v4.17-rc1 > > The if-block is initially there to fetch special refs created by things > like gerrit or github PRs. If you do the above, you get the warning, > i.e. the if-condition is false, i.e. fetching the tag that way fails > with exit-code 1, and this message: > > error: cannot update the ref 'refs/heads/v4.17-rc1': Trying to write > non-commit object a79c33a10ce2764c62fb8af6cbb571752d55c1c0 to branch > refs/heads/v4.17-rc1 > From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux > ! [new tag] v4.17-rc1 -> v4.17-rc1 (unable to update local ref) > * [new tag] v4.17-rc1 -> v4.17-rc1 > > But since it is in a if-ciondition, we just print a warning, and > continue. > > Then and the checkout succeeds. > > But! If you didn't try to fetch the special refs, the checkout would > have failed: > > error: pathspec 'v4.17-rc1' did not match any file(s) known to git. > > I.e. we have code that just happens to work by chance (or rather, by > side-effects and luck). > > So, even if the missing-blob problem could be solved another way, our > shallow fetch is anyway already borked, but we were lucky so far not to > fall for it... Great explanation -- with just a single typo, if-ciondition :-). Would be good to add that to the commit log (the explanation, not the typo). > I still think that this shalow-fetch dance is too risky, and that having > a sane git cache helps so much more in the long run, so much so that we > should just ditch the shallow fetch support. Yeah, given the current sorry state, it's probably better to break it down completely and then build up something stable again. Regards, Arnout -- Arnout Vandecappelle arnout at mind be Senior Embedded Software Architect +32-16-286500 Essensium/Mind http://www.mind.be G.Geenslaan 9, 3001 Leuven, Belgium BE 872 984 063 RPR Leuven LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle GPG fingerprint: 7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-12 9:28 [Buildroot] [RFC PATCH 0/3] fix some corner cases for download/git v1 Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache Ricardo Martincoski @ 2018-04-12 9:28 ` Ricardo Martincoski 2018-04-12 11:28 ` Thomas Petazzoni 2 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-12 9:28 UTC (permalink / raw) To: buildroot With any version of git, after the git cache was once populated using a shallow fetch, a full fetch succeeds but doesn't download the commits behind the reference that was fetched with --depth 1. In this case the download fails like this: Fetching all references Could not fetch special ref 'sha1'; assuming it is not special. fatal: reference is not a tree: sha1 One scenario in buildroot development that would trigger this sequence is for example when a package is bumped on master branch, some users create the git cache at this time, then the bump is reverted. Another scenario is when giving maintenance to a product version that uses one version of buildroot using the same build farm that uses a newer version of buildroot. Signed-off-by: Ricardo Martincoski <ricardo.martincoski@gmail.com> Cc: "Yann E. MORIN" <yann.morin.1998@free.fr> --- WARNING: this patch is in early stage, really RFC. I tested it in a short period of time. I tested it by first creating the cache for dt-utils in its current version, then I changed dt-utils version to the commit immediately before it (ea1d08e26b73a4855165ed56043fa439ff9948cb). Without this patch the fetch of all refs succeeds but does not bring the desired commit and the checkout fails. --- support/download/git | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/support/download/git b/support/download/git index 68b5d920a1..8216da0a71 100755 --- a/support/download/git +++ b/support/download/git @@ -76,7 +76,15 @@ if [ -n "$(_git ls-remote origin "'${cset}'" 2>&1)" ]; then fi if [ ${git_done} -eq 0 ]; then printf "Fetching all references\n" - _git fetch origin -u "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" + # After a shallow fetch was once sucessful in this repo, the only way to + # ensure the commits behind its current depth are available is to unshallow. + # Git versions older than 2.15.0 do not have a command to know if a repo is + # shallow, so test for the actual file. + if [ -f "$(git rev-parse --git-dir)/shallow" ]; then + unshallow=--unshallow + fi + _git fetch origin -u ${unshallow} \ + "'+refs/tags/*:refs/tags/*'" "'+refs/heads/*:refs/remotes/origin/*'" fi # Try to get the special refs exposed by some forges (pull-requests for -- 2.14.1 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs Ricardo Martincoski @ 2018-04-12 11:28 ` Thomas Petazzoni 2018-04-12 17:33 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Thomas Petazzoni @ 2018-04-12 11:28 UTC (permalink / raw) To: buildroot Hello, Thanks a lot Ricardo for all this work on the download issues, much appreciated. One question on this specific commit. On Thu, 12 Apr 2018 06:28:55 -0300, Ricardo Martincoski wrote: > With any version of git, after the git cache was once populated using a > shallow fetch, a full fetch succeeds but doesn't download the commits > behind the reference that was fetched with --depth 1. In this case the > download fails like this: > Fetching all references > Could not fetch special ref 'sha1'; assuming it is not special. > fatal: reference is not a tree: sha1 > > One scenario in buildroot development that would trigger this sequence > is for example when a package is bumped on master branch, some users > create the git cache at this time, then the bump is reverted. > > Another scenario is when giving maintenance to a product version that > uses one version of buildroot using the same build farm that uses a > newer version of buildroot. In the light of this, wouldn't it be simpler to stop doing shallow clones at all ? A shallow clone did make a lot of sense back when we didn't cache the Git repositories. But now that we are caching the results, does it make sense to keep the complexity of the code to use shallow clones ? Note that this is really an open question, there are possibly some convincing argument for us to keep a shallow clone strategy. Best regards, Thomas -- Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons) Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-12 11:28 ` Thomas Petazzoni @ 2018-04-12 17:33 ` Yann E. MORIN 2018-04-13 18:32 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-12 17:33 UTC (permalink / raw) To: buildroot Thomas, Ricardo, All, On 2018-04-12 13:28 +0200, Thomas Petazzoni spake thusly: > Thanks a lot Ricardo for all this work on the download issues, much > appreciated. Indeed, thanks! > One question on this specific commit. > > On Thu, 12 Apr 2018 06:28:55 -0300, Ricardo Martincoski wrote: > > With any version of git, after the git cache was once populated using a > > shallow fetch, a full fetch succeeds but doesn't download the commits > > behind the reference that was fetched with --depth 1. In this case the > > download fails like this: > > Fetching all references > > Could not fetch special ref 'sha1'; assuming it is not special. > > fatal: reference is not a tree: sha1 > > > > One scenario in buildroot development that would trigger this sequence > > is for example when a package is bumped on master branch, some users > > create the git cache at this time, then the bump is reverted. > > > > Another scenario is when giving maintenance to a product version that > > uses one version of buildroot using the same build farm that uses a > > newer version of buildroot. > > In the light of this, wouldn't it be simpler to stop doing shallow > clones at all ? A shallow clone did make a lot of sense back when we > didn't cache the Git repositories. But now that we are caching the > results, does it make sense to keep the complexity of the code to use > shallow clones ? > > Note that this is really an open question, there are possibly some > convincing argument for us to keep a shallow clone strategy. Indeed, I don't think it makes sense anymore... So, intead of fixing it, let's just do full clones now. Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-12 17:33 ` Yann E. MORIN @ 2018-04-13 18:32 ` Ricardo Martincoski 2018-04-15 12:08 ` Yann E. MORIN 0 siblings, 1 reply; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-13 18:32 UTC (permalink / raw) To: buildroot Hello, On Thu, Apr 12, 2018 at 02:33 PM, Yann E. MORIN wrote: > On 2018-04-12 13:28 +0200, Thomas Petazzoni spake thusly: >> > One scenario in buildroot development that would trigger this sequence >> > is for example when a package is bumped on master branch, some users >> > create the git cache at this time, then the bump is reverted. >> > >> > Another scenario is when giving maintenance to a product version that >> > uses one version of buildroot using the same build farm that uses a >> > newer version of buildroot. >> >> In the light of this, wouldn't it be simpler to stop doing shallow >> clones at all ? A shallow clone did make a lot of sense back when we >> didn't cache the Git repositories. But now that we are caching the >> results, does it make sense to keep the complexity of the code to use >> shallow clones ? >> >> Note that this is really an open question, there are possibly some >> convincing argument for us to keep a shallow clone strategy. > > Indeed, I don't think it makes sense anymore... > > So, intead of fixing it, let's just do full clones now. I agree it is less needed now than before. The worst scenario (linux trees) is covered by: - the use of tarballs from GitHub in the defconfigs, for newcomers who sometimes just want to use a defconfig, add few packages and use the image in a new board; - in the long run, due to the git cache feature. But I think people will miss this feature. At least I will. In the case of someone always using tags from a linux repo, this person would never need to download the full repo if the shallow fetch remains. When someone wants just to try a package, for example, a debug tool, which maybe will never be used again, this person would need to do a full clone, instead of potentially do a shallow fetch. The repo for some packages can be located on slow servers (not necessarily a package in the tree, it can be on a br2-external). And the least appealing argument: I was willing to recreate http://patchwork.ozlabs.org/patch/702006/ on top of the new infra. Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-13 18:32 ` Ricardo Martincoski @ 2018-04-15 12:08 ` Yann E. MORIN 2018-04-16 3:04 ` Ricardo Martincoski 0 siblings, 1 reply; 27+ messages in thread From: Yann E. MORIN @ 2018-04-15 12:08 UTC (permalink / raw) To: buildroot Ricardo, All, On 2018-04-13 15:32 -0300, Ricardo Martincoski spake thusly: > On Thu, Apr 12, 2018 at 02:33 PM, Yann E. MORIN wrote: > > On 2018-04-12 13:28 +0200, Thomas Petazzoni spake thusly: > > >> > One scenario in buildroot development that would trigger this sequence > >> > is for example when a package is bumped on master branch, some users > >> > create the git cache at this time, then the bump is reverted. > >> > > >> > Another scenario is when giving maintenance to a product version that > >> > uses one version of buildroot using the same build farm that uses a > >> > newer version of buildroot. > >> > >> In the light of this, wouldn't it be simpler to stop doing shallow > >> clones at all ? A shallow clone did make a lot of sense back when we > >> didn't cache the Git repositories. But now that we are caching the > >> results, does it make sense to keep the complexity of the code to use > >> shallow clones ? > >> > >> Note that this is really an open question, there are possibly some > >> convincing argument for us to keep a shallow clone strategy. > > > > Indeed, I don't think it makes sense anymore... > > > > So, intead of fixing it, let's just do full clones now. > > I agree it is less needed now than before. OK. > The worst scenario (linux trees) is covered by: > - the use of tarballs from GitHub in the defconfigs, for newcomers who sometimes > just want to use a defconfig, add few packages and use the image in a new > board; > - in the long run, due to the git cache feature. > > But I think people will miss this feature. At least I will. Except for the linux git tree and a very few other packages we get from git, most git tree are reasonably sized, so that the overhead of doign a full clone is not much as comp[ared to the shallow clone. And since it also makes our code much simpler and more maintaineable, that's still a win for me... > In the case of someone always using tags from a linux repo, this person > would never need to download the full repo if the shallow fetch remains. > > When someone wants just to try a package, for example, a debug tool, which > maybe will never be used again, this person would need to do a full clone, > instead of potentially do a shallow fetch. > > The repo for some packages can be located on slow servers (not necessarily a > package in the tree, it can be on a br2-external). > > And the least appealing argument: I was willing to recreate > http://patchwork.ozlabs.org/patch/702006/ on top of the new infra. I'm still not convinced any more than I previously was. I.e. I still think we should go with full clone. Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 27+ messages in thread
* [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs 2018-04-15 12:08 ` Yann E. MORIN @ 2018-04-16 3:04 ` Ricardo Martincoski 0 siblings, 0 replies; 27+ messages in thread From: Ricardo Martincoski @ 2018-04-16 3:04 UTC (permalink / raw) To: buildroot Hello, On Sun, Apr 15, 2018 at 09:08 AM, Yann E. MORIN wrote: >> The worst scenario (linux trees) is covered by: >> - the use of tarballs from GitHub in the defconfigs, for newcomers who sometimes >> just want to use a defconfig, add few packages and use the image in a new >> board; >> - in the long run, due to the git cache feature. >> >> But I think people will miss this feature. At least I will. > > Except for the linux git tree and a very few other packages we get from > git, most git tree are reasonably sized, so that the overhead of doign a > full clone is not much as comp[ared to the shallow clone. > > And since it also makes our code much simpler and more maintaineable, Much simpler code indeed. I see in your WIP series. > that's still a win for me... Unrelated question: Should we add something about the git cache to 'Migrating from older Buildroot versions'? I am not sure because it does not really need any action from the user. It only changes the behavior. For better in the long run. An alternative is to reformat that e-mail that Thomas sent with a nice summary of the feature and add it to 'Download infrastructure'. If you don't find an easy way to add a user-friendly message in the rare case the download infra cannot recover the cache (after your WIP series), the message If you are not sure how to solve this, remove the git cache. Could be added to this section of the manual. >> In the case of someone always using tags from a linux repo, this person >> would never need to download the full repo if the shallow fetch remains. >> >> When someone wants just to try a package, for example, a debug tool, which >> maybe will never be used again, this person would need to do a full clone, >> instead of potentially do a shallow fetch. >> >> The repo for some packages can be located on slow servers (not necessarily a >> package in the tree, it can be on a br2-external). >> >> And the least appealing argument: I was willing to recreate >> http://patchwork.ozlabs.org/patch/702006/ on top of the new infra. > > I'm still not convinced any more than I previously was. I.e. I still think > we should go with full clone. OK. If someone else brings a more convincing argument in the future we can always re-add it covering all its corner cases. I will change this patch to Rejected. For my use case of sha1 from Gerrit, I will pursue different paths... Maybe tuning the server side options to generate tarballs. Maybe adding support to register download methods from a br2-external. Regards, Ricardo ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2018-04-17 11:58 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-04-12 9:28 [Buildroot] [RFC PATCH 0/3] fix some corner cases for download/git v1 Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 1/3] download/git: fix fetch all refs for old git Ricardo Martincoski 2018-04-12 17:42 ` Yann E. MORIN 2018-04-13 18:23 ` Ricardo Martincoski 2018-04-15 12:12 ` Yann E. MORIN 2018-04-16 2:17 ` Ricardo Martincoski 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 2/3] download/git: recover dirty cache Ricardo Martincoski 2018-04-12 17:48 ` Yann E. MORIN 2018-04-13 18:28 ` Ricardo Martincoski 2018-04-15 12:02 ` Yann E. MORIN 2018-04-16 2:54 ` Ricardo Martincoski 2018-04-16 16:01 ` Yann E. MORIN 2018-04-16 20:56 ` Yann E. MORIN [not found] ` <1ffe206e-4b1e-c233-a511-ba4c3a8cb5f0@armadeus.com> 2018-04-17 10:42 ` Yann E. MORIN 2018-04-17 11:30 ` Thomas Petazzoni 2018-04-17 4:45 ` Ricardo Martincoski 2018-04-17 7:04 ` Ricardo Martincoski 2018-04-17 8:10 ` Arnout Vandecappelle 2018-04-17 8:56 ` Ricardo Martincoski 2018-04-17 10:36 ` Yann E. MORIN 2018-04-17 11:58 ` Arnout Vandecappelle 2018-04-12 9:28 ` [Buildroot] [RFC PATCH 3/3] download/git: unshallow when fetching all refs Ricardo Martincoski 2018-04-12 11:28 ` Thomas Petazzoni 2018-04-12 17:33 ` Yann E. MORIN 2018-04-13 18:32 ` Ricardo Martincoski 2018-04-15 12:08 ` Yann E. MORIN 2018-04-16 3:04 ` Ricardo Martincoski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.