All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [RFC PATCH] download/git: ban branch references
@ 2019-06-19 15:18 John Keeping
  2019-06-19 15:34 ` John Keeping
  2019-06-20 17:27 ` [Buildroot] [RFC PATCH] download/git: ban branch references Joel Carlson
  0 siblings, 2 replies; 13+ messages in thread
From: John Keeping @ 2019-06-19 15:18 UTC (permalink / raw)
  To: buildroot

As described in the manual, using a branch name as a version is not
supported.  However, nothing enforces this so it is easy to specify a
branch name either accidentally or because new developers have not read
through the manual.

For Git it is reasonably easy to catch most violations of this rule and
fail the fetch phase.  This isn't intended to be a comprehensive filter
(it can be bypassed with, for example, FOO_VERSION=origin/master), but
should catch accidental use of a branch version and prompt switching to
an immutable reference.

Signed-off-by: John Keeping <john@metanate.com>
---
 support/download/git | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/support/download/git b/support/download/git
index 075f665bbf..3f26613e61 100755
--- a/support/download/git
+++ b/support/download/git
@@ -134,6 +134,25 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
     exit 1
 fi
 
+# Check that the specified version is not a branch. We expect a tag or
+# raw commit hash, and accept some special refs as above. Using a branch
+# is forbidden because these are mutable references.
+case "${cset}" in
+    refs/heads/*)
+        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset#refs/heads/}"
+        exit 1
+        ;;
+    refs/*)
+        : pass
+        ;;
+    *)
+        if _git rev-parse --quiet --verify "refs/heads/${cset}" >/dev/null 2>&1; then
+            printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
+            exit 1
+        fi
+        ;;
+esac
+
 # The new cset we want to checkout might have different submodules, or
 # have sub-dirs converted to/from a submodule. So we would need to
 # deregister _current_ submodules before we checkout.
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-19 15:18 [Buildroot] [RFC PATCH] download/git: ban branch references John Keeping
@ 2019-06-19 15:34 ` John Keeping
  2019-06-20 16:39   ` Yann E. MORIN
  2019-06-20 17:27 ` [Buildroot] [RFC PATCH] download/git: ban branch references Joel Carlson
  1 sibling, 1 reply; 13+ messages in thread
From: John Keeping @ 2019-06-19 15:34 UTC (permalink / raw)
  To: buildroot

On Wed, 19 Jun 2019 16:18:17 +0100
John Keeping <john@metanate.com> wrote:

> As described in the manual, using a branch name as a version is not
> supported.  However, nothing enforces this so it is easy to specify a
> branch name either accidentally or because new developers have not read
> through the manual.
> 
> For Git it is reasonably easy to catch most violations of this rule and
> fail the fetch phase.  This isn't intended to be a comprehensive filter
> (it can be bypassed with, for example, FOO_VERSION=origin/master), but
> should catch accidental use of a branch version and prompt switching to
> an immutable reference.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---

Just after sending this, I realised that the patch below doesn't work
for versions specified as a SHA1.

When we have a SHA1 version, then the earlier call to:

	_git fetch origin "'${cset}:${cset}'"

creates a *branch* refs/heads/${cset} for the SHA1.  Git then prints a
warning when passing the SHA1 to rev-parse:

	Git normally never creates a ref that ends with 40 hex characters
	because it will be ignored when you just specify 40-hex. These refs
	may be created by mistake. For example,

	  git checkout -b $br $(git rev-parse ...)

	where "$br" is somehow empty and a 40-hex ref is created. Please
	examine these refs and maybe delete them. Turn this message off by
	running "git config advice.objectNameWarning false"

Maybe we need to skip that fetch if ${cset} matches [0-9a-fA-F]+ or skip
it if ${cset} doesn't contain '/' since I think all of the special refs
we're interested in there will contain at least one branch separator.

>  support/download/git | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/support/download/git b/support/download/git
> index 075f665bbf..3f26613e61 100755
> --- a/support/download/git
> +++ b/support/download/git
> @@ -134,6 +134,25 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
>      exit 1
>  fi
>  
> +# Check that the specified version is not a branch. We expect a tag or
> +# raw commit hash, and accept some special refs as above. Using a branch
> +# is forbidden because these are mutable references.
> +case "${cset}" in
> +    refs/heads/*)
> +        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset#refs/heads/}"
> +        exit 1
> +        ;;
> +    refs/*)
> +        : pass
> +        ;;
> +    *)
> +        if _git rev-parse --quiet --verify "refs/heads/${cset}" >/dev/null 2>&1; then
> +            printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
> +            exit 1
> +        fi
> +        ;;
> +esac
> +
>  # The new cset we want to checkout might have different submodules, or
>  # have sub-dirs converted to/from a submodule. So we would need to
>  # deregister _current_ submodules before we checkout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-19 15:34 ` John Keeping
@ 2019-06-20 16:39   ` Yann E. MORIN
  2019-06-21 16:36     ` John Keeping
  0 siblings, 1 reply; 13+ messages in thread
From: Yann E. MORIN @ 2019-06-20 16:39 UTC (permalink / raw)
  To: buildroot

John, All,

On 2019-06-19 16:34 +0100, John Keeping spake thusly:
> On Wed, 19 Jun 2019 16:18:17 +0100
> John Keeping <john@metanate.com> wrote:
> 
> > As described in the manual, using a branch name as a version is not
> > supported.  However, nothing enforces this so it is easy to specify a
> > branch name either accidentally or because new developers have not read
> > through the manual.
> > 
> > For Git it is reasonably easy to catch most violations of this rule and
> > fail the fetch phase.  This isn't intended to be a comprehensive filter
> > (it can be bypassed with, for example, FOO_VERSION=origin/master), but
> > should catch accidental use of a branch version and prompt switching to
> > an immutable reference.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> 
> Just after sending this, I realised that the patch below doesn't work
> for versions specified as a SHA1.

I was going to reply to your original pathc, but since you noticed the
issue on your own :-) here is some additional feedback.

The sha1-as-branch is becuase of the so-called special refs. I already
sent a patch some time ago, but did not get around to respinning it,
which was followed by a patch to drop local branches altogether now:
and finally a patch to detect and abort when the cset was a branch:

    https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/dl-git-no-branch-3
    https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=6d82e95e6d2de17370734d9253ed11a81bb750f9
    https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=5788d8dcb46ccbb595a62568ee47bbce07563dd5
    https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=f5a962bcfe481ac3edba0d04ddc06369fa54e8de

So. if you're still interested in the topic, feel fre eto draw
inspiration from the above if you think they may be helpful.

There is some feedback on those somewhere on the list.

> When we have a SHA1 version, then the earlier call to:
> 
> 	_git fetch origin "'${cset}:${cset}'"
> 
> creates a *branch* refs/heads/${cset} for the SHA1.  Git then prints a
> warning when passing the SHA1 to rev-parse:
> 
> 	Git normally never creates a ref that ends with 40 hex characters
> 	because it will be ignored when you just specify 40-hex. These refs
> 	may be created by mistake. For example,
> 
> 	  git checkout -b $br $(git rev-parse ...)
> 
> 	where "$br" is somehow empty and a 40-hex ref is created. Please
> 	examine these refs and maybe delete them. Turn this message off by
> 	running "git config advice.objectNameWarning false"
> 
> Maybe we need to skip that fetch if ${cset} matches [0-9a-fA-F]+ or skip
> it if ${cset} doesn't contain '/' since I think all of the special refs
> we're interested in there will contain at least one branch separator.

I think it is much better to actually drop any local branch: since we do
not want to support branch by name, we don't need any local branch.

Thanks! :-)

Regards,
Yann E. MORIN.

> >  support/download/git | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/support/download/git b/support/download/git
> > index 075f665bbf..3f26613e61 100755
> > --- a/support/download/git
> > +++ b/support/download/git
> > @@ -134,6 +134,25 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
> >      exit 1
> >  fi
> >  
> > +# Check that the specified version is not a branch. We expect a tag or
> > +# raw commit hash, and accept some special refs as above. Using a branch
> > +# is forbidden because these are mutable references.
> > +case "${cset}" in
> > +    refs/heads/*)
> > +        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset#refs/heads/}"
> > +        exit 1
> > +        ;;
> > +    refs/*)
> > +        : pass
> > +        ;;
> > +    *)
> > +        if _git rev-parse --quiet --verify "refs/heads/${cset}" >/dev/null 2>&1; then
> > +            printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
> > +            exit 1
> > +        fi
> > +        ;;
> > +esac
> > +
> >  # The new cset we want to checkout might have different submodules, or
> >  # have sub-dirs converted to/from a submodule. So we would need to
> >  # deregister _current_ submodules before we checkout.
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-19 15:18 [Buildroot] [RFC PATCH] download/git: ban branch references John Keeping
  2019-06-19 15:34 ` John Keeping
@ 2019-06-20 17:27 ` Joel Carlson
  2019-06-21 12:36   ` John Keeping
  1 sibling, 1 reply; 13+ messages in thread
From: Joel Carlson @ 2019-06-20 17:27 UTC (permalink / raw)
  To: buildroot

On Wed, Jun 19, 2019 at 9:19 AM John Keeping <john@metanate.com> wrote:
>
> As described in the manual, using a branch name as a version is not
> supported.  However, nothing enforces this so it is easy to specify a
> branch name either accidentally or because new developers have not read
> through the manual.

At a more general discussion level, I have to admit I like having the
ability to use a branch name even if it isn't supported. During
development I find it easier to use a branch name and just delete the
cached version to force it to re-download when I know I have new
updates.  I can switch to a tag when I have a version I actually want
tagged and "final."  Otherwise I'd have to keep tagging changes and
updating what tag to grab in the config or package file to pull in new
changes.

Granted, it isn't a huge inconvenience to switch how I do things, and
I could see it being useful to "idiot-proof" for people who don't
realize the potential problems with using branch names.  But I at
least wanted to make my opinion known that I like it the way it is.

-Joel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-20 17:27 ` [Buildroot] [RFC PATCH] download/git: ban branch references Joel Carlson
@ 2019-06-21 12:36   ` John Keeping
  0 siblings, 0 replies; 13+ messages in thread
From: John Keeping @ 2019-06-21 12:36 UTC (permalink / raw)
  To: buildroot

On Thu, 20 Jun 2019 11:27:02 -0600
Joel Carlson <JoelsonCarl@gmail.com> wrote:

> On Wed, Jun 19, 2019 at 9:19 AM John Keeping <john@metanate.com> wrote:
> >
> > As described in the manual, using a branch name as a version is not
> > supported.  However, nothing enforces this so it is easy to specify a
> > branch name either accidentally or because new developers have not read
> > through the manual.  
> 
> At a more general discussion level, I have to admit I like having the
> ability to use a branch name even if it isn't supported. During
> development I find it easier to use a branch name and just delete the
> cached version to force it to re-download when I know I have new
> updates.  I can switch to a tag when I have a version I actually want
> tagged and "final."  Otherwise I'd have to keep tagging changes and
> updating what tag to grab in the config or package file to pull in new
> changes.
> 
> Granted, it isn't a huge inconvenience to switch how I do things, and
> I could see it being useful to "idiot-proof" for people who don't
> realize the potential problems with using branch names.  But I at
> least wanted to make my opinion known that I like it the way it is.

I'm curious what advantages you see to this method compared to
_OVERRIDE_SRCDIR.  I find pointing to an external version of the source
to be more flexible and generally faster than pulling from Git every
time.


Regards,
John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-20 16:39   ` Yann E. MORIN
@ 2019-06-21 16:36     ` John Keeping
  2019-06-22  7:47       ` Yann E. MORIN
  0 siblings, 1 reply; 13+ messages in thread
From: John Keeping @ 2019-06-21 16:36 UTC (permalink / raw)
  To: buildroot

Hi Yann,

On Thu, 20 Jun 2019 18:39:46 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> On 2019-06-19 16:34 +0100, John Keeping spake thusly:
> > On Wed, 19 Jun 2019 16:18:17 +0100
> > John Keeping <john@metanate.com> wrote:
> >   
> > > As described in the manual, using a branch name as a version is not
> > > supported.  However, nothing enforces this so it is easy to specify a
> > > branch name either accidentally or because new developers have not read
> > > through the manual.
> > > 
> > > For Git it is reasonably easy to catch most violations of this rule and
> > > fail the fetch phase.  This isn't intended to be a comprehensive filter
> > > (it can be bypassed with, for example, FOO_VERSION=origin/master), but
> > > should catch accidental use of a branch version and prompt switching to
> > > an immutable reference.
> > > 
> > > Signed-off-by: John Keeping <john@metanate.com>
> > > ---  
> > 
> > Just after sending this, I realised that the patch below doesn't work
> > for versions specified as a SHA1.  
> 
> I was going to reply to your original pathc, but since you noticed the
> issue on your own :-) here is some additional feedback.
> 
> The sha1-as-branch is becuase of the so-called special refs. I already
> sent a patch some time ago, but did not get around to respinning it,
> which was followed by a patch to drop local branches altogether now:
> and finally a patch to detect and abort when the cset was a branch:
> 
>     https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/dl-git-no-branch-3
>     https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=6d82e95e6d2de17370734d9253ed11a81bb750f9
>     https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=5788d8dcb46ccbb595a62568ee47bbce07563dd5
>     https://git.buildroot.org/~ymorin/git/buildroot/commit/?h=yem/dl-git-no-branch-3&id=f5a962bcfe481ac3edba0d04ddc06369fa54e8de
> 
> So. if you're still interested in the topic, feel fre eto draw
> inspiration from the above if you think they may be helpful.
> 
> There is some feedback on those somewhere on the list.

Thanks for these pointers, I found the related thread [1] and the
discussion there was really helpful.

What I propose is that we change our repository setup to essentially
mirror the remote (we can't clone with "--mirror" as that creates a bare
repo).  This means we change the fetch command to:

    git fetch origin --prune refs/*:refs/*

and remove the explicit fetch of $cset.

Since this fetches everything it allows the version to be the SHA-1 of a
special ref like a pull request or Gerrit changeset, which is a use case
Ricardo mentioned in the thread at [1].

We can then filter out non-tags with:

    case $(git rev-parse --symbolic-full-name "${cset}") in
        refs/tags/*
            : ok
            ;;
        refs/*
            printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
            ;;
        # Anything else is not a ref, must be a raw hash which is ok.
    esac

What do you think?


Regards,
John


[1] http://buildroot-busybox.2317881.n4.nabble.com/PATCH-0-3-download-detect-and-refuse-git-branch-by-name-td200050.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-21 16:36     ` John Keeping
@ 2019-06-22  7:47       ` Yann E. MORIN
  2019-06-24 11:30         ` John Keeping
  0 siblings, 1 reply; 13+ messages in thread
From: Yann E. MORIN @ 2019-06-22  7:47 UTC (permalink / raw)
  To: buildroot

John, All,

On 2019-06-21 17:36 +0100, John Keeping spake thusly:
> We can then filter out non-tags with:
>     case $(git rev-parse --symbolic-full-name "${cset}") in

Need for 2>/dev/null or there is a big warning printed (see below).

>         refs/tags/*
>             : ok
>             ;;
>         refs/*
>             printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
>             ;;
>         # Anything else is not a ref, must be a raw hash which is ok.

That's wrong, because branches may pre-exist from git-clones that we do
currently.

For example, I have a local git clone of linux-firmware, which has:

    $ git branch
    * 1baa34868b2c0a004dc595b20678145e3fff83e7
      44d4fca9922a252a0bd81f6307bcc072a78da54a
      d87753369b82c5f362250c197d04a1e1ef5bf698

    $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7
    warning: refname '1baa34868b2c0a004dc595b20678145e3fff83e7' is ambiguous.
    Git normally never creates a ref that ends with 40 hex characters
    because it will be ignored when you just specify 40-hex. These refs
    may be created by mistake. For example,

      git checkout -b $br $(git rev-parse ...)

    where "$br" is somehow empty and a 40-hex ref is created. Please
    examine these refs and maybe delete them. Turn this message off by
    running "git config advice.objectNameWarning false"
    refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7

    $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7 2>/dev/null
    refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7

So, 1baa34868b2c0a004dc595b20678145e3fff83e7 is the sha1 we curently use
in Buildroot, the two others we used in the past.

So, your code snippet above woud break the build. We have a single
option, really, which is to try and remove the local branch before
checking if the cset is a branch.

So, my position is that the plan stays basically the same:

  - get rid of special refs. As I already explained, we already have a
    mechanism to catter for that use-case: override-srcdir;

  - remove local branches;

  - refuse branches.

Regards,
Yann E. MORIN.

>     esac
> 
> What do you think?
> 
> 
> Regards,
> John
> 
> 
> [1] http://buildroot-busybox.2317881.n4.nabble.com/PATCH-0-3-download-detect-and-refuse-git-branch-by-name-td200050.html

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [RFC PATCH] download/git: ban branch references
  2019-06-22  7:47       ` Yann E. MORIN
@ 2019-06-24 11:30         ` John Keeping
  2019-06-24 11:32           ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote John Keeping
  0 siblings, 1 reply; 13+ messages in thread
From: John Keeping @ 2019-06-24 11:30 UTC (permalink / raw)
  To: buildroot

Hi Yann,

On Sat, 22 Jun 2019 09:47:13 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> On 2019-06-21 17:36 +0100, John Keeping spake thusly:
> > We can then filter out non-tags with:
> >     case $(git rev-parse --symbolic-full-name "${cset}") in  
> 
> Need for 2>/dev/null or there is a big warning printed (see below).

Agreed, although see below for that specific case.

> >         refs/tags/*
> >             : ok
> >             ;;
> >         refs/*
> >             printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
> >             ;;
> >         # Anything else is not a ref, must be a raw hash which is ok.  
> 
> That's wrong, because branches may pre-exist from git-clones that we do
> currently.
> 
> For example, I have a local git clone of linux-firmware, which has:
> 
>     $ git branch
>     * 1baa34868b2c0a004dc595b20678145e3fff83e7
>       44d4fca9922a252a0bd81f6307bcc072a78da54a
>       d87753369b82c5f362250c197d04a1e1ef5bf698
> 
>     $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7
>     warning: refname '1baa34868b2c0a004dc595b20678145e3fff83e7' is ambiguous.
>     Git normally never creates a ref that ends with 40 hex characters
>     because it will be ignored when you just specify 40-hex. These refs
>     may be created by mistake. For example,
> 
>       git checkout -b $br $(git rev-parse ...)
> 
>     where "$br" is somehow empty and a 40-hex ref is created. Please
>     examine these refs and maybe delete them. Turn this message off by
>     running "git config advice.objectNameWarning false"
>     refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7
> 
>     $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7 2>/dev/null
>     refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7
> 
> So, 1baa34868b2c0a004dc595b20678145e3fff83e7 is the sha1 we curently use
> in Buildroot, the two others we used in the past.
> 
> So, your code snippet above woud break the build. We have a single
> option, really, which is to try and remove the local branch before
> checking if the cset is a branch.

Note that the suggestion earlier in my message to change the fetch
mechanism avoids this problem by ensuring that these branches do not
exist.  It turns out to be slightly more complicated than just doing:

	git fetch origin --prune +refs/*:refs/*

but that can be made to work and means that we avoid the whole problem
of unexpected refs.

I'll send updated patches in reply to this message.


Regards,
John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote
  2019-06-24 11:30         ` John Keeping
@ 2019-06-24 11:32           ` John Keeping
  2019-06-24 11:32             ` [Buildroot] [PATCH v2 2/2] download/git: ban branch references John Keeping
  2019-12-29 22:03             ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote Yann E. MORIN
  0 siblings, 2 replies; 13+ messages in thread
From: John Keeping @ 2019-06-24 11:32 UTC (permalink / raw)
  To: buildroot

Instead of fetching branches and tags then following up with another
request in case the specified version is a special branch like a pull
request, simply mirror the remote repository and fetch all of its refs.

This allows using a raw commit hash pointing to anything reachable on
the remote (for example a commit in a pull request which is not yet in a
branch), even if the server has not set allowReachableSHA1InWant.

There is one subtlety to this which is that Git will not let us update
the ref of a branch which is checked out.  Ideally we would just detach
HEAD to bypass this problem, but that doesn't work in an empty
repository when we are on an unborn branch.  Instead, we create an
unborn branch of our own pointing to a ref which is very unlikely to
exist.

Signed-off-by: John Keeping <john@metanate.com>
---
 support/download/git | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/support/download/git b/support/download/git
index 075f665bbf..02bf01bb95 100755
--- a/support/download/git
+++ b/support/download/git
@@ -111,20 +111,19 @@ fi
 
 _git remote set-url origin "'${uri}'"
 
+# We will use an explicit refspec, so we don't want any automatic
+# mapping from Git. This will fail if the key doesn't exist and we don't
+# want to delete the repository in that case so we ignore errors from
+# this command.
+_git config --unset-all remote.origin.fetch || true
+
 printf "Fetching all references\n"
-_git fetch origin
-_git fetch origin -t
-
-# Try to get the special refs exposed by some forges (pull-requests for
-# github, changes for gerrit...). There is no easy way to know whether
-# the cset the user passed us is such a special ref or a tag or a sha1
-# or whatever else. We'll eventually fail at checking out that cset,
-# below, if there is an issue anyway. Since most of the cset we're gonna
-# have to clone are not such special refs, consign the output to oblivion
-# so as not to alarm unsuspecting users, but still trace it as a warning.
-if ! _git fetch origin "'${cset}:${cset}'" >/dev/null 2>&1; then
-    printf "Could not fetch special ref '%s'; assuming it is not special.\n" "${cset}"
-fi
+# We can't update a checked-out branch and if we've just created a new
+# repo there aren't any commits to checkout a detached head, so we point
+# HEAD at a ref which really shouldn't exist, creating a temporary
+# unborn branch.
+_git symbolic-ref HEAD refs/buildroot-invalid-branch
+_git fetch --prune origin +refs/*:refs/*
 
 # Check that the changeset does exist. If it does not, re-cloning from
 # scratch won't help, so we don't want to trash the repository for a
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Buildroot] [PATCH v2 2/2] download/git: ban branch references
  2019-06-24 11:32           ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote John Keeping
@ 2019-06-24 11:32             ` John Keeping
  2019-12-29 22:12               ` Yann E. MORIN
  2019-12-29 22:03             ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote Yann E. MORIN
  1 sibling, 1 reply; 13+ messages in thread
From: John Keeping @ 2019-06-24 11:32 UTC (permalink / raw)
  To: buildroot

As described in the manual, using a branch name as a version is not
supported.  However, nothing enforces this so it is easy to specify a
branch name either accidentally or because new developers have not read
through the manual.

For Git it is reasonably easy to catch most violations of this rule and
fail the fetch phase.  We now only accept tags or raw commit hashes;
it's possible that there are other special refs which are known to be
stable and this can be extended to support those in the future if
required.

Signed-off-by: John Keeping <john@metanate.com>
---
 support/download/git | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/support/download/git b/support/download/git
index 02bf01bb95..5b5be92d15 100755
--- a/support/download/git
+++ b/support/download/git
@@ -133,6 +133,20 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
     exit 1
 fi
 
+# Check that the specified version is not a branch. We expect a tag or
+# raw commit hash, and accept some special refs as above. Using a branch
+# is forbidden because these are mutable references.
+case "$(_git rev-parse --symbolic-full-name "${cset}" 2>/dev/null)" in
+    refs/tags/*)
+        : ok
+        ;;
+    refs/*)
+        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
+        exit 1
+        ;;
+    # Anything else is not a ref, must be a raw hash which is ok.
+esac
+
 # The new cset we want to checkout might have different submodules, or
 # have sub-dirs converted to/from a submodule. So we would need to
 # deregister _current_ submodules before we checkout.
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote
  2019-06-24 11:32           ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote John Keeping
  2019-06-24 11:32             ` [Buildroot] [PATCH v2 2/2] download/git: ban branch references John Keeping
@ 2019-12-29 22:03             ` Yann E. MORIN
  1 sibling, 0 replies; 13+ messages in thread
From: Yann E. MORIN @ 2019-12-29 22:03 UTC (permalink / raw)
  To: buildroot

John, All,

I am very sorry that I did not come back to your new iteration sooner...

On 2019-06-24 12:32 +0100, John Keeping spake thusly:
> Instead of fetching branches and tags then following up with another
> request in case the specified version is a special branch like a pull
> request, simply mirror the remote repository and fetch all of its refs.
> 
> This allows using a raw commit hash pointing to anything reachable on
> the remote (for example a commit in a pull request which is not yet in a
> branch), even if the server has not set allowReachableSHA1InWant.
> 
> There is one subtlety to this which is that Git will not let us update
> the ref of a branch which is checked out.  Ideally we would just detach
> HEAD to bypass this problem, but that doesn't work in an empty
> repository when we are on an unborn branch.  Instead, we create an
> unborn branch of our own pointing to a ref which is very unlikely to
> exist.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  support/download/git | 25 ++++++++++++-------------
>  1 file changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/support/download/git b/support/download/git
> index 075f665bbf..02bf01bb95 100755
> --- a/support/download/git
> +++ b/support/download/git
> @@ -111,20 +111,19 @@ fi
>  
>  _git remote set-url origin "'${uri}'"
>  
> +# We will use an explicit refspec, so we don't want any automatic
> +# mapping from Git. This will fail if the key doesn't exist and we don't
> +# want to delete the repository in that case so we ignore errors from
> +# this command.
> +_git config --unset-all remote.origin.fetch || true
> +
>  printf "Fetching all references\n"
> -_git fetch origin
> -_git fetch origin -t
> -
> -# Try to get the special refs exposed by some forges (pull-requests for
> -# github, changes for gerrit...). There is no easy way to know whether
> -# the cset the user passed us is such a special ref or a tag or a sha1
> -# or whatever else. We'll eventually fail at checking out that cset,
> -# below, if there is an issue anyway. Since most of the cset we're gonna
> -# have to clone are not such special refs, consign the output to oblivion
> -# so as not to alarm unsuspecting users, but still trace it as a warning.
> -if ! _git fetch origin "'${cset}:${cset}'" >/dev/null 2>&1; then

Removing that line would make me oh!-so-happy... :-)

But...

> -    printf "Could not fetch special ref '%s'; assuming it is not special.\n" "${cset}"
> -fi
> +# We can't update a checked-out branch and if we've just created a new
> +# repo there aren't any commits to checkout a detached head, so we point
> +# HEAD at a ref which really shouldn't exist, creating a temporary
> +# unborn branch.
> +_git symbolic-ref HEAD refs/buildroot-invalid-branch
> +_git fetch --prune origin +refs/*:refs/*

... this one makes me raise an eyebrow. If we --prune, that would mean
we may drop branches, and thus objects, that are not on the current
remote.

Consider that the remote of the git cache may change arbitrarily between
two runs. For example, if I have two configurations, one which uses a
linux kernel from (say) the rpi fork on github, and the other uses the
linux4sam for (also on github), then a build of the second may drop
branches of the first, and if the user has auto-gc, objects may also get
dropped too, so when I later rebuild the rpi config, I may need to fetch
objects that were removed from the git cache...

That's not so nice in the long run, because the git cache is explicitly
made to keep stuff locally as much as possible.

Did I miss something?

Regards,
Yann E. MORIN.

>  # Check that the changeset does exist. If it does not, re-cloning from
>  # scratch won't help, so we don't want to trash the repository for a
> -- 
> 2.22.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [PATCH v2 2/2] download/git: ban branch references
  2019-06-24 11:32             ` [Buildroot] [PATCH v2 2/2] download/git: ban branch references John Keeping
@ 2019-12-29 22:12               ` Yann E. MORIN
  2020-01-02 17:57                 ` John Keeping
  0 siblings, 1 reply; 13+ messages in thread
From: Yann E. MORIN @ 2019-12-29 22:12 UTC (permalink / raw)
  To: buildroot

John, All,

On 2019-06-24 12:32 +0100, John Keeping spake thusly:
> As described in the manual, using a branch name as a version is not
> supported.  However, nothing enforces this so it is easy to specify a
> branch name either accidentally or because new developers have not read
> through the manual.
> 
> For Git it is reasonably easy to catch most violations of this rule and
> fail the fetch phase.  We now only accept tags or raw commit hashes;
> it's possible that there are other special refs which are known to be
> stable and this can be extended to support those in the future if
> required.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  support/download/git | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/support/download/git b/support/download/git
> index 02bf01bb95..5b5be92d15 100755
> --- a/support/download/git
> +++ b/support/download/git
> @@ -133,6 +133,20 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
>      exit 1
>  fi
>  
> +# Check that the specified version is not a branch. We expect a tag or
> +# raw commit hash, and accept some special refs as above. Using a branch
> +# is forbidden because these are mutable references.
> +case "$(_git rev-parse --symbolic-full-name "${cset}" 2>/dev/null)" in
> +    refs/tags/*)
> +        : ok
> +        ;;
> +    refs/*)
> +        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
> +        exit 1

Sorry, but as I previously explained, this breaks on _existing_ git
cached repositories. I'll repeat my previous example:

For example, I have a local git clone of linux-firmware, which has:

    $ git branch
    * 1baa34868b2c0a004dc595b20678145e3fff83e7
      44d4fca9922a252a0bd81f6307bcc072a78da54a
      d87753369b82c5f362250c197d04a1e1ef5bf698

    $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7
    warning: refname '1baa34868b2c0a004dc595b20678145e3fff83e7' is ambiguous.
    Git normally never creates a ref that ends with 40 hex characters
    because it will be ignored when you just specify 40-hex. These refs
    may be created by mistake. For example,

      git checkout -b $br $(git rev-parse ...)

    where "$br" is somehow empty and a 40-hex ref is created. Please
    examine these refs and maybe delete them. Turn this message off by
    running "git config advice.objectNameWarning false"
    refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7

    $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7 2>/dev/null
    refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7

So if we were oto use 1baa34868b2c0a004dc595b20678145e3fff83e7 (which we
did in the past), that would match the error path, which is not good.

Regards,
Yann E. MORIN.

> +        ;;
> +    # Anything else is not a ref, must be a raw hash which is ok.
> +esac
> +
>  # The new cset we want to checkout might have different submodules, or
>  # have sub-dirs converted to/from a submodule. So we would need to
>  # deregister _current_ submodules before we checkout.
> -- 
> 2.22.0
> 

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Buildroot] [PATCH v2 2/2] download/git: ban branch references
  2019-12-29 22:12               ` Yann E. MORIN
@ 2020-01-02 17:57                 ` John Keeping
  0 siblings, 0 replies; 13+ messages in thread
From: John Keeping @ 2020-01-02 17:57 UTC (permalink / raw)
  To: buildroot

Hi Yann,

On Sun, 29 Dec 2019 23:12:08 +0100
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> On 2019-06-24 12:32 +0100, John Keeping spake thusly:
> > As described in the manual, using a branch name as a version is not
> > supported.  However, nothing enforces this so it is easy to specify a
> > branch name either accidentally or because new developers have not read
> > through the manual.
> > 
> > For Git it is reasonably easy to catch most violations of this rule and
> > fail the fetch phase.  We now only accept tags or raw commit hashes;
> > it's possible that there are other special refs which are known to be
> > stable and this can be extended to support those in the future if
> > required.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  support/download/git | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/support/download/git b/support/download/git
> > index 02bf01bb95..5b5be92d15 100755
> > --- a/support/download/git
> > +++ b/support/download/git
> > @@ -133,6 +133,20 @@ if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
> >      exit 1
> >  fi
> >  
> > +# Check that the specified version is not a branch. We expect a tag or
> > +# raw commit hash, and accept some special refs as above. Using a branch
> > +# is forbidden because these are mutable references.
> > +case "$(_git rev-parse --symbolic-full-name "${cset}" 2>/dev/null)" in
> > +    refs/tags/*)
> > +        : ok
> > +        ;;
> > +    refs/*)
> > +        printf >&2 "Refusing to use Git branch '%s'.\n" "${cset}"
> > +        exit 1  
> 
> Sorry, but as I previously explained, this breaks on _existing_ git
> cached repositories. I'll repeat my previous example:
> 
> For example, I have a local git clone of linux-firmware, which has:
> 
>     $ git branch
>     * 1baa34868b2c0a004dc595b20678145e3fff83e7
>       44d4fca9922a252a0bd81f6307bcc072a78da54a
>       d87753369b82c5f362250c197d04a1e1ef5bf698
> 
>     $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7
>     warning: refname '1baa34868b2c0a004dc595b20678145e3fff83e7' is ambiguous.
>     Git normally never creates a ref that ends with 40 hex characters
>     because it will be ignored when you just specify 40-hex. These refs
>     may be created by mistake. For example,
> 
>       git checkout -b $br $(git rev-parse ...)
> 
>     where "$br" is somehow empty and a 40-hex ref is created. Please
>     examine these refs and maybe delete them. Turn this message off by
>     running "git config advice.objectNameWarning false"
>     refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7
> 
>     $ git rev-parse --symbolic-full-name 1baa34868b2c0a004dc595b20678145e3fff83e7 2>/dev/null
>     refs/heads/1baa34868b2c0a004dc595b20678145e3fff83e7
> 
> So if we were oto use 1baa34868b2c0a004dc595b20678145e3fff83e7 (which we
> did in the past), that would match the error path, which is not good.

This is solved by the "--prune" in patch 1, which removes all of those
refs so we end up with only a mirror of the remote.

If we can't prune the local repository completely and need to worry
about keeping all refs when switching remotes, then this case is a
problem.  The other side of that, is that some repositories have lots of
short-lived branches and if we don't prune those, then the local repo
may end up much bigger than necessary.


John

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-02 17:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-19 15:18 [Buildroot] [RFC PATCH] download/git: ban branch references John Keeping
2019-06-19 15:34 ` John Keeping
2019-06-20 16:39   ` Yann E. MORIN
2019-06-21 16:36     ` John Keeping
2019-06-22  7:47       ` Yann E. MORIN
2019-06-24 11:30         ` John Keeping
2019-06-24 11:32           ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote John Keeping
2019-06-24 11:32             ` [Buildroot] [PATCH v2 2/2] download/git: ban branch references John Keeping
2019-12-29 22:12               ` Yann E. MORIN
2020-01-02 17:57                 ` John Keeping
2019-12-29 22:03             ` [Buildroot] [PATCH v2 1/2] download/git: fetch all refs from the remote Yann E. MORIN
2019-06-20 17:27 ` [Buildroot] [RFC PATCH] download/git: ban branch references Joel Carlson
2019-06-21 12:36   ` John Keeping

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.