All of lore.kernel.org
 help / color / mirror / Atom feed
* bug: git name-rev --stdin --no-undefined on detached head
@ 2021-12-22 10:05 Erik Cervin Edin
  2021-12-23 18:39 ` John Cai
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Cervin Edin @ 2021-12-22 10:05 UTC (permalink / raw)
  To: git

Hey all!

I ran into a situation that I think may be a bug
using git name-rev for detached heads.

Steps to reproduce:
Create a detached head
  git checkout --detached
  git commit --allow-empty -m foo

Expected results:
My understanding is that
  git name-rev $(git rev-list -1 HEAD)
  git rev-list -1 HEAD | git name-rev --stdin
should yield the same result.

As well as combining with other flags
like --name-only / --no-undefined

Actual results:
Where this fails as expected
  git name-rev --no-undefined $(git rev-list HEAD)
this just prints the SHA wo failing
  git rev-list -1 HEAD |  git name-rev --stdin --no-undefined

"name-only" is also affected
  git rev-list -1 HEAD |  git name-rev --stdin --name-only
returns the SHA and not the name

Tested on
git version 2.34.1.windows.1
-- 
Erik Cervin-Edin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-22 10:05 bug: git name-rev --stdin --no-undefined on detached head Erik Cervin Edin
@ 2021-12-23 18:39 ` John Cai
  2021-12-24  6:09   ` John Cai
  2021-12-24 19:42   ` Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: John Cai @ 2021-12-23 18:39 UTC (permalink / raw)
  To: Erik Cervin Edin; +Cc: git

It seems like this bug can be generalized to “git name-rev --stdin” does not work with --no-undefined nor --name-only

The --name-only case seems clear to me that we should fix it. It’s misleading to return the sha instead of “undefined” for a rev without a symbolic name, as a sha could be a symbolic name.

I think we can also make the argument that --no-undefined should also die in --stdin mode when given a rev without any symbolic names.


> On Dec 22, 2021, at 2:05 AM, Erik Cervin Edin <erik@cervined.in> wrote:
> 
> Hey all!
> 
> I ran into a situation that I think may be a bug
> using git name-rev for detached heads.
> 
> Steps to reproduce:
> Create a detached head
>  git checkout --detached
>  git commit --allow-empty -m foo
> 
> Expected results:
> My understanding is that
>  git name-rev $(git rev-list -1 HEAD)
>  git rev-list -1 HEAD | git name-rev --stdin
> should yield the same result.
> 
> As well as combining with other flags
> like --name-only / --no-undefined
> 
> Actual results:
> Where this fails as expected
>  git name-rev --no-undefined $(git rev-list HEAD)
> this just prints the SHA wo failing
>  git rev-list -1 HEAD |  git name-rev --stdin --no-undefined
> 
> "name-only" is also affected
>  git rev-list -1 HEAD |  git name-rev --stdin --name-only
> returns the SHA and not the name
> 
> Tested on
> git version 2.34.1.windows.1
> -- 
> Erik Cervin-Edin


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-23 18:39 ` John Cai
@ 2021-12-24  6:09   ` John Cai
  2021-12-24 11:44     ` Erik Cervin Edin
  2021-12-24 19:42   ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: John Cai @ 2021-12-24  6:09 UTC (permalink / raw)
  To: Erik Cervin Edin; +Cc: git


> On Dec 23, 2021, at 10:39 AM, John Cai <jcai@gitlab.com> wrote:
> 
> It seems like this bug can be generalized to “git name-rev --stdin” does not work with --no-undefined nor --name-only
> 
> The --name-only case seems clear to me that we should fix it. It’s misleading to return the sha instead of “undefined” for a rev without a symbolic name, as a sha could be a symbolic name.
> 
> I think we can also make the argument that --no-undefined should also die in --stdin mode when given a rev without any symbolic names.

While I think this would make name-rev more consistent, I’d be interested in hearing what others think about changing the behavior of this command. This would have the potential of breaking scripts that rely on the current behavior. Since I’m a bit new, I’m wondering how we generally handle these cases?

> 
> 
>> On Dec 22, 2021, at 2:05 AM, Erik Cervin Edin <erik@cervined.in> wrote:
>> 
>> Hey all!
>> 
>> I ran into a situation that I think may be a bug
>> using git name-rev for detached heads.
>> 
>> Steps to reproduce:
>> Create a detached head
>> git checkout --detached
>> git commit --allow-empty -m foo
>> 
>> Expected results:
>> My understanding is that
>> git name-rev $(git rev-list -1 HEAD)
>> git rev-list -1 HEAD | git name-rev --stdin
>> should yield the same result.
>> 
>> As well as combining with other flags
>> like --name-only / --no-undefined
>> 
>> Actual results:
>> Where this fails as expected
>> git name-rev --no-undefined $(git rev-list HEAD)
>> this just prints the SHA wo failing
>> git rev-list -1 HEAD |  git name-rev --stdin --no-undefined
>> 
>> "name-only" is also affected
>> git rev-list -1 HEAD |  git name-rev --stdin --name-only
>> returns the SHA and not the name
>> 
>> Tested on
>> git version 2.34.1.windows.1
>> -- 
>> Erik Cervin-Edin
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-24  6:09   ` John Cai
@ 2021-12-24 11:44     ` Erik Cervin Edin
  0 siblings, 0 replies; 8+ messages in thread
From: Erik Cervin Edin @ 2021-12-24 11:44 UTC (permalink / raw)
  To: John Cai; +Cc: git

> On Fri, Dec 24, 2021 at 7:09 AM John Cai <johncai86@gmail.com> wrote:
> This would have the potential of breaking scripts that rely on the current behavior.

Presumably, using --no-undefined scripts rely on presumed behavior
that doesn't exist.
Similarly, using --name-only should expect to return a symbolic name
While it's possible, I think it's rare that this would break scripts
in an unacceptable way.

> On Fri, Dec 24, 2021 at 7:09 AM John Cai <johncai86@gmail.com> wrote:
> Since I’m a bit new, I’m wondering how we generally handle these cases?

I'm also new so I can't really comment on this :)
Though I believe the git suite in general is quite adamant on
backwards compatibility

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-23 18:39 ` John Cai
  2021-12-24  6:09   ` John Cai
@ 2021-12-24 19:42   ` Junio C Hamano
  2021-12-24 20:09     ` John Cai
                       ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Junio C Hamano @ 2021-12-24 19:42 UTC (permalink / raw)
  To: John Cai; +Cc: Erik Cervin Edin, git

John Cai <jcai@gitlab.com> writes:

> It seems like this bug can be generalized to “git name-rev
> --stdin” does not work with --no-undefined nor --name-only
>
> The --name-only case seems clear to me that we should fix
> it. It’s misleading to return the sha instead of “undefined”
> for a rev without a symbolic name, as a sha could be a symbolic
> name.
>
> I think we can also make the argument that --no-undefined should
> also die in --stdin mode when given a rev without any symbolic
> names.

Hmph, the manual page documents:

    --stdin::
            Transform stdin by substituting all the 40-character SHA-1
            hexes (say $hex) with "$hex ($rev_name)".  When used with
            --name-only, substitute with "$rev_name", omitting $hex
            altogether.  Intended for the scripter's use.

It is unfortunate that the way this option works is confusingly a
bit different from what we learned to expect from the --stdin option
other subcommands like "git pack-objects --stdin" takes.  In short,
these are not equivalent:

	git name-rev [<options>] $string
        printf "%s" "$string" | xargs git name-rev [<options>]
        printf "%s" "$string" | git name-rev --stdin [<options>]

The first two are supposed to be the equivalent, but the third one
is different by design.  Its `--stdin` mode is expected to read
something like this [*]:

	$ cat sample.txt
        A revision that exists 2ae0a9cb82 is shown here,
        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907
        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
        which probably is undescribable hexdigits.

and its designed use is to annotate its input into a more reader
friendly from with refnames where possible.  Here is what we get:

        $ git name-rev --stdin <sample.txt
        A revision that exists 2ae0a9cb82 is shown here,
        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907 (master)
        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
        which probably is undescribable hexdigits.

I notice a few things.

 * An abbreviated commit object name is not affected;

 * A 40-digit string that cannot be described with a reference is
   left alone, without "undefined".

It might be debatable that the latter may want to be annotated with
"undefined", but as the command does not molest other noise strings
like "its" "full" name" in the input, I think the current behaviour
is preferred over appending "(undefined)" after a string we do not
recognize that happens to be 40-hex.

When used with --name-only, we see this:

	$ git name-rev --name-only --stdin <sample.txt
        A revision that exists 2ae0a9cb82 is shown here,
        and its full name is master
        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
        which probably is undescribable hexdigits.

So, as far as I can see, it is working as described.  If there is
any bug in the things I saw and shown here, it is that it is
misleading to claim that this behaviour is intended for scripter's
use.  It clearly is not scripter friendly when you want to run
"name-rev" on unbounded number of object names you have, which may
not fit on the command line, as that is not how it was designed to
be used.

Two possible things we can do to improve are

 * Fix the documentation; it is not for scripters but for annotating
   text with object names.

 * Possibly add --names-from-standard-input option that would behave
   more like "we cannot afford to stuff all object names on the
   command line, so we feed them one by one from the standard input"
   mode the "--stdin" option of other subcommands use.

I do not think the latter is so important, as it is perfectly OK to
use xargs to split the large input into multiple invocations of
name-rev.  This is unlike "pack-objects --stdin" where the command
needs to see _all_ input in a single invocation.


[Footnote]

* The sample input was produced with

        $ cat >sample.txt <<EOF
        A revision that exists $(git rev-parse --short HEAD) is shown here,
        and its full name is $(git rev-parse HEAD)
        while its tree object is $(git rev-parse HEAD:)
        which probably is undescribable hexdigits.
        EOF

if you want to try it at home ;-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-24 19:42   ` Junio C Hamano
@ 2021-12-24 20:09     ` John Cai
  2021-12-25  0:35     ` Junio C Hamano
  2021-12-31 17:16     ` Philip Oakley
  2 siblings, 0 replies; 8+ messages in thread
From: John Cai @ 2021-12-24 20:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Erik Cervin Edin, git



> On Dec 24, 2021, at 11:42 AM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> John Cai <jcai@gitlab.com> writes:
> 
>> It seems like this bug can be generalized to “git name-rev
>> --stdin” does not work with --no-undefined nor --name-only
>> 
>> The --name-only case seems clear to me that we should fix
>> it. It’s misleading to return the sha instead of “undefined”
>> for a rev without a symbolic name, as a sha could be a symbolic
>> name.
>> 
>> I think we can also make the argument that --no-undefined should
>> also die in --stdin mode when given a rev without any symbolic
>> names.
> 
> Hmph, the manual page documents:
> 
>    --stdin::
>            Transform stdin by substituting all the 40-character SHA-1
>            hexes (say $hex) with "$hex ($rev_name)".  When used with
>            --name-only, substitute with "$rev_name", omitting $hex
>            altogether.  Intended for the scripter's use.
> 
> It is unfortunate that the way this option works is confusingly a
> bit different from what we learned to expect from the --stdin option
> other subcommands like "git pack-objects --stdin" takes.  In short,
> these are not equivalent:
> 
> 	git name-rev [<options>] $string
>        printf "%s" "$string" | xargs git name-rev [<options>]
>        printf "%s" "$string" | git name-rev --stdin [<options>]
> 
> The first two are supposed to be the equivalent, but the third one
> is different by design.  Its `--stdin` mode is expected to read
> something like this [*]:
> 
> 	$ cat sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> and its designed use is to annotate its input into a more reader
> friendly from with refnames where possible.  

No wonder! This elucidates why I found the user experience of the --stdin 
mode a bit unexpected, as it would simply echo back the input when it couldn’t find a 
valid object or refname rather than return an error message. I think I totally missed 
the verbiage in the documentation that states its meant to **substitute** text in stdin.
Makes sense to echo back the input if nothing useful could be found.

> Here is what we get:
> 
>        $ git name-rev --stdin <sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907 (master)
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> I notice a few things.
> 
> * An abbreviated commit object name is not affected;
> 
> * A 40-digit string that cannot be described with a reference is
>   left alone, without "undefined".
> 
> It might be debatable that the latter may want to be annotated with
> "undefined", but as the command does not molest other noise strings
> like "its" "full" name" in the input, I think the current behaviour
> is preferred over appending "(undefined)" after a string we do not
> recognize that happens to be 40-hex.
> 
> When used with --name-only, we see this:
> 
> 	$ git name-rev --name-only --stdin <sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is master
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> So, as far as I can see, it is working as described.  If there is
> any bug in the things I saw and shown here, it is that it is
> misleading to claim that this behaviour is intended for scripter's
> use.  It clearly is not scripter friendly when you want to run
> "name-rev" on unbounded number of object names you have, which may
> not fit on the command line, as that is not how it was designed to
> be used.
> 
> Two possible things we can do to improve are
> 
> * Fix the documentation; it is not for scripters but for annotating
>   text with object names.

Makes sense to update the documentation to make it clear what --stdin mode is
meant to do. 

> 
> * Possibly add --names-from-standard-input option that would behave
>   more like "we cannot afford to stuff all object names on the
>   command line, so we feed them one by one from the standard input"
>   mode the "--stdin" option of other subcommands use.
> 
> I do not think the latter is so important, as it is perfectly OK to
> use xargs to split the large input into multiple invocations of
> name-rev.  This is unlike "pack-objects --stdin" where the command
> needs to see _all_ input in a single invocation.
> 
> 
> [Footnote]
> 
> * The sample input was produced with
> 
>        $ cat >sample.txt <<EOF
>        A revision that exists $(git rev-parse --short HEAD) is shown here,
>        and its full name is $(git rev-parse HEAD)
>        while its tree object is $(git rev-parse HEAD:)
>        which probably is undescribable hexdigits.
>        EOF
> 
> if you want to try it at home ;-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-24 19:42   ` Junio C Hamano
  2021-12-24 20:09     ` John Cai
@ 2021-12-25  0:35     ` Junio C Hamano
  2021-12-31 17:16     ` Philip Oakley
  2 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2021-12-25  0:35 UTC (permalink / raw)
  To: John Cai; +Cc: Erik Cervin Edin, git

Junio C Hamano <gitster@pobox.com> writes:

> Two possible things we can do to improve are
>
>  * Fix the documentation; it is not for scripters but for annotating
>    text with object names.
>
>  * Possibly add --names-from-standard-input option that would behave
>    more like "we cannot afford to stuff all object names on the
>    command line, so we feed them one by one from the standard input"
>    mode the "--stdin" option of other subcommands use.
>
> I do not think the latter is so important, as it is perfectly OK to
> use xargs to split the large input into multiple invocations of
> name-rev.  This is unlike "pack-objects --stdin" where the command
> needs to see _all_ input in a single invocation.

This is primarily to teach newer developers how incompatible changes
are done in this project.  I am not suggesting that introducing such
an incompatible change and following through these steps to the end
is a good idea in this case.

Hypothetically, if name-rev were so important a command that the
xargs based workaround were unacceptable, what we would probably
do is to make the following changes over time.

 1. Introduce a new `--annotate-text` option, which is a synonym to
    the current `--stdin` option.  `--stdin` will work the same way
    as today, but using it will issue a warning() to the standard
    error to tell people to use the `--annotate-text` option
    instead.

 2. After a while, change `--stdin` to die() with a suggestion that
    people should instead pipe into xargs to invoke name-rev
    instead.

 3. After the above two steps, users and scripts of `--stdin` will
    go extinct.  Reintroduce `--stdin` but with a behaviour more in
    line with the `--stdin` option of other subcommands, i.e. take
    one arg per line from the standard input, and behave as if they
    came in the argv[] array.

The idea is to give users an early warning that is annoying enough
to encourage migration, ample time to adjust to the new world order,
and to ensure that there are no stragglers that gets hurt when the
name gets reused for an option with totally different behaviour.

It may not be a bad idea to do steps #1 and #2, though.  I do not
think xargs name-rev is bad enough to require going to the step #3,
but `--stdin` that exists and does a wrong thing is much worse than
`--stdin` that does not exist, and finishing upto step #2 is enough
to rectify that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: bug: git name-rev --stdin --no-undefined on detached head
  2021-12-24 19:42   ` Junio C Hamano
  2021-12-24 20:09     ` John Cai
  2021-12-25  0:35     ` Junio C Hamano
@ 2021-12-31 17:16     ` Philip Oakley
  2 siblings, 0 replies; 8+ messages in thread
From: Philip Oakley @ 2021-12-31 17:16 UTC (permalink / raw)
  To: Junio C Hamano, John Cai; +Cc: Erik Cervin Edin, git

Thanks you for the example..

On 24/12/2021 19:42, Junio C Hamano wrote:
> * The sample input was produced with
>
>         $ cat >sample.txt <<EOF
>         A revision that exists $(git rev-parse --short HEAD) is shown here,
>         and its full name is $(git rev-parse HEAD)
>         while its tree object is $(git rev-parse HEAD:)
>         which probably is undescribable hexdigits.
>         EOF
>
> if you want to try it at home ;-)

The ` $(git rev-parse HEAD:) ` technique is serendipitous. 

I'd forgotten that was how to get the commit object's tree  for further
investigation, as I'd need it for my `deadhead` query [1], as used
regularly in the Git-for-Windows merging-rebase (see also [2]).

Philip
[1]
https://lore.kernel.org/git/be7ec330-2b2c-a01f-c7ca-e5e752493ee0@iee.email/
[2]
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2112101528200.90@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-31 17:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 10:05 bug: git name-rev --stdin --no-undefined on detached head Erik Cervin Edin
2021-12-23 18:39 ` John Cai
2021-12-24  6:09   ` John Cai
2021-12-24 11:44     ` Erik Cervin Edin
2021-12-24 19:42   ` Junio C Hamano
2021-12-24 20:09     ` John Cai
2021-12-25  0:35     ` Junio C Hamano
2021-12-31 17:16     ` Philip Oakley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.