git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Gwyneth Morgan <gwymor@tilde.club>
Cc: "Fangyi Zhou" <me@fangyi.io>,
	git@vger.kernel.org,
	"Birger Skogeng Pedersen" <birger.sp@gmail.com>,
	"Birger Skogeng Pedersen" <birgersp@gmail.com>,
	"Brandon Williams" <bwilliamseng@gmail.com>,
	"Brandon Williams" <bmwill@google.com>,
	"CB Bailey" <cbailey32@bloomberg.net>,
	"Christopher Díaz Riveros" <christopher.diaz.riv@gmail.com>,
	"Christopher Díaz Riveros" <chrisadr@gentoo.org>,
	"Ed Maste" <emaste@freebsd.org>,
	"Jean-Noël Avila" <jn.avila@free.fr>,
	"Jean-Noel Avila" <jean-noel.avila@scantech.fr>,
	"Jessica Clarke" <jrtc27@jrtc27.com>,
	"Jiang Xin" <worldhello.net@gmail.com>,
	"Jiang Xin" <zhiyou.jx@alibaba-inc.com>,
	"Kazuhiro Kato" <kato-k@ksysllc.co.jp>,
	"Kazuhiro Kato" <kazuhiro.kato@hotmail.co.jp>,
	"Kevin Willford" <Kevin.Willford@microsoft.com>,
	"Kevin Willford" <kewillf@microsoft.com>,
	"Peter Kaestle" <peter@piie.net>,
	"Peter Kaestle" <peter.kaestle@nokia.com>,
	"Sibi Siddharthan" <sibisiddharthan.github@gmail.com>,
	"Sibi Siddharthan" <sibisiv.siddharthan@gmail.com>,
	"Slavica Đukić" <slawica92@hotmail.com>,
	"Slavica Djukic" <slavicadj.ip2018@gmail.com>
Subject: Oddidies in the .mailmap parser & future syntax extensions
Date: Fri, 10 Sep 2021 18:48:26 +0200	[thread overview]
Message-ID: <877dfocps2.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YTt4RymWg+TOEmUf@tilde.club>


[Changed subject]

On Fri, Sep 10 2021, Gwyneth Morgan wrote:

> On 2021-09-10 14:02:36+0100, Fangyi Zhou wrote:
>> Similar to a35b13fce0 (Update .mailmap, 2018-11-09).
>> 
>> This patch makes the output of `git shortlog -nse v2.10.0..master`
>> duplicate-free by taking/guessing the current and preferred
>> addresses for authors that appear with more than one address.
>
> The line for Jessica Clarke should probably just be
>
> Jessica Clarke <jrtc27@jrtc27.com>
>
> That works the same and doesn't put a reference to an old name.

It does work exactly the same!

More specifically this is an unintentional bug/misfeature/looseness in
the .mailmap parser, an entry like:

    Foo <foo@example.com> Bar

Is exactly equivalent to:

    Foo <foo@example.com>

I.e. we simply ignore the " Bar" part. The reason for this is that we're
internally treating nonsense input as if the line simply ended there.

Even having documented and tested some of this recently in 05b5ff219c2
(mailmap doc + tests: add better examples & test them, 2021-01-12) I
found this a bit surprising. I probably found out at the time, but
forgot and had to go source spelunking again.

I'd expect:

    Foo <foo@example.com> Bar

To be an alias/shorthand for:

    Foo <foo@example.com> Bar <foo@example.com>

Which is something that might be applicable / useful in some
cases.

E.g. a name might change over time from "Foo", to "Bar", to "Zar", but
just because we're at "Bar" and want to map "Foo" to "Bar", that might
not mean that we'd like to map any future name at the same address
(i.e. the future "Zar") to the same "Foo".

In practice I suspect that's more commonly what people do want to do,
maybe we should warn about it, I did mean to hook some pedantic mode of
the parser at some point up to git-fsck.

More annoying is that this:

    New <foo@example.com> <bar@example.com>
    <foo@example.com> <zar@example.com>

Doesn't mean the same as:

    New <foo@example.com> <bar@example.com>
    New <foo@example.com> <zar@example.com>

I.e. I'd expect the name to map to the empty string, *unless* we saw an
earlier address, i.e. just as we do for the first bar -> foo line (we
map it to a name of "New", we don't map it to an empty name).

So that's some #leftoverbits, perhaps someone somewhere relies on that,
but it seems like an obvious shorthand to have. I can't imagine it being
useful to map to empty names, and much of e.g. git.git's mailmap is
repeated entries with the same name over and over again.

I suppose we could also extend it to new syntax such as:

    New <foo@example.com> <bar@example.com> <zar@example.com>

Doing that would be strictly backwards compatible, i.e. now we'll
entirely ignore the 3rd E-Mail address. It does mean we also
accidentally support things like:

    New <foo@example.com> <bar@example.com> # A comment, because we ignore everything after the 2nd address

But don't tell anyone I told you that :) But that is something that
might technically have inadvertently closed the door to future syntax
extensions, but we could probably do them anyway, or at worst have some
heuristic.

Another useful thing might be to support:

    New <> Old <>

As an explicit mapping of the name "Old" wherever we see it to "New", or:

    New <> Old <>

To change just the name "Old" to "New" everywhere, without considering
the E-Mail address. Both of those are probably too crazy to be useful,
especially since if we supported that we'd logically also support:

    New <> <>

To assign all the commits to the name "New", but retain the address.

  parent reply	other threads:[~2021-09-10 17:19 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 13:02 [PATCH] .mailmap: Update mailmap Fangyi Zhou
2021-09-10 15:22 ` Gwyneth Morgan
2021-09-10 15:31   ` Jeff King
2021-09-10 15:35     ` Sibi Siddharthan
2021-09-11  0:32     ` Junio C Hamano
2021-09-11  1:31       ` Ævar Arnfjörð Bjarmason
2021-09-11 14:52         ` Jeff King
2021-09-11 14:47       ` Jeff King
2021-09-10 16:48   ` Ævar Arnfjörð Bjarmason [this message]
2021-09-10 18:11     ` Oddidies in the .mailmap parser & future syntax extensions Junio C Hamano
2021-09-10 19:50       ` Ævar Arnfjörð Bjarmason
2021-09-10 20:20         ` Junio C Hamano
2021-09-13  4:02           ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dfocps2.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Kevin.Willford@microsoft.com \
    --cc=birger.sp@gmail.com \
    --cc=birgersp@gmail.com \
    --cc=bmwill@google.com \
    --cc=bwilliamseng@gmail.com \
    --cc=cbailey32@bloomberg.net \
    --cc=chrisadr@gentoo.org \
    --cc=christopher.diaz.riv@gmail.com \
    --cc=emaste@freebsd.org \
    --cc=git@vger.kernel.org \
    --cc=gwymor@tilde.club \
    --cc=jean-noel.avila@scantech.fr \
    --cc=jn.avila@free.fr \
    --cc=jrtc27@jrtc27.com \
    --cc=kato-k@ksysllc.co.jp \
    --cc=kazuhiro.kato@hotmail.co.jp \
    --cc=kewillf@microsoft.com \
    --cc=me@fangyi.io \
    --cc=peter.kaestle@nokia.com \
    --cc=peter@piie.net \
    --cc=sibisiddharthan.github@gmail.com \
    --cc=sibisiv.siddharthan@gmail.com \
    --cc=slavicadj.ip2018@gmail.com \
    --cc=slawica92@hotmail.com \
    --cc=worldhello.net@gmail.com \
    --cc=zhiyou.jx@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).