All of lore.kernel.org
 help / color / mirror / Atom feed
* What's in a name? Let's use a (uuid,name,email) triplet
@ 2010-03-18 13:23 Michael Witten
  2010-03-18 13:48 ` Jon Smirl
                   ` (6 more replies)
  0 siblings, 7 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 13:23 UTC (permalink / raw)
  To: git

Short Version:
-------------


Rather than use a (name,email) pair to identify people, let's use
a (uuid,name,email) triplet.

The uuid can be any piece of information that a user of git determines
to be reasonably unique across space and time and that is intended to
be used by that user virtually forever (at least within a project's
history).

For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
some easily remembered, already reasonably unique information.

This could really help keep identifications clean, and it is rather
straightforward and possibly quite efficient.


Long Version:
------------


There are 2 reasons why people contribute (pro bono) to projects:

  (0) To improve the project.
  (1) To garner recognition.

and in my experience, (0) is not as sweet without (1).

One of the great boons of distributed systems like git is that they
separate author (contributor) identities from committer identities,
thereby maintaining (some semblance of) proper attribution in an
official, structured format that is amenable to parsing by tools.

While git's use of (name,email) pairs to identify each person is
extremely practical, it turns out that it's rather `unstable';
consider the following information gleaned from a clone of the
official git repository:

    $ git shortlog -se origin/master | grep Linus
         3  Linus Torvalds <torvalds@evo.osdl.org>
       122  Linus Torvalds <torvalds@g5.osdl.org>
       235  Linus Torvalds <torvalds@linux-foundation.org>
       276  Linus Torvalds <torvalds@osdl.org>
         9  Linus Torvalds <torvalds@ppc970.osdl.org.(none)>
       439  Linus Torvalds <torvalds@ppc970.osdl.org>
         9  Linus Torvalds <torvalds@woody.linux-foundation.org>

    $ git shortlog -se origin/master | grep Junio
      3658  Junio C Hamano <gitster@pobox.com>
         2  Junio C Hamano <junio@hera.kernel.org>
         3  Junio C Hamano <junio@kernel.org>
         3  Junio C Hamano <junio@pobox.com>
         8  Junio C Hamano <junio@twinsun.com>
      4167  Junio C Hamano <junkio@cox.net>
         2  Junio C Hamano <junkio@twinsun.com>
         2  Junio Hamano <gitster@pobox.com>

or using a clone of Linus's Linux repo:

    $ git shortlog -se origin/master | grep Linus
         2  Linus Luessing <linus.luessing@web.de>
         2  Linus Lüssing <linus.luessing@web.de>
         2  Linus Nilsson <lajnold@acc.umu.se>
         2  Linus Nilsson <lajnold@gmail.com>
        32  Linus Torvalds <torvalds@evo.osdl.org>
      1522  Linus Torvalds <torvalds@g5.osdl.org>
      4174  Linus Torvalds <torvalds@linux-foundation.org>
         7  Linus Torvalds <torvalds@macmini.osdl.org>
         2  Linus Torvalds <torvalds@merom.osdl.org>
         8  Linus Torvalds <torvalds@osdl.org>
         4  Linus Torvalds <torvalds@ppc970.osdl.org.(none)>
       166  Linus Torvalds <torvalds@ppc970.osdl.org>
         1  Linus Torvalds <torvalds@quad.osdl.org>
      1606  Linus Torvalds <torvalds@woody.linux-foundation.org>
       174  Linus Torvalds <torvalds@woody.osdl.org>
         1  Linus Walleij (LD/EAB <linus.walleij@ericsson.com>
         3  Linus Walleij <linus.ml.walleij@gmail.com>
         1  Linus Walleij <linus.walleij@ericsson.com>
        81  Linus Walleij <linus.walleij@stericsson.com>
         9  Linus Walleij <triad@df.lth.se>

    $ git shortlog -se origin/master | grep Morton
       581  Andrew Morton <akpm@linux-foundation.org>
       836  Andrew Morton <akpm@osdl.org>
         1  Andrew Morton <len.brown@intel.com>

From these few examples it seems pretty clear that the most volatile
portion of the (name,email) pair is the email, which is unfortunate
because the email is the most uniquely identifying information. Are
we really reasonably certain that these two are the same person?

    Linus Walleij <linus.ml.walleij@gmail.com>
    Linus Walleij <linus.walleij@ericsson.com>

Thus, I propose a more stable form of identification; rather than
using just a (name,email) pair, let's use a (uuid,name,email) triplet,
where the uuid can be any piece of information that a user of git
determines to be reasonably unique across space and time and that is
intended to be used by that user virtually forever (at least within a
project's history).

For instance, Linus is always stuck in his basement with the same
ancient computers, so he chooses to set up his few ~/.gitconfig
files with an OSF DCE 1.1 conforming UUID (generated by, say, uuidgen):

Linus Torvalds <torvalds@linux-foundation.org>

    [user]
        uuid  = 6b202ed1-e8ec-4048-84c2-ae0dd3b2df47
        name  = Linus Torvalds
        email = torvalds@linux-foundation.org

On the other hand, Junio is infatuated with the latest palmtop
computing gadgets and finds himself setting up a ~/.gitconfig file
several times each month; he doesn't want to bother remembering
some long human-hostile string, so he adopts as his uuid the
SHA-1 of some easily remembered piece of information like the
very first (name,email) pair that he used for git
(Junio C Hamano <junkio@cox.net>):

    [user]
        uuid  = 6e99d26860f0b87ef4843fa838df2a918b85d1f7
        name  = Junio C Hamano
        email = gitster@pobox.com

I'm sure that some optimizations could made for certain choices like
UUID and SHA-1 strings.

Anyway, I think this could really help keep identifications clean,
and it is rather straightforward and possibly quite efficient.

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
@ 2010-03-18 13:48 ` Jon Smirl
  2010-03-18 14:26   ` Michael Witten
  2010-03-18 17:27 ` Linus Torvalds
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 13:48 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

You can't go back and edit the history in git so a map of the aliases
is needed.  The easy fix is a .mailmap file. However, the .mailmap
entries need a mechanism to track which entries are correct and which
have been fixed. Read this long and painful thread...
http://lkml.org/lkml/2008/7/28/134

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:48 ` Jon Smirl
@ 2010-03-18 14:26   ` Michael Witten
  0 siblings, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 14:26 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

On Thu, Mar 18, 2010 at 08:48, Jon Smirl <jonsmirl@gmail.com> wrote:
> You can't go back and edit the history in git so a map of the aliases
> is needed.  The easy fix is a .mailmap file. However, the .mailmap
> entries need a mechanism to track which entries are correct and which
> have been fixed. Read this long and painful thread...
> http://lkml.org/lkml/2008/7/28/134

The addition of a uuid would not only likely decrease future trouble
tremendously, but also allow for a much more efficient remapping of
old (name,email) pairs.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
  2010-03-18 13:48 ` Jon Smirl
@ 2010-03-18 17:27 ` Linus Torvalds
  2010-03-18 19:02   ` Jon Smirl
  2010-03-18 22:36   ` Martin Langhoff
  2010-03-18 18:42 ` Michael Witten
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 17:27 UTC (permalink / raw)
  To: Michael Witten; +Cc: git



On Thu, 18 Mar 2010, Michael Witten wrote:
>
> Short Version:
> -------------
> 
> Rather than use a (name,email) pair to identify people, let's use
> a (uuid,name,email) triplet.

Even shorter version: NO.

> Long Version:
> ------------

UUID's are some total crazy shit. It's like XML. If you think you need 
them, you're almost certainly wrong. If it's about identifying a unique 
piece of hardware, ok. If it's about identifying people, no.

How about you walk around with a bar-code tattooed to your forehead? Don't 
like the idea? Then think about having to care about a uuid in your 
projects. Same deal.

Nobody is going to associate themselves with a uuid. It's not how humans 
work. It's degrading, and it's work-for-no-gain to anybody who doesn't 
have OCD.

So in practice, the only thing that would happen is that people make up 
random uuid's and they'd be different for every single machine they have, 
because absolutely NOBODY would ever bother to try to save and move their 
uuids around.

So when you point out that emails aren't unique, or that people change 
their emails over time, please realize that the emails are _more_ stable 
than a uuid would ever be. Because an email actually has some emotional 
attachment to the person in question. Yes, they change. So do real names 
too (which change more seldom, exactly because people are way _more_ 
emotionally attached to their real names).

uuid's? I can pretty much guarantee that for me, it would be different for 
every single machine I have. Because I could just not be bothered to care.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
  2010-03-18 13:48 ` Jon Smirl
  2010-03-18 17:27 ` Linus Torvalds
@ 2010-03-18 18:42 ` Michael Witten
  2010-03-18 18:47   ` Matthieu Moy
                     ` (2 more replies)
  2010-03-18 22:17 ` A Large Angry SCM
                   ` (3 subsequent siblings)
  6 siblings, 3 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 18:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git



Linus: Don't skim; read.



On Thu, Mar 18, 2010 at 12:27, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> So in practice, the only thing that would happen
> is that people make up random uuid's and they'd
> be different for every single machine they have,
> because absolutely NOBODY would ever bother to
> try to save and move their uuids around.
>
> ...
>
> please realize that the emails are _more_ stable
> than a uuid would ever be. Because an email
> actually has some emotional attachment to the
> person in question.

My anticipation of your response was uncanny:

    >> For instance, the uuid could be... the SHA-1
    >> of some easily remembered, already reasonably
    >> unique information.
    >>
    >> ...
    >>
    >> ...he doesn't want to bother remembering some
    >> long human-hostile string, so he adopts as
    >> his uuid the SHA-1 of some easily remembered
    >> piece of information like the very first
    >> (name,email) pair that he used for git
    >> (Junio C Hamano <junkio@cox.net>)

So, forget the original generality and let's
define the uuid as a SHA-1 of some EASILY
REMEMBERED, already reasonably unique piece of
information such as an old (name,email) pair.

To make life easier on people, git tools could automate
that process; to Junio, his just uuid is an old,
unchanging (name,email) pair:

    $ git config --global user.name  "Junio C Hamano"
    $ git config --global user.email "gitster@pobox.com"
    $ git config --global --uuid "Junio C Hamano <junkio@cox.net>"

which produces something like:

    [user]
        name  = Junio C Hamano
        email = gitster@pobox.com
        uuid  = 6e99d26860f0b87ef4843fa838df2a918b85d1f7

In fact those three steps should probably be
further automated anyway:

    $ git config --global --init
    Full Name? Junio C Hamano
    Email? gitster@pobox.com
    UUID [Junio C Hamano <gitster@pobox.com>]? Junio C Hamano <junkio@cox.net>

Set it and forget it in a completely human way.

Could people still bungle the uuid or enter trash?
Sure, but that's essentially no different than the
current situation. This would be an improvement,
because at least some people would take advantage
of it; in fact, I bet most people would use it
properly because:

    * The information required is easily remembered
      and reproduced; it has that emotional aspect.

    * People have an emotional attachment to getting
      proper attribution for their work, and this
      helps.

Moreover, storing and using the SHA-1 uuid would be
very efficient and allow for saner .mailmap hacks.

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 18:42 ` Michael Witten
@ 2010-03-18 18:47   ` Matthieu Moy
  2010-03-18 18:57     ` Michael Witten
  2010-03-18 19:12   ` Nicolas Pitre
  2010-03-18 20:44   ` tytso
  2 siblings, 1 reply; 104+ messages in thread
From: Matthieu Moy @ 2010-03-18 18:47 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, git

Michael Witten <mfwitten@gmail.com> writes:

> So, forget the original generality and let's
> define the uuid as a SHA-1 of some EASILY
> REMEMBERED, already reasonably unique piece of
> information such as an old (name,email) pair.

What's the added value of the "SHA-1" thing, here? A hash of a pair
(a, b) is exactly as unique as the pair itself (well, actually even a
bit less if you consider collisions).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 18:47   ` Matthieu Moy
@ 2010-03-18 18:57     ` Michael Witten
  0 siblings, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 18:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, git

On Thu, Mar 18, 2010 at 13:47, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> What's the added value of the "SHA-1" thing, here? A hash of a pair
> (a, b) is exactly as unique as the pair itself (well, actually even a
> bit less if you consider collisions).

Your observation is correct, but I'm pushing for the SHA-1 string
because it could be efficiently parsed, stored, and used; it's
essentially an optimization (or a preparation for an optimization).

If that's not a good way to approach it, then I'd be satisifed with
just a straight (name,email) pair or any other reasonably unique
string.

On a more general note, the idea of a uuid is to distribute the
process of canonicalizing identities. Does that not make perfect
sense?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 17:27 ` Linus Torvalds
@ 2010-03-18 19:02   ` Jon Smirl
  2010-03-18 19:07     ` Linus Torvalds
  2010-03-18 22:36   ` Martin Langhoff
  1 sibling, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 19:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael Witten, git

On Thu, Mar 18, 2010 at 1:27 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 18 Mar 2010, Michael Witten wrote:
>>
>> Short Version:
>> -------------
>>
>> Rather than use a (name,email) pair to identify people, let's use
>> a (uuid,name,email) triplet.
>
> Even shorter version: NO.
>
>> Long Version:
>> ------------
>
> UUID's are some total crazy shit. It's like XML. If you think you need
> them, you're almost certainly wrong. If it's about identifying a unique
> piece of hardware, ok. If it's about identifying people, no.

We could hash people emails and then build a .mailmap equivalent thus
hiding their identity.

Several things needed to be combined to build that mailmap.
1) a lot of hand work to identify aliases and misspellings
2) work with google to translate email addresses into human names when
names were missing
3) a list of all of the email addresses that had been checked, to make
it easy to identify new ones.

The trouble with hashing it is that all of the tools that use it will
need to be rewritten.

I'd really like to see a more global database constructed that links
commits, lkml discussions and the various distribution bug databases
but apparently it is too much of a threat to developer privacy. You
can achieve the same effect with a few hours in google throwing out
bunches of false positives.  It would be cool to be looking at a
subroutine, poke a button and then see all of the human oriented
history around it instead of just the diffs.

>
> How about you walk around with a bar-code tattooed to your forehead? Don't
> like the idea? Then think about having to care about a uuid in your
> projects. Same deal.
>
> Nobody is going to associate themselves with a uuid. It's not how humans
> work. It's degrading, and it's work-for-no-gain to anybody who doesn't
> have OCD.
>
> So in practice, the only thing that would happen is that people make up
> random uuid's and they'd be different for every single machine they have,
> because absolutely NOBODY would ever bother to try to save and move their
> uuids around.
>
> So when you point out that emails aren't unique, or that people change
> their emails over time, please realize that the emails are _more_ stable
> than a uuid would ever be. Because an email actually has some emotional
> attachment to the person in question. Yes, they change. So do real names
> too (which change more seldom, exactly because people are way _more_
> emotionally attached to their real names).
>
> uuid's? I can pretty much guarantee that for me, it would be different for
> every single machine I have. Because I could just not be bothered to care.
>
>                        Linus
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:02   ` Jon Smirl
@ 2010-03-18 19:07     ` Linus Torvalds
  2010-03-18 19:16       ` Jon Smirl
  2010-03-18 19:32       ` Michael Witten
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:07 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Michael Witten, git



On Thu, 18 Mar 2010, Jon Smirl wrote:
> 
> We could hash people emails and then build a .mailmap equivalent thus
> hiding their identity.

So? Why? What's the advantage?

I literally _only_ see disadvantages to the whole thing. If the uuid has 
some meaning (ie it's related to actual _real_ information), then it is 
nothing but a really inconvenient placeholder for the real information, 
adn another source of new problems (like "how do we know they are in 
sync? I edit the .gitconfig file by hand all the time").

And if it doesn't have meaning, then it's just annoying and will never 
ever be attached to anything relevant long-term.

Either way, there are only downsides, no upsides. There is absolutely _no_ 
way that teh uuid would ever actually encode any real meaningful 
information that isn't better represented by the name/email.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 18:42 ` Michael Witten
  2010-03-18 18:47   ` Matthieu Moy
@ 2010-03-18 19:12   ` Nicolas Pitre
  2010-03-18 20:44   ` tytso
  2 siblings, 0 replies; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-18 19:12 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, git

On Thu, 18 Mar 2010, Michael Witten wrote:

> So, forget the original generality and let's
> define the uuid as a SHA-1 of some EASILY
> REMEMBERED, already reasonably unique piece of
> information such as an old (name,email) pair.

Even with _that_, I bet many people will simply no bother.  You may as 
well just use your current name and email address.  Oh wait, Git is 
using just that already.

> To make life easier on people, git tools could automate
> that process; to Junio, his just uuid is an old,
> unchanging (name,email) pair:
> 
>     $ git config --global user.name  "Junio C Hamano"
>     $ git config --global user.email "gitster@pobox.com"
>     $ git config --global --uuid "Junio C Hamano <junkio@cox.net>"
> 
> which produces something like:
> 
>     [user]
>         name  = Junio C Hamano
>         email = gitster@pobox.com
>         uuid  = 6e99d26860f0b87ef4843fa838df2a918b85d1f7

Even then, some people _will_ manage to screw up some of their UUID 
configs.  And you'll inevitably end up in the same situation that we 
have today i.e. different identification credentials that have to be 
mapped to the same individual.

> Could people still bungle the uuid or enter trash?
> Sure, but that's essentially no different than the
> current situation.

Exact.  So why bother?

> This would be an improvement, because at least some people would take 
> advantage of it; in fact, I bet most people would use it properly 
> because:
[...]

Most people _already_ use their name/email configuration properly.  And 
those who really care are managing a stable email address already.  so 
this is not an improvement at all but only some added complexity.

> Moreover, storing and using the SHA-1 uuid would be
> very efficient and allow for saner .mailmap hacks.

I don't buy that either.  If anything, it is way better to fix the 
current .mailmap mechanism to catter for changing email addresses.  
That's what people use to contact people anyway as I doubt you could 
send any congratulations or job offers solely by using the Git's UUID.  
So you must link back to some form of email address in the end, and 
preferably the current one, otherwise the UUID is useless.  In that case 
then why not simply using that email address in the first place?

The real solution is actually to improve the .mailmap so that any 
individual could decide that for this or that name/email pair to be 
found in the repository then here's the current email that should be 
displayed instead.  Currently this applies partially and only to 
git-shortlog.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:07     ` Linus Torvalds
@ 2010-03-18 19:16       ` Jon Smirl
  2010-03-18 19:20         ` Linus Torvalds
  2010-03-18 19:32       ` Michael Witten
  1 sibling, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 19:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael Witten, git

On Thu, Mar 18, 2010 at 3:07 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>>
>> We could hash people emails and then build a .mailmap equivalent thus
>> hiding their identity.
>
> So? Why? What's the advantage?

I happen to think that the concept of privacy and working on an open
source project are fairly incompatible. But apparently their are
people who think otherwise.  The use would be to reconstruct that
mailmap I made, but with the email addresses replaced with SHA1 hashes
of the email. No human would use the SHA1s, they're just there to
obscure the emails.

>
> I literally _only_ see disadvantages to the whole thing. If the uuid has
> some meaning (ie it's related to actual _real_ information), then it is
> nothing but a really inconvenient placeholder for the real information,
> adn another source of new problems (like "how do we know they are in
> sync? I edit the .gitconfig file by hand all the time").
>
> And if it doesn't have meaning, then it's just annoying and will never
> ever be attached to anything relevant long-term.
>
> Either way, there are only downsides, no upsides. There is absolutely _no_
> way that teh uuid would ever actually encode any real meaningful
> information that isn't better represented by the name/email.
>
>                        Linus
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:16       ` Jon Smirl
@ 2010-03-18 19:20         ` Linus Torvalds
  2010-03-18 19:37           ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:20 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Michael Witten, git



On Thu, 18 Mar 2010, Jon Smirl wrote:
> 
> I happen to think that the concept of privacy and working on an open
> source project are fairly incompatible. But apparently their are
> people who think otherwise.  The use would be to reconstruct that
> mailmap I made, but with the email addresses replaced with SHA1 hashes
> of the email. No human would use the SHA1s, they're just there to
> obscure the emails.

I really see that as a bad thing, not a good thing. It's like enabling 
some crazy shit and making it official.

If you don't want to reveal your real name, use a fake address. Just don't 
expect anybody to want to work with you. 

The LAST thing we want is built-in git support for doing f*cking stupid 
things.  You can do stupid things with it on your own without us helping 
and encouraging you.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:07     ` Linus Torvalds
  2010-03-18 19:16       ` Jon Smirl
@ 2010-03-18 19:32       ` Michael Witten
  2010-03-18 19:40         ` Linus Torvalds
                           ` (2 more replies)
  1 sibling, 3 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 19:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git

On Thu, Mar 18, 2010 at 14:07, Linus Torvalds
<torvalds@linux-foundation.org> wrote:

> And if it doesn't have meaning, then it's just
> annoying and will never ever be attached to
> anything relevant long-term.

You've actually just described the current name/email system.

What a uuid provides is that very property of long-term attachment; a
git user can change the name/email pair but keep the same uuid.

You see, the problem is that the name/email pair isn't really an
identifier; it's actually just info about the user's current email
account, which is very useful for everyday workflow, but pretty naive
for historical identification over long periods of time.

As previously discussed in my original email, the 'email' portion of
the name/email pair is the most volatile portion, and that's because
it's only tangentially related to identity (and it certainly has
nothing to do with long-term identity).

>There is absolutely _no_ way that teh uuid would
> ever actually encode any real meaningful
> information that isn't better represented by the
> name/email.

It IS a name/email pair (if you want or if that is enforced); it's
just one that isn't as volatile.

This notion of a uuid is an attempt to adopt a BETTER MODEL for
identity: The user get's to choose a piece of information that he
himself deems a longterm identifier; it's not about what address you
currently use for email, it's solely about who you are over a long
period of time.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:20         ` Linus Torvalds
@ 2010-03-18 19:37           ` Jon Smirl
  2010-03-18 19:47             ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 19:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael Witten, git

On Thu, Mar 18, 2010 at 3:20 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>>
>> I happen to think that the concept of privacy and working on an open
>> source project are fairly incompatible. But apparently their are
>> people who think otherwise.  The use would be to reconstruct that
>> mailmap I made, but with the email addresses replaced with SHA1 hashes
>> of the email. No human would use the SHA1s, they're just there to
>> obscure the emails.
>
> I really see that as a bad thing, not a good thing. It's like enabling
> some crazy shit and making it official.
>
> If you don't want to reveal your real name, use a fake address. Just don't
> expect anybody to want to work with you.

Go ahead and commit that .mailmap I made. It really cleans up the
statistics by fixing 500 errors is people's names. Just don't point
the ensuing flame war at me, your hide is tougher.

> The LAST thing we want is built-in git support for doing f*cking stupid
> things.  You can do stupid things with it on your own without us helping
> and encouraging you.
>
>                Linus
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:32       ` Michael Witten
@ 2010-03-18 19:40         ` Linus Torvalds
  2010-03-18 19:47           ` Michael Witten
  2010-03-18 19:40         ` Wincent Colaiuta
  2010-03-18 19:42         ` Martin Langhoff
  2 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:40 UTC (permalink / raw)
  To: Michael Witten; +Cc: Jon Smirl, git



On Thu, 18 Mar 2010, Michael Witten wrote:
> 
> What a uuid provides is that very property of long-term attachment; a
> git user can change the name/email pair but keep the same uuid.

I don't think you understand what "attachment" means.

Think about your wife, your kids, or your pet. THAT is attachment.

Random 16-letter letter-jumble? No. People will _never_ care. They'll 
simply not care. 

It's true that people _already_ don't care too much about their emails, 
and that typos and simply job changes (or annoying ISP's) will change 
them. But that would be orders of magnitude _worse_ with something like a 
uuid.

> It IS a name/email pair (if you want or if that is enforced); it's
> just one that isn't as volatile.

Don't be an idiot.

Try to think like a HUMAN. Not a computer scientist. And ponder.

It's a _social_ issue, not a "let's tattoo this uuid on everybody".

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:32       ` Michael Witten
  2010-03-18 19:40         ` Linus Torvalds
@ 2010-03-18 19:40         ` Wincent Colaiuta
  2010-03-18 19:42         ` Martin Langhoff
  2 siblings, 0 replies; 104+ messages in thread
From: Wincent Colaiuta @ 2010-03-18 19:40 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, Jon Smirl, git

El 18/03/2010, a las 20:32, Michael Witten escribió:

> On Thu, Mar 18, 2010 at 14:07, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> 
>> And if it doesn't have meaning, then it's just
>> annoying and will never ever be attached to
>> anything relevant long-term.
> 
> You've actually just described the current name/email system.
> 
> What a uuid provides is that very property of long-term attachment; a
> git user can change the name/email pair but keep the same uuid.
> 
> You see, the problem is that the name/email pair isn't really an
> identifier; it's actually just info about the user's current email
> account, which is very useful for everyday workflow, but pretty naive
> for historical identification over long periods of time.

This whole thing is a stupid idea.

If users can't even be bothered keeping a stable email address, what makes you think that they can be assed "doing the right thing" with respect to a meaningless UUID string?

The idea is complicated, over-engineered, brings no benefit and adds only cruft.

W

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:32       ` Michael Witten
  2010-03-18 19:40         ` Linus Torvalds
  2010-03-18 19:40         ` Wincent Colaiuta
@ 2010-03-18 19:42         ` Martin Langhoff
  2 siblings, 0 replies; 104+ messages in thread
From: Martin Langhoff @ 2010-03-18 19:42 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, Jon Smirl, git

On Thu, Mar 18, 2010 at 3:32 PM, Michael Witten <mfwitten@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 14:07, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>
>> And if it doesn't have meaning, then it's just
>> annoying and will never ever be attached to
>> anything relevant long-term.
>
> You've actually just described the current name/email system.

WTH are you drinking? I have been using my current name and email
accounts for many years.

They are useful for git and for some things that are even more useful
-- like addressing emails! My best CV is googing for my name / email
addresses -- it will show you my professional career. Including the
time that Linus called my patch "idiotic" :-)

So, these things are attached to something meaningful: my long term
personal identity. A git-only "uuid"? Screw that, I hack on too many
physically different machines, I am not going to be carrying around a
magic string.

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:40         ` Linus Torvalds
@ 2010-03-18 19:47           ` Michael Witten
  2010-03-18 19:52             ` Linus Torvalds
  2010-03-18 19:52             ` Wincent Colaiuta
  0 siblings, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 19:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git

On Thu, Mar 18, 2010 at 14:40, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Random 16-letter letter-jumble? No. People will _never_ care. They'll
> simply not care.

I don't think you've read one word that I've written.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:37           ` Jon Smirl
@ 2010-03-18 19:47             ` Linus Torvalds
  2010-03-18 19:50               ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:47 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Michael Witten, git



On Thu, 18 Mar 2010, Jon Smirl wrote:
> 
> Go ahead and commit that .mailmap I made. It really cleans up the
> statistics by fixing 500 errors is people's names. Just don't point
> the ensuing flame war at me, your hide is tougher.

How hard is it to understand the notion of "people just don't _care_ 
enough"?

Look at CVS. Look at three _decades_ of CVS. Then look at the 
"identifiers" that thing used. 

Git is much better. Git is better for two reasons:

 - We allow/encourage people to use way more meaningful identifiers

 - Exactly _because_ what we use is meaningful to people, most people 
   bother to try.

And you don't seem to understand that whole "meaningful" part. If you 
don't have the social understanding of how people actually _work_, then 
nothing I say can explain it.

Let me try one more time: do the statistics on "committer information" vs 
"author information" on the Linux kernel repository, and count the types 
of errors that happen. I can explain the errors and why they happen, and 
it has everything to do with how _humans_work_ (*).

If you don't understand that, then there's no point in arguing.

			Linus

(*) I'll give you one answer in the next email. But before you read that 
email, try to think about it, and see if you can guess at patterns.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:47             ` Linus Torvalds
@ 2010-03-18 19:50               ` Linus Torvalds
  2010-03-18 20:01                 ` Linus Torvalds
  2010-03-18 20:31                 ` Reece Dunn
  0 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:50 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Michael Witten, git



On Thu, 18 Mar 2010, Linus Torvalds wrote:
> 
> (*) I'll give you one answer in the next email. But before you read that 
> email, try to think about it, and see if you can guess at patterns.

Lookie here:

  [torvalds@i5 linux]$ git log --pretty=full | grep '^Commit: ' | sort | uniq -c | sort -n | grep localdomain
      1 Commit: Jeff Garzik <jgarzik@localhost.localdomain>
      2 Commit: Dave Airlie <airlied@ppcg5.localdomain>
      3 Commit: James Bottomley <jejb@sparkweed.localdomain>
      3 Commit: James Morris <jmorris@localhost.localdomain>
      3 Commit: James Morris <jmorris@macbook.localdomain>
      4 Commit: James Bottomley <jejb@hobholes.localdomain>
     32 Commit: Thomas Graf <tgr@axs.localdomain>
    410 Commit: James Bottomley <jejb@mulgrave.localdomain>
  [torvalds@i5 linux]$ git log --pretty=full | grep '^Author: ' | sort | uniq -c | sort -n | grep localdomain
      1 Author: Alex Deucher <alex@hp.localdomain>
      1 Author: Dave Airlie <airlied@ppcg5.localdomain>
      1 Author: Eduardo Habkost <ehabkost@Rawhide-64.localdomain>
      1 Author: Grzegorz Nosek <root@localdomain.pl>
      1 Author: Izik Eidus <izike@localhost.localdomain>
      1 Author: Jeff Garzik <jgarzik@localhost.localdomain>
      2 Author: Esti Kummer <stkumer@localhost.localdomain>
      2 Author: James Bottomley <jejb@mulgrave.localdomain>
      3 Author: Dave Airlie <airlied@optimus.localdomain>
      3 Author: James Bottomley <jejb@hobholes.localdomain>
      3 Author: James Bottomley <jejb@sparkweed.localdomain>
      4 Author: Cindy H Kao <evans@localhost.localdomain>
      4 Author: Kristian Høgsberg <krh@localhost.localdomain>

See? Mistakes happen. But look at what happens to the committer 
information? Think about it. Really _think_ about it. There is absolutely 
zero _technical_ difference between the two fields. The only difference is 
that "git log" by default shows one, and not the other.

So as a human, which one do you think people care about and fix more 
quickly?

And look at the numbers once more.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:47           ` Michael Witten
@ 2010-03-18 19:52             ` Linus Torvalds
  2010-03-18 20:00               ` Michael Witten
  2010-03-18 19:52             ` Wincent Colaiuta
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 19:52 UTC (permalink / raw)
  To: Michael Witten; +Cc: Jon Smirl, git



On Thu, 18 Mar 2010, Michael Witten wrote:

> On Thu, Mar 18, 2010 at 14:40, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > Random 16-letter letter-jumble? No. People will _never_ care. They'll
> > simply not care.
> 
> I don't think you've read one word that I've written.

Oh, I read them. They make no sense.

If the uuid isn't random, but tied to the email address, then it's 
worthless. 

If you like the random 16-letter jumbles, then for christ sake JUST CHANGE 
"git log" to hash the author name for you. You'll get the uuid's. What I'm 
telling you is that NOBODY SANE WANTS TO EVER SEE THEM.

And if nobody wants them, then nobody will maintain them, and they'll be 
much _less_ useful than the emails we already have.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:47           ` Michael Witten
  2010-03-18 19:52             ` Linus Torvalds
@ 2010-03-18 19:52             ` Wincent Colaiuta
  1 sibling, 0 replies; 104+ messages in thread
From: Wincent Colaiuta @ 2010-03-18 19:52 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, Jon Smirl, git

El 18/03/2010, a las 20:47, Michael Witten escribió:

> On Thu, Mar 18, 2010 at 14:40, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> Random 16-letter letter-jumble? No. People will _never_ care. They'll
>> simply not care.
> 
> I don't think you've read one word that I've written.

On the contrary I get the impression he has waded through everything you've written, and has even been patient enough to put together (now several) replies explaining exactly why it is a misguided idea.

Now it's time for you to read and actually reflect on what's been said. If you're sane and have a modicum of intelligence you'll come to the conclusion that your idea doesn't solve any problem, and in fact only adds a bunch of new ones.

W

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:52             ` Linus Torvalds
@ 2010-03-18 20:00               ` Michael Witten
  0 siblings, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 20:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Thu, Mar 18, 2010 at 14:52, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> If you like the random 16-letter jumbles, then for christ sake JUST CHANGE
> "git log" to hash the author name for you. You'll get the uuid's. What I'm
> telling you is that NOBODY SANE WANTS TO EVER SEE THEM.

No, I'm reasonably certain you didn't.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:50               ` Linus Torvalds
@ 2010-03-18 20:01                 ` Linus Torvalds
  2010-03-19 19:39                   ` Junio C Hamano
  2010-03-18 20:31                 ` Reece Dunn
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 20:01 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Michael Witten, git



On Thu, 18 Mar 2010, Linus Torvalds wrote:
> 
> So as a human, which one do you think people care about and fix more 
> quickly?

Btw, one other thing you can take away from it is that even when they 
_are_ shown, and even when they _are_ meaningful, people still don't care. 

There's absolutely tons of "(none)" emails even in the _visible_ parts, 
which is really really sad. But it does tell a lot about humans - they 
won't be noticing even _obvious_ mistakes like that.

(And yes, it does say that git should probably have errored out way more 
aggressively about badly set up host/domain names in the "guess at email 
address" code. My bad. Maybe it's still worth fixing for the future)

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 19:50               ` Linus Torvalds
  2010-03-18 20:01                 ` Linus Torvalds
@ 2010-03-18 20:31                 ` Reece Dunn
  2010-03-18 20:59                   ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Reece Dunn @ 2010-03-18 20:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, Michael Witten, git

On 18 March 2010 19:50, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Thu, 18 Mar 2010, Linus Torvalds wrote:
>>
>> (*) I'll give you one answer in the next email. But before you read that
>> email, try to think about it, and see if you can guess at patterns.
>
> Lookie here:
>
>  [torvalds@i5 linux]$ git log --pretty=full | grep '^Commit: ' | sort | uniq -c | sort -n | grep localdomain
>      1 Commit: Jeff Garzik <jgarzik@localhost.localdomain>
>      2 Commit: Dave Airlie <airlied@ppcg5.localdomain>
>      3 Commit: James Bottomley <jejb@sparkweed.localdomain>
>      3 Commit: James Morris <jmorris@localhost.localdomain>
>      3 Commit: James Morris <jmorris@macbook.localdomain>
>      4 Commit: James Bottomley <jejb@hobholes.localdomain>
>     32 Commit: Thomas Graf <tgr@axs.localdomain>
>    410 Commit: James Bottomley <jejb@mulgrave.localdomain>
>  [torvalds@i5 linux]$ git log --pretty=full | grep '^Author: ' | sort | uniq -c | sort -n | grep localdomain
>      1 Author: Alex Deucher <alex@hp.localdomain>
>      1 Author: Dave Airlie <airlied@ppcg5.localdomain>
>      1 Author: Eduardo Habkost <ehabkost@Rawhide-64.localdomain>
>      1 Author: Grzegorz Nosek <root@localdomain.pl>
>      1 Author: Izik Eidus <izike@localhost.localdomain>
>      1 Author: Jeff Garzik <jgarzik@localhost.localdomain>
>      2 Author: Esti Kummer <stkumer@localhost.localdomain>
>      2 Author: James Bottomley <jejb@mulgrave.localdomain>
>      3 Author: Dave Airlie <airlied@optimus.localdomain>
>      3 Author: James Bottomley <jejb@hobholes.localdomain>
>      3 Author: James Bottomley <jejb@sparkweed.localdomain>
>      4 Author: Cindy H Kao <evans@localhost.localdomain>
>      4 Author: Kristian Høgsberg <krh@localhost.localdomain>
>
> See? Mistakes happen. But look at what happens to the committer
> information? Think about it. Really _think_ about it. There is absolutely
> zero _technical_ difference between the two fields. The only difference is
> that "git log" by default shows one, and not the other.
>
> So as a human, which one do you think people care about and fix more
> quickly?
>
> And look at the numbers once more.

So... going back to the original problem, we have:

  1/  people making mistakes in the commit logs for whatever reason
(e.g. re-installation or a new computer);
  2/  people changing name (e.g. getting married) or changing email
(e.g. gmail.com to googlemail.com).

The problem is that it may be beneficial to see all the changes Cindy
H Kao made for example, including the ones made
@localhost.localdomain.

Having (user, email, uuid) will not solve the problem (even if the
uuid is from a memorable string) -- consider case 1. If you forget to
setup git, uuid will be blank or some random data, so this will be
worse than the (user, email) identity. As noted, there is also the
issue that git is used in a lot of places and not all git clone
instances are running the same version (e.g. pushing to an older git
client that does not support this new data).

What would be better is having a concept of identity aliases. That is,
a user can say that (for this git project), (user1,email1) is the same
person as (user2,email2). This would allow someone who has
mis-configured their git instance to say what the (user,email) pair
should have been. It also allows people to say that they used to be
called someone and they are now called somebody.

This information should ideally be in some form of (user,email) ->
(user,email) map that is versioned and tracked by git (in a way that
is also backward compatible, which could be tricky).

It also needs to be changeable and version tracked (i.e. with history)
to allow people to undo this; for example, this system would allow me
to say that Linus' (user,email) id is actually an alias for my
(user,email) which is bad. I don't know of a decent way to prevent
this (or someone using the uuid of someone else in the original
proposal), but this approach would at least allow it to be corrected.

There will need to be the related plumbing and porcelain to access and
manipulate this data/meta-data.

Would this be a better approach? Or is there a fatal flaw I am missing
(like people being able to alias themselves as other people, for
example)?

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 18:42 ` Michael Witten
  2010-03-18 18:47   ` Matthieu Moy
  2010-03-18 19:12   ` Nicolas Pitre
@ 2010-03-18 20:44   ` tytso
  2010-03-18 21:12     ` Michael Witten
  2 siblings, 1 reply; 104+ messages in thread
From: tytso @ 2010-03-18 20:44 UTC (permalink / raw)
  To: Michael Witten; +Cc: Linus Torvalds, git

On Thu, Mar 18, 2010 at 11:42:44AM -0700, Michael Witten wrote:
> Could people still bungle the uuid or enter trash?
> Sure, but that's essentially no different than the
> current situation. This would be an improvement,
> because at least some people would take advantage
> of it; in fact, I bet most people would use it
> properly because:
> 
>     * The information required is easily remembered
>       and reproduced; it has that emotional aspect.
> 
>     * People have an emotional attachment to getting
>       proper attribution for their work, and this
>       helps.

The problem is that people don't get emotionally attached to a UUID.
And even if the UUID is generated algorithmically, they need to
remember, gee, was my UUID generated using:

	Theodore Y. Ts'o <tytso@mit.edu>
	Theodore Tso <tytso@mit.edu>
	Theodore T'so <tytso@valinux.com>  (*) 
	Theodore Y Tso <theotso@us.ibm.com
	Ted Tso <tytso@google.com>
	Theodore Tso <tytso@google.com>
	<etc.>

(*) The VA Linux folks screwed up where the apostrophe goes in some
press release, and the mispelling of my last name has followed me for
the last ten years since then.

More importantly, there's a lot more to someone's reputation than just
Git.  What about reviews of other people's patches on LKML?  Can you
**honestly** expect people to say,

   Hi, I'm <dd1b51a1-ce2a-41fd-ae89-f68b7f0ace85> and here are the things
   that you need to fix with your patch....

People who give thoughtful reviews of other people's code count for a
lot, and people are not going to track that sort of thing by UUID.
They are going to track it by name and e-mail address.

Or what about papers?  Can you honestly expect that it would matter
even one iota if someone put in a bibliography of a paper

R. Card (14a8da4b-0231-497b-aa66-1809cc9727f9), T. Y. Ts'o
(dd1b51a1-ce2a-41fd-ae89-f68b7f0ace85), and S. Tweedie
(9052e458-32cc-11df-93b8-0016eb0fac40), "Design and implementation of
the second extended filesystem," in Proceedings of the 1994 Amsterdam
Linux Conference, 1994.

Is that going to contribute to my identity any?   I don't think so.


Finally, if someone misses one of my commits in a git changelog, so
what?  People don't guage impact by the number of commits.  There are
some people who have huge numbers commits, but they are all spelling
corrections.  A developer's reputation is developed over many months
or years of contributions; of interactions over e-mail; interactions
in hallway discussions at conferences; papers which they author; etc.
It's not just about git commits.

   	      	 	       	  	   	  - Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 20:31                 ` Reece Dunn
@ 2010-03-18 20:59                   ` Linus Torvalds
  0 siblings, 0 replies; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 20:59 UTC (permalink / raw)
  To: Reece Dunn; +Cc: Jon Smirl, Michael Witten, git



On Thu, 18 Mar 2010, Reece Dunn wrote:
> 
> What would be better is having a concept of identity aliases. That is,
> a user can say that (for this git project), (user1,email1) is the same
> person as (user2,email2). This would allow someone who has
> mis-configured their git instance to say what the (user,email) pair
> should have been. It also allows people to say that they used to be
> called someone and they are now called somebody.

Yeah. And that's what '.mailmap' is, really.

Does mailmap get annoying? Yes. Is it going to be incomplete? Yes. Do we 
ever even _bother_ to try to make it perfect? No.

In the kernel, for example, we tend to use it _only_ to fix up the real 
name. It's much more capable than that (ie you can use it to fix up email 
addresses too), but we literally haven't cared enough to bother. So you 
still see the "localhost" emails or the "(none)" domains - even if you use 
one of the formats that ask for a "fixed" name and email.

And git itself only fixes up names for certain commands (git blame, git 
shortlog) and with specific format specifiers (%aN and %aE).

The _default_ pretty log format printouts don't do it, for example. Should 
they? Maybe. Or maybe we should have a flag and/or config option to do so 
by default.

				Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 20:44   ` tytso
@ 2010-03-18 21:12     ` Michael Witten
  2010-03-18 21:19       ` Martin Langhoff
  2010-03-18 21:27       ` Linus Torvalds
  0 siblings, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 21:12 UTC (permalink / raw)
  To: tytso
  Cc: Linus Torvalds, Nicolas Pitre, Martin Langhoff, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 15:44,  <tytso@mit.edu> wrote:
>   Hi, I'm <dd1b51a1-ce2a-41fd-ae89-f68b7f0ace85> and here are the things
>   that you need to fix with your patch....

Look, there is a huge misunderstanding.

This is all that I'm saying: Keep git exactly the way it is, but add
one extra piece of identifying information for each person.

That's it.

Nothing is being taken away.

You can still see/grep/access the full names and email addresses just
as before, only now there will be another piece of information on
which to filter (or ignore it if you want).

In the most general form of my proposal, the idea is to let the user
choose some piece of information that he himself deems to be uniquely
identifying over a long period of time. However, I think it would be
smart to reduce that information to a SHA-1 (at least when it's
recorded in, say, a commit).

Essentially, the goal is to distribute the task of maintaining aliases.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:12     ` Michael Witten
@ 2010-03-18 21:19       ` Martin Langhoff
  2010-03-18 21:29         ` Michael Witten
  2010-03-18 21:27       ` Linus Torvalds
  1 sibling, 1 reply; 104+ messages in thread
From: Martin Langhoff @ 2010-03-18 21:19 UTC (permalink / raw)
  To: Michael Witten
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 5:12 PM, Michael Witten <mfwitten@gmail.com> wrote:
> This is all that I'm saying: Keep git exactly the way it is, but add
> one extra piece of identifying information for each person.

What's the value? For me it'll be "Martin Langhoff". I already have that.

> Nothing is being taken away.

But something is added.

Good design is not when there's nothing more to add, it's when there's
nothing left _to remove_.

Git is what it is thanks to removing superfluous crud from its core
datamodel. Don't be surprised that there is a very strong resistance
to adding anything to that datamodel.

> Essentially, the goal is to distribute the task of maintaining aliases.

Already achieved with mailcap. No need to mess with the secret of
git's success (the tight datamodel).

cheers,



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:12     ` Michael Witten
  2010-03-18 21:19       ` Martin Langhoff
@ 2010-03-18 21:27       ` Linus Torvalds
  2010-03-18 21:44         ` Michael Witten
  2010-03-18 23:12         ` Jon Smirl
  1 sibling, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2010-03-18 21:27 UTC (permalink / raw)
  To: Michael Witten
  Cc: tytso, Nicolas Pitre, Martin Langhoff, Wincent Colaiuta, git



On Thu, 18 Mar 2010, Michael Witten wrote:
> 
> This is all that I'm saying: Keep git exactly the way it is, but add
> one extra piece of identifying information for each person.

The thing is, you don't seem to realize that most authorship is over 
email.

Let's take some numbers from the kernel archive, for example. Here's _one_ 
trivial way to count it:

 - number of commits where author/committer email matches (presumably 
   _not_ emailed, although sometimes people commit their own patches that 
   were emailed around):

	[torvalds@i5 linux]$ git log --no-merges "--pretty=format:%h-%ae%n%h-%ce" | uniq -d | wc
	  33473   33473  959167

 - total number of commits:

	[torvalds@i5 linux]$ git rev-list --no-merges HEAD | wc
	 176415  176415 7233015

IOW, less than a fifth of the patches were done by the person who actually 
committed things. 80%+ of all changes were committed by somebody else than 
the author.

How do you think the authorship information can be transferred sanely, 
considering that the author didn't even use git in the first place? 
Really?

That's where the typos/mistakes/missing-info really happens. And it often 
starts out with incomplete information, because the person has a bad email 
setup, and the thing only has an email address to begin with, ie the 
"From:" might literally say just "tytso@mit.edu" or something (to pick an 
example from the Cc list in this discussion - when Ted sends real emails, 
they tend to have proper naming).

Sometimes we'll edit the messages to have the "From: xyz <abc>" thing at 
the top, fixing up the incomplete thing then. Typos happen there. Or the 
patch will simply come in two different ways, so there's no typo, yet 
there are two different emails that get author attribution.

The thing is, development really is about human interaction. Yes, there's 
a tool involved (git), and once the data is in the tool we won't lose it 
any more, but this is about getting the data _into_ the tool in the first 
place.

And the data you want to add simply DOES NOT EXIST. And we can't make it 
exist. The fact that even the trivial and obvious data that git _does_ ask 
for gets to be incomplete should tell you something.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:19       ` Martin Langhoff
@ 2010-03-18 21:29         ` Michael Witten
  2010-03-18 21:39           ` Martin Langhoff
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-18 21:29 UTC (permalink / raw)
  To: Martin Langhoff
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 16:19, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 5:12 PM, Michael Witten <mfwitten@gmail.com> wrote:
>> This is all that I'm saying: Keep git exactly the way it is, but add
>> one extra piece of identifying information for each person.
>
> What's the value? For me it'll be "Martin Langhoff". I already have that.

Well, that's rather egotistical considering you're probably not the
only Martin Langhoff in this world. I'd advocate something like
"Martin Langhoff <martin.langhoff@gmail.com>".

At worst, things will be just like they have always been.

Most likely, all that will happen is identification entropy won't
increase nearly so rapidly and there might be other benefits such as
shortlog speed improvements.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:29         ` Michael Witten
@ 2010-03-18 21:39           ` Martin Langhoff
  2010-03-18 21:46             ` Michael Witten
  2010-03-18 21:57             ` Michael Witten
  0 siblings, 2 replies; 104+ messages in thread
From: Martin Langhoff @ 2010-03-18 21:39 UTC (permalink / raw)
  To: Michael Witten
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 5:29 PM, Michael Witten <mfwitten@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 16:19, Martin Langhoff
>> What's the value? For me it'll be "Martin Langhoff". I already have that.
>
> Well, that's rather egotistical considering you're probably not the
> only Martin Langhoff in this world. I'd advocate something like
> "Martin Langhoff <martin.langhoff@gmail.com>".

So you are saying we should change the core datamodel of git to say...
what we already can say?

> At worst, things will be just like they have always been.

No, we'll have another way to have data mismatches. There are _more_
moving parts in your model. That's what Linus is pointing out.

This is a case where an ancillary "fixup table", in the form of
mailmap, works best. Don't move the fixup table to the core of the
datamodel, it just doesn't belong there.

Here's a hint: using your "uuid" model, I'll get some commits into a
project with the wrong uuid. Because I made a typo, or changed
machines (and a random uuid got created), whatever reason. So now in
my project I appear under 2 uuids.

What should we do in that case? Use mailmap to map the stray uuid to
the "real" one?... Have we done a lot of work to get back to square 0?

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:27       ` Linus Torvalds
@ 2010-03-18 21:44         ` Michael Witten
  2010-03-18 23:12         ` Jon Smirl
  1 sibling, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 21:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: tytso, Nicolas Pitre, Martin Langhoff, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 16:27, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, 18 Mar 2010, Michael Witten wrote:
>>
>> This is all that I'm saying: Keep git exactly the way it is, but add
>> one extra piece of identifying information for each person.
>
> The thing is, you don't seem to realize that most authorship is [sent
> over email with incomplete information].

That is a really good point, and something I'll have to consider more
thoroughly.

However, I do NOT claim that my proposal will add information where
there is none, only that it will reduce the rate at which entropy
increases.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:39           ` Martin Langhoff
@ 2010-03-18 21:46             ` Michael Witten
  2010-03-18 21:55               ` Martin Langhoff
  2010-03-18 22:06               ` Reece Dunn
  2010-03-18 21:57             ` Michael Witten
  1 sibling, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 21:46 UTC (permalink / raw)
  To: Martin Langhoff
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 16:39, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
>
> Here's a hint: using your "uuid" model, I'll get some commits into a
> project with the wrong uuid. Because I made a typo, or changed
> machines (and a random uuid got created), whatever reason. So now in
> my project I appear under 2 uuids.
>
> What should we do in that case? Use mailmap to map the stray uuid to
> the "real" one?... Have we done a lot of work to get back to square 0?

Again:

>> At worst, things will be just like they have always been.
>>
>> Most likely, all that will happen is identification entropy won't
>> increase nearly so rapidly and there might be other benefits
>> such as shortlog speed improvements.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:46             ` Michael Witten
@ 2010-03-18 21:55               ` Martin Langhoff
  2010-03-18 22:02                 ` Michael Witten
  2010-03-18 22:06               ` Reece Dunn
  1 sibling, 1 reply; 104+ messages in thread
From: Martin Langhoff @ 2010-03-18 21:55 UTC (permalink / raw)
  To: Michael Witten
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 5:46 PM, Michael Witten <mfwitten@gmail.com> wrote:
>> What should we do in that case? Use mailmap to map the stray uuid to
>> the "real" one?... Have we done a lot of work to get back to square 0?
>
> Again:
>
>>> At worst, things will be just like they have always been.

Of course we all read that line. You are proposing a change that will
mean a flag day -- that is, old versions of git won't be able to read
"new" repositories (and "new" git will have to be backwards compat for
X releases...). This is major breakage.

Inflict a painful change on our userbase for... what exactly? Ah, "At
worst, things will be just like they have always been."

I don't think you understand what you've been proposing.

Is it clearer now why you get a clear "no" from all quarters? Huge
cost, no upside?



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:39           ` Martin Langhoff
  2010-03-18 21:46             ` Michael Witten
@ 2010-03-18 21:57             ` Michael Witten
  2010-03-19 12:34               ` Paolo Bonzini
  1 sibling, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-18 21:57 UTC (permalink / raw)
  To: Martin Langhoff
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 16:39, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 5:29 PM, Michael Witten <mfwitten@gmail.com> wrote:
>> On Thu, Mar 18, 2010 at 16:19, Martin Langhoff
>>> What's the value? For me it'll be "Martin Langhoff". I already have that.
>>
>> Well, that's rather egotistical considering you're probably not the
>> only Martin Langhoff in this world. I'd advocate something like
>> "Martin Langhoff <martin.langhoff@gmail.com>".
>
> So you are saying we should change the core datamodel of git to say...
> what we already can say?

You see, Martin, you might want/need to stop using "Martin Langhoff
<martin.langhoff@gmail.com>" as your email account, but there's no
reason why you can't continue to use it for your UUID.

>> At worst, things will be just like they have always been.
>
> No, we'll have another way to have data mismatches. There are _more_
> moving parts in your model. That's what Linus is pointing out.

Mismatches in UUIDs will be the only thing worth worrying about;
fortunately, UUIDs won't change as frequently because they would be
rarely typed by git users and they are not subject to changing email
systems or changing names.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:55               ` Martin Langhoff
@ 2010-03-18 22:02                 ` Michael Witten
  2010-03-18 23:37                   ` Nicolas Pitre
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-18 22:02 UTC (permalink / raw)
  To: Martin Langhoff
  Cc: tytso, Linus Torvalds, Nicolas Pitre, Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 16:55, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 5:46 PM, Michael Witten <mfwitten@gmail.com> wrote:
>>> What should we do in that case? Use mailmap to map the stray uuid to
>>> the "real" one?... Have we done a lot of work to get back to square 0?
>>
>> Again:
>>
>>>> At worst, things will be just like they have always been.
>
> Of course we all read that line.

You missed the other line (probably gmail's fault):

Most likely, all that will happen is identification entropy won't
increase nearly so rapidly and there might be other benefits
such as shortlog speed improvements.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:46             ` Michael Witten
  2010-03-18 21:55               ` Martin Langhoff
@ 2010-03-18 22:06               ` Reece Dunn
  1 sibling, 0 replies; 104+ messages in thread
From: Reece Dunn @ 2010-03-18 22:06 UTC (permalink / raw)
  To: Michael Witten
  Cc: Martin Langhoff, tytso, Linus Torvalds, Nicolas Pitre,
	Wincent Colaiuta, git

On 18 March 2010 21:46, Michael Witten <mfwitten@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 16:39, Martin Langhoff
> <martin.langhoff@gmail.com> wrote:
>>
>> Here's a hint: using your "uuid" model, I'll get some commits into a
>> project with the wrong uuid. Because I made a typo, or changed
>> machines (and a random uuid got created), whatever reason. So now in
>> my project I appear under 2 uuids.
>>
>> What should we do in that case? Use mailmap to map the stray uuid to
>> the "real" one?... Have we done a lot of work to get back to square 0?
>
> Again:
>
>>> At worst, things will be just like they have always been.
>>>
>>> Most likely, all that will happen is identification entropy won't
>>> increase nearly so rapidly and there might be other benefits
>>> such as shortlog speed improvements.

You have 3 pieces of information that can change by adding uuid instead of 2.

Are people going to remember that they need to set a uuid when
checking things into git? Different uuids? Forgetting the key string
to generate the hash for the uuid?

The uuid is another source of permutations that will see an increase
in identity triples. It is also another thing that needs to be stored
in a commit on disk and in memory, printed out in the shortlog and
checked by people.

Even if you generate a SHA-1 hash from a memorable bit of data, the
resulting hash is not readable. It is something that could cause
collisions with partial hashes in treeish queries (does 12ab34 refer
to a commit, or to a persons uuid?). It is also meaningless to the
user: I want to find Ted Ts'o's (I hope I've got the apostrophe in the
correct place) commits - how do I know what uuid refers to his
commits? How can I find it out?

It is just adding more resistance, whereas with a well-configured
.mailmap I could use one of his known email addresses, something that
is easy to find and remember.

>From what Linus and others have said, .mailmap is the way to fix name
and/or email changes. It may need more work to expose it to more
commands, but that is the simplest, cleanest and most elegant approach
to fixing the problem you specified.

What about .mailmap does not solve your problem? Is it that it does
not work for `git log`? If so, then write a patch to allow `git log`
to use that information when you specify a certain flag (or pretty
format string).

NOTE: It is not just the author/committer that needs to remember/use
the uuid - it is people doing analysis on commits, curious people,
automated scripts and many others.

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
                   ` (2 preceding siblings ...)
  2010-03-18 18:42 ` Michael Witten
@ 2010-03-18 22:17 ` A Large Angry SCM
  2010-03-19  2:47 ` Sitaram Chamarty
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 104+ messages in thread
From: A Large Angry SCM @ 2010-03-18 22:17 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

Michael Witten wrote:
> Short Version:
> -------------
> 
> 
> Rather than use a (name,email) pair to identify people, let's use
> a (uuid,name,email) triplet.
> 
> The uuid can be any piece of information that a user of git determines
> to be reasonably unique across space and time and that is intended to
> be used by that user virtually forever (at least within a project's
> history).
> 
> For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
> some easily remembered, already reasonably unique information.
> 
> This could really help keep identifications clean, and it is rather
> straightforward and possibly quite efficient.
> 
> 
> Long Version:
> ------------

[Much text deleted]

The formatting of the information in the author & committer fields are a 
_social_ convention (with a little help from the tools).You can actually 
use this proposed "feature" now for your own commits by appending the 
UUID string to you name config setting, environment variable and/or GCOS 
field today and everything will work. You can even make it a requirement 
for projects that you control. But don't expect all other projects to do 
so also as they may not care.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 17:27 ` Linus Torvalds
  2010-03-18 19:02   ` Jon Smirl
@ 2010-03-18 22:36   ` Martin Langhoff
  2010-03-18 23:17     ` Nicolas Pitre
  1 sibling, 1 reply; 104+ messages in thread
From: Martin Langhoff @ 2010-03-18 22:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Michael Witten, git

On Thu, Mar 18, 2010 at 1:27 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Even shorter version: NO.

One thing we all forgot to mention here is that even if it was a good
idea (which it is not), implementing it means a flag day: changing in
the pack format, wire protocol and APIs, messing up with compatibility
with users of pre-flag-day git, and rippling out to all the GUIs,
frontends and integration scripts out there.

A veritable mess that would reberberate for years.

Any proposal that touches the core git datamodel... better implement
something that is outrageously wondrously good and impossible to do
any other way.

My guess is that people that parachute into this list and propose
datamodel changes haven't thought this aspect through.

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:27       ` Linus Torvalds
  2010-03-18 21:44         ` Michael Witten
@ 2010-03-18 23:12         ` Jon Smirl
  1 sibling, 0 replies; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 23:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Witten, tytso, Nicolas Pitre, Martin Langhoff,
	Wincent Colaiuta, git

On Thu, Mar 18, 2010 at 5:27 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 18 Mar 2010, Michael Witten wrote:
>>
>> This is all that I'm saying: Keep git exactly the way it is, but add
>> one extra piece of identifying information for each person.
>
> The thing is, you don't seem to realize that most authorship is over
> email.
>
> Let's take some numbers from the kernel archive, for example. Here's _one_
> trivial way to count it:
>
>  - number of commits where author/committer email matches (presumably
>   _not_ emailed, although sometimes people commit their own patches that
>   were emailed around):
>
>        [torvalds@i5 linux]$ git log --no-merges "--pretty=format:%h-%ae%n%h-%ce" | uniq -d | wc
>          33473   33473  959167
>
>  - total number of commits:
>
>        [torvalds@i5 linux]$ git rev-list --no-merges HEAD | wc
>         176415  176415 7233015
>
> IOW, less than a fifth of the patches were done by the person who actually
> committed things. 80%+ of all changes were committed by somebody else than
> the author.
>
> How do you think the authorship information can be transferred sanely,
> considering that the author didn't even use git in the first place?
> Really?
>
> That's where the typos/mistakes/missing-info really happens. And it often
> starts out with incomplete information, because the person has a bad email
> setup, and the thing only has an email address to begin with, ie the
> "From:" might literally say just "tytso@mit.edu" or something (to pick an
> example from the Cc list in this discussion - when Ted sends real emails,
> they tend to have proper naming).

If I recall correctly the top source of errors is variations in the
domain name of the email address. Second place was mangling of names
from non-ASCII charsets. Third place was human typos. Fourth was
inconsistency in the human name, like Ted's example.

A really simple check would be for git to say - I've never seen this
name/email combo before, are you sure it is correct before I commit
it.

PS - I am not in favor of the UUID scheme.

>
> Sometimes we'll edit the messages to have the "From: xyz <abc>" thing at
> the top, fixing up the incomplete thing then. Typos happen there. Or the
> patch will simply come in two different ways, so there's no typo, yet
> there are two different emails that get author attribution.
>
> The thing is, development really is about human interaction. Yes, there's
> a tool involved (git), and once the data is in the tool we won't lose it
> any more, but this is about getting the data _into_ the tool in the first
> place.
>
> And the data you want to add simply DOES NOT EXIST. And we can't make it
> exist. The fact that even the trivial and obvious data that git _does_ ask
> for gets to be incomplete should tell you something.
>
>                        Linus
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 22:36   ` Martin Langhoff
@ 2010-03-18 23:17     ` Nicolas Pitre
  2010-03-18 23:26       ` Jon Smirl
  2010-03-18 23:34       ` Michael Witten
  0 siblings, 2 replies; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-18 23:17 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Linus Torvalds, Michael Witten, git

On Thu, 18 Mar 2010, Martin Langhoff wrote:

> On Thu, Mar 18, 2010 at 1:27 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > Even shorter version: NO.
> 
> One thing we all forgot to mention here is that even if it was a good
> idea (which it is not), implementing it means a flag day: changing in
> the pack format, wire protocol and APIs, messing up with compatibility
> with users of pre-flag-day git, and rippling out to all the GUIs,
> frontends and integration scripts out there.

And nobody yet mentioned what should happen when someone sends a patch 
by email.  Most commits in git.git originated from a patch sent via 
email.  Should we start pasting UUIDs in the email body?  What if the 
cut & paste was quickly done and the UUID is missing a character or two?  
Because this does happen.  And because this UUID thing is supposed to be 
a perfect identity representation then we'll need a .uuidmap to correct 
such mistakes of course.

Better improve on the existing .mailmap instead.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:17     ` Nicolas Pitre
@ 2010-03-18 23:26       ` Jon Smirl
  2010-03-18 23:34         ` Nicolas Pitre
  2010-03-18 23:34       ` Michael Witten
  1 sibling, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 23:26 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Martin Langhoff, Linus Torvalds, Michael Witten, git

On Thu, Mar 18, 2010 at 7:17 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Martin Langhoff wrote:
>
>> On Thu, Mar 18, 2010 at 1:27 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > Even shorter version: NO.
>>
>> One thing we all forgot to mention here is that even if it was a good
>> idea (which it is not), implementing it means a flag day: changing in
>> the pack format, wire protocol and APIs, messing up with compatibility
>> with users of pre-flag-day git, and rippling out to all the GUIs,
>> frontends and integration scripts out there.
>
> And nobody yet mentioned what should happen when someone sends a patch
> by email.  Most commits in git.git originated from a patch sent via
> email.  Should we start pasting UUIDs in the email body?  What if the
> cut & paste was quickly done and the UUID is missing a character or two?
> Because this does happen.  And because this UUID thing is supposed to be
> a perfect identity representation then we'll need a .uuidmap to correct
> such mistakes of course.
>
> Better improve on the existing .mailmap instead.

If anyone is interested I can send them a .mailmap that fixes a lot of
the problems in the kernel tree. It's two years old so it will need
updating.

>
>
> Nicolas
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:26       ` Jon Smirl
@ 2010-03-18 23:34         ` Nicolas Pitre
  2010-03-18 23:41           ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-18 23:34 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, Linus Torvalds, Michael Witten, git

On Thu, 18 Mar 2010, Jon Smirl wrote:

> If anyone is interested I can send them a .mailmap that fixes a lot of
> the problems in the kernel tree. It's two years old so it will need
> updating.

Please just make a patch with it, and post it to lkml and CC Linus and 
Andrew Morton.  Repost a month later if no one picked it up.

I think that 'git log' should really consider the .mailmap by default.  
Otherwise what's the point?   The only time when .mailmap should not be 
considered is when using --pretty=raw or when explicitly told not to.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:17     ` Nicolas Pitre
  2010-03-18 23:26       ` Jon Smirl
@ 2010-03-18 23:34       ` Michael Witten
  1 sibling, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-18 23:34 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Martin Langhoff, Linus Torvalds, git

On Thu, Mar 18, 2010 at 18:17, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Martin Langhoff wrote:
>
>> On Thu, Mar 18, 2010 at 1:27 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > Even shorter version: NO.
>>
>> One thing we all forgot to mention here is that even if it was a good
>> idea (which it is not), implementing it means a flag day: changing in
>> the pack format, wire protocol and APIs, messing up with compatibility
>> with users of pre-flag-day git, and rippling out to all the GUIs,
>> frontends and integration scripts out there.
>
> And nobody yet mentioned what should happen when someone sends a patch
> by email.  Most commits in git.git originated from a patch sent via
> email.  Should we start pasting UUIDs in the email body?  What if the
> cut & paste was quickly done and the UUID is missing a character or two?
> Because this does happen.  And because this UUID thing is supposed to be
> a perfect identity representation then we'll need a .uuidmap to correct
> such mistakes of course.
>
> Better improve on the existing .mailmap instead.

Actually, those points were touched upon earlier (including my rebuttals).

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 22:02                 ` Michael Witten
@ 2010-03-18 23:37                   ` Nicolas Pitre
  2010-03-18 23:44                     ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-18 23:37 UTC (permalink / raw)
  To: Michael Witten
  Cc: Martin Langhoff, tytso, Linus Torvalds, Wincent Colaiuta, git

On Thu, 18 Mar 2010, Michael Witten wrote:

> You missed the other line (probably gmail's fault):
> 
> Most likely, all that will happen is identification entropy won't
> increase nearly so rapidly and there might be other benefits
> such as shortlog speed improvements.

The shortlog speed improvement is certainly not going to compensate for 
all the added human time needed to process the extra piece of 
information.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:34         ` Nicolas Pitre
@ 2010-03-18 23:41           ` Jon Smirl
  2010-03-18 23:58             ` Nicolas Pitre
  0 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-18 23:41 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Martin Langhoff, Linus Torvalds, Michael Witten, git

On Thu, Mar 18, 2010 at 7:34 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>
>> If anyone is interested I can send them a .mailmap that fixes a lot of
>> the problems in the kernel tree. It's two years old so it will need
>> updating.
>
> Please just make a patch with it, and post it to lkml and CC Linus and
> Andrew Morton.  Repost a month later if no one picked it up.

Been there, done that. 1000 message flame war ensued about privacy
concerns over people's email address in the file.

>
> I think that 'git log' should really consider the .mailmap by default.
> Otherwise what's the point?   The only time when .mailmap should not be
> considered is when using --pretty=raw or when explicitly told not to.
>
>
> Nicolas
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:37                   ` Nicolas Pitre
@ 2010-03-18 23:44                     ` Michael Witten
  2010-03-19  0:03                       ` Nicolas Pitre
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-18 23:44 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

On Thu, Mar 18, 2010 at 18:37, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Michael Witten wrote:
>
>> You missed the other line (probably gmail's fault):
>>
>> Most likely, all that will happen is identification entropy won't
>> increase nearly so rapidly and there might be other benefits
>> such as shortlog speed improvements.
>
> The shortlog speed improvement is certainly not going to compensate for
> all the added human time needed to process the extra piece of
> information.

What added human time?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:41           ` Jon Smirl
@ 2010-03-18 23:58             ` Nicolas Pitre
  2010-03-19  0:16               ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-18 23:58 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, Linus Torvalds, Michael Witten, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 896 bytes --]

On Thu, 18 Mar 2010, Jon Smirl wrote:

> On Thu, Mar 18, 2010 at 7:34 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 18 Mar 2010, Jon Smirl wrote:
> >
> >> If anyone is interested I can send them a .mailmap that fixes a lot of
> >> the problems in the kernel tree. It's two years old so it will need
> >> updating.
> >
> > Please just make a patch with it, and post it to lkml and CC Linus and
> > Andrew Morton.  Repost a month later if no one picked it up.
> 
> Been there, done that. 1000 message flame war ensued about privacy
> concerns over people's email address in the file.

Well, you used git itself as the data source to fix up those email 
addresses, right?  If so there is simply no privacy concerns as the data 
is already there and public.  Just don't venture adding emails that are 
not already present in the whole Git history/content at all without 
consent.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:44                     ` Michael Witten
@ 2010-03-19  0:03                       ` Nicolas Pitre
  2010-03-19  0:27                         ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-19  0:03 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

On Thu, 18 Mar 2010, Michael Witten wrote:

> On Thu, Mar 18, 2010 at 18:37, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 18 Mar 2010, Michael Witten wrote:
> >
> >> You missed the other line (probably gmail's fault):
> >>
> >> Most likely, all that will happen is identification entropy won't
> >> increase nearly so rapidly and there might be other benefits
> >> such as shortlog speed improvements.
> >
> > The shortlog speed improvement is certainly not going to compensate for
> > all the added human time needed to process the extra piece of
> > information.
> 
> What added human time?

The time that humans will have to spend on this UUID 
setup/fixing/whatnot.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 23:58             ` Nicolas Pitre
@ 2010-03-19  0:16               ` Jon Smirl
  2010-03-19  0:17                 ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-19  0:16 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Martin Langhoff, Linus Torvalds, Michael Witten, git

On Thu, Mar 18, 2010 at 7:58 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>
>> On Thu, Mar 18, 2010 at 7:34 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
>> > On Thu, 18 Mar 2010, Jon Smirl wrote:
>> >
>> >> If anyone is interested I can send them a .mailmap that fixes a lot of
>> >> the problems in the kernel tree. It's two years old so it will need
>> >> updating.
>> >
>> > Please just make a patch with it, and post it to lkml and CC Linus and
>> > Andrew Morton.  Repost a month later if no one picked it up.
>>
>> Been there, done that. 1000 message flame war ensued about privacy
>> concerns over people's email address in the file.
>
> Well, you used git itself as the data source to fix up those email
> addresses, right?  If so there is simply no privacy concerns as the data
> is already there and public.  Just don't venture adding emails that are
> not already present in the whole Git history/content at all without
> consent.

I'll sent you the file and you can commit it. Please take full credit for it.
http://lkml.org/lkml/2008/7/28/134

All of the data came out of git tree.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:16               ` Jon Smirl
@ 2010-03-19  0:17                 ` Linus Torvalds
  2010-03-19  0:39                   ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-19  0:17 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Nicolas Pitre, Martin Langhoff, Michael Witten, git



On Thu, 18 Mar 2010, Jon Smirl wrote:
> 
> I'll sent you the file and you can commit it. Please take full credit for it.

Umm. You do realize that what people complained about was mostly that they 
felt a lot of the entries were totally pointless.

For example, you included names whether they were mistyped or not, and 
claimed that everybody needed to always be in the mailmap if they ever 
made any commit.

So I think 99% of the flames were due to just the patch being stupid.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:03                       ` Nicolas Pitre
@ 2010-03-19  0:27                         ` Michael Witten
  2010-03-19  0:32                           ` Nicolas Pitre
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19  0:27 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

On Thu, Mar 18, 2010 at 19:03, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Michael Witten wrote:
>
>> On Thu, Mar 18, 2010 at 18:37, Nicolas Pitre <nico@fluxnic.net> wrote:
>> > On Thu, 18 Mar 2010, Michael Witten wrote:
>> >
>> >> You missed the other line (probably gmail's fault):
>> >>
>> >> Most likely, all that will happen is identification entropy won't
>> >> increase nearly so rapidly and there might be other benefits
>> >> such as shortlog speed improvements.
>> >
>> > The shortlog speed improvement is certainly not going to compensate for
>> > all the added human time needed to process the extra piece of
>> > information.
>>
>> What added human time?
>
> The time that humans will have to spend on this UUID
> setup/fixing/whatnot.

Compatibility concerns aside, there is virtually no overhead. Indeed,
there would be less overhead than there is now in terms of fixing.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:27                         ` Michael Witten
@ 2010-03-19  0:32                           ` Nicolas Pitre
  0 siblings, 0 replies; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-19  0:32 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

On Thu, 18 Mar 2010, Michael Witten wrote:

> On Thu, Mar 18, 2010 at 19:03, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 18 Mar 2010, Michael Witten wrote:
> >
> >> On Thu, Mar 18, 2010 at 18:37, Nicolas Pitre <nico@fluxnic.net> wrote:
> >> > On Thu, 18 Mar 2010, Michael Witten wrote:
> >> >
> >> >> You missed the other line (probably gmail's fault):
> >> >>
> >> >> Most likely, all that will happen is identification entropy won't
> >> >> increase nearly so rapidly and there might be other benefits
> >> >> such as shortlog speed improvements.
> >> >
> >> > The shortlog speed improvement is certainly not going to compensate for
> >> > all the added human time needed to process the extra piece of
> >> > information.
> >>
> >> What added human time?
> >
> > The time that humans will have to spend on this UUID
> > setup/fixing/whatnot.
> 
> Compatibility concerns aside, there is virtually no overhead. Indeed,
> there would be less overhead than there is now in terms of fixing.

In a perfect world maybe.  Let's talk about it again when we get there.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:17                 ` Linus Torvalds
@ 2010-03-19  0:39                   ` Jon Smirl
  2010-03-19  0:50                     ` Linus Torvalds
  0 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-19  0:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Martin Langhoff, Michael Witten, git

On Thu, Mar 18, 2010 at 8:17 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>>
>> I'll sent you the file and you can commit it. Please take full credit for it.
>
> Umm. You do realize that what people complained about was mostly that they
> felt a lot of the entries were totally pointless.
>
> For example, you included names whether they were mistyped or not, and
> claimed that everybody needed to always be in the mailmap if they ever
> made any commit.
>
> So I think 99% of the flames were due to just the patch being stupid.

I had all of the names in the list so that I could regenerate the list
and diff it against the old version to know which new names needed to
be checked. Looking back I could have eliminated the names without
errors and then added a comment to the file as to the last date all of
the names were checked.  But that is less reliable than recording
which were checked. The problem is that if you lose track of what has
been checked, you are forced to recheck everything and it takes a long
time to recheck everything.

I'll send you a copy and you can unstupify it.

>
>                Linus
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:39                   ` Jon Smirl
@ 2010-03-19  0:50                     ` Linus Torvalds
  2010-03-19  1:12                       ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2010-03-19  0:50 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Nicolas Pitre, Martin Langhoff, Michael Witten, git



On Thu, 18 Mar 2010, Jon Smirl wrote:
> 
> I had all of the names in the list so that I could regenerate the list
> and diff it against the old version to know which new names needed to
> be checked. Looking back I could have eliminated the names without
> errors and then added a comment to the file as to the last date all of
> the names were checked.  But that is less reliable than recording
> which were checked. The problem is that if you lose track of what has
> been checked, you are forced to recheck everything and it takes a long
> time to recheck everything.

The part you keep missing is that NOBODY CARES!

For example, I exist in the current git kernel tree with 11 different 
names for just the authorship information:

     32 Linus Torvalds torvalds@evo.osdl.org
   1522 Linus Torvalds torvalds@g5.osdl.org
   4194 Linus Torvalds torvalds@linux-foundation.org
      7 Linus Torvalds torvalds@macmini.osdl.org
      2 Linus Torvalds torvalds@merom.osdl.org
      8 Linus Torvalds torvalds@osdl.org
    166 Linus Torvalds torvalds@ppc970.osdl.org
      4 Linus Torvalds torvalds@ppc970.osdl.org.(none)
      1 Linus Torvalds torvalds@quad.osdl.org
   1606 Linus Torvalds torvalds@woody.linux-foundation.org
    174 Linus Torvalds torvalds@woody.osdl.org

(that's counts, in case you care). And then if you check signed-off lines, 
you'll find some _additional_ oddities where things just got misspelled, 
like

	Linus Torvalds <tovalds@linux-foundation.org>
	Linus Torvalds <torvalds@akpm@linux-foundation.org>

where in one case there's a missing 'r', and in the other it's some odd 
perverse incestuous relationship between me and Andrew (in reality, it's 
me doing a stupid "search-and-replace" on the emails, adding my own 
sign-off to Andrew's and that got a bit too much copy-paste issues)

There's a few other mistakes like that in the sign-offs.

Does anybody care? Certainly not I. There is absolutely zero reason to 
worry about it. I used to find it convenient to see what machines I had 
worked on, so I actually included that. And one of them was clearly 
mis-configured, or git did something wrong when the hostname was already 
in FQDN format. Whatever.

There is no real _value_ in making a .mailcap for each such buggy entry is 
what I'm trying to tell you. Those things are maybe used for statistics. 
On the whole, they are correct. 

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  0:50                     ` Linus Torvalds
@ 2010-03-19  1:12                       ` Jon Smirl
  2010-03-19  1:45                         ` Nicolas Pitre
  0 siblings, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-19  1:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Martin Langhoff, Michael Witten, git

On Thu, Mar 18, 2010 at 8:50 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Does anybody care? Certainly not I. There is absolutely zero reason to
> worry about it. I used to find it convenient to see what machines I had
> worked on, so I actually included that. And one of them was clearly
> mis-configured, or git did something wrong when the hostname was already
> in FQDN format. Whatever.
>
> There is no real _value_ in making a .mailcap for each such buggy entry is
> what I'm trying to tell you. Those things are maybe used for statistics.
> On the whole, they are correct.

I was trying to track how many real people were working on the kernel.
 If we don't collapse the 13 different versions of you down to one
person the number numbers are way off.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  1:12                       ` Jon Smirl
@ 2010-03-19  1:45                         ` Nicolas Pitre
  2010-03-19  2:05                           ` Jon Smirl
  0 siblings, 1 reply; 104+ messages in thread
From: Nicolas Pitre @ 2010-03-19  1:45 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Linus Torvalds, Martin Langhoff, Michael Witten, git

On Thu, 18 Mar 2010, Jon Smirl wrote:

> On Thu, Mar 18, 2010 at 8:50 PM, Linus Torvalds
> > There is no real _value_ in making a .mailcap for each such buggy entry is
> > what I'm trying to tell you. Those things are maybe used for statistics.
> > On the whole, they are correct.
> 
> I was trying to track how many real people were working on the kernel.
>  If we don't collapse the 13 different versions of you down to one
> person the number numbers are way off.

If you have a cleaned up .mailmap file which doesn't include unneeded 
entries then just submit it for inclusion.  If someone else eventually 
cares to check and update it then another patch should come forth at 
that point.  That doesn't have to be any more complicated than that.


Nicolas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  1:45                         ` Nicolas Pitre
@ 2010-03-19  2:05                           ` Jon Smirl
  0 siblings, 0 replies; 104+ messages in thread
From: Jon Smirl @ 2010-03-19  2:05 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Martin Langhoff, Michael Witten, git

On Thu, Mar 18, 2010 at 9:45 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 18 Mar 2010, Jon Smirl wrote:
>
>> On Thu, Mar 18, 2010 at 8:50 PM, Linus Torvalds
>> > There is no real _value_ in making a .mailcap for each such buggy entry is
>> > what I'm trying to tell you. Those things are maybe used for statistics.
>> > On the whole, they are correct.
>>
>> I was trying to track how many real people were working on the kernel.
>>  If we don't collapse the 13 different versions of you down to one
>> person the number numbers are way off.
>
> If you have a cleaned up .mailmap file which doesn't include unneeded
> entries then just submit it for inclusion.  If someone else eventually
> cares to check and update it then another patch should come forth at
> that point.  That doesn't have to be any more complicated than that.

I sent you a copy, feel free to do whatever you want with it.  The
academics doing statistics on Linux will love you for submitting it.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
                   ` (3 preceding siblings ...)
  2010-03-18 22:17 ` A Large Angry SCM
@ 2010-03-19  2:47 ` Sitaram Chamarty
  2010-03-19  5:17   ` Nazri Ramliy
  2010-03-19  8:41 ` Michael Haggerty
  2010-03-19 14:08 ` Jakub Narebski
  6 siblings, 1 reply; 104+ messages in thread
From: Sitaram Chamarty @ 2010-03-19  2:47 UTC (permalink / raw)
  To: git

On Thu, Mar 18, 2010 at 6:53 PM, Michael Witten <mfwitten@gmail.com> wrote:
> Short Version:
> -------------

[all snipped]

Great Gods above... 50+ emails, including many from Linus himself,
trying to respond to a non-solution to a non-problem...

slow day?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  2:47 ` Sitaram Chamarty
@ 2010-03-19  5:17   ` Nazri Ramliy
  0 siblings, 0 replies; 104+ messages in thread
From: Nazri Ramliy @ 2010-03-19  5:17 UTC (permalink / raw)
  To: Sitaram Chamarty; +Cc: git

On Fri, Mar 19, 2010 at 10:47 AM, Sitaram Chamarty <sitaramc@gmail.com> wrote:
> On Thu, Mar 18, 2010 at 6:53 PM, Michael Witten <mfwitten@gmail.com> wrote:
>> Short Version:
>> -------------
>
> [all snipped]
>
> Great Gods above... 50+ emails, including many from Linus himself,
> trying to respond to a non-solution to a non-problem...
>
> slow day?

Nah.. just wait until someone mentions either Hitler or Nazis.

nazri.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
                   ` (4 preceding siblings ...)
  2010-03-19  2:47 ` Sitaram Chamarty
@ 2010-03-19  8:41 ` Michael Haggerty
  2010-03-19 11:39   ` Michael Witten
  2010-03-19 14:08 ` Jakub Narebski
  6 siblings, 1 reply; 104+ messages in thread
From: Michael Haggerty @ 2010-03-19  8:41 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

Michael Witten wrote:
> Rather than use a (name,email) pair to identify people, let's use
> a (uuid,name,email) triplet.
> [...]

A UUID doesn't need to be a big hex number.  All it has to be is a
"Universally Unique Identifier".  Like, oh, for example, your

                   *** EMAIL ADDRESS ***

[1].  There is even already a way to fix up mistakes or unavoidable
email address changes, namely the .mailmap file.

So if you are exercised about having a persistent identity, simply find
an email provider that is unlikely to ever give your email address to
somebody else, and use that address consistently.  Encourage other
people to do the same and to keep their .mailmap entries up to date.

(Not that it's likely to happen, but having people maintain opaque UUIDs
is even *less* likely.)

Michael

[1] The only non-UUID property of legitimate email addresses is that the
username part or even the domain name part of an email address can be
recycled.  But with a reputable email provider this shouldn't be a
problem.  For the purpose of the UUID it is not even a problem if the
email address becomes defunct, as long as it is not taken over by
somebody else.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19  8:41 ` Michael Haggerty
@ 2010-03-19 11:39   ` Michael Witten
  2010-03-19 11:45     ` david
  2010-03-19 14:08     ` Michael Haggerty
  0 siblings, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 11:39 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

On Fri, Mar 19, 2010 at 02:41, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Michael Witten wrote:
>> Rather than use a (name,email) pair to identify people, let's use
>> a (uuid,name,email) triplet.
>> [...]
>
> A UUID doesn't need to be a big hex number.  All it has to be is a
> "Universally Unique Identifier".  Like, oh, for example, your
>
>                   *** EMAIL ADDRESS ***
>
> [1].  There is even already a way to fix up mistakes or unavoidable
> email address changes, namely the .mailmap file.

*facepalm*

You've just repeated everything that I've said; go look at the rest of
the thread, where I spend plenty of time correcting the same hangups
about my choice of the word UUID and my use of hex digits.

I'm only observing that the current name/email system pair conflates
an individual with his current email system and that it would be
worthwhile to ALLOW an individual to FURTHER describe himself by
including another piece of information that is solely meant as
identification within git. That piece of information could be whatever
a user deems to be uniquely identifying for himself. You could use
"Michael Haggerty <mhagger@alum.mit.edu>" as your uuid, and you could
still use it after you change the `email' config variable to something
else.

There is MUCH LESS CHANCE of such a uuid getting trashed by typos,
changing names, and changing email addresses; of course it can still
get messed up, but the rate at which something like .mailmap would
need to be updated would likely be greatly decreased and it would make
gathering statistics easier (especially for the individuals who take
advantage of such a uuid for describing themselves---and it only
requires setting one config variable to something easily remembered by
that person).

I cover all of this numerous times in numerous rebuttals; don't
contribute to a thread with more than 60 emails without having read at
least some of them. If you don't care to read so much, then perhaps
jump here:

    http://marc.info/?l=git&m=126894679711600&w=2

In the end, there is probably only one legitimate problem with my
proposal: It might break compatibility with older repo formats/tools.
I'm not sure about that.

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:39   ` Michael Witten
@ 2010-03-19 11:45     ` david
  2010-03-19 11:54       ` Mike Hommey
  2010-03-19 12:08       ` Michael Witten
  2010-03-19 14:08     ` Michael Haggerty
  1 sibling, 2 replies; 104+ messages in thread
From: david @ 2010-03-19 11:45 UTC (permalink / raw)
  To: Michael Witten; +Cc: Michael Haggerty, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3227 bytes --]

On Fri, 19 Mar 2010, Michael Witten wrote:

> On Fri, Mar 19, 2010 at 02:41, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> Michael Witten wrote:
>>> Rather than use a (name,email) pair to identify people, let's use
>>> a (uuid,name,email) triplet.
>>> [...]
>>
>> A UUID doesn't need to be a big hex number.  All it has to be is a
>> "Universally Unique Identifier".  Like, oh, for example, your
>>
>>                   *** EMAIL ADDRESS ***
>>
>> [1].  There is even already a way to fix up mistakes or unavoidable
>> email address changes, namely the .mailmap file.
>
> *facepalm*
>
> You've just repeated everything that I've said; go look at the rest of
> the thread, where I spend plenty of time correcting the same hangups
> about my choice of the word UUID and my use of hex digits.
>
> I'm only observing that the current name/email system pair conflates
> an individual with his current email system and that it would be
> worthwhile to ALLOW an individual to FURTHER describe himself by
> including another piece of information that is solely meant as
> identification within git. That piece of information could be whatever
> a user deems to be uniquely identifying for himself. You could use
> "Michael Haggerty <mhagger@alum.mit.edu>" as your uuid, and you could
> still use it after you change the `email' config variable to something
> else.
>
> There is MUCH LESS CHANCE of such a uuid getting trashed by typos,
> changing names, and changing email addresses; of course it can still
> get messed up, but the rate at which something like .mailmap would
> need to be updated would likely be greatly decreased and it would make
> gathering statistics easier (especially for the individuals who take
> advantage of such a uuid for describing themselves---and it only
> requires setting one config variable to something easily remembered by
> that person).

here is where you are missing the point.

no, there is not 'much less chance' of it getting messed up.

you seem to assume that people would never need to set the UUID on 
multiple machines.

if they don't need to set it on multiple machines, then the e-mail/userid 
is going to be reliable anyway

if they do need to set it on multiple machines and can't be bothered to 
keep their e-mail consistant, why would they bother keeping this 
additional thing considtant? Linus is pointing out that people don't care 
now about their e-mail and name, and will care even less about some 
abstract UUID

people who care will already make their e-mail consistant.

David Lang


> I cover all of this numerous times in numerous rebuttals; don't
> contribute to a thread with more than 60 emails without having read at
> least some of them. If you don't care to read so much, then perhaps
> jump here:
>
>    http://marc.info/?l=git&m=126894679711600&w=2
>
> In the end, there is probably only one legitimate problem with my
> proposal: It might break compatibility with older repo formats/tools.
> I'm not sure about that.
>
> Sincerely,
> Michael Witten
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:45     ` david
@ 2010-03-19 11:54       ` Mike Hommey
  2010-03-19 12:09         ` Reece Dunn
  2010-03-19 12:09         ` Michael Witten
  2010-03-19 12:08       ` Michael Witten
  1 sibling, 2 replies; 104+ messages in thread
From: Mike Hommey @ 2010-03-19 11:54 UTC (permalink / raw)
  To: david; +Cc: git

On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
> here is where you are missing the point.
> 
> no, there is not 'much less chance' of it getting messed up.
> 
> you seem to assume that people would never need to set the UUID on
> multiple machines.
> 
> if they don't need to set it on multiple machines, then the
> e-mail/userid is going to be reliable anyway
> 
> if they do need to set it on multiple machines and can't be bothered
> to keep their e-mail consistant, why would they bother keeping this
> additional thing considtant? Linus is pointing out that people don't
> care now about their e-mail and name, and will care even less about
> some abstract UUID
> 
> people who care will already make their e-mail consistant.

While I don't agree with the need for that uuid thing, I'd like to
pinpoint that people who care can't necessarily make their e-mail
consistant. For example, Linus used to use an @osdl.org address, and
he now uses an @linux-foundation.org address. It's still the same Linus,
but the (name, email) pair has legitimately changed.

Mike

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:45     ` david
  2010-03-19 11:54       ` Mike Hommey
@ 2010-03-19 12:08       ` Michael Witten
  1 sibling, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 12:08 UTC (permalink / raw)
  To: david; +Cc: Michael Haggerty, git

On Fri, Mar 19, 2010 at 05:45,  <david@lang.hm> wrote:
> On Fri, 19 Mar 2010, Michael Witten wrote:
>
>> On Fri, Mar 19, 2010 at 02:41, Michael Haggerty <mhagger@alum.mit.edu>
>> wrote:
>>>
>>> Michael Witten wrote:
>>>>
>>>> Rather than use a (name,email) pair to identify people, let's use
>>>> a (uuid,name,email) triplet.
>>>> [...]
>>>
>>> A UUID doesn't need to be a big hex number.  All it has to be is a
>>> "Universally Unique Identifier".  Like, oh, for example, your
>>>
>>>                   *** EMAIL ADDRESS ***
>>>
>>> [1].  There is even already a way to fix up mistakes or unavoidable
>>> email address changes, namely the .mailmap file.
>>
>> *facepalm*
>>
>> You've just repeated everything that I've said; go look at the rest of
>> the thread, where I spend plenty of time correcting the same hangups
>> about my choice of the word UUID and my use of hex digits.
>>
>> I'm only observing that the current name/email system pair conflates
>> an individual with his current email system and that it would be
>> worthwhile to ALLOW an individual to FURTHER describe himself by
>> including another piece of information that is solely meant as
>> identification within git. That piece of information could be whatever
>> a user deems to be uniquely identifying for himself. You could use
>> "Michael Haggerty <mhagger@alum.mit.edu>" as your uuid, and you could
>> still use it after you change the `email' config variable to something
>> else.
>>
>> There is MUCH LESS CHANCE of such a uuid getting trashed by typos,
>> changing names, and changing email addresses; of course it can still
>> get messed up, but the rate at which something like .mailmap would
>> need to be updated would likely be greatly decreased and it would make
>> gathering statistics easier (especially for the individuals who take
>> advantage of such a uuid for describing themselves---and it only
>> requires setting one config variable to something easily remembered by
>> that person).
>
> here is where you are missing the point.
>
> no, there is not 'much less chance' of it getting messed up.
>
> you seem to assume that people would never need to set the UUID on multiple
> machines.

I covered that in the first email, highlighting the importance of
using an easily remembered, already reasonably unique piece of
information (like a name/email pair) that you don't need to change.

> if they don't need to set it on multiple machines, then the e-mail/userid is
> going to be reliable anyway

The problem is that the name/email pair (as in the 'name' and 'email'
config variables) is NOT ONLY subject to typos, but it is ALSO subject
to changing email accounts and changing real life names.

If you don't use the uuid `field' that I propose, then everything
would be just like it was before. If you do use it, then you can
easily identify all of your own contributions regardless of what your
name/email du jour is.

> if they do need to set it on multiple machines and can't be bothered to keep
> their e-mail consistant, why would they bother keeping this additional thing
> considtant? Linus is pointing out that people don't care now about their
> e-mail and name, and will care even less about some abstract UUID

The user doesn't have a damn choice!

The email can't be kept consistent over time because the tools expect
it to be and/or use the actual physical email used to send/receive
stuff. It's information that CONFLATES identity with whatever
tool/system you're using.

For instance, Michael Haggerty cannot reasonably use

    [user]
        name  = Michael Haggerty
        email = mhagger@MIT.EDU

because he likely no longer has that email account to use. He is
forced to change it and therefore forced to make his identity
confused.

I'm proposing ALLOWING him to say:

    [user]
        uuid  = Michael Haggerty <mhagger@MIT.EDU>
        name  = Michael Haggerty
        email = mhagger@ALUM.mit.edu

Heck, let's say he works at Red Hat as well; he might make some
commits under this config AT WORK:

    [user]
        uuid  = Michael Haggerty <mhagger@MIT.EDU>
        name  = Michael Haggerty
        email = mhagger@redhat.com

Then, he can make, say, commits to the Linux kernel repo for both work
and hobby related issues and still be recognized as the same person.
That is, he can have some commits under "Michael Haggerty
<mhagger@ALUM.mit.edu>" and other commits under "Michael Haggerty
<mhagger@redhat.com" and still link them all together as the same
identity with just the uuid "Michael Haggerty <mhagger@MIT.EDU>".

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:54       ` Mike Hommey
@ 2010-03-19 12:09         ` Reece Dunn
  2010-03-19 12:16           ` Michael Witten
  2010-03-19 12:25           ` Jon Smirl
  2010-03-19 12:09         ` Michael Witten
  1 sibling, 2 replies; 104+ messages in thread
From: Reece Dunn @ 2010-03-19 12:09 UTC (permalink / raw)
  To: Mike Hommey; +Cc: david, git

On 19 March 2010 11:54, Mike Hommey <mh@glandium.org> wrote:
> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
>> here is where you are missing the point.
>>
>> no, there is not 'much less chance' of it getting messed up.
>>
>> you seem to assume that people would never need to set the UUID on
>> multiple machines.
>>
>> if they don't need to set it on multiple machines, then the
>> e-mail/userid is going to be reliable anyway
>>
>> if they do need to set it on multiple machines and can't be bothered
>> to keep their e-mail consistant, why would they bother keeping this
>> additional thing considtant? Linus is pointing out that people don't
>> care now about their e-mail and name, and will care even less about
>> some abstract UUID
>>
>> people who care will already make their e-mail consistant.
>
> While I don't agree with the need for that uuid thing, I'd like to
> pinpoint that people who care can't necessarily make their e-mail
> consistant. For example, Linus used to use an @osdl.org address, and
> he now uses an @linux-foundation.org address. It's still the same Linus,
> but the (name, email) pair has legitimately changed.

So create an aliases list that maps one (name,email) to another that
is from the same person. There is no need for an additional item (a
uuid) to solve this problem. It also means that searching on any
(name,email) pair will find the others, so you only need to
remember/find one of the identities for the person you are interested
in finding the commits for.

AFAICS, mailmap is about correcting mistakes (primarily in the
reported name for a given email address). In this case, mailmap and
this aliases-map will work in conjunction with each other to give what
the original poster wanted. However, I haven't seen any of his replies
that answer this (or sufficiently address why mailmap does not solve
his problem).

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:54       ` Mike Hommey
  2010-03-19 12:09         ` Reece Dunn
@ 2010-03-19 12:09         ` Michael Witten
  2010-03-22 12:06           ` Mark Brown
  2010-03-22 14:38           ` Michael Witten
  1 sibling, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 12:09 UTC (permalink / raw)
  To: Mike Hommey; +Cc: david, git

On Fri, Mar 19, 2010 at 05:54, Mike Hommey <mh@glandium.org> wrote:
> While I don't agree with the need for that uuid thing, I'd like to
> pinpoint that people who care can't necessarily make their e-mail
> consistant. For example, Linus used to use an @osdl.org address, and
> he now uses an @linux-foundation.org address. It's still the same Linus,
> but the (name, email) pair has legitimately changed.

Indeed.

This is because the name/email pair (as in the 'name' and 'email'
config variables) CONFLATES the idea of identity and current email
account.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:09         ` Reece Dunn
@ 2010-03-19 12:16           ` Michael Witten
  2010-03-19 12:18             ` Michael Witten
  2010-03-19 14:57             ` Reece Dunn
  2010-03-19 12:25           ` Jon Smirl
  1 sibling, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 12:16 UTC (permalink / raw)
  To: Reece Dunn; +Cc: Mike Hommey, david, git

On Fri, Mar 19, 2010 at 06:09, Reece Dunn <msclrhd@googlemail.com> wrote:
> On 19 March 2010 11:54, Mike Hommey <mh@glandium.org> wrote:
>> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
>>> here is where you are missing the point.
>>>
>>> no, there is not 'much less chance' of it getting messed up.
>>>
>>> you seem to assume that people would never need to set the UUID on
>>> multiple machines.
>>>
>>> if they don't need to set it on multiple machines, then the
>>> e-mail/userid is going to be reliable anyway
>>>
>>> if they do need to set it on multiple machines and can't be bothered
>>> to keep their e-mail consistant, why would they bother keeping this
>>> additional thing considtant? Linus is pointing out that people don't
>>> care now about their e-mail and name, and will care even less about
>>> some abstract UUID
>>>
>>> people who care will already make their e-mail consistant.
>>
>> While I don't agree with the need for that uuid thing, I'd like to
>> pinpoint that people who care can't necessarily make their e-mail
>> consistant. For example, Linus used to use an @osdl.org address, and
>> he now uses an @linux-foundation.org address. It's still the same Linus,
>> but the (name, email) pair has legitimately changed.
>
> So create an aliases list that maps one (name,email) to another that
> is from the same person. There is no need for an additional item (a
> uuid) to solve this problem. It also means that searching on any
> (name,email) pair will find the others, so you only need to
> remember/find one of the identities for the person you are interested
> in finding the commits for.
>
> AFAICS, mailmap is about correcting mistakes (primarily in the
> reported name for a given email address). In this case, mailmap and
> this aliases-map will work in conjunction with each other to give what
> the original poster wanted. However, I haven't seen any of his replies
> that answer this (or sufficiently address why mailmap does not solve
> his problem).

See:

    http://marc.info/?l=git&m=126900051102958&w=2

The idea is to distribute the responsibility for maintaining a
consistent identity AND to make that responsibility EASY.

The extra uuid `field' can only suffer from typos, while the
name/email pair can suffer from typos, changing email accounts, and
changing real life names. If the uuid `field' does get bungled by a
typo or is not used, then we're no worse off than we were before.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:16           ` Michael Witten
@ 2010-03-19 12:18             ` Michael Witten
  2010-03-19 14:57             ` Reece Dunn
  1 sibling, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 12:18 UTC (permalink / raw)
  To: Reece Dunn; +Cc: Mike Hommey, david, git

On Fri, Mar 19, 2010 at 06:16, Michael Witten <mfwitten@gmail.com> wrote:
> The extra uuid `field' can only suffer from typos

I should add that because the uuid `field' would be typed pretty much
only as a config variable and then used by git tools from thenceforth,
the rate at which typos can occur is much less than for the name/email
pair.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:09         ` Reece Dunn
  2010-03-19 12:16           ` Michael Witten
@ 2010-03-19 12:25           ` Jon Smirl
  2010-03-19 12:40             ` Reece Dunn
  1 sibling, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-19 12:25 UTC (permalink / raw)
  To: Reece Dunn; +Cc: Mike Hommey, david, git

On Fri, Mar 19, 2010 at 8:09 AM, Reece Dunn <msclrhd@googlemail.com> wrote:
> On 19 March 2010 11:54, Mike Hommey <mh@glandium.org> wrote:
>> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
>>> here is where you are missing the point.
>>>
>>> no, there is not 'much less chance' of it getting messed up.
>>>
>>> you seem to assume that people would never need to set the UUID on
>>> multiple machines.
>>>
>>> if they don't need to set it on multiple machines, then the
>>> e-mail/userid is going to be reliable anyway
>>>
>>> if they do need to set it on multiple machines and can't be bothered
>>> to keep their e-mail consistant, why would they bother keeping this
>>> additional thing considtant? Linus is pointing out that people don't
>>> care now about their e-mail and name, and will care even less about
>>> some abstract UUID
>>>
>>> people who care will already make their e-mail consistant.
>>
>> While I don't agree with the need for that uuid thing, I'd like to
>> pinpoint that people who care can't necessarily make their e-mail
>> consistant. For example, Linus used to use an @osdl.org address, and
>> he now uses an @linux-foundation.org address. It's still the same Linus,
>> but the (name, email) pair has legitimately changed.
>
> So create an aliases list that maps one (name,email) to another that
> is from the same person. There is no need for an additional item (a
> uuid) to solve this problem. It also means that searching on any
> (name,email) pair will find the others, so you only need to
> remember/find one of the identities for the person you are interested
> in finding the commits for.

git already supports aliases via the .mailmap file. Pick one
name/address pair that you like and then use .mailmap to map all of
the variations into the primary one. Granted some git tools don't
process .mailmap, but it is easier to fix the tools that create a new
ID system.

Look at the .mailmap in the current kernel tree. It fixes a few
problems. I have a much larger one that fixes most address issues.

You don't need to reimplement this aliases, they are already in git.


>
> AFAICS, mailmap is about correcting mistakes (primarily in the
> reported name for a given email address). In this case, mailmap and
> this aliases-map will work in conjunction with each other to give what
> the original poster wanted. However, I haven't seen any of his replies
> that answer this (or sufficiently address why mailmap does not solve
> his problem).
>
> - Reece
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 21:57             ` Michael Witten
@ 2010-03-19 12:34               ` Paolo Bonzini
  2010-03-19 12:43                 ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Paolo Bonzini @ 2010-03-19 12:34 UTC (permalink / raw)
  To: Michael Witten
  Cc: Martin Langhoff, tytso, Linus Torvalds, Nicolas Pitre,
	Wincent Colaiuta, git

On 03/18/2010 10:57 PM, Michael Witten wrote:
> On Thu, Mar 18, 2010 at 16:39, Martin Langhoff
> <martin.langhoff@gmail.com>  wrote:
>> On Thu, Mar 18, 2010 at 5:29 PM, Michael Witten<mfwitten@gmail.com>  wrote:
>>> On Thu, Mar 18, 2010 at 16:19, Martin Langhoff
>>>> What's the value? For me it'll be "Martin Langhoff". I already have that.
>>>
>>> Well, that's rather egotistical considering you're probably not the
>>> only Martin Langhoff in this world. I'd advocate something like
>>> "Martin Langhoff<martin.langhoff@gmail.com>".
>>
>> So you are saying we should change the core datamodel of git to say...
>> what we already can say?
>
> You see, Martin, you might want/need to stop using "Martin Langhoff
> <martin.langhoff@gmail.com>" as your email account, but there's no
> reason why you can't continue to use it for your UUID.

While a gnu.org or gmail.com will (most likely) stay with some person 
forever, hindsight is 20/20 and many people may generate his UUID from a 
work email.  So, suppose I make my UUID based on <pbonzini@redhat.com> 
what will guarantee that in 20 years I won't find a new career as a 
bartender, and Red Hat wouldn't hire someone with my same name, and give 
him the same email address?

Heck, some people use gmail only for their personal email, and they 
rightly cannot be bothered to create another account to solve a problem 
they don't understand and they probably do not have.

For the UUID to make sense, it would need to be what the acronym says: 
universally unique.  An SHA-1 value is _not_ universally unique, it is 
just a one-way function.  There are tons of git repos out there with a 
blob hashing to e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 or 
257cc5642cb1a054f08cc83f2d943e56fd3ebe99.

I have an idea.  Start your own website uuidemail.com.  One registers 
and gets an alias for their email, something like 
8aacc35ffca0d34fccf8a750e84e3a81bdcb940b@uuidemail.com.  Then people can 
start using 
8aacc35ffca0d34fccf8a750e84e3a81bdcb940b+pbonzini--redhat.com@uuidemail.com 
as their git user.email.  I bet nobody will.

Paolo

ps: Yes, in a perfect world it would be nice for people to know that I 
am the same person independent of whether I contribute as 
bonzini@gnu.org or pbonzini@redhat.com.  But we're not in a perfect 
world, so amen.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:25           ` Jon Smirl
@ 2010-03-19 12:40             ` Reece Dunn
  0 siblings, 0 replies; 104+ messages in thread
From: Reece Dunn @ 2010-03-19 12:40 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Mike Hommey, david, git

On 19 March 2010 12:25, Jon Smirl <jonsmirl@gmail.com> wrote:
> On Fri, Mar 19, 2010 at 8:09 AM, Reece Dunn <msclrhd@googlemail.com> wrote:
>> On 19 March 2010 11:54, Mike Hommey <mh@glandium.org> wrote:
>>> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
>>>> here is where you are missing the point.
>>>>
>>>> no, there is not 'much less chance' of it getting messed up.
>>>>
>>>> you seem to assume that people would never need to set the UUID on
>>>> multiple machines.
>>>>
>>>> if they don't need to set it on multiple machines, then the
>>>> e-mail/userid is going to be reliable anyway
>>>>
>>>> if they do need to set it on multiple machines and can't be bothered
>>>> to keep their e-mail consistant, why would they bother keeping this
>>>> additional thing considtant? Linus is pointing out that people don't
>>>> care now about their e-mail and name, and will care even less about
>>>> some abstract UUID
>>>>
>>>> people who care will already make their e-mail consistant.
>>>
>>> While I don't agree with the need for that uuid thing, I'd like to
>>> pinpoint that people who care can't necessarily make their e-mail
>>> consistant. For example, Linus used to use an @osdl.org address, and
>>> he now uses an @linux-foundation.org address. It's still the same Linus,
>>> but the (name, email) pair has legitimately changed.
>>
>> So create an aliases list that maps one (name,email) to another that
>> is from the same person. There is no need for an additional item (a
>> uuid) to solve this problem. It also means that searching on any
>> (name,email) pair will find the others, so you only need to
>> remember/find one of the identities for the person you are interested
>> in finding the commits for.
>
> git already supports aliases via the .mailmap file. Pick one
> name/address pair that you like and then use .mailmap to map all of
> the variations into the primary one. Granted some git tools don't
> process .mailmap, but it is easier to fix the tools that create a new
> ID system.
>
> Look at the .mailmap in the current kernel tree. It fixes a few
> problems. I have a much larger one that fixes most address issues.
>
> You don't need to reimplement this aliases, they are already in git.

Indeed. I wasn't aware that mailmap catered for this as well.

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:34               ` Paolo Bonzini
@ 2010-03-19 12:43                 ` Michael Witten
  2010-03-19 12:53                   ` Paolo Bonzini
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19 12:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Martin Langhoff, tytso, Linus Torvalds, Nicolas Pitre,
	Wincent Colaiuta, git

On Fri, Mar 19, 2010 at 06:34, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 03/18/2010 10:57 PM, Michael Witten wrote:
>>
>> On Thu, Mar 18, 2010 at 16:39, Martin Langhoff
>> <martin.langhoff@gmail.com>  wrote:
>>>
>>> On Thu, Mar 18, 2010 at 5:29 PM, Michael Witten<mfwitten@gmail.com>
>>>  wrote:
>>>>
>>>> On Thu, Mar 18, 2010 at 16:19, Martin Langhoff
>>>>>
>>>>> What's the value? For me it'll be "Martin Langhoff". I already have
>>>>> that.
>>>>
>>>> Well, that's rather egotistical considering you're probably not the
>>>> only Martin Langhoff in this world. I'd advocate something like
>>>> "Martin Langhoff<martin.langhoff@gmail.com>".
>>>
>>> So you are saying we should change the core datamodel of git to say...
>>> what we already can say?
>>
>> You see, Martin, you might want/need to stop using "Martin Langhoff
>> <martin.langhoff@gmail.com>" as your email account, but there's no
>> reason why you can't continue to use it for your UUID.
>
> While a gnu.org or gmail.com will (most likely) stay with some person
> forever, hindsight is 20/20 and many people may generate his UUID from a
> work email.  So, suppose I make my UUID based on <pbonzini@redhat.com> what
> will guarantee that in 20 years I won't find a new career as a bartender,
> and Red Hat wouldn't hire someone with my same name, and give him the same
> email address?

Firstly, the UUID need not be a name/email pair.

Secondly, you're being ridiculous; even if that ridiculous scenario
played out not-infrequently, there would still be less identity
confusion in git repos over time, because changing real life names,
and changing email accounts do happen frequently and are not
ridiculous events.

> Heck, some people use gmail only for their personal email, and they rightly
> cannot be bothered to create another account to solve a problem they don't
> understand and they probably do not have.

This doesn't make any sense. Why does anybody need to create another
account? Are you still confused about what a uuid is this context?

> For the UUID to make sense, it would need to be what the acronym says:
> universally unique.  An SHA-1 value is _not_ universally unique, it is just
> a one-way function.  There are tons of git repos out there with a blob
> hashing to e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 or
> 257cc5642cb1a054f08cc83f2d943e56fd3ebe99.

The SHA-1 is supposed to be an optimization; it's not essential, as
I've already explained; I also get the feeling that you're being
ridiculous again. In particular, I don't see your point.

> I have an idea.  Start your own website uuidemail.com.  One registers and
> gets an alias for their email, something like
> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b@uuidemail.com.  Then people can
> start using
> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b+pbonzini--redhat.com@uuidemail.com
> as their git user.email.  I bet nobody will.

This is nonsense that betrays your misunderstanding.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:43                 ` Michael Witten
@ 2010-03-19 12:53                   ` Paolo Bonzini
  2010-03-19 13:03                     ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Paolo Bonzini @ 2010-03-19 12:53 UTC (permalink / raw)
  To: Michael Witten
  Cc: Martin Langhoff, tytso, Linus Torvalds, Nicolas Pitre,
	Wincent Colaiuta, git


>> While a gnu.org or gmail.com will (most likely) stay with some
>> person forever, hindsight is 20/20 and many people may generate
>> his UUID from a work email.  So, suppose I make my UUID based
>> on<pbonzini@redhat.com>  what will guarantee that in 20 years I
>> won't find a new career as a bartender, and Red Hat wouldn't hire
>> someone with my same name, and give him the same email address?
>
> Firstly, the UUID need not be a name/email pair.

That's what you lastly proposed generating it from.

> Secondly, you're being ridiculous; even if that ridiculous scenario
> played out not-infrequently

It's not a matter of frequency.  If you want a "UU" identification,
collisions must not even happen *once*.

>> I have an idea.  Start your own website uuidemail.com.  One
>> registers and gets an alias for their email, something like
>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b@uuidemail.com.  Then
>> people can start using
>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b+pbonzini--redhat.com@uuidemail.com
>> as their git user.email.  I bet nobody will.
>
> This is nonsense that betrays your misunderstanding.

Why?  What does (name, email, uuid) provide over (name, concat(uuid, 
email))?  Nothing.

But the point is, neither really provides anything over (name, email).

Paolo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:53                   ` Paolo Bonzini
@ 2010-03-19 13:03                     ` Michael Witten
  2010-03-19 13:08                       ` Paolo Bonzini
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19 13:03 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: git

On Fri, Mar 19, 2010 at 06:53, Paolo Bonzini <bonzini@gnu.org> wrote:
>
>>> While a gnu.org or gmail.com will (most likely) stay with some
>>> person forever, hindsight is 20/20 and many people may generate
>>> his UUID from a work email.  So, suppose I make my UUID based
>>> on<pbonzini@redhat.com>  what will guarantee that in 20 years I
>>> won't find a new career as a bartender, and Red Hat wouldn't hire
>>> someone with my same name, and give him the same email address?
>>
>> Firstly, the UUID need not be a name/email pair.
>
> That's what you lastly proposed generating it from.

No. Please go read.

>> Secondly, you're being ridiculous; even if that ridiculous scenario
>> played out not-infrequently
>
> It's not a matter of frequency.  If you want a "UU" identification,
> collisions must not even happen *once*.

I've got news for you. The UUIDs generated by uuidgen CAN collide:

    The new UUID can reasonably be considered unique
    among all UUIDs created on the local system, and
    among UUIDs created on other systems in the past
    and in the future.

You're creating a straw man argument; conceptually, what I propose is
better than what the current system provides because it would decrease
the rate at which identity entropy increases.

>>> I have an idea.  Start your own website uuidemail.com.  One
>>> registers and gets an alias for their email, something like
>>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b@uuidemail.com.  Then
>>> people can start using
>>>
>>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b+pbonzini--redhat.com@uuidemail.com
>>> as their git user.email.  I bet nobody will.
>>
>> This is nonsense that betrays your misunderstanding.
>
> Why?  What does (name, email, uuid) provide over (name, concat(uuid,
> email))?  Nothing.

Go read the thread until you understand.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 13:03                     ` Michael Witten
@ 2010-03-19 13:08                       ` Paolo Bonzini
  2010-03-19 13:13                         ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Paolo Bonzini @ 2010-03-19 13:08 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

On 03/19/2010 02:03 PM, Michael Witten wrote:

>>> Secondly, you're being ridiculous; even if that ridiculous scenario
>>> played out not-infrequently
>>
>> It's not a matter of frequency.  If you want a "UU" identification,
>> collisions must not even happen *once*.
>
> I've got news for you. The UUIDs generated by uuidgen CAN collide:
>
>      The new UUID can reasonably be considered unique
>      among all UUIDs created on the local system, and
>      among UUIDs created on other systems in the past
>      and in the future.

Please read the UUID generation algorithm.

> You're creating a straw man argument; conceptually, what I propose is
> better than what the current system provides because it would decrease
> the rate at which identity entropy increases.

Maybe you have to define entropy.  For human consumers, "Paolo Bonzini 
<pbonzini@redhat.com>" has considerably less "entropy" than 
8aacc35ffca0d34fccf8a750e84e3a81bdcb940b, as does even "Paolo Bonzini 
<bonzini@gnu.org, pbonzini@redhat.com>".  For non-human consumers, a 
good mailmap will do.

>>>> I have an idea.  Start your own website uuidemail.com.  One
>>>> registers and gets an alias for their email, something like
>>>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b@uuidemail.com.  Then
>>>> people can start using
>>>>
>>>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b+pbonzini--redhat.com@uuidemail.com
>>>> as their git user.email.  I bet nobody will.
>>>
>>> This is nonsense that betrays your misunderstanding.
>>
>> Why?  What does (name, email, uuid) provide over (name, concat(uuid,
>> email))?  Nothing.
>
> Go read the thread until you understand.

I am not alone.

Paolo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 13:08                       ` Paolo Bonzini
@ 2010-03-19 13:13                         ` Michael Witten
  2010-03-19 13:41                           ` Wincent Colaiuta
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19 13:13 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: git

On Fri, Mar 19, 2010 at 07:08, Paolo Bonzini <bonzini@gnu.org> wrote:
> Maybe you have to define entropy.  For human consumers, "Paolo Bonzini
> <pbonzini@redhat.com>" has considerably less "entropy" than
> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b, as does even "Paolo Bonzini
> <bonzini@gnu.org, pbonzini@redhat.com>".  For non-human consumers, a good
> mailmap will do.

As I've stated before many times, the SHA-1 is not necessary to the proposal.

Please go read.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 13:13                         ` Michael Witten
@ 2010-03-19 13:41                           ` Wincent Colaiuta
  2010-03-19 13:59                             ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Wincent Colaiuta @ 2010-03-19 13:41 UTC (permalink / raw)
  To: Michael Witten; +Cc: Paolo Bonzini, git

El 19/03/2010, a las 14:13, Michael Witten escribió:

> On Fri, Mar 19, 2010 at 07:08, Paolo Bonzini <bonzini@gnu.org> wrote:
>> Maybe you have to define entropy.  For human consumers, "Paolo Bonzini
>> <pbonzini@redhat.com>" has considerably less "entropy" than
>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b, as does even "Paolo Bonzini
>> <bonzini@gnu.org, pbonzini@redhat.com>".  For non-human consumers, a good
>> mailmap will do.
> 
> As I've stated before many times, the SHA-1 is not necessary to the proposal.
> 
> Please go read.

Stop telling people to go read your idiotic proposal. It has _already_ been read with great attention, and multiple people have shown immense patience repeatedly explaining to you why the idea is stupid. Your continued trolling is really starting to grate.

The overwhelming, sustained opposition to your idea should already be enough indication that such a proposal will _never_ be accepted into the Git codebase, so right now you're just wasting people's time.

w

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 13:41                           ` Wincent Colaiuta
@ 2010-03-19 13:59                             ` Michael Witten
  2010-03-19 14:13                               ` Martin Langhoff
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19 13:59 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Paolo Bonzini, git

On Fri, Mar 19, 2010 at 07:41, Wincent Colaiuta <win@wincent.com> wrote:
> El 19/03/2010, a las 14:13, Michael Witten escribió:
>
>> On Fri, Mar 19, 2010 at 07:08, Paolo Bonzini <bonzini@gnu.org> wrote:
>>> Maybe you have to define entropy.  For human consumers, "Paolo Bonzini
>>> <pbonzini@redhat.com>" has considerably less "entropy" than
>>> 8aacc35ffca0d34fccf8a750e84e3a81bdcb940b, as does even "Paolo Bonzini
>>> <bonzini@gnu.org, pbonzini@redhat.com>".  For non-human consumers, a good
>>> mailmap will do.
>>
>> As I've stated before many times, the SHA-1 is not necessary to the proposal.
>>
>> Please go read.
>
> Stop telling people to go read your idiotic proposal. It has _already_ been read with great attention, and multiple people have shown immense patience repeatedly explaining to you why the idea is stupid. Your continued trolling is really starting to grate.

I've shown immense patience repeatedly explaining why these
'explanations' are strawmen or based on misunderstandings and bad
assumptions.

It's true that I have been receiving perfectly valid complaints. The
problem is that almost all of them have nothing to do with what I've
been saying because people see 'uuid' and a few examples with hex
digits and then erroneously construct the rest in their heads.

> The overwhelming, sustained opposition to your idea should already be enough indication that such a proposal will _never_ be accepted into the Git codebase, so right now you're just wasting people's time.

I long ago gave up the notion that it would be included in the git codebase.

Instead, I've been defending the idea, which is a simple but vast
improvement over the current system; had it been in place since the
beginning, a lot of trouble could have been reduced.

Indeed, the only thing that makes this great idea a bad idea is
COMPATIBILITY CONCERNS; that's it.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 11:39   ` Michael Witten
  2010-03-19 11:45     ` david
@ 2010-03-19 14:08     ` Michael Haggerty
  2010-03-19 17:02       ` david
  1 sibling, 1 reply; 104+ messages in thread
From: Michael Haggerty @ 2010-03-19 14:08 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

Michael Witten wrote:
> On Fri, Mar 19, 2010 at 02:41, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> Michael Witten wrote:
>>> Rather than use a (name,email) pair to identify people, let's use
>>> a (uuid,name,email) triplet.
>>> [...]
>> A UUID doesn't need to be a big hex number.  All it has to be is a
>> "Universally Unique Identifier".  Like, oh, for example, your
>>
>>                   *** EMAIL ADDRESS ***
>>
>> [1].  There is even already a way to fix up mistakes or unavoidable
>> email address changes, namely the .mailmap file.
> 
> *facepalm*
> 
> You've just repeated everything that I've said; go look at the rest of
> the thread, where I spend plenty of time correcting the same hangups
> about my choice of the word UUID and my use of hex digits.

No, my point is to use the *existing* email address as the UUID
*without* adding another field.  Nothing needs to be changed!

> [...] You could use
> "Michael Haggerty <mhagger@alum.mit.edu>" as your uuid, and you could
> still use it after you change the `email' config variable to something
> else.

Give me a break.  It's not so damn hard to keep an email address over
time.  And if it changes, I can update the .mailcap file to map my old
email address to the new one and *presto* I have a new, equally valid
UUID that I can continue to commit under.

> I cover all of this numerous times in numerous rebuttals; don't
> contribute to a thread with more than 60 emails without having read at
> least some of them.

Wrong.  I've read the whole idiotic thread.  To prove it I'll summarize
it for you: you argue the same point over and over again while ignoring
the legitimate objections of just about every other participant.

Adding a new UUID field is obviously a non-starter, so I suggested a way
to get the same (very marginal) benefit from the fields that are already
present in every git repository.

Michael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
                   ` (5 preceding siblings ...)
  2010-03-19  8:41 ` Michael Haggerty
@ 2010-03-19 14:08 ` Jakub Narebski
  2010-03-19 14:33   ` Jon Smirl
  2010-03-19 14:40   ` Michael Witten
  6 siblings, 2 replies; 104+ messages in thread
From: Jakub Narebski @ 2010-03-19 14:08 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

Michael Witten <mfwitten@gmail.com> writes:

> Short Version:
> -------------
> 
> 
> Rather than use a (name,email) pair to identify people, let's use
> a (uuid,name,email) triplet.
> 
> The uuid can be any piece of information that a user of git determines
> to be reasonably unique across space and time and that is intended to
> be used by that user virtually forever (at least within a project's
> history).
> 
> For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
> some easily remembered, already reasonably unique information.

... or 'canonical-name canonical-email' pair.

> 
> This could really help keep identifications clean, and it is rather
> straightforward and possibly quite efficient.
> 
> 
> Long Version:
> ------------
[...]

> While git's use of (name,email) pairs to identify each person is
> extremely practical, it turns out that it's rather `unstable';

This is non-solution to non-problem.

First, the user.name and user.email does not need to be name and email
from some email account.  It might be some "canonical name" and 
"canonical email".

Second, there are (I think) two main sources of 'unstability' in
(name,email) pairs, namely A) misconfigured git (when fetching/pushing
using git itself), B) wrong name in email etc. (when sending patches
via email, 80% of patches in Linux kernel case).

In the case of misconfigured git (case A) using UUID wouldn't help,
and only make it worse (you would have to configure the same UUID on
each machine).  What would help here is for git to be more strict and
perhaps forbid (some of) autogenerated names and emails.

In the case of sending patches via email, you can use in-body 'From:'
to provide (name,email) part that is different than account used to
send email.  In the case of UUID you would need the same: some way to
provide UUID in patch (in email).  UUID has the disadvantage of being
required also when (name,email) in From: email header is good user ID.
So UUID wouldn't help there either.


What could help in both cases is .mailmap being used (perhaps on
demand) in more git commands.  See Documentation/mailmap.txt
or e.g. git-shortlog(1) manpage.  It is quite advanced tool for
correcting mistakes (it can correct *both* user name, which is
most common usage, but also email address).

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 13:59                             ` Michael Witten
@ 2010-03-19 14:13                               ` Martin Langhoff
  0 siblings, 0 replies; 104+ messages in thread
From: Martin Langhoff @ 2010-03-19 14:13 UTC (permalink / raw)
  To: Michael Witten; +Cc: Wincent Colaiuta, Paolo Bonzini, git

On Fri, Mar 19, 2010 at 9:59 AM, Michael Witten <mfwitten@gmail.com> wrote:
> I've shown immense patience repeatedly explaining

No, you haven't. _You_ are misunderstanding.

We have what you want: email + name, and a mapping mechanism (mailmap)
to cope with variations. It is good enough.

> Indeed, the only thing that makes this great idea a bad idea is
> COMPATIBILITY CONCERNS; that's it.

Good... at last! But don't put ALL CAPS when you are in the wrong,
mate. And wasting a lot of people's time.



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:08 ` Jakub Narebski
@ 2010-03-19 14:33   ` Jon Smirl
  2010-03-19 14:52     ` Michael J Gruber
  2010-03-19 14:40   ` Michael Witten
  1 sibling, 1 reply; 104+ messages in thread
From: Jon Smirl @ 2010-03-19 14:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Michael Witten, git

On Fri, Mar 19, 2010 at 10:08 AM, Jakub Narebski <jnareb@gmail.com> wrote:
> Michael Witten <mfwitten@gmail.com> writes:
>
>> Short Version:
>> -------------
>>
>>
>> Rather than use a (name,email) pair to identify people, let's use
>> a (uuid,name,email) triplet.
>>
>> The uuid can be any piece of information that a user of git determines
>> to be reasonably unique across space and time and that is intended to
>> be used by that user virtually forever (at least within a project's
>> history).
>>
>> For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
>> some easily remembered, already reasonably unique information.
>
> ... or 'canonical-name canonical-email' pair.
>
>>
>> This could really help keep identifications clean, and it is rather
>> straightforward and possibly quite efficient.
>>
>>
>> Long Version:
>> ------------
> [...]
>
>> While git's use of (name,email) pairs to identify each person is
>> extremely practical, it turns out that it's rather `unstable';
>
> This is non-solution to non-problem.
>
> First, the user.name and user.email does not need to be name and email
> from some email account.  It might be some "canonical name" and
> "canonical email".
>
> Second, there are (I think) two main sources of 'unstability' in
> (name,email) pairs, namely A) misconfigured git (when fetching/pushing
> using git itself), B) wrong name in email etc. (when sending patches
> via email, 80% of patches in Linux kernel case).

Another top source is mangling of non-ASCII charsets when they go
though the email system. Are the git work flow tools safe for
alternative charsets? Do the email tools look at the charset header of
the email message? Check people's names in the kernel commits and
you'll find lots of examples of this type of mangling.

Or people not using UTF-8. There are files in the kernel where
people's names are in conflicting codepages. Should git try to look
for diffs that aren't UTF-8?

>
> In the case of misconfigured git (case A) using UUID wouldn't help,
> and only make it worse (you would have to configure the same UUID on
> each machine).  What would help here is for git to be more strict and
> perhaps forbid (some of) autogenerated names and emails.
>
> In the case of sending patches via email, you can use in-body 'From:'
> to provide (name,email) part that is different than account used to
> send email.  In the case of UUID you would need the same: some way to
> provide UUID in patch (in email).  UUID has the disadvantage of being
> required also when (name,email) in From: email header is good user ID.
> So UUID wouldn't help there either.
>
>
> What could help in both cases is .mailmap being used (perhaps on
> demand) in more git commands.  See Documentation/mailmap.txt
> or e.g. git-shortlog(1) manpage.  It is quite advanced tool for
> correcting mistakes (it can correct *both* user name, which is
> most common usage, but also email address).
>
> --
> Jakub Narebski
> Poland
> ShadeHawk on #git
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:08 ` Jakub Narebski
  2010-03-19 14:33   ` Jon Smirl
@ 2010-03-19 14:40   ` Michael Witten
  2010-03-19 14:56     ` Erik Faye-Lund
                       ` (2 more replies)
  1 sibling, 3 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 14:40 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Fri, Mar 19, 2010 at 08:08, Jakub Narebski <jnareb@gmail.com> wrote:
> Michael Witten <mfwitten@gmail.com> writes:
>
>> Short Version:
>> -------------
>>
>>
>> Rather than use a (name,email) pair to identify people, let's use
>> a (uuid,name,email) triplet.
>>
>> The uuid can be any piece of information that a user of git determines
>> to be reasonably unique across space and time and that is intended to
>> be used by that user virtually forever (at least within a project's
>> history).
>>
>> For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
>> some easily remembered, already reasonably unique information.
>
> ... or 'canonical-name canonical-email' pair.
>
>>
>> This could really help keep identifications clean, and it is rather
>> straightforward and possibly quite efficient.
>>
>>
>> Long Version:
>> ------------
> [...]
>
>> While git's use of (name,email) pairs to identify each person is
>> extremely practical, it turns out that it's rather `unstable';
>
> This is non-solution to non-problem.
>
> First, the user.name and user.email does not need to be name and email
> from some email account.  It might be some "canonical name" and
> "canonical email".

The vast majority of patches come in through email; the git tools
expect the user.name and user.email to reflect physical email account
information.

You would be correct if it were not for the fact that git currently
conflates identity and current email system.

> Second, there are (I think) two main sources of 'unstability' in
> (name,email) pairs, namely A) misconfigured git (when fetching/pushing
> using git itself), B) wrong name in email etc. (when sending patches
> via email, 80% of patches in Linux kernel case).
>
> In the case of misconfigured git (case A) using UUID wouldn't help,
> and only make it worse (you would have to configure the same UUID on
> each machine).  What would help here is for git to be more strict and
> perhaps forbid (some of) autogenerated names and emails.

The uuid string would be typed pretty much only during configuration;
from there, it's basically just handled by the git tools. Hence, the
uuid can indeed suffer from typos, but the name/email pair can suffer
from not only typos but also real life name changing and email account
switching.

There would still be the same problem of variations in uuid for one
person, but the problem would very likely be greatly reduced; if a
person doesn't use the uuid properly or at all, then we're in the
exact same situation we were before. Those who do use it, though, will
be much better off.

Strictness about names and emails is difficult, and keeping something
like the current .mailmap file up-to-date is a centralized process.
The uuid field would distribute the responsibility of maintaining
identity and make that responsibility easy because the user-chosen
string is easy for that user to remember and is typed only very
occasionally and under very specific circumstances.

> In the case of sending patches via email, you can use in-body 'From:'
> to provide (name,email) part that is different than account used to
> send email.

That's a good solution that I've considered, except for 2 reasons:

    * It involves much more opportunities for typos and/or the
      configuration of a non-git tool for a git-specific purpose.

    * Many if not most email services will refuse to send messages
      with forged/spoofed email addresses.

> In the case of UUID you would need the same: some way to
> provide UUID in patch (in email).

Yes, but that's automated by tools like git's format-patch. Not using
something like format-patch or some other git interface is an
'out-of-band' communication and that author has essentially chosen not
to care about his identity.

The use of the uuid field and allowing git tools to handle it is just
a way to give a person who does care about his identity to keep it
consistent.

> UUID has the disadvantage of being
> required also when (name,email) in From: email header is good user ID.
> So UUID wouldn't help there either.

It's not a good user id because it depends on factors other than identity.

> What could help in both cases is .mailmap being used (perhaps on
> demand) in more git commands.  See Documentation/mailmap.txt
> or e.g. git-shortlog(1) manpage.  It is quite advanced tool for
> correcting mistakes (it can correct *both* user name, which is
> most common usage, but also email address).

The disadvantage here is that it centralizes identity management and
it is more demanding because the name/email pair is quite unstable.

On the other hand, something like a uuid field would distribute that
management to the user himself and frees that user from the influences
of legal name changing and email address switching.

Of course, as already stated, some people may bungle their uuid
setting. Then something like .mailmap can be used, but the format
would be simpler, the file would not grow nearly as quickly, and with
some clever encoding some statistics gathering programs could
(possibly) run more efficiently.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:33   ` Jon Smirl
@ 2010-03-19 14:52     ` Michael J Gruber
  0 siblings, 0 replies; 104+ messages in thread
From: Michael J Gruber @ 2010-03-19 14:52 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, Michael Witten, git

Jon Smirl venit, vidit, dixit 19.03.2010 15:33:
> On Fri, Mar 19, 2010 at 10:08 AM, Jakub Narebski <jnareb@gmail.com> wrote:
>> Michael Witten <mfwitten@gmail.com> writes:
>>
>>> Short Version:
>>> -------------
>>>
>>>
>>> Rather than use a (name,email) pair to identify people, let's use
>>> a (uuid,name,email) triplet.
>>>
>>> The uuid can be any piece of information that a user of git determines
>>> to be reasonably unique across space and time and that is intended to
>>> be used by that user virtually forever (at least within a project's
>>> history).
>>>
>>> For instance, the uuid could be an OSF DCE 1.1 UUID or the SHA-1 of
>>> some easily remembered, already reasonably unique information.
>>
>> ... or 'canonical-name canonical-email' pair.
>>
>>>
>>> This could really help keep identifications clean, and it is rather
>>> straightforward and possibly quite efficient.
>>>
>>>
>>> Long Version:
>>> ------------
>> [...]
>>
>>> While git's use of (name,email) pairs to identify each person is
>>> extremely practical, it turns out that it's rather `unstable';
>>
>> This is non-solution to non-problem.
>>
>> First, the user.name and user.email does not need to be name and email
>> from some email account.  It might be some "canonical name" and
>> "canonical email".
>>
>> Second, there are (I think) two main sources of 'unstability' in
>> (name,email) pairs, namely A) misconfigured git (when fetching/pushing
>> using git itself), B) wrong name in email etc. (when sending patches
>> via email, 80% of patches in Linux kernel case).
> 
> Another top source is mangling of non-ASCII charsets when they go
> though the email system. Are the git work flow tools safe for
> alternative charsets? Do the email tools look at the charset header of
> the email message? Check people's names in the kernel commits and
> you'll find lots of examples of this type of mangling.
>

Or even the quoting of quotes for nick names, appearing as 'nick',
"nick", \"nick\", nick and what not.

> Or people not using UTF-8. There are files in the kernel where
> people's names are in conflicting codepages. Should git try to look
> for diffs that aren't UTF-8?

You and others are proving a very important point here: This is really
an lkml proxy fight being taken to the git list, after the futile
mailmap-ification there.

People may disagree on the best approach in general, but this thread
clearly shows:

- The Git community is happy with mailmap for git.git.
- The Git community does not see any need for amending the mailmap
mechanism.
- How you actually use mailmap (leniently or enforcing) is a per-project
decision, just like the patch workflow, the meaning and use of s-o-b
lines, the requirement for full names and many other things.

But since the git list is hosted on kernel.org we can't really complain
about providing room for an lkml discussion ;)

Michael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:40   ` Michael Witten
@ 2010-03-19 14:56     ` Erik Faye-Lund
  2010-03-19 15:05       ` Michael Witten
  2010-03-19 15:12     ` Reece Dunn
  2010-03-20  0:21     ` Jakub Narebski
  2 siblings, 1 reply; 104+ messages in thread
From: Erik Faye-Lund @ 2010-03-19 14:56 UTC (permalink / raw)
  To: Michael Witten; +Cc: Jakub Narebski, git

On Fri, Mar 19, 2010 at 3:40 PM, Michael Witten <mfwitten@gmail.com> wrote:
> On Fri, Mar 19, 2010 at 08:08, Jakub Narebski <jnareb@gmail.com> wrote:
>> First, the user.name and user.email does not need to be name and email
>> from some email account.  It might be some "canonical name" and
>> "canonical email".
>
> The vast majority of patches come in through email; the git tools
> expect the user.name and user.email to reflect physical email account
> information.

What git tools would that be? The only one I know of that does
anything near assuming that is git send-email, and it only uses
user.email if neither sendemail.from is configured nor --from option
is specified. And even when it does, it prompts the user so it can be
changed if called from a terminal. So I wouldn't say that it assumes
anything about the "physicalness" of user.email, it just uses it's as
the most sane default unless anything else has been specified.

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:16           ` Michael Witten
  2010-03-19 12:18             ` Michael Witten
@ 2010-03-19 14:57             ` Reece Dunn
  2010-03-19 15:26               ` Michael J Gruber
  1 sibling, 1 reply; 104+ messages in thread
From: Reece Dunn @ 2010-03-19 14:57 UTC (permalink / raw)
  To: Michael Witten; +Cc: Mike Hommey, david, git

On 19 March 2010 12:16, Michael Witten <mfwitten@gmail.com> wrote:
> On Fri, Mar 19, 2010 at 06:09, Reece Dunn <msclrhd@googlemail.com> wrote:
>> On 19 March 2010 11:54, Mike Hommey <mh@glandium.org> wrote:
>>> On Fri, Mar 19, 2010 at 04:45:38AM -0700, david@lang.hm wrote:
>>>> here is where you are missing the point.
>>>>
>>>> no, there is not 'much less chance' of it getting messed up.
>>>>
>>>> you seem to assume that people would never need to set the UUID on
>>>> multiple machines.
>>>>
>>>> if they don't need to set it on multiple machines, then the
>>>> e-mail/userid is going to be reliable anyway
>>>>
>>>> if they do need to set it on multiple machines and can't be bothered
>>>> to keep their e-mail consistant, why would they bother keeping this
>>>> additional thing considtant? Linus is pointing out that people don't
>>>> care now about their e-mail and name, and will care even less about
>>>> some abstract UUID
>>>>
>>>> people who care will already make their e-mail consistant.
>>>
>>> While I don't agree with the need for that uuid thing, I'd like to
>>> pinpoint that people who care can't necessarily make their e-mail
>>> consistant. For example, Linus used to use an @osdl.org address, and
>>> he now uses an @linux-foundation.org address. It's still the same Linus,
>>> but the (name, email) pair has legitimately changed.
>>
>> So create an aliases list that maps one (name,email) to another that
>> is from the same person. There is no need for an additional item (a
>> uuid) to solve this problem. It also means that searching on any
>> (name,email) pair will find the others, so you only need to
>> remember/find one of the identities for the person you are interested
>> in finding the commits for.
>>
>> AFAICS, mailmap is about correcting mistakes (primarily in the
>> reported name for a given email address). In this case, mailmap and
>> this aliases-map will work in conjunction with each other to give what
>> the original poster wanted. However, I haven't seen any of his replies
>> that answer this (or sufficiently address why mailmap does not solve
>> his problem).
>
> See:
>
>    http://marc.info/?l=git&m=126900051102958&w=2
>
> The idea is to distribute the responsibility for maintaining a
> consistent identity AND to make that responsibility EASY.
>
> The extra uuid `field' can only suffer from typos, while the
> name/email pair can suffer from typos, changing email accounts, and
> changing real life names. If the uuid `field' does get bungled by a
> typo or is not used, then we're no worse off than we were before.

What specific problem(s) are you trying to solve?

The main issue is identifying who made what changes to a repository
(e.g. by a script, or database/statistics algorithms). The mailmap
file allows for corrections to a canonical (name,email) pair for a
specified repository.

For identifying the same person working across multiple projects,
ideally they should keep the canonical (name,email) pair consistent
across all projects, with mailmap files in the respective projects to
keep the canonical form correct.

This canonical (name,email) pair is then a unique identifier for that
person and then effectively becomes a uuid. There is no need to add an
extra uuid field that needs *more* work fixing up errors and making
consistent.

If you change email address or name, *and* care enough about it being
consistent, there is no reason why you cannot update the mailmap file
to use the new canonical (name,email) pair.

Oh, and you are expressing it wrong (if I understand you correctly)...

What you are after is a string U (the uuid) that is used to identify a
person irrespective of their name and email. At the moment
   U = (name,email)
is used to achieve that, with mailmap to normalise the variations.

What you are trying to express is:
    U <=> (name,email)
where U can be any unique string. This is different from using a
(name,email,uuid) triple to identify someone.

So, lets say that I choose U=abc to identify myself uniquely, so that:
    "abc" <=> "Reece Dunn <msclrhd@gmail.com>"
    "abc" <=> "Reece Dunn <msclrhd@googlemail.com>"
    "abc" <=> "Reece Dunn <msclrhd@hotmail.com>"
    "abc" <=> "Reece H. Dunn <msclrhd@gmail.com>"
    "abc" <=> "Reece H Dunn <msclrhd@gmail.com>"

I would still need to define all these variations when and as they
occur in a repository to fixup any typos and email address changes
that occur, so why not just pick U = "Reece H. Dunn
<msclrhd@gmail.com>" as the canonical form instead of "abc" or some
other string?

As has been said, mailmap supports name variations ("Reece Dunn",
"Reece H Dunn", "Reece H. Dunn") and email variations
(msclrhd@hotmail.com, msclrhd@gmail.com, msclrhd@googlemail.com), so
how does a string that I need to set on the git client in addition to
name and email help me define a canonical form *in the git
repository*?

So, I'll ask again: what problems are you trying to solve that cannot
be solved by mailmap?

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:56     ` Erik Faye-Lund
@ 2010-03-19 15:05       ` Michael Witten
  2010-03-19 15:12         ` Michael Witten
  2010-03-19 15:25         ` Erik Faye-Lund
  0 siblings, 2 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 15:05 UTC (permalink / raw)
  To: kusmabite; +Cc: Jakub Narebski, git

On Fri, Mar 19, 2010 at 08:56, Erik Faye-Lund <kusmabite@googlemail.com> wrote:
> On Fri, Mar 19, 2010 at 3:40 PM, Michael Witten <mfwitten@gmail.com> wrote:
>> On Fri, Mar 19, 2010 at 08:08, Jakub Narebski <jnareb@gmail.com> wrote:
>>> First, the user.name and user.email does not need to be name and email
>>> from some email account.  It might be some "canonical name" and
>>> "canonical email".
>>
>> The vast majority of patches come in through email; the git tools
>> expect the user.name and user.email to reflect physical email account
>> information.
>
> What git tools would that be?

Anything involving emailed patches.

> The only one I know of that does
> anything near assuming that is git send-email, and it only uses
> user.email if neither sendemail.from is configured nor --from option
> is specified. And even when it does, it prompts the user so it can be
> changed if called from a terminal. So I wouldn't say that it assumes
> anything about the "physicalness" of user.email, it just uses it's as
> the most sane default unless anything else has been specified.

It's useless to spoof the From field because many email services won't
send it, a point I already covered in the email you quoted.

When a patch is finally emailed, it's the From field that is used for
Author attribution.

You see? Your identity has been tied to whatever email service you
happen to use at any given time rather than to something with more
long term stability.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:40   ` Michael Witten
  2010-03-19 14:56     ` Erik Faye-Lund
@ 2010-03-19 15:12     ` Reece Dunn
  2010-03-20  0:21     ` Jakub Narebski
  2 siblings, 0 replies; 104+ messages in thread
From: Reece Dunn @ 2010-03-19 15:12 UTC (permalink / raw)
  To: Michael Witten; +Cc: Jakub Narebski, git

On 19 March 2010 14:40, Michael Witten <mfwitten@gmail.com> wrote:
> Strictness about names and emails is difficult, and keeping something
> like the current .mailmap file up-to-date is a centralized process.
> The uuid field would distribute the responsibility of maintaining
> identity and make that responsibility easy because the user-chosen
> string is easy for that user to remember and is typed only very
> occasionally and under very specific circumstances.

I don't get this - it is the other way around.

For the mailmap file, you check that file into the git repository
itself. Therefore, by implication, mailmap *is* distributed. It is
therefore kept locally and accessed locally. It also does not suffer
from configuration issues, as you don't need to re-enter it if you
change your computer.

For a uuid to work the way you intend it, there would need to be some
universal central server that would be queried to look up and resolve
the uuid so you can get consistent user identification information for
every git command by every person/script from every git repository.
This is never going to fly for all the reasons distributed VCSs were
created in the first place.

Unless by distributed you mean in the .git/config file, which is
always local and never distributed to others. However, the uuid data
in the repository will be distributed in the repositories, so how is
this any better than what git has now?

- Reece

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 15:05       ` Michael Witten
@ 2010-03-19 15:12         ` Michael Witten
  2010-03-19 15:25         ` Erik Faye-Lund
  1 sibling, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 15:12 UTC (permalink / raw)
  To: kusmabite; +Cc: Jakub Narebski, git

On Fri, Mar 19, 2010 at 09:05, Michael Witten <mfwitten@gmail.com> wrote:
>
> It's useless to spoof the From field because many email services won't
> send it, a point I already covered in the email you quoted.
>
> When a patch is finally emailed, it's the From field that is used for
> Author attribution.
>
> You see? Your identity has been tied to whatever email service you
> happen to use at any given time rather than to something with more
> long term stability.

A lot of trouble could probably be avoided if the Authorship
information could be sent as something separate from the From field. I
don't think it would be quite as powerful as having a uuid, but it
would be less invasive and probably practically as effective.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 15:05       ` Michael Witten
  2010-03-19 15:12         ` Michael Witten
@ 2010-03-19 15:25         ` Erik Faye-Lund
  1 sibling, 0 replies; 104+ messages in thread
From: Erik Faye-Lund @ 2010-03-19 15:25 UTC (permalink / raw)
  To: Michael Witten; +Cc: Jakub Narebski, git

On Fri, Mar 19, 2010 at 4:05 PM, Michael Witten <mfwitten@gmail.com> wrote:
> On Fri, Mar 19, 2010 at 08:56, Erik Faye-Lund <kusmabite@googlemail.com> wrote:
>> On Fri, Mar 19, 2010 at 3:40 PM, Michael Witten <mfwitten@gmail.com> wrote:
>>> On Fri, Mar 19, 2010 at 08:08, Jakub Narebski <jnareb@gmail.com> wrote:
>>>> First, the user.name and user.email does not need to be name and email
>>>> from some email account.  It might be some "canonical name" and
>>>> "canonical email".
>>>
>>> The vast majority of patches come in through email; the git tools
>>> expect the user.name and user.email to reflect physical email account
>>> information.
>>
>> What git tools would that be?
>
> Anything involving emailed patches.

Which are...?

>
>> The only one I know of that does
>> anything near assuming that is git send-email, and it only uses
>> user.email if neither sendemail.from is configured nor --from option
>> is specified. And even when it does, it prompts the user so it can be
>> changed if called from a terminal. So I wouldn't say that it assumes
>> anything about the "physicalness" of user.email, it just uses it's as
>> the most sane default unless anything else has been specified.
>
> It's useless to spoof the From field because many email services won't
> send it, a point I already covered in the email you quoted.
>
> When a patch is finally emailed, it's the From field that is used for
> Author attribution.

The From-field isn't assumed to be a physical-address, but the
From-header is. If the From-field and the From-header are identical,
the From-field doesn't get emitted. This is the same mechanism that is
used when people forward patches from other authors, and there's no
attempts to validate the From-field, only the From-header.

So no, the author-email shouldn't need to be a physical address as far
as send-email is concerned.

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:57             ` Reece Dunn
@ 2010-03-19 15:26               ` Michael J Gruber
  2010-03-19 16:05                 ` david
  0 siblings, 1 reply; 104+ messages in thread
From: Michael J Gruber @ 2010-03-19 15:26 UTC (permalink / raw)
  To: Reece Dunn; +Cc: Michael Witten, Mike Hommey, david, git

Reece Dunn venit, vidit, dixit 19.03.2010 15:57:

[snip]
> 
> So, I'll ask again: what problems are you trying to solve that
> cannot be solved by mailmap?
> 
> - Reece

[Attention, conspiracy theories below!]

The problem seems to be that some people are interested in statistics,
so some are interested in consistent author information, but this
requires others (the authors) to maintain this information, at least on
large projects where this information cannot be kept consistent by a few
people. So, some people are looking for a way to enforce this on the
others... Of course, one could also rephrase this is as "help authors
maintain their authorship information in a consistent way" ;)

Michael

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 15:26               ` Michael J Gruber
@ 2010-03-19 16:05                 ` david
  2010-03-19 17:16                   ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: david @ 2010-03-19 16:05 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Reece Dunn, Michael Witten, Mike Hommey, git

On Fri, 19 Mar 2010, Michael J Gruber wrote:

> Reece Dunn venit, vidit, dixit 19.03.2010 15:57:
>
> [snip]
>>
>> So, I'll ask again: what problems are you trying to solve that
>> cannot be solved by mailmap?
>>
>> - Reece
>
> [Attention, conspiracy theories below!]
>
> The problem seems to be that some people are interested in statistics,
> so some are interested in consistent author information, but this
> requires others (the authors) to maintain this information, at least on
> large projects where this information cannot be kept consistent by a few
> people. So, some people are looking for a way to enforce this on the
> others... Of course, one could also rephrase this is as "help authors
> maintain their authorship information in a consistent way" ;)

but a UUID doesn't help you.

if you can force people to have a consistant UUID, you can force them to 
have a consistant e-mail address (and submit mapping updates if it 
changes)

if you can't force people to maintain a consistant e-mail, why do you 
think they would maintain a consistant UUID?

David Lang

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:08     ` Michael Haggerty
@ 2010-03-19 17:02       ` david
  2010-03-19 17:06         ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: david @ 2010-03-19 17:02 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Michael Witten, git

On Fri, 19 Mar 2010, Michael Haggerty wrote:

> Michael Witten wrote:
>> On Fri, Mar 19, 2010 at 02:41, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>>> Michael Witten wrote:
>>>> Rather than use a (name,email) pair to identify people, let's use
>>>> a (uuid,name,email) triplet.
>>>> [...]
>>> A UUID doesn't need to be a big hex number.  All it has to be is a
>>> "Universally Unique Identifier".  Like, oh, for example, your
>>>
>>>                   *** EMAIL ADDRESS ***
>>>
>>> [1].  There is even already a way to fix up mistakes or unavoidable
>>> email address changes, namely the .mailmap file.
>>
>> *facepalm*
>>
>> You've just repeated everything that I've said; go look at the rest of
>> the thread, where I spend plenty of time correcting the same hangups
>> about my choice of the word UUID and my use of hex digits.
>
> No, my point is to use the *existing* email address as the UUID
> *without* adding another field.  Nothing needs to be changed!

if you are now proposing using the e-mail address, that already 
exists and is supported by the tools, it sounds like you are just 
withdrawing your proposal (other than possibly proposing that the e-mail 
field gets renamed to UUID????)

David Lang

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 17:02       ` david
@ 2010-03-19 17:06         ` Michael Witten
  2010-03-24 18:50           ` Avi Kivity
  0 siblings, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-19 17:06 UTC (permalink / raw)
  To: david; +Cc: Michael Haggerty, git

On Fri, Mar 19, 2010 at 11:02,  <david@lang.hm> wrote:
>
> if you are now proposing using the e-mail address, that already exists and
> is supported by the tools, it sounds like you are just withdrawing your
> proposal (other than possibly proposing that the e-mail field gets renamed
> to UUID????)

You're responding to a different Michael.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 16:05                 ` david
@ 2010-03-19 17:16                   ` Michael Witten
  0 siblings, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-19 17:16 UTC (permalink / raw)
  To: david; +Cc: Michael J Gruber, git

On Fri, Mar 19, 2010 at 10:05,  <david@lang.hm> wrote:
>
> if you can force people to have a consistant UUID, you can force them to
> have a consistant e-mail address (and submit mapping updates if it changes)
>
> if you can't force people to maintain a consistant e-mail, why do you think
> they would maintain a consistant UUID?

Firstly, please note that a UUID is defined in this context as any
string that the user deems for himself to be uniquely identifying of
himself; a UUID allows a user to determine his canonical
representation from the very start.

There's no forcing; there can't be. This is meant to help users manage
their own identities.

A UUID is basically only subject to change due to:

    * typos when configuring

A name/email pair (as in the user.name and user.email variables) is
subject to change due to:

    * typos when configuring
    * legal name changes
    * email account switching

Naturally, older commits and wrong UUIDs would need mappings, but
that's no different than the current situation except for the fact
that UUIDs would not change as frequently.

That aside, an alternative solution that is not as powerful but that
is less invasive would be to allow users to transmit authorship
information as part of the patch payload separate from the usual email
headers (or something like this). Erik Faye-Lund suggests this is
already easily done, but I'm not so sure.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-18 20:01                 ` Linus Torvalds
@ 2010-03-19 19:39                   ` Junio C Hamano
  0 siblings, 0 replies; 104+ messages in thread
From: Junio C Hamano @ 2010-03-19 19:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, Michael Witten, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> (And yes, it does say that git should probably have errored out way more 
> aggressively about badly set up host/domain names in the "guess at email 
> address" code. My bad. Maybe it's still worth fixing for the future)

We made a small step in that direction in 49ff9a7 (commit: show
interesting ident information in summary, 2010-01-13).  I think what it
does is sufficiently loud (but not annoying).

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 14:40   ` Michael Witten
  2010-03-19 14:56     ` Erik Faye-Lund
  2010-03-19 15:12     ` Reece Dunn
@ 2010-03-20  0:21     ` Jakub Narebski
  2 siblings, 0 replies; 104+ messages in thread
From: Jakub Narebski @ 2010-03-20  0:21 UTC (permalink / raw)
  To: Michael Witten; +Cc: git

On Fri, 19 Mar 2010, Michael Witten wrote:
> On Fri, Mar 19, 2010 at 08:08, Jakub Narebski <jnareb@gmail.com> wrote:

>> This is non-solution to non-problem.
>>
>> First, the user.name and user.email does not need to be name and email
>> from some email account.  It might be some "canonical name" and
>> "canonical email".
> 
> The vast majority of patches come in through email; the git tools
> expect the user.name and user.email to reflect physical email account
> information.
> 
> You would be correct if it were not for the fact that git currently
> conflates identity and current email system.

It is not true.  From the git-config(1) manpage, the description (meaning)
of user.name and user.email is:

  user.email::
        Your email address to be recorded in any newly created commits.
        Can be overridden by the 'GIT_AUTHOR_EMAIL', 'GIT_COMMITTER_EMAIL', and
        'EMAIL' environment variables.  See linkgit:git-commit-tree[1].

  user.name::
        Your full name to be recorded in any newly created commits.
        Can be overridden by the 'GIT_AUTHOR_NAME' and 'GIT_COMMITTER_NAME'
        environment variables.  See linkgit:git-commit-tree[1].
 
As you can see there is nothing about email, and physicsl email account.

It is true that git-send-email asks about the "From" email address to
send email from with user.name + user.email as default value...
unless either sendemail.from or --from option is used.  
[See also below].

>> Second, there are (I think) two main sources of 'unstability' in
>> (name,email) pairs, namely A) misconfigured git (when fetching/pushing
>> using git itself), B) wrong name in email etc. (when sending patches
>> via email, 80% of patches in Linux kernel case).
>>
>> In the case of misconfigured git (case A) using UUID wouldn't help,
>> and only make it worse (you would have to configure the same UUID on
>> each machine).  What would help here is for git to be more strict and
>> perhaps forbid (some of) autogenerated names and emails.
> 
> The uuid string would be typed pretty much only during configuration;
> from there, it's basically just handled by the git tools. Hence, the
> uuid can indeed suffer from typos, but the name/email pair can suffer
> from not only typos but also real life name changing and email account
> switching.

You do not need (in theory at least) to change user.name nor user.email
with real life name changing (like marriage or adoption) and email 
account switching.

[...]
>> In the case of sending patches via email, you can use in-body 'From:'
>> to provide (name,email) part that is different than account used to
>> send email.
> 
> That's a good solution that I've considered, except for 2 reasons:
> 
>     * It involves much more opportunities for typos and/or the
>       configuration of a non-git tool for a git-specific purpose.
> 
>     * Many if not most email services will refuse to send messages
>       with forged/spoofed email addresses.

Actually git-send-email would automatically add in-body "From:" header
if it is different from the "From:" address for email, and git-am would
automatically prefer in-body "From:" over sender (in-header "From:")
for authorship information.

Sender can be different from author of the patch, there is no problem
with that.

What git can improve here (and perhaps already does it) is handling of
non-ASCII characters in name (e.g. when commit message does not contain
non US-ASCII letters, but user.name does).  Perhaps it got corrected
(improved) already.


P.S. Backward compatibility (older git-am) would probably require
UUID in the form of canonical name+email, and use of in-body "From:"
header to pass this UUID when sending patches.

>> In the case of UUID you would need the same: some way to
>> provide UUID in patch (in email).
> 
> Yes, but that's automated by tools like git's format-patch. Not using
> something like format-patch or some other git interface is an
> 'out-of-band' communication and that author has essentially chosen not
> to care about his identity.
> 
> The use of the uuid field and allowing git tools to handle it is just
> a way to give a person who does care about his identity to keep it
> consistent.

git-send-email *already* automatically deals with sender != author.

[...]
>> What could help in both cases is .mailmap being used (perhaps on
>> demand) in more git commands.  See Documentation/mailmap.txt
>> or e.g. git-shortlog(1) manpage.  It is quite advanced tool for
>> correcting mistakes (it can correct *both* user name, which is
>> most common usage, but also email address).
> 
> The disadvantage here is that it centralizes identity management and
> it is more demanding because the name/email pair is quite unstable.

How in-tree .mailmap file (in-tree like .gitignore and .gitattributes)
is *centralized identity management*?  It is as distributed as git
repositories are.

On the other hand user.uuid is not distributed; for security reasons
config is not transferred.

[...]
> [...], and with
> some clever encoding some statistics gathering programs could
> (possibly) run more efficiently.

Well, I guess it is statistics that dominates, not id part.  Such
tools shoud simply take .mailmap into account (unless they rely on
git for that.).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:09         ` Michael Witten
@ 2010-03-22 12:06           ` Mark Brown
  2010-03-22 14:38           ` Michael Witten
  1 sibling, 0 replies; 104+ messages in thread
From: Mark Brown @ 2010-03-22 12:06 UTC (permalink / raw)
  To: Michael Witten; +Cc: Mike Hommey, david, git

On Fri, Mar 19, 2010 at 06:09:34AM -0600, Michael Witten wrote:
> On Fri, Mar 19, 2010 at 05:54, Mike Hommey <mh@glandium.org> wrote:

> > While I don't agree with the need for that uuid thing, I'd like to
> > pinpoint that people who care can't necessarily make their e-mail
> > consistant. For example, Linus used to use an @osdl.org address, and
> > he now uses an @linux-foundation.org address. It's still the same Linus,
> > but the (name, email) pair has legitimately changed.

> Indeed.

> This is because the name/email pair (as in the 'name' and 'email'
> config variables) CONFLATES the idea of identity and current email
> account.

You're assuming they aren't conflated - for example, when people do work
both personally and for their employer they often use distinct e-mail
addresses to identify how the work was funded.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 12:09         ` Michael Witten
  2010-03-22 12:06           ` Mark Brown
@ 2010-03-22 14:38           ` Michael Witten
  2010-03-24 19:18             ` Erik Faye-Lund
  1 sibling, 1 reply; 104+ messages in thread
From: Michael Witten @ 2010-03-22 14:38 UTC (permalink / raw)
  To: Mark Brown; +Cc: Mike Hommey, david, git

On Mon, Mar 22, 2010 at 06:06, Mark Brown <broonie@sirena.org.uk> wrote:
>
> You're assuming they aren't conflated - for example, when people do work
> both personally and for their employer they often use distinct e-mail
> addresses to identify how the work was funded.

Indeed.

The model I propose handles this case much better, as I explain here:

    http://marc.info/?l=git&m=126900051102958&w=2

Specifically:

    > if they do need to set it on multiple machines and
    > can't be bothered to keep their e-mail consistant,
    > why would they bother keeping this additional thing
    > considtant? Linus is pointing out that people don't
    > care now about their e-mail and name, and will care
    > even less about some abstract UUID
    
    The user doesn't have a damn choice!

    [These first few paragraphs aren't completley correct;
     there's an explanation below them. It's mainly just
     setting up for the important part below.]
    
    The email can't be kept consistent over time because
    the tools expect it to be and/or use the actual
    physical email used to send/receive stuff. It's
    information that CONFLATES identity with whatever
    tool/system you're using.
    
    For instance, Michael Haggerty cannot reasonably use
    
        [user]
            name  = Michael Haggerty
            email = mhagger@MIT.EDU
    
    because he likely no longer has that email account
    to use. He is forced to change it and therefore
    forced to make his identity confused.

    [The above isn't quite true; my mistake. Michael
     could actually keep "mhagger@MIT.EDU" but inform
     tools like "git send-email" to send patches from
     another email address; this way, send-email will
     emit the necessary information to carry that
     authorship identity ("mhagger@MIT.EDU") along
     with the patch.
    
     However, it's still the case that Michael Haggerty
     is essentially stuck with "mhagger@MIT.EDU" for
     his identification---a problem that my proposal
     essentially fixes, as described now:]
    
    I'm proposing ALLOWING him to say:
    
        [user]
            uuid  = Michael Haggerty <mhagger@MIT.EDU>
            name  = Michael Haggerty
            email = mhagger@ALUM.mit.edu
    
    Heck, let's say he works at Red Hat as well; he
    might make some commits under this config AT WORK:
    
        [user]
            uuid  = Michael Haggerty <mhagger@MIT.EDU>
            name  = Michael Haggerty
            email = mhagger@redhat.com
    
    Then, he can make, say, commits to the Linux kernel
    repo for both work and hobby related issues and
    still be recognized as the same person.
    
    That is, he can have some commits [publicly] under:
    
        Michael Haggerty <mhagger@ALUM.mit.edu>
    
    and other commits [publicly] under:
    
        Michael Haggerty <mhagger@redhat.com>
    
    and still link them all together as the [SAME PERSON]
    with just the uuid:
    
        Michael Haggerty <mhagger@MIT.EDU>

The idea is to help users manage their own identities more effectively.

It's clearly advantageous to be able to apply different public identities
(personal vs. work identity for instance) to different commits, tags, etc.,
but it's also advantageous to be able to link those different identities
together.

At the moment, the different identities can only be linked together by
editing and transmitting a .mailmap file to be used by git tools. My
proposal distributes this kind of work UPFRONT by having individuals
choose UPFRONT some reasonably unique identification string to use
as the link between public identities.

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-19 17:06         ` Michael Witten
@ 2010-03-24 18:50           ` Avi Kivity
  0 siblings, 0 replies; 104+ messages in thread
From: Avi Kivity @ 2010-03-24 18:50 UTC (permalink / raw)
  To: Michael Witten; +Cc: david, Michael Haggerty, git

On 03/19/2010 07:06 PM, Michael Witten wrote:
> On Fri, Mar 19, 2010 at 11:02,<david@lang.hm>  wrote:
>    
>> if you are now proposing using the e-mail address, that already exists and
>> is supported by the tools, it sounds like you are just withdrawing your
>> proposal (other than possibly proposing that the e-mail field gets renamed
>> to UUID????)
>>      
> You're responding to a different Michael.
>    

I guess he should have checked the UUID.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-22 14:38           ` Michael Witten
@ 2010-03-24 19:18             ` Erik Faye-Lund
  2010-03-24 19:23               ` Michael Witten
  0 siblings, 1 reply; 104+ messages in thread
From: Erik Faye-Lund @ 2010-03-24 19:18 UTC (permalink / raw)
  To: Michael Witten; +Cc: Mark Brown, Mike Hommey, david, git

On Mon, Mar 22, 2010 at 3:38 PM, Michael Witten <mfwitten@gmail.com> wrote:
>     However, it's still the case that Michael Haggerty
>     is essentially stuck with "mhagger@MIT.EDU" for
>     his identification---a problem that my proposal
>     essentially fixes, as described now:]
>
>    I'm proposing ALLOWING him to say:
>
>        [user]
>            uuid  = Michael Haggerty <mhagger@MIT.EDU>
>            name  = Michael Haggerty
>            email = mhagger@ALUM.mit.edu
>

...which is the exact same situation as above, where he's "stuck"
using "mhagger@MIT.EDU" for identification. I don't see how this
changes anything (except allowing to distribute an updated
contact-email... But let's face it, git-repos aren't Facebook)

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: What's in a name? Let's use a (uuid,name,email) triplet
  2010-03-24 19:18             ` Erik Faye-Lund
@ 2010-03-24 19:23               ` Michael Witten
  0 siblings, 0 replies; 104+ messages in thread
From: Michael Witten @ 2010-03-24 19:23 UTC (permalink / raw)
  To: kusmabite; +Cc: Mark Brown, Mike Hommey, david, git

On Wed, Mar 24, 2010 at 13:18, Erik Faye-Lund <kusmabite@googlemail.com> wrote:
> I don't see how this
> changes anything

I don't see how you can't see it.

Oh well.

^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2010-03-24 19:24 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-18 13:23 What's in a name? Let's use a (uuid,name,email) triplet Michael Witten
2010-03-18 13:48 ` Jon Smirl
2010-03-18 14:26   ` Michael Witten
2010-03-18 17:27 ` Linus Torvalds
2010-03-18 19:02   ` Jon Smirl
2010-03-18 19:07     ` Linus Torvalds
2010-03-18 19:16       ` Jon Smirl
2010-03-18 19:20         ` Linus Torvalds
2010-03-18 19:37           ` Jon Smirl
2010-03-18 19:47             ` Linus Torvalds
2010-03-18 19:50               ` Linus Torvalds
2010-03-18 20:01                 ` Linus Torvalds
2010-03-19 19:39                   ` Junio C Hamano
2010-03-18 20:31                 ` Reece Dunn
2010-03-18 20:59                   ` Linus Torvalds
2010-03-18 19:32       ` Michael Witten
2010-03-18 19:40         ` Linus Torvalds
2010-03-18 19:47           ` Michael Witten
2010-03-18 19:52             ` Linus Torvalds
2010-03-18 20:00               ` Michael Witten
2010-03-18 19:52             ` Wincent Colaiuta
2010-03-18 19:40         ` Wincent Colaiuta
2010-03-18 19:42         ` Martin Langhoff
2010-03-18 22:36   ` Martin Langhoff
2010-03-18 23:17     ` Nicolas Pitre
2010-03-18 23:26       ` Jon Smirl
2010-03-18 23:34         ` Nicolas Pitre
2010-03-18 23:41           ` Jon Smirl
2010-03-18 23:58             ` Nicolas Pitre
2010-03-19  0:16               ` Jon Smirl
2010-03-19  0:17                 ` Linus Torvalds
2010-03-19  0:39                   ` Jon Smirl
2010-03-19  0:50                     ` Linus Torvalds
2010-03-19  1:12                       ` Jon Smirl
2010-03-19  1:45                         ` Nicolas Pitre
2010-03-19  2:05                           ` Jon Smirl
2010-03-18 23:34       ` Michael Witten
2010-03-18 18:42 ` Michael Witten
2010-03-18 18:47   ` Matthieu Moy
2010-03-18 18:57     ` Michael Witten
2010-03-18 19:12   ` Nicolas Pitre
2010-03-18 20:44   ` tytso
2010-03-18 21:12     ` Michael Witten
2010-03-18 21:19       ` Martin Langhoff
2010-03-18 21:29         ` Michael Witten
2010-03-18 21:39           ` Martin Langhoff
2010-03-18 21:46             ` Michael Witten
2010-03-18 21:55               ` Martin Langhoff
2010-03-18 22:02                 ` Michael Witten
2010-03-18 23:37                   ` Nicolas Pitre
2010-03-18 23:44                     ` Michael Witten
2010-03-19  0:03                       ` Nicolas Pitre
2010-03-19  0:27                         ` Michael Witten
2010-03-19  0:32                           ` Nicolas Pitre
2010-03-18 22:06               ` Reece Dunn
2010-03-18 21:57             ` Michael Witten
2010-03-19 12:34               ` Paolo Bonzini
2010-03-19 12:43                 ` Michael Witten
2010-03-19 12:53                   ` Paolo Bonzini
2010-03-19 13:03                     ` Michael Witten
2010-03-19 13:08                       ` Paolo Bonzini
2010-03-19 13:13                         ` Michael Witten
2010-03-19 13:41                           ` Wincent Colaiuta
2010-03-19 13:59                             ` Michael Witten
2010-03-19 14:13                               ` Martin Langhoff
2010-03-18 21:27       ` Linus Torvalds
2010-03-18 21:44         ` Michael Witten
2010-03-18 23:12         ` Jon Smirl
2010-03-18 22:17 ` A Large Angry SCM
2010-03-19  2:47 ` Sitaram Chamarty
2010-03-19  5:17   ` Nazri Ramliy
2010-03-19  8:41 ` Michael Haggerty
2010-03-19 11:39   ` Michael Witten
2010-03-19 11:45     ` david
2010-03-19 11:54       ` Mike Hommey
2010-03-19 12:09         ` Reece Dunn
2010-03-19 12:16           ` Michael Witten
2010-03-19 12:18             ` Michael Witten
2010-03-19 14:57             ` Reece Dunn
2010-03-19 15:26               ` Michael J Gruber
2010-03-19 16:05                 ` david
2010-03-19 17:16                   ` Michael Witten
2010-03-19 12:25           ` Jon Smirl
2010-03-19 12:40             ` Reece Dunn
2010-03-19 12:09         ` Michael Witten
2010-03-22 12:06           ` Mark Brown
2010-03-22 14:38           ` Michael Witten
2010-03-24 19:18             ` Erik Faye-Lund
2010-03-24 19:23               ` Michael Witten
2010-03-19 12:08       ` Michael Witten
2010-03-19 14:08     ` Michael Haggerty
2010-03-19 17:02       ` david
2010-03-19 17:06         ` Michael Witten
2010-03-24 18:50           ` Avi Kivity
2010-03-19 14:08 ` Jakub Narebski
2010-03-19 14:33   ` Jon Smirl
2010-03-19 14:52     ` Michael J Gruber
2010-03-19 14:40   ` Michael Witten
2010-03-19 14:56     ` Erik Faye-Lund
2010-03-19 15:05       ` Michael Witten
2010-03-19 15:12         ` Michael Witten
2010-03-19 15:25         ` Erik Faye-Lund
2010-03-19 15:12     ` Reece Dunn
2010-03-20  0:21     ` Jakub Narebski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.