git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kelly Dean <kellydeanch@yahoo.com>
To: git@vger.kernel.org
Subject: Does content provenance matter?
Date: Sat, 5 May 2012 13:49:16 -0700 (PDT)	[thread overview]
Message-ID: <1336250956.54413.YahooMailClassic@web121505.mail.ne1.yahoo.com> (raw)

Suppose you make dirs B and C, copy file X into B and C, insert "foo" somewhere into B/X and the same place into C/X, and commit. Now, you copy "foo" from B/X into the same place in the original X, and commit again. Git doesn't record the information about whether "foo" was copied from B or C, and this is intentional, on the theory that just content, not provenance, is what matters.
Suppose instead, you branch master to new branches B and C, insert "foo" into B/X, commit, insert "foo" into C/X, and commit. Now, you merge B back into master. Git records that master contains "foo" because B contained it rather than because C contained it, on the theory that not only content, but also provenance, matters.
Does provenance actually matter, or not? The reason git doesn't record it in the first case isn't simply that your editor didn't store that information (and the editor didn't store it because it isn't customary to store it, and there's no standard way to store it); even if the editor were to store the information (e.g. as metadata for X; details not relevant) and a patch to git were submitted for it to record this metadata, the git maintainers would presumably reject this patch, on the basis that it violates git's design specification which says that provenance doesn't matter. For the same reason, git intentionally doesn't distinguish the cases of renaming a file or directory vs. deleting it and creating a new one with the same content, as has already been thoroughly debated.
The basic question is, if provenance doesn't matter, then why does a git commit record its parent(s)? Why not omit this information, and figure it out at search time (by looking at all commits with older timestamps), the same as you're supposed to figure out renames at search time and figure out the movement of lines within/among files at search time (by looking at all files in the parent commit(s))? (If speed is an issue, then use an index, but this doesn't require putting such derivative information in the commit record.)

             reply	other threads:[~2012-05-05 20:56 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-05 20:49 Kelly Dean [this message]
2012-05-07  8:23 ` Does content provenance matter? Thomas Rast
2012-05-07 21:43   ` Kelly Dean
2012-05-07 22:14     ` PJ Weisberg
2012-05-07 23:13       ` Kelly Dean
2012-05-08  0:03         ` Andrew Ardill
2012-05-08  9:23         ` Philip Oakley
2012-05-08  0:08   ` Junio C Hamano
2012-05-08  0:11     ` Junio C Hamano
2012-05-07 23:12 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1336250956.54413.YahooMailClassic@web121505.mail.ne1.yahoo.com \
    --to=kellydeanch@yahoo.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).