git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
	Johannes Sixt <j.sixt@viscovery.net>
Subject: Re: [PATCH v2 1/4] notes: preserve object type given by "add -C"
Date: Fri, 11 May 2012 23:12:27 -0700	[thread overview]
Message-ID: <7vzk9ehqr8.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <CACsJy8Avusvu9LJeg1L=OZ9=qW+FaqbNWfA_rrZJUY_3WqfhOg@mail.gmail.com> (Nguyen Thai Ngoc Duy's message of "Sat, 12 May 2012 12:20:00 +0700")

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> On Sat, May 12, 2012 at 4:16 AM, Junio C Hamano <gitster@pobox.com> wrote:
> ...
>> It is not automatically "converting"; as far as the notes subsystem is
>> concerned, the data you throw at it to be associated with an object the
>> note annotates has always been uninterpreted stream of bytes. If an
>> application wants to store the raw representation of a commit object
>> including the log message and its header, it has every right to expect
>> that it can use "git cat-file commit $argument_to_the_C_option" as the
>> source of that uninterpreted stream of bytes, doesn't it?
>
> Some part of git-notes expects this stream of bytes to be textual,
> human readable. In that case, "git notes add -C commit/tag"'s stuffing
> a blob with the given commit/tag content to notes tree may make sense.
> Personally I'd rather stuff the commit/tag object instead so no new
> blobs are created. The end result is the same: read_sha1_file() always
> return correct text, so does "git notes show".

No, the end result is definitely not the same.

There are two important characteristics of "uninterpreted byte stream" the
above thinking is not taking into consideration:

 (1) we do not interpret what the application stores; and
 (2) the application is *not* limited by our type system.

Suppose the application happens to want to stuff the contents it took from
a commit object, and "add -C $objecname" is a convenient way to do so.  We
have recorded it as "blob" because it is uninterpreted stream of bytes. If
you record that as a leaf note in the note tree, does that mean the note
tree now suddenly have a submodule?  Hell, no.

What if the application wanted to record the contents of a tree object
instead?  How would that affect the fan-out mechanism the note subsystem
uses to hash the 40-hexadecimal object names?  After descending the notes
tree to consume the object name to reach the leaf node, it still finds
even more level hanging below.  Not very careful "list all object names
that have notes attached in this note tree" implementation may end up
descending into the tree object, because of this patch.  Another bad
implication of such a change is that suddenly we would start including all
the subobjects in that tree object when computing the reachability of the
notes tree.

The application needs to have a way to tell what is in the data it stores
anyway, because it is not necessarily "enhancing git" kind of application
that talks about relationships between git objects.  And it may do so
either by convention (e.g. my "notes/amlog" notes tree only records a
single-line text that is a Message-Id header of the original patch e-mail
commits came from) or by having its own way to identify the application
specific data type (e.g. json, pickle, protobuf, etc.).  It is pointless
to say "Git object types can be stored natively, but there is only one
type of blob so the application needs to find a way to coax the types of
data it wants to store itself."  It needs to do so anyway, and having
native and standardized way only for git object types does not improve the
system.  It only ties our hands going forward because we need to define
what it _means_ to store non-blob types in the notes tree, and support
that forever.

So this 1/4 patch is _not_ a bugfix at all.  It breaks perfectly good
current storage semantics without no good reason.

For that matter, as long as $EDITOR is set to something appropriate for
the application specific data, there is no reason to forbid editing,
either.

The only sensible safety against "oops, I forgot that this notes tree
stores binary gunk" I can think of offhand that won't cripple sensible use
case is to sample the data to see if it is binary and ask confirmation,
similar to how "less" asks "frotz may be a binary file. See it anyway?",
and do so only when we are spewing it to the terminal.

  reply	other threads:[~2012-05-12  6:12 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-08 13:11 [PATCH] notes: do not accept non-blobs as new notes Nguyễn Thái Ngọc Duy
2012-05-08 16:03 ` Jeff King
2012-05-08 16:26   ` Junio C Hamano
2012-05-09  8:19   ` Nguyen Thai Ngoc Duy
2012-05-09 17:37     ` Jeff King
2012-05-09 17:52       ` Junio C Hamano
2012-05-09 18:43         ` Jeff King
2012-05-10 14:04 ` [PATCH 0/4] git-notes ui fixes regarding non-blobs notes Nguyễn Thái Ngọc Duy
2012-05-10 14:04   ` [PATCH 1/4] notes: preserve object type given by "add -C" Nguyễn Thái Ngọc Duy
2012-05-10 14:04   ` [PATCH 2/4] notes: "add -c" refuses to open an editor with non-blobs Nguyễn Thái Ngọc Duy
2012-05-10 15:26     ` Johannes Sixt
2012-05-11  1:11       ` Nguyen Thai Ngoc Duy
2012-05-10 14:05   ` [PATCH 3/4] notes: refuse to edit non-blobs Nguyễn Thái Ngọc Duy
2012-05-10 14:05   ` [PATCH 4/4] notes: refuse to append to non-blob notes Nguyễn Thái Ngọc Duy
2012-05-10 14:43     ` [PATCH 4/4] notes: only append a blob to a blob Nguyễn Thái Ngọc Duy
2012-05-10 15:19       ` Jeff King
2012-05-10 15:31         ` Nguyen Thai Ngoc Duy
2012-05-10 15:45           ` Jeff King
2012-05-11  3:57             ` Junio C Hamano
2012-05-10 14:29   ` [PATCH 0/4] git-notes ui fixes regarding non-blobs notes Jeff King
2012-05-11  1:25   ` [PATCH v2 0/4] non-blob notes fixes Nguyễn Thái Ngọc Duy
2012-05-11  1:25   ` [PATCH v2 1/4] notes: preserve object type given by "add -C" Nguyễn Thái Ngọc Duy
2012-05-11 21:16     ` Junio C Hamano
2012-05-12  5:20       ` Nguyen Thai Ngoc Duy
2012-05-12  6:12         ` Junio C Hamano [this message]
2012-05-12  6:58           ` Nguyen Thai Ngoc Duy
2012-05-11  1:25   ` [PATCH v2 2/4] notes: "add -c" refuses to open an editor with non-blobs Nguyễn Thái Ngọc Duy
2012-05-11  1:25   ` [PATCH v2 3/4] notes: refuse to edit non-blobs Nguyễn Thái Ngọc Duy
2012-05-11  1:25   ` [PATCH v2 4/4] notes: only allow to append a blob to a blob Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vzk9ehqr8.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=j.sixt@viscovery.net \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).