All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org,
	Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Subject: Re: [PATCH v2 1/3] diff-files --raw: handle intent-to-add files correctly
Date: Thu, 25 Jun 2020 19:52:41 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2006251618070.54@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <xmqqy2oc8oye.fsf@gitster.c.googlers.com>

Hi Junio,

On Wed, 24 Jun 2020, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > Sure, but my intention was to synchronize the `--raw` vs the `--patch`
> > output: the latter _already_ shows the correct hash. This here patch makes
> > the hash in the former's output match the latter's.
>
> That is shooting for a wrong uniformity and breaking consistency
> among the `--raw` modes.
>
>  $ git reset --hard
>  $ echo "/* end */" >cache.h ;# taint
>  $ git diff-files --raw
>  ... this shows (M)odified with 0{40} on the postimage
>  ... 0{40} for side that is known to have contents from low-level diff
>  ... means "object name unknown; figure it out yourself if you need it"
>  $ git update-index cache.h
>  $ git diff-files --raw
>  ... of course we see nothing here.  Wait for a bit.
>  $ touch cache.h ;# smudge
>  $ git diff-files --raw
>  ... this shows (M)odified with 0{40} on the postimage
>  ... again, it says "it is stat dirty so I do not bother to compute"
>  $ git update-index --refresh
>  $ git diff-files --raw
>  ... again we see nothing.
>
> Any tools that work on "--raw" output must be already prepared to
> see 0{40} on the side that is known to have contents and must know
> to grab the contents from the working tree file if they need them,
> so showing the 0{40} for i-t-a entry (whose definition is "the user
> said in the past that the final contents of the file will be added
> later, but Git does not know what object it will be yet") cannot
> break them.  And the behaviour of giving 0{40} in such a case aligns
> well with what is already done for paths already added to the index
> when Git does not have an already-computed object name handy.

Well, don't you know, I never realized that the hash shown by `git
diff-files --raw` for modified files was all-zero while `git diff-files
-p` showed the computed one matching the current worktree version!

> > Besides, we're talking about the post-image of `diff-files`, i.e. the
> > worktree version, here. I seem to remember that the pre-image already uses
> > the all-zero hash to indicate precisely what you mentioned above.
>
> The 0{40} you see for pre-image for (A)dded paths means a completely
> different thing from the 0{40} I have been explaining in the above,
> so that is not relevant here.
>
> By definition, there is *no* contents for the pre-image side of
> (A)dded paths (that is why I stressed the "side that must have
> contents" in the above description---it is determined by the type of
> the change), but because the format requires us to place some
> hexadecimal there, we fill the space with 0{40}.
>
> When we do not know the object name for the side that is known to
> have contents without performing extra computation (including "stat
> dirty so we cannot tell without rehashing"), we also use 0{40} as a
> signal to say "we do not know the actual contents", but the consumer
> of "--raw" format is expected to know the difference between "this
> side is known to have no data and 0{40} is here as filler" and "this
> side must have contents but we see 0{40} because Git does not have
> it handy in precomputed form".
>
> The above is the same for "diff-index --raw" without "--cached";
> when we have to hash before we can give the object name (e.g. the
> path is stat-dirty), we give 0{40} and let the consumer figure it
> out if it needs to.
>
>  $ git reset --hard
>  $ touch COPYING
>  $ git diff-index --raw HEAD
>  ... we see (M)odified with 0{40} on the right hand side.
>
> When the caller asks for "--patch" or any other output format that
> actually needs contents for output, however, these low-level tools
> do read the contents, and as a side effect, they may hash to obtain
> the object name and show it [*1*].
>
> By the way, as I do not want to see you waste your time going in a
> wrong direction just to be "different", let me make it clear that as
> far as the design of low level diff plumbing is concerned, what I
> said here is final.  Please don't waste your time on arguing for
> changing the design now after 15 years.  I want to see your time
> used in a more productive way for the project.

Thank you for patienty explaining to me something I managed to miss for a
decade and a half.

I'll send out v4 in a moment.

Ciao,
Dscho

>
> Thanks.
>
>
> [Footnote]
>
> *1* This division of labor to free "--raw" mode of anything remotely
>     unnecessary stems from the original diff plumbing design in May
>     2005 where the "--raw" mode was the only output mode, and there
>     was a separate "git-diff-helper" (look for it in the mailing
>     list archive if you want to learn more) that reads a "--raw"
>     output and transforms it into the patch form.  That "once we
>     have the raw diff, we can pipe it to post-processing and do more
>     interesting things" eventually led to the design of the diffcore
>     pipeline where we match up (A)dded and (D)eleted entries to
>     detect renames, etc.
>
>

  parent reply	other threads:[~2020-06-25 17:52 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 12:38 [PATCH 0/3] Fix difftool problem with intent-to-add files Johannes Schindelin via GitGitGadget
2020-06-11 12:38 ` [PATCH 1/3] diff-files: fix incorrect usage of an empty tree Johannes Schindelin via GitGitGadget
2020-06-11 12:38 ` [PATCH 2/3] diff-files --raw: handle intent-to-add files correctly Johannes Schindelin via GitGitGadget
2020-06-11 12:38 ` [PATCH 3/3] difftool -d: ensure that intent-to-add files are handled correctly Johannes Schindelin via GitGitGadget
2020-06-23 12:48 ` [PATCH v2 0/3] Fix difftool problem with intent-to-add files Johannes Schindelin via GitGitGadget
2020-06-23 12:48   ` [PATCH v2 1/3] diff-files --raw: handle intent-to-add files correctly Johannes Schindelin via GitGitGadget
2020-06-24  0:09     ` Junio C Hamano
2020-06-24 13:26       ` Johannes Schindelin
2020-06-24 15:26         ` Junio C Hamano
2020-06-24 18:41           ` Junio C Hamano
2020-06-25 17:52           ` Johannes Schindelin [this message]
2020-06-24  7:11     ` Srinidhi Kaushik
2020-06-24 13:34       ` Johannes Schindelin
2020-06-24 15:51         ` Junio C Hamano
2020-06-23 12:48   ` [PATCH v2 2/3] diff-files: fix incorrect usage of an empty tree Johannes Schindelin via GitGitGadget
2020-06-24  0:10     ` Junio C Hamano
2020-06-23 12:48   ` [PATCH v2 3/3] difftool -d: ensure that intent-to-add files are handled correctly Johannes Schindelin via GitGitGadget
2020-06-24 14:47   ` [PATCH v3 0/3] Fix difftool problem with intent-to-add files Johannes Schindelin via GitGitGadget
2020-06-24 14:47     ` [PATCH v3 1/3] diff-files --raw: handle intent-to-add files correctly Johannes Schindelin via GitGitGadget
2020-06-24 14:47     ` [PATCH v3 2/3] diff-files: fix incorrect usage of an empty tree Johannes Schindelin via GitGitGadget
2020-06-24 14:47     ` [PATCH v3 3/3] difftool -d: ensure that intent-to-add files are handled correctly Johannes Schindelin via GitGitGadget
2020-06-25 17:53     ` [PATCH v4 0/2] Fix difftool problem with intent-to-add files Johannes Schindelin via GitGitGadget
2020-06-25 17:53       ` [PATCH v4 1/2] diff-files --raw: show correct post-image of " Johannes Schindelin via GitGitGadget
2020-06-25 18:08         ` Junio C Hamano
2020-07-01  9:46           ` Johannes Schindelin
2020-07-01 21:02             ` Junio C Hamano
2020-06-25 18:21         ` Junio C Hamano
2020-07-01  9:52           ` Johannes Schindelin
2020-06-26 17:49         ` Srinidhi Kaushik
2020-06-25 17:53       ` [PATCH v4 2/2] difftool -d: ensure that intent-to-add files are handled correctly Johannes Schindelin via GitGitGadget
2020-06-25 18:11         ` Junio C Hamano
2020-07-01 10:01           ` Johannes Schindelin
2020-07-01 21:03             ` Junio C Hamano
2020-07-01 21:20               ` Johannes Schindelin
2020-07-01 21:19       ` [PATCH v5 0/2] Fix difftool problem with intent-to-add files Johannes Schindelin via GitGitGadget
2020-07-01 21:19         ` [PATCH v5 1/2] diff-files --raw: show correct post-image of " Johannes Schindelin via GitGitGadget
2020-07-01 21:19         ` [PATCH v5 2/2] difftool -d: ensure that intent-to-add files are handled correctly Johannes Schindelin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2006251618070.54@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=shrinidhi.kaushik@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.