git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [QUESTION] how to diff one blob with nothing
@ 2023-07-19  9:59 ZheNing Hu
  2023-07-19 16:15 ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: ZheNing Hu @ 2023-07-19  9:59 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Hi,

I want to diff two blobs right now, and one of them
may be empty, so I tried using
0000000000000000000000000000000000000000 or
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 (empty blobID)
to test its effect, and the result I found was:

git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
0000000000000000000000000000000000000000
fatal: bad object 0000000000000000000000000000000000000000

git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
fatal: bad object e69de29bb2d1d6434b8b29ae775ad8c2e48c5391

Since the "empty object" has not been created, the
git diff operation fails.

There may be some other methods [1] available to diff
two blobs by creating an empty blob, but I don't want to
create this empty blob, are there any other ways to solve
this problem?

[1]: https://stackoverflow.com/questions/14564034/creating-a-git-diff-from-nothing

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-19  9:59 [QUESTION] how to diff one blob with nothing ZheNing Hu
@ 2023-07-19 16:15 ` Junio C Hamano
  2023-07-26 18:00   ` ZheNing Hu
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-07-19 16:15 UTC (permalink / raw)
  To: ZheNing Hu; +Cc: Git List

ZheNing Hu <adlternative@gmail.com> writes:

> I want to diff two blobs right now, and one of them
> may be empty, so I tried using
> 0000000000000000000000000000000000000000 or
> e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 (empty blobID)
> to test its effect, and the result I found was:
>
> git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
> 0000000000000000000000000000000000000000
> fatal: bad object 0000000000000000000000000000000000000000

As the object name for an empty blob is not all-0, this is expected.

> git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
> e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
> fatal: bad object e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
>
> Since the "empty object" has not been created, the
> git diff operation fails.

If you haven't created one, of course it would fail.  It should help
to do

    $ git hash-object -w --stdin </dev/null

before running

    $ git diff 00750edc e69de29bb

Long time ago, with 346245a1 (hard-code the empty tree object,
2008-02-13) we taught git what an empty-tree looks like, but it
seems that we never did the same for an empty blob, perhaps?

Interesting.  I am not sure if it is a good idea to teach empty_blob
to find_cached_object() and leaning negative but I haven't thought
things through on this.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-19 16:15 ` Junio C Hamano
@ 2023-07-26 18:00   ` ZheNing Hu
  2023-07-26 18:23     ` Junio C Hamano
  2023-07-27 17:13     ` Konstantin Khomoutov
  0 siblings, 2 replies; 15+ messages in thread
From: ZheNing Hu @ 2023-07-26 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

Junio C Hamano <gitster@pobox.com> 于2023年7月20日周四 00:15写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> > I want to diff two blobs right now, and one of them
> > may be empty, so I tried using
> > 0000000000000000000000000000000000000000 or
> > e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 (empty blobID)
> > to test its effect, and the result I found was:
> >
> > git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
> > 0000000000000000000000000000000000000000
> > fatal: bad object 0000000000000000000000000000000000000000
>
> As the object name for an empty blob is not all-0, this is expected.
>
> > git diff 00750edc07d6415dcc07ae0351e9397b0222b7ba
> > e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
> > fatal: bad object e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
> >
> > Since the "empty object" has not been created, the
> > git diff operation fails.
>
> If you haven't created one, of course it would fail.  It should help
> to do
>
>     $ git hash-object -w --stdin </dev/null
>
> before running
>
>     $ git diff 00750edc e69de29bb
>

This is a viable solution, but it's a bit ugly since a read-only "diff"
requires ”write“ an empty blob.

> Long time ago, with 346245a1 (hard-code the empty tree object,
> 2008-02-13) we taught git what an empty-tree looks like, but it
> seems that we never did the same for an empty blob, perhaps?
>

It would be great to have that capability.

> Interesting.  I am not sure if it is a good idea to teach empty_blob
> to find_cached_object() and leaning negative but I haven't thought
> things through on this.
>

I currently don't have time to delve into implementation details,
so I went with the approach that requires writing an empty blob, but
I'll take a closer look at it later.

>
>

Thanks,
ZheNing Hu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-26 18:00   ` ZheNing Hu
@ 2023-07-26 18:23     ` Junio C Hamano
  2023-07-27 17:46       ` Taylor Blau
  2023-07-27 17:13     ` Konstantin Khomoutov
  1 sibling, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-07-26 18:23 UTC (permalink / raw)
  To: ZheNing Hu; +Cc: Git List

ZheNing Hu <adlternative@gmail.com> writes:

>> If you haven't created one, of course it would fail.  It should help
>> to do
>>
>>     $ git hash-object -w --stdin </dev/null
>>
>> before running
>>
>>     $ git diff 00750edc e69de29bb
>>
>
> This is a viable solution, but it's a bit ugly since a read-only "diff"
> requires ”write“ an empty blob.

If you do not even have an empty blob, you have no business
comparing some other blobs you have with it, do you?

If you do not have a file with a single line "hello, world\n" (that
hashes to 4b5fa63702dd96796042e92787f464e28f09f17d if written in a
blob), then you cannot do "git diff 4b5fa637" with anything and
expect it to work.  It is the same thing.

Besides, if you _know_ you want to compare a blob X to emptyness,
you are better of doing "git cat-file blob X" in the first place.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-26 18:00   ` ZheNing Hu
  2023-07-26 18:23     ` Junio C Hamano
@ 2023-07-27 17:13     ` Konstantin Khomoutov
  2023-07-28  3:35       ` ZheNing Hu
  1 sibling, 1 reply; 15+ messages in thread
From: Konstantin Khomoutov @ 2023-07-27 17:13 UTC (permalink / raw)
  To: Git List; +Cc: ZheNing Hu

>> If you haven't created one, of course it would fail.  It should help
>> to do
>>
>>     $ git hash-object -w --stdin </dev/null
>>
>> before running
>>
>>     $ git diff 00750edc e69de29bb
>>
>
> This is a viable solution, but it's a bit ugly since a read-only "diff"
> requires ”write“ an empty blob.

You could probably just do

  git cat-file -s e69de29bb

to figure out whether a blob is empty.

What is your end goal? Do you indeed want to produce a "trivial patch" which
merely "adds" all the lines of the blob you'd like to compare to an empty one
(assuming the blob conains text)?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-26 18:23     ` Junio C Hamano
@ 2023-07-27 17:46       ` Taylor Blau
  2023-07-28  3:40         ` ZheNing Hu
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2023-07-27 17:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: ZheNing Hu, Git List

On Wed, Jul 26, 2023 at 11:23:52AM -0700, Junio C Hamano wrote:
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> If you haven't created one, of course it would fail.  It should help
> >> to do
> >>
> >>     $ git hash-object -w --stdin </dev/null
> >>
> >> before running
> >>
> >>     $ git diff 00750edc e69de29bb
> >>
> >
> > This is a viable solution, but it's a bit ugly since a read-only "diff"
> > requires ”write“ an empty blob.
>
> If you do not even have an empty blob, you have no business
> comparing some other blobs you have with it, do you?
>
> If you do not have a file with a single line "hello, world\n" (that
> hashes to 4b5fa63702dd96796042e92787f464e28f09f17d if written in a
> blob), then you cannot do "git diff 4b5fa637" with anything and
> expect it to work.  It is the same thing.
>
> Besides, if you _know_ you want to compare a blob X to emptyness,
> you are better of doing "git cat-file blob X" in the first place.

Yeah, exactly. In 346245a1bb6 (hard-code the empty tree object,
2008-02-13), the rationale was partly that having the empty tree object
is useful for showing some diffs, such as for the initial commit.

But I can't think of a similar argument for the empty blob. Like Junio
said, if you're purposefully diff-ing against the empty blob, wouldn't
you simply want the entire contents anyway? If that's the case, cat-file
seems like a much more appropriate tool.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-27 17:13     ` Konstantin Khomoutov
@ 2023-07-28  3:35       ` ZheNing Hu
  0 siblings, 0 replies; 15+ messages in thread
From: ZheNing Hu @ 2023-07-28  3:35 UTC (permalink / raw)
  To: Git List, ZheNing Hu

Konstantin Khomoutov <kostix@bswap.ru> 于2023年7月28日周五 01:13写道:
>
> >> If you haven't created one, of course it would fail.  It should help
> >> to do
> >>
> >>     $ git hash-object -w --stdin </dev/null
> >>
> >> before running
> >>
> >>     $ git diff 00750edc e69de29bb
> >>
> >
> > This is a viable solution, but it's a bit ugly since a read-only "diff"
> > requires ”write“ an empty blob.
>
> You could probably just do
>
>   git cat-file -s e69de29bb
>
> to figure out whether a blob is empty.
>
> What is your end goal? Do you indeed want to produce a "trivial patch" which
> merely "adds" all the lines of the blob you'd like to compare to an empty one
> (assuming the blob conains text)?
>

Yes, I need to compare the diff between multiple versions of blobs in real time
(only given the blobID). Once a file is deleted or created, I have to
perform this
comparison with the "empty blob".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-27 17:46       ` Taylor Blau
@ 2023-07-28  3:40         ` ZheNing Hu
  2023-08-03  5:16           ` ZheNing Hu
  0 siblings, 1 reply; 15+ messages in thread
From: ZheNing Hu @ 2023-07-28  3:40 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, Git List

Taylor Blau <me@ttaylorr.com> 于2023年7月28日周五 01:46写道:
>
> On Wed, Jul 26, 2023 at 11:23:52AM -0700, Junio C Hamano wrote:
> > ZheNing Hu <adlternative@gmail.com> writes:
> >
> > >> If you haven't created one, of course it would fail.  It should help
> > >> to do
> > >>
> > >>     $ git hash-object -w --stdin </dev/null
> > >>
> > >> before running
> > >>
> > >>     $ git diff 00750edc e69de29bb
> > >>
> > >
> > > This is a viable solution, but it's a bit ugly since a read-only "diff"
> > > requires ”write“ an empty blob.
> >
> > If you do not even have an empty blob, you have no business
> > comparing some other blobs you have with it, do you?
> >
> > If you do not have a file with a single line "hello, world\n" (that
> > hashes to 4b5fa63702dd96796042e92787f464e28f09f17d if written in a
> > blob), then you cannot do "git diff 4b5fa637" with anything and
> > expect it to work.  It is the same thing.
> >
> > Besides, if you _know_ you want to compare a blob X to emptyness,
> > you are better of doing "git cat-file blob X" in the first place.
>
> Yeah, exactly. In 346245a1bb6 (hard-code the empty tree object,
> 2008-02-13), the rationale was partly that having the empty tree object
> is useful for showing some diffs, such as for the initial commit.
>
> But I can't think of a similar argument for the empty blob. Like Junio
> said, if you're purposefully diff-ing against the empty blob, wouldn't
> you simply want the entire contents anyway? If that's the case, cat-file
> seems like a much more appropriate tool.
>

Here, it is necessary to compare multiple versions of blobs while also
considering the situations of creation and deletion.

Well, what I need is the "diff" content, with lines in the diff indicating
'+' or '-' signs. This can be achieved by manually adding them, but it is
not very compatible with the original logic.

> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-07-28  3:40         ` ZheNing Hu
@ 2023-08-03  5:16           ` ZheNing Hu
  2023-08-03 15:24             ` Taylor Blau
  0 siblings, 1 reply; 15+ messages in thread
From: ZheNing Hu @ 2023-08-03  5:16 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, Git List

ZheNing Hu <adlternative@gmail.com> 于2023年7月28日周五 11:40写道:
>
> Taylor Blau <me@ttaylorr.com> 于2023年7月28日周五 01:46写道:
> >
> > On Wed, Jul 26, 2023 at 11:23:52AM -0700, Junio C Hamano wrote:
> > > ZheNing Hu <adlternative@gmail.com> writes:
> > >
> > > >> If you haven't created one, of course it would fail.  It should help
> > > >> to do
> > > >>
> > > >>     $ git hash-object -w --stdin </dev/null
> > > >>
> > > >> before running
> > > >>
> > > >>     $ git diff 00750edc e69de29bb
> > > >>
> > > >
> > > > This is a viable solution, but it's a bit ugly since a read-only "diff"
> > > > requires ”write“ an empty blob.
> > >
> > > If you do not even have an empty blob, you have no business
> > > comparing some other blobs you have with it, do you?
> > >
> > > If you do not have a file with a single line "hello, world\n" (that
> > > hashes to 4b5fa63702dd96796042e92787f464e28f09f17d if written in a
> > > blob), then you cannot do "git diff 4b5fa637" with anything and
> > > expect it to work.  It is the same thing.
> > >
> > > Besides, if you _know_ you want to compare a blob X to emptyness,
> > > you are better of doing "git cat-file blob X" in the first place.
> >
> > Yeah, exactly. In 346245a1bb6 (hard-code the empty tree object,
> > 2008-02-13), the rationale was partly that having the empty tree object
> > is useful for showing some diffs, such as for the initial commit.
> >
> > But I can't think of a similar argument for the empty blob. Like Junio
> > said, if you're purposefully diff-ing against the empty blob, wouldn't
> > you simply want the entire contents anyway? If that's the case, cat-file
> > seems like a much more appropriate tool.
> >
>
> Here, it is necessary to compare multiple versions of blobs while also
> considering the situations of creation and deletion.
>
> Well, what I need is the "diff" content, with lines in the diff indicating
> '+' or '-' signs. This can be achieved by manually adding them, but it is
> not very compatible with the original logic.
>

The native diff command itself supports comparison with an empty file.

#diff -u  /dev/null a
--- /dev/null 2023-07-25 16:47:50.270094301 +0800
+++ a 2023-08-03 13:14:16.980262362 +0800
@@ -0,0 +1 @@
+a

So I believe this feature would also be useful in git.

> > Thanks,
> > Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-03  5:16           ` ZheNing Hu
@ 2023-08-03 15:24             ` Taylor Blau
  2023-08-04  2:28               ` ZheNing Hu
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2023-08-03 15:24 UTC (permalink / raw)
  To: ZheNing Hu; +Cc: Junio C Hamano, Git List

On Thu, Aug 03, 2023 at 01:16:02PM +0800, ZheNing Hu wrote:
> > Here, it is necessary to compare multiple versions of blobs while also
> > considering the situations of creation and deletion.
> >
> > Well, what I need is the "diff" content, with lines in the diff indicating
> > '+' or '-' signs. This can be achieved by manually adding them, but it is
> > not very compatible with the original logic.
>
> The native diff command itself supports comparison with an empty file.
>
> #diff -u  /dev/null a
> --- /dev/null 2023-07-25 16:47:50.270094301 +0800
> +++ a 2023-08-03 13:14:16.980262362 +0800
> @@ -0,0 +1 @@
> +a
>
> So I believe this feature would also be useful in git.

Sure, you can easily diff any file against any other, including if
either one or both are empty. I think the main difference here is that
/dev/null exists on your system without additional configuration and
the empty blob does not exist in a Git repository without additional
configuration (in this case, `git hash-object -w -t blob --stdin
</dev/null`).

TBH, I don't know if /dev/null existing by default is necessarily a
solid argument in favor of having Git repositories come initialized with
the empty blob by default.

(To be clear, when I say "initialized", I mean that a Git repository
would recognize the empty blob object's hash for any value of
`the_hash_algo`, not that every repository would be prepared with a
loose object by default.)

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-03 15:24             ` Taylor Blau
@ 2023-08-04  2:28               ` ZheNing Hu
  2023-08-04  8:28                 ` Christian Couder
  0 siblings, 1 reply; 15+ messages in thread
From: ZheNing Hu @ 2023-08-04  2:28 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, Git List

Taylor Blau <me@ttaylorr.com> 于2023年8月3日周四 23:24写道:
>
> On Thu, Aug 03, 2023 at 01:16:02PM +0800, ZheNing Hu wrote:
> > > Here, it is necessary to compare multiple versions of blobs while also
> > > considering the situations of creation and deletion.
> > >
> > > Well, what I need is the "diff" content, with lines in the diff indicating
> > > '+' or '-' signs. This can be achieved by manually adding them, but it is
> > > not very compatible with the original logic.
> >
> > The native diff command itself supports comparison with an empty file.
> >
> > #diff -u  /dev/null a
> > --- /dev/null 2023-07-25 16:47:50.270094301 +0800
> > +++ a 2023-08-03 13:14:16.980262362 +0800
> > @@ -0,0 +1 @@
> > +a
> >
> > So I believe this feature would also be useful in git.
>
> Sure, you can easily diff any file against any other, including if
> either one or both are empty. I think the main difference here is that
> /dev/null exists on your system without additional configuration and
> the empty blob does not exist in a Git repository without additional
> configuration (in this case, `git hash-object -w -t blob --stdin
> </dev/null`).
>
> TBH, I don't know if /dev/null existing by default is necessarily a
> solid argument in favor of having Git repositories come initialized with
> the empty blob by default.
>
> (To be clear, when I say "initialized", I mean that a Git repository
> would recognize the empty blob object's hash for any value of
> `the_hash_algo`, not that every repository would be prepared with a
> loose object by default.)
>

Actually, there is no need to support a default empty blob.
For example, with the command "git diff --no-index <file> /dev/null",
it can compare a file with /dev/null, but it can only compare <file>
and not <oid>.
Therefore, using commands like "git diff <oid> /dev/null",
"git diff --no-index <oid> /dev/null", or even "git diff <oid> --stdin"
could potentially solve this issue.

Thanks,
ZheNing Hu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-04  2:28               ` ZheNing Hu
@ 2023-08-04  8:28                 ` Christian Couder
  2023-08-04 18:34                   ` Taylor Blau
  0 siblings, 1 reply; 15+ messages in thread
From: Christian Couder @ 2023-08-04  8:28 UTC (permalink / raw)
  To: ZheNing Hu; +Cc: Taylor Blau, Junio C Hamano, Git List

On Fri, Aug 4, 2023 at 6:42 AM ZheNing Hu <adlternative@gmail.com> wrote:

> Actually, there is no need to support a default empty blob.
> For example, with the command "git diff --no-index <file> /dev/null",
> it can compare a file with /dev/null, but it can only compare <file>
> and not <oid>.
> Therefore, using commands like "git diff <oid> /dev/null",
> "git diff --no-index <oid> /dev/null", or even "git diff <oid> --stdin"
> could potentially solve this issue.

Maybe it would be clearer to have a new option, called for example
"--blob-vs-file", for that then. It could support both:

$ git diff --blob-vs-file <blob> <file>

and:

$ git diff --blob-vs-file <file> <blob>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-04  8:28                 ` Christian Couder
@ 2023-08-04 18:34                   ` Taylor Blau
  2023-08-04 19:00                     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2023-08-04 18:34 UTC (permalink / raw)
  To: Christian Couder; +Cc: ZheNing Hu, Junio C Hamano, Git List

On Fri, Aug 04, 2023 at 10:28:53AM +0200, Christian Couder wrote:
> On Fri, Aug 4, 2023 at 6:42 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> > Actually, there is no need to support a default empty blob.
> > For example, with the command "git diff --no-index <file> /dev/null",
> > it can compare a file with /dev/null, but it can only compare <file>
> > and not <oid>.
> > Therefore, using commands like "git diff <oid> /dev/null",
> > "git diff --no-index <oid> /dev/null", or even "git diff <oid> --stdin"
> > could potentially solve this issue.
>
> Maybe it would be clearer to have a new option, called for example
> "--blob-vs-file", for that then. It could support both:
>
> $ git diff --blob-vs-file <blob> <file>
>
> and:
>
> $ git diff --blob-vs-file <file> <blob>

Hmm. This feels like a case of trying to teach 'git diff' to do too
much.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-04 18:34                   ` Taylor Blau
@ 2023-08-04 19:00                     ` Junio C Hamano
  2023-08-05  8:27                       ` ZheNing Hu
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2023-08-04 19:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Christian Couder, ZheNing Hu, Git List

Taylor Blau <me@ttaylorr.com> writes:

> On Fri, Aug 04, 2023 at 10:28:53AM +0200, Christian Couder wrote:
>> On Fri, Aug 4, 2023 at 6:42 AM ZheNing Hu <adlternative@gmail.com> wrote:
>>
>> > Actually, there is no need to support a default empty blob.
>> > For example, with the command "git diff --no-index <file> /dev/null",
>> > it can compare a file with /dev/null, but it can only compare <file>
>> > and not <oid>.
>> > Therefore, using commands like "git diff <oid> /dev/null",
>> > "git diff --no-index <oid> /dev/null", or even "git diff <oid> --stdin"
>> > could potentially solve this issue.
>>
>> Maybe it would be clearer to have a new option, called for example
>> "--blob-vs-file", for that then. It could support both:
>>
>> $ git diff --blob-vs-file <blob> <file>
>>
>> and:
>>
>> $ git diff --blob-vs-file <file> <blob>
>
> Hmm. This feels like a case of trying to teach 'git diff' to do too
> much.

Worse yet, I do not quite get the original use case in the first
place.  What is the series of diff output that result in comparing a
random pair of blob object names going to be used for?

The reply to <ZMKtcaN7xYaTtkcI@nand.local> says that the original
use case was to express the evolution of a single path since its
creation until its removal, but the thing is, a diff with an empty
blob and a creation or a deletion event are expressed differently in
the patch output, exactly because the patch has to be able to
express "before this change, a file with zero byte content was
there" and "before this change, there was nothing at this path"
(vice versa for contents-removal vs deletion).

For that reason, I have a hard time to find any merit in the earlier
complaint that said "can be achieved by manually adding them, but it
is not very compatible with the original logic", whatever the
"original logic" refers to.  If creation needs to be recorded as
creation and not as a change from an empty and existing blob, there
has to be something that needs to be manually done to turn the
latter (which is the only thing "diff" between two blobs or even a
blob and a file can give) into the former *anyway*.  Whatever the
thing that is looping over the history/evoluation of a single path
needs to have a three-arm switch for each iteration to deal with
creation, modification, and removal, and iterating over the contents
of the files and prefixing "+" or "-" on each and every line would
be the _easiest_ part of such a necessary tweak to turn "diff
between an empty contents and something else" into "creation or
deletion of a file."


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [QUESTION] how to diff one blob with nothing
  2023-08-04 19:00                     ` Junio C Hamano
@ 2023-08-05  8:27                       ` ZheNing Hu
  0 siblings, 0 replies; 15+ messages in thread
From: ZheNing Hu @ 2023-08-05  8:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, Christian Couder, Git List

Junio C Hamano <gitster@pobox.com> 于2023年8月5日周六 03:00写道:
>
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Fri, Aug 04, 2023 at 10:28:53AM +0200, Christian Couder wrote:
> >> On Fri, Aug 4, 2023 at 6:42 AM ZheNing Hu <adlternative@gmail.com> wrote:
> >>
> >> > Actually, there is no need to support a default empty blob.
> >> > For example, with the command "git diff --no-index <file> /dev/null",
> >> > it can compare a file with /dev/null, but it can only compare <file>
> >> > and not <oid>.
> >> > Therefore, using commands like "git diff <oid> /dev/null",
> >> > "git diff --no-index <oid> /dev/null", or even "git diff <oid> --stdin"
> >> > could potentially solve this issue.
> >>
> >> Maybe it would be clearer to have a new option, called for example
> >> "--blob-vs-file", for that then. It could support both:
> >>
> >> $ git diff --blob-vs-file <blob> <file>
> >>
> >> and:
> >>
> >> $ git diff --blob-vs-file <file> <blob>
> >
> > Hmm. This feels like a case of trying to teach 'git diff' to do too
> > much.
>
> Worse yet, I do not quite get the original use case in the first
> place.  What is the series of diff output that result in comparing a
> random pair of blob object names going to be used for?
>
> The reply to <ZMKtcaN7xYaTtkcI@nand.local> says that the original
> use case was to express the evolution of a single path since its
> creation until its removal, but the thing is, a diff with an empty
> blob and a creation or a deletion event are expressed differently in
> the patch output, exactly because the patch has to be able to
> express "before this change, a file with zero byte content was
> there" and "before this change, there was nothing at this path"
> (vice versa for contents-removal vs deletion).
>
> For that reason, I have a hard time to find any merit in the earlier
> complaint that said "can be achieved by manually adding them, but it
> is not very compatible with the original logic", whatever the
> "original logic" refers to.  If creation needs to be recorded as
> creation and not as a change from an empty and existing blob, there
> has to be something that needs to be manually done to turn the
> latter (which is the only thing "diff" between two blobs or even a
> blob and a file can give) into the former *anyway*.  Whatever the
> thing that is looping over the history/evoluation of a single path
> needs to have a three-arm switch for each iteration to deal with
> creation, modification, and removal, and iterating over the contents
> of the files and prefixing "+" or "-" on each and every line would
> be the _easiest_ part of such a necessary tweak to turn "diff
> between an empty contents and something else" into "creation or
> deletion of a file."
>

Okay, let me clarify the background for using an empty blob diff.
Essentially, it is a git web diff interface that requires real-time calculation
of the diff between files across multiple versions and rendering them.
Due to some reasons, the higher-level component did not provide multiple
versions of commits but instead provided blob OIDs (Object IDs).

Therefore, I expected to generate the diff results directly using the
"git diff <oid> <oid>" command. (I only care about the patch part
in the diff and don't really care about the related information of OIDs
in the diff output.) Everything went smoothly except when a blob
is created or deleted, as there is no direct way to obtain
the diff for a blob using the "git diff <oid> <oid>" interface. Initially,
I intended to generate a patch by diffing an empty blob
 (e69de29bb2d1d6434b8b29ae775ad8c2e48c5391) with the blob ID.
However, unlike the empty tree (4b825dc642cb6eb9a060e54bf8d69288fbee4904),
the empty blob does not exist in the git repository by default.

I had to create an additional empty blob for the purpose of performing
the diff, but this goes against the design of the web-based diff interface,
which is intended to be read-only.

So, I might explore methods like "git diff <oid> /dev/null" or
 "git diff <oid> --stdin", they are read only, but it does not
currently exist...

Regardless, I hope that the empty blob diff can simulate the effect of
an empty tree diff:

git diff 4b825dc642cb6eb9a060e54bf8d69288fbee4904

diff --git a/.cirrus.yml b/.cirrus.yml
new file mode 100644
index 0000000000..4860bebd32
--- /dev/null
+++ b/.cirrus.yml
@@ -0,0 +1,22 @@
+env:
+  CIRRUS_CLONE_DEPTH: 1
+
+freebsd_12_task:
+  env:
+    GIT_PROVE_OPTS: "--timer --jobs 10"
+    GIT_TEST_OPTS: "--no-chain-lint --no-bin-wrappers"
+    MAKEFLAGS: "-j4"
+    DEFAULT_TEST_TARGET: prove
+    DEVELOPER: 1
+  freebsd_instance:
+    image_family: freebsd-12-3
+    memory: 2G
+  install_script:
+    pkg install -y gettext gmake perl5
+  create_user_script:
+    - pw useradd git
+    - chown -R git:git .
+  build_script:
+    - su git -c gmake
+  test_script:
+    - su git -c 'gmake test'

Thanks,
ZheNing Hu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-08-05  8:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-19  9:59 [QUESTION] how to diff one blob with nothing ZheNing Hu
2023-07-19 16:15 ` Junio C Hamano
2023-07-26 18:00   ` ZheNing Hu
2023-07-26 18:23     ` Junio C Hamano
2023-07-27 17:46       ` Taylor Blau
2023-07-28  3:40         ` ZheNing Hu
2023-08-03  5:16           ` ZheNing Hu
2023-08-03 15:24             ` Taylor Blau
2023-08-04  2:28               ` ZheNing Hu
2023-08-04  8:28                 ` Christian Couder
2023-08-04 18:34                   ` Taylor Blau
2023-08-04 19:00                     ` Junio C Hamano
2023-08-05  8:27                       ` ZheNing Hu
2023-07-27 17:13     ` Konstantin Khomoutov
2023-07-28  3:35       ` ZheNing Hu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).