git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Varun Naik" <vcnaik94@gmail.com>,
	git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [RFC PATCH] unpack-trees.c: handle empty deleted ita files
Date: Tue, 13 Aug 2019 13:32:56 -0700	[thread overview]
Message-ID: <xmqqr25o7qmf.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <b7f4b745-8942-6d90-dbc5-7f79f2cc323e@web.de> (=?utf-8?Q?=22R?= =?utf-8?Q?en=C3=A9?= Scharfe"'s message of "Tue, 13 Aug 2019 20:08:06 +0200")

René Scharfe <l.s.r@web.de> writes:

> Am 13.08.19 um 18:03 schrieb Varun Naik:
>> It is possible to delete a committed file from the index and then add it
>> as intent-to-add. Several variations of "reset" and "checkout" should
>> resurrect the file in the index from HEAD. "merge", "cherry-pick", and
>> "revert" should all fail with an error message. This patch provides the
>> desired behavior even when the file is empty in HEAD.
>>
>> The affected commands all compare two cache entries by calling
>> unpack-trees.c:same(). A cache entry for an ita file and a cache entry
>> for an empty file have the same oid. So, the cache entry for an empty
>> deleted ita file was previously considered the "same" as the cache entry
>> for an empty file. This fix adds a comparison of the intent-to-add bits
>> so that the two cache entries are no longer considered the "same".
>>
>> Signed-off-by: Varun Naik <vcnaik94@gmail.com>
>> ---
>> I am marking this patch as RFC because it is changing code deep in
>> unpack-trees.c, and I'm sure it will generate some controversy :)
>
> Lacking experience with intent-to-add I don't see why this would be
> controversial.

The "same()" function here is used to compare two cache entries
(either came from the in-core index, or fabricated out of one of the
tree objects as a potential merge result to replace the one in the
index), and is expected to tell if they are the same or the
different, which is used for example in a logic like "if the one in
the index and the one came from HEAD are not the same, then it is
not safe to continue the merge", "if the one came from HEAD and the
one from the other branch are the same, just keep the one from the
index (which may or may not be the same as HEAD)".

The original code considered that two entries with the same mode and
the same "contents" are the same.  As nobody sane tracks an empty
file for an extended span of history, that meant that most of the
time, intent-to-add entries, which has the normal mode bits for the
blobs (with or without the executable bit) and object name for a
zero length blob, would have been judged "different".

This change extends the logic to the case in which the contents
recorded in the tree-ishes is also a zero length blob.  We used to
say the I-T-A entry was the same as the recorded blob, but whether
the recorded one in the tree-ish is 0-length or 1-byte blob, this
patch does not want it to be declared the "same" as any I-T-A entry.

So in that sense, it makes the behaviour for I-T-A entries
consistent.  But it is a separate matter if the consistency is good;
we do not want our code to be consistently wrong ;-)

>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 50189909b8..9b7e6b01c4 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -1661,6 +1661,7 @@ static int same(const struct cache_entry *a, const struct cache_entry *b)
>>  	if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
>>  		return 0;
>>  	return a->ce_mode == b->ce_mode &&
>> +	       !ce_intent_to_add(a) == !ce_intent_to_add(b) &&
>
> Why the bangs?  This would work just as well and be slightly easier to
> read without negating both sides, wouldn't it?

Looking at the implementation of that macro, I agree.  If it were
"returns non-zero for true, zero for false, we do not guarantee that
we return the same non-zero value for true all the time", then these
bangs do make sense, but that is not the case here.

But more importantly, can both sides of the comparison be I-T-A
entries?

I offhand do not think such a situation can arise (a cache entry
with I-T-A bit on can only come from the in-core index, IIUC, and
never from a tree-ish in this codepath), but if we encouter such a
case, I would imagine that we do not want to treat an I-T-A entry to
be the same with anything else, including another I-T-A entry.

So perhaps 

+	!ce_intent_to_add(a) && !ce_intent_to_add(b) &&

i.e. "a cache entry is eligible to be same with something else only
when its I-T-A bit is unset".

I dunno.

>>  	       oideq(&a->oid, &b->oid);
>>  }
>>
>>

  reply	other threads:[~2019-08-13 20:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 16:03 [RFC PATCH] unpack-trees.c: handle empty deleted ita files Varun Naik
2019-08-13 18:08 ` René Scharfe
2019-08-13 20:32   ` Junio C Hamano [this message]
2019-08-14  4:30     ` René Scharfe
2019-08-15 16:17     ` Varun Naik
2019-08-15 16:34       ` Junio C Hamano
2019-08-19 15:35         ` Varun Naik
2019-08-19 19:54           ` Junio C Hamano
2019-08-15 16:21 ` [PATCH v2] unpack-trees.c: distinguish ita files from empty files Varun Naik
2019-08-15 16:46   ` René Scharfe
2019-08-21  3:21 ` [PATCH v3] " Varun Naik
2020-02-14 17:14   ` [PATCH v4] " Varun Naik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqr25o7qmf.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=pclouds@gmail.com \
    --cc=vcnaik94@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).