All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] read-cache.c: Ensure unmerged entries are removed
@ 2014-08-12 15:31 Jaime Soriano Pastor
  2014-08-12 15:31 ` Jaime Soriano Pastor
  2014-08-12 18:31 ` [PATCH] read-cache.c: Ensure unmerged entries are removed Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-12 15:31 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

A file in the index can be left as merged and unmerged at the same time
by some tools as libgit2, this causes some undesiderable behaviours in git.

I have seen, at least, these behaviours:
- git reset --hard consuming 100% CPU and never ending
- git reset --hard consuming all memory in git < 2.0
- git add/git mergetool not resolving a conflict, even if they finish
  succesfully

The state is something like this:

$ git ls-files -s
100644 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 0       conflict
100644 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 1       conflict
100644 5716ca5987cbf97d6bb54920bea6adde242d87e6 2       conflict
100644 f2e41136eac73c39554dede1fd7e67b12502d577 3       conflict

This can be caused e.g. by libgit2 doing this:
1. Merge with conflicts, without solving them
2. Force checkout

I see that this is not caused by git (I haven't been able to reproduce it
only using git) but I think that git should be able to detect this situation
and even handle it, specially to avoid the never-ending git resets.   

The proposed patch serves as protection and autoremediation for this
kind of cases.
Another option would be to detect the issue and tell the user to clean the
index with git read-tree --empty, but I think this would be more intrussive
than the patch.

Regards,
Jaime Soriano.

Jaime Soriano Pastor (1):
  read-cache.c: Ensure unmerged entries are removed

 read-cache.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

-- 
2.0.4.1.g8a38f21.dirty

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-12 15:31 [PATCH] read-cache.c: Ensure unmerged entries are removed Jaime Soriano Pastor
@ 2014-08-12 15:31 ` Jaime Soriano Pastor
  2014-08-12 18:39   ` Junio C Hamano
  2014-08-12 18:31 ` [PATCH] read-cache.c: Ensure unmerged entries are removed Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-12 15:31 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Wrong implementations of tools that modify the index can left
some files as merged and unmerged at the same time. Avoid undesiderable
behaviours by handling this situation.

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..23e46e1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -935,6 +935,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 	int new_only = option & ADD_CACHE_NEW_ONLY;
+	int replaced = 0;
 
 	cache_tree_invalidate_path(istate->cache_tree, ce->name);
 	pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce));
@@ -943,9 +944,10 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	if (pos >= 0) {
 		if (!new_only)
 			replace_index_entry(istate, pos, ce);
-		return 0;
-	}
-	pos = -pos-1;
+		pos++;
+		replaced = 1;
+	} else
+		pos = -pos-1;
 
 	/*
 	 * Inserting a merged entry ("stage 0") into the index
@@ -959,6 +961,8 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 		}
 	}
 
+	if (replaced)
+		return 0;
 	if (!ok_to_add)
 		return -1;
 	if (!verify_path(ce->name))
-- 
2.0.4.1.g8a38f21.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-12 15:31 [PATCH] read-cache.c: Ensure unmerged entries are removed Jaime Soriano Pastor
  2014-08-12 15:31 ` Jaime Soriano Pastor
@ 2014-08-12 18:31 ` Junio C Hamano
  2014-08-13 22:10   ` Jaime Soriano Pastor
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-12 18:31 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> A file in the index can be left as merged and unmerged at the same time
> by some tools as libgit2, this causes some undesiderable behaviours in git.

Well, doesn't it mean that libgit2 is broken?  Have you filed a bug
over there yet?

Having said that, protecting ourselves from insanity left by other
people is always a good idea, provided that it can be done without
bending overly backwards.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-12 15:31 ` Jaime Soriano Pastor
@ 2014-08-12 18:39   ` Junio C Hamano
  2014-08-13 22:10     ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-12 18:39 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> Wrong implementations of tools that modify the index can left
> some files as merged and unmerged at the same time. Avoid undesiderable
> behaviours by handling this situation.

It is understandable that the way _you_ decided to "handle the
situation" is so obvious to be spelled out to _you_, but that is the
most important design decision that needs to be described here.  Do
you silently remove higher-stage entries when an entry at stage 0
exists?  Do you silently remove stage 0 entry when higher-stage
entries exist?  Do you error out without doing neither?

Silently removing these at runtime may not be something we would
want to do; after all, we do not know if the broken tool actually
wanted to have the higher stage entries, or the merged entry.

Ideally, I think we should error out and let the users figure out
how to proceed (we may of course need to make sure they have the
necessary tools to do so, e.g. "git cat-file blob 0:$path" to
resurrect the contents and "git update-index --cacheinfo" to stuff
the contents into the stages).

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-12 18:31 ` [PATCH] read-cache.c: Ensure unmerged entries are removed Junio C Hamano
@ 2014-08-13 22:10   ` Jaime Soriano Pastor
  0 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-13 22:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Aug 12, 2014 at 8:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
> > A file in the index can be left as merged and unmerged at the same time
> > by some tools as libgit2, this causes some undesiderable behaviours in git.
>
> Well, doesn't it mean that libgit2 is broken?  Have you filed a bug
> over there yet?
>

Yes, exactly, I think libgit2 is broken but I wanted to double-check
that it was still happening in their master branch, and it is. I have
reported the bug after checking it.
https://github.com/libgit2/libgit2/issues/2515

>
> Having said that, protecting ourselves from insanity left by other
> people is always a good idea, provided that it can be done without
> bending overly backwards.


Yes, I think the most important thing in this case is to protect git
against this kind of inconsistencies.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-12 18:39   ` Junio C Hamano
@ 2014-08-13 22:10     ` Jaime Soriano Pastor
  2014-08-13 23:04       ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-13 22:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Aug 12, 2014 at 8:39 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
> > Wrong implementations of tools that modify the index can left
> > some files as merged and unmerged at the same time. Avoid undesiderable
> > behaviours by handling this situation.
>
> It is understandable that the way _you_ decided to "handle the
> situation" is so obvious to be spelled out to _you_, but that is the
> most important design decision that needs to be described here.  Do
> you silently remove higher-stage entries when an entry at stage 0
> exists?  Do you silently remove stage 0 entry when higher-stage
> entries exist?  Do you error out without doing neither?
>

Sorry, I didn't explain my decission enough, and my knowledge of git
internals is not so good.
The idea of my proposal is to remove higher stage entries when, after
replacing an existing entry at stage 0, there are still entries in
higher stages.

In the problematic cases I've seen (specially git add and git reset
--hard) the final state of both, merged and unmerged files, is that
only an entry in stage 0 exists.
Also, the current implementation of git checkout -f silently removes
higher stage entries in this case.

>
> Silently removing these at runtime may not be something we would
> want to do; after all, we do not know if the broken tool actually
> wanted to have the higher stage entries, or the merged entry.
>

Yes, I have to agree on that, the user should have the final decission
about what stage entry to use, although I'm not sure if in the
previously commented cases there could be such an additional loss as
the operations that can be modified are already intended to silently
remove stage entries.

> Ideally, I think we should error out and let the users figure out
> how to proceed (we may of course need to make sure they have the
> necessary tools to do so, e.g. "git cat-file blob 0:$path" to
> resurrect the contents and "git update-index --cacheinfo" to stuff
> the contents into the stages).
>

I have also tried a couple of implementations of this patch with die()
and warning().
The implementation with die() would have a message like "There are
other staged versions for merged file", and maybe some recomendation
about how to see the blobs.
The warning implementation could return -1, what would prevent git add
to remove the higher-stage entries, but would still make git reset
--hard to clean the index as it seems that it does it anyway if it
manages to finish the call to read_index_unmerged.
Another option would be to print the deleted entries as a warning but
deleting them anyway.

Which option would be better? And what could be a good message?

BTW, I didn't know "git cat-file blob 0:$path", but I only manage to
get "Not a valid object name" fatals. How is it supposed to be used?

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-13 22:10     ` Jaime Soriano Pastor
@ 2014-08-13 23:04       ` Junio C Hamano
  2014-08-15 19:50         ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-13 23:04 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> In the problematic cases I've seen (specially git add and git reset
> --hard) the final state of both, merged and unmerged files, is that
> only an entry in stage 0 exists.
> Also, the current implementation of git checkout -f silently removes
> higher stage entries in this case.
>>
>> Silently removing these at runtime may not be something we would
>> want to do; after all, we do not know if the broken tool actually
>> wanted to have the higher stage entries, or the merged entry.
>>
>
> Yes, I have to agree on that, the user should have the final decission
> about what stage entry to use, although I'm not sure if in the
> previously commented cases there could be such an additional loss as
> the operations that can be modified are already intended to silently
> remove stage entries.
> ...
> Which option would be better? And what could be a good message?

Being a conservative, I'd rather avoid doing any magic during
read_cache() time.  "ls-files -s" for example should show the four
stages so that the "broken" state can be inspected.

Instead, I suspect that the code paths with problematic iterations
over the index entries that assume that having stage #0 entry for a
path guarantees that there will not be any higher stage entry first
need to be identified (you already said "add" and "reset" may be
problematic, there may be others, or they may be the only ones, I
dunno), and then the most sensible one, which would be different
from case to case, among various possibilities need to be chosen as
a fix to each of them:

 (1) the loop may be fixed to ignore/skip unmerged entries;
 (2) the loop may be fixed to ignore/skip the merged entry;
 (3) the loop may be fixed not to spin indefinitely on a path with
     mixed entries; or
 (4) the command should error out.

Yes, it would be more work, but I'd feel safer if the following
worked:

	$ git ls-files -s
        100644 3cc58df83752123644fef39faab2393af643b1d2 0       conflict
        100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1       conflict
        100644 3cc58df83752123644fef39faab2393af643b1d2 2       conflict
        100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3       conflict
	$ >empty
        $ git add empty
        100644 3cc58df83752123644fef39faab2393af643b1d2 0       conflict
        100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1       conflict
        100644 3cc58df83752123644fef39faab2393af643b1d2 2       conflict
        100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3       conflict
	100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       empty
	$ git cat-file blob :empty >output
        $ cmp empty output && echo OK
        OK

which would be impossible to do if we nuked the "problematic" stages
whenever we read the index, I am afraid.

> BTW, I didn't know "git cat-file blob 0:$path", but I only manage to
> get "Not a valid object name" fatals. How is it supposed to be used?

That was a typo of ":$n:$path" (where 0 <= $n <= 3).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-13 23:04       ` Junio C Hamano
@ 2014-08-15 19:50         ` Jaime Soriano Pastor
  2014-08-15 21:45           ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-15 19:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Being a conservative, I'd rather avoid doing any magic during
> read_cache() time. "ls-files -s" for example should show the four
> stages so that the "broken" state can be inspected.
>
Well, only read_cache_unmerged() is modified in the sent patch, so no
magic is done in read_cache(), I'd also avoid changes there. Indeed
with the patch, "ls-files -s" can be used to inspect the problem
without further problems.

> Instead, I suspect that the code paths with problematic iterations
> over the index entries that assume that having stage #0 entry for a
> path guarantees that there will not be any higher stage entry first
> need to be identified (you already said "add" and "reset" may be
> problematic, there may be others, or they may be the only ones, I
> dunno), and then the most sensible one, which would be different
> from case to case, among various possibilities need to be chosen as
> a fix to each of them:
>
> (1) the loop may be fixed to ignore/skip unmerged entries;
> (2) the loop may be fixed to ignore/skip the merged entry;
> (3) the loop may be fixed not to spin indefinitely on a path with
> mixed entries; or
> (4) the command should error out.
>
git reset will clean the index anyway if the loop finishes, would it
be ok? I think that it'd be acceptable for git reset --hard to clean
the index as git checkout -f already does it even in this case.

git merge is also affected by the loop in read_cache_unmerged(), but
any of the solutions would be enough for it as only by finishing the
loop with unmerged entries it will die without commiting the cache to
the index file.

For git add probably the best option is to error out and ask the user
to check "git ls-files -s" to investigate the problem and decide what
to do.

The error message given by "git commit -a" is a bit confusing in this
case, I can take a look to this too.

I'll try to prepare a patch with these cases, and rethinking the loop
to avoid future problems there, I think that is a bit dangerous to
look for the position of a path entry (with index_name_pos) for the
next iteration.

> Yes, it would be more work, but I'd feel safer if the following
> worked:
>
> $ git ls-files -s
> 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict
> 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict
> 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict
> 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict
> $ >empty
> $ git add empty
> 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict
> 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict
> 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict
> 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict
> 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 empty
> $ git cat-file blob :empty >output
> $ cmp empty output && echo OK
> OK
>
> which would be impossible to do if we nuked the "problematic" stages
> whenever we read the index, I am afraid.
>
This works with the first patch as read_cache() is not modified, and
git add would only clean the entries for the paths passed as
arguments.

>> BTW, I didn't know "git cat-file blob 0:$path", but I only manage to
>> get "Not a valid object name" fatals. How is it supposed to be used?
>
> That was a typo of ":$n:$path" (where 0 <= $n <= 3).
Great, thanks!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-15 19:50         ` Jaime Soriano Pastor
@ 2014-08-15 21:45           ` Junio C Hamano
  2014-08-16 14:51             ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-15 21:45 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Being a conservative, I'd rather avoid doing any magic during
>> read_cache() time. "ls-files -s" for example should show the four
>> stages so that the "broken" state can be inspected.
>>
> Well, only read_cache_unmerged() is modified in the sent patch, so no
> magic is done in read_cache(), I'd also avoid changes there.

Ahh, I must have overlooked that; changes being only in _unmerged()
variant makes me feel much better, and it probably would make much
of ...

>> Yes, it would be more work,...

... moot, hopefully ;-)

> git reset will clean the index anyway if the loop finishes, would it
> be ok?

Surely.

> git merge is also affected by the loop in read_cache_unmerged(), but
> any of the solutions would be enough for it as only by finishing the
> loop with unmerged entries it will die without commiting the cache to
> the index file.

Again, true.  The mergy operations want to start from a clean slate
and they call _unmerged() variant primarily to learn that there were
unmerged entries in the index, only to abort the operation in that
case, so a change to _unmerged() variant should be safe for them.

I'll take another look at your patch later, but not before the 2.1
final, and by that time you may already have sent a reroll ;-)

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-15 21:45           ` Junio C Hamano
@ 2014-08-16 14:51             ` Jaime Soriano Pastor
  2014-08-18 16:34               ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-16 14:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

I'd like to add some tests too for this, but I don't know how to
reproduce this state with git commands only, is there any way to add
entries to the index without checkings?
Or maybe it could be done by creating a "test-" command that adds the
entries to an index?

Thanks.

On Fri, Aug 15, 2014 at 11:45 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
>> On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Being a conservative, I'd rather avoid doing any magic during
>>> read_cache() time. "ls-files -s" for example should show the four
>>> stages so that the "broken" state can be inspected.
>>>
>> Well, only read_cache_unmerged() is modified in the sent patch, so no
>> magic is done in read_cache(), I'd also avoid changes there.
>
> Ahh, I must have overlooked that; changes being only in _unmerged()
> variant makes me feel much better, and it probably would make much
> of ...
>
>>> Yes, it would be more work,...
>
> ... moot, hopefully ;-)
>
>> git reset will clean the index anyway if the loop finishes, would it
>> be ok?
>
> Surely.
>
>> git merge is also affected by the loop in read_cache_unmerged(), but
>> any of the solutions would be enough for it as only by finishing the
>> loop with unmerged entries it will die without commiting the cache to
>> the index file.
>
> Again, true.  The mergy operations want to start from a clean slate
> and they call _unmerged() variant primarily to learn that there were
> unmerged entries in the index, only to abort the operation in that
> case, so a change to _unmerged() variant should be safe for them.
>
> I'll take another look at your patch later, but not before the 2.1
> final, and by that time you may already have sent a reroll ;-)
>
> Thanks.



-- 
Jaime Soriano Pastor - Software Developer

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-16 14:51             ` Jaime Soriano Pastor
@ 2014-08-18 16:34               ` Junio C Hamano
  2014-08-18 17:23                 ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-18 16:34 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> I'd like to add some tests too for this, but I don't know how to
> reproduce this state with git commands only, is there any way to add
> entries to the index without checkings?

Perhaps feeding "update-index --index-info" four input lines would work?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
  2014-08-18 16:34               ` Junio C Hamano
@ 2014-08-18 17:23                 ` Jaime Soriano Pastor
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-18 17:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Yes, --index-info worked for this purpouse, thanks!

https://github.com/jsoriano/git/blob/remove-unmerged-index-entry/t/t9904-unmerged-file-with-merged-entry.sh#L25

I'll try to send the patches to the mailing lists later today or tomorrow.

On Mon, Aug 18, 2014 at 6:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
>> I'd like to add some tests too for this, but I don't know how to
>> reproduce this state with git commands only, is there any way to add
>> entries to the index without checkings?
>
> Perhaps feeding "update-index --index-info" four input lines would work?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 0/4] Handling unmerged files with merged entries
  2014-08-18 17:23                 ` Jaime Soriano Pastor
@ 2014-08-20 11:25                   ` Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 1/4] read_index_unmerged doesn't loop forever if merged stage exists for unmerged file Jaime Soriano Pastor
                                       ` (4 more replies)
  0 siblings, 5 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-20 11:25 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

New approach for the case of finding unmerged files with merged entries
in the index.
After some discussion the solution tries to:
- Avoid the problems with infinite loops in this case.
- Provide better information to the user in the commands affected.
- Make sure there are ways to clean the index.
- Provide also a way to specifically recover each one of the files with
  this problem.

With these patches the behaviour of these commands (for this case) change:
- git reset is able to finish, cleaning the index, but warning out the
  information about the removed stages.
- git merge is able to finish, reporting that there is a merge in progress as
  usual, it also warns about the unmerged files with merged entries.
- git add fails when this case happens, telling the user to check the state
  with 'git ls-files -s', before, it did nothing. The same with git commit -a.
- git update-index --cacheinfo can be used to select an specific staged
  version to resolve the conflict, without the need of reseting the working
  copy. It did nothing before.

Tests added for these cases. Rest of the tests remain unchanged and pass too.

Jaime Soriano Pastor (4):
  read_index_unmerged doesn't loop forever if merged stage exists for
    unmerged file
  Error out when adding a file with merged and unmerged entries
  Added tests for the case of merged and unmerged entries for the same
    file
  git update-index --cacheinfo can be used to select a stage when there
    are merged and unmerged entries

 builtin/update-index.c                     |  1 +
 cache.h                                    |  1 +
 read-cache.c                               | 36 ++++++++++--
 t/t9904-unmerged-file-with-merged-entry.sh | 94 ++++++++++++++++++++++++++++++
 4 files changed, 127 insertions(+), 5 deletions(-)
 create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh

-- 
2.0.4.4.gaf54b2b

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/4] read_index_unmerged doesn't loop forever if merged stage exists for unmerged file
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
@ 2014-08-20 11:26                     ` Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 2/4] Error out when adding a file with merged and unmerged entries Jaime Soriano Pastor
                                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-20 11:26 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..c932b83 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1933,6 +1933,7 @@ int read_index_unmerged(struct index_state *istate)
 {
 	int i;
 	int unmerged = 0;
+	struct cache_entry *merged_ce = NULL;
 
 	read_index(istate);
 	for (i = 0; i < istate->cache_nr; i++) {
@@ -1940,9 +1941,26 @@ int read_index_unmerged(struct index_state *istate)
 		struct cache_entry *new_ce;
 		int size, len;
 
-		if (!ce_stage(ce))
+		if (!ce_stage(ce)) {
+			merged_ce = ce;
 			continue;
+		}
 		unmerged = 1;
+		if (merged_ce && ce_same_name(merged_ce, ce)) {
+			warning("Unexpected stages for merged file '%s':",
+				merged_ce->name);
+			i--;
+			while (i < istate->cache_nr &&
+				   ce_same_name(merged_ce, istate->cache[i])) {
+				ce = istate->cache[i++];
+				warning("%06o %s %d",
+				        ce->ce_mode, sha1_to_hex(ce->sha1), ce_stage(ce));
+			}
+			i--;
+			merged_ce->ce_flags = create_ce_flags(0) | CE_CONFLICTED;
+			merged_ce = NULL;
+			continue;
+		}
 		len = ce_namelen(ce);
 		size = cache_entry_size(len);
 		new_ce = xcalloc(1, size);
@@ -1953,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
 		if (add_index_entry(istate, new_ce, 0))
 			return error("%s: cannot drop to stage #0",
 				     new_ce->name);
-		i = index_name_pos(istate, new_ce->name, len);
 	}
 	return unmerged;
 }
-- 
2.0.4.4.gaf54b2b

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/4] Error out when adding a file with merged and unmerged entries
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 1/4] read_index_unmerged doesn't loop forever if merged stage exists for unmerged file Jaime Soriano Pastor
@ 2014-08-20 11:26                     ` Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file Jaime Soriano Pastor
                                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-20 11:26 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index c932b83..d549d0b 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -935,6 +935,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 	int new_only = option & ADD_CACHE_NEW_ONLY;
+	int replaced = 0;
 
 	cache_tree_invalidate_path(istate->cache_tree, ce->name);
 	pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce));
@@ -943,9 +944,10 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	if (pos >= 0) {
 		if (!new_only)
 			replace_index_entry(istate, pos, ce);
-		return 0;
-	}
-	pos = -pos-1;
+		replaced = 1;
+		pos++;
+	} else
+		pos = -pos-1;
 
 	/*
 	 * Inserting a merged entry ("stage 0") into the index
@@ -953,12 +955,18 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	 */
 	if (pos < istate->cache_nr && ce_stage(ce) == 0) {
 		while (ce_same_name(istate->cache[pos], ce)) {
+			if (replaced)
+				die("Merged and unmerged entries found for "
+				    "'%s', check 'git ls-files -s \"%s\"'",
+				    ce->name, ce->name);
 			ok_to_add = 1;
 			if (!remove_index_entry_at(istate, pos))
 				break;
 		}
 	}
 
+	if (replaced)
+		return 0;
 	if (!ok_to_add)
 		return -1;
 	if (!verify_path(ce->name))
-- 
2.0.4.4.gaf54b2b

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 1/4] read_index_unmerged doesn't loop forever if merged stage exists for unmerged file Jaime Soriano Pastor
  2014-08-20 11:26                     ` [PATCH 2/4] Error out when adding a file with merged and unmerged entries Jaime Soriano Pastor
@ 2014-08-20 11:26                     ` Jaime Soriano Pastor
  2014-08-20 21:00                       ` Junio C Hamano
  2014-08-20 11:26                     ` [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries Jaime Soriano Pastor
  2014-08-20 22:19                     ` [PATCH 0/4] Handling unmerged files with merged entries Junio C Hamano
  4 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-20 11:26 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 t/t9904-unmerged-file-with-merged-entry.sh | 86 ++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)
 create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh

diff --git a/t/t9904-unmerged-file-with-merged-entry.sh b/t/t9904-unmerged-file-with-merged-entry.sh
new file mode 100755
index 0000000..945bc1c
--- /dev/null
+++ b/t/t9904-unmerged-file-with-merged-entry.sh
@@ -0,0 +1,86 @@
+#!/bin/sh
+
+test_description='Operations with unmerged files with merged entries'
+
+. ./test-lib.sh
+
+setup_repository() {
+	test_commit A conflict A
+	test_commit A conflict2 A2 branchbase
+	test_commit B conflict B
+	test_commit B conflict2 B2
+	git checkout branchbase -b branch1
+	test_commit C conflict C
+	test_commit C conflict2 C2
+	test_commit something otherfile otherfile
+}
+
+setup_stage_state() {
+	git checkout -f HEAD
+	{
+		git ls-files -s conflict conflict2
+		git merge master > /dev/null
+		git ls-files -s conflict conflict2
+	} > index
+	cat index | git update-index --index-info
+	rm index
+}
+
+test_expect_success 'setup - two branches with conflicting files' '
+	setup_repository &&
+	setup_stage_state &&
+	git ls-files -s conflict > output &&
+	test_line_count = 4 output &&
+	git ls-files -s conflict2 > output &&
+	test_line_count = 4 output &&
+	rm output
+'
+
+test_expect_success 'git commit -a' '
+	setup_stage_state &&
+	test_must_fail git commit -a
+'
+
+test_expect_success 'git add conflict' '
+	setup_stage_state &&
+	test_must_fail git add conflict
+'
+
+test_expect_success 'git rm conflict' '
+	setup_stage_state &&
+	test_must_fail git rm conflict
+'
+
+test_expect_success 'git add otherfile' '
+	setup_stage_state &&
+	>otherfile &&
+	git add otherfile
+'
+
+test_expect_success 'git rm otherfile' '
+	setup_stage_state &&
+	git rm otherfile
+'
+
+test_expect_success 'git add newfile' '
+	setup_stage_state &&
+	>newfile &&
+	git add newfile
+'
+
+test_expect_success 'git merge branch' '
+	setup_stage_state &&
+	test_must_fail git merge master
+'
+
+test_expect_success 'git reset --hard' '
+	setup_stage_state &&
+	git reset --hard &&
+	git show HEAD:conflict > expected &&
+	cat conflict > current &&
+	git show HEAD:conflict2 > expected &&
+	cat conflict2 > current &&
+	test_cmp expected current
+'
+
+test_done
-- 
2.0.4.4.gaf54b2b

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
                                       ` (2 preceding siblings ...)
  2014-08-20 11:26                     ` [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file Jaime Soriano Pastor
@ 2014-08-20 11:26                     ` Jaime Soriano Pastor
  2014-08-20 21:08                       ` Junio C Hamano
  2014-08-20 22:19                     ` [PATCH 0/4] Handling unmerged files with merged entries Junio C Hamano
  4 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-20 11:26 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 builtin/update-index.c                     |  1 +
 cache.h                                    |  1 +
 read-cache.c                               |  3 ++-
 t/t9904-unmerged-file-with-merged-entry.sh | 14 +++++++++++---
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index ebea285..509fae7 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -239,6 +239,7 @@ static int add_cacheinfo(unsigned int mode, const unsigned char *sha1,
 		ce->ce_flags |= CE_VALID;
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+	option |= ADD_CACHE_OK_TO_REPLACE_MERGED;
 	if (add_cache_entry(ce, option))
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
diff --git a/cache.h b/cache.h
index c708062..ecac41c 100644
--- a/cache.h
+++ b/cache.h
@@ -474,6 +474,7 @@ extern int index_name_pos(const struct index_state *, const char *name, int name
 #define ADD_CACHE_SKIP_DFCHECK 4	/* Ok to skip DF conflict checks */
 #define ADD_CACHE_JUST_APPEND 8		/* Append only; tree.c::read_tree() */
 #define ADD_CACHE_NEW_ONLY 16		/* Do not replace existing ones */
+#define ADD_CACHE_OK_TO_REPLACE_MERGED 32 /* Ok to replace even if a merged entry exists */
 extern int add_index_entry(struct index_state *, struct cache_entry *ce, int option);
 extern void rename_index_entry_at(struct index_state *, int pos, const char *new_name);
 extern int remove_index_entry_at(struct index_state *, int pos);
diff --git a/read-cache.c b/read-cache.c
index d549d0b..c4ddefe 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -933,6 +933,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	int pos;
 	int ok_to_add = option & ADD_CACHE_OK_TO_ADD;
 	int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;
+	int ok_to_replace_merged = option & ADD_CACHE_OK_TO_REPLACE_MERGED;
 	int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;
 	int new_only = option & ADD_CACHE_NEW_ONLY;
 	int replaced = 0;
@@ -955,7 +956,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	 */
 	if (pos < istate->cache_nr && ce_stage(ce) == 0) {
 		while (ce_same_name(istate->cache[pos], ce)) {
-			if (replaced)
+			if (replaced && !ok_to_replace_merged)
 				die("Merged and unmerged entries found for "
 				    "'%s', check 'git ls-files -s \"%s\"'",
 				    ce->name, ce->name);
diff --git a/t/t9904-unmerged-file-with-merged-entry.sh b/t/t9904-unmerged-file-with-merged-entry.sh
index 945bc1c..9138821 100755
--- a/t/t9904-unmerged-file-with-merged-entry.sh
+++ b/t/t9904-unmerged-file-with-merged-entry.sh
@@ -28,11 +28,11 @@ setup_stage_state() {
 
 test_expect_success 'setup - two branches with conflicting files' '
 	setup_repository &&
-	setup_stage_state &&
+	git merge master
 	git ls-files -s conflict > output &&
-	test_line_count = 4 output &&
+	test_line_count = 3 output &&
 	git ls-files -s conflict2 > output &&
-	test_line_count = 4 output &&
+	test_line_count = 3 output &&
 	rm output
 '
 
@@ -83,4 +83,12 @@ test_expect_success 'git reset --hard' '
 	test_cmp expected current
 '
 
+test_expect_success 'git update-index --cacheinfo to select a stage to use' '
+	setup_stage_state &&
+	git cat-file blob :1:conflict > conflict &&
+	git update-index --cacheinfo 100644,`git hash-object conflict`,conflict
+	git ls-files -s conflict > output &&
+	test_line_count = 1 output
+'
+
 test_done
-- 
2.0.4.4.gaf54b2b

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file
  2014-08-20 11:26                     ` [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file Jaime Soriano Pastor
@ 2014-08-20 21:00                       ` Junio C Hamano
  2014-08-21 13:51                         ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-20 21:00 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
> ---
>  t/t9904-unmerged-file-with-merged-entry.sh | 86 ++++++++++++++++++++++++++++++

Isn't this number already used for another test?  A test on the
index probably belongs to t2XXX or t3XXX family.

>  1 file changed, 86 insertions(+)
>  create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh
>
> diff --git a/t/t9904-unmerged-file-with-merged-entry.sh b/t/t9904-unmerged-file-with-merged-entry.sh
> new file mode 100755
> index 0000000..945bc1c
> --- /dev/null
> +++ b/t/t9904-unmerged-file-with-merged-entry.sh
> @@ -0,0 +1,86 @@
> +#!/bin/sh
> +
> +test_description='Operations with unmerged files with merged entries'
> +
> +. ./test-lib.sh
> +
> +setup_repository() {
> +	test_commit A conflict A
> +	test_commit A conflict2 A2 branchbase
> +	test_commit B conflict B
> +	test_commit B conflict2 B2
> +	git checkout branchbase -b branch1
> +	test_commit C conflict C
> +	test_commit C conflict2 C2
> +	test_commit something otherfile otherfile
> +}

No error is checked here?

> +setup_stage_state() {
> +	git checkout -f HEAD
> +	{
> +		git ls-files -s conflict conflict2
> +		git merge master > /dev/null
> +		git ls-files -s conflict conflict2
> +	} > index

No error is checked here?

Style: no SP between redirection operator and its target, i.e.

	git merge master >/dev/null
	{ ... } >index

> +	cat index | git update-index --index-info

Do not cat a single file into a pipeline, i.e.

	git update-index --index-info <index


Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries
  2014-08-20 11:26                     ` [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries Jaime Soriano Pastor
@ 2014-08-20 21:08                       ` Junio C Hamano
  2014-08-21 13:57                         ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-20 21:08 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> Subject: Re: [PATCH 4/4] git update-index --cacheinfo can be used to select
>  a stage when there are merged and unmerged entries

Hmph, what does it even mean?  Shared with your [1/4] is that it is
unclear if you are stating an existing problem to be fixed or
describing the desired end result.

Also "update-index --cacheinfo" is not about "selecting" but is
about stuffing an entry to the index, so "can be used to select"
is doubly puzzling...

>   ...
> +test_expect_success 'git update-index --cacheinfo to select a stage to use' '
> +	setup_stage_state &&
> +	git cat-file blob :1:conflict > conflict &&

Style: no SP between redirection and its target.

> +	git update-index --cacheinfo 100644,`git hash-object conflict`,conflict

Style: we prefer $() over ``

> +	git ls-files -s conflict > output &&
> +	test_line_count = 1 output

Is "we have only one line" the only thing we care about?  Don't we
want to check which stage the entry is at?

> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/4] Handling unmerged files with merged entries
  2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
                                       ` (3 preceding siblings ...)
  2014-08-20 11:26                     ` [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries Jaime Soriano Pastor
@ 2014-08-20 22:19                     ` Junio C Hamano
  2014-08-21 13:42                       ` Jaime Soriano Pastor
  2014-08-21 18:40                       ` [PATCH 0/4] Handling unmerged files with merged entries Johannes Sixt
  4 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-20 22:19 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> New approach for the case of finding unmerged files with merged entries
> in the index.
> After some discussion the solution tries to:
> - Avoid the problems with infinite loops in this case.
> - Provide better information to the user in the commands affected.
> - Make sure there are ways to clean the index.
> - Provide also a way to specifically recover each one of the files with
>   this problem.
>
> With these patches the behaviour of these commands (for this case) change:
> - git reset is able to finish, cleaning the index, but warning out the
>   information about the removed stages.
> - git merge is able to finish, reporting that there is a merge in progress as
>   usual, it also warns about the unmerged files with merged entries.
> - git add fails when this case happens, telling the user to check the state
>   with 'git ls-files -s', before, it did nothing. The same with git commit -a.
> - git update-index --cacheinfo can be used to select an specific staged
>   version to resolve the conflict, without the need of reseting the working
>   copy. It did nothing before.
>
> Tests added for these cases. Rest of the tests remain unchanged and pass too.

Thanks.

After looking at what you did in 1/4, I started to wonder if we can
solve this in add_index_entry_with_check() in a less intrusive way.
When we call the function with a stage #0 entry, we are telling the
index that any entry in higher stage for the same path must
disappear.  Since the current implementation of the function assumes
that the index is not corrupt in this particular way to have both
merged and unmerged entries for the same path, it fails to remove
the higher stage entries.  If we fix the function, wouldn't it make
your 1/4 unnecessary?  Read-only operations such as "ls-files -s"
would not call add_index_entry() so diagnostic tools would not be
affected even with such a fix.

... which may look something like the one attached at the end.

But then it made me wonder even more.

There are other ways a piece of software can leave a corrupt index
for us to read from.  Your fix, or the simpler one I suggested for
that matter, would still assume that the index entries are in the
sorted order, and a corrupt index that does not sort its entries
correctly will cause us to behave in an undefined way.  At some
point we should draw a line and say "Your index is hopelessly
corrupt.", send it back to whatever broken software that originally
wrote such a mess and have the user use that software to fix the
corrupt index up before talking to us.

For that, we need to catch an index whose entries are not sorted and
error out, perhaps when read_index_from() iterates over the mmapped
index entries.  We can even draw that "hopelessly corrupt" line
above the breakage you are addressing and add a check to make sure
no path has both merged and unmerged entries to the same check to
make it error out.

I suspect that such a "detect and error out" may be sufficient and
also may be more robust than the approach that assumes that a
breakage is only to have both merged and unmerged entries for the
same path, the entries are still correctly sorted.


 read-cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..56006a3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -943,9 +943,16 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e
 	if (pos >= 0) {
 		if (!new_only)
 			replace_index_entry(istate, pos, ce);
-		return 0;
+		/*
+		 * ... but protect ourselves from a corrupt index
+		 * that has an unmerged entry for the same path.
+		 */
+		if (istate->cache_nr <= pos + 1 ||
+		    !ce_same_name(ce, istate->cache[pos + 1]))
+			return 0;
+	} else {
+		pos = -pos-1;
 	}
-	pos = -pos-1;
 
 	/*
 	 * Inserting a merged entry ("stage 0") into the index





[Footnote]

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/4] Handling unmerged files with merged entries
  2014-08-20 22:19                     ` [PATCH 0/4] Handling unmerged files with merged entries Junio C Hamano
@ 2014-08-21 13:42                       ` Jaime Soriano Pastor
  2014-08-21 13:43                         ` [PATCH] Check order when reading index Jaime Soriano Pastor
  2014-08-21 18:40                       ` [PATCH 0/4] Handling unmerged files with merged entries Johannes Sixt
  1 sibling, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-21 13:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Good points.

On Thu, Aug 21, 2014 at 12:19 AM, Junio C Hamano <gitster@pobox.com> wrote:
> After looking at what you did in 1/4, I started to wonder if we can
> solve this in add_index_entry_with_check() in a less intrusive way.
> When we call the function with a stage #0 entry, we are telling the
> index that any entry in higher stage for the same path must
> disappear.  Since the current implementation of the function assumes
> that the index is not corrupt in this particular way to have both
> merged and unmerged entries for the same path, it fails to remove
> the higher stage entries.  If we fix the function, wouldn't it make
> your 1/4 unnecessary?  Read-only operations such as "ls-files -s"
> would not call add_index_entry() so diagnostic tools would not be
> affected even with such a fix.
>
Another thing that is done in 1/4 is to get rid of the call to
index_name_pos, that can lead to infinite loops depending on what the
previous add_index_entry call does as we have seen, and I wonder why
is it really needed, specially if we guarantee the order in the index.

> ... which may look something like the one attached at the end.
>
And it would be more in the line of my first patch.

> But then it made me wonder even more.
>
> There are other ways a piece of software can leave a corrupt index
> for us to read from.  Your fix, or the simpler one I suggested for
> that matter, would still assume that the index entries are in the
> sorted order, and a corrupt index that does not sort its entries
> correctly will cause us to behave in an undefined way.  At some
> point we should draw a line and say "Your index is hopelessly
> corrupt.", send it back to whatever broken software that originally
> wrote such a mess and have the user use that software to fix the
> corrupt index up before talking to us.
>
True.

> For that, we need to catch an index whose entries are not sorted and
> error out, perhaps when read_index_from() iterates over the mmapped
> index entries.  We can even draw that "hopelessly corrupt" line
> above the breakage you are addressing and add a check to make sure
> no path has both merged and unmerged entries to the same check to
> make it error out.
>
> I suspect that such a "detect and error out" may be sufficient and
> also may be more robust than the approach that assumes that a
> breakage is only to have both merged and unmerged entries for the
> same path, the entries are still correctly sorted.
>
Agree. I have prepared an initial patch for this to discuss, but
adding checks in read_index_from() can add a small(?) penalization to
all git operations, specially with big indexes.
And it wouldn't probably allow the user to fix the repository using
git commands (unless we only warn instead of die depending on the
thing that is broken).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] Check order when reading index
  2014-08-21 13:42                       ` Jaime Soriano Pastor
@ 2014-08-21 13:43                         ` Jaime Soriano Pastor
  2014-08-21 13:49                           ` Jaime Soriano Pastor
                                             ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-21 13:43 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..e117d3a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
+void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce) {
+	if (!ce || !next_ce)
+		return;
+	if (cache_name_compare(ce->name, ce_namelen(ce),
+						   next_ce->name, ce_namelen(next_ce)) > 1)
+		die("Unordered stage entries in index");
+	if (ce_same_name(ce, next_ce)) {
+		if (!ce_stage(ce))
+			die("Multiple stage entries for merged file '%s'",
+				ce->name);
+		if (ce_stage(ce) >= ce_stage(next_ce))
+			die("Unordered stage entries for '%s'", ce->name);
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
 		ce = create_from_disk(disk_ce, &consumed, previous_name);
 		set_index_entry(istate, i, ce);
 
+		if (i > 0)
+			check_next_ce(istate->cache[i-1], ce);
+
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-- 
2.0.4.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] Check order when reading index
  2014-08-21 13:43                         ` [PATCH] Check order when reading index Jaime Soriano Pastor
@ 2014-08-21 13:49                           ` Jaime Soriano Pastor
  2014-08-21 13:59                           ` Duy Nguyen
  2014-08-21 18:51                           ` Junio C Hamano
  2 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-21 13:49 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

On Thu, Aug 21, 2014 at 3:43 PM, Jaime Soriano Pastor
<jsorianopastor@gmail.com> wrote:
> +               if (!ce_stage(ce))
> +                       die("Multiple stage entries for merged file '%s'",
> +                               ce->name);

This case can be provoked by "git update-index --index-info" as shown
in the patch with the added test, maybe it should be only a warning.
And add too some variation of the patches in this thread to make the
same command able to fix the situation.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file
  2014-08-20 21:00                       ` Junio C Hamano
@ 2014-08-21 13:51                         ` Jaime Soriano Pastor
  2014-08-21 22:21                           ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-21 13:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Aug 20, 2014 at 11:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
>> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
>> ---
>>  t/t9904-unmerged-file-with-merged-entry.sh | 86 ++++++++++++++++++++++++++++++
>
> Isn't this number already used for another test?  A test on the
> index probably belongs to t2XXX or t3XXX family.
>
Umm, I though this test number was free, I just added it to the last+1
position, if I finally add a test I'll take this into account. Thanks.

>>  1 file changed, 86 insertions(+)
>>  create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh
>>
>> diff --git a/t/t9904-unmerged-file-with-merged-entry.sh b/t/t9904-unmerged-file-with-merged-entry.sh
>> new file mode 100755
>> index 0000000..945bc1c
>> --- /dev/null
>> +++ b/t/t9904-unmerged-file-with-merged-entry.sh
>> @@ -0,0 +1,86 @@
>> +#!/bin/sh
>> +
>> +test_description='Operations with unmerged files with merged entries'
>> +
>> +. ./test-lib.sh
>> +
>> +setup_repository() {
>> +     test_commit A conflict A
>> +     test_commit A conflict2 A2 branchbase
>> +     test_commit B conflict B
>> +     test_commit B conflict2 B2
>> +     git checkout branchbase -b branch1
>> +     test_commit C conflict C
>> +     test_commit C conflict2 C2
>> +     test_commit something otherfile otherfile
>> +}
>
> No error is checked here?
>
This is only a helper function for setup, not a test itself.

>> +setup_stage_state() {
>> +     git checkout -f HEAD
>> +     {
>> +             git ls-files -s conflict conflict2
>> +             git merge master > /dev/null
>> +             git ls-files -s conflict conflict2
>> +     } > index
>
> No error is checked here?
>
Same here.

> Style: no SP between redirection operator and its target, i.e.
>
>         git merge master >/dev/null
>         { ... } >index
>
>> +     cat index | git update-index --index-info
>
> Do not cat a single file into a pipeline, i.e.
>
>         git update-index --index-info <index
>
True :) Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries
  2014-08-20 21:08                       ` Junio C Hamano
@ 2014-08-21 13:57                         ` Jaime Soriano Pastor
  0 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-21 13:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Aug 20, 2014 at 11:08 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
>> Subject: Re: [PATCH 4/4] git update-index --cacheinfo can be used to select
>>  a stage when there are merged and unmerged entries
>
> Hmph, what does it even mean?  Shared with your [1/4] is that it is
> unclear if you are stating an existing problem to be fixed or
> describing the desired end result.
>
> Also "update-index --cacheinfo" is not about "selecting" but is
> about stuffing an entry to the index, so "can be used to select"
> is doubly puzzling...
>
Well, somehow I understand "update-index --cacheinfo" as a low level
version of add. I was trying to explain the desired end result, yes.

>>   ...
>> +test_expect_success 'git update-index --cacheinfo to select a stage to use' '
>> +     setup_stage_state &&
>> +     git cat-file blob :1:conflict > conflict &&
>
> Style: no SP between redirection and its target.
>
Ok.

>> +     git update-index --cacheinfo 100644,`git hash-object conflict`,conflict
>
> Style: we prefer $() over ``
>
Ok.

>> +     git ls-files -s conflict > output &&
>> +     test_line_count = 1 output
>
> Is "we have only one line" the only thing we care about?  Don't we
> want to check which stage the entry is at?
>
Yes, it'd be better.

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Check order when reading index
  2014-08-21 13:43                         ` [PATCH] Check order when reading index Jaime Soriano Pastor
  2014-08-21 13:49                           ` Jaime Soriano Pastor
@ 2014-08-21 13:59                           ` Duy Nguyen
  2014-08-21 18:51                           ` Junio C Hamano
  2 siblings, 0 replies; 51+ messages in thread
From: Duy Nguyen @ 2014-08-21 13:59 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: Git Mailing List

On Thu, Aug 21, 2014 at 8:43 PM, Jaime Soriano Pastor
<jsorianopastor@gmail.com> wrote:
> @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
>                 ce = create_from_disk(disk_ce, &consumed, previous_name);
>                 set_index_entry(istate, i, ce);
>
> +               if (i > 0)
> +                       check_next_ce(istate->cache[i-1], ce);
> +
>                 src_offset += consumed;
>         }
>         strbuf_release(&previous_name_buf);

It may be nice to save the "good index" stamp as an index extension so
we don't have to check this over and over. I'm thinking about big
indexes where compare cost might matter (I'm not so sure yet, will do
some testing when I have time).
-- 
Duy

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/4] Handling unmerged files with merged entries
  2014-08-20 22:19                     ` [PATCH 0/4] Handling unmerged files with merged entries Junio C Hamano
  2014-08-21 13:42                       ` Jaime Soriano Pastor
@ 2014-08-21 18:40                       ` Johannes Sixt
  2014-08-21 22:18                         ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Johannes Sixt @ 2014-08-21 18:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jaime Soriano Pastor, git

Am 21.08.2014 00:19, schrieb Junio C Hamano:
> For that, we need to catch an index whose entries are not sorted and
> error out, perhaps when read_index_from() iterates over the mmapped
> index entries.  We can even draw that "hopelessly corrupt" line
> above the breakage you are addressing and add a check to make sure
> no path has both merged and unmerged entries to the same check to
> make it error out.

Except that we can't declare an index with both merged and unmerged
entries as "hopelessly corrupt, return to sender" when it's dead easy to
generate with the git tool set:

 >x
 name=$(git hash-object -w x)
 for i in 0 1 2 3; do printf '100644 %s %d\tx\n' $name $i; done |
 git update-index --index-info

-- Hannes

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] Check order when reading index
  2014-08-21 13:43                         ` [PATCH] Check order when reading index Jaime Soriano Pastor
  2014-08-21 13:49                           ` Jaime Soriano Pastor
  2014-08-21 13:59                           ` Duy Nguyen
@ 2014-08-21 18:51                           ` Junio C Hamano
  2014-08-24 17:57                             ` [PATCH 1/2] " Jaime Soriano Pastor
  2 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-21 18:51 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
> ---
>  read-cache.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7f5645e..e117d3a 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
>  	return ce;
>  }
>  
> +void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce) {

Have opening brace for the function on its own line, i.e.

	void check_next_ce(struct cache_entry *ce, struct cache_entry *next_ce)
	{

The function might be misnamed (see below), though.

> +	if (!ce || !next_ce)
> +		return;

Hmph, would it be either a programming error or a corrupt index
input to see a NULL in either of these variables?

> +	if (cache_name_compare(ce->name, ce_namelen(ce),
> +						   next_ce->name, ce_namelen(next_ce)) > 1)

An odd indentation that is overly deep to make it hard to read.

	if (cache_name_compare(ce->name, ce_namelen(ce),
			       next_ce->name, ce_namelen(next_ce)) > 1)

should be sufficient (replacing 7-SP before next_ce with a HT is OK
if the existing code nearby does so).

What is the significance of the return value from cache_name_compare()
that is strictly greater than 1 (as opposed to merely "is it positive?")?

Perhaps you want something that is modeled after ce_same_name() that
ignores the stage, i.e.

	int ce_name_compare(const struct cache_entry *a, const struct cache_entry *b)
	{
		return strcmp(a->ce_name, b->ce_name);
	}

without reimplementing the cache-name-compare() as

	int bad_ce_same_name(const struct cache_entry *a, const struct cache_entry *b)
	{
        	return !ce_same_name(a, b);
	}

to keep the "two names with different length could never be the
same" optimization.

	- if (0 <= ce_name_compare(ce, next)) then the names are not sorted

        - if (!stage(ce) && !name_compare(ce, next)) then the merged
          entry 'ce' is not the only entry for the path



> +		die("Unordered stage entries in index");
> +	if (ce_same_name(ce, next_ce)) {
> +		if (!ce_stage(ce))
> +			die("Multiple stage entries for merged file '%s'",
> +				ce->name);

> +		if (ce_stage(ce) >= ce_stage(next_ce))
> +			die("Unordered stage entries for '%s'", ce->name);
> +	}
> +}
> +
>  /* remember to discard_cache() before reading a different cache! */
>  int read_index_from(struct index_state *istate, const char *path)
>  {
> @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
>  		ce = create_from_disk(disk_ce, &consumed, previous_name);
>  		set_index_entry(istate, i, ce);
>  
> +		if (i > 0)
> +			check_next_ce(istate->cache[i-1], ce);

Have a SP each on both sides of binary operator "-".

Judging from the way this helper function is used, it looks like
check_with_previous_ce() is a more appropriate name.  After all, you
are not checking the next ce which you haven't even created yet ;-)


Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/4] Handling unmerged files with merged entries
  2014-08-21 18:40                       ` [PATCH 0/4] Handling unmerged files with merged entries Johannes Sixt
@ 2014-08-21 22:18                         ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-21 22:18 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Jaime Soriano Pastor, git

Johannes Sixt <j6t@kdbg.org> writes:

> Am 21.08.2014 00:19, schrieb Junio C Hamano:
>> For that, we need to catch an index whose entries are not sorted and
>> error out, perhaps when read_index_from() iterates over the mmapped
>> index entries.  We can even draw that "hopelessly corrupt" line
>> above the breakage you are addressing and add a check to make sure
>> no path has both merged and unmerged entries to the same check to
>> make it error out.
>
> Except that we can't declare an index with both merged and unmerged
> entries as "hopelessly corrupt, return to sender" when it's dead easy to
> generate with the git tool set:
>
>  >x
>  name=$(git hash-object -w x)
>  for i in 0 1 2 3; do printf '100644 %s %d\tx\n' $name $i; done |
>  git update-index --index-info

Because hash-object and update-index deliberately have these holes
to allow us (read: me ;-) to easily experiment new and/or unallowed
formats, I wouldn't take that as a serious objection.  It is dead
easy to corrupt your repository or lose your data by /bin/rm, too
;-)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file
  2014-08-21 13:51                         ` Jaime Soriano Pastor
@ 2014-08-21 22:21                           ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-21 22:21 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> On Wed, Aug 20, 2014 at 11:00 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>>
>>> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
>>> ---
>>>  t/t9904-unmerged-file-with-merged-entry.sh | 86 ++++++++++++++++++++++++++++++
>>
>> Isn't this number already used for another test?  A test on the
>> index probably belongs to t2XXX or t3XXX family.
>>
> Umm, I though this test number was free, I just added it to the last+1
> position, if I finally add a test I'll take this into account. Thanks.

Please check t/README for classes of features and appropriate first
digit; also do not forget that there are topics by other people in
flight and you may need to at least check with the tip of the 'pu'
branch.

Thanks.

>>>  1 file changed, 86 insertions(+)
>>>  create mode 100755 t/t9904-unmerged-file-with-merged-entry.sh
>>>
>>> diff --git a/t/t9904-unmerged-file-with-merged-entry.sh b/t/t9904-unmerged-file-with-merged-entry.sh
>>> new file mode 100755
>>> index 0000000..945bc1c
>>> --- /dev/null
>>> +++ b/t/t9904-unmerged-file-with-merged-entry.sh
>>> @@ -0,0 +1,86 @@
>>> +#!/bin/sh
>>> +
>>> +test_description='Operations with unmerged files with merged entries'
>>> +
>>> +. ./test-lib.sh
>>> +
>>> +setup_repository() {
>>> +...
>>> +}
>>
>> No error is checked here?
>>
> This is only a helper function for setup, not a test itself.

So what?  If the set-up fails, we would want

    $ sh tXXXX-my-test.sh -i

to immediately stop without going further.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/2] Check order when reading index
  2014-08-21 18:51                           ` Junio C Hamano
@ 2014-08-24 17:57                             ` Jaime Soriano Pastor
  2014-08-24 17:57                               ` [PATCH 2/2] Loop index increases monotonically when reading unmerged entries Jaime Soriano Pastor
  2014-08-25 17:21                               ` [PATCH 1/2] Check order when reading index Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-24 17:57 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..c1a9619 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
+void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+	int name_compare = strcmp(ce->name, next_ce->name);
+	if (0 < name_compare)
+		die("Unordered stage entries in index");
+	if (!name_compare) {
+		if (!ce_stage(ce))
+			die("Multiple stage entries for merged file '%s'",
+				ce->name);
+		if (ce_stage(ce) >= ce_stage(next_ce))
+			die("Unordered stage entries for '%s'",
+				ce->name);
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
 		ce = create_from_disk(disk_ce, &consumed, previous_name);
 		set_index_entry(istate, i, ce);
 
+		if (i > 0)
+			check_ce_order(istate->cache[i - 1], ce);
+
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-- 
2.0.4.1.g0b8a4f9.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/2] Loop index increases monotonically when reading unmerged entries
  2014-08-24 17:57                             ` [PATCH 1/2] " Jaime Soriano Pastor
@ 2014-08-24 17:57                               ` Jaime Soriano Pastor
  2014-08-24 18:04                                 ` Jaime Soriano Pastor
  2014-08-25 17:21                               ` [PATCH 1/2] Check order when reading index Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-24 17:57 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index c1a9619..3d70386 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1971,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
 		if (add_index_entry(istate, new_ce, 0))
 			return error("%s: cannot drop to stage #0",
 				     new_ce->name);
-		i = index_name_pos(istate, new_ce->name, len);
 	}
 	return unmerged;
 }
-- 
2.0.4.1.g0b8a4f9.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Loop index increases monotonically when reading unmerged entries
  2014-08-24 17:57                               ` [PATCH 2/2] Loop index increases monotonically when reading unmerged entries Jaime Soriano Pastor
@ 2014-08-24 18:04                                 ` Jaime Soriano Pastor
  2014-08-25 17:09                                   ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-24 18:04 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

I think this line is dangerous, if add_cache_entry is not able to
remove higher-stages it will be looping forever, as happens in the
case of this thread.
I cannot see why it's even needed, and removing it doesn't break any test.

On Sun, Aug 24, 2014 at 7:57 PM, Jaime Soriano Pastor
<jsorianopastor@gmail.com> wrote:
> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
> ---
>  read-cache.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index c1a9619..3d70386 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1971,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
>                 if (add_index_entry(istate, new_ce, 0))
>                         return error("%s: cannot drop to stage #0",
>                                      new_ce->name);
> -               i = index_name_pos(istate, new_ce->name, len);
>         }
>         return unmerged;
>  }
> --
> 2.0.4.1.g0b8a4f9.dirty
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] Loop index increases monotonically when reading unmerged entries
  2014-08-24 18:04                                 ` Jaime Soriano Pastor
@ 2014-08-25 17:09                                   ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-25 17:09 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> I think this line is dangerous, if add_cache_entry is not able to
> remove higher-stages it will be looping forever, as happens in the
> case of this thread.
> I cannot see why it's even needed, and removing it doesn't break any test.
>
> On Sun, Aug 24, 2014 at 7:57 PM, Jaime Soriano Pastor
> <jsorianopastor@gmail.com> wrote:
>> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
>> ---
>>  read-cache.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index c1a9619..3d70386 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -1971,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
>>                 if (add_index_entry(istate, new_ce, 0))
>>                         return error("%s: cannot drop to stage #0",
>>                                      new_ce->name);
>> -               i = index_name_pos(istate, new_ce->name, len);

I think the original idea was that regardless of how many entries
with the same name were removed because of the replacement (or
addition) of "new_ce", by making "i" point at the newly added
"new_ce", we would make sure that the loop will continue from the
next entry.  The if/return expected that add_index_entry() will get
rid of all the other entries with the same name as "new_ce" has or it
will return an error.

Without the "bug" in add_index_entry(), because "new_ce" always has
the same name as "ce", the entry we found at "i" by the loop, we know
that index_name_pos() will give the same "i" we already have, so
removing this line should be a no-op.

Now, add_index_entry() in your case did not notice that it failed to
remove all other entries with the same name as "new_ce", resulting
in your "looping forever".  Among the "merged and unmerged entries
with the same name exists in the index file" class of index file
corruption, we could treat the "merged and unmerged entries with the
same name not just exists but next to each other, unmerged ones
coming immediately after merged one" case specially (i.e. declaring
that it is more likely for a broken software to leave both next to
each other than otherwise) and try to accomodate it as your original
patch did.  I am not absolutely sure if such a special case is worth
it, and with your updated "[1/2] read_index_from(): check order of
entries when reading index" we will not be doing so, which is good.

With that safety in place, the "bug" in add_index_entry() will
disappear; it is safe not to adjust "i" by calling index_name_pos()
and this patch, "[2/2] read_index_unmerged(): remove unnecessary
loop index adjustment", will be a good thing to do.

Thanks.

>>         }
>>         return unmerged;
>>  }
>> --
>> 2.0.4.1.g0b8a4f9.dirty
>>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-24 17:57                             ` [PATCH 1/2] " Jaime Soriano Pastor
  2014-08-24 17:57                               ` [PATCH 2/2] Loop index increases monotonically when reading unmerged entries Jaime Soriano Pastor
@ 2014-08-25 17:21                               ` Junio C Hamano
  2014-08-25 19:44                                 ` Jeff King
  2014-08-25 20:26                                 ` Jaime Soriano Pastor
  1 sibling, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-25 17:21 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> Subject: Re: [PATCH 1/2] Check order when reading index

Please be careful when crafting the commit title.  This single line
will be the only one that readers will have to identify the change
among hundreds of entries in "git shortlog" output when trying to
see what kind of change went into the project during the given
period.  Something like:

    read_index_from(): catch out of order entries while reading an index file

perhaps?

> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
> ---
>  read-cache.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7f5645e..c1a9619 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
>  	return ce;
>  }
>  
> +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)

Does this have to be global, i.e. not "static void ..."?

> +{
> +	int name_compare = strcmp(ce->name, next_ce->name);
> +	if (0 < name_compare)
> +		die("Unordered stage entries in index");
> +	if (!name_compare) {
> +		if (!ce_stage(ce))
> +			die("Multiple stage entries for merged file '%s'",
> +				ce->name);

OK.  If ce is at stage #0, no other entry can have the same name
regardless of the stage, and next_ce having the same name violates
that rule.

> +		if (ce_stage(ce) >= ce_stage(next_ce))
> +			die("Unordered stage entries for '%s'",
> +				ce->name);

Not quite.  We do allow multiple higher stage entries; having two or
more stage #1 entries is perfectly fine during a merge resolution,
and both ce and next_ce may be pointing at the stage #1 entries of
the same path.  Replacing the comparison with ">" is sufficient, I
think.

Thanks.

> +	}
> +}
> +
>  /* remember to discard_cache() before reading a different cache! */
>  int read_index_from(struct index_state *istate, const char *path)
>  {
> @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
>  		ce = create_from_disk(disk_ce, &consumed, previous_name);
>  		set_index_entry(istate, i, ce);
>  
> +		if (i > 0)
> +			check_ce_order(istate->cache[i - 1], ce);
> +
>  		src_offset += consumed;
>  	}
>  	strbuf_release(&previous_name_buf);

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-25 17:21                               ` [PATCH 1/2] Check order when reading index Junio C Hamano
@ 2014-08-25 19:44                                 ` Jeff King
  2014-08-25 20:52                                   ` Junio C Hamano
  2014-08-25 20:26                                 ` Jaime Soriano Pastor
  1 sibling, 1 reply; 51+ messages in thread
From: Jeff King @ 2014-08-25 19:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jaime Soriano Pastor, git

On Mon, Aug 25, 2014 at 10:21:58AM -0700, Junio C Hamano wrote:

> > +		if (ce_stage(ce) >= ce_stage(next_ce))
> > +			die("Unordered stage entries for '%s'",
> > +				ce->name);
> 
> Not quite.  We do allow multiple higher stage entries; having two or
> more stage #1 entries is perfectly fine during a merge resolution,
> and both ce and next_ce may be pointing at the stage #1 entries of
> the same path.  Replacing the comparison with ">" is sufficient, I
> think.

For my own curiosity, how do you get into this situation, and what does
it mean to have multiple stage#1 entries for the same path? What would
"git cat-file :1:path" output?

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-25 17:21                               ` [PATCH 1/2] Check order when reading index Junio C Hamano
  2014-08-25 19:44                                 ` Jeff King
@ 2014-08-25 20:26                                 ` Jaime Soriano Pastor
  1 sibling, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-25 20:26 UTC (permalink / raw)
  To: git

On Mon, Aug 25, 2014 at 7:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:
>
>> Subject: Re: [PATCH 1/2] Check order when reading index
>
> Please be careful when crafting the commit title.  This single line
> will be the only one that readers will have to identify the change
> among hundreds of entries in "git shortlog" output when trying to
> see what kind of change went into the project during the given
> period.  Something like:
>
>     read_index_from(): catch out of order entries while reading an index file
>
> perhaps?
>
Ok, reprashing it.

>> +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
>
> Does this have to be global, i.e. not "static void ..."?
>
Not really, changing it to static.

>> +             if (ce_stage(ce) >= ce_stage(next_ce))
>> +                     die("Unordered stage entries for '%s'",
>> +                             ce->name);
>
> Not quite.  We do allow multiple higher stage entries; having two or
> more stage #1 entries is perfectly fine during a merge resolution,
> and both ce and next_ce may be pointing at the stage #1 entries of
> the same path.  Replacing the comparison with ">" is sufficient, I
> think.
>
Ok, but like Jeff, I'm also curious about how to have multiple stage
#1 entries for the same path.

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-25 19:44                                 ` Jeff King
@ 2014-08-25 20:52                                   ` Junio C Hamano
  2014-08-26 12:08                                     ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-25 20:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Jaime Soriano Pastor, Git Mailing List

On Mon, Aug 25, 2014 at 12:44 PM, Jeff King <peff@peff.net> wrote:
> For my own curiosity, how do you get into this situation, and what does
> it mean to have multiple stage#1 entries for the same path? What would
> "git cat-file :1:path" output?

That is how we natively (read: not with the funky "virtual" stuff
merge-recursive does) express a merge with multiple merge bases.
You also should be able to read this in the way how "git merge" invokes
merge strategies (one or more bases, double-dash and then current
HEAD and the other branches). I think there are some tests in 3way
merge tests that checks what should happen when our HEAD matches
one of the stage #1 while their branch matches a different one of the
stage #1, too.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-25 20:52                                   ` Junio C Hamano
@ 2014-08-26 12:08                                     ` Jaime Soriano Pastor
  2014-08-26 12:20                                       ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-26 12:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On Mon, Aug 25, 2014 at 10:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> On Mon, Aug 25, 2014 at 12:44 PM, Jeff King <peff@peff.net> wrote:
>> For my own curiosity, how do you get into this situation, and what does
>> it mean to have multiple stage#1 entries for the same path? What would
>> "git cat-file :1:path" output?
>
> That is how we natively (read: not with the funky "virtual" stuff
> merge-recursive does) express a merge with multiple merge bases.
> You also should be able to read this in the way how "git merge" invokes
> merge strategies (one or more bases, double-dash and then current
> HEAD and the other branches). I think there are some tests in 3way
> merge tests that checks what should happen when our HEAD matches
> one of the stage #1 while their branch matches a different one of the
> stage #1, too.

I'm a bit lost with this, conceptually it doesn't seem to be any
problem with having multiple merge bases, but I don't manage to
reproduce it.
With "natively" do you mean some internal state that is never written
into the index? If this were the case then there wouldn't be any
problem with the restriction when reading the index file.

I have also tried to reproduce it by directly calling
git-merge-recursive and the most I have got is what it seemed to be
like a conflict in the stage #1:

$ git ls-files -s
100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict

$ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
<<<<<<< Temporary merge branch 1
G
=======
E
F
>>>>>>> Temporary merge branch 2

And after thinking a bit more about it, I don't see a way to have two
stage #1 entries for the same path with git commands only. It seems
that all entries are added through the add_index_entry_with_check()
function (except maybe the added to the cached tree extension), and
this function replaces existing entries if they have the same name and
stage.
Also, all tests pass with the patch, without allowing two entries for
the same stage.

So I'm afraid that I don't fully understand this case :/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-26 12:08                                     ` Jaime Soriano Pastor
@ 2014-08-26 12:20                                       ` Jeff King
  2014-08-26 16:53                                         ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2014-08-26 12:20 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: Junio C Hamano, Git Mailing List

On Tue, Aug 26, 2014 at 02:08:35PM +0200, Jaime Soriano Pastor wrote:

> > That is how we natively (read: not with the funky "virtual" stuff
> > merge-recursive does) express a merge with multiple merge bases.
> > You also should be able to read this in the way how "git merge" invokes
> > merge strategies (one or more bases, double-dash and then current
> > HEAD and the other branches). I think there are some tests in 3way
> > merge tests that checks what should happen when our HEAD matches
> > one of the stage #1 while their branch matches a different one of the
> > stage #1, too.
> 
> I'm a bit lost with this, conceptually it doesn't seem to be any
> problem with having multiple merge bases, but I don't manage to
> reproduce it.
> With "natively" do you mean some internal state that is never written
> into the index? If this were the case then there wouldn't be any
> problem with the restriction when reading the index file.

FWIW, that was my question on reading Junio's response, too.

> I have also tried to reproduce it by directly calling
> git-merge-recursive and the most I have got is what it seemed to be
> like a conflict in the stage #1:
> 
> $ git ls-files -s
> 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
> 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
> 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict
> 
> $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
> <<<<<<< Temporary merge branch 1
> G
> =======
> E
> F
> >>>>>>> Temporary merge branch 2

Yes, I think merge-recursive resolves the earlier merges and then feeds
the result into the main merge. And that's how you end up with the
"temporary merge branch" name instead of something useful.

It might work to have a recursive merge that causes a conflict on path
X, and then we further need to resolve that conflict. I'm not sure if we
try to represent that in the index somehow, or if merge-recursive just
bails in this case (I didn't try it).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-26 12:20                                       ` Jeff King
@ 2014-08-26 16:53                                         ` Junio C Hamano
  2014-08-26 17:22                                           ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-26 16:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Jaime Soriano Pastor, Git Mailing List

Jeff King <peff@peff.net> writes:

>> With "natively" do you mean some internal state that is never written
>> into the index? If this were the case then there wouldn't be any
>> problem with the restriction when reading the index file.
>
> FWIW, that was my question on reading Junio's response, too.

The current code may not put two stage #1 entries for the same path
but allowing such entries was a part of the design from very early
days; iow it is a valid index file, unlike the one that has both
stage #0 and stage #1 for the same path.  It is a no-no to reject
such an index as long as we are discussing to add a new code to
tighten the sanity check on the file content.

>> I have also tried to reproduce it by directly calling
>> git-merge-recursive and the most I have got is what it seemed to be
>> like a conflict in the stage #1:
>> 
>> $ git ls-files -s
>> 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
>> 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
>> 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict
>> 
>> $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
>> <<<<<<< Temporary merge branch 1
>> G
>> =======
>> E
>> F
>> >>>>>>> Temporary merge branch 2
>
> Yes, I think merge-recursive resolves the earlier merges and then feeds
> the result into the main merge. And that's how you end up with the
> "temporary merge branch" name instead of something useful.

Yes---that is what I meant by the "virtual stuff".  Unlike resolve
work by Daniel (around Sep 2005 $gmane/8088) that tried to use
multiple ancestors directly in a single merge, recursive limits
itself to repeated use of pairwise merges.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-26 16:53                                         ` Junio C Hamano
@ 2014-08-26 17:22                                           ` Jaime Soriano Pastor
  2014-08-26 17:43                                             ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-26 17:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Yes---that is what I meant by the "virtual stuff".  Unlike resolve
> work by Daniel (around Sep 2005 $gmane/8088) that tried to use
> multiple ancestors directly in a single merge, recursive limits
> itself to repeated use of pairwise merges.

Ok, I see now. Then what about checking also in check_ce_order() that
only stage #1 can be repeated?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-26 17:22                                           ` Jaime Soriano Pastor
@ 2014-08-26 17:43                                             ` Junio C Hamano
  2014-08-27 19:48                                               ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Jaime Soriano Pastor
  2014-08-27 19:52                                               ` [PATCH 1/2] Check order when reading index Jaime Soriano Pastor
  0 siblings, 2 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-26 17:43 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: Jeff King, Git Mailing List

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Yes---that is what I meant by the "virtual stuff".  Unlike resolve
>> work by Daniel (around Sep 2005 $gmane/8088) that tried to use
>> multiple ancestors directly in a single merge, recursive limits
>> itself to repeated use of pairwise merges.
>
> Ok, I see now. Then what about checking also in check_ce_order() that
> only stage #1 can be repeated?

We could use multiple stage #3 entries to natively represent an
octopus merge in progress if we wanted to.  I do not think we want
to close the door for doing so, unless there is some compelling
reason.

Does the current codebase choke with such entries in the index file,
like you saw in your index file with both stage #0 and stage #1
entries?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file
  2014-08-26 17:43                                             ` Junio C Hamano
@ 2014-08-27 19:48                                               ` Jaime Soriano Pastor
  2014-08-27 19:48                                                 ` [PATCH 2/2] read_index_unmerged(): remove unnecessary loop index adjustment Jaime Soriano Pastor
  2014-08-29  2:16                                                 ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Eric Sunshine
  2014-08-27 19:52                                               ` [PATCH 1/2] Check order when reading index Jaime Soriano Pastor
  1 sibling, 2 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-27 19:48 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..1cdb762 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
+static void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+	int name_compare = strcmp(ce->name, next_ce->name);
+	if (0 < name_compare)
+		die("Unordered stage entries in index");
+	if (!name_compare) {
+		if (!ce_stage(ce))
+			die("Multiple stage entries for merged file '%s'",
+				ce->name);
+		if (ce_stage(ce) > ce_stage(next_ce))
+			die("Unordered stage entries for '%s'",
+				ce->name);
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
 		ce = create_from_disk(disk_ce, &consumed, previous_name);
 		set_index_entry(istate, i, ce);
 
+		if (i > 0)
+			check_ce_order(istate->cache[i - 1], ce);
+
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-- 
2.0.4.2.g7bc378e.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/2] read_index_unmerged(): remove unnecessary loop index adjustment
  2014-08-27 19:48                                               ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Jaime Soriano Pastor
@ 2014-08-27 19:48                                                 ` Jaime Soriano Pastor
  2014-08-29  2:16                                                 ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Eric Sunshine
  1 sibling, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-27 19:48 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 1cdb762..39fca8c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1971,7 +1971,6 @@ int read_index_unmerged(struct index_state *istate)
 		if (add_index_entry(istate, new_ce, 0))
 			return error("%s: cannot drop to stage #0",
 				     new_ce->name);
-		i = index_name_pos(istate, new_ce->name, len);
 	}
 	return unmerged;
 }
-- 
2.0.4.2.g7bc378e.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-26 17:43                                             ` Junio C Hamano
  2014-08-27 19:48                                               ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Jaime Soriano Pastor
@ 2014-08-27 19:52                                               ` Jaime Soriano Pastor
  2014-08-27 21:11                                                 ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-27 19:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Does the current codebase choke with such entries in the index file,
> like you saw in your index file with both stage #0 and stage #1
> entries?

Not sure, I couldn't reproduce an scenario with an index with multiple
entries in the same stage for the same path.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-27 19:52                                               ` [PATCH 1/2] Check order when reading index Jaime Soriano Pastor
@ 2014-08-27 21:11                                                 ` Junio C Hamano
  2014-08-27 22:13                                                   ` Jaime Soriano Pastor
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2014-08-27 21:11 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: Jeff King, Git Mailing List

Jaime Soriano Pastor <jsorianopastor@gmail.com> writes:

> On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Does the current codebase choke with such entries in the index file,
>> like you saw in your index file with both stage #0 and stage #1
>> entries?
>
> Not sure, I couldn't reproduce an scenario with an index with multiple
> entries in the same stage for the same path.

I think we have been discussing how to protect broken index file
left by tools other people wrote, so I wouldn't be so surprised if
our current toolset does not let you recreate certain breakages ;-)

I was asking for an answer more from what you know about the code.
For example, would read_index_unmerged() choke if the index has two
or more stage #1 (or stage #3) entries for the same path (provided
that the index is otherwise normal, i.e. no stage #0 entry for that
path, entries are sorted by pathname order and stages are in an
order that does not decrease)?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] Check order when reading index
  2014-08-27 21:11                                                 ` Junio C Hamano
@ 2014-08-27 22:13                                                   ` Jaime Soriano Pastor
  0 siblings, 0 replies; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-27 22:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Git Mailing List

On Wed, Aug 27, 2014 at 11:11 PM, Junio C Hamano <gitster@pobox.com> wrote:
> I was asking for an answer more from what you know about the code.
> For example, would read_index_unmerged() choke if the index has two
> or more stage #1 (or stage #3) entries for the same path (provided
> that the index is otherwise normal, i.e. no stage #0 entry for that
> path, entries are sorted by pathname order and stages are in an
> order that does not decrease)?

Oh, ok :) Re-reading the code a bit I think that there can be a
potential problem in the add_index_entry_with_check() function. It's
currently implemented to allow an only entry for each stage and each
path, if an entry for a path and a stage is being added, and another
one existed before, the old one is replaced, but just the first one,
so adding an entry to stage #1 in an index with multiple entries at
stage #1 would replace the first occurence, but not the rest, what
could not be expected. The user could maybe expect that all entries
are replaced, or only an specific one.
If an stage #0 entry is added and there are multiple entries for any
of the higher-stage entries there wouldn't be any problem as this
function removes all the higher-stage entries for the same path
without checking the stage. This last case is the one in
read_index_unmerged().

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file
  2014-08-27 19:48                                               ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Jaime Soriano Pastor
  2014-08-27 19:48                                                 ` [PATCH 2/2] read_index_unmerged(): remove unnecessary loop index adjustment Jaime Soriano Pastor
@ 2014-08-29  2:16                                                 ` Eric Sunshine
  2014-08-29  8:54                                                   ` [PATCH] " Jaime Soriano Pastor
  1 sibling, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2014-08-29  2:16 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: Git List

On Wed, Aug 27, 2014 at 3:48 PM, Jaime Soriano Pastor
<jsorianopastor@gmail.com> wrote:
> Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
> ---
>  read-cache.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7f5645e..1cdb762 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
>         return ce;
>  }
>
> +static void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
> +{
> +       int name_compare = strcmp(ce->name, next_ce->name);
> +       if (0 < name_compare)
> +               die("Unordered stage entries in index");
> +       if (!name_compare) {
> +               if (!ce_stage(ce))
> +                       die("Multiple stage entries for merged file '%s'",
> +                               ce->name);
> +               if (ce_stage(ce) > ce_stage(next_ce))
> +                       die("Unordered stage entries for '%s'",
> +                               ce->name);

Perhaps consider dropping capitalization from error messages [1]
(despite existing code in read-cache.c having a mix of the two
styles). See "Error Messages" in Documentation/CodingGuidelines.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/251715/focus=253209

> +       }
> +}
> +
>  /* remember to discard_cache() before reading a different cache! */
>  int read_index_from(struct index_state *istate, const char *path)
>  {
> @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path)
>                 ce = create_from_disk(disk_ce, &consumed, previous_name);
>                 set_index_entry(istate, i, ce);
>
> +               if (i > 0)
> +                       check_ce_order(istate->cache[i - 1], ce);
> +
>                 src_offset += consumed;
>         }
>         strbuf_release(&previous_name_buf);
> --
> 2.0.4.2.g7bc378e.dirty

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] read_index_from(): catch out of order entries when reading an index file
  2014-08-29  2:16                                                 ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Eric Sunshine
@ 2014-08-29  8:54                                                   ` Jaime Soriano Pastor
  2014-08-29 17:06                                                     ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jaime Soriano Pastor @ 2014-08-29  8:54 UTC (permalink / raw)
  To: git; +Cc: Jaime Soriano Pastor

Signed-off-by: Jaime Soriano Pastor <jsorianopastor@gmail.com>
---
 read-cache.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 5d3c8bd..023d6d7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1465,6 +1465,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
+static void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+	int name_compare = strcmp(ce->name, next_ce->name);
+	if (0 < name_compare)
+		die("unordered stage entries in index");
+	if (!name_compare) {
+		if (!ce_stage(ce))
+			die("multiple stage entries for merged file '%s'",
+				ce->name);
+		if (ce_stage(ce) > ce_stage(next_ce))
+			die("unordered stage entries for '%s'",
+				ce->name);
+	}
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -1526,6 +1541,9 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 		ce = create_from_disk(disk_ce, &consumed, previous_name);
 		set_index_entry(istate, i, ce);
 
+		if (i > 0)
+			check_ce_order(istate->cache[i - 1], ce);
+
 		src_offset += consumed;
 	}
 	strbuf_release(&previous_name_buf);
-- 
2.0.4.1.gca370f9.dirty

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH] read_index_from(): catch out of order entries when reading an index file
  2014-08-29  8:54                                                   ` [PATCH] " Jaime Soriano Pastor
@ 2014-08-29 17:06                                                     ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2014-08-29 17:06 UTC (permalink / raw)
  To: Jaime Soriano Pastor; +Cc: git

Thanks, both.

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2014-08-29 17:06 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-12 15:31 [PATCH] read-cache.c: Ensure unmerged entries are removed Jaime Soriano Pastor
2014-08-12 15:31 ` Jaime Soriano Pastor
2014-08-12 18:39   ` Junio C Hamano
2014-08-13 22:10     ` Jaime Soriano Pastor
2014-08-13 23:04       ` Junio C Hamano
2014-08-15 19:50         ` Jaime Soriano Pastor
2014-08-15 21:45           ` Junio C Hamano
2014-08-16 14:51             ` Jaime Soriano Pastor
2014-08-18 16:34               ` Junio C Hamano
2014-08-18 17:23                 ` Jaime Soriano Pastor
2014-08-20 11:25                   ` [PATCH 0/4] Handling unmerged files with merged entries Jaime Soriano Pastor
2014-08-20 11:26                     ` [PATCH 1/4] read_index_unmerged doesn't loop forever if merged stage exists for unmerged file Jaime Soriano Pastor
2014-08-20 11:26                     ` [PATCH 2/4] Error out when adding a file with merged and unmerged entries Jaime Soriano Pastor
2014-08-20 11:26                     ` [PATCH 3/4] Added tests for the case of merged and unmerged entries for the same file Jaime Soriano Pastor
2014-08-20 21:00                       ` Junio C Hamano
2014-08-21 13:51                         ` Jaime Soriano Pastor
2014-08-21 22:21                           ` Junio C Hamano
2014-08-20 11:26                     ` [PATCH 4/4] git update-index --cacheinfo can be used to select a stage when there are merged and unmerged entries Jaime Soriano Pastor
2014-08-20 21:08                       ` Junio C Hamano
2014-08-21 13:57                         ` Jaime Soriano Pastor
2014-08-20 22:19                     ` [PATCH 0/4] Handling unmerged files with merged entries Junio C Hamano
2014-08-21 13:42                       ` Jaime Soriano Pastor
2014-08-21 13:43                         ` [PATCH] Check order when reading index Jaime Soriano Pastor
2014-08-21 13:49                           ` Jaime Soriano Pastor
2014-08-21 13:59                           ` Duy Nguyen
2014-08-21 18:51                           ` Junio C Hamano
2014-08-24 17:57                             ` [PATCH 1/2] " Jaime Soriano Pastor
2014-08-24 17:57                               ` [PATCH 2/2] Loop index increases monotonically when reading unmerged entries Jaime Soriano Pastor
2014-08-24 18:04                                 ` Jaime Soriano Pastor
2014-08-25 17:09                                   ` Junio C Hamano
2014-08-25 17:21                               ` [PATCH 1/2] Check order when reading index Junio C Hamano
2014-08-25 19:44                                 ` Jeff King
2014-08-25 20:52                                   ` Junio C Hamano
2014-08-26 12:08                                     ` Jaime Soriano Pastor
2014-08-26 12:20                                       ` Jeff King
2014-08-26 16:53                                         ` Junio C Hamano
2014-08-26 17:22                                           ` Jaime Soriano Pastor
2014-08-26 17:43                                             ` Junio C Hamano
2014-08-27 19:48                                               ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Jaime Soriano Pastor
2014-08-27 19:48                                                 ` [PATCH 2/2] read_index_unmerged(): remove unnecessary loop index adjustment Jaime Soriano Pastor
2014-08-29  2:16                                                 ` [PATCH 1/2] read_index_from(): catch out of order entries when reading an index file Eric Sunshine
2014-08-29  8:54                                                   ` [PATCH] " Jaime Soriano Pastor
2014-08-29 17:06                                                     ` Junio C Hamano
2014-08-27 19:52                                               ` [PATCH 1/2] Check order when reading index Jaime Soriano Pastor
2014-08-27 21:11                                                 ` Junio C Hamano
2014-08-27 22:13                                                   ` Jaime Soriano Pastor
2014-08-25 20:26                                 ` Jaime Soriano Pastor
2014-08-21 18:40                       ` [PATCH 0/4] Handling unmerged files with merged entries Johannes Sixt
2014-08-21 22:18                         ` Junio C Hamano
2014-08-12 18:31 ` [PATCH] read-cache.c: Ensure unmerged entries are removed Junio C Hamano
2014-08-13 22:10   ` Jaime Soriano Pastor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.