All of lore.kernel.org
 help / color / mirror / Atom feed
* Strange effect merging empty file
@ 2012-03-21 10:28 Ralf Nyren
  2012-03-21 10:54 ` Zbigniew Jędrzejewski-Szmek
  0 siblings, 1 reply; 25+ messages in thread
From: Ralf Nyren @ 2012-03-21 10:28 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

Hi,

I found a "strange effect" when merging from a branch containing a 
change of a previously empty file. The change is added to another empty 
file in the current branch by the merge.

I guess git sees it as a renamed file which is logical from a 
content-perspective.

Not sure what to do with this, I would not say it is a bug really...

Reproduce as follows (should be cut-n-paste friendly):

# Start with a new repository
git init
echo 'Readme file' > README
git add README
git commit -m 'Initial commit'

# Setup the branch to be merged
git checkout -b import
touch empty.txt
echo hello world > hello.txt
git add empty.txt hello.txt
git commit -m 'Import 1.0'
git tag IMPORT_1.0
echo This file is no longer empty > empty.txt
git commit -m 'Import 1.1' empty.txt
git tag IMPORT_1.1

# Setup master branch
git checkout master
mkdir static
touch static/.gitignore
git add static/.gitignore
git commit -m 'Static web content'

# Merge import 1.0 and remove the empty file
git merge IMPORT_1.0
git rm empty.txt
git commit -m 'Remove empty file' empty.txt

# Merge import 1.1 and watch empty.txt contents show up in .gitignore
git merge IMPORT_1.1
cat static/.gitignore

regards, Ralf


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3994 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-21 10:28 Strange effect merging empty file Ralf Nyren
@ 2012-03-21 10:54 ` Zbigniew Jędrzejewski-Szmek
  2012-03-21 17:14   ` Junio C Hamano
  2012-03-22 12:17   ` Randal L. Schwartz
  0 siblings, 2 replies; 25+ messages in thread
From: Zbigniew Jędrzejewski-Szmek @ 2012-03-21 10:54 UTC (permalink / raw)
  To: Ralf Nyren; +Cc: git

On 03/21/2012 11:28 AM, Ralf Nyren wrote:
> Hi,
>
> I found a "strange effect" when merging from a branch containing a
> change of a previously empty file. The change is added to another empty
> file in the current branch by the merge.
>
> I guess git sees it as a renamed file which is logical from a
> content-perspective.
>
> Not sure what to do with this, I would not say it is a bug really...
It does seem like a bug. An unrelated file is clobbered.

> Reproduce as follows (should be cut-n-paste friendly):
Here's a slightly simplified version that is actually cut-n-paste 
friendly after recent changes to always edit merge comments...

# Start with a new repository
git init
echo Readme file >README
git add README
git commit -m 'Initial commit'

# Setup the branch to be merged
git checkout -b import
touch empty.txt
echo hello world >hello.txt
git add empty.txt hello.txt
git commit -m 'Import 1.0'
echo This file is no longer empty >empty.txt
git commit -m 'Import 1.1' empty.txt

# Setup master branch
git checkout master
touch .gitignore
git add .gitignore
git commit -m 'Static web content'

# Merge import 1.0 and remove the empty file
git merge  --no-edit import^
git rm empty.txt
git commit -m 'Remove empty file'

# Merge import 1.1 and watch empty.txt contents show up in .gitignore
git merge --no-edit import
cat .gitignore


Regards,
Zbyszek

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-21 10:54 ` Zbigniew Jędrzejewski-Szmek
@ 2012-03-21 17:14   ` Junio C Hamano
  2012-03-22 12:17   ` Randal L. Schwartz
  1 sibling, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2012-03-21 17:14 UTC (permalink / raw)
  To: Zbigniew Jędrzejewski-Szmek; +Cc: Ralf Nyren, git

Zbigniew Jędrzejewski-Szmek  <zbyszek@in.waw.pl> writes:

> # Merge import 1.0 and remove the empty file
> git merge  --no-edit import^
> git rm empty.txt
> git commit -m 'Remove empty file'
>
> # Merge import 1.1 and watch empty.txt contents show up in .gitignore
> git merge --no-edit import
> cat .gitignore

Given what merge-recursive does, I am not very surprised, even though I
agree that it feels like a very surprising end user experience ;-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-21 10:54 ` Zbigniew Jędrzejewski-Szmek
  2012-03-21 17:14   ` Junio C Hamano
@ 2012-03-22 12:17   ` Randal L. Schwartz
  2012-03-22 12:39     ` Ralf Nyren
  2012-03-22 12:47     ` Zbigniew Jędrzejewski-Szmek
  1 sibling, 2 replies; 25+ messages in thread
From: Randal L. Schwartz @ 2012-03-22 12:17 UTC (permalink / raw)
  To: Zbigniew Jędrzejewski-Szmek; +Cc: Ralf Nyren, git

>>>>> "Zbigniew" == Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> writes:

Zbigniew> touch .gitignore

If you're doing this to make sure git makes an empty directory, always
put the directory name in there as a comment... that makes it (a) not an
empty file (so merge doesn't get confused) and (b) unique, so when you
change it, merge won't try to change other similar empty files.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 12:17   ` Randal L. Schwartz
@ 2012-03-22 12:39     ` Ralf Nyren
  2012-03-22 12:47     ` Zbigniew Jędrzejewski-Szmek
  1 sibling, 0 replies; 25+ messages in thread
From: Ralf Nyren @ 2012-03-22 12:39 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 610 bytes --]

On 22/03/12 13:17, Randal L. Schwartz wrote:
>>>>>> "Zbigniew" == Zbigniew Jędrzejewski-Szmek<zbyszek@in.waw.pl>  writes:
>
> Zbigniew>  touch .gitignore
>
> If you're doing this to make sure git makes an empty directory, always
> put the directory name in there as a comment... that makes it (a) not an
> empty file (so merge doesn't get confused) and (b) unique, so when you
> change it, merge won't try to change other similar empty files.

Good point.

Renaming directories makes the comment out-of-date but since uniqueness 
is what matters it should not be a problem.

regards, Ralf



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3994 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 12:17   ` Randal L. Schwartz
  2012-03-22 12:39     ` Ralf Nyren
@ 2012-03-22 12:47     ` Zbigniew Jędrzejewski-Szmek
  2012-03-22 14:01       ` Jeff King
  1 sibling, 1 reply; 25+ messages in thread
From: Zbigniew Jędrzejewski-Szmek @ 2012-03-22 12:47 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: Ralf Nyren, git, Junio C Hamano

On 03/22/2012 01:17 PM, Randal L. Schwartz wrote:
>>>>>> "Zbigniew" == Zbigniew Jędrzejewski-Szmek<zbyszek@in.waw.pl>  writes:
>
> Zbigniew>  touch .gitignore
>
> If you're doing this to make sure git makes an empty directory, always
> put the directory name in there as a comment... that makes it (a) not an
> empty file (so merge doesn't get confused) and (b) unique, so when you
> change it, merge won't try to change other similar empty files.

Yes, this will indeed fix this particular problem. But in general, empty 
files can be used for various reasons, and it can be a pretty nasty 
surprise if they sprout random content as a result of a merge.

Maybe merge-recursive could special-case empty files?

Zbyszek

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 12:47     ` Zbigniew Jędrzejewski-Szmek
@ 2012-03-22 14:01       ` Jeff King
  2012-03-22 17:03         ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2012-03-22 14:01 UTC (permalink / raw)
  To: Zbigniew Jędrzejewski-Szmek
  Cc: Randal L. Schwartz, Ralf Nyren, git, Junio C Hamano

On Thu, Mar 22, 2012 at 01:47:04PM +0100, Zbigniew Jędrzejewski-Szmek wrote:

> Yes, this will indeed fix this particular problem. But in general,
> empty files can be used for various reasons, and it can be a pretty
> nasty surprise if they sprout random content as a result of a merge.
> 
> Maybe merge-recursive could special-case empty files?

I was thinking the same thing. And it seems this came up once before,
and the list seemed to favor special-casing merge-recursive (but not
diffcore):

  http://thread.gmane.org/gmane.comp.version-control.git/116917/focus=117082

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 14:01       ` Jeff King
@ 2012-03-22 17:03         ` Junio C Hamano
  2012-03-22 17:59           ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2012-03-22 17:03 UTC (permalink / raw)
  To: Jeff King
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

Jeff King <peff@peff.net> writes:

> I was thinking the same thing. And it seems this came up once before,
> and the list seemed to favor special-casing merge-recursive (but not
> diffcore):
>
>   http://thread.gmane.org/gmane.comp.version-control.git/116917/focus=117082

Yeah, thanks for digging up the old thread. I was looking at the patch to
merge-recursive from Dscho on that thread and I think it identified the
place that needs patching correctly. I was on a tablet, without the access
to the surrounding code outside the patch context, so I do not know if the
logic to detect the pure-rename of an empty file in the patch was correct,
or the patch still applies to the current codebase, though.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 17:03         ` Junio C Hamano
@ 2012-03-22 17:59           ` Jeff King
  2012-03-22 18:25             ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2012-03-22 17:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 10:03:02AM -0700, Junio C Hamano wrote:

> > I was thinking the same thing. And it seems this came up once before,
> > and the list seemed to favor special-casing merge-recursive (but not
> > diffcore):
> >
> >   http://thread.gmane.org/gmane.comp.version-control.git/116917/focus=117082
> 
> Yeah, thanks for digging up the old thread. I was looking at the patch to
> merge-recursive from Dscho on that thread and I think it identified the
> place that needs patching correctly. I was on a tablet, without the access
> to the surrounding code outside the patch context, so I do not know if the
> logic to detect the pure-rename of an empty file in the patch was correct,
> or the patch still applies to the current codebase, though.

It's easy to apply the patch manually, and I have written a test.
However, it seems to cause lots of other parts of t6022 to fail. I'll
try to dig up the cause.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 17:59           ` Jeff King
@ 2012-03-22 18:25             ` Jeff King
  2012-03-22 18:52               ` Jeff King
  2012-03-22 18:52               ` Strange effect merging empty file Junio C Hamano
  0 siblings, 2 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 18:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 01:59:53PM -0400, Jeff King wrote:

> > Yeah, thanks for digging up the old thread. I was looking at the patch to
> > merge-recursive from Dscho on that thread and I think it identified the
> > place that needs patching correctly. I was on a tablet, without the access
> > to the surrounding code outside the patch context, so I do not know if the
> > logic to detect the pure-rename of an empty file in the patch was correct,
> > or the patch still applies to the current codebase, though.
> 
> It's easy to apply the patch manually, and I have written a test.
> However, it seems to cause lots of other parts of t6022 to fail. I'll
> try to dig up the cause.

Found it. The diff code is very smart about doing as little work as
possible. For a raw diff (i.e., not patch), we can often get away with
not loading the blob at all, and therefore have no idea what the size
is. The inexact rename code may load it, of course, but any file which
is an exact rename will have a "0" size, also.

We can get around it by just checking for the empty-blob sha1. The patch
below should do the right thing, and passes the whole test suite.

---
diff --git a/cache.h b/cache.h
index e5e1aa4..61671b6 100644
--- a/cache.h
+++ b/cache.h
@@ -708,6 +708,8 @@ static inline void hashclr(unsigned char *hash)
 #define EMPTY_TREE_SHA1_BIN \
 	 ((const unsigned char *) EMPTY_TREE_SHA1_BIN_LITERAL)
 
+int is_empty_blob_sha1(const unsigned char *sha1);
+
 int git_mkstemp(char *path, size_t n, const char *template);
 
 int git_mkstemps(char *path, size_t n, const char *template, int suffix_len);
diff --git a/merge-recursive.c b/merge-recursive.c
index 6479a60..ed4ff16 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -502,7 +502,7 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct string_list_item *item;
 		struct rename *re;
 		struct diff_filepair *pair = diff_queued_diff.queue[i];
-		if (pair->status != 'R') {
+		if (pair->status != 'R' || is_empty_blob_sha1(pair->one->sha1)) {
 			diff_free_filepair(pair);
 			continue;
 		}
diff --git a/read-cache.c b/read-cache.c
index 274e54b..dfabad0 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -157,7 +157,7 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st)
 	return 0;
 }
 
-static int is_empty_blob_sha1(const unsigned char *sha1)
+int is_empty_blob_sha1(const unsigned char *sha1)
 {
 	static const unsigned char empty_blob_sha1[20] = {
 		0xe6,0x9d,0xe2,0x9b,0xb2,0xd1,0xd6,0x43,0x4b,0x8b,
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index 9d8584e..1104249 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -884,4 +884,20 @@ test_expect_success 'no spurious "refusing to lose untracked" message' '
 	! grep "refusing to lose untracked file" errors.txt
 '
 
+test_expect_success 'do not follow renames for empty files' '
+	git checkout -f -b empty-base &&
+	>empty1 &&
+	git add empty1 &&
+	git commit -m base &&
+	echo content >empty1 &&
+	git add empty1 &&
+	git commit -m fill &&
+	git checkout -b empty-topic HEAD^ &&
+	git mv empty1 empty2 &&
+	git commit -m rename &&
+	test_must_fail git merge empty-base &&
+	>expect &&
+	test_cmp expect empty2
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 18:25             ` Jeff King
@ 2012-03-22 18:52               ` Jeff King
  2012-03-22 18:53                 ` [PATCH 1/3] drop casts from users EMPTY_TREE_SHA1_BIN Jeff King
                                   ` (2 more replies)
  2012-03-22 18:52               ` Strange effect merging empty file Junio C Hamano
  1 sibling, 3 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 18:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 02:25:33PM -0400, Jeff King wrote:

> We can get around it by just checking for the empty-blob sha1. The patch
> below should do the right thing, and passes the whole test suite.

Here it is broken up properly into commits:

  [1/3]: drop casts from users EMPTY_TREE_SHA1_BIN
  [2/3]: make is_empty_blob_sha1 available everywhere
  [3/3]: merge-recursive: don't detect renames from empty files

The first one is just a cleanup I noticed while adding
EMPTY_BLOB_SHA1_BIN, and is not strictly related. The second one is the
refactoring bits, and the third one is the real change.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 18:25             ` Jeff King
  2012-03-22 18:52               ` Jeff King
@ 2012-03-22 18:52               ` Junio C Hamano
  2012-03-22 19:03                 ` Jeff King
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2012-03-22 18:52 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Jeff King <peff@peff.net> writes:

> On Thu, Mar 22, 2012 at 01:59:53PM -0400, Jeff King wrote:
>
>> > Yeah, thanks for digging up the old thread. I was looking at the patch to
>> > merge-recursive from Dscho on that thread and I think it identified the
>> > place that needs patching correctly. I was on a tablet, without the access
>> > to the surrounding code outside the patch context, so I do not know if the
>> > logic to detect the pure-rename of an empty file in the patch was correct,
>> > or the patch still applies to the current codebase, though.
>> 
>> It's easy to apply the patch manually, and I have written a test.
>> However, it seems to cause lots of other parts of t6022 to fail. I'll
>> try to dig up the cause.
>
> Found it. The diff code is very smart about doing as little work as
> possible. For a raw diff (i.e., not patch), we can often get away with
> not loading the blob at all, and therefore have no idea what the size
> is. The inexact rename code may load it, of course, but any file which
> is an exact rename will have a "0" size, also.

Thanks.

The "I do not know if the logic is correct" reservation pays off ;-)

I still wonder why checking only the preimage side is sufficient, though.
Shouldn't we check both sides?

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 6479a60..ed4ff16 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -502,7 +502,7 @@ static struct string_list *get_renames(struct merge_options *o,
>  		struct string_list_item *item;
>  		struct rename *re;
>  		struct diff_filepair *pair = diff_queued_diff.queue[i];
> -		if (pair->status != 'R') {
> +		if (pair->status != 'R' || is_empty_blob_sha1(pair->one->sha1)) {
>  			diff_free_filepair(pair);
>  			continue;
>  		}
> diff --git a/read-cache.c b/read-cache.c
> index 274e54b..dfabad0 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -157,7 +157,7 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st)
>  	return 0;
>  }
>  
> -static int is_empty_blob_sha1(const unsigned char *sha1)
> +int is_empty_blob_sha1(const unsigned char *sha1)
>  {
>  	static const unsigned char empty_blob_sha1[20] = {
>  		0xe6,0x9d,0xe2,0x9b,0xb2,0xd1,0xd6,0x43,0x4b,0x8b,
> diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
> index 9d8584e..1104249 100755
> --- a/t/t6022-merge-rename.sh
> +++ b/t/t6022-merge-rename.sh
> @@ -884,4 +884,20 @@ test_expect_success 'no spurious "refusing to lose untracked" message' '
>  	! grep "refusing to lose untracked file" errors.txt
>  '
>  
> +test_expect_success 'do not follow renames for empty files' '
> +	git checkout -f -b empty-base &&
> +	>empty1 &&
> +	git add empty1 &&
> +	git commit -m base &&
> +	echo content >empty1 &&
> +	git add empty1 &&
> +	git commit -m fill &&
> +	git checkout -b empty-topic HEAD^ &&
> +	git mv empty1 empty2 &&
> +	git commit -m rename &&
> +	test_must_fail git merge empty-base &&
> +	>expect &&
> +	test_cmp expect empty2
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/3] drop casts from users EMPTY_TREE_SHA1_BIN
  2012-03-22 18:52               ` Jeff King
@ 2012-03-22 18:53                 ` Jeff King
  2012-03-22 18:53                 ` [PATCH 2/3] make is_empty_blob_sha1 available everywhere Jeff King
  2012-03-22 18:53                 ` [PATCH 3/3] merge-recursive: don't detect renames from empty files Jeff King
  2 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 18:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

This macro already evaluates to the correct type, as it
casts the string literal to "unsigned char *" itself
(and callers who want the literal can use the _LITERAL
form).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/diff.c    |    2 +-
 merge-recursive.c |    2 +-
 sequencer.c       |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/diff.c b/builtin/diff.c
index 424c815..9069dc4 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -327,7 +327,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
 				add_head_to_pending(&rev);
 				if (!rev.pending.nr) {
 					struct tree *tree;
-					tree = lookup_tree((const unsigned char*)EMPTY_TREE_SHA1_BIN);
+					tree = lookup_tree(EMPTY_TREE_SHA1_BIN);
 					add_pending_object(&rev, &tree->object, "HEAD");
 				}
 				break;
diff --git a/merge-recursive.c b/merge-recursive.c
index 6479a60..318d32e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1914,7 +1914,7 @@ int merge_recursive(struct merge_options *o,
 		/* if there is no common ancestor, use an empty tree */
 		struct tree *tree;
 
-		tree = lookup_tree((const unsigned char *)EMPTY_TREE_SHA1_BIN);
+		tree = lookup_tree(EMPTY_TREE_SHA1_BIN);
 		merged_common_ancestors = make_virtual_commit(tree, "ancestor");
 	}
 
diff --git a/sequencer.c b/sequencer.c
index a37846a..4307364 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -164,7 +164,7 @@ static void write_message(struct strbuf *msgbuf, const char *filename)
 
 static struct tree *empty_tree(void)
 {
-	return lookup_tree((const unsigned char *)EMPTY_TREE_SHA1_BIN);
+	return lookup_tree(EMPTY_TREE_SHA1_BIN);
 }
 
 static int error_dirty_index(struct replay_opts *opts)
-- 
1.7.10.rc0.9.gdcbe9

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/3] make is_empty_blob_sha1 available everywhere
  2012-03-22 18:52               ` Jeff King
  2012-03-22 18:53                 ` [PATCH 1/3] drop casts from users EMPTY_TREE_SHA1_BIN Jeff King
@ 2012-03-22 18:53                 ` Jeff King
  2012-03-22 18:53                 ` [PATCH 3/3] merge-recursive: don't detect renames from empty files Jeff King
  2 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 18:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

The read-cache implementation defines this static function,
but it is a generally useful concept in git. Let's give
the empty blob the same treatment as the empty tree,
providing both hex and binary forms of the sha1.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h      |   13 +++++++++++++
 read-cache.c |   10 ----------
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index e5e1aa4..5280574 100644
--- a/cache.h
+++ b/cache.h
@@ -708,6 +708,19 @@ static inline void hashclr(unsigned char *hash)
 #define EMPTY_TREE_SHA1_BIN \
 	 ((const unsigned char *) EMPTY_TREE_SHA1_BIN_LITERAL)
 
+#define EMPTY_BLOB_SHA1_HEX \
+	"e69de29bb2d1d6434b8b29ae775ad8c2e48c5391"
+#define EMPTY_BLOB_SHA1_BIN_LITERAL \
+	"\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \
+	"\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91"
+#define EMPTY_BLOB_SHA1_BIN \
+	((const unsigned char *) EMPTY_BLOB_SHA1_BIN_LITERAL)
+
+static inline int is_empty_blob_sha1(const unsigned char *sha1)
+{
+	return !hashcmp(sha1, EMPTY_BLOB_SHA1_BIN);
+}
+
 int git_mkstemp(char *path, size_t n, const char *template);
 
 int git_mkstemps(char *path, size_t n, const char *template, int suffix_len);
diff --git a/read-cache.c b/read-cache.c
index 274e54b..6c8f395 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -157,16 +157,6 @@ static int ce_modified_check_fs(struct cache_entry *ce, struct stat *st)
 	return 0;
 }
 
-static int is_empty_blob_sha1(const unsigned char *sha1)
-{
-	static const unsigned char empty_blob_sha1[20] = {
-		0xe6,0x9d,0xe2,0x9b,0xb2,0xd1,0xd6,0x43,0x4b,0x8b,
-		0x29,0xae,0x77,0x5a,0xd8,0xc2,0xe4,0x8c,0x53,0x91
-	};
-
-	return !hashcmp(sha1, empty_blob_sha1);
-}
-
 static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 {
 	unsigned int changed = 0;
-- 
1.7.10.rc0.9.gdcbe9

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/3] merge-recursive: don't detect renames from empty files
  2012-03-22 18:52               ` Jeff King
  2012-03-22 18:53                 ` [PATCH 1/3] drop casts from users EMPTY_TREE_SHA1_BIN Jeff King
  2012-03-22 18:53                 ` [PATCH 2/3] make is_empty_blob_sha1 available everywhere Jeff King
@ 2012-03-22 18:53                 ` Jeff King
  2012-03-22 19:18                   ` Jonathan Nieder
  2 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2012-03-22 18:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

Merge-recursive detects renames so that if one side modifies
"foo" and the other side moves it to "bar", the modification
is applied to "bar". However, our rename detection is based
on content analysis, it can be wrong (i.e., two files were
not intended as a rename, but just happen to have the same
or similar content).

This is quite rare if the files actually contain content,
since two unrelated files are unlikely to have exactly the
same content.  However, empty files present a problem, in
that there is nothing to analyze. An uninteresting
placeholder file with zero bytes may or may not be related
to a placeholder file with another name.

The result is that adding content to an empty file may cause
confusion if the other side of a merge removed it; your
content may end up in another random placeholder file that
was added.

Let's err on the side of caution and not consider empty
files as renames. This will cause a modify/delete conflict
on the merge, which will let the user sort it out
themselves.

We could do the same thing for general diff rename
detection. However, the stakes are much less high there, as
we are explicitly reporting the rename to the user. It's
only the automatic nature of merge-recursive that makes the
result confusing. So there's not as much need for caution
when just showing a diff.

Signed-off-by: Jeff King <peff@peff.net>
---
 merge-recursive.c       |    2 +-
 t/t6022-merge-rename.sh |   16 ++++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 318d32e..0255f50 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -502,7 +502,7 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct string_list_item *item;
 		struct rename *re;
 		struct diff_filepair *pair = diff_queued_diff.queue[i];
-		if (pair->status != 'R') {
+		if (pair->status != 'R' || is_empty_blob_sha1(pair->one->sha1)) {
 			diff_free_filepair(pair);
 			continue;
 		}
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index 9d8584e..1104249 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -884,4 +884,20 @@ test_expect_success 'no spurious "refusing to lose untracked" message' '
 	! grep "refusing to lose untracked file" errors.txt
 '
 
+test_expect_success 'do not follow renames for empty files' '
+	git checkout -f -b empty-base &&
+	>empty1 &&
+	git add empty1 &&
+	git commit -m base &&
+	echo content >empty1 &&
+	git add empty1 &&
+	git commit -m fill &&
+	git checkout -b empty-topic HEAD^ &&
+	git mv empty1 empty2 &&
+	git commit -m rename &&
+	test_must_fail git merge empty-base &&
+	>expect &&
+	test_cmp expect empty2
+'
+
 test_done
-- 
1.7.10.rc0.9.gdcbe9

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 18:52               ` Strange effect merging empty file Junio C Hamano
@ 2012-03-22 19:03                 ` Jeff King
  2012-03-22 19:12                   ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2012-03-22 19:03 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 11:52:54AM -0700, Junio C Hamano wrote:

> I still wonder why checking only the preimage side is sufficient, though.
> Shouldn't we check both sides?
> [...]
> > +		if (pair->status != 'R' || is_empty_blob_sha1(pair->one->sha1)) {

If the source is an empty file, then the other side must be an empty
file, too, no? Otherwise it would not be a rename. It could not be
exact, for obvious reasons. And it could not be inexact, because there
is no content to analyze on the source side.

Ditto for the destination side. How could something be renamed to an
empty file, but not be empty on the source side?

That is a slight layering violation, in that we are making assumptions
about how the diffcore-rename subsystem works. In theory diffcore-rename
could learn to read rename annotations from the commit message or
something (bleh!). But then, we'd probably want to update
merge-recursive in that instance to accept those "yes, it's definitely a
rename" markers, even for an empty file.

I think it would be slightly cleaner to tell diffcore-rename "be
conservative, and don't use empty files as sources" via an option. It's
not as trivial as this one-liner, but it shouldn't be much more than the
small patch you posted in the earlier thread.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange effect merging empty file
  2012-03-22 19:03                 ` Jeff King
@ 2012-03-22 19:12                   ` Junio C Hamano
  2012-03-22 22:46                     ` [PATCH 0/2] merging renames of empty files Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2012-03-22 19:12 UTC (permalink / raw)
  To: Jeff King
  Cc: Zbigniew Jędrzejewski-Szmek, Randal L. Schwartz, Ralf Nyren, git

Jeff King <peff@peff.net> writes:

> That is a slight layering violation, in that we are making assumptions
> about how the diffcore-rename subsystem works.

I do not think I have to say any more than that.  The special case we want
to have is for the "empty to empty" case and nothing else, and I do not
want to see people having to remember to look at the merge-recursive code
if/when rename detection starts to treat "empty to small" as "rename with
minor modification."

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] merge-recursive: don't detect renames from empty files
  2012-03-22 18:53                 ` [PATCH 3/3] merge-recursive: don't detect renames from empty files Jeff King
@ 2012-03-22 19:18                   ` Jonathan Nieder
  2012-03-22 21:53                     ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Jonathan Nieder @ 2012-03-22 19:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Jeff King wrote:

> We could do the same thing for general diff rename
> detection. However, the stakes are much less high there, as
> we are explicitly reporting the rename to the user. It's
> only the automatic nature of merge-recursive that makes the
> result confusing. So there's not as much need for caution
> when just showing a diff.

The stakes may be different, but doesn't the same justification apply
anyway?  If "git diff -M" chooses a random pairing to describe a
renaming of multiple empty files, that seems just as confusing as
merge-recursive making the same mistake.

If adding this check in diffcore is more complicated, doing it in
merge-recursive for now seems fine and prudent, but if we are doing it
at the merge-recursive level just to be conservative then that seems
like the wrong layer.

Thanks for a clean and pleasant patch.
Jonathan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] merge-recursive: don't detect renames from empty files
  2012-03-22 19:18                   ` Jonathan Nieder
@ 2012-03-22 21:53                     ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 21:53 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 02:18:51PM -0500, Jonathan Nieder wrote:

> > We could do the same thing for general diff rename
> > detection. However, the stakes are much less high there, as
> > we are explicitly reporting the rename to the user. It's
> > only the automatic nature of merge-recursive that makes the
> > result confusing. So there's not as much need for caution
> > when just showing a diff.
> 
> The stakes may be different, but doesn't the same justification apply
> anyway?  If "git diff -M" chooses a random pairing to describe a
> renaming of multiple empty files, that seems just as confusing as
> merge-recursive making the same mistake.

Maybe "stakes" was not the best word. My thinking was something like the
following. Matching empty files is a heuristic. So sometimes it will be
right, and sometimes it will be wrong. We want to make sure that the
confusion caused by being wrong is less than the goodness caused by
being right.  When we find renames for a merge, the badness in being
wrong is quite high. And the lack of goodness in failing to be right is
not all that high; you'll get a conflict which will bring the issue to
the user's attention, and they can fall back to using "git diff" to
investigate the situation.

Whereas with a regular diff, the badness of being wrong is not very
high. The user sees the diff and says "Really? Stupid git, that wasn't a
rename". And the lack of goodness in failing to be right is somewhat
worse, because there is no way to fall back and ask git "what renames
would you have found if you relaxed the heuristics a bit more?"

Of course, one could make that fallback an option. And given that we
are, by definition, talking about trivial empty files, it's not like the
rename detection somehow makes the diff a whole lot nicer. It just says
"rename X to Y" instead of "deleted Y, added X". The real value in the
rename detection is seeing the interdiff between X and Y, but it would
always be empty in this case anyway.

So I could go either way.

> If adding this check in diffcore is more complicated, doing it in
> merge-recursive for now seems fine and prudent, but if we are doing it
> at the merge-recursive level just to be conservative then that seems
> like the wrong layer.

It's not really more complicated, and based on Junio's response, I think
we want to do it there anyway. Doing it unconditionally for diff and
merge actually would make the code even simpler, then.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] merging renames of empty files
  2012-03-22 19:12                   ` Junio C Hamano
@ 2012-03-22 22:46                     ` Jeff King
  2012-03-22 22:52                       ` [PATCH 1/2] teach diffcore-rename to optionally ignore empty content Jeff King
                                         ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 22:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 12:12:06PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > That is a slight layering violation, in that we are making assumptions
> > about how the diffcore-rename subsystem works.
> 
> I do not think I have to say any more than that.  The special case we want
> to have is for the "empty to empty" case and nothing else, and I do not
> want to see people having to remember to look at the merge-recursive code
> if/when rename detection starts to treat "empty to small" as "rename with
> minor modification."

Here's a 2-patch series to replace the old 3/3 (they go on top of the
first two cleanups from the previous iteration).

  [1/2]: teach diffcore-rename to optionally ignore empty content
  [2/2]: merge-recursive: don't detect renames of empty files

Thinking on this more, it is actually a more generic problem than just
empty files. It is really a problem of having generic placeholder files
with the same content. So a fully general solution would be something
like a gitattribute for "don't do renames on this". However, in
practice, these placeholder files are empty (since any non-empty file is
likely to actually have different content). So I think just dropping the
empty files as rename candidates is a pretty good heuristic, and it's
nice and simple.

After responding to Jonathan, I'm on the fence about whether diff should
follow the same heuristic. I left the diff behavior unchanged, but a 3/2
that turns it off by default would be a trivial one-liner.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] teach diffcore-rename to optionally ignore empty content
  2012-03-22 22:46                     ` [PATCH 0/2] merging renames of empty files Jeff King
@ 2012-03-22 22:52                       ` Jeff King
  2012-03-22 22:52                       ` [PATCH 2/2] merge-recursive: don't detect renames of empty files Jeff King
  2012-03-22 23:37                       ` [PATCH 0/2] merging " Junio C Hamano
  2 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 22:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Our rename detection is a heuristic, matching pairs of
removed and added files with similar or identical content.
It's unlikely to be wrong when there is actual content to
compare, and we already take care not to do inexact rename
detection when there is not enough content to produce good
results.

However, we always do exact rename detection, even when the
blob is tiny or empty. It's easy to get false positives with
an empty blob, simply because it is an obvious content to
use as a boilerplate (e.g., when telling git that an empty
directory is worth tracking via an empty .gitignore).

This patch lets callers specify whether or not they are
interested in using empty files as rename sources and
destinations. The default is "yes", keeping the original
behavior. It works by detecting the empty-blob sha1 for
rename sources and destinations.

One more flexible alternative would be to allow the caller
to specify a minimum size for a blob to be "interesting" for
rename detection. But that would catch small boilerplate
files, not large ones (e.g., if you had the GPL COPYING file
in many directories).

A better alternative would be to allow a "-rename"
gitattribute to allow boilerplate files to be marked as
such. I'll leave the complexity of that solution until such
time as somebody actually wants it. The complaints we've
seen so far revolve around empty files, so let's start with
the simple thing.

Signed-off-by: Jeff King <peff@peff.net>
---
From the previous discussion, we know we could get away with just
dropping empty files from the rename_src list. However, doing it for
both the src and dst lists is a little more obvious and robust. And
since some of the rename detection is O(src*dst), keeping the lists
as small as possible is a good thing.

I added command-line triggers mostly for testing and debugging, and
didn't bother to advertise them in the documentation. Obviously if we
decide that diff should just have this behavior, this patch can be
even smaller.

 diff.c            |    5 +++++
 diff.h            |    2 +-
 diffcore-rename.c |    6 ++++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 377ec1e..0b70aad 100644
--- a/diff.c
+++ b/diff.c
@@ -3136,6 +3136,7 @@ void diff_setup(struct diff_options *options)
 	options->rename_limit = -1;
 	options->dirstat_permille = diff_dirstat_permille_default;
 	options->context = 3;
+	DIFF_OPT_SET(options, RENAME_EMPTY);
 
 	options->change = diff_change;
 	options->add_remove = diff_addremove;
@@ -3506,6 +3507,10 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
 	}
 	else if (!strcmp(arg, "--no-renames"))
 		options->detect_rename = 0;
+	else if (!strcmp(arg, "--rename-empty"))
+		DIFF_OPT_SET(options, RENAME_EMPTY);
+	else if (!strcmp(arg, "--no-rename-empty"))
+		DIFF_OPT_CLR(options, RENAME_EMPTY);
 	else if (!strcmp(arg, "--relative"))
 		DIFF_OPT_SET(options, RELATIVE_NAME);
 	else if (!prefixcmp(arg, "--relative=")) {
diff --git a/diff.h b/diff.h
index cb68743..dd48eca 100644
--- a/diff.h
+++ b/diff.h
@@ -60,7 +60,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_SILENT_ON_REMOVE    (1 <<  5)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES      (1 <<  7)
-/* (1 <<  8) unused */
+#define DIFF_OPT_RENAME_EMPTY        (1 <<  8)
 /* (1 <<  9) unused */
 #define DIFF_OPT_HAS_CHANGES         (1 << 10)
 #define DIFF_OPT_QUICK               (1 << 11)
diff --git a/diffcore-rename.c b/diffcore-rename.c
index f639601..216a7a4 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -512,9 +512,15 @@ void diffcore_rename(struct diff_options *options)
 			else if (options->single_follow &&
 				 strcmp(options->single_follow, p->two->path))
 				continue; /* not interested */
+			else if (!DIFF_OPT_TST(options, RENAME_EMPTY) &&
+				 is_empty_blob_sha1(p->two->sha1))
+				continue;
 			else
 				locate_rename_dst(p->two, 1);
 		}
+		else if (!DIFF_OPT_TST(options, RENAME_EMPTY) &&
+			 is_empty_blob_sha1(p->one->sha1))
+			continue;
 		else if (!DIFF_PAIR_UNMERGED(p) && !DIFF_FILE_VALID(p->two)) {
 			/*
 			 * If the source is a broken "delete", and
-- 
1.7.10.rc0.9.gdcbe9

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] merge-recursive: don't detect renames of empty files
  2012-03-22 22:46                     ` [PATCH 0/2] merging renames of empty files Jeff King
  2012-03-22 22:52                       ` [PATCH 1/2] teach diffcore-rename to optionally ignore empty content Jeff King
@ 2012-03-22 22:52                       ` Jeff King
  2012-03-22 23:37                       ` [PATCH 0/2] merging " Junio C Hamano
  2 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2012-03-22 22:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Merge-recursive detects renames so that if one side modifies
"foo" and the other side moves it to "bar", the modification
is applied to "bar". However, our rename detection is based
on content analysis, it can be wrong (i.e., two files were
not intended as a rename, but just happen to have the same
or similar content).

This is quite rare if the files actually contain content,
since two unrelated files are unlikely to have exactly the
same content.  However, empty files present a problem, in
that there is nothing to analyze. An uninteresting
placeholder file with zero bytes may or may not be related
to a placeholder file with another name.

The result is that adding content to an empty file may cause
confusion if the other side of a merge removed it; your
content may end up in another random placeholder file that
was added.

Let's err on the side of caution and not consider empty
files as renames. This will cause a modify/delete conflict
on the merge, which will let the user sort it out
themselves.

Signed-off-by: Jeff King <peff@peff.net>
---
 merge-recursive.c       |    1 +
 t/t6022-merge-rename.sh |   16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 318d32e..0fb1743 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -485,6 +485,7 @@ static struct string_list *get_renames(struct merge_options *o,
 	renames = xcalloc(1, sizeof(struct string_list));
 	diff_setup(&opts);
 	DIFF_OPT_SET(&opts, RECURSIVE);
+	DIFF_OPT_CLR(&opts, RENAME_EMPTY);
 	opts.detect_rename = DIFF_DETECT_RENAME;
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index 9d8584e..1104249 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -884,4 +884,20 @@ test_expect_success 'no spurious "refusing to lose untracked" message' '
 	! grep "refusing to lose untracked file" errors.txt
 '
 
+test_expect_success 'do not follow renames for empty files' '
+	git checkout -f -b empty-base &&
+	>empty1 &&
+	git add empty1 &&
+	git commit -m base &&
+	echo content >empty1 &&
+	git add empty1 &&
+	git commit -m fill &&
+	git checkout -b empty-topic HEAD^ &&
+	git mv empty1 empty2 &&
+	git commit -m rename &&
+	test_must_fail git merge empty-base &&
+	>expect &&
+	test_cmp expect empty2
+'
+
 test_done
-- 
1.7.10.rc0.9.gdcbe9

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] merging renames of empty files
  2012-03-22 22:46                     ` [PATCH 0/2] merging renames of empty files Jeff King
  2012-03-22 22:52                       ` [PATCH 1/2] teach diffcore-rename to optionally ignore empty content Jeff King
  2012-03-22 22:52                       ` [PATCH 2/2] merge-recursive: don't detect renames of empty files Jeff King
@ 2012-03-22 23:37                       ` Junio C Hamano
  2012-03-23  0:23                         ` Jeff King
  2 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2012-03-22 23:37 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Jeff King <peff@peff.net> writes:

> Here's a 2-patch series to replace the old 3/3 (they go on top of the
> first two cleanups from the previous iteration).

Hrm. As this is probably useful for older maintenance track, I would have
preferred not to see the first one that touches sequencer.c that did not
exist in the 1.7.7.x maintenance track, as the change is purely "cosmetic"
and does not have anything to do with fixing the over-agressive merge.

>   [1/2]: teach diffcore-rename to optionally ignore empty content
>   [2/2]: merge-recursive: don't detect renames of empty files
>
> Thinking on this more, it is actually a more generic problem than just
> empty files. It is really a problem of having generic placeholder files
> with the same content. So a fully general solution would be something
> like a gitattribute for "don't do renames on this". However, in
> practice, these placeholder files are empty (since any non-empty file is
> likely to actually have different content). So I think just dropping the
> empty files as rename candidates is a pretty good heuristic, and it's
> nice and simple.

I thought that our recommendation for keeping an otherwise empty
directories around was to have .gitignore file with two entries in it,
namely:

	.*
        *

So these files will be everywhere and without being empty, no?

But I tend to prefer the simplicity of limiting this to empty files
anyway.

> After responding to Jonathan, I'm on the fence about whether diff should
> follow the same heuristic. I left the diff behavior unchanged, but a 3/2
> that turns it off by default would be a trivial one-liner.

I am also torn, but somewhat in favor of avoiding random permutations of
empty files.

I can think of only one situation somebody _could_ argue that detecting
empty-to-empty rename is halfway sensible: when there is only one empty
file that was removed, and one new empty file that was added.  Even in
such a case, the user could have done "rm this; >that", and depending on
the nature of these two files, "mv this that" may not have made _any_
sense even if they are both empty, and that is why I said "halfway"
sensible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] merging renames of empty files
  2012-03-22 23:37                       ` [PATCH 0/2] merging " Junio C Hamano
@ 2012-03-23  0:23                         ` Jeff King
  2012-03-23  4:56                           ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2012-03-23  0:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

On Thu, Mar 22, 2012 at 04:37:05PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Here's a 2-patch series to replace the old 3/3 (they go on top of the
> > first two cleanups from the previous iteration).
> 
> Hrm. As this is probably useful for older maintenance track, I would have
> preferred not to see the first one that touches sequencer.c that did not
> exist in the 1.7.7.x maintenance track, as the change is purely "cosmetic"
> and does not have anything to do with fixing the over-agressive merge.

I considered that, but assumed this was not a maint fix, but rather a
new "feature" to improve the heuristics. That being said, the first
patch is entirely unrelated.  The others don't depend on it textually or
semantically, and it does not need to be part of the series (I just
happened to notice it while in the area).

> I thought that our recommendation for keeping an otherwise empty
> directories around was to have .gitignore file with two entries in it,
> namely:
> 
> 	.*
>         *
> 
> So these files will be everywhere and without being empty, no?

I dunno. I have never recommended that, but maybe it is something people
do.

> But I tend to prefer the simplicity of limiting this to empty files
> anyway.

Yeah. We might want to do a gitattributes thing on top, but I'd much
rather have this in the meantime, as it seems like it handles the bulk
of the complaints we have actually seen on the list.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/2] merging renames of empty files
  2012-03-23  0:23                         ` Jeff King
@ 2012-03-23  4:56                           ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2012-03-23  4:56 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Zbigniew Jędrzejewski-Szmek,
	Randal L. Schwartz, Ralf Nyren, git

Jeff King <peff@peff.net> writes:

> On Thu, Mar 22, 2012 at 04:37:05PM -0700, Junio C Hamano wrote:
>
>> Hrm. As this is probably useful for older maintenance track, I would have
>> preferred not to see the first one that touches sequencer.c that did not
>> exist in the 1.7.7.x maintenance track, as the change is purely "cosmetic"
>> and does not have anything to do with fixing the over-agressive merge.
>
> I considered that, but assumed this was not a maint fix, but rather a
> new "feature" to improve the heuristics.

Ok. Perhaps I over-reacted to the issue, which sounded like a grave
mismerge that can go unnoticed.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-03-23  4:56 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-21 10:28 Strange effect merging empty file Ralf Nyren
2012-03-21 10:54 ` Zbigniew Jędrzejewski-Szmek
2012-03-21 17:14   ` Junio C Hamano
2012-03-22 12:17   ` Randal L. Schwartz
2012-03-22 12:39     ` Ralf Nyren
2012-03-22 12:47     ` Zbigniew Jędrzejewski-Szmek
2012-03-22 14:01       ` Jeff King
2012-03-22 17:03         ` Junio C Hamano
2012-03-22 17:59           ` Jeff King
2012-03-22 18:25             ` Jeff King
2012-03-22 18:52               ` Jeff King
2012-03-22 18:53                 ` [PATCH 1/3] drop casts from users EMPTY_TREE_SHA1_BIN Jeff King
2012-03-22 18:53                 ` [PATCH 2/3] make is_empty_blob_sha1 available everywhere Jeff King
2012-03-22 18:53                 ` [PATCH 3/3] merge-recursive: don't detect renames from empty files Jeff King
2012-03-22 19:18                   ` Jonathan Nieder
2012-03-22 21:53                     ` Jeff King
2012-03-22 18:52               ` Strange effect merging empty file Junio C Hamano
2012-03-22 19:03                 ` Jeff King
2012-03-22 19:12                   ` Junio C Hamano
2012-03-22 22:46                     ` [PATCH 0/2] merging renames of empty files Jeff King
2012-03-22 22:52                       ` [PATCH 1/2] teach diffcore-rename to optionally ignore empty content Jeff King
2012-03-22 22:52                       ` [PATCH 2/2] merge-recursive: don't detect renames of empty files Jeff King
2012-03-22 23:37                       ` [PATCH 0/2] merging " Junio C Hamano
2012-03-23  0:23                         ` Jeff King
2012-03-23  4:56                           ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.