git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fall back to three-way merge when applying a patch.
@ 2005-10-05  0:46 Junio C Hamano
  2005-10-05  4:56 ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2005-10-05  0:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

After git-apply fails, attempt to find a base tree that the patch
cleanly applies to, and do a three-way merge using that base tree into
the current index.

When the fall-back merge fails, the working tree can be resolved the
same way as you would normally hand resolve a conflicting merge.
When making commit, use .dotest/final-commit as the log message
template.  Or you could just choose to 'git-checkout-index -f -a'
to revert the failed merge.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * I will be placing this in the proposed updates branch.
   Hopefully this would alleviate the complaints from people who
   find the "no fuzz" policy of git-apply is too strict.

   The change is helped if the patch sender uses the updated
   git-format-patch that records which tree the patch is
   supposed to cleanly apply to, but that is not strictly
   necessary; it tries to find a tree that patch applies to from
   the recent commits itself.

 git-applypatch.sh |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 61 insertions(+), 1 deletions(-)

applies-to: 9eec9ff9326032bca405a12a265918725edf4ac7
fb533c6562481dfbc4719f740aa0d85e2a45700c
diff --git a/git-applypatch.sh b/git-applypatch.sh
--- a/git-applypatch.sh
+++ b/git-applypatch.sh
@@ -105,7 +105,67 @@ git-apply --index "$PATCHFILE" || {
 	# Here if we know which revision the patch applies to,
 	# we create a temporary working tree and index, apply the
 	# patch, and attempt 3-way merge with the resulting tree.
-	exit 1
+
+	O_OBJECT=`cd "$GIT_OBJECT_DIRECTORY" && pwd`
+	rm -fr .patch-merge-*
+
+	(
+		N=10
+
+		# if the patch records the base tree...
+		sed -ne '
+			/^diff /q
+			/^applies-to: \([0-9a-f]*\)$/{
+				s//\1/p
+				q
+			}
+		' "$PATCHFILE"
+
+		# or hoping the patch is against our recent commits...
+		git-rev-list --max-count=$N HEAD
+
+		# or hoping the patch is against known tags...
+		git-ls-remote --tags .
+	) |
+	while read base junk
+	do
+		# Try it if we have it as a tree.
+		git-cat-file tree "$base" >/dev/null 2>&1 || continue
+
+		rm -fr .patch-merge-tmp-* &&
+		mkdir .patch-merge-tmp-dir || break
+		(
+			cd .patch-merge-tmp-dir &&
+			GIT_INDEX_FILE=../.patch-merge-tmp-index &&
+			GIT_OBJECT_DIRECTORY="$O_OBJECT" &&
+			export GIT_INDEX_FILE GIT_OBJECT_DIRECTORY &&
+			git-read-tree "$base" &&
+			git-apply --index &&
+			mv ../.patch-merge-tmp-index ../.patch-merge-index &&
+			echo "$base" >../.patch-merge-base
+		) <"$PATCHFILE"  2>/dev/null && break
+	done
+
+	test -f .patch-merge-index &&
+	his_tree=$(GIT_INDEX_FILE=.patch-merge-index git-write-tree) &&
+	orig_tree=$(cat .patch-merge-base) &&
+	rm -fr .patch-merge-* || exit 1
+
+	echo Falling back to patching base and 3-way merge using $orig_tree...
+
+	# This is not so wrong.  Depending on which base we picked,
+	# orig_tree may be wildly different from ours, but his_tree
+	# has the same set of wildly different changes in parts the
+	# patch did not touch, so resolve ends up cancelling them,
+	# saying that we reverted all those changes.
+
+	if git-merge-resolve $orig_tree -- HEAD $his_tree
+	then
+		echo Done.
+	else
+		echo Failed to merge in the changes.
+		exit 1
+	fi
 }
 
 if test -x "$GIT_DIR"/hooks/pre-applypatch
---
0.99.8.GIT

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-05  0:46 [PATCH] Fall back to three-way merge when applying a patch Junio C Hamano
@ 2005-10-05  4:56 ` Linus Torvalds
  2005-10-05  6:58   ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2005-10-05  4:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 4 Oct 2005, Junio C Hamano wrote:
> 
>  * I will be placing this in the proposed updates branch.
>    Hopefully this would alleviate the complaints from people who
>    find the "no fuzz" policy of git-apply is too strict.

This should definitely be enabled by a switch only. 

For example, I use git-applypatch _only_ through git-applymbox. Ie 
non-interactively. I definitely do _not_ want it to try random other trees 
unless I tell it to, especially when applying a series of 175 patches in 
one go.

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-05  4:56 ` Linus Torvalds
@ 2005-10-05  6:58   ` Junio C Hamano
  2005-10-05 14:30     ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2005-10-05  6:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> For example, I use git-applypatch _only_ through git-applymbox. Ie 
> non-interactively. I definitely do _not_ want it to try random other trees 
> unless I tell it to, especially when applying a series of 175 patches in 
> one go.

My understanding of your workflow, from reading your earlier
posts to the git list, is:

 * Read bunch of mails -- mostly concentrating on the
   description in the log message and skimming the code -- while
   saving the worthy patches into a 'to-apply' mbox.

 * Review 'to-aplly' mbox again, perhaps excluding not-so-good
   ones.  When you need to fix log messages, that is done in
   this step.

 * Run git-applymbox on the surviving messages in a batch mode,
   processing 175 patches in one go.

What is your workflow after seeing that the 47th patch among 175
patches fails to apply?  I understand you run git-applymbox
without -q, so it immediately fails and exits there.  You would
have split mails, numbered from 0001 to 0175 in the .dotest
directory, with msg, patch and info files from the mail 0047
left behind.

I presume that, if the conflict/fuzz is manageably small, you
would hand edit the file 0047 to make the diff part to apply
cleanly, and then run 'git-applymbox -c .dotest/0047'.  I
would understand you would be able to resume and process the
rest 128 patches this way.

Once you mentioned that if the conflict is too great you would
apply the patch to an old version that the patch sender based
her work on, and merge using git (this was the direct
inspiration of the change to git-applypatch).

What I am wondering is how that step fits in your workflow.  Do
you do this by the following sequence?

 * Temporarily switch to a new throwaway branch for merge from
   the known old version with 'git checkout -b test v2.6.12';

 * 'git apply --index <.dotest/patch' to apply the patch, and
   perhaps 'git commit -F .dotest/msg';

 * Switch back to the master branch with 'git checkout master';

 * Merge them with 'git resolve master test <some message>',
   which may leave conflicts, in which case you would hand
   resolve, and say 'git commit -F .dotest/msg'.  Then 'git
   branch -d test' to get rid of the throwaway branch.

 * Resume the patch application with 'git-applymbox -c
   .dotest/0048'.

Well, I just noticed that it is not even immediately obvious
that the 47th of the 175 patches failed to apply from the
git-applymbox batch output (it shows how I do not process really
many patches myself X-<).  Is there anything I can do to improve
this?  For example, after git-applypatch exits with 0 exit
status we could remove the split mail file, like this:

------------
diff --git a/git-applymbox.sh b/git-applymbox.sh
--- a/git-applymbox.sh
+++ b/git-applymbox.sh
@@ -82,7 +82,11 @@ do
     do
 	git-applypatch .dotest/msg-clean .dotest/patch .dotest/info "$signoff"
 	case "$?" in
-	0 | 2 )
+	0)
+		# Remove the cleanly applied one to reduce clutter.
+		rm -f .dotest/$i
+		;;
+	2)
 		# 2 is a special exit code from applypatch to indicate that
 	    	# the patch wasn't applied, but continue anyway 
 		;;
------------

Or it may not matter at all if your "after failure" workflow is
very different from what I've been imagining.  This is the
primarily reason why I would like to understand your workflow
better.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-05  6:58   ` Junio C Hamano
@ 2005-10-05 14:30     ` Linus Torvalds
  2005-10-06  0:03       ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2005-10-05 14:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Tue, 4 Oct 2005, Junio C Hamano wrote:

> What is your workflow after seeing that the 47th patch among 175
> patches fails to apply?  I understand you run git-applymbox
> without -q, so it immediately fails and exits there.

Correct. 

What I usually do is to just ignore the split patches, and go back to the 
mbox. Delete the ones that applied, and look at the one that didn't. Then 
I try to apply that one manually - sometimes I actually do a quick manual 
branch to a known version, apply and merge - and I think I might try to 
use the "base wiggle" patch for that. Although I would actually prefer to 
be able to try against a specific named release (not "all named tags" 
first).

Quite often, the reason it doesn't apply is that I already have that patch 
through a git merge (or because I applied that patch earlier through 
another person), in which case I don't even try to merge it, I just delete 
that email too, and just do "git-applymbox" again.

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-05 14:30     ` Linus Torvalds
@ 2005-10-06  0:03       ` Junio C Hamano
  2005-10-06  1:59         ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2005-10-06  0:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> ... Although I would actually prefer to be able to try against
> a specific named release (not "all named tags" first).

That makes sense, given that you know better than the tool can
possibly guess, especially from the performance point of view.

What's interesting is that from the correctness point of view,
it should not make that much of a difference which base tree
happens to be picked -- if the base tree is wildly different
from your current HEAD, most of that wild difference will be
carried over intact to the result of patch application for paths
and parts of the files the patch does not touch.  Three-way
merge notices that and your "changes" from the base tree win.

> Quite often, the reason it doesn't apply is that I already have that patch 
> through a git merge (or because I applied that patch earlier through 
> another person), in which case I don't even try to merge it, I just delete 
> that email too, and just do "git-applymbox" again.

Thanks.  So in short, you do not even need '-c' option, but just
trim the mbox you feed to applymbox and re-run it.

While I was experimenting with the git-am (only in 'pu' branch,
and still marked as "do not use"), I noticed that it does the
right thing on an already applied patch.  It fails to apply,
finds an appropriate base and then notices the result of the
patch application is already contained in the HEAD ;-).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  0:03       ` Junio C Hamano
@ 2005-10-06  1:59         ` Eric W. Biederman
  2005-10-06  2:18           ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2005-10-06  1:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> ... Although I would actually prefer to be able to try against
>> a specific named release (not "all named tags" first).
>
> That makes sense, given that you know better than the tool can
> possibly guess, especially from the performance point of view.
>
> What's interesting is that from the correctness point of view,
> it should not make that much of a difference which base tree
> happens to be picked -- if the base tree is wildly different
> from your current HEAD, most of that wild difference will be
> carried over intact to the result of patch application for paths
> and parts of the files the patch does not touch.  Three-way
> merge notices that and your "changes" from the base tree win.

There is another workable strategy.  Modify git-diff-xxx to report
the sha1 of the tree, or the sha1's of the files the patch applies to.
And make that information available to git-apply.  I don't how often
it would help but it has the possibility of letting the tools know
exactly what the patch applied against.

A rational for why this would work is that is that a kernel repository
is large,  and unless objects/alternatives can be made to work across
servers there will be people who can use git for development but can't
host a repository that can be pulled from.   Especially for a small
patch.

Using sha1's for context before merging should be able to give you a
whole lot more machine verifiable context with just a little bit of work. 

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  1:59         ` Eric W. Biederman
@ 2005-10-06  2:18           ` Linus Torvalds
  2005-10-06  4:17             ` Junio C Hamano
                               ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Linus Torvalds @ 2005-10-06  2:18 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Junio C Hamano, git



On Wed, 5 Oct 2005, Eric W. Biederman wrote:
>
> There is another workable strategy.  Modify git-diff-xxx to report
> the sha1 of the tree, or the sha1's of the files the patch applies to.

Not workable.

It fundamentally only works for the first patch in a series. After that, 
the rest will be based off a version that the recipient simply doesn't 
have (well, if it's the _tree_ SHA1, and the recipient has no other 
patches, then they'll match, but that case is uninteresting, since it's 
the trivial case that you always get right by just applying the things in 
the first place).

So you'd have to make it the base for the patch _series_, at a minimum.

And even that likely doesn't work very often. Any time you have a private 
merge or some other patch in your tree, the recipient wouldn't be able to 
parse it.

In practice, I've found that most often it's very trivially obvious _why_ 
a patch doesn't apply (I remember the "other patch" that happened to the 
same file, or I just do a "git-whatchanged -p filename"), and it can be 
useful to allow the person applying the patch to say "ok, try to apply it 
against version xyz, and do a three-way merge".

But it really tends to be fairly rare.

Now obviously, some of that may be kernel-specific - a large part of the 
lack of patch conflicts is that we over the years have actively tried to 
set up the kernel sources so that people seldom step on each other (one 
example is how we have all modules contain their own "init_module()" thing 
and sortign it out in the linker - because the old init/main.c approach 
was _very_ painful since everybody wanted to add lines to the same file).

So it may turn out that other projects might have different wishes for how 
something like this would work. But I doubt it works very well to rely on 
commit-level SHA1's - it's more likely to work if you try the "last few 
tagged releases etc", since patches that don't apply are often against the 
previous release (and if they don't apply there either, then it's probably 
not worth fighting over anyway).

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  2:18           ` Linus Torvalds
@ 2005-10-06  4:17             ` Junio C Hamano
  2005-10-06  5:25             ` Eric W. Biederman
  2005-10-06  7:33             ` [PATCH] Fall back to three-way merge when applying a patch Junio C Hamano
  2 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2005-10-06  4:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eric W. Biederman, git

Linus Torvalds <torvalds@osdl.org> writes:

> ... But I doubt it works very well to rely on 
> commit-level SHA1's - it's more likely to work if you try the "last few 
> tagged releases etc", since patches that don't apply are often against the 
> previous release (and if they don't apply there either, then it's probably 
> not worth fighting over anyway).

That is the heuristics git-applypatch in the proposed updates
branch tries to attempt.  Recorded tree if exists (which is not
very interesting), last few commits from the current branch, and
then tagged releases.

Ideally, if we had the path following git-rev-list you talked
about, we could first ask git-apply the set of paths the patch
touches, and instead of trying every commits from the HEAD, try
only commits that touch one (or more) of the given paths.

Another possible git-rev-list enhancement that might be useful
is to pop commits not based on time but based on the depth from
branch heads.  Then we could:

    git-rev-list --depth-order --max-count=$N --all \
    	$(git-apply --show-files $patch | sed -e 's/^[^ ]* [^ ]* //p')

to obtain list of the last few commits that touch the paths
involved, in the order that is closer-to-tip first.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  2:18           ` Linus Torvalds
  2005-10-06  4:17             ` Junio C Hamano
@ 2005-10-06  5:25             ` Eric W. Biederman
  2005-10-06 14:35               ` Linus Torvalds
  2005-10-06  7:33             ` [PATCH] Fall back to three-way merge when applying a patch Junio C Hamano
  2 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2005-10-06  5:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> writes:

> On Wed, 5 Oct 2005, Eric W. Biederman wrote:
>>
>> There is another workable strategy.  Modify git-diff-xxx to report
>> the sha1 of the tree, or the sha1's of the files the patch applies to.
>
> Not workable.
>
> It fundamentally only works for the first patch in a series. After that, 
> the rest will be based off a version that the recipient simply doesn't 
> have (well, if it's the _tree_ SHA1, and the recipient has no other 
> patches, then they'll match, but that case is uninteresting, since it's 
> the trivial case that you always get right by just applying the things in 
> the first place).

But what about the file sha1?  The intuition is that you can
easily provide a lot more context in diffs and that the extra
context might be useful in merging patches.  The part that gives
me how is when I look at if from a diff/patch perspective and
not from a git perspective.

> So you'd have to make it the base for the patch _series_, at a minimum.
>
> And even that likely doesn't work very often. Any time you have a private 
> merge or some other patch in your tree, the recipient wouldn't be able to 
> parse it.

True although in many cases people rework their diffs before sending
them out to avoid situations like the above.

> In practice, I've found that most often it's very trivially obvious _why_ 
> a patch doesn't apply (I remember the "other patch" that happened to the 
> same file, or I just do a "git-whatchanged -p filename"), and it can be 
> useful to allow the person applying the patch to say "ok, try to apply it 
> against version xyz, and do a three-way merge".

Ah.  I had missed that git-whatchanged can be given a filename that is
nice.  One of the better kept secrets of git.  That makes my whole
question worthwhile :)

> But it really tends to be fairly rare.

I don't know how rare it is.  I have had a couple of patches in the last
week that I could successfully auto-merge between branches but the
diffs themselves would actually conflict.

> Now obviously, some of that may be kernel-specific - a large part of the 
> lack of patch conflicts is that we over the years have actively tried to 
> set up the kernel sources so that people seldom step on each other (one 
> example is how we have all modules contain their own "init_module()" thing 
> and sortign it out in the linker - because the old init/main.c approach 
> was _very_ painful since everybody wanted to add lines to the same file).
>
> So it may turn out that other projects might have different wishes for how 
> something like this would work. But I doubt it works very well to rely on 
> commit-level SHA1's - it's more likely to work if you try the "last few 
> tagged releases etc", since patches that don't apply are often against the 
> previous release (and if they don't apply there either, then it's probably 
> not worth fighting over anyway).

Agreed, commit level sha1 are not very interesting.

But if you happen to have at least the file level sha1 you can
actually know if the patch was against what you think it is against.

At one level you could make conflicts more common by dropping any patch
whose source file had the wrong sha1.  At another level you could
test for the sha1 of the source file in the repository and know quickly
if there is something the patch applies cleanly too.

Walking through the list of commits and finding enough information to
merge the patch might still be prohibitive, but at least it is something
that could be automated, and the test to see if it interesting is
cheap.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  2:18           ` Linus Torvalds
  2005-10-06  4:17             ` Junio C Hamano
  2005-10-06  5:25             ` Eric W. Biederman
@ 2005-10-06  7:33             ` Junio C Hamano
  2 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2005-10-06  7:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eric W. Biederman, git

Linus Torvalds <torvalds@osdl.org> writes:

> ... But I doubt it works very well to rely on 
> commit-level SHA1's - it's more likely to work if you try the "last few 
> tagged releases etc", since patches that don't apply are often against the 
> previous release (and if they don't apply there either, then it's probably 
> not worth fighting over anyway).

Yes.  That is the current heuristics git-applypatch in the
proposed updates branch tries to attempt.  Recorded tree if
exists (which is not very interesting), last few commits, and
then tagged releases.

Ideally, if we had the path following git-rev-list you talked
about, we could first ask git-apply the set of paths the patch
touches, and instead of trying every commits from the HEAD, try
only commits that touch one (or more) of the given paths.

Another possible git-rev-list enhancement that might be useful
is to pop commits not based on time but based on the depth from
branch heads.  Then we could:

    git-rev-list --depth-order --max-count=$N --all \
    	$(git-apply --show-files $patch | sed -e 's/^[^ ]* [^ ]* //p')

to obtain list of last few commits that touch the paths
involved, in the order that is closer-to-tip first.  This would
give us set of more useful commits than the current behaviour.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06  5:25             ` Eric W. Biederman
@ 2005-10-06 14:35               ` Linus Torvalds
  2005-10-06 14:52                 ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2005-10-06 14:35 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Junio C Hamano, git



On Wed, 5 Oct 2005, Eric W. Biederman wrote:
>
> Ah.  I had missed that git-whatchanged can be given a filename that is
> nice.  One of the better kept secrets of git.  That makes my whole
> question worthwhile :)

That's definitely not a secret - it was part of the whole point of 
git-whatchanged. It's the native git version of "annotate", and I 
personally find it much more useful.

It's not even just a filename. You can do

	git-whatchanged -p drivers/scsi/ include/scsi

to limit the set to those two subdirectories. IOW, you can give 
git-whatchanged an arbitrary list of individual pathnames or directory 
names.

> But if you happen to have at least the file level sha1 you can
> actually know if the patch was against what you think it is against.

Yes, a file-level SHA1 may be useful. On the other hand, I suspect that by 
that time (since you have to search for the version anyway) you might as 
well have the "just try to apply the patch" approach. It's basically the 
same search space.

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06 14:35               ` Linus Torvalds
@ 2005-10-06 14:52                 ` Eric W. Biederman
  2005-10-06 14:59                   ` Linus Torvalds
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2005-10-06 14:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> writes:

> On Wed, 5 Oct 2005, Eric W. Biederman wrote:
>>
>> Ah.  I had missed that git-whatchanged can be given a filename that is
>> nice.  One of the better kept secrets of git.  That makes my whole
>> question worthwhile :)
>
> That's definitely not a secret - it was part of the whole point of 
> git-whatchanged. It's the native git version of "annotate", and I 
> personally find it much more useful.
>
> It's not even just a filename. You can do
>
> 	git-whatchanged -p drivers/scsi/ include/scsi
>
> to limit the set to those two subdirectories. IOW, you can give 
> git-whatchanged an arbitrary list of individual pathnames or directory 
> names.

Which probably means it's time for me to generate a patch to the
git-whatchanged documentation.

>> But if you happen to have at least the file level sha1 you can
>> actually know if the patch was against what you think it is against.
>
> Yes, a file-level SHA1 may be useful. On the other hand, I suspect that by 
> that time (since you have to search for the version anyway) you might as 
> well have the "just try to apply the patch" approach. It's basically the 
> same search space.

Maybe.

After thinking about it I don't think you need to look through the
history to use it for a merge3 operation.   As I recall merge3 only
looks at the base and the two derived versions of the file.  If we
have the sha1 of the original in the git repository I think all we
need to compute is the diff between that sha1 the current version
file.  And then apply the merge3 algorithm to combine the two sets of
changes.


Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06 14:52                 ` Eric W. Biederman
@ 2005-10-06 14:59                   ` Linus Torvalds
  2005-10-06 17:07                     ` Eric W. Biederman
  2005-10-07  2:33                     ` [PATCH] Show original and resulting blob object info in diff output Junio C Hamano
  0 siblings, 2 replies; 17+ messages in thread
From: Linus Torvalds @ 2005-10-06 14:59 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Junio C Hamano, git



On Thu, 6 Oct 2005, Eric W. Biederman wrote:
> 
> After thinking about it I don't think you need to look through the
> history to use it for a merge3 operation.   As I recall merge3 only
> looks at the base and the two derived versions of the file.  If we
> have the sha1 of the original in the git repository I think all we
> need to compute is the diff between that sha1 the current version
> file.  And then apply the merge3 algorithm to combine the two sets of
> changes.

Ahh, that I can definitely agree with. In fact, it makes perfect sense.

However, it assumes that everybody is a git user, which isn't actually 
true. Also, I'm wondering whether the advantages outweigh the 
disadvantages: it would make the diff uglier. We'd have to add that SHA1 
there somewhere (either on the "diff" line itself, or as anothe rextended 
git line like the "rename from/to" lines - a "original <sha1>" line).

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fall back to three-way merge when applying a patch.
  2005-10-06 14:59                   ` Linus Torvalds
@ 2005-10-06 17:07                     ` Eric W. Biederman
  2005-10-07  2:33                     ` [PATCH] Show original and resulting blob object info in diff output Junio C Hamano
  1 sibling, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2005-10-06 17:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> writes:

> On Thu, 6 Oct 2005, Eric W. Biederman wrote:
>> 
>> After thinking about it I don't think you need to look through the
>> history to use it for a merge3 operation.   As I recall merge3 only
>> looks at the base and the two derived versions of the file.  If we
>> have the sha1 of the original in the git repository I think all we
>> need to compute is the diff between that sha1 the current version
>> file.  And then apply the merge3 algorithm to combine the two sets of
>> changes.
>
> Ahh, that I can definitely agree with. In fact, it makes perfect sense.
>
> However, it assumes that everybody is a git user, which isn't actually 
> true. 

Agreed.  The question is the subset of everyone large enough to
make it a useful technique.  Given that a standalone diff utility can
be taught how to generate the extra information, and a standalone
patch could use it to verify you are at least patching the version
of the file the patch was intended for I suspect the subset of
everyone is large enough to be interesting.

> Also, I'm wondering whether the advantages outweigh the 
> disadvantages: it would make the diff uglier. We'd have to add that SHA1 
> there somewhere (either on the "diff" line itself, or as anothe rextended 
> git line like the "rename from/to" lines - a "original <sha1>" line).

I don't think an extra line in the header is going to be much of a problem.
Just more header noise.  

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] Show original and resulting blob object info in diff output.
  2005-10-06 14:59                   ` Linus Torvalds
  2005-10-06 17:07                     ` Eric W. Biederman
@ 2005-10-07  2:33                     ` Junio C Hamano
  2005-10-07  4:47                       ` Linus Torvalds
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2005-10-07  2:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eric W. Biederman, git

This adds more cruft to diff --git header to record the blob SHA1 and
the mode the patch/diff is intended to be applied against, to help the
receiving end fall back on a three-way merge.  The new header looks
like this:

    diff --git a/apply.c b/apply.c
    old index 7be50413538868412a87c847f8fa184cadd0fa2a 100644
    new index 83660822fcbb9edac523634999e30d65c2790cae 100644
    --- a/apply.c
    +++ b/apply.c
    @@ -14,6 +14,7 @@
     //    files that are being modified, but doesn't apply the patch
     //  --stat does just a diffstat, and doesn't actually apply
    +//  --show-index-info shows the old and new index info for...
    ...

There is a counterpart option --show-index-info to git-apply command
to summarize this:

    - 7be5041... 100644	apply.c
    + 8366082... 100644	apply.c
    - ec2a161... 100644	cache.h
    + 514adb8... 100644	cache.h
    - ...

Upon receiving such a patch, if the patch did not apply cleanly to the
target tree, the recipient can try to find the matching old objects in
her object database and create a temporary tree, apply the patch to
that temporary tree, and attempt a 3-way merge between the patched
temporary tree and the target tree using the original temporary tree
as the common ancestor.

The patch lifts the code to compute the hash for an on-filesystem
object from update-index.c and makes it available to the diff output
routine.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

    Linus Torvalds <torvalds@osdl.org> writes:

    > On Thu, 6 Oct 2005, Eric W. Biederman wrote:
    >> 
    >> After thinking about it I don't think you need to look through the
    >> history to use it for a merge3 operation.   As I recall merge3 only
    >> looks at the base and the two derived versions of the file.  If we
    >> have the sha1 of the original in the git repository I think all we
    >> need to compute is the diff between that sha1 the current version
    >> file.  And then apply the merge3 algorithm to combine the two sets of
    >> changes.
    >
    > Ahh, that I can definitely agree with. In fact, it makes perfect sense.

    And this is a sample implementation to do so.  Will not
    graduate "pu" until the receiving end gets ready and unless
    people agree that the additional cruft is not too
    distracting.

 apply.c                        |   64 +++++++++++++++++++++++++++++++++-
 cache.h                        |    1 +
 diff.c                         |   76 +++++++++++++++++++++++++++++-----------
 sha1_file.c                    |   39 +++++++++++++++++++++
 t/diff-lib.sh                  |   10 ++++-
 t/t4000-diff-format.sh         |    6 +++
 t/t4001-diff-rename.sh         |    3 +-
 t/t4004-diff-rename-symlink.sh |    3 +-
 update-index.c                 |   33 ++---------------
 9 files changed, 179 insertions(+), 56 deletions(-)

applies-to: 8cf0bb652ffde36e57bca5d6f33287305a61bb74
2c48a811a36ee6e12aeb044b3d1ff459b8d958c2
diff --git a/apply.c b/apply.c
old index 7be50413538868412a87c847f8fa184cadd0fa2a 100644
new index ed2eac2c67dd003c8a0eaa7325e1543d1472777d 100644
--- a/apply.c
+++ b/apply.c
@@ -14,6 +14,7 @@
 //    files that are being modified, but doesn't apply the patch
 //  --stat does just a diffstat, and doesn't actually apply
 //  --show-files shows the directory changes
+//  --show-index-info shows the old and new index info for paths if available.
 //
 static int check_index = 0;
 static int write_index = 0;
@@ -22,8 +23,9 @@ static int summary = 0;
 static int check = 0;
 static int apply = 1;
 static int show_files = 0;
+static int show_index_info = 0;
 static const char apply_usage[] =
-"git-apply [--stat] [--summary] [--check] [--index] [--apply] [--show-files] <patch>...";
+"git-apply [--stat] [--summary] [--check] [--index] [--apply] [--show-files] [--show-index-info] <patch>...";
 
 /*
  * For "diff-stat" like behaviour, we keep track of the biggest change
@@ -56,6 +58,8 @@ struct patch {
 	struct fragment *fragments;
 	char *result;
 	unsigned long resultsize;
+	unsigned char old_sha1[20];
+	unsigned char new_sha1[20];
 	struct patch *next;
 };
 
@@ -334,6 +338,24 @@ static int gitdiff_dissimilarity(const c
 	return 0;
 }
 
+static int gitdiff_old_index(const char *line, struct patch *patch)
+{
+	if (get_sha1_hex(line, patch->old_sha1))
+		memcpy(patch->old_sha1, null_sha1, 20);
+	else
+		gitdiff_oldmode(line + 41, patch);
+	return 0;
+}
+
+static int gitdiff_new_index(const char *line, struct patch *patch)
+{
+	if (get_sha1_hex(line, patch->new_sha1))
+		memcpy(patch->new_sha1, null_sha1, 20);
+	else
+		gitdiff_newmode(line + 41, patch);
+	return 0;
+}
+
 /*
  * This is normal for a diff that doesn't change anything: we'll fall through
  * into the next diff. Tell the parser to break out.
@@ -438,6 +460,8 @@ static int parse_git_header(char *line, 
 			{ "rename to ", gitdiff_renamedst },
 			{ "similarity index ", gitdiff_similarity },
 			{ "dissimilarity index ", gitdiff_dissimilarity },
+			{ "old index ", gitdiff_old_index },
+			{ "new index ", gitdiff_new_index },
 			{ "", gitdiff_unrecognized },
 		};
 		int i;
@@ -1136,6 +1160,36 @@ static void show_file_list(struct patch 
 	}
 }
 
+static inline int is_null_sha1(const unsigned char *sha1)
+{
+	return !memcmp(sha1, null_sha1, 20);
+}
+
+static void show_index_list(struct patch *list)
+{
+	struct patch *patch;
+
+	for (patch = list; patch; patch = patch->next) {
+		if ( (!patch->is_delete && is_null_sha1(patch->new_sha1)) ||
+		     (!patch->is_new && is_null_sha1(patch->old_sha1)) )
+			die("patch does not record sha1 information");
+	}
+
+	for (patch = list; patch; patch = patch->next) {
+		if (!is_null_sha1(patch->old_sha1))
+			printf("- %s %06o	%s\n",
+			       sha1_to_hex(patch->old_sha1),
+			       patch->old_mode,
+			       patch->old_name);
+
+		if (!is_null_sha1(patch->new_sha1))
+			printf("+ %s %06o	%s\n",
+			       sha1_to_hex(patch->new_sha1),
+			       patch->new_mode,
+			       patch->new_name);
+	}
+}
+
 static void stat_patch_list(struct patch *patch)
 {
 	int files, adds, dels;
@@ -1476,6 +1530,9 @@ static int apply_patch(int fd)
 	if (show_files)
 		show_file_list(list);
 
+	if (show_index_info)
+		show_index_list(list);
+
 	if (diffstat)
 		stat_patch_list(list);
 
@@ -1534,6 +1591,11 @@ int main(int argc, char **argv)
 			show_files = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--show-index-info")) {
+			apply = 0;
+			show_index_info = 1;
+			continue;
+		}
 		fd = open(arg, O_RDONLY);
 		if (fd < 0)
 			usage(apply_usage);
diff --git a/cache.h b/cache.h
old index ec2a1610b2fd6edec6c95847d4377f9c0241b738 100644
new index 514adb8f8ed621d98175bea0d701576de13eb620 100644
--- a/cache.h
+++ b/cache.h
@@ -165,6 +165,7 @@ extern int ce_match_stat(struct cache_en
 extern int ce_modified(struct cache_entry *ce, struct stat *st);
 extern int ce_path_match(const struct cache_entry *ce, const char **pathspec);
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, const char *type);
+extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object);
 extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
 
 struct cache_file {
diff --git a/diff.c b/diff.c
old index 7d06b035ae8b6a53f10a7f94251a24d911baa509 100644
new index b602e25ccb2b4cf56dbfb5f2797651b1e44d4585 100644
--- a/diff.c
+++ b/diff.c
@@ -596,15 +596,31 @@ static void run_external_diff(const char
 	remove_tempfile();
 }
 
+static void diff_fill_sha1_info(struct diff_filespec *one)
+{
+	if (DIFF_FILE_VALID(one)) {
+		if (!one->sha1_valid) {
+			struct stat st;
+			if (stat(one->path, &st) < 0)
+				die("stat %s", one->path);
+			if (index_path(one->sha1, one->path, &st, 0))
+				die("cannot hash %s\n", one->path);
+		}
+	}
+	else
+		memset(one->sha1, 0, 20);
+}
+
 static void run_diff(struct diff_filepair *p)
 {
 	const char *pgm = external_diff();
-	char msg_[PATH_MAX*2+200], *xfrm_msg;
+	char msg[PATH_MAX*2+300], *xfrm_msg;
 	struct diff_filespec *one;
 	struct diff_filespec *two;
 	const char *name;
 	const char *other;
 	int complete_rewrite = 0;
+	int len;
 
 	if (DIFF_PAIR_UNMERGED(p)) {
 		/* unmerged */
@@ -616,38 +632,58 @@ static void run_diff(struct diff_filepai
 	name = p->one->path;
 	other = (strcmp(name, p->two->path) ? p->two->path : NULL);
 	one = p->one; two = p->two;
+
+	diff_fill_sha1_info(one);
+	diff_fill_sha1_info(two);
+
+	len = 0;
+	if (memcmp(one->sha1, two->sha1, 20)) {
+		if (memcmp(one->sha1, null_sha1, 20))
+			len += snprintf(msg + len, sizeof(msg) - len,
+					"old index %s %06o\n",
+					sha1_to_hex(one->sha1),
+					one->mode);
+		if (memcmp(two->sha1, null_sha1, 20))
+			len += snprintf(msg + len, sizeof(msg) - len,
+					"new index %s %06o\n",
+					sha1_to_hex(two->sha1),
+					two->mode);
+	}
+
 	switch (p->status) {
 	case DIFF_STATUS_COPIED:
-		sprintf(msg_,
-			"similarity index %d%%\n"
-			"copy from %s\n"
-			"copy to %s",
-			(int)(0.5 + p->score * 100.0/MAX_SCORE),
-			name, other);
-		xfrm_msg = msg_;
+		len += snprintf(msg + len, sizeof(msg) - len,
+				"similarity index %d%%\n"
+				"copy from %s\n"
+				"copy to %s\n",
+				(int)(0.5 + p->score * 100.0/MAX_SCORE),
+				name, other);
 		break;
 	case DIFF_STATUS_RENAMED:
-		sprintf(msg_,
-			"similarity index %d%%\n"
-			"rename from %s\n"
-			"rename to %s",
-			(int)(0.5 + p->score * 100.0/MAX_SCORE),
-			name, other);
-		xfrm_msg = msg_;
+		len += snprintf(msg + len, sizeof(msg) - len,
+				"similarity index %d%%\n"
+				"rename from %s\n"
+				"rename to %s\n",
+				(int)(0.5 + p->score * 100.0/MAX_SCORE),
+				name, other);
 		break;
 	case DIFF_STATUS_MODIFIED:
 		if (p->score) {
-			sprintf(msg_,
-				"dissimilarity index %d%%",
-				(int)(0.5 + p->score * 100.0/MAX_SCORE));
-			xfrm_msg = msg_;
+			len += snprintf(msg + len, sizeof(msg) - len,
+					"dissimilarity index %d%%\n",
+					(int)(0.5 + p->score *
+					      100.0/MAX_SCORE));
 			complete_rewrite = 1;
 			break;
 		}
 		/* fallthru */
 	default:
-		xfrm_msg = NULL;
+		/* nothing */
+		;
 	}
+	if (len)
+		msg[--len] = 0;
+	xfrm_msg = len ? msg : NULL;
 
 	if (!pgm &&
 	    DIFF_FILE_VALID(one) && DIFF_FILE_VALID(two) &&
diff --git a/sha1_file.c b/sha1_file.c
old index 895c1fab6fc00b8131e44851f33b79b1d2310b12 100644
new index 287f618827d9b2fd472c06add42ead65de291822 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1545,3 +1545,42 @@ int index_fd(unsigned char *sha1, int fd
 		munmap(buf, size);
 	return ret;
 }
+
+int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
+{
+	int fd;
+	char *target;
+
+	switch (st->st_mode & S_IFMT) {
+	case S_IFREG:
+		fd = open(path, O_RDONLY);
+		if (fd < 0)
+			return error("open(\"%s\"): %s", path,
+				     strerror(errno));
+		if (index_fd(sha1, fd, st, write_object, NULL) < 0)
+			return error("%s: failed to insert into database",
+				     path);
+		break;
+	case S_IFLNK:
+		target = xmalloc(st->st_size+1);
+		if (readlink(path, target, st->st_size+1) != st->st_size) {
+			char *errstr = strerror(errno);
+			free(target);
+			return error("readlink(\"%s\"): %s", path,
+			             errstr);
+		}
+		if (!write_object) {
+			unsigned char hdr[50];
+			int hdrlen;
+			write_sha1_file_prepare(target, st->st_size, "blob",
+						sha1, hdr, &hdrlen);
+		} else if (write_sha1_file(target, st->st_size, "blob", sha1))
+			return error("%s: failed to insert into database",
+				     path);
+		free(target);
+		break;
+	default:
+		return error("%s: unsupported file type", path);
+	}
+	return 0;
+}
diff --git a/t/diff-lib.sh b/t/diff-lib.sh
old index a912f435aa67d7b2cde09f0b8c9855442c6c2377 100755
new index a4095db2ed3b57d5e042c5ca49dd912c1c534cf8 100755
--- a/t/diff-lib.sh
+++ b/t/diff-lib.sh
@@ -29,7 +29,13 @@ compare_diff_raw_z () {
 compare_diff_patch () {
     # When heuristics are improved, the score numbers would change.
     # Ignore them while comparing.
-    sed -e '/^[dis]*imilarity index [0-9]*%$/d' <"$1" >.tmp-1
-    sed -e '/^[dis]*imilarity index [0-9]*%$/d' <"$2" >.tmp-2
+    sed -e '
+	/^[dis]*imilarity index [0-9]*%$/d
+	/^[no][el][dw] index [0-9a-f]* [0-7]*$/d
+    ' <"$1" >.tmp-1
+    sed -e '
+	/^[dis]*imilarity index [0-9]*%$/d
+	/^[no][el][dw] index [0-9a-f]* [0-7]*$/d
+    ' <"$2" >.tmp-2
     diff -u .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
 }
diff --git a/t/t4000-diff-format.sh b/t/t4000-diff-format.sh
old index f3b6330a9b4af0e68d9e402ab7d82e600c939ccc 100755
new index fa387cc24b8e5e6c92df9d1d504b805c92b94e36 100755
--- a/t/t4000-diff-format.sh
+++ b/t/t4000-diff-format.sh
@@ -7,6 +7,7 @@ test_description='Test built-in diff out
 
 '
 . ./test-lib.sh
+. ../diff-lib.sh
 
 echo >path0 'Line 1
 Line 2
@@ -29,6 +30,8 @@ cat >expected <<\EOF
 diff --git a/path0 b/path0
 old mode 100644
 new mode 100755
+old index 85d7fd2fbbe9564fe6ecc5e72416adb36e62d658 100644
+new index 6ad36e52f0002937ed2de6a1c15d8a0ae5df056a 100755
 --- a/path0
 +++ b/path0
 @@ -1,3 +1,3 @@
@@ -38,6 +41,7 @@ new mode 100755
 +Line 3
 diff --git a/path1 b/path1
 deleted file mode 100755
+old index 85d7fd2fbbe9564fe6ecc5e72416adb36e62d658 100755
 --- a/path1
 +++ /dev/null
 @@ -1,3 +0,0 @@
@@ -48,6 +52,6 @@ EOF
 
 test_expect_success \
     'validate git-diff-files -p output.' \
-    'cmp -s current expected'
+    'compare_diff_patch current expected'
 
 test_done
diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
old index be474856824f8e6659151590ed5acff38187e97d 100755
new index 2e3c20d6b9468bf413e97d422e7dbe13ac4238cd 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -7,6 +7,7 @@ test_description='Test rename detection 
 
 '
 . ./test-lib.sh
+. ../diff-lib.sh
 
 echo >path0 'Line 1
 Line 2
@@ -61,6 +62,6 @@ EOF
 
 test_expect_success \
     'validate the output.' \
-    'diff -I "similarity.*" >/dev/null current expected'
+    'compare_diff_patch current expected'
 
 test_done
diff --git a/t/t4004-diff-rename-symlink.sh b/t/t4004-diff-rename-symlink.sh
old index f59614ae255b33f450a784200716c9fd63b0a054 100755
new index a23aaa0a9471c68b233480cf34c7115d1f40e154 100755
--- a/t/t4004-diff-rename-symlink.sh
+++ b/t/t4004-diff-rename-symlink.sh
@@ -10,6 +10,7 @@ copy of symbolic links, but should not p
 by an edit for them.
 '
 . ./test-lib.sh
+. ../diff-lib.sh
 
 test_expect_success \
     'prepare reference tree' \
@@ -61,6 +62,6 @@ EOF
 
 test_expect_success \
     'validate diff output' \
-    'diff -u current expected'
+    'compare_diff_patch current expected'
 
 test_done
diff --git a/update-index.c b/update-index.c
old index b825a11d2f6d8a53f5e42e4f157ba036a13edb59 100644
new index 9dc518877b67bedced29ac91ecd8fafbb2a60135 100644
--- a/update-index.c
+++ b/update-index.c
@@ -37,8 +37,6 @@ static int add_file_to_cache(const char 
 	int size, namelen, option, status;
 	struct cache_entry *ce;
 	struct stat st;
-	int fd;
-	char *target;
 
 	status = lstat(path, &st);
 	if (status < 0 || S_ISDIR(st.st_mode)) {
@@ -77,34 +75,9 @@ static int add_file_to_cache(const char 
 	fill_stat_cache_info(ce, &st);
 	ce->ce_mode = create_ce_mode(st.st_mode);
 	ce->ce_flags = htons(namelen);
-	switch (st.st_mode & S_IFMT) {
-	case S_IFREG:
-		fd = open(path, O_RDONLY);
-		if (fd < 0)
-			return error("open(\"%s\"): %s", path, strerror(errno));
-		if (index_fd(ce->sha1, fd, &st, !info_only, NULL) < 0)
-			return error("%s: failed to insert into database", path);
-		break;
-	case S_IFLNK:
-		target = xmalloc(st.st_size+1);
-		if (readlink(path, target, st.st_size+1) != st.st_size) {
-			char *errstr = strerror(errno);
-			free(target);
-			return error("readlink(\"%s\"): %s", path,
-			             errstr);
-		}
-		if (info_only) {
-			unsigned char hdr[50];
-			int hdrlen;
-			write_sha1_file_prepare(target, st.st_size, "blob",
-						ce->sha1, hdr, &hdrlen);
-		} else if (write_sha1_file(target, st.st_size, "blob", ce->sha1))
-			return error("%s: failed to insert into database", path);
-		free(target);
-		break;
-	default:
-		return error("%s: unsupported file type", path);
-	}
+
+	if (index_path(ce->sha1, path, &st, !info_only))
+		return -1;
 	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
 	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
 	if (add_cache_entry(ce, option))
---
0.99.8.GIT

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Show original and resulting blob object info in diff output.
  2005-10-07  2:33                     ` [PATCH] Show original and resulting blob object info in diff output Junio C Hamano
@ 2005-10-07  4:47                       ` Linus Torvalds
  2005-10-07  5:16                         ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2005-10-07  4:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric W. Biederman, git



On Thu, 6 Oct 2005, Junio C Hamano wrote:
> 
>     diff --git a/apply.c b/apply.c
>     old index 7be50413538868412a87c847f8fa184cadd0fa2a 100644
>     new index 83660822fcbb9edac523634999e30d65c2790cae 100644

Hmm.. It strikes me that if we don't have the SHA1 object, then the SHA1 
is largely useless except as a verification of the thing (ie we can still 
_calculate_ the blob sha of whatever the current file is).

And if we _do_ have the SHA1, then we don't actually need all of it.

So the above is just very ugly, but it would be equally useful if it just 
had a partial SHA1 - still plenty good to identify the thing.

IO, why not make it shorter ans prettier and just make it be

	index 7be50413538868412a87..83660822fcbb9edac523

which is basically the first 20 hex digits of each SHA1.

Still plenty good to look up the object if we have it, and _certainly_ 
good enough as a quick verification. And much less "imposing".

The thing is, I actually tend to look at diffs when I send stuff out or 
receive them, and your one is just very ugly:

> applies-to: 8cf0bb652ffde36e57bca5d6f33287305a61bb74
> 2c48a811a36ee6e12aeb044b3d1ff459b8d958c2
> diff --git a/apply.c b/apply.c
> old index 7be50413538868412a87c847f8fa184cadd0fa2a 100644
> new index ed2eac2c67dd003c8a0eaa7325e1543d1472777d 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -14,6 +14,7 @@

That's a _lot_ of very "solid" unreadable stuff. It makes my eyes water. 
I'd really like to minimize it.

If you want the "applies-to" etc, then _all_ of it could be just crammed 
onto just one line (just cut each SHA1 down to 10 hex characters: 40 bits 
is still _way_ enough to indicate a unique version of a particular file if 
we have it, even if it's "just" a quarter of the full name. It's not 
like this is meant to be cryptographic. In fact, I bet 20 bits would be 
plenty).

		Linus

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Show original and resulting blob object info in diff output.
  2005-10-07  4:47                       ` Linus Torvalds
@ 2005-10-07  5:16                         ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2005-10-07  5:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> IOW, why not make it shorter ans prettier and just make it be
>
> 	index 7be50413538868412a87..83660822fcbb9edac523
>
> which is basically the first 20 hex digits of each SHA1.

Makes sense.

> If you want the "applies-to" etc,...

applies-to: is for the whole tree, and I think it can go if we
do this per-blob signature thing.

The first SHA1 after applies-to: is coming from diff-tree,
noting the commit object _after_ the change.  I could filter it
while running format-patch, because it is useless from patch
application point of view.

> .... In fact, I bet 20 bits would be plenty).

So the updated proposal would be:

    diff --git a/apply.c b/apply.c
    index 7be5041..8366082 100644
    --- a/apply.c
    +++ b/apply.c
    @@ -14,6 +14,7 @@
    ...
    diff --git a/foo.sh b/bar.sh
    old mode 100644
    new mode 100755
    index 7be5041..8366082
    similarity index 86%
    rename from foo.sh
    rename to bar.sh
    --- a/foo.sh
    +++ b/bar.sh
    @@ -14,6 +14,7 @@
    ...

where the "index" line shows abbreviated SHA1 for pre- and post-
image blob, with an optional mode bit string only if there is no
mode change; otherwise we would have old/new mode line anyway.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-10-07  5:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-05  0:46 [PATCH] Fall back to three-way merge when applying a patch Junio C Hamano
2005-10-05  4:56 ` Linus Torvalds
2005-10-05  6:58   ` Junio C Hamano
2005-10-05 14:30     ` Linus Torvalds
2005-10-06  0:03       ` Junio C Hamano
2005-10-06  1:59         ` Eric W. Biederman
2005-10-06  2:18           ` Linus Torvalds
2005-10-06  4:17             ` Junio C Hamano
2005-10-06  5:25             ` Eric W. Biederman
2005-10-06 14:35               ` Linus Torvalds
2005-10-06 14:52                 ` Eric W. Biederman
2005-10-06 14:59                   ` Linus Torvalds
2005-10-06 17:07                     ` Eric W. Biederman
2005-10-07  2:33                     ` [PATCH] Show original and resulting blob object info in diff output Junio C Hamano
2005-10-07  4:47                       ` Linus Torvalds
2005-10-07  5:16                         ` Junio C Hamano
2005-10-06  7:33             ` [PATCH] Fall back to three-way merge when applying a patch Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).