git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Random patches from my tree
@ 2008-10-27 20:05 Linus Torvalds
  2008-10-27 20:06 ` [PATCH 1/2] Add file delete/create info when we overflow rename_limit Linus Torvalds
  2008-10-27 20:37 ` [PATCH 0/2] Random patches from my tree Linus Torvalds
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2008-10-27 20:05 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano


Ok, here's two random patches that I have in my git tree that may or may 
not be interesting.

The first one is something I did a long time ago because I was irritated 
by the "skipping inexact rename detection" message not really telling me 
_why_ it did so - just "too many files". So it just adds the information 
about number of deleted and created files.

The second one was something I whipped up due to the discussion about 
"which branch is a commit from". It adds a "--source" flag to the log 
command family, which then makes the revision walker save off the name of 
the commit in the '->util' field, and makes 'show_decorations()' show it.

So the second one allows things like

	git log --pretty=oneline --source --all

and it gives a somewhat useful view of which branch some commit came in 
on, by just showing _which_ of the references was the one that reached 
a particular commit first.

Of course, with commits that are reached multiple ways the whole notion of 
"which one reached us" is ambiguous, and it just picks one rather than 
building up any list of them. The first name we reach somethign through 
is the one that a commit sticks with. That's generally what you want.

I guess we should have a "--pretty=format" thing for it too, and maybe you 
want to support it for 'git rev-list' as well, but that's for others to 
decide - it people think it's useful.

			Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] Add file delete/create info when we overflow rename_limit
  2008-10-27 20:05 [PATCH 0/2] Random patches from my tree Linus Torvalds
@ 2008-10-27 20:06 ` Linus Torvalds
  2008-10-27 20:07   ` [PATCH 2/2] Add a 'source' decorator for commits Linus Torvalds
  2008-10-27 20:37 ` [PATCH 0/2] Random patches from my tree Linus Torvalds
  1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2008-10-27 20:06 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano


When we refuse to do rename detection due to having too many files
created or deleted, let the user know the numbers.  That way there is a
reasonable starting point for setting the diff.renamelimit option.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 diffcore-rename.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 1b2ebb4..168a95b 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -493,7 +493,7 @@ void diffcore_rename(struct diff_options *options)
 	if ((num_create > rename_limit && num_src > rename_limit) ||
 	    (num_create * num_src > rename_limit * rename_limit)) {
 		if (options->warn_on_too_large_rename)
-			warning("too many files, skipping inexact rename detection");
+			warning("too many files (created: %d deleted: %d), skipping inexact rename detection", num_create, num_src);
 		goto cleanup;
 	}
 
-- 
1.6.0.3.517.g759a.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-27 20:06 ` [PATCH 1/2] Add file delete/create info when we overflow rename_limit Linus Torvalds
@ 2008-10-27 20:07   ` Linus Torvalds
  2008-10-28  5:45     ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2008-10-27 20:07 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano


We already support decorating commits by tags or branches that point to
them, but especially when we are looking at multiple branches together,
we sometimes want to see _how_ we reached a particular commit.

We can abuse the '->util' field in the commit to keep track of that as
we walk the commit lists, and get a reasonably useful view into which
branch or tag first reaches that commit.

Of course, if the commit is reachable through multiple sources (which is
common), our particular choice of "first" reachable is entirely random
and depends on the particular path we happened to follow.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 builtin-log.c      |    2 ++
 builtin-rev-list.c |    2 +-
 log-tree.c         |    8 +++++---
 log-tree.h         |    2 +-
 revision.c         |    4 ++++
 revision.h         |    1 +
 6 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index a0944f7..176cbce 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -56,6 +56,8 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 		if (!strcmp(arg, "--decorate")) {
 			load_ref_decorations();
 			decorate = 1;
+		} else if (!strcmp(arg, "--source")) {
+			rev->show_source = 1;
 		} else
 			die("unrecognized argument: %s", arg);
 	}
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 06cdeb7..857742a 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -100,7 +100,7 @@ static void show_commit(struct commit *commit)
 			children = children->next;
 		}
 	}
-	show_decorations(commit);
+	show_decorations(&revs, commit);
 	if (revs.commit_format == CMIT_FMT_ONELINE)
 		putchar(' ');
 	else
diff --git a/log-tree.c b/log-tree.c
index cec3c06..cf7947b 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -52,11 +52,13 @@ static void show_parents(struct commit *commit, int abbrev)
 	}
 }
 
-void show_decorations(struct commit *commit)
+void show_decorations(struct rev_info *opt, struct commit *commit)
 {
 	const char *prefix;
 	struct name_decoration *decoration;
 
+	if (opt->show_source && commit->util)
+		printf(" %s", (char *) commit->util);
 	decoration = lookup_decoration(&name_decoration, &commit->object);
 	if (!decoration)
 		return;
@@ -279,7 +281,7 @@ void show_log(struct rev_info *opt)
 		fputs(diff_unique_abbrev(commit->object.sha1, abbrev_commit), stdout);
 		if (opt->print_parents)
 			show_parents(commit, abbrev_commit);
-		show_decorations(commit);
+		show_decorations(opt, commit);
 		if (opt->graph && !graph_is_commit_finished(opt->graph)) {
 			putchar('\n');
 			graph_show_remainder(opt->graph);
@@ -352,7 +354,7 @@ void show_log(struct rev_info *opt)
 			printf(" (from %s)",
 			       diff_unique_abbrev(parent->object.sha1,
 						  abbrev_commit));
-		show_decorations(commit);
+		show_decorations(opt, commit);
 		printf("%s", diff_get_color_opt(&opt->diffopt, DIFF_RESET));
 		if (opt->commit_format == CMIT_FMT_ONELINE) {
 			putchar(' ');
diff --git a/log-tree.h b/log-tree.h
index 3c8127b..f2a9008 100644
--- a/log-tree.h
+++ b/log-tree.h
@@ -12,7 +12,7 @@ int log_tree_diff_flush(struct rev_info *);
 int log_tree_commit(struct rev_info *, struct commit *);
 int log_tree_opt_parse(struct rev_info *, const char **, int);
 void show_log(struct rev_info *opt);
-void show_decorations(struct commit *commit);
+void show_decorations(struct rev_info *opt, struct commit *commit);
 void log_write_email_headers(struct rev_info *opt, const char *name,
 			     const char **subject_p,
 			     const char **extra_headers_p,
diff --git a/revision.c b/revision.c
index 2f646de..d45f05a 100644
--- a/revision.c
+++ b/revision.c
@@ -199,6 +199,8 @@ static struct commit *handle_commit(struct rev_info *revs, struct object *object
 			mark_parents_uninteresting(commit);
 			revs->limited = 1;
 		}
+		if (revs->show_source && !commit->util)
+			commit->util = (void *) name;
 		return commit;
 	}
 
@@ -484,6 +486,8 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
 
 		if (parse_commit(p) < 0)
 			return -1;
+		if (revs->show_source && !p->util)
+			p->util = commit->util;
 		p->object.flags |= left_flag;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= SEEN;
diff --git a/revision.h b/revision.h
index 2fdb2dd..51a4863 100644
--- a/revision.h
+++ b/revision.h
@@ -53,6 +53,7 @@ struct rev_info {
 			left_right:1,
 			rewrite_parents:1,
 			print_parents:1,
+			show_source:1,
 			reverse:1,
 			reverse_output_stage:1,
 			cherry_pick:1,
-- 
1.6.0.3.517.g759a.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Random patches from my tree
  2008-10-27 20:05 [PATCH 0/2] Random patches from my tree Linus Torvalds
  2008-10-27 20:06 ` [PATCH 1/2] Add file delete/create info when we overflow rename_limit Linus Torvalds
@ 2008-10-27 20:37 ` Linus Torvalds
  1 sibling, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2008-10-27 20:37 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano


On Mon, 27 Oct 2008, Linus Torvalds wrote:
> 
> So the second one allows things like
> 
> 	git log --pretty=oneline --source --all

A better example may have --abbrev-commit, at which point my current git 
tree looks like this for me:

	52be8b3... refs/heads/master Add a 'source' decorator for commits
	affba33... refs/heads/master Add file delete/create info when we overflow rename
	6806f78... refs/remotes/origin/pu Merge branch 'ar/mksnpath' into pu
	058412d... refs/remotes/origin/pu Fix potentially dangerous uses of mkpath and g
	356af64... refs/remotes/origin/pu Merge branch 'ar/maint-mksnpath' into HEAD
	9fa03c1... refs/remotes/origin/pu Fix potentially dangerous uses of mkpath and g
	94cc355... refs/remotes/origin/pu Fix mkpath abuse in dwim_ref and dwim_log of s
	108bebe... refs/remotes/origin/pu Add mksnpath which allows you to specify the o
	6af1fc2... refs/remotes/origin/man Autogenerated manpages for v1.6.0.3-523-g304d
	61b1229... refs/remotes/origin/html Autogenerated HTML docs for v1.6.0.3-523-g30
	f3ce133... refs/remotes/origin/pu Merge branch 'jc/send-pack-tell-me-more' into 
	2b37e3a... refs/remotes/origin/pu Merge branch 'jk/renamelimit' into pu
	..

ie you can see the the branch that the commits are on.

An example of the ambiguous nature would be 

	304d058... refs/remotes/origin/HEAD Merge branch 'maint'

ie because the HEAD and 'master' branches of refs/remotes/origin are the 
same, it's ambiguous which one to take, and git will have picked one at 
random. And obviously when you hit a point where two branches meet again, 
it really will depend on which branch we happened to be parsing at the 
time (which generally is "which branch had a commit date that was further 
back in history").

Another known deficiency: "gitk --source" doesn't work (neither does "gitk 
--decorate" for that matter). gitk gets very upset indeed if you try, and 
sucks up CPU cycles in some infinite loop.


		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-27 20:07   ` [PATCH 2/2] Add a 'source' decorator for commits Linus Torvalds
@ 2008-10-28  5:45     ` Jeff King
  2008-10-28 13:11       ` Pierre Habouzit
  2008-10-28 15:17       ` Linus Torvalds
  0 siblings, 2 replies; 14+ messages in thread
From: Jeff King @ 2008-10-28  5:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

On Mon, Oct 27, 2008 at 01:07:10PM -0700, Linus Torvalds wrote:

> We already support decorating commits by tags or branches that point to
> them, but especially when we are looking at multiple branches together,
> we sometimes want to see _how_ we reached a particular commit.

I think this is a cool feature, but I have a few questions/complaints:

  - Does it make sense to have this _in addition_ to --decorate (since
    for any commit with a --decorate field, it would likely be the same
    as --source)? Should it be a different type of decorate instead,
    like --decorate=source or --decorate=branch?

  - Should this be triggered by the "%d" --pretty=format specifier? This
    two-liner:

diff --git a/pretty.c b/pretty.c
index f6ff312..bdaad19 100644
--- a/pretty.c
+++ b/pretty.c
@@ -487,6 +487,8 @@ static void format_decoration(struct strbuf *sb, const struct commit *commit)
 	const char *prefix = " (";
 
 	load_ref_decorations();
+	if (commit->util)
+		printf("%s", (char *)commit->util);
 	d = lookup_decoration(&name_decoration, &commit->object);
 	while (d) {
 		strbuf_addstr(sb, prefix);

    works, but:

      - it doesn't check revs->show_source, so is it possible that
        commit->util is being used for something else?

      - using '%d' automatically turns on --decorate, so you end up with
        both the --source and --decorate values. More sensible semantics
        would be "%d turns on --decorate, unless you have done
        --decorate=<explicit format>".

        Alternatively, this should just be "%b" or "%S".

  - If you don't specify --all, you just get "HEAD" for everything.
    Which makes sense when you consider the implementation, but I think
    is probably a bit confusing for users.

> Of course, if the commit is reachable through multiple sources (which is
> common), our particular choice of "first" reachable is entirely random
> and depends on the particular path we happened to follow.

Hmm. It would be nice to keep even a simple counter to get a "distance"
from the ref and choose the one with the smallest distance (I think we
can get away without the complex rules that git-describe uses, since we
are not interested in describing the full commit, but rather finding a
"likely" branch).

However, that would require making multiple passes over the commit
graph, which might impact the startup speed.

-Peff

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28  5:45     ` Jeff King
@ 2008-10-28 13:11       ` Pierre Habouzit
  2008-10-28 13:19         ` Pierre Habouzit
  2008-10-28 19:46         ` Jeff King
  2008-10-28 15:17       ` Linus Torvalds
  1 sibling, 2 replies; 14+ messages in thread
From: Pierre Habouzit @ 2008-10-28 13:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 2870 bytes --]

On Tue, Oct 28, 2008 at 05:45:40AM +0000, Jeff King wrote:
> On Mon, Oct 27, 2008 at 01:07:10PM -0700, Linus Torvalds wrote:
> > Of course, if the commit is reachable through multiple sources (which is
> > common), our particular choice of "first" reachable is entirely random
> > and depends on the particular path we happened to follow.
> 
> Hmm. It would be nice to keep even a simple counter to get a "distance"
> from the ref and choose the one with the smallest distance (I think we
> can get away without the complex rules that git-describe uses, since we
> are not interested in describing the full commit, but rather finding a
> "likely" branch).
> 
> However, that would require making multiple passes over the commit
> graph, which might impact the startup speed.

Actually I tried to do that (and you meant name-rev --contains rather
than describe actually ;p), and I stopped because it's too slow. I
believe the proper way to do that is to help git-log knowing which are
the short (topic) branches, and to crawl incrementally using a
date-based hack. This would basically work a bit like this. Let's
imaging you want to crawl "next" in git and know which topics from pu
are in it. You would say e.g.:

git-log --topics=*/* next (as pretty much every */* is a topic branch
for git.git).


Then one has to know which are the heads of every topic branches first,
then crawl next the usual way, except that when you arrive to a point
that is a topic branch head, you don't look that way. You remember the
date of that point, and continue to crawl "next" a bit further so that
you can start annotating the topic's commits you've stumbled upon. And
you do that, you look at jd/topic (as in John Doe topic branch) and mark
all the commits as being from jd/topic, until you either go back to some
commit from next, or your current commit date is younger than your
current "next" crawling front. In the former case, you're done with that
topic, in the latter you must continue to preprocess "next" a bit more.

That should allow incremental output, FSVO incremental (in the git.git
kind of integration, you have buckets of topic branches merges, and
basically we would have output being spit bucket per bucket I believe).


If you do that, you don't really need to keep distance scores, this
should, I believe, yield decent enough results, while being incremental,
which should yield almost instantaneous output.

Point is, I've no clue how to do that with our crawling primitives right
now, to be fair, I didn't looked at it because I'm lazy and don't have
enough topic branches to work with in my projects, so it's not an itch
yet ;)

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 13:11       ` Pierre Habouzit
@ 2008-10-28 13:19         ` Pierre Habouzit
  2008-10-28 19:46         ` Jeff King
  1 sibling, 0 replies; 14+ messages in thread
From: Pierre Habouzit @ 2008-10-28 13:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3122 bytes --]

On Tue, Oct 28, 2008 at 01:11:16PM +0000, Pierre Habouzit wrote:
> On Tue, Oct 28, 2008 at 05:45:40AM +0000, Jeff King wrote:
> > On Mon, Oct 27, 2008 at 01:07:10PM -0700, Linus Torvalds wrote:
> > > Of course, if the commit is reachable through multiple sources (which is
> > > common), our particular choice of "first" reachable is entirely random
> > > and depends on the particular path we happened to follow.
> > 
> > Hmm. It would be nice to keep even a simple counter to get a "distance"
> > from the ref and choose the one with the smallest distance (I think we
> > can get away without the complex rules that git-describe uses, since we
> > are not interested in describing the full commit, but rather finding a
> > "likely" branch).
> > 
> > However, that would require making multiple passes over the commit
> > graph, which might impact the startup speed.
> 
> Actually I tried to do that (and you meant name-rev --contains rather
> than describe actually ;p), and I stopped because it's too slow. I
> believe the proper way to do that is to help git-log knowing which are
> the short (topic) branches, and to crawl incrementally using a
> date-based hack. This would basically work a bit like this. Let's
> imaging you want to crawl "next" in git and know which topics from pu
> are in it. You would say e.g.:
> 
> git-log --topics=*/* next (as pretty much every */* is a topic branch
> for git.git).
> 
> 
> Then one has to know which are the heads of every topic branches first,
> then crawl next the usual way, except that when you arrive to a point
> that is a topic branch head, you don't look that way. You remember the
> date of that point, and continue to crawl "next" a bit further so that
> you can start annotating the topic's commits you've stumbled upon. And
> you do that, you look at jd/topic (as in John Doe topic branch) and mark
> all the commits as being from jd/topic, until you either go back to some
> commit from next, or your current commit date is younger than your
> current "next" crawling front. In the former case, you're done with that
> topic, in the latter you must continue to preprocess "next" a bit more.

My description is clumsy, as we probably sometimes want to crawl topics
that aren't merged either like in git-log --topics=*/* --all.  But we
just have to mark the current "front" of the walk, and have it sorted
between things that are part of topic heads and the thing that are not.
We don't even need to decorate the things that are _not_ part of the
topics at all I'd say, because that's where you definitely need distance
based algorithms and multiple passes to yield correct results.

Whereas for the specific case of topic branches, which is what people
want (see Ingo's mails on the subject a week or so ago), the algorithm I
propose is, I guess, mostly O(n + m) in the number of walked commits (n)
and references (m). IOW fast.


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28  5:45     ` Jeff King
  2008-10-28 13:11       ` Pierre Habouzit
@ 2008-10-28 15:17       ` Linus Torvalds
  2008-10-28 17:21         ` Linus Torvalds
  2008-10-28 19:29         ` Jeff King
  1 sibling, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2008-10-28 15:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Junio C Hamano



On Tue, 28 Oct 2008, Jeff King wrote:
> 
>   - Does it make sense to have this _in addition_ to --decorate (since
>     for any commit with a --decorate field, it would likely be the same
>     as --source)? Should it be a different type of decorate instead,
>     like --decorate=source or --decorate=branch?

I think they are different. People who want --source generally have other 
issues than people who want --decorate, and the two do actually work 
together.

In particular, think about things like "gitk", which currently can't do 
_either_, but that could easily support both. Even to the point where gitk 
might want to add both flags on its own, just to always get the branch and 
the decorate output.

And no, they are _not_ the same. They are vehemently not the same when you 
use abything but '--all', and even with --all they are different because 
decorate has no problems with multiple decorations on one commit, while 
source is very much a single thing per commit.

And --source really has to be just a single data field, because anything 
else will almost inevitably be too expensive to be worth it.


>   - Should this be triggered by the "%d" --pretty=format specifier? This
>     two-liner:
> 
> diff --git a/pretty.c b/pretty.c
> index f6ff312..bdaad19 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -487,6 +487,8 @@ static void format_decoration(struct strbuf *sb, const struct commit *commit)
>  	const char *prefix = " (";
>  
>  	load_ref_decorations();
> +	if (commit->util)
> +		printf("%s", (char *)commit->util);
>  	d = lookup_decoration(&name_decoration, &commit->object);
>  	while (d) {
>  		strbuf_addstr(sb, prefix);
> 
>     works, but:
> 
>       - it doesn't check revs->show_source, so is it possible that
>         commit->util is being used for something else?

Indeed. You should always do the show_source check. There are different 
things that use 'util', and while I don't think any of them will ever use 
the format string, it's still a good idea to just make it consistent.

However:

>       - using '%d' automatically turns on --decorate, so you end up with
>         both the --source and --decorate values. More sensible semantics
>         would be "%d turns on --decorate, unless you have done
>         --decorate=<explicit format>".

As mentioned above, I think this is a non-starter. I don't think 
"decorate" and "source" really have anything to do with each other, except 
that they get printed out in similar ways and in the same function for the 
default printout.

And quite frankly, even that was partly just a "minimal diff" thing, 
although I do think that they both are "decorations", it's just that they 
are very _different_ decorations.

>         Alternatively, this should just be "%b" or "%S".

So yeah, I'd expect a new format specifier.

>   - If you don't specify --all, you just get "HEAD" for everything.
>     Which makes sense when you consider the implementation, but I think
>     is probably a bit confusing for users.

I don't think it's at all confusing, for two reasons:

 - you'd never ever use it manually unless you give multiple branches. Why 
   would you spend time to type out '--source' unless it was because you 
   needed to?

 - for scripting, you want things consistent, even if the 'consistency' is 
   purely about always showing HEAD when that's the only source.

For the second case, imagine having gitk always add "--source" and 
"--decorate" to the command line (the same way it always adds --parents 
and --pretty=raw etc). gitk doesn't want to care how many branches you 
give it as arguments, or whether you use --tags or --all. But it would 
want to always parse things the same way.

No?

> Hmm. It would be nice to keep even a simple counter to get a "distance"
> from the ref and choose the one with the smallest distance

We don't have the space. The other fields on "struct commit" are already 
used (indegree is used for topo sorting etc), and while we could make the 
pointer itself point to a more complex structure rather than the name (one 
that contains counts and possibly multiple names), that would now mean 
that we'd have to make another allocation for each commit. 

And that's very much against the whole point of the 'source' decoration. 
It was designed to be basically zero-cost. 

I could imagine doing it as not a single string: you could make it be a 
pointer to a list of (alphabetically sorted) strings, and then you don't 
have to make an allocation for each commit, you'd only need to do 
something like

	void add_source(struct commit *commit, struct strin_list *list)
	{
		struct string_list *old = commit->util;

		if (!old) {
			commit->util = list;
			return;
		}
		if (old == list)
			return;
		.. do a union sort of 'old'/'list' ..
	}

and I think it would be a stable algorithm (ie we'd share all normal 
cases, and only have to allocate new lists in the relatively rare cases of 
graphs joining), and that would be acceptable.

But the "counter" thing would not work. Not because it's expensive to 
count (it's not - you just increment the counter every time you go to a 
parent, and then if the parent already has a ->util entry, you replace it 
if the new one has a smaller count), but because it's just expensive to do 
another allocation for each commit.

(Of course, "expense" is relative. Maybe another allocation is ok, since 
it would only trigger with --source.)

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 15:17       ` Linus Torvalds
@ 2008-10-28 17:21         ` Linus Torvalds
  2008-10-28 19:29         ` Jeff King
  1 sibling, 0 replies; 14+ messages in thread
From: Linus Torvalds @ 2008-10-28 17:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Junio C Hamano



On Tue, 28 Oct 2008, Linus Torvalds wrote:
> 
> I could imagine doing it as not a single string: you could make it be a 
> pointer to a list of (alphabetically sorted) strings, and then you don't 
> have to make an allocation for each commit, you'd only need to do 
> something like
> 
> 	void add_source(struct commit *commit, struct strin_list *list)

Actually, no. That would be wrong.

Why? Becuase we might be printing out the commit before we see it for the 
second time, so if we were to print out anything but the "first reached 
data", we'd now have really nasty _unreliable_ data that would actually 
change depending on whether we also do things like --topo-sort and/or do 
commit limiting.

So suddenly --source would have to do the full tree just to make sure that 
it's reliably giving the same information, and that makes it much less 
useful.

In contrast, the thing I sent out is not only really simple and has 
basically zero peformance impact, but it's actually "more reliable" in 
that what it gives you is meaningful and clear. It might pick one 
particular name over another in random ways that depend on internal 
choices and the exact order that you gave your arguments in, but it 
doesn't even _try_ to claim anything else.

The source "name" is unambiguous only if there is a single source, and it 
doesn't really even try to claim anything else - for other cases, it will 
give answers that "make sense" but they won't necessarily be the _whole_ 
truth. But it won't ever be really misleading either, and it will never 
cause any slowdowns.

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 15:17       ` Linus Torvalds
  2008-10-28 17:21         ` Linus Torvalds
@ 2008-10-28 19:29         ` Jeff King
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff King @ 2008-10-28 19:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

On Tue, Oct 28, 2008 at 08:17:02AM -0700, Linus Torvalds wrote:

> >   - Does it make sense to have this _in addition_ to --decorate (since
> >     for any commit with a --decorate field, it would likely be the same
> >     as --source)? Should it be a different type of decorate instead,
> >     like --decorate=source or --decorate=branch?
> 
> I think they are different. People who want --source generally have other 
> issues than people who want --decorate, and the two do actually work 
> together.

Sleeping on this and thinking about it some more, I think you are right
here, and all of the other complaints I had just go away.

I was thinking of it as "decorate commits with the likely branches they
were made on." But that's not what this is at all (though it happens to
come up with similar answers!). It's really about "show which ref, of
the refs which were requested to be shown, we started at to reach this
commit." Which is perhaps more limited, but obvoiusly is much faster to
compute.

And then the output of "git log --source HEAD" makes perfect sense, and
it makes sense not to worry about finding the "closest" ref. It is
really about annotating the traversal that you asked for.

So now my only complaint is the lack of documentation and tests. ;)

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 13:11       ` Pierre Habouzit
  2008-10-28 13:19         ` Pierre Habouzit
@ 2008-10-28 19:46         ` Jeff King
  2008-10-28 19:53           ` Jeff King
  2008-10-28 20:09           ` Pierre Habouzit
  1 sibling, 2 replies; 14+ messages in thread
From: Jeff King @ 2008-10-28 19:46 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

On Tue, Oct 28, 2008 at 02:11:16PM +0100, Pierre Habouzit wrote:

> Actually I tried to do that (and you meant name-rev --contains rather
> than describe actually ;p), and I stopped because it's too slow. I

I think we are both wrong, since it's "git describe --contains". ;)
But yes, that was what I meant.

> believe the proper way to do that is to help git-log knowing which are
> the short (topic) branches, and to crawl incrementally using a
> date-based hack. This would basically work a bit like this. Let's
> imaging you want to crawl "next" in git and know which topics from pu
> are in it. You would say e.g.:

Hmm. Why a date-based hack to see what's on the topic branch? Why not
just give an option to walk the graph twice, giving name-rev style
annotations, and just let it be slow. People will mostly look at it by
specifying just their topic branches anyway. IOW:

  git log ^origin/master my/topic1 my/topic2 my/topic3

and by virtue of the fact that you are vastly limiting the size of the
tree, it won't actually end up too slow. So you haven't said so much
"these are my topic branches" as "I am just not interested in things
that are already upstream."

Or maybe I'm misunderstanding something here:

> Then one has to know which are the heads of every topic branches first,
> then crawl next the usual way, except that when you arrive to a point
> that is a topic branch head, you don't look that way. You remember the
> date of that point, and continue to crawl "next" a bit further so that
> you can start annotating the topic's commits you've stumbled upon. And
> you do that, you look at jd/topic (as in John Doe topic branch) and mark
> all the commits as being from jd/topic, until you either go back to some
> commit from next, or your current commit date is younger than your
> current "next" crawling front. In the former case, you're done with that
> topic, in the latter you must continue to preprocess "next" a bit more.

When you say "heads of topic branches" do you mean we actually have the
topic branches? Or do you mean you want to crawl next, pulling the names
of the topic branches from the merge messages?

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 19:46         ` Jeff King
@ 2008-10-28 19:53           ` Jeff King
  2008-10-28 20:09           ` Pierre Habouzit
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff King @ 2008-10-28 19:53 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

On Tue, Oct 28, 2008 at 03:46:43PM -0400, Jeff King wrote:

> 
> Hmm. Why a date-based hack to see what's on the topic branch? Why not
> just give an option to walk the graph twice, giving name-rev style
> annotations, and just let it be slow. People will mostly look at it by
> specifying just their topic branches anyway. IOW:
> 
>   git log ^origin/master my/topic1 my/topic2 my/topic3

And obviously you could split the "these are the commits to annotate"
specification from "these are the commits to show". But I actually think
in practice most users would want those to be the same.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 19:46         ` Jeff King
  2008-10-28 19:53           ` Jeff King
@ 2008-10-28 20:09           ` Pierre Habouzit
  2008-10-28 20:27             ` Jeff King
  1 sibling, 1 reply; 14+ messages in thread
From: Pierre Habouzit @ 2008-10-28 20:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 1452 bytes --]

On Tue, Oct 28, 2008 at 07:46:43PM +0000, Jeff King wrote:
> On Tue, Oct 28, 2008 at 02:11:16PM +0100, Pierre Habouzit wrote:
> > believe the proper way to do that is to help git-log knowing which are
> > the short (topic) branches, and to crawl incrementally using a
> > date-based hack. This would basically work a bit like this. Let's
> > imaging you want to crawl "next" in git and know which topics from pu
> > are in it. You would say e.g.:
> 
> Hmm. Why a date-based hack to see what's on the topic branch? Why not
> just give an option to walk the graph twice,

it serves the purpose to not walk the graph twice actually, but indeed
twice is not _that_ bad.
 
> giving name-rev style annotations, and just let it be slow. People
> will mostly look at it by specifying just their topic branches anyway.
> IOW:
> 
>   git log ^origin/master my/topic1 my/topic2 my/topic3
> 
> and by virtue of the fact that you are vastly limiting the size of the
> tree, it won't actually end up too slow. So you haven't said so much
> "these are my topic branches" as "I am just not interested in things
> that are already upstream."

Well, I was just thinking quickly during jetlag-induced insomnia. I
don't really care about the issue that much actually.
-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] Add a 'source' decorator for commits
  2008-10-28 20:09           ` Pierre Habouzit
@ 2008-10-28 20:27             ` Jeff King
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2008-10-28 20:27 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Linus Torvalds, Git Mailing List, Junio C Hamano

On Tue, Oct 28, 2008 at 09:09:02PM +0100, Pierre Habouzit wrote:

> > Hmm. Why a date-based hack to see what's on the topic branch? Why not
> > just give an option to walk the graph twice,
> 
> it serves the purpose to not walk the graph twice actually, but indeed
> twice is not _that_ bad.

Sure, and that is reasonable. But I think the real goal is "give this
information in a not-painfully slow manner". So if we can do it by
walking a smaller graph twice, I think that is OK, too.

> Well, I was just thinking quickly during jetlag-induced insomnia. I
> don't really care about the issue that much actually.

Heh. I have to admit I don't care enough to work on this personally,
either. I just _thought_ Linus was working on it, but I think what he is
doing is subtly different, and our discussion has diverged to something
that, if it ever came to pass, would be a separate feature.

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-10-28 20:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-27 20:05 [PATCH 0/2] Random patches from my tree Linus Torvalds
2008-10-27 20:06 ` [PATCH 1/2] Add file delete/create info when we overflow rename_limit Linus Torvalds
2008-10-27 20:07   ` [PATCH 2/2] Add a 'source' decorator for commits Linus Torvalds
2008-10-28  5:45     ` Jeff King
2008-10-28 13:11       ` Pierre Habouzit
2008-10-28 13:19         ` Pierre Habouzit
2008-10-28 19:46         ` Jeff King
2008-10-28 19:53           ` Jeff King
2008-10-28 20:09           ` Pierre Habouzit
2008-10-28 20:27             ` Jeff King
2008-10-28 15:17       ` Linus Torvalds
2008-10-28 17:21         ` Linus Torvalds
2008-10-28 19:29         ` Jeff King
2008-10-27 20:37 ` [PATCH 0/2] Random patches from my tree Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).