git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Smelkov <kirr@navytux.spb.ru>
To: Junio C Hamano <gitster@pobox.com>
Cc: kirr@mns.spb.ru, git@vger.kernel.org
Subject: Re: [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based
Date: Tue, 25 Mar 2014 13:22:15 +0400	[thread overview]
Message-ID: <20140325092215.GB3777@mini.zxlink> (raw)
In-Reply-To: <xmqqa9cfp9d5.fsf@gitster.dls.corp.google.com>

On Mon, Mar 24, 2014 at 02:36:22PM -0700, Junio C Hamano wrote:
> Kirill Smelkov <kirr@mns.spb.ru> writes:
> 
> > The downside is that try_to_follow_renames(), if active, we cause
> > re-reading of 2 initial trees, which was negligible based on my timings,
> 
> That would depend on how often the codepath triggered in your test
> case, but is totally understandable.  It fires only when the path we
> have been following disappears from the parent, and the processing
> of try-to-follow itself is very compute-intensive (it needs to run
> find-copies-harder logic) that will end up reading many subtrees of
> the two initial trees; two more reading of tree objects will be
> dwarfed by the actual processing.

I agree and thanks for the explanation.

> > and which is outweighed cogently by the upsides.
> 
> > Changes since v1:
> >
> >  - don't need to touch diff.h, as diff_tree() became static.
> 
> Nice.  I wonder if it is an option to let the function keep its name
> diff_tree() without renaming it to __diff_tree_whatever(), though.

As I see it, in Git for functions operating on trees, there is convention
to accept either `struct tree_desc *` and be named simply, or sha1 and
be named with _sha1 suffix. From this point of view for new diff_tree()
accepting sha1's and staying with its old name would be confusing.

Besides, in the end we'll have two function with high-level wrapper, and
lower-lever worker:

    - diff_tree_sha1(), and
    - diff_tree_paths().

So it's not about this only particular case.  Both do some simple
preparation, call worker, and perform some cleanup.

So the question is how to name the worker?

In Linux they use "__" prefix. We could also use some other prefix or
suffix, e.g. "_bh" (for bottom-half), "_worker", "_low", "_raw", etc...


To me, personally, the cleanest is "__" prefix, but maybe I'm too used
to Linux etc... I'm open to other naming scheme, only it should be
consistent.

What are the downsides of "__" prefix by the way?


> >  tree-diff.c | 60 ++++++++++++++++++++++++++++--------------------------------
> >  1 file changed, 28 insertions(+), 32 deletions(-)
> >
> > diff --git a/tree-diff.c b/tree-diff.c
> > index b99622c..f90acf5 100644
> > --- a/tree-diff.c
> > +++ b/tree-diff.c
> > @@ -137,12 +137,17 @@ static void skip_uninteresting(struct tree_desc *t, struct strbuf *base,
> >  	}
> >  }
> >  
> > -static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
> > -		     const char *base_str, struct diff_options *opt)
> > +static int __diff_tree_sha1(const unsigned char *old, const unsigned char *new,
> > +			    const char *base_str, struct diff_options *opt)
> >  {
> > +	struct tree_desc t1, t2;
> > +	void *t1tree, *t2tree;
> >  	struct strbuf base;
> >  	int baselen = strlen(base_str);
> >  
> > +	t1tree = fill_tree_descriptor(&t1, old);
> > +	t2tree = fill_tree_descriptor(&t2, new);
> > +
> >  	/* Enable recursion indefinitely */
> >  	opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);
> >  
> > @@ -155,39 +160,41 @@ static int diff_tree(struct tree_desc *t1, struct tree_desc *t2,
> >  		if (diff_can_quit_early(opt))
> >  			break;
> >  		if (opt->pathspec.nr) {
> > -			skip_uninteresting(t1, &base, opt);
> > -			skip_uninteresting(t2, &base, opt);
> > +			skip_uninteresting(&t1, &base, opt);
> > +			skip_uninteresting(&t2, &base, opt);
> >  		}
> > -		if (!t1->size && !t2->size)
> > +		if (!t1.size && !t2.size)
> >  			break;
> >  
> > -		cmp = tree_entry_pathcmp(t1, t2);
> > +		cmp = tree_entry_pathcmp(&t1, &t2);
> >  
> >  		/* t1 = t2 */
> >  		if (cmp == 0) {
> >  			if (DIFF_OPT_TST(opt, FIND_COPIES_HARDER) ||
> > -			    hashcmp(t1->entry.sha1, t2->entry.sha1) ||
> > -			    (t1->entry.mode != t2->entry.mode))
> > -				show_path(&base, opt, t1, t2);
> > +			    hashcmp(t1.entry.sha1, t2.entry.sha1) ||
> > +			    (t1.entry.mode != t2.entry.mode))
> > +				show_path(&base, opt, &t1, &t2);
> >  
> > -			update_tree_entry(t1);
> > -			update_tree_entry(t2);
> > +			update_tree_entry(&t1);
> > +			update_tree_entry(&t2);
> >  		}
> >  
> >  		/* t1 < t2 */
> >  		else if (cmp < 0) {
> > -			show_path(&base, opt, t1, /*t2=*/NULL);
> > -			update_tree_entry(t1);
> > +			show_path(&base, opt, &t1, /*t2=*/NULL);
> > +			update_tree_entry(&t1);
> >  		}
> >  
> >  		/* t1 > t2 */
> >  		else {
> > -			show_path(&base, opt, /*t1=*/NULL, t2);
> > -			update_tree_entry(t2);
> > +			show_path(&base, opt, /*t1=*/NULL, &t2);
> > +			update_tree_entry(&t2);
> >  		}
> >  	}
> >  
> >  	strbuf_release(&base);
> > +	free(t2tree);
> > +	free(t1tree);
> >  	return 0;
> >  }
> >  
> > @@ -202,7 +209,7 @@ static inline int diff_might_be_rename(void)
> >  		!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
> >  }
> >  
> > -static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, const char *base, struct diff_options *opt)
> > +static void try_to_follow_renames(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> >  {
> >  	struct diff_options diff_opts;
> >  	struct diff_queue_struct *q = &diff_queued_diff;
> > @@ -240,7 +247,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
> >  	diff_opts.break_opt = opt->break_opt;
> >  	diff_opts.rename_score = opt->rename_score;
> >  	diff_setup_done(&diff_opts);
> > -	diff_tree(t1, t2, base, &diff_opts);
> > +	__diff_tree_sha1(old, new, base, &diff_opts);
> >  	diffcore_std(&diff_opts);
> >  	free_pathspec(&diff_opts.pathspec);
> >  
> > @@ -301,23 +308,12 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
> >  
> >  int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
> >  {
> > -	void *tree1, *tree2;
> > -	struct tree_desc t1, t2;
> > -	unsigned long size1, size2;
> >  	int retval;
> >  
> > -	tree1 = fill_tree_descriptor(&t1, old);
> > -	tree2 = fill_tree_descriptor(&t2, new);
> > -	size1 = t1.size;
> > -	size2 = t2.size;
> > -	retval = diff_tree(&t1, &t2, base, opt);
> > -	if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename()) {
> > -		init_tree_desc(&t1, tree1, size1);
> > -		init_tree_desc(&t2, tree2, size2);
> > -		try_to_follow_renames(&t1, &t2, base, opt);
> > -	}
> > -	free(tree1);
> > -	free(tree2);
> > +	retval = __diff_tree_sha1(old, new, base, opt);
> > +	if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename())
> > +		try_to_follow_renames(old, new, base, opt);
> > +
> >  	return retval;
> >  }

  reply	other threads:[~2014-03-25  9:24 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-24 16:21 [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Kirill Smelkov
2014-02-24 16:21 ` [PATCH 01/19] combine-diff: move show_log_first logic/action out of paths scanning Kirill Smelkov
2014-02-24 16:21 ` [PATCH 02/19] combine-diff: move changed-paths scanning logic into its own function Kirill Smelkov
2014-02-24 16:21 ` [PATCH 03/19] tree-diff: no need to manually verify that there is no mode change for a path Kirill Smelkov
2014-02-24 16:21 ` [PATCH 04/19] tree-diff: no need to pass match to skip_uninteresting() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 05/19] tree-diff: show_tree() is not needed Kirill Smelkov
2014-02-24 16:21 ` [PATCH 06/19] tree-diff: consolidate code for emitting diffs and recursion in one place Kirill Smelkov
2014-02-24 16:21 ` [PATCH 07/19] tree-diff: don't assume compare_tree_entry() returns -1,0,1 Kirill Smelkov
2014-02-24 16:21 ` [PATCH 08/19] tree-diff: move all action-taking code out of compare_tree_entry() Kirill Smelkov
2014-02-24 16:21 ` [PATCH 09/19] tree-diff: rename compare_tree_entry -> tree_entry_pathcmp Kirill Smelkov
2014-02-24 16:21 ` [PATCH 10/19] tree-diff: show_path prototype is not needed anymore Kirill Smelkov
2014-02-24 16:21 ` [PATCH 11/19] tree-diff: simplify tree_entry_pathcmp Kirill Smelkov
2014-03-24 21:25   ` Junio C Hamano
2014-03-25  9:23     ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 12/19] tree-diff: remove special-case diff-emitting code for empty-tree cases Kirill Smelkov
2014-03-24 21:18   ` Junio C Hamano
2014-03-25  9:20     ` Kirill Smelkov
2014-03-25 17:45       ` Junio C Hamano
2014-03-26 18:32         ` Kirill Smelkov
2014-03-25 22:07       ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 13/19] tree-diff: diff_tree() should now be static Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 14/19] tree-diff: rework diff_tree interface to be sha1 based Kirill Smelkov
2014-03-24 21:36   ` Junio C Hamano
2014-03-25  9:22     ` Kirill Smelkov [this message]
2014-03-25 17:46       ` Junio C Hamano
2014-03-26 19:52         ` Kirill Smelkov
2014-03-26 21:34           ` Junio C Hamano
2014-03-27 14:24             ` Kirill Smelkov
2014-03-27 18:48               ` Junio C Hamano
2014-03-27 19:43                 ` Kirill Smelkov
2014-03-28  6:52                 ` Johannes Sixt
2014-03-28 17:06                   ` Junio C Hamano
2014-03-28 17:46                     ` Johannes Sixt
2014-03-28 18:36                       ` Junio C Hamano
2014-03-28 19:08                         ` Johannes Sixt
2014-03-28 19:27                           ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 15/19] tree-diff: no need to call "full" diff_tree_sha1 from show_path() Kirill Smelkov
2014-03-27 14:21   ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH v2 16/19] tree-diff: reuse base str(buf) memory on sub-tree recursion Kirill Smelkov
2014-03-24 21:43   ` Junio C Hamano
2014-03-25  9:23     ` Kirill Smelkov
2014-03-27 14:22       ` Kirill Smelkov
2014-02-24 16:21 ` [PATCH 17/19] Portable alloca for Git Kirill Smelkov
2014-02-28 10:58   ` Thomas Schwinge
2014-02-28 13:44   ` Erik Faye-Lund
2014-02-28 13:50     ` Erik Faye-Lund
2014-02-28 17:00       ` Kirill Smelkov
2014-02-28 17:19         ` Erik Faye-Lund
2014-03-05  9:31           ` Kirill Smelkov
2014-03-24 21:47             ` Junio C Hamano
2014-03-27 14:22               ` Kirill Smelkov
2014-04-09 12:48                 ` Kirill Smelkov
2014-04-09 13:01                   ` Erik Faye-Lund
2014-04-10 17:30                     ` Junio C Hamano
2014-02-24 16:21 ` [PATCH v2 18/19] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Kirill Smelkov
2014-03-27 14:23   ` Kirill Smelkov
2014-04-04 18:42     ` Junio C Hamano
2014-04-06 21:46       ` Kirill Smelkov
2014-04-07 17:29         ` Junio C Hamano
2014-04-07 20:26           ` Kirill Smelkov
2014-04-07 18:07         ` Junio C Hamano
2014-02-24 16:21 ` [PATCH 19/19] combine-diff: speed it up, by using multiparent diff tree-walker directly Kirill Smelkov
2014-02-24 23:43 ` [PATCH v2 00/19] Multiparent diff tree-walker + combine-diff speedup Duy Nguyen
2014-02-25 10:38   ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140325092215.GB3777@mini.zxlink \
    --to=kirr@navytux.spb.ru \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kirr@mns.spb.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).