All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] cache_tree_find(): remove redundant checks
@ 2014-03-04  8:31 Michael Haggerty
  2014-03-04  9:40 ` David Kastrup
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Michael Haggerty @ 2014-03-04  8:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Kastrup, git, Michael Haggerty

The beginning of the loop ensures that slash can never be NULL.  So
don't keep checking whether it is NULL later in the loop.

Furthermore, there is no need for an early

    return it;

from the loop if slash points at the end of the string, because that
is exactly what will happen when the while condition fails at the
start of the next iteration.

Helped-by: David Kastrup <dak@gnu.org>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
I incorporated David's suggestion, and then realized that yet another
check was superfluous, so I removed that one too.

 cache-tree.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 0bbec43..19252c3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -551,25 +551,22 @@ static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *pat
 	if (!it)
 		return NULL;
 	while (*path) {
-		const char *slash;
 		struct cache_tree_sub *sub;
+		const char *slash = strchr(path, '/');
 
-		slash = strchr(path, '/');
 		if (!slash)
 			slash = path + strlen(path);
-		/* between path and slash is the name of the
-		 * subtree to look for.
+		/*
+		 * Between path and slash is the name of the subtree
+		 * to look for.
 		 */
 		sub = find_subtree(it, path, slash - path, 0);
 		if (!sub)
 			return NULL;
 		it = sub->cache_tree;
-		if (slash)
-			while (*slash && *slash == '/')
-				slash++;
-		if (!slash || !*slash)
-			return it; /* prefix ended with slashes */
 		path = slash;
+		while (*path == '/')
+			path++;
 	}
 	return it;
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04  8:31 [PATCH v2] cache_tree_find(): remove redundant checks Michael Haggerty
@ 2014-03-04  9:40 ` David Kastrup
  2014-03-04 10:22   ` Michael Haggerty
  2014-03-04 21:05 ` Junio C Hamano
  2014-03-04 21:11 ` [microproject idea] Junio C Hamano
  2 siblings, 1 reply; 11+ messages in thread
From: David Kastrup @ 2014-03-04  9:40 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> The beginning of the loop ensures that slash can never be NULL.  So
> don't keep checking whether it is NULL later in the loop.
>
> Furthermore, there is no need for an early
>
>     return it;
>
> from the loop if slash points at the end of the string, because that
> is exactly what will happen when the while condition fails at the
> start of the next iteration.

Hm.  Another suggestion.  You have

		const char *slash = strchr(path, '/');
 		if (!slash)
 			slash = path + strlen(path);
[...]
		sub = find_subtree(it, path, slash - path, 0);
[...]
		path = slash;
		while (*path == '/')
			path++;
	}

At the price of introducing another variable, this could be

		const char *slash = strchr(path, '/');
 		size_t len = slash ? slash - path : strlen(path);
[...]
		sub = find_subtree(it, path, len, 0);
[...]
                if (!slash)
			break;
		for (path = slash; *path == '/';)
			path++;
	}

This introduces another variable and another condition.  The advantage
is that "slash" indeed points at a slash or is NULL, so the variable
names correspond better to what happens.  Alternatively, it might make
sense to rename "slash" into "end" or "endpart" or whatever.  Since
I can't think of a pretty name, I lean towards preferring the latter
version as it reads nicer.  I prefer code to read like children's books
rather than mystery novels.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04  9:40 ` David Kastrup
@ 2014-03-04 10:22   ` Michael Haggerty
  2014-03-04 10:34     ` David Kastrup
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Haggerty @ 2014-03-04 10:22 UTC (permalink / raw)
  To: David Kastrup; +Cc: Junio C Hamano, git

On 03/04/2014 10:40 AM, David Kastrup wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> The beginning of the loop ensures that slash can never be NULL.  So
>> don't keep checking whether it is NULL later in the loop.
>>
>> Furthermore, there is no need for an early
>>
>>     return it;
>>
>> from the loop if slash points at the end of the string, because that
>> is exactly what will happen when the while condition fails at the
>> start of the next iteration.
> 
> Hm.  Another suggestion.  You have
> 
> 		const char *slash = strchr(path, '/');
>  		if (!slash)
>  			slash = path + strlen(path);
> [...]
> 		sub = find_subtree(it, path, slash - path, 0);
> [...]
> 		path = slash;
> 		while (*path == '/')
> 			path++;
> 	}
> 
> At the price of introducing another variable, this could be
> 
> 		const char *slash = strchr(path, '/');
>  		size_t len = slash ? slash - path : strlen(path);
> [...]
> 		sub = find_subtree(it, path, len, 0);
> [...]
>                 if (!slash)
> 			break;
> 		for (path = slash; *path == '/';)
> 			path++;
> 	}
> 
> This introduces another variable and another condition.  The advantage
> is that "slash" indeed points at a slash or is NULL, so the variable
> names correspond better to what happens.  Alternatively, it might make
> sense to rename "slash" into "end" or "endpart" or whatever.  Since
> I can't think of a pretty name, I lean towards preferring the latter
> version as it reads nicer.  I prefer code to read like children's books
> rather than mystery novels.

I think we're reaching the point of diminishing returns here.  I can't
muster a strong feeling either way about your suggestion to add a "len"
variable.

BTW, I purposely didn't use a "for" loop at the end (even though I
usually like them) because I wanted to keep it prominent that path is
being updated to the value of slash.  Putting that assignment in a for
loop makes it easy to overlook because it puts "path" in the spot that
usually holds an inconsequential iteration variable.

YMMV.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04 10:22   ` Michael Haggerty
@ 2014-03-04 10:34     ` David Kastrup
  0 siblings, 0 replies; 11+ messages in thread
From: David Kastrup @ 2014-03-04 10:34 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> BTW, I purposely didn't use a "for" loop at the end (even though I
> usually like them) because I wanted to keep it prominent that path is
> being updated to the value of slash.  Putting that assignment in a for
> loop makes it easy to overlook because it puts "path" in the spot that
> usually holds an inconsequential iteration variable.

Reasonable.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04  8:31 [PATCH v2] cache_tree_find(): remove redundant checks Michael Haggerty
  2014-03-04  9:40 ` David Kastrup
@ 2014-03-04 21:05 ` Junio C Hamano
  2014-03-04 22:24   ` Michael Haggerty
  2014-03-05  4:38   ` David Kastrup
  2014-03-04 21:11 ` [microproject idea] Junio C Hamano
  2 siblings, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2014-03-04 21:05 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Kastrup, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

>  	while (*path) {
> -		const char *slash;
>  		struct cache_tree_sub *sub;
> +		const char *slash = strchr(path, '/');
>  
> -		slash = strchr(path, '/');
>  		if (!slash)
>  			slash = path + strlen(path);

Isn't the above a strchrnul()?

Combining a freestanding decl with intializer assignment to lose one
line is sort of cheating on the line count, but replacing the three
lines with a single strchrnul() would be a real code reduction ;-)

> -		/* between path and slash is the name of the
> -		 * subtree to look for.
> +		/*
> +		 * Between path and slash is the name of the subtree
> +		 * to look for.
>  		 */
>  		sub = find_subtree(it, path, slash - path, 0);
>  		if (!sub)
>  			return NULL;
>  		it = sub->cache_tree;
> -		if (slash)
> -			while (*slash && *slash == '/')
> -				slash++;
> -		if (!slash || !*slash)
> -			return it; /* prefix ended with slashes */
>  		path = slash;
> +		while (*path == '/')
> +			path++;
>  	}
>  	return it;
>  }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [microproject idea]
  2014-03-04  8:31 [PATCH v2] cache_tree_find(): remove redundant checks Michael Haggerty
  2014-03-04  9:40 ` David Kastrup
  2014-03-04 21:05 ` Junio C Hamano
@ 2014-03-04 21:11 ` Junio C Hamano
  2 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2014-03-04 21:11 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Find places where we scan a string twice unnecessarily, once with
strchr() and then with strlen(), e.g.

	const char *colon = strchr(name, ':');
	int namelen = colon ? colon - name : strlen(name);

and rewrite such a pattern using strchrnul() as appropriate.

The above example can become

	const char *colon = strchrnul(name, ':');
        int namelen = colon - name;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04 21:05 ` Junio C Hamano
@ 2014-03-04 22:24   ` Michael Haggerty
  2014-03-04 23:18     ` Junio C Hamano
  2014-03-05  4:38   ` David Kastrup
  1 sibling, 1 reply; 11+ messages in thread
From: Michael Haggerty @ 2014-03-04 22:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Kastrup, git

On 03/04/2014 10:05 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>>  	while (*path) {
>> -		const char *slash;
>>  		struct cache_tree_sub *sub;
>> +		const char *slash = strchr(path, '/');
>>  
>> -		slash = strchr(path, '/');
>>  		if (!slash)
>>  			slash = path + strlen(path);
> 
> Isn't the above a strchrnul()?

Oh, cool, I never realized that this GNU extension was blessed for use
in Git.  Will change.

> Combining a freestanding decl with intializer assignment to lose one
> line is sort of cheating on the line count, but replacing the three
> lines with a single strchrnul() would be a real code reduction ;-)

I suppose you are just teasing me, but for the record I consider line
count only a secondary metric.  The reason for combining initialization
with declaration is to reduce by one the number of times that the reader
has to think about that variable when analyzing the code.

And as I am writing this I realize that after converting to the use of
strchrnul(), sub can also be initialized at its declaration.

I really wish we could mix declarations with statements because I think
it is a big help to readability.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04 22:24   ` Michael Haggerty
@ 2014-03-04 23:18     ` Junio C Hamano
  2014-03-05 17:57       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2014-03-04 23:18 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Kastrup, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

>> Isn't the above a strchrnul()?
>
> Oh, cool, I never realized that this GNU extension was blessed for use
> in Git.  Will change.

We do have our own fallbacks for non-glibc platforms, so it should
be safe.

>> Combining a freestanding decl with intializer assignment to lose one
>> line is sort of cheating on the line count, but replacing the three
>> lines with a single strchrnul() would be a real code reduction ;-)
>
> I suppose you are just teasing me, but for the record I consider line
> count only a secondary metric.  The reason for combining initialization
> with declaration is to reduce by one the number of times that the reader
> has to think about that variable when analyzing the code.
> ...
> I really wish we could mix declarations with statements because I think
> it is a big help to readability.

Unfortunately, I think we are in violent disagreement.

A variable declaration block with initializations on only some but
not all variables is extremely annoying.  If none of the variable
declaration has initialization (or initialization to trivial values
that do not depend on the logic flow), and the first statement is
separated from the decl block, then I do not have to read the decl
part when reading the code/logic *at all* (the compiler will find
missing variables, variables declared as a wrong type, etc.).

In other words, a trivial initialization at the beginning of the
block, if the logic flow only sometimes makes assignment to the
variable, is perfectly fine, e.g.

	const char *string = NULL;

	if (...) {
        	string = ...
	}                

But I would wish people stop doing this:

	const char *string = strchrnul(name, ':');

	... the real logic of the block that uses string follows ...

and instead say

	const char *string;

	string = strchrnul(name, ':');
	... the real logic of the block that uses string follows ...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04 21:05 ` Junio C Hamano
  2014-03-04 22:24   ` Michael Haggerty
@ 2014-03-05  4:38   ` David Kastrup
  2014-03-05 18:40     ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: David Kastrup @ 2014-03-05  4:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

Junio C Hamano <gitster@pobox.com> writes:

> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>>  	while (*path) {
>> -		const char *slash;
>>  		struct cache_tree_sub *sub;
>> +		const char *slash = strchr(path, '/');
>>  
>> -		slash = strchr(path, '/');
>>  		if (!slash)
>>  			slash = path + strlen(path);
>
> Isn't the above a strchrnul()?

Yes.  I realized that previously, but since it's a GNU extension rather
than part of the C standards, I discarded that idea.  Calling

    git grep strchrnul

shows, however, that it _is_ used plentifully already.

That would, indeed, favor the current proposal but with strchnul.

Still worth thinking about whether there is no better name than slash
for something that indicated the end of the current path name segment.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-04 23:18     ` Junio C Hamano
@ 2014-03-05 17:57       ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2014-03-05 17:57 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Kastrup, git

Junio C Hamano <gitster@pobox.com> writes:

> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> I really wish we could mix declarations with statements because I think
>> it is a big help to readability.
> ...
> Unfortunately, I think we are in violent disagreement.

After re-reading the above, I realize that my statement may have
sounded a lot stronger than I intended it to.  If our codebase
allowed decl-after-stmt, that would change the equation and a
different style might help readability somewhat.

If decl-after-stmt were allowed, the group of lines that declare
variables at the beginning before the real logic begins do not even
have to be there, and "if some variables have initialization that
involve program logic that need to be read carefully, the
declaration at the beginning no longer can be coasted over as
boilerplate" complaint disappears.  The entire block can become the
logic, declaring variables as necessary at the point they are
required, without having to have a separate decl at the beginning.

Note that I am not advocating to allow decl-after-stmt; I do not
think the imagined readability benefit is worth it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] cache_tree_find(): remove redundant checks
  2014-03-05  4:38   ` David Kastrup
@ 2014-03-05 18:40     ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2014-03-05 18:40 UTC (permalink / raw)
  To: David Kastrup; +Cc: Michael Haggerty, git

David Kastrup <dak@gnu.org> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>
>>>  	while (*path) {
>>> -		const char *slash;
>>>  		struct cache_tree_sub *sub;
>>> +		const char *slash = strchr(path, '/');
>>>  
>>> -		slash = strchr(path, '/');
>>>  		if (!slash)
>>>  			slash = path + strlen(path);
>>
>> Isn't the above a strchrnul()?
>
> Yes.  I realized that previously, but since it's a GNU extension rather
> than part of the C standards, I discarded that idea.  Calling
>
>     git grep strchrnul
>
> shows, however, that it _is_ used plentifully already.

Yes, we have a fallback definition in compat-util, I think.

> Still worth thinking about whether there is no better name than slash
> for something that indicated the end of the current path name segment.

end_of_path_component?  Sounds a bit too long ;-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-03-05 18:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-04  8:31 [PATCH v2] cache_tree_find(): remove redundant checks Michael Haggerty
2014-03-04  9:40 ` David Kastrup
2014-03-04 10:22   ` Michael Haggerty
2014-03-04 10:34     ` David Kastrup
2014-03-04 21:05 ` Junio C Hamano
2014-03-04 22:24   ` Michael Haggerty
2014-03-04 23:18     ` Junio C Hamano
2014-03-05 17:57       ` Junio C Hamano
2014-03-05  4:38   ` David Kastrup
2014-03-05 18:40     ` Junio C Hamano
2014-03-04 21:11 ` [microproject idea] Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.