All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>,
	git@vger.kernel.org
Subject: Re: [PATCH v5 2/4] read-cache: add strcmp_offset function
Date: Thu, 6 Apr 2017 11:58:10 -0400	[thread overview]
Message-ID: <3f88c428-6d04-6ef5-f7bc-952daad2703a@jeffhostetler.com> (raw)
In-Reply-To: <20170406141912.14536-1-szeder.dev@gmail.com>



On 4/6/2017 10:19 AM, SZEDER Gábor wrote:
>> Add strcmp_offset() function to also return the offset of the
>> first change.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>  cache.h      |  1 +
>>  read-cache.c | 29 +++++++++++++++++++++++++++++
>>  2 files changed, 30 insertions(+)
>>
>> diff --git a/cache.h b/cache.h
>> index 80b6372..4d82490 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -574,6 +574,7 @@ extern int write_locked_index(struct index_state *, struct lock_file *lock, unsi
>>  extern int discard_index(struct index_state *);
>>  extern int unmerged_index(const struct index_state *);
>>  extern int verify_path(const char *path);
>> +extern int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change);
>>  extern int index_dir_exists(struct index_state *istate, const char *name, int namelen);
>>  extern void adjust_dirname_case(struct index_state *istate, char *name);
>>  extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);
>> diff --git a/read-cache.c b/read-cache.c
>> index 9054369..b3fc77d 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>
> read-cache.c is probably not the best place for such a general string
> utility function, though I'm not sure where its most apropriate place
> would be.

Yeah, I looked and didn't see any place so I left it here
near it's first usage for now.

>
>> @@ -887,6 +887,35 @@ static int has_file_name(struct index_state *istate,
>>  	return retval;
>>  }
>>
>> +
>> +/*
>> + * Like strcmp(), but also return the offset of the first change.
>
> This comment doesn't tell us what will happen with the offset
> parameter if it is NULL or if the two strings are equal.  I don't
> really care about the safety check for NULL: a callsite not interested
> in the offset should just call strcmp() instead.  I'm fine either way.
> However, setting it to 0 when the strings are equal doesn't seem quite
> right, does it.

Good catch. I'll add a null check.

When the strings are equal, I'm not sure any one value is
better than another.  I could leave it undefined or set it
to the length.  That might be more useful since we have it on
hand and it might save the caller a strlen() later.

>
>> + */
>> +int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change)
>> +{
>> +	const unsigned char *s1 = (const unsigned char *)s1_in;
>> +	const unsigned char *s2 = (const unsigned char *)s2_in;
>> +	int diff = 0;
>> +	int k;
>> +
>> +	*first_change = 0;
>> +	for (k=0; s1[k]; k++)
>> +		if ((diff = (s1[k] - s2[k])))
>> +			goto found_it;
>> +	if (!s2[k])
>> +		return 0;
>> +	diff = -1;
>> +
>> +found_it:
>> +	*first_change = k;
>> +	if (diff > 0)
>> +		return 1;
>> +	else if (diff < 0)
>> +		return -1;
>> +	else
>> +		return 0;
>> +}
>
> The implementation seems to me a bit long-winded, with more
> conditional statements than necessary.  How about something like this
> instead?  Much shorter, no goto and only half the number of
> conditionals to reason about, leaves *first_change untouched when the
> two strings are equal, and deals with it being NULL.
>
> int strcmp_offset(const char *s1, const char *s2, int *first_change)
> {
> 	int k;
>
> 	for (k = 0; s1[k] == s2[k]; k++)
> 		if (s1[k] == '\0')
> 			return 0;
>
> 	if (first_change)
> 		*first_change = k;
> 	return ((unsigned char *)s1)[k] - ((unsigned char *)s2)[k];
> }
>

Let me give that a try.  Thanks!
Jeff


  reply	other threads:[~2017-04-06 15:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-05 17:38 [PATCH v5 0/4] read-cache: speed up add_index_entry git
2017-04-05 17:38 ` [PATCH v5 1/4] p0004-read-tree: perf test to time read-tree git
2017-04-06 20:20   ` René Scharfe
2017-04-06 20:41     ` Jeff Hostetler
2017-04-05 17:38 ` [PATCH v5 2/4] read-cache: add strcmp_offset function git
2017-04-06 14:19   ` SZEDER Gábor
2017-04-06 15:58     ` Jeff Hostetler [this message]
2017-04-05 17:38 ` [PATCH v5 3/4] test-strcmp-offset: created test for strcmp_offset git
2017-04-05 22:47   ` SZEDER Gábor
2017-04-06  8:21     ` Johannes Schindelin
2017-04-06 14:25       ` Jeff Hostetler
2017-04-06 20:20   ` René Scharfe
2017-04-06 20:42     ` Jeff Hostetler
2017-04-05 17:38 ` [PATCH v5 4/4] read-cache: speed up add_index_entry during checkout git
2017-04-05 22:54   ` SZEDER Gábor
2017-04-06 14:05     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f88c428-6d04-6ef5-f7bc-952daad2703a@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.