All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Felipe Contreras <felipe.contreras@gmail.com>,
	ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Christian Couder <christian.couder@gmail.com>,
	Hariom Verma <hariom18599@gmail.com>,
	Karthik Nayak <karthik.188@gmail.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>, Jeff King <peff@peff.net>,
	ZheNing Hu <adlternative@gmail.com>
Subject: Re: [PATCH 1/2] [GSOC] ref-filter: add %(raw) atom
Date: Sat, 29 May 2021 14:23:05 +0100	[thread overview]
Message-ID: <13c63e79-27fd-58d5-9a4c-6b58c40ef4b8@gmail.com> (raw)
In-Reply-To: <60afca827a28f_265302085b@natae.notmuch>

On 27/05/2021 17:36, Felipe Contreras wrote:
> ZheNing Hu via GitGitGadget wrote:
> [...]
>> +static int memcasecmp(const void *vs1, const void *vs2, size_t n)
> 
> Why void *? We can delcare as char *.

If you look at how this function is used you'll see
	int (*cmp_fn)(const void *, const void *, size_t);
	cmp_fn = s->sort_flags & REF_SORTING_ICASE
			? memcasecmp : memcmp;

So the signature must match memcmp to avoid undefined behavior (a 
ternary expression is undefined unless both sides evaluate to the same 
type and calling a function through a pointer a different type is 
undefined as well)

>> +{
>> +	size_t i;
>> +	const char *s1 = (const char *)vs1;
>> +	const char *s2 = (const char *)vs2;
> 
> Then we avoid this extra step.
> 
>> +	for (i = 0; i < n; i++) {
>> +		unsigned char u1 = s1[i];
>> +		unsigned char u2 = s2[i];
> 
> There's no need for two entirely new variables...
> 
>> +		int U1 = toupper (u1);
>> +		int U2 = toupper (u2);
> 
> You can do toupper(s1[i]) directly (BTW, there's an extra space: `foo(x)`,
> not `foo (x)`).
> 
> While we are at it, why keep an extra index from s1, when s1 is never
> used again?
> 
> We can simply advance both s1 and s2:
> 
>    s1++, s2++
> 
>> +		int diff = (UCHAR_MAX <= INT_MAX ? U1 - U2
>> +			: U1 < U2 ? -1 : U2 < U1);
> 
> I don't understand what this is supposed to achieve. Both U1 and U2 are
> integers, pretty low integers actually.
> 
> If we get rid if that complexity we don't even need U1 or U2, just do:
> 
>    diff = toupper(u1) - toupper(u2);
> 
>> +		if (diff)
>> +			return diff;
>> +	}
>> +	return 0;
>> +}
> 
> All we have to do is define the end point, and then we don't need i:
> 
> 	static int memcasecmp(const char *s1, const char *s2, size_t n)
> 	{
> 		const char *end = s1 + n;
> 		for (; s1 < end; s1++, s2++) {
> 			int diff = tolower(*s1) - tolower(*s2);
> 			if (diff)
> 				return diff;
> 		}
> 		return 0;
> 	}
> 
> (and I personally prefer lower to upper)

We should be using tolower() as that is what POSIX specifies for 
strcasecmp() [1] which we are trying to emulate and there are cases[2] where
	(tolower(c1) == tolower(c2)) != (toupper(c1) == toupper(c2))

Best Wishes

Phillip

[1] https://pubs.opengroup.org/onlinepubs/9699919799/
[2] https://en.wikipedia.org/wiki/Dotted_and_dotless_I#In_computing

> Check the following resource for a detailed explanation of why my
> modified version is considered good taste:
> 
> https://github.com/felipec/linked-list-good-taste
> 
>>   static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
>>   {
>>   	struct atom_value *va, *vb;
>> @@ -2304,6 +2382,7 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
>>   	int cmp_detached_head = 0;
>>   	cmp_type cmp_type = used_atom[s->atom].type;
>>   	struct strbuf err = STRBUF_INIT;
>> +	size_t slen = 0;
>>   
>>   	if (get_ref_atom_value(a, s->atom, &va, &err))
>>   		die("%s", err.buf);
>> @@ -2317,10 +2396,32 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
>>   	} else if (s->sort_flags & REF_SORTING_VERSION) {
>>   		cmp = versioncmp(va->s, vb->s);
>>   	} else if (cmp_type == FIELD_STR) {
>> -		int (*cmp_fn)(const char *, const char *);
>> -		cmp_fn = s->sort_flags & REF_SORTING_ICASE
>> -			? strcasecmp : strcmp;
>> -		cmp = cmp_fn(va->s, vb->s);
>> +		if (va->s_size == ATOM_VALUE_S_SIZE_INIT &&
>> +		    vb->s_size == ATOM_VALUE_S_SIZE_INIT) {
>> +			int (*cmp_fn)(const char *, const char *);
>> +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
>> +				? strcasecmp : strcmp;
>> +			cmp = cmp_fn(va->s, vb->s);
>> +		} else {
>> +			int (*cmp_fn)(const void *, const void *, size_t);
>> +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
>> +				? memcasecmp : memcmp;
>> +
>> +			if (va->s_size != ATOM_VALUE_S_SIZE_INIT &&
>> +			    vb->s_size != ATOM_VALUE_S_SIZE_INIT) {
>> +				cmp = cmp_fn(va->s, vb->s, va->s_size > vb->s_size ?
>> +				       vb->s_size : va->s_size);
>> +			} else if (va->s_size == ATOM_VALUE_S_SIZE_INIT) {
>> +				slen = strlen(va->s);
>> +				cmp = cmp_fn(va->s, vb->s, slen > vb->s_size ?
>> +					     vb->s_size : slen);
>> +			} else {
>> +				slen = strlen(vb->s);
>> +				cmp = cmp_fn(va->s, vb->s, slen > va->s_size ?
>> +					     slen : va->s_size);
>> +			}
>> +			cmp = cmp ? cmp : va->s_size - vb->s_size;
>> +		}
> 
> This hurts my eyes. I think the complexity of this chunk warrants a
> separate function. Then the logic would be easer to see.
> 
> Cheers.
> 


  parent reply	other threads:[~2021-05-29 13:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 14:43 [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-27 14:43 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-05-27 16:36   ` Felipe Contreras
2021-05-28 13:02     ` ZheNing Hu
2021-05-28 16:30       ` Felipe Contreras
2021-05-30  5:37         ` ZheNing Hu
2021-05-29 13:23     ` Phillip Wood [this message]
2021-05-29 15:24       ` Felipe Contreras
2021-05-29 17:23         ` Phillip Wood
2021-05-30  6:29         ` ZheNing Hu
2021-05-30 13:05           ` Phillip Wood
2021-05-31 14:15             ` ZheNing Hu
2021-05-31 15:35           ` Felipe Contreras
2021-05-30  6:26       ` ZheNing Hu
2021-05-30 13:02         ` Phillip Wood
2021-05-28  3:03   ` Junio C Hamano
2021-05-28 15:04     ` ZheNing Hu
2021-05-28 16:38       ` Felipe Contreras
2021-05-30  8:11       ` ZheNing Hu
2021-05-27 14:43 ` [PATCH 2/2] [GSOC] ref-filter: add %(header) atom ZheNing Hu via GitGitGadget
2021-05-27 16:37   ` Felipe Contreras
2021-05-28  3:06   ` Junio C Hamano
2021-05-28  4:36   ` Junio C Hamano
2021-05-28 15:19     ` ZheNing Hu
2021-05-27 15:39 ` [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom Felipe Contreras
2021-05-30 13:01 ` [PATCH v2 " ZheNing Hu via GitGitGadget
2021-05-30 13:01   ` [PATCH v2 1/2] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-05-31  5:34     ` Junio C Hamano
2021-05-30 13:01   ` [PATCH v2 2/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-31  0:44     ` Junio C Hamano
2021-05-31 14:35       ` ZheNing Hu
2021-06-01  9:54         ` Junio C Hamano
2021-06-01 11:05           ` ZheNing Hu
2021-05-31  4:04     ` Junio C Hamano
2021-05-31 14:40       ` ZheNing Hu
2021-06-01  8:54         ` Junio C Hamano
2021-06-01 11:00           ` ZheNing Hu
2021-06-01 13:48             ` Johannes Schindelin
2021-05-31  4:10     ` Junio C Hamano
2021-05-31 15:41     ` Felipe Contreras
2021-06-01 10:37       ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13c63e79-27fd-58d5-9a4c-6b58c40ef4b8@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=adlternative@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.