All of lore.kernel.org
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>, Eric Sunshine <sunshine@sunshineco.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v2] strbuf: add and use strbuf_insertstr()
Date: Tue, 11 Feb 2020 17:18:02 +0100	[thread overview]
Message-ID: <a5622196-1978-2fe5-144f-99edc5516dd6@web.de> (raw)
In-Reply-To: <20200210234427.GA632160@coredump.intra.peff.net>

Am 11.02.20 um 00:44 schrieb Jeff King:
> On Sun, Feb 09, 2020 at 12:36:22PM -0500, Eric Sunshine wrote:
>
>> On Sun, Feb 9, 2020 at 8:45 AM René Scharfe <l.s.r@web.de> wrote:
>>> Add a function for inserting a C string into a strbuf.  Use it
>>> throughout the source to get rid of magic string length constants and
>>> explicit strlen() calls.
>>>
>>> Like strbuf_addstr(), implement it as an inline function to avoid the
>>> implicit strlen() calls to cause runtime overhead.
>>>
>>> Signed-off-by: René Scharfe <l.s.r@web.de>
>>> ---
>>> diff --git a/mailinfo.c b/mailinfo.c
>>> @@ -570,7 +570,7 @@ static int check_header(struct mailinfo *mi,
>>>                 len = strlen("Content-Type: ");
>>>                 strbuf_add(&sb, line->buf + len, line->len - len);
>>>                 decode_header(mi, &sb);
>>> -               strbuf_insert(&sb, 0, "Content-Type: ", len);
>>> +               strbuf_insertstr(&sb, 0, "Content-Type: ");
>>>                 handle_content_type(mi, &sb);
>>
>> Meh. We've already computed the length of "Content-Type: " a few lines
>> earlier, so taking advantage of that value when inserting the string
>> literal is perfectly sensible. Thus, I'm not convinced that this
>> change is an improvement.
>
> I had a similar thought. I kind of wonder if all of these "len" bits
> (and their repeated strings) could go away if cmp_header() just used
> skip_iprefix() under the hood and had a pointer out-parameter.
>
> Something like the (largely untested) patch below. That would also make
> it easy to support arbitrary amounts of whitespace after the header,
> which I think are allowed by the standard (whereas now we'd parse
> "Content-type:    text/plain" as "    text/plain", which is silly).
>
> Worth doing?

Sure.

> ---
> diff --git a/mailinfo.c b/mailinfo.c
> index b395adbdf2..bbb5b429f8 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -346,11 +346,16 @@ static const char *header[MAX_HDR_PARSED] = {
>  	"From","Subject","Date",
>  };
>
> -static inline int cmp_header(const struct strbuf *line, const char *hdr)
> +static inline int cmp_header(const struct strbuf *line, const char *hdr,
> +			     const char **outval)
>  {
> -	int len = strlen(hdr);
> -	return !strncasecmp(line->buf, hdr, len) && line->len > len &&
> -			line->buf[len] == ':' && isspace(line->buf[len + 1]);
> +	const char *val;
> +	if (!skip_iprefix(line->buf, hdr, &val) ||
> +	    *val++ != ':' ||
> +	    !isspace(*val++))
> +		return 0;
> +	*outval = val;
> +	return 1;
>  }

And you could rename it to skip_header() to fix the issue that its name
starts with cmp but its return value is the inverse of a cmp-style
function.

And it could take a char pointer instead of a strbuf, to reduce its
dependencies and make it more widely useful -- but that might also be
a case of YAGNI.

>
>  static int is_format_patch_separator(const char *line, int len)
> @@ -547,17 +552,17 @@ static int check_header(struct mailinfo *mi,
>  			const struct strbuf *line,
>  			struct strbuf *hdr_data[], int overwrite)
>  {
> -	int i, ret = 0, len;
> +	int i, ret = 0;
>  	struct strbuf sb = STRBUF_INIT;
> +	const char *val;
>
>  	/* search for the interesting parts */
>  	for (i = 0; header[i]; i++) {
> -		int len = strlen(header[i]);
> -		if ((!hdr_data[i] || overwrite) && cmp_header(line, header[i])) {
> +		if ((!hdr_data[i] || overwrite) && cmp_header(line, header[i], &val)) {
>  			/* Unwrap inline B and Q encoding, and optionally
>  			 * normalize the meta information to utf8.
>  			 */
> -			strbuf_add(&sb, line->buf + len + 2, line->len - len - 2);
> +			strbuf_addstr(&sb, val);

That assumes the header value never contains NULs.  Valid?

>  			decode_header(mi, &sb);
>  			handle_header(&hdr_data[i], &sb);
>  			ret = 1;
> @@ -566,26 +571,22 @@ static int check_header(struct mailinfo *mi,
>  	}
>
>  	/* Content stuff */
> -	if (cmp_header(line, "Content-Type")) {
> -		len = strlen("Content-Type: ");
> -		strbuf_add(&sb, line->buf + len, line->len - len);
> +	if (cmp_header(line, "Content-Type", &val)) {
> +		strbuf_addstr(&sb, val);
>  		decode_header(mi, &sb);
> -		strbuf_insert(&sb, 0, "Content-Type: ", len);
>  		handle_content_type(mi, &sb);
>  		ret = 1;
>  		goto check_header_out;
>  	}
> -	if (cmp_header(line, "Content-Transfer-Encoding")) {
> -		len = strlen("Content-Transfer-Encoding: ");
> -		strbuf_add(&sb, line->buf + len, line->len - len);
> +	if (cmp_header(line, "Content-Transfer-Encoding", &val)) {
> +		strbuf_addstr(&sb, val);
>  		decode_header(mi, &sb);
>  		handle_content_transfer_encoding(mi, &sb);
>  		ret = 1;
>  		goto check_header_out;
>  	}
> -	if (cmp_header(line, "Message-Id")) {
> -		len = strlen("Message-Id: ");
> -		strbuf_add(&sb, line->buf + len, line->len - len);
> +	if (cmp_header(line, "Message-Id", &val)) {
> +		strbuf_addstr(&sb, val);
>  		decode_header(mi, &sb);
>  		if (mi->add_message_id)
>  			mi->message_id = strbuf_detach(&sb, NULL);

The repeated sequence cmp_header()+strbuf_add{,str}()+decode_header()
makes me itchy.

> @@ -607,8 +608,9 @@ static int is_inbody_header(const struct mailinfo *mi,
>  			    const struct strbuf *line)
>  {
>  	int i;
> +	const char *val;
>  	for (i = 0; header[i]; i++)
> -		if (!mi->s_hdr_data[i] && cmp_header(line, header[i]))
> +		if (!mi->s_hdr_data[i] && cmp_header(line, header[i], &val))
>  			return 1;
>  	return 0;
>  }
>


  parent reply	other threads:[~2020-02-11 16:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-08 19:56 [PATCH] strbuf: add and use strbuf_insertstr() René Scharfe
2020-02-08 23:08 ` Taylor Blau
2020-02-09 10:23   ` René Scharfe
2020-02-09  0:53 ` Eric Sunshine
2020-02-09 10:23   ` René Scharfe
2020-02-09 13:44 ` [PATCH v2] " René Scharfe
2020-02-09 17:36   ` Eric Sunshine
2020-02-09 18:28     ` René Scharfe
2020-02-09 21:09       ` Eric Sunshine
2020-02-09 23:10       ` Taylor Blau
2020-02-10 23:44     ` Jeff King
2020-02-11 16:17       ` Junio C Hamano
2020-02-11 17:16         ` [PATCH 0/4] some more mailinfo cleanups Jeff King
2020-02-11 17:18           ` [PATCH 1/4] mailinfo: treat header values as C strings Jeff King
2020-02-11 17:26             ` Eric Sunshine
2020-02-11 17:19           ` [PATCH 2/4] mailinfo: simplify parsing of header values Jeff King
2020-02-11 17:19           ` [PATCH 3/4] mailinfo: be more liberal with header whitespace Jeff King
2020-02-11 17:20           ` [PATCH 4/4] mailinfo: factor out some repeated header handling Jeff King
2020-02-11 16:18       ` René Scharfe [this message]
2020-02-11 17:13         ` [PATCH v2] strbuf: add and use strbuf_insertstr() Jeff King
2020-02-10  7:15   ` [PATCH 2/1] mailinfo: don't insert header prefix for handle_content_type() René Scharfe
2020-02-10 17:27     ` Junio C Hamano
2020-02-10 19:55     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5622196-1978-2fe5-144f-99edc5516dd6@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.