All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
To: William Duclot <william.duclot@ensimag.grenoble-inp.fr>
Cc: git@vger.kernel.org, simon.rabourg@ensimag.grenoble-inp.fr,
	francois.beutin@ensimag.grenoble-inp.fr,
	antoine.queru@ensimag.grenoble-inp.fr, mhagger@alum.mit.edu
Subject: Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Date: Mon, 30 May 2016 14:52:41 +0200	[thread overview]
Message-ID: <vpqlh2remhy.fsf@anie.imag.fr> (raw)
In-Reply-To: <20160530103642.7213-3-william.duclot@ensimag.grenoble-inp.fr> (William Duclot's message of "Mon, 30 May 2016 12:36:42 +0200")

William Duclot <william.duclot@ensimag.grenoble-inp.fr> writes:

> multiple threads. Those limitations prevent strbuf to be used in

prevent strbuf from being used ...

> API ENHANCEMENT
> ---------------
>
> All functions of the API can still be reliably called without
> knowledge of the initialization (normal/preallocated/fixed) with the
> exception that strbuf_grow() may die() if the string try to overflow a

s/try/tries/

> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>  
>  void strbuf_init(struct strbuf *sb, size_t hint)
>  {
> +	sb->flags = 0;
>  	sb->alloc = sb->len = 0;
>  	sb->buf = strbuf_slopbuf;
>  	if (hint)
>  		strbuf_grow(sb, hint);
>  }

If you set flags = 0 here, existing callers will have all flags off,
including OWNS_MEMORY.

I *think* this is OK, as sb->buf is currently pointing to
strbuf_slopbuf, which the the strbuf doesn't own. But that is too subtle
to go without an explanatory comment IMHO.

Also, doesn't this make the "new_buf" case useless in strbuf_grow?

With your patch, the code looks like:

void strbuf_grow(struct strbuf *sb, size_t extra)
{
	int new_buf = !sb->alloc;
...
	if (sb->flags & STRBUF_OWNS_MEMORY) {
		if (new_buf) // <---------------------------------------- (1)
			sb->buf = NULL;
		ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
	} else {
		/*
		 * The strbuf doesn't own the buffer: to avoid to realloc it,
		 * the strbuf needs to use a new buffer without freeing the old
		 */
		if (sb->len + extra + 1 > sb->alloc) {
			size_t new_alloc = MAX(sb->len + extra + 1, alloc_nr(sb->alloc));
			char *buf = xmalloc(new_alloc);
			memcpy(buf, sb->buf, sb->alloc);
			sb->buf = buf;
			sb->alloc = new_alloc;
			sb->flags |= STRBUF_OWNS_MEMORY;
		}
	}

	if (new_buf) // <---------------------------------------- (2)
		sb->buf[0] = '\0';
}

I think (1) is now dead code, since sb->alloc == 0 implies that
STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just
memcpy-ed the existing '\0' into the buffer.

> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
> +			      size_t path_buf_len, size_t alloc_len)
> +{
> +	if (!path_buf)
> +		die("you try to use a NULL buffer to initialize a strbuf");
> +
> +	strbuf_init(sb, 0);
> +	strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
> +	sb->flags &= ~STRBUF_OWNS_MEMORY;
> +	sb->flags &= ~STRBUF_FIXED_MEMORY;
> +}

strbuf_wrap_preallocated seem very close to strbuf_attach. I'd rather
see a symmetric code sharing like

void strbuf_attach_internal(struct strbuf *sb, ..., unsigned int flags)

and then strbuf_attach() and strbuf_wrap_preallocated() become
straightforward wrappers around it.

This would avoid setting and then unsetting STRBUF_OWNS_MEMORY (the
performance impact is probably small, but it looks weird to set the flag
and then unset it right away).

After your patch, there are differences between
strbuf_wrap_preallocated() which I think are inconsistencies:

* strbuf_attach() does not check for NULL buffer, but it doesn't accept
  them either if I read correctly. It would make sense to add the check
  to strbuf_attach(), but it's weird to have the performance-critical
  oriented function do the expensive stuff that the
  non-performance-critical one doesn't.

* strbuf_attach() calls strbuf_release(), which allows reusing an
  existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which
  would override silently any previous content. I think strbuf_attach()
  does the right thing here.

  (And I'm probably the one who misguided you to do this)

  In any case, you probably want to include calls to strbuf_attach() and
  strbuf_wrap_*() functions on existing non-empty strbufs.

> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
> +		       size_t path_buf_len, size_t alloc_len)
> +{
> +	strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
> +	sb->flags |= STRBUF_FIXED_MEMORY;
> +}

And this could become a 3rd caller of strbuf_attach_internal().

> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>  	if (unsigned_add_overflows(extra, 1) ||
>  	    unsigned_add_overflows(sb->len, extra + 1))
>  		die("you want to use way too much memory");
> -	if (new_buf)
> -		sb->buf = NULL;
> -	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > sb->alloc)
> +		die("you try to make a string overflow the buffer of a fixed strbuf");
> +
> +	/*
> +	 * ALLOC_GROW may do a realloc() if needed, so we must not use it on
> +	 * a buffer the strbuf doesn't own
> +	 */
> +	if (sb->flags & STRBUF_OWNS_MEMORY) {
> +		if (new_buf)
> +			sb->buf = NULL;
> +		ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	} else {
> +		/*
> +		 * The strbuf doesn't own the buffer: to avoid to realloc it,
> +		 * the strbuf needs to use a new buffer without freeing the old
> +		 */
> +		if (sb->len + extra + 1 > sb->alloc) {
> +			size_t new_alloc = MAX(sb->len + extra + 1, alloc_nr(sb->alloc));
> +			char *buf = xmalloc(new_alloc);
> +			memcpy(buf, sb->buf, sb->alloc);

I think you want to memcpy only sb->len + 1 bytes. Here, you're
memcpy-ing the allocated-but-not-initialized part of the array.

xmemdupz can probably simplify the code too (either you include the '\0'
in what memcpy copies, or you let xmemdupz add it).

> +/**
> + * Allow the caller to give a pre-allocated piece of memory for the strbuf
> + * to use. It is possible then to strbuf_grow() the string past the size of the
> + * pre-allocated buffer: a new buffer will be allocated. The pre-allocated

To make it clearer: "a new buffer will then be allocated"?

> +/**
> + * Allow the caller to give a pre-allocated piece of memory for the strbuf
> + * to use and indicate that the strbuf must use exclusively this buffer,
> + * never realloc() it or allocate a new one. It means that the string can
> + * be manipulated but cannot overflow the pre-allocated buffer. The
> + * pre-allocated buffer will never be freed.
> + */

Perhaps say explicitly that although the allocated buffer has a fixed
size, the string itself can grow as long as it does not overflow the
buffer?

> @@ -91,6 +116,8 @@ extern void strbuf_release(struct strbuf *);
>   * Detach the string from the strbuf and returns it; you now own the
>   * storage the string occupies and it is your responsibility from then on
>   * to release it with `free(3)` when you are done with it.
> + * Must allocate a copy of the buffer in case of a preallocated/fixed buffer.
> + * Performance-critical operations have to be aware of this.

Better than just warn about performance, you can give the alternative.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

  parent reply	other threads:[~2016-05-30 12:52 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 10:36 [PATCH 0/2] strbuf: improve API William Duclot
2016-05-30 10:36 ` [PATCH 1/2] strbuf: add tests William Duclot
2016-05-30 11:26   ` Johannes Schindelin
2016-05-30 13:42     ` Simon Rabourg
2016-05-30 11:56   ` Matthieu Moy
2016-05-31  2:04   ` Michael Haggerty
2016-05-31  9:48     ` Simon Rabourg
2016-05-30 10:36 ` [PATCH 2/2] strbuf: allow to use preallocated memory William Duclot
2016-05-30 12:13   ` Johannes Schindelin
2016-05-30 13:20     ` William Duclot
2016-05-31  6:21       ` Johannes Schindelin
2016-05-31  3:05     ` Michael Haggerty
2016-05-31  6:41       ` Johannes Schindelin
2016-05-31  8:25         ` Michael Haggerty
2016-05-30 12:52   ` Matthieu Moy [this message]
2016-05-30 14:15     ` William Duclot
2016-05-30 14:34       ` Matthieu Moy
2016-05-30 15:16         ` William Duclot
2016-05-31  4:05     ` Michael Haggerty
2016-05-31 15:59       ` William Duclot
2016-06-03 14:04       ` William Duclot
2016-05-30 21:56   ` Mike Hommey
2016-05-30 22:46     ` William Duclot
2016-05-30 22:50       ` Mike Hommey
2016-05-31  6:34   ` Junio C Hamano
2016-05-31 15:45     ` William
2016-05-31 15:54       ` Matthieu Moy
2016-05-31 16:08         ` William Duclot
2016-05-30 11:32 ` [PATCH 0/2] strbuf: improve API Remi Galan Alfonso
2016-06-01  7:42   ` Jeff King
2016-06-01 19:50     ` David Turner
2016-06-01 20:09       ` Jeff King
2016-06-01 20:22         ` David Turner
2016-06-01 21:07     ` Jeff King
2016-06-02 11:11       ` Michael Haggerty
2016-06-02 12:58         ` Matthieu Moy
2016-06-02 14:22           ` William Duclot
2016-06-24 17:20         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vpqlh2remhy.fsf@anie.imag.fr \
    --to=matthieu.moy@grenoble-inp.fr \
    --cc=antoine.queru@ensimag.grenoble-inp.fr \
    --cc=francois.beutin@ensimag.grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=simon.rabourg@ensimag.grenoble-inp.fr \
    --cc=william.duclot@ensimag.grenoble-inp.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.