All of lore.kernel.org
 help / color / mirror / Atom feed
From: William Duclot <william.duclot@ensimag.grenoble-inp.fr>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git@vger.kernel.org,
	simon rabourg <simon.rabourg@ensimag.grenoble-inp.fr>,
	francois beutin <francois.beutin@ensimag.grenoble-inp.fr>,
	antoine queru <antoine.queru@ensimag.grenoble-inp.fr>,
	matthieu moy <matthieu.moy@grenoble-inp.fr>,
	mhagger@alum.mit.edu
Subject: Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Date: Mon, 30 May 2016 15:20:53 +0200 (CEST)	[thread overview]
Message-ID: <953965621.202433.1464614453377.JavaMail.zimbra@ensimag.grenoble-inp.fr> (raw)
In-Reply-To: <alpine.DEB.2.20.1605301326530.4449@virtualbox>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Mon, 30 May 2016, William Duclot wrote:
> 
>> It is unfortunate that it is currently impossible to use a strbuf
>> without doing a memory allocation. So code like
>> 
>> void f()
>> {
>>     char path[PATH_MAX];
>>     ...
>> }
>> 
>> typically gets turned into either
>> 
>> void f()
>> {
>>     struct strbuf path;
>>     strbuf_add(&path, ...); <-- does a malloc
>>     ...
>>     strbuf_release(&path);  <-- does a free
>> }
>> 
>> which costs extra memory allocations, or
>> 
>> void f()
>> {
>>     static struct strbuf path;
>>     strbuf_add(&path, ...);
>>     ...
>>     strbuf_setlen(&path, 0);
>> }
>> 
>> which, by using a static variable, avoids most of the malloc/free
>> overhead, but makes the function unsafe to use recursively or from
>> multiple threads. Those limitations prevent strbuf to be used in
>> performance-critical operations.
> 
> This description is nice and verbose, but maybe something like this would
> introduce the subject in a quicker manner?
> 
> 	When working e.g. with file paths or with dates, strbuf's
> 	malloc()/free() dance of strbufs can be easily avoided: as
> 	a sensible initial buffer size is already known, it can be
> 	allocated on the heap.

strbuf already allow to indicate a sensible initial buffer size thanks
to strbuf_init() second parameter. The main perk of pre-allocation
is to use stack-allocated memory, and not heap-allocated :)
Unless I misunderstood your message?

>> diff --git a/strbuf.c b/strbuf.c
>> index 1ba600b..527b986 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -1,6 +1,14 @@
>>  #include "cache.h"
>>  #include "refs.h"
>>  #include "utf8.h"
>> +#include <sys/param.h>
> 
> Why?

For the MAX macro. It may be a teeny tiny overkill

>> +/**
>> + * Flags
>> + * --------------
>> + */
>> +#define STRBUF_OWNS_MEMORY 1
>> +#define STRBUF_FIXED_MEMORY (1 << 1)
> 
> From reading the commit message, I expected STRBUF_OWNS_MEMORY.
> STRBUF_FIXED_MEMORY still needs to be explained.

Yes, that seems right

>> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>>  
>>  void strbuf_init(struct strbuf *sb, size_t hint)
>>  {
>> +	sb->flags = 0;
>>  	sb->alloc = sb->len = 0;
>>  	sb->buf = strbuf_slopbuf;
>>  	if (hint)
>>  		strbuf_grow(sb, hint);
>>  }
>>  
>> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
>> +			      size_t path_buf_len, size_t alloc_len)
>> +{
>> +	if (!path_buf)
>> +		die("you try to use a NULL buffer to initialize a strbuf");
>> +
>> +	strbuf_init(sb, 0);
>> +	strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
>> +	sb->flags &= ~STRBUF_OWNS_MEMORY;
>> +	sb->flags &= ~STRBUF_FIXED_MEMORY;
> 
> Shorter: sb->flags &= ~(STRBUF_OWNS_MEMORY | STRBUF_FIXED_MEMORY);

Okay with me

>> +}
>> +
>> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
>> +		       size_t path_buf_len, size_t alloc_len)
>> +{
>> +	strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
>> +	sb->flags |= STRBUF_FIXED_MEMORY;
>> +}
> 
> Rather than letting strbuf_wrap_preallocated() set sb->flags &=
> ~FIXED_MEMORY only to revert that decision right away, a static function
> could be called by both strbuf_wrap_preallocated() and
> strbuf_wrap_fixed().

Makes sense

>>  void strbuf_release(struct strbuf *sb)
>>  {
>>  	if (sb->alloc) {
>> -		free(sb->buf);
>> +		if (sb->flags & STRBUF_OWNS_MEMORY)
>> +			free(sb->buf);
>>  		strbuf_init(sb, 0);
>>  	}
> 
> Should we not reset the flags here, too?

Well, strbuf_init() reset the flags. The only way to have !sb->alloc
is that strbuf has been initialized and never used (even alloc_grow(0)
set sb->alloc=1), so sb==STRBUF_INIT, so the flags don't have to be reset 

>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>  {
>>  	char *res;
>>  	strbuf_grow(sb, 0);
>> -	res = sb->buf;
>> +	if (sb->flags & STRBUF_OWNS_MEMORY)
>> +		res = sb->buf;
>> +	else
>> +		res = xmemdupz(sb->buf, sb->alloc - 1);
> 
> This looks like a usage to be avoided: if we plan to detach the buffer,
> anyway, there is no good reason to allocate it on the heap first. I would
> at least issue a warning here.

strbuf_detach() guarantees to return heap-allocated memory, that the caller
can use however he want and that he'll have to free. If the strbuf doesn't
own the memory, it cannot return the buf attribute directly because:
- The memory belong to someone else (so the caller can't use it however
he want)
- The caller can't have the responsibility to free (because the memory
belong to someone else)
- The memory may not even be heap-allocated

>> @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t
>> len, size_t alloc)
>>  	sb->buf   = buf;
>>  	sb->len   = len;
>>  	sb->alloc = alloc;
>> +	sb->flags |= STRBUF_OWNS_MEMORY;
>> +	sb->flags &= ~STRBUF_FIXED_MEMORY;
>>  	strbuf_grow(sb, 0);
>>  	sb->buf[sb->len] = '\0';
>>  }
>> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>>  	if (unsigned_add_overflows(extra, 1) ||
>>  	    unsigned_add_overflows(sb->len, extra + 1))
>>  		die("you want to use way too much memory");
>> -	if (new_buf)
>> -		sb->buf = NULL;
>> -	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>> +	if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > sb->alloc)
>> +		die("you try to make a string overflow the buffer of a fixed strbuf");
> 
> We try to avoid running over 80 columns/row. This message could be
> more to the point: cannot grow fixed string

What is fixed is the buffer, not the string. I'll shrink that under 80 columns
   
>>  extern char strbuf_slopbuf[];
>> -#define STRBUF_INIT  { 0, 0, strbuf_slopbuf }
>> +#define STRBUF_INIT  { 0, 0, 0, strbuf_slopbuf }
> 
> If I am not mistaken, to preserve the existing behavior the initial flags
> should be 1 (own memory).

strbuf_slopbuf is a buffer that doesn't belong to any strbuf (because it's
shared between all just-initialized strbufs). If STRBUF_OWNS_MEMORY was
set, strbuf_slopbuf could be freed (which is impossible because it is
shared AND even more because it is stack-allocated) 

> BTW this demonstrates that it may not be a good idea to declare the
> "flags" field globally but then make the actual flags private.

I'm not sure what you mean here?

> Also: similar use cases in Git used :1 flags (see e.g. the "configured"
> field in credential.h).

I think that keeping an obscure `flags` attribute may be better, as they
should only be useful for internal operations and the user shouldn't mess
with it. Keeping it a `private` attribute, in a way

  reply	other threads:[~2016-05-30 13:13 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 10:36 [PATCH 0/2] strbuf: improve API William Duclot
2016-05-30 10:36 ` [PATCH 1/2] strbuf: add tests William Duclot
2016-05-30 11:26   ` Johannes Schindelin
2016-05-30 13:42     ` Simon Rabourg
2016-05-30 11:56   ` Matthieu Moy
2016-05-31  2:04   ` Michael Haggerty
2016-05-31  9:48     ` Simon Rabourg
2016-05-30 10:36 ` [PATCH 2/2] strbuf: allow to use preallocated memory William Duclot
2016-05-30 12:13   ` Johannes Schindelin
2016-05-30 13:20     ` William Duclot [this message]
2016-05-31  6:21       ` Johannes Schindelin
2016-05-31  3:05     ` Michael Haggerty
2016-05-31  6:41       ` Johannes Schindelin
2016-05-31  8:25         ` Michael Haggerty
2016-05-30 12:52   ` Matthieu Moy
2016-05-30 14:15     ` William Duclot
2016-05-30 14:34       ` Matthieu Moy
2016-05-30 15:16         ` William Duclot
2016-05-31  4:05     ` Michael Haggerty
2016-05-31 15:59       ` William Duclot
2016-06-03 14:04       ` William Duclot
2016-05-30 21:56   ` Mike Hommey
2016-05-30 22:46     ` William Duclot
2016-05-30 22:50       ` Mike Hommey
2016-05-31  6:34   ` Junio C Hamano
2016-05-31 15:45     ` William
2016-05-31 15:54       ` Matthieu Moy
2016-05-31 16:08         ` William Duclot
2016-05-30 11:32 ` [PATCH 0/2] strbuf: improve API Remi Galan Alfonso
2016-06-01  7:42   ` Jeff King
2016-06-01 19:50     ` David Turner
2016-06-01 20:09       ` Jeff King
2016-06-01 20:22         ` David Turner
2016-06-01 21:07     ` Jeff King
2016-06-02 11:11       ` Michael Haggerty
2016-06-02 12:58         ` Matthieu Moy
2016-06-02 14:22           ` William Duclot
2016-06-24 17:20         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=953965621.202433.1464614453377.JavaMail.zimbra@ensimag.grenoble-inp.fr \
    --to=william.duclot@ensimag.grenoble-inp.fr \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=antoine.queru@ensimag.grenoble-inp.fr \
    --cc=francois.beutin@ensimag.grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=matthieu.moy@grenoble-inp.fr \
    --cc=mhagger@alum.mit.edu \
    --cc=simon.rabourg@ensimag.grenoble-inp.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.