All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: William Duclot <william.duclot@ensimag.grenoble-inp.fr>,
	git@vger.kernel.org, simon.rabourg@ensimag.grenoble-inp.fr,
	francois.beutin@ensimag.grenoble-inp.fr,
	antoine.queru@ensimag.grenoble-inp.fr,
	matthieu.moy@grenoble-inp.fr
Subject: Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Date: Tue, 31 May 2016 10:25:29 +0200	[thread overview]
Message-ID: <574D4A79.1030405@alum.mit.edu> (raw)
In-Reply-To: <alpine.DEB.2.20.1605310822490.4449@virtualbox>

On 05/31/2016 08:41 AM, Johannes Schindelin wrote:
> Hi Michael,
> 
> On Tue, 31 May 2016, Michael Haggerty wrote:
> 
>> On 05/30/2016 02:13 PM, Johannes Schindelin wrote:
>>> [...]
>>>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>>>  {
>>>>  	char *res;
>>>>  	strbuf_grow(sb, 0);
>>>> -	res = sb->buf;
>>>> +	if (sb->flags & STRBUF_OWNS_MEMORY)
>>>> +		res = sb->buf;
>>>> +	else
>>>> +		res = xmemdupz(sb->buf, sb->alloc - 1);
>>>
>>> This looks like a usage to be avoided: if we plan to detach the buffer,
>>> anyway, there is no good reason to allocate it on the heap first. I would
>>> at least issue a warning here.
>>
>> I think this last case should be changed to
>>
>>     res = xmemdupz(sb->buf, sb->len);
>>
>> Johannes, if this change is made then I think that there is a reasonable
>> use case for calling `strbuf_detach()` on a strbuf that wraps a
>> stack-allocated string, so I don't think that a warning is needed.
>>
>> I think this change makes sense. After all, once a caller detaches a
>> string, it is probably not planning on growing/shrinking it anymore, so
>> any more space than that would probably be wasted. In fact, since the
>> caller has no way to ask how much extra space the detached string has
>> allocated, it is almost guaranteed that the space would be wasted.
> 
> Ah, I did not think of that. (We could of course introduce
> strbuf_realloc_snugly() or some such, but I agree: why?) So the best
> counter-example to my objection might be something like this:
> 
> -- snip --
> Subject: git_pathdup(): avoid realloc()
> 
> When generating Git paths, we already know the maximum length:
> MAX_PATH. Let's avoid realloc()-caused fragmentation by using a
> fixed buffer.
> 
> As a side effect, we avoid overallocating the end result, too:
> previously, strbuf_detach() would not realloc() to avoid wasting
> the (sb->alloc - sb->len) bytes, while it malloc()s the precisely
> needed amount of memory when detaching fixed buffers.
> 
> diff --git a/path.c b/path.c
> index 2511d8a..64fd3ee 100644
> --- a/path.c
> +++ b/path.c
> @@ -426,7 +426,8 @@ const char *git_path(const char *fmt, ...)
>  
>  char *git_pathdup(const char *fmt, ...)
>  {
> -	struct strbuf path = STRBUF_INIT;
> +	char buffer[MAX_PATH];
> +	struct strbuf path = STRBUF_WRAP_FIXED(buffer);
>  	va_list args;
>  	va_start(args, fmt);
>  	do_git_path(&path, fmt, args);
> -- snap --

I like the idea, but apparently MAX_PATH is not a hard limit that OSs
must respect (e.g., see [1]). The existing implementation can generate
paths longer than MAX_PATH whereas your replacement cannot.

This would be easy to solve by using STRBUF_WRAP_MOVABLE() instead of
STRBUF_WRAP_FIXED() in your patch. We would thereby accept the
possibility that the strbuf might need to be expanded for really long
paths. However, if that happens, then strbuf_detach() is likely to
return the string in an overallocated memory area.

So if there are callers who *really* care about not having overallocated
memory, we might want a strbuf_detach_snugly().

Your (only half-serious) idea for strbuf_realloc_snugly() would have the
disadvantage that it couldn't work with a fixed strbuf, whereas
strbuf_detach_snugly() could. It would be sad for callers to have to
worry about the distinction.

Michael

[1] http://insanecoding.blogspot.de/2007/11/pathmax-simply-isnt.html

  reply	other threads:[~2016-05-31  8:26 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 10:36 [PATCH 0/2] strbuf: improve API William Duclot
2016-05-30 10:36 ` [PATCH 1/2] strbuf: add tests William Duclot
2016-05-30 11:26   ` Johannes Schindelin
2016-05-30 13:42     ` Simon Rabourg
2016-05-30 11:56   ` Matthieu Moy
2016-05-31  2:04   ` Michael Haggerty
2016-05-31  9:48     ` Simon Rabourg
2016-05-30 10:36 ` [PATCH 2/2] strbuf: allow to use preallocated memory William Duclot
2016-05-30 12:13   ` Johannes Schindelin
2016-05-30 13:20     ` William Duclot
2016-05-31  6:21       ` Johannes Schindelin
2016-05-31  3:05     ` Michael Haggerty
2016-05-31  6:41       ` Johannes Schindelin
2016-05-31  8:25         ` Michael Haggerty [this message]
2016-05-30 12:52   ` Matthieu Moy
2016-05-30 14:15     ` William Duclot
2016-05-30 14:34       ` Matthieu Moy
2016-05-30 15:16         ` William Duclot
2016-05-31  4:05     ` Michael Haggerty
2016-05-31 15:59       ` William Duclot
2016-06-03 14:04       ` William Duclot
2016-05-30 21:56   ` Mike Hommey
2016-05-30 22:46     ` William Duclot
2016-05-30 22:50       ` Mike Hommey
2016-05-31  6:34   ` Junio C Hamano
2016-05-31 15:45     ` William
2016-05-31 15:54       ` Matthieu Moy
2016-05-31 16:08         ` William Duclot
2016-05-30 11:32 ` [PATCH 0/2] strbuf: improve API Remi Galan Alfonso
2016-06-01  7:42   ` Jeff King
2016-06-01 19:50     ` David Turner
2016-06-01 20:09       ` Jeff King
2016-06-01 20:22         ` David Turner
2016-06-01 21:07     ` Jeff King
2016-06-02 11:11       ` Michael Haggerty
2016-06-02 12:58         ` Matthieu Moy
2016-06-02 14:22           ` William Duclot
2016-06-24 17:20         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=574D4A79.1030405@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=antoine.queru@ensimag.grenoble-inp.fr \
    --cc=francois.beutin@ensimag.grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=matthieu.moy@grenoble-inp.fr \
    --cc=simon.rabourg@ensimag.grenoble-inp.fr \
    --cc=william.duclot@ensimag.grenoble-inp.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.