* [PATCH] bisect: save heap memory. allocate only the required amount @ 2014-08-24 14:17 Arjun Sreedharan 2014-08-24 15:10 ` Stefan Beller ` (3 more replies) 0 siblings, 4 replies; 32+ messages in thread From: Arjun Sreedharan @ 2014-08-24 14:17 UTC (permalink / raw) To: git Cc: Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc Duy Find and allocate the required amount instead of allocating extra 100 bytes Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> --- bisect.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/bisect.c b/bisect.c index d6e851d..c96aab0 100644 --- a/bisect.c +++ b/bisect.c @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n } qsort(array, cnt, sizeof(*array), compare_commit_dist); for (p = list, i = 0; i < cnt; i++) { - struct name_decoration *r = xmalloc(sizeof(*r) + 100); + char name[100]; + sprintf(name, "dist=%d", array[i].distance); + int name_len = strlen(name); + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); struct object *obj = &(array[i].commit->object); - sprintf(r->name, "dist=%d", array[i].distance); + memcpy(r->name, name, name_len + 1); r->next = add_decoration(&name_decoration, obj, r); p->item = array[i].commit; p = p->next; -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 14:17 [PATCH] bisect: save heap memory. allocate only the required amount Arjun Sreedharan @ 2014-08-24 15:10 ` Stefan Beller 2014-08-24 23:39 ` Junio C Hamano 2014-08-24 15:32 ` Ramsay Jones ` (2 subsequent siblings) 3 siblings, 1 reply; 32+ messages in thread From: Stefan Beller @ 2014-08-24 15:10 UTC (permalink / raw) To: Arjun Sreedharan, git Cc: Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc Duy On 24.08.2014 16:17, Arjun Sreedharan wrote: > Find and allocate the required amount instead of > allocating extra 100 bytes > > Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> > --- > bisect.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/bisect.c b/bisect.c > index d6e851d..c96aab0 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n > } > qsort(array, cnt, sizeof(*array), compare_commit_dist); > for (p = list, i = 0; i < cnt; i++) { > - struct name_decoration *r = xmalloc(sizeof(*r) + 100); > + char name[100]; Would it make sense to convert the 'name' into a git strbuf? Please have a look at Documentation/technical/api-strbuf.txt > + sprintf(name, "dist=%d", array[i].distance); > + int name_len = strlen(name); > + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); > struct object *obj = &(array[i].commit->object); > > - sprintf(r->name, "dist=%d", array[i].distance); > + memcpy(r->name, name, name_len + 1); > r->next = add_decoration(&name_decoration, obj, r); > p->item = array[i].commit; > p = p->next; > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 15:10 ` Stefan Beller @ 2014-08-24 23:39 ` Junio C Hamano 2014-08-25 13:07 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-24 23:39 UTC (permalink / raw) To: Stefan Beller Cc: Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On Sun, Aug 24, 2014 at 8:10 AM, Stefan Beller <stefanbeller@gmail.com> wrote: >> for (p = list, i = 0; i < cnt; i++) { >> - struct name_decoration *r = xmalloc(sizeof(*r) + 100); >> + char name[100]; > > Would it make sense to convert the 'name' into a git strbuf? > Please have a look at Documentation/technical/api-strbuf.txt Why not go one step further and format it twice, once only to measure the necessary size to allocate, allocate and then format into it for real? Then you do not need to print into a temporary strbuf, copy the result and free the strbuf, no? BUT. The string will always be "dist=" followed by decimal representation of a count that fits in "int" anyway, so I actually think use of strbuf is way overkill (and formatting it twice also is); the patch as posted should be just fine. Or am I missing some uses that require larger/unbounded buffer outside the context of the patch? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 23:39 ` Junio C Hamano @ 2014-08-25 13:07 ` Jeff King 2014-08-25 21:36 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-25 13:07 UTC (permalink / raw) To: Junio C Hamano Cc: Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On Sun, Aug 24, 2014 at 04:39:37PM -0700, Junio C Hamano wrote: > On Sun, Aug 24, 2014 at 8:10 AM, Stefan Beller <stefanbeller@gmail.com> wrote: > >> for (p = list, i = 0; i < cnt; i++) { > >> - struct name_decoration *r = xmalloc(sizeof(*r) + 100); > >> + char name[100]; > > > > Would it make sense to convert the 'name' into a git strbuf? > > Please have a look at Documentation/technical/api-strbuf.txt > > Why not go one step further and format it twice, once only > to measure the necessary size to allocate, allocate and > then format into it for real? Then you do not need to print > into a temporary strbuf, copy the result and free the strbuf, > no? > > BUT. > > The string will always be "dist=" followed by decimal representation of > a count that fits in "int" anyway, so I actually think use of strbuf is way > overkill (and formatting it twice also is); the patch as posted should be > just fine. I think you are right, and the patch is the right direction (assuming we want to do this; I question whether there are enough elements in the list for us to care about the size, and if there are, we are probably better off storing the int and formatting the strings on the fly). I wonder if there is a way we could get rid of the magic "100" here, though. Its meaning is "enough to hold 'dist=' and any integer". But you have to read carefully to see that this call to sprintf is not a buffer overflow. A strbuf is one way to get rid of it, though it is awkward because we then have to copy the result into a flex-array structure. It would be nice if there was some way to abstract the idea of formatting a buffer directly into a flex-array. That would involve the double-format you mention, but we could use it in lots of places to make the code nicer. Maybe like: void *fmt_flex_array(size_t base, const char *fmt, ...) { va_list ap; size_t flex; unsigned char *ret; va_start(ap, fmt); flex = vsnprintf(NULL, 0, fmt, ap); va_end(ap); ret = xmalloc(base + flex + 1); va_start(ap, fmt); /* Eek, see below */ vsnprintf(ret + flex, flex + 1, fmt, ap); return ret; } and you'd call it like: struct name_decoration *r = fmt_flex_array(sizeof(*r), "dist=%d", x); Except that I don't think we are guaranteed that offsetof(mystruct, flex_member) is equal to sizeof(mystruct). If FLEX_ARRAY is 0, it should be, but some platforms use FLEX_ARRAY=1. So you'd have to pass in the offset like: struct name_decoration *r = fmt_flex_array(sizeof(*r), offsetof(*r, name), "dist=%d", x); which is a little less nice. You could make it nicer with a macro, but we don't assume variadic macros. <sigh> -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 13:07 ` Jeff King @ 2014-08-25 21:36 ` Junio C Hamano 2014-08-26 11:03 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-25 21:36 UTC (permalink / raw) To: Jeff King Cc: Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy Jeff King <peff@peff.net> writes: >> The string will always be "dist=" followed by decimal representation of >> a count that fits in "int" anyway, so I actually think use of strbuf is way >> overkill (and formatting it twice also is); the patch as posted should be >> just fine. > > I think you are right, and the patch is the right direction (assuming we > want to do this; I question whether there are enough elements in the > list for us to care about the size, and if there are, we are probably > better off storing the int and formatting the strings on the fly). ;-) > It would be nice if there was some way to abstract the idea of > formatting a buffer directly into a flex-array. That would involve the > double-format you mention, but we could use it in lots of places to make > the code nicer.... > ... > struct name_decoration *r = fmt_flex_array(sizeof(*r), > offsetof(*r, name), > "dist=%d", x); > > which is a little less nice. You could make it nicer with a macro, but > we don't assume variadic macros. <sigh> At first I thought "Yuck. A helper only to format into the flex member that holds a string?", and I tried to change my mind, but I couldn't quite convince myself. At least not yet. Among the flex arrays we use, some are arrays of bools, some others are arrays of object names, and there are many arrays of even more esoteric types. Only a small number of them are "We want a struct with a constant string, and we do not want to do two allocations and to pay an extra dereference cost by using 'const char *'". For them, by the time we allocate a struct, by definition we should have sufficient information to compute how large to make that structure and a printf-format plus its args would be the preferred form of that "sufficient information", I would think. The name "fmt_flex_array()", which stresses too much on the "formatting" side without implying that it is the way to allocate the thing, may be horrible, and I agree with you that without variadic macros the end result may not read very well. Unless we have great many number of places we can use the helper to make the code to create these objects look nicer, I am afraid that the pros-and-cons may not be very favourable. Thanks for an interesting tangent. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 21:36 ` Junio C Hamano @ 2014-08-26 11:03 ` Jeff King 2014-08-26 11:57 ` Ramsay Jones 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 11:03 UTC (permalink / raw) To: Junio C Hamano Cc: Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On Mon, Aug 25, 2014 at 02:36:02PM -0700, Junio C Hamano wrote: > > I think you are right, and the patch is the right direction (assuming we > > want to do this; I question whether there are enough elements in the > > list for us to care about the size, and if there are, we are probably > > better off storing the int and formatting the strings on the fly). > > ;-) Having now dug into this much further, the answers to those questions are: 1. Yes, we might actually have quite a lot of these, if you are bisecting a large range of commits. However, the code only runs when you are doing --bisect-all or skipping a commit, so probably nobody actually cares that much. 2. It would be nicer to hold just an "int", but we are hooking into the log-tree decorate machinery here, which expects a string. We'd need some kind of decorate-like callback machinery from log-tree to do this right. It's probably not worth the effort. > Among the flex arrays we use, some are arrays of bools, some others > are arrays of object names, and there are many arrays of even more > esoteric types. Only a small number of them are "We want a struct > with a constant string, and we do not want to do two allocations and > to pay an extra dereference cost by using 'const char *'". Yeah, I was working under the assumption that most of them were holding a string. I just did a quick skim of the grep results for FLEX_ARRAY. Of the 36 instances, 26 hold strings. 9 hold something else entirely, and 1 holds a char buffer that does not need to be NUL-terminated (so it _could_ be handled by a similar helper, but using "%s" would be wrong). So it's definitely the majority, but certainly not all. I decided to look into this a little further, and my results are below. The tl;dr is that no, we probably shouldn't do it. So you can stop reading if you don't find this interesting. :) > For them, by the time we allocate a struct, by definition we should > have sufficient information to compute how large to make that > structure and a printf-format plus its args would be the preferred > form of that "sufficient information", I would think. I was tempted to also suggest a pure-string form (i.e., just take a string, run strlen on it, and use that as the final value). That would make the variadic macro problem go away. But besides name_decoration, there are other cases that really do want formatting. For instance, alloc_ref basically wants to do ("%s%s", prefix, name). > The name "fmt_flex_array()", which stresses too much on the > "formatting" side without implying that it is the way to allocate > the thing, may be horrible, and I agree with you that without > variadic macros the end result may not read very well. Unless we > have great many number of places we can use the helper to make the > code to create these objects look nicer, I am afraid that the > pros-and-cons may not be very favourable. Yeah, reading my suggestion again, it should clearly be alloc_flex_struct or something. Here's a fully-converted sample spot, where I think there's a slight benefit: diff --git a/remote.c b/remote.c index 3d6c86a..ba32d40 100644 --- a/remote.c +++ b/remote.c @@ -928,14 +928,30 @@ int remote_find_tracking(struct remote *remote, struct refspec *refspec) return query_refspecs(remote->fetch, remote->fetch_refspec_nr, refspec); } +static void *alloc_flex_struct(size_t base, size_t offset, const char *fmt, ...) +{ + va_list ap; + size_t extra; + char *ret; + + va_start(ap, fmt); + extra = vsnprintf(NULL, 0, fmt, ap); + extra++; /* for NUL terminator */ + va_end(ap); + + ret = xcalloc(1, base + extra); + va_start(ap, fmt); + vsnprintf(ret + offset, extra, fmt, ap); + va_end(ap); + + return ret; +} + static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen, const char *name) { - size_t len = strlen(name); - struct ref *ref = xcalloc(1, sizeof(struct ref) + prefixlen + len + 1); - memcpy(ref->name, prefix, prefixlen); - memcpy(ref->name + prefixlen, name, len); - return ref; + return alloc_flex_struct(sizeof(struct ref), offsetof(struct ref, name), + "%.*s%s", prefixlen, prefix, name); } struct ref *alloc_ref(const char *name) Obviously the helper is much longer than the code it is replacing, but it would be used in multiple spots. The main thing I like here is that we are dropping the manual length computations, which are easy to get wrong (it's easy to forget a +1 for a NUL terminator, etc). The offsetof is a little ugly. And the fact that we have a pre-computed length for prefixlen makes the format string a little ugly. Here's a another example: diff --git a/attr.c b/attr.c index 734222d..100c423 100644 --- a/attr.c +++ b/attr.c @@ -89,8 +89,8 @@ static struct git_attr *git_attr_internal(const char *name, int len) if (invalid_attr_name(name, len)) return NULL; - a = xmalloc(sizeof(*a) + len + 1); - memcpy(a->name, name, len); + a = alloc_flex_array(sizeof(*a), offsetof(struct git_attr, name), + "%.*s", len, name); a->name[len] = 0; a->h = hval; a->next = git_attr_hash[pos]; I think this is strictly worse for reading. The original computation was pretty easy in the first place, so we are not getting much benefit there. And again we have the precomputed "len" passed into the function, so we have to use the less readable "%.*s". And note that offsetof() requires us to pass a real typename instead of just "*a", as sizeof() allows (I suspect passing "a->name - a" would work on many systems, but I also suspect that is a gross violation of the C standard when "a" has not yet been initialized). So given that the benefit ranges from "a little" to "negative" in these two examples, I'm inclined to give up this line of inquiry. -Peff ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-26 11:03 ` Jeff King @ 2014-08-26 11:57 ` Ramsay Jones 2014-08-26 12:14 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Ramsay Jones @ 2014-08-26 11:57 UTC (permalink / raw) To: Jeff King, Junio C Hamano Cc: Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On 26/08/14 12:03, Jeff King wrote: [snip] > > Yeah, reading my suggestion again, it should clearly be > alloc_flex_struct or something. > > Here's a fully-converted sample spot, where I think there's a slight > benefit: > > diff --git a/remote.c b/remote.c > index 3d6c86a..ba32d40 100644 > --- a/remote.c > +++ b/remote.c > @@ -928,14 +928,30 @@ int remote_find_tracking(struct remote *remote, struct refspec *refspec) > return query_refspecs(remote->fetch, remote->fetch_refspec_nr, refspec); > } > > +static void *alloc_flex_struct(size_t base, size_t offset, const char *fmt, ...) > +{ > + va_list ap; > + size_t extra; > + char *ret; > + > + va_start(ap, fmt); > + extra = vsnprintf(NULL, 0, fmt, ap); > + extra++; /* for NUL terminator */ > + va_end(ap); > + > + ret = xcalloc(1, base + extra); > + va_start(ap, fmt); > + vsnprintf(ret + offset, extra, fmt, ap); What is the relationship between 'base' and 'offset'? Let me assume that base is always, depending on your compiler, either equal to offset or offset+1. Yes? (I'm assuming base is always the sizeof(struct whatever)). Do you need both base and offset? > + va_end(ap); > + > + return ret; > +} > + > static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen, > const char *name) > { > - size_t len = strlen(name); > - struct ref *ref = xcalloc(1, sizeof(struct ref) + prefixlen + len + 1); > - memcpy(ref->name, prefix, prefixlen); > - memcpy(ref->name + prefixlen, name, len); > - return ref; > + return alloc_flex_struct(sizeof(struct ref), offsetof(struct ref, name), > + "%.*s%s", prefixlen, prefix, name); > } > > struct ref *alloc_ref(const char *name) > > Obviously the helper is much longer than the code it is replacing, but > it would be used in multiple spots. The main thing I like here is that > we are dropping the manual length computations, which are easy to get > wrong (it's easy to forget a +1 for a NUL terminator, etc). > > The offsetof is a little ugly. And the fact that we have a pre-computed > length for prefixlen makes the format string a little ugly. > > Here's a another example: > > diff --git a/attr.c b/attr.c > index 734222d..100c423 100644 > --- a/attr.c > +++ b/attr.c > @@ -89,8 +89,8 @@ static struct git_attr *git_attr_internal(const char *name, int len) > if (invalid_attr_name(name, len)) > return NULL; > > - a = xmalloc(sizeof(*a) + len + 1); > - memcpy(a->name, name, len); > + a = alloc_flex_array(sizeof(*a), offsetof(struct git_attr, name), > + "%.*s", len, name); > a->name[len] = 0; > a->h = hval; > a->next = git_attr_hash[pos]; > > I think this is strictly worse for reading. The original computation was > pretty easy in the first place, so we are not getting much benefit > there. And again we have the precomputed "len" passed into the function, > so we have to use the less readable "%.*s". And note that offsetof() > requires us to pass a real typename instead of just "*a", as sizeof() > allows (I suspect passing "a->name - a" would work on many systems, but > I also suspect that is a gross violation of the C standard when "a" has > not yet been initialized). > > So given that the benefit ranges from "a little" to "negative" in these > two examples, I'm inclined to give up this line of inquiry. Indeed. :-D ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-26 11:57 ` Ramsay Jones @ 2014-08-26 12:14 ` Jeff King 2014-08-26 12:37 ` Ramsay Jones 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 12:14 UTC (permalink / raw) To: Ramsay Jones Cc: Junio C Hamano, Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On Tue, Aug 26, 2014 at 12:57:21PM +0100, Ramsay Jones wrote: > > + ret = xcalloc(1, base + extra); > > + va_start(ap, fmt); > > + vsnprintf(ret + offset, extra, fmt, ap); > > What is the relationship between 'base' and 'offset'? > > Let me assume that base is always, depending on your compiler, either > equal to offset or offset+1. Yes? (I'm assuming base is always the > sizeof(struct whatever)). Do you need both base and offset? It's much more complicated than that. Take "struct name_decoration", for instance, which looks like this: struct name_decoration { struct name_decoration *next; int type; char name[FLEX_ARRAY]; }; On my 64-bit system using gcc, sizeof() returns 16; it has to pad the whole thing to 64-bit alignment in case I put two of them in an array. But offsetof(name) is 12, since the array of char does not need the same alignment; it can go right after the type and make use of the padding bits. As a side note, that means that the original "char name[1]" (before it became FLEX_ARRAY) was not any less efficient on 64-bit machines (the 1-byte went into the padding, and sizeof() was the same). It did matter on 32-bit systems, though where it bumped the empty struct size from 12 to 16. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-26 12:14 ` Jeff King @ 2014-08-26 12:37 ` Ramsay Jones 2014-08-26 12:43 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Ramsay Jones @ 2014-08-26 12:37 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On 26/08/14 13:14, Jeff King wrote: > On Tue, Aug 26, 2014 at 12:57:21PM +0100, Ramsay Jones wrote: > >>> + ret = xcalloc(1, base + extra); >>> + va_start(ap, fmt); >>> + vsnprintf(ret + offset, extra, fmt, ap); >> >> What is the relationship between 'base' and 'offset'? >> >> Let me assume that base is always, depending on your compiler, either >> equal to offset or offset+1. Yes? (I'm assuming base is always the >> sizeof(struct whatever)). Do you need both base and offset? > > It's much more complicated than that. Take "struct name_decoration", for > instance, which looks like this: > > struct name_decoration { > struct name_decoration *next; > int type; > char name[FLEX_ARRAY]; > }; > > On my 64-bit system using gcc, sizeof() returns 16; it has to pad the > whole thing to 64-bit alignment in case I put two of them in an array. > But offsetof(name) is 12, since the array of char does not need the same > alignment; it can go right after the type and make use of the padding > bits. Hmm, interesting. I must re-read the standard. I was convinced that the standard *requires* any alignment padding to come *before* the name field. (how would you put a, non-trivial, variable data structure into an array?) ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-26 12:37 ` Ramsay Jones @ 2014-08-26 12:43 ` Jeff King 2014-08-26 12:59 ` Ramsay Jones 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 12:43 UTC (permalink / raw) To: Ramsay Jones Cc: Junio C Hamano, Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On Tue, Aug 26, 2014 at 01:37:44PM +0100, Ramsay Jones wrote: > > On my 64-bit system using gcc, sizeof() returns 16; it has to pad the > > whole thing to 64-bit alignment in case I put two of them in an array. > > But offsetof(name) is 12, since the array of char does not need the same > > alignment; it can go right after the type and make use of the padding > > bits. > > Hmm, interesting. I must re-read the standard. I was convinced that the > standard *requires* any alignment padding to come *before* the name field. > (how would you put a, non-trivial, variable data structure into an array?) I think you don't. How would you compute a[1] if a[0] has a variable size? You can put a flex-array structure into an array, but then each element has the flex member as zero-size (and you should not access it, of course). -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-26 12:43 ` Jeff King @ 2014-08-26 12:59 ` Ramsay Jones 0 siblings, 0 replies; 32+ messages in thread From: Ramsay Jones @ 2014-08-26 12:59 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Stefan Beller, Arjun Sreedharan, Git Mailing List, Christian Couder, Nguyễn Thái Ngọc Duy On 26/08/14 13:43, Jeff King wrote: > On Tue, Aug 26, 2014 at 01:37:44PM +0100, Ramsay Jones wrote: > >>> On my 64-bit system using gcc, sizeof() returns 16; it has to pad the >>> whole thing to 64-bit alignment in case I put two of them in an array. >>> But offsetof(name) is 12, since the array of char does not need the same >>> alignment; it can go right after the type and make use of the padding >>> bits. >> >> Hmm, interesting. I must re-read the standard. I was convinced that the >> standard *requires* any alignment padding to come *before* the name field. >> (how would you put a, non-trivial, variable data structure into an array?) > > I think you don't. How would you compute a[1] if a[0] has a variable > size? > > You can put a flex-array structure into an array, but then each element > has the flex member as zero-size (and you should not access it, of > course). Exactly. ;-) ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 14:17 [PATCH] bisect: save heap memory. allocate only the required amount Arjun Sreedharan 2014-08-24 15:10 ` Stefan Beller @ 2014-08-24 15:32 ` Ramsay Jones 2014-08-24 21:55 ` Arjun Sreedharan 2014-08-25 13:35 ` Jeff King 2014-08-25 21:14 ` [PATCH] bisect: save heap memory. allocate only the required amount Junio C Hamano 3 siblings, 1 reply; 32+ messages in thread From: Ramsay Jones @ 2014-08-24 15:32 UTC (permalink / raw) To: Arjun Sreedharan, git Cc: Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc Duy On 24/08/14 15:17, Arjun Sreedharan wrote: > Find and allocate the required amount instead of > allocating extra 100 bytes > > Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> > --- > bisect.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/bisect.c b/bisect.c > index d6e851d..c96aab0 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n > } > qsort(array, cnt, sizeof(*array), compare_commit_dist); > for (p = list, i = 0; i < cnt; i++) { > - struct name_decoration *r = xmalloc(sizeof(*r) + 100); > + char name[100]; > + sprintf(name, "dist=%d", array[i].distance); > + int name_len = strlen(name); > + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); > struct object *obj = &(array[i].commit->object); declaration(s) after statement. > > - sprintf(r->name, "dist=%d", array[i].distance); > + memcpy(r->name, name, name_len + 1); > r->next = add_decoration(&name_decoration, obj, r); > p->item = array[i].commit; > p = p->next; > HTH ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 15:32 ` Ramsay Jones @ 2014-08-24 21:55 ` Arjun Sreedharan 0 siblings, 0 replies; 32+ messages in thread From: Arjun Sreedharan @ 2014-08-24 21:55 UTC (permalink / raw) To: git Cc: Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc Duy, Stefan Beller, Ramsay Jones find and allocate the required amount instead of allocating extra 100 bytes Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> --- bisect.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/bisect.c b/bisect.c index d6e851d..a52631e 100644 --- a/bisect.c +++ b/bisect.c @@ -215,12 +215,16 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n } qsort(array, cnt, sizeof(*array), compare_commit_dist); for (p = list, i = 0; i < cnt; i++) { - struct name_decoration *r = xmalloc(sizeof(*r) + 100); + struct strbuf name = STRBUF_INIT; + struct name_decoration *r; struct object *obj = &(array[i].commit->object); - sprintf(r->name, "dist=%d", array[i].distance); + strbuf_addf(&name, "dist=%d", array[i].distance); + r = xmalloc(sizeof(*r) + name.len); + memcpy(r->name, name.buf, name.len + 1); r->next = add_decoration(&name_decoration, obj, r); p->item = array[i].commit; + strbuf_release(&name); p = p->next; } if (p) -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 14:17 [PATCH] bisect: save heap memory. allocate only the required amount Arjun Sreedharan 2014-08-24 15:10 ` Stefan Beller 2014-08-24 15:32 ` Ramsay Jones @ 2014-08-25 13:35 ` Jeff King 2014-08-25 14:06 ` Christian Couder 2014-08-25 21:14 ` [PATCH] bisect: save heap memory. allocate only the required amount Junio C Hamano 3 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-25 13:35 UTC (permalink / raw) To: Arjun Sreedharan Cc: git, Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc Duy On Sun, Aug 24, 2014 at 07:47:24PM +0530, Arjun Sreedharan wrote: > diff --git a/bisect.c b/bisect.c > index d6e851d..c96aab0 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n > } > qsort(array, cnt, sizeof(*array), compare_commit_dist); > for (p = list, i = 0; i < cnt; i++) { > - struct name_decoration *r = xmalloc(sizeof(*r) + 100); > + char name[100]; > + sprintf(name, "dist=%d", array[i].distance); > + int name_len = strlen(name); > + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); This allocation should be name_len + 1 for the NUL-terminator, no? It looks like add_name_decoration in log-tree already handles half of what you are adding here. Can we just make that available globally (it is manipulating the already-global "struct decoration name_decoration")? I also notice that we do not set r->type at all, meaning the decoration lookup code in log-tree will access uninitialized memory (worse, it will use it as a pointer offset into the color list; I got a segfault when I tried to run "git rev-list --bisect-all v1.8.0..v1.9.0"). I think we need this: diff --git a/bisect.c b/bisect.c index d6e851d..e2a7682 100644 --- a/bisect.c +++ b/bisect.c @@ -219,6 +219,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n struct object *obj = &(array[i].commit->object); sprintf(r->name, "dist=%d", array[i].distance); + r->type = 0; r->next = add_decoration(&name_decoration, obj, r); p->item = array[i].commit; p = p->next; at a minimum. It looks like this was a regression caused by eb3005e (commit.h: add 'type' to struct name_decoration, 2010-06-19). Which makes me wonder if anybody actually _uses_ --bisect-all (which AFAICT is the only way to trigger the problem), but since it's public, I guess we should keep it. I think the sane thing here is to stop advertising name_decoration as a global, and make all callers use add_name_decoration. That makes it easier for callers like this one, and would have caught the regression caused be eb3005e (the compiler would have noticed that we were not passing a type parameter to the function). -Peff ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 13:35 ` Jeff King @ 2014-08-25 14:06 ` Christian Couder 2014-08-25 15:00 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Christian Couder @ 2014-08-25 14:06 UTC (permalink / raw) To: Jeff King Cc: Arjun Sreedharan, git, Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc On Mon, Aug 25, 2014 at 3:35 PM, Jeff King <peff@peff.net> wrote: > On Sun, Aug 24, 2014 at 07:47:24PM +0530, Arjun Sreedharan wrote: > >> diff --git a/bisect.c b/bisect.c >> index d6e851d..c96aab0 100644 >> --- a/bisect.c >> +++ b/bisect.c >> @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n >> } >> qsort(array, cnt, sizeof(*array), compare_commit_dist); >> for (p = list, i = 0; i < cnt; i++) { >> - struct name_decoration *r = xmalloc(sizeof(*r) + 100); >> + char name[100]; >> + sprintf(name, "dist=%d", array[i].distance); >> + int name_len = strlen(name); >> + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); > > This allocation should be name_len + 1 for the NUL-terminator, no? I wondered about that too, but as struct name_decoration is defined like this: struct name_decoration { struct name_decoration *next; int type; char name[1]; }; the .name field of this struct already has one char, so the allocation above should be ok. > It looks like add_name_decoration in log-tree already handles half of > what you are adding here. Can we just make that available globally (it > is manipulating the already-global "struct decoration name_decoration")? Yeah, it looks like it should be better. Note that add_name_decoration() does: int nlen = strlen(name); struct name_decoration *res = xmalloc(sizeof(struct name_decoration) + nlen); so it also relies on the fact that .name contains one char. > I also notice that we do not set r->type at all, meaning the decoration > lookup code in log-tree will access uninitialized memory (worse, it will > use it as a pointer offset into the color list; I got a segfault when I > tried to run "git rev-list --bisect-all v1.8.0..v1.9.0"). > > I think we need this: > > diff --git a/bisect.c b/bisect.c > index d6e851d..e2a7682 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -219,6 +219,7 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n > struct object *obj = &(array[i].commit->object); > > sprintf(r->name, "dist=%d", array[i].distance); > + r->type = 0; > r->next = add_decoration(&name_decoration, obj, r); > p->item = array[i].commit; > p = p->next; > > at a minimum. Yeah if we don't use add_name_decoration() we would need that. Thanks for noticing. > It looks like this was a regression caused by eb3005e (commit.h: add > 'type' to struct name_decoration, 2010-06-19). Which makes me wonder if > anybody actually _uses_ --bisect-all (which AFAICT is the only way to > trigger the problem), but since it's public, I guess we should keep it. Yeah, we should probably keep it. > I think the sane thing here is to stop advertising name_decoration as a > global, and make all callers use add_name_decoration. That makes it > easier for callers like this one, and would have caught the regression > caused be eb3005e (the compiler would have noticed that we were not > passing a type parameter to the function). I agree. Thanks, Christian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 14:06 ` Christian Couder @ 2014-08-25 15:00 ` Jeff King 2014-08-25 18:26 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-25 15:00 UTC (permalink / raw) To: Christian Couder Cc: Arjun Sreedharan, git, Christian Couder, Junio C Hamano, Nguyễn Thái Ngọc On Mon, Aug 25, 2014 at 04:06:52PM +0200, Christian Couder wrote: > > This allocation should be name_len + 1 for the NUL-terminator, no? > > I wondered about that too, but as struct name_decoration is defined like this: > > struct name_decoration { > struct name_decoration *next; > int type; > char name[1]; > }; > > the .name field of this struct already has one char, so the allocation > above should be ok. Yeah, you're right. I would argue it should just be FLEX_ARRAY for consistency with other spots, though (in which case add_name_decoration needs to be updated with a +1). Running "git grep '^ char [^ ]*\[[01]]' -- '*.[ch]'" shows that this is one of only two spots that don't use FLEX_ARRAY (and the other has a comment explaining why not). -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 15:00 ` Jeff King @ 2014-08-25 18:26 ` Junio C Hamano 2014-08-25 19:35 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-25 18:26 UTC (permalink / raw) To: Jeff King Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc Jeff King <peff@peff.net> writes: > On Mon, Aug 25, 2014 at 04:06:52PM +0200, Christian Couder wrote: > >> > This allocation should be name_len + 1 for the NUL-terminator, no? >> >> I wondered about that too, but as struct name_decoration is defined like this: >> >> struct name_decoration { >> struct name_decoration *next; >> int type; >> char name[1]; >> }; >> >> the .name field of this struct already has one char, so the allocation >> above should be ok. > > Yeah, you're right. I would argue it should just be FLEX_ARRAY for > consistency with other spots, though (in which case add_name_decoration > needs to be updated with a +1). > > Running "git grep '^ char [^ ]*\[[01]]' -- '*.[ch]'" shows that this > is one of only two spots that don't use FLEX_ARRAY (and the other has a > comment explaining why not). Good digging, and I agree that it should use the FLEX_ARRAY for consistency. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 18:26 ` Junio C Hamano @ 2014-08-25 19:35 ` Jeff King 2014-08-25 20:27 ` Arjun Sreedharan 2014-08-25 21:11 ` Junio C Hamano 0 siblings, 2 replies; 32+ messages in thread From: Jeff King @ 2014-08-25 19:35 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote: > Good digging, and I agree that it should use the FLEX_ARRAY for > consistency. I can produce a patch, but I did not want to steal Arjun's thunder. Arjun, did my proposal make sense? Do you want to try implementing that? -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 19:35 ` Jeff King @ 2014-08-25 20:27 ` Arjun Sreedharan 2014-08-25 21:11 ` Junio C Hamano 1 sibling, 0 replies; 32+ messages in thread From: Arjun Sreedharan @ 2014-08-25 20:27 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Christian Couder, git, Christian Couder, Nguyễn Thái Ngọc On 26 August 2014 01:05, Jeff King <peff@peff.net> wrote: > On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote: > >> Good digging, and I agree that it should use the FLEX_ARRAY for >> consistency. > > I can produce a patch, but I did not want to steal Arjun's thunder. Please feel free to do so. > > Arjun, did my proposal make sense? Do you want to try implementing that? > > -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 19:35 ` Jeff King 2014-08-25 20:27 ` Arjun Sreedharan @ 2014-08-25 21:11 ` Junio C Hamano 2014-08-26 10:20 ` [PATCH 0/3] name_decoration cleanups Jeff King 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-25 21:11 UTC (permalink / raw) To: Jeff King Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc Jeff King <peff@peff.net> writes: > On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote: > >> Good digging, and I agree that it should use the FLEX_ARRAY for >> consistency. > > I can produce a patch, but I did not want to steal Arjun's thunder. Hmph, would it have to overlap? I think we can queue Arjun's patch with +1 fix and FLEX_ARRAY thing separately, and they can go in in any order, no? > Arjun, did my proposal make sense? Do you want to try implementing that? ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/3] name_decoration cleanups 2014-08-25 21:11 ` Junio C Hamano @ 2014-08-26 10:20 ` Jeff King 2014-08-26 10:23 ` [PATCH 1/3] log-tree: make add_name_decoration a public function Jeff King ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Jeff King @ 2014-08-26 10:20 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On Mon, Aug 25, 2014 at 02:11:09PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > On Mon, Aug 25, 2014 at 11:26:39AM -0700, Junio C Hamano wrote: > > > >> Good digging, and I agree that it should use the FLEX_ARRAY for > >> consistency. > > > > I can produce a patch, but I did not want to steal Arjun's thunder. > > Hmph, would it have to overlap? I think we can queue Arjun's patch > with +1 fix and FLEX_ARRAY thing separately, and they can go in in > any order, no? I more meant my suggestion to use add_name_decoration consistently. That fixes the r->type thing _and_ replaces Arjun's patch. Fixing FLEX_ARRAY on top is just gravy. :) Here's the patch series I was thinking of: [1/3]: log-tree: make add_name_decoration a public function [2/3]: log-tree: make name_decoration hash static [3/3]: log-tree: use FLEX_ARRAY in name_decoration I almost added a 4/3 to convert "name_decoration" to "commit_decoration", since that is how it is used (and name_decoration is somewhat vague). But we actually do annotate other non-commit objects that refs point to, as well. I'm not sure there is a way to _show_ them currently, but I'd just as soon leave it as-is. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/3] log-tree: make add_name_decoration a public function 2014-08-26 10:20 ` [PATCH 0/3] name_decoration cleanups Jeff King @ 2014-08-26 10:23 ` Jeff King 2014-08-26 11:48 ` Ramsay Jones 2014-08-26 10:23 ` [PATCH 2/3] log-tree: make name_decoration hash static Jeff King 2014-08-26 10:24 ` [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration Jeff King 2 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 10:23 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc The log-tree code keeps a "struct decoration" hash to show text decorations for each commit during log traversals. It makes this available to other files by providing global access to the hash. This can result in other code adding entries that do not conform to what log-tree expects. For example, the bisect code adds its own "dist" decorations to be shown. Originally the bisect code was correct, but when the name_decoration code grew a new field in eb3005e (commit.h: add 'type' to struct name_decoration, 2010-06-19), the bisect code was not updated. As a result, the log-tree code can access uninitialized memory and even segfault. We can fix this by making name_decoration's adding function public. If all callers use it, then any changes to structi initialization only need to happen in one place (and because the members come in as parameters, the compiler can notice a caller who does not supply enough information). As a bonus, this also means that the decoration hashes created by the bisect code will use less memory (previously we over-allocated space for the distance integer, but not we format it into a temporary buffer and copy it to the final flex-array). Signed-off-by: Jeff King <peff@peff.net> --- bisect.c | 7 ++++--- commit.h | 12 ++++++++++++ log-tree.c | 12 +----------- 3 files changed, 17 insertions(+), 14 deletions(-) diff --git a/bisect.c b/bisect.c index d6e851d..df09cbc 100644 --- a/bisect.c +++ b/bisect.c @@ -215,11 +215,12 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n } qsort(array, cnt, sizeof(*array), compare_commit_dist); for (p = list, i = 0; i < cnt; i++) { - struct name_decoration *r = xmalloc(sizeof(*r) + 100); + char buf[100]; /* enough for dist=%d */ struct object *obj = &(array[i].commit->object); - sprintf(r->name, "dist=%d", array[i].distance); - r->next = add_decoration(&name_decoration, obj, r); + snprintf(buf, sizeof(buf), "dist=%d", array[i].distance); + add_name_decoration(DECORATION_NONE, buf, obj); + p->item = array[i].commit; p = p->next; } diff --git a/commit.h b/commit.h index a8cbf52..4902f97 100644 --- a/commit.h +++ b/commit.h @@ -33,6 +33,18 @@ struct name_decoration { char name[1]; }; +enum decoration_type { + DECORATION_NONE = 0, + DECORATION_REF_LOCAL, + DECORATION_REF_REMOTE, + DECORATION_REF_TAG, + DECORATION_REF_STASH, + DECORATION_REF_HEAD, + DECORATION_GRAFTED, +}; + +void add_name_decoration(enum decoration_type type, const char *name, struct object *obj); + struct commit *lookup_commit(const unsigned char *sha1); struct commit *lookup_commit_reference(const unsigned char *sha1); struct commit *lookup_commit_reference_gently(const unsigned char *sha1, diff --git a/log-tree.c b/log-tree.c index 0c53dc1..a821258 100644 --- a/log-tree.c +++ b/log-tree.c @@ -14,16 +14,6 @@ struct decoration name_decoration = { "object names" }; -enum decoration_type { - DECORATION_NONE = 0, - DECORATION_REF_LOCAL, - DECORATION_REF_REMOTE, - DECORATION_REF_TAG, - DECORATION_REF_STASH, - DECORATION_REF_HEAD, - DECORATION_GRAFTED, -}; - static char decoration_colors[][COLOR_MAXLEN] = { GIT_COLOR_RESET, GIT_COLOR_BOLD_GREEN, /* REF_LOCAL */ @@ -84,7 +74,7 @@ int parse_decorate_color_config(const char *var, const int ofs, const char *valu #define decorate_get_color_opt(o, ix) \ decorate_get_color((o)->use_color, ix) -static void add_name_decoration(enum decoration_type type, const char *name, struct object *obj) +void add_name_decoration(enum decoration_type type, const char *name, struct object *obj) { int nlen = strlen(name); struct name_decoration *res = xmalloc(sizeof(struct name_decoration) + nlen); -- 2.1.0.346.ga0367b9 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] log-tree: make add_name_decoration a public function 2014-08-26 10:23 ` [PATCH 1/3] log-tree: make add_name_decoration a public function Jeff King @ 2014-08-26 11:48 ` Ramsay Jones 2014-08-26 12:17 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Ramsay Jones @ 2014-08-26 11:48 UTC (permalink / raw) To: Jeff King, Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On 26/08/14 11:23, Jeff King wrote: > The log-tree code keeps a "struct decoration" hash to show > text decorations for each commit during log traversals. It > makes this available to other files by providing global > access to the hash. This can result in other code adding > entries that do not conform to what log-tree expects. > > For example, the bisect code adds its own "dist" > decorations to be shown. Originally the bisect code was > correct, but when the name_decoration code grew a new field > in eb3005e (commit.h: add 'type' to struct name_decoration, > 2010-06-19), the bisect code was not updated. As a result, > the log-tree code can access uninitialized memory and even > segfault. > > We can fix this by making name_decoration's adding function > public. If all callers use it, then any changes to structi s/structi/struct/ > initialization only need to happen in one place (and because > the members come in as parameters, the compiler can notice a > caller who does not supply enough information). > > As a bonus, this also means that the decoration hashes > created by the bisect code will use less memory (previously > we over-allocated space for the distance integer, but not we s/not/now/ > format it into a temporary buffer and copy it to the final > flex-array). > > Signed-off-by: Jeff King <peff@peff.net> ATB, Ramsay Jones ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] log-tree: make add_name_decoration a public function 2014-08-26 11:48 ` Ramsay Jones @ 2014-08-26 12:17 ` Jeff King 0 siblings, 0 replies; 32+ messages in thread From: Jeff King @ 2014-08-26 12:17 UTC (permalink / raw) To: Ramsay Jones Cc: Junio C Hamano, Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On Tue, Aug 26, 2014 at 12:48:24PM +0100, Ramsay Jones wrote: > > We can fix this by making name_decoration's adding function > > public. If all callers use it, then any changes to structi > > s/structi/struct/ I blame vi finger-cruft. > > initialization only need to happen in one place (and because > > the members come in as parameters, the compiler can notice a > > caller who does not supply enough information). > > > > As a bonus, this also means that the decoration hashes > > created by the bisect code will use less memory (previously > > we over-allocated space for the distance integer, but not we > > s/not/now/ Er, I blame vi again. Yeah, that's my story and I'm sticking to it. Thanks for proofreading. :) -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/3] log-tree: make name_decoration hash static 2014-08-26 10:20 ` [PATCH 0/3] name_decoration cleanups Jeff King 2014-08-26 10:23 ` [PATCH 1/3] log-tree: make add_name_decoration a public function Jeff King @ 2014-08-26 10:23 ` Jeff King 2014-08-26 17:40 ` Junio C Hamano 2014-08-26 10:24 ` [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration Jeff King 2 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 10:23 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc In the previous commit, we made add_name_decoration global so that adders would not have to access the hash directly. We now make the hash itself static so that callers _have_ to add through our function, making sure that all additions go through a single point. To do this, we have to add one more accessor function: a way to lookup entries in the hash. Since the only caller doesn't actually look at the returned value, but rather only asks whether there is a decoration or not, we could provide only a boolean "has_name_decoration". That would allow us to make "struct name_decoration" local to log-tree, as well. However, it's unlikely to cause any maintainability harm making the actual data public, and this interface is more flexible if we need to look at decorations from other parts of the code in the future. Signed-off-by: Jeff King <peff@peff.net> --- commit.h | 2 +- log-tree.c | 11 ++++++++--- revision.c | 2 +- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/commit.h b/commit.h index 4902f97..263b49e 100644 --- a/commit.h +++ b/commit.h @@ -26,7 +26,6 @@ extern int save_commit_buffer; extern const char *commit_type; /* While we can decorate any object with a name, it's only used for commits.. */ -extern struct decoration name_decoration; struct name_decoration { struct name_decoration *next; int type; @@ -44,6 +43,7 @@ enum decoration_type { }; void add_name_decoration(enum decoration_type type, const char *name, struct object *obj); +const struct name_decoration *get_name_decoration(const struct object *obj); struct commit *lookup_commit(const unsigned char *sha1); struct commit *lookup_commit_reference(const unsigned char *sha1); diff --git a/log-tree.c b/log-tree.c index a821258..7cbc4ee 100644 --- a/log-tree.c +++ b/log-tree.c @@ -12,7 +12,7 @@ #include "sequencer.h" #include "line-log.h" -struct decoration name_decoration = { "object names" }; +static struct decoration name_decoration = { "object names" }; static char decoration_colors[][COLOR_MAXLEN] = { GIT_COLOR_RESET, @@ -83,6 +83,11 @@ void add_name_decoration(enum decoration_type type, const char *name, struct obj res->next = add_decoration(&name_decoration, obj, res); } +const struct name_decoration *get_name_decoration(const struct object *obj) +{ + return lookup_decoration(&name_decoration, obj); +} + static int add_ref_decoration(const char *refname, const unsigned char *sha1, int flags, void *cb_data) { struct object *obj; @@ -177,13 +182,13 @@ void format_decorations(struct strbuf *sb, int use_color) { const char *prefix; - struct name_decoration *decoration; + const struct name_decoration *decoration; const char *color_commit = diff_get_color(use_color, DIFF_COMMIT); const char *color_reset = decorate_get_color(use_color, DECORATION_NONE); - decoration = lookup_decoration(&name_decoration, &commit->object); + decoration = get_name_decoration(&commit->object); if (!decoration) return; prefix = " ("; diff --git a/revision.c b/revision.c index 2571ada..5aff2b4 100644 --- a/revision.c +++ b/revision.c @@ -473,7 +473,7 @@ static int rev_compare_tree(struct rev_info *revs, * If we are simplifying by decoration, then the commit * is worth showing if it has a tag pointing at it. */ - if (lookup_decoration(&name_decoration, &commit->object)) + if (get_name_decoration(&commit->object)) return REV_TREE_DIFFERENT; /* * A commit that is not pointed by a tag is uninteresting -- 2.1.0.346.ga0367b9 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] log-tree: make name_decoration hash static 2014-08-26 10:23 ` [PATCH 2/3] log-tree: make name_decoration hash static Jeff King @ 2014-08-26 17:40 ` Junio C Hamano 2014-08-26 17:43 ` Jeff King 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-26 17:40 UTC (permalink / raw) To: Jeff King Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc Jeff King <peff@peff.net> writes: > In the previous commit, we made add_name_decoration global > so that adders would not have to access the hash directly. > We now make the hash itself static so that callers _have_ to > add through our function, making sure that all additions go > through a single point. To do this, we have to add one more > accessor function: a way to lookup entries in the hash. > > Since the only caller doesn't actually look at the returned > value, but rather only asks whether there is a decoration or > not, we could provide only a boolean "has_name_decoration". > That would allow us to make "struct name_decoration" local > to log-tree, as well. > > However, it's unlikely to cause any maintainability harm > making the actual data public, and this interface is more > flexible if we need to look at decorations from other parts > of the code in the future. I didn't think we want only-bool version, but it is nice to see it explained well. I may have called it lookup_name_decoration() to match, though, if I were doing this patch ;-) Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] log-tree: make name_decoration hash static 2014-08-26 17:40 ` Junio C Hamano @ 2014-08-26 17:43 ` Jeff King 2014-08-26 19:25 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 17:43 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On Tue, Aug 26, 2014 at 10:40:10AM -0700, Junio C Hamano wrote: > I may have called it lookup_name_decoration() to match, though, if I > were doing this patch ;-) Hmph. I called it "get" because that was the opposite of "add" to me, and I was matching "add_name_decoration". Of course, in the regular decoration code, the add function is also "add" and its opposite is "lookup". So mine is gratuitously different. I do not mind if you adjust while applying. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/3] log-tree: make name_decoration hash static 2014-08-26 17:43 ` Jeff King @ 2014-08-26 19:25 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2014-08-26 19:25 UTC (permalink / raw) To: Jeff King Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc Jeff King <peff@peff.net> writes: > On Tue, Aug 26, 2014 at 10:40:10AM -0700, Junio C Hamano wrote: > >> I may have called it lookup_name_decoration() to match, though, if I >> were doing this patch ;-) > > Hmph. I called it "get" because that was the opposite of "add" to me, > and I was matching "add_name_decoration". Of course, in the regular > decoration code, the add function is also "add" and its opposite is > "lookup". So mine is gratuitously different. I do not mind if you adjust > while applying. I do not care too deeply either way, either ;-) I just thought that sharing the verb with the underlying function being wrapped would be more consistent. I wish we used lookup vs get more consistently, though. One should mean "give us if we already have one otherwise fail" while the other should mean "give us one, or create one if there isn't yet". Unfortunately lookup_commit() and remote_get() both do auto-vivify X-<. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration 2014-08-26 10:20 ` [PATCH 0/3] name_decoration cleanups Jeff King 2014-08-26 10:23 ` [PATCH 1/3] log-tree: make add_name_decoration a public function Jeff King 2014-08-26 10:23 ` [PATCH 2/3] log-tree: make name_decoration hash static Jeff King @ 2014-08-26 10:24 ` Jeff King 2014-08-27 0:30 ` Eric Sunshine 2 siblings, 1 reply; 32+ messages in thread From: Jeff King @ 2014-08-26 10:24 UTC (permalink / raw) To: Junio C Hamano Cc: Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc We are already using the flex-array technique; let's annotate it with our usual FLEX_ARRAY macro. Besides being more readable, this is slightly more efficient on compilers that understand flex-arrays. Note that we need to bump the allocation in add_name_decoration, which did not explicitly add one byte for the NUL terminator of the string we putting into the flex-array (it did not need to before, because the struct itself was over-allocated by one byte). Signed-off-by: Jeff King <peff@peff.net> --- This could come first in the series, but doing it last means we only have to update one spot. :) commit.h | 2 +- log-tree.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/commit.h b/commit.h index 263b49e..1516fc9 100644 --- a/commit.h +++ b/commit.h @@ -29,7 +29,7 @@ extern const char *commit_type; struct name_decoration { struct name_decoration *next; int type; - char name[1]; + char name[FLEX_ARRAY]; }; enum decoration_type { diff --git a/log-tree.c b/log-tree.c index 7cbc4ee..fb60018 100644 --- a/log-tree.c +++ b/log-tree.c @@ -77,7 +77,7 @@ int parse_decorate_color_config(const char *var, const int ofs, const char *valu void add_name_decoration(enum decoration_type type, const char *name, struct object *obj) { int nlen = strlen(name); - struct name_decoration *res = xmalloc(sizeof(struct name_decoration) + nlen); + struct name_decoration *res = xmalloc(sizeof(*res) + nlen + 1); memcpy(res->name, name, nlen + 1); res->type = type; res->next = add_decoration(&name_decoration, obj, res); -- 2.1.0.346.ga0367b9 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration 2014-08-26 10:24 ` [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration Jeff King @ 2014-08-27 0:30 ` Eric Sunshine 0 siblings, 0 replies; 32+ messages in thread From: Eric Sunshine @ 2014-08-27 0:30 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Christian Couder, Arjun Sreedharan, git, Christian Couder, Nguyễn Thái Ngọc On Tue, Aug 26, 2014 at 6:24 AM, Jeff King <peff@peff.net> wrote: > We are already using the flex-array technique; let's > annotate it with our usual FLEX_ARRAY macro. Besides being > more readable, this is slightly more efficient on compilers > that understand flex-arrays. > > Note that we need to bump the allocation in add_name_decoration, > which did not explicitly add one byte for the NUL terminator > of the string we putting into the flex-array (it did not s/we/we are/ > need to before, because the struct itself was over-allocated > by one byte). > > Signed-off-by: Jeff King <peff@peff.net> > --- > This could come first in the series, but doing it last means we only > have to update one spot. :) > > commit.h | 2 +- > log-tree.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/commit.h b/commit.h > index 263b49e..1516fc9 100644 > --- a/commit.h > +++ b/commit.h > @@ -29,7 +29,7 @@ extern const char *commit_type; > struct name_decoration { > struct name_decoration *next; > int type; > - char name[1]; > + char name[FLEX_ARRAY]; > }; > > enum decoration_type { > diff --git a/log-tree.c b/log-tree.c > index 7cbc4ee..fb60018 100644 > --- a/log-tree.c > +++ b/log-tree.c > @@ -77,7 +77,7 @@ int parse_decorate_color_config(const char *var, const int ofs, const char *valu > void add_name_decoration(enum decoration_type type, const char *name, struct object *obj) > { > int nlen = strlen(name); > - struct name_decoration *res = xmalloc(sizeof(struct name_decoration) + nlen); > + struct name_decoration *res = xmalloc(sizeof(*res) + nlen + 1); > memcpy(res->name, name, nlen + 1); > res->type = type; > res->next = add_decoration(&name_decoration, obj, res); > -- > 2.1.0.346.ga0367b9 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-24 14:17 [PATCH] bisect: save heap memory. allocate only the required amount Arjun Sreedharan ` (2 preceding siblings ...) 2014-08-25 13:35 ` Jeff King @ 2014-08-25 21:14 ` Junio C Hamano 2014-08-26 7:30 ` Arjun Sreedharan 3 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2014-08-25 21:14 UTC (permalink / raw) To: Arjun Sreedharan Cc: git, Christian Couder, Nguyễn Thái Ngọc Duy Arjun Sreedharan <arjun024@gmail.com> writes: > Find and allocate the required amount instead of allocating extra > 100 bytes > > Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> > --- Interesting. How much memory do we typically waste (in other words, how big did "cnt" grew in your repository where you noticed this)? > bisect.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/bisect.c b/bisect.c > index d6e851d..c96aab0 100644 > --- a/bisect.c > +++ b/bisect.c > @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n > } > qsort(array, cnt, sizeof(*array), compare_commit_dist); > for (p = list, i = 0; i < cnt; i++) { > - struct name_decoration *r = xmalloc(sizeof(*r) + 100); > + char name[100]; > + sprintf(name, "dist=%d", array[i].distance); > + int name_len = strlen(name); Decl-after-stmt. You do not have to run a separate strlen() for this. The return value from sprintf() should tell you how much you need to allocate, I think. > + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); > struct object *obj = &(array[i].commit->object); > > - sprintf(r->name, "dist=%d", array[i].distance); > + memcpy(r->name, name, name_len + 1); > r->next = add_decoration(&name_decoration, obj, r); > p->item = array[i].commit; > p = p->next; ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] bisect: save heap memory. allocate only the required amount 2014-08-25 21:14 ` [PATCH] bisect: save heap memory. allocate only the required amount Junio C Hamano @ 2014-08-26 7:30 ` Arjun Sreedharan 0 siblings, 0 replies; 32+ messages in thread From: Arjun Sreedharan @ 2014-08-26 7:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Christian Couder, Nguyễn Thái Ngọc On 26 August 2014 02:44, Junio C Hamano <gitster@pobox.com> wrote: > Arjun Sreedharan <arjun024@gmail.com> writes: > >> Find and allocate the required amount instead of allocating extra >> 100 bytes >> >> Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> >> --- > > Interesting. How much memory do we typically waste (in other words, > how big did "cnt" grew in your repository where you noticed this)? > I did not try to make an observation of that. I was thinking we're anyway better off not using a magic number to allot memory on the heap for each item in the commit list. >> bisect.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/bisect.c b/bisect.c >> index d6e851d..c96aab0 100644 >> --- a/bisect.c >> +++ b/bisect.c >> @@ -215,10 +215,13 @@ static struct commit_list *best_bisection_sorted(struct commit_list *list, int n >> } >> qsort(array, cnt, sizeof(*array), compare_commit_dist); >> for (p = list, i = 0; i < cnt; i++) { >> - struct name_decoration *r = xmalloc(sizeof(*r) + 100); >> + char name[100]; >> + sprintf(name, "dist=%d", array[i].distance); >> + int name_len = strlen(name); > > Decl-after-stmt. > > You do not have to run a separate strlen() for this. The return > value from sprintf() should tell you how much you need to allocate, > I think. > Yes. yes. >> + struct name_decoration *r = xmalloc(sizeof(*r) + name_len); >> struct object *obj = &(array[i].commit->object); >> >> - sprintf(r->name, "dist=%d", array[i].distance); >> + memcpy(r->name, name, name_len + 1); >> r->next = add_decoration(&name_decoration, obj, r); >> p->item = array[i].commit; >> p = p->next; ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2014-08-27 0:30 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-08-24 14:17 [PATCH] bisect: save heap memory. allocate only the required amount Arjun Sreedharan 2014-08-24 15:10 ` Stefan Beller 2014-08-24 23:39 ` Junio C Hamano 2014-08-25 13:07 ` Jeff King 2014-08-25 21:36 ` Junio C Hamano 2014-08-26 11:03 ` Jeff King 2014-08-26 11:57 ` Ramsay Jones 2014-08-26 12:14 ` Jeff King 2014-08-26 12:37 ` Ramsay Jones 2014-08-26 12:43 ` Jeff King 2014-08-26 12:59 ` Ramsay Jones 2014-08-24 15:32 ` Ramsay Jones 2014-08-24 21:55 ` Arjun Sreedharan 2014-08-25 13:35 ` Jeff King 2014-08-25 14:06 ` Christian Couder 2014-08-25 15:00 ` Jeff King 2014-08-25 18:26 ` Junio C Hamano 2014-08-25 19:35 ` Jeff King 2014-08-25 20:27 ` Arjun Sreedharan 2014-08-25 21:11 ` Junio C Hamano 2014-08-26 10:20 ` [PATCH 0/3] name_decoration cleanups Jeff King 2014-08-26 10:23 ` [PATCH 1/3] log-tree: make add_name_decoration a public function Jeff King 2014-08-26 11:48 ` Ramsay Jones 2014-08-26 12:17 ` Jeff King 2014-08-26 10:23 ` [PATCH 2/3] log-tree: make name_decoration hash static Jeff King 2014-08-26 17:40 ` Junio C Hamano 2014-08-26 17:43 ` Jeff King 2014-08-26 19:25 ` Junio C Hamano 2014-08-26 10:24 ` [PATCH 3/3] log-tree: use FLEX_ARRAY in name_decoration Jeff King 2014-08-27 0:30 ` Eric Sunshine 2014-08-25 21:14 ` [PATCH] bisect: save heap memory. allocate only the required amount Junio C Hamano 2014-08-26 7:30 ` Arjun Sreedharan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.