* [PATCH] reftable: avoid undefined behaviour breaking t0032 @ 2022-04-15 7:02 Carlo Marcelo Arenas Belón 2022-04-15 7:10 ` Junio C Hamano 2022-04-15 8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón 0 siblings, 2 replies; 16+ messages in thread From: Carlo Marcelo Arenas Belón @ 2022-04-15 7:02 UTC (permalink / raw) To: git; +Cc: hanwen, Carlo Marcelo Arenas Belón At least in glibc based systems, memset with a NULL first parameter will cause a runtime exception. Avoid doing so by adding a conditional to check for NULL in all three identically looking functions that were affected. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- Bug was introduced with the original code in 1214aa841bc (reftable: add blocksource, an abstraction for random access reads, 2021-10-07), so not to be considered a regression for this release. reftable/blocksource.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/reftable/blocksource.c b/reftable/blocksource.c index 0044eecd9aa..984bf07fc17 100644 --- a/reftable/blocksource.c +++ b/reftable/blocksource.c @@ -15,7 +15,8 @@ license that can be found in the LICENSE file or at static void strbuf_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->data) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } @@ -56,7 +57,8 @@ void block_source_from_strbuf(struct reftable_block_source *bs, static void malloc_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->data) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } @@ -85,7 +87,8 @@ static uint64_t file_size(void *b) static void file_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->data) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } -- 2.36.0.rc2.283.gbef64175c85 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] reftable: avoid undefined behaviour breaking t0032 2022-04-15 7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón @ 2022-04-15 7:10 ` Junio C Hamano 2022-04-15 8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón 1 sibling, 0 replies; 16+ messages in thread From: Junio C Hamano @ 2022-04-15 7:10 UTC (permalink / raw) To: Carlo Marcelo Arenas Belón; +Cc: git, hanwen Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > At least in glibc based systems, memset with a NULL first parameter > will cause a runtime exception. I take it to mean that the code assumes that it is OK to pass NULL as long as length is 0 (i.e. filling the range of memory whose size is 0 with the specified byte can happen safely no matter what the starting address of that range is, as size==0 by definition should mean a no-op). That would mean we have a rule on how members of dest must be set: .data is allowed to be NULL only when .len is 0. If so, I wonder if we want to guard with dest->len instead, i.e. if (dest->len) memset(dest->data, 0xff, dest->len); With the form in this patch, i.e. > - memset(dest->data, 0xff, dest->len); > + if (dest->data) > + memset(dest->data, 0xff, dest->len); we will fail to catch a bogus caller that violates the rule above that we have on <data, len>. But if we guard with dest->len, then a violator of <data, len> rule will be caught by memset(). Thanks. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2] reftable: avoid undefined behaviour breaking t0032 2022-04-15 7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón 2022-04-15 7:10 ` Junio C Hamano @ 2022-04-15 8:30 ` Carlo Marcelo Arenas Belón 2022-04-15 10:21 ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason 2022-04-25 10:18 ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys 1 sibling, 2 replies; 16+ messages in thread From: Carlo Marcelo Arenas Belón @ 2022-04-15 8:30 UTC (permalink / raw) To: git; +Cc: hanwen, Carlo Marcelo Arenas Belón 1214aa841bc (reftable: add blocksource, an abstraction for random access reads, 2021-10-07), makes the assumption that it is ok to free a reftable_block pointing to NULL if the size is also set to 0, but implements that using a memset call that at least in glibc based system will trigger a runtime exception if called with a NULL pointer as its first parameter. Avoid doing so by adding a conditional to check for the size in all three identically looking functions that were affected, and therefore, still allow memset to help catch callers that might incorrectly pass a NULL pointer with a non zero size, but avoiding the exception for the valid cases. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- Changes since v1: - Improved logic as suggested by Junio - Hopefully also improved commit message reftable/blocksource.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/reftable/blocksource.c b/reftable/blocksource.c index 0044eecd9aa..db1b7dc966f 100644 --- a/reftable/blocksource.c +++ b/reftable/blocksource.c @@ -15,7 +15,8 @@ license that can be found in the LICENSE file or at static void strbuf_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->len) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } @@ -56,7 +57,8 @@ void block_source_from_strbuf(struct reftable_block_source *bs, static void malloc_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->len) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } @@ -85,7 +87,8 @@ static uint64_t file_size(void *b) static void file_return_block(void *b, struct reftable_block *dest) { - memset(dest->data, 0xff, dest->len); + if (dest->len) + memset(dest->data, 0xff, dest->len); reftable_free(dest->data); } -- 2.36.0.rc2.283.gbef64175c85 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug 2022-04-15 8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón @ 2022-04-15 10:21 ` Ævar Arnfjörð Bjarmason 2022-04-15 10:21 ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason 2022-04-15 10:21 ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason 2022-04-25 10:18 ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys 1 sibling, 2 replies; 16+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw) To: git Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Ævar Arnfjörð Bjarmason On Fri, Apr 15 2022, Carlo Marcelo Arenas Belón wrote: > 1214aa841bc (reftable: add blocksource, an abstraction for random > access reads, 2021-10-07), makes the assumption that it is ok to > free a reftable_block pointing to NULL if the size is also set to > 0, but implements that using a memset call that at least in glibc > based system will trigger a runtime exception if called with a > NULL pointer as its first parameter. FWIW I've been carrying 1/2 here for a while in my local tree, i.e. reftable/* has various abstractions and indirections that aren't really needed. In this case we can just get rid of that & free them, so the memset()s you fixed can just be removed. The 2/2 is then another memset() issue I spotted when looking at this again, -fanalyzer notes the bug related to it. Ævar Arnfjörð Bjarmason (2): reftable: remove the "return_block" abstraction reftable: don't memset() a NULL from failed malloc() reftable/block.c | 4 +--- reftable/blocksource.c | 28 +--------------------------- reftable/publicbasics.c | 2 ++ reftable/reftable-blocksource.h | 2 -- 4 files changed, 4 insertions(+), 32 deletions(-) -- 2.36.0.rc2.863.gfc2c14e3b91 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 1/2] reftable: remove the "return_block" abstraction 2022-04-15 10:21 ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 ` Ævar Arnfjörð Bjarmason 2022-04-15 13:37 ` René Scharfe 2022-04-25 9:57 ` Han-Wen Nienhuys 2022-04-15 10:21 ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason 1 sibling, 2 replies; 16+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw) To: git Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Ævar Arnfjörð Bjarmason This abstraction added in 1214aa841bc (reftable: add blocksource, an abstraction for random access reads, 2021-10-07) has the caller provide a "blockp->data", so there's not point in having the vtable have a custom free() function. In addition this had what looked like a poor man's SANITIZE=address doing a memset() to 0xff just before the data was free'd. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- reftable/block.c | 4 +--- reftable/blocksource.c | 28 +--------------------------- reftable/reftable-blocksource.h | 2 -- 3 files changed, 2 insertions(+), 32 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 34d4d073692..bb17cc32372 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -442,9 +442,7 @@ void block_writer_release(struct block_writer *bw) void reftable_block_done(struct reftable_block *blockp) { - struct reftable_block_source source = blockp->source; - if (blockp && source.ops) - source.ops->return_block(source.arg, blockp); + FREE_AND_NULL(blockp->data); blockp->data = NULL; blockp->len = 0; blockp->source.ops = NULL; diff --git a/reftable/blocksource.c b/reftable/blocksource.c index 2605371c28d..d9e47cc316b 100644 --- a/reftable/blocksource.c +++ b/reftable/blocksource.c @@ -13,12 +13,6 @@ license that can be found in the LICENSE file or at #include "reftable-blocksource.h" #include "reftable-error.h" -static void strbuf_return_block(void *b, struct reftable_block *dest) -{ - memset(dest->data, 0xff, dest->len); - reftable_free(dest->data); -} - static void strbuf_close(void *b) { } @@ -42,7 +36,6 @@ static uint64_t strbuf_size(void *b) static struct reftable_block_source_vtable strbuf_vtable = { .size = &strbuf_size, .read_block = &strbuf_read_block, - .return_block = &strbuf_return_block, .close = &strbuf_close, }; @@ -54,19 +47,7 @@ void block_source_from_strbuf(struct reftable_block_source *bs, bs->arg = buf; } -static void malloc_return_block(void *b, struct reftable_block *dest) -{ - memset(dest->data, 0xff, dest->len); - reftable_free(dest->data); -} - -static struct reftable_block_source_vtable malloc_vtable = { - .return_block = &malloc_return_block, -}; - -static struct reftable_block_source malloc_block_source_instance = { - .ops = &malloc_vtable, -}; +static struct reftable_block_source malloc_block_source_instance = { 0 }; struct reftable_block_source malloc_block_source(void) { @@ -83,12 +64,6 @@ static uint64_t file_size(void *b) return ((struct file_block_source *)b)->size; } -static void file_return_block(void *b, struct reftable_block *dest) -{ - memset(dest->data, 0xff, dest->len); - reftable_free(dest->data); -} - static void file_close(void *b) { int fd = ((struct file_block_source *)b)->fd; @@ -115,7 +90,6 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off, static struct reftable_block_source_vtable file_vtable = { .size = &file_size, .read_block = &file_read_block, - .return_block = &file_return_block, .close = &file_close, }; diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h index 5aa3990a573..7b7cb280b73 100644 --- a/reftable/reftable-blocksource.h +++ b/reftable/reftable-blocksource.h @@ -35,8 +35,6 @@ struct reftable_block_source_vtable { beyond the end of the block */ int (*read_block)(void *source, struct reftable_block *dest, uint64_t off, uint32_t size); - /* mark the block as read; may return the data back to malloc */ - void (*return_block)(void *source, struct reftable_block *blockp); /* release all resources associated with the block source */ void (*close)(void *source); -- 2.36.0.rc2.863.gfc2c14e3b91 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction 2022-04-15 10:21 ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason @ 2022-04-15 13:37 ` René Scharfe 2022-04-25 9:57 ` Han-Wen Nienhuys 1 sibling, 0 replies; 16+ messages in thread From: René Scharfe @ 2022-04-15 13:37 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: > This abstraction added in 1214aa841bc (reftable: add blocksource, an > abstraction for random access reads, 2021-10-07) has the caller > provide a "blockp->data", so there's not point in having the vtable > have a custom free() function. > > In addition this had what looked like a poor man's SANITIZE=address > doing a memset() to 0xff just before the data was free'd. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > reftable/block.c | 4 +--- > reftable/blocksource.c | 28 +--------------------------- > reftable/reftable-blocksource.h | 2 -- > 3 files changed, 2 insertions(+), 32 deletions(-) > > diff --git a/reftable/block.c b/reftable/block.c > index 34d4d073692..bb17cc32372 100644 > --- a/reftable/block.c > +++ b/reftable/block.c > @@ -442,9 +442,7 @@ void block_writer_release(struct block_writer *bw) > > void reftable_block_done(struct reftable_block *blockp) > { > - struct reftable_block_source source = blockp->source; > - if (blockp && source.ops) > - source.ops->return_block(source.arg, blockp); I don't understand this interface. Why is source.arg passed to the function? Is this perhaps done in preparation for some kind of backend that is planned to be added later which makes use of it? I suspect we'd better postpone cleanups until the full functionality is integrated. > + FREE_AND_NULL(blockp->data); If you do that... > blockp->data = NULL; ... then this line becomes redundant. > blockp->len = 0; > blockp->source.ops = NULL; > diff --git a/reftable/blocksource.c b/reftable/blocksource.c > index 2605371c28d..d9e47cc316b 100644 > --- a/reftable/blocksource.c > +++ b/reftable/blocksource.c > @@ -13,12 +13,6 @@ license that can be found in the LICENSE file or at > #include "reftable-blocksource.h" > #include "reftable-error.h" > > -static void strbuf_return_block(void *b, struct reftable_block *dest) > -{ > - memset(dest->data, 0xff, dest->len); > - reftable_free(dest->data); > -} > - > static void strbuf_close(void *b) > { > } > @@ -42,7 +36,6 @@ static uint64_t strbuf_size(void *b) > static struct reftable_block_source_vtable strbuf_vtable = { > .size = &strbuf_size, > .read_block = &strbuf_read_block, > - .return_block = &strbuf_return_block, > .close = &strbuf_close, > }; > > @@ -54,19 +47,7 @@ void block_source_from_strbuf(struct reftable_block_source *bs, > bs->arg = buf; > } > > -static void malloc_return_block(void *b, struct reftable_block *dest) > -{ > - memset(dest->data, 0xff, dest->len); > - reftable_free(dest->data); > -} > - > -static struct reftable_block_source_vtable malloc_vtable = { > - .return_block = &malloc_return_block, > -}; > - > -static struct reftable_block_source malloc_block_source_instance = { > - .ops = &malloc_vtable, > -}; > +static struct reftable_block_source malloc_block_source_instance = { 0 }; > > struct reftable_block_source malloc_block_source(void) > { > @@ -83,12 +64,6 @@ static uint64_t file_size(void *b) > return ((struct file_block_source *)b)->size; > } > > -static void file_return_block(void *b, struct reftable_block *dest) > -{ > - memset(dest->data, 0xff, dest->len); > - reftable_free(dest->data); > -} > - > static void file_close(void *b) > { > int fd = ((struct file_block_source *)b)->fd; > @@ -115,7 +90,6 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off, > static struct reftable_block_source_vtable file_vtable = { > .size = &file_size, > .read_block = &file_read_block, > - .return_block = &file_return_block, > .close = &file_close, > }; > > diff --git a/reftable/reftable-blocksource.h b/reftable/reftable-blocksource.h > index 5aa3990a573..7b7cb280b73 100644 > --- a/reftable/reftable-blocksource.h > +++ b/reftable/reftable-blocksource.h > @@ -35,8 +35,6 @@ struct reftable_block_source_vtable { > beyond the end of the block */ > int (*read_block)(void *source, struct reftable_block *dest, > uint64_t off, uint32_t size); > - /* mark the block as read; may return the data back to malloc */ > - void (*return_block)(void *source, struct reftable_block *blockp); > > /* release all resources associated with the block source */ > void (*close)(void *source); ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction 2022-04-15 10:21 ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason 2022-04-15 13:37 ` René Scharfe @ 2022-04-25 9:57 ` Han-Wen Nienhuys 2022-04-25 17:30 ` Junio C Hamano 1 sibling, 1 reply; 16+ messages in thread From: Han-Wen Nienhuys @ 2022-04-25 9:57 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón On Fri, Apr 15, 2022 at 12:21 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > This abstraction added in 1214aa841bc (reftable: add blocksource, an > abstraction for random access reads, 2021-10-07) has the caller > provide a "blockp->data", so there's not point in having the vtable > have a custom free() function. > > In addition this had what looked like a poor man's SANITIZE=address > doing a memset() to 0xff just before the data was free'd. > void reftable_block_done(struct reftable_block *blockp) > { > - struct reftable_block_source source = blockp->source; > - if (blockp && source.ops) > - source.ops->return_block(source.arg, blockp); > + FREE_AND_NULL(blockp->data); My thinking here is that we could mmap the reftable file to do reads. In that case, discarding the block would imply decreasing a refcount somewhere, rather than deallocating memory. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 1/2] reftable: remove the "return_block" abstraction 2022-04-25 9:57 ` Han-Wen Nienhuys @ 2022-04-25 17:30 ` Junio C Hamano 0 siblings, 0 replies; 16+ messages in thread From: Junio C Hamano @ 2022-04-25 17:30 UTC (permalink / raw) To: Han-Wen Nienhuys Cc: Ævar Arnfjörð Bjarmason, git, Carlo Marcelo Arenas Belón Han-Wen Nienhuys <hanwen@google.com> writes: > On Fri, Apr 15, 2022 at 12:21 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> >> This abstraction added in 1214aa841bc (reftable: add blocksource, an >> abstraction for random access reads, 2021-10-07) has the caller >> provide a "blockp->data", so there's not point in having the vtable >> have a custom free() function. >> >> In addition this had what looked like a poor man's SANITIZE=address >> doing a memset() to 0xff just before the data was free'd. > >> void reftable_block_done(struct reftable_block *blockp) >> { >> - struct reftable_block_source source = blockp->source; >> - if (blockp && source.ops) >> - source.ops->return_block(source.arg, blockp); >> + FREE_AND_NULL(blockp->data); > > > My thinking here is that we could mmap the reftable file to do reads. > In that case, discarding the block would imply decreasing a refcount > somewhere, rather than deallocating memory. Sounds like a plan. As a solution to the memset() thing, ripping out this abstraction layer is indeed not just overkill but also doing too much of "while we are at it". Let's take what we've queued on cm/reftable-0-length-memset and merge it down. Thanks. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 10:21 ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason 2022-04-15 10:21 ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 ` Ævar Arnfjörð Bjarmason 2022-04-15 13:37 ` René Scharfe 1 sibling, 1 reply; 16+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 10:21 UTC (permalink / raw) To: git Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Ævar Arnfjörð Bjarmason Return NULL instead of possibly feeding a NULL to memset() in reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: reftable/publicbasics.c: In function ‘reftable_calloc’: reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] 43 | memset(p, 0, sz); | ^~~~~~~~~~~~~~~~ [...] This bug has been with us ever since this code was added in ef8a6c62687 (reftable: utility functions, 2021-10-07). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- reftable/publicbasics.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c index 0ad7d5c0ff2..a18167f5ab7 100644 --- a/reftable/publicbasics.c +++ b/reftable/publicbasics.c @@ -40,6 +40,8 @@ void reftable_free(void *p) void *reftable_calloc(size_t sz) { void *p = reftable_malloc(sz); + if (!p) + return NULL; memset(p, 0, sz); return p; } -- 2.36.0.rc2.863.gfc2c14e3b91 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 10:21 ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason @ 2022-04-15 13:37 ` René Scharfe 2022-04-15 13:53 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 16+ messages in thread From: René Scharfe @ 2022-04-15 13:37 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, git Cc: Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: > Return NULL instead of possibly feeding a NULL to memset() in > reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: > > reftable/publicbasics.c: In function ‘reftable_calloc’: > reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] > 43 | memset(p, 0, sz); > | ^~~~~~~~~~~~~~~~ > [...] > > This bug has been with us ever since this code was added in > ef8a6c62687 (reftable: utility functions, 2021-10-07). Not sure it's a bug, or limited to this specific location. AFAICS it's more a general lack of handling of allocation failures in the reftable code. Perhaps it was a conscious decision to let the system deal with them via segfaults? When the code is used by Git eventually it should use xmalloc and xrealloc instead of calling malloc(3) and realloc(3) directly, to handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. This will calm down the analyzer and extend the safety to the rest of the reftable code, not just this memset call. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > reftable/publicbasics.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c > index 0ad7d5c0ff2..a18167f5ab7 100644 > --- a/reftable/publicbasics.c > +++ b/reftable/publicbasics.c > @@ -40,6 +40,8 @@ void reftable_free(void *p) > void *reftable_calloc(size_t sz) > { > void *p = reftable_malloc(sz); > + if (!p) > + return NULL; > memset(p, 0, sz); This function is calloc(3) reimplemented, except it can't make use of pre-zero'd blocks of memory and doesn't multiply element size and count for the caller while checking for overflow, making it slower and harder to use securely. :-/ > return p; > } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 13:37 ` René Scharfe @ 2022-04-15 13:53 ` Ævar Arnfjörð Bjarmason 2022-04-15 14:30 ` Phillip Wood 0 siblings, 1 reply; 16+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 13:53 UTC (permalink / raw) To: René Scharfe Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys On Fri, Apr 15 2022, René Scharfe wrote: > Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >> Return NULL instead of possibly feeding a NULL to memset() in >> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >> >> reftable/publicbasics.c: In function ‘reftable_calloc’: >> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >> 43 | memset(p, 0, sz); >> | ^~~~~~~~~~~~~~~~ >> [...] >> >> This bug has been with us ever since this code was added in >> ef8a6c62687 (reftable: utility functions, 2021-10-07). > > Not sure it's a bug, or limited to this specific location. AFAICS it's > more a general lack of handling of allocation failures in the reftable > code. Perhaps it was a conscious decision to let the system deal with > them via segfaults? I think it's just buggy, it tries to deal with malloc returning NULL in other contexts. > When the code is used by Git eventually it should use xmalloc and > xrealloc instead of calling malloc(3) and realloc(3) directly, to > handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. > This will calm down the analyzer and extend the safety to the rest of > the reftable code, not just this memset call. Yeah, its whole custom malloc handling should go away. >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> --- >> reftable/publicbasics.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c >> index 0ad7d5c0ff2..a18167f5ab7 100644 >> --- a/reftable/publicbasics.c >> +++ b/reftable/publicbasics.c >> @@ -40,6 +40,8 @@ void reftable_free(void *p) >> void *reftable_calloc(size_t sz) >> { >> void *p = reftable_malloc(sz); >> + if (!p) >> + return NULL; >> memset(p, 0, sz); > > This function is calloc(3) reimplemented, except it can't make use of > pre-zero'd blocks of memory and doesn't multiply element size and count > for the caller while checking for overflow, making it slower and harder > to use securely. :-/ *nod*, this is really just re-arranging the deck chairs. Maybe Han-Wen had something in mind, but I really don't see the point of having it use malloc wrappers at all, as opposed to just having the linker point it to the right "malloc". So if it needs to be used outside of git.git it would just need the trivial xmalloc() etc. wrappers. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 13:53 ` Ævar Arnfjörð Bjarmason @ 2022-04-15 14:30 ` Phillip Wood 2022-04-15 15:20 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 16+ messages in thread From: Phillip Wood @ 2022-04-15 14:30 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, René Scharfe Cc: git, Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys Hi Ævar and René On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote: > > On Fri, Apr 15 2022, René Scharfe wrote: > >> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >>> Return NULL instead of possibly feeding a NULL to memset() in >>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >>> >>> reftable/publicbasics.c: In function ‘reftable_calloc’: >>> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >>> 43 | memset(p, 0, sz); >>> | ^~~~~~~~~~~~~~~~ >>> [...] >>> >>> This bug has been with us ever since this code was added in >>> ef8a6c62687 (reftable: utility functions, 2021-10-07). >> >> Not sure it's a bug, or limited to this specific location. AFAICS it's >> more a general lack of handling of allocation failures in the reftable >> code. Perhaps it was a conscious decision to let the system deal with >> them via segfaults? > > I think it's just buggy, it tries to deal with malloc returning NULL in > other contexts. Does it? I just quickly scanned the output of git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master and I didn't see a single NULL check >> When the code is used by Git eventually it should use xmalloc and >> xrealloc instead of calling malloc(3) and realloc(3) directly, to >> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. >> This will calm down the analyzer and extend the safety to the rest of >> the reftable code, not just this memset call. > > Yeah, its whole custom malloc handling should go away. Isn't it there so the same code can be used by libgit2 and other things that let the caller specify a custom allocator? Best Wishes Phillip ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 14:30 ` Phillip Wood @ 2022-04-15 15:20 ` Ævar Arnfjörð Bjarmason 2022-04-15 16:23 ` Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 15:20 UTC (permalink / raw) To: Phillip Wood Cc: René Scharfe, git, Junio C Hamano, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Edward Thomson, Patrick Steinhardt On Fri, Apr 15 2022, Phillip Wood wrote: > Hi Ævar and René > > On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote: >> On Fri, Apr 15 2022, René Scharfe wrote: >> >>> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >>>> Return NULL instead of possibly feeding a NULL to memset() in >>>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >>>> >>>> reftable/publicbasics.c: In function ‘reftable_calloc’: >>>> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >>>> 43 | memset(p, 0, sz); >>>> | ^~~~~~~~~~~~~~~~ >>>> [...] >>>> >>>> This bug has been with us ever since this code was added in >>>> ef8a6c62687 (reftable: utility functions, 2021-10-07). >>> >>> Not sure it's a bug, or limited to this specific location. AFAICS it's >>> more a general lack of handling of allocation failures in the reftable >>> code. Perhaps it was a conscious decision to let the system deal with >>> them via segfaults? >> I think it's just buggy, it tries to deal with malloc returning NULL >> in >> other contexts. > > Does it? I just quickly scanned the output of > > git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master > > and I didn't see a single NULL check I think you're right, I retrieved that information from wetware. Looking again I think I was confusing it with the if (x) free(x) fixes in 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). >>> When the code is used by Git eventually it should use xmalloc and >>> xrealloc instead of calling malloc(3) and realloc(3) directly, to >>> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. >>> This will calm down the analyzer and extend the safety to the rest of >>> the reftable code, not just this memset call. >> Yeah, its whole custom malloc handling should go away. > > Isn't it there so the same code can be used by libgit2 and other > things that let the caller specify a custom allocator? I don't think so, but I may be wrong. I think it's a case of over-generalization where we'd be better off with something simpler, i.e. just using our normal allocation functions. That still allows for taking this code and plugging it into any custom allocator. If you simply want to compile some code such as this to use another allocator you can easily do that via the linker, as e.g. git.git's build process allows you to do with nedmalloc. So presumably if libgit2 would want to use this they'd just do that via their build process. I.e. they'd have "malloc" point to a symbol that would resolve to a shimmy layer that would dispatch to functions using this: https://github.com/libgit2/libgit2/blob/main/src/util/alloc.c The reason you'd use these sorts of pluggable malloc functions is, I think, a few: 1. You are a library like libgit2, as opposed to libreftable itself, so you want to provide this sort of "set the alloc pointers to this" now. But in this case we either always want malloc/free (git.git) or the "parent" can provide these wrappers (libgit2). 2. You want to support switching dynamically (or not-globally), or have N instances with different malloc, as e.g. the PCREv2 API does: cbe81e653fa (grep/pcre2: move back to thread-only PCREv2 structures, 2021-02-18). For this code I think that's thoroughly in YAGNI territory. 3. Your distribution model can't assume the user can adjust this via linking, e.g. C source that's intended to be #included as-is (although there you could use defines...) etc. But maybe I'm missing something. But a really good reason not to do this and just rely on the link-time overriding is: A. Things look the same, and benefit from more general fixes (although to be fair the x*alloc() wrappers would work either way..) B. You don't get subtle bugs where you forget some_custom_malloc() and then use a generic free(). Having looked just now reftable/ has at least one such submarine-segfault waiting to happen. C. If you don't actually need the hyper-customize pluggable model of say PCREv2 where the library is trying to support having two patterns in memory managed by different allocators, or other such advanced use-cases it's just extra complexity. Some of this was discussed for xdiff/* in this thread: https://lore.kernel.org/git/220216.86leybszht.gmgdl@evledraar.gmail.com/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 15:20 ` Ævar Arnfjörð Bjarmason @ 2022-04-15 16:23 ` Junio C Hamano 2022-04-25 10:30 ` Han-Wen Nienhuys 0 siblings, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2022-04-15 16:23 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Phillip Wood, René Scharfe, git, Carlo Marcelo Arenas Belón, Han-Wen Nienhuys, Edward Thomson, Patrick Steinhardt Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> Does it? I just quickly scanned the output of >> >> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master >> >> and I didn't see a single NULL check > > I think you're right, I retrieved that information from wetware. Looking > again I think I was confusing it with the if (x) free(x) fixes in > 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). True. Initially I hoped that these RFC changes gives us a better solution that comes from stepping back and rethinking the issues around the original "why are we calling memset() with NULL?", and if it were only "well, in _return_block() functions, we clear the block before calling _free()---that shouldn't be necessary, if the calling custom malloc-free pair cares, they can do the clearing, and plain vanilla free() certainly doesn't, so let's stop calling memset()", it certainly would have fallen into that category, but everything these RFC patches do beyond that seems "oh, it does not look important to me, so let's rip it out to simplify", without knowing enough to say if that is sensible. But ... >> Isn't it there so the same code can be used by libgit2 and other >> things that let the caller specify a custom allocator? > > I don't think so, but I may be wrong. I think it's a case of > over-generalization where we'd be better off with something simpler, > i.e. just using our normal allocation functions. ... many points that was raised on the reftable code in this thread may deserve response to explain the rationale behind the current code and design, as nobody can improve, without breaking the intended way to be used, without knowing what it is. Thanks for marking the patches with RFC. I think the "patch" is a bit too dense a form to point out the issues in the existing code, but the discussion that follows by quoting the points in the original code and suggested changes is a good way to think about the code before these RFC patches. Ideally, it would have been much better if these points were raised during the review before the code was accepted to the code base, but it is better to have one now, rather than never ;-) THanks. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() 2022-04-15 16:23 ` Junio C Hamano @ 2022-04-25 10:30 ` Han-Wen Nienhuys 0 siblings, 0 replies; 16+ messages in thread From: Han-Wen Nienhuys @ 2022-04-25 10:30 UTC (permalink / raw) To: Junio C Hamano Cc: Ævar Arnfjörð Bjarmason, Phillip Wood, René Scharfe, git, Carlo Marcelo Arenas Belón, Edward Thomson, Patrick Steinhardt On Fri, Apr 15, 2022 at 6:23 PM Junio C Hamano <gitster@pobox.com> wrote: > >> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master > >> > >> and I didn't see a single NULL check > > > > I think you're right, I retrieved that information from wetware. Looking > > again I think I was confusing it with the if (x) free(x) fixes in > > 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). > > True. Initially I hoped that these RFC changes gives us a better > solution that comes from stepping back and rethinking the issues > around the original "why are we calling memset() with NULL?", and memset with NULL is an oversight. The malloc routines are pluggable so the code could be reused for libgit2. However, as use within Git itself is still not imminent, they could just as well be removed as they are just a premature generalization right now. > if it were only "well, in _return_block() functions, we clear the > block before calling _free()---that shouldn't be necessary, if the > calling custom malloc-free pair cares, they can do the clearing, and > plain vanilla free() certainly doesn't, so let's stop calling The memset() calls on free() (eg. in are there to tease out use-after-free bugs in the unittests, but they should probably be removed from file_return_block() as that is production code. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] reftable: avoid undefined behaviour breaking t0032 2022-04-15 8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón 2022-04-15 10:21 ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason @ 2022-04-25 10:18 ` Han-Wen Nienhuys 1 sibling, 0 replies; 16+ messages in thread From: Han-Wen Nienhuys @ 2022-04-25 10:18 UTC (permalink / raw) To: Carlo Marcelo Arenas Belón; +Cc: git On Fri, Apr 15, 2022 at 10:34 AM Carlo Marcelo Arenas Belón <carenas@gmail.com> wrote: > 1214aa841bc (reftable: add blocksource, an abstraction for random > access reads, 2021-10-07), makes the assumption that it is ok to > free a reftable_block pointing to NULL if the size is also set to >.. either patch (data or len) LGTM -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-04-25 17:30 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-15 7:02 [PATCH] reftable: avoid undefined behaviour breaking t0032 Carlo Marcelo Arenas Belón 2022-04-15 7:10 ` Junio C Hamano 2022-04-15 8:30 ` [PATCH v2] " Carlo Marcelo Arenas Belón 2022-04-15 10:21 ` [RFC PATCH 0/2] reftable: remove poor man's SANITIZE=address, fix a memset() bug Ævar Arnfjörð Bjarmason 2022-04-15 10:21 ` [RFC PATCH 1/2] reftable: remove the "return_block" abstraction Ævar Arnfjörð Bjarmason 2022-04-15 13:37 ` René Scharfe 2022-04-25 9:57 ` Han-Wen Nienhuys 2022-04-25 17:30 ` Junio C Hamano 2022-04-15 10:21 ` [RFC PATCH 2/2] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason 2022-04-15 13:37 ` René Scharfe 2022-04-15 13:53 ` Ævar Arnfjörð Bjarmason 2022-04-15 14:30 ` Phillip Wood 2022-04-15 15:20 ` Ævar Arnfjörð Bjarmason 2022-04-15 16:23 ` Junio C Hamano 2022-04-25 10:30 ` Han-Wen Nienhuys 2022-04-25 10:18 ` [PATCH v2] reftable: avoid undefined behaviour breaking t0032 Han-Wen Nienhuys
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).