All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Extend and add a little more generalization to the mem_pool API
@ 2020-08-14  3:02 Elijah Newren via GitGitGadget
  2020-08-14  3:02 ` [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  3:02 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Elijah Newren

In my new merge algorithm, I made use of the mem_pool API in a few
places...but I also needed to add a few more functions. More importantly,
though, I couldn't reuse the existing API as-is since I want to be able to
discard the memory (after each recursive merge) and then continue adding to
the pool, something the existing API was not amenable to. Instead of writing
another competing API, since there is only one other user, I just modified
the existing callsites to use the more generic calling structure that I
needed.

Unfortunately, Matheus' parallel-checkout RFC series (not yet in next or
seen as far as I can tell) adds a few more mem_pool callers, so this series
conflicts with his (semantically). I can rebuild mine on top of his, or,
since his is longer and would probably advance more slowly, it may make
sense to have his series be based on this one. If so, I'm happy to help him
update his to depend on this series. Let me know preferences.

Elijah Newren (3):
  mem-pool: add convenience functions for xstrdup and xstrndup
  mem-pool: use more standard initialization and finalization
  mem-pool: use consistent pool variable name

 fast-import.c | 12 ++-------
 mem-pool.c    | 75 ++++++++++++++++++++++++++++++++-------------------
 mem-pool.h    | 14 +++++++---
 read-cache.c  | 21 +++++++++------
 split-index.c |  6 +++--
 5 files changed, 76 insertions(+), 52 deletions(-)


base-commit: 7814e8a05a59c0cf5fb186661d1551c75d1299b5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-830%2Fnewren%2Fmem_pool_api-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-830/newren/mem_pool_api-v1
Pull-Request: https://github.com/git/git/pull/830
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup
  2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
@ 2020-08-14  3:02 ` Elijah Newren via GitGitGadget
  2020-08-14  4:42   ` Eric Sunshine
  2020-08-14  3:02 ` [PATCH 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  3:02 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

fast-import had a special mem_pool_xstrdup() convenience function that I
want to be able to use from the new merge algorithm I am writing.  Move
it from fast-import to mem-pool, and also add a mem_pool_xstrndup()
while at it that I also want to use.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 fast-import.c | 12 ++----------
 mem-pool.c    | 23 +++++++++++++++++++++++
 mem-pool.h    |  6 ++++++
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index ce47794db6..dd5b563950 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -526,14 +526,6 @@ static unsigned int hc_str(const char *s, size_t len)
 	return r;
 }
 
-static char *pool_strdup(const char *s)
-{
-	size_t len = strlen(s) + 1;
-	char *r = mem_pool_alloc(&fi_mem_pool, len);
-	memcpy(r, s, len);
-	return r;
-}
-
 static void insert_mark(struct mark_set *s, uintmax_t idnum, struct object_entry *oe)
 {
 	while ((idnum >> s->shift) >= 1024) {
@@ -615,7 +607,7 @@ static struct branch *new_branch(const char *name)
 		die("Branch name doesn't conform to GIT standards: %s", name);
 
 	b = mem_pool_calloc(&fi_mem_pool, 1, sizeof(struct branch));
-	b->name = pool_strdup(name);
+	b->name = mem_pool_xstrdup(&fi_mem_pool, name);
 	b->table_next_branch = branch_table[hc];
 	b->branch_tree.versions[0].mode = S_IFDIR;
 	b->branch_tree.versions[1].mode = S_IFDIR;
@@ -2806,7 +2798,7 @@ static void parse_new_tag(const char *arg)
 
 	t = mem_pool_alloc(&fi_mem_pool, sizeof(struct tag));
 	memset(t, 0, sizeof(struct tag));
-	t->name = pool_strdup(arg);
+	t->name = mem_pool_xstrdup(&fi_mem_pool, arg);
 	if (last_tag)
 		last_tag->next_tag = t;
 	else
diff --git a/mem-pool.c b/mem-pool.c
index a2841a4a9a..3a8c54d9df 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -102,6 +102,29 @@ void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
 	return r;
 }
 
+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str)
+{
+	size_t len = strlen(str) + 1;
+	char *ret = mem_pool_alloc(pool, len);
+
+	if (!ret)
+		die("Out of memory, mem_pool_xstrdup failed");
+
+	return memcpy(ret, str, len);
+}
+
+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
+{
+	size_t minlen = strnlen(str, len);
+	char *ret = mem_pool_alloc(pool, minlen+1);
+
+	if (!ret)
+		die("Out of memory, mem_pool_xstrndup failed");
+
+	ret[minlen] = '\0';
+	return memcpy(ret, str, minlen);
+}
+
 int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
 {
 	struct mp_block *p;
diff --git a/mem-pool.h b/mem-pool.h
index 999d3c3a52..fcaa2d462b 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -41,6 +41,12 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len);
  */
 void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size);
 
+/*
+ * Allocate memory from the memory pool and copy str into it.
+ */
+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str);
+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len);
+
 /*
  * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
  * pool will be empty and not contain any memory. It still needs to be free'd
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] mem-pool: use more standard initialization and finalization
  2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  2020-08-14  3:02 ` [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
@ 2020-08-14  3:02 ` Elijah Newren via GitGitGadget
  2020-08-14  4:38   ` Junio C Hamano
  2020-08-14  3:02 ` [PATCH 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  3:02 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

A typical memory type, such as strbuf, hashmap, or string_list can be
stored on the stack or embedded within another structure.  mem_pool
cannot be, because of how mem_pool_init() and mem_pool_discard() are
written.  mem_pool_init() does essentially the following (simplified
for purposes of explanation here):

    void mem_pool_init(struct mem_pool **pool...)
    {
        *pool = xcalloc(1, sizeof(*pool));

It seems weird to require that mem_pools can only be accessed through a
pointer.  It also seems slightly dangerous: unlike strbuf_release() or
strbuf_reset() or string_list_clear(), all of which put the data
structure into a state where it can be re-used after the call,
mem_pool_discard(pool) will leave pool pointing at free'd memory.
read-cache (and split-index) are the only current users of mem_pools,
and they haven't fallen into a use-after-free mistake here, but it seems
likely to be problematic for future users especially since several of
the current callers of mem_pool_init() will only call it when the
mem_pool* is not already allocated (i.e. is NULL).

This type of mechanism also prevents finding synchronization
points where one can free existing memory and then resume more
operations.  It would be natural at such points to run something like
    mem_pool_discard(pool...);
and, if necessary,
    mem_pool_init(&pool...);
and then carry on continuing to use the pool.  However, this fails badly
if several objects had a copy of the value of pool from before these
commands; in such a case, those objects won't get the updated value of
pool that mem_pool_init() overwrites pool with and they'll all instead
be reading and writing from free'd memory.

Modify mem_pool_init()/mem_pool_discard() to behave more like
   strbuf_init()/strbuf_release()
or
   string_list_init()/string_list_clear()
In particular: (1) make mem_pool_init() just take a mem_pool* and have
it only worry about allocating struct mp_blocks, not the struct mem_pool
itself, (2) make mem_pool_discard() free the memory that the pool was
responsible for, but leave it in a state where it can be used to
allocate more memory afterward (without the need to call mem_pool_init()
again).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c    | 20 +++++++-------------
 mem-pool.h    |  4 ++--
 read-cache.c  | 21 +++++++++++++--------
 split-index.c |  6 ++++--
 4 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index 3a8c54d9df..b7d789823e 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -33,21 +33,14 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 	return p;
 }
 
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
+void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size)
 {
-	struct mem_pool *pool;
-
-	if (*mem_pool)
-		return;
-
-	pool = xcalloc(1, sizeof(*pool));
-
-	pool->block_alloc = BLOCK_GROWTH_SIZE;
+	mem_pool->mp_block = NULL;
+	mem_pool->pool_alloc = 0;
+	mem_pool->block_alloc = BLOCK_GROWTH_SIZE;
 
 	if (initial_size > 0)
-		mem_pool_alloc_block(pool, initial_size, NULL);
-
-	*mem_pool = pool;
+		mem_pool_alloc_block(mem_pool, initial_size, NULL);
 }
 
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
@@ -66,7 +59,8 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	free(mem_pool);
+	mem_pool->mp_block = NULL;
+	mem_pool->pool_alloc = 0;
 }
 
 void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
diff --git a/mem-pool.h b/mem-pool.h
index fcaa2d462b..30b7a8c03b 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -24,10 +24,10 @@ struct mem_pool {
 /*
  * Initialize mem_pool with specified initial size.
  */
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size);
+void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size);
 
 /*
- * Discard a memory pool and free all the memory it is responsible for.
+ * Discard all the memory the memory pool is responsible for.
  */
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
 
diff --git a/read-cache.c b/read-cache.c
index 8ed1c29b54..fa291cdbee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -89,8 +89,10 @@ static struct mem_pool *find_mem_pool(struct index_state *istate)
 	else
 		pool_ptr = &istate->ce_mem_pool;
 
-	if (!*pool_ptr)
-		mem_pool_init(pool_ptr, 0);
+	if (!*pool_ptr) {
+		*pool_ptr = xmalloc(sizeof(**pool_ptr));
+		mem_pool_init(*pool_ptr, 0);
+	}
 
 	return *pool_ptr;
 }
@@ -2006,11 +2008,12 @@ static unsigned long load_all_cache_entries(struct index_state *istate,
 {
 	unsigned long consumed;
 
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 	if (istate->version == 4) {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size_from_compressed(istate->cache_nr));
 	} else {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size(mmap_size, istate->cache_nr));
 	}
 
@@ -2070,7 +2073,8 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	if (istate->name_hash_initialized)
 		BUG("the name hash isn't thread safe");
 
-	mem_pool_init(&istate->ce_mem_pool, 0);
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
+	mem_pool_init(istate->ce_mem_pool, 0);
 
 	/* ensure we have no more threads than we have blocks to process */
 	if (nr_threads > ieot->nr)
@@ -2097,11 +2101,12 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 		nr = 0;
 		for (j = p->ieot_start; j < p->ieot_start + p->ieot_blocks; j++)
 			nr += p->ieot->entries[j].nr;
+		istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 		if (istate->version == 4) {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size_from_compressed(nr));
 		} else {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size(mmap_size, nr));
 		}
 
@@ -2358,7 +2363,7 @@ int discard_index(struct index_state *istate)
 
 	if (istate->ce_mem_pool) {
 		mem_pool_discard(istate->ce_mem_pool, should_validate_cache_entries());
-		istate->ce_mem_pool = NULL;
+		FREE_AND_NULL(istate->ce_mem_pool);
 	}
 
 	return 0;
diff --git a/split-index.c b/split-index.c
index e6154e4ea9..c0e8ad670d 100644
--- a/split-index.c
+++ b/split-index.c
@@ -79,8 +79,10 @@ void move_cache_to_base_index(struct index_state *istate)
 	if (si->base &&
 		si->base->ce_mem_pool) {
 
-		if (!istate->ce_mem_pool)
-			mem_pool_init(&istate->ce_mem_pool, 0);
+		if (!istate->ce_mem_pool) {
+			istate->ce_mem_pool = xmalloc(sizeof(struct mem_pool));
+			mem_pool_init(istate->ce_mem_pool, 0);
+		}
 
 		mem_pool_combine(istate->ce_mem_pool, istate->split_index->base->ce_mem_pool);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] mem-pool: use consistent pool variable name
  2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  2020-08-14  3:02 ` [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
  2020-08-14  3:02 ` [PATCH 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
@ 2020-08-14  3:02 ` Elijah Newren via GitGitGadget
  2020-08-14  3:51 ` [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Matheus Tavares Bernardino
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
  4 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  3:02 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

About half the function declarations in mem-pool.h used 'struct mem_pool
*pool', while the other half used 'struct mem_pool *mem_pool'.  Make the
code a bit more consistent by just using 'pool' in preference to
'mem_pool' everywhere.

No behavioral changes included; this is just a mechanical rename (though
a line or two was rewrapped as well).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c | 50 ++++++++++++++++++++++++++------------------------
 mem-pool.h |  6 +++---
 2 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index b7d789823e..08039ed069 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -12,11 +12,13 @@
  * `insert_after`. If `insert_after` is NULL, then insert block at the
  * head of the linked list.
  */
-static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t block_alloc, struct mp_block *insert_after)
+static struct mp_block *mem_pool_alloc_block(struct mem_pool *pool,
+					     size_t block_alloc,
+					     struct mp_block *insert_after)
 {
 	struct mp_block *p;
 
-	mem_pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
+	pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
 	p = xmalloc(st_add(sizeof(struct mp_block), block_alloc));
 
 	p->next_free = (char *)p->space;
@@ -26,28 +28,28 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 		p->next_block = insert_after->next_block;
 		insert_after->next_block = p;
 	} else {
-		p->next_block = mem_pool->mp_block;
-		mem_pool->mp_block = p;
+		p->next_block = pool->mp_block;
+		pool->mp_block = p;
 	}
 
 	return p;
 }
 
-void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size)
+void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 {
-	mem_pool->mp_block = NULL;
-	mem_pool->pool_alloc = 0;
-	mem_pool->block_alloc = BLOCK_GROWTH_SIZE;
+	pool->mp_block = NULL;
+	pool->pool_alloc = 0;
+	pool->block_alloc = BLOCK_GROWTH_SIZE;
 
 	if (initial_size > 0)
-		mem_pool_alloc_block(mem_pool, initial_size, NULL);
+		mem_pool_alloc_block(pool, initial_size, NULL);
 }
 
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory)
 {
 	struct mp_block *block, *block_to_free;
 
-	block = mem_pool->mp_block;
+	block = pool->mp_block;
 	while (block)
 	{
 		block_to_free = block;
@@ -59,11 +61,11 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	mem_pool->mp_block = NULL;
-	mem_pool->pool_alloc = 0;
+	pool->mp_block = NULL;
+	pool->pool_alloc = 0;
 }
 
-void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
+void *mem_pool_alloc(struct mem_pool *pool, size_t len)
 {
 	struct mp_block *p = NULL;
 	void *r;
@@ -72,15 +74,15 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	if (len & (sizeof(uintmax_t) - 1))
 		len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
 
-	if (mem_pool->mp_block &&
-	    mem_pool->mp_block->end - mem_pool->mp_block->next_free >= len)
-		p = mem_pool->mp_block;
+	if (pool->mp_block &&
+	    pool->mp_block->end - pool->mp_block->next_free >= len)
+		p = pool->mp_block;
 
 	if (!p) {
-		if (len >= (mem_pool->block_alloc / 2))
-			return mem_pool_alloc_block(mem_pool, len, mem_pool->mp_block);
+		if (len >= (pool->block_alloc / 2))
+			return mem_pool_alloc_block(pool, len, pool->mp_block);
 
-		p = mem_pool_alloc_block(mem_pool, mem_pool->block_alloc, NULL);
+		p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
 	}
 
 	r = p->next_free;
@@ -88,10 +90,10 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	return r;
 }
 
-void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
+void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size)
 {
 	size_t len = st_mult(count, size);
-	void *r = mem_pool_alloc(mem_pool, len);
+	void *r = mem_pool_alloc(pool, len);
 	memset(r, 0, len);
 	return r;
 }
@@ -119,12 +121,12 @@ char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
 	return memcpy(ret, str, minlen);
 }
 
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
+int mem_pool_contains(struct mem_pool *pool, void *mem)
 {
 	struct mp_block *p;
 
 	/* Check if memory is allocated in a block */
-	for (p = mem_pool->mp_block; p; p = p->next_block)
+	for (p = pool->mp_block; p; p = p->next_block)
 		if ((mem >= ((void *)p->space)) &&
 		    (mem < ((void *)p->end)))
 			return 1;
diff --git a/mem-pool.h b/mem-pool.h
index 30b7a8c03b..022b3097e9 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -24,12 +24,12 @@ struct mem_pool {
 /*
  * Initialize mem_pool with specified initial size.
  */
-void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size);
+void mem_pool_init(struct mem_pool *pool, size_t initial_size);
 
 /*
  * Discard all the memory the memory pool is responsible for.
  */
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory);
 
 /*
  * Alloc memory from the mem_pool.
@@ -58,6 +58,6 @@ void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src);
  * Check if a memory pointed at by 'mem' is part of the range of
  * memory managed by the specified mem_pool.
  */
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem);
+int mem_pool_contains(struct mem_pool *pool, void *mem);
 
 #endif
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/3] Extend and add a little more generalization to the mem_pool API
  2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-08-14  3:02 ` [PATCH 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
@ 2020-08-14  3:51 ` Matheus Tavares Bernardino
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
  4 siblings, 0 replies; 17+ messages in thread
From: Matheus Tavares Bernardino @ 2020-08-14  3:51 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

Hi, Elijah

On Fri, Aug 14, 2020 at 12:02 AM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Unfortunately, Matheus' parallel-checkout RFC series (not yet in next or
> seen as far as I can tell) adds a few more mem_pool callers, so this series
> conflicts with his (semantically). I can rebuild mine on top of his, or,
> since his is longer and would probably advance more slowly, it may make
> sense to have his series be based on this one. If so, I'm happy to help him
> update his to depend on this series. Let me know preferences.

Thanks for the heads up. Yeah, let me rebuild my series on top of
yours. From a quick look at your patches, I think it should be quite
straightforward to rebase my changes.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] mem-pool: use more standard initialization and finalization
  2020-08-14  3:02 ` [PATCH 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
@ 2020-08-14  4:38   ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2020-08-14  4:38 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Matheus Tavares, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> written.  mem_pool_init() does essentially the following (simplified
> for purposes of explanation here):
>
>     void mem_pool_init(struct mem_pool **pool...)
>     {
>         *pool = xcalloc(1, sizeof(*pool));
>
> It seems weird to require that mem_pools can only be accessed through a
> pointer.

Yup, if the _init() were to also allocate, I would expect it to be
more like

	struct mem_pool *mem_pool_create(...)
	{
		struct mem_pool *pool = xcalloc(1, sizeof(*pool));
		...
		return pool;
	}

It also is OK to let the caller pass uninitialized region of memory,
which is how we usually arrange _init() to work.  It seems that that
is the approach this patch takes.

> -void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
> +void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size)
>  {
> -	struct mem_pool *pool;
> -
> -	if (*mem_pool)
> -		return;
> -
> -	pool = xcalloc(1, sizeof(*pool));
> -
> -	pool->block_alloc = BLOCK_GROWTH_SIZE;
> +	mem_pool->mp_block = NULL;
> +	mem_pool->pool_alloc = 0;
> +	mem_pool->block_alloc = BLOCK_GROWTH_SIZE;
>  
>  	if (initial_size > 0)
> -		mem_pool_alloc_block(pool, initial_size, NULL);
> -
> -	*mem_pool = pool;
> +		mem_pool_alloc_block(mem_pool, initial_size, NULL);

It used to be that this function both knew and took control of all
the bits in *pool memory by using xcalloc().  Any field the function
assigned to of course got explicitly the value the function wanted
it to have, and all other fields were left to 0.

It may happen to be still the case (i.e. the assignments we see in
this function cover all the fields defined), but don't we need some
provision to make sure it will hold to be true in the future?

Starting it with "memset(pool, 0, sizeof(*pool)" would be one way.

You'd standardize to s/mem_pool/pool/ in [3/3]; shouldn't this be
written with pool to begin with, instead of reintroducing mem_pool
that is of different type from the original?

> -	if (!*pool_ptr)
> -		mem_pool_init(pool_ptr, 0);
> +	if (!*pool_ptr) {
> +		*pool_ptr = xmalloc(sizeof(**pool_ptr));
> +		mem_pool_init(*pool_ptr, 0);

This one gives an uninitialized chunk of memory to the _init(); an
example of the caller that the earlier comment may matter.

> +	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
>  	if (istate->version == 4) {
> -		mem_pool_init(&istate->ce_mem_pool,
> +		mem_pool_init(istate->ce_mem_pool,
>  				estimate_cache_size_from_compressed(istate->cache_nr));
>  	} else {
> -		mem_pool_init(&istate->ce_mem_pool,
> +		mem_pool_init(istate->ce_mem_pool,
>  				estimate_cache_size(mmap_size, istate->cache_nr));
>  	}

Likewise.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup
  2020-08-14  3:02 ` [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
@ 2020-08-14  4:42   ` Eric Sunshine
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sunshine @ 2020-08-14  4:42 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: Git List, Matheus Tavares, Elijah Newren

On Thu, Aug 13, 2020 at 11:02 PM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> fast-import had a special mem_pool_xstrdup() convenience function that I
> want to be able to use from the new merge algorithm I am writing.  Move
> it from fast-import to mem-pool, and also add a mem_pool_xstrndup()
> while at it that I also want to use.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> diff --git a/mem-pool.c b/mem-pool.c
> @@ -102,6 +102,29 @@ void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
> +char *mem_pool_xstrdup(struct mem_pool *pool, const char *str)
> +{
> +       size_t len = strlen(str) + 1;
> +       char *ret = mem_pool_alloc(pool, len);
> +
> +       if (!ret)
> +               die("Out of memory, mem_pool_xstrdup failed");

Nit: Not worth a re-roll, but I was wondering if it would make sense
to allow these to be localized (and follow current convention of not
capitalizing):

    die(_("mem_pool_xstrdup: out of memory"));

> +       return memcpy(ret, str, len);
> +}
> +
> +char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
> +{
> +       size_t minlen = strnlen(str, len);
> +       char *ret = mem_pool_alloc(pool, minlen+1);
> +
> +       if (!ret)
> +               die("Out of memory, mem_pool_xstrndup failed");

Ditto.

> +
> +       ret[minlen] = '\0';
> +       return memcpy(ret, str, minlen);
> +}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 0/3] Extend and add a little more generalization to the mem_pool API
  2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-08-14  3:51 ` [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Matheus Tavares Bernardino
@ 2020-08-14  6:00 ` Elijah Newren via GitGitGadget
  2020-08-14  6:00   ` [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
                     ` (3 more replies)
  4 siblings, 4 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  6:00 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Eric Sunshine, Elijah Newren

In my new merge algorithm, I made use of the mem_pool API in a few
places...but I also needed to add a few more functions and also needed to
make the API a bit more general.

Changes since v1:

 * Made the error message tweaks suggested by Eric
 * Although mem_pool_init() intialized all members manually, add a memset()
   to 0 for future-proofing, as suggested by Junio.
 * Use 'pool' instead of 'mem_pool' in patch 2 so that patch 3 doesn't have
   to change as much (as suggested by Junio).

Also, Matheus said he'd rebase his parallel-checkout RFC series (next yet in
next or seen) on top of mine to avoid the semantic conflict.

Elijah Newren (3):
  mem-pool: add convenience functions for xstrdup and xstrndup
  mem-pool: use more standard initialization and finalization
  mem-pool: use consistent pool variable name

 fast-import.c | 12 ++-------
 mem-pool.c    | 74 ++++++++++++++++++++++++++++++++-------------------
 mem-pool.h    | 14 +++++++---
 read-cache.c  | 21 +++++++++------
 split-index.c |  6 +++--
 5 files changed, 75 insertions(+), 52 deletions(-)


base-commit: 7814e8a05a59c0cf5fb186661d1551c75d1299b5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-830%2Fnewren%2Fmem_pool_api-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-830/newren/mem_pool_api-v2
Pull-Request: https://github.com/git/git/pull/830

Range-diff vs v1:

 1:  eae8b2923f ! 1:  6d679c5b46 mem-pool: add convenience functions for xstrdup and xstrndup
     @@ mem-pool.c: void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_
      +	char *ret = mem_pool_alloc(pool, len);
      +
      +	if (!ret)
     -+		die("Out of memory, mem_pool_xstrdup failed");
     ++		die(_("mem_pool_xstrdup: out of memory"));
      +
      +	return memcpy(ret, str, len);
      +}
     @@ mem-pool.c: void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_
      +	char *ret = mem_pool_alloc(pool, minlen+1);
      +
      +	if (!ret)
     -+		die("Out of memory, mem_pool_xstrndup failed");
     ++		die(_("mem_pool_xstrndup: out of memory"));
      +
      +	ret[minlen] = '\0';
      +	return memcpy(ret, str, minlen);
 2:  f13a52055c ! 2:  e04ba96b22 mem-pool: use more standard initialization and finalization
     @@ mem-pool.c: static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_po
       }
       
      -void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
     -+void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size)
     ++void mem_pool_init(struct mem_pool *pool, size_t initial_size)
       {
      -	struct mem_pool *pool;
      -
     @@ mem-pool.c: static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_po
      -
      -	pool = xcalloc(1, sizeof(*pool));
      -
     --	pool->block_alloc = BLOCK_GROWTH_SIZE;
     -+	mem_pool->mp_block = NULL;
     -+	mem_pool->pool_alloc = 0;
     -+	mem_pool->block_alloc = BLOCK_GROWTH_SIZE;
     ++	memset(pool, 0, sizeof(*pool));
     + 	pool->block_alloc = BLOCK_GROWTH_SIZE;
       
       	if (initial_size > 0)
     --		mem_pool_alloc_block(pool, initial_size, NULL);
     + 		mem_pool_alloc_block(pool, initial_size, NULL);
      -
      -	*mem_pool = pool;
     -+		mem_pool_alloc_block(mem_pool, initial_size, NULL);
       }
       
       void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
     @@ mem-pool.h: struct mem_pool {
        * Initialize mem_pool with specified initial size.
        */
      -void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size);
     -+void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size);
     ++void mem_pool_init(struct mem_pool *pool, size_t initial_size);
       
       /*
      - * Discard a memory pool and free all the memory it is responsible for.
 3:  62c2479fe6 ! 3:  616402c64e mem-pool: use consistent pool variable name
     @@ mem-pool.c: static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_po
       	}
       
       	return p;
     - }
     - 
     --void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size)
     -+void mem_pool_init(struct mem_pool *pool, size_t initial_size)
     - {
     --	mem_pool->mp_block = NULL;
     --	mem_pool->pool_alloc = 0;
     --	mem_pool->block_alloc = BLOCK_GROWTH_SIZE;
     -+	pool->mp_block = NULL;
     -+	pool->pool_alloc = 0;
     -+	pool->block_alloc = BLOCK_GROWTH_SIZE;
     - 
     - 	if (initial_size > 0)
     --		mem_pool_alloc_block(mem_pool, initial_size, NULL);
     -+		mem_pool_alloc_block(pool, initial_size, NULL);
     +@@ mem-pool.c: void mem_pool_init(struct mem_pool *pool, size_t initial_size)
     + 		mem_pool_alloc_block(pool, initial_size, NULL);
       }
       
      -void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
     @@ mem-pool.c: char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size
       			return 1;
      
       ## mem-pool.h ##
     -@@ mem-pool.h: struct mem_pool {
     - /*
     -  * Initialize mem_pool with specified initial size.
     -  */
     --void mem_pool_init(struct mem_pool *mem_pool, size_t initial_size);
     -+void mem_pool_init(struct mem_pool *pool, size_t initial_size);
     - 
     +@@ mem-pool.h: void mem_pool_init(struct mem_pool *pool, size_t initial_size);
       /*
        * Discard all the memory the memory pool is responsible for.
        */

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
@ 2020-08-14  6:00   ` Elijah Newren via GitGitGadget
  2020-08-14  8:21     ` René Scharfe
  2020-08-14  6:00   ` [PATCH v2 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  6:00 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Eric Sunshine, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

fast-import had a special mem_pool_xstrdup() convenience function that I
want to be able to use from the new merge algorithm I am writing.  Move
it from fast-import to mem-pool, and also add a mem_pool_xstrndup()
while at it that I also want to use.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 fast-import.c | 12 ++----------
 mem-pool.c    | 23 +++++++++++++++++++++++
 mem-pool.h    |  6 ++++++
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index ce47794db6..dd5b563950 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -526,14 +526,6 @@ static unsigned int hc_str(const char *s, size_t len)
 	return r;
 }
 
-static char *pool_strdup(const char *s)
-{
-	size_t len = strlen(s) + 1;
-	char *r = mem_pool_alloc(&fi_mem_pool, len);
-	memcpy(r, s, len);
-	return r;
-}
-
 static void insert_mark(struct mark_set *s, uintmax_t idnum, struct object_entry *oe)
 {
 	while ((idnum >> s->shift) >= 1024) {
@@ -615,7 +607,7 @@ static struct branch *new_branch(const char *name)
 		die("Branch name doesn't conform to GIT standards: %s", name);
 
 	b = mem_pool_calloc(&fi_mem_pool, 1, sizeof(struct branch));
-	b->name = pool_strdup(name);
+	b->name = mem_pool_xstrdup(&fi_mem_pool, name);
 	b->table_next_branch = branch_table[hc];
 	b->branch_tree.versions[0].mode = S_IFDIR;
 	b->branch_tree.versions[1].mode = S_IFDIR;
@@ -2806,7 +2798,7 @@ static void parse_new_tag(const char *arg)
 
 	t = mem_pool_alloc(&fi_mem_pool, sizeof(struct tag));
 	memset(t, 0, sizeof(struct tag));
-	t->name = pool_strdup(arg);
+	t->name = mem_pool_xstrdup(&fi_mem_pool, arg);
 	if (last_tag)
 		last_tag->next_tag = t;
 	else
diff --git a/mem-pool.c b/mem-pool.c
index a2841a4a9a..33fda1c411 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -102,6 +102,29 @@ void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
 	return r;
 }
 
+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str)
+{
+	size_t len = strlen(str) + 1;
+	char *ret = mem_pool_alloc(pool, len);
+
+	if (!ret)
+		die(_("mem_pool_xstrdup: out of memory"));
+
+	return memcpy(ret, str, len);
+}
+
+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
+{
+	size_t minlen = strnlen(str, len);
+	char *ret = mem_pool_alloc(pool, minlen+1);
+
+	if (!ret)
+		die(_("mem_pool_xstrndup: out of memory"));
+
+	ret[minlen] = '\0';
+	return memcpy(ret, str, minlen);
+}
+
 int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
 {
 	struct mp_block *p;
diff --git a/mem-pool.h b/mem-pool.h
index 999d3c3a52..fcaa2d462b 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -41,6 +41,12 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len);
  */
 void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size);
 
+/*
+ * Allocate memory from the memory pool and copy str into it.
+ */
+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str);
+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len);
+
 /*
  * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
  * pool will be empty and not contain any memory. It still needs to be free'd
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/3] mem-pool: use more standard initialization and finalization
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
  2020-08-14  6:00   ` [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
@ 2020-08-14  6:00   ` Elijah Newren via GitGitGadget
  2020-08-14  6:00   ` [PATCH v2 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  6:00 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Eric Sunshine, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

A typical memory type, such as strbuf, hashmap, or string_list can be
stored on the stack or embedded within another structure.  mem_pool
cannot be, because of how mem_pool_init() and mem_pool_discard() are
written.  mem_pool_init() does essentially the following (simplified
for purposes of explanation here):

    void mem_pool_init(struct mem_pool **pool...)
    {
        *pool = xcalloc(1, sizeof(*pool));

It seems weird to require that mem_pools can only be accessed through a
pointer.  It also seems slightly dangerous: unlike strbuf_release() or
strbuf_reset() or string_list_clear(), all of which put the data
structure into a state where it can be re-used after the call,
mem_pool_discard(pool) will leave pool pointing at free'd memory.
read-cache (and split-index) are the only current users of mem_pools,
and they haven't fallen into a use-after-free mistake here, but it seems
likely to be problematic for future users especially since several of
the current callers of mem_pool_init() will only call it when the
mem_pool* is not already allocated (i.e. is NULL).

This type of mechanism also prevents finding synchronization
points where one can free existing memory and then resume more
operations.  It would be natural at such points to run something like
    mem_pool_discard(pool...);
and, if necessary,
    mem_pool_init(&pool...);
and then carry on continuing to use the pool.  However, this fails badly
if several objects had a copy of the value of pool from before these
commands; in such a case, those objects won't get the updated value of
pool that mem_pool_init() overwrites pool with and they'll all instead
be reading and writing from free'd memory.

Modify mem_pool_init()/mem_pool_discard() to behave more like
   strbuf_init()/strbuf_release()
or
   string_list_init()/string_list_clear()
In particular: (1) make mem_pool_init() just take a mem_pool* and have
it only worry about allocating struct mp_blocks, not the struct mem_pool
itself, (2) make mem_pool_discard() free the memory that the pool was
responsible for, but leave it in a state where it can be used to
allocate more memory afterward (without the need to call mem_pool_init()
again).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c    | 15 ++++-----------
 mem-pool.h    |  4 ++--
 read-cache.c  | 21 +++++++++++++--------
 split-index.c |  6 ++++--
 4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index 33fda1c411..305bcd3542 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -33,21 +33,13 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 	return p;
 }
 
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
+void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 {
-	struct mem_pool *pool;
-
-	if (*mem_pool)
-		return;
-
-	pool = xcalloc(1, sizeof(*pool));
-
+	memset(pool, 0, sizeof(*pool));
 	pool->block_alloc = BLOCK_GROWTH_SIZE;
 
 	if (initial_size > 0)
 		mem_pool_alloc_block(pool, initial_size, NULL);
-
-	*mem_pool = pool;
 }
 
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
@@ -66,7 +58,8 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	free(mem_pool);
+	mem_pool->mp_block = NULL;
+	mem_pool->pool_alloc = 0;
 }
 
 void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
diff --git a/mem-pool.h b/mem-pool.h
index fcaa2d462b..a55ee4bc38 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -24,10 +24,10 @@ struct mem_pool {
 /*
  * Initialize mem_pool with specified initial size.
  */
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size);
+void mem_pool_init(struct mem_pool *pool, size_t initial_size);
 
 /*
- * Discard a memory pool and free all the memory it is responsible for.
+ * Discard all the memory the memory pool is responsible for.
  */
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
 
diff --git a/read-cache.c b/read-cache.c
index 8ed1c29b54..fa291cdbee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -89,8 +89,10 @@ static struct mem_pool *find_mem_pool(struct index_state *istate)
 	else
 		pool_ptr = &istate->ce_mem_pool;
 
-	if (!*pool_ptr)
-		mem_pool_init(pool_ptr, 0);
+	if (!*pool_ptr) {
+		*pool_ptr = xmalloc(sizeof(**pool_ptr));
+		mem_pool_init(*pool_ptr, 0);
+	}
 
 	return *pool_ptr;
 }
@@ -2006,11 +2008,12 @@ static unsigned long load_all_cache_entries(struct index_state *istate,
 {
 	unsigned long consumed;
 
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 	if (istate->version == 4) {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size_from_compressed(istate->cache_nr));
 	} else {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size(mmap_size, istate->cache_nr));
 	}
 
@@ -2070,7 +2073,8 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	if (istate->name_hash_initialized)
 		BUG("the name hash isn't thread safe");
 
-	mem_pool_init(&istate->ce_mem_pool, 0);
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
+	mem_pool_init(istate->ce_mem_pool, 0);
 
 	/* ensure we have no more threads than we have blocks to process */
 	if (nr_threads > ieot->nr)
@@ -2097,11 +2101,12 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 		nr = 0;
 		for (j = p->ieot_start; j < p->ieot_start + p->ieot_blocks; j++)
 			nr += p->ieot->entries[j].nr;
+		istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 		if (istate->version == 4) {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size_from_compressed(nr));
 		} else {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size(mmap_size, nr));
 		}
 
@@ -2358,7 +2363,7 @@ int discard_index(struct index_state *istate)
 
 	if (istate->ce_mem_pool) {
 		mem_pool_discard(istate->ce_mem_pool, should_validate_cache_entries());
-		istate->ce_mem_pool = NULL;
+		FREE_AND_NULL(istate->ce_mem_pool);
 	}
 
 	return 0;
diff --git a/split-index.c b/split-index.c
index e6154e4ea9..c0e8ad670d 100644
--- a/split-index.c
+++ b/split-index.c
@@ -79,8 +79,10 @@ void move_cache_to_base_index(struct index_state *istate)
 	if (si->base &&
 		si->base->ce_mem_pool) {
 
-		if (!istate->ce_mem_pool)
-			mem_pool_init(&istate->ce_mem_pool, 0);
+		if (!istate->ce_mem_pool) {
+			istate->ce_mem_pool = xmalloc(sizeof(struct mem_pool));
+			mem_pool_init(istate->ce_mem_pool, 0);
+		}
 
 		mem_pool_combine(istate->ce_mem_pool, istate->split_index->base->ce_mem_pool);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/3] mem-pool: use consistent pool variable name
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
  2020-08-14  6:00   ` [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
  2020-08-14  6:00   ` [PATCH v2 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
@ 2020-08-14  6:00   ` Elijah Newren via GitGitGadget
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-14  6:00 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Eric Sunshine, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

About half the function declarations in mem-pool.h used 'struct mem_pool
*pool', while the other half used 'struct mem_pool *mem_pool'.  Make the
code a bit more consistent by just using 'pool' in preference to
'mem_pool' everywhere.

No behavioral changes included; this is just a mechanical rename (though
a line or two was rewrapped as well).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c | 40 +++++++++++++++++++++-------------------
 mem-pool.h |  4 ++--
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index 305bcd3542..593605fce2 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -12,11 +12,13 @@
  * `insert_after`. If `insert_after` is NULL, then insert block at the
  * head of the linked list.
  */
-static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t block_alloc, struct mp_block *insert_after)
+static struct mp_block *mem_pool_alloc_block(struct mem_pool *pool,
+					     size_t block_alloc,
+					     struct mp_block *insert_after)
 {
 	struct mp_block *p;
 
-	mem_pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
+	pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
 	p = xmalloc(st_add(sizeof(struct mp_block), block_alloc));
 
 	p->next_free = (char *)p->space;
@@ -26,8 +28,8 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 		p->next_block = insert_after->next_block;
 		insert_after->next_block = p;
 	} else {
-		p->next_block = mem_pool->mp_block;
-		mem_pool->mp_block = p;
+		p->next_block = pool->mp_block;
+		pool->mp_block = p;
 	}
 
 	return p;
@@ -42,11 +44,11 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 		mem_pool_alloc_block(pool, initial_size, NULL);
 }
 
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory)
 {
 	struct mp_block *block, *block_to_free;
 
-	block = mem_pool->mp_block;
+	block = pool->mp_block;
 	while (block)
 	{
 		block_to_free = block;
@@ -58,11 +60,11 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	mem_pool->mp_block = NULL;
-	mem_pool->pool_alloc = 0;
+	pool->mp_block = NULL;
+	pool->pool_alloc = 0;
 }
 
-void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
+void *mem_pool_alloc(struct mem_pool *pool, size_t len)
 {
 	struct mp_block *p = NULL;
 	void *r;
@@ -71,15 +73,15 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	if (len & (sizeof(uintmax_t) - 1))
 		len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
 
-	if (mem_pool->mp_block &&
-	    mem_pool->mp_block->end - mem_pool->mp_block->next_free >= len)
-		p = mem_pool->mp_block;
+	if (pool->mp_block &&
+	    pool->mp_block->end - pool->mp_block->next_free >= len)
+		p = pool->mp_block;
 
 	if (!p) {
-		if (len >= (mem_pool->block_alloc / 2))
-			return mem_pool_alloc_block(mem_pool, len, mem_pool->mp_block);
+		if (len >= (pool->block_alloc / 2))
+			return mem_pool_alloc_block(pool, len, pool->mp_block);
 
-		p = mem_pool_alloc_block(mem_pool, mem_pool->block_alloc, NULL);
+		p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
 	}
 
 	r = p->next_free;
@@ -87,10 +89,10 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	return r;
 }
 
-void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
+void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size)
 {
 	size_t len = st_mult(count, size);
-	void *r = mem_pool_alloc(mem_pool, len);
+	void *r = mem_pool_alloc(pool, len);
 	memset(r, 0, len);
 	return r;
 }
@@ -118,12 +120,12 @@ char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
 	return memcpy(ret, str, minlen);
 }
 
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
+int mem_pool_contains(struct mem_pool *pool, void *mem)
 {
 	struct mp_block *p;
 
 	/* Check if memory is allocated in a block */
-	for (p = mem_pool->mp_block; p; p = p->next_block)
+	for (p = pool->mp_block; p; p = p->next_block)
 		if ((mem >= ((void *)p->space)) &&
 		    (mem < ((void *)p->end)))
 			return 1;
diff --git a/mem-pool.h b/mem-pool.h
index a55ee4bc38..022b3097e9 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -29,7 +29,7 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size);
 /*
  * Discard all the memory the memory pool is responsible for.
  */
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory);
 
 /*
  * Alloc memory from the mem_pool.
@@ -58,6 +58,6 @@ void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src);
  * Check if a memory pointed at by 'mem' is part of the range of
  * memory managed by the specified mem_pool.
  */
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem);
+int mem_pool_contains(struct mem_pool *pool, void *mem);
 
 #endif
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup
  2020-08-14  6:00   ` [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
@ 2020-08-14  8:21     ` René Scharfe
  0 siblings, 0 replies; 17+ messages in thread
From: René Scharfe @ 2020-08-14  8:21 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Matheus Tavares, Eric Sunshine, Elijah Newren

Am 14.08.20 um 08:00 schrieb Elijah Newren via GitGitGadget:
> From: Elijah Newren <newren@gmail.com>
>
> fast-import had a special mem_pool_xstrdup() convenience function that I
> want to be able to use from the new merge algorithm I am writing.  Move
> it from fast-import to mem-pool, and also add a mem_pool_xstrndup()
> while at it that I also want to use.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  fast-import.c | 12 ++----------
>  mem-pool.c    | 23 +++++++++++++++++++++++
>  mem-pool.h    |  6 ++++++
>  3 files changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/fast-import.c b/fast-import.c
> index ce47794db6..dd5b563950 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -526,14 +526,6 @@ static unsigned int hc_str(const char *s, size_t len)
>  	return r;
>  }
>
> -static char *pool_strdup(const char *s)
> -{
> -	size_t len = strlen(s) + 1;
> -	char *r = mem_pool_alloc(&fi_mem_pool, len);
> -	memcpy(r, s, len);
> -	return r;
> -}

Note: No "x" in the name and it doesn't handle mem_pool_alloc()
returning NULL.

> -
>  static void insert_mark(struct mark_set *s, uintmax_t idnum, struct object_entry *oe)
>  {
>  	while ((idnum >> s->shift) >= 1024) {
> @@ -615,7 +607,7 @@ static struct branch *new_branch(const char *name)
>  		die("Branch name doesn't conform to GIT standards: %s", name);
>
>  	b = mem_pool_calloc(&fi_mem_pool, 1, sizeof(struct branch));
> -	b->name = pool_strdup(name);
> +	b->name = mem_pool_xstrdup(&fi_mem_pool, name);
>  	b->table_next_branch = branch_table[hc];
>  	b->branch_tree.versions[0].mode = S_IFDIR;
>  	b->branch_tree.versions[1].mode = S_IFDIR;
> @@ -2806,7 +2798,7 @@ static void parse_new_tag(const char *arg)
>
>  	t = mem_pool_alloc(&fi_mem_pool, sizeof(struct tag));
>  	memset(t, 0, sizeof(struct tag));
> -	t->name = pool_strdup(arg);
> +	t->name = mem_pool_xstrdup(&fi_mem_pool, arg);
>  	if (last_tag)
>  		last_tag->next_tag = t;
>  	else
> diff --git a/mem-pool.c b/mem-pool.c
> index a2841a4a9a..33fda1c411 100644
> --- a/mem-pool.c
> +++ b/mem-pool.c
> @@ -102,6 +102,29 @@ void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
>  	return r;
>  }
>
> +char *mem_pool_xstrdup(struct mem_pool *pool, const char *str)
> +{
> +	size_t len = strlen(str) + 1;
> +	char *ret = mem_pool_alloc(pool, len);
> +
> +	if (!ret)
> +		die(_("mem_pool_xstrdup: out of memory"));

Can mem_pool_alloc() actually return NULL?  It will rather die because
it uses xmalloc(), right?  So that check is unnecessary.

And since "mem_pool_" already implies that these functions won't return
if an allocation fails, no extra "x" is needed in their name.

> +
> +	return memcpy(ret, str, len);
> +}
> +
> +char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
> +{
> +	size_t minlen = strnlen(str, len);

Hmm, this would be our first caller of strnlen().  wrapper.c::xstrndup()
uses memchr() instead.  It was added in 2008, and strnlen() is in
POSIX.1-2008, so back then it made sense.  Perhaps there are still
systems out there without one?

> +	char *ret = mem_pool_alloc(pool, minlen+1);
> +
> +	if (!ret)
> +		die(_("mem_pool_xstrndup: out of memory"));

The same comments as on mem_pool_xstrdup() apply here.

> +
> +	ret[minlen] = '\0';
> +	return memcpy(ret, str, minlen);
> +}
> +
>  int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
>  {
>  	struct mp_block *p;
> diff --git a/mem-pool.h b/mem-pool.h
> index 999d3c3a52..fcaa2d462b 100644
> --- a/mem-pool.h
> +++ b/mem-pool.h
> @@ -41,6 +41,12 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len);
>   */
>  void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size);
>
> +/*
> + * Allocate memory from the memory pool and copy str into it.
> + */
> +char *mem_pool_xstrdup(struct mem_pool *pool, const char *str);
> +char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len);
> +
>  /*
>   * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
>   * pool will be empty and not contain any memory. It still needs to be free'd
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API
  2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-08-14  6:00   ` [PATCH v2 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
@ 2020-08-15 17:37   ` Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 1/3] mem-pool: add convenience functions for strdup and strndup Elijah Newren via GitGitGadget
                       ` (3 more replies)
  3 siblings, 4 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-15 17:37 UTC (permalink / raw)
  To: git; +Cc: Matheus Tavares, Eric Sunshine, René Scharfe, Elijah Newren

In my new merge algorithm, I made use of the mem_pool API in a few
places...but I also needed to add a few more functions and also needed to
make the API a bit more general.

Changes since v2:

 * Remove 'x' from mem_pool_xstr[n]?dup() names and remove check for NULL
   since mem_pool_alloc() already handles that (via xmalloc), as suggested
   by René
 * Rewrite mem_pool_strndup() to not rely on strnlen(), since that may be
   too new from some systems (comes from POSIX 2008 ). Also pointed out by
   René

Elijah Newren (3):
  mem-pool: add convenience functions for strdup and strndup
  mem-pool: use more standard initialization and finalization
  mem-pool: use consistent pool variable name

 fast-import.c | 12 ++-------
 mem-pool.c    | 69 ++++++++++++++++++++++++++++++---------------------
 mem-pool.h    | 14 ++++++++---
 read-cache.c  | 21 ++++++++++------
 split-index.c |  6 +++--
 5 files changed, 70 insertions(+), 52 deletions(-)


base-commit: 7814e8a05a59c0cf5fb186661d1551c75d1299b5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-830%2Fnewren%2Fmem_pool_api-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-830/newren/mem_pool_api-v3
Pull-Request: https://github.com/git/git/pull/830

Range-diff vs v2:

 1:  6d679c5b46 ! 1:  e7f6bf8a8c mem-pool: add convenience functions for xstrdup and xstrndup
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    mem-pool: add convenience functions for xstrdup and xstrndup
     +    mem-pool: add convenience functions for strdup and strndup
      
     -    fast-import had a special mem_pool_xstrdup() convenience function that I
     +    fast-import had a special mem_pool_strdup() convenience function that I
          want to be able to use from the new merge algorithm I am writing.  Move
     -    it from fast-import to mem-pool, and also add a mem_pool_xstrndup()
     +    it from fast-import to mem-pool, and also add a mem_pool_strndup()
          while at it that I also want to use.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
     @@ fast-import.c: static struct branch *new_branch(const char *name)
       
       	b = mem_pool_calloc(&fi_mem_pool, 1, sizeof(struct branch));
      -	b->name = pool_strdup(name);
     -+	b->name = mem_pool_xstrdup(&fi_mem_pool, name);
     ++	b->name = mem_pool_strdup(&fi_mem_pool, name);
       	b->table_next_branch = branch_table[hc];
       	b->branch_tree.versions[0].mode = S_IFDIR;
       	b->branch_tree.versions[1].mode = S_IFDIR;
     @@ fast-import.c: static void parse_new_tag(const char *arg)
       	t = mem_pool_alloc(&fi_mem_pool, sizeof(struct tag));
       	memset(t, 0, sizeof(struct tag));
      -	t->name = pool_strdup(arg);
     -+	t->name = mem_pool_xstrdup(&fi_mem_pool, arg);
     ++	t->name = mem_pool_strdup(&fi_mem_pool, arg);
       	if (last_tag)
       		last_tag->next_tag = t;
       	else
     @@ mem-pool.c: void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_
       	return r;
       }
       
     -+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str)
     ++char *mem_pool_strdup(struct mem_pool *pool, const char *str)
      +{
      +	size_t len = strlen(str) + 1;
      +	char *ret = mem_pool_alloc(pool, len);
      +
     -+	if (!ret)
     -+		die(_("mem_pool_xstrdup: out of memory"));
     -+
      +	return memcpy(ret, str, len);
      +}
      +
     -+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
     ++char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len)
      +{
     -+	size_t minlen = strnlen(str, len);
     -+	char *ret = mem_pool_alloc(pool, minlen+1);
     -+
     -+	if (!ret)
     -+		die(_("mem_pool_xstrndup: out of memory"));
     ++	char *p = memchr(str, '\0', len);
     ++	size_t actual_len = (p ? p - str : len);
     ++	char *ret = mem_pool_alloc(pool, actual_len+1);
      +
     -+	ret[minlen] = '\0';
     -+	return memcpy(ret, str, minlen);
     ++	ret[actual_len] = '\0';
     ++	return memcpy(ret, str, actual_len);
      +}
      +
       int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
     @@ mem-pool.h: void *mem_pool_alloc(struct mem_pool *pool, size_t len);
      +/*
      + * Allocate memory from the memory pool and copy str into it.
      + */
     -+char *mem_pool_xstrdup(struct mem_pool *pool, const char *str);
     -+char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len);
     ++char *mem_pool_strdup(struct mem_pool *pool, const char *str);
     ++char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len);
      +
       /*
        * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
 2:  e04ba96b22 = 2:  65f334f5cf mem-pool: use more standard initialization and finalization
 3:  616402c64e ! 3:  09976779c3 mem-pool: use consistent pool variable name
     @@ mem-pool.c: void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
       	memset(r, 0, len);
       	return r;
       }
     -@@ mem-pool.c: char *mem_pool_xstrndup(struct mem_pool *pool, const char *str, size_t len)
     - 	return memcpy(ret, str, minlen);
     +@@ mem-pool.c: char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len)
     + 	return memcpy(ret, str, actual_len);
       }
       
      -int mem_pool_contains(struct mem_pool *mem_pool, void *mem)

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/3] mem-pool: add convenience functions for strdup and strndup
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
@ 2020-08-15 17:37     ` Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-15 17:37 UTC (permalink / raw)
  To: git
  Cc: Matheus Tavares, Eric Sunshine, René Scharfe, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

fast-import had a special mem_pool_strdup() convenience function that I
want to be able to use from the new merge algorithm I am writing.  Move
it from fast-import to mem-pool, and also add a mem_pool_strndup()
while at it that I also want to use.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 fast-import.c | 12 ++----------
 mem-pool.c    | 18 ++++++++++++++++++
 mem-pool.h    |  6 ++++++
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index ce47794db6..c45fe7474c 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -526,14 +526,6 @@ static unsigned int hc_str(const char *s, size_t len)
 	return r;
 }
 
-static char *pool_strdup(const char *s)
-{
-	size_t len = strlen(s) + 1;
-	char *r = mem_pool_alloc(&fi_mem_pool, len);
-	memcpy(r, s, len);
-	return r;
-}
-
 static void insert_mark(struct mark_set *s, uintmax_t idnum, struct object_entry *oe)
 {
 	while ((idnum >> s->shift) >= 1024) {
@@ -615,7 +607,7 @@ static struct branch *new_branch(const char *name)
 		die("Branch name doesn't conform to GIT standards: %s", name);
 
 	b = mem_pool_calloc(&fi_mem_pool, 1, sizeof(struct branch));
-	b->name = pool_strdup(name);
+	b->name = mem_pool_strdup(&fi_mem_pool, name);
 	b->table_next_branch = branch_table[hc];
 	b->branch_tree.versions[0].mode = S_IFDIR;
 	b->branch_tree.versions[1].mode = S_IFDIR;
@@ -2806,7 +2798,7 @@ static void parse_new_tag(const char *arg)
 
 	t = mem_pool_alloc(&fi_mem_pool, sizeof(struct tag));
 	memset(t, 0, sizeof(struct tag));
-	t->name = pool_strdup(arg);
+	t->name = mem_pool_strdup(&fi_mem_pool, arg);
 	if (last_tag)
 		last_tag->next_tag = t;
 	else
diff --git a/mem-pool.c b/mem-pool.c
index a2841a4a9a..020b51e0c5 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -102,6 +102,24 @@ void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
 	return r;
 }
 
+char *mem_pool_strdup(struct mem_pool *pool, const char *str)
+{
+	size_t len = strlen(str) + 1;
+	char *ret = mem_pool_alloc(pool, len);
+
+	return memcpy(ret, str, len);
+}
+
+char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len)
+{
+	char *p = memchr(str, '\0', len);
+	size_t actual_len = (p ? p - str : len);
+	char *ret = mem_pool_alloc(pool, actual_len+1);
+
+	ret[actual_len] = '\0';
+	return memcpy(ret, str, actual_len);
+}
+
 int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
 {
 	struct mp_block *p;
diff --git a/mem-pool.h b/mem-pool.h
index 999d3c3a52..ca062c9070 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -41,6 +41,12 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len);
  */
 void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size);
 
+/*
+ * Allocate memory from the memory pool and copy str into it.
+ */
+char *mem_pool_strdup(struct mem_pool *pool, const char *str);
+char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len);
+
 /*
  * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
  * pool will be empty and not contain any memory. It still needs to be free'd
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/3] mem-pool: use more standard initialization and finalization
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 1/3] mem-pool: add convenience functions for strdup and strndup Elijah Newren via GitGitGadget
@ 2020-08-15 17:37     ` Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
  2020-08-18 19:27     ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Junio C Hamano
  3 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-15 17:37 UTC (permalink / raw)
  To: git
  Cc: Matheus Tavares, Eric Sunshine, René Scharfe, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

A typical memory type, such as strbuf, hashmap, or string_list can be
stored on the stack or embedded within another structure.  mem_pool
cannot be, because of how mem_pool_init() and mem_pool_discard() are
written.  mem_pool_init() does essentially the following (simplified
for purposes of explanation here):

    void mem_pool_init(struct mem_pool **pool...)
    {
        *pool = xcalloc(1, sizeof(*pool));

It seems weird to require that mem_pools can only be accessed through a
pointer.  It also seems slightly dangerous: unlike strbuf_release() or
strbuf_reset() or string_list_clear(), all of which put the data
structure into a state where it can be re-used after the call,
mem_pool_discard(pool) will leave pool pointing at free'd memory.
read-cache (and split-index) are the only current users of mem_pools,
and they haven't fallen into a use-after-free mistake here, but it seems
likely to be problematic for future users especially since several of
the current callers of mem_pool_init() will only call it when the
mem_pool* is not already allocated (i.e. is NULL).

This type of mechanism also prevents finding synchronization
points where one can free existing memory and then resume more
operations.  It would be natural at such points to run something like
    mem_pool_discard(pool...);
and, if necessary,
    mem_pool_init(&pool...);
and then carry on continuing to use the pool.  However, this fails badly
if several objects had a copy of the value of pool from before these
commands; in such a case, those objects won't get the updated value of
pool that mem_pool_init() overwrites pool with and they'll all instead
be reading and writing from free'd memory.

Modify mem_pool_init()/mem_pool_discard() to behave more like
   strbuf_init()/strbuf_release()
or
   string_list_init()/string_list_clear()
In particular: (1) make mem_pool_init() just take a mem_pool* and have
it only worry about allocating struct mp_blocks, not the struct mem_pool
itself, (2) make mem_pool_discard() free the memory that the pool was
responsible for, but leave it in a state where it can be used to
allocate more memory afterward (without the need to call mem_pool_init()
again).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c    | 15 ++++-----------
 mem-pool.h    |  4 ++--
 read-cache.c  | 21 +++++++++++++--------
 split-index.c |  6 ++++--
 4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index 020b51e0c5..7659919ab2 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -33,21 +33,13 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 	return p;
 }
 
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size)
+void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 {
-	struct mem_pool *pool;
-
-	if (*mem_pool)
-		return;
-
-	pool = xcalloc(1, sizeof(*pool));
-
+	memset(pool, 0, sizeof(*pool));
 	pool->block_alloc = BLOCK_GROWTH_SIZE;
 
 	if (initial_size > 0)
 		mem_pool_alloc_block(pool, initial_size, NULL);
-
-	*mem_pool = pool;
 }
 
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
@@ -66,7 +58,8 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	free(mem_pool);
+	mem_pool->mp_block = NULL;
+	mem_pool->pool_alloc = 0;
 }
 
 void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
diff --git a/mem-pool.h b/mem-pool.h
index ca062c9070..870161ab44 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -24,10 +24,10 @@ struct mem_pool {
 /*
  * Initialize mem_pool with specified initial size.
  */
-void mem_pool_init(struct mem_pool **mem_pool, size_t initial_size);
+void mem_pool_init(struct mem_pool *pool, size_t initial_size);
 
 /*
- * Discard a memory pool and free all the memory it is responsible for.
+ * Discard all the memory the memory pool is responsible for.
  */
 void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
 
diff --git a/read-cache.c b/read-cache.c
index 8ed1c29b54..fa291cdbee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -89,8 +89,10 @@ static struct mem_pool *find_mem_pool(struct index_state *istate)
 	else
 		pool_ptr = &istate->ce_mem_pool;
 
-	if (!*pool_ptr)
-		mem_pool_init(pool_ptr, 0);
+	if (!*pool_ptr) {
+		*pool_ptr = xmalloc(sizeof(**pool_ptr));
+		mem_pool_init(*pool_ptr, 0);
+	}
 
 	return *pool_ptr;
 }
@@ -2006,11 +2008,12 @@ static unsigned long load_all_cache_entries(struct index_state *istate,
 {
 	unsigned long consumed;
 
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 	if (istate->version == 4) {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size_from_compressed(istate->cache_nr));
 	} else {
-		mem_pool_init(&istate->ce_mem_pool,
+		mem_pool_init(istate->ce_mem_pool,
 				estimate_cache_size(mmap_size, istate->cache_nr));
 	}
 
@@ -2070,7 +2073,8 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 	if (istate->name_hash_initialized)
 		BUG("the name hash isn't thread safe");
 
-	mem_pool_init(&istate->ce_mem_pool, 0);
+	istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
+	mem_pool_init(istate->ce_mem_pool, 0);
 
 	/* ensure we have no more threads than we have blocks to process */
 	if (nr_threads > ieot->nr)
@@ -2097,11 +2101,12 @@ static unsigned long load_cache_entries_threaded(struct index_state *istate, con
 		nr = 0;
 		for (j = p->ieot_start; j < p->ieot_start + p->ieot_blocks; j++)
 			nr += p->ieot->entries[j].nr;
+		istate->ce_mem_pool = xmalloc(sizeof(*istate->ce_mem_pool));
 		if (istate->version == 4) {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size_from_compressed(nr));
 		} else {
-			mem_pool_init(&p->ce_mem_pool,
+			mem_pool_init(p->ce_mem_pool,
 				estimate_cache_size(mmap_size, nr));
 		}
 
@@ -2358,7 +2363,7 @@ int discard_index(struct index_state *istate)
 
 	if (istate->ce_mem_pool) {
 		mem_pool_discard(istate->ce_mem_pool, should_validate_cache_entries());
-		istate->ce_mem_pool = NULL;
+		FREE_AND_NULL(istate->ce_mem_pool);
 	}
 
 	return 0;
diff --git a/split-index.c b/split-index.c
index e6154e4ea9..c0e8ad670d 100644
--- a/split-index.c
+++ b/split-index.c
@@ -79,8 +79,10 @@ void move_cache_to_base_index(struct index_state *istate)
 	if (si->base &&
 		si->base->ce_mem_pool) {
 
-		if (!istate->ce_mem_pool)
-			mem_pool_init(&istate->ce_mem_pool, 0);
+		if (!istate->ce_mem_pool) {
+			istate->ce_mem_pool = xmalloc(sizeof(struct mem_pool));
+			mem_pool_init(istate->ce_mem_pool, 0);
+		}
 
 		mem_pool_combine(istate->ce_mem_pool, istate->split_index->base->ce_mem_pool);
 	}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 3/3] mem-pool: use consistent pool variable name
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 1/3] mem-pool: add convenience functions for strdup and strndup Elijah Newren via GitGitGadget
  2020-08-15 17:37     ` [PATCH v3 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
@ 2020-08-15 17:37     ` Elijah Newren via GitGitGadget
  2020-08-18 19:27     ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Junio C Hamano
  3 siblings, 0 replies; 17+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-08-15 17:37 UTC (permalink / raw)
  To: git
  Cc: Matheus Tavares, Eric Sunshine, René Scharfe, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

About half the function declarations in mem-pool.h used 'struct mem_pool
*pool', while the other half used 'struct mem_pool *mem_pool'.  Make the
code a bit more consistent by just using 'pool' in preference to
'mem_pool' everywhere.

No behavioral changes included; this is just a mechanical rename (though
a line or two was rewrapped as well).

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 mem-pool.c | 40 +++++++++++++++++++++-------------------
 mem-pool.h |  4 ++--
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/mem-pool.c b/mem-pool.c
index 7659919ab2..8401761dda 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -12,11 +12,13 @@
  * `insert_after`. If `insert_after` is NULL, then insert block at the
  * head of the linked list.
  */
-static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t block_alloc, struct mp_block *insert_after)
+static struct mp_block *mem_pool_alloc_block(struct mem_pool *pool,
+					     size_t block_alloc,
+					     struct mp_block *insert_after)
 {
 	struct mp_block *p;
 
-	mem_pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
+	pool->pool_alloc += sizeof(struct mp_block) + block_alloc;
 	p = xmalloc(st_add(sizeof(struct mp_block), block_alloc));
 
 	p->next_free = (char *)p->space;
@@ -26,8 +28,8 @@ static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t b
 		p->next_block = insert_after->next_block;
 		insert_after->next_block = p;
 	} else {
-		p->next_block = mem_pool->mp_block;
-		mem_pool->mp_block = p;
+		p->next_block = pool->mp_block;
+		pool->mp_block = p;
 	}
 
 	return p;
@@ -42,11 +44,11 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size)
 		mem_pool_alloc_block(pool, initial_size, NULL);
 }
 
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory)
 {
 	struct mp_block *block, *block_to_free;
 
-	block = mem_pool->mp_block;
+	block = pool->mp_block;
 	while (block)
 	{
 		block_to_free = block;
@@ -58,11 +60,11 @@ void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory)
 		free(block_to_free);
 	}
 
-	mem_pool->mp_block = NULL;
-	mem_pool->pool_alloc = 0;
+	pool->mp_block = NULL;
+	pool->pool_alloc = 0;
 }
 
-void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
+void *mem_pool_alloc(struct mem_pool *pool, size_t len)
 {
 	struct mp_block *p = NULL;
 	void *r;
@@ -71,15 +73,15 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	if (len & (sizeof(uintmax_t) - 1))
 		len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
 
-	if (mem_pool->mp_block &&
-	    mem_pool->mp_block->end - mem_pool->mp_block->next_free >= len)
-		p = mem_pool->mp_block;
+	if (pool->mp_block &&
+	    pool->mp_block->end - pool->mp_block->next_free >= len)
+		p = pool->mp_block;
 
 	if (!p) {
-		if (len >= (mem_pool->block_alloc / 2))
-			return mem_pool_alloc_block(mem_pool, len, mem_pool->mp_block);
+		if (len >= (pool->block_alloc / 2))
+			return mem_pool_alloc_block(pool, len, pool->mp_block);
 
-		p = mem_pool_alloc_block(mem_pool, mem_pool->block_alloc, NULL);
+		p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
 	}
 
 	r = p->next_free;
@@ -87,10 +89,10 @@ void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
 	return r;
 }
 
-void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
+void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size)
 {
 	size_t len = st_mult(count, size);
-	void *r = mem_pool_alloc(mem_pool, len);
+	void *r = mem_pool_alloc(pool, len);
 	memset(r, 0, len);
 	return r;
 }
@@ -113,12 +115,12 @@ char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len)
 	return memcpy(ret, str, actual_len);
 }
 
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
+int mem_pool_contains(struct mem_pool *pool, void *mem)
 {
 	struct mp_block *p;
 
 	/* Check if memory is allocated in a block */
-	for (p = mem_pool->mp_block; p; p = p->next_block)
+	for (p = pool->mp_block; p; p = p->next_block)
 		if ((mem >= ((void *)p->space)) &&
 		    (mem < ((void *)p->end)))
 			return 1;
diff --git a/mem-pool.h b/mem-pool.h
index 870161ab44..fe7507f022 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -29,7 +29,7 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size);
 /*
  * Discard all the memory the memory pool is responsible for.
  */
-void mem_pool_discard(struct mem_pool *mem_pool, int invalidate_memory);
+void mem_pool_discard(struct mem_pool *pool, int invalidate_memory);
 
 /*
  * Alloc memory from the mem_pool.
@@ -58,6 +58,6 @@ void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src);
  * Check if a memory pointed at by 'mem' is part of the range of
  * memory managed by the specified mem_pool.
  */
-int mem_pool_contains(struct mem_pool *mem_pool, void *mem);
+int mem_pool_contains(struct mem_pool *pool, void *mem);
 
 #endif
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API
  2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-08-15 17:37     ` [PATCH v3 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
@ 2020-08-18 19:27     ` Junio C Hamano
  3 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2020-08-18 19:27 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Matheus Tavares, Eric Sunshine, René Scharfe, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> In my new merge algorithm, I made use of the mem_pool API in a few
> places...but I also needed to add a few more functions and also needed to
> make the API a bit more general.
>
> Changes since v2:
>
>  * Remove 'x' from mem_pool_xstr[n]?dup() names and remove check for NULL
>    since mem_pool_alloc() already handles that (via xmalloc), as suggested
>    by René
>  * Rewrite mem_pool_strndup() to not rely on strnlen(), since that may be
>    too new from some systems (comes from POSIX 2008 ). Also pointed out by
>    René

Thanks; will queue.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-08-18 19:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-14  3:02 [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
2020-08-14  3:02 ` [PATCH 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
2020-08-14  4:42   ` Eric Sunshine
2020-08-14  3:02 ` [PATCH 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
2020-08-14  4:38   ` Junio C Hamano
2020-08-14  3:02 ` [PATCH 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
2020-08-14  3:51 ` [PATCH 0/3] Extend and add a little more generalization to the mem_pool API Matheus Tavares Bernardino
2020-08-14  6:00 ` [PATCH v2 " Elijah Newren via GitGitGadget
2020-08-14  6:00   ` [PATCH v2 1/3] mem-pool: add convenience functions for xstrdup and xstrndup Elijah Newren via GitGitGadget
2020-08-14  8:21     ` René Scharfe
2020-08-14  6:00   ` [PATCH v2 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
2020-08-14  6:00   ` [PATCH v2 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
2020-08-15 17:37   ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Elijah Newren via GitGitGadget
2020-08-15 17:37     ` [PATCH v3 1/3] mem-pool: add convenience functions for strdup and strndup Elijah Newren via GitGitGadget
2020-08-15 17:37     ` [PATCH v3 2/3] mem-pool: use more standard initialization and finalization Elijah Newren via GitGitGadget
2020-08-15 17:37     ` [PATCH v3 3/3] mem-pool: use consistent pool variable name Elijah Newren via GitGitGadget
2020-08-18 19:27     ` [PATCH v3 0/3] Extend and add a little more generalization to the mem_pool API Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.