All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/18] hardening allocations against integer overflow
@ 2016-02-15 21:45 Jeff King
  2016-02-15 21:49 ` [PATCH 01/18] add helpers for detecting size_t overflow Jeff King
                   ` (19 more replies)
  0 siblings, 20 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:45 UTC (permalink / raw)
  To: git

About 6 months or so ago, I did an audit of git's code base for uses of
strcpy and sprintf that could overflow, fixing any bugs and cleaning up
any suspect spots to make further audits simpler.  This is a
continuation of that work, for size computations which can overflow and
cause us to allocate a too-small buffer. E.g., something like:

  char *concat(const char *a, const char *b)
  {
	unsigned len_a = strlen(a);
	unsigned len_b = strlen(b);
	char *ret = xmalloc(len_a + len_b);
	memcpy(ret, a, len_a);
	memcpy(ret, b, len_b);
  }

will behave badly if the sum of "a" and "b" overflows "unsigned". There
are other variants, too (we are also truncating the return value from
strlen, and we'd frequently use "int" here, so the lengths can actually
be negative!). It also varies based on platform. If the sites use size_t
instead of int, then 64-bit systems are typically hard to trigger in
practice (just because you'd need petabytes to store "a" and "b" in the
first place).

The only bug I have actually confirmed in practice here is fixed by
patch 2 (which is why it's at the front). There's another one in
path_name(), but that function is already dropped by the nearby
jk/lose-name-path topic.

The rest are cleanups of spots which _might_ be buggy, but I didn't dig
too far to find out. As with the earlier audit, I tried to refactor
using helpers that make the code clearer and less error-prone. So maybe
they're fixing bugs or not, but they certainly make it easier to audit
the result for problems.

  [01/18]: add helpers for detecting size_t overflow
  [02/18]: tree-diff: catch integer overflow in combine_diff_path allocation
  [03/18]: harden REALLOC_ARRAY and xcalloc against size_t overflow
  [04/18]: add helpers for allocating flex-array structs
  [05/18]: convert trivial cases to ALLOC_ARRAY
  [06/18]: use xmallocz to avoid size arithmetic
  [07/18]: convert trivial cases to FLEX_ARRAY macros
  [08/18]: use st_add and st_mult for allocation size computation
  [09/18]: write_untracked_extension: use FLEX_ALLOC helper
  [10/18]: fast-import: simplify allocation in start_packfile
  [11/18]: fetch-pack: simplify add_sought_entry
  [12/18]: test-path-utils: fix normalize_path_copy output buffer size
  [13/18]: sequencer: simplify memory allocation of get_message
  [14/18]: git-compat-util: drop mempcpy compat code
  [15/18]: transport_anonymize_url: use xstrfmt
  [16/18]: diff_populate_gitlink: use a strbuf
  [17/18]: convert ewah/bitmap code to use xmalloc
  [18/18]: ewah: convert to REALLOC_ARRAY, etc

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 01/18] add helpers for detecting size_t overflow
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
@ 2016-02-15 21:49 ` Jeff King
  2016-02-15 21:49 ` [PATCH 02/18] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:49 UTC (permalink / raw)
  To: git

Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.

We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:

  - promote the arguments to size_t, so that we know we are
    doing our computation in the same size of integer that
    will ultimately be fed to xmalloc

  - check and die on overflow

  - return the result so that computations can be done in
    the parameter list of xmalloc.

These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:

  st_add(st_add(a, b), st_add(c, d));

you can write:

  st_add4(a, b, c, d);

This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.

Signed-off-by: Jeff King <peff@peff.net>
---
The st_* names aren't amazing, but I think they mostly work. Suggestions
welcome, but please keep bikeshedding to a minimum. :)

I almost went with checked_add(), checked_mult(), etc. But these
inherently promote their arguments to size_t, so:

  int x = checked_add(a, b);

is buggy; the user _should_ be reminded of the result type in each call.

These could also build on compiler intrinsics for extra speed. I'm happy
to do that as a follow-up, but I doubt it really matters in practice.
We're about to call malloc() in all cases, so an extra integer
computation is almost certainly irrelevant. So I went for simplicity to
start with.

 git-compat-util.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 693a336..0c65033 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -96,6 +96,14 @@
 #define unsigned_add_overflows(a, b) \
     ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+/*
+ * Returns true if the multiplication of "a" and "b" will
+ * overflow. The types of "a" and "b" must match and must be unsigned.
+ * Note that this macro evaluates "a" twice!
+ */
+#define unsigned_mult_overflows(a, b) \
+    ((a) && (b) > maximum_unsigned_value_of_type(a) / (a))
+
 #ifdef __GNUC__
 #define TYPEOF(x) (__typeof__(x))
 #else
@@ -713,6 +721,32 @@ extern void release_pack_memory(size_t);
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
+static inline size_t st_add(size_t a, size_t b)
+{
+	if (unsigned_add_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a + b;
+}
+#define st_add3(a,b,c)   st_add((a),st_add((b),(c)))
+#define st_add4(a,b,c,d) st_add((a),st_add3((b),(c),(d)))
+
+static inline size_t st_mult(size_t a, size_t b)
+{
+	if (unsigned_mult_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" * %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a * b;
+}
+
+static inline size_t st_sub(size_t a, size_t b)
+{
+	if (a < b)
+		die("size_t underflow: %"PRIuMAX" - %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a - b;
+}
+
 #ifdef HAVE_ALLOCA_H
 # include <alloca.h>
 # define xalloca(size)      (alloca(size))
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 02/18] tree-diff: catch integer overflow in combine_diff_path allocation
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
  2016-02-15 21:49 ` [PATCH 01/18] add helpers for detecting size_t overflow Jeff King
@ 2016-02-15 21:49 ` Jeff King
  2016-02-15 21:50 ` [PATCH 03/18] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:49 UTC (permalink / raw)
  To: git

A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.

We can fix this by using size_t, and checking for overflow
with the st_add helper.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.h      | 4 ++--
 tree-diff.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.h b/diff.h
index 70b2d70..beafbbd 100644
--- a/diff.h
+++ b/diff.h
@@ -222,8 +222,8 @@ struct combine_diff_path {
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
-	(sizeof(struct combine_diff_path) + \
-	 sizeof(struct combine_diff_parent) * (n) + (l) + 1)
+	st_add4(sizeof(struct combine_diff_path), (l), 1, \
+		st_mult(sizeof(struct combine_diff_parent), (n)))
 
 extern void show_combined_diff(struct combine_diff_path *elem, int num_parent,
 			      int dense, struct rev_info *);
diff --git a/tree-diff.c b/tree-diff.c
index 290a1da..4dda9a1 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -124,8 +124,8 @@ static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
 	unsigned mode, const unsigned char *sha1)
 {
 	struct combine_diff_path *p;
-	int len = base->len + pathlen;
-	int alloclen = combine_diff_path_size(nparent, len);
+	size_t len = st_add(base->len, pathlen);
+	size_t alloclen = combine_diff_path_size(nparent, len);
 
 	/* if last->next is !NULL - it is a pre-allocated memory, we can reuse */
 	p = last->next;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 03/18] harden REALLOC_ARRAY and xcalloc against size_t overflow
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
  2016-02-15 21:49 ` [PATCH 01/18] add helpers for detecting size_t overflow Jeff King
  2016-02-15 21:49 ` [PATCH 02/18] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
@ 2016-02-15 21:50 ` Jeff King
  2016-02-15 21:50 ` [PATCH 04/18] add helpers for allocating flex-array structs Jeff King
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:50 UTC (permalink / raw)
  To: git

REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow.  Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.

xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 3 ++-
 wrapper.c         | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0c65033..55c073d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -779,7 +779,8 @@ extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1);
 extern char *xgetcwd(void);
 extern FILE *fopen_for_writing(const char *path);
 
-#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
+#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
+#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
 
 static inline char *xstrdup_or_null(const char *str)
 {
diff --git a/wrapper.c b/wrapper.c
index 29a45d2..9afc1a0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -152,6 +152,9 @@ void *xcalloc(size_t nmemb, size_t size)
 {
 	void *ret;
 
+	if (unsigned_mult_overflows(nmemb, size))
+		die("data too large to fit into virtual memory space");
+
 	memory_limit_check(size * nmemb, 0);
 	ret = calloc(nmemb, size);
 	if (!ret && (!nmemb || !size))
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 04/18] add helpers for allocating flex-array structs
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (2 preceding siblings ...)
  2016-02-15 21:50 ` [PATCH 03/18] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
@ 2016-02-15 21:50 ` Jeff King
  2016-02-16  1:47   ` Eric Sunshine
  2016-02-15 21:51 ` [PATCH 05/18] convert trivial cases to ALLOC_ARRAY Jeff King
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:50 UTC (permalink / raw)
  To: git

Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.

This patch adds a few helpers to turn simple cases of into a
one-liner that properly checks for overflow. See the
embedded documentation for details.

Ideally we could provide a more flexible version that could
handle multiple strings, like:

  FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);

But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 55c073d..8f23801 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
 
+/*
+ * These functions help you allocate structs with flex arrays, and copy
+ * the data directly into the array. For example, if you had:
+ *
+ *   struct foo {
+ *     int bar;
+ *     char name[FLEX_ARRAY];
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_MEM(f, name, src, len);
+ *
+ * to allocate a "foo" with the contents of "src" in the "name" field.
+ * The resulting struct is automatically zero'd, and the flex-array field
+ * is NUL-terminated (whether the incoming src buffer was or not).
+ *
+ * The FLEXPTR_* variants operate on structs that don't use flex-arrays,
+ * but do want to store a pointer to some extra data in the same allocated
+ * block. For example, if you have:
+ *
+ *   struct foo {
+ *     char *name;
+ *     int bar;
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_STR(f, name, src);
+ *
+ * and "name" will point to a block of memory after the struct, which will be
+ * freed along with the struct (but the pointer can be repoined anywhere).
+ *
+ * The *_STR variants accept a string parameter rather than a ptr/len
+ * combination.
+ *
+ * Note that these macros will evaluate the first parameter multiple
+ * times, and it must be assignable as an lvalue.
+ */
+#define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
+	(x) = NULL; /* silence -Wuninitialized for offset calculation */ \
+	(x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char *)(x), (buf), (len)); \
+} while (0)
+#define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
+	(x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+	(x)->ptrname = (void *)((x)+1); \
+} while(0)
+#define FLEX_ALLOC_STR(x, flexname, str) \
+	FLEX_ALLOC_MEM((x), flexname, (str), strlen(str))
+#define FLEXPTR_ALLOC_STR(x, ptrname, str) \
+	FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
+
+static inline void *xalloc_flex(size_t base_len, size_t offset,
+				const void *src, size_t src_len)
+{
+	unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
+	memcpy(ret + offset, src, src_len);
+	return ret;
+}
+
 static inline char *xstrdup_or_null(const char *str)
 {
 	return str ? xstrdup(str) : NULL;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 05/18] convert trivial cases to ALLOC_ARRAY
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (3 preceding siblings ...)
  2016-02-15 21:50 ` [PATCH 04/18] add helpers for allocating flex-array structs Jeff King
@ 2016-02-15 21:51 ` Jeff King
  2016-02-16  4:22   ` Eric Sunshine
  2016-02-15 21:52 ` [PATCH 06/18] use xmallocz to avoid size arithmetic Jeff King
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:51 UTC (permalink / raw)
  To: git

Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
---
 alias.c                  |  2 +-
 attr.c                   |  2 +-
 bisect.c                 |  4 ++--
 builtin/blame.c          |  3 ++-
 builtin/clean.c          |  2 +-
 builtin/fast-export.c    |  2 +-
 builtin/grep.c           |  3 ++-
 builtin/index-pack.c     |  4 ++--
 builtin/merge-base.c     |  2 +-
 builtin/mv.c             |  3 ++-
 builtin/pack-objects.c   |  7 ++++---
 builtin/pack-redundant.c |  2 +-
 builtin/receive-pack.c   |  7 +++----
 builtin/remote-ext.c     |  2 +-
 column.c                 |  2 +-
 combine-diff.c           |  4 ++--
 commit.c                 |  2 +-
 compat/mingw.c           |  6 +++---
 daemon.c                 |  2 +-
 diffcore-order.c         |  4 ++--
 dir.c                    |  6 +++---
 exec_cmd.c               |  2 +-
 fast-import.c            |  5 +++--
 fsck.c                   |  3 ++-
 git.c                    |  2 +-
 graph.c                  | 10 ++++------
 khash.h                  |  2 +-
 levenshtein.c            |  8 +++++---
 line-log.c               | 10 +++++-----
 notes.c                  |  2 +-
 pack-check.c             |  2 +-
 pack-revindex.c          | 12 ++++++++----
 pathspec.c               |  5 +++--
 remote-curl.c            |  6 ++++--
 run-command.c            |  2 +-
 sha1_file.c              |  4 ++--
 shallow.c                |  6 +++---
 show-index.c             |  3 ++-
 transport.c              |  2 +-
 xdiff-interface.c        |  2 +-
 40 files changed, 86 insertions(+), 73 deletions(-)

diff --git a/alias.c b/alias.c
index a11229d..3b90397 100644
--- a/alias.c
+++ b/alias.c
@@ -23,7 +23,7 @@ int split_cmdline(char *cmdline, const char ***argv)
 	int src, dst, count = 0, size = 16;
 	char quoted = 0;
 
-	*argv = xmalloc(sizeof(**argv) * size);
+	ALLOC_ARRAY(*argv, size);
 
 	/* split alias_string */
 	(*argv)[count++] = cmdline;
diff --git a/attr.c b/attr.c
index 086c08d..c83ec49 100644
--- a/attr.c
+++ b/attr.c
@@ -799,7 +799,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 			++count;
 	}
 	*num = count;
-	*check = xmalloc(sizeof(**check) * count);
+	ALLOC_ARRAY(*check, count);
 	j = 0;
 	for (i = 0; i < attr_nr; i++) {
 		const char *value = check_all_attr[i].value;
diff --git a/bisect.c b/bisect.c
index 06ec54e..7996c29 100644
--- a/bisect.c
+++ b/bisect.c
@@ -708,10 +708,10 @@ static struct commit *get_commit_reference(const unsigned char *sha1)
 
 static struct commit **get_bad_and_good_commits(int *rev_nr)
 {
-	int len = 1 + good_revs.nr;
-	struct commit **rev = xmalloc(len * sizeof(*rev));
+	struct commit **rev;
 	int i, n = 0;
 
+	ALLOC_ARRAY(rev, 1 + good_revs.nr);
 	rev[n++] = get_commit_reference(current_bad_oid->hash);
 	for (i = 0; i < good_revs.nr; i++)
 		rev[n++] = get_commit_reference(good_revs.sha1[i]);
diff --git a/builtin/blame.c b/builtin/blame.c
index 55bf5fa..b4ed462 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2059,7 +2059,8 @@ static int prepare_lines(struct scoreboard *sb)
 	for (p = buf; p < end; p = get_next_line(p, end))
 		num++;
 
-	sb->lineno = lineno = xmalloc(sizeof(*sb->lineno) * (num + 1));
+	ALLOC_ARRAY(sb->lineno, num + 1);
+	lineno = sb->lineno;
 
 	for (p = buf; p < end; p = get_next_line(p, end))
 		*lineno++ = p - buf;
diff --git a/builtin/clean.c b/builtin/clean.c
index 7b08237..8229f7e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -543,7 +543,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 	int eof = 0;
 	int i;
 
-	chosen = xmalloc(sizeof(int) * stuff->nr);
+	ALLOC_ARRAY(chosen, stuff->nr);
 	/* set chosen as uninitialized */
 	for (i = 0; i < stuff->nr; i++)
 		chosen[i] = -1;
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 2471297..8164b58 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1021,7 +1021,7 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 		const char **refspecs_str;
 		int i;
 
-		refspecs_str = xmalloc(sizeof(*refspecs_str) * refspecs_list.nr);
+		ALLOC_ARRAY(refspecs_str, refspecs_list.nr);
 		for (i = 0; i < refspecs_list.nr; i++)
 			refspecs_str[i] = refspecs_list.items[i].string;
 
diff --git a/builtin/grep.c b/builtin/grep.c
index 8c516a9..fa9c9ef 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -365,9 +365,10 @@ static void append_path(struct grep_opt *opt, const void *data, size_t len)
 static void run_pager(struct grep_opt *opt, const char *prefix)
 {
 	struct string_list *path_list = opt->output_priv;
-	const char **argv = xmalloc(sizeof(const char *) * (path_list->nr + 1));
+	const char **argv;
 	int i, status;
 
+	ALLOC_ARRAY(argv, path_list->nr + 1);
 	for (i = 0; i < path_list->nr; i++)
 		argv[i] = path_list->items[i].string;
 	argv[path_list->nr] = NULL;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6a01509..a60bcfa 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1346,7 +1346,7 @@ static void fix_unresolved_deltas(struct sha1file *f)
 	 * before deltas depending on them, a good heuristic is to start
 	 * resolving deltas in the same order as their position in the pack.
 	 */
-	sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
+	ALLOC_ARRAY(sorted_by_pos, nr_ref_deltas);
 	for (i = 0; i < nr_ref_deltas; i++)
 		sorted_by_pos[i] = &ref_deltas[i];
 	qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
@@ -1759,7 +1759,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (show_stat)
 		show_pack_info(stat_only);
 
-	idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *));
+	ALLOC_ARRAY(idx_objects, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
 	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_sha1);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index a891162..c0d1822 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -252,7 +252,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 	if (argc < 2)
 		usage_with_options(merge_base_usage, options);
 
-	rev = xmalloc(argc * sizeof(*rev));
+	ALLOC_ARRAY(rev, argc);
 	while (argc-- > 0)
 		rev[rev_nr++] = get_commit_reference(*argv++);
 	return show_merge_base(rev, rev_nr, show_all);
diff --git a/builtin/mv.c b/builtin/mv.c
index d1d4316..9a9813a 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -24,7 +24,8 @@ static const char **internal_copy_pathspec(const char *prefix,
 					   int count, unsigned flags)
 {
 	int i;
-	const char **result = xmalloc((count + 1) * sizeof(const char *));
+	const char **result;
+	ALLOC_ARRAY(result, count + 1);
 	memcpy(result, pathspec, count * sizeof(const char *));
 	result[count] = NULL;
 	for (i = 0; i < count; i++) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4dae5b1..b4f1fa6 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -624,7 +624,7 @@ static struct object_entry **compute_write_order(void)
 {
 	unsigned int i, wo_end, last_untagged;
 
-	struct object_entry **wo = xmalloc(to_pack.nr_objects * sizeof(*wo));
+	struct object_entry **wo;
 	struct object_entry *objects = to_pack.objects;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
@@ -657,6 +657,7 @@ static struct object_entry **compute_write_order(void)
 	 * Give the objects in the original recency order until
 	 * we see a tagged tip.
 	 */
+	ALLOC_ARRAY(wo, to_pack.nr_objects);
 	for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			break;
@@ -769,7 +770,7 @@ static void write_pack_file(void)
 
 	if (progress > pack_to_stdout)
 		progress_state = start_progress(_("Writing objects"), nr_result);
-	written_list = xmalloc(to_pack.nr_objects * sizeof(*written_list));
+	ALLOC_ARRAY(written_list, to_pack.nr_objects);
 	write_order = compute_write_order();
 
 	do {
@@ -2129,7 +2130,7 @@ static void prepare_pack(int window, int depth)
 	if (!to_pack.nr_objects || !window || !depth)
 		return;
 
-	delta_list = xmalloc(to_pack.nr_objects * sizeof(*delta_list));
+	ALLOC_ARRAY(delta_list, to_pack.nr_objects);
 	nr_deltas = n = 0;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index d0532f6..72c8158 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -53,7 +53,7 @@ static inline struct llist_item *llist_item_get(void)
 		free_nodes = free_nodes->next;
 	} else {
 		int i = 1;
-		new = xmalloc(sizeof(struct llist_item) * BLKSIZE);
+		ALLOC_ARRAY(new, BLKSIZE);
 		for (; i < BLKSIZE; i++)
 			llist_item_put(&new[i]);
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f2d6761..ad2fdaa 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1044,7 +1044,7 @@ static void run_update_post_hook(struct command *commands)
 	if (!argc || !hook)
 		return;
 
-	argv = xmalloc(sizeof(*argv) * (2 + argc));
+	ALLOC_ARRAY(argv, 2 + argc);
 	argv[0] = hook;
 
 	for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
@@ -1597,8 +1597,7 @@ static void prepare_shallow_update(struct command *commands,
 {
 	int i, j, k, bitmap_size = (si->ref->nr + 31) / 32;
 
-	si->used_shallow = xmalloc(sizeof(*si->used_shallow) *
-				   si->shallow->nr);
+	ALLOC_ARRAY(si->used_shallow, si->shallow->nr);
 	assign_shallow_commits_to_refs(si, si->used_shallow, NULL);
 
 	si->need_reachability_test =
@@ -1664,7 +1663,7 @@ static void update_shallow_info(struct command *commands,
 		return;
 	}
 
-	ref_status = xmalloc(sizeof(*ref_status) * ref->nr);
+	ALLOC_ARRAY(ref_status, ref->nr);
 	assign_shallow_commits_to_refs(si, NULL, ref_status);
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (is_null_sha1(cmd->new_sha1))
diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index e3cd25d..681894a 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -133,7 +133,7 @@ static const char **parse_argv(const char *arg, const char *service)
 			temparray[arguments++] = expanded;
 	}
 
-	ret = xmalloc((arguments + 1) * sizeof(char *));
+	ALLOC_ARRAY(ret, arguments + 1);
 	for (i = 0; i < arguments; i++)
 		ret[i] = temparray[i];
 	ret[arguments] = NULL;
diff --git a/column.c b/column.c
index 786abe6..f9fda68 100644
--- a/column.c
+++ b/column.c
@@ -164,7 +164,7 @@ static void display_table(const struct string_list *list,
 	data.colopts = colopts;
 	data.opts = *opts;
 
-	data.len = xmalloc(sizeof(*data.len) * list->nr);
+	ALLOC_ARRAY(data.len, list->nr);
 	for (i = 0; i < list->nr; i++)
 		data.len[i] = item_length(colopts, list->items[i].string);
 
diff --git a/combine-diff.c b/combine-diff.c
index 5571304..a698016 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1372,7 +1372,7 @@ static struct combine_diff_path *find_paths_multitree(
 	struct combine_diff_path paths_head;
 	struct strbuf base;
 
-	parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0]));
+	ALLOC_ARRAY(parents_sha1, nparent);
 	for (i = 0; i < nparent; i++)
 		parents_sha1[i] = parents->sha1[i];
 
@@ -1483,7 +1483,7 @@ void diff_tree_combined(const unsigned char *sha1,
 	if (opt->orderfile && num_paths) {
 		struct obj_order *o;
 
-		o = xmalloc(sizeof(*o) * num_paths);
+		ALLOC_ARRAY(o, num_paths);
 		for (i = 0, p = paths; p; p = p->next, i++)
 			o[i].obj = p;
 		order_objects(opt->orderfile, path_path, o, num_paths);
diff --git a/commit.c b/commit.c
index 40388d7..31cd91f 100644
--- a/commit.c
+++ b/commit.c
@@ -903,7 +903,7 @@ static int remove_redundant(struct commit **array, int cnt)
 
 	work = xcalloc(cnt, sizeof(*work));
 	redundant = xcalloc(cnt, 1);
-	filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1));
+	ALLOC_ARRAY(filled_index, cnt - 1);
 
 	for (i = 0; i < cnt; i++)
 		parse_commit(array[i]);
diff --git a/compat/mingw.c b/compat/mingw.c
index 77a51d3..0eabe68 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -854,7 +854,7 @@ static char **get_path_split(void)
 	if (!n)
 		return NULL;
 
-	path = xmalloc((n+1)*sizeof(char *));
+	ALLOC_ARRAY(path, n+1);
 	p = envpath;
 	i = 0;
 	do {
@@ -939,7 +939,7 @@ static wchar_t *make_environment_block(char **deltaenv)
 		i++;
 
 	/* copy the environment, leaving space for changes */
-	tmpenv = xmalloc((size + i) * sizeof(char*));
+	ALLOC_ARRAY(tmpenv, size + i);
 	memcpy(tmpenv, environ, size * sizeof(char*));
 
 	/* merge supplied environment changes into the temporary environment */
@@ -1129,7 +1129,7 @@ static int try_shell_exec(const char *cmd, char *const *argv)
 		int argc = 0;
 		const char **argv2;
 		while (argv[argc]) argc++;
-		argv2 = xmalloc(sizeof(*argv) * (argc+1));
+		ALLOC_ARRAY(argv2, argc + 1);
 		argv2[0] = (char *)cmd;	/* full path to the script file */
 		memcpy(&argv2[1], &argv[1], sizeof(*argv) * argc);
 		pid = mingw_spawnv(prog, argv2, 1);
diff --git a/daemon.c b/daemon.c
index 46b411c..0f48b34 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1374,7 +1374,7 @@ int main(int argc, char **argv)
 		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
 
 	/* prepare argv for serving-processes */
-	cld_argv = xmalloc(sizeof (char *) * (argc + 2));
+	ALLOC_ARRAY(cld_argv, argc + 2);
 	cld_argv[0] = argv[0];	/* git-daemon */
 	cld_argv[1] = "--serve";
 	for (i = 1; i < argc; ++i)
diff --git a/diffcore-order.c b/diffcore-order.c
index 97dd3d0..69d41f7 100644
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -52,7 +52,7 @@ static void prepare_order(const char *orderfile)
 		}
 		if (pass == 0) {
 			order_cnt = cnt;
-			order = xmalloc(sizeof(*order) * cnt);
+			ALLOC_ARRAY(order, cnt);
 		}
 	}
 }
@@ -120,7 +120,7 @@ void diffcore_order(const char *orderfile)
 	if (!q->nr)
 		return;
 
-	o = xmalloc(sizeof(*o) * q->nr);
+	ALLOC_ARRAY(o, q->nr);
 	for (i = 0; i < q->nr; i++)
 		o[i].obj = q->queue[i];
 	order_objects(orderfile, pair_pathtwo, o, q->nr);
diff --git a/dir.c b/dir.c
index f0b6d0a..66c93c1 100644
--- a/dir.c
+++ b/dir.c
@@ -2484,14 +2484,14 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	ud.untracked_alloc = value;
 	ud.untracked_nr	   = value;
 	if (ud.untracked_nr)
-		ud.untracked = xmalloc(sizeof(*ud.untracked) * ud.untracked_nr);
+		ALLOC_ARRAY(ud.untracked, ud.untracked_nr);
 	data = next;
 
 	next = data;
 	ud.dirs_alloc = ud.dirs_nr = decode_varint(&next);
 	if (next > end)
 		return -1;
-	ud.dirs = xmalloc(sizeof(*ud.dirs) * ud.dirs_nr);
+	ALLOC_ARRAY(ud.dirs, ud.dirs_nr);
 	data = next;
 
 	len = strlen((const char *)data);
@@ -2611,7 +2611,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
 	rd.data	      = next;
 	rd.end	      = end;
 	rd.index      = 0;
-	rd.ucd        = xmalloc(sizeof(*rd.ucd) * len);
+	ALLOC_ARRAY(rd.ucd, len);
 
 	if (read_one_dir(&uc->root, &rd) || rd.index != len)
 		goto done;
diff --git a/exec_cmd.c b/exec_cmd.c
index e85f0fd..17b7918 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -114,7 +114,7 @@ const char **prepare_git_cmd(const char **argv)
 
 	for (argc = 0; argv[argc]; argc++)
 		; /* just counting */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 2));
+	ALLOC_ARRAY(nargv, argc + 2);
 
 	nargv[0] = "git";
 	for (argc = 0; argv[argc]; argc++)
diff --git a/fast-import.c b/fast-import.c
index bf01b34..a6467cb 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -814,7 +814,8 @@ static struct tree_entry *new_tree_entry(void)
 	if (!avail_tree_entry) {
 		unsigned int n = tree_entry_alloc;
 		total_allocd += n * sizeof(struct tree_entry);
-		avail_tree_entry = e = xmalloc(n * sizeof(struct tree_entry));
+		ALLOC_ARRAY(e, n);
+		avail_tree_entry = e;
 		while (n-- > 1) {
 			*((void**)e) = e + 1;
 			e++;
@@ -898,7 +899,7 @@ static const char *create_index(void)
 	struct object_entry_pool *o;
 
 	/* Build the table of object IDs. */
-	idx = xmalloc(object_count * sizeof(*idx));
+	ALLOC_ARRAY(idx, object_count);
 	c = idx;
 	for (o = blocks; o; o = o->next_pool)
 		for (e = o->next_free; e-- != o->entries;)
diff --git a/fsck.c b/fsck.c
index c637f66..ca4c685 100644
--- a/fsck.c
+++ b/fsck.c
@@ -199,7 +199,8 @@ void fsck_set_msg_type(struct fsck_options *options,
 
 	if (!options->msg_type) {
 		int i;
-		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_type;
+		ALLOC_ARRAY(msg_type, FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
 			msg_type[i] = fsck_msg_type(i, options);
 		options->msg_type = msg_type;
diff --git a/git.c b/git.c
index da278c3..0bd92a7 100644
--- a/git.c
+++ b/git.c
@@ -249,7 +249,7 @@ static int handle_alias(int *argcp, const char ***argv)
 			restore_env(1);
 
 			/* build alias_argv */
-			alias_argv = xmalloc(sizeof(*alias_argv) * (argc + 1));
+			ALLOC_ARRAY(alias_argv, argc + 1);
 			alias_argv[0] = alias_string + 1;
 			for (i = 1; i < argc; ++i)
 				alias_argv[i] = (*argv)[i];
diff --git a/graph.c b/graph.c
index c25a09a..1350bdd 100644
--- a/graph.c
+++ b/graph.c
@@ -234,12 +234,10 @@ struct git_graph *graph_init(struct rev_info *opt)
 	 * We'll automatically grow columns later if we need more room.
 	 */
 	graph->column_capacity = 30;
-	graph->columns = xmalloc(sizeof(struct column) *
-				 graph->column_capacity);
-	graph->new_columns = xmalloc(sizeof(struct column) *
-				     graph->column_capacity);
-	graph->mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
-	graph->new_mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->new_columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->mapping, 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->new_mapping, 2 * graph->column_capacity);
 
 	/*
 	 * The diff output prefix callback, with this we can make
diff --git a/khash.h b/khash.h
index 376475a..c0da40d 100644
--- a/khash.h
+++ b/khash.h
@@ -117,7 +117,7 @@ static const double __ac_HASH_UPPER = 0.77;
 			if (new_n_buckets < 4) new_n_buckets = 4;					\
 			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	/* requested size is too small */ \
 			else { /* hash table size to be changed (shrink or expand); rehash */ \
-				new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t));	\
+				ALLOC_ARRAY(new_flags, __ac_fsize(new_n_buckets)); \
 				if (!new_flags) return -1;								\
 				memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
 				if (h->n_buckets < new_n_buckets) {	/* expand */		\
diff --git a/levenshtein.c b/levenshtein.c
index fc28159..d263269 100644
--- a/levenshtein.c
+++ b/levenshtein.c
@@ -42,11 +42,13 @@ int levenshtein(const char *string1, const char *string2,
 		int w, int s, int a, int d)
 {
 	int len1 = strlen(string1), len2 = strlen(string2);
-	int *row0 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row1 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row2 = xmalloc(sizeof(int) * (len2 + 1));
+	int *row0, *row1, *row2;
 	int i, j;
 
+	ALLOC_ARRAY(row0, len2 + 1);
+	ALLOC_ARRAY(row1, len2 + 1);
+	ALLOC_ARRAY(row2, len2 + 1);
+
 	for (j = 0; j <= len2; j++)
 		row1[j] = j * a;
 	for (i = 0; i < len1; i++) {
diff --git a/line-log.c b/line-log.c
index af6e2f7..2f781b4 100644
--- a/line-log.c
+++ b/line-log.c
@@ -521,7 +521,7 @@ static void fill_line_ends(struct diff_filespec *spec, long *lines,
 	if (diff_populate_filespec(spec, 0))
 		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
 
-	ends = xmalloc(size * sizeof(*ends));
+	ALLOC_ARRAY(ends, size);
 	ends[cur++] = 0;
 	data = spec->data;
 	while (num < spec->size) {
@@ -753,7 +753,7 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
 			count++;
 			r = r->next;
 		}
-		paths = xmalloc((count+1)*sizeof(char *));
+		ALLOC_ARRAY(paths, count + 1);
 		r = range;
 		for (i = 0; i < count; i++) {
 			paths[i] = xstrdup(r->path);
@@ -1146,9 +1146,9 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
 	if (nparents > 1 && rev->first_parent_only)
 		nparents = 1;
 
-	diffqueues = xmalloc(nparents * sizeof(*diffqueues));
-	cand = xmalloc(nparents * sizeof(*cand));
-	parents = xmalloc(nparents * sizeof(*parents));
+	ALLOC_ARRAY(diffqueues, nparents);
+	ALLOC_ARRAY(cand, nparents);
+	ALLOC_ARRAY(parents, nparents);
 
 	p = commit->parents;
 	for (i = 0; i < nparents; i++) {
diff --git a/notes.c b/notes.c
index c1e5035..88cf474 100644
--- a/notes.c
+++ b/notes.c
@@ -1035,7 +1035,7 @@ struct notes_tree **load_notes_trees(struct string_list *refs, int flags)
 	struct string_list_item *item;
 	int counter = 0;
 	struct notes_tree **trees;
-	trees = xmalloc((refs->nr+1) * sizeof(struct notes_tree *));
+	ALLOC_ARRAY(trees, refs->nr + 1);
 	for_each_string_list_item(item, refs) {
 		struct notes_tree *t = xcalloc(1, sizeof(struct notes_tree));
 		init_notes(t, item->string, combine_notes_ignore, flags);
diff --git a/pack-check.c b/pack-check.c
index 433bd86..1da89a4 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -89,7 +89,7 @@ static int verify_packfile(struct packed_git *p,
 	 * we do not do scan-streaming check on the pack file.
 	 */
 	nr_objects = p->num_objects;
-	entries = xmalloc((nr_objects + 1) * sizeof(*entries));
+	ALLOC_ARRAY(entries, nr_objects + 1);
 	entries[nr_objects].offset = pack_sig_ofs;
 	/* first sort entries by pack offset, since unpacking them is more efficient that way */
 	for (i = 0; i < nr_objects; i++) {
diff --git a/pack-revindex.c b/pack-revindex.c
index 155a8a3..96d51c3 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -44,10 +44,14 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
 	 * keep track of them with alias pointers, always sorting from "from"
 	 * to "to".
 	 */
-	struct revindex_entry *tmp = xmalloc(n * sizeof(*tmp));
-	struct revindex_entry *from = entries, *to = tmp;
+	struct revindex_entry *tmp, *from, *to;
 	int bits;
-	unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
+	unsigned *pos;
+
+	ALLOC_ARRAY(pos, BUCKETS);
+	ALLOC_ARRAY(tmp, n);
+	from = entries;
+	to = tmp;
 
 	/*
 	 * If (max >> bits) is zero, then we know that the radix digit we are
@@ -121,7 +125,7 @@ static void create_pack_revindex(struct packed_git *p)
 	unsigned i;
 	const char *index = p->index_data;
 
-	p->revindex = xmalloc(sizeof(*p->revindex) * (num_ent + 1));
+	ALLOC_ARRAY(p->revindex, num_ent + 1);
 	index += 4 * 256;
 
 	if (p->index_version > 1) {
diff --git a/pathspec.c b/pathspec.c
index 9304ee3..c9e9b6c 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -406,7 +406,8 @@ void parse_pathspec(struct pathspec *pathspec,
 		n++;
 
 	pathspec->nr = n;
-	pathspec->items = item = xmalloc(sizeof(*item) * n);
+	ALLOC_ARRAY(pathspec->items, n);
+	item = pathspec->items;
 	pathspec->_raw = argv;
 	prefixlen = prefix ? strlen(prefix) : 0;
 
@@ -483,7 +484,7 @@ const char **get_pathspec(const char *prefix, const char **pathspec)
 void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 {
 	*dst = *src;
-	dst->items = xmalloc(sizeof(struct pathspec_item) * dst->nr);
+	ALLOC_ARRAY(dst->items, dst->nr);
 	memcpy(dst->items, src->items,
 	       sizeof(struct pathspec_item) * dst->nr);
 }
diff --git a/remote-curl.c b/remote-curl.c
index c704857..49d8c3b 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -696,9 +696,10 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 static int fetch_dumb(int nr_heads, struct ref **to_fetch)
 {
 	struct walker *walker;
-	char **targets = xmalloc(nr_heads * sizeof(char*));
+	char **targets;
 	int ret, i;
 
+	ALLOC_ARRAY(targets, nr_heads);
 	if (options.depth)
 		die("dumb http transport does not support --depth");
 	for (i = 0; i < nr_heads; i++)
@@ -845,9 +846,10 @@ static void parse_fetch(struct strbuf *buf)
 
 static int push_dav(int nr_spec, char **specs)
 {
-	const char **argv = xmalloc((10 + nr_spec) * sizeof(char*));
+	const char **argv;
 	int argc = 0, i;
 
+	ALLOC_ARRAY(argv, 10 + nr_spec);
 	argv[argc++] = "http-push";
 	argv[argc++] = "--helper-status";
 	if (options.dry_run)
diff --git a/run-command.c b/run-command.c
index cdf0184..a8a293a 100644
--- a/run-command.c
+++ b/run-command.c
@@ -168,7 +168,7 @@ static const char **prepare_shell_cmd(const char **argv)
 	for (argc = 0; argv[argc]; argc++)
 		; /* just counting */
 	/* +1 for NULL, +3 for "sh -c" plus extra $0 */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 1 + 3));
+	ALLOC_ARRAY(nargv, argc + 1 + 3);
 
 	if (argc < 1)
 		die("BUG: shell command is empty");
diff --git a/sha1_file.c b/sha1_file.c
index aab1872..2f1c6d3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1942,7 +1942,7 @@ static enum object_type packed_to_object_type(struct packed_git *p,
 		/* Push the object we're going to leave behind */
 		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
 			poi_stack_alloc = alloc_nr(poi_stack_nr);
-			poi_stack = xmalloc(sizeof(off_t)*poi_stack_alloc);
+			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
 			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
 		} else {
 			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
@@ -2308,7 +2308,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 		if (delta_stack_nr >= delta_stack_alloc
 		    && delta_stack == small_delta_stack) {
 			delta_stack_alloc = alloc_nr(delta_stack_nr);
-			delta_stack = xmalloc(sizeof(*delta_stack)*delta_stack_alloc);
+			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
 			memcpy(delta_stack, small_delta_stack,
 			       sizeof(*delta_stack)*delta_stack_nr);
 		} else {
diff --git a/shallow.c b/shallow.c
index 60f1505..71163bf 100644
--- a/shallow.c
+++ b/shallow.c
@@ -315,8 +315,8 @@ void prepare_shallow_info(struct shallow_info *info, struct sha1_array *sa)
 	info->shallow = sa;
 	if (!sa)
 		return;
-	info->ours = xmalloc(sizeof(*info->ours) * sa->nr);
-	info->theirs = xmalloc(sizeof(*info->theirs) * sa->nr);
+	ALLOC_ARRAY(info->ours, sa->nr);
+	ALLOC_ARRAY(info->theirs, sa->nr);
 	for (i = 0; i < sa->nr; i++) {
 		if (has_sha1_file(sa->sha1[i])) {
 			struct commit_graft *graft;
@@ -487,7 +487,7 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 	struct paint_info pi;
 
 	trace_printf_key(&trace_shallow, "shallow: assign_shallow_commits_to_refs\n");
-	shallow = xmalloc(sizeof(*shallow) * (info->nr_ours + info->nr_theirs));
+	ALLOC_ARRAY(shallow, info->nr_ours + info->nr_theirs);
 	for (i = 0; i < info->nr_ours; i++)
 		shallow[nr_shallow++] = info->ours[i];
 	for (i = 0; i < info->nr_theirs; i++)
diff --git a/show-index.c b/show-index.c
index d9e4903..acf8d54 100644
--- a/show-index.c
+++ b/show-index.c
@@ -50,7 +50,8 @@ int main(int argc, char **argv)
 			unsigned char sha1[20];
 			uint32_t crc;
 			uint32_t off;
-		} *entries = xmalloc(nr * sizeof(entries[0]));
+		} *entries;
+		ALLOC_ARRAY(entries, nr);
 		for (i = 0; i < nr; i++)
 			if (fread(entries[i].sha1, 20, 1, stdin) != 1)
 				die("unable to read sha1 %u/%u", i, nr);
diff --git a/transport.c b/transport.c
index 9ae7184..5c63295 100644
--- a/transport.c
+++ b/transport.c
@@ -1307,7 +1307,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 		 * This condition shouldn't be met in a non-deepening fetch
 		 * (see builtin/fetch.c:quickfetch()).
 		 */
-		heads = xmalloc(nr_refs * sizeof(*heads));
+		ALLOC_ARRAY(heads, nr_refs);
 		for (rm = refs; rm; rm = rm->next)
 			heads[nr_heads++] = rm;
 	}
diff --git a/xdiff-interface.c b/xdiff-interface.c
index cb67c1c..54236f2 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -265,7 +265,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
 	for (i = 0, regs->nr = 1; value[i]; i++)
 		if (value[i] == '\n')
 			regs->nr++;
-	regs->array = xmalloc(regs->nr * sizeof(struct ff_reg));
+	ALLOC_ARRAY(regs->array, regs->nr);
 	for (i = 0; i < regs->nr; i++) {
 		struct ff_reg *reg = regs->array + i;
 		const char *ep = strchr(value, '\n'), *expression;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 06/18] use xmallocz to avoid size arithmetic
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (4 preceding siblings ...)
  2016-02-15 21:51 ` [PATCH 05/18] convert trivial cases to ALLOC_ARRAY Jeff King
@ 2016-02-15 21:52 ` Jeff King
  2016-02-15 21:52 ` [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros Jeff King
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:52 UTC (permalink / raw)
  To: git

We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/check-ref-format.c | 2 +-
 builtin/merge-tree.c       | 2 +-
 builtin/worktree.c         | 2 +-
 column.c                   | 3 +--
 combine-diff.c             | 4 +---
 config.c                   | 4 +---
 dir.c                      | 2 +-
 entry.c                    | 2 +-
 grep.c                     | 3 +--
 imap-send.c                | 5 ++---
 ll-merge.c                 | 2 +-
 progress.c                 | 2 +-
 refs.c                     | 2 +-
 setup.c                    | 5 ++---
 strbuf.c                   | 2 +-
 15 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/builtin/check-ref-format.c b/builtin/check-ref-format.c
index fd915d5..eac4994 100644
--- a/builtin/check-ref-format.c
+++ b/builtin/check-ref-format.c
@@ -20,7 +20,7 @@ static const char builtin_check_ref_format_usage[] =
  */
 static char *collapse_slashes(const char *refname)
 {
-	char *ret = xmalloc(strlen(refname) + 1);
+	char *ret = xmallocz(strlen(refname));
 	char ch;
 	char prev = '/';
 	char *cp = ret;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index d4f0cbd..ca57004 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -174,7 +174,7 @@ static struct merge_list *create_entry(unsigned stage, unsigned mode, const unsi
 
 static char *traverse_path(const struct traverse_info *info, const struct name_entry *n)
 {
-	char *path = xmalloc(traverse_path_len(info, n) + 1);
+	char *path = xmallocz(traverse_path_len(info, n));
 	return make_traverse_path(path, info, n);
 }
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 475b958..0a45710 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -52,7 +52,7 @@ static int prune_worktree(const char *id, struct strbuf *reason)
 		return 1;
 	}
 	len = st.st_size;
-	path = xmalloc(len + 1);
+	path = xmallocz(len);
 	read_in_full(fd, path, len);
 	close(fd);
 	while (len && (path[len - 1] == '\n' || path[len - 1] == '\r'))
diff --git a/column.c b/column.c
index f9fda68..d55ead1 100644
--- a/column.c
+++ b/column.c
@@ -173,9 +173,8 @@ static void display_table(const struct string_list *list,
 	if (colopts & COL_DENSE)
 		shrink_columns(&data);
 
-	empty_cell = xmalloc(initial_width + 1);
+	empty_cell = xmallocz(initial_width);
 	memset(empty_cell, ' ', initial_width);
-	empty_cell[initial_width] = '\0';
 	for (y = 0; y < data.rows; y++) {
 		for (x = 0; x < data.cols; x++)
 			if (display_cell(&data, initial_width, empty_cell, x, y))
diff --git a/combine-diff.c b/combine-diff.c
index a698016..890c415 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1043,7 +1043,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 				elem->mode = canon_mode(S_IFLNK);
 
 			result_size = len;
-			result = xmalloc(len + 1);
+			result = xmallocz(len);
 
 			done = read_in_full(fd, result, len);
 			if (done < 0)
@@ -1051,8 +1051,6 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			else if (done < len)
 				die("early EOF '%s'", elem->path);
 
-			result[len] = 0;
-
 			/* If not a fake symlink, apply filters, e.g. autocrlf */
 			if (is_file) {
 				struct strbuf buf = STRBUF_INIT;
diff --git a/config.c b/config.c
index b95ac3a..e7b589a 100644
--- a/config.c
+++ b/config.c
@@ -1902,7 +1902,7 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 	 * Validate the key and while at it, lower case it for matching.
 	 */
 	if (store_key)
-		*store_key = xmalloc(strlen(key) + 1);
+		*store_key = xmallocz(strlen(key));
 
 	dot = 0;
 	for (i = 0; key[i]; i++) {
@@ -1926,8 +1926,6 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 		if (store_key)
 			(*store_key)[i] = c;
 	}
-	if (store_key)
-		(*store_key)[i] = 0;
 
 	return 0;
 
diff --git a/dir.c b/dir.c
index 66c93c1..6c627d5 100644
--- a/dir.c
+++ b/dir.c
@@ -711,7 +711,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			close(fd);
 			return 0;
 		}
-		buf = xmalloc(size+1);
+		buf = xmallocz(size);
 		if (read_in_full(fd, buf, size) != size) {
 			free(buf);
 			close(fd);
diff --git a/entry.c b/entry.c
index 582c400..a410957 100644
--- a/entry.c
+++ b/entry.c
@@ -6,7 +6,7 @@
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
 {
-	char *buf = xmalloc(path_len + 1);
+	char *buf = xmallocz(path_len);
 	int len = 0;
 
 	while (len < path_len) {
diff --git a/grep.c b/grep.c
index 7b2b96a..528b652 100644
--- a/grep.c
+++ b/grep.c
@@ -1741,7 +1741,7 @@ static int grep_source_load_file(struct grep_source *gs)
 	i = open(filename, O_RDONLY);
 	if (i < 0)
 		goto err_ret;
-	data = xmalloc(size + 1);
+	data = xmallocz(size);
 	if (st.st_size != read_in_full(i, data, size)) {
 		error(_("'%s': short read %s"), filename, strerror(errno));
 		close(i);
@@ -1749,7 +1749,6 @@ static int grep_source_load_file(struct grep_source *gs)
 		return -1;
 	}
 	close(i);
-	data[size] = 0;
 
 	gs->buf = data;
 	gs->size = size;
diff --git a/imap-send.c b/imap-send.c
index 4d3b773..2c52027 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -892,12 +892,11 @@ static char *cram(const char *challenge_64, const char *user, const char *pass)
 	response = xstrfmt("%s %s", user, hex);
 	resp_len = strlen(response) + 1;
 
-	response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);
+	response_64 = xmallocz(ENCODED_SIZE(resp_len));
 	encoded_len = EVP_EncodeBlock((unsigned char *)response_64,
 				      (unsigned char *)response, resp_len);
 	if (encoded_len < 0)
 		die("EVP_EncodeBlock error");
-	response_64[encoded_len] = '\0';
 	return (char *)response_64;
 }
 
@@ -1188,7 +1187,7 @@ static void lf_to_crlf(struct strbuf *msg)
 		j++;
 	}
 
-	new = xmalloc(j + 1);
+	new = xmallocz(j);
 
 	/*
 	 * Second pass: write the new string.  Note that this loop is
diff --git a/ll-merge.c b/ll-merge.c
index 0338630..ff4a43a 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -205,7 +205,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	if (fstat(fd, &st))
 		goto close_bad;
 	result->size = st.st_size;
-	result->ptr = xmalloc(result->size + 1);
+	result->ptr = xmallocz(result->size);
 	if (read_in_full(fd, result->ptr, result->size) != result->size) {
 		free(result->ptr);
 		result->ptr = NULL;
diff --git a/progress.c b/progress.c
index 353bd37..76a88c5 100644
--- a/progress.c
+++ b/progress.c
@@ -247,7 +247,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		size_t len = strlen(msg) + 5;
 		struct throughput *tp = progress->throughput;
 
-		bufp = (len < sizeof(buf)) ? buf : xmalloc(len + 1);
+		bufp = (len < sizeof(buf)) ? buf : xmallocz(len);
 		if (tp) {
 			unsigned int rate = !tp->avg_misecs ? 0 :
 					tp->avg_bytes / tp->avg_misecs;
diff --git a/refs.c b/refs.c
index e2d34b2..1d9e2a7 100644
--- a/refs.c
+++ b/refs.c
@@ -124,7 +124,7 @@ int refname_is_safe(const char *refname)
 		char *buf;
 		int result;
 
-		buf = xmalloc(strlen(refname) + 1);
+		buf = xmallocz(strlen(refname));
 		/*
 		 * Does the refname try to escape refs/?
 		 * For example: refs/foo/../bar is safe but refs/foo/../../bar
diff --git a/setup.c b/setup.c
index 0deb022..ab21086 100644
--- a/setup.c
+++ b/setup.c
@@ -88,7 +88,7 @@ char *prefix_path_gently(const char *prefix, int len,
 	const char *orig = path;
 	char *sanitized;
 	if (is_absolute_path(orig)) {
-		sanitized = xmalloc(strlen(path) + 1);
+		sanitized = xmallocz(strlen(path));
 		if (remaining_prefix)
 			*remaining_prefix = 0;
 		if (normalize_path_copy_len(sanitized, path, remaining_prefix)) {
@@ -488,14 +488,13 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 		error_code = READ_GITFILE_ERR_OPEN_FAILED;
 		goto cleanup_return;
 	}
-	buf = xmalloc(st.st_size + 1);
+	buf = xmallocz(st.st_size);
 	len = read_in_full(fd, buf, st.st_size);
 	close(fd);
 	if (len != st.st_size) {
 		error_code = READ_GITFILE_ERR_READ_FAILED;
 		goto cleanup_return;
 	}
-	buf[len] = '\0';
 	if (!starts_with(buf, "gitdir: ")) {
 		error_code = READ_GITFILE_ERR_INVALID_FORMAT;
 		goto cleanup_return;
diff --git a/strbuf.c b/strbuf.c
index bab316d..f60e2ee 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -718,7 +718,7 @@ char *xstrdup_tolower(const char *string)
 	size_t len, i;
 
 	len = strlen(string);
-	result = xmalloc(len + 1);
+	result = xmallocz(len);
 	for (i = 0; i < len; i++)
 		result[i] = tolower(string[i]);
 	result[i] = '\0';
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (5 preceding siblings ...)
  2016-02-15 21:52 ` [PATCH 06/18] use xmallocz to avoid size arithmetic Jeff King
@ 2016-02-15 21:52 ` Jeff King
  2016-02-16  2:17   ` Eric Sunshine
  2016-02-15 21:53 ` [PATCH 08/18] use st_add and st_mult for allocation size computation Jeff King
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:52 UTC (permalink / raw)
  To: git

Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King <peff@peff.net>
---
 attr.c               |  4 +---
 builtin/blame.c      |  4 +---
 builtin/help.c       |  9 +++------
 builtin/mktree.c     |  9 +++++----
 builtin/reflog.c     |  7 ++-----
 cache-tree.c         |  4 +---
 combine-diff.c       |  4 +---
 diff.c               |  7 ++-----
 dir.c                | 16 +++-------------
 hashmap.c            |  3 +--
 help.c               |  6 ++----
 log-tree.c           |  5 ++---
 name-hash.c          |  3 +--
 ref-filter.c         |  6 ++----
 refs.c               |  6 ++----
 refs/files-backend.c | 19 +++++--------------
 remote.c             |  5 +----
 17 files changed, 35 insertions(+), 82 deletions(-)

diff --git a/attr.c b/attr.c
index c83ec49..6537a43 100644
--- a/attr.c
+++ b/attr.c
@@ -93,9 +93,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	if (invalid_attr_name(name, len))
 		return NULL;
 
-	a = xmalloc(sizeof(*a) + len + 1);
-	memcpy(a->name, name, len);
-	a->name[len] = 0;
+	FLEX_ALLOC_MEM(a, name, name, len);
 	a->h = hval;
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
diff --git a/builtin/blame.c b/builtin/blame.c
index b4ed462..e982fb8 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -466,13 +466,11 @@ static void queue_blames(struct scoreboard *sb, struct origin *porigin,
 static struct origin *make_origin(struct commit *commit, const char *path)
 {
 	struct origin *o;
-	size_t pathlen = strlen(path) + 1;
-	o = xcalloc(1, sizeof(*o) + pathlen);
+	FLEX_ALLOC_STR(o, path, path);
 	o->commit = commit;
 	o->refcnt = 1;
 	o->next = commit->util;
 	commit->util = o;
-	memcpy(o->path, path, pathlen); /* includes NUL */
 	return o;
 }
 
diff --git a/builtin/help.c b/builtin/help.c
index 1cd0c1e..3c55ce4 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -171,12 +171,10 @@ static void exec_man_cmd(const char *cmd, const char *page)
 static void add_man_viewer(const char *name)
 {
 	struct man_viewer_list **p = &man_viewer_list;
-	size_t len = strlen(name);
 
 	while (*p)
 		p = &((*p)->next);
-	*p = xcalloc(1, (sizeof(**p) + len + 1));
-	memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */
+	FLEX_ALLOC_STR(*p, name, name);
 }
 
 static int supported_man_viewer(const char *name, size_t len)
@@ -190,9 +188,8 @@ static void do_add_man_viewer_info(const char *name,
 				   size_t len,
 				   const char *value)
 {
-	struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1);
-
-	memcpy(new->name, name, len); /* NUL-terminated by xcalloc */
+	struct man_viewer_info_list *new;
+	FLEX_ALLOC_MEM(new, name, name, len);
 	new->info = xstrdup(value);
 	new->next = man_viewer_info_list;
 	man_viewer_info_list = new;
diff --git a/builtin/mktree.c b/builtin/mktree.c
index a237caa..4282b62 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -19,16 +19,17 @@ static int alloc, used;
 static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
 {
 	struct treeent *ent;
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (strchr(path, '/'))
 		die("path %s contains slash", path);
 
-	ALLOC_GROW(entries, used + 1, alloc);
-	ent = entries[used++] = xmalloc(sizeof(**entries) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, path, len);
 	ent->mode = mode;
 	ent->len = len;
 	hashcpy(ent->sha1, sha1);
-	memcpy(ent->name, path, len+1);
+
+	ALLOC_GROW(entries, used + 1, alloc);
+	entries[used++] = ent;
 }
 
 static int ent_compare(const void *a_, const void *b_)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index f39960e..7c1990f 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -382,11 +382,9 @@ static int collect_reflog(const char *ref, const struct object_id *oid, int unus
 {
 	struct collected_reflog *e;
 	struct collect_reflog_cb *cb = cb_data;
-	size_t namelen = strlen(ref);
 
-	e = xmalloc(sizeof(*e) + namelen + 1);
+	FLEX_ALLOC_STR(e, reflog, ref);
 	hashcpy(e->sha1, oid->hash);
-	memcpy(e->reflog, ref, namelen + 1);
 	ALLOC_GROW(cb->e, cb->nr + 1, cb->alloc);
 	cb->e[cb->nr++] = e;
 	return 0;
@@ -412,8 +410,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		    !memcmp(ent->pattern, pattern, len))
 			return ent;
 
-	ent = xcalloc(1, (sizeof(*ent) + len));
-	memcpy(ent->pattern, pattern, len);
+	FLEX_ALLOC_MEM(ent, pattern, pattern, len);
 	ent->len = len;
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
diff --git a/cache-tree.c b/cache-tree.c
index 20ee7b5..3ebf9c3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -79,11 +79,9 @@ static struct cache_tree_sub *find_subtree(struct cache_tree *it,
 	ALLOC_GROW(it->down, it->subtree_nr + 1, it->subtree_alloc);
 	it->subtree_nr++;
 
-	down = xmalloc(sizeof(*down) + pathlen + 1);
+	FLEX_ALLOC_MEM(down, name, path, pathlen);
 	down->cache_tree = NULL;
 	down->namelen = pathlen;
-	memcpy(down->name, path, pathlen);
-	down->name[pathlen] = 0;
 
 	if (pos < it->subtree_nr)
 		memmove(it->down + pos + 1,
diff --git a/combine-diff.c b/combine-diff.c
index 890c415..be09a2b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -319,7 +319,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	if (line[len-1] == '\n')
 		len--;
 
-	lline = xmalloc(sizeof(*lline) + len + 1);
+	FLEX_ALLOC_MEM(lline, line, line, len);
 	lline->len = len;
 	lline->next = NULL;
 	lline->prev = sline->plost.lost_tail;
@@ -330,8 +330,6 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	sline->plost.lost_tail = lline;
 	sline->plost.len++;
 	lline->parent_map = this_mask;
-	memcpy(lline->line, line, len);
-	lline->line[len] = 0;
 }
 
 struct combine_diff_state {
diff --git a/diff.c b/diff.c
index 2136b69..27d14a7 100644
--- a/diff.c
+++ b/diff.c
@@ -2607,12 +2607,9 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 
 struct diff_filespec *alloc_filespec(const char *path)
 {
-	int namelen = strlen(path);
-	struct diff_filespec *spec = xmalloc(sizeof(*spec) + namelen + 1);
+	struct diff_filespec *spec;
 
-	memset(spec, 0, sizeof(*spec));
-	spec->path = (char *)(spec + 1);
-	memcpy(spec->path, path, namelen+1);
+	FLEXPTR_ALLOC_STR(spec, path, path);
 	spec->count = 1;
 	spec->is_binary = -1;
 	return spec;
diff --git a/dir.c b/dir.c
index 6c627d5..f06ebb7 100644
--- a/dir.c
+++ b/dir.c
@@ -503,12 +503,7 @@ void add_exclude(const char *string, const char *base,
 
 	parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);
 	if (flags & EXC_FLAG_MUSTBEDIR) {
-		char *s;
-		x = xmalloc(sizeof(*x) + patternlen + 1);
-		s = (char *)(x+1);
-		memcpy(s, string, patternlen);
-		s[patternlen] = '\0';
-		x->pattern = s;
+		FLEXPTR_ALLOC_MEM(x, pattern, string, patternlen);
 	} else {
 		x = xmalloc(sizeof(*x));
 		x->pattern = string;
@@ -625,10 +620,7 @@ static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
 	}
 
 	uc->dir_created++;
-	d = xmalloc(sizeof(*d) + len + 1);
-	memset(d, 0, sizeof(*d));
-	memcpy(d->name, name, len);
-	d->name[len] = '\0';
+	FLEX_ALLOC_MEM(d, name, name, len);
 
 	ALLOC_GROW(dir->dirs, dir->dirs_nr + 1, dir->dirs_alloc);
 	memmove(dir->dirs + first + 1, dir->dirs + first,
@@ -1167,10 +1159,8 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
 {
 	struct dir_entry *ent;
 
-	ent = xmalloc(sizeof(*ent) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, pathname, len);
 	ent->len = len;
-	memcpy(ent->name, pathname, len);
-	ent->name[len] = 0;
 	return ent;
 }
 
diff --git a/hashmap.c b/hashmap.c
index f693839..b10b642 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -256,10 +256,9 @@ const void *memintern(const void *data, size_t len)
 	e = hashmap_get(&map, &key, data);
 	if (!e) {
 		/* not found: create it */
-		e = xmallocz(sizeof(struct pool_entry) + len);
+		FLEX_ALLOC_MEM(e, data, data, len);
 		hashmap_entry_init(e, key.ent.hash);
 		e->len = len;
-		memcpy(e->data, data, len);
 		hashmap_add(&map, e);
 	}
 	return e->data;
diff --git a/help.c b/help.c
index d996b34..19328ea 100644
--- a/help.c
+++ b/help.c
@@ -11,11 +11,9 @@
 
 void add_cmdname(struct cmdnames *cmds, const char *name, int len)
 {
-	struct cmdname *ent = xmalloc(sizeof(*ent) + len + 1);
-
+	struct cmdname *ent;
+	FLEX_ALLOC_MEM(ent, name, name, len);
 	ent->len = len;
-	memcpy(ent->name, name, len);
-	ent->name[len] = 0;
 
 	ALLOC_GROW(cmds->names, cmds->cnt + 1, cmds->alloc);
 	cmds->names[cmds->cnt++] = ent;
diff --git a/log-tree.c b/log-tree.c
index f70a30e..60f9839 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -77,9 +77,8 @@ int parse_decorate_color_config(const char *var, const char *slot_name, const ch
 
 void add_name_decoration(enum decoration_type type, const char *name, struct object *obj)
 {
-	int nlen = strlen(name);
-	struct name_decoration *res = xmalloc(sizeof(*res) + nlen + 1);
-	memcpy(res->name, name, nlen + 1);
+	struct name_decoration *res;
+	FLEX_ALLOC_STR(res, name, name);
 	res->type = type;
 	res->next = add_decoration(&name_decoration, obj, res);
 }
diff --git a/name-hash.c b/name-hash.c
index 332ba95..6d9f23e 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -55,10 +55,9 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 	dir = find_dir_entry(istate, ce->name, namelen);
 	if (!dir) {
 		/* not found, create it and add to hash table */
-		dir = xcalloc(1, sizeof(struct dir_entry) + namelen + 1);
+		FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
 		hashmap_entry_init(dir, memihash(ce->name, namelen));
 		dir->namelen = namelen;
-		strncpy(dir->name, ce->name, namelen);
 		hashmap_add(&istate->dir_hash, dir);
 
 		/* recursively add missing parent directories */
diff --git a/ref-filter.c b/ref-filter.c
index f097176..9ccfc51 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1255,10 +1255,8 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 						 const unsigned char *objectname,
 						 int flag)
 {
-	size_t len = strlen(refname);
-	struct ref_array_item *ref = xcalloc(1, sizeof(struct ref_array_item) + len + 1);
-	memcpy(ref->refname, refname, len);
-	ref->refname[len] = '\0';
+	struct ref_array_item *ref;
+	FLEX_ALLOC_STR(ref, refname, refname);
 	hashcpy(ref->objectname, objectname);
 	ref->flag = flag;
 
diff --git a/refs.c b/refs.c
index 1d9e2a7..2d86445 100644
--- a/refs.c
+++ b/refs.c
@@ -761,10 +761,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
 static struct ref_update *add_update(struct ref_transaction *transaction,
 				     const char *refname)
 {
-	size_t len = strlen(refname) + 1;
-	struct ref_update *update = xcalloc(1, sizeof(*update) + len);
-
-	memcpy((char *)update->refname, refname, len); /* includes NUL */
+	struct ref_update *update;
+	FLEX_ALLOC_STR(update, refname, refname);
 	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
 	transaction->updates[transaction->nr++] = update;
 	return update;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b569762..81f68f8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -199,17 +199,14 @@ static struct ref_entry *create_ref_entry(const char *refname,
 					  const unsigned char *sha1, int flag,
 					  int check_name)
 {
-	int len;
 	struct ref_entry *ref;
 
 	if (check_name &&
 	    check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
 		die("Reference has invalid format: '%s'", refname);
-	len = strlen(refname) + 1;
-	ref = xmalloc(sizeof(struct ref_entry) + len);
+	FLEX_ALLOC_STR(ref, name, refname);
 	hashcpy(ref->u.value.oid.hash, sha1);
 	oidclr(&ref->u.value.peeled);
-	memcpy(ref->name, refname, len);
 	ref->flag = flag;
 	return ref;
 }
@@ -268,9 +265,7 @@ static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
 					  int incomplete)
 {
 	struct ref_entry *direntry;
-	direntry = xcalloc(1, sizeof(struct ref_entry) + len + 1);
-	memcpy(direntry->name, dirname, len);
-	direntry->name[len] = '\0';
+	FLEX_ALLOC_MEM(direntry, name, dirname, len);
 	direntry->u.subdir.ref_cache = ref_cache;
 	direntry->flag = REF_DIR | (incomplete ? REF_INCOMPLETE : 0);
 	return direntry;
@@ -939,13 +934,10 @@ static void clear_loose_ref_cache(struct ref_cache *refs)
  */
 static struct ref_cache *create_ref_cache(const char *submodule)
 {
-	int len;
 	struct ref_cache *refs;
 	if (!submodule)
 		submodule = "";
-	len = strlen(submodule) + 1;
-	refs = xcalloc(1, sizeof(struct ref_cache) + len);
-	memcpy(refs->name, submodule, len);
+	FLEX_ALLOC_STR(refs, name, submodule);
 	refs->next = submodule_ref_caches;
 	submodule_ref_caches = refs;
 	return refs;
@@ -2197,10 +2189,9 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
 
 	/* Schedule the loose reference for pruning if requested. */
 	if ((cb->flags & PACK_REFS_PRUNE)) {
-		int namelen = strlen(entry->name) + 1;
-		struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen);
+		struct ref_to_prune *n;
+		FLEX_ALLOC_STR(n, name, entry->name);
 		hashcpy(n->sha1, entry->u.value.oid.hash);
-		memcpy(n->name, entry->name, namelen); /* includes NUL */
 		n->next = cb->ref_to_prune;
 		cb->ref_to_prune = n;
 	}
diff --git a/remote.c b/remote.c
index 02e698a..25a960f 100644
--- a/remote.c
+++ b/remote.c
@@ -2136,16 +2136,13 @@ static int one_local_ref(const char *refname, const struct object_id *oid,
 {
 	struct ref ***local_tail = cb_data;
 	struct ref *ref;
-	int len;
 
 	/* we already know it starts with refs/ to get here */
 	if (check_refname_format(refname + 5, 0))
 		return 0;
 
-	len = strlen(refname) + 1;
-	ref = xcalloc(1, sizeof(*ref) + len);
+	ref = alloc_ref(refname);
 	oidcpy(&ref->new_oid, oid);
-	memcpy(ref->name, refname, len);
 	**local_tail = ref;
 	*local_tail = &ref->next;
 	return 0;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 08/18] use st_add and st_mult for allocation size computation
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (6 preceding siblings ...)
  2016-02-15 21:52 ` [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros Jeff King
@ 2016-02-15 21:53 ` Jeff King
  2016-02-16  5:47   ` Eric Sunshine
  2016-02-15 21:53 ` [PATCH 09/18] write_untracked_extension: use FLEX_ALLOC helper Jeff King
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:53 UTC (permalink / raw)
  To: git

If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c              |  4 ++--
 builtin/apply.c        |  2 +-
 builtin/clean.c        |  2 +-
 builtin/fetch.c        |  2 +-
 builtin/index-pack.c   |  4 ++--
 builtin/merge.c        |  2 +-
 builtin/mv.c           |  4 ++--
 builtin/receive-pack.c |  2 +-
 combine-diff.c         | 14 +++++++-------
 commit.c               |  2 +-
 compat/mingw.c         |  4 ++--
 compat/qsort.c         |  2 +-
 compat/setenv.c        |  2 +-
 compat/win32/syslog.c  |  4 ++--
 diffcore-delta.c       |  6 ++++--
 diffcore-rename.c      |  2 +-
 dir.c                  |  4 ++--
 fast-import.c          |  2 +-
 refs.c                 |  2 +-
 remote.c               |  8 ++++----
 revision.c             |  2 +-
 sha1_file.c            | 20 +++++++++++---------
 sha1_name.c            |  5 ++---
 shallow.c              |  2 +-
 submodule.c            |  6 +++---
 25 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/archive.c b/archive.c
index 0687afa..5d735ae 100644
--- a/archive.c
+++ b/archive.c
@@ -171,8 +171,8 @@ static void queue_directory(const unsigned char *sha1,
 		unsigned mode, int stage, struct archiver_context *c)
 {
 	struct directory *d;
-	size_t len = base->len + 1 + strlen(filename) + 1;
-	d = xmalloc(sizeof(*d) + len);
+	size_t len = st_add4(base->len, 1, strlen(filename), 1);
+	d = xmalloc(st_add(sizeof(*d), len));
 	d->up	   = c->bottom;
 	d->baselen = base->len;
 	d->mode	   = mode;
diff --git a/builtin/apply.c b/builtin/apply.c
index d61ac65..42c610e 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -2632,7 +2632,7 @@ static void update_image(struct image *img,
 	insert_count = postimage->len;
 
 	/* Adjust the contents */
-	result = xmalloc(img->len + insert_count - remove_count + 1);
+	result = xmalloc(st_add3(st_sub(img->len, remove_count), insert_count, 1));
 	memcpy(result, img->buf, applied_at);
 	memcpy(result + applied_at, postimage->buf, postimage->len);
 	memcpy(result + applied_at + postimage->len,
diff --git a/builtin/clean.c b/builtin/clean.c
index 8229f7e..0371010 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -615,7 +615,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 				nr += chosen[i];
 		}
 
-		result = xcalloc(nr + 1, sizeof(int));
+		result = xcalloc(st_add(nr, 1), sizeof(int));
 		for (i = 0; i < stuff->nr && j < nr; i++) {
 			if (chosen[i])
 				result[j++] = i;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8e74213..373a89d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1110,7 +1110,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv)
 	if (argc > 0) {
 		int j = 0;
 		int i;
-		refs = xcalloc(argc + 1, sizeof(const char *));
+		refs = xcalloc(st_add(argc, 1), sizeof(const char *));
 		for (i = 0; i < argc; i++) {
 			if (!strcmp(argv[i], "tag")) {
 				i++;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a60bcfa..193908a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1744,9 +1744,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
-	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+	objects = xcalloc(st_add(nr_objects, 1), sizeof(struct object_entry));
 	if (show_stat)
-		obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
+		obj_stat = xcalloc(st_add(nr_objects, 1), sizeof(struct object_stat));
 	ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
 	parse_pack_objects(pack_sha1);
 	resolve_deltas();
diff --git a/builtin/merge.c b/builtin/merge.c
index b98a348..101ffef 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -939,7 +939,7 @@ static int setup_with_upstream(const char ***argv)
 	if (!branch->merge_nr)
 		die(_("No default upstream defined for the current branch."));
 
-	args = xcalloc(branch->merge_nr + 1, sizeof(char *));
+	args = xcalloc(st_add(branch->merge_nr, 1), sizeof(char *));
 	for (i = 0; i < branch->merge_nr; i++) {
 		if (!branch->merge[i]->dst)
 			die(_("No remote-tracking branch for %s from %s"),
diff --git a/builtin/mv.c b/builtin/mv.c
index 9a9813a..aeae855 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -48,9 +48,9 @@ static const char **internal_copy_pathspec(const char *prefix,
 
 static const char *add_slash(const char *path)
 {
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (path[len - 1] != '/') {
-		char *with_slash = xmalloc(len + 2);
+		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);
 		with_slash[len++] = '/';
 		with_slash[len] = 0;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index ad2fdaa..9b00515 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1378,7 +1378,7 @@ static struct command **queue_command(struct command **tail,
 
 	refname = line + 82;
 	reflen = linelen - 82;
-	cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
+	cmd = xcalloc(1, st_add3(sizeof(struct command), reflen, 1));
 	hashcpy(cmd->old_sha1, old_sha1);
 	hashcpy(cmd->new_sha1, new_sha1);
 	memcpy(cmd->ref_name, refname, reflen);
diff --git a/combine-diff.c b/combine-diff.c
index be09a2b..0e1d4b0 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -189,11 +189,11 @@ static struct lline *coalesce_lines(struct lline *base, int *lenbase,
 	 *   - Else if we have NEW, insert newend lline into base and
 	 *   consume newend
 	 */
-	lcs = xcalloc(origbaselen + 1, sizeof(int*));
-	directions = xcalloc(origbaselen + 1, sizeof(enum coalesce_direction*));
+	lcs = xcalloc(st_add(origbaselen, 1), sizeof(int*));
+	directions = xcalloc(st_add(origbaselen, 1), sizeof(enum coalesce_direction*));
 	for (i = 0; i < origbaselen + 1; i++) {
-		lcs[i] = xcalloc(lennew + 1, sizeof(int));
-		directions[i] = xcalloc(lennew + 1, sizeof(enum coalesce_direction));
+		lcs[i] = xcalloc(st_add(lennew, 1), sizeof(int));
+		directions[i] = xcalloc(st_add(lennew, 1), sizeof(enum coalesce_direction));
 		directions[i][0] = BASE;
 	}
 	for (j = 1; j < lennew + 1; j++)
@@ -1111,7 +1111,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	if (result_size && result[result_size-1] != '\n')
 		cnt++; /* incomplete line */
 
-	sline = xcalloc(cnt+2, sizeof(*sline));
+	sline = xcalloc(st_add(cnt, 2), sizeof(*sline));
 	sline[0].bol = result;
 	for (lno = 0, cp = result; cp < result + result_size; cp++) {
 		if (*cp == '\n') {
@@ -1130,7 +1130,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	/* Even p_lno[cnt+1] is valid -- that is for the end line number
 	 * for deletion hunk at the end.
 	 */
-	sline[0].p_lno = xcalloc((cnt+2) * num_parent, sizeof(unsigned long));
+	sline[0].p_lno = xcalloc(st_mult(st_add(cnt, 2), num_parent), sizeof(unsigned long));
 	for (lno = 0; lno <= cnt; lno++)
 		sline[lno+1].p_lno = sline[lno].p_lno + num_parent;
 
@@ -1262,7 +1262,7 @@ static struct diff_filepair *combined_pair(struct combine_diff_path *p,
 	struct diff_filespec *pool;
 
 	pair = xmalloc(sizeof(*pair));
-	pool = xcalloc(num_parent + 1, sizeof(struct diff_filespec));
+	pool = xcalloc(st_add(num_parent, 1), sizeof(struct diff_filespec));
 	pair->one = pool + 1;
 	pair->two = pool;
 
diff --git a/commit.c b/commit.c
index 31cd91f..3f4f371 100644
--- a/commit.c
+++ b/commit.c
@@ -147,7 +147,7 @@ struct commit_graft *read_graft_line(char *buf, int len)
 	if ((len + 1) % entry_size)
 		goto bad_graft_data;
 	i = (len + 1) / entry_size - 1;
-	graft = xmalloc(sizeof(*graft) + GIT_SHA1_RAWSZ * i);
+	graft = xmalloc(st_add(sizeof(*graft), st_mult(GIT_SHA1_RAWSZ, i)));
 	graft->nr_parent = i;
 	if (get_oid_hex(buf, &graft->oid))
 		goto bad_graft_data;
diff --git a/compat/mingw.c b/compat/mingw.c
index 0eabe68..60f431b 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -771,7 +771,7 @@ static const char *quote_arg(const char *arg)
 		return arg;
 
 	/* insert \ where necessary */
-	d = q = xmalloc(len+n+3);
+	d = q = xmalloc(st_add3(len, n, 3));
 	*d++ = '"';
 	while (*arg) {
 		if (*arg == '"')
@@ -1030,7 +1030,7 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 			free(quoted);
 	}
 
-	wargs = xmalloc((2 * args.len + 1) * sizeof(wchar_t));
+	wargs = xmalloc_array(st_add(st_mult(2, args.len), 1), sizeof(wchar_t));
 	xutftowcs(wargs, args.buf, 2 * args.len + 1);
 	strbuf_release(&args);
 
diff --git a/compat/qsort.c b/compat/qsort.c
index 9574d53..7d071af 100644
--- a/compat/qsort.c
+++ b/compat/qsort.c
@@ -47,7 +47,7 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
 void git_qsort(void *b, size_t n, size_t s,
 	       int (*cmp)(const void *, const void *))
 {
-	const size_t size = n * s;
+	const size_t size = st_mult(n, s);
 	char buf[1024];
 
 	if (size < sizeof(buf)) {
diff --git a/compat/setenv.c b/compat/setenv.c
index fc1439a..7849f25 100644
--- a/compat/setenv.c
+++ b/compat/setenv.c
@@ -18,7 +18,7 @@ int gitsetenv(const char *name, const char *value, int replace)
 
 	namelen = strlen(name);
 	valuelen = strlen(value);
-	envstr = malloc((namelen + valuelen + 2));
+	envstr = malloc(st_add3(namelen, valuelen, 2));
 	if (!envstr) {
 		errno = ENOMEM;
 		return -1;
diff --git a/compat/win32/syslog.c b/compat/win32/syslog.c
index d015e43..b905aea 100644
--- a/compat/win32/syslog.c
+++ b/compat/win32/syslog.c
@@ -32,7 +32,7 @@ void syslog(int priority, const char *fmt, ...)
 		return;
 	}
 
-	str = malloc(str_len + 1);
+	str = malloc(st_add(str_len, 1));
 	if (!str) {
 		warning("malloc failed: '%s'", strerror(errno));
 		return;
@@ -43,7 +43,7 @@ void syslog(int priority, const char *fmt, ...)
 	va_end(ap);
 
 	while ((pos = strstr(str, "%1")) != NULL) {
-		str = realloc(str, ++str_len + 1);
+		str = realloc(str, st_add(++str_len, 1));
 		if (!str) {
 			warning("realloc failed: '%s'", strerror(errno));
 			return;
diff --git a/diffcore-delta.c b/diffcore-delta.c
index 7cf431d..4159748 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -53,7 +53,8 @@ static struct spanhash_top *spanhash_rehash(struct spanhash_top *orig)
 	int osz = 1 << orig->alloc_log2;
 	int sz = osz << 1;
 
-	new = xmalloc(sizeof(*orig) + sizeof(struct spanhash) * sz);
+	new = xmalloc(st_add(sizeof(*orig),
+			     st_mult(sizeof(struct spanhash), sz)));
 	new->alloc_log2 = orig->alloc_log2 + 1;
 	new->free = INITIAL_FREE(new->alloc_log2);
 	memset(new->data, 0, sizeof(struct spanhash) * sz);
@@ -130,7 +131,8 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one)
 	int is_text = !diff_filespec_is_binary(one);
 
 	i = INITIAL_HASH_SIZE;
-	hash = xmalloc(sizeof(*hash) + sizeof(struct spanhash) * (1<<i));
+	hash = xmalloc(st_add(sizeof(*hash),
+			      st_mult(sizeof(struct spanhash), 1<<i)));
 	hash->alloc_log2 = i;
 	hash->free = INITIAL_FREE(i);
 	memset(hash->data, 0, sizeof(struct spanhash) * (1<<i));
diff --git a/diffcore-rename.c b/diffcore-rename.c
index af1fe08..3b3c1ed 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -537,7 +537,7 @@ void diffcore_rename(struct diff_options *options)
 				rename_dst_nr * rename_src_nr, 50, 1);
 	}
 
-	mx = xcalloc(num_create * NUM_CANDIDATE_PER_DST, sizeof(*mx));
+	mx = xcalloc(st_mult(num_create, NUM_CANDIDATE_PER_DST), sizeof(*mx));
 	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
 		struct diff_filespec *two = rename_dst[i].two;
 		struct diff_score *m;
diff --git a/dir.c b/dir.c
index f06ebb7..2c91541 100644
--- a/dir.c
+++ b/dir.c
@@ -689,7 +689,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			return 0;
 		}
 		if (buf[size-1] != '\n') {
-			buf = xrealloc(buf, size+1);
+			buf = xrealloc(buf, st_add(size, 1));
 			buf[size++] = '\n';
 		}
 	} else {
@@ -2488,7 +2488,7 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	next = data + len + 1;
 	if (next > rd->end)
 		return -1;
-	*untracked_ = untracked = xmalloc(sizeof(*untracked) + len);
+	*untracked_ = untracked = xmalloc(st_add(sizeof(*untracked), len));
 	memcpy(untracked, &ud, sizeof(ud));
 	memcpy(untracked->name, data, len + 1);
 	data = next;
diff --git a/fast-import.c b/fast-import.c
index a6467cb..3053bb8 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -622,7 +622,7 @@ static void *pool_alloc(size_t len)
 			return xmalloc(len);
 		}
 		total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;
-		p = xmalloc(sizeof(struct mem_pool) + mem_pool_alloc);
+		p = xmalloc(st_add(sizeof(struct mem_pool), mem_pool_alloc));
 		p->next_pool = mem_pool;
 		p->next_free = (char *) p->space;
 		p->end = p->next_free + mem_pool_alloc;
diff --git a/refs.c b/refs.c
index 2d86445..b0e6ece 100644
--- a/refs.c
+++ b/refs.c
@@ -906,7 +906,7 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 			/* -2 for strlen("%.*s") - strlen("%s"); +1 for NUL */
 			total_len += strlen(ref_rev_parse_rules[nr_rules]) - 2 + 1;
 
-		scanf_fmts = xmalloc(nr_rules * sizeof(char *) + total_len);
+		scanf_fmts = xmalloc(st_add(st_mult(nr_rules, sizeof(char *)), total_len));
 
 		offset = 0;
 		for (i = 0; i < nr_rules; i++) {
diff --git a/remote.c b/remote.c
index 25a960f..21e4ec3 100644
--- a/remote.c
+++ b/remote.c
@@ -931,7 +931,7 @@ static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen,
 		const char *name)
 {
 	size_t len = strlen(name);
-	struct ref *ref = xcalloc(1, sizeof(struct ref) + prefixlen + len + 1);
+	struct ref *ref = xcalloc(1, st_add4(sizeof(*ref), prefixlen, len, 1));
 	memcpy(ref->name, prefix, prefixlen);
 	memcpy(ref->name + prefixlen, name, len);
 	return ref;
@@ -948,9 +948,9 @@ struct ref *copy_ref(const struct ref *ref)
 	size_t len;
 	if (!ref)
 		return NULL;
-	len = strlen(ref->name);
-	cpy = xmalloc(sizeof(struct ref) + len + 1);
-	memcpy(cpy, ref, sizeof(struct ref) + len + 1);
+	len = st_add3(sizeof(struct ref), strlen(ref->name), 1);
+	cpy = xmalloc(len);
+	memcpy(cpy, ref, len);
 	cpy->next = NULL;
 	cpy->symref = xstrdup_or_null(ref->symref);
 	cpy->remote_status = xstrdup_or_null(ref->remote_status);
diff --git a/revision.c b/revision.c
index f24ead5..d4ace60 100644
--- a/revision.c
+++ b/revision.c
@@ -540,7 +540,7 @@ struct treesame_state {
 static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit)
 {
 	unsigned n = commit_list_count(commit->parents);
-	struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+	struct treesame_state *st = xcalloc(1, st_add(sizeof(*st), n));
 	st->nparents = n;
 	add_decoration(&revs->treesame, &commit->object, st);
 	return st;
diff --git a/sha1_file.c b/sha1_file.c
index 2f1c6d3..0251700 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -253,7 +253,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 {
 	struct alternate_object_database *ent;
 	struct alternate_object_database *alt;
-	int pfxlen, entlen;
+	size_t pfxlen, entlen;
 	struct strbuf pathbuf = STRBUF_INIT;
 
 	if (!is_absolute_path(entry) && relative_base) {
@@ -273,8 +273,8 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 	while (pfxlen && pathbuf.buf[pfxlen-1] == '/')
 		pfxlen -= 1;
 
-	entlen = pfxlen + 43; /* '/' + 2 hex + '/' + 38 hex + NUL */
-	ent = xmalloc(sizeof(*ent) + entlen);
+	entlen = st_add(pfxlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */
+	ent = xmalloc(st_add(sizeof(*ent), entlen));
 	memcpy(ent->base, pathbuf.buf, pfxlen);
 	strbuf_release(&pathbuf);
 
@@ -1134,7 +1134,7 @@ unsigned char *use_pack(struct packed_git *p,
 
 static struct packed_git *alloc_packed_git(int extra)
 {
-	struct packed_git *p = xmalloc(sizeof(*p) + extra);
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
 	memset(p, 0, sizeof(*p));
 	p->pack_fd = -1;
 	return p;
@@ -1168,7 +1168,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	 * ".pack" is long enough to hold any suffix we're adding (and
 	 * the use xsnprintf double-checks that)
 	 */
-	alloc = path_len + strlen(".pack") + 1;
+	alloc = st_add3(path_len, strlen(".pack"), 1);
 	p = alloc_packed_git(alloc);
 	memcpy(p->pack_name, path, path_len);
 
@@ -1196,7 +1196,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 {
 	const char *path = sha1_pack_name(sha1);
-	int alloc = strlen(path) + 1;
+	size_t alloc = st_add(strlen(path), 1);
 	struct packed_git *p = alloc_packed_git(alloc);
 
 	memcpy(p->pack_name, path, alloc); /* includes NUL */
@@ -1413,10 +1413,12 @@ static void mark_bad_packed_object(struct packed_git *p,
 {
 	unsigned i;
 	for (i = 0; i < p->num_bad_objects; i++)
-		if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
 			return;
-	p->bad_object_sha1 = xrealloc(p->bad_object_sha1, 20 * (p->num_bad_objects + 1));
-	hashcpy(p->bad_object_sha1 + 20 * p->num_bad_objects, sha1);
+	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
+				      st_mult(GIT_SHA1_RAWSZ,
+					      st_add(p->num_bad_objects, 1)));
+	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
 	p->num_bad_objects++;
 }
 
diff --git a/sha1_name.c b/sha1_name.c
index 89918ca..d14346c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -87,9 +87,8 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa
 		 * object databases including our own.
 		 */
 		const char *objdir = get_object_directory();
-		int objdir_len = strlen(objdir);
-		int entlen = objdir_len + 43;
-		fakeent = xmalloc(sizeof(*fakeent) + entlen);
+		size_t objdir_len = strlen(objdir);
+		fakeent = xmalloc(st_add3(sizeof(*fakeent), objdir_len, 43));
 		memcpy(fakeent->base, objdir, objdir_len);
 		fakeent->name = fakeent->base + objdir_len + 1;
 		fakeent->name[-1] = '/';
diff --git a/shallow.c b/shallow.c
index 71163bf..4d554ca 100644
--- a/shallow.c
+++ b/shallow.c
@@ -389,7 +389,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 	unsigned int i, nr;
 	struct commit_list *head = NULL;
 	int bitmap_nr = (info->nr_bits + 31) / 32;
-	int bitmap_size = bitmap_nr * sizeof(uint32_t);
+	size_t bitmap_size = st_mult(bitmap_nr, sizeof(uint32_t));
 	uint32_t *tmp = xmalloc(bitmap_size); /* to be freed before return */
 	uint32_t *bitmap = paint_alloc(info);
 	struct commit *c = lookup_commit_reference_gently(sha1, 1);
diff --git a/submodule.c b/submodule.c
index b83939c..ac61c65 100644
--- a/submodule.c
+++ b/submodule.c
@@ -123,7 +123,7 @@ static int add_submodule_odb(const char *path)
 	struct strbuf objects_directory = STRBUF_INIT;
 	struct alternate_object_database *alt_odb;
 	int ret = 0;
-	int alloc;
+	size_t alloc;
 
 	strbuf_git_path_submodule(&objects_directory, path, "objects/");
 	if (!is_directory(objects_directory.buf)) {
@@ -138,8 +138,8 @@ static int add_submodule_odb(const char *path)
 					objects_directory.len))
 			goto done;
 
-	alloc = objects_directory.len + 42; /* for "12/345..." sha1 */
-	alt_odb = xmalloc(sizeof(*alt_odb) + alloc);
+	alloc = st_add(objects_directory.len, 42); /* for "12/345..." sha1 */
+	alt_odb = xmalloc(st_add(sizeof(*alt_odb), alloc));
 	alt_odb->next = alt_odb_list;
 	xsnprintf(alt_odb->base, alloc, "%s", objects_directory.buf);
 	alt_odb->name = alt_odb->base + objects_directory.len;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 09/18] write_untracked_extension: use FLEX_ALLOC helper
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (7 preceding siblings ...)
  2016-02-15 21:53 ` [PATCH 08/18] use st_add and st_mult for allocation size computation Jeff King
@ 2016-02-15 21:53 ` Jeff King
  2016-02-15 21:54 ` [PATCH 10/18] fast-import: simplify allocation in start_packfile Jeff King
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:53 UTC (permalink / raw)
  To: git

We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.

Note that there's one subtlety here, though.  We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.

As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.

Signed-off-by: Jeff King <peff@peff.net>
---
 dir.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 2c91541..a4a9d9f 100644
--- a/dir.c
+++ b/dir.c
@@ -2360,16 +2360,15 @@ void write_untracked_extension(struct strbuf *out, struct untracked_cache *untra
 	struct ondisk_untracked_cache *ouc;
 	struct write_data wd;
 	unsigned char varbuf[16];
-	int len = 0, varint_len;
-	if (untracked->exclude_per_dir)
-		len = strlen(untracked->exclude_per_dir);
-	ouc = xmalloc(sizeof(*ouc) + len + 1);
+	int varint_len;
+	size_t len = strlen(untracked->exclude_per_dir);
+
+	FLEX_ALLOC_MEM(ouc, exclude_per_dir, untracked->exclude_per_dir, len);
 	stat_data_to_disk(&ouc->info_exclude_stat, &untracked->ss_info_exclude.stat);
 	stat_data_to_disk(&ouc->excludes_file_stat, &untracked->ss_excludes_file.stat);
 	hashcpy(ouc->info_exclude_sha1, untracked->ss_info_exclude.sha1);
 	hashcpy(ouc->excludes_file_sha1, untracked->ss_excludes_file.sha1);
 	ouc->dir_flags = htonl(untracked->dir_flags);
-	memcpy(ouc->exclude_per_dir, untracked->exclude_per_dir, len + 1);
 
 	varint_len = encode_varint(untracked->ident.len, varbuf);
 	strbuf_add(out, varbuf, varint_len);
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 10/18] fast-import: simplify allocation in start_packfile
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (8 preceding siblings ...)
  2016-02-15 21:53 ` [PATCH 09/18] write_untracked_extension: use FLEX_ALLOC helper Jeff King
@ 2016-02-15 21:54 ` Jeff King
  2016-02-15 21:54 ` [PATCH 11/18] fetch-pack: simplify add_sought_entry Jeff King
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:54 UTC (permalink / raw)
  To: git

This function allocates a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needs to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).

This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.

Signed-off-by: Jeff King <peff@peff.net>
---
 fast-import.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 3053bb8..9fc7093 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -865,15 +865,12 @@ static void start_packfile(void)
 {
 	static char tmp_file[PATH_MAX];
 	struct packed_git *p;
-	int namelen;
 	struct pack_header hdr;
 	int pack_fd;
 
 	pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
 			      "pack/tmp_pack_XXXXXX");
-	namelen = strlen(tmp_file) + 2;
-	p = xcalloc(1, sizeof(*p) + namelen);
-	xsnprintf(p->pack_name, namelen, "%s", tmp_file);
+	FLEX_ALLOC_STR(p, pack_name, tmp_file);
 	p->pack_fd = pack_fd;
 	p->do_not_close = 1;
 	pack_file = sha1fd(pack_fd, p->pack_name);
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 11/18] fetch-pack: simplify add_sought_entry
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (9 preceding siblings ...)
  2016-02-15 21:54 ` [PATCH 10/18] fast-import: simplify allocation in start_packfile Jeff King
@ 2016-02-15 21:54 ` Jeff King
  2016-02-15 21:55 ` [PATCH 12/18] test-path-utils: fix normalize_path_copy output buffer size Jeff King
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:54 UTC (permalink / raw)
  To: git

We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.

Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().

Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for "<sha1> <ref>" versus just "<ref>". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fetch-pack.c | 27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9b2a514..79a611f 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -10,33 +10,24 @@ static const char fetch_pack_usage[] =
 "[--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] "
 "[--no-progress] [--diag-url] [-v] [<host>:]<directory> [<refs>...]";
 
-static void add_sought_entry_mem(struct ref ***sought, int *nr, int *alloc,
-				 const char *name, int namelen)
+static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
+			     const char *name)
 {
-	struct ref *ref = xcalloc(1, sizeof(*ref) + namelen + 1);
+	struct ref *ref;
 	struct object_id oid;
-	const int chunksz = GIT_SHA1_HEXSZ + 1;
 
-	if (namelen > chunksz && name[chunksz - 1] == ' ' &&
-		!get_oid_hex(name, &oid)) {
-		oidcpy(&ref->old_oid, &oid);
-		name += chunksz;
-		namelen -= chunksz;
-	}
+	if (!get_oid_hex(name, &oid) && name[GIT_SHA1_HEXSZ] == ' ')
+		name += GIT_SHA1_HEXSZ + 1;
+	else
+		oidclr(&oid);
 
-	memcpy(ref->name, name, namelen);
-	ref->name[namelen] = '\0';
+	ref = alloc_ref(name);
+	oidcpy(&ref->old_oid, &oid);
 	(*nr)++;
 	ALLOC_GROW(*sought, *nr, *alloc);
 	(*sought)[*nr - 1] = ref;
 }
 
-static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
-			     const char *string)
-{
-	add_sought_entry_mem(sought, nr, alloc, string, strlen(string));
-}
-
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 {
 	int i, ret;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 12/18] test-path-utils: fix normalize_path_copy output buffer size
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (10 preceding siblings ...)
  2016-02-15 21:54 ` [PATCH 11/18] fetch-pack: simplify add_sought_entry Jeff King
@ 2016-02-15 21:55 ` Jeff King
  2016-02-15 21:56 ` [PATCH 13/18] sequencer: simplify memory allocation of get_message Jeff King
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:55 UTC (permalink / raw)
  To: git

The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.

In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.

Signed-off-by: Jeff King <peff@peff.net>
---
 test-path-utils.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/test-path-utils.c b/test-path-utils.c
index c3adcd8..0c15f18 100644
--- a/test-path-utils.c
+++ b/test-path-utils.c
@@ -8,21 +8,14 @@
  */
 static int normalize_ceiling_entry(struct string_list_item *item, void *unused)
 {
-	const char *ceil = item->string;
-	int len = strlen(ceil);
-	char buf[PATH_MAX+1];
+	char *ceil = item->string;
 
-	if (len == 0)
+	if (!*ceil)
 		die("Empty path is not supported");
-	if (len > PATH_MAX)
-		die("Path \"%s\" is too long", ceil);
 	if (!is_absolute_path(ceil))
 		die("Path \"%s\" is not absolute", ceil);
-	if (normalize_path_copy(buf, ceil) < 0)
+	if (normalize_path_copy(ceil, ceil) < 0)
 		die("Path \"%s\" could not be normalized", ceil);
-	len = strlen(buf);
-	free(item->string);
-	item->string = xstrdup(buf);
 	return 1;
 }
 
@@ -166,7 +159,7 @@ static struct test_data dirname_data[] = {
 int main(int argc, char **argv)
 {
 	if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) {
-		char *buf = xmalloc(PATH_MAX + 1);
+		char *buf = xmallocz(strlen(argv[2]));
 		int rv = normalize_path_copy(buf, argv[2]);
 		if (rv)
 			buf = "++failed++";
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 13/18] sequencer: simplify memory allocation of get_message
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (11 preceding siblings ...)
  2016-02-15 21:55 ` [PATCH 12/18] test-path-utils: fix normalize_path_copy output buffer size Jeff King
@ 2016-02-15 21:56 ` Jeff King
  2016-02-16  6:05   ` Eric Sunshine
  2016-02-15 21:56 ` [PATCH 14/18] git-compat-util: drop mempcpy compat code Jeff King
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:56 UTC (permalink / raw)
  To: git

For a commit with has "1234abcd" and subject "foo", this
function produces a struct with three strings:

 1. "foo"

 2. "1234abcd... foo"

 3. "parent of 1234abcd... foo"

It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.

Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.

While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.

Signed-off-by: Jeff King <peff@peff.net>
---
 sequencer.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 8048786..e66f2fe 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -124,42 +124,33 @@ static const char *action_name(const struct replay_opts *opts)
 
 struct commit_message {
 	char *parent_label;
-	const char *label;
-	const char *subject;
+	char *label;
+	char *subject;
 	const char *message;
 };
 
 static int get_message(struct commit *commit, struct commit_message *out)
 {
 	const char *abbrev, *subject;
-	int abbrev_len, subject_len;
-	char *q;
-
-	if (!git_commit_encoding)
-		git_commit_encoding = "UTF-8";
+	int subject_len;
 
-	out->message = logmsg_reencode(commit, NULL, git_commit_encoding);
+	out->message = logmsg_reencode(commit, NULL, get_commit_output_encoding());
 	abbrev = find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV);
-	abbrev_len = strlen(abbrev);
 
 	subject_len = find_commit_subject(out->message, &subject);
 
-	out->parent_label = xmalloc(strlen("parent of ") + abbrev_len +
-			      strlen("... ") + subject_len + 1);
-	q = out->parent_label;
-	q = mempcpy(q, "parent of ", strlen("parent of "));
-	out->label = q;
-	q = mempcpy(q, abbrev, abbrev_len);
-	q = mempcpy(q, "... ", strlen("... "));
-	out->subject = q;
-	q = mempcpy(q, subject, subject_len);
-	*q = '\0';
+	out->subject = xmemdupz(subject, subject_len);
+	out->label = xstrfmt("%s... %s", abbrev, out->subject);
+	out->parent_label = xstrfmt("parent of %s", out->label);
+
 	return 0;
 }
 
 static void free_message(struct commit *commit, struct commit_message *msg)
 {
 	free(msg->parent_label);
+	free(msg->label);
+	free(msg->subject);
 	unuse_commit_buffer(commit, msg->message);
 }
 
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 14/18] git-compat-util: drop mempcpy compat code
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (12 preceding siblings ...)
  2016-02-15 21:56 ` [PATCH 13/18] sequencer: simplify memory allocation of get_message Jeff King
@ 2016-02-15 21:56 ` Jeff King
  2016-02-16  6:05   ` Eric Sunshine
  2016-02-15 21:56 ` [PATCH 15/18] transport_anonymize_url: use xstrfmt Jeff King
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:56 UTC (permalink / raw)
  To: git

There are no callers of this left, as the last one was
dropped in the previous patch. And there are no likely to be
new ones, as the function has been around since 2010 without
gaining any new callers.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 8f23801..6892e72 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -681,7 +681,6 @@ extern int git_vsnprintf(char *str, size_t maxsize,
 #ifdef __GLIBC_PREREQ
 #if __GLIBC_PREREQ(2, 1)
 #define HAVE_STRCHRNUL
-#define HAVE_MEMPCPY
 #endif
 #endif
 
@@ -695,14 +694,6 @@ static inline char *gitstrchrnul(const char *s, int c)
 }
 #endif
 
-#ifndef HAVE_MEMPCPY
-#define mempcpy gitmempcpy
-static inline void *gitmempcpy(void *dest, const void *src, size_t n)
-{
-	return (char *)memcpy(dest, src, n) + n;
-}
-#endif
-
 #ifdef NO_INET_PTON
 int inet_pton(int af, const char *src, void *dst);
 #endif
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 15/18] transport_anonymize_url: use xstrfmt
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (13 preceding siblings ...)
  2016-02-15 21:56 ` [PATCH 14/18] git-compat-util: drop mempcpy compat code Jeff King
@ 2016-02-15 21:56 ` Jeff King
  2016-02-15 21:56 ` [PATCH 16/18] diff_populate_gitlink: use a strbuf Jeff King
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:56 UTC (permalink / raw)
  To: git

This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/transport.c b/transport.c
index 5c63295..d53e4aa 100644
--- a/transport.c
+++ b/transport.c
@@ -1351,7 +1351,7 @@ int transport_disconnect(struct transport *transport)
  */
 char *transport_anonymize_url(const char *url)
 {
-	char *anon_url, *scheme_prefix, *anon_part;
+	char *scheme_prefix, *anon_part;
 	size_t anon_len, prefix_len = 0;
 
 	anon_part = strchr(url, '@');
@@ -1385,10 +1385,8 @@ char *transport_anonymize_url(const char *url)
 			goto literal_copy;
 		prefix_len = scheme_prefix - url + 3;
 	}
-	anon_url = xcalloc(1, 1 + prefix_len + anon_len);
-	memcpy(anon_url, url, prefix_len);
-	memcpy(anon_url + prefix_len, anon_part, anon_len);
-	return anon_url;
+	return xstrfmt("%.*s%.*s", (int)prefix_len, url,
+		       (int)anon_len, anon_part);
 literal_copy:
 	return xstrdup(url);
 }
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 16/18] diff_populate_gitlink: use a strbuf
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (14 preceding siblings ...)
  2016-02-15 21:56 ` [PATCH 15/18] transport_anonymize_url: use xstrfmt Jeff King
@ 2016-02-15 21:56 ` Jeff King
  2016-02-15 21:57 ` [PATCH 17/18] convert ewah/bitmap code to use xmalloc Jeff King
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:56 UTC (permalink / raw)
  To: git

We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.

We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 27d14a7..a70ec6e 100644
--- a/diff.c
+++ b/diff.c
@@ -2704,21 +2704,21 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
 
 static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 {
-	int len;
-	char *data = xmalloc(100), *dirty = "";
+	struct strbuf buf = STRBUF_INIT;
+	char *dirty = "";
 
 	/* Are we looking at the work tree? */
 	if (s->dirty_submodule)
 		dirty = "-dirty";
 
-	len = snprintf(data, 100,
-		       "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
-	s->data = data;
-	s->size = len;
-	s->should_free = 1;
+	strbuf_addf(&buf, "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
+	s->size = buf.len;
 	if (size_only) {
 		s->data = NULL;
-		free(data);
+		strbuf_release(&buf);
+	} else {
+		s->data = strbuf_detach(&buf, NULL);
+		s->should_free = 1;
 	}
 	return 0;
 }
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 17/18] convert ewah/bitmap code to use xmalloc
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (15 preceding siblings ...)
  2016-02-15 21:56 ` [PATCH 16/18] diff_populate_gitlink: use a strbuf Jeff King
@ 2016-02-15 21:57 ` Jeff King
  2016-02-15 21:57 ` [PATCH 18/18] ewah: convert to REALLOC_ARRAY, etc Jeff King
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:57 UTC (permalink / raw)
  To: git

This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.

We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.

Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 12 ++++++------
 ewah/ewah_bitmap.c |  9 +++------
 ewah/ewah_io.c     | 10 ++--------
 ewah/ewok.h        | 10 ----------
 4 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 47ad674..c88daa0 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -25,8 +25,8 @@
 
 struct bitmap *bitmap_new(void)
 {
-	struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
-	bitmap->words = ewah_calloc(32, sizeof(eword_t));
+	struct bitmap *bitmap = xmalloc(sizeof(struct bitmap));
+	bitmap->words = xcalloc(32, sizeof(eword_t));
 	bitmap->word_alloc = 32;
 	return bitmap;
 }
@@ -38,8 +38,8 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = ewah_realloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		self->words = xrealloc(self->words,
+				      self->word_alloc * sizeof(eword_t));
 
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
@@ -102,7 +102,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	while (ewah_iterator_next(&blowup, &it)) {
 		if (i >= bitmap->word_alloc) {
 			bitmap->word_alloc *= 1.5;
-			bitmap->words = ewah_realloc(
+			bitmap->words = xrealloc(
 				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
 		}
 
@@ -134,7 +134,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = ewah_realloc(self->words,
+		self->words = xrealloc(self->words,
 			self->word_alloc * sizeof(eword_t));
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index b522437..fcd465e 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,7 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
@@ -282,11 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 {
 	struct ewah_bitmap *self;
 
-	self = ewah_malloc(sizeof(struct ewah_bitmap));
-	if (self == NULL)
-		return NULL;
-
-	self->buffer = ewah_malloc(32 * sizeof(eword_t));
+	self = xmalloc(sizeof(struct ewah_bitmap));
+	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
 
 	ewah_clear(self);
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 43481b9..4acff97 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,12 +134,9 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
 	 * if we're in a little-endian platform, we'll perform
@@ -180,12 +177,9 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
 	words_left = self->buffer_size;
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 6e2c5e1..269a1a8 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -20,16 +20,6 @@
 #ifndef __EWOK_BITMAP_H__
 #define __EWOK_BITMAP_H__
 
-#ifndef ewah_malloc
-#	define ewah_malloc xmalloc
-#endif
-#ifndef ewah_realloc
-#	define ewah_realloc xrealloc
-#endif
-#ifndef ewah_calloc
-#	define ewah_calloc xcalloc
-#endif
-
 struct strbuf;
 typedef uint64_t eword_t;
 #define BITS_IN_EWORD (sizeof(eword_t) * 8)
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 18/18] ewah: convert to REALLOC_ARRAY, etc
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (16 preceding siblings ...)
  2016-02-15 21:57 ` [PATCH 17/18] convert ewah/bitmap code to use xmalloc Jeff King
@ 2016-02-15 21:57 ` Jeff King
  2016-02-15 22:02 ` [PATCH 0/18] hardening allocations against integer overflow Jeff King
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 21:57 UTC (permalink / raw)
  To: git

Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 16 ++++------------
 ewah/ewah_bitmap.c |  5 ++---
 ewah/ewah_io.c     |  6 ++----
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index c88daa0..7103cee 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -17,7 +17,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
-#include "git-compat-util.h"
+#include "cache.h"
 #include "ewok.h"
 
 #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD))
@@ -38,9 +38,7 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = xrealloc(self->words,
-				      self->word_alloc * sizeof(eword_t));
-
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
 	}
@@ -100,12 +98,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	ewah_iterator_init(&it, ewah);
 
 	while (ewah_iterator_next(&blowup, &it)) {
-		if (i >= bitmap->word_alloc) {
-			bitmap->word_alloc *= 1.5;
-			bitmap->words = xrealloc(
-				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
-		}
-
+		ALLOC_GROW(bitmap->words, i + 1, bitmap->word_alloc);
 		bitmap->words[i++] = blowup;
 	}
 
@@ -134,8 +127,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = xrealloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
 	}
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index fcd465e..2dc9c82 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,8 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
 
@@ -283,8 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 	struct ewah_bitmap *self;
 
 	self = xmalloc(sizeof(struct ewah_bitmap));
-	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
+	ALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	ewah_clear(self);
 	return self;
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 4acff97..61f6a43 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,8 +134,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
@@ -177,8 +176,7 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
-- 
2.7.1.572.gf718037

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 0/18] hardening allocations against integer overflow
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (17 preceding siblings ...)
  2016-02-15 21:57 ` [PATCH 18/18] ewah: convert to REALLOC_ARRAY, etc Jeff King
@ 2016-02-15 22:02 ` Jeff King
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
  19 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-15 22:02 UTC (permalink / raw)
  To: git

On Mon, Feb 15, 2016 at 04:45:16PM -0500, Jeff King wrote:

> The only bug I have actually confirmed in practice here is fixed by
> patch 2 (which is why it's at the front). There's another one in
> path_name(), but that function is already dropped by the nearby
> jk/lose-name-path topic.
> 
> The rest are cleanups of spots which _might_ be buggy, but I didn't dig
> too far to find out. As with the earlier audit, I tried to refactor
> using helpers that make the code clearer and less error-prone. So maybe
> they're fixing bugs or not, but they certainly make it easier to audit
> the result for problems.

After this, looking for /[cm]alloc.*\+/ is pretty clean. I _didn't_ fix
any sites in xdiff, but those are already protected by dcd1742 (xdiff:
reject files larger than ~1GB, 2015-09-24).

That's not necessarily complete coverage, though, as you can always
screw up the computation outside of the xmalloc call, and pass in the
truncated version. E.g.:

  int alloc = a + b;
  char *foo = xmalloc(alloc);

However, this is only a big problem if you then copy "a" and "b"
separately. If you use "alloc" later as the limit, like:

  xsnprintf(foo, alloc, "%s%s", a, b);

that's only a minor bug (we notice the overflow and complain, rather
than smashing the heap).

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/18] add helpers for allocating flex-array structs
  2016-02-15 21:50 ` [PATCH 04/18] add helpers for allocating flex-array structs Jeff King
@ 2016-02-16  1:47   ` Eric Sunshine
  2016-02-16  2:52     ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  1:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:50 PM, Jeff King <peff@peff.net> wrote:
> Allocating a struct with a flex array is pretty simple in
> practice: you over-allocate the struct, then copy some data
> into the over-allocation. But it can be a slight pain to
> make sure you're allocating and copying the right amounts.
>
> This patch adds a few helpers to turn simple cases of into a

Grammo: "cases of into"

> one-liner that properly checks for overflow. See the
> embedded documentation for details.
> [...]
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/git-compat-util.h b/git-compat-util.h
> @@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
> + *   struct foo *f;
> + *   FLEX_ALLOC_STR(f, name, src);
> + *
> + * and "name" will point to a block of memory after the struct, which will be
> + * freed along with the struct (but the pointer can be repoined anywhere).

"repoined"?

> + * The *_STR variants accept a string parameter rather than a ptr/len
> + * combination.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-15 21:52 ` [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros Jeff King
@ 2016-02-16  2:17   ` Eric Sunshine
  2016-02-16  3:15     ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  2:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:52 PM, Jeff King <peff@peff.net> wrote:
> Using FLEX_ARRAY macros reduces the amount of manual
> computation size we have to do. It also ensures we don't
> overflow size_t, and it makes sure we write the same number
> of bytes that we allocated.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> @@ -412,8 +410,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
>                     !memcmp(ent->pattern, pattern, len))
>                         return ent;
>
> -       ent = xcalloc(1, (sizeof(*ent) + len));
> -       memcpy(ent->pattern, pattern, len);
> +       FLEX_ALLOC_MEM(ent, pattern, pattern, len);

Does the incoming 'len' already account for the NUL terminator, or was
the original code underallocating?

Answering my own question: Looking at reflog_expire_config() and
parse_config_key(), I gather that 'len' already accounts for the NUL,
thus the new code is overallocating (which should not be a problem).

>         ent->len = len;
>         *reflog_expire_cfg_tail = ent;
>         reflog_expire_cfg_tail = &(ent->next);
> diff --git a/hashmap.c b/hashmap.c
> @@ -256,10 +256,9 @@ const void *memintern(const void *data, size_t len)
>         e = hashmap_get(&map, &key, data);
>         if (!e) {
>                 /* not found: create it */
> -               e = xmallocz(sizeof(struct pool_entry) + len);
> +               FLEX_ALLOC_MEM(e, data, data, len);

Ditto. I guess the new code is overallocating (which should be okay).

>                 hashmap_entry_init(e, key.ent.hash);
>                 e->len = len;
> -               memcpy(e->data, data, len);
>                 hashmap_add(&map, e);
>         }
>         return e->data;

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/18] add helpers for allocating flex-array structs
  2016-02-16  1:47   ` Eric Sunshine
@ 2016-02-16  2:52     ` Jeff King
  0 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-16  2:52 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 08:47:30PM -0500, Eric Sunshine wrote:

> > This patch adds a few helpers to turn simple cases of into a
> 
> Grammo: "cases of into"

Oops. Cases of "flex-array struct allocation into...".

> > + * and "name" will point to a block of memory after the struct, which will be
> > + * freed along with the struct (but the pointer can be repoined anywhere).
> 
> "repoined"?

Repointed.

Fixed patch below.

-- >8 --
Subject: [PATCH] add helpers for allocating flex-array structs

Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.

This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.

Ideally we could provide a more flexible version that could
handle multiple strings, like:

  FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);

But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 55c073d..e71e615 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
 
+/*
+ * These functions help you allocate structs with flex arrays, and copy
+ * the data directly into the array. For example, if you had:
+ *
+ *   struct foo {
+ *     int bar;
+ *     char name[FLEX_ARRAY];
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_MEM(f, name, src, len);
+ *
+ * to allocate a "foo" with the contents of "src" in the "name" field.
+ * The resulting struct is automatically zero'd, and the flex-array field
+ * is NUL-terminated (whether the incoming src buffer was or not).
+ *
+ * The FLEXPTR_* variants operate on structs that don't use flex-arrays,
+ * but do want to store a pointer to some extra data in the same allocated
+ * block. For example, if you have:
+ *
+ *   struct foo {
+ *     char *name;
+ *     int bar;
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_STR(f, name, src);
+ *
+ * and "name" will point to a block of memory after the struct, which will be
+ * freed along with the struct (but the pointer can be repointed anywhere).
+ *
+ * The *_STR variants accept a string parameter rather than a ptr/len
+ * combination.
+ *
+ * Note that these macros will evaluate the first parameter multiple
+ * times, and it must be assignable as an lvalue.
+ */
+#define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
+	(x) = NULL; /* silence -Wuninitialized for offset calculation */ \
+	(x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char *)(x), (buf), (len)); \
+} while (0)
+#define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
+	(x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+	(x)->ptrname = (void *)((x)+1); \
+} while(0)
+#define FLEX_ALLOC_STR(x, flexname, str) \
+	FLEX_ALLOC_MEM((x), flexname, (str), strlen(str))
+#define FLEXPTR_ALLOC_STR(x, ptrname, str) \
+	FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
+
+static inline void *xalloc_flex(size_t base_len, size_t offset,
+				const void *src, size_t src_len)
+{
+	unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
+	memcpy(ret + offset, src, src_len);
+	return ret;
+}
+
 static inline char *xstrdup_or_null(const char *str)
 {
 	return str ? xstrdup(str) : NULL;
-- 
2.7.1.574.gccd43a9

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  2:17   ` Eric Sunshine
@ 2016-02-16  3:15     ` Jeff King
  2016-02-16  3:26       ` Jeff King
  2016-02-16  4:10       ` Eric Sunshine
  0 siblings, 2 replies; 93+ messages in thread
From: Jeff King @ 2016-02-16  3:15 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 09:17:05PM -0500, Eric Sunshine wrote:

> > ---
> > diff --git a/builtin/reflog.c b/builtin/reflog.c
> > @@ -412,8 +410,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
> >                     !memcmp(ent->pattern, pattern, len))
> >                         return ent;
> >
> > -       ent = xcalloc(1, (sizeof(*ent) + len));
> > -       memcpy(ent->pattern, pattern, len);
> > +       FLEX_ALLOC_MEM(ent, pattern, pattern, len);
> 
> Does the incoming 'len' already account for the NUL terminator, or was
> the original code underallocating?
> 
> Answering my own question: Looking at reflog_expire_config() and
> parse_config_key(), I gather that 'len' already accounts for the NUL,
> thus the new code is overallocating (which should not be a problem).

Actually, I think the original underallocates. If we have
gc.foobar.reflogExpire, then "pattern" will poitn to "foobar" and "len"
will be 6. Meaning we allocate without a trailing NUL.

That _should_ be OK, because the struct has a "len" field, and readers
can be careful not to go past it. And indeed, in the loop above, we
check the length and use memcmp().

But later, in set_reflog_expiry_param(), we walk through the list and
hand ent->pattern directly to wildmatch, which assumes a NUL-terminated
string. In practice, it probably works out 7 out of 8 times, because
malloc will align the struct, and we're on a zeroed page, so unless the
string is exactly 8 characters, we'll get some extra NULs afterwards.
But I could demonstrate it by doing:

  gdb --args git -c gc.foobar12.reflogexpire=never reflog expire --all

and breaking on wildmatch, which yields:

  Breakpoint 1, wildmatch (pattern=0x85eb70 "foobar12Q", text=0x85e4d4
	"refs/heads/master", flags=0, wo=0x0)

So this is in fact fixing a bug. I can't say I'm terribly surprised
nobody noticed it, as per-ref reflog expiration is pretty obscure.

I hope this increases confidence in my patch series. Even though I
didn't _know_ there was a bug here, I did know that malloc computations
are a potential source of errors. And the FLEX_ALLOC helpers are
designed to remove that work and have a simple interface. We don't know
whether the caller will want a NUL afterwards or not, but we err on the
side of over-allocating by a byte (just as we over-allocate
read_sha1_file() output by a byte), because safety is better than
squeezing out a single byte.

> > diff --git a/hashmap.c b/hashmap.c
> > @@ -256,10 +256,9 @@ const void *memintern(const void *data, size_t len)
> >         e = hashmap_get(&map, &key, data);
> >         if (!e) {
> >                 /* not found: create it */
> > -               e = xmallocz(sizeof(struct pool_entry) + len);
> > +               FLEX_ALLOC_MEM(e, data, data, len);
> 
> Ditto. I guess the new code is overallocating (which should be okay).

It is, but so was the original (it used xmallocz to get an extra NUL).

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  3:15     ` Jeff King
@ 2016-02-16  3:26       ` Jeff King
  2016-02-16  3:36         ` Jeff King
  2016-02-16  4:10       ` Eric Sunshine
  1 sibling, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-16  3:26 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 10:15:54PM -0500, Jeff King wrote:

> > Answering my own question: Looking at reflog_expire_config() and
> > parse_config_key(), I gather that 'len' already accounts for the NUL,
> > thus the new code is overallocating (which should not be a problem).
> 
> Actually, I think the original underallocates. If we have
> gc.foobar.reflogExpire, then "pattern" will poitn to "foobar" and "len"
> will be 6. Meaning we allocate without a trailing NUL.
> 
> That _should_ be OK, because the struct has a "len" field, and readers
> can be careful not to go past it. And indeed, in the loop above, we
> check the length and use memcmp().
> 
> But later, in set_reflog_expiry_param(), we walk through the list and
> hand ent->pattern directly to wildmatch, which assumes a NUL-terminated
> string. In practice, it probably works out 7 out of 8 times, because
> malloc will align the struct, and we're on a zeroed page, so unless the
> string is exactly 8 characters, we'll get some extra NULs afterwards.
> But I could demonstrate it by doing:
> 
>   gdb --args git -c gc.foobar12.reflogexpire=never reflog expire --all
> 
> and breaking on wildmatch, which yields:
> 
>   Breakpoint 1, wildmatch (pattern=0x85eb70 "foobar12Q", text=0x85e4d4
> 	"refs/heads/master", flags=0, wo=0x0)
> 
> So this is in fact fixing a bug. I can't say I'm terribly surprised
> nobody noticed it, as per-ref reflog expiration is pretty obscure.

We could do this on top of my series (I can also factor out the fix
separately to go at the beginning if we don't want to hold the bugfix
hostage).

-- >8 --
Subject: [PATCH] reflog_expire_cfg: drop misleading "len" parameter

You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern.

However, we feed the pattern directly to wildmatch(), which
means that it must be a NUL-terminated string. Before the
recent conversion to FLEX_ALLOC_MEM, we got this wrong, and
could feed extra garbage to wildmatch(). That's now fixed,
but the "len" parameter is simply misleading. The pattern is
a string, and we don't need to record its length.

To get rid of it, we do need to tweak the "do we have it
already?" search in find_cfg_ent(), but we can do so without
having a recorded length by just using strncmp.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/reflog.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index 7c1990f..2d46b64 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -394,7 +394,6 @@ static struct reflog_expire_cfg {
 	struct reflog_expire_cfg *next;
 	unsigned long expire_total;
 	unsigned long expire_unreachable;
-	size_t len;
 	char pattern[FLEX_ARRAY];
 } *reflog_expire_cfg, **reflog_expire_cfg_tail;
 
@@ -406,12 +405,11 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		reflog_expire_cfg_tail = &reflog_expire_cfg;
 
 	for (ent = reflog_expire_cfg; ent; ent = ent->next)
-		if (ent->len == len &&
-		    !memcmp(ent->pattern, pattern, len))
+		if (!strncmp(ent->pattern, pattern, len) &&
+		    ent->pattern[len] == '\0')
 			return ent;
 
 	FLEX_ALLOC_MEM(ent, pattern, pattern, len);
-	ent->len = len;
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
-- 
2.7.1.574.gccd43a9

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  3:26       ` Jeff King
@ 2016-02-16  3:36         ` Jeff King
  2016-02-16  4:18           ` Eric Sunshine
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-16  3:36 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 10:26:26PM -0500, Jeff King wrote:

> We could do this on top of my series (I can also factor out the fix
> separately to go at the beginning if we don't want to hold the bugfix
> hostage).
> 
> -- >8 --
> Subject: [PATCH] reflog_expire_cfg: drop misleading "len" parameter

Here it is as a separate fix. Applying my series on top would need a
minor and obvious tweak. I'll hold a re-roll for more comments, but will
otherwise plan to stick this at the front of the series.

-- >8 --
Subject: [PATCH] reflog_expire_cfg: NUL-terminate pattern field

You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern. The
pattern itself is _not_ NUL-terminated.

However, we feed the pattern directly to wildmatch(), which
expects a NUL-terminated string, meaning it may keep reading
random junk after our struct.

We can fix this by allocating an extra byte for the NUL
(which is already zero because we use xcalloc). Let's also
drop the misleading "len" field, which is no longer
necessary. The existing use of "len" can be converted to use
strncmp().

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/reflog.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index f39960e..9980731 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -396,7 +396,6 @@ static struct reflog_expire_cfg {
 	struct reflog_expire_cfg *next;
 	unsigned long expire_total;
 	unsigned long expire_unreachable;
-	size_t len;
 	char pattern[FLEX_ARRAY];
 } *reflog_expire_cfg, **reflog_expire_cfg_tail;
 
@@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		reflog_expire_cfg_tail = &reflog_expire_cfg;
 
 	for (ent = reflog_expire_cfg; ent; ent = ent->next)
-		if (ent->len == len &&
-		    !memcmp(ent->pattern, pattern, len))
+		if (!strncmp(ent->pattern, pattern, len) &&
+		    ent->pattern[len] == '\0')
 			return ent;
 
-	ent = xcalloc(1, (sizeof(*ent) + len));
+	ent = xcalloc(1, sizeof(*ent) + len + 1);
 	memcpy(ent->pattern, pattern, len);
-	ent->len = len;
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
-- 
2.7.1.574.gccd43a9

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  3:15     ` Jeff King
  2016-02-16  3:26       ` Jeff King
@ 2016-02-16  4:10       ` Eric Sunshine
  1 sibling, 0 replies; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  4:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 10:15 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Feb 15, 2016 at 09:17:05PM -0500, Eric Sunshine wrote:
>> > -       ent = xcalloc(1, (sizeof(*ent) + len));
>> > -       memcpy(ent->pattern, pattern, len);
>> > +       FLEX_ALLOC_MEM(ent, pattern, pattern, len);
>>
>> Does the incoming 'len' already account for the NUL terminator, or was
>> the original code underallocating?
>>
>> Answering my own question: Looking at reflog_expire_config() and
>> parse_config_key(), I gather that 'len' already accounts for the NUL,
>> thus the new code is overallocating (which should not be a problem).
>
> Actually, I think the original underallocates. If we have
> gc.foobar.reflogExpire, then "pattern" will poitn to "foobar" and "len"
> will be 6. Meaning we allocate without a trailing NUL.

Ugh, yeah, I misread the code.

> That _should_ be OK, because the struct has a "len" field, and readers
> can be careful not to go past it. And indeed, in the loop above, we
> check the length and use memcmp().
>
> But later, in set_reflog_expiry_param(), we walk through the list and
> hand ent->pattern directly to wildmatch, which assumes a NUL-terminated
> string. In practice, it probably works out 7 out of 8 times, because
> malloc will align the struct, and we're on a zeroed page, so unless the
> string is exactly 8 characters, we'll get some extra NULs afterwards.
> But I could demonstrate it by doing:
>
>   gdb --args git -c gc.foobar12.reflogexpire=never reflog expire --all
>
> and breaking on wildmatch, which yields:
>
>   Breakpoint 1, wildmatch (pattern=0x85eb70 "foobar12Q", text=0x85e4d4
>         "refs/heads/master", flags=0, wo=0x0)

Nice confirmation.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  3:36         ` Jeff King
@ 2016-02-16  4:18           ` Eric Sunshine
  2016-02-16  4:22             ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  4:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 10:36 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Feb 15, 2016 at 10:26:26PM -0500, Jeff King wrote:
>> We could do this on top of my series (I can also factor out the fix
>> separately to go at the beginning if we don't want to hold the bugfix
>> hostage).
>>
>> -- >8 --
>> Subject: [PATCH] reflog_expire_cfg: drop misleading "len" parameter
>
> Here it is as a separate fix. Applying my series on top would need a
> minor and obvious tweak. I'll hold a re-roll for more comments, but will
> otherwise plan to stick this at the front of the series.

Yep, I prefer this version of the patch too, as it makes it explicit
that a bug is being fixed rather than it happening "by accident" via
the FLEX_ALLOC_MEM conversion, which is easily overlooked.

> -- >8 --
> Subject: [PATCH] reflog_expire_cfg: NUL-terminate pattern field
>
> You can tweak the reflog expiration for a particular subset
> of refs by configuring gc.foo.reflogexpire. We keep a linked
> list of reflog_expire_cfg structs, each of which holds the
> pattern and a "len" field for the length of the pattern. The
> pattern itself is _not_ NUL-terminated.
>
> However, we feed the pattern directly to wildmatch(), which
> expects a NUL-terminated string, meaning it may keep reading
> random junk after our struct.
>
> We can fix this by allocating an extra byte for the NUL
> (which is already zero because we use xcalloc). Let's also
> drop the misleading "len" field, which is no longer
> necessary. The existing use of "len" can be converted to use
> strncmp().
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> @@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
>                 reflog_expire_cfg_tail = &reflog_expire_cfg;
>
>         for (ent = reflog_expire_cfg; ent; ent = ent->next)
> -               if (ent->len == len &&
> -                   !memcmp(ent->pattern, pattern, len))
> +               if (!strncmp(ent->pattern, pattern, len) &&
> +                   ent->pattern[len] == '\0')

If 'ent->pattern' is shorter than 'pattern' then the strncmp() will
fail, thus it will short-circuit before ent->pattern[len] has a chance
to access beyond the end of memory allocated for 'ent->pattern'. Okay,
makes sense.

>                         return ent;
>
> -       ent = xcalloc(1, (sizeof(*ent) + len));
> +       ent = xcalloc(1, sizeof(*ent) + len + 1);
>         memcpy(ent->pattern, pattern, len);
> -       ent->len = len;
>         *reflog_expire_cfg_tail = ent;
>         reflog_expire_cfg_tail = &(ent->next);
>         return ent;
> --
> 2.7.1.574.gccd43a9

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 05/18] convert trivial cases to ALLOC_ARRAY
  2016-02-15 21:51 ` [PATCH 05/18] convert trivial cases to ALLOC_ARRAY Jeff King
@ 2016-02-16  4:22   ` Eric Sunshine
  2016-02-16  4:23     ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  4:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:51 PM, Jeff King <peff@peff.net> wrote:
> Each of these cases can be converted to use ALLOC_ARRAY or
> REALLOC_ARRAY, which has two advantages:
>
>   1. It automatically checks the array-size multiplication
>      for overflow.
>
>   2. It always uses sizeof(*array) for the element-size,
>      so that it can never go out of sync with the declared
>      type of the array.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/compat/mingw.c b/compat/mingw.c
> index 77a51d3..0eabe68 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -854,7 +854,7 @@ static char **get_path_split(void)
>         if (!n)
>                 return NULL;
>
> -       path = xmalloc((n+1)*sizeof(char *));
> +       ALLOC_ARRAY(path, n+1);

Elsewhere in this patch, you've reformatted "x+c" as "x + c"; perhaps
do so here, as well.

>         p = envpath;
>         i = 0;
>         do {

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros
  2016-02-16  4:18           ` Eric Sunshine
@ 2016-02-16  4:22             ` Jeff King
  0 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-16  4:22 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 11:18:56PM -0500, Eric Sunshine wrote:

> > diff --git a/builtin/reflog.c b/builtin/reflog.c
> > @@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
> >                 reflog_expire_cfg_tail = &reflog_expire_cfg;
> >
> >         for (ent = reflog_expire_cfg; ent; ent = ent->next)
> > -               if (ent->len == len &&
> > -                   !memcmp(ent->pattern, pattern, len))
> > +               if (!strncmp(ent->pattern, pattern, len) &&
> > +                   ent->pattern[len] == '\0')
> 
> If 'ent->pattern' is shorter than 'pattern' then the strncmp() will
> fail, thus it will short-circuit before ent->pattern[len] has a chance
> to access beyond the end of memory allocated for 'ent->pattern'. Okay,
> makes sense.

Yeah. It took me a minute to convince myself that this was correct. If
you have a shorter or more clear way of writing it, I'm open to it. The
best I could come up with is running an extra "strlen" and otherwise
keeping the loop as it is; the performance on that is not as good, but
if performance is a concern, maybe something besides a linear search
would be in order. :)

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 05/18] convert trivial cases to ALLOC_ARRAY
  2016-02-16  4:22   ` Eric Sunshine
@ 2016-02-16  4:23     ` Jeff King
  2016-02-16  4:32       ` Eric Sunshine
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-16  4:23 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 11:22:12PM -0500, Eric Sunshine wrote:

> On Mon, Feb 15, 2016 at 4:51 PM, Jeff King <peff@peff.net> wrote:
> > Each of these cases can be converted to use ALLOC_ARRAY or
> > REALLOC_ARRAY, which has two advantages:
> >
> >   1. It automatically checks the array-size multiplication
> >      for overflow.
> >
> >   2. It always uses sizeof(*array) for the element-size,
> >      so that it can never go out of sync with the declared
> >      type of the array.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
> > ---
> > diff --git a/compat/mingw.c b/compat/mingw.c
> > index 77a51d3..0eabe68 100644
> > --- a/compat/mingw.c
> > +++ b/compat/mingw.c
> > @@ -854,7 +854,7 @@ static char **get_path_split(void)
> >         if (!n)
> >                 return NULL;
> >
> > -       path = xmalloc((n+1)*sizeof(char *));
> > +       ALLOC_ARRAY(path, n+1);
> 
> Elsewhere in this patch, you've reformatted "x+c" as "x + c"; perhaps
> do so here, as well.

Will do. I noticed while going over this before sending it out that it
may also be technically possible for "n+1" to overflow here (and I think
in a few other places in this patch). I don't know how paranoid we want
to be.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 05/18] convert trivial cases to ALLOC_ARRAY
  2016-02-16  4:23     ` Jeff King
@ 2016-02-16  4:32       ` Eric Sunshine
  2016-02-16  5:46         ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  4:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 11:23 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Feb 15, 2016 at 11:22:12PM -0500, Eric Sunshine wrote:
>> On Mon, Feb 15, 2016 at 4:51 PM, Jeff King <peff@peff.net> wrote:
>> > -       path = xmalloc((n+1)*sizeof(char *));
>> > +       ALLOC_ARRAY(path, n+1);
>>
>> Elsewhere in this patch, you've reformatted "x+c" as "x + c"; perhaps
>> do so here, as well.
>
> Will do. I noticed while going over this before sending it out that it
> may also be technically possible for "n+1" to overflow here (and I think
> in a few other places in this patch). I don't know how paranoid we want
> to be.

Yes, I also noticed those and considered mentioning it. There was also
some multiplication which might be of concern.

    ALLOC_ARRAY(graph->mapping, 2 * graph->column_capacity);

It would be easy enough to manually call st_add() and st_mult() for
those cases, but I haven't examined them closely enough to determine
how likely they would be to overflow, nor do I know if the resulting
noisiness of code is desirable.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 05/18] convert trivial cases to ALLOC_ARRAY
  2016-02-16  4:32       ` Eric Sunshine
@ 2016-02-16  5:46         ` Jeff King
  0 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-16  5:46 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Mon, Feb 15, 2016 at 11:32:25PM -0500, Eric Sunshine wrote:

> On Mon, Feb 15, 2016 at 11:23 PM, Jeff King <peff@peff.net> wrote:
> > On Mon, Feb 15, 2016 at 11:22:12PM -0500, Eric Sunshine wrote:
> >> On Mon, Feb 15, 2016 at 4:51 PM, Jeff King <peff@peff.net> wrote:
> >> > -       path = xmalloc((n+1)*sizeof(char *));
> >> > +       ALLOC_ARRAY(path, n+1);
> >>
> >> Elsewhere in this patch, you've reformatted "x+c" as "x + c"; perhaps
> >> do so here, as well.
> >
> > Will do. I noticed while going over this before sending it out that it
> > may also be technically possible for "n+1" to overflow here (and I think
> > in a few other places in this patch). I don't know how paranoid we want
> > to be.
> 
> Yes, I also noticed those and considered mentioning it. There was also
> some multiplication which might be of concern.
> 
>     ALLOC_ARRAY(graph->mapping, 2 * graph->column_capacity);
> 
> It would be easy enough to manually call st_add() and st_mult() for
> those cases, but I haven't examined them closely enough to determine
> how likely they would be to overflow, nor do I know if the resulting
> noisiness of code is desirable.

Yeah, I'm quite sure that one is safe (we set column_capacity to a fixed
integer immediately beforehand). And many of the "+" ones are likely
safe, too.  If "n" is close to wrapping, then allocating "n" structs
will probably fail beforehand (though not always, if you have a ton of
RAM and "n" is a signed int).

But part of the point of this series is that we shouldn't have to wonder
if things are safe. They should just be obviously so, and we should err
on the side of caution. So I think it probably _is_ worth sprinkling
st_add() calls in those places. I'll take a look for the re-roll.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 08/18] use st_add and st_mult for allocation size computation
  2016-02-15 21:53 ` [PATCH 08/18] use st_add and st_mult for allocation size computation Jeff King
@ 2016-02-16  5:47   ` Eric Sunshine
  0 siblings, 0 replies; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  5:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:53 PM, Jeff King <peff@peff.net> wrote:
> If our size computation overflows size_t, we may allocate a
> much smaller buffer than we expected and overflow it. It's
> probably impossible to trigger an overflow in most of these
> sites in practice, but it is easy enough convert their
> additions and multiplications into overflow-checking
> variants. This may be fixing real bugs, and it makes
> auditing the code easier.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/builtin/apply.c b/builtin/apply.c
> @@ -2632,7 +2632,7 @@ static void update_image(struct image *img,
> -       result = xmalloc(img->len + insert_count - remove_count + 1);
> +       result = xmalloc(st_add3(st_sub(img->len, remove_count), insert_count, 1));

Phew, what a mouthful, and not easy to read compared to the original.
Fortunately, the remainder of the changes in this patch are
straightforward and often simple.

> diff --git a/sha1_name.c b/sha1_name.c
> @@ -87,9 +87,8 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa
>                 const char *objdir = get_object_directory();
> -               int objdir_len = strlen(objdir);
> -               int entlen = objdir_len + 43;
> -               fakeent = xmalloc(sizeof(*fakeent) + entlen);
> +               size_t objdir_len = strlen(objdir);
> +               fakeent = xmalloc(st_add3(sizeof(*fakeent), objdir_len, 43));
>                 memcpy(fakeent->base, objdir, objdir_len);
>                 fakeent->name = fakeent->base + objdir_len + 1;

If we've gotten this far without die()ing due to overflow in st_add3()
when invoking xmalloc(), then we know that this fakeent->name
computation won't overflow. Okay.

>                 fakeent->name[-1] = '/';

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 13/18] sequencer: simplify memory allocation of get_message
  2016-02-15 21:56 ` [PATCH 13/18] sequencer: simplify memory allocation of get_message Jeff King
@ 2016-02-16  6:05   ` Eric Sunshine
  0 siblings, 0 replies; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  6:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:56 PM, Jeff King <peff@peff.net> wrote:
> For a commit with has "1234abcd" and subject "foo", this

Did you mean s/with has/which has ID/ or something?

> function produces a struct with three strings:
>
>  1. "foo"
>
>  2. "1234abcd... foo"
>
>  3. "parent of 1234abcd... foo"
>
> It takes advantage of the fact that these strings are
> subsets of each other, and allocates only _one_ string, with
> pointers into the various parts. Unfortunately, this makes
> the string allocation complicated and hard to follow.
>
> Since we keep only one of these in memory at a time, we can
> afford to simply allocate three strings. This lets us build
> on tools like xstrfmt and avoid manual computation.
>
> While we're here, we can also drop the ad-hoc
> reimplementation of get_git_commit_encoding(), and simply
> call that function.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 14/18] git-compat-util: drop mempcpy compat code
  2016-02-15 21:56 ` [PATCH 14/18] git-compat-util: drop mempcpy compat code Jeff King
@ 2016-02-16  6:05   ` Eric Sunshine
  0 siblings, 0 replies; 93+ messages in thread
From: Eric Sunshine @ 2016-02-16  6:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Mon, Feb 15, 2016 at 4:56 PM, Jeff King <peff@peff.net> wrote:
> There are no callers of this left, as the last one was
> dropped in the previous patch. And there are no likely to be

s/no/not

> new ones, as the function has been around since 2010 without
> gaining any new callers.
>
> Signed-off-by: Jeff King <peff@peff.net>

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v2 0/21] hardening allocations against integer overflow
  2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
                   ` (18 preceding siblings ...)
  2016-02-15 22:02 ` [PATCH 0/18] hardening allocations against integer overflow Jeff King
@ 2016-02-19 11:19 ` Jeff King
  2016-02-19 11:21   ` [PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field Jeff King
                     ` (21 more replies)
  19 siblings, 22 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:19 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Here's a re-roll of jk/tighten-alloc series from:

  http://thread.gmane.org/gmane.comp.version-control.git/286253

This one fixes all of the minor typo/gramm-o problems from the first
round. As Eric noted, the change to reflog_expire_cfg is an actual
bug-fix. Rather than silently fixing it, I've bumped the fix out to its
own commit at the front of the series.

I also took a look at the raw computation in the ALLOC_ARRAY and
REALLOC_ARRAY lines, as well as the ones in ALLOC_GROW. In theory
something like this is dangerous:

  ALLOC_GROW(foo, nr_foo + 1, alloc_foo);
  foo[nr_foo++] = whatever;

if we overflow nr_foo, which is quite often an "int". In practice, I
think we're OK, for two reasons:

  1. Overflowing a signed "int" here is going to make it go negative
     (technically, it invokes undefined behavior, but let's be blindly
     pragmatic for a minute and assume twos-complement wrapping). On a
     system with a 64-bit size_t, that will try to allocate an enormous
     amount of memory and fail. On a 32 bit system, it will be only
     about 2GB.  But...

  2. We're talking about overflowing 2^31 counters here. And the counter
     is multiplied by the size of each object we're storing in the
     array. So even if we assume that foo is "char *", we know we've
     allocated close to 2GB already. On a 32-bit system, the subsequent
     2GB allocation is pretty much guaranteed to fail.

     On a 64-bit system, I suspect it's possible to convince some of
     these counters to wrap (e.g., storing an array of ints, we're
     talking about only 8GB; that's a lot, but plenty of machines,
     especially servers, can allocate that).

So I have a feeling we're mostly OK there, but the reasoning is
certainly hand-wavy and I'd like to do better. Just switching to:

  ALLOC_GROW(foo, st_add(nr_foo, 1), alloc_foo);
  foo[nr_foo++] = whatever;

doesn't quite cut it. We might succeed in the allocation, and it stays
big, which is good. But if nr_foo is an int, and we wrap to negative
values, we'll start writing to memory before "foo", corrupting the heap.

So I really think we need to look at each site (and there are a lot of
them) and start using size_t more consistently for these. Or
alternatively, have an int-sized version of st_add and use that, though
it's probably just as much work to convert it to a size_t, which IMHO is
more correct. I really wanted to make a type-agnostic version of
st_add(), but I don't think it's possible to do so portably. My best
attempts needed either typeof() or compiler intrinsics.

So I've punted on that for this series, because I'm not convinced there
are active problems, and it's quite a lot of work (and the patches will
be quite disruptive).

While pondering this, I also looked at what happens if an incoming
packfile claims to have 2^32 objects in its header. In index-pack we
actually read this into a signed "int". Which is kind of bad, but in
practice means we run into the "whoops, I can't allocate (size_t)-1
memory" problem and die. We could change this to a uint32_t (which is
what the actual incoming format supports), but I have a feeling that
makes things worse (if we actually manage to process that many objects,
we then start doing some other computations based on the number of
objects, all using ints; so at least as it is now, we bail early).

While peeking at some of these sites, though, I did realize that many of
the ones that became "ALLOC_ARRAY(foo + 1)" were doing so to make a
NULL-terminated argv list. So there are two new patches in this
iteration to switch them to argv_array (one to catch the mundane cases,
and one for a unique snowflake).

  [01/21]: reflog_expire_cfg: NUL-terminate pattern field
  [02/21]: add helpers for detecting size_t overflow
  [03/21]: tree-diff: catch integer overflow in combine_diff_path allocation
  [04/21]: harden REALLOC_ARRAY and xcalloc against size_t overflow
  [05/21]: add helpers for allocating flex-array structs
  [06/21]: convert manual allocations to argv_array
  [07/21]: convert trivial cases to ALLOC_ARRAY
  [08/21]: use xmallocz to avoid size arithmetic
  [09/21]: convert trivial cases to FLEX_ARRAY macros
  [10/21]: use st_add and st_mult for allocation size computation
  [11/21]: prepare_{git,shell}_cmd: use argv_array
  [12/21]: write_untracked_extension: use FLEX_ALLOC helper
  [13/21]: fast-import: simplify allocation in start_packfile
  [14/21]: fetch-pack: simplify add_sought_entry
  [15/21]: test-path-utils: fix normalize_path_copy output buffer size
  [16/21]: sequencer: simplify memory allocation of get_message
  [17/21]: git-compat-util: drop mempcpy compat code
  [18/21]: transport_anonymize_url: use xstrfmt
  [19/21]: diff_populate_gitlink: use a strbuf
  [20/21]: convert ewah/bitmap code to use xmalloc
  [21/21]: ewah: convert to REALLOC_ARRAY, etc

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
@ 2016-02-19 11:21   ` Jeff King
  2016-02-19 11:21   ` [PATCH 02/21] add helpers for detecting size_t overflow Jeff King
                     ` (20 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:21 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern. The
pattern itself is _not_ NUL-terminated.

However, we feed the pattern directly to wildmatch(), which
expects a NUL-terminated string, meaning it may keep reading
random junk after our struct.

We can fix this by allocating an extra byte for the NUL
(which is already zero because we use xcalloc). Let's also
drop the misleading "len" field, which is no longer
necessary. The existing use of "len" can be converted to use
strncmp().

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/reflog.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index f39960e..9980731 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -396,7 +396,6 @@ static struct reflog_expire_cfg {
 	struct reflog_expire_cfg *next;
 	unsigned long expire_total;
 	unsigned long expire_unreachable;
-	size_t len;
 	char pattern[FLEX_ARRAY];
 } *reflog_expire_cfg, **reflog_expire_cfg_tail;
 
@@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		reflog_expire_cfg_tail = &reflog_expire_cfg;
 
 	for (ent = reflog_expire_cfg; ent; ent = ent->next)
-		if (ent->len == len &&
-		    !memcmp(ent->pattern, pattern, len))
+		if (!strncmp(ent->pattern, pattern, len) &&
+		    ent->pattern[len] == '\0')
 			return ent;
 
-	ent = xcalloc(1, (sizeof(*ent) + len));
+	ent = xcalloc(1, sizeof(*ent) + len + 1);
 	memcpy(ent->pattern, pattern, len);
-	ent->len = len;
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 02/21] add helpers for detecting size_t overflow
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
  2016-02-19 11:21   ` [PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field Jeff King
@ 2016-02-19 11:21   ` Jeff King
  2016-02-19 11:21   ` [PATCH 03/21] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
                     ` (19 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:21 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.

We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:

  - promote the arguments to size_t, so that we know we are
    doing our computation in the same size of integer that
    will ultimately be fed to xmalloc

  - check and die on overflow

  - return the result so that computations can be done in
    the parameter list of xmalloc.

These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:

  st_add(st_add(a, b), st_add(c, d));

you can write:

  st_add4(a, b, c, d);

This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 693a336..0c65033 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -96,6 +96,14 @@
 #define unsigned_add_overflows(a, b) \
     ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+/*
+ * Returns true if the multiplication of "a" and "b" will
+ * overflow. The types of "a" and "b" must match and must be unsigned.
+ * Note that this macro evaluates "a" twice!
+ */
+#define unsigned_mult_overflows(a, b) \
+    ((a) && (b) > maximum_unsigned_value_of_type(a) / (a))
+
 #ifdef __GNUC__
 #define TYPEOF(x) (__typeof__(x))
 #else
@@ -713,6 +721,32 @@ extern void release_pack_memory(size_t);
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
+static inline size_t st_add(size_t a, size_t b)
+{
+	if (unsigned_add_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a + b;
+}
+#define st_add3(a,b,c)   st_add((a),st_add((b),(c)))
+#define st_add4(a,b,c,d) st_add((a),st_add3((b),(c),(d)))
+
+static inline size_t st_mult(size_t a, size_t b)
+{
+	if (unsigned_mult_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" * %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a * b;
+}
+
+static inline size_t st_sub(size_t a, size_t b)
+{
+	if (a < b)
+		die("size_t underflow: %"PRIuMAX" - %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a - b;
+}
+
 #ifdef HAVE_ALLOCA_H
 # include <alloca.h>
 # define xalloca(size)      (alloca(size))
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 03/21] tree-diff: catch integer overflow in combine_diff_path allocation
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
  2016-02-19 11:21   ` [PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field Jeff King
  2016-02-19 11:21   ` [PATCH 02/21] add helpers for detecting size_t overflow Jeff King
@ 2016-02-19 11:21   ` Jeff King
  2016-02-19 11:22   ` [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
                     ` (18 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:21 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.

We can fix this by using size_t, and checking for overflow
with the st_add helper.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.h      | 4 ++--
 tree-diff.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.h b/diff.h
index 70b2d70..beafbbd 100644
--- a/diff.h
+++ b/diff.h
@@ -222,8 +222,8 @@ struct combine_diff_path {
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
-	(sizeof(struct combine_diff_path) + \
-	 sizeof(struct combine_diff_parent) * (n) + (l) + 1)
+	st_add4(sizeof(struct combine_diff_path), (l), 1, \
+		st_mult(sizeof(struct combine_diff_parent), (n)))
 
 extern void show_combined_diff(struct combine_diff_path *elem, int num_parent,
 			      int dense, struct rev_info *);
diff --git a/tree-diff.c b/tree-diff.c
index 290a1da..4dda9a1 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -124,8 +124,8 @@ static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
 	unsigned mode, const unsigned char *sha1)
 {
 	struct combine_diff_path *p;
-	int len = base->len + pathlen;
-	int alloclen = combine_diff_path_size(nparent, len);
+	size_t len = st_add(base->len, pathlen);
+	size_t alloclen = combine_diff_path_size(nparent, len);
 
 	/* if last->next is !NULL - it is a pre-allocated memory, we can reuse */
 	p = last->next;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (2 preceding siblings ...)
  2016-02-19 11:21   ` [PATCH 03/21] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
@ 2016-02-19 11:22   ` Jeff King
  2016-02-20 21:32     ` René Scharfe
  2016-02-19 11:22   ` [PATCH 05/21] add helpers for allocating flex-array structs Jeff King
                     ` (17 subsequent siblings)
  21 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:22 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow.  Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.

xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 3 ++-
 wrapper.c         | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0c65033..55c073d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -779,7 +779,8 @@ extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1);
 extern char *xgetcwd(void);
 extern FILE *fopen_for_writing(const char *path);
 
-#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
+#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
+#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
 
 static inline char *xstrdup_or_null(const char *str)
 {
diff --git a/wrapper.c b/wrapper.c
index 29a45d2..9afc1a0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -152,6 +152,9 @@ void *xcalloc(size_t nmemb, size_t size)
 {
 	void *ret;
 
+	if (unsigned_mult_overflows(nmemb, size))
+		die("data too large to fit into virtual memory space");
+
 	memory_limit_check(size * nmemb, 0);
 	ret = calloc(nmemb, size);
 	if (!ret && (!nmemb || !size))
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 05/21] add helpers for allocating flex-array structs
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (3 preceding siblings ...)
  2016-02-19 11:22   ` [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
@ 2016-02-19 11:22   ` Jeff King
  2016-02-19 11:23   ` [PATCH 06/21] convert manual allocations to argv_array Jeff King
                     ` (16 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:22 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.

This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.

Ideally we could provide a more flexible version that could
handle multiple strings, like:

  FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);

But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 55c073d..e71e615 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
 
+/*
+ * These functions help you allocate structs with flex arrays, and copy
+ * the data directly into the array. For example, if you had:
+ *
+ *   struct foo {
+ *     int bar;
+ *     char name[FLEX_ARRAY];
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_MEM(f, name, src, len);
+ *
+ * to allocate a "foo" with the contents of "src" in the "name" field.
+ * The resulting struct is automatically zero'd, and the flex-array field
+ * is NUL-terminated (whether the incoming src buffer was or not).
+ *
+ * The FLEXPTR_* variants operate on structs that don't use flex-arrays,
+ * but do want to store a pointer to some extra data in the same allocated
+ * block. For example, if you have:
+ *
+ *   struct foo {
+ *     char *name;
+ *     int bar;
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_STR(f, name, src);
+ *
+ * and "name" will point to a block of memory after the struct, which will be
+ * freed along with the struct (but the pointer can be repointed anywhere).
+ *
+ * The *_STR variants accept a string parameter rather than a ptr/len
+ * combination.
+ *
+ * Note that these macros will evaluate the first parameter multiple
+ * times, and it must be assignable as an lvalue.
+ */
+#define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
+	(x) = NULL; /* silence -Wuninitialized for offset calculation */ \
+	(x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char *)(x), (buf), (len)); \
+} while (0)
+#define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
+	(x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+	(x)->ptrname = (void *)((x)+1); \
+} while(0)
+#define FLEX_ALLOC_STR(x, flexname, str) \
+	FLEX_ALLOC_MEM((x), flexname, (str), strlen(str))
+#define FLEXPTR_ALLOC_STR(x, ptrname, str) \
+	FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
+
+static inline void *xalloc_flex(size_t base_len, size_t offset,
+				const void *src, size_t src_len)
+{
+	unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
+	memcpy(ret + offset, src, src_len);
+	return ret;
+}
+
 static inline char *xstrdup_or_null(const char *str)
 {
 	return str ? xstrdup(str) : NULL;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 06/21] convert manual allocations to argv_array
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (4 preceding siblings ...)
  2016-02-19 11:22   ` [PATCH 05/21] add helpers for allocating flex-array structs Jeff King
@ 2016-02-19 11:23   ` Jeff King
  2016-02-20  8:07     ` Eric Sunshine
  2016-02-19 11:23   ` [PATCH 07/21] convert trivial cases to ALLOC_ARRAY Jeff King
                     ` (15 subsequent siblings)
  21 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:23 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
     array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
     then allocate, then fill in the array. Now we can do it
     in one pass, making the code shorter and easier to
     follow.

  3. argv_array handles memory ownership for us, making it
     more obvious when things should be free()d and and when
     not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/grep.c         | 10 +++++-----
 builtin/receive-pack.c | 12 +++---------
 builtin/remote-ext.c   | 26 +++++---------------------
 daemon.c               | 12 +++++-------
 git.c                  | 14 +++++---------
 line-log.c             | 26 ++++++++++----------------
 remote-curl.c          | 23 +++++++++++------------
 7 files changed, 44 insertions(+), 79 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 8c516a9..aa7435f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -365,17 +365,17 @@ static void append_path(struct grep_opt *opt, const void *data, size_t len)
 static void run_pager(struct grep_opt *opt, const char *prefix)
 {
 	struct string_list *path_list = opt->output_priv;
-	const char **argv = xmalloc(sizeof(const char *) * (path_list->nr + 1));
+	struct child_process child = CHILD_PROCESS_INIT;
 	int i, status;
 
 	for (i = 0; i < path_list->nr; i++)
-		argv[i] = path_list->items[i].string;
-	argv[path_list->nr] = NULL;
+		argv_array_push(&child.args, path_list->items[i].string);
+	child.dir = prefix;
+	child.use_shell = 1;
 
-	status = run_command_v_opt_cd_env(argv, RUN_USING_SHELL, prefix, NULL);
+	status = run_command(&child);
 	if (status)
 		exit(status);
-	free(argv);
 }
 
 static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int cached)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f2d6761..932afab 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1031,7 +1031,6 @@ static void run_update_post_hook(struct command *commands)
 {
 	struct command *cmd;
 	int argc;
-	const char **argv;
 	struct child_process proc = CHILD_PROCESS_INIT;
 	const char *hook;
 
@@ -1044,21 +1043,16 @@ static void run_update_post_hook(struct command *commands)
 	if (!argc || !hook)
 		return;
 
-	argv = xmalloc(sizeof(*argv) * (2 + argc));
-	argv[0] = hook;
-
-	for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
+	argv_array_push(&proc.args, hook);
+	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (cmd->error_string || cmd->did_not_exist)
 			continue;
-		argv[argc] = xstrdup(cmd->ref_name);
-		argc++;
+		argv_array_push(&proc.args, cmd->ref_name);
 	}
-	argv[argc] = NULL;
 
 	proc.no_stdin = 1;
 	proc.stdout_to_stderr = 1;
 	proc.err = use_sideband ? -1 : 0;
-	proc.argv = argv;
 
 	if (!start_command(&proc)) {
 		if (use_sideband)
diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index e3cd25d..7457c74 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -114,30 +114,14 @@ static char *strip_escapes(const char *str, const char *service,
 	}
 }
 
-/* Should be enough... */
-#define MAXARGUMENTS 256
-
-static const char **parse_argv(const char *arg, const char *service)
+static void parse_argv(struct argv_array *out, const char *arg, const char *service)
 {
-	int arguments = 0;
-	int i;
-	const char **ret;
-	char *temparray[MAXARGUMENTS + 1];
-
 	while (*arg) {
-		char *expanded;
-		if (arguments == MAXARGUMENTS)
-			die("remote-ext command has too many arguments");
-		expanded = strip_escapes(arg, service, &arg);
+		char *expanded = strip_escapes(arg, service, &arg);
 		if (expanded)
-			temparray[arguments++] = expanded;
+			argv_array_push(out, expanded);
+		free(expanded);
 	}
-
-	ret = xmalloc((arguments + 1) * sizeof(char *));
-	for (i = 0; i < arguments; i++)
-		ret[i] = temparray[i];
-	ret[arguments] = NULL;
-	return ret;
 }
 
 static void send_git_request(int stdin_fd, const char *serv, const char *repo,
@@ -158,7 +142,7 @@ static int run_child(const char *arg, const char *service)
 	child.in = -1;
 	child.out = -1;
 	child.err = 0;
-	child.argv = parse_argv(arg, service);
+	parse_argv(&child.args, arg, service);
 
 	if (start_command(&child) < 0)
 		die("Can't run specified command");
diff --git a/daemon.c b/daemon.c
index 46b411c..3f57adb 100644
--- a/daemon.c
+++ b/daemon.c
@@ -808,7 +808,7 @@ static void check_dead_children(void)
 			cradle = &blanket->next;
 }
 
-static char **cld_argv;
+struct argv_array cld_argv = ARGV_ARRAY_INIT;
 static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
@@ -842,7 +842,7 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 #endif
 	}
 
-	cld.argv = (const char **)cld_argv;
+	cld.argv = cld_argv.argv;
 	cld.in = incoming;
 	cld.out = dup(incoming);
 
@@ -1374,12 +1374,10 @@ int main(int argc, char **argv)
 		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
 
 	/* prepare argv for serving-processes */
-	cld_argv = xmalloc(sizeof (char *) * (argc + 2));
-	cld_argv[0] = argv[0];	/* git-daemon */
-	cld_argv[1] = "--serve";
+	argv_array_push(&cld_argv, argv[0]); /* git-daemon */
+	argv_array_push(&cld_argv, "--serve");
 	for (i = 1; i < argc; ++i)
-		cld_argv[i+1] = argv[i];
-	cld_argv[argc+1] = NULL;
+		argv_array_push(&cld_argv, argv[i]);
 
 	return serve(&listen_addr, listen_port, cred);
 }
diff --git a/git.c b/git.c
index 6c64c94..087ad31 100644
--- a/git.c
+++ b/git.c
@@ -247,20 +247,16 @@ static int handle_alias(int *argcp, const char ***argv)
 	alias_string = alias_lookup(alias_command);
 	if (alias_string) {
 		if (alias_string[0] == '!') {
-			const char **alias_argv;
-			int argc = *argcp, i;
+			struct child_process child = CHILD_PROCESS_INIT;
 
 			commit_pager_choice();
 			restore_env(1);
 
-			/* build alias_argv */
-			alias_argv = xmalloc(sizeof(*alias_argv) * (argc + 1));
-			alias_argv[0] = alias_string + 1;
-			for (i = 1; i < argc; ++i)
-				alias_argv[i] = (*argv)[i];
-			alias_argv[argc] = NULL;
+			child.use_shell = 1;
+			argv_array_push(&child.args, alias_string + 1);
+			argv_array_pushv(&child.args, (*argv) + 1);
 
-			ret = run_command_v_opt(alias_argv, RUN_USING_SHELL);
+			ret = run_command(&child);
 			if (ret >= 0)   /* normal exit */
 				exit(ret);
 
diff --git a/line-log.c b/line-log.c
index af6e2f7..5942a6f 100644
--- a/line-log.c
+++ b/line-log.c
@@ -14,6 +14,7 @@
 #include "graph.h"
 #include "userdiff.h"
 #include "line-log.h"
+#include "argv-array.h"
 
 static void range_set_grow(struct range_set *rs, size_t extra)
 {
@@ -746,23 +747,16 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
 	add_line_range(rev, commit, range);
 
 	if (!rev->diffopt.detect_rename) {
-		int i, count = 0;
-		struct line_log_data *r = range;
-		const char **paths;
-		while (r) {
-			count++;
-			r = r->next;
-		}
-		paths = xmalloc((count+1)*sizeof(char *));
-		r = range;
-		for (i = 0; i < count; i++) {
-			paths[i] = xstrdup(r->path);
-			r = r->next;
-		}
-		paths[count] = NULL;
+		struct line_log_data *r;
+		struct argv_array paths = ARGV_ARRAY_INIT;
+
+		for (r = range; r; r = r->next)
+			argv_array_push(&paths, r->path);
 		parse_pathspec(&rev->diffopt.pathspec, 0,
-			       PATHSPEC_PREFER_FULL, "", paths);
-		free(paths);
+			       PATHSPEC_PREFER_FULL, "", paths.argv);
+		/* argv strings are now owned by pathspec */
+		paths.argc = 0;
+		argv_array_clear(&paths);
 	}
 }
 
diff --git a/remote-curl.c b/remote-curl.c
index c704857..42deeec 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -845,23 +845,22 @@ static void parse_fetch(struct strbuf *buf)
 
 static int push_dav(int nr_spec, char **specs)
 {
-	const char **argv = xmalloc((10 + nr_spec) * sizeof(char*));
-	int argc = 0, i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	size_t i;
 
-	argv[argc++] = "http-push";
-	argv[argc++] = "--helper-status";
+	child.git_cmd = 1;
+	argv_array_push(&child.args, "http-push");
+	argv_array_push(&child.args, "--helper-status");
 	if (options.dry_run)
-		argv[argc++] = "--dry-run";
+		argv_array_push(&child.args, "--dry-run");
 	if (options.verbosity > 1)
-		argv[argc++] = "--verbose";
-	argv[argc++] = url.buf;
+		argv_array_push(&child.args, "--verbose");
+	argv_array_push(&child.args, url.buf);
 	for (i = 0; i < nr_spec; i++)
-		argv[argc++] = specs[i];
-	argv[argc++] = NULL;
+		argv_array_push(&child.args, specs[i]);
 
-	if (run_command_v_opt(argv, RUN_GIT_CMD))
-		die("git-%s failed", argv[0]);
-	free(argv);
+	if (run_command(&child))
+		die("git-http-push failed");
 	return 0;
 }
 
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 07/21] convert trivial cases to ALLOC_ARRAY
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (5 preceding siblings ...)
  2016-02-19 11:23   ` [PATCH 06/21] convert manual allocations to argv_array Jeff King
@ 2016-02-19 11:23   ` Jeff King
  2016-02-19 11:23   ` [PATCH 08/21] use xmallocz to avoid size arithmetic Jeff King
                     ` (14 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:23 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
---
 alias.c                  |  2 +-
 attr.c                   |  2 +-
 bisect.c                 |  4 ++--
 builtin/blame.c          |  3 ++-
 builtin/clean.c          |  2 +-
 builtin/fast-export.c    |  2 +-
 builtin/index-pack.c     |  4 ++--
 builtin/merge-base.c     |  2 +-
 builtin/mv.c             |  3 ++-
 builtin/pack-objects.c   |  7 ++++---
 builtin/pack-redundant.c |  2 +-
 builtin/receive-pack.c   |  5 ++---
 column.c                 |  2 +-
 combine-diff.c           |  4 ++--
 commit.c                 |  2 +-
 compat/mingw.c           |  6 +++---
 diffcore-order.c         |  4 ++--
 dir.c                    |  6 +++---
 fast-import.c            |  5 +++--
 fsck.c                   |  3 ++-
 graph.c                  | 10 ++++------
 khash.h                  |  2 +-
 levenshtein.c            |  8 +++++---
 line-log.c               |  8 ++++----
 notes.c                  |  2 +-
 pack-check.c             |  2 +-
 pack-revindex.c          | 12 ++++++++----
 pathspec.c               |  5 +++--
 remote-curl.c            |  3 ++-
 sha1_file.c              |  4 ++--
 shallow.c                |  6 +++---
 show-index.c             |  3 ++-
 transport.c              |  2 +-
 xdiff-interface.c        |  2 +-
 34 files changed, 75 insertions(+), 64 deletions(-)

diff --git a/alias.c b/alias.c
index a11229d..3b90397 100644
--- a/alias.c
+++ b/alias.c
@@ -23,7 +23,7 @@ int split_cmdline(char *cmdline, const char ***argv)
 	int src, dst, count = 0, size = 16;
 	char quoted = 0;
 
-	*argv = xmalloc(sizeof(**argv) * size);
+	ALLOC_ARRAY(*argv, size);
 
 	/* split alias_string */
 	(*argv)[count++] = cmdline;
diff --git a/attr.c b/attr.c
index 086c08d..c83ec49 100644
--- a/attr.c
+++ b/attr.c
@@ -799,7 +799,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 			++count;
 	}
 	*num = count;
-	*check = xmalloc(sizeof(**check) * count);
+	ALLOC_ARRAY(*check, count);
 	j = 0;
 	for (i = 0; i < attr_nr; i++) {
 		const char *value = check_all_attr[i].value;
diff --git a/bisect.c b/bisect.c
index 06ec54e..7996c29 100644
--- a/bisect.c
+++ b/bisect.c
@@ -708,10 +708,10 @@ static struct commit *get_commit_reference(const unsigned char *sha1)
 
 static struct commit **get_bad_and_good_commits(int *rev_nr)
 {
-	int len = 1 + good_revs.nr;
-	struct commit **rev = xmalloc(len * sizeof(*rev));
+	struct commit **rev;
 	int i, n = 0;
 
+	ALLOC_ARRAY(rev, 1 + good_revs.nr);
 	rev[n++] = get_commit_reference(current_bad_oid->hash);
 	for (i = 0; i < good_revs.nr; i++)
 		rev[n++] = get_commit_reference(good_revs.sha1[i]);
diff --git a/builtin/blame.c b/builtin/blame.c
index 55bf5fa..b4ed462 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2059,7 +2059,8 @@ static int prepare_lines(struct scoreboard *sb)
 	for (p = buf; p < end; p = get_next_line(p, end))
 		num++;
 
-	sb->lineno = lineno = xmalloc(sizeof(*sb->lineno) * (num + 1));
+	ALLOC_ARRAY(sb->lineno, num + 1);
+	lineno = sb->lineno;
 
 	for (p = buf; p < end; p = get_next_line(p, end))
 		*lineno++ = p - buf;
diff --git a/builtin/clean.c b/builtin/clean.c
index 7b08237..8229f7e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -543,7 +543,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 	int eof = 0;
 	int i;
 
-	chosen = xmalloc(sizeof(int) * stuff->nr);
+	ALLOC_ARRAY(chosen, stuff->nr);
 	/* set chosen as uninitialized */
 	for (i = 0; i < stuff->nr; i++)
 		chosen[i] = -1;
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 2471297..8164b58 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1021,7 +1021,7 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 		const char **refspecs_str;
 		int i;
 
-		refspecs_str = xmalloc(sizeof(*refspecs_str) * refspecs_list.nr);
+		ALLOC_ARRAY(refspecs_str, refspecs_list.nr);
 		for (i = 0; i < refspecs_list.nr; i++)
 			refspecs_str[i] = refspecs_list.items[i].string;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6a01509..a60bcfa 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1346,7 +1346,7 @@ static void fix_unresolved_deltas(struct sha1file *f)
 	 * before deltas depending on them, a good heuristic is to start
 	 * resolving deltas in the same order as their position in the pack.
 	 */
-	sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
+	ALLOC_ARRAY(sorted_by_pos, nr_ref_deltas);
 	for (i = 0; i < nr_ref_deltas; i++)
 		sorted_by_pos[i] = &ref_deltas[i];
 	qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
@@ -1759,7 +1759,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (show_stat)
 		show_pack_info(stat_only);
 
-	idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *));
+	ALLOC_ARRAY(idx_objects, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
 	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_sha1);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index a891162..c0d1822 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -252,7 +252,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 	if (argc < 2)
 		usage_with_options(merge_base_usage, options);
 
-	rev = xmalloc(argc * sizeof(*rev));
+	ALLOC_ARRAY(rev, argc);
 	while (argc-- > 0)
 		rev[rev_nr++] = get_commit_reference(*argv++);
 	return show_merge_base(rev, rev_nr, show_all);
diff --git a/builtin/mv.c b/builtin/mv.c
index d1d4316..9a9813a 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -24,7 +24,8 @@ static const char **internal_copy_pathspec(const char *prefix,
 					   int count, unsigned flags)
 {
 	int i;
-	const char **result = xmalloc((count + 1) * sizeof(const char *));
+	const char **result;
+	ALLOC_ARRAY(result, count + 1);
 	memcpy(result, pathspec, count * sizeof(const char *));
 	result[count] = NULL;
 	for (i = 0; i < count; i++) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4dae5b1..b4f1fa6 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -624,7 +624,7 @@ static struct object_entry **compute_write_order(void)
 {
 	unsigned int i, wo_end, last_untagged;
 
-	struct object_entry **wo = xmalloc(to_pack.nr_objects * sizeof(*wo));
+	struct object_entry **wo;
 	struct object_entry *objects = to_pack.objects;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
@@ -657,6 +657,7 @@ static struct object_entry **compute_write_order(void)
 	 * Give the objects in the original recency order until
 	 * we see a tagged tip.
 	 */
+	ALLOC_ARRAY(wo, to_pack.nr_objects);
 	for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			break;
@@ -769,7 +770,7 @@ static void write_pack_file(void)
 
 	if (progress > pack_to_stdout)
 		progress_state = start_progress(_("Writing objects"), nr_result);
-	written_list = xmalloc(to_pack.nr_objects * sizeof(*written_list));
+	ALLOC_ARRAY(written_list, to_pack.nr_objects);
 	write_order = compute_write_order();
 
 	do {
@@ -2129,7 +2130,7 @@ static void prepare_pack(int window, int depth)
 	if (!to_pack.nr_objects || !window || !depth)
 		return;
 
-	delta_list = xmalloc(to_pack.nr_objects * sizeof(*delta_list));
+	ALLOC_ARRAY(delta_list, to_pack.nr_objects);
 	nr_deltas = n = 0;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index d0532f6..72c8158 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -53,7 +53,7 @@ static inline struct llist_item *llist_item_get(void)
 		free_nodes = free_nodes->next;
 	} else {
 		int i = 1;
-		new = xmalloc(sizeof(struct llist_item) * BLKSIZE);
+		ALLOC_ARRAY(new, BLKSIZE);
 		for (; i < BLKSIZE; i++)
 			llist_item_put(&new[i]);
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 932afab..3dc3868 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1591,8 +1591,7 @@ static void prepare_shallow_update(struct command *commands,
 {
 	int i, j, k, bitmap_size = (si->ref->nr + 31) / 32;
 
-	si->used_shallow = xmalloc(sizeof(*si->used_shallow) *
-				   si->shallow->nr);
+	ALLOC_ARRAY(si->used_shallow, si->shallow->nr);
 	assign_shallow_commits_to_refs(si, si->used_shallow, NULL);
 
 	si->need_reachability_test =
@@ -1658,7 +1657,7 @@ static void update_shallow_info(struct command *commands,
 		return;
 	}
 
-	ref_status = xmalloc(sizeof(*ref_status) * ref->nr);
+	ALLOC_ARRAY(ref_status, ref->nr);
 	assign_shallow_commits_to_refs(si, NULL, ref_status);
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (is_null_sha1(cmd->new_sha1))
diff --git a/column.c b/column.c
index 786abe6..f9fda68 100644
--- a/column.c
+++ b/column.c
@@ -164,7 +164,7 @@ static void display_table(const struct string_list *list,
 	data.colopts = colopts;
 	data.opts = *opts;
 
-	data.len = xmalloc(sizeof(*data.len) * list->nr);
+	ALLOC_ARRAY(data.len, list->nr);
 	for (i = 0; i < list->nr; i++)
 		data.len[i] = item_length(colopts, list->items[i].string);
 
diff --git a/combine-diff.c b/combine-diff.c
index 5571304..a698016 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1372,7 +1372,7 @@ static struct combine_diff_path *find_paths_multitree(
 	struct combine_diff_path paths_head;
 	struct strbuf base;
 
-	parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0]));
+	ALLOC_ARRAY(parents_sha1, nparent);
 	for (i = 0; i < nparent; i++)
 		parents_sha1[i] = parents->sha1[i];
 
@@ -1483,7 +1483,7 @@ void diff_tree_combined(const unsigned char *sha1,
 	if (opt->orderfile && num_paths) {
 		struct obj_order *o;
 
-		o = xmalloc(sizeof(*o) * num_paths);
+		ALLOC_ARRAY(o, num_paths);
 		for (i = 0, p = paths; p; p = p->next, i++)
 			o[i].obj = p;
 		order_objects(opt->orderfile, path_path, o, num_paths);
diff --git a/commit.c b/commit.c
index 40388d7..31cd91f 100644
--- a/commit.c
+++ b/commit.c
@@ -903,7 +903,7 @@ static int remove_redundant(struct commit **array, int cnt)
 
 	work = xcalloc(cnt, sizeof(*work));
 	redundant = xcalloc(cnt, 1);
-	filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1));
+	ALLOC_ARRAY(filled_index, cnt - 1);
 
 	for (i = 0; i < cnt; i++)
 		parse_commit(array[i]);
diff --git a/compat/mingw.c b/compat/mingw.c
index fbe69b8..7803f30 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -893,7 +893,7 @@ static char **get_path_split(void)
 	if (!n)
 		return NULL;
 
-	path = xmalloc((n+1)*sizeof(char *));
+	ALLOC_ARRAY(path, n + 1);
 	p = envpath;
 	i = 0;
 	do {
@@ -978,7 +978,7 @@ static wchar_t *make_environment_block(char **deltaenv)
 		i++;
 
 	/* copy the environment, leaving space for changes */
-	tmpenv = xmalloc((size + i) * sizeof(char*));
+	ALLOC_ARRAY(tmpenv, size + i);
 	memcpy(tmpenv, environ, size * sizeof(char*));
 
 	/* merge supplied environment changes into the temporary environment */
@@ -1168,7 +1168,7 @@ static int try_shell_exec(const char *cmd, char *const *argv)
 		int argc = 0;
 		const char **argv2;
 		while (argv[argc]) argc++;
-		argv2 = xmalloc(sizeof(*argv) * (argc+1));
+		ALLOC_ARRAY(argv2, argc + 1);
 		argv2[0] = (char *)cmd;	/* full path to the script file */
 		memcpy(&argv2[1], &argv[1], sizeof(*argv) * argc);
 		pid = mingw_spawnv(prog, argv2, 1);
diff --git a/diffcore-order.c b/diffcore-order.c
index 97dd3d0..69d41f7 100644
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -52,7 +52,7 @@ static void prepare_order(const char *orderfile)
 		}
 		if (pass == 0) {
 			order_cnt = cnt;
-			order = xmalloc(sizeof(*order) * cnt);
+			ALLOC_ARRAY(order, cnt);
 		}
 	}
 }
@@ -120,7 +120,7 @@ void diffcore_order(const char *orderfile)
 	if (!q->nr)
 		return;
 
-	o = xmalloc(sizeof(*o) * q->nr);
+	ALLOC_ARRAY(o, q->nr);
 	for (i = 0; i < q->nr; i++)
 		o[i].obj = q->queue[i];
 	order_objects(orderfile, pair_pathtwo, o, q->nr);
diff --git a/dir.c b/dir.c
index f0b6d0a..66c93c1 100644
--- a/dir.c
+++ b/dir.c
@@ -2484,14 +2484,14 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	ud.untracked_alloc = value;
 	ud.untracked_nr	   = value;
 	if (ud.untracked_nr)
-		ud.untracked = xmalloc(sizeof(*ud.untracked) * ud.untracked_nr);
+		ALLOC_ARRAY(ud.untracked, ud.untracked_nr);
 	data = next;
 
 	next = data;
 	ud.dirs_alloc = ud.dirs_nr = decode_varint(&next);
 	if (next > end)
 		return -1;
-	ud.dirs = xmalloc(sizeof(*ud.dirs) * ud.dirs_nr);
+	ALLOC_ARRAY(ud.dirs, ud.dirs_nr);
 	data = next;
 
 	len = strlen((const char *)data);
@@ -2611,7 +2611,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
 	rd.data	      = next;
 	rd.end	      = end;
 	rd.index      = 0;
-	rd.ucd        = xmalloc(sizeof(*rd.ucd) * len);
+	ALLOC_ARRAY(rd.ucd, len);
 
 	if (read_one_dir(&uc->root, &rd) || rd.index != len)
 		goto done;
diff --git a/fast-import.c b/fast-import.c
index bf01b34..a6467cb 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -814,7 +814,8 @@ static struct tree_entry *new_tree_entry(void)
 	if (!avail_tree_entry) {
 		unsigned int n = tree_entry_alloc;
 		total_allocd += n * sizeof(struct tree_entry);
-		avail_tree_entry = e = xmalloc(n * sizeof(struct tree_entry));
+		ALLOC_ARRAY(e, n);
+		avail_tree_entry = e;
 		while (n-- > 1) {
 			*((void**)e) = e + 1;
 			e++;
@@ -898,7 +899,7 @@ static const char *create_index(void)
 	struct object_entry_pool *o;
 
 	/* Build the table of object IDs. */
-	idx = xmalloc(object_count * sizeof(*idx));
+	ALLOC_ARRAY(idx, object_count);
 	c = idx;
 	for (o = blocks; o; o = o->next_pool)
 		for (e = o->next_free; e-- != o->entries;)
diff --git a/fsck.c b/fsck.c
index c637f66..ca4c685 100644
--- a/fsck.c
+++ b/fsck.c
@@ -199,7 +199,8 @@ void fsck_set_msg_type(struct fsck_options *options,
 
 	if (!options->msg_type) {
 		int i;
-		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_type;
+		ALLOC_ARRAY(msg_type, FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
 			msg_type[i] = fsck_msg_type(i, options);
 		options->msg_type = msg_type;
diff --git a/graph.c b/graph.c
index c25a09a..1350bdd 100644
--- a/graph.c
+++ b/graph.c
@@ -234,12 +234,10 @@ struct git_graph *graph_init(struct rev_info *opt)
 	 * We'll automatically grow columns later if we need more room.
 	 */
 	graph->column_capacity = 30;
-	graph->columns = xmalloc(sizeof(struct column) *
-				 graph->column_capacity);
-	graph->new_columns = xmalloc(sizeof(struct column) *
-				     graph->column_capacity);
-	graph->mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
-	graph->new_mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->new_columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->mapping, 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->new_mapping, 2 * graph->column_capacity);
 
 	/*
 	 * The diff output prefix callback, with this we can make
diff --git a/khash.h b/khash.h
index 376475a..c0da40d 100644
--- a/khash.h
+++ b/khash.h
@@ -117,7 +117,7 @@ static const double __ac_HASH_UPPER = 0.77;
 			if (new_n_buckets < 4) new_n_buckets = 4;					\
 			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	/* requested size is too small */ \
 			else { /* hash table size to be changed (shrink or expand); rehash */ \
-				new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t));	\
+				ALLOC_ARRAY(new_flags, __ac_fsize(new_n_buckets)); \
 				if (!new_flags) return -1;								\
 				memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
 				if (h->n_buckets < new_n_buckets) {	/* expand */		\
diff --git a/levenshtein.c b/levenshtein.c
index fc28159..d263269 100644
--- a/levenshtein.c
+++ b/levenshtein.c
@@ -42,11 +42,13 @@ int levenshtein(const char *string1, const char *string2,
 		int w, int s, int a, int d)
 {
 	int len1 = strlen(string1), len2 = strlen(string2);
-	int *row0 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row1 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row2 = xmalloc(sizeof(int) * (len2 + 1));
+	int *row0, *row1, *row2;
 	int i, j;
 
+	ALLOC_ARRAY(row0, len2 + 1);
+	ALLOC_ARRAY(row1, len2 + 1);
+	ALLOC_ARRAY(row2, len2 + 1);
+
 	for (j = 0; j <= len2; j++)
 		row1[j] = j * a;
 	for (i = 0; i < len1; i++) {
diff --git a/line-log.c b/line-log.c
index 5942a6f..fa095b9 100644
--- a/line-log.c
+++ b/line-log.c
@@ -522,7 +522,7 @@ static void fill_line_ends(struct diff_filespec *spec, long *lines,
 	if (diff_populate_filespec(spec, 0))
 		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
 
-	ends = xmalloc(size * sizeof(*ends));
+	ALLOC_ARRAY(ends, size);
 	ends[cur++] = 0;
 	data = spec->data;
 	while (num < spec->size) {
@@ -1140,9 +1140,9 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
 	if (nparents > 1 && rev->first_parent_only)
 		nparents = 1;
 
-	diffqueues = xmalloc(nparents * sizeof(*diffqueues));
-	cand = xmalloc(nparents * sizeof(*cand));
-	parents = xmalloc(nparents * sizeof(*parents));
+	ALLOC_ARRAY(diffqueues, nparents);
+	ALLOC_ARRAY(cand, nparents);
+	ALLOC_ARRAY(parents, nparents);
 
 	p = commit->parents;
 	for (i = 0; i < nparents; i++) {
diff --git a/notes.c b/notes.c
index c1e5035..88cf474 100644
--- a/notes.c
+++ b/notes.c
@@ -1035,7 +1035,7 @@ struct notes_tree **load_notes_trees(struct string_list *refs, int flags)
 	struct string_list_item *item;
 	int counter = 0;
 	struct notes_tree **trees;
-	trees = xmalloc((refs->nr+1) * sizeof(struct notes_tree *));
+	ALLOC_ARRAY(trees, refs->nr + 1);
 	for_each_string_list_item(item, refs) {
 		struct notes_tree *t = xcalloc(1, sizeof(struct notes_tree));
 		init_notes(t, item->string, combine_notes_ignore, flags);
diff --git a/pack-check.c b/pack-check.c
index 433bd86..1da89a4 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -89,7 +89,7 @@ static int verify_packfile(struct packed_git *p,
 	 * we do not do scan-streaming check on the pack file.
 	 */
 	nr_objects = p->num_objects;
-	entries = xmalloc((nr_objects + 1) * sizeof(*entries));
+	ALLOC_ARRAY(entries, nr_objects + 1);
 	entries[nr_objects].offset = pack_sig_ofs;
 	/* first sort entries by pack offset, since unpacking them is more efficient that way */
 	for (i = 0; i < nr_objects; i++) {
diff --git a/pack-revindex.c b/pack-revindex.c
index 155a8a3..96d51c3 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -44,10 +44,14 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
 	 * keep track of them with alias pointers, always sorting from "from"
 	 * to "to".
 	 */
-	struct revindex_entry *tmp = xmalloc(n * sizeof(*tmp));
-	struct revindex_entry *from = entries, *to = tmp;
+	struct revindex_entry *tmp, *from, *to;
 	int bits;
-	unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
+	unsigned *pos;
+
+	ALLOC_ARRAY(pos, BUCKETS);
+	ALLOC_ARRAY(tmp, n);
+	from = entries;
+	to = tmp;
 
 	/*
 	 * If (max >> bits) is zero, then we know that the radix digit we are
@@ -121,7 +125,7 @@ static void create_pack_revindex(struct packed_git *p)
 	unsigned i;
 	const char *index = p->index_data;
 
-	p->revindex = xmalloc(sizeof(*p->revindex) * (num_ent + 1));
+	ALLOC_ARRAY(p->revindex, num_ent + 1);
 	index += 4 * 256;
 
 	if (p->index_version > 1) {
diff --git a/pathspec.c b/pathspec.c
index 9304ee3..c9e9b6c 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -406,7 +406,8 @@ void parse_pathspec(struct pathspec *pathspec,
 		n++;
 
 	pathspec->nr = n;
-	pathspec->items = item = xmalloc(sizeof(*item) * n);
+	ALLOC_ARRAY(pathspec->items, n);
+	item = pathspec->items;
 	pathspec->_raw = argv;
 	prefixlen = prefix ? strlen(prefix) : 0;
 
@@ -483,7 +484,7 @@ const char **get_pathspec(const char *prefix, const char **pathspec)
 void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 {
 	*dst = *src;
-	dst->items = xmalloc(sizeof(struct pathspec_item) * dst->nr);
+	ALLOC_ARRAY(dst->items, dst->nr);
 	memcpy(dst->items, src->items,
 	       sizeof(struct pathspec_item) * dst->nr);
 }
diff --git a/remote-curl.c b/remote-curl.c
index 42deeec..5259b7e 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -696,9 +696,10 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 static int fetch_dumb(int nr_heads, struct ref **to_fetch)
 {
 	struct walker *walker;
-	char **targets = xmalloc(nr_heads * sizeof(char*));
+	char **targets;
 	int ret, i;
 
+	ALLOC_ARRAY(targets, nr_heads);
 	if (options.depth)
 		die("dumb http transport does not support --depth");
 	for (i = 0; i < nr_heads; i++)
diff --git a/sha1_file.c b/sha1_file.c
index aab1872..2f1c6d3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1942,7 +1942,7 @@ static enum object_type packed_to_object_type(struct packed_git *p,
 		/* Push the object we're going to leave behind */
 		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
 			poi_stack_alloc = alloc_nr(poi_stack_nr);
-			poi_stack = xmalloc(sizeof(off_t)*poi_stack_alloc);
+			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
 			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
 		} else {
 			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
@@ -2308,7 +2308,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 		if (delta_stack_nr >= delta_stack_alloc
 		    && delta_stack == small_delta_stack) {
 			delta_stack_alloc = alloc_nr(delta_stack_nr);
-			delta_stack = xmalloc(sizeof(*delta_stack)*delta_stack_alloc);
+			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
 			memcpy(delta_stack, small_delta_stack,
 			       sizeof(*delta_stack)*delta_stack_nr);
 		} else {
diff --git a/shallow.c b/shallow.c
index 60f1505..71163bf 100644
--- a/shallow.c
+++ b/shallow.c
@@ -315,8 +315,8 @@ void prepare_shallow_info(struct shallow_info *info, struct sha1_array *sa)
 	info->shallow = sa;
 	if (!sa)
 		return;
-	info->ours = xmalloc(sizeof(*info->ours) * sa->nr);
-	info->theirs = xmalloc(sizeof(*info->theirs) * sa->nr);
+	ALLOC_ARRAY(info->ours, sa->nr);
+	ALLOC_ARRAY(info->theirs, sa->nr);
 	for (i = 0; i < sa->nr; i++) {
 		if (has_sha1_file(sa->sha1[i])) {
 			struct commit_graft *graft;
@@ -487,7 +487,7 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 	struct paint_info pi;
 
 	trace_printf_key(&trace_shallow, "shallow: assign_shallow_commits_to_refs\n");
-	shallow = xmalloc(sizeof(*shallow) * (info->nr_ours + info->nr_theirs));
+	ALLOC_ARRAY(shallow, info->nr_ours + info->nr_theirs);
 	for (i = 0; i < info->nr_ours; i++)
 		shallow[nr_shallow++] = info->ours[i];
 	for (i = 0; i < info->nr_theirs; i++)
diff --git a/show-index.c b/show-index.c
index d9e4903..acf8d54 100644
--- a/show-index.c
+++ b/show-index.c
@@ -50,7 +50,8 @@ int main(int argc, char **argv)
 			unsigned char sha1[20];
 			uint32_t crc;
 			uint32_t off;
-		} *entries = xmalloc(nr * sizeof(entries[0]));
+		} *entries;
+		ALLOC_ARRAY(entries, nr);
 		for (i = 0; i < nr; i++)
 			if (fread(entries[i].sha1, 20, 1, stdin) != 1)
 				die("unable to read sha1 %u/%u", i, nr);
diff --git a/transport.c b/transport.c
index c92f8ae..3b4e644 100644
--- a/transport.c
+++ b/transport.c
@@ -977,7 +977,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 		 * This condition shouldn't be met in a non-deepening fetch
 		 * (see builtin/fetch.c:quickfetch()).
 		 */
-		heads = xmalloc(nr_refs * sizeof(*heads));
+		ALLOC_ARRAY(heads, nr_refs);
 		for (rm = refs; rm; rm = rm->next)
 			heads[nr_heads++] = rm;
 	}
diff --git a/xdiff-interface.c b/xdiff-interface.c
index cb67c1c..54236f2 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -265,7 +265,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
 	for (i = 0, regs->nr = 1; value[i]; i++)
 		if (value[i] == '\n')
 			regs->nr++;
-	regs->array = xmalloc(regs->nr * sizeof(struct ff_reg));
+	ALLOC_ARRAY(regs->array, regs->nr);
 	for (i = 0; i < regs->nr; i++) {
 		struct ff_reg *reg = regs->array + i;
 		const char *ep = strchr(value, '\n'), *expression;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 08/21] use xmallocz to avoid size arithmetic
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (6 preceding siblings ...)
  2016-02-19 11:23   ` [PATCH 07/21] convert trivial cases to ALLOC_ARRAY Jeff King
@ 2016-02-19 11:23   ` Jeff King
  2016-02-19 11:23   ` [PATCH 09/21] convert trivial cases to FLEX_ARRAY macros Jeff King
                     ` (13 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:23 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/check-ref-format.c | 2 +-
 builtin/merge-tree.c       | 2 +-
 builtin/worktree.c         | 2 +-
 column.c                   | 3 +--
 combine-diff.c             | 4 +---
 config.c                   | 4 +---
 dir.c                      | 2 +-
 entry.c                    | 2 +-
 grep.c                     | 3 +--
 imap-send.c                | 5 ++---
 ll-merge.c                 | 2 +-
 progress.c                 | 2 +-
 refs.c                     | 2 +-
 setup.c                    | 5 ++---
 strbuf.c                   | 2 +-
 15 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/builtin/check-ref-format.c b/builtin/check-ref-format.c
index fd915d5..eac4994 100644
--- a/builtin/check-ref-format.c
+++ b/builtin/check-ref-format.c
@@ -20,7 +20,7 @@ static const char builtin_check_ref_format_usage[] =
  */
 static char *collapse_slashes(const char *refname)
 {
-	char *ret = xmalloc(strlen(refname) + 1);
+	char *ret = xmallocz(strlen(refname));
 	char ch;
 	char prev = '/';
 	char *cp = ret;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index d4f0cbd..ca57004 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -174,7 +174,7 @@ static struct merge_list *create_entry(unsigned stage, unsigned mode, const unsi
 
 static char *traverse_path(const struct traverse_info *info, const struct name_entry *n)
 {
-	char *path = xmalloc(traverse_path_len(info, n) + 1);
+	char *path = xmallocz(traverse_path_len(info, n));
 	return make_traverse_path(path, info, n);
 }
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 475b958..0a45710 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -52,7 +52,7 @@ static int prune_worktree(const char *id, struct strbuf *reason)
 		return 1;
 	}
 	len = st.st_size;
-	path = xmalloc(len + 1);
+	path = xmallocz(len);
 	read_in_full(fd, path, len);
 	close(fd);
 	while (len && (path[len - 1] == '\n' || path[len - 1] == '\r'))
diff --git a/column.c b/column.c
index f9fda68..d55ead1 100644
--- a/column.c
+++ b/column.c
@@ -173,9 +173,8 @@ static void display_table(const struct string_list *list,
 	if (colopts & COL_DENSE)
 		shrink_columns(&data);
 
-	empty_cell = xmalloc(initial_width + 1);
+	empty_cell = xmallocz(initial_width);
 	memset(empty_cell, ' ', initial_width);
-	empty_cell[initial_width] = '\0';
 	for (y = 0; y < data.rows; y++) {
 		for (x = 0; x < data.cols; x++)
 			if (display_cell(&data, initial_width, empty_cell, x, y))
diff --git a/combine-diff.c b/combine-diff.c
index a698016..890c415 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1043,7 +1043,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 				elem->mode = canon_mode(S_IFLNK);
 
 			result_size = len;
-			result = xmalloc(len + 1);
+			result = xmallocz(len);
 
 			done = read_in_full(fd, result, len);
 			if (done < 0)
@@ -1051,8 +1051,6 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			else if (done < len)
 				die("early EOF '%s'", elem->path);
 
-			result[len] = 0;
-
 			/* If not a fake symlink, apply filters, e.g. autocrlf */
 			if (is_file) {
 				struct strbuf buf = STRBUF_INIT;
diff --git a/config.c b/config.c
index b95ac3a..e7b589a 100644
--- a/config.c
+++ b/config.c
@@ -1902,7 +1902,7 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 	 * Validate the key and while at it, lower case it for matching.
 	 */
 	if (store_key)
-		*store_key = xmalloc(strlen(key) + 1);
+		*store_key = xmallocz(strlen(key));
 
 	dot = 0;
 	for (i = 0; key[i]; i++) {
@@ -1926,8 +1926,6 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 		if (store_key)
 			(*store_key)[i] = c;
 	}
-	if (store_key)
-		(*store_key)[i] = 0;
 
 	return 0;
 
diff --git a/dir.c b/dir.c
index 66c93c1..6c627d5 100644
--- a/dir.c
+++ b/dir.c
@@ -711,7 +711,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			close(fd);
 			return 0;
 		}
-		buf = xmalloc(size+1);
+		buf = xmallocz(size);
 		if (read_in_full(fd, buf, size) != size) {
 			free(buf);
 			close(fd);
diff --git a/entry.c b/entry.c
index 582c400..a410957 100644
--- a/entry.c
+++ b/entry.c
@@ -6,7 +6,7 @@
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
 {
-	char *buf = xmalloc(path_len + 1);
+	char *buf = xmallocz(path_len);
 	int len = 0;
 
 	while (len < path_len) {
diff --git a/grep.c b/grep.c
index 7b2b96a..528b652 100644
--- a/grep.c
+++ b/grep.c
@@ -1741,7 +1741,7 @@ static int grep_source_load_file(struct grep_source *gs)
 	i = open(filename, O_RDONLY);
 	if (i < 0)
 		goto err_ret;
-	data = xmalloc(size + 1);
+	data = xmallocz(size);
 	if (st.st_size != read_in_full(i, data, size)) {
 		error(_("'%s': short read %s"), filename, strerror(errno));
 		close(i);
@@ -1749,7 +1749,6 @@ static int grep_source_load_file(struct grep_source *gs)
 		return -1;
 	}
 	close(i);
-	data[size] = 0;
 
 	gs->buf = data;
 	gs->size = size;
diff --git a/imap-send.c b/imap-send.c
index 4d3b773..2c52027 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -892,12 +892,11 @@ static char *cram(const char *challenge_64, const char *user, const char *pass)
 	response = xstrfmt("%s %s", user, hex);
 	resp_len = strlen(response) + 1;
 
-	response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);
+	response_64 = xmallocz(ENCODED_SIZE(resp_len));
 	encoded_len = EVP_EncodeBlock((unsigned char *)response_64,
 				      (unsigned char *)response, resp_len);
 	if (encoded_len < 0)
 		die("EVP_EncodeBlock error");
-	response_64[encoded_len] = '\0';
 	return (char *)response_64;
 }
 
@@ -1188,7 +1187,7 @@ static void lf_to_crlf(struct strbuf *msg)
 		j++;
 	}
 
-	new = xmalloc(j + 1);
+	new = xmallocz(j);
 
 	/*
 	 * Second pass: write the new string.  Note that this loop is
diff --git a/ll-merge.c b/ll-merge.c
index 0338630..ff4a43a 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -205,7 +205,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	if (fstat(fd, &st))
 		goto close_bad;
 	result->size = st.st_size;
-	result->ptr = xmalloc(result->size + 1);
+	result->ptr = xmallocz(result->size);
 	if (read_in_full(fd, result->ptr, result->size) != result->size) {
 		free(result->ptr);
 		result->ptr = NULL;
diff --git a/progress.c b/progress.c
index 353bd37..76a88c5 100644
--- a/progress.c
+++ b/progress.c
@@ -247,7 +247,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		size_t len = strlen(msg) + 5;
 		struct throughput *tp = progress->throughput;
 
-		bufp = (len < sizeof(buf)) ? buf : xmalloc(len + 1);
+		bufp = (len < sizeof(buf)) ? buf : xmallocz(len);
 		if (tp) {
 			unsigned int rate = !tp->avg_misecs ? 0 :
 					tp->avg_bytes / tp->avg_misecs;
diff --git a/refs.c b/refs.c
index e2d34b2..1d9e2a7 100644
--- a/refs.c
+++ b/refs.c
@@ -124,7 +124,7 @@ int refname_is_safe(const char *refname)
 		char *buf;
 		int result;
 
-		buf = xmalloc(strlen(refname) + 1);
+		buf = xmallocz(strlen(refname));
 		/*
 		 * Does the refname try to escape refs/?
 		 * For example: refs/foo/../bar is safe but refs/foo/../../bar
diff --git a/setup.c b/setup.c
index 0deb022..ab21086 100644
--- a/setup.c
+++ b/setup.c
@@ -88,7 +88,7 @@ char *prefix_path_gently(const char *prefix, int len,
 	const char *orig = path;
 	char *sanitized;
 	if (is_absolute_path(orig)) {
-		sanitized = xmalloc(strlen(path) + 1);
+		sanitized = xmallocz(strlen(path));
 		if (remaining_prefix)
 			*remaining_prefix = 0;
 		if (normalize_path_copy_len(sanitized, path, remaining_prefix)) {
@@ -488,14 +488,13 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 		error_code = READ_GITFILE_ERR_OPEN_FAILED;
 		goto cleanup_return;
 	}
-	buf = xmalloc(st.st_size + 1);
+	buf = xmallocz(st.st_size);
 	len = read_in_full(fd, buf, st.st_size);
 	close(fd);
 	if (len != st.st_size) {
 		error_code = READ_GITFILE_ERR_READ_FAILED;
 		goto cleanup_return;
 	}
-	buf[len] = '\0';
 	if (!starts_with(buf, "gitdir: ")) {
 		error_code = READ_GITFILE_ERR_INVALID_FORMAT;
 		goto cleanup_return;
diff --git a/strbuf.c b/strbuf.c
index bab316d..f60e2ee 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -718,7 +718,7 @@ char *xstrdup_tolower(const char *string)
 	size_t len, i;
 
 	len = strlen(string);
-	result = xmalloc(len + 1);
+	result = xmallocz(len);
 	for (i = 0; i < len; i++)
 		result[i] = tolower(string[i]);
 	result[i] = '\0';
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 09/21] convert trivial cases to FLEX_ARRAY macros
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (7 preceding siblings ...)
  2016-02-19 11:23   ` [PATCH 08/21] use xmallocz to avoid size arithmetic Jeff King
@ 2016-02-19 11:23   ` Jeff King
  2016-02-19 11:23   ` [PATCH 10/21] use st_add and st_mult for allocation size computation Jeff King
                     ` (12 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:23 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King <peff@peff.net>
---
 attr.c               |  4 +---
 builtin/blame.c      |  4 +---
 builtin/help.c       |  9 +++------
 builtin/mktree.c     |  9 +++++----
 builtin/reflog.c     |  7 ++-----
 cache-tree.c         |  4 +---
 combine-diff.c       |  4 +---
 diff.c               |  7 ++-----
 dir.c                | 16 +++-------------
 hashmap.c            |  3 +--
 help.c               |  6 ++----
 log-tree.c           |  5 ++---
 name-hash.c          |  3 +--
 ref-filter.c         |  6 ++----
 refs.c               |  6 ++----
 refs/files-backend.c | 19 +++++--------------
 remote.c             |  5 +----
 17 files changed, 35 insertions(+), 82 deletions(-)

diff --git a/attr.c b/attr.c
index c83ec49..6537a43 100644
--- a/attr.c
+++ b/attr.c
@@ -93,9 +93,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	if (invalid_attr_name(name, len))
 		return NULL;
 
-	a = xmalloc(sizeof(*a) + len + 1);
-	memcpy(a->name, name, len);
-	a->name[len] = 0;
+	FLEX_ALLOC_MEM(a, name, name, len);
 	a->h = hval;
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
diff --git a/builtin/blame.c b/builtin/blame.c
index b4ed462..e982fb8 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -466,13 +466,11 @@ static void queue_blames(struct scoreboard *sb, struct origin *porigin,
 static struct origin *make_origin(struct commit *commit, const char *path)
 {
 	struct origin *o;
-	size_t pathlen = strlen(path) + 1;
-	o = xcalloc(1, sizeof(*o) + pathlen);
+	FLEX_ALLOC_STR(o, path, path);
 	o->commit = commit;
 	o->refcnt = 1;
 	o->next = commit->util;
 	commit->util = o;
-	memcpy(o->path, path, pathlen); /* includes NUL */
 	return o;
 }
 
diff --git a/builtin/help.c b/builtin/help.c
index 1cd0c1e..3c55ce4 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -171,12 +171,10 @@ static void exec_man_cmd(const char *cmd, const char *page)
 static void add_man_viewer(const char *name)
 {
 	struct man_viewer_list **p = &man_viewer_list;
-	size_t len = strlen(name);
 
 	while (*p)
 		p = &((*p)->next);
-	*p = xcalloc(1, (sizeof(**p) + len + 1));
-	memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */
+	FLEX_ALLOC_STR(*p, name, name);
 }
 
 static int supported_man_viewer(const char *name, size_t len)
@@ -190,9 +188,8 @@ static void do_add_man_viewer_info(const char *name,
 				   size_t len,
 				   const char *value)
 {
-	struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1);
-
-	memcpy(new->name, name, len); /* NUL-terminated by xcalloc */
+	struct man_viewer_info_list *new;
+	FLEX_ALLOC_MEM(new, name, name, len);
 	new->info = xstrdup(value);
 	new->next = man_viewer_info_list;
 	man_viewer_info_list = new;
diff --git a/builtin/mktree.c b/builtin/mktree.c
index a237caa..4282b62 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -19,16 +19,17 @@ static int alloc, used;
 static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
 {
 	struct treeent *ent;
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (strchr(path, '/'))
 		die("path %s contains slash", path);
 
-	ALLOC_GROW(entries, used + 1, alloc);
-	ent = entries[used++] = xmalloc(sizeof(**entries) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, path, len);
 	ent->mode = mode;
 	ent->len = len;
 	hashcpy(ent->sha1, sha1);
-	memcpy(ent->name, path, len+1);
+
+	ALLOC_GROW(entries, used + 1, alloc);
+	entries[used++] = ent;
 }
 
 static int ent_compare(const void *a_, const void *b_)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 9980731..2d46b64 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -382,11 +382,9 @@ static int collect_reflog(const char *ref, const struct object_id *oid, int unus
 {
 	struct collected_reflog *e;
 	struct collect_reflog_cb *cb = cb_data;
-	size_t namelen = strlen(ref);
 
-	e = xmalloc(sizeof(*e) + namelen + 1);
+	FLEX_ALLOC_STR(e, reflog, ref);
 	hashcpy(e->sha1, oid->hash);
-	memcpy(e->reflog, ref, namelen + 1);
 	ALLOC_GROW(cb->e, cb->nr + 1, cb->alloc);
 	cb->e[cb->nr++] = e;
 	return 0;
@@ -411,8 +409,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		    ent->pattern[len] == '\0')
 			return ent;
 
-	ent = xcalloc(1, sizeof(*ent) + len + 1);
-	memcpy(ent->pattern, pattern, len);
+	FLEX_ALLOC_MEM(ent, pattern, pattern, len);
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
diff --git a/cache-tree.c b/cache-tree.c
index 20ee7b5..3ebf9c3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -79,11 +79,9 @@ static struct cache_tree_sub *find_subtree(struct cache_tree *it,
 	ALLOC_GROW(it->down, it->subtree_nr + 1, it->subtree_alloc);
 	it->subtree_nr++;
 
-	down = xmalloc(sizeof(*down) + pathlen + 1);
+	FLEX_ALLOC_MEM(down, name, path, pathlen);
 	down->cache_tree = NULL;
 	down->namelen = pathlen;
-	memcpy(down->name, path, pathlen);
-	down->name[pathlen] = 0;
 
 	if (pos < it->subtree_nr)
 		memmove(it->down + pos + 1,
diff --git a/combine-diff.c b/combine-diff.c
index 890c415..be09a2b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -319,7 +319,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	if (line[len-1] == '\n')
 		len--;
 
-	lline = xmalloc(sizeof(*lline) + len + 1);
+	FLEX_ALLOC_MEM(lline, line, line, len);
 	lline->len = len;
 	lline->next = NULL;
 	lline->prev = sline->plost.lost_tail;
@@ -330,8 +330,6 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	sline->plost.lost_tail = lline;
 	sline->plost.len++;
 	lline->parent_map = this_mask;
-	memcpy(lline->line, line, len);
-	lline->line[len] = 0;
 }
 
 struct combine_diff_state {
diff --git a/diff.c b/diff.c
index 2136b69..27d14a7 100644
--- a/diff.c
+++ b/diff.c
@@ -2607,12 +2607,9 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 
 struct diff_filespec *alloc_filespec(const char *path)
 {
-	int namelen = strlen(path);
-	struct diff_filespec *spec = xmalloc(sizeof(*spec) + namelen + 1);
+	struct diff_filespec *spec;
 
-	memset(spec, 0, sizeof(*spec));
-	spec->path = (char *)(spec + 1);
-	memcpy(spec->path, path, namelen+1);
+	FLEXPTR_ALLOC_STR(spec, path, path);
 	spec->count = 1;
 	spec->is_binary = -1;
 	return spec;
diff --git a/dir.c b/dir.c
index 6c627d5..f06ebb7 100644
--- a/dir.c
+++ b/dir.c
@@ -503,12 +503,7 @@ void add_exclude(const char *string, const char *base,
 
 	parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);
 	if (flags & EXC_FLAG_MUSTBEDIR) {
-		char *s;
-		x = xmalloc(sizeof(*x) + patternlen + 1);
-		s = (char *)(x+1);
-		memcpy(s, string, patternlen);
-		s[patternlen] = '\0';
-		x->pattern = s;
+		FLEXPTR_ALLOC_MEM(x, pattern, string, patternlen);
 	} else {
 		x = xmalloc(sizeof(*x));
 		x->pattern = string;
@@ -625,10 +620,7 @@ static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
 	}
 
 	uc->dir_created++;
-	d = xmalloc(sizeof(*d) + len + 1);
-	memset(d, 0, sizeof(*d));
-	memcpy(d->name, name, len);
-	d->name[len] = '\0';
+	FLEX_ALLOC_MEM(d, name, name, len);
 
 	ALLOC_GROW(dir->dirs, dir->dirs_nr + 1, dir->dirs_alloc);
 	memmove(dir->dirs + first + 1, dir->dirs + first,
@@ -1167,10 +1159,8 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
 {
 	struct dir_entry *ent;
 
-	ent = xmalloc(sizeof(*ent) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, pathname, len);
 	ent->len = len;
-	memcpy(ent->name, pathname, len);
-	ent->name[len] = 0;
 	return ent;
 }
 
diff --git a/hashmap.c b/hashmap.c
index f693839..b10b642 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -256,10 +256,9 @@ const void *memintern(const void *data, size_t len)
 	e = hashmap_get(&map, &key, data);
 	if (!e) {
 		/* not found: create it */
-		e = xmallocz(sizeof(struct pool_entry) + len);
+		FLEX_ALLOC_MEM(e, data, data, len);
 		hashmap_entry_init(e, key.ent.hash);
 		e->len = len;
-		memcpy(e->data, data, len);
 		hashmap_add(&map, e);
 	}
 	return e->data;
diff --git a/help.c b/help.c
index d996b34..19328ea 100644
--- a/help.c
+++ b/help.c
@@ -11,11 +11,9 @@
 
 void add_cmdname(struct cmdnames *cmds, const char *name, int len)
 {
-	struct cmdname *ent = xmalloc(sizeof(*ent) + len + 1);
-
+	struct cmdname *ent;
+	FLEX_ALLOC_MEM(ent, name, name, len);
 	ent->len = len;
-	memcpy(ent->name, name, len);
-	ent->name[len] = 0;
 
 	ALLOC_GROW(cmds->names, cmds->cnt + 1, cmds->alloc);
 	cmds->names[cmds->cnt++] = ent;
diff --git a/log-tree.c b/log-tree.c
index f70a30e..60f9839 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -77,9 +77,8 @@ int parse_decorate_color_config(const char *var, const char *slot_name, const ch
 
 void add_name_decoration(enum decoration_type type, const char *name, struct object *obj)
 {
-	int nlen = strlen(name);
-	struct name_decoration *res = xmalloc(sizeof(*res) + nlen + 1);
-	memcpy(res->name, name, nlen + 1);
+	struct name_decoration *res;
+	FLEX_ALLOC_STR(res, name, name);
 	res->type = type;
 	res->next = add_decoration(&name_decoration, obj, res);
 }
diff --git a/name-hash.c b/name-hash.c
index 332ba95..6d9f23e 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -55,10 +55,9 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 	dir = find_dir_entry(istate, ce->name, namelen);
 	if (!dir) {
 		/* not found, create it and add to hash table */
-		dir = xcalloc(1, sizeof(struct dir_entry) + namelen + 1);
+		FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
 		hashmap_entry_init(dir, memihash(ce->name, namelen));
 		dir->namelen = namelen;
-		strncpy(dir->name, ce->name, namelen);
 		hashmap_add(&istate->dir_hash, dir);
 
 		/* recursively add missing parent directories */
diff --git a/ref-filter.c b/ref-filter.c
index f097176..9ccfc51 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1255,10 +1255,8 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 						 const unsigned char *objectname,
 						 int flag)
 {
-	size_t len = strlen(refname);
-	struct ref_array_item *ref = xcalloc(1, sizeof(struct ref_array_item) + len + 1);
-	memcpy(ref->refname, refname, len);
-	ref->refname[len] = '\0';
+	struct ref_array_item *ref;
+	FLEX_ALLOC_STR(ref, refname, refname);
 	hashcpy(ref->objectname, objectname);
 	ref->flag = flag;
 
diff --git a/refs.c b/refs.c
index 1d9e2a7..2d86445 100644
--- a/refs.c
+++ b/refs.c
@@ -761,10 +761,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
 static struct ref_update *add_update(struct ref_transaction *transaction,
 				     const char *refname)
 {
-	size_t len = strlen(refname) + 1;
-	struct ref_update *update = xcalloc(1, sizeof(*update) + len);
-
-	memcpy((char *)update->refname, refname, len); /* includes NUL */
+	struct ref_update *update;
+	FLEX_ALLOC_STR(update, refname, refname);
 	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
 	transaction->updates[transaction->nr++] = update;
 	return update;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b569762..81f68f8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -199,17 +199,14 @@ static struct ref_entry *create_ref_entry(const char *refname,
 					  const unsigned char *sha1, int flag,
 					  int check_name)
 {
-	int len;
 	struct ref_entry *ref;
 
 	if (check_name &&
 	    check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
 		die("Reference has invalid format: '%s'", refname);
-	len = strlen(refname) + 1;
-	ref = xmalloc(sizeof(struct ref_entry) + len);
+	FLEX_ALLOC_STR(ref, name, refname);
 	hashcpy(ref->u.value.oid.hash, sha1);
 	oidclr(&ref->u.value.peeled);
-	memcpy(ref->name, refname, len);
 	ref->flag = flag;
 	return ref;
 }
@@ -268,9 +265,7 @@ static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
 					  int incomplete)
 {
 	struct ref_entry *direntry;
-	direntry = xcalloc(1, sizeof(struct ref_entry) + len + 1);
-	memcpy(direntry->name, dirname, len);
-	direntry->name[len] = '\0';
+	FLEX_ALLOC_MEM(direntry, name, dirname, len);
 	direntry->u.subdir.ref_cache = ref_cache;
 	direntry->flag = REF_DIR | (incomplete ? REF_INCOMPLETE : 0);
 	return direntry;
@@ -939,13 +934,10 @@ static void clear_loose_ref_cache(struct ref_cache *refs)
  */
 static struct ref_cache *create_ref_cache(const char *submodule)
 {
-	int len;
 	struct ref_cache *refs;
 	if (!submodule)
 		submodule = "";
-	len = strlen(submodule) + 1;
-	refs = xcalloc(1, sizeof(struct ref_cache) + len);
-	memcpy(refs->name, submodule, len);
+	FLEX_ALLOC_STR(refs, name, submodule);
 	refs->next = submodule_ref_caches;
 	submodule_ref_caches = refs;
 	return refs;
@@ -2197,10 +2189,9 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
 
 	/* Schedule the loose reference for pruning if requested. */
 	if ((cb->flags & PACK_REFS_PRUNE)) {
-		int namelen = strlen(entry->name) + 1;
-		struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen);
+		struct ref_to_prune *n;
+		FLEX_ALLOC_STR(n, name, entry->name);
 		hashcpy(n->sha1, entry->u.value.oid.hash);
-		memcpy(n->name, entry->name, namelen); /* includes NUL */
 		n->next = cb->ref_to_prune;
 		cb->ref_to_prune = n;
 	}
diff --git a/remote.c b/remote.c
index 02e698a..25a960f 100644
--- a/remote.c
+++ b/remote.c
@@ -2136,16 +2136,13 @@ static int one_local_ref(const char *refname, const struct object_id *oid,
 {
 	struct ref ***local_tail = cb_data;
 	struct ref *ref;
-	int len;
 
 	/* we already know it starts with refs/ to get here */
 	if (check_refname_format(refname + 5, 0))
 		return 0;
 
-	len = strlen(refname) + 1;
-	ref = xcalloc(1, sizeof(*ref) + len);
+	ref = alloc_ref(refname);
 	oidcpy(&ref->new_oid, oid);
-	memcpy(ref->name, refname, len);
 	**local_tail = ref;
 	*local_tail = &ref->next;
 	return 0;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 10/21] use st_add and st_mult for allocation size computation
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (8 preceding siblings ...)
  2016-02-19 11:23   ` [PATCH 09/21] convert trivial cases to FLEX_ARRAY macros Jeff King
@ 2016-02-19 11:23   ` Jeff King
  2016-02-19 11:24   ` [PATCH 11/21] prepare_{git,shell}_cmd: use argv_array Jeff King
                     ` (11 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:23 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c              |  4 ++--
 builtin/apply.c        |  2 +-
 builtin/clean.c        |  2 +-
 builtin/fetch.c        |  2 +-
 builtin/index-pack.c   |  4 ++--
 builtin/merge.c        |  2 +-
 builtin/mv.c           |  4 ++--
 builtin/receive-pack.c |  2 +-
 combine-diff.c         | 14 +++++++-------
 commit.c               |  2 +-
 compat/mingw.c         |  4 ++--
 compat/qsort.c         |  2 +-
 compat/setenv.c        |  2 +-
 compat/win32/syslog.c  |  4 ++--
 diffcore-delta.c       |  6 ++++--
 diffcore-rename.c      |  2 +-
 dir.c                  |  4 ++--
 fast-import.c          |  2 +-
 refs.c                 |  2 +-
 remote.c               |  8 ++++----
 revision.c             |  2 +-
 sha1_file.c            | 20 +++++++++++---------
 sha1_name.c            |  5 ++---
 shallow.c              |  2 +-
 submodule.c            |  6 +++---
 25 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/archive.c b/archive.c
index 0687afa..5d735ae 100644
--- a/archive.c
+++ b/archive.c
@@ -171,8 +171,8 @@ static void queue_directory(const unsigned char *sha1,
 		unsigned mode, int stage, struct archiver_context *c)
 {
 	struct directory *d;
-	size_t len = base->len + 1 + strlen(filename) + 1;
-	d = xmalloc(sizeof(*d) + len);
+	size_t len = st_add4(base->len, 1, strlen(filename), 1);
+	d = xmalloc(st_add(sizeof(*d), len));
 	d->up	   = c->bottom;
 	d->baselen = base->len;
 	d->mode	   = mode;
diff --git a/builtin/apply.c b/builtin/apply.c
index d61ac65..42c610e 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -2632,7 +2632,7 @@ static void update_image(struct image *img,
 	insert_count = postimage->len;
 
 	/* Adjust the contents */
-	result = xmalloc(img->len + insert_count - remove_count + 1);
+	result = xmalloc(st_add3(st_sub(img->len, remove_count), insert_count, 1));
 	memcpy(result, img->buf, applied_at);
 	memcpy(result + applied_at, postimage->buf, postimage->len);
 	memcpy(result + applied_at + postimage->len,
diff --git a/builtin/clean.c b/builtin/clean.c
index 8229f7e..0371010 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -615,7 +615,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 				nr += chosen[i];
 		}
 
-		result = xcalloc(nr + 1, sizeof(int));
+		result = xcalloc(st_add(nr, 1), sizeof(int));
 		for (i = 0; i < stuff->nr && j < nr; i++) {
 			if (chosen[i])
 				result[j++] = i;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8e74213..373a89d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1110,7 +1110,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv)
 	if (argc > 0) {
 		int j = 0;
 		int i;
-		refs = xcalloc(argc + 1, sizeof(const char *));
+		refs = xcalloc(st_add(argc, 1), sizeof(const char *));
 		for (i = 0; i < argc; i++) {
 			if (!strcmp(argv[i], "tag")) {
 				i++;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a60bcfa..193908a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1744,9 +1744,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
-	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+	objects = xcalloc(st_add(nr_objects, 1), sizeof(struct object_entry));
 	if (show_stat)
-		obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
+		obj_stat = xcalloc(st_add(nr_objects, 1), sizeof(struct object_stat));
 	ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
 	parse_pack_objects(pack_sha1);
 	resolve_deltas();
diff --git a/builtin/merge.c b/builtin/merge.c
index b98a348..101ffef 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -939,7 +939,7 @@ static int setup_with_upstream(const char ***argv)
 	if (!branch->merge_nr)
 		die(_("No default upstream defined for the current branch."));
 
-	args = xcalloc(branch->merge_nr + 1, sizeof(char *));
+	args = xcalloc(st_add(branch->merge_nr, 1), sizeof(char *));
 	for (i = 0; i < branch->merge_nr; i++) {
 		if (!branch->merge[i]->dst)
 			die(_("No remote-tracking branch for %s from %s"),
diff --git a/builtin/mv.c b/builtin/mv.c
index 9a9813a..aeae855 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -48,9 +48,9 @@ static const char **internal_copy_pathspec(const char *prefix,
 
 static const char *add_slash(const char *path)
 {
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (path[len - 1] != '/') {
-		char *with_slash = xmalloc(len + 2);
+		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);
 		with_slash[len++] = '/';
 		with_slash[len] = 0;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3dc3868..c8e32b2 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1372,7 +1372,7 @@ static struct command **queue_command(struct command **tail,
 
 	refname = line + 82;
 	reflen = linelen - 82;
-	cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
+	cmd = xcalloc(1, st_add3(sizeof(struct command), reflen, 1));
 	hashcpy(cmd->old_sha1, old_sha1);
 	hashcpy(cmd->new_sha1, new_sha1);
 	memcpy(cmd->ref_name, refname, reflen);
diff --git a/combine-diff.c b/combine-diff.c
index be09a2b..0e1d4b0 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -189,11 +189,11 @@ static struct lline *coalesce_lines(struct lline *base, int *lenbase,
 	 *   - Else if we have NEW, insert newend lline into base and
 	 *   consume newend
 	 */
-	lcs = xcalloc(origbaselen + 1, sizeof(int*));
-	directions = xcalloc(origbaselen + 1, sizeof(enum coalesce_direction*));
+	lcs = xcalloc(st_add(origbaselen, 1), sizeof(int*));
+	directions = xcalloc(st_add(origbaselen, 1), sizeof(enum coalesce_direction*));
 	for (i = 0; i < origbaselen + 1; i++) {
-		lcs[i] = xcalloc(lennew + 1, sizeof(int));
-		directions[i] = xcalloc(lennew + 1, sizeof(enum coalesce_direction));
+		lcs[i] = xcalloc(st_add(lennew, 1), sizeof(int));
+		directions[i] = xcalloc(st_add(lennew, 1), sizeof(enum coalesce_direction));
 		directions[i][0] = BASE;
 	}
 	for (j = 1; j < lennew + 1; j++)
@@ -1111,7 +1111,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	if (result_size && result[result_size-1] != '\n')
 		cnt++; /* incomplete line */
 
-	sline = xcalloc(cnt+2, sizeof(*sline));
+	sline = xcalloc(st_add(cnt, 2), sizeof(*sline));
 	sline[0].bol = result;
 	for (lno = 0, cp = result; cp < result + result_size; cp++) {
 		if (*cp == '\n') {
@@ -1130,7 +1130,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	/* Even p_lno[cnt+1] is valid -- that is for the end line number
 	 * for deletion hunk at the end.
 	 */
-	sline[0].p_lno = xcalloc((cnt+2) * num_parent, sizeof(unsigned long));
+	sline[0].p_lno = xcalloc(st_mult(st_add(cnt, 2), num_parent), sizeof(unsigned long));
 	for (lno = 0; lno <= cnt; lno++)
 		sline[lno+1].p_lno = sline[lno].p_lno + num_parent;
 
@@ -1262,7 +1262,7 @@ static struct diff_filepair *combined_pair(struct combine_diff_path *p,
 	struct diff_filespec *pool;
 
 	pair = xmalloc(sizeof(*pair));
-	pool = xcalloc(num_parent + 1, sizeof(struct diff_filespec));
+	pool = xcalloc(st_add(num_parent, 1), sizeof(struct diff_filespec));
 	pair->one = pool + 1;
 	pair->two = pool;
 
diff --git a/commit.c b/commit.c
index 31cd91f..3f4f371 100644
--- a/commit.c
+++ b/commit.c
@@ -147,7 +147,7 @@ struct commit_graft *read_graft_line(char *buf, int len)
 	if ((len + 1) % entry_size)
 		goto bad_graft_data;
 	i = (len + 1) / entry_size - 1;
-	graft = xmalloc(sizeof(*graft) + GIT_SHA1_RAWSZ * i);
+	graft = xmalloc(st_add(sizeof(*graft), st_mult(GIT_SHA1_RAWSZ, i)));
 	graft->nr_parent = i;
 	if (get_oid_hex(buf, &graft->oid))
 		goto bad_graft_data;
diff --git a/compat/mingw.c b/compat/mingw.c
index 7803f30..cfedcf9 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -810,7 +810,7 @@ static const char *quote_arg(const char *arg)
 		return arg;
 
 	/* insert \ where necessary */
-	d = q = xmalloc(len+n+3);
+	d = q = xmalloc(st_add3(len, n, 3));
 	*d++ = '"';
 	while (*arg) {
 		if (*arg == '"')
@@ -1069,7 +1069,7 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 			free(quoted);
 	}
 
-	wargs = xmalloc((2 * args.len + 1) * sizeof(wchar_t));
+	wargs = xmalloc_array(st_add(st_mult(2, args.len), 1), sizeof(wchar_t));
 	xutftowcs(wargs, args.buf, 2 * args.len + 1);
 	strbuf_release(&args);
 
diff --git a/compat/qsort.c b/compat/qsort.c
index 9574d53..7d071af 100644
--- a/compat/qsort.c
+++ b/compat/qsort.c
@@ -47,7 +47,7 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
 void git_qsort(void *b, size_t n, size_t s,
 	       int (*cmp)(const void *, const void *))
 {
-	const size_t size = n * s;
+	const size_t size = st_mult(n, s);
 	char buf[1024];
 
 	if (size < sizeof(buf)) {
diff --git a/compat/setenv.c b/compat/setenv.c
index fc1439a..7849f25 100644
--- a/compat/setenv.c
+++ b/compat/setenv.c
@@ -18,7 +18,7 @@ int gitsetenv(const char *name, const char *value, int replace)
 
 	namelen = strlen(name);
 	valuelen = strlen(value);
-	envstr = malloc((namelen + valuelen + 2));
+	envstr = malloc(st_add3(namelen, valuelen, 2));
 	if (!envstr) {
 		errno = ENOMEM;
 		return -1;
diff --git a/compat/win32/syslog.c b/compat/win32/syslog.c
index d015e43..b905aea 100644
--- a/compat/win32/syslog.c
+++ b/compat/win32/syslog.c
@@ -32,7 +32,7 @@ void syslog(int priority, const char *fmt, ...)
 		return;
 	}
 
-	str = malloc(str_len + 1);
+	str = malloc(st_add(str_len, 1));
 	if (!str) {
 		warning("malloc failed: '%s'", strerror(errno));
 		return;
@@ -43,7 +43,7 @@ void syslog(int priority, const char *fmt, ...)
 	va_end(ap);
 
 	while ((pos = strstr(str, "%1")) != NULL) {
-		str = realloc(str, ++str_len + 1);
+		str = realloc(str, st_add(++str_len, 1));
 		if (!str) {
 			warning("realloc failed: '%s'", strerror(errno));
 			return;
diff --git a/diffcore-delta.c b/diffcore-delta.c
index 7cf431d..4159748 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -53,7 +53,8 @@ static struct spanhash_top *spanhash_rehash(struct spanhash_top *orig)
 	int osz = 1 << orig->alloc_log2;
 	int sz = osz << 1;
 
-	new = xmalloc(sizeof(*orig) + sizeof(struct spanhash) * sz);
+	new = xmalloc(st_add(sizeof(*orig),
+			     st_mult(sizeof(struct spanhash), sz)));
 	new->alloc_log2 = orig->alloc_log2 + 1;
 	new->free = INITIAL_FREE(new->alloc_log2);
 	memset(new->data, 0, sizeof(struct spanhash) * sz);
@@ -130,7 +131,8 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one)
 	int is_text = !diff_filespec_is_binary(one);
 
 	i = INITIAL_HASH_SIZE;
-	hash = xmalloc(sizeof(*hash) + sizeof(struct spanhash) * (1<<i));
+	hash = xmalloc(st_add(sizeof(*hash),
+			      st_mult(sizeof(struct spanhash), 1<<i)));
 	hash->alloc_log2 = i;
 	hash->free = INITIAL_FREE(i);
 	memset(hash->data, 0, sizeof(struct spanhash) * (1<<i));
diff --git a/diffcore-rename.c b/diffcore-rename.c
index af1fe08..3b3c1ed 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -537,7 +537,7 @@ void diffcore_rename(struct diff_options *options)
 				rename_dst_nr * rename_src_nr, 50, 1);
 	}
 
-	mx = xcalloc(num_create * NUM_CANDIDATE_PER_DST, sizeof(*mx));
+	mx = xcalloc(st_mult(num_create, NUM_CANDIDATE_PER_DST), sizeof(*mx));
 	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
 		struct diff_filespec *two = rename_dst[i].two;
 		struct diff_score *m;
diff --git a/dir.c b/dir.c
index f06ebb7..2c91541 100644
--- a/dir.c
+++ b/dir.c
@@ -689,7 +689,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			return 0;
 		}
 		if (buf[size-1] != '\n') {
-			buf = xrealloc(buf, size+1);
+			buf = xrealloc(buf, st_add(size, 1));
 			buf[size++] = '\n';
 		}
 	} else {
@@ -2488,7 +2488,7 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	next = data + len + 1;
 	if (next > rd->end)
 		return -1;
-	*untracked_ = untracked = xmalloc(sizeof(*untracked) + len);
+	*untracked_ = untracked = xmalloc(st_add(sizeof(*untracked), len));
 	memcpy(untracked, &ud, sizeof(ud));
 	memcpy(untracked->name, data, len + 1);
 	data = next;
diff --git a/fast-import.c b/fast-import.c
index a6467cb..3053bb8 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -622,7 +622,7 @@ static void *pool_alloc(size_t len)
 			return xmalloc(len);
 		}
 		total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;
-		p = xmalloc(sizeof(struct mem_pool) + mem_pool_alloc);
+		p = xmalloc(st_add(sizeof(struct mem_pool), mem_pool_alloc));
 		p->next_pool = mem_pool;
 		p->next_free = (char *) p->space;
 		p->end = p->next_free + mem_pool_alloc;
diff --git a/refs.c b/refs.c
index 2d86445..b0e6ece 100644
--- a/refs.c
+++ b/refs.c
@@ -906,7 +906,7 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 			/* -2 for strlen("%.*s") - strlen("%s"); +1 for NUL */
 			total_len += strlen(ref_rev_parse_rules[nr_rules]) - 2 + 1;
 
-		scanf_fmts = xmalloc(nr_rules * sizeof(char *) + total_len);
+		scanf_fmts = xmalloc(st_add(st_mult(nr_rules, sizeof(char *)), total_len));
 
 		offset = 0;
 		for (i = 0; i < nr_rules; i++) {
diff --git a/remote.c b/remote.c
index 25a960f..21e4ec3 100644
--- a/remote.c
+++ b/remote.c
@@ -931,7 +931,7 @@ static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen,
 		const char *name)
 {
 	size_t len = strlen(name);
-	struct ref *ref = xcalloc(1, sizeof(struct ref) + prefixlen + len + 1);
+	struct ref *ref = xcalloc(1, st_add4(sizeof(*ref), prefixlen, len, 1));
 	memcpy(ref->name, prefix, prefixlen);
 	memcpy(ref->name + prefixlen, name, len);
 	return ref;
@@ -948,9 +948,9 @@ struct ref *copy_ref(const struct ref *ref)
 	size_t len;
 	if (!ref)
 		return NULL;
-	len = strlen(ref->name);
-	cpy = xmalloc(sizeof(struct ref) + len + 1);
-	memcpy(cpy, ref, sizeof(struct ref) + len + 1);
+	len = st_add3(sizeof(struct ref), strlen(ref->name), 1);
+	cpy = xmalloc(len);
+	memcpy(cpy, ref, len);
 	cpy->next = NULL;
 	cpy->symref = xstrdup_or_null(ref->symref);
 	cpy->remote_status = xstrdup_or_null(ref->remote_status);
diff --git a/revision.c b/revision.c
index f24ead5..d4ace60 100644
--- a/revision.c
+++ b/revision.c
@@ -540,7 +540,7 @@ struct treesame_state {
 static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit)
 {
 	unsigned n = commit_list_count(commit->parents);
-	struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+	struct treesame_state *st = xcalloc(1, st_add(sizeof(*st), n));
 	st->nparents = n;
 	add_decoration(&revs->treesame, &commit->object, st);
 	return st;
diff --git a/sha1_file.c b/sha1_file.c
index 2f1c6d3..0251700 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -253,7 +253,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 {
 	struct alternate_object_database *ent;
 	struct alternate_object_database *alt;
-	int pfxlen, entlen;
+	size_t pfxlen, entlen;
 	struct strbuf pathbuf = STRBUF_INIT;
 
 	if (!is_absolute_path(entry) && relative_base) {
@@ -273,8 +273,8 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 	while (pfxlen && pathbuf.buf[pfxlen-1] == '/')
 		pfxlen -= 1;
 
-	entlen = pfxlen + 43; /* '/' + 2 hex + '/' + 38 hex + NUL */
-	ent = xmalloc(sizeof(*ent) + entlen);
+	entlen = st_add(pfxlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */
+	ent = xmalloc(st_add(sizeof(*ent), entlen));
 	memcpy(ent->base, pathbuf.buf, pfxlen);
 	strbuf_release(&pathbuf);
 
@@ -1134,7 +1134,7 @@ unsigned char *use_pack(struct packed_git *p,
 
 static struct packed_git *alloc_packed_git(int extra)
 {
-	struct packed_git *p = xmalloc(sizeof(*p) + extra);
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
 	memset(p, 0, sizeof(*p));
 	p->pack_fd = -1;
 	return p;
@@ -1168,7 +1168,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	 * ".pack" is long enough to hold any suffix we're adding (and
 	 * the use xsnprintf double-checks that)
 	 */
-	alloc = path_len + strlen(".pack") + 1;
+	alloc = st_add3(path_len, strlen(".pack"), 1);
 	p = alloc_packed_git(alloc);
 	memcpy(p->pack_name, path, path_len);
 
@@ -1196,7 +1196,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 {
 	const char *path = sha1_pack_name(sha1);
-	int alloc = strlen(path) + 1;
+	size_t alloc = st_add(strlen(path), 1);
 	struct packed_git *p = alloc_packed_git(alloc);
 
 	memcpy(p->pack_name, path, alloc); /* includes NUL */
@@ -1413,10 +1413,12 @@ static void mark_bad_packed_object(struct packed_git *p,
 {
 	unsigned i;
 	for (i = 0; i < p->num_bad_objects; i++)
-		if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
 			return;
-	p->bad_object_sha1 = xrealloc(p->bad_object_sha1, 20 * (p->num_bad_objects + 1));
-	hashcpy(p->bad_object_sha1 + 20 * p->num_bad_objects, sha1);
+	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
+				      st_mult(GIT_SHA1_RAWSZ,
+					      st_add(p->num_bad_objects, 1)));
+	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
 	p->num_bad_objects++;
 }
 
diff --git a/sha1_name.c b/sha1_name.c
index 89918ca..d14346c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -87,9 +87,8 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa
 		 * object databases including our own.
 		 */
 		const char *objdir = get_object_directory();
-		int objdir_len = strlen(objdir);
-		int entlen = objdir_len + 43;
-		fakeent = xmalloc(sizeof(*fakeent) + entlen);
+		size_t objdir_len = strlen(objdir);
+		fakeent = xmalloc(st_add3(sizeof(*fakeent), objdir_len, 43));
 		memcpy(fakeent->base, objdir, objdir_len);
 		fakeent->name = fakeent->base + objdir_len + 1;
 		fakeent->name[-1] = '/';
diff --git a/shallow.c b/shallow.c
index 71163bf..4d554ca 100644
--- a/shallow.c
+++ b/shallow.c
@@ -389,7 +389,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 	unsigned int i, nr;
 	struct commit_list *head = NULL;
 	int bitmap_nr = (info->nr_bits + 31) / 32;
-	int bitmap_size = bitmap_nr * sizeof(uint32_t);
+	size_t bitmap_size = st_mult(bitmap_nr, sizeof(uint32_t));
 	uint32_t *tmp = xmalloc(bitmap_size); /* to be freed before return */
 	uint32_t *bitmap = paint_alloc(info);
 	struct commit *c = lookup_commit_reference_gently(sha1, 1);
diff --git a/submodule.c b/submodule.c
index b83939c..ac61c65 100644
--- a/submodule.c
+++ b/submodule.c
@@ -123,7 +123,7 @@ static int add_submodule_odb(const char *path)
 	struct strbuf objects_directory = STRBUF_INIT;
 	struct alternate_object_database *alt_odb;
 	int ret = 0;
-	int alloc;
+	size_t alloc;
 
 	strbuf_git_path_submodule(&objects_directory, path, "objects/");
 	if (!is_directory(objects_directory.buf)) {
@@ -138,8 +138,8 @@ static int add_submodule_odb(const char *path)
 					objects_directory.len))
 			goto done;
 
-	alloc = objects_directory.len + 42; /* for "12/345..." sha1 */
-	alt_odb = xmalloc(sizeof(*alt_odb) + alloc);
+	alloc = st_add(objects_directory.len, 42); /* for "12/345..." sha1 */
+	alt_odb = xmalloc(st_add(sizeof(*alt_odb), alloc));
 	alt_odb->next = alt_odb_list;
 	xsnprintf(alt_odb->base, alloc, "%s", objects_directory.buf);
 	alt_odb->name = alt_odb->base + objects_directory.len;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 11/21] prepare_{git,shell}_cmd: use argv_array
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (9 preceding siblings ...)
  2016-02-19 11:23   ` [PATCH 10/21] use st_add and st_mult for allocation size computation Jeff King
@ 2016-02-19 11:24   ` Jeff King
  2016-02-19 11:24   ` [PATCH 12/21] write_untracked_extension: use FLEX_ALLOC helper Jeff King
                     ` (10 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:24 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

These functions transform an existing argv into one suitable
for exec-ing or spawning via git or a shell. We can use an
argv_array in each to avoid dealing with manual counting and
allocation.

This also makes the memory allocation more clear and fixes
some leaks. In prepare_shell_cmd, we would sometimes
allocate a new string with "$@" in it and sometimes not,
meaning the caller could not correctly free it. On the
non-Windows side, we are in a child process which will
exec() or exit() immediately, so the leak isn't a big deal.
On Windows, though, we use spawn() from the parent process,
and leak a string for each shell command we run. On top of
that, the Windows code did not free the allocated argv array
at all (but does for the prepare_git_cmd case!).

By switching both of these functions to write into an
argv_array, we can consistently free the result as
appropriate.

Signed-off-by: Jeff King <peff@peff.net>
---
Note that I had to touch the Windows run-command code here, but I don't
actually have a platform to test it on.

 exec_cmd.c    | 28 +++++++++++-----------------
 exec_cmd.h    |  4 +++-
 run-command.c | 60 +++++++++++++++++++++++++----------------------------------
 3 files changed, 39 insertions(+), 53 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index e85f0fd..cf442a9 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "exec_cmd.h"
 #include "quote.h"
+#include "argv-array.h"
 #define MAX_ARGS	32
 
 static const char *argv_exec_path;
@@ -107,32 +108,25 @@ void setup_path(void)
 	strbuf_release(&new_path);
 }
 
-const char **prepare_git_cmd(const char **argv)
+const char **prepare_git_cmd(struct argv_array *out, const char **argv)
 {
-	int argc;
-	const char **nargv;
-
-	for (argc = 0; argv[argc]; argc++)
-		; /* just counting */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 2));
-
-	nargv[0] = "git";
-	for (argc = 0; argv[argc]; argc++)
-		nargv[argc + 1] = argv[argc];
-	nargv[argc + 1] = NULL;
-	return nargv;
+	argv_array_push(out, "git");
+	argv_array_pushv(out, argv);
+	return out->argv;
 }
 
 int execv_git_cmd(const char **argv) {
-	const char **nargv = prepare_git_cmd(argv);
-	trace_argv_printf(nargv, "trace: exec:");
+	struct argv_array nargv = ARGV_ARRAY_INIT;
+
+	prepare_git_cmd(&nargv, argv);
+	trace_argv_printf(nargv.argv, "trace: exec:");
 
 	/* execvp() can only ever return if it fails */
-	sane_execvp("git", (char **)nargv);
+	sane_execvp("git", (char **)nargv.argv);
 
 	trace_printf("trace: exec failed: %s\n", strerror(errno));
 
-	free(nargv);
+	argv_array_clear(&nargv);
 	return -1;
 }
 
diff --git a/exec_cmd.h b/exec_cmd.h
index 93b0c02..1f6b433 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -1,11 +1,13 @@
 #ifndef GIT_EXEC_CMD_H
 #define GIT_EXEC_CMD_H
 
+struct argv_array;
+
 extern void git_set_argv_exec_path(const char *exec_path);
 extern const char *git_extract_argv0_path(const char *path);
 extern const char *git_exec_path(void);
 extern void setup_path(void);
-extern const char **prepare_git_cmd(const char **argv);
+extern const char **prepare_git_cmd(struct argv_array *out, const char **argv);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 LAST_ARG_MUST_BE_NULL
 extern int execl_git_cmd(const char *cmd, ...);
diff --git a/run-command.c b/run-command.c
index cdf0184..019f6d1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -160,50 +160,41 @@ int sane_execvp(const char *file, char * const argv[])
 	return -1;
 }
 
-static const char **prepare_shell_cmd(const char **argv)
+static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 {
-	int argc, nargc = 0;
-	const char **nargv;
-
-	for (argc = 0; argv[argc]; argc++)
-		; /* just counting */
-	/* +1 for NULL, +3 for "sh -c" plus extra $0 */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 1 + 3));
-
-	if (argc < 1)
+	if (!argv[0])
 		die("BUG: shell command is empty");
 
 	if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) {
 #ifndef GIT_WINDOWS_NATIVE
-		nargv[nargc++] = SHELL_PATH;
+		argv_array_push(out, SHELL_PATH);
 #else
-		nargv[nargc++] = "sh";
+		argv_array_push(out, "sh");
 #endif
-		nargv[nargc++] = "-c";
-
-		if (argc < 2)
-			nargv[nargc++] = argv[0];
-		else {
-			struct strbuf arg0 = STRBUF_INIT;
-			strbuf_addf(&arg0, "%s \"$@\"", argv[0]);
-			nargv[nargc++] = strbuf_detach(&arg0, NULL);
-		}
-	}
+		argv_array_push(out, "-c");
 
-	for (argc = 0; argv[argc]; argc++)
-		nargv[nargc++] = argv[argc];
-	nargv[nargc] = NULL;
+		/*
+		 * If we have no extra arguments, we do not even need to
+		 * bother with the "$@" magic.
+		 */
+		if (!argv[1])
+			argv_array_push(out, argv[0]);
+		else
+			argv_array_pushf(out, "%s \"$@\"", argv[0]);
+	}
 
-	return nargv;
+	argv_array_pushv(out, argv);
+	return out->argv;
 }
 
 #ifndef GIT_WINDOWS_NATIVE
 static int execv_shell_cmd(const char **argv)
 {
-	const char **nargv = prepare_shell_cmd(argv);
-	trace_argv_printf(nargv, "trace: exec:");
-	sane_execvp(nargv[0], (char **)nargv);
-	free(nargv);
+	struct argv_array nargv = ARGV_ARRAY_INIT;
+	prepare_shell_cmd(&nargv, argv);
+	trace_argv_printf(nargv.argv, "trace: exec:");
+	sane_execvp(nargv.argv[0], (char **)nargv.argv);
+	argv_array_clear(&nargv);
 	return -1;
 }
 #endif
@@ -457,6 +448,7 @@ fail_pipe:
 {
 	int fhin = 0, fhout = 1, fherr = 2;
 	const char **sargv = cmd->argv;
+	struct argv_array nargv = ARGV_ARRAY_INIT;
 
 	if (cmd->no_stdin)
 		fhin = open("/dev/null", O_RDWR);
@@ -482,9 +474,9 @@ fail_pipe:
 		fhout = dup(cmd->out);
 
 	if (cmd->git_cmd)
-		cmd->argv = prepare_git_cmd(cmd->argv);
+		cmd->argv = prepare_git_cmd(&nargv, cmd->argv);
 	else if (cmd->use_shell)
-		cmd->argv = prepare_shell_cmd(cmd->argv);
+		cmd->argv = prepare_shell_cmd(&nargv, cmd->argv);
 
 	cmd->pid = mingw_spawnvpe(cmd->argv[0], cmd->argv, (char**) cmd->env,
 			cmd->dir, fhin, fhout, fherr);
@@ -494,9 +486,7 @@ fail_pipe:
 	if (cmd->clean_on_exit && cmd->pid >= 0)
 		mark_child_for_cleanup(cmd->pid);
 
-	if (cmd->git_cmd)
-		free(cmd->argv);
-
+	argv_array_clear(&nargv);
 	cmd->argv = sargv;
 	if (fhin != 0)
 		close(fhin);
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 12/21] write_untracked_extension: use FLEX_ALLOC helper
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (10 preceding siblings ...)
  2016-02-19 11:24   ` [PATCH 11/21] prepare_{git,shell}_cmd: use argv_array Jeff King
@ 2016-02-19 11:24   ` Jeff King
  2016-02-19 11:24   ` [PATCH 13/21] fast-import: simplify allocation in start_packfile Jeff King
                     ` (9 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:24 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.

Note that there's one subtlety here, though.  We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.

As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.

Signed-off-by: Jeff King <peff@peff.net>
---
 dir.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 2c91541..a4a9d9f 100644
--- a/dir.c
+++ b/dir.c
@@ -2360,16 +2360,15 @@ void write_untracked_extension(struct strbuf *out, struct untracked_cache *untra
 	struct ondisk_untracked_cache *ouc;
 	struct write_data wd;
 	unsigned char varbuf[16];
-	int len = 0, varint_len;
-	if (untracked->exclude_per_dir)
-		len = strlen(untracked->exclude_per_dir);
-	ouc = xmalloc(sizeof(*ouc) + len + 1);
+	int varint_len;
+	size_t len = strlen(untracked->exclude_per_dir);
+
+	FLEX_ALLOC_MEM(ouc, exclude_per_dir, untracked->exclude_per_dir, len);
 	stat_data_to_disk(&ouc->info_exclude_stat, &untracked->ss_info_exclude.stat);
 	stat_data_to_disk(&ouc->excludes_file_stat, &untracked->ss_excludes_file.stat);
 	hashcpy(ouc->info_exclude_sha1, untracked->ss_info_exclude.sha1);
 	hashcpy(ouc->excludes_file_sha1, untracked->ss_excludes_file.sha1);
 	ouc->dir_flags = htonl(untracked->dir_flags);
-	memcpy(ouc->exclude_per_dir, untracked->exclude_per_dir, len + 1);
 
 	varint_len = encode_varint(untracked->ident.len, varbuf);
 	strbuf_add(out, varbuf, varint_len);
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 13/21] fast-import: simplify allocation in start_packfile
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (11 preceding siblings ...)
  2016-02-19 11:24   ` [PATCH 12/21] write_untracked_extension: use FLEX_ALLOC helper Jeff King
@ 2016-02-19 11:24   ` Jeff King
  2016-02-19 17:48     ` Junio C Hamano
  2016-02-19 11:24   ` [PATCH 14/21] fetch-pack: simplify add_sought_entry Jeff King
                     ` (8 subsequent siblings)
  21 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:24 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).

This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.

Signed-off-by: Jeff King <peff@peff.net>
---
 fast-import.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 3053bb8..9fc7093 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -865,15 +865,12 @@ static void start_packfile(void)
 {
 	static char tmp_file[PATH_MAX];
 	struct packed_git *p;
-	int namelen;
 	struct pack_header hdr;
 	int pack_fd;
 
 	pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
 			      "pack/tmp_pack_XXXXXX");
-	namelen = strlen(tmp_file) + 2;
-	p = xcalloc(1, sizeof(*p) + namelen);
-	xsnprintf(p->pack_name, namelen, "%s", tmp_file);
+	FLEX_ALLOC_STR(p, pack_name, tmp_file);
 	p->pack_fd = pack_fd;
 	p->do_not_close = 1;
 	pack_file = sha1fd(pack_fd, p->pack_name);
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 14/21] fetch-pack: simplify add_sought_entry
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (12 preceding siblings ...)
  2016-02-19 11:24   ` [PATCH 13/21] fast-import: simplify allocation in start_packfile Jeff King
@ 2016-02-19 11:24   ` Jeff King
  2016-02-19 11:24   ` [PATCH 15/21] test-path-utils: fix normalize_path_copy output buffer size Jeff King
                     ` (7 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:24 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.

Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().

Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for "<sha1> <ref>" versus just "<ref>". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fetch-pack.c | 27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9b2a514..79a611f 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -10,33 +10,24 @@ static const char fetch_pack_usage[] =
 "[--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] "
 "[--no-progress] [--diag-url] [-v] [<host>:]<directory> [<refs>...]";
 
-static void add_sought_entry_mem(struct ref ***sought, int *nr, int *alloc,
-				 const char *name, int namelen)
+static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
+			     const char *name)
 {
-	struct ref *ref = xcalloc(1, sizeof(*ref) + namelen + 1);
+	struct ref *ref;
 	struct object_id oid;
-	const int chunksz = GIT_SHA1_HEXSZ + 1;
 
-	if (namelen > chunksz && name[chunksz - 1] == ' ' &&
-		!get_oid_hex(name, &oid)) {
-		oidcpy(&ref->old_oid, &oid);
-		name += chunksz;
-		namelen -= chunksz;
-	}
+	if (!get_oid_hex(name, &oid) && name[GIT_SHA1_HEXSZ] == ' ')
+		name += GIT_SHA1_HEXSZ + 1;
+	else
+		oidclr(&oid);
 
-	memcpy(ref->name, name, namelen);
-	ref->name[namelen] = '\0';
+	ref = alloc_ref(name);
+	oidcpy(&ref->old_oid, &oid);
 	(*nr)++;
 	ALLOC_GROW(*sought, *nr, *alloc);
 	(*sought)[*nr - 1] = ref;
 }
 
-static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
-			     const char *string)
-{
-	add_sought_entry_mem(sought, nr, alloc, string, strlen(string));
-}
-
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 {
 	int i, ret;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 15/21] test-path-utils: fix normalize_path_copy output buffer size
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (13 preceding siblings ...)
  2016-02-19 11:24   ` [PATCH 14/21] fetch-pack: simplify add_sought_entry Jeff King
@ 2016-02-19 11:24   ` Jeff King
  2016-02-19 11:25   ` [PATCH 16/21] sequencer: simplify memory allocation of get_message Jeff King
                     ` (6 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:24 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.

In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.

Signed-off-by: Jeff King <peff@peff.net>
---
 test-path-utils.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/test-path-utils.c b/test-path-utils.c
index 6232dfe..ba805b3 100644
--- a/test-path-utils.c
+++ b/test-path-utils.c
@@ -8,21 +8,14 @@
  */
 static int normalize_ceiling_entry(struct string_list_item *item, void *unused)
 {
-	const char *ceil = item->string;
-	int len = strlen(ceil);
-	char buf[PATH_MAX+1];
+	char *ceil = item->string;
 
-	if (len == 0)
+	if (!*ceil)
 		die("Empty path is not supported");
-	if (len > PATH_MAX)
-		die("Path \"%s\" is too long", ceil);
 	if (!is_absolute_path(ceil))
 		die("Path \"%s\" is not absolute", ceil);
-	if (normalize_path_copy(buf, ceil) < 0)
+	if (normalize_path_copy(ceil, ceil) < 0)
 		die("Path \"%s\" could not be normalized", ceil);
-	len = strlen(buf);
-	free(item->string);
-	item->string = xstrdup(buf);
 	return 1;
 }
 
@@ -166,7 +159,7 @@ static struct test_data dirname_data[] = {
 int main(int argc, char **argv)
 {
 	if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) {
-		char *buf = xmalloc(PATH_MAX + 1);
+		char *buf = xmallocz(strlen(argv[2]));
 		int rv = normalize_path_copy(buf, argv[2]);
 		if (rv)
 			buf = "++failed++";
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 16/21] sequencer: simplify memory allocation of get_message
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (14 preceding siblings ...)
  2016-02-19 11:24   ` [PATCH 15/21] test-path-utils: fix normalize_path_copy output buffer size Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-19 11:25   ` [PATCH 17/21] git-compat-util: drop mempcpy compat code Jeff King
                     ` (5 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

For a commit with sha1 "1234abcd" and subject "foo", this
function produces a struct with three strings:

 1. "foo"

 2. "1234abcd... foo"

 3. "parent of 1234abcd... foo"

It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.

Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.

While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.

Signed-off-by: Jeff King <peff@peff.net>
---
 sequencer.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 8048786..e66f2fe 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -124,42 +124,33 @@ static const char *action_name(const struct replay_opts *opts)
 
 struct commit_message {
 	char *parent_label;
-	const char *label;
-	const char *subject;
+	char *label;
+	char *subject;
 	const char *message;
 };
 
 static int get_message(struct commit *commit, struct commit_message *out)
 {
 	const char *abbrev, *subject;
-	int abbrev_len, subject_len;
-	char *q;
-
-	if (!git_commit_encoding)
-		git_commit_encoding = "UTF-8";
+	int subject_len;
 
-	out->message = logmsg_reencode(commit, NULL, git_commit_encoding);
+	out->message = logmsg_reencode(commit, NULL, get_commit_output_encoding());
 	abbrev = find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV);
-	abbrev_len = strlen(abbrev);
 
 	subject_len = find_commit_subject(out->message, &subject);
 
-	out->parent_label = xmalloc(strlen("parent of ") + abbrev_len +
-			      strlen("... ") + subject_len + 1);
-	q = out->parent_label;
-	q = mempcpy(q, "parent of ", strlen("parent of "));
-	out->label = q;
-	q = mempcpy(q, abbrev, abbrev_len);
-	q = mempcpy(q, "... ", strlen("... "));
-	out->subject = q;
-	q = mempcpy(q, subject, subject_len);
-	*q = '\0';
+	out->subject = xmemdupz(subject, subject_len);
+	out->label = xstrfmt("%s... %s", abbrev, out->subject);
+	out->parent_label = xstrfmt("parent of %s", out->label);
+
 	return 0;
 }
 
 static void free_message(struct commit *commit, struct commit_message *msg)
 {
 	free(msg->parent_label);
+	free(msg->label);
+	free(msg->subject);
 	unuse_commit_buffer(commit, msg->message);
 }
 
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 17/21] git-compat-util: drop mempcpy compat code
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (15 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 16/21] sequencer: simplify memory allocation of get_message Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-19 11:25   ` [PATCH 18/21] transport_anonymize_url: use xstrfmt Jeff King
                     ` (4 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

There are no callers of this left, as the last one was
dropped in the previous patch. And there are not likely to
be new ones, as the function has been around since 2010
without gaining any new callers.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index e71e615..8f0e76b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -681,7 +681,6 @@ extern int git_vsnprintf(char *str, size_t maxsize,
 #ifdef __GLIBC_PREREQ
 #if __GLIBC_PREREQ(2, 1)
 #define HAVE_STRCHRNUL
-#define HAVE_MEMPCPY
 #endif
 #endif
 
@@ -695,14 +694,6 @@ static inline char *gitstrchrnul(const char *s, int c)
 }
 #endif
 
-#ifndef HAVE_MEMPCPY
-#define mempcpy gitmempcpy
-static inline void *gitmempcpy(void *dest, const void *src, size_t n)
-{
-	return (char *)memcpy(dest, src, n) + n;
-}
-#endif
-
 #ifdef NO_INET_PTON
 int inet_pton(int af, const char *src, void *dst);
 #endif
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 18/21] transport_anonymize_url: use xstrfmt
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (16 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 17/21] git-compat-util: drop mempcpy compat code Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-19 11:25   ` [PATCH 19/21] diff_populate_gitlink: use a strbuf Jeff King
                     ` (3 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/transport.c b/transport.c
index 3b4e644..ac857a9 100644
--- a/transport.c
+++ b/transport.c
@@ -1021,7 +1021,7 @@ int transport_disconnect(struct transport *transport)
  */
 char *transport_anonymize_url(const char *url)
 {
-	char *anon_url, *scheme_prefix, *anon_part;
+	char *scheme_prefix, *anon_part;
 	size_t anon_len, prefix_len = 0;
 
 	anon_part = strchr(url, '@');
@@ -1055,10 +1055,8 @@ char *transport_anonymize_url(const char *url)
 			goto literal_copy;
 		prefix_len = scheme_prefix - url + 3;
 	}
-	anon_url = xcalloc(1, 1 + prefix_len + anon_len);
-	memcpy(anon_url, url, prefix_len);
-	memcpy(anon_url + prefix_len, anon_part, anon_len);
-	return anon_url;
+	return xstrfmt("%.*s%.*s", (int)prefix_len, url,
+		       (int)anon_len, anon_part);
 literal_copy:
 	return xstrdup(url);
 }
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 19/21] diff_populate_gitlink: use a strbuf
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (17 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 18/21] transport_anonymize_url: use xstrfmt Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-19 11:25   ` [PATCH 20/21] convert ewah/bitmap code to use xmalloc Jeff King
                     ` (2 subsequent siblings)
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.

We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 27d14a7..a70ec6e 100644
--- a/diff.c
+++ b/diff.c
@@ -2704,21 +2704,21 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
 
 static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 {
-	int len;
-	char *data = xmalloc(100), *dirty = "";
+	struct strbuf buf = STRBUF_INIT;
+	char *dirty = "";
 
 	/* Are we looking at the work tree? */
 	if (s->dirty_submodule)
 		dirty = "-dirty";
 
-	len = snprintf(data, 100,
-		       "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
-	s->data = data;
-	s->size = len;
-	s->should_free = 1;
+	strbuf_addf(&buf, "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
+	s->size = buf.len;
 	if (size_only) {
 		s->data = NULL;
-		free(data);
+		strbuf_release(&buf);
+	} else {
+		s->data = strbuf_detach(&buf, NULL);
+		s->should_free = 1;
 	}
 	return 0;
 }
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 20/21] convert ewah/bitmap code to use xmalloc
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (18 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 19/21] diff_populate_gitlink: use a strbuf Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-19 11:25   ` [PATCH 21/21] ewah: convert to REALLOC_ARRAY, etc Jeff King
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.

We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.

Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 12 ++++++------
 ewah/ewah_bitmap.c |  9 +++------
 ewah/ewah_io.c     | 10 ++--------
 ewah/ewok.h        | 10 ----------
 4 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 47ad674..c88daa0 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -25,8 +25,8 @@
 
 struct bitmap *bitmap_new(void)
 {
-	struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
-	bitmap->words = ewah_calloc(32, sizeof(eword_t));
+	struct bitmap *bitmap = xmalloc(sizeof(struct bitmap));
+	bitmap->words = xcalloc(32, sizeof(eword_t));
 	bitmap->word_alloc = 32;
 	return bitmap;
 }
@@ -38,8 +38,8 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = ewah_realloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		self->words = xrealloc(self->words,
+				      self->word_alloc * sizeof(eword_t));
 
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
@@ -102,7 +102,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	while (ewah_iterator_next(&blowup, &it)) {
 		if (i >= bitmap->word_alloc) {
 			bitmap->word_alloc *= 1.5;
-			bitmap->words = ewah_realloc(
+			bitmap->words = xrealloc(
 				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
 		}
 
@@ -134,7 +134,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = ewah_realloc(self->words,
+		self->words = xrealloc(self->words,
 			self->word_alloc * sizeof(eword_t));
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index b522437..fcd465e 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,7 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
@@ -282,11 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 {
 	struct ewah_bitmap *self;
 
-	self = ewah_malloc(sizeof(struct ewah_bitmap));
-	if (self == NULL)
-		return NULL;
-
-	self->buffer = ewah_malloc(32 * sizeof(eword_t));
+	self = xmalloc(sizeof(struct ewah_bitmap));
+	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
 
 	ewah_clear(self);
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 43481b9..4acff97 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,12 +134,9 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
 	 * if we're in a little-endian platform, we'll perform
@@ -180,12 +177,9 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
 	words_left = self->buffer_size;
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 6e2c5e1..269a1a8 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -20,16 +20,6 @@
 #ifndef __EWOK_BITMAP_H__
 #define __EWOK_BITMAP_H__
 
-#ifndef ewah_malloc
-#	define ewah_malloc xmalloc
-#endif
-#ifndef ewah_realloc
-#	define ewah_realloc xrealloc
-#endif
-#ifndef ewah_calloc
-#	define ewah_calloc xcalloc
-#endif
-
 struct strbuf;
 typedef uint64_t eword_t;
 #define BITS_IN_EWORD (sizeof(eword_t) * 8)
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH 21/21] ewah: convert to REALLOC_ARRAY, etc
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (19 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 20/21] convert ewah/bitmap code to use xmalloc Jeff King
@ 2016-02-19 11:25   ` Jeff King
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
  21 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 11:25 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 16 ++++------------
 ewah/ewah_bitmap.c |  5 ++---
 ewah/ewah_io.c     |  6 ++----
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index c88daa0..7103cee 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -17,7 +17,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
-#include "git-compat-util.h"
+#include "cache.h"
 #include "ewok.h"
 
 #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD))
@@ -38,9 +38,7 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = xrealloc(self->words,
-				      self->word_alloc * sizeof(eword_t));
-
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
 	}
@@ -100,12 +98,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	ewah_iterator_init(&it, ewah);
 
 	while (ewah_iterator_next(&blowup, &it)) {
-		if (i >= bitmap->word_alloc) {
-			bitmap->word_alloc *= 1.5;
-			bitmap->words = xrealloc(
-				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
-		}
-
+		ALLOC_GROW(bitmap->words, i + 1, bitmap->word_alloc);
 		bitmap->words[i++] = blowup;
 	}
 
@@ -134,8 +127,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = xrealloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
 	}
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index fcd465e..2dc9c82 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,8 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
 
@@ -283,8 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 	struct ewah_bitmap *self;
 
 	self = xmalloc(sizeof(struct ewah_bitmap));
-	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
+	ALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	ewah_clear(self);
 	return self;
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 4acff97..61f6a43 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,8 +134,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
@@ -177,8 +176,7 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
-- 
2.7.1.577.gfed91b8

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 13/21] fast-import: simplify allocation in start_packfile
  2016-02-19 11:24   ` [PATCH 13/21] fast-import: simplify allocation in start_packfile Jeff King
@ 2016-02-19 17:48     ` Junio C Hamano
  2016-02-19 19:12       ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Junio C Hamano @ 2016-02-19 17:48 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Eric Sunshine

Jeff King <peff@peff.net> writes:

> This function allocate a packed_git flex-array, and adds a

s/allocate/&s/, right?

> mysterious 2 bytes to the length of the pack_name field. One
> is for the trailing NUL, but the other has no purpose. This
> is probably cargo-culted from add_packed_git, which gets the
> ".idx" path and needed to allocate enough space to hold the
> matching ".pack" (though since 48bcc1c, we calculate the
> size there differently).
>
> This site, however, is using the raw path of a tempfile, and
> does not need the extra byte. We can just replace the
> allocation with FLEX_ALLOC_STR, which handles the allocation
> and the NUL for us.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  fast-import.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fast-import.c b/fast-import.c
> index 3053bb8..9fc7093 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -865,15 +865,12 @@ static void start_packfile(void)
>  {
>  	static char tmp_file[PATH_MAX];
>  	struct packed_git *p;
> -	int namelen;
>  	struct pack_header hdr;
>  	int pack_fd;
>  
>  	pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
>  			      "pack/tmp_pack_XXXXXX");
> -	namelen = strlen(tmp_file) + 2;
> -	p = xcalloc(1, sizeof(*p) + namelen);
> -	xsnprintf(p->pack_name, namelen, "%s", tmp_file);
> +	FLEX_ALLOC_STR(p, pack_name, tmp_file);
>  	p->pack_fd = pack_fd;
>  	p->do_not_close = 1;
>  	pack_file = sha1fd(pack_fd, p->pack_name);

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 13/21] fast-import: simplify allocation in start_packfile
  2016-02-19 17:48     ` Junio C Hamano
@ 2016-02-19 19:12       ` Jeff King
  0 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-19 19:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Eric Sunshine

On Fri, Feb 19, 2016 at 09:48:54AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > This function allocate a packed_git flex-array, and adds a
> 
> s/allocate/&s/, right?

Yes, thanks.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-19 11:23   ` [PATCH 06/21] convert manual allocations to argv_array Jeff King
@ 2016-02-20  8:07     ` Eric Sunshine
  2016-02-20  8:10       ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-20  8:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano

On Fri, Feb 19, 2016 at 6:23 AM, Jeff King <peff@peff.net> wrote:
> There are many manual argv allocations that predate the
> argv_array API. Switching to that API brings a few
> advantages:
>
>   1. We no longer have to manually compute the correct final
>      array size (so it's one less thing we can screw up).
>
>   2. In many cases we had to make a separate pass to count,
>      then allocate, then fill in the array. Now we can do it
>      in one pass, making the code shorter and easier to
>      follow.
>
>   3. argv_array handles memory ownership for us, making it
>      more obvious when things should be free()d and and when
>      not.
>
> Most of these cases are pretty straightforward. In some, we
> switch from "run_command_v" to "run_command" which lets us
> directly use the argv_array embedded in "struct
> child_process".
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/line-log.c b/line-log.c
> @@ -746,23 +747,16 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
>         if (!rev->diffopt.detect_rename) {
> -               int i, count = 0;
> -               struct line_log_data *r = range;
> -               const char **paths;
> -               while (r) {
> -                       count++;
> -                       r = r->next;
> -               }
> -               paths = xmalloc((count+1)*sizeof(char *));
> -               r = range;
> -               for (i = 0; i < count; i++) {
> -                       paths[i] = xstrdup(r->path);
> -                       r = r->next;
> -               }
> -               paths[count] = NULL;
> +               struct line_log_data *r;
> +               struct argv_array paths = ARGV_ARRAY_INIT;
> +
> +               for (r = range; r; r = r->next)
> +                       argv_array_push(&paths, r->path);
>                 parse_pathspec(&rev->diffopt.pathspec, 0,
> -                              PATHSPEC_PREFER_FULL, "", paths);
> -               free(paths);
> +                              PATHSPEC_PREFER_FULL, "", paths.argv);
> +               /* argv strings are now owned by pathspec */
> +               paths.argc = 0;
> +               argv_array_clear(&paths);

This overly intimate knowledge of the internal implementation of
argv_array_clear() is rather ugly.

>         }
>  }

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:07     ` Eric Sunshine
@ 2016-02-20  8:10       ` Jeff King
  2016-02-20  8:29         ` Eric Sunshine
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-20  8:10 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 03:07:00AM -0500, Eric Sunshine wrote:

> > diff --git a/line-log.c b/line-log.c
> > @@ -746,23 +747,16 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
> >         if (!rev->diffopt.detect_rename) {
> > -               int i, count = 0;
> > -               struct line_log_data *r = range;
> > -               const char **paths;
> > -               while (r) {
> > -                       count++;
> > -                       r = r->next;
> > -               }
> > -               paths = xmalloc((count+1)*sizeof(char *));
> > -               r = range;
> > -               for (i = 0; i < count; i++) {
> > -                       paths[i] = xstrdup(r->path);
> > -                       r = r->next;
> > -               }
> > -               paths[count] = NULL;
> > +               struct line_log_data *r;
> > +               struct argv_array paths = ARGV_ARRAY_INIT;
> > +
> > +               for (r = range; r; r = r->next)
> > +                       argv_array_push(&paths, r->path);
> >                 parse_pathspec(&rev->diffopt.pathspec, 0,
> > -                              PATHSPEC_PREFER_FULL, "", paths);
> > -               free(paths);
> > +                              PATHSPEC_PREFER_FULL, "", paths.argv);
> > +               /* argv strings are now owned by pathspec */
> > +               paths.argc = 0;
> > +               argv_array_clear(&paths);
> 
> This overly intimate knowledge of the internal implementation of
> argv_array_clear() is rather ugly.

Yep, I agree. Suggestions?

We can just leak the array of "char *". This function is called only
once per program invocation, and that's unlikely to change.

I guess we can make an argv_array_detach_strings() function. Or maybe
even just argv_array_detach() would be less gross, and then this
function could manually free the array but not the strings themselves.

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:10       ` Jeff King
@ 2016-02-20  8:29         ` Eric Sunshine
  2016-02-20  8:34           ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-20  8:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 3:10 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Feb 20, 2016 at 03:07:00AM -0500, Eric Sunshine wrote:
>> > +               /* argv strings are now owned by pathspec */
>> > +               paths.argc = 0;
>> > +               argv_array_clear(&paths);
>>
>> This overly intimate knowledge of the internal implementation of
>> argv_array_clear() is rather ugly.
>
> Yep, I agree. Suggestions?
>
> We can just leak the array of "char *". This function is called only
> once per program invocation, and that's unlikely to change.
>
> I guess we can make an argv_array_detach_strings() function. Or maybe
> even just argv_array_detach() would be less gross, and then this
> function could manually free the array but not the strings themselves.

The latter is what I was thinking, and I agree that
argv_array_detach() would be less gross than
argv_array_detach_strings(), however, it also feels a bit wrong since
this sort of ownership transfer is kind of out of scope for
argv_array.

I wonder if a simple "dup'ing" string_list would be more suitable for
this case. You'd have to append the NULL item manually with
string_list_append_nodup(), and string_list_clear() would then be the
correct way to dispose of the list without intimate knowledge of its
implementation and no need for an API extension.

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:29         ` Eric Sunshine
@ 2016-02-20  8:34           ` Jeff King
  2016-02-20  8:39             ` Eric Sunshine
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-20  8:34 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 03:29:29AM -0500, Eric Sunshine wrote:

> On Sat, Feb 20, 2016 at 3:10 AM, Jeff King <peff@peff.net> wrote:
> > On Sat, Feb 20, 2016 at 03:07:00AM -0500, Eric Sunshine wrote:
> >> > +               /* argv strings are now owned by pathspec */
> >> > +               paths.argc = 0;
> >> > +               argv_array_clear(&paths);
> >>
> >> This overly intimate knowledge of the internal implementation of
> >> argv_array_clear() is rather ugly.
> >
> > Yep, I agree. Suggestions?
> >
> > We can just leak the array of "char *". This function is called only
> > once per program invocation, and that's unlikely to change.
> >
> > I guess we can make an argv_array_detach_strings() function. Or maybe
> > even just argv_array_detach() would be less gross, and then this
> > function could manually free the array but not the strings themselves.
> 
> The latter is what I was thinking, and I agree that
> argv_array_detach() would be less gross than
> argv_array_detach_strings(), however, it also feels a bit wrong since
> this sort of ownership transfer is kind of out of scope for
> argv_array.
> 
> I wonder if a simple "dup'ing" string_list would be more suitable for
> this case. You'd have to append the NULL item manually with
> string_list_append_nodup(), and string_list_clear() would then be the
> correct way to dispose of the list without intimate knowledge of its
> implementation and no need for an API extension.

A string_list doesn't just store pointers; it's a struct with a util
field. So you can't pass it to things expecting a "const char **".

I think argv_array_detach() is the least-bad thing here. It matches
strbuf_detach() to say "you now own the storage" (as opposed to just
peeking at argv.argv, which we should do only in a read-only way).

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:34           ` Jeff King
@ 2016-02-20  8:39             ` Eric Sunshine
  2016-02-20  8:57               ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: Eric Sunshine @ 2016-02-20  8:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 3:34 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Feb 20, 2016 at 03:29:29AM -0500, Eric Sunshine wrote:
>> On Sat, Feb 20, 2016 at 3:10 AM, Jeff King <peff@peff.net> wrote:
>> > On Sat, Feb 20, 2016 at 03:07:00AM -0500, Eric Sunshine wrote:
>> >> > +               /* argv strings are now owned by pathspec */
>> >> > +               paths.argc = 0;
>> >> > +               argv_array_clear(&paths);
>> >>
>> >> This overly intimate knowledge of the internal implementation of
>> >> argv_array_clear() is rather ugly.
>> >
>> > Yep, I agree. Suggestions?
>> > [...]
>> >
>> > I guess we can make an argv_array_detach_strings() function. Or maybe
>> > even just argv_array_detach() would be less gross, and then this
>> > function could manually free the array but not the strings themselves.
>>
>> [...]
>> I wonder if a simple "dup'ing" string_list would be more suitable for
>> this case. You'd have to append the NULL item manually with
>> string_list_append_nodup(), and string_list_clear() would then be the
>> correct way to dispose of the list without intimate knowledge of its
>> implementation and no need for an API extension.
>
> A string_list doesn't just store pointers; it's a struct with a util
> field. So you can't pass it to things expecting a "const char **".

Yep, I knew that but wasn't thinking straight.

> I think argv_array_detach() is the least-bad thing here. It matches
> strbuf_detach() to say "you now own the storage" (as opposed to just
> peeking at argv.argv, which we should do only in a read-only way).

I also had made the strbuf_detach() analogy in my response but deleted
it before sending; I do think it's a reasonable API template to mirror
via new argv_array_detach().

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:39             ` Eric Sunshine
@ 2016-02-20  8:57               ` Jeff King
  2016-02-20  9:04                 ` Eric Sunshine
  0 siblings, 1 reply; 93+ messages in thread
From: Jeff King @ 2016-02-20  8:57 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 03:39:36AM -0500, Eric Sunshine wrote:

> I also had made the strbuf_detach() analogy in my response but deleted
> it before sending; I do think it's a reasonable API template to mirror
> via new argv_array_detach().

That would look like this, which I think is not too bad (on top of my
series for now; I'd do the API function as a separate patch at the
beginning and then use it immediately).

diff --git a/argv-array.c b/argv-array.c
index eaed477..5d370fa 100644
--- a/argv-array.c
+++ b/argv-array.c
@@ -74,3 +74,14 @@ void argv_array_clear(struct argv_array *array)
 	}
 	argv_array_init(array);
 }
+
+const char **argv_array_detach(struct argv_array *array)
+{
+	if (array->argv == empty_argv)
+		return xcalloc(1, sizeof(const char *));
+	else {
+		const char **ret = array->argv;
+		argv_array_init(array);
+		return ret;
+	}
+}
diff --git a/argv-array.h b/argv-array.h
index a2fa0aa..29056e4 100644
--- a/argv-array.h
+++ b/argv-array.h
@@ -20,5 +20,6 @@ void argv_array_pushl(struct argv_array *, ...);
 void argv_array_pushv(struct argv_array *, const char **);
 void argv_array_pop(struct argv_array *);
 void argv_array_clear(struct argv_array *);
+const char **argv_array_detach(struct argv_array *);
 
 #endif /* ARGV_ARRAY_H */
diff --git a/line-log.c b/line-log.c
index fa095b9..bbe31ed 100644
--- a/line-log.c
+++ b/line-log.c
@@ -748,15 +748,17 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
 
 	if (!rev->diffopt.detect_rename) {
 		struct line_log_data *r;
-		struct argv_array paths = ARGV_ARRAY_INIT;
+		struct argv_array array = ARGV_ARRAY_INIT;
+		const char **paths;
 
 		for (r = range; r; r = r->next)
-			argv_array_push(&paths, r->path);
+			argv_array_push(&array, r->path);
+		paths = argv_array_detach(&array);
+
 		parse_pathspec(&rev->diffopt.pathspec, 0,
-			       PATHSPEC_PREFER_FULL, "", paths.argv);
-		/* argv strings are now owned by pathspec */
-		paths.argc = 0;
-		argv_array_clear(&paths);
+			       PATHSPEC_PREFER_FULL, "", paths);
+		/* strings are now owned by pathspec */
+		free(paths);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH 06/21] convert manual allocations to argv_array
  2016-02-20  8:57               ` Jeff King
@ 2016-02-20  9:04                 ` Eric Sunshine
  0 siblings, 0 replies; 93+ messages in thread
From: Eric Sunshine @ 2016-02-20  9:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano

On Sat, Feb 20, 2016 at 3:57 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Feb 20, 2016 at 03:39:36AM -0500, Eric Sunshine wrote:
>> I also had made the strbuf_detach() analogy in my response but deleted
>> it before sending; I do think it's a reasonable API template to mirror
>> via new argv_array_detach().
>
> That would look like this, which I think is not too bad (on top of my
> series for now; I'd do the API function as a separate patch at the
> beginning and then use it immediately).

Looks reasonable.

> diff --git a/argv-array.c b/argv-array.c
> index eaed477..5d370fa 100644
> --- a/argv-array.c
> +++ b/argv-array.c
> @@ -74,3 +74,14 @@ void argv_array_clear(struct argv_array *array)
>         }
>         argv_array_init(array);
>  }
> +
> +const char **argv_array_detach(struct argv_array *array)
> +{
> +       if (array->argv == empty_argv)
> +               return xcalloc(1, sizeof(const char *));
> +       else {
> +               const char **ret = array->argv;
> +               argv_array_init(array);
> +               return ret;
> +       }
> +}
> diff --git a/argv-array.h b/argv-array.h
> index a2fa0aa..29056e4 100644
> --- a/argv-array.h
> +++ b/argv-array.h
> @@ -20,5 +20,6 @@ void argv_array_pushl(struct argv_array *, ...);
>  void argv_array_pushv(struct argv_array *, const char **);
>  void argv_array_pop(struct argv_array *);
>  void argv_array_clear(struct argv_array *);
> +const char **argv_array_detach(struct argv_array *);
>
>  #endif /* ARGV_ARRAY_H */
> diff --git a/line-log.c b/line-log.c
> index fa095b9..bbe31ed 100644
> --- a/line-log.c
> +++ b/line-log.c
> @@ -748,15 +748,17 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
>
>         if (!rev->diffopt.detect_rename) {
>                 struct line_log_data *r;
> -               struct argv_array paths = ARGV_ARRAY_INIT;
> +               struct argv_array array = ARGV_ARRAY_INIT;
> +               const char **paths;
>
>                 for (r = range; r; r = r->next)
> -                       argv_array_push(&paths, r->path);
> +                       argv_array_push(&array, r->path);
> +               paths = argv_array_detach(&array);
> +
>                 parse_pathspec(&rev->diffopt.pathspec, 0,
> -                              PATHSPEC_PREFER_FULL, "", paths.argv);
> -               /* argv strings are now owned by pathspec */
> -               paths.argc = 0;
> -               argv_array_clear(&paths);
> +                              PATHSPEC_PREFER_FULL, "", paths);
> +               /* strings are now owned by pathspec */
> +               free(paths);
>         }
>  }

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow
  2016-02-19 11:22   ` [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
@ 2016-02-20 21:32     ` René Scharfe
  2016-02-21 23:30       ` Jeff King
  0 siblings, 1 reply; 93+ messages in thread
From: René Scharfe @ 2016-02-20 21:32 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Eric Sunshine, Junio C Hamano

Am 19.02.2016 um 12:22 schrieb Jeff King:
> REALLOC_ARRAY inherently involves a multiplication which can
> overflow size_t, resulting in a much smaller buffer than we
> think we've allocated. We can easily harden it by using
> st_mult() to check for overflow.  Likewise, we can add
> ALLOC_ARRAY to do the same thing for xmalloc calls.

Good idea!

> xcalloc() should already be fine, because it takes the two
> factors separately, assuming the system calloc actually
> checks for overflow. However, before we even hit the system
> calloc(), we do our memory_limit_check, which involves a
> multiplication. Let's check for overflow ourselves so that
> this limit cannot be bypassed.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   git-compat-util.h | 3 ++-
>   wrapper.c         | 3 +++
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 0c65033..55c073d 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -779,7 +779,8 @@ extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1);
>   extern char *xgetcwd(void);
>   extern FILE *fopen_for_writing(const char *path);
>
> -#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
> +#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
> +#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))

st_mult(x, y) calls unsigned_mult_overflows(x, y), which divides by x. 
This division can be done at compile time if x is a constant.  This can 
be guaranteed for all users of the two macros above by reversing the 
arguments of st_mult(), so that sizeof comes first.  Probably not a big 
win, but why not do it if it's that easy?

Or perhaps a macro like this could help here and in other places which 
use st_mult with sizeof:

   #define SIZEOF_MULT(x, n) st_mult(sizeof(x), (n))

(I'd call it ARRAY_SIZE, but that name is already taken. :)

René

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow
  2016-02-20 21:32     ` René Scharfe
@ 2016-02-21 23:30       ` Jeff King
  0 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-21 23:30 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Eric Sunshine, Junio C Hamano

On Sat, Feb 20, 2016 at 10:32:00PM +0100, René Scharfe wrote:

> >-#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
> >+#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x))))
> >+#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), sizeof(*(x))))
> 
> st_mult(x, y) calls unsigned_mult_overflows(x, y), which divides by x. This
> division can be done at compile time if x is a constant.  This can be
> guaranteed for all users of the two macros above by reversing the arguments
> of st_mult(), so that sizeof comes first.  Probably not a big win, but why
> not do it if it's that easy?

I doubt it's even measurable, but as you say, it's easy enough to do, so
why not.

If we really care about optimizing, I suspect that something like:

diff --git a/git-compat-util.h b/git-compat-util.h
index 5cbdd2e..1d53328 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -662,10 +662,12 @@ static inline size_t st_add(size_t a, size_t b)
 
 static inline size_t st_mult(size_t a, size_t b)
 {
-	if (unsigned_mult_overflows(a, b))
+	size_t ret;
+
+	if (__builtin_mul_overflow(a, b, &ret))
 		die("size_t overflow: %"PRIuMAX" * %"PRIuMAX,
 		    (uintmax_t)a, (uintmax_t)b);
-	return a * b;
+	return ret;
 }
 
 static inline size_t st_sub(size_t a, size_t b)


would do a lot more. But it needs #ifdefs for compilers besides gcc and
clang.

> Or perhaps a macro like this could help here and in other places which use
> st_mult with sizeof:
> 
>   #define SIZEOF_MULT(x, n) st_mult(sizeof(x), (n))
> 
> (I'd call it ARRAY_SIZE, but that name is already taken. :)

I don't think we need that; we're really only checking allocations,
which means ALLOC_ARRAY() and friends cover all the uses of sizeof (we
might still use st_mult() for _another_ part of the computation, but it
won't usually be a sizeof then).

We also may do follow-up multiplications, like:

  ALLOC_ARRAY(foo, nr);
  memset(foo, 0, nr * sizeof(*foo));

but I didn't bother doing overflow checks for those. We know that they
should be fine if the original allocation was successful (though of
course, just using xcalloc here would be better still).

-Peff

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 0/22] hardening allocations against integer overflow
  2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
                     ` (20 preceding siblings ...)
  2016-02-19 11:25   ` [PATCH 21/21] ewah: convert to REALLOC_ARRAY, etc Jeff King
@ 2016-02-22 22:41   ` Jeff King
  2016-02-22 22:43     ` [PATCH v3 01/22] reflog_expire_cfg: NUL-terminate pattern field Jeff King
                       ` (22 more replies)
  21 siblings, 23 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:41 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

On Fri, Feb 19, 2016 at 06:19:41AM -0500, Jeff King wrote:

> Here's a re-roll of jk/tighten-alloc series from:
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/286253

And here's v3. The changes this time (and philosophical rationalization
of changes I didn't make) are pretty small:

  - flip the order of arguments to st_mult() in ALLOC_ARRAY, et al, to
    get a probably-irrelevant-but-so-easy-why-not optimization

  - mark new global var in daemon.c as static

  - add argv_array_detach (new patch 6) to avoid gross memory management
    when converting line-log to use argv_array in the subsequent patch

So I hope this one is ready for 'next'.  Thanks René, Ramsay, and Eric
for reviewing.

  [01/22]: reflog_expire_cfg: NUL-terminate pattern field
  [02/22]: add helpers for detecting size_t overflow
  [03/22]: tree-diff: catch integer overflow in combine_diff_path allocation
  [04/22]: harden REALLOC_ARRAY and xcalloc against size_t overflow
  [05/22]: add helpers for allocating flex-array structs
  [06/22]: argv-array: add detach function
  [07/22]: convert manual allocations to argv_array
  [08/22]: convert trivial cases to ALLOC_ARRAY
  [09/22]: use xmallocz to avoid size arithmetic
  [10/22]: convert trivial cases to FLEX_ARRAY macros
  [11/22]: use st_add and st_mult for allocation size computation
  [12/22]: prepare_{git,shell}_cmd: use argv_array
  [13/22]: write_untracked_extension: use FLEX_ALLOC helper
  [14/22]: fast-import: simplify allocation in start_packfile
  [15/22]: fetch-pack: simplify add_sought_entry
  [16/22]: test-path-utils: fix normalize_path_copy output buffer size
  [17/22]: sequencer: simplify memory allocation of get_message
  [18/22]: git-compat-util: drop mempcpy compat code
  [19/22]: transport_anonymize_url: use xstrfmt
  [20/22]: diff_populate_gitlink: use a strbuf
  [21/22]: convert ewah/bitmap code to use xmalloc
  [22/22]: ewah: convert to REALLOC_ARRAY, etc

-Peff

^ permalink raw reply	[flat|nested] 93+ messages in thread

* [PATCH v3 01/22] reflog_expire_cfg: NUL-terminate pattern field
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
@ 2016-02-22 22:43     ` Jeff King
  2016-02-22 22:43     ` [PATCH v3 02/22] add helpers for detecting size_t overflow Jeff King
                       ` (21 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:43 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern. The
pattern itself is _not_ NUL-terminated.

However, we feed the pattern directly to wildmatch(), which
expects a NUL-terminated string, meaning it may keep reading
random junk after our struct.

We can fix this by allocating an extra byte for the NUL
(which is already zero because we use xcalloc). Let's also
drop the misleading "len" field, which is no longer
necessary. The existing use of "len" can be converted to use
strncmp().

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/reflog.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index f39960e..9980731 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -396,7 +396,6 @@ static struct reflog_expire_cfg {
 	struct reflog_expire_cfg *next;
 	unsigned long expire_total;
 	unsigned long expire_unreachable;
-	size_t len;
 	char pattern[FLEX_ARRAY];
 } *reflog_expire_cfg, **reflog_expire_cfg_tail;
 
@@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		reflog_expire_cfg_tail = &reflog_expire_cfg;
 
 	for (ent = reflog_expire_cfg; ent; ent = ent->next)
-		if (ent->len == len &&
-		    !memcmp(ent->pattern, pattern, len))
+		if (!strncmp(ent->pattern, pattern, len) &&
+		    ent->pattern[len] == '\0')
 			return ent;
 
-	ent = xcalloc(1, (sizeof(*ent) + len));
+	ent = xcalloc(1, sizeof(*ent) + len + 1);
 	memcpy(ent->pattern, pattern, len);
-	ent->len = len;
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 02/22] add helpers for detecting size_t overflow
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
  2016-02-22 22:43     ` [PATCH v3 01/22] reflog_expire_cfg: NUL-terminate pattern field Jeff King
@ 2016-02-22 22:43     ` Jeff King
  2016-02-22 22:43     ` [PATCH v3 03/22] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
                       ` (20 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:43 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.

We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:

  - promote the arguments to size_t, so that we know we are
    doing our computation in the same size of integer that
    will ultimately be fed to xmalloc

  - check and die on overflow

  - return the result so that computations can be done in
    the parameter list of xmalloc.

These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:

  st_add(st_add(a, b), st_add(c, d));

you can write:

  st_add4(a, b, c, d);

This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 693a336..0c65033 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -96,6 +96,14 @@
 #define unsigned_add_overflows(a, b) \
     ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+/*
+ * Returns true if the multiplication of "a" and "b" will
+ * overflow. The types of "a" and "b" must match and must be unsigned.
+ * Note that this macro evaluates "a" twice!
+ */
+#define unsigned_mult_overflows(a, b) \
+    ((a) && (b) > maximum_unsigned_value_of_type(a) / (a))
+
 #ifdef __GNUC__
 #define TYPEOF(x) (__typeof__(x))
 #else
@@ -713,6 +721,32 @@ extern void release_pack_memory(size_t);
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
+static inline size_t st_add(size_t a, size_t b)
+{
+	if (unsigned_add_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a + b;
+}
+#define st_add3(a,b,c)   st_add((a),st_add((b),(c)))
+#define st_add4(a,b,c,d) st_add((a),st_add3((b),(c),(d)))
+
+static inline size_t st_mult(size_t a, size_t b)
+{
+	if (unsigned_mult_overflows(a, b))
+		die("size_t overflow: %"PRIuMAX" * %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a * b;
+}
+
+static inline size_t st_sub(size_t a, size_t b)
+{
+	if (a < b)
+		die("size_t underflow: %"PRIuMAX" - %"PRIuMAX,
+		    (uintmax_t)a, (uintmax_t)b);
+	return a - b;
+}
+
 #ifdef HAVE_ALLOCA_H
 # include <alloca.h>
 # define xalloca(size)      (alloca(size))
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 03/22] tree-diff: catch integer overflow in combine_diff_path allocation
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
  2016-02-22 22:43     ` [PATCH v3 01/22] reflog_expire_cfg: NUL-terminate pattern field Jeff King
  2016-02-22 22:43     ` [PATCH v3 02/22] add helpers for detecting size_t overflow Jeff King
@ 2016-02-22 22:43     ` Jeff King
  2016-02-22 22:43     ` [PATCH v3 04/22] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
                       ` (19 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:43 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.

We can fix this by using size_t, and checking for overflow
with the st_add helper.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.h      | 4 ++--
 tree-diff.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.h b/diff.h
index 70b2d70..beafbbd 100644
--- a/diff.h
+++ b/diff.h
@@ -222,8 +222,8 @@ struct combine_diff_path {
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
-	(sizeof(struct combine_diff_path) + \
-	 sizeof(struct combine_diff_parent) * (n) + (l) + 1)
+	st_add4(sizeof(struct combine_diff_path), (l), 1, \
+		st_mult(sizeof(struct combine_diff_parent), (n)))
 
 extern void show_combined_diff(struct combine_diff_path *elem, int num_parent,
 			      int dense, struct rev_info *);
diff --git a/tree-diff.c b/tree-diff.c
index 290a1da..4dda9a1 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -124,8 +124,8 @@ static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
 	unsigned mode, const unsigned char *sha1)
 {
 	struct combine_diff_path *p;
-	int len = base->len + pathlen;
-	int alloclen = combine_diff_path_size(nparent, len);
+	size_t len = st_add(base->len, pathlen);
+	size_t alloclen = combine_diff_path_size(nparent, len);
 
 	/* if last->next is !NULL - it is a pre-allocated memory, we can reuse */
 	p = last->next;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 04/22] harden REALLOC_ARRAY and xcalloc against size_t overflow
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (2 preceding siblings ...)
  2016-02-22 22:43     ` [PATCH v3 03/22] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
@ 2016-02-22 22:43     ` Jeff King
  2016-02-22 22:43     ` [PATCH v3 05/22] add helpers for allocating flex-array structs Jeff King
                       ` (18 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:43 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow.  Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.

xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 3 ++-
 wrapper.c         | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0c65033..81f2347 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -779,7 +779,8 @@ extern int odb_pack_keep(char *name, size_t namesz, const unsigned char *sha1);
 extern char *xgetcwd(void);
 extern FILE *fopen_for_writing(const char *path);
 
-#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
+#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
+#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), (alloc)))
 
 static inline char *xstrdup_or_null(const char *str)
 {
diff --git a/wrapper.c b/wrapper.c
index 29a45d2..9afc1a0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -152,6 +152,9 @@ void *xcalloc(size_t nmemb, size_t size)
 {
 	void *ret;
 
+	if (unsigned_mult_overflows(nmemb, size))
+		die("data too large to fit into virtual memory space");
+
 	memory_limit_check(size * nmemb, 0);
 	ret = calloc(nmemb, size);
 	if (!ret && (!nmemb || !size))
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 05/22] add helpers for allocating flex-array structs
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (3 preceding siblings ...)
  2016-02-22 22:43     ` [PATCH v3 04/22] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
@ 2016-02-22 22:43     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 06/22] argv-array: add detach function Jeff King
                       ` (17 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:43 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.

This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.

Ideally we could provide a more flexible version that could
handle multiple strings, like:

  FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);

But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 81f2347..57ff9fb 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult(sizeof(*(x)), (alloc)))
 
+/*
+ * These functions help you allocate structs with flex arrays, and copy
+ * the data directly into the array. For example, if you had:
+ *
+ *   struct foo {
+ *     int bar;
+ *     char name[FLEX_ARRAY];
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_MEM(f, name, src, len);
+ *
+ * to allocate a "foo" with the contents of "src" in the "name" field.
+ * The resulting struct is automatically zero'd, and the flex-array field
+ * is NUL-terminated (whether the incoming src buffer was or not).
+ *
+ * The FLEXPTR_* variants operate on structs that don't use flex-arrays,
+ * but do want to store a pointer to some extra data in the same allocated
+ * block. For example, if you have:
+ *
+ *   struct foo {
+ *     char *name;
+ *     int bar;
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_STR(f, name, src);
+ *
+ * and "name" will point to a block of memory after the struct, which will be
+ * freed along with the struct (but the pointer can be repointed anywhere).
+ *
+ * The *_STR variants accept a string parameter rather than a ptr/len
+ * combination.
+ *
+ * Note that these macros will evaluate the first parameter multiple
+ * times, and it must be assignable as an lvalue.
+ */
+#define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
+	(x) = NULL; /* silence -Wuninitialized for offset calculation */ \
+	(x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char *)(x), (buf), (len)); \
+} while (0)
+#define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
+	(x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+	(x)->ptrname = (void *)((x)+1); \
+} while(0)
+#define FLEX_ALLOC_STR(x, flexname, str) \
+	FLEX_ALLOC_MEM((x), flexname, (str), strlen(str))
+#define FLEXPTR_ALLOC_STR(x, ptrname, str) \
+	FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
+
+static inline void *xalloc_flex(size_t base_len, size_t offset,
+				const void *src, size_t src_len)
+{
+	unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
+	memcpy(ret + offset, src, src_len);
+	return ret;
+}
+
 static inline char *xstrdup_or_null(const char *str)
 {
 	return str ? xstrdup(str) : NULL;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 06/22] argv-array: add detach function
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (4 preceding siblings ...)
  2016-02-22 22:43     ` [PATCH v3 05/22] add helpers for allocating flex-array structs Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 07/22] convert manual allocations to argv_array Jeff King
                       ` (16 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

The usual pattern for an argv array is to initialize it,
push in some strings, and then clear it when done. Very
occasionally, though, we must do other exotic things with
the memory, like freeing the list but keeping the strings.
Let's provide a detach function so that callers can make use
of our API to build up the array, and then take ownership of
it.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/technical/api-argv-array.txt |  7 +++++++
 argv-array.c                               | 11 +++++++++++
 argv-array.h                               |  1 +
 3 files changed, 19 insertions(+)

diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt
index 8076172..cfc0630 100644
--- a/Documentation/technical/api-argv-array.txt
+++ b/Documentation/technical/api-argv-array.txt
@@ -56,3 +56,10 @@ Functions
 `argv_array_clear`::
 	Free all memory associated with the array and return it to the
 	initial, empty state.
+
+`argv_array_detach`::
+	Disconnect the `argv` member from the `argv_array` struct and
+	return it. The caller is responsible for freeing the memory used
+	by the array, and by the strings it references. After detaching,
+	the `argv_array` is in a reinitialized state and can be pushed
+	into again.
diff --git a/argv-array.c b/argv-array.c
index eaed477..5d370fa 100644
--- a/argv-array.c
+++ b/argv-array.c
@@ -74,3 +74,14 @@ void argv_array_clear(struct argv_array *array)
 	}
 	argv_array_init(array);
 }
+
+const char **argv_array_detach(struct argv_array *array)
+{
+	if (array->argv == empty_argv)
+		return xcalloc(1, sizeof(const char *));
+	else {
+		const char **ret = array->argv;
+		argv_array_init(array);
+		return ret;
+	}
+}
diff --git a/argv-array.h b/argv-array.h
index a2fa0aa..29056e4 100644
--- a/argv-array.h
+++ b/argv-array.h
@@ -20,5 +20,6 @@ void argv_array_pushl(struct argv_array *, ...);
 void argv_array_pushv(struct argv_array *, const char **);
 void argv_array_pop(struct argv_array *);
 void argv_array_clear(struct argv_array *);
+const char **argv_array_detach(struct argv_array *);
 
 #endif /* ARGV_ARRAY_H */
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 07/22] convert manual allocations to argv_array
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (5 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 06/22] argv-array: add detach function Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 08/22] convert trivial cases to ALLOC_ARRAY Jeff King
                       ` (15 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
     array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
     then allocate, then fill in the array. Now we can do it
     in one pass, making the code shorter and easier to
     follow.

  3. argv_array handles memory ownership for us, making it
     more obvious when things should be free()d and and when
     not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/grep.c         | 10 +++++-----
 builtin/receive-pack.c | 12 +++---------
 builtin/remote-ext.c   | 26 +++++---------------------
 daemon.c               | 12 +++++-------
 git.c                  | 14 +++++---------
 line-log.c             | 22 +++++++++-------------
 remote-curl.c          | 23 +++++++++++------------
 7 files changed, 43 insertions(+), 76 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 8c516a9..aa7435f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -365,17 +365,17 @@ static void append_path(struct grep_opt *opt, const void *data, size_t len)
 static void run_pager(struct grep_opt *opt, const char *prefix)
 {
 	struct string_list *path_list = opt->output_priv;
-	const char **argv = xmalloc(sizeof(const char *) * (path_list->nr + 1));
+	struct child_process child = CHILD_PROCESS_INIT;
 	int i, status;
 
 	for (i = 0; i < path_list->nr; i++)
-		argv[i] = path_list->items[i].string;
-	argv[path_list->nr] = NULL;
+		argv_array_push(&child.args, path_list->items[i].string);
+	child.dir = prefix;
+	child.use_shell = 1;
 
-	status = run_command_v_opt_cd_env(argv, RUN_USING_SHELL, prefix, NULL);
+	status = run_command(&child);
 	if (status)
 		exit(status);
-	free(argv);
 }
 
 static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, int cached)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f2d6761..932afab 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1031,7 +1031,6 @@ static void run_update_post_hook(struct command *commands)
 {
 	struct command *cmd;
 	int argc;
-	const char **argv;
 	struct child_process proc = CHILD_PROCESS_INIT;
 	const char *hook;
 
@@ -1044,21 +1043,16 @@ static void run_update_post_hook(struct command *commands)
 	if (!argc || !hook)
 		return;
 
-	argv = xmalloc(sizeof(*argv) * (2 + argc));
-	argv[0] = hook;
-
-	for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
+	argv_array_push(&proc.args, hook);
+	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (cmd->error_string || cmd->did_not_exist)
 			continue;
-		argv[argc] = xstrdup(cmd->ref_name);
-		argc++;
+		argv_array_push(&proc.args, cmd->ref_name);
 	}
-	argv[argc] = NULL;
 
 	proc.no_stdin = 1;
 	proc.stdout_to_stderr = 1;
 	proc.err = use_sideband ? -1 : 0;
-	proc.argv = argv;
 
 	if (!start_command(&proc)) {
 		if (use_sideband)
diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index e3cd25d..7457c74 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -114,30 +114,14 @@ static char *strip_escapes(const char *str, const char *service,
 	}
 }
 
-/* Should be enough... */
-#define MAXARGUMENTS 256
-
-static const char **parse_argv(const char *arg, const char *service)
+static void parse_argv(struct argv_array *out, const char *arg, const char *service)
 {
-	int arguments = 0;
-	int i;
-	const char **ret;
-	char *temparray[MAXARGUMENTS + 1];
-
 	while (*arg) {
-		char *expanded;
-		if (arguments == MAXARGUMENTS)
-			die("remote-ext command has too many arguments");
-		expanded = strip_escapes(arg, service, &arg);
+		char *expanded = strip_escapes(arg, service, &arg);
 		if (expanded)
-			temparray[arguments++] = expanded;
+			argv_array_push(out, expanded);
+		free(expanded);
 	}
-
-	ret = xmalloc((arguments + 1) * sizeof(char *));
-	for (i = 0; i < arguments; i++)
-		ret[i] = temparray[i];
-	ret[arguments] = NULL;
-	return ret;
 }
 
 static void send_git_request(int stdin_fd, const char *serv, const char *repo,
@@ -158,7 +142,7 @@ static int run_child(const char *arg, const char *service)
 	child.in = -1;
 	child.out = -1;
 	child.err = 0;
-	child.argv = parse_argv(arg, service);
+	parse_argv(&child.args, arg, service);
 
 	if (start_command(&child) < 0)
 		die("Can't run specified command");
diff --git a/daemon.c b/daemon.c
index 46b411c..8d45c33 100644
--- a/daemon.c
+++ b/daemon.c
@@ -808,7 +808,7 @@ static void check_dead_children(void)
 			cradle = &blanket->next;
 }
 
-static char **cld_argv;
+static struct argv_array cld_argv = ARGV_ARRAY_INIT;
 static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
@@ -842,7 +842,7 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 #endif
 	}
 
-	cld.argv = (const char **)cld_argv;
+	cld.argv = cld_argv.argv;
 	cld.in = incoming;
 	cld.out = dup(incoming);
 
@@ -1374,12 +1374,10 @@ int main(int argc, char **argv)
 		write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
 
 	/* prepare argv for serving-processes */
-	cld_argv = xmalloc(sizeof (char *) * (argc + 2));
-	cld_argv[0] = argv[0];	/* git-daemon */
-	cld_argv[1] = "--serve";
+	argv_array_push(&cld_argv, argv[0]); /* git-daemon */
+	argv_array_push(&cld_argv, "--serve");
 	for (i = 1; i < argc; ++i)
-		cld_argv[i+1] = argv[i];
-	cld_argv[argc+1] = NULL;
+		argv_array_push(&cld_argv, argv[i]);
 
 	return serve(&listen_addr, listen_port, cred);
 }
diff --git a/git.c b/git.c
index 6c64c94..087ad31 100644
--- a/git.c
+++ b/git.c
@@ -247,20 +247,16 @@ static int handle_alias(int *argcp, const char ***argv)
 	alias_string = alias_lookup(alias_command);
 	if (alias_string) {
 		if (alias_string[0] == '!') {
-			const char **alias_argv;
-			int argc = *argcp, i;
+			struct child_process child = CHILD_PROCESS_INIT;
 
 			commit_pager_choice();
 			restore_env(1);
 
-			/* build alias_argv */
-			alias_argv = xmalloc(sizeof(*alias_argv) * (argc + 1));
-			alias_argv[0] = alias_string + 1;
-			for (i = 1; i < argc; ++i)
-				alias_argv[i] = (*argv)[i];
-			alias_argv[argc] = NULL;
+			child.use_shell = 1;
+			argv_array_push(&child.args, alias_string + 1);
+			argv_array_pushv(&child.args, (*argv) + 1);
 
-			ret = run_command_v_opt(alias_argv, RUN_USING_SHELL);
+			ret = run_command(&child);
 			if (ret >= 0)   /* normal exit */
 				exit(ret);
 
diff --git a/line-log.c b/line-log.c
index af6e2f7..5877986 100644
--- a/line-log.c
+++ b/line-log.c
@@ -14,6 +14,7 @@
 #include "graph.h"
 #include "userdiff.h"
 #include "line-log.h"
+#include "argv-array.h"
 
 static void range_set_grow(struct range_set *rs, size_t extra)
 {
@@ -746,22 +747,17 @@ void line_log_init(struct rev_info *rev, const char *prefix, struct string_list
 	add_line_range(rev, commit, range);
 
 	if (!rev->diffopt.detect_rename) {
-		int i, count = 0;
-		struct line_log_data *r = range;
+		struct line_log_data *r;
+		struct argv_array array = ARGV_ARRAY_INIT;
 		const char **paths;
-		while (r) {
-			count++;
-			r = r->next;
-		}
-		paths = xmalloc((count+1)*sizeof(char *));
-		r = range;
-		for (i = 0; i < count; i++) {
-			paths[i] = xstrdup(r->path);
-			r = r->next;
-		}
-		paths[count] = NULL;
+
+		for (r = range; r; r = r->next)
+			argv_array_push(&array, r->path);
+		paths = argv_array_detach(&array);
+
 		parse_pathspec(&rev->diffopt.pathspec, 0,
 			       PATHSPEC_PREFER_FULL, "", paths);
+		/* strings are now owned by pathspec */
 		free(paths);
 	}
 }
diff --git a/remote-curl.c b/remote-curl.c
index c704857..42deeec 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -845,23 +845,22 @@ static void parse_fetch(struct strbuf *buf)
 
 static int push_dav(int nr_spec, char **specs)
 {
-	const char **argv = xmalloc((10 + nr_spec) * sizeof(char*));
-	int argc = 0, i;
+	struct child_process child = CHILD_PROCESS_INIT;
+	size_t i;
 
-	argv[argc++] = "http-push";
-	argv[argc++] = "--helper-status";
+	child.git_cmd = 1;
+	argv_array_push(&child.args, "http-push");
+	argv_array_push(&child.args, "--helper-status");
 	if (options.dry_run)
-		argv[argc++] = "--dry-run";
+		argv_array_push(&child.args, "--dry-run");
 	if (options.verbosity > 1)
-		argv[argc++] = "--verbose";
-	argv[argc++] = url.buf;
+		argv_array_push(&child.args, "--verbose");
+	argv_array_push(&child.args, url.buf);
 	for (i = 0; i < nr_spec; i++)
-		argv[argc++] = specs[i];
-	argv[argc++] = NULL;
+		argv_array_push(&child.args, specs[i]);
 
-	if (run_command_v_opt(argv, RUN_GIT_CMD))
-		die("git-%s failed", argv[0]);
-	free(argv);
+	if (run_command(&child))
+		die("git-http-push failed");
 	return 0;
 }
 
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 08/22] convert trivial cases to ALLOC_ARRAY
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (6 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 07/22] convert manual allocations to argv_array Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 09/22] use xmallocz to avoid size arithmetic Jeff King
                       ` (14 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
---
 alias.c                  |  2 +-
 attr.c                   |  2 +-
 bisect.c                 |  4 ++--
 builtin/blame.c          |  3 ++-
 builtin/clean.c          |  2 +-
 builtin/fast-export.c    |  2 +-
 builtin/index-pack.c     |  4 ++--
 builtin/merge-base.c     |  2 +-
 builtin/mv.c             |  3 ++-
 builtin/pack-objects.c   |  7 ++++---
 builtin/pack-redundant.c |  2 +-
 builtin/receive-pack.c   |  5 ++---
 column.c                 |  2 +-
 combine-diff.c           |  4 ++--
 commit.c                 |  2 +-
 compat/mingw.c           |  6 +++---
 diffcore-order.c         |  4 ++--
 dir.c                    |  6 +++---
 fast-import.c            |  5 +++--
 fsck.c                   |  3 ++-
 graph.c                  | 10 ++++------
 khash.h                  |  2 +-
 levenshtein.c            |  8 +++++---
 line-log.c               |  8 ++++----
 notes.c                  |  2 +-
 pack-check.c             |  2 +-
 pack-revindex.c          | 12 ++++++++----
 pathspec.c               |  5 +++--
 remote-curl.c            |  3 ++-
 sha1_file.c              |  4 ++--
 shallow.c                |  6 +++---
 show-index.c             |  3 ++-
 transport.c              |  2 +-
 xdiff-interface.c        |  2 +-
 34 files changed, 75 insertions(+), 64 deletions(-)

diff --git a/alias.c b/alias.c
index a11229d..3b90397 100644
--- a/alias.c
+++ b/alias.c
@@ -23,7 +23,7 @@ int split_cmdline(char *cmdline, const char ***argv)
 	int src, dst, count = 0, size = 16;
 	char quoted = 0;
 
-	*argv = xmalloc(sizeof(**argv) * size);
+	ALLOC_ARRAY(*argv, size);
 
 	/* split alias_string */
 	(*argv)[count++] = cmdline;
diff --git a/attr.c b/attr.c
index 086c08d..c83ec49 100644
--- a/attr.c
+++ b/attr.c
@@ -799,7 +799,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 			++count;
 	}
 	*num = count;
-	*check = xmalloc(sizeof(**check) * count);
+	ALLOC_ARRAY(*check, count);
 	j = 0;
 	for (i = 0; i < attr_nr; i++) {
 		const char *value = check_all_attr[i].value;
diff --git a/bisect.c b/bisect.c
index 06ec54e..7996c29 100644
--- a/bisect.c
+++ b/bisect.c
@@ -708,10 +708,10 @@ static struct commit *get_commit_reference(const unsigned char *sha1)
 
 static struct commit **get_bad_and_good_commits(int *rev_nr)
 {
-	int len = 1 + good_revs.nr;
-	struct commit **rev = xmalloc(len * sizeof(*rev));
+	struct commit **rev;
 	int i, n = 0;
 
+	ALLOC_ARRAY(rev, 1 + good_revs.nr);
 	rev[n++] = get_commit_reference(current_bad_oid->hash);
 	for (i = 0; i < good_revs.nr; i++)
 		rev[n++] = get_commit_reference(good_revs.sha1[i]);
diff --git a/builtin/blame.c b/builtin/blame.c
index 55bf5fa..b4ed462 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2059,7 +2059,8 @@ static int prepare_lines(struct scoreboard *sb)
 	for (p = buf; p < end; p = get_next_line(p, end))
 		num++;
 
-	sb->lineno = lineno = xmalloc(sizeof(*sb->lineno) * (num + 1));
+	ALLOC_ARRAY(sb->lineno, num + 1);
+	lineno = sb->lineno;
 
 	for (p = buf; p < end; p = get_next_line(p, end))
 		*lineno++ = p - buf;
diff --git a/builtin/clean.c b/builtin/clean.c
index 7b08237..8229f7e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -543,7 +543,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 	int eof = 0;
 	int i;
 
-	chosen = xmalloc(sizeof(int) * stuff->nr);
+	ALLOC_ARRAY(chosen, stuff->nr);
 	/* set chosen as uninitialized */
 	for (i = 0; i < stuff->nr; i++)
 		chosen[i] = -1;
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 2471297..8164b58 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1021,7 +1021,7 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 		const char **refspecs_str;
 		int i;
 
-		refspecs_str = xmalloc(sizeof(*refspecs_str) * refspecs_list.nr);
+		ALLOC_ARRAY(refspecs_str, refspecs_list.nr);
 		for (i = 0; i < refspecs_list.nr; i++)
 			refspecs_str[i] = refspecs_list.items[i].string;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6a01509..a60bcfa 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1346,7 +1346,7 @@ static void fix_unresolved_deltas(struct sha1file *f)
 	 * before deltas depending on them, a good heuristic is to start
 	 * resolving deltas in the same order as their position in the pack.
 	 */
-	sorted_by_pos = xmalloc(nr_ref_deltas * sizeof(*sorted_by_pos));
+	ALLOC_ARRAY(sorted_by_pos, nr_ref_deltas);
 	for (i = 0; i < nr_ref_deltas; i++)
 		sorted_by_pos[i] = &ref_deltas[i];
 	qsort(sorted_by_pos, nr_ref_deltas, sizeof(*sorted_by_pos), delta_pos_compare);
@@ -1759,7 +1759,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (show_stat)
 		show_pack_info(stat_only);
 
-	idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *));
+	ALLOC_ARRAY(idx_objects, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
 	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_sha1);
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index a891162..c0d1822 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -252,7 +252,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix)
 	if (argc < 2)
 		usage_with_options(merge_base_usage, options);
 
-	rev = xmalloc(argc * sizeof(*rev));
+	ALLOC_ARRAY(rev, argc);
 	while (argc-- > 0)
 		rev[rev_nr++] = get_commit_reference(*argv++);
 	return show_merge_base(rev, rev_nr, show_all);
diff --git a/builtin/mv.c b/builtin/mv.c
index d1d4316..9a9813a 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -24,7 +24,8 @@ static const char **internal_copy_pathspec(const char *prefix,
 					   int count, unsigned flags)
 {
 	int i;
-	const char **result = xmalloc((count + 1) * sizeof(const char *));
+	const char **result;
+	ALLOC_ARRAY(result, count + 1);
 	memcpy(result, pathspec, count * sizeof(const char *));
 	result[count] = NULL;
 	for (i = 0; i < count; i++) {
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4dae5b1..b4f1fa6 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -624,7 +624,7 @@ static struct object_entry **compute_write_order(void)
 {
 	unsigned int i, wo_end, last_untagged;
 
-	struct object_entry **wo = xmalloc(to_pack.nr_objects * sizeof(*wo));
+	struct object_entry **wo;
 	struct object_entry *objects = to_pack.objects;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
@@ -657,6 +657,7 @@ static struct object_entry **compute_write_order(void)
 	 * Give the objects in the original recency order until
 	 * we see a tagged tip.
 	 */
+	ALLOC_ARRAY(wo, to_pack.nr_objects);
 	for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			break;
@@ -769,7 +770,7 @@ static void write_pack_file(void)
 
 	if (progress > pack_to_stdout)
 		progress_state = start_progress(_("Writing objects"), nr_result);
-	written_list = xmalloc(to_pack.nr_objects * sizeof(*written_list));
+	ALLOC_ARRAY(written_list, to_pack.nr_objects);
 	write_order = compute_write_order();
 
 	do {
@@ -2129,7 +2130,7 @@ static void prepare_pack(int window, int depth)
 	if (!to_pack.nr_objects || !window || !depth)
 		return;
 
-	delta_list = xmalloc(to_pack.nr_objects * sizeof(*delta_list));
+	ALLOC_ARRAY(delta_list, to_pack.nr_objects);
 	nr_deltas = n = 0;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
index d0532f6..72c8158 100644
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@ -53,7 +53,7 @@ static inline struct llist_item *llist_item_get(void)
 		free_nodes = free_nodes->next;
 	} else {
 		int i = 1;
-		new = xmalloc(sizeof(struct llist_item) * BLKSIZE);
+		ALLOC_ARRAY(new, BLKSIZE);
 		for (; i < BLKSIZE; i++)
 			llist_item_put(&new[i]);
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 932afab..3dc3868 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1591,8 +1591,7 @@ static void prepare_shallow_update(struct command *commands,
 {
 	int i, j, k, bitmap_size = (si->ref->nr + 31) / 32;
 
-	si->used_shallow = xmalloc(sizeof(*si->used_shallow) *
-				   si->shallow->nr);
+	ALLOC_ARRAY(si->used_shallow, si->shallow->nr);
 	assign_shallow_commits_to_refs(si, si->used_shallow, NULL);
 
 	si->need_reachability_test =
@@ -1658,7 +1657,7 @@ static void update_shallow_info(struct command *commands,
 		return;
 	}
 
-	ref_status = xmalloc(sizeof(*ref_status) * ref->nr);
+	ALLOC_ARRAY(ref_status, ref->nr);
 	assign_shallow_commits_to_refs(si, NULL, ref_status);
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (is_null_sha1(cmd->new_sha1))
diff --git a/column.c b/column.c
index 786abe6..f9fda68 100644
--- a/column.c
+++ b/column.c
@@ -164,7 +164,7 @@ static void display_table(const struct string_list *list,
 	data.colopts = colopts;
 	data.opts = *opts;
 
-	data.len = xmalloc(sizeof(*data.len) * list->nr);
+	ALLOC_ARRAY(data.len, list->nr);
 	for (i = 0; i < list->nr; i++)
 		data.len[i] = item_length(colopts, list->items[i].string);
 
diff --git a/combine-diff.c b/combine-diff.c
index 5571304..a698016 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1372,7 +1372,7 @@ static struct combine_diff_path *find_paths_multitree(
 	struct combine_diff_path paths_head;
 	struct strbuf base;
 
-	parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0]));
+	ALLOC_ARRAY(parents_sha1, nparent);
 	for (i = 0; i < nparent; i++)
 		parents_sha1[i] = parents->sha1[i];
 
@@ -1483,7 +1483,7 @@ void diff_tree_combined(const unsigned char *sha1,
 	if (opt->orderfile && num_paths) {
 		struct obj_order *o;
 
-		o = xmalloc(sizeof(*o) * num_paths);
+		ALLOC_ARRAY(o, num_paths);
 		for (i = 0, p = paths; p; p = p->next, i++)
 			o[i].obj = p;
 		order_objects(opt->orderfile, path_path, o, num_paths);
diff --git a/commit.c b/commit.c
index 40388d7..31cd91f 100644
--- a/commit.c
+++ b/commit.c
@@ -903,7 +903,7 @@ static int remove_redundant(struct commit **array, int cnt)
 
 	work = xcalloc(cnt, sizeof(*work));
 	redundant = xcalloc(cnt, 1);
-	filled_index = xmalloc(sizeof(*filled_index) * (cnt - 1));
+	ALLOC_ARRAY(filled_index, cnt - 1);
 
 	for (i = 0; i < cnt; i++)
 		parse_commit(array[i]);
diff --git a/compat/mingw.c b/compat/mingw.c
index fbe69b8..7803f30 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -893,7 +893,7 @@ static char **get_path_split(void)
 	if (!n)
 		return NULL;
 
-	path = xmalloc((n+1)*sizeof(char *));
+	ALLOC_ARRAY(path, n + 1);
 	p = envpath;
 	i = 0;
 	do {
@@ -978,7 +978,7 @@ static wchar_t *make_environment_block(char **deltaenv)
 		i++;
 
 	/* copy the environment, leaving space for changes */
-	tmpenv = xmalloc((size + i) * sizeof(char*));
+	ALLOC_ARRAY(tmpenv, size + i);
 	memcpy(tmpenv, environ, size * sizeof(char*));
 
 	/* merge supplied environment changes into the temporary environment */
@@ -1168,7 +1168,7 @@ static int try_shell_exec(const char *cmd, char *const *argv)
 		int argc = 0;
 		const char **argv2;
 		while (argv[argc]) argc++;
-		argv2 = xmalloc(sizeof(*argv) * (argc+1));
+		ALLOC_ARRAY(argv2, argc + 1);
 		argv2[0] = (char *)cmd;	/* full path to the script file */
 		memcpy(&argv2[1], &argv[1], sizeof(*argv) * argc);
 		pid = mingw_spawnv(prog, argv2, 1);
diff --git a/diffcore-order.c b/diffcore-order.c
index 97dd3d0..69d41f7 100644
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -52,7 +52,7 @@ static void prepare_order(const char *orderfile)
 		}
 		if (pass == 0) {
 			order_cnt = cnt;
-			order = xmalloc(sizeof(*order) * cnt);
+			ALLOC_ARRAY(order, cnt);
 		}
 	}
 }
@@ -120,7 +120,7 @@ void diffcore_order(const char *orderfile)
 	if (!q->nr)
 		return;
 
-	o = xmalloc(sizeof(*o) * q->nr);
+	ALLOC_ARRAY(o, q->nr);
 	for (i = 0; i < q->nr; i++)
 		o[i].obj = q->queue[i];
 	order_objects(orderfile, pair_pathtwo, o, q->nr);
diff --git a/dir.c b/dir.c
index f0b6d0a..66c93c1 100644
--- a/dir.c
+++ b/dir.c
@@ -2484,14 +2484,14 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	ud.untracked_alloc = value;
 	ud.untracked_nr	   = value;
 	if (ud.untracked_nr)
-		ud.untracked = xmalloc(sizeof(*ud.untracked) * ud.untracked_nr);
+		ALLOC_ARRAY(ud.untracked, ud.untracked_nr);
 	data = next;
 
 	next = data;
 	ud.dirs_alloc = ud.dirs_nr = decode_varint(&next);
 	if (next > end)
 		return -1;
-	ud.dirs = xmalloc(sizeof(*ud.dirs) * ud.dirs_nr);
+	ALLOC_ARRAY(ud.dirs, ud.dirs_nr);
 	data = next;
 
 	len = strlen((const char *)data);
@@ -2611,7 +2611,7 @@ struct untracked_cache *read_untracked_extension(const void *data, unsigned long
 	rd.data	      = next;
 	rd.end	      = end;
 	rd.index      = 0;
-	rd.ucd        = xmalloc(sizeof(*rd.ucd) * len);
+	ALLOC_ARRAY(rd.ucd, len);
 
 	if (read_one_dir(&uc->root, &rd) || rd.index != len)
 		goto done;
diff --git a/fast-import.c b/fast-import.c
index bf01b34..a6467cb 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -814,7 +814,8 @@ static struct tree_entry *new_tree_entry(void)
 	if (!avail_tree_entry) {
 		unsigned int n = tree_entry_alloc;
 		total_allocd += n * sizeof(struct tree_entry);
-		avail_tree_entry = e = xmalloc(n * sizeof(struct tree_entry));
+		ALLOC_ARRAY(e, n);
+		avail_tree_entry = e;
 		while (n-- > 1) {
 			*((void**)e) = e + 1;
 			e++;
@@ -898,7 +899,7 @@ static const char *create_index(void)
 	struct object_entry_pool *o;
 
 	/* Build the table of object IDs. */
-	idx = xmalloc(object_count * sizeof(*idx));
+	ALLOC_ARRAY(idx, object_count);
 	c = idx;
 	for (o = blocks; o; o = o->next_pool)
 		for (e = o->next_free; e-- != o->entries;)
diff --git a/fsck.c b/fsck.c
index c637f66..ca4c685 100644
--- a/fsck.c
+++ b/fsck.c
@@ -199,7 +199,8 @@ void fsck_set_msg_type(struct fsck_options *options,
 
 	if (!options->msg_type) {
 		int i;
-		int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX);
+		int *msg_type;
+		ALLOC_ARRAY(msg_type, FSCK_MSG_MAX);
 		for (i = 0; i < FSCK_MSG_MAX; i++)
 			msg_type[i] = fsck_msg_type(i, options);
 		options->msg_type = msg_type;
diff --git a/graph.c b/graph.c
index c25a09a..1350bdd 100644
--- a/graph.c
+++ b/graph.c
@@ -234,12 +234,10 @@ struct git_graph *graph_init(struct rev_info *opt)
 	 * We'll automatically grow columns later if we need more room.
 	 */
 	graph->column_capacity = 30;
-	graph->columns = xmalloc(sizeof(struct column) *
-				 graph->column_capacity);
-	graph->new_columns = xmalloc(sizeof(struct column) *
-				     graph->column_capacity);
-	graph->mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
-	graph->new_mapping = xmalloc(sizeof(int) * 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->new_columns, graph->column_capacity);
+	ALLOC_ARRAY(graph->mapping, 2 * graph->column_capacity);
+	ALLOC_ARRAY(graph->new_mapping, 2 * graph->column_capacity);
 
 	/*
 	 * The diff output prefix callback, with this we can make
diff --git a/khash.h b/khash.h
index 376475a..c0da40d 100644
--- a/khash.h
+++ b/khash.h
@@ -117,7 +117,7 @@ static const double __ac_HASH_UPPER = 0.77;
 			if (new_n_buckets < 4) new_n_buckets = 4;					\
 			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	/* requested size is too small */ \
 			else { /* hash table size to be changed (shrink or expand); rehash */ \
-				new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t));	\
+				ALLOC_ARRAY(new_flags, __ac_fsize(new_n_buckets)); \
 				if (!new_flags) return -1;								\
 				memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
 				if (h->n_buckets < new_n_buckets) {	/* expand */		\
diff --git a/levenshtein.c b/levenshtein.c
index fc28159..d263269 100644
--- a/levenshtein.c
+++ b/levenshtein.c
@@ -42,11 +42,13 @@ int levenshtein(const char *string1, const char *string2,
 		int w, int s, int a, int d)
 {
 	int len1 = strlen(string1), len2 = strlen(string2);
-	int *row0 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row1 = xmalloc(sizeof(int) * (len2 + 1));
-	int *row2 = xmalloc(sizeof(int) * (len2 + 1));
+	int *row0, *row1, *row2;
 	int i, j;
 
+	ALLOC_ARRAY(row0, len2 + 1);
+	ALLOC_ARRAY(row1, len2 + 1);
+	ALLOC_ARRAY(row2, len2 + 1);
+
 	for (j = 0; j <= len2; j++)
 		row1[j] = j * a;
 	for (i = 0; i < len1; i++) {
diff --git a/line-log.c b/line-log.c
index 5877986..bbe31ed 100644
--- a/line-log.c
+++ b/line-log.c
@@ -522,7 +522,7 @@ static void fill_line_ends(struct diff_filespec *spec, long *lines,
 	if (diff_populate_filespec(spec, 0))
 		die("Cannot read blob %s", sha1_to_hex(spec->sha1));
 
-	ends = xmalloc(size * sizeof(*ends));
+	ALLOC_ARRAY(ends, size);
 	ends[cur++] = 0;
 	data = spec->data;
 	while (num < spec->size) {
@@ -1142,9 +1142,9 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
 	if (nparents > 1 && rev->first_parent_only)
 		nparents = 1;
 
-	diffqueues = xmalloc(nparents * sizeof(*diffqueues));
-	cand = xmalloc(nparents * sizeof(*cand));
-	parents = xmalloc(nparents * sizeof(*parents));
+	ALLOC_ARRAY(diffqueues, nparents);
+	ALLOC_ARRAY(cand, nparents);
+	ALLOC_ARRAY(parents, nparents);
 
 	p = commit->parents;
 	for (i = 0; i < nparents; i++) {
diff --git a/notes.c b/notes.c
index c1e5035..88cf474 100644
--- a/notes.c
+++ b/notes.c
@@ -1035,7 +1035,7 @@ struct notes_tree **load_notes_trees(struct string_list *refs, int flags)
 	struct string_list_item *item;
 	int counter = 0;
 	struct notes_tree **trees;
-	trees = xmalloc((refs->nr+1) * sizeof(struct notes_tree *));
+	ALLOC_ARRAY(trees, refs->nr + 1);
 	for_each_string_list_item(item, refs) {
 		struct notes_tree *t = xcalloc(1, sizeof(struct notes_tree));
 		init_notes(t, item->string, combine_notes_ignore, flags);
diff --git a/pack-check.c b/pack-check.c
index 433bd86..1da89a4 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -89,7 +89,7 @@ static int verify_packfile(struct packed_git *p,
 	 * we do not do scan-streaming check on the pack file.
 	 */
 	nr_objects = p->num_objects;
-	entries = xmalloc((nr_objects + 1) * sizeof(*entries));
+	ALLOC_ARRAY(entries, nr_objects + 1);
 	entries[nr_objects].offset = pack_sig_ofs;
 	/* first sort entries by pack offset, since unpacking them is more efficient that way */
 	for (i = 0; i < nr_objects; i++) {
diff --git a/pack-revindex.c b/pack-revindex.c
index 155a8a3..96d51c3 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -44,10 +44,14 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
 	 * keep track of them with alias pointers, always sorting from "from"
 	 * to "to".
 	 */
-	struct revindex_entry *tmp = xmalloc(n * sizeof(*tmp));
-	struct revindex_entry *from = entries, *to = tmp;
+	struct revindex_entry *tmp, *from, *to;
 	int bits;
-	unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
+	unsigned *pos;
+
+	ALLOC_ARRAY(pos, BUCKETS);
+	ALLOC_ARRAY(tmp, n);
+	from = entries;
+	to = tmp;
 
 	/*
 	 * If (max >> bits) is zero, then we know that the radix digit we are
@@ -121,7 +125,7 @@ static void create_pack_revindex(struct packed_git *p)
 	unsigned i;
 	const char *index = p->index_data;
 
-	p->revindex = xmalloc(sizeof(*p->revindex) * (num_ent + 1));
+	ALLOC_ARRAY(p->revindex, num_ent + 1);
 	index += 4 * 256;
 
 	if (p->index_version > 1) {
diff --git a/pathspec.c b/pathspec.c
index 9304ee3..c9e9b6c 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -406,7 +406,8 @@ void parse_pathspec(struct pathspec *pathspec,
 		n++;
 
 	pathspec->nr = n;
-	pathspec->items = item = xmalloc(sizeof(*item) * n);
+	ALLOC_ARRAY(pathspec->items, n);
+	item = pathspec->items;
 	pathspec->_raw = argv;
 	prefixlen = prefix ? strlen(prefix) : 0;
 
@@ -483,7 +484,7 @@ const char **get_pathspec(const char *prefix, const char **pathspec)
 void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 {
 	*dst = *src;
-	dst->items = xmalloc(sizeof(struct pathspec_item) * dst->nr);
+	ALLOC_ARRAY(dst->items, dst->nr);
 	memcpy(dst->items, src->items,
 	       sizeof(struct pathspec_item) * dst->nr);
 }
diff --git a/remote-curl.c b/remote-curl.c
index 42deeec..5259b7e 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -696,9 +696,10 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 static int fetch_dumb(int nr_heads, struct ref **to_fetch)
 {
 	struct walker *walker;
-	char **targets = xmalloc(nr_heads * sizeof(char*));
+	char **targets;
 	int ret, i;
 
+	ALLOC_ARRAY(targets, nr_heads);
 	if (options.depth)
 		die("dumb http transport does not support --depth");
 	for (i = 0; i < nr_heads; i++)
diff --git a/sha1_file.c b/sha1_file.c
index aab1872..2f1c6d3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1942,7 +1942,7 @@ static enum object_type packed_to_object_type(struct packed_git *p,
 		/* Push the object we're going to leave behind */
 		if (poi_stack_nr >= poi_stack_alloc && poi_stack == small_poi_stack) {
 			poi_stack_alloc = alloc_nr(poi_stack_nr);
-			poi_stack = xmalloc(sizeof(off_t)*poi_stack_alloc);
+			ALLOC_ARRAY(poi_stack, poi_stack_alloc);
 			memcpy(poi_stack, small_poi_stack, sizeof(off_t)*poi_stack_nr);
 		} else {
 			ALLOC_GROW(poi_stack, poi_stack_nr+1, poi_stack_alloc);
@@ -2308,7 +2308,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 		if (delta_stack_nr >= delta_stack_alloc
 		    && delta_stack == small_delta_stack) {
 			delta_stack_alloc = alloc_nr(delta_stack_nr);
-			delta_stack = xmalloc(sizeof(*delta_stack)*delta_stack_alloc);
+			ALLOC_ARRAY(delta_stack, delta_stack_alloc);
 			memcpy(delta_stack, small_delta_stack,
 			       sizeof(*delta_stack)*delta_stack_nr);
 		} else {
diff --git a/shallow.c b/shallow.c
index 60f1505..71163bf 100644
--- a/shallow.c
+++ b/shallow.c
@@ -315,8 +315,8 @@ void prepare_shallow_info(struct shallow_info *info, struct sha1_array *sa)
 	info->shallow = sa;
 	if (!sa)
 		return;
-	info->ours = xmalloc(sizeof(*info->ours) * sa->nr);
-	info->theirs = xmalloc(sizeof(*info->theirs) * sa->nr);
+	ALLOC_ARRAY(info->ours, sa->nr);
+	ALLOC_ARRAY(info->theirs, sa->nr);
 	for (i = 0; i < sa->nr; i++) {
 		if (has_sha1_file(sa->sha1[i])) {
 			struct commit_graft *graft;
@@ -487,7 +487,7 @@ void assign_shallow_commits_to_refs(struct shallow_info *info,
 	struct paint_info pi;
 
 	trace_printf_key(&trace_shallow, "shallow: assign_shallow_commits_to_refs\n");
-	shallow = xmalloc(sizeof(*shallow) * (info->nr_ours + info->nr_theirs));
+	ALLOC_ARRAY(shallow, info->nr_ours + info->nr_theirs);
 	for (i = 0; i < info->nr_ours; i++)
 		shallow[nr_shallow++] = info->ours[i];
 	for (i = 0; i < info->nr_theirs; i++)
diff --git a/show-index.c b/show-index.c
index d9e4903..acf8d54 100644
--- a/show-index.c
+++ b/show-index.c
@@ -50,7 +50,8 @@ int main(int argc, char **argv)
 			unsigned char sha1[20];
 			uint32_t crc;
 			uint32_t off;
-		} *entries = xmalloc(nr * sizeof(entries[0]));
+		} *entries;
+		ALLOC_ARRAY(entries, nr);
 		for (i = 0; i < nr; i++)
 			if (fread(entries[i].sha1, 20, 1, stdin) != 1)
 				die("unable to read sha1 %u/%u", i, nr);
diff --git a/transport.c b/transport.c
index c92f8ae..3b4e644 100644
--- a/transport.c
+++ b/transport.c
@@ -977,7 +977,7 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 		 * This condition shouldn't be met in a non-deepening fetch
 		 * (see builtin/fetch.c:quickfetch()).
 		 */
-		heads = xmalloc(nr_refs * sizeof(*heads));
+		ALLOC_ARRAY(heads, nr_refs);
 		for (rm = refs; rm; rm = rm->next)
 			heads[nr_heads++] = rm;
 	}
diff --git a/xdiff-interface.c b/xdiff-interface.c
index cb67c1c..54236f2 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -265,7 +265,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
 	for (i = 0, regs->nr = 1; value[i]; i++)
 		if (value[i] == '\n')
 			regs->nr++;
-	regs->array = xmalloc(regs->nr * sizeof(struct ff_reg));
+	ALLOC_ARRAY(regs->array, regs->nr);
 	for (i = 0; i < regs->nr; i++) {
 		struct ff_reg *reg = regs->array + i;
 		const char *ep = strchr(value, '\n'), *expression;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 09/22] use xmallocz to avoid size arithmetic
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (7 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 08/22] convert trivial cases to ALLOC_ARRAY Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 10/22] convert trivial cases to FLEX_ARRAY macros Jeff King
                       ` (13 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/check-ref-format.c | 2 +-
 builtin/merge-tree.c       | 2 +-
 builtin/worktree.c         | 2 +-
 column.c                   | 3 +--
 combine-diff.c             | 4 +---
 config.c                   | 4 +---
 dir.c                      | 2 +-
 entry.c                    | 2 +-
 grep.c                     | 3 +--
 imap-send.c                | 5 ++---
 ll-merge.c                 | 2 +-
 progress.c                 | 2 +-
 refs.c                     | 2 +-
 setup.c                    | 5 ++---
 strbuf.c                   | 2 +-
 15 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/builtin/check-ref-format.c b/builtin/check-ref-format.c
index fd915d5..eac4994 100644
--- a/builtin/check-ref-format.c
+++ b/builtin/check-ref-format.c
@@ -20,7 +20,7 @@ static const char builtin_check_ref_format_usage[] =
  */
 static char *collapse_slashes(const char *refname)
 {
-	char *ret = xmalloc(strlen(refname) + 1);
+	char *ret = xmallocz(strlen(refname));
 	char ch;
 	char prev = '/';
 	char *cp = ret;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index d4f0cbd..ca57004 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -174,7 +174,7 @@ static struct merge_list *create_entry(unsigned stage, unsigned mode, const unsi
 
 static char *traverse_path(const struct traverse_info *info, const struct name_entry *n)
 {
-	char *path = xmalloc(traverse_path_len(info, n) + 1);
+	char *path = xmallocz(traverse_path_len(info, n));
 	return make_traverse_path(path, info, n);
 }
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 475b958..0a45710 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -52,7 +52,7 @@ static int prune_worktree(const char *id, struct strbuf *reason)
 		return 1;
 	}
 	len = st.st_size;
-	path = xmalloc(len + 1);
+	path = xmallocz(len);
 	read_in_full(fd, path, len);
 	close(fd);
 	while (len && (path[len - 1] == '\n' || path[len - 1] == '\r'))
diff --git a/column.c b/column.c
index f9fda68..d55ead1 100644
--- a/column.c
+++ b/column.c
@@ -173,9 +173,8 @@ static void display_table(const struct string_list *list,
 	if (colopts & COL_DENSE)
 		shrink_columns(&data);
 
-	empty_cell = xmalloc(initial_width + 1);
+	empty_cell = xmallocz(initial_width);
 	memset(empty_cell, ' ', initial_width);
-	empty_cell[initial_width] = '\0';
 	for (y = 0; y < data.rows; y++) {
 		for (x = 0; x < data.cols; x++)
 			if (display_cell(&data, initial_width, empty_cell, x, y))
diff --git a/combine-diff.c b/combine-diff.c
index a698016..890c415 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1043,7 +1043,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 				elem->mode = canon_mode(S_IFLNK);
 
 			result_size = len;
-			result = xmalloc(len + 1);
+			result = xmallocz(len);
 
 			done = read_in_full(fd, result, len);
 			if (done < 0)
@@ -1051,8 +1051,6 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			else if (done < len)
 				die("early EOF '%s'", elem->path);
 
-			result[len] = 0;
-
 			/* If not a fake symlink, apply filters, e.g. autocrlf */
 			if (is_file) {
 				struct strbuf buf = STRBUF_INIT;
diff --git a/config.c b/config.c
index b95ac3a..e7b589a 100644
--- a/config.c
+++ b/config.c
@@ -1902,7 +1902,7 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 	 * Validate the key and while at it, lower case it for matching.
 	 */
 	if (store_key)
-		*store_key = xmalloc(strlen(key) + 1);
+		*store_key = xmallocz(strlen(key));
 
 	dot = 0;
 	for (i = 0; key[i]; i++) {
@@ -1926,8 +1926,6 @@ static int git_config_parse_key_1(const char *key, char **store_key, int *basele
 		if (store_key)
 			(*store_key)[i] = c;
 	}
-	if (store_key)
-		(*store_key)[i] = 0;
 
 	return 0;
 
diff --git a/dir.c b/dir.c
index 66c93c1..6c627d5 100644
--- a/dir.c
+++ b/dir.c
@@ -711,7 +711,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			close(fd);
 			return 0;
 		}
-		buf = xmalloc(size+1);
+		buf = xmallocz(size);
 		if (read_in_full(fd, buf, size) != size) {
 			free(buf);
 			close(fd);
diff --git a/entry.c b/entry.c
index 582c400..a410957 100644
--- a/entry.c
+++ b/entry.c
@@ -6,7 +6,7 @@
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
 {
-	char *buf = xmalloc(path_len + 1);
+	char *buf = xmallocz(path_len);
 	int len = 0;
 
 	while (len < path_len) {
diff --git a/grep.c b/grep.c
index 7b2b96a..528b652 100644
--- a/grep.c
+++ b/grep.c
@@ -1741,7 +1741,7 @@ static int grep_source_load_file(struct grep_source *gs)
 	i = open(filename, O_RDONLY);
 	if (i < 0)
 		goto err_ret;
-	data = xmalloc(size + 1);
+	data = xmallocz(size);
 	if (st.st_size != read_in_full(i, data, size)) {
 		error(_("'%s': short read %s"), filename, strerror(errno));
 		close(i);
@@ -1749,7 +1749,6 @@ static int grep_source_load_file(struct grep_source *gs)
 		return -1;
 	}
 	close(i);
-	data[size] = 0;
 
 	gs->buf = data;
 	gs->size = size;
diff --git a/imap-send.c b/imap-send.c
index 4d3b773..2c52027 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -892,12 +892,11 @@ static char *cram(const char *challenge_64, const char *user, const char *pass)
 	response = xstrfmt("%s %s", user, hex);
 	resp_len = strlen(response) + 1;
 
-	response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);
+	response_64 = xmallocz(ENCODED_SIZE(resp_len));
 	encoded_len = EVP_EncodeBlock((unsigned char *)response_64,
 				      (unsigned char *)response, resp_len);
 	if (encoded_len < 0)
 		die("EVP_EncodeBlock error");
-	response_64[encoded_len] = '\0';
 	return (char *)response_64;
 }
 
@@ -1188,7 +1187,7 @@ static void lf_to_crlf(struct strbuf *msg)
 		j++;
 	}
 
-	new = xmalloc(j + 1);
+	new = xmallocz(j);
 
 	/*
 	 * Second pass: write the new string.  Note that this loop is
diff --git a/ll-merge.c b/ll-merge.c
index 0338630..ff4a43a 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -205,7 +205,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	if (fstat(fd, &st))
 		goto close_bad;
 	result->size = st.st_size;
-	result->ptr = xmalloc(result->size + 1);
+	result->ptr = xmallocz(result->size);
 	if (read_in_full(fd, result->ptr, result->size) != result->size) {
 		free(result->ptr);
 		result->ptr = NULL;
diff --git a/progress.c b/progress.c
index 353bd37..76a88c5 100644
--- a/progress.c
+++ b/progress.c
@@ -247,7 +247,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 		size_t len = strlen(msg) + 5;
 		struct throughput *tp = progress->throughput;
 
-		bufp = (len < sizeof(buf)) ? buf : xmalloc(len + 1);
+		bufp = (len < sizeof(buf)) ? buf : xmallocz(len);
 		if (tp) {
 			unsigned int rate = !tp->avg_misecs ? 0 :
 					tp->avg_bytes / tp->avg_misecs;
diff --git a/refs.c b/refs.c
index e2d34b2..1d9e2a7 100644
--- a/refs.c
+++ b/refs.c
@@ -124,7 +124,7 @@ int refname_is_safe(const char *refname)
 		char *buf;
 		int result;
 
-		buf = xmalloc(strlen(refname) + 1);
+		buf = xmallocz(strlen(refname));
 		/*
 		 * Does the refname try to escape refs/?
 		 * For example: refs/foo/../bar is safe but refs/foo/../../bar
diff --git a/setup.c b/setup.c
index 0deb022..ab21086 100644
--- a/setup.c
+++ b/setup.c
@@ -88,7 +88,7 @@ char *prefix_path_gently(const char *prefix, int len,
 	const char *orig = path;
 	char *sanitized;
 	if (is_absolute_path(orig)) {
-		sanitized = xmalloc(strlen(path) + 1);
+		sanitized = xmallocz(strlen(path));
 		if (remaining_prefix)
 			*remaining_prefix = 0;
 		if (normalize_path_copy_len(sanitized, path, remaining_prefix)) {
@@ -488,14 +488,13 @@ const char *read_gitfile_gently(const char *path, int *return_error_code)
 		error_code = READ_GITFILE_ERR_OPEN_FAILED;
 		goto cleanup_return;
 	}
-	buf = xmalloc(st.st_size + 1);
+	buf = xmallocz(st.st_size);
 	len = read_in_full(fd, buf, st.st_size);
 	close(fd);
 	if (len != st.st_size) {
 		error_code = READ_GITFILE_ERR_READ_FAILED;
 		goto cleanup_return;
 	}
-	buf[len] = '\0';
 	if (!starts_with(buf, "gitdir: ")) {
 		error_code = READ_GITFILE_ERR_INVALID_FORMAT;
 		goto cleanup_return;
diff --git a/strbuf.c b/strbuf.c
index bab316d..f60e2ee 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -718,7 +718,7 @@ char *xstrdup_tolower(const char *string)
 	size_t len, i;
 
 	len = strlen(string);
-	result = xmalloc(len + 1);
+	result = xmallocz(len);
 	for (i = 0; i < len; i++)
 		result[i] = tolower(string[i]);
 	result[i] = '\0';
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 10/22] convert trivial cases to FLEX_ARRAY macros
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (8 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 09/22] use xmallocz to avoid size arithmetic Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 11/22] use st_add and st_mult for allocation size computation Jeff King
                       ` (12 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King <peff@peff.net>
---
 attr.c               |  4 +---
 builtin/blame.c      |  4 +---
 builtin/help.c       |  9 +++------
 builtin/mktree.c     |  9 +++++----
 builtin/reflog.c     |  7 ++-----
 cache-tree.c         |  4 +---
 combine-diff.c       |  4 +---
 diff.c               |  7 ++-----
 dir.c                | 16 +++-------------
 hashmap.c            |  3 +--
 help.c               |  6 ++----
 log-tree.c           |  5 ++---
 name-hash.c          |  3 +--
 ref-filter.c         |  6 ++----
 refs.c               |  6 ++----
 refs/files-backend.c | 19 +++++--------------
 remote.c             |  5 +----
 17 files changed, 35 insertions(+), 82 deletions(-)

diff --git a/attr.c b/attr.c
index c83ec49..6537a43 100644
--- a/attr.c
+++ b/attr.c
@@ -93,9 +93,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	if (invalid_attr_name(name, len))
 		return NULL;
 
-	a = xmalloc(sizeof(*a) + len + 1);
-	memcpy(a->name, name, len);
-	a->name[len] = 0;
+	FLEX_ALLOC_MEM(a, name, name, len);
 	a->h = hval;
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
diff --git a/builtin/blame.c b/builtin/blame.c
index b4ed462..e982fb8 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -466,13 +466,11 @@ static void queue_blames(struct scoreboard *sb, struct origin *porigin,
 static struct origin *make_origin(struct commit *commit, const char *path)
 {
 	struct origin *o;
-	size_t pathlen = strlen(path) + 1;
-	o = xcalloc(1, sizeof(*o) + pathlen);
+	FLEX_ALLOC_STR(o, path, path);
 	o->commit = commit;
 	o->refcnt = 1;
 	o->next = commit->util;
 	commit->util = o;
-	memcpy(o->path, path, pathlen); /* includes NUL */
 	return o;
 }
 
diff --git a/builtin/help.c b/builtin/help.c
index 1cd0c1e..3c55ce4 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -171,12 +171,10 @@ static void exec_man_cmd(const char *cmd, const char *page)
 static void add_man_viewer(const char *name)
 {
 	struct man_viewer_list **p = &man_viewer_list;
-	size_t len = strlen(name);
 
 	while (*p)
 		p = &((*p)->next);
-	*p = xcalloc(1, (sizeof(**p) + len + 1));
-	memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */
+	FLEX_ALLOC_STR(*p, name, name);
 }
 
 static int supported_man_viewer(const char *name, size_t len)
@@ -190,9 +188,8 @@ static void do_add_man_viewer_info(const char *name,
 				   size_t len,
 				   const char *value)
 {
-	struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1);
-
-	memcpy(new->name, name, len); /* NUL-terminated by xcalloc */
+	struct man_viewer_info_list *new;
+	FLEX_ALLOC_MEM(new, name, name, len);
 	new->info = xstrdup(value);
 	new->next = man_viewer_info_list;
 	man_viewer_info_list = new;
diff --git a/builtin/mktree.c b/builtin/mktree.c
index a237caa..4282b62 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -19,16 +19,17 @@ static int alloc, used;
 static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
 {
 	struct treeent *ent;
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (strchr(path, '/'))
 		die("path %s contains slash", path);
 
-	ALLOC_GROW(entries, used + 1, alloc);
-	ent = entries[used++] = xmalloc(sizeof(**entries) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, path, len);
 	ent->mode = mode;
 	ent->len = len;
 	hashcpy(ent->sha1, sha1);
-	memcpy(ent->name, path, len+1);
+
+	ALLOC_GROW(entries, used + 1, alloc);
+	entries[used++] = ent;
 }
 
 static int ent_compare(const void *a_, const void *b_)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 9980731..2d46b64 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -382,11 +382,9 @@ static int collect_reflog(const char *ref, const struct object_id *oid, int unus
 {
 	struct collected_reflog *e;
 	struct collect_reflog_cb *cb = cb_data;
-	size_t namelen = strlen(ref);
 
-	e = xmalloc(sizeof(*e) + namelen + 1);
+	FLEX_ALLOC_STR(e, reflog, ref);
 	hashcpy(e->sha1, oid->hash);
-	memcpy(e->reflog, ref, namelen + 1);
 	ALLOC_GROW(cb->e, cb->nr + 1, cb->alloc);
 	cb->e[cb->nr++] = e;
 	return 0;
@@ -411,8 +409,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char *pattern, size_t len)
 		    ent->pattern[len] == '\0')
 			return ent;
 
-	ent = xcalloc(1, sizeof(*ent) + len + 1);
-	memcpy(ent->pattern, pattern, len);
+	FLEX_ALLOC_MEM(ent, pattern, pattern, len);
 	*reflog_expire_cfg_tail = ent;
 	reflog_expire_cfg_tail = &(ent->next);
 	return ent;
diff --git a/cache-tree.c b/cache-tree.c
index 20ee7b5..3ebf9c3 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -79,11 +79,9 @@ static struct cache_tree_sub *find_subtree(struct cache_tree *it,
 	ALLOC_GROW(it->down, it->subtree_nr + 1, it->subtree_alloc);
 	it->subtree_nr++;
 
-	down = xmalloc(sizeof(*down) + pathlen + 1);
+	FLEX_ALLOC_MEM(down, name, path, pathlen);
 	down->cache_tree = NULL;
 	down->namelen = pathlen;
-	memcpy(down->name, path, pathlen);
-	down->name[pathlen] = 0;
 
 	if (pos < it->subtree_nr)
 		memmove(it->down + pos + 1,
diff --git a/combine-diff.c b/combine-diff.c
index 890c415..be09a2b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -319,7 +319,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	if (line[len-1] == '\n')
 		len--;
 
-	lline = xmalloc(sizeof(*lline) + len + 1);
+	FLEX_ALLOC_MEM(lline, line, line, len);
 	lline->len = len;
 	lline->next = NULL;
 	lline->prev = sline->plost.lost_tail;
@@ -330,8 +330,6 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	sline->plost.lost_tail = lline;
 	sline->plost.len++;
 	lline->parent_map = this_mask;
-	memcpy(lline->line, line, len);
-	lline->line[len] = 0;
 }
 
 struct combine_diff_state {
diff --git a/diff.c b/diff.c
index 2136b69..27d14a7 100644
--- a/diff.c
+++ b/diff.c
@@ -2607,12 +2607,9 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 
 struct diff_filespec *alloc_filespec(const char *path)
 {
-	int namelen = strlen(path);
-	struct diff_filespec *spec = xmalloc(sizeof(*spec) + namelen + 1);
+	struct diff_filespec *spec;
 
-	memset(spec, 0, sizeof(*spec));
-	spec->path = (char *)(spec + 1);
-	memcpy(spec->path, path, namelen+1);
+	FLEXPTR_ALLOC_STR(spec, path, path);
 	spec->count = 1;
 	spec->is_binary = -1;
 	return spec;
diff --git a/dir.c b/dir.c
index 6c627d5..f06ebb7 100644
--- a/dir.c
+++ b/dir.c
@@ -503,12 +503,7 @@ void add_exclude(const char *string, const char *base,
 
 	parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);
 	if (flags & EXC_FLAG_MUSTBEDIR) {
-		char *s;
-		x = xmalloc(sizeof(*x) + patternlen + 1);
-		s = (char *)(x+1);
-		memcpy(s, string, patternlen);
-		s[patternlen] = '\0';
-		x->pattern = s;
+		FLEXPTR_ALLOC_MEM(x, pattern, string, patternlen);
 	} else {
 		x = xmalloc(sizeof(*x));
 		x->pattern = string;
@@ -625,10 +620,7 @@ static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
 	}
 
 	uc->dir_created++;
-	d = xmalloc(sizeof(*d) + len + 1);
-	memset(d, 0, sizeof(*d));
-	memcpy(d->name, name, len);
-	d->name[len] = '\0';
+	FLEX_ALLOC_MEM(d, name, name, len);
 
 	ALLOC_GROW(dir->dirs, dir->dirs_nr + 1, dir->dirs_alloc);
 	memmove(dir->dirs + first + 1, dir->dirs + first,
@@ -1167,10 +1159,8 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
 {
 	struct dir_entry *ent;
 
-	ent = xmalloc(sizeof(*ent) + len + 1);
+	FLEX_ALLOC_MEM(ent, name, pathname, len);
 	ent->len = len;
-	memcpy(ent->name, pathname, len);
-	ent->name[len] = 0;
 	return ent;
 }
 
diff --git a/hashmap.c b/hashmap.c
index f693839..b10b642 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -256,10 +256,9 @@ const void *memintern(const void *data, size_t len)
 	e = hashmap_get(&map, &key, data);
 	if (!e) {
 		/* not found: create it */
-		e = xmallocz(sizeof(struct pool_entry) + len);
+		FLEX_ALLOC_MEM(e, data, data, len);
 		hashmap_entry_init(e, key.ent.hash);
 		e->len = len;
-		memcpy(e->data, data, len);
 		hashmap_add(&map, e);
 	}
 	return e->data;
diff --git a/help.c b/help.c
index d996b34..19328ea 100644
--- a/help.c
+++ b/help.c
@@ -11,11 +11,9 @@
 
 void add_cmdname(struct cmdnames *cmds, const char *name, int len)
 {
-	struct cmdname *ent = xmalloc(sizeof(*ent) + len + 1);
-
+	struct cmdname *ent;
+	FLEX_ALLOC_MEM(ent, name, name, len);
 	ent->len = len;
-	memcpy(ent->name, name, len);
-	ent->name[len] = 0;
 
 	ALLOC_GROW(cmds->names, cmds->cnt + 1, cmds->alloc);
 	cmds->names[cmds->cnt++] = ent;
diff --git a/log-tree.c b/log-tree.c
index f70a30e..60f9839 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -77,9 +77,8 @@ int parse_decorate_color_config(const char *var, const char *slot_name, const ch
 
 void add_name_decoration(enum decoration_type type, const char *name, struct object *obj)
 {
-	int nlen = strlen(name);
-	struct name_decoration *res = xmalloc(sizeof(*res) + nlen + 1);
-	memcpy(res->name, name, nlen + 1);
+	struct name_decoration *res;
+	FLEX_ALLOC_STR(res, name, name);
 	res->type = type;
 	res->next = add_decoration(&name_decoration, obj, res);
 }
diff --git a/name-hash.c b/name-hash.c
index 332ba95..6d9f23e 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -55,10 +55,9 @@ static struct dir_entry *hash_dir_entry(struct index_state *istate,
 	dir = find_dir_entry(istate, ce->name, namelen);
 	if (!dir) {
 		/* not found, create it and add to hash table */
-		dir = xcalloc(1, sizeof(struct dir_entry) + namelen + 1);
+		FLEX_ALLOC_MEM(dir, name, ce->name, namelen);
 		hashmap_entry_init(dir, memihash(ce->name, namelen));
 		dir->namelen = namelen;
-		strncpy(dir->name, ce->name, namelen);
 		hashmap_add(&istate->dir_hash, dir);
 
 		/* recursively add missing parent directories */
diff --git a/ref-filter.c b/ref-filter.c
index f097176..9ccfc51 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1255,10 +1255,8 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 						 const unsigned char *objectname,
 						 int flag)
 {
-	size_t len = strlen(refname);
-	struct ref_array_item *ref = xcalloc(1, sizeof(struct ref_array_item) + len + 1);
-	memcpy(ref->refname, refname, len);
-	ref->refname[len] = '\0';
+	struct ref_array_item *ref;
+	FLEX_ALLOC_STR(ref, refname, refname);
 	hashcpy(ref->objectname, objectname);
 	ref->flag = flag;
 
diff --git a/refs.c b/refs.c
index 1d9e2a7..2d86445 100644
--- a/refs.c
+++ b/refs.c
@@ -761,10 +761,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
 static struct ref_update *add_update(struct ref_transaction *transaction,
 				     const char *refname)
 {
-	size_t len = strlen(refname) + 1;
-	struct ref_update *update = xcalloc(1, sizeof(*update) + len);
-
-	memcpy((char *)update->refname, refname, len); /* includes NUL */
+	struct ref_update *update;
+	FLEX_ALLOC_STR(update, refname, refname);
 	ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
 	transaction->updates[transaction->nr++] = update;
 	return update;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index b569762..81f68f8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -199,17 +199,14 @@ static struct ref_entry *create_ref_entry(const char *refname,
 					  const unsigned char *sha1, int flag,
 					  int check_name)
 {
-	int len;
 	struct ref_entry *ref;
 
 	if (check_name &&
 	    check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
 		die("Reference has invalid format: '%s'", refname);
-	len = strlen(refname) + 1;
-	ref = xmalloc(sizeof(struct ref_entry) + len);
+	FLEX_ALLOC_STR(ref, name, refname);
 	hashcpy(ref->u.value.oid.hash, sha1);
 	oidclr(&ref->u.value.peeled);
-	memcpy(ref->name, refname, len);
 	ref->flag = flag;
 	return ref;
 }
@@ -268,9 +265,7 @@ static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
 					  int incomplete)
 {
 	struct ref_entry *direntry;
-	direntry = xcalloc(1, sizeof(struct ref_entry) + len + 1);
-	memcpy(direntry->name, dirname, len);
-	direntry->name[len] = '\0';
+	FLEX_ALLOC_MEM(direntry, name, dirname, len);
 	direntry->u.subdir.ref_cache = ref_cache;
 	direntry->flag = REF_DIR | (incomplete ? REF_INCOMPLETE : 0);
 	return direntry;
@@ -939,13 +934,10 @@ static void clear_loose_ref_cache(struct ref_cache *refs)
  */
 static struct ref_cache *create_ref_cache(const char *submodule)
 {
-	int len;
 	struct ref_cache *refs;
 	if (!submodule)
 		submodule = "";
-	len = strlen(submodule) + 1;
-	refs = xcalloc(1, sizeof(struct ref_cache) + len);
-	memcpy(refs->name, submodule, len);
+	FLEX_ALLOC_STR(refs, name, submodule);
 	refs->next = submodule_ref_caches;
 	submodule_ref_caches = refs;
 	return refs;
@@ -2197,10 +2189,9 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
 
 	/* Schedule the loose reference for pruning if requested. */
 	if ((cb->flags & PACK_REFS_PRUNE)) {
-		int namelen = strlen(entry->name) + 1;
-		struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen);
+		struct ref_to_prune *n;
+		FLEX_ALLOC_STR(n, name, entry->name);
 		hashcpy(n->sha1, entry->u.value.oid.hash);
-		memcpy(n->name, entry->name, namelen); /* includes NUL */
 		n->next = cb->ref_to_prune;
 		cb->ref_to_prune = n;
 	}
diff --git a/remote.c b/remote.c
index 02e698a..25a960f 100644
--- a/remote.c
+++ b/remote.c
@@ -2136,16 +2136,13 @@ static int one_local_ref(const char *refname, const struct object_id *oid,
 {
 	struct ref ***local_tail = cb_data;
 	struct ref *ref;
-	int len;
 
 	/* we already know it starts with refs/ to get here */
 	if (check_refname_format(refname + 5, 0))
 		return 0;
 
-	len = strlen(refname) + 1;
-	ref = xcalloc(1, sizeof(*ref) + len);
+	ref = alloc_ref(refname);
 	oidcpy(&ref->new_oid, oid);
-	memcpy(ref->name, refname, len);
 	**local_tail = ref;
 	*local_tail = &ref->next;
 	return 0;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 11/22] use st_add and st_mult for allocation size computation
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (9 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 10/22] convert trivial cases to FLEX_ARRAY macros Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 12/22] prepare_{git,shell}_cmd: use argv_array Jeff King
                       ` (11 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
---
 archive.c              |  4 ++--
 builtin/apply.c        |  2 +-
 builtin/clean.c        |  2 +-
 builtin/fetch.c        |  2 +-
 builtin/index-pack.c   |  4 ++--
 builtin/merge.c        |  2 +-
 builtin/mv.c           |  4 ++--
 builtin/receive-pack.c |  2 +-
 combine-diff.c         | 14 +++++++-------
 commit.c               |  2 +-
 compat/mingw.c         |  4 ++--
 compat/qsort.c         |  2 +-
 compat/setenv.c        |  2 +-
 compat/win32/syslog.c  |  4 ++--
 diffcore-delta.c       |  6 ++++--
 diffcore-rename.c      |  2 +-
 dir.c                  |  4 ++--
 fast-import.c          |  2 +-
 refs.c                 |  2 +-
 remote.c               |  8 ++++----
 revision.c             |  2 +-
 sha1_file.c            | 20 +++++++++++---------
 sha1_name.c            |  5 ++---
 shallow.c              |  2 +-
 submodule.c            |  6 +++---
 25 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/archive.c b/archive.c
index 0687afa..5d735ae 100644
--- a/archive.c
+++ b/archive.c
@@ -171,8 +171,8 @@ static void queue_directory(const unsigned char *sha1,
 		unsigned mode, int stage, struct archiver_context *c)
 {
 	struct directory *d;
-	size_t len = base->len + 1 + strlen(filename) + 1;
-	d = xmalloc(sizeof(*d) + len);
+	size_t len = st_add4(base->len, 1, strlen(filename), 1);
+	d = xmalloc(st_add(sizeof(*d), len));
 	d->up	   = c->bottom;
 	d->baselen = base->len;
 	d->mode	   = mode;
diff --git a/builtin/apply.c b/builtin/apply.c
index d61ac65..42c610e 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -2632,7 +2632,7 @@ static void update_image(struct image *img,
 	insert_count = postimage->len;
 
 	/* Adjust the contents */
-	result = xmalloc(img->len + insert_count - remove_count + 1);
+	result = xmalloc(st_add3(st_sub(img->len, remove_count), insert_count, 1));
 	memcpy(result, img->buf, applied_at);
 	memcpy(result + applied_at, postimage->buf, postimage->len);
 	memcpy(result + applied_at + postimage->len,
diff --git a/builtin/clean.c b/builtin/clean.c
index 8229f7e..0371010 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -615,7 +615,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 				nr += chosen[i];
 		}
 
-		result = xcalloc(nr + 1, sizeof(int));
+		result = xcalloc(st_add(nr, 1), sizeof(int));
 		for (i = 0; i < stuff->nr && j < nr; i++) {
 			if (chosen[i])
 				result[j++] = i;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8e74213..373a89d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1110,7 +1110,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv)
 	if (argc > 0) {
 		int j = 0;
 		int i;
-		refs = xcalloc(argc + 1, sizeof(const char *));
+		refs = xcalloc(st_add(argc, 1), sizeof(const char *));
 		for (i = 0; i < argc; i++) {
 			if (!strcmp(argv[i], "tag")) {
 				i++;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a60bcfa..193908a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1744,9 +1744,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
-	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+	objects = xcalloc(st_add(nr_objects, 1), sizeof(struct object_entry));
 	if (show_stat)
-		obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
+		obj_stat = xcalloc(st_add(nr_objects, 1), sizeof(struct object_stat));
 	ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
 	parse_pack_objects(pack_sha1);
 	resolve_deltas();
diff --git a/builtin/merge.c b/builtin/merge.c
index b98a348..101ffef 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -939,7 +939,7 @@ static int setup_with_upstream(const char ***argv)
 	if (!branch->merge_nr)
 		die(_("No default upstream defined for the current branch."));
 
-	args = xcalloc(branch->merge_nr + 1, sizeof(char *));
+	args = xcalloc(st_add(branch->merge_nr, 1), sizeof(char *));
 	for (i = 0; i < branch->merge_nr; i++) {
 		if (!branch->merge[i]->dst)
 			die(_("No remote-tracking branch for %s from %s"),
diff --git a/builtin/mv.c b/builtin/mv.c
index 9a9813a..aeae855 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -48,9 +48,9 @@ static const char **internal_copy_pathspec(const char *prefix,
 
 static const char *add_slash(const char *path)
 {
-	int len = strlen(path);
+	size_t len = strlen(path);
 	if (path[len - 1] != '/') {
-		char *with_slash = xmalloc(len + 2);
+		char *with_slash = xmalloc(st_add(len, 2));
 		memcpy(with_slash, path, len);
 		with_slash[len++] = '/';
 		with_slash[len] = 0;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 3dc3868..c8e32b2 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1372,7 +1372,7 @@ static struct command **queue_command(struct command **tail,
 
 	refname = line + 82;
 	reflen = linelen - 82;
-	cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
+	cmd = xcalloc(1, st_add3(sizeof(struct command), reflen, 1));
 	hashcpy(cmd->old_sha1, old_sha1);
 	hashcpy(cmd->new_sha1, new_sha1);
 	memcpy(cmd->ref_name, refname, reflen);
diff --git a/combine-diff.c b/combine-diff.c
index be09a2b..0e1d4b0 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -189,11 +189,11 @@ static struct lline *coalesce_lines(struct lline *base, int *lenbase,
 	 *   - Else if we have NEW, insert newend lline into base and
 	 *   consume newend
 	 */
-	lcs = xcalloc(origbaselen + 1, sizeof(int*));
-	directions = xcalloc(origbaselen + 1, sizeof(enum coalesce_direction*));
+	lcs = xcalloc(st_add(origbaselen, 1), sizeof(int*));
+	directions = xcalloc(st_add(origbaselen, 1), sizeof(enum coalesce_direction*));
 	for (i = 0; i < origbaselen + 1; i++) {
-		lcs[i] = xcalloc(lennew + 1, sizeof(int));
-		directions[i] = xcalloc(lennew + 1, sizeof(enum coalesce_direction));
+		lcs[i] = xcalloc(st_add(lennew, 1), sizeof(int));
+		directions[i] = xcalloc(st_add(lennew, 1), sizeof(enum coalesce_direction));
 		directions[i][0] = BASE;
 	}
 	for (j = 1; j < lennew + 1; j++)
@@ -1111,7 +1111,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	if (result_size && result[result_size-1] != '\n')
 		cnt++; /* incomplete line */
 
-	sline = xcalloc(cnt+2, sizeof(*sline));
+	sline = xcalloc(st_add(cnt, 2), sizeof(*sline));
 	sline[0].bol = result;
 	for (lno = 0, cp = result; cp < result + result_size; cp++) {
 		if (*cp == '\n') {
@@ -1130,7 +1130,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 	/* Even p_lno[cnt+1] is valid -- that is for the end line number
 	 * for deletion hunk at the end.
 	 */
-	sline[0].p_lno = xcalloc((cnt+2) * num_parent, sizeof(unsigned long));
+	sline[0].p_lno = xcalloc(st_mult(st_add(cnt, 2), num_parent), sizeof(unsigned long));
 	for (lno = 0; lno <= cnt; lno++)
 		sline[lno+1].p_lno = sline[lno].p_lno + num_parent;
 
@@ -1262,7 +1262,7 @@ static struct diff_filepair *combined_pair(struct combine_diff_path *p,
 	struct diff_filespec *pool;
 
 	pair = xmalloc(sizeof(*pair));
-	pool = xcalloc(num_parent + 1, sizeof(struct diff_filespec));
+	pool = xcalloc(st_add(num_parent, 1), sizeof(struct diff_filespec));
 	pair->one = pool + 1;
 	pair->two = pool;
 
diff --git a/commit.c b/commit.c
index 31cd91f..3f4f371 100644
--- a/commit.c
+++ b/commit.c
@@ -147,7 +147,7 @@ struct commit_graft *read_graft_line(char *buf, int len)
 	if ((len + 1) % entry_size)
 		goto bad_graft_data;
 	i = (len + 1) / entry_size - 1;
-	graft = xmalloc(sizeof(*graft) + GIT_SHA1_RAWSZ * i);
+	graft = xmalloc(st_add(sizeof(*graft), st_mult(GIT_SHA1_RAWSZ, i)));
 	graft->nr_parent = i;
 	if (get_oid_hex(buf, &graft->oid))
 		goto bad_graft_data;
diff --git a/compat/mingw.c b/compat/mingw.c
index 7803f30..cfedcf9 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -810,7 +810,7 @@ static const char *quote_arg(const char *arg)
 		return arg;
 
 	/* insert \ where necessary */
-	d = q = xmalloc(len+n+3);
+	d = q = xmalloc(st_add3(len, n, 3));
 	*d++ = '"';
 	while (*arg) {
 		if (*arg == '"')
@@ -1069,7 +1069,7 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen
 			free(quoted);
 	}
 
-	wargs = xmalloc((2 * args.len + 1) * sizeof(wchar_t));
+	wargs = xmalloc_array(st_add(st_mult(2, args.len), 1), sizeof(wchar_t));
 	xutftowcs(wargs, args.buf, 2 * args.len + 1);
 	strbuf_release(&args);
 
diff --git a/compat/qsort.c b/compat/qsort.c
index 9574d53..7d071af 100644
--- a/compat/qsort.c
+++ b/compat/qsort.c
@@ -47,7 +47,7 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
 void git_qsort(void *b, size_t n, size_t s,
 	       int (*cmp)(const void *, const void *))
 {
-	const size_t size = n * s;
+	const size_t size = st_mult(n, s);
 	char buf[1024];
 
 	if (size < sizeof(buf)) {
diff --git a/compat/setenv.c b/compat/setenv.c
index fc1439a..7849f25 100644
--- a/compat/setenv.c
+++ b/compat/setenv.c
@@ -18,7 +18,7 @@ int gitsetenv(const char *name, const char *value, int replace)
 
 	namelen = strlen(name);
 	valuelen = strlen(value);
-	envstr = malloc((namelen + valuelen + 2));
+	envstr = malloc(st_add3(namelen, valuelen, 2));
 	if (!envstr) {
 		errno = ENOMEM;
 		return -1;
diff --git a/compat/win32/syslog.c b/compat/win32/syslog.c
index d015e43..b905aea 100644
--- a/compat/win32/syslog.c
+++ b/compat/win32/syslog.c
@@ -32,7 +32,7 @@ void syslog(int priority, const char *fmt, ...)
 		return;
 	}
 
-	str = malloc(str_len + 1);
+	str = malloc(st_add(str_len, 1));
 	if (!str) {
 		warning("malloc failed: '%s'", strerror(errno));
 		return;
@@ -43,7 +43,7 @@ void syslog(int priority, const char *fmt, ...)
 	va_end(ap);
 
 	while ((pos = strstr(str, "%1")) != NULL) {
-		str = realloc(str, ++str_len + 1);
+		str = realloc(str, st_add(++str_len, 1));
 		if (!str) {
 			warning("realloc failed: '%s'", strerror(errno));
 			return;
diff --git a/diffcore-delta.c b/diffcore-delta.c
index 7cf431d..4159748 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -53,7 +53,8 @@ static struct spanhash_top *spanhash_rehash(struct spanhash_top *orig)
 	int osz = 1 << orig->alloc_log2;
 	int sz = osz << 1;
 
-	new = xmalloc(sizeof(*orig) + sizeof(struct spanhash) * sz);
+	new = xmalloc(st_add(sizeof(*orig),
+			     st_mult(sizeof(struct spanhash), sz)));
 	new->alloc_log2 = orig->alloc_log2 + 1;
 	new->free = INITIAL_FREE(new->alloc_log2);
 	memset(new->data, 0, sizeof(struct spanhash) * sz);
@@ -130,7 +131,8 @@ static struct spanhash_top *hash_chars(struct diff_filespec *one)
 	int is_text = !diff_filespec_is_binary(one);
 
 	i = INITIAL_HASH_SIZE;
-	hash = xmalloc(sizeof(*hash) + sizeof(struct spanhash) * (1<<i));
+	hash = xmalloc(st_add(sizeof(*hash),
+			      st_mult(sizeof(struct spanhash), 1<<i)));
 	hash->alloc_log2 = i;
 	hash->free = INITIAL_FREE(i);
 	memset(hash->data, 0, sizeof(struct spanhash) * (1<<i));
diff --git a/diffcore-rename.c b/diffcore-rename.c
index af1fe08..3b3c1ed 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -537,7 +537,7 @@ void diffcore_rename(struct diff_options *options)
 				rename_dst_nr * rename_src_nr, 50, 1);
 	}
 
-	mx = xcalloc(num_create * NUM_CANDIDATE_PER_DST, sizeof(*mx));
+	mx = xcalloc(st_mult(num_create, NUM_CANDIDATE_PER_DST), sizeof(*mx));
 	for (dst_cnt = i = 0; i < rename_dst_nr; i++) {
 		struct diff_filespec *two = rename_dst[i].two;
 		struct diff_score *m;
diff --git a/dir.c b/dir.c
index f06ebb7..2c91541 100644
--- a/dir.c
+++ b/dir.c
@@ -689,7 +689,7 @@ static int add_excludes(const char *fname, const char *base, int baselen,
 			return 0;
 		}
 		if (buf[size-1] != '\n') {
-			buf = xrealloc(buf, size+1);
+			buf = xrealloc(buf, st_add(size, 1));
 			buf[size++] = '\n';
 		}
 	} else {
@@ -2488,7 +2488,7 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	next = data + len + 1;
 	if (next > rd->end)
 		return -1;
-	*untracked_ = untracked = xmalloc(sizeof(*untracked) + len);
+	*untracked_ = untracked = xmalloc(st_add(sizeof(*untracked), len));
 	memcpy(untracked, &ud, sizeof(ud));
 	memcpy(untracked->name, data, len + 1);
 	data = next;
diff --git a/fast-import.c b/fast-import.c
index a6467cb..3053bb8 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -622,7 +622,7 @@ static void *pool_alloc(size_t len)
 			return xmalloc(len);
 		}
 		total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;
-		p = xmalloc(sizeof(struct mem_pool) + mem_pool_alloc);
+		p = xmalloc(st_add(sizeof(struct mem_pool), mem_pool_alloc));
 		p->next_pool = mem_pool;
 		p->next_free = (char *) p->space;
 		p->end = p->next_free + mem_pool_alloc;
diff --git a/refs.c b/refs.c
index 2d86445..b0e6ece 100644
--- a/refs.c
+++ b/refs.c
@@ -906,7 +906,7 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 			/* -2 for strlen("%.*s") - strlen("%s"); +1 for NUL */
 			total_len += strlen(ref_rev_parse_rules[nr_rules]) - 2 + 1;
 
-		scanf_fmts = xmalloc(nr_rules * sizeof(char *) + total_len);
+		scanf_fmts = xmalloc(st_add(st_mult(nr_rules, sizeof(char *)), total_len));
 
 		offset = 0;
 		for (i = 0; i < nr_rules; i++) {
diff --git a/remote.c b/remote.c
index 25a960f..21e4ec3 100644
--- a/remote.c
+++ b/remote.c
@@ -931,7 +931,7 @@ static struct ref *alloc_ref_with_prefix(const char *prefix, size_t prefixlen,
 		const char *name)
 {
 	size_t len = strlen(name);
-	struct ref *ref = xcalloc(1, sizeof(struct ref) + prefixlen + len + 1);
+	struct ref *ref = xcalloc(1, st_add4(sizeof(*ref), prefixlen, len, 1));
 	memcpy(ref->name, prefix, prefixlen);
 	memcpy(ref->name + prefixlen, name, len);
 	return ref;
@@ -948,9 +948,9 @@ struct ref *copy_ref(const struct ref *ref)
 	size_t len;
 	if (!ref)
 		return NULL;
-	len = strlen(ref->name);
-	cpy = xmalloc(sizeof(struct ref) + len + 1);
-	memcpy(cpy, ref, sizeof(struct ref) + len + 1);
+	len = st_add3(sizeof(struct ref), strlen(ref->name), 1);
+	cpy = xmalloc(len);
+	memcpy(cpy, ref, len);
 	cpy->next = NULL;
 	cpy->symref = xstrdup_or_null(ref->symref);
 	cpy->remote_status = xstrdup_or_null(ref->remote_status);
diff --git a/revision.c b/revision.c
index f24ead5..d4ace60 100644
--- a/revision.c
+++ b/revision.c
@@ -540,7 +540,7 @@ struct treesame_state {
 static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit)
 {
 	unsigned n = commit_list_count(commit->parents);
-	struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+	struct treesame_state *st = xcalloc(1, st_add(sizeof(*st), n));
 	st->nparents = n;
 	add_decoration(&revs->treesame, &commit->object, st);
 	return st;
diff --git a/sha1_file.c b/sha1_file.c
index 2f1c6d3..0251700 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -253,7 +253,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 {
 	struct alternate_object_database *ent;
 	struct alternate_object_database *alt;
-	int pfxlen, entlen;
+	size_t pfxlen, entlen;
 	struct strbuf pathbuf = STRBUF_INIT;
 
 	if (!is_absolute_path(entry) && relative_base) {
@@ -273,8 +273,8 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base,
 	while (pfxlen && pathbuf.buf[pfxlen-1] == '/')
 		pfxlen -= 1;
 
-	entlen = pfxlen + 43; /* '/' + 2 hex + '/' + 38 hex + NUL */
-	ent = xmalloc(sizeof(*ent) + entlen);
+	entlen = st_add(pfxlen, 43); /* '/' + 2 hex + '/' + 38 hex + NUL */
+	ent = xmalloc(st_add(sizeof(*ent), entlen));
 	memcpy(ent->base, pathbuf.buf, pfxlen);
 	strbuf_release(&pathbuf);
 
@@ -1134,7 +1134,7 @@ unsigned char *use_pack(struct packed_git *p,
 
 static struct packed_git *alloc_packed_git(int extra)
 {
-	struct packed_git *p = xmalloc(sizeof(*p) + extra);
+	struct packed_git *p = xmalloc(st_add(sizeof(*p), extra));
 	memset(p, 0, sizeof(*p));
 	p->pack_fd = -1;
 	return p;
@@ -1168,7 +1168,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 	 * ".pack" is long enough to hold any suffix we're adding (and
 	 * the use xsnprintf double-checks that)
 	 */
-	alloc = path_len + strlen(".pack") + 1;
+	alloc = st_add3(path_len, strlen(".pack"), 1);
 	p = alloc_packed_git(alloc);
 	memcpy(p->pack_name, path, path_len);
 
@@ -1196,7 +1196,7 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
 {
 	const char *path = sha1_pack_name(sha1);
-	int alloc = strlen(path) + 1;
+	size_t alloc = st_add(strlen(path), 1);
 	struct packed_git *p = alloc_packed_git(alloc);
 
 	memcpy(p->pack_name, path, alloc); /* includes NUL */
@@ -1413,10 +1413,12 @@ static void mark_bad_packed_object(struct packed_git *p,
 {
 	unsigned i;
 	for (i = 0; i < p->num_bad_objects; i++)
-		if (!hashcmp(sha1, p->bad_object_sha1 + 20 * i))
+		if (!hashcmp(sha1, p->bad_object_sha1 + GIT_SHA1_RAWSZ * i))
 			return;
-	p->bad_object_sha1 = xrealloc(p->bad_object_sha1, 20 * (p->num_bad_objects + 1));
-	hashcpy(p->bad_object_sha1 + 20 * p->num_bad_objects, sha1);
+	p->bad_object_sha1 = xrealloc(p->bad_object_sha1,
+				      st_mult(GIT_SHA1_RAWSZ,
+					      st_add(p->num_bad_objects, 1)));
+	hashcpy(p->bad_object_sha1 + GIT_SHA1_RAWSZ * p->num_bad_objects, sha1);
 	p->num_bad_objects++;
 }
 
diff --git a/sha1_name.c b/sha1_name.c
index 89918ca..d14346c 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -87,9 +87,8 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa
 		 * object databases including our own.
 		 */
 		const char *objdir = get_object_directory();
-		int objdir_len = strlen(objdir);
-		int entlen = objdir_len + 43;
-		fakeent = xmalloc(sizeof(*fakeent) + entlen);
+		size_t objdir_len = strlen(objdir);
+		fakeent = xmalloc(st_add3(sizeof(*fakeent), objdir_len, 43));
 		memcpy(fakeent->base, objdir, objdir_len);
 		fakeent->name = fakeent->base + objdir_len + 1;
 		fakeent->name[-1] = '/';
diff --git a/shallow.c b/shallow.c
index 71163bf..4d554ca 100644
--- a/shallow.c
+++ b/shallow.c
@@ -389,7 +389,7 @@ static void paint_down(struct paint_info *info, const unsigned char *sha1,
 	unsigned int i, nr;
 	struct commit_list *head = NULL;
 	int bitmap_nr = (info->nr_bits + 31) / 32;
-	int bitmap_size = bitmap_nr * sizeof(uint32_t);
+	size_t bitmap_size = st_mult(bitmap_nr, sizeof(uint32_t));
 	uint32_t *tmp = xmalloc(bitmap_size); /* to be freed before return */
 	uint32_t *bitmap = paint_alloc(info);
 	struct commit *c = lookup_commit_reference_gently(sha1, 1);
diff --git a/submodule.c b/submodule.c
index b83939c..ac61c65 100644
--- a/submodule.c
+++ b/submodule.c
@@ -123,7 +123,7 @@ static int add_submodule_odb(const char *path)
 	struct strbuf objects_directory = STRBUF_INIT;
 	struct alternate_object_database *alt_odb;
 	int ret = 0;
-	int alloc;
+	size_t alloc;
 
 	strbuf_git_path_submodule(&objects_directory, path, "objects/");
 	if (!is_directory(objects_directory.buf)) {
@@ -138,8 +138,8 @@ static int add_submodule_odb(const char *path)
 					objects_directory.len))
 			goto done;
 
-	alloc = objects_directory.len + 42; /* for "12/345..." sha1 */
-	alt_odb = xmalloc(sizeof(*alt_odb) + alloc);
+	alloc = st_add(objects_directory.len, 42); /* for "12/345..." sha1 */
+	alt_odb = xmalloc(st_add(sizeof(*alt_odb), alloc));
 	alt_odb->next = alt_odb_list;
 	xsnprintf(alt_odb->base, alloc, "%s", objects_directory.buf);
 	alt_odb->name = alt_odb->base + objects_directory.len;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 12/22] prepare_{git,shell}_cmd: use argv_array
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (10 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 11/22] use st_add and st_mult for allocation size computation Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 13/22] write_untracked_extension: use FLEX_ALLOC helper Jeff King
                       ` (10 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

These functions transform an existing argv into one suitable
for exec-ing or spawning via git or a shell. We can use an
argv_array in each to avoid dealing with manual counting and
allocation.

This also makes the memory allocation more clear and fixes
some leaks. In prepare_shell_cmd, we would sometimes
allocate a new string with "$@" in it and sometimes not,
meaning the caller could not correctly free it. On the
non-Windows side, we are in a child process which will
exec() or exit() immediately, so the leak isn't a big deal.
On Windows, though, we use spawn() from the parent process,
and leak a string for each shell command we run. On top of
that, the Windows code did not free the allocated argv array
at all (but does for the prepare_git_cmd case!).

By switching both of these functions to write into an
argv_array, we can consistently free the result as
appropriate.

Signed-off-by: Jeff King <peff@peff.net>
---
 exec_cmd.c    | 28 +++++++++++-----------------
 exec_cmd.h    |  4 +++-
 run-command.c | 60 +++++++++++++++++++++++++----------------------------------
 3 files changed, 39 insertions(+), 53 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index e85f0fd..cf442a9 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "exec_cmd.h"
 #include "quote.h"
+#include "argv-array.h"
 #define MAX_ARGS	32
 
 static const char *argv_exec_path;
@@ -107,32 +108,25 @@ void setup_path(void)
 	strbuf_release(&new_path);
 }
 
-const char **prepare_git_cmd(const char **argv)
+const char **prepare_git_cmd(struct argv_array *out, const char **argv)
 {
-	int argc;
-	const char **nargv;
-
-	for (argc = 0; argv[argc]; argc++)
-		; /* just counting */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 2));
-
-	nargv[0] = "git";
-	for (argc = 0; argv[argc]; argc++)
-		nargv[argc + 1] = argv[argc];
-	nargv[argc + 1] = NULL;
-	return nargv;
+	argv_array_push(out, "git");
+	argv_array_pushv(out, argv);
+	return out->argv;
 }
 
 int execv_git_cmd(const char **argv) {
-	const char **nargv = prepare_git_cmd(argv);
-	trace_argv_printf(nargv, "trace: exec:");
+	struct argv_array nargv = ARGV_ARRAY_INIT;
+
+	prepare_git_cmd(&nargv, argv);
+	trace_argv_printf(nargv.argv, "trace: exec:");
 
 	/* execvp() can only ever return if it fails */
-	sane_execvp("git", (char **)nargv);
+	sane_execvp("git", (char **)nargv.argv);
 
 	trace_printf("trace: exec failed: %s\n", strerror(errno));
 
-	free(nargv);
+	argv_array_clear(&nargv);
 	return -1;
 }
 
diff --git a/exec_cmd.h b/exec_cmd.h
index 93b0c02..1f6b433 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -1,11 +1,13 @@
 #ifndef GIT_EXEC_CMD_H
 #define GIT_EXEC_CMD_H
 
+struct argv_array;
+
 extern void git_set_argv_exec_path(const char *exec_path);
 extern const char *git_extract_argv0_path(const char *path);
 extern const char *git_exec_path(void);
 extern void setup_path(void);
-extern const char **prepare_git_cmd(const char **argv);
+extern const char **prepare_git_cmd(struct argv_array *out, const char **argv);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 LAST_ARG_MUST_BE_NULL
 extern int execl_git_cmd(const char *cmd, ...);
diff --git a/run-command.c b/run-command.c
index cdf0184..019f6d1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -160,50 +160,41 @@ int sane_execvp(const char *file, char * const argv[])
 	return -1;
 }
 
-static const char **prepare_shell_cmd(const char **argv)
+static const char **prepare_shell_cmd(struct argv_array *out, const char **argv)
 {
-	int argc, nargc = 0;
-	const char **nargv;
-
-	for (argc = 0; argv[argc]; argc++)
-		; /* just counting */
-	/* +1 for NULL, +3 for "sh -c" plus extra $0 */
-	nargv = xmalloc(sizeof(*nargv) * (argc + 1 + 3));
-
-	if (argc < 1)
+	if (!argv[0])
 		die("BUG: shell command is empty");
 
 	if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) {
 #ifndef GIT_WINDOWS_NATIVE
-		nargv[nargc++] = SHELL_PATH;
+		argv_array_push(out, SHELL_PATH);
 #else
-		nargv[nargc++] = "sh";
+		argv_array_push(out, "sh");
 #endif
-		nargv[nargc++] = "-c";
-
-		if (argc < 2)
-			nargv[nargc++] = argv[0];
-		else {
-			struct strbuf arg0 = STRBUF_INIT;
-			strbuf_addf(&arg0, "%s \"$@\"", argv[0]);
-			nargv[nargc++] = strbuf_detach(&arg0, NULL);
-		}
-	}
+		argv_array_push(out, "-c");
 
-	for (argc = 0; argv[argc]; argc++)
-		nargv[nargc++] = argv[argc];
-	nargv[nargc] = NULL;
+		/*
+		 * If we have no extra arguments, we do not even need to
+		 * bother with the "$@" magic.
+		 */
+		if (!argv[1])
+			argv_array_push(out, argv[0]);
+		else
+			argv_array_pushf(out, "%s \"$@\"", argv[0]);
+	}
 
-	return nargv;
+	argv_array_pushv(out, argv);
+	return out->argv;
 }
 
 #ifndef GIT_WINDOWS_NATIVE
 static int execv_shell_cmd(const char **argv)
 {
-	const char **nargv = prepare_shell_cmd(argv);
-	trace_argv_printf(nargv, "trace: exec:");
-	sane_execvp(nargv[0], (char **)nargv);
-	free(nargv);
+	struct argv_array nargv = ARGV_ARRAY_INIT;
+	prepare_shell_cmd(&nargv, argv);
+	trace_argv_printf(nargv.argv, "trace: exec:");
+	sane_execvp(nargv.argv[0], (char **)nargv.argv);
+	argv_array_clear(&nargv);
 	return -1;
 }
 #endif
@@ -457,6 +448,7 @@ fail_pipe:
 {
 	int fhin = 0, fhout = 1, fherr = 2;
 	const char **sargv = cmd->argv;
+	struct argv_array nargv = ARGV_ARRAY_INIT;
 
 	if (cmd->no_stdin)
 		fhin = open("/dev/null", O_RDWR);
@@ -482,9 +474,9 @@ fail_pipe:
 		fhout = dup(cmd->out);
 
 	if (cmd->git_cmd)
-		cmd->argv = prepare_git_cmd(cmd->argv);
+		cmd->argv = prepare_git_cmd(&nargv, cmd->argv);
 	else if (cmd->use_shell)
-		cmd->argv = prepare_shell_cmd(cmd->argv);
+		cmd->argv = prepare_shell_cmd(&nargv, cmd->argv);
 
 	cmd->pid = mingw_spawnvpe(cmd->argv[0], cmd->argv, (char**) cmd->env,
 			cmd->dir, fhin, fhout, fherr);
@@ -494,9 +486,7 @@ fail_pipe:
 	if (cmd->clean_on_exit && cmd->pid >= 0)
 		mark_child_for_cleanup(cmd->pid);
 
-	if (cmd->git_cmd)
-		free(cmd->argv);
-
+	argv_array_clear(&nargv);
 	cmd->argv = sargv;
 	if (fhin != 0)
 		close(fhin);
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 13/22] write_untracked_extension: use FLEX_ALLOC helper
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (11 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 12/22] prepare_{git,shell}_cmd: use argv_array Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 14/22] fast-import: simplify allocation in start_packfile Jeff King
                       ` (9 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.

Note that there's one subtlety here, though.  We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.

As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.

Signed-off-by: Jeff King <peff@peff.net>
---
 dir.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 2c91541..a4a9d9f 100644
--- a/dir.c
+++ b/dir.c
@@ -2360,16 +2360,15 @@ void write_untracked_extension(struct strbuf *out, struct untracked_cache *untra
 	struct ondisk_untracked_cache *ouc;
 	struct write_data wd;
 	unsigned char varbuf[16];
-	int len = 0, varint_len;
-	if (untracked->exclude_per_dir)
-		len = strlen(untracked->exclude_per_dir);
-	ouc = xmalloc(sizeof(*ouc) + len + 1);
+	int varint_len;
+	size_t len = strlen(untracked->exclude_per_dir);
+
+	FLEX_ALLOC_MEM(ouc, exclude_per_dir, untracked->exclude_per_dir, len);
 	stat_data_to_disk(&ouc->info_exclude_stat, &untracked->ss_info_exclude.stat);
 	stat_data_to_disk(&ouc->excludes_file_stat, &untracked->ss_excludes_file.stat);
 	hashcpy(ouc->info_exclude_sha1, untracked->ss_info_exclude.sha1);
 	hashcpy(ouc->excludes_file_sha1, untracked->ss_excludes_file.sha1);
 	ouc->dir_flags = htonl(untracked->dir_flags);
-	memcpy(ouc->exclude_per_dir, untracked->exclude_per_dir, len + 1);
 
 	varint_len = encode_varint(untracked->ident.len, varbuf);
 	strbuf_add(out, varbuf, varint_len);
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 14/22] fast-import: simplify allocation in start_packfile
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (12 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 13/22] write_untracked_extension: use FLEX_ALLOC helper Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 15/22] fetch-pack: simplify add_sought_entry Jeff King
                       ` (8 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).

This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.

Signed-off-by: Jeff King <peff@peff.net>
---
 fast-import.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 3053bb8..9fc7093 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -865,15 +865,12 @@ static void start_packfile(void)
 {
 	static char tmp_file[PATH_MAX];
 	struct packed_git *p;
-	int namelen;
 	struct pack_header hdr;
 	int pack_fd;
 
 	pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
 			      "pack/tmp_pack_XXXXXX");
-	namelen = strlen(tmp_file) + 2;
-	p = xcalloc(1, sizeof(*p) + namelen);
-	xsnprintf(p->pack_name, namelen, "%s", tmp_file);
+	FLEX_ALLOC_STR(p, pack_name, tmp_file);
 	p->pack_fd = pack_fd;
 	p->do_not_close = 1;
 	pack_file = sha1fd(pack_fd, p->pack_name);
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 15/22] fetch-pack: simplify add_sought_entry
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (13 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 14/22] fast-import: simplify allocation in start_packfile Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 16/22] test-path-utils: fix normalize_path_copy output buffer size Jeff King
                       ` (7 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.

Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().

Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for "<sha1> <ref>" versus just "<ref>". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/fetch-pack.c | 27 +++++++++------------------
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9b2a514..79a611f 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -10,33 +10,24 @@ static const char fetch_pack_usage[] =
 "[--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] "
 "[--no-progress] [--diag-url] [-v] [<host>:]<directory> [<refs>...]";
 
-static void add_sought_entry_mem(struct ref ***sought, int *nr, int *alloc,
-				 const char *name, int namelen)
+static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
+			     const char *name)
 {
-	struct ref *ref = xcalloc(1, sizeof(*ref) + namelen + 1);
+	struct ref *ref;
 	struct object_id oid;
-	const int chunksz = GIT_SHA1_HEXSZ + 1;
 
-	if (namelen > chunksz && name[chunksz - 1] == ' ' &&
-		!get_oid_hex(name, &oid)) {
-		oidcpy(&ref->old_oid, &oid);
-		name += chunksz;
-		namelen -= chunksz;
-	}
+	if (!get_oid_hex(name, &oid) && name[GIT_SHA1_HEXSZ] == ' ')
+		name += GIT_SHA1_HEXSZ + 1;
+	else
+		oidclr(&oid);
 
-	memcpy(ref->name, name, namelen);
-	ref->name[namelen] = '\0';
+	ref = alloc_ref(name);
+	oidcpy(&ref->old_oid, &oid);
 	(*nr)++;
 	ALLOC_GROW(*sought, *nr, *alloc);
 	(*sought)[*nr - 1] = ref;
 }
 
-static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
-			     const char *string)
-{
-	add_sought_entry_mem(sought, nr, alloc, string, strlen(string));
-}
-
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 {
 	int i, ret;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 16/22] test-path-utils: fix normalize_path_copy output buffer size
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (14 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 15/22] fetch-pack: simplify add_sought_entry Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:44     ` [PATCH v3 17/22] sequencer: simplify memory allocation of get_message Jeff King
                       ` (6 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.

In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.

Signed-off-by: Jeff King <peff@peff.net>
---
 test-path-utils.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/test-path-utils.c b/test-path-utils.c
index 6232dfe..ba805b3 100644
--- a/test-path-utils.c
+++ b/test-path-utils.c
@@ -8,21 +8,14 @@
  */
 static int normalize_ceiling_entry(struct string_list_item *item, void *unused)
 {
-	const char *ceil = item->string;
-	int len = strlen(ceil);
-	char buf[PATH_MAX+1];
+	char *ceil = item->string;
 
-	if (len == 0)
+	if (!*ceil)
 		die("Empty path is not supported");
-	if (len > PATH_MAX)
-		die("Path \"%s\" is too long", ceil);
 	if (!is_absolute_path(ceil))
 		die("Path \"%s\" is not absolute", ceil);
-	if (normalize_path_copy(buf, ceil) < 0)
+	if (normalize_path_copy(ceil, ceil) < 0)
 		die("Path \"%s\" could not be normalized", ceil);
-	len = strlen(buf);
-	free(item->string);
-	item->string = xstrdup(buf);
 	return 1;
 }
 
@@ -166,7 +159,7 @@ static struct test_data dirname_data[] = {
 int main(int argc, char **argv)
 {
 	if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) {
-		char *buf = xmalloc(PATH_MAX + 1);
+		char *buf = xmallocz(strlen(argv[2]));
 		int rv = normalize_path_copy(buf, argv[2]);
 		if (rv)
 			buf = "++failed++";
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 17/22] sequencer: simplify memory allocation of get_message
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (15 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 16/22] test-path-utils: fix normalize_path_copy output buffer size Jeff King
@ 2016-02-22 22:44     ` Jeff King
  2016-02-22 22:45     ` [PATCH v3 18/22] git-compat-util: drop mempcpy compat code Jeff King
                       ` (5 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:44 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

For a commit with sha1 "1234abcd" and subject "foo", this
function produces a struct with three strings:

 1. "foo"

 2. "1234abcd... foo"

 3. "parent of 1234abcd... foo"

It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.

Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.

While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.

Signed-off-by: Jeff King <peff@peff.net>
---
 sequencer.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 8048786..e66f2fe 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -124,42 +124,33 @@ static const char *action_name(const struct replay_opts *opts)
 
 struct commit_message {
 	char *parent_label;
-	const char *label;
-	const char *subject;
+	char *label;
+	char *subject;
 	const char *message;
 };
 
 static int get_message(struct commit *commit, struct commit_message *out)
 {
 	const char *abbrev, *subject;
-	int abbrev_len, subject_len;
-	char *q;
-
-	if (!git_commit_encoding)
-		git_commit_encoding = "UTF-8";
+	int subject_len;
 
-	out->message = logmsg_reencode(commit, NULL, git_commit_encoding);
+	out->message = logmsg_reencode(commit, NULL, get_commit_output_encoding());
 	abbrev = find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV);
-	abbrev_len = strlen(abbrev);
 
 	subject_len = find_commit_subject(out->message, &subject);
 
-	out->parent_label = xmalloc(strlen("parent of ") + abbrev_len +
-			      strlen("... ") + subject_len + 1);
-	q = out->parent_label;
-	q = mempcpy(q, "parent of ", strlen("parent of "));
-	out->label = q;
-	q = mempcpy(q, abbrev, abbrev_len);
-	q = mempcpy(q, "... ", strlen("... "));
-	out->subject = q;
-	q = mempcpy(q, subject, subject_len);
-	*q = '\0';
+	out->subject = xmemdupz(subject, subject_len);
+	out->label = xstrfmt("%s... %s", abbrev, out->subject);
+	out->parent_label = xstrfmt("parent of %s", out->label);
+
 	return 0;
 }
 
 static void free_message(struct commit *commit, struct commit_message *msg)
 {
 	free(msg->parent_label);
+	free(msg->label);
+	free(msg->subject);
 	unuse_commit_buffer(commit, msg->message);
 }
 
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 18/22] git-compat-util: drop mempcpy compat code
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (16 preceding siblings ...)
  2016-02-22 22:44     ` [PATCH v3 17/22] sequencer: simplify memory allocation of get_message Jeff King
@ 2016-02-22 22:45     ` Jeff King
  2016-02-22 22:45     ` [PATCH v3 19/22] transport_anonymize_url: use xstrfmt Jeff King
                       ` (4 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:45 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

There are no callers of this left, as the last one was
dropped in the previous patch. And there are not likely to
be new ones, as the function has been around since 2010
without gaining any new callers.

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 57ff9fb..1459f9b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -681,7 +681,6 @@ extern int git_vsnprintf(char *str, size_t maxsize,
 #ifdef __GLIBC_PREREQ
 #if __GLIBC_PREREQ(2, 1)
 #define HAVE_STRCHRNUL
-#define HAVE_MEMPCPY
 #endif
 #endif
 
@@ -695,14 +694,6 @@ static inline char *gitstrchrnul(const char *s, int c)
 }
 #endif
 
-#ifndef HAVE_MEMPCPY
-#define mempcpy gitmempcpy
-static inline void *gitmempcpy(void *dest, const void *src, size_t n)
-{
-	return (char *)memcpy(dest, src, n) + n;
-}
-#endif
-
 #ifdef NO_INET_PTON
 int inet_pton(int af, const char *src, void *dst);
 #endif
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 19/22] transport_anonymize_url: use xstrfmt
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (17 preceding siblings ...)
  2016-02-22 22:45     ` [PATCH v3 18/22] git-compat-util: drop mempcpy compat code Jeff King
@ 2016-02-22 22:45     ` Jeff King
  2016-02-22 22:45     ` [PATCH v3 20/22] diff_populate_gitlink: use a strbuf Jeff King
                       ` (3 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:45 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/transport.c b/transport.c
index 3b4e644..ac857a9 100644
--- a/transport.c
+++ b/transport.c
@@ -1021,7 +1021,7 @@ int transport_disconnect(struct transport *transport)
  */
 char *transport_anonymize_url(const char *url)
 {
-	char *anon_url, *scheme_prefix, *anon_part;
+	char *scheme_prefix, *anon_part;
 	size_t anon_len, prefix_len = 0;
 
 	anon_part = strchr(url, '@');
@@ -1055,10 +1055,8 @@ char *transport_anonymize_url(const char *url)
 			goto literal_copy;
 		prefix_len = scheme_prefix - url + 3;
 	}
-	anon_url = xcalloc(1, 1 + prefix_len + anon_len);
-	memcpy(anon_url, url, prefix_len);
-	memcpy(anon_url + prefix_len, anon_part, anon_len);
-	return anon_url;
+	return xstrfmt("%.*s%.*s", (int)prefix_len, url,
+		       (int)anon_len, anon_part);
 literal_copy:
 	return xstrdup(url);
 }
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 20/22] diff_populate_gitlink: use a strbuf
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (18 preceding siblings ...)
  2016-02-22 22:45     ` [PATCH v3 19/22] transport_anonymize_url: use xstrfmt Jeff King
@ 2016-02-22 22:45     ` Jeff King
  2016-02-22 22:45     ` [PATCH v3 21/22] convert ewah/bitmap code to use xmalloc Jeff King
                       ` (2 subsequent siblings)
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:45 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.

We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 27d14a7..a70ec6e 100644
--- a/diff.c
+++ b/diff.c
@@ -2704,21 +2704,21 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int
 
 static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 {
-	int len;
-	char *data = xmalloc(100), *dirty = "";
+	struct strbuf buf = STRBUF_INIT;
+	char *dirty = "";
 
 	/* Are we looking at the work tree? */
 	if (s->dirty_submodule)
 		dirty = "-dirty";
 
-	len = snprintf(data, 100,
-		       "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
-	s->data = data;
-	s->size = len;
-	s->should_free = 1;
+	strbuf_addf(&buf, "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
+	s->size = buf.len;
 	if (size_only) {
 		s->data = NULL;
-		free(data);
+		strbuf_release(&buf);
+	} else {
+		s->data = strbuf_detach(&buf, NULL);
+		s->should_free = 1;
 	}
 	return 0;
 }
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 21/22] convert ewah/bitmap code to use xmalloc
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (19 preceding siblings ...)
  2016-02-22 22:45     ` [PATCH v3 20/22] diff_populate_gitlink: use a strbuf Jeff King
@ 2016-02-22 22:45     ` Jeff King
  2016-02-22 22:45     ` [PATCH v3 22/22] ewah: convert to REALLOC_ARRAY, etc Jeff King
  2016-02-22 23:08     ` [PATCH v3 0/22] hardening allocations against integer overflow Junio C Hamano
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:45 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.

We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.

Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 12 ++++++------
 ewah/ewah_bitmap.c |  9 +++------
 ewah/ewah_io.c     | 10 ++--------
 ewah/ewok.h        | 10 ----------
 4 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 47ad674..c88daa0 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -25,8 +25,8 @@
 
 struct bitmap *bitmap_new(void)
 {
-	struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
-	bitmap->words = ewah_calloc(32, sizeof(eword_t));
+	struct bitmap *bitmap = xmalloc(sizeof(struct bitmap));
+	bitmap->words = xcalloc(32, sizeof(eword_t));
 	bitmap->word_alloc = 32;
 	return bitmap;
 }
@@ -38,8 +38,8 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = ewah_realloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		self->words = xrealloc(self->words,
+				      self->word_alloc * sizeof(eword_t));
 
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
@@ -102,7 +102,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	while (ewah_iterator_next(&blowup, &it)) {
 		if (i >= bitmap->word_alloc) {
 			bitmap->word_alloc *= 1.5;
-			bitmap->words = ewah_realloc(
+			bitmap->words = xrealloc(
 				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
 		}
 
@@ -134,7 +134,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = ewah_realloc(self->words,
+		self->words = xrealloc(self->words,
 			self->word_alloc * sizeof(eword_t));
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index b522437..fcd465e 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,7 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
@@ -282,11 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 {
 	struct ewah_bitmap *self;
 
-	self = ewah_malloc(sizeof(struct ewah_bitmap));
-	if (self == NULL)
-		return NULL;
-
-	self->buffer = ewah_malloc(32 * sizeof(eword_t));
+	self = xmalloc(sizeof(struct ewah_bitmap));
+	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
 
 	ewah_clear(self);
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 43481b9..4acff97 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,12 +134,9 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
 	 * if we're in a little-endian platform, we'll perform
@@ -180,12 +177,9 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = ewah_realloc(self->buffer,
+	self->buffer = xrealloc(self->buffer,
 		self->alloc_size * sizeof(eword_t));
 
-	if (!self->buffer)
-		return -1;
-
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
 	words_left = self->buffer_size;
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 6e2c5e1..269a1a8 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -20,16 +20,6 @@
 #ifndef __EWOK_BITMAP_H__
 #define __EWOK_BITMAP_H__
 
-#ifndef ewah_malloc
-#	define ewah_malloc xmalloc
-#endif
-#ifndef ewah_realloc
-#	define ewah_realloc xrealloc
-#endif
-#ifndef ewah_calloc
-#	define ewah_calloc xcalloc
-#endif
-
 struct strbuf;
 typedef uint64_t eword_t;
 #define BITS_IN_EWORD (sizeof(eword_t) * 8)
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* [PATCH v3 22/22] ewah: convert to REALLOC_ARRAY, etc
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (20 preceding siblings ...)
  2016-02-22 22:45     ` [PATCH v3 21/22] convert ewah/bitmap code to use xmalloc Jeff King
@ 2016-02-22 22:45     ` Jeff King
  2016-02-22 23:08     ` [PATCH v3 0/22] hardening allocations against integer overflow Junio C Hamano
  22 siblings, 0 replies; 93+ messages in thread
From: Jeff King @ 2016-02-22 22:45 UTC (permalink / raw)
  To: git; +Cc: Eric Sunshine, Junio C Hamano

Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.

Signed-off-by: Jeff King <peff@peff.net>
---
 ewah/bitmap.c      | 16 ++++------------
 ewah/ewah_bitmap.c |  5 ++---
 ewah/ewah_io.c     |  6 ++----
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index c88daa0..7103cee 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -17,7 +17,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
-#include "git-compat-util.h"
+#include "cache.h"
 #include "ewok.h"
 
 #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD))
@@ -38,9 +38,7 @@ void bitmap_set(struct bitmap *self, size_t pos)
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
 		self->word_alloc = block * 2;
-		self->words = xrealloc(self->words,
-				      self->word_alloc * sizeof(eword_t));
-
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));
 	}
@@ -100,12 +98,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
 	ewah_iterator_init(&it, ewah);
 
 	while (ewah_iterator_next(&blowup, &it)) {
-		if (i >= bitmap->word_alloc) {
-			bitmap->word_alloc *= 1.5;
-			bitmap->words = xrealloc(
-				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
-		}
-
+		ALLOC_GROW(bitmap->words, i + 1, bitmap->word_alloc);
 		bitmap->words[i++] = blowup;
 	}
 
@@ -134,8 +127,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
 
 	if (self->word_alloc < other_final) {
 		self->word_alloc = other_final;
-		self->words = xrealloc(self->words,
-			self->word_alloc * sizeof(eword_t));
+		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + original_size, 0x0,
 			(self->word_alloc - original_size) * sizeof(eword_t));
 	}
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index fcd465e..2dc9c82 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,8 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 		return;
 
 	self->alloc_size = new_size;
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
 
@@ -283,8 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 	struct ewah_bitmap *self;
 
 	self = xmalloc(sizeof(struct ewah_bitmap));
-	self->buffer = xmalloc(32 * sizeof(eword_t));
 	self->alloc_size = 32;
+	ALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	ewah_clear(self);
 	return self;
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 4acff97..61f6a43 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,8 +134,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 	self->buffer_size = self->alloc_size = get_be32(ptr);
 	ptr += sizeof(uint32_t);
 
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/*
 	 * Copy the raw data for the bitmap as a whole chunk;
@@ -177,8 +176,7 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
 		return -1;
 
 	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-	self->buffer = xrealloc(self->buffer,
-		self->alloc_size * sizeof(eword_t));
+	REALLOC_ARRAY(self->buffer, self->alloc_size);
 
 	/** 64 bit x N -- compressed words */
 	buffer = self->buffer;
-- 
2.7.2.645.g4e1306c

^ permalink raw reply related	[flat|nested] 93+ messages in thread

* Re: [PATCH v3 0/22] hardening allocations against integer overflow
  2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
                       ` (21 preceding siblings ...)
  2016-02-22 22:45     ` [PATCH v3 22/22] ewah: convert to REALLOC_ARRAY, etc Jeff King
@ 2016-02-22 23:08     ` Junio C Hamano
  22 siblings, 0 replies; 93+ messages in thread
From: Junio C Hamano @ 2016-02-22 23:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Eric Sunshine

Jeff King <peff@peff.net> writes:

> On Fri, Feb 19, 2016 at 06:19:41AM -0500, Jeff King wrote:
>
>> Here's a re-roll of jk/tighten-alloc series from:
>> 
>>   http://thread.gmane.org/gmane.comp.version-control.git/286253
>
> And here's v3. The changes this time (and philosophical rationalization
> of changes I didn't make) are pretty small:
>
>   - flip the order of arguments to st_mult() in ALLOC_ARRAY, et al, to
>     get a probably-irrelevant-but-so-easy-why-not optimization
>
>   - mark new global var in daemon.c as static
>
>   - add argv_array_detach (new patch 6) to avoid gross memory management
>     when converting line-log to use argv_array in the subsequent patch
>
> So I hope this one is ready for 'next'.  Thanks René, Ramsay, and Eric
> for reviewing.
>
>   [01/22]: reflog_expire_cfg: NUL-terminate pattern field
>   [02/22]: add helpers for detecting size_t overflow
>   [03/22]: tree-diff: catch integer overflow in combine_diff_path allocation
>   [04/22]: harden REALLOC_ARRAY and xcalloc against size_t overflow
>   [05/22]: add helpers for allocating flex-array structs
>   [06/22]: argv-array: add detach function
>   [07/22]: convert manual allocations to argv_array
>   [08/22]: convert trivial cases to ALLOC_ARRAY
>   [09/22]: use xmallocz to avoid size arithmetic
>   [10/22]: convert trivial cases to FLEX_ARRAY macros
>   [11/22]: use st_add and st_mult for allocation size computation
>   [12/22]: prepare_{git,shell}_cmd: use argv_array
>   [13/22]: write_untracked_extension: use FLEX_ALLOC helper
>   [14/22]: fast-import: simplify allocation in start_packfile
>   [15/22]: fetch-pack: simplify add_sought_entry
>   [16/22]: test-path-utils: fix normalize_path_copy output buffer size
>   [17/22]: sequencer: simplify memory allocation of get_message
>   [18/22]: git-compat-util: drop mempcpy compat code
>   [19/22]: transport_anonymize_url: use xstrfmt
>   [20/22]: diff_populate_gitlink: use a strbuf
>   [21/22]: convert ewah/bitmap code to use xmalloc
>   [22/22]: ewah: convert to REALLOC_ARRAY, etc

I admit that I only looked at tbdiff output, but what I saw looked
very sensible.

Thanks.

^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2016-02-22 23:08 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-15 21:45 [PATCH 0/18] hardening allocations against integer overflow Jeff King
2016-02-15 21:49 ` [PATCH 01/18] add helpers for detecting size_t overflow Jeff King
2016-02-15 21:49 ` [PATCH 02/18] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
2016-02-15 21:50 ` [PATCH 03/18] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
2016-02-15 21:50 ` [PATCH 04/18] add helpers for allocating flex-array structs Jeff King
2016-02-16  1:47   ` Eric Sunshine
2016-02-16  2:52     ` Jeff King
2016-02-15 21:51 ` [PATCH 05/18] convert trivial cases to ALLOC_ARRAY Jeff King
2016-02-16  4:22   ` Eric Sunshine
2016-02-16  4:23     ` Jeff King
2016-02-16  4:32       ` Eric Sunshine
2016-02-16  5:46         ` Jeff King
2016-02-15 21:52 ` [PATCH 06/18] use xmallocz to avoid size arithmetic Jeff King
2016-02-15 21:52 ` [PATCH 07/18] convert trivial cases to FLEX_ARRAY macros Jeff King
2016-02-16  2:17   ` Eric Sunshine
2016-02-16  3:15     ` Jeff King
2016-02-16  3:26       ` Jeff King
2016-02-16  3:36         ` Jeff King
2016-02-16  4:18           ` Eric Sunshine
2016-02-16  4:22             ` Jeff King
2016-02-16  4:10       ` Eric Sunshine
2016-02-15 21:53 ` [PATCH 08/18] use st_add and st_mult for allocation size computation Jeff King
2016-02-16  5:47   ` Eric Sunshine
2016-02-15 21:53 ` [PATCH 09/18] write_untracked_extension: use FLEX_ALLOC helper Jeff King
2016-02-15 21:54 ` [PATCH 10/18] fast-import: simplify allocation in start_packfile Jeff King
2016-02-15 21:54 ` [PATCH 11/18] fetch-pack: simplify add_sought_entry Jeff King
2016-02-15 21:55 ` [PATCH 12/18] test-path-utils: fix normalize_path_copy output buffer size Jeff King
2016-02-15 21:56 ` [PATCH 13/18] sequencer: simplify memory allocation of get_message Jeff King
2016-02-16  6:05   ` Eric Sunshine
2016-02-15 21:56 ` [PATCH 14/18] git-compat-util: drop mempcpy compat code Jeff King
2016-02-16  6:05   ` Eric Sunshine
2016-02-15 21:56 ` [PATCH 15/18] transport_anonymize_url: use xstrfmt Jeff King
2016-02-15 21:56 ` [PATCH 16/18] diff_populate_gitlink: use a strbuf Jeff King
2016-02-15 21:57 ` [PATCH 17/18] convert ewah/bitmap code to use xmalloc Jeff King
2016-02-15 21:57 ` [PATCH 18/18] ewah: convert to REALLOC_ARRAY, etc Jeff King
2016-02-15 22:02 ` [PATCH 0/18] hardening allocations against integer overflow Jeff King
2016-02-19 11:19 ` [PATCH v2 0/21] " Jeff King
2016-02-19 11:21   ` [PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field Jeff King
2016-02-19 11:21   ` [PATCH 02/21] add helpers for detecting size_t overflow Jeff King
2016-02-19 11:21   ` [PATCH 03/21] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
2016-02-19 11:22   ` [PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
2016-02-20 21:32     ` René Scharfe
2016-02-21 23:30       ` Jeff King
2016-02-19 11:22   ` [PATCH 05/21] add helpers for allocating flex-array structs Jeff King
2016-02-19 11:23   ` [PATCH 06/21] convert manual allocations to argv_array Jeff King
2016-02-20  8:07     ` Eric Sunshine
2016-02-20  8:10       ` Jeff King
2016-02-20  8:29         ` Eric Sunshine
2016-02-20  8:34           ` Jeff King
2016-02-20  8:39             ` Eric Sunshine
2016-02-20  8:57               ` Jeff King
2016-02-20  9:04                 ` Eric Sunshine
2016-02-19 11:23   ` [PATCH 07/21] convert trivial cases to ALLOC_ARRAY Jeff King
2016-02-19 11:23   ` [PATCH 08/21] use xmallocz to avoid size arithmetic Jeff King
2016-02-19 11:23   ` [PATCH 09/21] convert trivial cases to FLEX_ARRAY macros Jeff King
2016-02-19 11:23   ` [PATCH 10/21] use st_add and st_mult for allocation size computation Jeff King
2016-02-19 11:24   ` [PATCH 11/21] prepare_{git,shell}_cmd: use argv_array Jeff King
2016-02-19 11:24   ` [PATCH 12/21] write_untracked_extension: use FLEX_ALLOC helper Jeff King
2016-02-19 11:24   ` [PATCH 13/21] fast-import: simplify allocation in start_packfile Jeff King
2016-02-19 17:48     ` Junio C Hamano
2016-02-19 19:12       ` Jeff King
2016-02-19 11:24   ` [PATCH 14/21] fetch-pack: simplify add_sought_entry Jeff King
2016-02-19 11:24   ` [PATCH 15/21] test-path-utils: fix normalize_path_copy output buffer size Jeff King
2016-02-19 11:25   ` [PATCH 16/21] sequencer: simplify memory allocation of get_message Jeff King
2016-02-19 11:25   ` [PATCH 17/21] git-compat-util: drop mempcpy compat code Jeff King
2016-02-19 11:25   ` [PATCH 18/21] transport_anonymize_url: use xstrfmt Jeff King
2016-02-19 11:25   ` [PATCH 19/21] diff_populate_gitlink: use a strbuf Jeff King
2016-02-19 11:25   ` [PATCH 20/21] convert ewah/bitmap code to use xmalloc Jeff King
2016-02-19 11:25   ` [PATCH 21/21] ewah: convert to REALLOC_ARRAY, etc Jeff King
2016-02-22 22:41   ` [PATCH v3 0/22] hardening allocations against integer overflow Jeff King
2016-02-22 22:43     ` [PATCH v3 01/22] reflog_expire_cfg: NUL-terminate pattern field Jeff King
2016-02-22 22:43     ` [PATCH v3 02/22] add helpers for detecting size_t overflow Jeff King
2016-02-22 22:43     ` [PATCH v3 03/22] tree-diff: catch integer overflow in combine_diff_path allocation Jeff King
2016-02-22 22:43     ` [PATCH v3 04/22] harden REALLOC_ARRAY and xcalloc against size_t overflow Jeff King
2016-02-22 22:43     ` [PATCH v3 05/22] add helpers for allocating flex-array structs Jeff King
2016-02-22 22:44     ` [PATCH v3 06/22] argv-array: add detach function Jeff King
2016-02-22 22:44     ` [PATCH v3 07/22] convert manual allocations to argv_array Jeff King
2016-02-22 22:44     ` [PATCH v3 08/22] convert trivial cases to ALLOC_ARRAY Jeff King
2016-02-22 22:44     ` [PATCH v3 09/22] use xmallocz to avoid size arithmetic Jeff King
2016-02-22 22:44     ` [PATCH v3 10/22] convert trivial cases to FLEX_ARRAY macros Jeff King
2016-02-22 22:44     ` [PATCH v3 11/22] use st_add and st_mult for allocation size computation Jeff King
2016-02-22 22:44     ` [PATCH v3 12/22] prepare_{git,shell}_cmd: use argv_array Jeff King
2016-02-22 22:44     ` [PATCH v3 13/22] write_untracked_extension: use FLEX_ALLOC helper Jeff King
2016-02-22 22:44     ` [PATCH v3 14/22] fast-import: simplify allocation in start_packfile Jeff King
2016-02-22 22:44     ` [PATCH v3 15/22] fetch-pack: simplify add_sought_entry Jeff King
2016-02-22 22:44     ` [PATCH v3 16/22] test-path-utils: fix normalize_path_copy output buffer size Jeff King
2016-02-22 22:44     ` [PATCH v3 17/22] sequencer: simplify memory allocation of get_message Jeff King
2016-02-22 22:45     ` [PATCH v3 18/22] git-compat-util: drop mempcpy compat code Jeff King
2016-02-22 22:45     ` [PATCH v3 19/22] transport_anonymize_url: use xstrfmt Jeff King
2016-02-22 22:45     ` [PATCH v3 20/22] diff_populate_gitlink: use a strbuf Jeff King
2016-02-22 22:45     ` [PATCH v3 21/22] convert ewah/bitmap code to use xmalloc Jeff King
2016-02-22 22:45     ` [PATCH v3 22/22] ewah: convert to REALLOC_ARRAY, etc Jeff King
2016-02-22 23:08     ` [PATCH v3 0/22] hardening allocations against integer overflow Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.