All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode
@ 2022-06-03 18:37 Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
                   ` (15 more replies)
  0 siblings, 16 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

The -fanalyzer mode in GCC v12 is much better about reporting real
issues. This RFC series currently conflicts with "next", but I thought
sending it for comment sooner than later made sense.

With this series applied you can run:

    make DEVOPTS=analyzer

Which will turn all of these into warnings even under -Werror, we can
also:

    make DEVOPTS="analyzer no-suppress-analyzer"

This works on both GCC 12 and GCC 11, but the former will yield better
results. We need to quiet some issues on GCC 11. GCC 10 ships with
"-fanalyzer", but it has so many false positives that we'll $(error)
out if you try to run it with DEVOPTS=fanalyzer.

If you're CC'd on this series you were either involved in some
discussion about -fanalyzer on the ML, or one of the commits mentioned
in this series is yours. Comments on individual sets of patches below:

Ævar Arnfjörð Bjarmason (15):
  remote.c: don't dereference NULL in freeing loop
  pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path
  reftable: don't memset() a NULL from failed malloc()

These are all real bugs, in rough order of more to less severe.

  diff-lib.c: don't dereference NULL in oneway_diff()
  refs/packed-backend.c: add a BUG() if iter is NULL
  ref-filter.c: BUG() out on show_ref() with NULL refname

These may or may not be real bugs, I'm pretty sure they are...

  strbuf.c: placate -fanalyzer in strbuf_grow()

The first of cases where -fanalyzer isn't "wrong", but it is in the
sense that the error it points out is a hard constraint that we
(hopefully) ensure elsewhere in various APIs.

  strbuf.c: use st_add3(), not unsigned_add_overflows()

A while-we're-at-it for code spotted in the last commit.

  add-patch: assert parse_diff() expectations with BUG()
  reftable: don't have reader_get_block() confuse -fanalyzer
  blame.c: clarify the state of "final_commit" for -fanalyzer
  pack.h: wrap write_*file*() functions
  pack-write API: pass down "verify" not arbitrary flags

These are all cases where -fanalyzer flags "genuine" issues, but on
closer inspection it's because we're passing the equivalent of two
variables we know go hand-in-hand, but probably no compiler can be
smart enough to spot that.

E.g. the first one here is a case where "git add -p" would segfault if
fed diffs that aren't in the format "git diff" would emit, but since
we control both sides it doesn't happen in practice.

Regardless, sprinkling assertions and BUG() guards in that code seems
prudent.

  config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer

Add the new DEVOPTS mode and quiet some issues that need -Wno-error=*.

  config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro

Quiet some outstanding issues, the main reason these aren't in commits
like the above is because I didn't get to them yet, and may not any
time soon. They may be genuine bugs, false alarms etc.

Even if we don't get to these I think having a CI mode with -fanalyzer
and a whitelist of current issues would be an improment, since we'd
error on *new* isssues.

This series doesn't add such a CI mode yet, due to conflicts with CI
work in "next"..

Ævar Arnfjörð Bjarmason (15):
  remote.c: don't dereference NULL in freeing loop
  pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path
  reftable: don't memset() a NULL from failed malloc()
  diff-lib.c: don't dereference NULL in oneway_diff()
  refs/packed-backend.c: add a BUG() if iter is NULL
  ref-filter.c: BUG() out on show_ref() with NULL refname
  strbuf.c: placate -fanalyzer in strbuf_grow()
  strbuf.c: use st_add3(), not unsigned_add_overflows()
  add-patch: assert parse_diff() expectations with BUG()
  reftable: don't have reader_get_block() confuse -fanalyzer
  blame.c: clarify the state of "final_commit" for -fanalyzer
  pack.h: wrap write_*file*() functions
  pack-write API: pass down "verify" not arbitrary flags
  config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer
  config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro

 Makefile                | 14 +++++++++
 add-patch.c             |  7 ++++-
 blame.c                 |  2 +-
 builtin/fast-import.c   |  2 +-
 builtin/index-pack.c    | 35 +++++++++-------------
 builtin/name-rev.c      |  1 +
 builtin/pack-objects.c  |  9 ++----
 builtin/pull.c          | 14 +++++----
 config.mak.dev          | 65 +++++++++++++++++++++++++++++++++++++++++
 diff-lib.c              |  3 ++
 dir.c                   |  1 +
 git-compat-util.h       | 16 ++++++++++
 gpg-interface.c         |  2 ++
 graph.c                 |  2 ++
 line-log.c              |  2 ++
 midx.c                  |  2 +-
 pack-write.c            | 38 +++++++++++-------------
 pack.h                  | 24 +++++++++------
 ref-filter.c            |  4 ++-
 refs/packed-backend.c   |  2 ++
 reftable/publicbasics.c |  2 ++
 reftable/reader.c       | 11 +++----
 remote.c                |  8 ++---
 strbuf.c                | 10 ++++---
 unpack-trees.c          |  1 +
 utf8.c                  |  1 +
 26 files changed, 195 insertions(+), 83 deletions(-)

-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 21:07   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path Ævar Arnfjörð Bjarmason
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Fix a bug in fd3cb0501e1 (remote: move static variables into
per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
after having NULL'd out remote->pushurl. itself.

While we're at it let's get rid of the redundant braces per the
CodingGuidelines, which also serves to show in the diff context that
we were doing a FREE_AND_NULL(remote->pushurl) afterwards too, let's
keep that one.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 remote.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/remote.c b/remote.c
index 930fdc9c2f6..61240562df1 100644
--- a/remote.c
+++ b/remote.c
@@ -144,14 +144,10 @@ static void remote_clear(struct remote *remote)
 	free((char *)remote->name);
 	free((char *)remote->foreign_vcs);
 
-	for (i = 0; i < remote->url_nr; i++) {
+	for (i = 0; i < remote->url_nr; i++)
 		free((char *)remote->url[i]);
-	}
-	FREE_AND_NULL(remote->pushurl);
-
-	for (i = 0; i < remote->pushurl_nr; i++) {
+	for (i = 0; i < remote->pushurl_nr; i++)
 		free((char *)remote->pushurl[i]);
-	}
 	FREE_AND_NULL(remote->pushurl);
 	free((char *)remote->receivepack);
 	free((char *)remote->uploadpack);
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 21:27   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Fix an issue with feeding NULL to strcmp() noted by GCC's -fanalyzer,
this fixes a bug in 1678b81ecce (pull: teach git pull about --rebase,
2015-06-18).

In cmd_pull() we could go through the function without initializing
the "repo" argument (the -fanalyzer output shows how exactly), we'd
then call get_rebase_fork_point (), which would in turn call
get_tracking_branch() with that "NULL" repo argument.

Let's avoid this potential issue by returning NULL in this case, which
will have get_rebase_fork_point() return -1 in turn.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/pull.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/builtin/pull.c b/builtin/pull.c
index 01155ba67b2..ed8df004028 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -756,14 +756,16 @@ static const char *get_tracking_branch(const char *remote, const char *refspec)
 		starts_with(spec_src, "remotes/"))
 		spec_src = "";
 
-	if (*spec_src) {
-		if (!strcmp(remote, "."))
-			merge_branch = mkpath("refs/heads/%s", spec_src);
-		else
-			merge_branch = mkpath("refs/remotes/%s/%s", remote, spec_src);
-	} else
+	if ((*spec_src && !remote) || !*spec_src) {
 		merge_branch = NULL;
+		goto cleanup;
+	}
 
+	if (!strcmp(remote, "."))
+		merge_branch = mkpath("refs/heads/%s", spec_src);
+	else
+		merge_branch = mkpath("refs/remotes/%s/%s", remote, spec_src);
+cleanup:
 	refspec_item_clear(&spec);
 	return merge_branch;
 }
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 22:22   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff() Ævar Arnfjörð Bjarmason
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Return NULL instead of possibly feeding a NULL to memset() in
reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:

	reftable/publicbasics.c: In function ‘reftable_calloc’:
	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
	   43 |         memset(p, 0, sz);
	      |         ^~~~~~~~~~~~~~~~
	[...]

This bug has been with us ever since this code was added in
ef8a6c62687 (reftable: utility functions, 2021-10-07).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 reftable/publicbasics.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
index 0ad7d5c0ff2..a18167f5ab7 100644
--- a/reftable/publicbasics.c
+++ b/reftable/publicbasics.c
@@ -40,6 +40,8 @@ void reftable_free(void *p)
 void *reftable_calloc(size_t sz)
 {
 	void *p = reftable_malloc(sz);
+	if (!p)
+		return NULL;
 	memset(p, 0, sz);
 	return p;
 }
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff()
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 22:48   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL Ævar Arnfjörð Bjarmason
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Fix a control flow issue dating back to d1f2d7e8ca6 (Make
run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19)
where we'd assume "tree" must be non-NULL if idx was NULL. As
-fanalyzer shows we'd end up dereferencing "tree" in that case in
ce_path_match():

dir.h:541:41: warning: dereference of NULL ‘ce’ [CWE-476] [-Wanalyzer-null-dereference]
  541 |                               S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode));
      |                                         ^
  ‘oneway_diff’: events 1-2
    |
    |diff-lib.c:493:12:
    |  493 | static int oneway_diff(const struct cache_entry * const *src,
    |      |            ^~~~~~~~~~~
    |      |            |
    |      |            (1) entry to ‘oneway_diff’
    |......
    |  506 |         if (tree == o->df_conflict_entry)
    |      |            ~
    |      |            |
    |      |            (2) following ‘true’ branch...
    |
  ‘oneway_diff’: event 3
    |
    |  507 |                 tree = NULL;
    |      |                      ^
    |      |                      |
    |      |                      (3) ...to here
    |
  ‘oneway_diff’: events 4-8
    |
    |  507 |                 tree = NULL;
    |      |                      ^
    |      |                      |
    |      |                      (4) ‘tree’ is NULL
    |  508 |
    |  509 |         if (ce_path_match(revs->diffopt.repo->index,
    |      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |             |
    |      |             (5) following ‘false’ branch (when ‘idx’ is NULL)...
    |      |             (6) ...to here
    |      |             (7) ‘tree’ is NULL
    |      |             (8) calling ‘ce_path_match’ from ‘oneway_diff’
    |  510 |                           idx ? idx : tree,
    |      |                           ~~~~~~~~~~~~~~~~~
    |  511 |                           &revs->prune_data, NULL)) {
    |      |                           ~~~~~~~~~~~~~~~~~~~~~~~~
    |
    +--> ‘ce_path_match’: event 9
           |
           |dir.h:535:19:
           |  535 | static inline int ce_path_match(struct index_state *istate,
           |      |                   ^~~~~~~~~~~~~
           |      |                   |
           |      |                   (9) entry to ‘ce_path_match’
           |
         ‘ce_path_match’: event 10
           |
           |  541 |                               S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode));
           |      |                                         ^
           |      |                                         |
           |      |                                         (10) dereference of NULL ‘ce

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 diff-lib.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/diff-lib.c b/diff-lib.c
index ca085a03efc..8373ad7e3ea 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -506,6 +506,9 @@ static int oneway_diff(const struct cache_entry * const *src,
 	if (tree == o->df_conflict_entry)
 		tree = NULL;
 
+	if (!idx && !tree)
+		return 0;
+
 	if (ce_path_match(revs->diffopt.repo->index,
 			  idx ? idx : tree,
 			  &revs->prune_data, NULL)) {
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff() Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 23:14   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname Ævar Arnfjörð Bjarmason
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Adjust code added in 2775d8724d7 (packed_ref_store: implement
reference transactions, 2017-09-08) to BUG() out in a case there GCC
v12's -fanalyzer flagged that the "iter->oid" seen in the context was
reachable where iter was NULL.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 refs/packed-backend.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 97b68377673..65991bbcaf5 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1226,6 +1226,8 @@ static int write_with_updates(struct packed_ref_store *refs,
 			struct object_id peeled;
 			int peel_error = ref_iterator_peel(iter, &peeled);
 
+			if (!iter)
+				BUG("must have iter if cmp < 0");
 			if (write_packed_entry(out, iter->refname,
 					       iter->oid,
 					       peel_error ? NULL : &peeled))
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (4 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-04 18:07   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow() Ævar Arnfjörð Bjarmason
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Adjust code originally added in 5339bdad96a (ref-filter: introduce
remote_ref_atom_parser(), 2016-02-17) to BUG() out rather than
potentially segfault if we get a NULL refname here.

As noted by GCC v12's -fanalyzer that will happen if this follows the
"refname = NULL" branch added in cc72385fe35 (for-each-ref: let
upstream/push optionally report the remote name, 2017-10-05).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 ref-filter.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index 2413f889f48..91aa8e89268 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1644,7 +1644,9 @@ static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
 				    struct branch *branch, const char **s)
 {
 	int num_ours, num_theirs;
-	if (atom->u.remote_ref.option == RR_REF)
+	if (atom->u.remote_ref.option == RR_REF && !refname)
+		BUG("must get refname with [...]remote_ref.option == RR_REF");
+	else if (atom->u.remote_ref.option == RR_REF)
 		*s = show_ref(&atom->u.remote_ref.refname, refname);
 	else if (atom->u.remote_ref.option == RR_TRACK) {
 		if (stat_tracking_info(branch, &num_ours, &num_theirs,
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (5 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-04 12:24   ` René Scharfe
  2022-06-04 12:46   ` Phillip Wood
  2022-06-03 18:37 ` [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows() Ævar Arnfjörð Bjarmason
                   ` (8 subsequent siblings)
  15 siblings, 2 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
yell at us about sb->buf[0] dereferencing NULL, this also makes this
code easier to follow.

This BUG() should be unreachable since the state of our "sb->buf" and
"sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
know that, and adding the BUG() also makes it clearer to human readers
that that's what happens here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 strbuf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index dd9eb85527a..61c4630aeeb 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
 	if (new_buf)
 		sb->buf = NULL;
 	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
+	if (new_buf && !sb->buf)
+		BUG("for a new buffer ALLOC_GROW() should always do work!");
 	if (new_buf)
 		sb->buf[0] = '\0';
 }
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows()
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (6 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow() Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-04 21:27   ` René Scharfe
  2022-06-03 18:37 ` [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG() Ævar Arnfjörð Bjarmason
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the strbuf_grow() function to use st_add3() instead of doing
its own unsigned_add_overflows() checks.  The overflow checking here
was originally added in b449f4cfc97 (Rework strbuf API and semantics.,
2007-09-06) and adjusted in 1368f65002b (compat: helper for detecting
unsigned overflow, 2010-10-10). Instead we compute a "sz" with
st_add3().

That was done at a time when the underlying xrealloc() in
git-compat-util.h didn't use st_mult() yet, that has been the case
since the later e7792a74bcf (harden REALLOC_ARRAY and xcalloc against
size_t overflow, 2016-02-22).

The only behavior change here should be the very obscure edge case
that we'd previously die() in cases where we strictly didn't need to,
as we'd check both "extra + 1" and "sb->len + extra + 1" for
overflow. If we overflowed only on the latter but were doing the
former we'd needlessly die() die. I don't think that difference
mattered, but it's noted here for completeness.

While we're at it add an inline comment about why we're adding 1 to
the values, that's also explained in the API documentation in
strbuf.h, but since we're using that magic constant here...

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 strbuf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 61c4630aeeb..f0a74d2bfb1 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -91,12 +91,12 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
 void strbuf_grow(struct strbuf *sb, size_t extra)
 {
 	int new_buf = !sb->alloc;
-	if (unsigned_add_overflows(extra, 1) ||
-	    unsigned_add_overflows(sb->len, extra + 1))
-		die("you want to use way too much memory");
+	const size_t sz_buf = new_buf ? 0 : sb->len;
+	const size_t sz = st_add3(sz_buf, extra, 1 /* for \0 */);
+
 	if (new_buf)
 		sb->buf = NULL;
-	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
+	ALLOC_GROW(sb->buf, sz, sb->alloc);
 	if (new_buf && !sb->buf)
 		BUG("for a new buffer ALLOC_GROW() should always do work!");
 	if (new_buf)
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG()
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (7 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows() Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-04 13:04   ` Phillip Wood
  2022-06-03 18:37 ` [RFC PATCH 10/15] reftable: don't have reader_get_block() confuse -fanalyzer Ævar Arnfjörð Bjarmason
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Assert that this code added in [1], [2] and other related commits
expects that once we see a "diff " line we should have a non-NULL
"file_diff" and "hunk".

In practice this would have always been the case, as we are parsing
our own "diff" output, but e.g. GCC v12's -fanalyzer doesn't know
that, and will alert us that in the "else if" and below in this
function we could be dereferencing NULL if we were processing anything
except our expected input.

1. f6aa7ecc343 (built-in add -i: start implementing the `patch`
   functionality in C, 2019-12-13)
2. 80399aec5ab (built-in add -p: support multi-file diffs, 2019-12-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 add-patch.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/add-patch.c b/add-patch.c
index 55d719f7845..087bf317b07 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -478,11 +478,16 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 	while (p != pend) {
 		char *eol = memchr(p, '\n', pend - p);
 		const char *deleted = NULL, *mode_change = NULL;
+		const char *const diff_l = "diff ";
+		int is_diff_line = starts_with(p, diff_l);
 
 		if (!eol)
 			eol = pend;
 
-		if (starts_with(p, "diff ")) {
+		if (!is_diff_line && (!file_diff || !hunk))
+			BUG("expected '%s' line to follow a '%s' line", p, diff_l);
+
+		if (is_diff_line) {
 			complete_file(marker, hunk);
 			ALLOC_GROW_BY(s->file_diff, s->file_diff_nr, 1,
 				   file_diff_alloc);
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 10/15] reftable: don't have reader_get_block() confuse -fanalyzer
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (8 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG() Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 11/15] blame.c: clarify the state of "final_commit" for -fanalyzer Ævar Arnfjörð Bjarmason
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the control flow in reftable code add in 46bc0e731a7 (reftable:
read reftable files, 2021-10-07) to work around a false positive
spotted by GCC's -fanalyzer option[1]. The code was added in
46bc0e731a7 (reftable: read reftable files, 2021-10-07).

Usually we'd just leave such false positives alone, but having looked
at it the control flow was also odd and confusing to humans, so let's
change it.

What -fanalyzer complained about was:

	|......
	|  294 |         if (next_off >= r_size)
	|      |            ~
	|      |            |
	|      |            (24) following ‘false’ branch (when ‘next_off < r_size’)...
	|......
	|  294 |         if (next_off >= r_size)
	|      |            ~
	|      |            |
	|      |            (24) following ‘false’ branch (when ‘next_off < r_size’)...
	|......
	|  297 |         err = reader_get_block(r, &block, next_off, guess_block_size, r->size);
	|      |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	|      |               |
	|      |               (25) ...to here
	|      |               (26) calling ‘reader_get_block’ from ‘reader_init_block_reader’
	|
	+--> ‘reader_get_block’: events 27-29
	       |
	       |   59 | static int reader_get_block(struct reftable_reader *r,
	       |      |            ^~~~~~~~~~~~~~~~
	       |      |            |
	       |      |            (27) entry to ‘reader_get_block’
	       |......
	       |   63 |         if (off >= r_size)
	       |      |            ~
	       |      |            |
	       |      |            (28) following ‘true’ branch (when ‘off >= r_size’)...
	       |   64 |                 return 0;
	       |      |                        ~
	       |      |                        |
	       |      |                        (29) ...to here
	       [...]
	       |  275 |         *typ = data[0];
	       |      |                ~~~~~~~
	       |      |                    |
	       |      |                    (37) ...to here
	       |      |                    (38) dereference of NULL ‘data’

I.e. it thought that we could take the "return 0" in
reader_get_block() due to "(off >= r->size)", which followed the
identical "(next_off >= r_size)" check in reader_init_block_reader()
just before reader_get_block() was called.

But whatever GCC's -fanalyzer thinks it's confusing that we're
checking this twice, and making it look as though these parameters
might change within the reader_init_block_reader() function, but they
won't. So let's just do the check once early, and use "const" types to
assert that they'll be constant throughout.

1. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105285

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 reftable/reader.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/reftable/reader.c b/reftable/reader.c
index 54b4025105c..e1649929b30 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -58,9 +58,9 @@ reader_offsets_for(struct reftable_reader *r, uint8_t typ)
 
 static int reader_get_block(struct reftable_reader *r,
 			    struct reftable_block *dest, uint64_t off,
-			    uint32_t sz)
+			    uint32_t sz, uint64_t r_size)
 {
-	if (off >= r->size)
+	if (off >= r_size)
 		return 0;
 
 	if (off + sz > r->size) {
@@ -281,6 +281,7 @@ static int32_t extract_block_size(uint8_t *data, uint8_t *typ, uint64_t off,
 int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
 			     uint64_t next_off, uint8_t want_typ)
 {
+	const uint64_t r_size = r->size;
 	int32_t guess_block_size = r->block_size ? r->block_size :
 							 DEFAULT_BLOCK_SIZE;
 	struct reftable_block block = { NULL };
@@ -289,10 +290,10 @@ int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
 	uint32_t header_off = next_off ? 0 : header_size(r->version);
 	int32_t block_size = 0;
 
-	if (next_off >= r->size)
+	if (next_off >= r_size)
 		return 1;
 
-	err = reader_get_block(r, &block, next_off, guess_block_size);
+	err = reader_get_block(r, &block, next_off, guess_block_size, r_size);
 	if (err < 0)
 		goto done;
 
@@ -309,7 +310,7 @@ int reader_init_block_reader(struct reftable_reader *r, struct block_reader *br,
 
 	if (block_size > guess_block_size) {
 		reftable_block_done(&block);
-		err = reader_get_block(r, &block, next_off, block_size);
+		err = reader_get_block(r, &block, next_off, block_size, r_size);
 		if (err < 0) {
 			goto done;
 		}
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 11/15] blame.c: clarify the state of "final_commit" for -fanalyzer
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (9 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 10/15] reftable: don't have reader_get_block() confuse -fanalyzer Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 12/15] pack.h: wrap write_*file*() functions Ævar Arnfjörð Bjarmason
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change code added in d0d0ef1f67c (blame: create scoreboard setup
function, 2017-05-24) so that GCC v12's -fanalyzer doesn't think that
we can have a "NULL" final_commit in the modified branch.

This happens because the analyzer gives up in the
prepare_revision_walk() function, and thinks that the "sb->reverse &&
sb->revs->first_parent_only" condition we already checked a few lines
above can have a different result at this point.

That isn't the case, but what we really mean here is "if we previously
set up the final commit [because that was true]", so let's do that
instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 blame.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blame.c b/blame.c
index da1052ac94b..f6b1865ba65 100644
--- a/blame.c
+++ b/blame.c
@@ -2816,7 +2816,7 @@ void setup_scoreboard(struct blame_scoreboard *sb,
 	if (prepare_revision_walk(sb->revs))
 		die(_("revision walk setup failed"));
 
-	if (sb->reverse && sb->revs->first_parent_only) {
+	if (final_commit) {
 		struct commit *c = final_commit;
 
 		sb->revs->children.name = "children";
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 12/15] pack.h: wrap write_*file*() functions
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (10 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 11/15] blame.c: clarify the state of "final_commit" for -fanalyzer Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 13/15] pack-write API: pass down "verify" not arbitrary flags Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Whitespace-wrap some overly long write_*file*() lines in preparation
for changing their signatures in a subsequent commit. I named the
"opts" argument to "write_idx_file()" while I was at it for
consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 pack.h | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/pack.h b/pack.h
index b22bfc4a18e..802c047efe6 100644
--- a/pack.h
+++ b/pack.h
@@ -81,7 +81,10 @@ struct progress;
 /* Note, the data argument could be NULL if object type is blob */
 typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned long, void*, int*);
 
-const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, const unsigned char *sha1);
+const char *write_idx_file(const char *index_name,
+			   struct pack_idx_entry **objects, int nr_objects,
+			   const struct pack_idx_option *opts,
+			   const unsigned char *sha1);
 int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 int verify_pack_index(struct packed_git *);
 int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t);
@@ -93,8 +96,14 @@ struct ref;
 
 void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
 
-const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
-const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
+const char *write_rev_file(const char *rev_name,
+			   struct pack_idx_entry **objects,
+			   uint32_t nr_objects,
+			   const unsigned char *hash, unsigned flags);
+const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order,
+				 uint32_t nr_objects,
+				 const unsigned char *hash,
+				 unsigned flags);
 
 /*
  * The "hdr" output buffer should be at least this big, which will handle sizes
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 13/15] pack-write API: pass down "verify" not arbitrary flags
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (11 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 12/15] pack.h: wrap write_*file*() functions Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 14/15] config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Change the pack-write API to accept a boolean "verify" parameter
instead of passing down the "struct pack_idx_option" flags directly.

This simplifies the code for both humans and machines, e.g. GCC's
-fanalyzer would correctly note that there were potential paths
through this code where we'd deference NULL, but in reality we
wouldn't hit them because certain flags would always go hand-in-hand.

Let's instead separate the underlying API from the total set of flags,
and stop passing the flags down to functions that don't need the full
set of flags, they'll just get the specific state they need.

See e37d0b8730b (builtin/index-pack.c: write reverse indexes,
2021-01-25) for the initial implementation, and [1] for the initial
WIP version of this commit on the ML.

1. https://lore.kernel.org/git/87v93bidhn.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/fast-import.c  |  2 +-
 builtin/index-pack.c   | 35 ++++++++++++++---------------------
 builtin/pack-objects.c |  9 +++------
 midx.c                 |  2 +-
 pack-write.c           | 38 +++++++++++++++++---------------------
 pack.h                 | 15 ++++++---------
 6 files changed, 42 insertions(+), 59 deletions(-)

diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 28d3193c380..7cd2a3b8203 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -787,7 +787,7 @@ static const char *create_index(void)
 		die("internal consistency error creating the index");
 
 	tmpfile = write_idx_file(NULL, idx, object_count, &pack_idx_opts,
-				 pack_data->hash);
+				 pack_data->hash, 0);
 	free(idx);
 	return tmpfile;
 }
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3e385b48002..6691c5836e4 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1589,12 +1589,9 @@ static int git_index_pack_config(const char *k, const char *v, void *cb)
 		}
 		return 0;
 	}
-	if (!strcmp(k, "pack.writereverseindex")) {
-		if (git_config_bool(k, v))
-			opts->flags |= WRITE_REV;
-		else
-			opts->flags &= ~WRITE_REV;
-	}
+	if (!strcmp(k, "pack.writereverseindex"))
+		opts->write_rev = git_config_bool(k, v);
+
 	return git_default_config(k, v, cb);
 }
 
@@ -1713,7 +1710,7 @@ static void show_pack_info(int stat_only)
 
 int cmd_index_pack(int argc, const char **argv, const char *prefix)
 {
-	int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index;
+	int i, fix_thin_pack = 0, verify = 0, stat_only = 0;
 	const char *curr_index;
 	const char *curr_rev_index = NULL;
 	const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL;
@@ -1747,10 +1744,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (prefix && chdir(prefix))
 		die(_("Cannot come back to cwd"));
 
-	if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0))
-		rev_index = 1;
-	else
-		rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV));
+	opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX,
+				       opts.write_rev);
 
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
@@ -1835,9 +1830,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 					die(_("unknown hash algorithm '%s'"), arg);
 				repo_set_hash_algo(the_repository, hash_algo);
 			} else if (!strcmp(arg, "--rev-index")) {
-				rev_index = 1;
+				opts.write_rev = 1;
 			} else if (!strcmp(arg, "--no-rev-index")) {
-				rev_index = 0;
+				opts.write_rev = 0;
 			} else
 				usage(index_pack_usage);
 			continue;
@@ -1859,9 +1854,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (!index_name && pack_name)
 		index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf);
 
-	opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY);
-	if (rev_index) {
-		opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV;
+	if (opts.write_rev) {
 		if (index_name)
 			rev_index_name = derive_filename(index_name,
 							 "idx", "rev",
@@ -1872,10 +1865,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		if (!index_name)
 			die(_("--verify with no packfile name given"));
 		read_idx_option(&opts, index_name);
-		opts.flags |= WRITE_IDX_VERIFY | WRITE_IDX_STRICT;
 	}
 	if (strict)
-		opts.flags |= WRITE_IDX_STRICT;
+		opts.write_idx_strict = 1;
 
 	if (HAVE_THREADS && !nr_threads) {
 		nr_threads = online_cpus();
@@ -1919,11 +1911,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	ALLOC_ARRAY(idx_objects, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		idx_objects[i] = &objects[i].idx;
-	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash);
-	if (rev_index)
+	curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts,
+				    pack_hash, verify);
+	if (opts.write_rev)
 		curr_rev_index = write_rev_file(rev_index_name, idx_objects,
 						nr_objects, pack_hash,
-						opts.flags);
+						verify);
 	free(idx_objects);
 
 	if (!verify)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 014dcd4bc98..6aeaad8560f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3160,10 +3160,7 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		return 0;
 	}
 	if (!strcmp(k, "pack.writereverseindex")) {
-		if (git_config_bool(k, v))
-			pack_idx_opts.flags |= WRITE_REV;
-		else
-			pack_idx_opts.flags &= ~WRITE_REV;
+		pack_idx_opts.write_rev = git_config_bool(k, v);
 		return 0;
 	}
 	if (!strcmp(k, "uploadpack.blobpackfileuri")) {
@@ -4007,8 +4004,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 
 	reset_pack_idx_option(&pack_idx_opts);
 	git_config(git_pack_config, NULL);
-	if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0))
-		pack_idx_opts.flags |= WRITE_REV;
+	pack_idx_opts.write_rev = git_env_bool(GIT_TEST_WRITE_REV_INDEX,
+					       pack_idx_opts.write_rev);
 
 	progress = isatty(2);
 	argc = parse_options(argc, argv, prefix, pack_objects_options,
diff --git a/midx.c b/midx.c
index 3db0e47735f..f0bb56b2c64 100644
--- a/midx.c
+++ b/midx.c
@@ -905,7 +905,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
 	strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash));
 
 	tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr,
-					midx_hash, WRITE_REV);
+					midx_hash, 0);
 
 	if (finalize_object_file(tmp_file, buf.buf))
 		die(_("cannot store reverse index file"));
diff --git a/pack-write.c b/pack-write.c
index 51812cb1299..7973aa75579 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -44,7 +44,7 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts)
  */
 const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects,
 			   int nr_objects, const struct pack_idx_option *opts,
-			   const unsigned char *sha1)
+			   const unsigned char *sha1, int verify)
 {
 	struct hashfile *f;
 	struct pack_idx_entry **sorted_by_sha, **list, **last;
@@ -65,7 +65,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 	else
 		sorted_by_sha = list = last = NULL;
 
-	if (opts->flags & WRITE_IDX_VERIFY) {
+	if (verify) {
 		assert(index_name);
 		f = hashfd_check(index_name);
 	} else {
@@ -117,7 +117,7 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 		if (index_version < 2)
 			hashwrite_be32(f, obj->offset);
 		hashwrite(f, obj->oid.hash, the_hash_algo->rawsz);
-		if ((opts->flags & WRITE_IDX_STRICT) &&
+		if (opts->write_idx_strict &&
 		    (i && oideq(&list[-2]->oid, &obj->oid)))
 			die("The same object %s appears twice in the pack",
 			    oid_to_hex(&obj->oid));
@@ -159,9 +159,10 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 	}
 
 	hashwrite(f, sha1, the_hash_algo->rawsz);
+
 	finalize_hashfile(f, NULL, FSYNC_COMPONENT_PACK_METADATA,
 			  CSUM_HASH_IN_STREAM | CSUM_CLOSE |
-			  ((opts->flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC));
+			  (verify ? 0 : CSUM_FSYNC));
 	return index_name;
 }
 
@@ -216,22 +217,19 @@ const char *write_rev_file(const char *rev_name,
 			   struct pack_idx_entry **objects,
 			   uint32_t nr_objects,
 			   const unsigned char *hash,
-			   unsigned flags)
+			   int verify)
 {
 	uint32_t *pack_order;
 	uint32_t i;
 	const char *ret;
 
-	if (!(flags & WRITE_REV) && !(flags & WRITE_REV_VERIFY))
-		return NULL;
-
 	ALLOC_ARRAY(pack_order, nr_objects);
 	for (i = 0; i < nr_objects; i++)
 		pack_order[i] = i;
 	QSORT_S(pack_order, nr_objects, pack_order_cmp, objects);
 
 	ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash,
-				   flags);
+				   verify);
 
 	free(pack_order);
 
@@ -242,15 +240,12 @@ const char *write_rev_file_order(const char *rev_name,
 				 uint32_t *pack_order,
 				 uint32_t nr_objects,
 				 const unsigned char *hash,
-				 unsigned flags)
+				 int verify)
 {
 	struct hashfile *f;
 	int fd;
 
-	if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY))
-		die(_("cannot both write and verify reverse index"));
-
-	if (flags & WRITE_REV) {
+	if (!verify) {
 		if (!rev_name) {
 			struct strbuf tmp_file = STRBUF_INIT;
 			fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX");
@@ -260,7 +255,7 @@ const char *write_rev_file_order(const char *rev_name,
 			fd = xopen(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600);
 		}
 		f = hashfd(fd, rev_name);
-	} else if (flags & WRITE_REV_VERIFY) {
+	} else {
 		struct stat statbuf;
 		if (stat(rev_name, &statbuf)) {
 			if (errno == ENOENT) {
@@ -270,8 +265,7 @@ const char *write_rev_file_order(const char *rev_name,
 				die_errno(_("could not stat: %s"), rev_name);
 		}
 		f = hashfd_check(rev_name);
-	} else
-		return NULL;
+	}
 
 	write_rev_header(f);
 
@@ -283,7 +277,8 @@ const char *write_rev_file_order(const char *rev_name,
 
 	finalize_hashfile(f, NULL, FSYNC_COMPONENT_PACK_METADATA,
 			  CSUM_HASH_IN_STREAM | CSUM_CLOSE |
-			  ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC));
+			  (verify ? 0 : CSUM_FSYNC));
+
 
 	return rev_name;
 }
@@ -494,12 +489,13 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 		die_errno("unable to make temporary pack file readable");
 
 	*idx_tmp_name = (char *)write_idx_file(NULL, written_list, nr_written,
-					       pack_idx_opts, hash);
+					       pack_idx_opts, hash, 0);
 	if (adjust_shared_perm(*idx_tmp_name))
 		die_errno("unable to make temporary index file readable");
 
-	rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash,
-				      pack_idx_opts->flags);
+	if (pack_idx_opts->write_rev)
+		rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash,
+					      0);
 
 	rename_tmp_packfile(name_buffer, pack_tmp_name, "pack");
 	if (rev_tmp_name)
diff --git a/pack.h b/pack.h
index 802c047efe6..ccc4c129b0c 100644
--- a/pack.h
+++ b/pack.h
@@ -38,12 +38,8 @@ struct pack_header {
 #define PACK_IDX_SIGNATURE 0xff744f63	/* "\377tOc" */
 
 struct pack_idx_option {
-	unsigned flags;
-	/* flag bits */
-#define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */
-#define WRITE_IDX_STRICT 02
-#define WRITE_REV 04
-#define WRITE_REV_VERIFY 010
+	unsigned int write_idx_strict:1,
+		     write_rev:1;
 
 	uint32_t version;
 	uint32_t off32_limit;
@@ -84,7 +80,8 @@ typedef int (*verify_fn)(const struct object_id *, enum object_type, unsigned lo
 const char *write_idx_file(const char *index_name,
 			   struct pack_idx_entry **objects, int nr_objects,
 			   const struct pack_idx_option *opts,
-			   const unsigned char *sha1);
+			   const unsigned char *sha1,
+			   int verify);
 int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 int verify_pack_index(struct packed_git *);
 int verify_pack(struct repository *, struct packed_git *, verify_fn fn, struct progress *, uint32_t);
@@ -99,11 +96,11 @@ void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_
 const char *write_rev_file(const char *rev_name,
 			   struct pack_idx_entry **objects,
 			   uint32_t nr_objects,
-			   const unsigned char *hash, unsigned flags);
+			   const unsigned char *hash, int verify);
 const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order,
 				 uint32_t nr_objects,
 				 const unsigned char *hash,
-				 unsigned flags);
+				 int verify);
 
 /*
  * The "hdr" output buffer should be at least this big, which will handle sizes
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 14/15] config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (12 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 13/15] pack-write API: pass down "verify" not arbitrary flags Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-03 18:37 ` [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro Ævar Arnfjörð Bjarmason
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
  15 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add an option to compile with GCC's -fanalyzer, which as noted in [1]
and [2] has become much more useful in the recently released GCC v12
series.

Here we're quieting a few outstanding -fanalyzer issues that require
us to use -Wno-error=* on an entire file:

 * range-diff.c, because it involves loop variables and would be
   painfully verbose to instrument with the ASSERT_FOR_FANALYZER() macro
   introduced in the subsequent commit.

 * http-fetch.c and fsmonitor-settings.c, because those aren't issues
   where we're referencing NULL, and therefore we can't quiet it with an
   assert().

For non-GCC compilers I considered wrapping the DEVOPTS logic in:

	ifneq ($(filter gcc10,$(COMPILER_FEATURES)),)
	endif

Which would make it OK to specify DEVOPTS=analyzer under other
compilers, or on older GCC. But then we'd silently ignore the option
on those. Let's instead trust the compiler to error out if it doesn't
support -fanalyzer.

There are various limitations and bugs in the analyzer engine, e.g. I
filed [3] for a false positive in builtin/merge-file.c before GCC v12
was released, which was subsequently fixed in GCC 12 trunk in [4], but
many other issues remain.

1. https://developers.redhat.com/articles/2022/04/12/state-static-analysis-gcc-12-compiler
2. https://gcc.gnu.org/gcc-12/changes.html
3. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105264
4. https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a358e4b60815b41e27f3508014ceb592f86b9b45

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile       | 14 ++++++++++++
 config.mak.dev | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+)

diff --git a/Makefile b/Makefile
index 18ca6744a50..129d55f5992 100644
--- a/Makefile
+++ b/Makefile
@@ -507,6 +507,20 @@ include shared.mak
 #    no-pedantic:
 #
 #        Disable -pedantic compilation.
+#
+#    analyzer:
+#
+#        Compile with GCC's -fanalyzer, this analysis is much more
+#        expensive than other GCC warnings.
+#
+#        The set of analysis flags is curated based on known issues
+#        and compiler version. Known issues are made into non-fatal
+#        warnings (even "no-error" isn't set).
+#
+#    no-suppress-analyzer:
+#
+#        When using "analyzer" disable the suppression of known
+#        -fanalyzer issues.
 
 GIT-VERSION-FILE: FORCE
 	@$(SHELL_PATH) ./GIT-VERSION-GEN
diff --git a/config.mak.dev b/config.mak.dev
index c3104f400b2..d6f5be92297 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -72,3 +72,64 @@ DEVELOPER_CFLAGS += -Wno-error=dangling-pointer
 endif
 
 GIT_TEST_PERL_FATAL_WARNINGS = YesPlease
+
+# GCC's -fanalyzer mode
+ifeq ($(filter analyzer,$(DEVOPTS)),analyzer)
+
+ifeq ($(filter gcc1,$(COMPILER_FEATURES)),)
+$(error you must be using a new-ish version of GCC for DEVOPTS=analyzer, your \
+$(CC) is not GCC at all!)
+endif
+
+DEVELOPER_CFLAGS += -fanalyzer
+
+## -fanalyzer exists exists as of gcc10, but versions older than gcc12
+## have a lot of false positives.
+ifeq ($(filter gcc12,$(COMPILER_FEATURES)),)
+DEVELOPER_CFLAGS += -Wno-analyzer-double-free
+DEVELOPER_CFLAGS += -Wno-analyzer-free-of-non-heap
+endif
+
+## Helper templates to whitelist existing issues
+define fn_disable_analyzer_tmpl
+$(1).sp $(1).s $(1).o: EXTRA_CPPFLAGS += $(2)
+
+endef
+
+define fn_disable_analyzer
+$(foreach f,$(2),$(call fn_disable_analyzer_tmpl,$(f),$(1)))
+endef
+
+## -Wno-error=analyzer-null-dereference
+$(eval $(call fn_disable_analyzer, \
+	-Wno-error=analyzer-null-dereference, \
+	range-diff \
+))
+## -Wno-error=analyzer-malloc-leak
+$(eval $(call fn_disable_analyzer, \
+	-Wno-error=analyzer-malloc-leak, \
+	fsmonitor-settings \
+))
+## per-GCC version annotations
+### -Wno-error=analyzer-use-of-uninitialized-value: gcc >= 12
+ifneq ($(filter gcc12,$(COMPILER_FEATURES)),)
+$(eval $(call fn_disable_analyzer, \
+	-Wno-error=analyzer-use-of-uninitialized-value, \
+	http-fetch \
+))
+else # gcc < 12
+### -Wno-error=analyzer-null-dereference: gcc == 11
+ifneq ($(filter gcc11,$(COMPILER_FEATURES)),)
+$(eval $(call fn_disable_analyzer, \
+	-Wno-error=analyzer-null-dereference, \
+	merge \
+	xdiff/xemit \
+	reftable/reader \
+))
+else
+$(error Your GCC version is too old for -fanalyze, or you are using \
+gcc10 which has it, but has too many false positives!)
+endif
+endif # gcc < 12
+
+endif
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (13 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 14/15] config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer Ævar Arnfjörð Bjarmason
@ 2022-06-03 18:37 ` Ævar Arnfjörð Bjarmason
  2022-06-04 13:12   ` Phillip Wood
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
  15 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-03 18:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Phillip Wood,
	Glen Choo, Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau, Ævar Arnfjörð Bjarmason

Add an ASSERT_FOR_FANALYZER() macro to quiet those -fanalyzer issues
that haven't been diagnosed yet to either change the code involved, or
to add an assert() or BUG() to it to placate GCC v12's -fanalyzer
mode.

As in the preceding commit we could also use -Wno-error=* here, which
was the initial implementation in config.mak.dev, i.e. something like:

	## -Wno-error=analyzer-null-dereference
	$(eval $(call fn_disable_analyzer, \
		-Wno-error=analyzer-null-dereference, \
		dir \
		gpg-interface \
		graph \
		range-diff \
		utf8 \
	))
	## -Wno-error=analyzer-null-dereference: pre-gcc12
	ifeq ($(filter gcc12,$(COMPILER_FEATURES)),)
	$(eval $(call fn_disable_analyzer, \
		-Wno-error=analyzer-null-dereference, \
		merge \
		builtin/name-rev \
	))
	endif

But any such approach will unfortunately quiet *all* in the relevant
files, including any new ones. Instead we want to quiet specific
issues involved with ASSERT_FOR_FANALYZER().

A drawback of this overall approach is that this only works under
e.g. CFLAGS=-O0, but not CFLAGS=-O3. The compiler takes the liberty to
re-arrange our code to get around these assert()s, and other new
issues will also crop up. All of this is expected (-fanalyzer is
subject to optimization passes, like most other options), but let's
focus on -O0 for now.

Commentary on specific cases:

 * builtin/name-rev.c: Since 45a14f578e1 (Revert "name-rev: release
   unused name strings", 2022-04-22) GCC v12's -fanalyzer has complained
   about this code, per René's analysis in [1] it's incorrect to do
   so. So let's add an ASSERT_FOR_FANALYZER() to placate it.

 * graph.c: In 0f0f389f120 (graph: tidy up display of left-skewed
   merges, 2019-10-15) a previous "assert" was removed, and another
   unconditional deference of the return value of
   first_interesting_parent() was added without checking it. Since it
   can return NULL let's add ASSERT_FOR_FANALYZER()'s here to note for
   -fanalyzer that we won't be dereferencing these in practice.

1. https://lore.kernel.org/git/dece627d-ccf8-2375-0c50-c59637e561d3@web.de/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/name-rev.c |  1 +
 config.mak.dev     |  4 ++++
 dir.c              |  1 +
 git-compat-util.h  | 16 ++++++++++++++++
 gpg-interface.c    |  2 ++
 graph.c            |  2 ++
 line-log.c         |  2 ++
 unpack-trees.c     |  1 +
 utf8.c             |  1 +
 9 files changed, 30 insertions(+)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 02ea9d16330..419159961b9 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -223,6 +223,7 @@ static void name_rev(struct commit *start_commit,
 			if (commit_is_before_cutoff(parent))
 				continue;
 
+			ASSERT_FOR_FANALYZER(name);
 			if (parent_number > 1) {
 				generation = 0;
 				distance = name->distance + MERGE_TRAVERSAL_WEIGHT;
diff --git a/config.mak.dev b/config.mak.dev
index d6f5be92297..1d47fc04163 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -83,6 +83,10 @@ endif
 
 DEVELOPER_CFLAGS += -fanalyzer
 
+ifeq ($(filter no-suppress-analyzer,$(DEVOPTS)),)
+DEVELOPER_CFLAGS += -DSUPPRESS_FSANITIZER
+endif
+
 ## -fanalyzer exists exists as of gcc10, but versions older than gcc12
 ## have a lot of false positives.
 ifeq ($(filter gcc12,$(COMPILER_FEATURES)),)
diff --git a/dir.c b/dir.c
index 3633c0852ee..c80b584b4a7 100644
--- a/dir.c
+++ b/dir.c
@@ -3781,6 +3781,7 @@ static void invalidate_one_directory(struct untracked_cache *uc,
 				     struct untracked_cache_dir *ucd)
 {
 	uc->dir_invalidated++;
+	ASSERT_FOR_FANALYZER(ucd);
 	ucd->valid = 0;
 	ucd->untracked_nr = 0;
 }
diff --git a/git-compat-util.h b/git-compat-util.h
index 96293b6c43a..a553884c048 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -984,6 +984,21 @@ static inline unsigned long cast_size_t_to_ulong(size_t a)
 	return (unsigned long)a;
 }
 
+/**
+ * Transitory helper macro to quiet currently known GCC -fsanitizer
+ * issues.
+ *
+ * We lie to it and say that we're confident that the given "expr" to
+ * ASSERT_FOR_FANALYZER() cannot be NULL (or 0). Those
+ * grandfathered-in issues should be fixed, but at least we're
+ * stemming the tide.
+ */
+#ifdef SUPPRESS_FSANITIZER
+#define ASSERT_FOR_FANALYZER(expr) assert((expr))
+#else
+#define ASSERT_FOR_FANALYZER(expr) do {} while (0)
+#endif
+
 #ifdef HAVE_ALLOCA_H
 # include <alloca.h>
 # define xalloca(size)      (alloca(size))
@@ -1037,6 +1052,7 @@ int xstrncmpz(const char *s, const char *t, size_t len);
 	BUILD_ASSERT_OR_ZERO(sizeof(*(dst)) == sizeof(*(src))))
 static inline void copy_array(void *dst, const void *src, size_t n, size_t size)
 {
+	ASSERT_FOR_FANALYZER(dst);
 	if (n)
 		memcpy(dst, src, st_mult(size, n));
 }
diff --git a/gpg-interface.c b/gpg-interface.c
index 280f1fa1a58..9cba3b86e45 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -242,6 +242,7 @@ static void parse_gpg_output(struct signature_check *sigc)
 					next = strchrnul(line, ' ');
 					replace_cstring(&sigc->key, line, next);
 					/* Do we have signer information? */
+					ASSERT_FOR_FANALYZER(next);
 					if (*next && (sigcheck_gpg_status[i].flags & GPG_STATUS_UID)) {
 						line = next + 1;
 						next = strchrnul(line, '\n');
@@ -283,6 +284,7 @@ static void parse_gpg_output(struct signature_check *sigc)
 					 */
 					limit = strchrnul(line, '\n');
 					for (j = 9; j > 0; j--) {
+						ASSERT_FOR_FANALYZER(next);
 						if (!*next || limit <= next)
 							break;
 						line = next + 1;
diff --git a/graph.c b/graph.c
index 568b6e7cd41..39f7e95d4ab 100644
--- a/graph.c
+++ b/graph.c
@@ -1105,6 +1105,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
 			seen_this = 1;
 
 			for (j = 0; j < graph->num_parents; j++) {
+				ASSERT_FOR_FANALYZER(parents);
 				par_column = graph_find_new_column_by_commit(graph, parents->item);
 				assert(par_column >= 0);
 
@@ -1138,6 +1139,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
 			}
 		}
 
+		ASSERT_FOR_FANALYZER(first_parent);
 		if (col_commit == first_parent->item)
 			parent_col = col;
 	}
diff --git a/line-log.c b/line-log.c
index 51d93310a4d..1295f46deaf 100644
--- a/line-log.c
+++ b/line-log.c
@@ -154,6 +154,7 @@ static void range_set_union(struct range_set *out,
 	while (i < a->nr || j < b->nr) {
 		struct range *new_range;
 		if (i < a->nr && j < b->nr) {
+			ASSERT_FOR_FANALYZER(rb);
 			if (ra[i].start < rb[j].start)
 				new_range = &ra[i++];
 			else if (ra[i].start > rb[j].start)
@@ -166,6 +167,7 @@ static void range_set_union(struct range_set *out,
 			new_range = &ra[i++];
 		else                       /* a exhausted */
 			new_range = &rb[j++];
+		ASSERT_FOR_FANALYZER(new_range);
 		if (new_range->start == new_range->end)
 			; /* empty range */
 		else if (!out->nr || out->ranges[out->nr-1].end < new_range->start) {
diff --git a/unpack-trees.c b/unpack-trees.c
index a1d0ff3a4d3..b6e10b05a00 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2195,6 +2195,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	char *pathbuf;
 	int cnt = 0;
 
+	ASSERT_FOR_FANALYZER(ce);
 	if (S_ISGITLINK(ce->ce_mode)) {
 		struct object_id oid;
 		int sub_head = resolve_gitlink_ref(ce->name, "HEAD", &oid);
diff --git a/utf8.c b/utf8.c
index de4ce5c0e68..3ffc0a97f3b 100644
--- a/utf8.c
+++ b/utf8.c
@@ -130,6 +130,7 @@ static ucs_char_t pick_one_utf8_char(const char **start, size_t *remainder_p)
 	 */
 	remainder = (remainder_p ? *remainder_p : 999);
 
+	ASSERT_FOR_FANALYZER(remainder < 1 || s);
 	if (remainder < 1) {
 		goto invalid;
 	} else if (*s < 0x80) {
-- 
2.36.1.1124.g577fa9c2ebd


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-03 18:37 ` [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
@ 2022-06-03 21:07   ` René Scharfe
  2022-06-03 21:28     ` Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-03 21:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Fix a bug in fd3cb0501e1 (remote: move static variables into
> per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
> after having NULL'd out remote->pushurl. itself.
>
> While we're at it let's get rid of the redundant braces per the
> CodingGuidelines, which also serves to show in the diff context that
> we were doing a FREE_AND_NULL(remote->pushurl) afterwards too, let's
> keep that one.

The extended context is helping, but the brace removal makes this change
harder to read.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  remote.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/remote.c b/remote.c
> index 930fdc9c2f6..61240562df1 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -144,14 +144,10 @@ static void remote_clear(struct remote *remote)
>  	free((char *)remote->name);
>  	free((char *)remote->foreign_vcs);
>
> -	for (i = 0; i < remote->url_nr; i++) {
> +	for (i = 0; i < remote->url_nr; i++)
>  		free((char *)remote->url[i]);
> -	}
> -	FREE_AND_NULL(remote->pushurl);

So you remove the redundant release of the pushurl array, but the url
array, which should be freed here after its elements are released, still
leaks.  The duplicate FREE_AND_NULL perhaps resulted from a copy&paste
error.

> -
> -	for (i = 0; i < remote->pushurl_nr; i++) {
> +	for (i = 0; i < remote->pushurl_nr; i++)
>  		free((char *)remote->pushurl[i]);
> -	}
>  	FREE_AND_NULL(remote->pushurl);

Why set pushurl to NULL after release?  This results in an invalid state
unless pushurl_nr und pushurl_alloc are reset to zero.  Same goes for
the url array above -- either a simple free(3) call suffices or url_nr
and url_alloc need to be cleared as well.

>  	free((char *)remote->receivepack);
>  	free((char *)remote->uploadpack);

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path
  2022-06-03 18:37 ` [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path Ævar Arnfjörð Bjarmason
@ 2022-06-03 21:27   ` René Scharfe
  0 siblings, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-03 21:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Fix an issue with feeding NULL to strcmp() noted by GCC's -fanalyzer,
> this fixes a bug in 1678b81ecce (pull: teach git pull about --rebase,
> 2015-06-18).
>
> In cmd_pull() we could go through the function without initializing
> the "repo" argument (the -fanalyzer output shows how exactly), we'd
> then call get_rebase_fork_point (), which would in turn call
> get_tracking_branch() with that "NULL" repo argument.
>
> Let's avoid this potential issue by returning NULL in this case, which
> will have get_rebase_fork_point() return -1 in turn.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/pull.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/pull.c b/builtin/pull.c
> index 01155ba67b2..ed8df004028 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -756,14 +756,16 @@ static const char *get_tracking_branch(const char *remote, const char *refspec)
>  		starts_with(spec_src, "remotes/"))
>  		spec_src = "";
>
> -	if (*spec_src) {
> -		if (!strcmp(remote, "."))
> -			merge_branch = mkpath("refs/heads/%s", spec_src);
> -		else
> -			merge_branch = mkpath("refs/remotes/%s/%s", remote, spec_src);
> -	} else
> +	if ((*spec_src && !remote) || !*spec_src) {

This is equivalent to:

+	if (!*spec_src || !remote) {

It's easier to read without the redundant term.

The patch would be even easier to understand as a one-liner that just adds
the missing check, i.e.:

-	if (*spec_src) {
+	if (*spec_src && remote) {

>  		merge_branch = NULL;
> +		goto cleanup;
> +	}
>
> +	if (!strcmp(remote, "."))
> +		merge_branch = mkpath("refs/heads/%s", spec_src);
> +	else
> +		merge_branch = mkpath("refs/remotes/%s/%s", remote, spec_src);
> +cleanup:
>  	refspec_item_clear(&spec);
>  	return merge_branch;
>  }

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-03 21:07   ` René Scharfe
@ 2022-06-03 21:28     ` Junio C Hamano
  2022-06-03 22:32     ` Glen Choo
  2022-06-04 12:51     ` Phillip Wood
  2 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-06-03 21:28 UTC (permalink / raw)
  To: René Scharfe
  Cc: Ævar Arnfjörð Bjarmason, git, Jinoh Kang,
	Phillip Wood, Glen Choo, Paul Tan, Han-Wen Nienhuys,
	Karthik Nayak, Jeff Smith, Taylor Blau

René Scharfe <l.s.r@web.de> writes:

>> -	for (i = 0; i < remote->pushurl_nr; i++) {
>> +	for (i = 0; i < remote->pushurl_nr; i++)
>>  		free((char *)remote->pushurl[i]);
>> -	}
>>  	FREE_AND_NULL(remote->pushurl);
>
> Why set pushurl to NULL after release?  This results in an invalid state
> unless pushurl_nr und pushurl_alloc are reset to zero.  Same goes for
> the url array above -- either a simple free(3) call suffices or url_nr
> and url_alloc need to be cleared as well.

We probably should give a huge warning next to FREE_AND_NULL() about
this.  It also is an effective way to hide an existing bug under the
rug.  diff_options.pathspec might be freed prematurely which may be
noticed by a use-after-free if left to use free(), but FREE_AND_NULL()
will mislead the use-after-free caller into thinkig that "ah there is
no pathspec to be used" and produce nonsense result without crashing.

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-03 18:37 ` [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
@ 2022-06-03 22:22   ` René Scharfe
  2022-06-04  0:54     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 51+ messages in thread
From: René Scharfe @ 2022-06-03 22:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Return NULL instead of possibly feeding a NULL to memset() in
> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>
> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
> 	   43 |         memset(p, 0, sz);
> 	      |         ^~~~~~~~~~~~~~~~
> 	[...]
>
> This bug has been with us ever since this code was added in
> ef8a6c62687 (reftable: utility functions, 2021-10-07).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  reftable/publicbasics.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
> index 0ad7d5c0ff2..a18167f5ab7 100644
> --- a/reftable/publicbasics.c
> +++ b/reftable/publicbasics.c
> @@ -40,6 +40,8 @@ void reftable_free(void *p)
>  void *reftable_calloc(size_t sz)
>  {
>  	void *p = reftable_malloc(sz);
> +	if (!p)
> +		return NULL;
>  	memset(p, 0, sz);
>  	return p;
>  }

We discussed this before, in
https://lore.kernel.org/git/RFC-patch-2.2-364d1194a95-20220415T101740Z-avarab@gmail.com/

If this code was actually used by Git and still not handling allocation
failures then I'd propose something like the below instead.

Next I'd probably try to convert reftable_calloc() calls to a variant
that takes size and count separately -- like calloc(3) does -- to avoid
unchecked multiplication.

--- >8 ---
Subject: [PATCH] reftable: remove reftable_set_alloc()

Use our native allocation functions for the reftable code to provide
consistent handling of allocation failures.  Remove the ability to
change the allocator, as we don't need it anymore after this change.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 reftable/basics.h          |  7 -------
 reftable/publicbasics.c    | 43 --------------------------------------
 reftable/reftable-malloc.h | 18 ----------------
 reftable/system.h          |  5 +++++
 4 files changed, 5 insertions(+), 68 deletions(-)
 delete mode 100644 reftable/reftable-malloc.h

diff --git a/reftable/basics.h b/reftable/basics.h
index 096b36862b..3386ff6484 100644
--- a/reftable/basics.h
+++ b/reftable/basics.h
@@ -46,13 +46,6 @@ int names_equal(char **a, char **b);
 /* returns the array size of a NULL-terminated array of strings. */
 int names_length(char **names);

-/* Allocation routines; they invoke the functions set through
- * reftable_set_alloc() */
-void *reftable_malloc(size_t sz);
-void *reftable_realloc(void *p, size_t sz);
-void reftable_free(void *p);
-void *reftable_calloc(size_t sz);
-
 /* Find the longest shared prefix size of `a` and `b` */
 struct strbuf;
 int common_prefix_size(struct strbuf *a, struct strbuf *b);
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
index 0ad7d5c0ff..4dd4204ce6 100644
--- a/reftable/publicbasics.c
+++ b/reftable/publicbasics.c
@@ -6,52 +6,9 @@ license that can be found in the LICENSE file or at
 https://developers.google.com/open-source/licenses/bsd
 */

-#include "reftable-malloc.h"
-
 #include "basics.h"
 #include "system.h"

-static void *(*reftable_malloc_ptr)(size_t sz);
-static void *(*reftable_realloc_ptr)(void *, size_t);
-static void (*reftable_free_ptr)(void *);
-
-void *reftable_malloc(size_t sz)
-{
-	if (reftable_malloc_ptr)
-		return (*reftable_malloc_ptr)(sz);
-	return malloc(sz);
-}
-
-void *reftable_realloc(void *p, size_t sz)
-{
-	if (reftable_realloc_ptr)
-		return (*reftable_realloc_ptr)(p, sz);
-	return realloc(p, sz);
-}
-
-void reftable_free(void *p)
-{
-	if (reftable_free_ptr)
-		reftable_free_ptr(p);
-	else
-		free(p);
-}
-
-void *reftable_calloc(size_t sz)
-{
-	void *p = reftable_malloc(sz);
-	memset(p, 0, sz);
-	return p;
-}
-
-void reftable_set_alloc(void *(*malloc)(size_t),
-			void *(*realloc)(void *, size_t), void (*free)(void *))
-{
-	reftable_malloc_ptr = malloc;
-	reftable_realloc_ptr = realloc;
-	reftable_free_ptr = free;
-}
-
 int hash_size(uint32_t id)
 {
 	switch (id) {
diff --git a/reftable/reftable-malloc.h b/reftable/reftable-malloc.h
deleted file mode 100644
index 5f2185f1f3..0000000000
--- a/reftable/reftable-malloc.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*
-Copyright 2020 Google LLC
-
-Use of this source code is governed by a BSD-style
-license that can be found in the LICENSE file or at
-https://developers.google.com/open-source/licenses/bsd
-*/
-
-#ifndef REFTABLE_H
-#define REFTABLE_H
-
-#include <stddef.h>
-
-/* Overrides the functions to use for memory management. */
-void reftable_set_alloc(void *(*malloc)(size_t),
-			void *(*realloc)(void *, size_t), void (*free)(void *));
-
-#endif
diff --git a/reftable/system.h b/reftable/system.h
index 18f9207dfe..d7c7c20115 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -16,6 +16,11 @@ license that can be found in the LICENSE file or at
 #include "hash.h" /* hash ID, sizes.*/
 #include "dir.h" /* remove_dir_recursively, for tests.*/

+#define reftable_malloc(x) xmalloc(x)
+#define reftable_realloc(x, y) xrealloc((x), (y))
+#define reftable_free(x) free(x)
+#define reftable_calloc(x) xcalloc(1, (x))
+
 int hash_size(uint32_t id);

 #endif
--
2.35.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-03 21:07   ` René Scharfe
  2022-06-03 21:28     ` Junio C Hamano
@ 2022-06-03 22:32     ` Glen Choo
  2022-06-04 12:51     ` Phillip Wood
  2 siblings, 0 replies; 51+ messages in thread
From: Glen Choo @ 2022-06-03 22:32 UTC (permalink / raw)
  To: René Scharfe, Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

René Scharfe <l.s.r@web.de> writes:

> Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  remote.c | 8 ++------
>>  1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/remote.c b/remote.c
>> index 930fdc9c2f6..61240562df1 100644
>> --- a/remote.c
>> +++ b/remote.c
>> @@ -144,14 +144,10 @@ static void remote_clear(struct remote *remote)
>>  	free((char *)remote->name);
>>  	free((char *)remote->foreign_vcs);
>>
>> -	for (i = 0; i < remote->url_nr; i++) {
>> +	for (i = 0; i < remote->url_nr; i++)
>>  		free((char *)remote->url[i]);
>> -	}
>> -	FREE_AND_NULL(remote->pushurl);
>
> So you remove the redundant release of the pushurl array, but the url
> array, which should be freed here after its elements are released, still
> leaks.  The duplicate FREE_AND_NULL perhaps resulted from a copy&paste
> error.

This seems like the most likely explanation (even I don't recall how I
ended up introducing this bug..). I probably meant to change pushurl ->
url, but forgot somehow.

>> -
>> -	for (i = 0; i < remote->pushurl_nr; i++) {
>> +	for (i = 0; i < remote->pushurl_nr; i++)
>>  		free((char *)remote->pushurl[i]);
>> -	}
>>  	FREE_AND_NULL(remote->pushurl);
>
> Why set pushurl to NULL after release?  This results in an invalid state
> unless pushurl_nr und pushurl_alloc are reset to zero.  Same goes for
> the url array above -- either a simple free(3) call suffices or url_nr
> and url_alloc need to be cleared as well.

True, thanks for catching that. We'd probably never notice it because we
never reuse the clear-ed `struct remote` (since we only ever call this
when we clean up `struct repository`), but it's better to do the 'right
thing'.

I didn't consider that when I chose to use FREE_AND_NULL() instead of
free(), and there wasn't any particular reason why I chose one over the
other.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff()
  2022-06-03 18:37 ` [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff() Ævar Arnfjörð Bjarmason
@ 2022-06-03 22:48   ` René Scharfe
  0 siblings, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-03 22:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Fix a control flow issue dating back to d1f2d7e8ca6 (Make
> run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19)
> where we'd assume "tree" must be non-NULL if idx was NULL. As
> -fanalyzer shows we'd end up dereferencing "tree" in that case in
> ce_path_match():
>
> dir.h:541:41: warning: dereference of NULL ‘ce’ [CWE-476] [-Wanalyzer-null-dereference]
>   541 |                               S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode));
>       |                                         ^
>   ‘oneway_diff’: events 1-2
>     |
>     |diff-lib.c:493:12:
>     |  493 | static int oneway_diff(const struct cache_entry * const *src,
>     |      |            ^~~~~~~~~~~
>     |      |            |
>     |      |            (1) entry to ‘oneway_diff’
>     |......
>     |  506 |         if (tree == o->df_conflict_entry)
>     |      |            ~
>     |      |            |
>     |      |            (2) following ‘true’ branch...
>     |
>   ‘oneway_diff’: event 3
>     |
>     |  507 |                 tree = NULL;
>     |      |                      ^
>     |      |                      |
>     |      |                      (3) ...to here
>     |
>   ‘oneway_diff’: events 4-8
>     |
>     |  507 |                 tree = NULL;
>     |      |                      ^
>     |      |                      |
>     |      |                      (4) ‘tree’ is NULL
>     |  508 |
>     |  509 |         if (ce_path_match(revs->diffopt.repo->index,
>     |      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |             |
>     |      |             (5) following ‘false’ branch (when ‘idx’ is NULL)...
>     |      |             (6) ...to here
>     |      |             (7) ‘tree’ is NULL
>     |      |             (8) calling ‘ce_path_match’ from ‘oneway_diff’
>     |  510 |                           idx ? idx : tree,
>     |      |                           ~~~~~~~~~~~~~~~~~
>     |  511 |                           &revs->prune_data, NULL)) {
>     |      |                           ~~~~~~~~~~~~~~~~~~~~~~~~
>     |
>     +--> ‘ce_path_match’: event 9
>            |
>            |dir.h:535:19:
>            |  535 | static inline int ce_path_match(struct index_state *istate,
>            |      |                   ^~~~~~~~~~~~~
>            |      |                   |
>            |      |                   (9) entry to ‘ce_path_match’
>            |
>          ‘ce_path_match’: event 10
>            |
>            |  541 |                               S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode));
>            |      |                                         ^
>            |      |                                         |
>            |      |                                         (10) dereference of NULL ‘ce
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  diff-lib.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/diff-lib.c b/diff-lib.c
> index ca085a03efc..8373ad7e3ea 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -506,6 +506,9 @@ static int oneway_diff(const struct cache_entry * const *src,
>  	if (tree == o->df_conflict_entry)
>  		tree = NULL;

So here we have a D/F conflict in a oneway diff, i.e. a single tree.
That means if we discard the thing from the tree, then we still have
the conflicting thing from the index.  Meaning idx and tree cannot both
be NULL in that D/F conflict scenario.  Right?

>
> +	if (!idx && !tree)
> +		return 0;

That calms down the confused compiler, but would it perhaps be better to
BUG out at this point?  Or is there a valid state with both idx and tree
being NULL?

> +
>  	if (ce_path_match(revs->diffopt.repo->index,
>  			  idx ? idx : tree,>  			  &revs->prune_data, NULL)) {

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL
  2022-06-03 18:37 ` [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL Ævar Arnfjörð Bjarmason
@ 2022-06-03 23:14   ` René Scharfe
  0 siblings, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-03 23:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Adjust code added in 2775d8724d7 (packed_ref_store: implement
> reference transactions, 2017-09-08) to BUG() out in a case there GCC
> v12's -fanalyzer flagged that the "iter->oid" seen in the context was
> reachable where iter was NULL.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  refs/packed-backend.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 97b68377673..65991bbcaf5 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1226,6 +1226,8 @@ static int write_with_updates(struct packed_ref_store *refs,
>  			struct object_id peeled;
>  			int peel_error = ref_iterator_peel(iter, &peeled);

Some peel backends dereference iter, e.g. packed_ref_iterator_peel() from
the same file.

>
> +			if (!iter)
> +				BUG("must have iter if cmp < 0");

So if iter can be NULL at this point, then it must be checked before the
ref_iterator_peel() call above, no?

*Can* it be NULL, though?  The code in question is a bit complicated and
it's late around here, but I cannot see how cmp < 0 and !iter can both
be true at this place in the code.  An optimizing compiler will probably
optimize out the added check and string.  Mine does (clang 13); you can
check if yours does the same using:

  $ make refs/packed-backend.o && grep "must have" refs/packed-backend.o

>  			if (write_packed_entry(out, iter->refname,
>  					       iter->oid,
>  					       peel_error ? NULL : &peeled))

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-03 22:22   ` René Scharfe
@ 2022-06-04  0:54     ` Ævar Arnfjörð Bjarmason
  2022-06-04 12:24       ` René Scharfe
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-04  0:54 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Sat, Jun 04 2022, René Scharfe wrote:

> Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
>> Return NULL instead of possibly feeding a NULL to memset() in
>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer:
>>
>> 	reftable/publicbasics.c: In function ‘reftable_calloc’:
>> 	reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument]
>> 	   43 |         memset(p, 0, sz);
>> 	      |         ^~~~~~~~~~~~~~~~
>> 	[...]
>>
>> This bug has been with us ever since this code was added in
>> ef8a6c62687 (reftable: utility functions, 2021-10-07).
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  reftable/publicbasics.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c
>> index 0ad7d5c0ff2..a18167f5ab7 100644
>> --- a/reftable/publicbasics.c
>> +++ b/reftable/publicbasics.c
>> @@ -40,6 +40,8 @@ void reftable_free(void *p)
>>  void *reftable_calloc(size_t sz)
>>  {
>>  	void *p = reftable_malloc(sz);
>> +	if (!p)
>> +		return NULL;
>>  	memset(p, 0, sz);
>>  	return p;
>>  }
>
> We discussed this before, in
> https://lore.kernel.org/git/RFC-patch-2.2-364d1194a95-20220415T101740Z-avarab@gmail.com/
>
> If this code was actually used by Git and still not handling allocation
> failures then I'd propose something like the below instead.
>
> Next I'd probably try to convert reftable_calloc() calls to a variant
> that takes size and count separately -- like calloc(3) does -- to avoid
> unchecked multiplication.
>
> --- >8 ---
> Subject: [PATCH] reftable: remove reftable_set_alloc()

I think this is a much better direction than my more narrow fix, and
would be happy to see it queued up.

To your comment here & some others (e.g. FREE_AND_NULL()): I was really
trying to focus on narrowly addressing these -fanalyzer issues without
digressing into the larger topics "what is this code *really* doing, and
does it make sense?". It was pretty unavoidable in 13/15 though.

Which isn't to say that I shouldn't fix some of it, e.g. your
s/return/BUG()/ suggestion, but I think it's best to view these patches
with an eye towards us already having these issues, and in most cases
making -fanalyzer happy is a small cost.

And by doing so and getting a "clean build" we'll be able to turn it on
in CI, and thus notice when we run into new -fanalyzer issues.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-04  0:54     ` Ævar Arnfjörð Bjarmason
@ 2022-06-04 12:24       ` René Scharfe
  2022-06-04 16:23         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 51+ messages in thread
From: René Scharfe @ 2022-06-04 12:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Am 04.06.22 um 02:54 schrieb Ævar Arnfjörð Bjarmason:
>
> To your comment here & some others (e.g. FREE_AND_NULL()): I was really
> trying to focus on narrowly addressing these -fanalyzer issues without
> digressing into the larger topics "what is this code *really* doing, and
> does it make sense?". It was pretty unavoidable in 13/15 though.
>
> Which isn't to say that I shouldn't fix some of it, e.g. your
> s/return/BUG()/ suggestion, but I think it's best to view these patches
> with an eye towards us already having these issues, and in most cases
> making -fanalyzer happy is a small cost.
>
> And by doing so and getting a "clean build" we'll be able to turn it on
> in CI, and thus notice when we run into new -fanalyzer issues.

Future analyzer reports are likely of the same quality as the current
ones.  If the goal is to shush them then we should just not use the
analyzer.  If reports contain a helpful signal, e.g. pointing to a real
bug or to overly complicated code, then we better address these issues.

We can think about automating the analyzer once we have a certain number
of commits with improvements that would not have been made without it.

René

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-03 18:37 ` [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow() Ævar Arnfjörð Bjarmason
@ 2022-06-04 12:24   ` René Scharfe
  2022-06-04 12:46   ` Phillip Wood
  1 sibling, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-04 12:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
> yell at us about sb->buf[0] dereferencing NULL, this also makes this
> code easier to follow.
>
> This BUG() should be unreachable since the state of our "sb->buf" and
> "sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
> know that, and adding the BUG() also makes it clearer to human readers
> that that's what happens here.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  strbuf.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/strbuf.c b/strbuf.c
> index dd9eb85527a..61c4630aeeb 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>  	if (new_buf)
>  		sb->buf = NULL;
>  	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	if (new_buf && !sb->buf)
> +		BUG("for a new buffer ALLOC_GROW() should always do work!");

new_buf is !sb->alloc.  ALLOC_GROW sets buf if sb->len + extra + 1 is
bigger than sb->alloc, which is always true because the variables in the
sum are unsigned, so we end up with at least 1 > 0.  That's easy enough
to see; I wonder why the compiler doesn't agree.  Am I missing
something?

Does setting the attribute returns_nonnull for xrealloc help?
Specifying it explicitly makes sense to me -- expecting the compiler to
infer it automatically across compilation units is probably a bit too
much to ask for, or at least needlessly expensive.

>  	if (new_buf)
>  		sb->buf[0] = '\0';
>  }

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-03 18:37 ` [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow() Ævar Arnfjörð Bjarmason
  2022-06-04 12:24   ` René Scharfe
@ 2022-06-04 12:46   ` Phillip Wood
  2022-06-04 16:21     ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 51+ messages in thread
From: Phillip Wood @ 2022-06-04 12:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Hi Ævar

[This is an old address that I only have webmail access to, please use phillip.wood@dunelm.org.uk when cc'ing me]

> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> 
> Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
> yell at us about sb->buf[0] dereferencing NULL, this also makes this
> code easier to follow.
> 
> This BUG() should be unreachable since the state of our "sb->buf" and
> "sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
> know that, and adding the BUG() also makes it clearer to human readers
> that that's what happens here.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  strbuf.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/strbuf.c b/strbuf.c
> index dd9eb85527a..61c4630aeeb 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>  	if (new_buf)
>  		sb->buf = NULL;
>  	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	if (new_buf && !sb->buf)
> +		BUG("for a new buffer ALLOC_GROW() should always do work!");

This is a bit ugly, have you tried adding
__attribute__((malloc (free), returns_nonnull))
to xmalloc() and xrealloc() ?

Best Wishes

Phillip

>  	if (new_buf)
>  		sb->buf[0] = '\0';
>  }
> -- 
> 2.36.1.1124.g577fa9c2ebd
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-03 21:07   ` René Scharfe
  2022-06-03 21:28     ` Junio C Hamano
  2022-06-03 22:32     ` Glen Choo
@ 2022-06-04 12:51     ` Phillip Wood
  2022-06-04 16:20       ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 51+ messages in thread
From: Phillip Wood @ 2022-06-04 12:51 UTC (permalink / raw)
  To: René Scharfe, Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau


> On 03 June 2022 at 22:07 René Scharfe <l.s.r@web.de> wrote:
> 
> 
> Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> > Fix a bug in fd3cb0501e1 (remote: move static variables into
> > per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
> > after having NULL'd out remote->pushurl. itself.
> >
> > While we're at it let's get rid of the redundant braces per the
> > CodingGuidelines, which also serves to show in the diff context that
> > we were doing a FREE_AND_NULL(remote->pushurl) afterwards too, let's
> > keep that one.
> 
> The extended context is helping, but the brace removal makes this change
> harder to read.

Indeed, a small style fix in a larger change is one thing but here at least 80% of the changed lines are unrelated to the bugfix. I'm afraid my heart has started to sink when I see the phrase "while we're at it" in a commit message.

> [...]
> >  	FREE_AND_NULL(remote->pushurl);
> 
> Why set pushurl to NULL after release?  This results in an invalid state
> unless pushurl_nr und pushurl_alloc are reset to zero.  Same goes for
> the url array above -- either a simple free(3) call suffices or url_nr
> and url_alloc need to be cleared as well.

That's a good point

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG()
  2022-06-03 18:37 ` [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG() Ævar Arnfjörð Bjarmason
@ 2022-06-04 13:04   ` Phillip Wood
  0 siblings, 0 replies; 51+ messages in thread
From: Phillip Wood @ 2022-06-04 13:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Hi Ævar

> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> 
> Assert that this code added in [1], [2] and other related commits
> expects that once we see a "diff " line we should have a non-NULL
> "file_diff" and "hunk".
> 
> In practice this would have always been the case, as we are parsing
> our own "diff" output, but e.g. GCC v12's -fanalyzer doesn't know
> that, and will alert us that in the "else if" and below in this
> function we could be dereferencing NULL if we were processing anything
> except our expected input.

If we're only doing this to keep -fanalyzer quiet then would it be better to use the macro you introduce at the end of this series instead?

Best Wishes

Phillip

> 1. f6aa7ecc343 (built-in add -i: start implementing the `patch`
>    functionality in C, 2019-12-13)
> 2. 80399aec5ab (built-in add -p: support multi-file diffs, 2019-12-13)
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  add-patch.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/add-patch.c b/add-patch.c
> index 55d719f7845..087bf317b07 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -478,11 +478,16 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
>  	while (p != pend) {
>  		char *eol = memchr(p, '\n', pend - p);
>  		const char *deleted = NULL, *mode_change = NULL;
> +		const char *const diff_l = "diff ";
> +		int is_diff_line = starts_with(p, diff_l);
>  
>  		if (!eol)
>  			eol = pend;
>  
> -		if (starts_with(p, "diff ")) {
> +		if (!is_diff_line && (!file_diff || !hunk))
> +			BUG("expected '%s' line to follow a '%s' line", p, diff_l);
> +
> +		if (is_diff_line) {
>  			complete_file(marker, hunk);
>  			ALLOC_GROW_BY(s->file_diff, s->file_diff_nr, 1,
>  				   file_diff_alloc);
> -- 
> 2.36.1.1124.g577fa9c2ebd
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro
  2022-06-03 18:37 ` [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro Ævar Arnfjörð Bjarmason
@ 2022-06-04 13:12   ` Phillip Wood
  0 siblings, 0 replies; 51+ messages in thread
From: Phillip Wood @ 2022-06-04 13:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, René Scharfe, Jinoh Kang, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> 
> 
> Add an ASSERT_FOR_FANALYZER() macro to quiet those -fanalyzer issues

If the purpose of the macro is to silence -fanalyzer then it would perhaps be better to name it after that purpose rather than the way that it is implemented. SILENCE_FANALYZER_WARNING() or something like that.

> that haven't been diagnosed yet to either change the code involved, or
> to add an assert() or BUG() to it to placate GCC v12's -fanalyzer
> mode.
> 
> As in the preceding commit we could also use -Wno-error=* here, which
> was the initial implementation in config.mak.dev, i.e. something like:
> 
> 	## -Wno-error=analyzer-null-dereference
> 	$(eval $(call fn_disable_analyzer, \
> 		-Wno-error=analyzer-null-dereference, \
> 		dir \
> 		gpg-interface \
> 		graph \
> 		range-diff \
> 		utf8 \
> 	))
> 	## -Wno-error=analyzer-null-dereference: pre-gcc12
> 	ifeq ($(filter gcc12,$(COMPILER_FEATURES)),)
> 	$(eval $(call fn_disable_analyzer, \
> 		-Wno-error=analyzer-null-dereference, \
> 		merge \
> 		builtin/name-rev \
> 	))
> 	endif
> 
> But any such approach will unfortunately quiet *all* in the relevant
> files, including any new ones. Instead we want to quiet specific
> issues involved with ASSERT_FOR_FANALYZER().

I had a hard time understanding this, maybe

But any such approach will unfortunately miss any new warnings as it silences them for the whole file.

Best Wishes

Phillip

> 
> A drawback of this overall approach is that this only works under
> e.g. CFLAGS=-O0, but not CFLAGS=-O3. The compiler takes the liberty to
> re-arrange our code to get around these assert()s, and other new
> issues will also crop up. All of this is expected (-fanalyzer is
> subject to optimization passes, like most other options), but let's
> focus on -O0 for now.
> 
> Commentary on specific cases:
> 
>  * builtin/name-rev.c: Since 45a14f578e1 (Revert "name-rev: release
>    unused name strings", 2022-04-22) GCC v12's -fanalyzer has complained
>    about this code, per René's analysis in [1] it's incorrect to do
>    so. So let's add an ASSERT_FOR_FANALYZER() to placate it.
> 
>  * graph.c: In 0f0f389f120 (graph: tidy up display of left-skewed
>    merges, 2019-10-15) a previous "assert" was removed, and another
>    unconditional deference of the return value of
>    first_interesting_parent() was added without checking it. Since it
>    can return NULL let's add ASSERT_FOR_FANALYZER()'s here to note for
>    -fanalyzer that we won't be dereferencing these in practice.
> 
> 1. https://lore.kernel.org/git/dece627d-ccf8-2375-0c50-c59637e561d3@web.de/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/name-rev.c |  1 +
>  config.mak.dev     |  4 ++++
>  dir.c              |  1 +
>  git-compat-util.h  | 16 ++++++++++++++++
>  gpg-interface.c    |  2 ++
>  graph.c            |  2 ++
>  line-log.c         |  2 ++
>  unpack-trees.c     |  1 +
>  utf8.c             |  1 +
>  9 files changed, 30 insertions(+)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index 02ea9d16330..419159961b9 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -223,6 +223,7 @@ static void name_rev(struct commit *start_commit,
>  			if (commit_is_before_cutoff(parent))
>  				continue;
>  
> +			ASSERT_FOR_FANALYZER(name);
>  			if (parent_number > 1) {
>  				generation = 0;
>  				distance = name->distance + MERGE_TRAVERSAL_WEIGHT;
> diff --git a/config.mak.dev b/config.mak.dev
> index d6f5be92297..1d47fc04163 100644
> --- a/config.mak.dev
> +++ b/config.mak.dev
> @@ -83,6 +83,10 @@ endif
>  
>  DEVELOPER_CFLAGS += -fanalyzer
>  
> +ifeq ($(filter no-suppress-analyzer,$(DEVOPTS)),)
> +DEVELOPER_CFLAGS += -DSUPPRESS_FSANITIZER
> +endif
> +
>  ## -fanalyzer exists exists as of gcc10, but versions older than gcc12
>  ## have a lot of false positives.
>  ifeq ($(filter gcc12,$(COMPILER_FEATURES)),)
> diff --git a/dir.c b/dir.c
> index 3633c0852ee..c80b584b4a7 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -3781,6 +3781,7 @@ static void invalidate_one_directory(struct untracked_cache *uc,
>  				     struct untracked_cache_dir *ucd)
>  {
>  	uc->dir_invalidated++;
> +	ASSERT_FOR_FANALYZER(ucd);
>  	ucd->valid = 0;
>  	ucd->untracked_nr = 0;
>  }
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 96293b6c43a..a553884c048 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -984,6 +984,21 @@ static inline unsigned long cast_size_t_to_ulong(size_t a)
>  	return (unsigned long)a;
>  }
>  
> +/**
> + * Transitory helper macro to quiet currently known GCC -fsanitizer
> + * issues.
> + *
> + * We lie to it and say that we're confident that the given "expr" to
> + * ASSERT_FOR_FANALYZER() cannot be NULL (or 0). Those
> + * grandfathered-in issues should be fixed, but at least we're
> + * stemming the tide.
> + */
> +#ifdef SUPPRESS_FSANITIZER
> +#define ASSERT_FOR_FANALYZER(expr) assert((expr))
> +#else
> +#define ASSERT_FOR_FANALYZER(expr) do {} while (0)
> +#endif
> +
>  #ifdef HAVE_ALLOCA_H
>  # include <alloca.h>
>  # define xalloca(size)      (alloca(size))
> @@ -1037,6 +1052,7 @@ int xstrncmpz(const char *s, const char *t, size_t len);
>  	BUILD_ASSERT_OR_ZERO(sizeof(*(dst)) == sizeof(*(src))))
>  static inline void copy_array(void *dst, const void *src, size_t n, size_t size)
>  {
> +	ASSERT_FOR_FANALYZER(dst);
>  	if (n)
>  		memcpy(dst, src, st_mult(size, n));
>  }
> diff --git a/gpg-interface.c b/gpg-interface.c
> index 280f1fa1a58..9cba3b86e45 100644
> --- a/gpg-interface.c
> +++ b/gpg-interface.c
> @@ -242,6 +242,7 @@ static void parse_gpg_output(struct signature_check *sigc)
>  					next = strchrnul(line, ' ');
>  					replace_cstring(&sigc->key, line, next);
>  					/* Do we have signer information? */
> +					ASSERT_FOR_FANALYZER(next);
>  					if (*next && (sigcheck_gpg_status[i].flags & GPG_STATUS_UID)) {
>  						line = next + 1;
>  						next = strchrnul(line, '\n');
> @@ -283,6 +284,7 @@ static void parse_gpg_output(struct signature_check *sigc)
>  					 */
>  					limit = strchrnul(line, '\n');
>  					for (j = 9; j > 0; j--) {
> +						ASSERT_FOR_FANALYZER(next);
>  						if (!*next || limit <= next)
>  							break;
>  						line = next + 1;
> diff --git a/graph.c b/graph.c
> index 568b6e7cd41..39f7e95d4ab 100644
> --- a/graph.c
> +++ b/graph.c
> @@ -1105,6 +1105,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
>  			seen_this = 1;
>  
>  			for (j = 0; j < graph->num_parents; j++) {
> +				ASSERT_FOR_FANALYZER(parents);
>  				par_column = graph_find_new_column_by_commit(graph, parents->item);
>  				assert(par_column >= 0);
>  
> @@ -1138,6 +1139,7 @@ static void graph_output_post_merge_line(struct git_graph *graph, struct graph_l
>  			}
>  		}
>  
> +		ASSERT_FOR_FANALYZER(first_parent);
>  		if (col_commit == first_parent->item)
>  			parent_col = col;
>  	}
> diff --git a/line-log.c b/line-log.c
> index 51d93310a4d..1295f46deaf 100644
> --- a/line-log.c
> +++ b/line-log.c
> @@ -154,6 +154,7 @@ static void range_set_union(struct range_set *out,
>  	while (i < a->nr || j < b->nr) {
>  		struct range *new_range;
>  		if (i < a->nr && j < b->nr) {
> +			ASSERT_FOR_FANALYZER(rb);
>  			if (ra[i].start < rb[j].start)
>  				new_range = &ra[i++];
>  			else if (ra[i].start > rb[j].start)
> @@ -166,6 +167,7 @@ static void range_set_union(struct range_set *out,
>  			new_range = &ra[i++];
>  		else                       /* a exhausted */
>  			new_range = &rb[j++];
> +		ASSERT_FOR_FANALYZER(new_range);
>  		if (new_range->start == new_range->end)
>  			; /* empty range */
>  		else if (!out->nr || out->ranges[out->nr-1].end < new_range->start) {
> diff --git a/unpack-trees.c b/unpack-trees.c
> index a1d0ff3a4d3..b6e10b05a00 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -2195,6 +2195,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
>  	char *pathbuf;
>  	int cnt = 0;
>  
> +	ASSERT_FOR_FANALYZER(ce);
>  	if (S_ISGITLINK(ce->ce_mode)) {
>  		struct object_id oid;
>  		int sub_head = resolve_gitlink_ref(ce->name, "HEAD", &oid);
> diff --git a/utf8.c b/utf8.c
> index de4ce5c0e68..3ffc0a97f3b 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -130,6 +130,7 @@ static ucs_char_t pick_one_utf8_char(const char **start, size_t *remainder_p)
>  	 */
>  	remainder = (remainder_p ? *remainder_p : 999);
>  
> +	ASSERT_FOR_FANALYZER(remainder < 1 || s);
>  	if (remainder < 1) {
>  		goto invalid;
>  	} else if (*s < 0x80) {
> -- 
> 2.36.1.1124.g577fa9c2ebd
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop
  2022-06-04 12:51     ` Phillip Wood
@ 2022-06-04 16:20       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-04 16:20 UTC (permalink / raw)
  To: Phillip Wood
  Cc: René Scharfe, git, Junio C Hamano, Jinoh Kang, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Sat, Jun 04 2022, Phillip Wood wrote:

>> On 03 June 2022 at 22:07 René Scharfe <l.s.r@web.de> wrote:
>> 
>> 
>> Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
>> > Fix a bug in fd3cb0501e1 (remote: move static variables into
>> > per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
>> > after having NULL'd out remote->pushurl. itself.
>> >
>> > While we're at it let's get rid of the redundant braces per the
>> > CodingGuidelines, which also serves to show in the diff context that
>> > we were doing a FREE_AND_NULL(remote->pushurl) afterwards too, let's
>> > keep that one.
>> 
>> The extended context is helping, but the brace removal makes this change
>> harder to read.
>
> Indeed, a small style fix in a larger change is one thing but here at
> least 80% of the changed lines are unrelated to the bugfix. I'm afraid
> my heart has started to sink when I see the phrase "while we're at it"
> in a commit message.

I'm happy to omit it, I thought it was helpful to force the context to
be shown, will change that.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-04 12:46   ` Phillip Wood
@ 2022-06-04 16:21     ` Ævar Arnfjörð Bjarmason
  2022-06-04 20:37       ` René Scharfe
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-04 16:21 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Junio C Hamano, René Scharfe, Jinoh Kang, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Sat, Jun 04 2022, Phillip Wood wrote:

> Hi Ævar
>
> [This is an old address that I only have webmail access to, please use
> phillip.wood@dunelm.org.uk when cc'ing me]

Hrm, I just grabbed it from an old commit of yours I found. Consider
sending in a patch to update .mailmap :)

>> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> 
>> 
>> Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
>> yell at us about sb->buf[0] dereferencing NULL, this also makes this
>> code easier to follow.
>> 
>> This BUG() should be unreachable since the state of our "sb->buf" and
>> "sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
>> know that, and adding the BUG() also makes it clearer to human readers
>> that that's what happens here.
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  strbuf.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/strbuf.c b/strbuf.c
>> index dd9eb85527a..61c4630aeeb 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>>  	if (new_buf)
>>  		sb->buf = NULL;
>>  	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>> +	if (new_buf && !sb->buf)
>> +		BUG("for a new buffer ALLOC_GROW() should always do work!");
>
> This is a bit ugly, have you tried adding
> __attribute__((malloc (free), returns_nonnull))
> to xmalloc() and xrealloc() ?
>

Will try to experiment with that, perhaps GCC can be massaged to grok
this somehow.

I do vaguely remember (but couldn't track down where it was) that we
have some config for coverity for this function, due to it also having
trouble with it.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-04 12:24       ` René Scharfe
@ 2022-06-04 16:23         ` Ævar Arnfjörð Bjarmason
  2022-06-04 20:31           ` René Scharfe
  2022-06-06 16:53           ` Junio C Hamano
  0 siblings, 2 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-04 16:23 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Sat, Jun 04 2022, René Scharfe wrote:

> Am 04.06.22 um 02:54 schrieb Ævar Arnfjörð Bjarmason:
>>
>> To your comment here & some others (e.g. FREE_AND_NULL()): I was really
>> trying to focus on narrowly addressing these -fanalyzer issues without
>> digressing into the larger topics "what is this code *really* doing, and
>> does it make sense?". It was pretty unavoidable in 13/15 though.
>>
>> Which isn't to say that I shouldn't fix some of it, e.g. your
>> s/return/BUG()/ suggestion, but I think it's best to view these patches
>> with an eye towards us already having these issues, and in most cases
>> making -fanalyzer happy is a small cost.
>>
>> And by doing so and getting a "clean build" we'll be able to turn it on
>> in CI, and thus notice when we run into new -fanalyzer issues.
>
> Future analyzer reports are likely of the same quality as the current
> ones.  If the goal is to shush them then we should just not use the
> analyzer.  If reports contain a helpful signal, e.g. pointing to a real
> bug or to overly complicated code, then we better address these issues.
>
> We can think about automating the analyzer once we have a certain number
> of commits with improvements that would not have been made without it.

We might decide not to go with -fanalyzer in CI or whatever, but I
really think that your line of reasoning here is just the wrong way to
evaluate the cost/benefit of -fanalyzer, a new warning or whatever.

There's ~15 commits in this series addressing things -fanalyzer brought
up, and it would be ~20 if the remaining issues I punted on were
addressed.

The question shouldn't be whether those things in particular were worth
the effort, but whether the added safety of getting the new diagnostic
going forward is worth the one-time cost.

Some of these commits are fixing issues going back to 2007-ish, $(git
log --no-merges --oneline -- '*.[ch]' | wc -l) is ~25k lines. And
looking at it like that 20/25K isn't that bad of a ratio :)

FWIW I spotted a couple of bugs in my own unsubmitted code from running
all of it through -fanalyzer, and that POV is also worth thinking about,
i.e. it's not just about improving git's current code, or even commits
that might land in git.git in the future.

But also to provide a development aid so that when we're writing patches
we spot issues earlier, even if they're ones we might spot before we
send the patch, or in review before it gets applied.

It's also a much faster way of spotting certain issues, if you take into
account that we've already been spotting some of these with the likes of
SANITIZE=address, valgrind runs, or coverity.

I find the warning output from -fanalyzer to be *really useful*. It's
scarily verbose at first, but it's basically doing most of the work for
you in terms of exhaustively describing how the control flow got to a
given location. With e.g. SANITIZE=address and valgrind (to the extent
that they overlap) you might get a stacktrace or two, but you generally
have to chase all that down yourself.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname
  2022-06-03 18:37 ` [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname Ævar Arnfjörð Bjarmason
@ 2022-06-04 18:07   ` René Scharfe
  0 siblings, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-04 18:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Adjust code originally added in 5339bdad96a (ref-filter: introduce
> remote_ref_atom_parser(), 2016-02-17) to BUG() out rather than
> potentially segfault if we get a NULL refname here.
>
> As noted by GCC v12's -fanalyzer that will happen if this follows the
> "refname = NULL" branch added in cc72385fe35 (for-each-ref: let
> upstream/push optionally report the remote name, 2017-10-05).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  ref-filter.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 2413f889f48..91aa8e89268 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1644,7 +1644,9 @@ static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
>  				    struct branch *branch, const char **s)
>  {
>  	int num_ours, num_theirs;
> -	if (atom->u.remote_ref.option == RR_REF)
> +	if (atom->u.remote_ref.option == RR_REF && !refname)
> +		BUG("must get refname with [...]remote_ref.option == RR_REF");
> +	else if (atom->u.remote_ref.option == RR_REF)

The related code not shown here looks tricky.  IIUC refname cannot be
NULL if option is RR_REF, because remote_ref_atom_parser() only sets
.push_remote for other option values and only if that's set we can get a
NULL refname passed to fill_remote_ref_details().  Right?

Do we need to have two different ways to indicate remote-ness?  Having
just one like after the patch below should reduce the risk of confusion.

>  		*s = show_ref(&atom->u.remote_ref.refname, refname);
>  	else if (atom->u.remote_ref.option == RR_TRACK) {
>  		if (stat_tracking_info(branch, &num_ours, &num_theirs,


diff --git a/ref-filter.c b/ref-filter.c
index 2413f889f4..260257b92c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -183,7 +183,7 @@ static struct used_atom {
 				RR_REF, RR_TRACK, RR_TRACKSHORT, RR_REMOTE_NAME, RR_REMOTE_REF
 			} option;
 			struct refname_atom refname;
-			unsigned int nobracket : 1, push : 1, push_remote : 1;
+			unsigned int nobracket : 1, push : 1;
 		} remote_ref;
 		struct {
 			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
@@ -295,10 +295,8 @@ static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *a
 			atom->u.remote_ref.nobracket = 1;
 		else if (!strcmp(s, "remotename")) {
 			atom->u.remote_ref.option = RR_REMOTE_NAME;
-			atom->u.remote_ref.push_remote = 1;
 		} else if (!strcmp(s, "remoteref")) {
 			atom->u.remote_ref.option = RR_REMOTE_REF;
-			atom->u.remote_ref.push_remote = 1;
 		} else {
 			atom->u.remote_ref.option = RR_REF;
 			if (refname_atom_parser_internal(&atom->u.remote_ref.refname,
@@ -1891,7 +1889,8 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				continue;
 			branch = branch_get(branch_name);

-			if (atom->u.remote_ref.push_remote)
+			if (atom->u.remote_ref.option == RR_REMOTE_NAME ||
+			    atom->u.remote_ref.option == RR_REMOTE_REF)
 				refname = NULL;
 			else {
 				refname = branch_get_push(branch, NULL);

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-04 16:23         ` Ævar Arnfjörð Bjarmason
@ 2022-06-04 20:31           ` René Scharfe
  2022-06-06 16:53           ` Junio C Hamano
  1 sibling, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-04 20:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Am 04.06.22 um 18:23 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 04 2022, René Scharfe wrote:
>
>> Am 04.06.22 um 02:54 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> To your comment here & some others (e.g. FREE_AND_NULL()): I was really
>>> trying to focus on narrowly addressing these -fanalyzer issues without
>>> digressing into the larger topics "what is this code *really* doing, and
>>> does it make sense?". It was pretty unavoidable in 13/15 though.
>>>
>>> Which isn't to say that I shouldn't fix some of it, e.g. your
>>> s/return/BUG()/ suggestion, but I think it's best to view these patches
>>> with an eye towards us already having these issues, and in most cases
>>> making -fanalyzer happy is a small cost.
>>>
>>> And by doing so and getting a "clean build" we'll be able to turn it on
>>> in CI, and thus notice when we run into new -fanalyzer issues.
>>
>> Future analyzer reports are likely of the same quality as the current
>> ones.  If the goal is to shush them then we should just not use the
>> analyzer.  If reports contain a helpful signal, e.g. pointing to a real
>> bug or to overly complicated code, then we better address these issues.
>>
>> We can think about automating the analyzer once we have a certain number
>> of commits with improvements that would not have been made without it.
>
> We might decide not to go with -fanalyzer in CI or whatever, but I
> really think that your line of reasoning here is just the wrong way to
> evaluate the cost/benefit of -fanalyzer, a new warning or whatever.
>
> There's ~15 commits in this series addressing things -fanalyzer brought
> up, and it would be ~20 if the remaining issues I punted on were
> addressed.
>
> The question shouldn't be whether those things in particular were worth
> the effort, but whether the added safety of getting the new diagnostic
> going forward is worth the one-time cost.

Is it pointing out real issues?  Yes, the ones from patches 1 and 2 look
legit.  Let's get them fixed and attribute -fanalyzer for finding them!
They are good selling points.

Is it giving us false positives?  Patches 4, 5 and 6 look like that to
me.  Perhaps we can still use this opportunity to simplify the code for
both human and machine, or at least add helpful asserts.

Patch 3 is a bit surprising to me.  It's a real issue, but the whole
reftable code never bothers to check for allocation errors, so I would
have expected the analyzer to report lots more places.

Is it worth automating the thing?  Perhaps, but we better work through
the reports we already have first before getting new ones.

> FWIW I spotted a couple of bugs in my own unsubmitted code from running
> all of it through -fanalyzer, and that POV is also worth thinking about,
> i.e. it's not just about improving git's current code, or even commits
> that might land in git.git in the future.
>
> But also to provide a development aid so that when we're writing patches
> we spot issues earlier, even if they're ones we might spot before we
> send the patch, or in review before it gets applied.
>
> It's also a much faster way of spotting certain issues, if you take into
> account that we've already been spotting some of these with the likes of
> SANITIZE=address, valgrind runs, or coverity.
>
> I find the warning output from -fanalyzer to be *really useful*. It's
> scarily verbose at first, but it's basically doing most of the work for
> you in terms of exhaustively describing how the control flow got to a
> given location. With e.g. SANITIZE=address and valgrind (to the extent
> that they overlap) you might get a stacktrace or two, but you generally
> have to chase all that down yourself.

Sounds good, makes me want to try it.  And I actually can, to a certain
degree: GCC 13 seems to get confused when compiling compat/ code on my
machine (macOS).  Not sure what's going on.  Compiling e.g. strbuf.c
works, though, and gives me the same false positive. :-|

René

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-04 16:21     ` Ævar Arnfjörð Bjarmason
@ 2022-06-04 20:37       ` René Scharfe
  2022-06-05 10:20         ` Phillip Wood
  0 siblings, 1 reply; 51+ messages in thread
From: René Scharfe @ 2022-06-04 20:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Phillip Wood
  Cc: git, Junio C Hamano, Jinoh Kang, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 04.06.22 um 18:21 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 04 2022, Phillip Wood wrote:
>
>> Hi Ævar
>>
>> [This is an old address that I only have webmail access to, please use
>> phillip.wood@dunelm.org.uk when cc'ing me]
>
> Hrm, I just grabbed it from an old commit of yours I found. Consider
> sending in a patch to update .mailmap :)
>
>>> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>>
>>>
>>> Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
>>> yell at us about sb->buf[0] dereferencing NULL, this also makes this
>>> code easier to follow.
>>>
>>> This BUG() should be unreachable since the state of our "sb->buf" and
>>> "sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
>>> know that, and adding the BUG() also makes it clearer to human readers
>>> that that's what happens here.
>>>
>>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>> ---
>>>  strbuf.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/strbuf.c b/strbuf.c
>>> index dd9eb85527a..61c4630aeeb 100644
>>> --- a/strbuf.c
>>> +++ b/strbuf.c
>>> @@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>>>  	if (new_buf)
>>>  		sb->buf = NULL;
>>>  	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>>> +	if (new_buf && !sb->buf)
>>> +		BUG("for a new buffer ALLOC_GROW() should always do work!");
>>
>> This is a bit ugly, have you tried adding
>> __attribute__((malloc (free), returns_nonnull))
>> to xmalloc() and xrealloc() ?
>>
>
> Will try to experiment with that, perhaps GCC can be massaged to grok
> this somehow.
>
> I do vaguely remember (but couldn't track down where it was) that we
> have some config for coverity for this function, due to it also having
> trouble with it.

Good idea, but this attribute doesn't seem to help here.

The following helps, but I don't know why it would be needed -- if alloc
is 0 then any strbuf_grow() call will give a nr of at least 1 (for the
NUL character):


diff --git a/cache.h b/cache.h
index 595582becc..fe10789959 100644
--- a/cache.h
+++ b/cache.h
@@ -707,7 +707,7 @@ int daemonize(void);
  */
 #define ALLOC_GROW(x, nr, alloc) \
 	do { \
-		if ((nr) > alloc) { \
+		if (!alloc || (nr) > alloc) { \
 			if (alloc_nr(alloc) < (nr)) \
 				alloc = (nr); \
 			else \

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows()
  2022-06-03 18:37 ` [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows() Ævar Arnfjörð Bjarmason
@ 2022-06-04 21:27   ` René Scharfe
  0 siblings, 0 replies; 51+ messages in thread
From: René Scharfe @ 2022-06-04 21:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jinoh Kang, Phillip Wood, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

Am 03.06.22 um 20:37 schrieb Ævar Arnfjörð Bjarmason:
> Change the strbuf_grow() function to use st_add3() instead of doing
> its own unsigned_add_overflows() checks.  The overflow checking here
> was originally added in b449f4cfc97 (Rework strbuf API and semantics.,
> 2007-09-06) and adjusted in 1368f65002b (compat: helper for detecting
> unsigned overflow, 2010-10-10). Instead we compute a "sz" with
> st_add3().

Good idea.

> That was done at a time when the underlying xrealloc() in
> git-compat-util.h didn't use st_mult() yet, that has been the case
> since the later e7792a74bcf (harden REALLOC_ARRAY and xcalloc against
> size_t overflow, 2016-02-22).

How is that relevant?

> The only behavior change here should be the very obscure edge case
> that we'd previously die() in cases where we strictly didn't need to,
> as we'd check both "extra + 1" and "sb->len + extra + 1" for
> overflow. If we overflowed only on the latter but were doing the
> former we'd needlessly die() die.

st_add3 does the same checks.  What do you mean with "doing the former"?

> I don't think that difference
> mattered, but it's noted here for completeness.
>
> While we're at it add an inline comment about why we're adding 1 to
> the values, that's also explained in the API documentation in
> strbuf.h, but since we're using that magic constant here...
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  strbuf.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index 61c4630aeeb..f0a74d2bfb1 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -91,12 +91,12 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
>  void strbuf_grow(struct strbuf *sb, size_t extra)
>  {
>  	int new_buf = !sb->alloc;
> -	if (unsigned_add_overflows(extra, 1) ||
> -	    unsigned_add_overflows(sb->len, extra + 1))
> -		die("you want to use way too much memory");
> +	const size_t sz_buf = new_buf ? 0 : sb->len;

This is a change in behavior.  The one you meant above?  It ignores the
len value if alloc is zero.  Any len value other than zero is invalid in
that case.  Adding code to paper over this invalid state seems a waste.
And if a brave caller fiddled with len before calling strbuf_grow(), the
resulting allocation will now be shorter.  Hopefully nobody does that,
but it's not totally impossible.

We could report the issue instead with something like the following:

	if (new_buf && sb->len)
		BUG("unallocated strbuf has invalid len: %"PRIuMAX, (uintmax_t)sb->len);

I'd rather keep the old behavior.  But whatever we decide to do, I think
the st_add3 change deserves its own commit with no further change in
behavior.

> +	const size_t sz = st_add3(sz_buf, extra, 1 /* for \0 */);

I don't mind the added comment.

> +
>  	if (new_buf)
>  		sb->buf = NULL;
> -	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> +	ALLOC_GROW(sb->buf, sz, sb->alloc);
>  	if (new_buf && !sb->buf)
>  		BUG("for a new buffer ALLOC_GROW() should always do work!");
>  	if (new_buf)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow()
  2022-06-04 20:37       ` René Scharfe
@ 2022-06-05 10:20         ` Phillip Wood
  0 siblings, 0 replies; 51+ messages in thread
From: Phillip Wood @ 2022-06-05 10:20 UTC (permalink / raw)
  To: René Scharfe, Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jinoh Kang, Glen Choo, Paul Tan,
	Han-Wen Nienhuys, Karthik Nayak, Jeff Smith, Taylor Blau

On 04/06/2022 21:37, René Scharfe wrote:
> Am 04.06.22 um 18:21 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sat, Jun 04 2022, Phillip Wood wrote:
>>
>>> Hi Ævar
>>>
>>> [This is an old address that I only have webmail access to, please use
>>> phillip.wood@dunelm.org.uk when cc'ing me]
>>
>> Hrm, I just grabbed it from an old commit of yours I found. Consider
>> sending in a patch to update .mailmap :)
>>
>>>> On 03 June 2022 at 19:37 Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>>>
>>>>
>>>> Change the strbuf_grow() function so that GCC v12's -fanalyze doesn't
>>>> yell at us about sb->buf[0] dereferencing NULL, this also makes this
>>>> code easier to follow.
>>>>
>>>> This BUG() should be unreachable since the state of our "sb->buf" and
>>>> "sb->alloc" goes hand-in-hand, but -fanalyzer isn't smart enough to
>>>> know that, and adding the BUG() also makes it clearer to human readers
>>>> that that's what happens here.
>>>>
>>>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>>> ---
>>>>   strbuf.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/strbuf.c b/strbuf.c
>>>> index dd9eb85527a..61c4630aeeb 100644
>>>> --- a/strbuf.c
>>>> +++ b/strbuf.c
>>>> @@ -97,6 +97,8 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>>>>   	if (new_buf)
>>>>   		sb->buf = NULL;
>>>>   	ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>>>> +	if (new_buf && !sb->buf)
>>>> +		BUG("for a new buffer ALLOC_GROW() should always do work!");
>>>
>>> This is a bit ugly, have you tried adding
>>> __attribute__((malloc (free), returns_nonnull))
>>> to xmalloc() and xrealloc() ?
>>>
>>
>> Will try to experiment with that, perhaps GCC can be massaged to grok
>> this somehow.
>>
>> I do vaguely remember (but couldn't track down where it was) that we
>> have some config for coverity for this function, due to it also having
>> trouble with it.
> 
> Good idea, but this attribute doesn't seem to help here.

Indeed, I was tricked by the misleading BUG message, the analyzer is not 
complaining that xrealloc() may return NULL, it is taking a path where 
it does not reallocate the buffer. I've copied the full output at the 
end of the message if anyone wants to see the whole thing but the 
relevant part is

            |cache.h:727:20:
            |  727 |                 if ((nr) > alloc) { \
            |      |                    ^
            |      |                    |
            |      |                    (10) following ‘false’ branch...
strbuf.c:99:9: note: in expansion of macro ‘ALLOC_GROW’
            |   99 |         ALLOC_GROW(sb->buf, sb->len + extra + 1, 
sb->alloc);
            |      |         ^~~~~~~~~~

> The following helps, but I don't know why it would be needed -- if alloc
> is 0 then any strbuf_grow() call will give a nr of at least 1 (for the
> NUL character):

Yes it seems to be ignoring the result of our overflow checks and 
assuming that sb->len + extra + 1 can overflow to zero. Having seen the 
number of false positives I'm inclined to think we should take the bug 
fixes and punt the rest of these patches. Maybe after a couple more gcc 
releases the false positive rate will be more manageable. (I tired 
running gcc 11 with -fanalyzer a few months ago and gave up looking at 
the output before I found a single real bug so it has certainly improved 
in gcc 12)

Best Wishes

Phillip

strbuf.c: In function ‘strbuf_grow’:
strbuf.c:101:28: warning: dereference of NULL ‘0’ [CWE-476] 
[-Wanalyzer-null-dereference]
   101 |                 sb->buf[0] = '\0';
       |                 ~~~~~~~~~~~^~~~~~
   ‘strbuf_normalize_path’: events 1-2
     |
     | 1156 | int strbuf_normalize_path(struct strbuf *src)
     |      |     ^~~~~~~~~~~~~~~~~~~~~
     |      |     |
     |      |     (1) entry to ‘strbuf_normalize_path’
     |......
     | 1160 |         strbuf_grow(&dst, src->len);
     |      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~
     |      |         |
     |      |         (2) calling ‘strbuf_grow’ from ‘strbuf_normalize_path’
     |
     +--> ‘strbuf_grow’: events 3-4
            |
            |   91 | void strbuf_grow(struct strbuf *sb, size_t extra)
            |      |      ^~~~~~~~~~~
            |      |      |
            |      |      (3) entry to ‘strbuf_grow’
            |......
            |   94 |         if (unsigned_add_overflows(extra, 1) ||
            |      |            ~
            |      |            |
            |      |            (4) following ‘false’ branch (when 
‘extra != 18446744073709551615’)...
            |
          ‘strbuf_grow’: event 5
            |
            |   95 |             unsigned_add_overflows(sb->len, extra + 1))
git-compat-util.h:128:7: note: in definition of macro 
‘unsigned_add_overflows’
            |  128 |     ((b) > maximum_unsigned_value_of_type(a) - (a))
            |      |       ^
            |
          ‘strbuf_grow’: events 6-9
            |
            |strbuf.c:94:46:
            |   94 |         if (unsigned_add_overflows(extra, 1) ||
            |......
            |   97 |         if (new_buf)
            |      |         ~~ ~
            |      |         |  |
            |      |         |  (8) following ‘true’ branch...
            |      |         (7) ...to here
            |   98 |                 sb->buf = NULL;
            |      |                 ~~
            |      |                 |
            |      |                 (9) ...to here
            |
          ‘strbuf_grow’: event 10
            |
            |cache.h:727:20:
            |  727 |                 if ((nr) > alloc) { \
            |      |                    ^
            |      |                    |
            |      |                    (10) following ‘false’ branch...
strbuf.c:99:9: note: in expansion of macro ‘ALLOC_GROW’
            |   99 |         ALLOC_GROW(sb->buf, sb->len + extra + 1, 
sb->alloc);
            |      |         ^~~~~~~~~~
            |
          ‘strbuf_grow’: event 11
            |
            |cache.h:726:12:
            |  726 |         do { \
            |      |            ^
            |      |            |
            |      |            (11) ...to here
strbuf.c:99:9: note: in expansion of macro ‘ALLOC_GROW’
            |   99 |         ALLOC_GROW(sb->buf, sb->len + extra + 1, 
sb->alloc);
            |      |         ^~~~~~~~~~
            |
          ‘strbuf_grow’: events 12-15
            |
            |  100 |         if (new_buf)
            |      |            ^
            |      |            |
            |      |            (12) following ‘true’ branch...
            |  101 |                 sb->buf[0] = '\0';
            |      |                 ~~~~~~~~~~~~~~~~~
            |      |                 | |        |
            |      |                 | |        (15) dereference of NULL 
‘*sb.buf’
            |      |                 | (14) ‘dst.buf’ is NULL
            |      |                 (13) ...to here


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-04 16:23         ` Ævar Arnfjörð Bjarmason
  2022-06-04 20:31           ` René Scharfe
@ 2022-06-06 16:53           ` Junio C Hamano
  2022-06-06 17:38             ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2022-06-06 16:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: René Scharfe, git, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> Future analyzer reports are likely of the same quality as the current
>> ones.  If the goal is to shush them then we should just not use the
>> analyzer.  If reports contain a helpful signal, e.g. pointing to a real
>> bug or to overly complicated code, then we better address these issues.
>>
>> We can think about automating the analyzer once we have a certain number
>> of commits with improvements that would not have been made without it.
>
> We might decide not to go with -fanalyzer in CI or whatever, but I
> really think that your line of reasoning here is just the wrong way to
> evaluate the cost/benefit of -fanalyzer, a new warning or whatever.

To me, your priorities look much less in line with this project's
than what Rt Réne gave us above.

> The question shouldn't be whether those things in particular were worth
> the effort, but whether the added safety of getting the new diagnostic
> going forward is worth the one-time cost.

Workarounds for false positives are hardly one-time cost.  They have
to stay with us until the -fanalyzer gets corrected, somebody needs
to remember to occasionally check if that happened, and revert the
workaround to bring the code into their more natural form.  What has
been good and readable for human programmers for a long time does
not need to be touched just to work around a false positive bug in a
new tool.

> I find the warning output from -fanalyzer to be *really useful*.

I do not mind if you add -fanalyzer during your build via your own
config.mak file, and you would help them improve the analyzer by
reporting false positive bugs while finding possible bugs in the
code we have (like you did in a few patches in this series) and the
code you are writing.  You are capable enough to keep your own set
of patches to work around their false positive bugs locally.

But if you have to send in 15 patches with more workaround changes
than real fix, then it is premature for us to bear the cost to have
workaround for the tool.

There are folks who use our codebase as a suitably-sized guinea pig
to improve their tool, and we should not make it harder for them to
do so, but our priority is to improve the product of this project.

Come to think of it, adding unnecessary workarounds is a hostile act
to those who are trying to improve -fanalyzer, I guess, too.  They
may want to fix problems in their tool, but workarounds hide them.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-06 16:53           ` Junio C Hamano
@ 2022-06-06 17:38             ` Ævar Arnfjörð Bjarmason
  2022-06-06 17:44               ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-06 17:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: René Scharfe, git, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Mon, Jun 06 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> [...]
>> The question shouldn't be whether those things in particular were worth
>> the effort, but whether the added safety of getting the new diagnostic
>> going forward is worth the one-time cost.
>
> Workarounds for false positives are hardly one-time cost.  They have
> to stay with us until the -fanalyzer gets corrected, somebody needs
> to remember to occasionally check if that happened, and revert the
> workaround to bring the code into their more natural form.  What has
> been good and readable for human programmers for a long time does
> not need to be touched just to work around a false positive bug in a
> new tool.

Yes, but I think in this case most of these are either 100% legitimate
issues, or things where we'd like to e.g. add a BUG() assertion anyway,
e.g. in the diff parsing case.

>> I find the warning output from -fanalyzer to be *really useful*.
>
> I do not mind if you add -fanalyzer during your build via your own
> config.mak file, and you would help them improve the analyzer by
> reporting false positive bugs while finding possible bugs in the
> code we have (like you did in a few patches in this series) and the
> code you are writing.  You are capable enough to keep your own set
> of patches to work around their false positive bugs locally.

Of course.

> But if you have to send in 15 patches with more workaround changes
> than real fix, then it is premature for us to bear the cost to have
> workaround for the tool.

The idea here was to send an RFC showing what it would take to get it
into a state where it would be more useful to others.

I.e. I've found it useful to run with it in my own builds and see if
anything crops up that's not on the whitelist, I was aiming to give that
to others with an easily tweakable knob.

> There are folks who use our codebase as a suitably-sized guinea pig
> to improve their tool, and we should not make it harder for them to
> do so, but our priority is to improve the product of this project.

Those people can still use CFLAGS=-fanalyzer

> Come to think of it, adding unnecessary workarounds is a hostile act
> to those who are trying to improve -fanalyzer, I guess, too.  They
> may want to fix problems in their tool, but workarounds hide them.

I'm not proposing that we do anything we wouldn't otherwise do to
appease -fanalyzer, but that we:

 1. Fix things it alerts on because we'd find it worthwhile anyway
 2. For the rest, have a macro to appease it.

That macro being similar to e.g. UNLEAK() in that we'd opt-in enable it
if you enabled certain flags, but otherwise we'd ignore it.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-06 17:38             ` Ævar Arnfjörð Bjarmason
@ 2022-06-06 17:44               ` Junio C Hamano
  2022-06-06 17:46                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 51+ messages in thread
From: Junio C Hamano @ 2022-06-06 17:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: René Scharfe, git, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> Yes, but I think in this case most of these are either 100% legitimate
> issues, or things where we'd like to e.g. add a BUG() assertion anyway,
> e.g. in the diff parsing case.

Where does that "BUG() assertion anyway" comes from?  What are we
protecting against?

I do not think such a BUG() added only for placating -fanalyzer
falls into "legitimate" category at all.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()
  2022-06-06 17:44               ` Junio C Hamano
@ 2022-06-06 17:46                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-06 17:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: René Scharfe, git, Jinoh Kang, Phillip Wood, Glen Choo,
	Paul Tan, Han-Wen Nienhuys, Karthik Nayak, Jeff Smith,
	Taylor Blau


On Mon, Jun 06 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> Yes, but I think in this case most of these are either 100% legitimate
>> issues, or things where we'd like to e.g. add a BUG() assertion anyway,
>> e.g. in the diff parsing case.
>
> Where does that "BUG() assertion anyway" comes from?  What are we
> protecting against?
>
> I do not think such a BUG() added only for placating -fanalyzer
> falls into "legitimate" category at all.

No, but e.g. in 09/15[1] we *would* have a segfault or NULL dereference if
that function was fed input that it wasn't expecting, which is output
from another git command (or the same output created in-core).

In that case we can reason about it being impossible in the current
state, but annotating that with the equivalent of a BUG("unreachable")
seems prudent, i.e. it helps human readers to understand the code,
whatever -fanalyzer says.

Ditto the pack-related code in [2], where it's noting an aspect I'd
noted in an earlier code review (before I'd ever used
-fanalyzer). I.e. the combination of certain flags could land us in a
NULL dereference, even if we can carefully reason that we won't combine
the two having the code be less easily misunderstood seems better.

1. https://lore.kernel.org/git/RFC-patch-09.15-de0f7722608-20220603T183608Z-avarab@gmail.com/
2. https://lore.kernel.org/git/RFC-patch-13.15-63eeb66185a-20220603T183608Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue
  2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
                   ` (14 preceding siblings ...)
  2022-06-03 18:37 ` [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro Ævar Arnfjörð Bjarmason
@ 2022-06-07 15:50 ` Ævar Arnfjörð Bjarmason
  2022-06-07 15:50   ` [PATCH 1/3] remote.c: remove braces from one-statement "for"-loops Ævar Arnfjörð Bjarmason
                     ` (3 more replies)
  15 siblings, 4 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-07 15:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Glen Choo, Phillip Wood,
	Ævar Arnfjörð Bjarmason

This series is spun off from the recent RFC -fanalyzer series[1] and
fixes a clear bug in remote_state_clear().

While doing so remove the underlying source of the landmines in the
API and use free() rather than FREE_AND_NULL() for structs that aren't
being re-used.

Then have the API free() the structure itself, rather than have
*_new() allocate it, but the caller being responsible for calling both
remote_state_clear() and free().

1. https://lore.kernel.org/git/RFC-cover-00.15-00000000000-20220603T183608Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (3):
  remote.c: remove braces from one-statement "for"-loops
  remote.c: don't dereference NULL in freeing loop
  remote API: don't buggily FREE_AND_NULL(), free() instead

 remote.c     | 23 ++++++++++-------------
 remote.h     | 10 +++++++++-
 repository.c |  2 +-
 3 files changed, 20 insertions(+), 15 deletions(-)

Range-diff:
 1:  b3a678d934a !  1:  1879ed2826e remote.c: don't dereference NULL in freeing loop
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    remote.c: don't dereference NULL in freeing loop
    +    remote.c: remove braces from one-statement "for"-loops
     
    -    Fix a bug in fd3cb0501e1 (remote: move static variables into
    -    per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
    -    after having NULL'd out remote->pushurl. itself.
    -
    -    While we're at it let's get rid of the redundant braces per the
    -    CodingGuidelines, which also serves to show in the diff context that
    -    we were doing a FREE_AND_NULL(remote->pushurl) afterwards too, let's
    -    keep that one.
    +    Remove braces that don't follow the CodingGuidelines from code added
    +    in fd3cb0501e1 (remote: move static variables into per-repository
    +    struct, 2021-11-17). A subsequent commit will edit code adjacent to
    +    this.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ remote.c: static void remote_clear(struct remote *remote)
     +	for (i = 0; i < remote->url_nr; i++)
      		free((char *)remote->url[i]);
     -	}
    --	FREE_AND_NULL(remote->pushurl);
    --
    + 	FREE_AND_NULL(remote->pushurl);
    + 
     -	for (i = 0; i < remote->pushurl_nr; i++) {
     +	for (i = 0; i < remote->pushurl_nr; i++)
      		free((char *)remote->pushurl[i]);
    @@ remote.c: static void remote_clear(struct remote *remote)
      	FREE_AND_NULL(remote->pushurl);
      	free((char *)remote->receivepack);
      	free((char *)remote->uploadpack);
    +@@ remote.c: void remote_state_clear(struct remote_state *remote_state)
    + {
    + 	int i;
    + 
    +-	for (i = 0; i < remote_state->remotes_nr; i++) {
    ++	for (i = 0; i < remote_state->remotes_nr; i++)
    + 		remote_clear(remote_state->remotes[i]);
    +-	}
    + 	FREE_AND_NULL(remote_state->remotes);
    + 	remote_state->remotes_alloc = 0;
    + 	remote_state->remotes_nr = 0;
 2:  4a055969ea5 <  -:  ----------- pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path
 3:  0b570d112fc <  -:  ----------- reftable: don't memset() a NULL from failed malloc()
 4:  3a287c19d7e <  -:  ----------- diff-lib.c: don't dereference NULL in oneway_diff()
 5:  46e0c307941 <  -:  ----------- refs/packed-backend.c: add a BUG() if iter is NULL
 6:  2d04035d7aa <  -:  ----------- ref-filter.c: BUG() out on show_ref() with NULL refname
 7:  cf1a5f3ed0f <  -:  ----------- strbuf.c: placate -fanalyzer in strbuf_grow()
 8:  2c4b7832144 <  -:  ----------- strbuf.c: use st_add3(), not unsigned_add_overflows()
 9:  de0f7722608 <  -:  ----------- add-patch: assert parse_diff() expectations with BUG()
10:  b50558d3b24 <  -:  ----------- reftable: don't have reader_get_block() confuse -fanalyzer
11:  66518467e1d <  -:  ----------- blame.c: clarify the state of "final_commit" for -fanalyzer
12:  9f0f515ed3a <  -:  ----------- pack.h: wrap write_*file*() functions
13:  63eeb66185a <  -:  ----------- pack-write API: pass down "verify" not arbitrary flags
14:  9cf550688d4 <  -:  ----------- config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer
15:  16bd2270b4c <  -:  ----------- config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro
 -:  ----------- >  2:  0e258c230f6 remote.c: don't dereference NULL in freeing loop
 -:  ----------- >  3:  062fb3f454e remote API: don't buggily FREE_AND_NULL(), free() instead
-- 
2.36.1.1178.g0c3594a0ba5


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/3] remote.c: remove braces from one-statement "for"-loops
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
@ 2022-06-07 15:50   ` Ævar Arnfjörð Bjarmason
  2022-06-07 15:50   ` [PATCH 2/3] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-07 15:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Glen Choo, Phillip Wood,
	Ævar Arnfjörð Bjarmason

Remove braces that don't follow the CodingGuidelines from code added
in fd3cb0501e1 (remote: move static variables into per-repository
struct, 2021-11-17). A subsequent commit will edit code adjacent to
this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 remote.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/remote.c b/remote.c
index e98148ac227..3e75db7bb4f 100644
--- a/remote.c
+++ b/remote.c
@@ -145,14 +145,12 @@ static void remote_clear(struct remote *remote)
 	free((char *)remote->name);
 	free((char *)remote->foreign_vcs);
 
-	for (i = 0; i < remote->url_nr; i++) {
+	for (i = 0; i < remote->url_nr; i++)
 		free((char *)remote->url[i]);
-	}
 	FREE_AND_NULL(remote->pushurl);
 
-	for (i = 0; i < remote->pushurl_nr; i++) {
+	for (i = 0; i < remote->pushurl_nr; i++)
 		free((char *)remote->pushurl[i]);
-	}
 	FREE_AND_NULL(remote->pushurl);
 	free((char *)remote->receivepack);
 	free((char *)remote->uploadpack);
@@ -2720,9 +2718,8 @@ void remote_state_clear(struct remote_state *remote_state)
 {
 	int i;
 
-	for (i = 0; i < remote_state->remotes_nr; i++) {
+	for (i = 0; i < remote_state->remotes_nr; i++)
 		remote_clear(remote_state->remotes[i]);
-	}
 	FREE_AND_NULL(remote_state->remotes);
 	remote_state->remotes_alloc = 0;
 	remote_state->remotes_nr = 0;
-- 
2.36.1.1178.g0c3594a0ba5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/3] remote.c: don't dereference NULL in freeing loop
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
  2022-06-07 15:50   ` [PATCH 1/3] remote.c: remove braces from one-statement "for"-loops Ævar Arnfjörð Bjarmason
@ 2022-06-07 15:50   ` Ævar Arnfjörð Bjarmason
  2022-06-07 17:23     ` Junio C Hamano
  2022-06-07 15:50   ` [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead Ævar Arnfjörð Bjarmason
  2022-06-07 17:32   ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Junio C Hamano
  3 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-07 15:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Glen Choo, Phillip Wood,
	Ævar Arnfjörð Bjarmason

Fix a bug in fd3cb0501e1 (remote: move static variables into
per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
after having NULL'd out remote->pushurl. itself. We free
"remote->pushurl" in the next "for"-loop, so doing this appears to
have been a copy/paste error.

Before this change GCC 12's -fanalyzer would correctly note that we'd
dereference NULL in this case, this change fixes that:

	remote.c: In function ‘remote_clear’:
	remote.c:153:17: error: dereference of NULL ‘*remote.pushurl’ [CWE-476] [-Werror=analyzer-null-dereference]
	  153 |                 free((char *)remote->pushurl[i]);
	      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	      [...]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 remote.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote.c b/remote.c
index 3e75db7bb4f..0b243b090d9 100644
--- a/remote.c
+++ b/remote.c
@@ -147,7 +147,7 @@ static void remote_clear(struct remote *remote)
 
 	for (i = 0; i < remote->url_nr; i++)
 		free((char *)remote->url[i]);
-	FREE_AND_NULL(remote->pushurl);
+	FREE_AND_NULL(remote->url);
 
 	for (i = 0; i < remote->pushurl_nr; i++)
 		free((char *)remote->pushurl[i]);
-- 
2.36.1.1178.g0c3594a0ba5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
  2022-06-07 15:50   ` [PATCH 1/3] remote.c: remove braces from one-statement "for"-loops Ævar Arnfjörð Bjarmason
  2022-06-07 15:50   ` [PATCH 2/3] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
@ 2022-06-07 15:50   ` Ævar Arnfjörð Bjarmason
  2022-06-07 17:02     ` Glen Choo
  2022-06-07 17:29     ` Junio C Hamano
  2022-06-07 17:32   ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Junio C Hamano
  3 siblings, 2 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-07 15:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Glen Choo, Phillip Wood,
	Ævar Arnfjörð Bjarmason

Change the buggy "remote_clear()" function to stop pretending to to be
able to zero out a "struct remote". Setting "url" and "pushurl" to
NULL results in an invalid state unless the corresponding "url_nr" and
"pushurl_nr" are also set to zero.

In this case however we do not use the "struct remote", so the
FREE_AND_NULL() pattern added in fd3cb0501e1 (remote: move static
variables into per-repository struct, 2021-11-17) can be replaced with
free().

The API was also odd in that remote_state_new() would xmalloc() for us,
but the user had to free() it themselves, let's instead change the
behavior to have the destructor free() what we malloc() in the
constructer.

In this case this appears to have been done for consistency with
repo_clear(), let's instead have repo_clear() handle the NULL-ing of
its "remote_state", and not attempt to reset the structure in remote.c

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 remote.c     | 14 +++++++-------
 remote.h     | 10 +++++++++-
 repository.c |  2 +-
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/remote.c b/remote.c
index 0b243b090d9..c6ce04dacb7 100644
--- a/remote.c
+++ b/remote.c
@@ -147,15 +147,15 @@ static void remote_clear(struct remote *remote)
 
 	for (i = 0; i < remote->url_nr; i++)
 		free((char *)remote->url[i]);
-	FREE_AND_NULL(remote->url);
+	free(remote->url);
 
 	for (i = 0; i < remote->pushurl_nr; i++)
 		free((char *)remote->pushurl[i]);
-	FREE_AND_NULL(remote->pushurl);
+	free(remote->pushurl);
 	free((char *)remote->receivepack);
 	free((char *)remote->uploadpack);
-	FREE_AND_NULL(remote->http_proxy);
-	FREE_AND_NULL(remote->http_proxy_authmethod);
+	free(remote->http_proxy);
+	free(remote->http_proxy_authmethod);
 }
 
 static void add_merge(struct branch *branch, const char *name)
@@ -2720,12 +2720,12 @@ void remote_state_clear(struct remote_state *remote_state)
 
 	for (i = 0; i < remote_state->remotes_nr; i++)
 		remote_clear(remote_state->remotes[i]);
-	FREE_AND_NULL(remote_state->remotes);
-	remote_state->remotes_alloc = 0;
-	remote_state->remotes_nr = 0;
+	free(remote_state->remotes);
 
 	hashmap_clear_and_free(&remote_state->remotes_hash, struct remote, ent);
 	hashmap_clear_and_free(&remote_state->branches_hash, struct remote, ent);
+
+	free(remote_state);
 }
 
 /*
diff --git a/remote.h b/remote.h
index dd4402436f1..d91b2b29373 100644
--- a/remote.h
+++ b/remote.h
@@ -54,9 +54,17 @@ struct remote_state {
 	int initialized;
 };
 
-void remote_state_clear(struct remote_state *remote_state);
+/**
+ * xmalloc() a "struct remote_state" and initialize it. The resulting
+ * data should be free'd with remote_state_clear().
+ */
 struct remote_state *remote_state_new(void);
 
+/**
+ * free() the structure returned by remote_state_new().
+ */
+void remote_state_clear(struct remote_state *remote_state);
+
 struct remote {
 	struct hashmap_entry ent;
 
diff --git a/repository.c b/repository.c
index 5d166b692c8..0a6df6937e4 100644
--- a/repository.c
+++ b/repository.c
@@ -292,7 +292,7 @@ void repo_clear(struct repository *repo)
 
 	if (repo->remote_state) {
 		remote_state_clear(repo->remote_state);
-		FREE_AND_NULL(repo->remote_state);
+		repo->remote_state = NULL;
 	}
 
 	repo_clear_path_cache(&repo->cached_paths);
-- 
2.36.1.1178.g0c3594a0ba5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead
  2022-06-07 15:50   ` [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead Ævar Arnfjörð Bjarmason
@ 2022-06-07 17:02     ` Glen Choo
  2022-06-07 18:09       ` Junio C Hamano
  2022-06-07 17:29     ` Junio C Hamano
  1 sibling, 1 reply; 51+ messages in thread
From: Glen Choo @ 2022-06-07 17:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Phillip Wood, Ævar Arnfjörð Bjarmason


Thanks for sending out this series :) I feel a little guilty leaving
behind this much junk. Coincidentally, I was already planning on
revisiting my previous work to clean up debt/mistakes, so this was a
good lesson in what to look out for.

The previous patches look good to me, and I have no comments on those.
I have a style question on this patch (more for my own learning than a
suggestion on this patch), but it also looks good to me.

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> In this case however we do not use the "struct remote", so the
> FREE_AND_NULL() pattern added in fd3cb0501e1 (remote: move static
> variables into per-repository struct, 2021-11-17) can be replaced with
> free().

Did you mean *reuse* the "struct remote"?

> The API was also odd in that remote_state_new() would xmalloc() for us,
> but the user had to free() it themselves, let's instead change the
> behavior to have the destructor free() what we malloc() in the
> constructer.
>
> In this case this appears to have been done for consistency with
> repo_clear(), let's instead have repo_clear() handle the NULL-ing of
> its "remote_state", and not attempt to reset the structure in remote.c

Yes, this was specifically for consistency reasons. We'll have to read
on to see whether or not this patch remains consistent with
repo_clear()...

>  static void add_merge(struct branch *branch, const char *name)
> @@ -2720,12 +2720,12 @@ void remote_state_clear(struct remote_state *remote_state)
>  
>  	for (i = 0; i < remote_state->remotes_nr; i++)
>  		remote_clear(remote_state->remotes[i]);
> -	FREE_AND_NULL(remote_state->remotes);
> -	remote_state->remotes_alloc = 0;
> -	remote_state->remotes_nr = 0;
> +	free(remote_state->remotes);
>  
>  	hashmap_clear_and_free(&remote_state->remotes_hash, struct remote, ent);
>  	hashmap_clear_and_free(&remote_state->branches_hash, struct remote, ent);
> +
> +	free(remote_state);
>  }

Now that remote_state_clear() free()-s the "struct remote_state", should
we rename it to something like "remote_state_free()"? There's some
precedent for this in the other repo_clear() 'destructors'...

> diff --git a/remote.h b/remote.h
> index dd4402436f1..d91b2b29373 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -54,9 +54,17 @@ struct remote_state {
>  	int initialized;
>  };
>  
> -void remote_state_clear(struct remote_state *remote_state);
> +/**
> + * xmalloc() a "struct remote_state" and initialize it. The resulting
> + * data should be free'd with remote_state_clear().
> + */
>  struct remote_state *remote_state_new(void);
>  
> +/**
> + * free() the structure returned by remote_state_new().
> + */
> +void remote_state_clear(struct remote_state *remote_state);
> +
>  struct remote {
>  	struct hashmap_entry ent;
>  
> diff --git a/repository.c b/repository.c
> index 5d166b692c8..0a6df6937e4 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -292,7 +292,7 @@ void repo_clear(struct repository *repo)
>  
>  	if (repo->remote_state) {
>  		remote_state_clear(repo->remote_state);
> -		FREE_AND_NULL(repo->remote_state);
> +		repo->remote_state = NULL;
>  	}
>  
>  	repo_clear_path_cache(&repo->cached_paths);

I suppose the question of whether or not to free() in the 'destructor'
depends on whether we expect the struct to be reusable? I don't expect
that "struct remote_state" needs to be reused, so free()-ing it is ok to
me.

The API is not _that_ odd though ;) As you noted, my initial use of
FREE_AND_NULL() is for consistency reasons with the rest of
repo_clear(), which looks like this:

	if (repo->config) {
		git_configset_clear(repo->config);
		FREE_AND_NULL(repo->config);
	}

	if (repo->submodule_cache) {
		submodule_cache_free(repo->submodule_cache);
		repo->submodule_cache = NULL;
	}

	if (repo->index) {
		discard_index(repo->index);
		if (repo->index != &the_index)
			FREE_AND_NULL(repo->index);
	}

	if (repo->promisor_remote_config) {
		promisor_remote_clear(repo->promisor_remote_config);
		FREE_AND_NULL(repo->promisor_remote_config);
	}

	if (repo->remote_state) {
		remote_state_clear(repo->remote_state);
 -	FREE_AND_NULL(repo->remote_state);
 +	repo->remote_state = NULL;
	}

promisor_remote_clear(), discard_index(), and git_configset_clear()
don't free() the struct, so it makes sense for them to use
FREE_AND_NULL(). AFAICT, these structs are meant to be reused, so it
makes sense that we "clear" it without freeing the struct pointer
itself.

On the other hand, submodule_cache_free() _does_ free() the struct, and
so we just use "= NULL". I noticed that this uses the verb "free", and
not "clear".

So now that remote_state_clear() *does* free() the struct, it is
perfectly fine to use "= NULL" here as well, though it uses the verb
"clear".

I'm not sure if we have a style around clear/free. Feel free to ignore
if there isn't one.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/3] remote.c: don't dereference NULL in freeing loop
  2022-06-07 15:50   ` [PATCH 2/3] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
@ 2022-06-07 17:23     ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-06-07 17:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Glen Choo, Phillip Wood

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix a bug in fd3cb0501e1 (remote: move static variables into
> per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
> after having NULL'd out remote->pushurl. itself. We free

Wow, that's a bad one.  Why didn't anybody notice this at runtime, I
have to wonder.

>  	for (i = 0; i < remote->url_nr; i++)
>  		free((char *)remote->url[i]);
> -	FREE_AND_NULL(remote->pushurl);
> +	FREE_AND_NULL(remote->url);
>  
>  	for (i = 0; i < remote->pushurl_nr; i++)
>  		free((char *)remote->pushurl[i]);
>  	FREE_AND_NULL(remote->pushurl);

Thanks.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead
  2022-06-07 15:50   ` [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead Ævar Arnfjörð Bjarmason
  2022-06-07 17:02     ` Glen Choo
@ 2022-06-07 17:29     ` Junio C Hamano
  1 sibling, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-06-07 17:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Glen Choo, Phillip Wood

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change the buggy "remote_clear()" function to stop pretending to to be

"buggy" -> ""; "to to" -> "to".

Over-using FREE_AND_NULL() is indeed irritating, but as long as we
are not reusing the struct whose member are getting freed, there is
no "bug" per-se, no?

> able to zero out a "struct remote". Setting "url" and "pushurl" to
> NULL results in an invalid state unless the corresponding "url_nr" and
> "pushurl_nr" are also set to zero.
>
> In this case however we do not use the "struct remote", so the

"use" -> "reuse", with comma around "however" on both sides, probably.

> FREE_AND_NULL() pattern added in fd3cb0501e1 (remote: move static
> variables into per-repository struct, 2021-11-17) can be replaced with
> free().

Yay.

> The API was also odd in that remote_state_new() would xmalloc() for us,
> but the user had to free() it themselves, let's instead change the
> behavior to have the destructor free() what we malloc() in the
> constructer.

I am not sure if remote_clear() that does not free the remote goes
well with remote_state_clear() that does free remote_state.  As long
as it is clear whose responsibility to release resources is, we can
go either way, but helper functions with similar names having quite
different resource maintainance semantics looks like a potential
source of confusion to me.

> In this case this appears to have been done for consistency with
> repo_clear(), let's instead have repo_clear() handle the NULL-ing of
> its "remote_state", and not attempt to reset the structure in remote.c
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  remote.c     | 14 +++++++-------
>  remote.h     | 10 +++++++++-
>  repository.c |  2 +-
>  3 files changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/remote.c b/remote.c
> index 0b243b090d9..c6ce04dacb7 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -147,15 +147,15 @@ static void remote_clear(struct remote *remote)
>  
>  	for (i = 0; i < remote->url_nr; i++)
>  		free((char *)remote->url[i]);
> -	FREE_AND_NULL(remote->url);
> +	free(remote->url);
>  
>  	for (i = 0; i < remote->pushurl_nr; i++)
>  		free((char *)remote->pushurl[i]);
> -	FREE_AND_NULL(remote->pushurl);
> +	free(remote->pushurl);
>  	free((char *)remote->receivepack);
>  	free((char *)remote->uploadpack);
> -	FREE_AND_NULL(remote->http_proxy);
> -	FREE_AND_NULL(remote->http_proxy_authmethod);
> +	free(remote->http_proxy);
> +	free(remote->http_proxy_authmethod);
>  }
>  
>  static void add_merge(struct branch *branch, const char *name)
> @@ -2720,12 +2720,12 @@ void remote_state_clear(struct remote_state *remote_state)
>  
>  	for (i = 0; i < remote_state->remotes_nr; i++)
>  		remote_clear(remote_state->remotes[i]);
> -	FREE_AND_NULL(remote_state->remotes);
> -	remote_state->remotes_alloc = 0;
> -	remote_state->remotes_nr = 0;
> +	free(remote_state->remotes);
>  
>  	hashmap_clear_and_free(&remote_state->remotes_hash, struct remote, ent);
>  	hashmap_clear_and_free(&remote_state->branches_hash, struct remote, ent);
> +
> +	free(remote_state);
>  }
>  
>  /*
> diff --git a/remote.h b/remote.h
> index dd4402436f1..d91b2b29373 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -54,9 +54,17 @@ struct remote_state {
>  	int initialized;
>  };
>  
> -void remote_state_clear(struct remote_state *remote_state);
> +/**
> + * xmalloc() a "struct remote_state" and initialize it. The resulting
> + * data should be free'd with remote_state_clear().
> + */
>  struct remote_state *remote_state_new(void);
>  
> +/**
> + * free() the structure returned by remote_state_new().
> + */
> +void remote_state_clear(struct remote_state *remote_state);
> +
>  struct remote {
>  	struct hashmap_entry ent;
>  
> diff --git a/repository.c b/repository.c
> index 5d166b692c8..0a6df6937e4 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -292,7 +292,7 @@ void repo_clear(struct repository *repo)
>  
>  	if (repo->remote_state) {
>  		remote_state_clear(repo->remote_state);
> -		FREE_AND_NULL(repo->remote_state);
> +		repo->remote_state = NULL;
>  	}
>  
>  	repo_clear_path_cache(&repo->cached_paths);

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue
  2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2022-06-07 15:50   ` [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead Ævar Arnfjörð Bjarmason
@ 2022-06-07 17:32   ` Junio C Hamano
  3 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-06-07 17:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Glen Choo, Phillip Wood

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> This series is spun off from the recent RFC -fanalyzer series[1] and
> fixes a clear bug in remote_state_clear().

[1/3] is a pretty-much "Meh" (read: if it is already in the code, it
is not worth a patch noise to go and fix it, even though we would
avoid newly introducing the construct in new code), and some part of
[3/3] I am not so sure about (read: it is a mixed bag of changes),
but [2/3] is a bug that I find surprising that nobody noticed at
runtime.

Thanks.  Will queue.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead
  2022-06-07 17:02     ` Glen Choo
@ 2022-06-07 18:09       ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-06-07 18:09 UTC (permalink / raw)
  To: Glen Choo; +Cc: Ævar Arnfjörð Bjarmason, git, Phillip Wood

Glen Choo <chooglen@google.com> writes:

> I suppose the question of whether or not to free() in the 'destructor'
> depends on whether we expect the struct to be reusable? I don't expect
> that "struct remote_state" needs to be reused, so free()-ing it is ok to
> me.
>
> The API is not _that_ odd though ;) As you noted, my initial use of
> FREE_AND_NULL() is for consistency reasons with the rest of
> repo_clear(), which looks like this:
>
> 	if (repo->config) {
> 		git_configset_clear(repo->config);
> 		FREE_AND_NULL(repo->config);

So git_configset_clear() does clear but does not free.

> 	}
>
> 	if (repo->submodule_cache) {
> 		submodule_cache_free(repo->submodule_cache);

submodule_cache_free() does (probably clear and) free.

> 		repo->submodule_cache = NULL;
> 	}
>
> 	if (repo->index) {
> 		discard_index(repo->index);

discard_index() does not free.

> 		if (repo->index != &the_index)
> 			FREE_AND_NULL(repo->index);
> 	}
>
> 	if (repo->promisor_remote_config) {
> 		promisor_remote_clear(repo->promisor_remote_config);

promisor_remote_clear() does not free.

> 		FREE_AND_NULL(repo->promisor_remote_config);
> 	}
>
> 	if (repo->remote_state) {
> 		remote_state_clear(repo->remote_state);
>  -	FREE_AND_NULL(repo->remote_state);
>  +	repo->remote_state = NULL;
> 	}
>
> promisor_remote_clear(), discard_index(), and git_configset_clear()
> don't free() the struct, so it makes sense for them to use
> FREE_AND_NULL(). AFAICT, these structs are meant to be reused, so it
> makes sense that we "clear" it without freeing the struct pointer
> itself.
>
> On the other hand, submodule_cache_free() _does_ free() the struct, and
> so we just use "= NULL". I noticed that this uses the verb "free", and
> not "clear".
>
> So now that remote_state_clear() *does* free() the struct, it is
> perfectly fine to use "= NULL" here as well, though it uses the verb
> "clear".
>
> I'm not sure if we have a style around clear/free. Feel free to ignore
> if there isn't one.

It does bother me.  Changing _clear() that did not free the
container resource to free it, without changing the name to free or
release or whatever, smells like leaving a source of confusion for
future developers.



^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2022-06-07 19:28 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-03 18:37 [RFC PATCH 00/15] Fix GCC -fanalyzer warnings & add -fanalyzer DEVOPTS mode Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 01/15] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
2022-06-03 21:07   ` René Scharfe
2022-06-03 21:28     ` Junio C Hamano
2022-06-03 22:32     ` Glen Choo
2022-06-04 12:51     ` Phillip Wood
2022-06-04 16:20       ` Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 02/15] pull.c: don't feed NULL to strcmp() on get_rebase_fork_point() path Ævar Arnfjörð Bjarmason
2022-06-03 21:27   ` René Scharfe
2022-06-03 18:37 ` [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc() Ævar Arnfjörð Bjarmason
2022-06-03 22:22   ` René Scharfe
2022-06-04  0:54     ` Ævar Arnfjörð Bjarmason
2022-06-04 12:24       ` René Scharfe
2022-06-04 16:23         ` Ævar Arnfjörð Bjarmason
2022-06-04 20:31           ` René Scharfe
2022-06-06 16:53           ` Junio C Hamano
2022-06-06 17:38             ` Ævar Arnfjörð Bjarmason
2022-06-06 17:44               ` Junio C Hamano
2022-06-06 17:46                 ` Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 04/15] diff-lib.c: don't dereference NULL in oneway_diff() Ævar Arnfjörð Bjarmason
2022-06-03 22:48   ` René Scharfe
2022-06-03 18:37 ` [RFC PATCH 05/15] refs/packed-backend.c: add a BUG() if iter is NULL Ævar Arnfjörð Bjarmason
2022-06-03 23:14   ` René Scharfe
2022-06-03 18:37 ` [RFC PATCH 06/15] ref-filter.c: BUG() out on show_ref() with NULL refname Ævar Arnfjörð Bjarmason
2022-06-04 18:07   ` René Scharfe
2022-06-03 18:37 ` [RFC PATCH 07/15] strbuf.c: placate -fanalyzer in strbuf_grow() Ævar Arnfjörð Bjarmason
2022-06-04 12:24   ` René Scharfe
2022-06-04 12:46   ` Phillip Wood
2022-06-04 16:21     ` Ævar Arnfjörð Bjarmason
2022-06-04 20:37       ` René Scharfe
2022-06-05 10:20         ` Phillip Wood
2022-06-03 18:37 ` [RFC PATCH 08/15] strbuf.c: use st_add3(), not unsigned_add_overflows() Ævar Arnfjörð Bjarmason
2022-06-04 21:27   ` René Scharfe
2022-06-03 18:37 ` [RFC PATCH 09/15] add-patch: assert parse_diff() expectations with BUG() Ævar Arnfjörð Bjarmason
2022-06-04 13:04   ` Phillip Wood
2022-06-03 18:37 ` [RFC PATCH 10/15] reftable: don't have reader_get_block() confuse -fanalyzer Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 11/15] blame.c: clarify the state of "final_commit" for -fanalyzer Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 12/15] pack.h: wrap write_*file*() functions Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 13/15] pack-write API: pass down "verify" not arbitrary flags Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 14/15] config.mak.dev: add a DEVOPTS=analyzer mode to use GCC's -fanalyzer Ævar Arnfjörð Bjarmason
2022-06-03 18:37 ` [RFC PATCH 15/15] config.mak.dev: add and use ASSERT_FOR_FANALYZER() macro Ævar Arnfjörð Bjarmason
2022-06-04 13:12   ` Phillip Wood
2022-06-07 15:50 ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Ævar Arnfjörð Bjarmason
2022-06-07 15:50   ` [PATCH 1/3] remote.c: remove braces from one-statement "for"-loops Ævar Arnfjörð Bjarmason
2022-06-07 15:50   ` [PATCH 2/3] remote.c: don't dereference NULL in freeing loop Ævar Arnfjörð Bjarmason
2022-06-07 17:23     ` Junio C Hamano
2022-06-07 15:50   ` [PATCH 3/3] remote API: don't buggily FREE_AND_NULL(), free() instead Ævar Arnfjörð Bjarmason
2022-06-07 17:02     ` Glen Choo
2022-06-07 18:09       ` Junio C Hamano
2022-06-07 17:29     ` Junio C Hamano
2022-06-07 17:32   ` [PATCH 0/3] remote API: fix -fanalyzer-spotted freeing issue Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.