All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] strvec: use size_t to store nr and alloc
@ 2021-09-11 15:01 Jeff King
  2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2021-09-11 15:01 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Derrick Stolee

We converted argv_array (which later became strvec) to use size_t in
819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
order to avoid the possibility of integer overflow. But later, commit
d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
converted these back to ints!

Those two commits were part of the same patch series. I'm pretty sure
what happened is that they were originally written in the opposite order
and then cleaned up and re-ordered during an interactive rebase. And
when resolving the inevitable conflict, I mistakenly took the "rename"
patch completely, accidentally dropping the type change.

We can correct it now; better late than never.

Signed-off-by: Jeff King <peff@peff.net>
---
This was posted previously in the midst of another thread, but I don't
think was picked up. There was some positive reaction, but one "do we
really need this?" to which I responded in detail:

  https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/

I don't really think any of that needs to go into the commit message,
but if that's a hold-up, I can try to summarize it (though I think
referring to the commit which _already_ did this and was accidentally
reverted would be sufficient).

 strvec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/strvec.h b/strvec.h
index fdcad75b45..6b3cbd6758 100644
--- a/strvec.h
+++ b/strvec.h
@@ -29,8 +29,8 @@ extern const char *empty_strvec[];
  */
 struct strvec {
 	const char **v;
-	int nr;
-	int alloc;
+	size_t nr;
+	size_t alloc;
 };
 
 #define STRVEC_INIT { empty_strvec, 0, 0 }
-- 
2.33.0.811.g40f7f3a89c

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] strvec: use size_t to store nr and alloc
  2021-09-11 15:01 [PATCH] strvec: use size_t to store nr and alloc Jeff King
@ 2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
  2021-09-11 22:48   ` Philip Oakley
  2021-09-12 21:58   ` Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-11 16:13 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Derrick Stolee


On Sat, Sep 11 2021, Jeff King wrote:

> We converted argv_array (which later became strvec) to use size_t in
> 819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
> order to avoid the possibility of integer overflow. But later, commit
> d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
> converted these back to ints!
>
> Those two commits were part of the same patch series. I'm pretty sure
> what happened is that they were originally written in the opposite order
> and then cleaned up and re-ordered during an interactive rebase. And
> when resolving the inevitable conflict, I mistakenly took the "rename"
> patch completely, accidentally dropping the type change.
>
> We can correct it now; better late than never.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This was posted previously in the midst of another thread, but I don't
> think was picked up. There was some positive reaction, but one "do we
> really need this?" to which I responded in detail:
>
>   https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
>
> I don't really think any of that needs to go into the commit message,
> but if that's a hold-up, I can try to summarize it (though I think
> referring to the commit which _already_ did this and was accidentally
> reverted would be sufficient).

Thanks, I have a WIP version of this outstanding starting with this
patch that I was planning to submit sometime, but I'm happy to have you
pursue it, especially with the ~100 outstanding patches I have in
master..seen.

It does feel somewhere between iffy and a landmine waiting to be stepped
on to only convert the member itself, and not any of the corresponding
"int" variables that track it to "size_t".

If you do the change I suggested in
https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
find that there's at least one first-order reference to this that now
uses "int" that if converted to "size_t" will result in a wrap-around
error, we're lucky that one has a test failure.

I can tell you what that bug is, but maybe it's better if you find it
yourself :) I.e. I found *that* one, but I'm not sure I found them
all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
errors & the code context (note: pointer, obviously broken, but makes
the compiler yell).

That particular bug will be caught by the compiler as it involves a >= 0
comparison against unsigned, but we may not not have that everywhere...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] strvec: use size_t to store nr and alloc
  2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
@ 2021-09-11 22:48   ` Philip Oakley
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2021-09-12 22:00     ` [PATCH] " Jeff King
  2021-09-12 21:58   ` Jeff King
  1 sibling, 2 replies; 23+ messages in thread
From: Philip Oakley @ 2021-09-11 22:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff King
  Cc: git, Junio C Hamano, Derrick Stolee

On 11/09/2021 17:13, Ævar Arnfjörð Bjarmason wrote:
> On Sat, Sep 11 2021, Jeff King wrote:
>
>> We converted argv_array (which later became strvec) to use size_t in
>> 819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
>> order to avoid the possibility of integer overflow. But later, commit
>> d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
>> converted these back to ints!
>>
>> Those two commits were part of the same patch series. I'm pretty sure
>> what happened is that they were originally written in the opposite order
>> and then cleaned up and re-ordered during an interactive rebase. And
>> when resolving the inevitable conflict, I mistakenly took the "rename"
>> patch completely, accidentally dropping the type change.
>>
>> We can correct it now; better late than never.
>>
>> Signed-off-by: Jeff King <peff@peff.net>
>> ---
>> This was posted previously in the midst of another thread, but I don't
>> think was picked up. There was some positive reaction, but one "do we
>> really need this?" to which I responded in detail:
>>
>>   https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
>>
>> I don't really think any of that needs to go into the commit message,
>> but if that's a hold-up, I can try to summarize it (though I think
>> referring to the commit which _already_ did this and was accidentally
>> reverted would be sufficient).
> Thanks, I have a WIP version of this outstanding starting with this
> patch that I was planning to submit sometime, but I'm happy to have you
> pursue it, especially with the ~100 outstanding patches I have in
> master..seen.
>
> It does feel somewhere between iffy and a landmine waiting to be stepped
> on to only convert the member itself, and not any of the corresponding
> "int" variables that track it to "size_t".
>
> If you do the change I suggested in
> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
> find that there's at least one first-order reference to this that now
> uses "int" that if converted to "size_t" will result in a wrap-around
> error, we're lucky that one has a test failure.
>
> I can tell you what that bug is, but maybe it's better if you find it
> yourself :) I.e. I found *that* one, but I'm not sure I found them
> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
> errors & the code context (note: pointer, obviously broken, but makes
> the compiler yell).
>
> That particular bug will be caught by the compiler as it involves a >= 0
> comparison against unsigned, but we may not not have that everywhere...

I'm particularly interested in the int -> size_t change problem as part
of the wider 4GB limitations for the LLP64 systems [0] such as the
RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
problem.


Philip


[0]
http://nickdesaulniers.github.io/blog/2016/05/30/data-models-and-word-size/
[1] https://github.com/git-lfs/git-lfs/issues/2434  Git on Windows
client corrupts files > 4Gb
[2] https://github.com/git-for-windows/git/pull/2179  [DRAFT] for
testing : Fix 4Gb limit for large files on Git for Windows

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-11 22:48   ` Philip Oakley
@ 2021-09-12  0:15     ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
                         ` (8 more replies)
  2021-09-12 22:00     ` [PATCH] " Jeff King
  1 sibling, 9 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

This is a proposed v2 of Jeff King's one-patch change to change
strvec's nr/alloc from "int" to "size_t". As noted below I think it's
worthwhile to not only change that in the struct, but also in code
that directly references the "nr" member.

On Sat, Sep 11 2021, Philip Oakley wrote:

> On 11/09/2021 17:13, Ævar Arnfjörð Bjarmason wrote:
>> On Sat, Sep 11 2021, Jeff King wrote:
>>
>>> We converted argv_array (which later became strvec) to use size_t in
>>> 819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
>>> order to avoid the possibility of integer overflow. But later, commit
>>> d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
>>> converted these back to ints!
>>>
>>> Those two commits were part of the same patch series. I'm pretty sure
>>> what happened is that they were originally written in the opposite order
>>> and then cleaned up and re-ordered during an interactive rebase. And
>>> when resolving the inevitable conflict, I mistakenly took the "rename"
>>> patch completely, accidentally dropping the type change.
>>>
>>> We can correct it now; better late than never.
>>>
>>> Signed-off-by: Jeff King <peff@peff.net>
>>> ---
>>> This was posted previously in the midst of another thread, but I don't
>>> think was picked up. There was some positive reaction, but one "do we
>>> really need this?" to which I responded in detail:
>>>
>>>   https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
>>>
>>> I don't really think any of that needs to go into the commit message,
>>> but if that's a hold-up, I can try to summarize it (though I think
>>> referring to the commit which _already_ did this and was accidentally
>>> reverted would be sufficient).
>> Thanks, I have a WIP version of this outstanding starting with this
>> patch that I was planning to submit sometime, but I'm happy to have you
>> pursue it, especially with the ~100 outstanding patches I have in
>> master..seen.
>>
>> It does feel somewhere between iffy and a landmine waiting to be stepped
>> on to only convert the member itself, and not any of the corresponding
>> "int" variables that track it to "size_t".
>>
>> If you do the change I suggested in
>> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
>> find that there's at least one first-order reference to this that now
>> uses "int" that if converted to "size_t" will result in a wrap-around
>> error, we're lucky that one has a test failure.
>>
>> I can tell you what that bug is, but maybe it's better if you find it
>> yourself :) I.e. I found *that* one, but I'm not sure I found them
>> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
>> errors & the code context (note: pointer, obviously broken, but makes
>> the compiler yell).
>>
>> That particular bug will be caught by the compiler as it involves a >= 0
>> comparison against unsigned, but we may not not have that everywhere...
>
> I'm particularly interested in the int -> size_t change problem as part
> of the wider 4GB limitations for the LLP64 systems [0] such as the
> RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
> problem.

Okey, fine, no fun excercise for the reader then ;)

This is what I'd been sitting on locally since that recent thread, I
polished it up a bit since Jeff King posted his version.

The potential overflow bug I mentioned is in rebase.c. See
5/7. "Potential" because it's not a bug now, but that code
intentionally considers a strvec, and then iterates it from nr-1 to 0,
and if it reaches 0 intentionally counts down one more to -1 to
indicate that it's visited all elements.

We then check that with i >= 0, except of course if it becomes
unsigned that doesn't become -1, but rather it wraps around.

The rest of this is all changes to have that s/int/size_t/ radiate
outwards, i.e. when we assign that value to a variable somewhere its
now a "size_t" instead of an "int" etc.

> [0]
> http://nickdesaulniers.github.io/blog/2016/05/30/data-models-and-word-size/
> [1] https://github.com/git-lfs/git-lfs/issues/2434  Git on Windows
> client corrupts files > 4Gb
> [2] https://github.com/git-for-windows/git/pull/2179  [DRAFT] for
> testing : Fix 4Gb limit for large files on Git for Windows

Jeff King (1):
  strvec: use size_t to store nr and alloc

Ævar Arnfjörð Bjarmason (6):
  remote-curl: pass "struct strvec *" instead of int/char ** pair
  pack-objects: pass "struct strvec *" instead of int/char ** pair
  sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
  upload-pack.c: pass "struct strvec *" instead of int/char ** pair
  rebase: don't have loop over "struct strvec" depend on signed "nr"
  strvec API users: change some "int" tracking "nr" to "size_t"

 builtin/pack-objects.c |  6 +++---
 builtin/rebase.c       | 26 ++++++++++++--------------
 connect.c              |  8 ++++----
 fetch-pack.c           |  4 ++--
 ls-refs.c              |  2 +-
 remote-curl.c          | 23 +++++++++++------------
 sequencer.c            |  8 ++++----
 sequencer.h            |  4 ++--
 serve.c                |  2 +-
 shallow.c              |  5 +++--
 shallow.h              |  6 ++++--
 strvec.h               |  4 ++--
 submodule.c            |  2 +-
 upload-pack.c          |  7 +++----
 14 files changed, 53 insertions(+), 54 deletions(-)

Range-diff against v1:
-:  ----------- > 1:  2ef48d734e8 remote-curl: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 2:  7f59a58ed97 pack-objects: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 3:  c35cfb9c9c5 sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 4:  2e0b82d4316 upload-pack.c: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 5:  be85a0565ef rebase: don't have loop over "struct strvec" depend on signed "nr"
1:  498f5ed80dc ! 6:  ba17290852c strvec: use size_t to store nr and alloc
    @@ Commit message
         We can correct it now; better late than never.
     
         Signed-off-by: Jeff King <peff@peff.net>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## strvec.h ##
     @@ strvec.h: extern const char *empty_strvec[];
-:  ----------- > 7:  2edd9708888 strvec API users: change some "int" tracking "nr" to "size_t"
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:36         ` Carlo Arenas
  2021-09-12  0:15       ` [PATCH v2 2/7] pack-objects: " Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

We'll be changing the "int nr" in the "struct strvec" to "int size_t"
soon, i.e. from signed to unsigned. It will then make sense to use the
"size_t" type for things that mirror that "nr" member.

This sets up that change in remote-curl.c, we'll change the "int i"
here to a "size_t i" in a later commit (let's also split and "err" up
to make that change smaller).

In this case the push() function can just pass the "struct strvec" it
has down to push_dav() and push_git(), in addition make use of
strvec_pushv() in push_dav() instead of looping over the the specs
ourselves. So we can get rid of much of the need for tracking the
"nr", which in the case of push_dav() was already a "size_t".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 remote-curl.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 598cff7cde6..1f177e86f11 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1266,10 +1266,9 @@ static void parse_fetch(struct strbuf *buf)
 	strbuf_reset(buf);
 }
 
-static int push_dav(int nr_spec, const char **specs)
+static int push_dav(struct strvec *specs)
 {
 	struct child_process child = CHILD_PROCESS_INIT;
-	size_t i;
 
 	child.git_cmd = 1;
 	strvec_push(&child.args, "http-push");
@@ -1279,18 +1278,18 @@ static int push_dav(int nr_spec, const char **specs)
 	if (options.verbosity > 1)
 		strvec_push(&child.args, "--verbose");
 	strvec_push(&child.args, url.buf);
-	for (i = 0; i < nr_spec; i++)
-		strvec_push(&child.args, specs[i]);
+	strvec_pushv(&child.args, specs->v);
 
 	if (run_command(&child))
 		die(_("git-http-push failed"));
 	return 0;
 }
 
-static int push_git(struct discovery *heads, int nr_spec, const char **specs)
+static int push_git(struct discovery *heads, struct strvec *specs)
 {
 	struct rpc_state rpc;
-	int i, err;
+	int i;
+	int err;
 	struct strvec args;
 	struct string_list_item *cas_option;
 	struct strbuf preamble = STRBUF_INIT;
@@ -1326,8 +1325,8 @@ static int push_git(struct discovery *heads, int nr_spec, const char **specs)
 		strvec_push(&args, "--force-if-includes");
 
 	strvec_push(&args, "--stdin");
-	for (i = 0; i < nr_spec; i++)
-		packet_buf_write(&preamble, "%s\n", specs[i]);
+	for (i = 0; i < specs->nr; i++)
+		packet_buf_write(&preamble, "%s\n", specs->v[i]);
 	packet_buf_flush(&preamble);
 
 	memset(&rpc, 0, sizeof(rpc));
@@ -1342,15 +1341,15 @@ static int push_git(struct discovery *heads, int nr_spec, const char **specs)
 	return err;
 }
 
-static int push(int nr_spec, const char **specs)
+static int push(struct strvec *specs)
 {
 	struct discovery *heads = discover_refs("git-receive-pack", 1);
 	int ret;
 
 	if (heads->proto_git)
-		ret = push_git(heads, nr_spec, specs);
+		ret = push_git(heads, specs);
 	else
-		ret = push_dav(nr_spec, specs);
+		ret = push_dav(specs);
 	free_discovery(heads);
 	return ret;
 }
@@ -1374,7 +1373,7 @@ static void parse_push(struct strbuf *buf)
 			break;
 	} while (1);
 
-	ret = push(specs.nr, specs.v);
+	ret = push(&specs);
 	printf("\n");
 	fflush(stdout);
 
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/7] pack-objects: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 3/7] sequencer.[ch]: " Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

As in the preceding commit, prepare for the "nr" member of "struct
strvec" changing from an "int" to a "size_t".

In this case we end up passing the "nr" to setup_revisions(), which
expects to take an argc/argv pair passed to main(), so we can't change
its type.

But we can defer the conversion until that point. Let's continue to
pass the "struct strvec *" down into instead of "unwrapping" it in
cases where the only source of the "int/char **" pair is a "struct
strvec".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/pack-objects.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ec8503563a6..1902c0b6776 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3701,7 +3701,7 @@ static void mark_bitmap_preferred_tips(void)
 	}
 }
 
-static void get_object_list(int ac, const char **av)
+static void get_object_list(struct strvec *rp)
 {
 	struct rev_info revs;
 	struct setup_revision_opt s_r_opt = {
@@ -3713,7 +3713,7 @@ static void get_object_list(int ac, const char **av)
 
 	repo_init_revisions(the_repository, &revs, NULL);
 	save_commit_buffer = 0;
-	setup_revisions(ac, av, &revs, &s_r_opt);
+	setup_revisions(rp->nr, rp->v, &revs, &s_r_opt);
 
 	/* make sure shallows are read */
 	is_repository_shallow(the_repository);
@@ -4138,7 +4138,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	} else if (!use_internal_rev_list)
 		read_object_list_from_stdin();
 	else {
-		get_object_list(rp.nr, rp.v);
+		get_object_list(&rp);
 		strvec_clear(&rp);
 	}
 	cleanup_preferred_base();
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/7] sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 2/7] pack-objects: " Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 4/7] upload-pack.c: " Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

As in the preceding commit, prepare for the "nr" member of "struct
strvec" changing from an "int" to a "size_t". These are the same sorts
of changes to pass a "struct strvec *" further down instead of passing
args->nr and args->v.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/rebase.c | 3 +--
 sequencer.c      | 6 +++---
 sequencer.h      | 4 ++--
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index eb01f4d790b..27669880918 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -363,8 +363,7 @@ static int do_interactive_rebase(struct rebase_options *opts, unsigned flags)
 			     oid_to_hex(&opts->restrict_revision->object.oid));
 
 	ret = sequencer_make_script(the_repository, &todo_list.buf,
-				    make_script_args.nr, make_script_args.v,
-				    flags);
+				    &make_script_args, flags);
 
 	if (ret)
 		error(_("could not generate todo list"));
diff --git a/sequencer.c b/sequencer.c
index 68316636921..a4ba434f173 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5355,8 +5355,8 @@ static int make_script_with_merges(struct pretty_print_context *pp,
 	return 0;
 }
 
-int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
-			  const char **argv, unsigned flags)
+int sequencer_make_script(struct repository *r, struct strbuf *out,
+			  struct strvec *args, unsigned flags)
 {
 	char *format = NULL;
 	struct pretty_print_context pp = {0};
@@ -5390,7 +5390,7 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 	pp.fmt = revs.commit_format;
 	pp.output_encoding = get_log_output_encoding();
 
-	if (setup_revisions(argc, argv, &revs, NULL) > 1)
+	if (setup_revisions(args->nr, args->v, &revs, NULL) > 1)
 		return error(_("make_script: unhandled options"));
 
 	if (prepare_revision_walk(&revs) < 0)
diff --git a/sequencer.h b/sequencer.h
index 2088344cb37..36f67164f86 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -158,8 +158,8 @@ int sequencer_remove_state(struct replay_opts *opts);
 #define TODO_LIST_REAPPLY_CHERRY_PICKS (1U << 7)
 #define TODO_LIST_WARN_SKIPPED_CHERRY_PICKS (1U << 8)
 
-int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
-			  const char **argv, unsigned flags);
+int sequencer_make_script(struct repository *r, struct strbuf *out,
+			  struct strvec *args, unsigned flags);
 
 void todo_list_add_exec_commands(struct todo_list *todo_list,
 				 struct string_list *commands);
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/7] upload-pack.c: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-09-12  0:15       ` [PATCH v2 3/7] sequencer.[ch]: " Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

As in the preceding commit, prepare for the "nr" member of "struct
strvec" changing from an "int" to a "size_t". These are the same sorts
of changes to pass a "struct strvec *" further down instead of passing
args->nr and args->v.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 shallow.c     | 5 +++--
 shallow.h     | 6 ++++--
 upload-pack.c | 7 +++----
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/shallow.c b/shallow.c
index 9ed18eb8849..6928db36669 100644
--- a/shallow.c
+++ b/shallow.c
@@ -15,6 +15,7 @@
 #include "list-objects.h"
 #include "commit-reach.h"
 #include "shallow.h"
+#include "strvec.h"
 
 void set_alternate_shallow_file(struct repository *r, const char *path, int override)
 {
@@ -196,7 +197,7 @@ static void show_commit(struct commit *commit, void *data)
  * are marked with shallow_flag. The list of border/shallow commits
  * are also returned.
  */
-struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
+struct commit_list *get_shallow_commits_by_rev_list(struct strvec *args,
 						    int shallow_flag,
 						    int not_shallow_flag)
 {
@@ -215,7 +216,7 @@ struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
 
 	repo_init_revisions(the_repository, &revs, NULL);
 	save_commit_buffer = 0;
-	setup_revisions(ac, av, &revs, NULL);
+	setup_revisions(args->nr, args->v, &revs, NULL);
 
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
diff --git a/shallow.h b/shallow.h
index 5b4a96dcd69..206405ec8ca 100644
--- a/shallow.h
+++ b/shallow.h
@@ -5,6 +5,7 @@
 #include "object.h"
 #include "repository.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 void set_alternate_shallow_file(struct repository *r, const char *path, int override);
 int register_shallow(struct repository *r, const struct object_id *oid);
@@ -32,8 +33,9 @@ void rollback_shallow_file(struct repository *r, struct shallow_lock *lk);
 
 struct commit_list *get_shallow_commits(struct object_array *heads,
 					int depth, int shallow_flag, int not_shallow_flag);
-struct commit_list *get_shallow_commits_by_rev_list(
-		int ac, const char **av, int shallow_flag, int not_shallow_flag);
+struct commit_list *get_shallow_commits_by_rev_list(struct strvec *args,
+						    int shallow_flag,
+						    int not_shallow_flag);
 int write_shallow_commits(struct strbuf *out, int use_pack_protocol,
 			  const struct oid_array *extra);
 
diff --git a/upload-pack.c b/upload-pack.c
index 6ce07231d3d..5928973bcc3 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -859,13 +859,12 @@ static void deepen(struct upload_pack_data *data, int depth)
 }
 
 static void deepen_by_rev_list(struct upload_pack_data *data,
-			       int ac,
-			       const char **av)
+			       struct strvec *args)
 {
 	struct commit_list *result;
 
 	disable_commit_graph(the_repository);
-	result = get_shallow_commits_by_rev_list(ac, av, SHALLOW, NOT_SHALLOW);
+	result = get_shallow_commits_by_rev_list(args, SHALLOW, NOT_SHALLOW);
 	send_shallow(data, result);
 	free_commit_list(result);
 	send_unshallow(data);
@@ -900,7 +899,7 @@ static int send_shallow_list(struct upload_pack_data *data)
 			struct object *o = data->want_obj.objects[i].item;
 			strvec_push(&av, oid_to_hex(&o->oid));
 		}
-		deepen_by_rev_list(data, av.nr, av.v);
+		deepen_by_rev_list(data, &av);
 		strvec_clear(&av);
 		ret = 1;
 	} else {
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr"
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-09-12  0:15       ` [PATCH v2 4/7] upload-pack.c: " Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  2:57         ` Eric Sunshine
  2021-09-12  0:15       ` [PATCH v2 6/7] strvec: use size_t to store nr and alloc Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

As in the preceding commit, prepare for the "nr" member of "struct
strvec" changing from an "int" to a "size_t".

Let's change this code added in f57696802c3 (rebase: really just
passthru the `git am` options, 2018-11-14) so that it won't cause a
bug if the "i" variable here is updated to be a "size_t" instead of an
"int".

The reason it would be buggy is because this for-loop relied on
"counting down" from nr-1 to 0, we'll then decrement the variable one
last time, so a value of -1 indicates that we've visited all
elements. Since we're looking for a needle in the haystack having
looked at all the hay means that the needle isn't there.

Let's replace this with simpler code that loops overthe "struct
strvec" and checks the current needle is "-q", if it isn't and we're
doing a merge we die, otherwise we do a "REBASE_APPLY".

We've been able to simplify this code since 8295ed690bf (rebase: make
the backend configurable via config setting, 2020-02-15) when we
stopped using this for anything except this one check, so let's do
that and move the check into the loop itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/rebase.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 27669880918..dcd897fda9c 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1750,16 +1750,14 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 
 	if (options.git_am_opts.nr || options.type == REBASE_APPLY) {
 		/* all am options except -q are compatible only with --apply */
-		for (i = options.git_am_opts.nr - 1; i >= 0; i--)
-			if (strcmp(options.git_am_opts.v[i], "-q"))
-				break;
-
-		if (i >= 0) {
-			if (is_merge(&options))
-				die(_("cannot combine apply options with "
-				      "merge options"));
-			else
+		for (i = 0; i < options.git_am_opts.nr; i++) {
+			if (strcmp(options.git_am_opts.v[i], "-q")) {
+				if (is_merge(&options))
+					die(_("cannot combine apply options with "
+					      "merge options"));
 				options.type = REBASE_APPLY;
+				break;
+			}
 		}
 	}
 
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/7] strvec: use size_t to store nr and alloc
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

From: Jeff King <peff@peff.net>

We converted argv_array (which later became strvec) to use size_t in
819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
order to avoid the possibility of integer overflow. But later, commit
d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
converted these back to ints!

Those two commits were part of the same patch series. I'm pretty sure
what happened is that they were originally written in the opposite order
and then cleaned up and re-ordered during an interactive rebase. And
when resolving the inevitable conflict, I mistakenly took the "rename"
patch completely, accidentally dropping the type change.

We can correct it now; better late than never.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 strvec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/strvec.h b/strvec.h
index fdcad75b45b..6b3cbd67589 100644
--- a/strvec.h
+++ b/strvec.h
@@ -29,8 +29,8 @@ extern const char *empty_strvec[];
  */
 struct strvec {
 	const char **v;
-	int nr;
-	int alloc;
+	size_t nr;
+	size_t alloc;
 };
 
 #define STRVEC_INIT { empty_strvec, 0, 0 }
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t"
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-09-12  0:15       ` [PATCH v2 6/7] strvec: use size_t to store nr and alloc Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:15       ` Ævar Arnfjörð Bjarmason
  2021-09-12  3:00         ` Eric Sunshine
  2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
  2021-09-13 10:47       ` Philip Oakley
  8 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-12  0:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Philip Oakley,
	Ævar Arnfjörð Bjarmason

Now that the strvec.h "int nr" has been changed to "size_t nr" in the
preceding commit change various users of the API so that their local
tracking (usually for-loop iteration) of the number of items in the
array uses "size_t" as well.

These were found by changing the "nr" member to a pointer temporarily,
and manually looking at the codepaths that the compiler complained
about.

As argued in <YTIBnT8Ue1HZXs82@coredump.intra.peff.net>[1] these
changes are not strictly necessary as a follow-up, but let's do them
anyway so we don't need to wonder about the "size_t" v.s. "int"
inconsistency going forward.

As noted in <87v93i8svd.fsf@evledraar.gmail.com> in that thread we
have various things that interact with the "int argc" passed from
main() (e.g. setup_revisions()) and "struct strvec". Those things
could consistently use "int" before, but will now use some mixture of
"int" and "size_t". That's OK, but is the reason we're not converting
all API users to use "size_t" here for their own copies of "nr", or
when they pass that "nr" to other functions.

1. https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
2. https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/rebase.c | 7 ++++---
 connect.c        | 8 ++++----
 fetch-pack.c     | 4 ++--
 ls-refs.c        | 2 +-
 remote-curl.c    | 2 +-
 sequencer.c      | 2 +-
 serve.c          | 2 +-
 submodule.c      | 2 +-
 8 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index dcd897fda9c..b96faaa7fea 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1424,7 +1424,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 			 N_("apply all changes, even those already present upstream")),
 		OPT_END(),
 	};
-	int i;
+	size_t i;
+	unsigned int j;
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_rebase_usage,
@@ -1654,8 +1655,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	for (i = 0; i < exec.nr; i++)
-		if (check_exec_cmd(exec.items[i].string))
+	for (j = 0; j < exec.nr; j++)
+		if (check_exec_cmd(exec.items[j].string))
 			exit(1);
 
 	if (!(options.flags & REBASE_NO_QUIET))
diff --git a/connect.c b/connect.c
index aff13a270e6..a54d4aa4d90 100644
--- a/connect.c
+++ b/connect.c
@@ -68,7 +68,7 @@ static NORETURN void die_initial_contact(int unexpected)
 /* Checks if the server supports the capability 'c' */
 int server_supports_v2(const char *c, int die_on_error)
 {
-	int i;
+	size_t i;
 
 	for (i = 0; i < server_capabilities_v2.nr; i++) {
 		const char *out;
@@ -85,7 +85,7 @@ int server_supports_v2(const char *c, int die_on_error)
 
 int server_feature_v2(const char *c, const char **v)
 {
-	int i;
+	size_t i;
 
 	for (i = 0; i < server_capabilities_v2.nr; i++) {
 		const char *out;
@@ -101,7 +101,7 @@ int server_feature_v2(const char *c, const char **v)
 int server_supports_feature(const char *c, const char *feature,
 			    int die_on_error)
 {
-	int i;
+	size_t i;
 
 	for (i = 0; i < server_capabilities_v2.nr; i++) {
 		const char *out;
@@ -479,7 +479,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     const struct string_list *server_options,
 			     int stateless_rpc)
 {
-	int i;
+	size_t i;
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
diff --git a/fetch-pack.c b/fetch-pack.c
index 0bf7ed7e477..d20f9a046f7 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -945,7 +945,7 @@ static int get_pack(struct fetch_pack_args *args,
 	}
 
 	if (index_pack_args) {
-		int i;
+		size_t i;
 
 		for (i = 0; i < cmd.args.nr; i++)
 			strvec_push(index_pack_args, cmd.args.v[i]);
@@ -1674,7 +1674,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	}
 
 	for (i = 0; i < packfile_uris.nr; i++) {
-		int j;
+		size_t j;
 		struct child_process cmd = CHILD_PROCESS_INIT;
 		char packname[GIT_MAX_HEXSZ + 1];
 		const char *uri = packfile_uris.items[i].string +
diff --git a/ls-refs.c b/ls-refs.c
index 84021416ca5..9622bc46072 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -46,7 +46,7 @@ static void ensure_config_read(void)
  */
 static int ref_match(const struct strvec *prefixes, const char *refname)
 {
-	int i;
+	size_t i;
 
 	if (!prefixes->nr)
 		return 1; /* no restriction */
diff --git a/remote-curl.c b/remote-curl.c
index 1f177e86f11..682643bb842 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1288,7 +1288,7 @@ static int push_dav(struct strvec *specs)
 static int push_git(struct discovery *heads, struct strvec *specs)
 {
 	struct rpc_state rpc;
-	int i;
+	size_t i;
 	int err;
 	struct strvec args;
 	struct string_list_item *cas_option;
diff --git a/sequencer.c b/sequencer.c
index a4ba434f173..ca3ee2e4167 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -919,7 +919,7 @@ static char *get_author(const char *message)
 
 static const char *author_date_from_env_array(const struct strvec *env)
 {
-	int i;
+	size_t i;
 	const char *date;
 
 	for (i = 0; i < env->nr; i++)
diff --git a/serve.c b/serve.c
index f11c0e07c45..0271bf946cc 100644
--- a/serve.c
+++ b/serve.c
@@ -159,7 +159,7 @@ static int is_command(const char *key, struct protocol_capability **command)
 int has_capability(const struct strvec *keys, const char *capability,
 		   const char **value)
 {
-	int i;
+	size_t i;
 	for (i = 0; i < keys->nr; i++) {
 		const char *out;
 		if (skip_prefix(keys->v[i], capability, &out) &&
diff --git a/submodule.c b/submodule.c
index 8e611fe1dbf..38ee56782c7 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1606,7 +1606,7 @@ int fetch_populated_submodules(struct repository *r,
 			       int default_option,
 			       int quiet, int max_parallel_jobs)
 {
-	int i;
+	size_t i;
 	struct submodule_parallel_fetch spf = SPF_INIT;
 
 	spf.r = r;
-- 
2.33.0.998.ga4d44345d43


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
@ 2021-09-12  0:36         ` Carlo Arenas
  2021-09-13  3:56           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 23+ messages in thread
From: Carlo Arenas @ 2021-09-12  0:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley

On Sat, Sep 11, 2021 at 5:16 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> We'll be changing the "int nr" in the "struct strvec" to "int size_t"

Did you mean "size_t nr"?

> -static int push_git(struct discovery *heads, int nr_spec, const char **specs)
> +static int push_git(struct discovery *heads, struct strvec *specs)
>  {
>         struct rpc_state rpc;
> -       int i, err;
> +       int i;

you are already splitting them, why not then change 'i' also to be a size_t?

Carlo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr"
  2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
@ 2021-09-12  2:57         ` Eric Sunshine
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Sunshine @ 2021-09-12  2:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, Philip Oakley

On Sat, Sep 11, 2021 at 8:16 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> As in the preceding commit, prepare for the "nr" member of "struct
> strvec" changing from an "int" to a "size_t".
>
> Let's change this code added in f57696802c3 (rebase: really just
> passthru the `git am` options, 2018-11-14) so that it won't cause a
> bug if the "i" variable here is updated to be a "size_t" instead of an
> "int".
>
> The reason it would be buggy is because this for-loop relied on
> "counting down" from nr-1 to 0, we'll then decrement the variable one
> last time, so a value of -1 indicates that we've visited all
> elements. Since we're looking for a needle in the haystack having
> looked at all the hay means that the needle isn't there.

s/haystack/&,/

> Let's replace this with simpler code that loops overthe "struct
> strvec" and checks the current needle is "-q", if it isn't and we're
> doing a merge we die, otherwise we do a "REBASE_APPLY".

s/overthe/over the/

Nit: comma-splice[1] at `"-q",`; replace comma with semicolon or period.

[1]: https://lore.kernel.org/git/CAPx1GvfFPWvJsj+uJV7RZrv1rgEpio=pk6rKF2UrjHebVY=LPA@mail.gmail.com/

> We've been able to simplify this code since 8295ed690bf (rebase: make
> the backend configurable via config setting, 2020-02-15) when we
> stopped using this for anything except this one check, so let's do
> that and move the check into the loop itself.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t"
  2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
@ 2021-09-12  3:00         ` Eric Sunshine
  0 siblings, 0 replies; 23+ messages in thread
From: Eric Sunshine @ 2021-09-12  3:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, Philip Oakley

On Sat, Sep 11, 2021 at 8:16 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Now that the strvec.h "int nr" has been changed to "size_t nr" in the
> preceding commit change various users of the API so that their local
> tracking (usually for-loop iteration) of the number of items in the
> array uses "size_t" as well.

s/change/changed/

> These were found by changing the "nr" member to a pointer temporarily,
> and manually looking at the codepaths that the compiler complained
> about.
>
> As argued in <YTIBnT8Ue1HZXs82@coredump.intra.peff.net>[1] these
> changes are not strictly necessary as a follow-up, but let's do them
> anyway so we don't need to wonder about the "size_t" v.s. "int"
> inconsistency going forward.
>
> As noted in <87v93i8svd.fsf@evledraar.gmail.com> in that thread we

s/>/>[2]/

> have various things that interact with the "int argc" passed from
> main() (e.g. setup_revisions()) and "struct strvec". Those things
> could consistently use "int" before, but will now use some mixture of
> "int" and "size_t". That's OK, but is the reason we're not converting
> all API users to use "size_t" here for their own copies of "nr", or
> when they pass that "nr" to other functions.
>
> 1. https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
> 2. https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] strvec: use size_t to store nr and alloc
  2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
  2021-09-11 22:48   ` Philip Oakley
@ 2021-09-12 21:58   ` Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Jeff King @ 2021-09-12 21:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Derrick Stolee

On Sat, Sep 11, 2021 at 06:13:18PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I don't really think any of that needs to go into the commit message,
> > but if that's a hold-up, I can try to summarize it (though I think
> > referring to the commit which _already_ did this and was accidentally
> > reverted would be sufficient).
> 
> Thanks, I have a WIP version of this outstanding starting with this
> patch that I was planning to submit sometime, but I'm happy to have you
> pursue it, especially with the ~100 outstanding patches I have in
> master..seen.
> 
> It does feel somewhere between iffy and a landmine waiting to be stepped
> on to only convert the member itself, and not any of the corresponding
> "int" variables that track it to "size_t".

I don't think it's a landmine. If those other places are using "int" and
have trouble with a too-big strvec after the patch, then they were
generally already buggy before, too. I have poked around before for
cases where half-converting some code from size_t to int can possibly
make things worse, but it's pretty rare.

Meanwhile, strvec using an int has obvious and easy-to-trigger overflow
problems. Try this:

  yes '000aagent' | GIT_PROTOCOL=version=2 ./git-upload-pack .

where we will happily allocate an arbitrarily-large strvec. If you have
the patience and the RAM, this will overflow the alloc field. Luckily we
catch it before under-allocating the array. Because our ALLOC_GROW()
growth pattern is fixed, we'll always end up with a negative 32-bit int,
which gets cast to a very large size_t, and an st_mult() invocation
complains.

Much scarier to me is that string_list seems to be using "unsigned int"
for its counter, so it never goes negative! Try this:

  yes | git pack-objects --stdin-packs foo

which reads into a string_list. It takes even more RAM to overflow
because string_list is so horribly inefficient. But here once again
we're saved by a quirk of ALLOC_GROW() dating back to 4234a76167 (Extend
--pretty=oneline to cover the first paragraph,, 2007-06-11).

alloc_nr() will overflow to a small number when it hits around 2^32 / 3.
Before that patch, we'd then realloc a too-small buffer and have a heap
overflow. After that patch (I think in order to catch growth by more
than 1 element), we notice the case that alloc_nr() didn't give us a big
enough size, and extend it exactly to what was asked for. So no heap
overflow, though we do enter a quadratic growth loop. :-/

So I don't think we have any heap overflows here, which is good. But we
are protected by a fair bit of luck, and none of this would get nearly
as close to danger if we just used a sensible type for our allocation
sizes.

I do think we should separately fix the v2 protocol not to just allocate
arbitrary-sized strvecs on behalf of the user. I took a stab at that
this weekend and have a series I'm pretty happy with, but of course it
conflicts with your ab/serve-cleanup (and in fact, I ended up doing some
of those same cleanups, like not passing "keys" around). I'll see if I
can re-build it on top.

> If you do the change I suggested in
> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
> find that there's at least one first-order reference to this that now
> uses "int" that if converted to "size_t" will result in a wrap-around
> error, we're lucky that one has a test failure.
> 
> I can tell you what that bug is, but maybe it's better if you find it
> yourself :) I.e. I found *that* one, but I'm not sure I found them
> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
> errors & the code context (note: pointer, obviously broken, but makes
> the compiler yell).

Yes, having converted s/int/size_t/ for other structures before, you
have to be very careful of signedness. A safer conversion is ssize_t
(which similarly avoids overflow problems on LP64 systems).

I'm sure there are other overflow bugs lurking around use of strvecs, if
you really push them hard. But I care a lot less about something like
"oops, I accidentally overwrite list[0] instead of list[2^32-1]". Those
are bugs that normal people will never see, because they aren't doing
stupid or malicious things. I care a lot more about our allocating
functions being vulnerable to heap overflows from malicious attackers.

So I was really hoping to do the size_t fix separately. It's hard for it
to screw anything up, and it addresses the most vulnerable and important
use.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] strvec: use size_t to store nr and alloc
  2021-09-11 22:48   ` Philip Oakley
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
@ 2021-09-12 22:00     ` Jeff King
  2021-09-13 11:42       ` Philip Oakley
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2021-09-12 22:00 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Derrick Stolee

On Sat, Sep 11, 2021 at 11:48:38PM +0100, Philip Oakley wrote:

> I'm particularly interested in the int -> size_t change problem as part
> of the wider 4GB limitations for the LLP64 systems [0] such as the
> RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
> problem.

Note that a lot of the Windows LLP64 problems are really a separate
issue. They come from a misuse of "unsigned long" as "gee, this should
be big enough for anything". Most of that is due to its use for object
sizes, which of course infected a whole bunch of other code.

Which isn't to say it's not important. But my main goal here was making
sure we use size_t for growth allocations to avoid integer overflow
leading to under-allocation (and thus heap overflow).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
@ 2021-09-12 22:19       ` Jeff King
  2021-09-13  5:38         ` Junio C Hamano
  2021-09-13 10:47       ` Philip Oakley
  8 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2021-09-12 22:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Philip Oakley

On Sun, Sep 12, 2021 at 02:15:48AM +0200, Ævar Arnfjörð Bjarmason wrote:

> This is what I'd been sitting on locally since that recent thread, I
> polished it up a bit since Jeff King posted his version.
> 
> The potential overflow bug I mentioned is in rebase.c. See
> 5/7. "Potential" because it's not a bug now, but that code
> intentionally considers a strvec, and then iterates it from nr-1 to 0,
> and if it reaches 0 intentionally counts down one more to -1 to
> indicate that it's visited all elements.
> 
> We then check that with i >= 0, except of course if it becomes
> unsigned that doesn't become -1, but rather it wraps around.

You can also just use ssize_t, or you can compare against SIZE_MAX to
catch the wraparound (there's some prior art in sort_revindex()). That
said, I don't mind rewriting loops to count up rather than down. It
usually makes them easier to follow (and in your patch 5, I do not see
any reason we would need to count down rather than up; we do not even
care where we find "-q", only that we found it.

> The rest of this is all changes to have that s/int/size_t/ radiate
> outwards, i.e. when we assign that value to a variable somewhere its
> now a "size_t" instead of an "int" etc.

I'm a little "meh" on some of these, for a few reasons:

 - anything calling into setup_revisions() eventually is just kicking
   the can anyway. And these are generally not buggy in the first place,
   since they're bounded argv creations.

 - passing a strvec instead of the broken-down pair is a less flexible
   interface. It's one thing if the callee benefits from seeing the
   strvec (say, because they may push more items onto it). But I think
   with strbufs, we have a general guideline that if a function _can_
   take the bare pointer, then it should. (Sorry, I don't have a
   succinct reference to CodingGuidelines or anything like that; I feel
   like this is wisdom we came up with on the list in the early days of
   strbufs).

 - if we are going to pass a strvec, it should almost certainly be
   const, to make it clear how we intend to use it.

So if we we wanted to try to reduce the int/size_t conversions here (and
I don't mind doing it, but am not altogether sure it is a good use of
time, because the rabbit hole runs deep), I think we ought to be
switching to size_t everywhere-ish along whole call chains. Or possibly
providing a checked size_to_int() which will safely catch and abort.
These cases are largely stupid things that real people would never come
across. The real goal is making sure we don't get hit with a memory
safety bug (under-allocation, converting a big size_t to a negative int,
etc).

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair
  2021-09-12  0:36         ` Carlo Arenas
@ 2021-09-13  3:56           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-13  3:56 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: git, Junio C Hamano, Jeff King, Philip Oakley


On Sat, Sep 11 2021, Carlo Arenas wrote:

> On Sat, Sep 11, 2021 at 5:16 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> We'll be changing the "int nr" in the "struct strvec" to "int size_t"
>
> Did you mean "size_t nr"?

Yes, will fix.

>> -static int push_git(struct discovery *heads, int nr_spec, const char **specs)
>> +static int push_git(struct discovery *heads, struct strvec *specs)
>>  {
>>         struct rpc_state rpc;
>> -       int i, err;
>> +       int i;
>
> you are already splitting them, why not then change 'i' also to be a size_t?

The conversion to size_t comes later in this series, this is setting
that diff up to be smaller. I.e. we'll then to s/int i/size_t /g.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
@ 2021-09-13  5:38         ` Junio C Hamano
  2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2021-09-13  5:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, git, Philip Oakley

Jeff King <peff@peff.net> writes:

> I'm a little "meh" on some of these, for a few reasons:
>
>  - anything calling into setup_revisions() eventually is just kicking
>    the can anyway. And these are generally not buggy in the first place,
>    since they're bounded argv creations.
>
>  - passing a strvec instead of the broken-down pair is a less flexible
>    interface. It's one thing if the callee benefits from seeing the
>    strvec (say, because they may push more items onto it). But I think
>    with strbufs, we have a general guideline that if a function _can_
>    take the bare pointer, then it should. (Sorry, I don't have a
>    succinct reference to CodingGuidelines or anything like that; I feel
>    like this is wisdom we came up with on the list in the early days of
>    strbufs).
>
>  - if we are going to pass a strvec, it should almost certainly be
>    const, to make it clear how we intend to use it.

All true.

> These cases are largely stupid things that real people would never come
> across. The real goal is making sure we don't get hit with a memory
> safety bug (under-allocation, converting a big size_t to a negative int,
> etc).

Yes.  

Ævar, I do not mean any disrespect to you, but I have to say that
topics like this one are starting to wear my concentration and
patience down really thin and making me really slow down.

Perhaps I am biased by not yet having seen what you eventually want
to build on top, and because of that I do not understood why and how
these "clean ups" are so valuable to have right now (as opposed to
just letting the sleeping dog lie), which you would of course have a
much better chance to know than I do.

But with that "bias", many of the recent patches from you look more
like pointless churn, mixed with fixes to theoretical problems, than
clean-ups with real benefits.

What makes it worse is that there are occasional real gems among
these "meh" patches, which means I have to read all of them anyway
to sift wheat from chaff X-<.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
@ 2021-09-13 10:47       ` Philip Oakley
  8 siblings, 0 replies; 23+ messages in thread
From: Philip Oakley @ 2021-09-13 10:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git; +Cc: Junio C Hamano, Jeff King

On 12/09/2021 01:15, Ævar Arnfjörð Bjarmason wrote:
> This is a proposed v2 of Jeff King's one-patch change to change
> strvec's nr/alloc from "int" to "size_t". As noted below I think it's
> worthwhile to not only change that in the struct, but also in code
> that directly references the "nr" member.
>
> On Sat, Sep 11 2021, Philip Oakley wrote:
>
>> On 11/09/2021 17:13, Ævar Arnfjörð Bjarmason wrote:
>>> On Sat, Sep 11 2021, Jeff King wrote:
>>>
>>>> We converted argv_array (which later became strvec) to use size_t in
>>>> 819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
>>>> order to avoid the possibility of integer overflow. But later, commit
>>>> d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
>>>> converted these back to ints!
>>>>
>>>> Those two commits were part of the same patch series. I'm pretty sure
>>>> what happened is that they were originally written in the opposite order
>>>> and then cleaned up and re-ordered during an interactive rebase. And
>>>> when resolving the inevitable conflict, I mistakenly took the "rename"
>>>> patch completely, accidentally dropping the type change.
>>>>
>>>> We can correct it now; better late than never.
>>>>
>>>> Signed-off-by: Jeff King <peff@peff.net>
>>>> ---
>>>> This was posted previously in the midst of another thread, but I don't
>>>> think was picked up. There was some positive reaction, but one "do we
>>>> really need this?" to which I responded in detail:
>>>>
>>>>   https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/
>>>>
>>>> I don't really think any of that needs to go into the commit message,
>>>> but if that's a hold-up, I can try to summarize it (though I think
>>>> referring to the commit which _already_ did this and was accidentally
>>>> reverted would be sufficient).
>>> Thanks, I have a WIP version of this outstanding starting with this
>>> patch that I was planning to submit sometime, but I'm happy to have you
>>> pursue it, especially with the ~100 outstanding patches I have in
>>> master..seen.
>>>
>>> It does feel somewhere between iffy and a landmine waiting to be stepped
>>> on to only convert the member itself, and not any of the corresponding
>>> "int" variables that track it to "size_t".
>>>
>>> If you do the change I suggested in
>>> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
>>> find that there's at least one first-order reference to this that now
>>> uses "int" that if converted to "size_t" will result in a wrap-around
>>> error, we're lucky that one has a test failure.
>>>
>>> I can tell you what that bug is, but maybe it's better if you find it
>>> yourself :) I.e. I found *that* one, but I'm not sure I found them
>>> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
>>> errors & the code context (note: pointer, obviously broken, but makes
>>> the compiler yell).
>>>
>>> That particular bug will be caught by the compiler as it involves a >= 0
>>> comparison against unsigned, but we may not not have that everywhere...
>> I'm particularly interested in the int -> size_t change problem as part
>> of the wider 4GB limitations for the LLP64 systems [0] such as the
>> RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
>> problem.
> Okey, fine, no fun excercise for the reader then ;)

There's a lot of weeds in there ;-)  In some ways it feels like the
SHA1->SHA256 transition, but without the consensus, as the 'shifting
foundations' problem only affect a subgroup of a subgroup (large Windows
files and repositories).
>
> This is what I'd been sitting on locally since that recent thread, I
> polished it up a bit since Jeff King posted his version.
>
> The potential overflow bug I mentioned is in rebase.c. See
> 5/7. "Potential" because it's not a bug now, but that code
> intentionally considers a strvec, and then iterates it from nr-1 to 0,
> and if it reaches 0 intentionally counts down one more to -1 to
> indicate that it's visited all elements.

It's these tidbits about how the problems surface, their detection and
resolution that are really useful. Along with general awareness raising.
At least here the issue is reasonably tightly focussed, and even then,
testing is hard.
>
> We then check that with i >= 0, except of course if it becomes
> unsigned that doesn't become -1, but rather it wraps around.
>
> The rest of this is all changes to have that s/int/size_t/ radiate
> outwards, i.e. when we assign that value to a variable somewhere its
> now a "size_t" instead of an "int" etc.

In the LLP64 case, I'm somewhat concerned about the possible pushback of
a wide spread s/int/size_t/ on the codebase's look & feel.
 (aside) I don't think there is even a `1S` to match the  the `1L` and
`1U` shorthands used in various places.

None of that is part of the series, but the patches are beneficial to
the codes portability.

>
>> [0]
>> http://nickdesaulniers.github.io/blog/2016/05/30/data-models-and-word-size/
>> [1] https://github.com/git-lfs/git-lfs/issues/2434  Git on Windows
>> client corrupts files > 4Gb
>> [2] https://github.com/git-for-windows/git/pull/2179  [DRAFT] for
>> testing : Fix 4Gb limit for large files on Git for Windows
> Jeff King (1):
>   strvec: use size_t to store nr and alloc
>
> Ævar Arnfjörð Bjarmason (6):
>   remote-curl: pass "struct strvec *" instead of int/char ** pair
>   pack-objects: pass "struct strvec *" instead of int/char ** pair
>   sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
>   upload-pack.c: pass "struct strvec *" instead of int/char ** pair
>   rebase: don't have loop over "struct strvec" depend on signed "nr"
>   strvec API users: change some "int" tracking "nr" to "size_t"
>
>  builtin/pack-objects.c |  6 +++---
>  builtin/rebase.c       | 26 ++++++++++++--------------
>  connect.c              |  8 ++++----
>  fetch-pack.c           |  4 ++--
>  ls-refs.c              |  2 +-
>  remote-curl.c          | 23 +++++++++++------------
>  sequencer.c            |  8 ++++----
>  sequencer.h            |  4 ++--
>  serve.c                |  2 +-
>  shallow.c              |  5 +++--
>  shallow.h              |  6 ++++--
>  strvec.h               |  4 ++--
>  submodule.c            |  2 +-
>  upload-pack.c          |  7 +++----
>  14 files changed, 53 insertions(+), 54 deletions(-)
>
> Range-diff against v1:
> -:  ----------- > 1:  2ef48d734e8 remote-curl: pass "struct strvec *" instead of int/char ** pair
> -:  ----------- > 2:  7f59a58ed97 pack-objects: pass "struct strvec *" instead of int/char ** pair
> -:  ----------- > 3:  c35cfb9c9c5 sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
> -:  ----------- > 4:  2e0b82d4316 upload-pack.c: pass "struct strvec *" instead of int/char ** pair
> -:  ----------- > 5:  be85a0565ef rebase: don't have loop over "struct strvec" depend on signed "nr"
> 1:  498f5ed80dc ! 6:  ba17290852c strvec: use size_t to store nr and alloc
>     @@ Commit message
>          We can correct it now; better late than never.
>      
>          Signed-off-by: Jeff King <peff@peff.net>
>     +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>      
>       ## strvec.h ##
>      @@ strvec.h: extern const char *empty_strvec[];
> -:  ----------- > 7:  2edd9708888 strvec API users: change some "int" tracking "nr" to "size_t"


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] strvec: use size_t to store nr and alloc
  2021-09-12 22:00     ` [PATCH] " Jeff King
@ 2021-09-13 11:42       ` Philip Oakley
  0 siblings, 0 replies; 23+ messages in thread
From: Philip Oakley @ 2021-09-13 11:42 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Derrick Stolee

On 12/09/2021 23:00, Jeff King wrote:
> On Sat, Sep 11, 2021 at 11:48:38PM +0100, Philip Oakley wrote:
>
>> I'm particularly interested in the int -> size_t change problem as part
>> of the wider 4GB limitations for the LLP64 systems [0] such as the
>> RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
>> problem.
> Note that a lot of the Windows LLP64 problems are really a separate
> issue. They come from a misuse of "unsigned long" as "gee, this should
> be big enough for anything". Most of that is due to its use for object
> sizes, which of course infected a whole bunch of other code.

I don't see it (root cause) as independent though. This is a nicely
separated issue (effect), which helps.

> Which isn't to say it's not important. But my main goal here was making
> sure we use size_t for growth allocations to avoid integer overflow
> leading to under-allocation (and thus heap overflow).

It's also been helpful in highlighting some of the wider issues and
approaches.

Thank you

Philip


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-13  5:38         ` Junio C Hamano
@ 2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
  2021-09-13 17:20             ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-13 12:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git, Philip Oakley


On Sun, Sep 12 2021, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
>
>> I'm a little "meh" on some of these, for a few reasons:
>>
>>  - anything calling into setup_revisions() eventually is just kicking
>>    the can anyway. And these are generally not buggy in the first place,
>>    since they're bounded argv creations.
>>
>>  - passing a strvec instead of the broken-down pair is a less flexible
>>    interface. It's one thing if the callee benefits from seeing the
>>    strvec (say, because they may push more items onto it). But I think
>>    with strbufs, we have a general guideline that if a function _can_
>>    take the bare pointer, then it should. (Sorry, I don't have a
>>    succinct reference to CodingGuidelines or anything like that; I feel
>>    like this is wisdom we came up with on the list in the early days of
>>    strbufs).
>>
>>  - if we are going to pass a strvec, it should almost certainly be
>>    const, to make it clear how we intend to use it.
>
> All true.
>
>> These cases are largely stupid things that real people would never come
>> across. The real goal is making sure we don't get hit with a memory
>> safety bug (under-allocation, converting a big size_t to a negative int,
>> etc).
>
> Yes.  
>
> Ævar, I do not mean any disrespect to you, but I have to say that
> topics like this one are starting to wear my concentration and
> patience down really thin and making me really slow down.

I'm sorry. I'll try to lay off on the patch firehose until the delta
I've got in master..seen is way down from what it is now.

> Perhaps I am biased by not yet having seen what you eventually want
> to build on top, and because of that I do not understood why and how
> these "clean ups" are so valuable to have right now (as opposed to
> just letting the sleeping dog lie), which you would of course have a
> much better chance to know than I do.

I could go into that, but it's probably best to leave it at: Yeah for
these seemingly going nowhere small changes I'm generally taking them
somewhere interesting.

But figuring out how to split that is still a hard problem. E.g. I've
had one series (the fsck error message improvements) that's been stalled
for ~3 months due to size/including the "interesting" part, but recently
relatively small increments of prep code changes seem to get a lot of
review traction.

As for this strvec.h s/int/size_t/ topic. I'm not taking that anywhere,
Jeff suggested it and came up with the patch, I figured more helpful
than "if we change s/int/size_t/g for x, shouldn't we change that for y
which whe assign x to?" would be patches I had to do that, which I'd
come up with after Jeff suggested this direction in response to another
topic.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/7] strvec: use size_t to store nr and alloc
  2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
@ 2021-09-13 17:20             ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2021-09-13 17:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Junio C Hamano, git, Philip Oakley

On Mon, Sep 13, 2021 at 02:29:01PM +0200, Ævar Arnfjörð Bjarmason wrote:

> As for this strvec.h s/int/size_t/ topic. I'm not taking that anywhere,
> Jeff suggested it and came up with the patch, I figured more helpful
> than "if we change s/int/size_t/g for x, shouldn't we change that for y
> which whe assign x to?" would be patches I had to do that, which I'd
> come up with after Jeff suggested this direction in response to another
> topic.

I'm not inherently opposed to further int/size_t cleanups. But the
trouble is that my single patch stands on its own as an improvement to a
real issue, and does not (as far as I know) have any functional
downsides (either known or even hypothetical, aside from the obvious
mismatch that some callers will still use "int").

But doing wide-spread int/size_t conversion has less obvious immediate
benefit, is much easier to get wrong, and may introduce further
complications (e.g., differences of opinion in whether we should be
passing strvecs around more, or just using size_t in more places).

So I don't mind a series in that direction (though I don't necessarily
think it is the best use of time), but I'd prefer not to see my original
patch tied up in it.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-09-13 17:20 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-11 15:01 [PATCH] strvec: use size_t to store nr and alloc Jeff King
2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
2021-09-11 22:48   ` Philip Oakley
2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
2021-09-12  0:36         ` Carlo Arenas
2021-09-13  3:56           ` Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 2/7] pack-objects: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 3/7] sequencer.[ch]: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 4/7] upload-pack.c: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
2021-09-12  2:57         ` Eric Sunshine
2021-09-12  0:15       ` [PATCH v2 6/7] strvec: use size_t to store nr and alloc Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
2021-09-12  3:00         ` Eric Sunshine
2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
2021-09-13  5:38         ` Junio C Hamano
2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
2021-09-13 17:20             ` Jeff King
2021-09-13 10:47       ` Philip Oakley
2021-09-12 22:00     ` [PATCH] " Jeff King
2021-09-13 11:42       ` Philip Oakley
2021-09-12 21:58   ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.