All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH] strvec: use size_t to store nr and alloc
Date: Sun, 12 Sep 2021 17:58:32 -0400	[thread overview]
Message-ID: <YT54CNYgtGcqexwq@coredump.intra.peff.net> (raw)
In-Reply-To: <87o88z82pc.fsf@evledraar.gmail.com>

On Sat, Sep 11, 2021 at 06:13:18PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I don't really think any of that needs to go into the commit message,
> > but if that's a hold-up, I can try to summarize it (though I think
> > referring to the commit which _already_ did this and was accidentally
> > reverted would be sufficient).
> 
> Thanks, I have a WIP version of this outstanding starting with this
> patch that I was planning to submit sometime, but I'm happy to have you
> pursue it, especially with the ~100 outstanding patches I have in
> master..seen.
> 
> It does feel somewhere between iffy and a landmine waiting to be stepped
> on to only convert the member itself, and not any of the corresponding
> "int" variables that track it to "size_t".

I don't think it's a landmine. If those other places are using "int" and
have trouble with a too-big strvec after the patch, then they were
generally already buggy before, too. I have poked around before for
cases where half-converting some code from size_t to int can possibly
make things worse, but it's pretty rare.

Meanwhile, strvec using an int has obvious and easy-to-trigger overflow
problems. Try this:

  yes '000aagent' | GIT_PROTOCOL=version=2 ./git-upload-pack .

where we will happily allocate an arbitrarily-large strvec. If you have
the patience and the RAM, this will overflow the alloc field. Luckily we
catch it before under-allocating the array. Because our ALLOC_GROW()
growth pattern is fixed, we'll always end up with a negative 32-bit int,
which gets cast to a very large size_t, and an st_mult() invocation
complains.

Much scarier to me is that string_list seems to be using "unsigned int"
for its counter, so it never goes negative! Try this:

  yes | git pack-objects --stdin-packs foo

which reads into a string_list. It takes even more RAM to overflow
because string_list is so horribly inefficient. But here once again
we're saved by a quirk of ALLOC_GROW() dating back to 4234a76167 (Extend
--pretty=oneline to cover the first paragraph,, 2007-06-11).

alloc_nr() will overflow to a small number when it hits around 2^32 / 3.
Before that patch, we'd then realloc a too-small buffer and have a heap
overflow. After that patch (I think in order to catch growth by more
than 1 element), we notice the case that alloc_nr() didn't give us a big
enough size, and extend it exactly to what was asked for. So no heap
overflow, though we do enter a quadratic growth loop. :-/

So I don't think we have any heap overflows here, which is good. But we
are protected by a fair bit of luck, and none of this would get nearly
as close to danger if we just used a sensible type for our allocation
sizes.

I do think we should separately fix the v2 protocol not to just allocate
arbitrary-sized strvecs on behalf of the user. I took a stab at that
this weekend and have a series I'm pretty happy with, but of course it
conflicts with your ab/serve-cleanup (and in fact, I ended up doing some
of those same cleanups, like not passing "keys" around). I'll see if I
can re-build it on top.

> If you do the change I suggested in
> https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ you'll
> find that there's at least one first-order reference to this that now
> uses "int" that if converted to "size_t" will result in a wrap-around
> error, we're lucky that one has a test failure.
> 
> I can tell you what that bug is, but maybe it's better if you find it
> yourself :) I.e. I found *that* one, but I'm not sure I found them
> all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
> errors & the code context (note: pointer, obviously broken, but makes
> the compiler yell).

Yes, having converted s/int/size_t/ for other structures before, you
have to be very careful of signedness. A safer conversion is ssize_t
(which similarly avoids overflow problems on LP64 systems).

I'm sure there are other overflow bugs lurking around use of strvecs, if
you really push them hard. But I care a lot less about something like
"oops, I accidentally overwrite list[0] instead of list[2^32-1]". Those
are bugs that normal people will never see, because they aren't doing
stupid or malicious things. I care a lot more about our allocating
functions being vulnerable to heap overflows from malicious attackers.

So I was really hoping to do the size_t fix separately. It's hard for it
to screw anything up, and it addresses the most vulnerable and important
use.

-Peff

      parent reply	other threads:[~2021-09-12 21:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11 15:01 [PATCH] strvec: use size_t to store nr and alloc Jeff King
2021-09-11 16:13 ` Ævar Arnfjörð Bjarmason
2021-09-11 22:48   ` Philip Oakley
2021-09-12  0:15     ` [PATCH v2 0/7] " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 1/7] remote-curl: pass "struct strvec *" instead of int/char ** pair Ævar Arnfjörð Bjarmason
2021-09-12  0:36         ` Carlo Arenas
2021-09-13  3:56           ` Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 2/7] pack-objects: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 3/7] sequencer.[ch]: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 4/7] upload-pack.c: " Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 5/7] rebase: don't have loop over "struct strvec" depend on signed "nr" Ævar Arnfjörð Bjarmason
2021-09-12  2:57         ` Eric Sunshine
2021-09-12  0:15       ` [PATCH v2 6/7] strvec: use size_t to store nr and alloc Ævar Arnfjörð Bjarmason
2021-09-12  0:15       ` [PATCH v2 7/7] strvec API users: change some "int" tracking "nr" to "size_t" Ævar Arnfjörð Bjarmason
2021-09-12  3:00         ` Eric Sunshine
2021-09-12 22:19       ` [PATCH v2 0/7] strvec: use size_t to store nr and alloc Jeff King
2021-09-13  5:38         ` Junio C Hamano
2021-09-13 12:29           ` Ævar Arnfjörð Bjarmason
2021-09-13 17:20             ` Jeff King
2021-09-13 10:47       ` Philip Oakley
2021-09-12 22:00     ` [PATCH] " Jeff King
2021-09-13 11:42       ` Philip Oakley
2021-09-12 21:58   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT54CNYgtGcqexwq@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.