All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: Git Mailing List <git@vger.kernel.org>
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: What's cooking in git.git (Jan 2019, #01; Mon, 7)
Date: Wed, 9 Jan 2019 22:06:08 +0100	[thread overview]
Message-ID: <CAN0heSqLUWpwRdeUvYj2KnDX-QxSOnWOdKWz77RjHKJ3AFUGEQ@mail.gmail.com> (raw)
In-Reply-To: <CAN0heSoRYYS3-UAamE9nibhORPoD+_TRHu5-ZTeYxYMS4BAnrA@mail.gmail.com>

On Wed, 9 Jan 2019 at 08:37, Martin Ågren <martin.agren@gmail.com> wrote:
>
> On Tue, 8 Jan 2019 at 00:34, Junio C Hamano <gitster@pobox.com> wrote:
> > * bc/sha-256 (2018-11-14) 12 commits
> >  - hash: add an SHA-256 implementation using OpenSSL
> >  - sha256: add an SHA-256 implementation using libgcrypt
> >  - Add a base implementation of SHA-256 support
> >  - commit-graph: convert to using the_hash_algo
> >  - t/helper: add a test helper to compute hash speed
> >  - sha1-file: add a constant for hash block size
> >  - t: make the sha1 test-tool helper generic
> >  - t: add basic tests for our SHA-1 implementation
> >  - cache: make hashcmp and hasheq work with larger hashes
> >  - hex: introduce functions to print arbitrary hashes
> >  - sha1-file: provide functions to look up hash algorithms
> >  - sha1-file: rename algorithm to "sha1"
> >
> >  Add sha-256 hash and plug it through the code to allow building Git
> >  with the "NewHash".
>
> AddressSanitizer barks at current pu (855f98be272f19d16564e) for a
> handful of tests.
>
> One example is t5702-protocol-v2.sh. [...]
>
> ==1691823==ERROR: AddressSanitizer: heap-buffer-overflow on address
> 0x6040000004f2 at pc 0x0000004ea0fd bp 0x7ffc53082590 sp
> 0x7ffc53081d40
> READ of size 32 at 0x6040000004f2 thread T0
>     #0 0x4ea0fc in __asan_memcpy
> llvm/projects/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cc:23
>     #1 0x8603ec in oidset_insert oidset.c
>     #2 0x86c977 in add_promisor_object packfile.c:2129:4
>     #3 0x86c07a in for_each_object_in_pack packfile.c:2070:7
>     #4 0x86c535 in for_each_packed_object packfile.c:2095:7
>     #5 0x86c651 in is_promisor_object packfile.c:2151:4

> 0x6040000004f2 is located 0 bytes to the right of 34-byte region
> [0x6040000004d0,0x6040000004f2)
> allocated by thread T0 here:
>     #0 0x4eb4cf in malloc
> llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146
>     #1 0x9fa1db in do_xmalloc wrapper.c:60:8
>     #2 0x9fa2fd in do_xmallocz wrapper.c:100:8
>     #3 0x9fa2fd in xmallocz_gently wrapper.c:113
>     #4 0x86a877 in unpack_compressed_entry packfile.c:1588:11
>     #5 0x86a02e in unpack_entry packfile.c:1737:11
>     #6 0x867431 in cache_or_unpack_entry packfile.c:1439:10
>     #7 0x867431 in packed_object_info packfile.c:1506
>     #8 0x96b7be in oid_object_info_extended sha1-file.c:1394:10
>     #9 0x96d7d0 in read_object sha1-file.c:1434:6
>     #10 0x96d7d0 in read_object_file_extended sha1-file.c:1476
>     #11 0x85cf40 in repo_read_object_file ./object-store.h:174:9
>     #12 0x85cf40 in parse_object object.c:273
>     #13 0x86c752 in add_promisor_object packfile.c:2108:23
>     #14 0x86c07a in for_each_object_in_pack packfile.c:2070:7
>     #15 0x86c535 in for_each_packed_object packfile.c:2095:7
>     #16 0x86c651 in is_promisor_object packfile.c:2151:4

I found some more time to look into this.

It seems we have a buffer with raw data and we set up a `struct
object_id *` pointing into it, at a (supposed) OID value. Then
`update_tree_entry_internal()` verifies that the buffer contains
sufficiently many bytes, i.e., at least `the_hash_algo->rawsz` (=20).
We immediately call `oidset_insert()` which copies an entire struct,
i.e., we copy sizeof(struct object_id) (=32) bytes. Which is 12 more
than what is known to be safe. For this particular input data, we read
outside allocated memory.

I can think of three possible approaches:

* Allocate with a margin (GIT_MAX_RAWSZ - the_hash_algo->rawsz) where
  "necessary" (TM). Maybe not so maintainable.

* Teach `oidset_insert()` (i.e., khash) to only copy
  `the_hash_algo->rawsz` bytes. Maybe not so good for performance.

* Ignore.

I wonder which of these is the least awful, or if there are other ideas.

Martin

  reply	other threads:[~2019-01-09 21:06 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 23:34 What's cooking in git.git (Jan 2019, #01; Mon, 7) Junio C Hamano
2019-01-08  9:50 ` tg/checkout-no-overlay, was " Thomas Gummerer
2019-01-08 17:51   ` Junio C Hamano
2019-01-08 17:30 ` ag/sequencer-reduce-rewriting-todo " Alban Gruin
2019-01-08 21:20 ` sb/more-repo-in-api, was " Jonathan Tan
2019-01-08 21:35   ` Junio C Hamano
2019-01-09 21:28     ` Stefan Beller
2019-01-09  7:37 ` Martin Ågren
2019-01-09 21:06   ` Martin Ågren [this message]
2019-01-10  1:02     ` brian m. carlson
2019-01-10 18:55       ` Junio C Hamano
2019-01-10 19:03       ` Martin Ågren
2019-01-10  4:25     ` [PATCH 0/5] tree-walk object_id refactor brian m. carlson
2019-01-10  4:25       ` [PATCH 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-10  4:25       ` [PATCH 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-10  4:25       ` [PATCH 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-10  6:45         ` Jeff King
2019-01-10 23:55           ` brian m. carlson
2019-01-11 14:51             ` Jeff King
2019-01-11 14:54               ` Jeff King
2019-01-14  1:30                 ` brian m. carlson
2019-01-14 15:40                   ` Jeff King
2019-01-10  4:25       ` [PATCH 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-10  6:49         ` Jeff King
2019-01-10 23:57           ` brian m. carlson
2019-01-10  4:25       ` [PATCH 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-10  6:50         ` Jeff King
2019-01-10  6:40       ` [PATCH 0/5] tree-walk object_id refactor Jeff King
2019-01-11  0:17         ` brian m. carlson
2019-01-11 14:17           ` Jeff King
2019-01-15  0:39     ` [PATCH v2 " brian m. carlson
2019-01-15  0:39       ` [PATCH v2 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-15  0:39       ` [PATCH v2 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-15  0:39       ` [PATCH v2 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-15  0:39       ` [PATCH v2 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-15  0:39       ` [PATCH v2 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-15 17:51       ` [PATCH v2 0/5] tree-walk object_id refactor Junio C Hamano
2019-01-09 10:28 ` What's cooking in git.git (Jan 2019, #01; Mon, 7) Jeff King
2019-01-10 19:05   ` Junio C Hamano
2019-01-10 19:46   ` Junio C Hamano
2019-01-10 18:02 ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN0heSqLUWpwRdeUvYj2KnDX-QxSOnWOdKWz77RjHKJ3AFUGEQ@mail.gmail.com \
    --to=martin.agren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.