All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [RFC PATCH] unpack-trees: watch for out-of-range index position
Date: Fri, 10 Jan 2020 01:37:41 -0500	[thread overview]
Message-ID: <20200110063741.GA409153@coredump.intra.peff.net> (raw)
In-Reply-To: <20200109224641.GF181522@google.com>

On Thu, Jan 09, 2020 at 02:46:41PM -0800, Emily Shaffer wrote:

> > Perhaps. The integrity check only protects against an index that was
> > modified after the fact, not one that was generated by a buggy Git. I'm
> > not sure we know how the index that led to this patch got into this
> > state (though it sounds like Emily has a copy and could check the hash
> > on it), but other cache-tree segfault I found recently was with an index
> > with an intact integrity hash.
> 
> Yeah, I can do that, although I'm not sure how. The index itself is very
> small - it only contains one file and one tree extension - so I'll go
> ahead and paste some poking and prodding, and if it's not what you
> wanted then please let me know what else to run.

I was thinking you would run something like:

  size=$(stat --format=%s "$file")
  actual=$(head -c $(($size-20)) "$file" | sha1sum | awk '{print $1}')
  expect=$(xxd -s -20 -g 20 -c 20 "$file" | awk '{print $2}')
  if test "$actual" = "$expect"; then
          echo "OK ($actual)"
  else
          echo "FAIL ($actual != $expect)"
  fi

to manually check the sha1. But...

>   $ g fsck --cache
>   Checking object directories: 100% (256/256), done.
>   Checking objects: 100% (20/20), done.
>   broken link from  commit 153a9a100eae7fdba5989ce39a5dd1782075517f
>                 to  commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   broken link from  commit 7d6bb91e31d18eadfaf855a9fb7ad6ba81b8b6d9
>                 to  commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   dangling commit 5e2c635433bc46b13061b276e481f63b1f6642c8

...fsck would have reported a problem there, since we explicitly kept
the check there in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14).

And just to be double-sure, I used this:

>   $ hexdump -C .git/index
>   00000000  44 49 52 43 00 00 00 02  00 00 00 01 5d 89 5e 22  |DIRC........].^"|
>   00000010  23 bf a3 c4 5d 89 5e 22  23 bf a3 c4 00 00 fe 02  |#...].^"#.......|
>   00000020  02 c8 f5 83 00 00 81 a4  00 06 c1 dc 00 01 5f 53  |.............._S|
>   00000030  00 00 06 b3 78 88 a4 f4  22 34 7d ad b0 c4 73 0f  |....x..."4}...s.|
>   00000040  c5 bc f6 ea 1d 2d f0 3a  00 09 52 45 41 44 4d 45  |.....-.:..README|
>   00000050  2e 6d 64 00 54 52 45 45  00 00 00 3a 00 31 37 20  |.md.TREE...:.17 |
>   00000060  31 0a da 7f 67 25 40 7d  4e ce 9f d3 72 ce 4c e8  |1...g%@}N...r.L.|
>   00000070  40 6d 5d ad e9 79 67 69  74 6c 69 6e 74 00 34 20  |@m]..ygitlint.4 |
>   00000080  30 0a 93 63 25 17 69 e6  d6 92 78 97 55 4b 0f 8b  |0..c%.i...x.UK..|
>   00000090  ff a0 e8 2d 6d 71 32 d1  69 fc f2 38 42 f8 5a 6e  |...-mq2.i..8B.Zn|
>   000000a0  05 35 d6 94 41 c0 9f c7  ba 43                    |.5..A....C|
>   000000aa

to reconstruct the file and check its sha1, and indeed it is fine.

So this bogus index was probably actually created by Git, not an
after-the-fact byte corruption.

-Peff

  reply	other threads:[~2020-01-10  6:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08  2:31 [RFC PATCH] unpack-trees: watch for out-of-range index position Emily Shaffer
2020-01-08  7:15 ` Jeff King
2020-01-08 17:30   ` Junio C Hamano
2020-01-08 19:38     ` Emily Shaffer
2020-01-08 20:35       ` Junio C Hamano
2020-01-09  7:52         ` Jeff King
2020-01-09 22:46           ` Emily Shaffer
2020-01-10  6:37             ` Jeff King [this message]
2020-01-10 23:07               ` Emily Shaffer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200110063741.GA409153@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.