All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>, git@vger.kernel.org
Subject: Re: [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag
Date: Sun, 26 Mar 2017 02:18:26 -0400	[thread overview]
Message-ID: <20170326061826.yx6nh3k2ps6uyyz6@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqq7f3d6ev1.fsf@gitster.mtv.corp.google.com>

On Fri, Mar 24, 2017 at 11:37:54PM -0700, Junio C Hamano wrote:

> The hash that names a packfile is constructed by sorting all the
> names of the objects contained in the packfile and running SHA-1
> hash over it.  I think this MUST be hashed with collision-attack
> detection.  A malicious site can feed you a packfile that contains
> objects the site crafts so that the sorted object names would result
> in a collision-attack, ending up with one pack that contains a sets
> of objects different from another pack that happens to have the same
> packname, causing Git to say "Ah, this new pack must have the same
> set of objects as the pack we already have" and discard it,
> resulting in lost objects and a corrupt repository with missing
> objects.

I don't think this case really matters for collision detection. What's
important is what Git does when it receives a brand-new packfile that
would overwrite an existing one. It _should_ keep the old one, under the
usual "existing data wins" rule.

It should be easy to test, though:

  $ git init tmp && cd tmp
  $ git commit --allow-empty -m foo
  $ git gc

  $ touch -d yesterday .git/objects/pack/*
  $ ls -l .git/objects/pack
  -r--r--r-- 1 peff peff 1128 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx
  -r--r--r-- 1 peff peff  153 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack

  $ git index-pack --stdin <.git/objects/pack/*.pack
  pack	7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b

  $ ls -l .git/objects/pack
  -r--r--r-- 1 peff peff 1128 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx
  -r--r--r-- 1 peff peff  153 Mar 25 02:10 pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack

Looks like the timestamps were retained. <phew> And if we use strace, we
can see what happens:

  $ strace git index-pac k--stdin <.git/objects/pack/*.pack
  link(".git/objects/pack/tmp_pack_YSrdWU", ".git/objects/pack/pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.pack") = -1 EEXIST (File exists)
  unlink(".git/objects/pack/tmp_pack_YSrdWU") = 0
  link(".git/objects/pack/tmp_idx_O94NNU", ".git/objects/pack/pack-7e9d64ac27adc9ce1b12774dd287ff9bd8a9345b.idx") = -1 EEXIST (File exists)
  unlink(".git/objects/pack/tmp_idx_O94NNU") = 0

This is due to the link()/EEXIST handling in finalize_object_file. It
has a FIXME for a collision check, so we could actually detect at that
point whether we have a real collision, or if the other side just
happened to send us the same pack.

I wouldn't be surprised if the dumb-http walker is not so careful,
though (but the solution is to make it careful, not to worry about
a weak hash algorithm).

The rest of your email all made sense to me.

-Peff

  parent reply	other threads:[~2017-03-26  6:25 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-24 23:24 [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag Johannes Schindelin
2017-03-24 23:24 ` [PATCH 1/7] sha1dc: safeguard against outside definitions of BIGENDIAN Johannes Schindelin
2017-03-24 23:24 ` [PATCH 2/7] Makefile: optionally compile with both SHA1DC and SHA1_OPENSSL Johannes Schindelin
2017-03-25 19:51   ` Ævar Arnfjörð Bjarmason
2017-03-30 16:16   ` Junio C Hamano
2017-03-30 16:47     ` Junio C Hamano
2017-04-18 11:28     ` Johannes Schindelin
2017-03-24 23:24 ` [PATCH 3/7] config: add the core.enablesha1dc setting Johannes Schindelin
2017-03-24 23:25 ` [PATCH 4/7] t0013: do not skip the entire file wholesale without DC_SHA1 Johannes Schindelin
2017-03-24 23:25 ` [PATCH 5/7] t0013: test DC_AND_OPENSSL_SHA1, too Johannes Schindelin
2017-03-24 23:28 ` [PATCH 6/7] mingw: enable DC_AND_OPENSSL_SHA1 by default Johannes Schindelin
2017-03-24 23:28 ` [PATCH 7/7] p0013: new test to compare SHA1DC vs OpenSSL Johannes Schindelin
2017-03-25  6:37 ` [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag Junio C Hamano
2017-03-25 16:58   ` Junio C Hamano
2017-03-26  6:18   ` Jeff King [this message]
2017-03-26 23:16     ` Junio C Hamano
2017-03-27  1:11       ` Jeff King
2017-03-27  6:07         ` Junio C Hamano
2017-03-27  7:09           ` Jeff King
2017-03-27 17:15             ` Junio C Hamano
2017-03-29 20:02   ` Johannes Schindelin
2017-03-30  0:31     ` Junio C Hamano
2017-04-18 11:30       ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170326061826.yx6nh3k2ps6uyyz6@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.