git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: git@vger.kernel.org
Subject: Re: [PATCH] Fix packname hash generation.
Date: Wed, 12 Oct 2005 19:46:28 -0700	[thread overview]
Message-ID: <7v3bn6877f.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: 7vslv6b86l.fsf_-_@assigned-by-dhcp.cox.net

Junio C Hamano <junkio@cox.net> writes:

> This changes the generation of hash packfiles have in their names, from
> "hash of object names as fed to us" to "hash of object names in the
> resulting pack, in the order they appear in the index file".  The new
> "git-index-pack" command is taught to output the computed hash value
> to its standard output.

In case it was not obvious, this is not a backward incompatible
change.  Your existing packs will be valid after this change.

What those 40-byte hashes were buying us was that we did not
have to worry about name clashes.  We could have said "these two
packs have the same name so they must have the same set of
objects", but there is no tool that relies on this fact.  We
could not even say "these two packs have different names so the
set of objects contained by them must be different" -- the
resulting pack name depended on the order of objects fed to
git-pack-objects, even if you fed the same set of objects.

The really core part never cared about how packfiles and their
indices are named.  The only restrictions were that they live
immediately under .git/objects/pack/, have .pack and .idx suffix
respectively, and their basename match with each other.

The commit walkers (anything that link with fetch.c) impose
another limitation that their basenames are "pack-" followed by
40-byte hexadecimal digits.  But they do not check if the name
is consistent with the set of objects in the pack (checking it
was computationally infeasible for huge packs in the previous
hashing mechanism -- you have to feed all permutations of
objects contained in the pack to SHA1 hash and see if any
produces the same hash as the pack name).  We _could_ now do
this additional check if we wanted to (the same goes to the
really core part in sha1_file.c::check_packed_git_idx()).

In short, it does not matter if your existing packs are named
using the old hashing mechanism.  They will continue to be
valid.

But if you really care about consistency, here is an easy way to
rename your existing packs to their new names the new hashing
scheme would produce.

#!/bin/sh

: ${GIT_DIR=.git}
: ${GIT_OBJECT_DIRECTORY="${GIT_DIR}/objects"}

O="$GIT_OBJECT_DIRECTORY"
P="$GIT_OBJECT_DIRECTORY/pack"
for existing in `cd "$GIT_OBJECT_DIRECTORY" &&
		 find pack -name '*.pack' -print`
do
    idx=`expr "$existing" : '\(.*\)\.pack$'`.idx &&
    test -f "$O/$idx" || {
        echo >&2 "Missing idx $idx?"
        continue
    }
    new=`git-index-pack -o tmp-idx "$O/$existing"` || {
        echo >&2 "Corrupt pack $existing?"
        continue
    }           
    # index generated for an existing pack should match.
    cmp "$O/$idx" tmp-idx || {
        echo >&2 "Corrupt idx $idx?"
        continue
    }
    if test "pack/pack-$new.pack" = "$existing"
    then
        echo >&2 "Already converted $existing."
        continue
    fi
    if test -f "$P/pack-$new.pack" || test -f "$P/pack-$new.idx"
    then
        echo >&2 "Name clash! $new"
        continue
    fi
    mv "$O/$existing" "$P/pack-$new.pack" &&
    mv "$O/$idx" "$P/pack-$new.idx" || {
        echo >&2 "Cannot move $existing to $new"
        continue
    }
    echo >&2 "Renamed $existing -> $new"
done

  parent reply	other threads:[~2005-10-13  2:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-12 11:02 [PATCH] Add '--create-index' to git-unpack-objects Johannes Schindelin
2005-10-12 13:34 ` Sergey Vlasov
2005-10-12 13:54   ` [PATCH] Add git-index-pack utility Sergey Vlasov
2005-10-12 14:33     ` Johannes Schindelin
2005-10-12 15:01       ` Sergey Vlasov
2005-10-12 23:57     ` [PATCH] Fix packname hash generation Junio C Hamano
2005-10-13  1:23       ` [PATCH] clone-pack: new option --keep to keep the pack unexploded Junio C Hamano
2005-10-13  2:46       ` Junio C Hamano [this message]
2005-10-12 14:25   ` [PATCH] Add '--create-index' to git-unpack-objects Johannes Schindelin
2005-10-12 14:55     ` Sergey Vlasov
2005-10-12 15:08       ` Johannes Schindelin
2005-10-12 15:20       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7v3bn6877f.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).