git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Drew Northup <drew.northup@maine.edu>,
	Jakub Narebski <jnareb@gmail.com>,
	Heiko Voigt <hvoigt@hvoigt.net>,
	Johan Herland <johan@herland.net>,
	Julian Phillips <julian@quantumfyre.co.uk>
Subject: Re: [PATCH v2 08/51] is_dup_ref(): extract function from sort_ref_array()
Date: Mon, 12 Dec 2011 12:44:24 +0100	[thread overview]
Message-ID: <4EE5E918.8090104@alum.mit.edu> (raw)
In-Reply-To: <20111212083345.GA16106@sigill.intra.peff.net>

On 12/12/2011 09:33 AM, Jeff King wrote:
> On Mon, Dec 12, 2011 at 06:38:15AM +0100, mhagger@alum.mit.edu wrote:
> 
>> +/*
>> + * Emit a warning and return true iff ref1 and ref2 have the same name
>> + * and the same sha1.  Die if they have the same name but different
>> + * sha1s.
>> + */
>> +static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2)
>> +{
>> +	if (!strcmp(ref1->name, ref2->name)) {
>> +		/* Duplicate name; make sure that the SHA1s match: */
>> +		if (hashcmp(ref1->sha1, ref2->sha1))
>> +			die("Duplicated ref, and SHA1s don't match: %s",
>> +			    ref1->name);
>> +		warning("Duplicated ref: %s", ref1->name);
>> +		return 1;
>> +	} else {
>> +		return 0;
>> +	}
>> +}
> 
> As a user, I'm not sure what this message means. If I understand the
> context right, this happens when you have duplicate entries in your
> packed-refs file?

It could be caused by dups in any of the three ref caches: packed refs,
loose refs, or extra refs.

Duplicates in the packed-refs file could be caused by any garden-variety
data corruption external to git.  Or they could be caused by a bug in
git that caused it to write duplicates.  But the code that writes packed
refs iterates over the refs right after they are sorted, and sorting
eliminates duplicates, so this would be an unlikely bug the way the code
is currently organized.  I suppose they could also be caused by the
simultaneous writing of the packed-refs file by another process that
doesn't respect git's lock file; perhaps an ill-timed rsync or something.

Normally there should be no duplicate loose refs, because their names
are taken directly from the filesystem and duplicate filenames are not
allowed.  (Though I don't know if readdir() guarantees an atomic
snapshot of a directory; if not, then a dup could perhaps be created by
having another process add and remove files while git is reading a loose
ref directory.)

Extra refs are created locally by other git code that I am not familiar
with, so duplicate extra refs could only be created by a bug in git.

So in summary, duplicates could be caused by a git bug, by a corrupt
packed-refs file, or conceivably by a race condition with another
process that is mutating the packed-refs file or the loose reference
directories.

> This would indicate a bug in git, so should this perhaps say "BUG:" in
> front, or maybe give some more context so that a user understands it is
> not their error, or even a random error on this run, but that git has
> accidentally corrupted the packed-refs file (and their best bet is
> probably to report the bug to us).
> 
> This is obviously not an issue introduced by your patch, as you are
> just moving these error messages around. But maybe while the topic is
> fresh in your mind it is a good time to improve it. I dunno. AFAICT
> nobody has ever actually hit this message, so maybe it doesn't matter.
> :)

The first of Google search results for the fatal error message only
shows one instance where the error was observed [1].  This was a case of
cloning using the rsync protocol, which I believe is now deprecated
(probably for exactly this reason).

I think that the gold-plated way to improve the error message would be
to pass the error back to the caller, who would have more context to
decide the most likely cause of the duplicate and give an appropriate
warning or error message.  But I think we can afford to wait until more
error reports appear before bothering with this.

Michael

[1] http://kerneltrap.org/mailarchive/git/2008/12/17/4438764

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2011-12-12 11:44 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-12  5:38 [PATCH v2 00/51] ref-api-C and ref-api-D re-roll mhagger
2011-12-12  5:38 ` [PATCH v2 01/51] struct ref_entry: document name member mhagger
2011-12-12  5:38 ` [PATCH v2 02/51] refs: rename "refname" variables mhagger
2011-12-13  0:37   ` Junio C Hamano
2011-12-12  5:38 ` [PATCH v2 03/51] refs: rename parameters result -> sha1 mhagger
2011-12-12  5:38 ` [PATCH v2 04/51] clear_ref_array(): rename from free_ref_array() mhagger
2011-12-12  5:38 ` [PATCH v2 05/51] is_refname_available(): remove the "quiet" argument mhagger
2011-12-12  5:38 ` [PATCH v2 06/51] parse_ref_line(): add docstring mhagger
2011-12-12  5:38 ` [PATCH v2 07/51] add_ref(): " mhagger
2011-12-12  5:38 ` [PATCH v2 08/51] is_dup_ref(): extract function from sort_ref_array() mhagger
2011-12-12  8:33   ` Jeff King
2011-12-12 11:44     ` Michael Haggerty [this message]
2011-12-12 17:14       ` Junio C Hamano
2011-12-12 22:33   ` Junio C Hamano
2011-12-13  4:35     ` Michael Haggerty
2011-12-13  5:00       ` Michael Haggerty
2011-12-12  5:38 ` [PATCH v2 09/51] refs: change signatures of get_packed_refs() and get_loose_refs() mhagger
2011-12-12  5:38 ` [PATCH v2 10/51] get_ref_dir(): change signature mhagger
2011-12-12  5:38 ` [PATCH v2 11/51] resolve_gitlink_ref(): improve docstring mhagger
2011-12-12  5:38 ` [PATCH v2 12/51] Pass a (ref_cache *) to the resolve_gitlink_*() helper functions mhagger
2011-12-12  5:38 ` [PATCH v2 13/51] resolve_gitlink_ref_recursive(): change to work with struct ref_cache mhagger
2011-12-12  5:38 ` [PATCH v2 14/51] repack_without_ref(): remove temporary mhagger
2011-12-12  5:38 ` [PATCH v2 15/51] create_ref_entry(): extract function from add_ref() mhagger
2011-12-12  5:38 ` [PATCH v2 16/51] add_ref(): take a (struct ref_entry *) parameter mhagger
2011-12-12  5:38 ` [PATCH v2 17/51] do_for_each_ref(): correctly terminate while processesing extra_refs mhagger
2011-12-12 22:41   ` Junio C Hamano
2011-12-12  5:38 ` [PATCH v2 18/51] do_for_each_ref_in_array(): new function mhagger
2011-12-12  5:38 ` [PATCH v2 19/51] do_for_each_ref_in_arrays(): " mhagger
2011-12-12  5:38 ` [PATCH v2 20/51] repack_without_ref(): reimplement using do_for_each_ref_in_array() mhagger
2011-12-12 22:44   ` Junio C Hamano
2011-12-12  5:38 ` [PATCH v2 21/51] names_conflict(): new function, extracted from is_refname_available() mhagger
2011-12-12  5:38 ` [PATCH v2 22/51] names_conflict(): simplify implementation mhagger
2011-12-12  5:38 ` [PATCH v2 23/51] is_refname_available(): reimplement using do_for_each_ref_in_array() mhagger
2011-12-12  5:38 ` [PATCH v2 24/51] refs.c: reorder definitions more logically mhagger
2011-12-12  5:38 ` [PATCH v2 25/51] free_ref_entry(): new function mhagger
2011-12-12  5:38 ` [PATCH v2 26/51] check_refname_component(): return 0 for zero-length components mhagger
2011-12-12  5:38 ` [PATCH v2 27/51] struct ref_entry: nest the value part in a union mhagger
2011-12-12  5:38 ` [PATCH v2 28/51] refs.c: rename ref_array -> ref_dir mhagger
2011-12-13  0:45   ` Junio C Hamano
2011-12-13  5:43     ` Michael Haggerty
2011-12-13  6:37       ` Junio C Hamano
2011-12-13 19:12         ` Michael Haggerty
2011-12-13 19:17           ` Junio C Hamano
2011-12-13 22:13           ` Michael Haggerty
2011-12-13 23:24             ` Junio C Hamano
2011-12-14  0:19               ` Junio C Hamano
2011-12-14  2:33                 ` Jeff King
2011-12-15  8:19                   ` Michael Haggerty
2011-12-15  8:37                     ` Jeff King
2012-01-17 15:07               ` Michael Haggerty
2012-02-10 14:51                 ` Michael Haggerty
2012-02-10 20:44                   ` Jeff King
2012-02-10 21:17                     ` Junio C Hamano
2012-02-11  6:33                       ` Michael Haggerty
2011-12-12  5:38 ` [PATCH v2 29/51] refs: store references hierarchically mhagger
2011-12-12  5:38 ` [PATCH v2 30/51] sort_ref_dir(): do not sort if already sorted mhagger
2011-12-12 23:26   ` Junio C Hamano
2011-12-12  5:38 ` [PATCH v2 31/51] refs: sort ref_dirs lazily mhagger
2011-12-12  5:38 ` [PATCH v2 32/51] do_for_each_ref(): only iterate over the subtree that was requested mhagger
2011-12-12  5:38 ` [PATCH v2 33/51] get_ref_dir(): keep track of the current ref_dir mhagger
2011-12-12  5:38 ` [PATCH v2 34/51] refs: wrap top-level ref_dirs in ref_entries mhagger
2011-12-12  5:38 ` [PATCH v2 35/51] get_packed_refs(): return (ref_entry *) instead of (ref_dir *) mhagger
2011-12-12  5:38 ` [PATCH v2 36/51] get_loose_refs(): " mhagger
2011-12-12  5:38 ` [PATCH v2 37/51] is_refname_available(): take " mhagger
2011-12-12  5:38 ` [PATCH v2 38/51] find_ref(): " mhagger
2011-12-12  5:38 ` [PATCH v2 39/51] read_packed_refs(): " mhagger
2011-12-12  5:38 ` [PATCH v2 40/51] add_ref(): " mhagger
2011-12-12  5:38 ` [PATCH v2 41/51] find_containing_direntry(): use " mhagger
2011-12-12  5:38 ` [PATCH v2 42/51] search_ref_dir(): take " mhagger
2011-12-12  5:38 ` [PATCH v2 43/51] add_entry(): " mhagger
2011-12-12  5:38 ` [PATCH v2 44/51] do_for_each_ref_in_dir*(): " mhagger
2011-12-12  5:38 ` [PATCH v2 45/51] sort_ref_dir(): " mhagger
2011-12-12  5:38 ` [PATCH v2 46/51] struct ref_dir: store a reference to the enclosing ref_cache mhagger
2011-12-12  5:38 ` [PATCH v2 47/51] read_loose_refs(): take a (ref_entry *) as argument mhagger
2011-12-12  5:38 ` [PATCH v2 48/51] refs: read loose references lazily mhagger
2011-12-12  5:38 ` [PATCH v2 49/51] is_refname_available(): query only possibly-conflicting references mhagger
2011-12-12  5:38 ` [PATCH v2 50/51] read_packed_refs(): keep track of the directory being worked in mhagger
2011-12-12  5:38 ` [PATCH v2 51/51] repack_without_ref(): call clear_packed_ref_cache() mhagger
2011-12-12  8:24 ` [PATCH v2 00/51] ref-api-C and ref-api-D re-roll Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE5E918.8090104@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=drew.northup@maine.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hvoigt@hvoigt.net \
    --cc=jnareb@gmail.com \
    --cc=johan@herland.net \
    --cc=julian@quantumfyre.co.uk \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).