All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>,
	Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, christian.couder@gmail.com
Subject: Re: [RFC PATCH 2/4] fsck: support refs pointing to lazy objects
Date: Fri, 28 Jul 2017 09:29:31 -0400	[thread overview]
Message-ID: <dc3c12fb-38e1-7c62-0d51-cca2f9883927@gmail.com> (raw)
In-Reply-To: <20170727165058.4eda505e@twelve2.svl.corp.google.com>



On 7/27/2017 7:50 PM, Jonathan Tan wrote:
> On Thu, 27 Jul 2017 11:59:47 -0700
> Junio C Hamano <gitster@pobox.com> wrote:
> 
>>> @@ -438,6 +438,14 @@ static int fsck_handle_ref(const char *refname, const struct object_id *oid,
>>>   
>>>   	obj = parse_object(oid);
>>>   	if (!obj) {
>>> +		if (repository_format_lazy_object) {
>>> +			/*
>>> +			 * Increment default_refs anyway, because this is a
>>> +			 * valid ref.
>>> +			 */
>>> +			default_refs++;
>>> +			return 0;
>>> +		}
>>>   		error("%s: invalid sha1 pointer %s", refname, oid_to_hex(oid));
>>>   		errors_found |= ERROR_REACHABLE;
>>
>> At this point, do we know (or can we tell) if this is a missing
>> object or a file exists as a loose object but is corrupt?  If we
>> could, it would be nice to do this only for the former to avoid
>> sweeping a real corruption that is unrelated to the lazy fetch under
>> the rug.
> 
> Before all this is run, there is a check over all loose and packed
> objects and I've verified that this check reports failure in
> corrupt-object situations (see test below).
> 
> It is true that parse_object() cannot report the difference, but since
> fsck has already verified non-corruptness, I don't think it needs to
> know the difference at this point.
> 
>>> +test_expect_success 'fsck fails on lazy object pointed to by ref' '
>>> +	rm -rf repo &&
>>> +	test_create_repo repo &&
>>> +	test_commit -C repo 1 &&
>>> +
>>> +	A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
>>> +
>>> +	# Reference $A only from ref, and delete it
>>> +	git -C repo branch mybranch "$A" &&
>>> +	delete_object repo "$A" &&
>>> +
>>> +	test_must_fail git -C repo fsck
>>> +'
>>
>> And a new test that uses a helper different from delete_object
>> (perhaps call it corrupt_object?) can be used to make sure that we
>> complain in that case here.
> 
> I agree that object corruption can cause this specific part of the
> production code to falsely work. But I think that this specific part of
> the code can and should rely on object corruption being checked
> elsewhere. (I usually don't like to assume that other components work
> and will continue to work, but in this case, I think that fsck checking
> for object corruption is very foundational and should be relied upon.)
> 
> But if we think that defense "in depth" is a good idea, I have no
> problem adding such tests (like the one below).
> 

It's good to know that object corruption is already being checked 
elsewhere in fsck.  I agree that you should be able to rely that as long 
as it is guaranteed to run.  I'd rather see a single location in the 
code that has the logic to detect corrupted objects rather than two 
copies (or worse, two different versions).

Are there also tests for validating the object corruption detection 
code?  If not, adding some (like below) would be great.


> ---
> delete_object () {
> 	rm $1/.git/objects/$(echo $2 | cut -c1-2)/$(echo $2 | cut -c3-40)
> }
> 
> corrupt_object () {
> 	chmod a+w $1/.git/objects/$(echo $2 | cut -c1-2)/$(echo $2 | cut -c3-40) &&
> 	echo CORRUPT >$1/.git/objects/$(echo $2 | cut -c1-2)/$(echo $2 | cut -c3-40)
> }
> 
> setup_object_in_reflog () {
> 	rm -rf repo &&
> 	test_create_repo repo &&
> 	test_commit -C repo 1 &&
> 
> 	A=$(git -C repo commit-tree -m a HEAD^{tree}) &&
> 	B=$(git -C repo commit-tree -m b HEAD^{tree}) &&
> 
> 	# Reference $A only from reflog
> 	git -C repo branch mybranch "$A" &&
> 	git -C repo branch -f mybranch "$B"
> }
> 
> test_expect_success 'lazy object in reflog' '
> 	setup_object_in_reflog &&
> 	delete_object repo "$A" &&
> 	test_must_fail git -C repo fsck &&
> 	git -C repo config core.repositoryformatversion 1 &&
> 	git -C repo config extensions.lazyobject "arbitrary string" &&
> 	git -C repo fsck
> '
> 
> test_expect_success 'corrupt loose object in reflog' '
> 	setup_object_in_reflog &&
> 	corrupt_object repo "$A" &&
> 	test_must_fail git -C repo fsck &&
> 	git -C repo config core.repositoryformatversion 1 &&
> 	git -C repo config extensions.lazyobject "arbitrary string" &&
> 	test_must_fail git -C repo fsck
> '
> 
> test_expect_success 'missing packed object in reflog' '
> 	setup_object_in_reflog &&
> 	git -C repo repack -a &&
> 	delete_object repo "$A" &&
> 	chmod a+w repo/.git/objects/pack/*.pack &&
> 	echo CORRUPT >"$(echo repo/.git/objects/pack/*.pack)" &&
> 	test_must_fail git -C repo fsck &&
> 	git -C repo config core.repositoryformatversion 1 &&
> 	git -C repo config extensions.lazyobject "arbitrary string" &&
> 	test_must_fail git -C repo fsck
> '
> 

  reply	other threads:[~2017-07-28 13:29 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26 23:29 [RFC PATCH 0/4] Some patches for fsck for missing objects Jonathan Tan
2017-07-26 23:29 ` [RFC PATCH 1/4] environment, fsck: introduce lazyobject extension Jonathan Tan
2017-07-27 18:55   ` Junio C Hamano
2017-07-28 13:20     ` Ben Peart
2017-07-28 23:50     ` Jonathan Tan
2017-07-29  0:21       ` Junio C Hamano
2017-07-26 23:30 ` [RFC PATCH 2/4] fsck: support refs pointing to lazy objects Jonathan Tan
2017-07-27 18:59   ` Junio C Hamano
2017-07-27 23:50     ` Jonathan Tan
2017-07-28 13:29       ` Ben Peart [this message]
2017-07-28 20:08         ` [PATCH] tests: ensure fsck fails on corrupt packfiles Jonathan Tan
2017-07-26 23:30 ` [RFC PATCH 3/4] fsck: support referenced lazy objects Jonathan Tan
2017-07-27 19:17   ` Junio C Hamano
2017-07-27 23:50     ` Jonathan Tan
2017-07-29 16:04   ` Junio C Hamano
2017-07-26 23:30 ` [RFC PATCH 4/4] fsck: support lazy objects as CLI argument Jonathan Tan
2017-07-26 23:42 ` [RFC PATCH 0/4] Some patches for fsck for missing objects brian m. carlson
2017-07-27  0:24   ` Stefan Beller
2017-07-27 17:25   ` Jonathan Tan
2017-07-28 13:40     ` Ben Peart
2017-07-31 21:02 ` [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader Jonathan Tan
2017-07-31 21:21   ` Junio C Hamano
2017-07-31 23:05     ` Jonathan Tan
2017-08-01 17:11       ` Junio C Hamano
2017-08-01 17:45         ` Jonathan Nieder
2017-08-01 20:15           ` Junio C Hamano
2017-08-02  0:19         ` Jonathan Tan
2017-08-02 16:20           ` Junio C Hamano
2017-08-02 17:38             ` Jonathan Nieder
2017-08-02 20:51               ` Junio C Hamano
2017-08-02 22:13                 ` Jonathan Nieder
2017-08-03 19:08                 ` Jonathan Tan
2017-08-08 17:13   ` Ben Peart
2017-07-31 21:02 ` [PATCH v2 1/5] environment, fsck: introduce lazyobject extension Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 2/5] fsck: support refs pointing to lazy objects Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 3/5] fsck: support referenced " Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 4/5] fsck: support lazy objects as CLI argument Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 5/5] sha1_file: support loading lazy objects Jonathan Tan
2017-07-31 21:29   ` Junio C Hamano
2017-08-08 20:20   ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc3c12fb-38e1-7c62-0d51-cca2f9883927@gmail.com \
    --to=peartben@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.