git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Thomas Rast <trast@student.ethz.ch>
Cc: "Stefan Zager" <szager@google.com>,
	git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH] sha1_file: remove recursion in packed_object_info
Date: Wed, 20 Mar 2013 09:47:10 -0700	[thread overview]
Message-ID: <7vtxo6f27l.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <c5fc1d2040544965ad3cc09e7b82b6013f06b7fa.1363729774.git.trast@student.ethz.ch> (Thomas Rast's message of "Tue, 19 Mar 2013 23:08:13 +0100")

Thomas Rast <trast@student.ethz.ch> writes:

> So here's a nonrecursive version.  Dijkstra is probably turning over
> in his grave as we speak.
>
> I *think* I actually got it right.

You seem to have lost the "if we cannot get delta base, this object
is BAD" check where you measure the size of a deltified object,
which would correspond to this check:

> -static int packed_delta_info(struct packed_git *p,
> -			     struct pack_window **w_curs,
> -			     off_t curpos,
> -			     enum object_type type,
> -			     off_t obj_offset,
> -			     unsigned long *sizep)
> -{
> -	off_t base_offset;
> -
> -	base_offset = get_delta_base(p, w_curs, &curpos, type, obj_offset);
> -	if (!base_offset)
> -		return OBJ_BAD;

The following comment is also lost but...

> -	/* We choose to only get the type of the base object and
> -	 * ignore potentially corrupt pack file that expects the delta
> -	 * based on a base with a wrong size.  This saves tons of
> -	 * inflate() calls.
> -	 */
> -	if (sizep) {
> -		*sizep = get_size_from_delta(p, w_curs, curpos);
> -		if (*sizep == 0)
> -			type = OBJ_BAD;

... is this check correct?  There is an equivalent check at the
beginning of the new packed_object_info() to error out a deltified
result.  Why is an object whose size is 0 bad?

This comes from 3d77d8774fc1 (make packed_object_info() resilient to
pack corruptions, 2008-10-29), and I tend to trust Nico more than I
do myself. I must be missing something obvious, but it appears to me
that the only thing that keeps us from triggering a false positive
is that we do not even attempt to deltify anything smaller than 50
bytes, and create_delta() refuses to create a delta to produce an
empty data.  But a hand-crafted packfile could certainly record such
a delta, no?

> The task of the two functions is not all that hard to describe without
> any recursion, however.  It proceeds in three steps:
>
> - determine the representation type and size, based on the outermost
>   object (delta or not)
>
> - follow through the delta chain, if any
>
> - determine the object type from what is found at the end of the delta
>   chain

The stack/recursion is used _only_ for error recovery, no?  If we do
not care about retrying with a different copy of an object we find
in the delta chain, we can just update obj_offset with base_offset
and keep digging.  It almost makes me wonder if a logical follow-up
to this patch may be to do so, and rewrite the error recovery
codepath to just mark the bad copy and jump back to the very top,
retrying everything from scratch.  Eventually we would run out
bad copies of the problematic object and would report an error, or
find a good copy and return the type.

  reply	other threads:[~2013-03-20 16:47 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-15 22:42 regression in multi-threaded git-pack-index Stefan Zager
2013-03-16 11:41 ` Jeff King
2013-03-16 12:38   ` Duy Nguyen
2013-03-19  8:17   ` Thomas Rast
2013-03-19  9:30     ` Jeff King
2013-03-19  9:59       ` Jeff King
2013-03-19 10:08         ` Jeff King
2013-03-19 10:24           ` Jeff King
2013-03-19 10:29             ` Thomas Rast
2013-03-19 10:33               ` Jeff King
2013-03-19 10:45                 ` Thomas Rast
2013-03-19 10:47                   ` Jeff King
2013-03-19 10:58             ` [PATCH] index-pack: always zero-initialize object_entry list Jeff King
2013-03-19 15:33               ` Thomas Rast
2013-03-19 15:43                 ` Jeff King
2013-03-19 15:52                   ` Jeff King
2013-03-19 16:17                     ` [PATCH v2] " Jeff King
2013-03-19 16:27                       ` Thomas Rast
2013-03-19 17:13                         ` Junio C Hamano
2013-03-20 19:12                       ` Eric Sunshine
2013-03-20 19:13                         ` Jeff King
2013-03-20 19:14                           ` Eric Sunshine
2013-03-19 12:35     ` regression in multi-threaded git-pack-index Duy Nguyen
2013-03-19 13:01       ` [PATCH] index-pack: protect deepest_delta in multithread code Nguyễn Thái Ngọc Duy
2013-03-19 13:25         ` Jeff King
2013-03-19 13:50         ` Thomas Rast
2013-03-19 14:07           ` Duy Nguyen
2013-03-19 14:16             ` [PATCH v2] index-pack: guard nr_resolved_deltas reads by lock Thomas Rast
2013-03-19 15:53               ` Junio C Hamano
2013-03-19 15:41 ` regression in multi-threaded git-pack-index Thomas Rast
2013-03-19 15:45   ` Thomas Rast
2013-03-19 16:11     ` Thomas Rast
2013-03-19 17:58       ` Junio C Hamano
2013-03-19 22:08         ` [PATCH] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-20 16:47           ` Junio C Hamano [this message]
2013-03-25  9:27             ` thomas
2013-03-25 18:07               ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-25 18:07                 ` [PATCH v2 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-25 23:15                   ` Junio C Hamano
2013-03-26 11:09                     ` thomas
2013-03-25 18:07                 ` [PATCH v2 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-25 23:19                   ` Junio C Hamano
2013-03-26  3:37                 ` [PATCH v2 0/3] Recursion-free unpack_entry and packed_object_info Nicolas Pitre
2013-03-25 18:17               ` [PATCH] sha1_file: remove recursion in packed_object_info Junio C Hamano
2013-03-27 20:03               ` [PATCH v3 0/3] Recursion-free unpack_entry and packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 1/3] sha1_file: remove recursion in packed_object_info Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry Thomas Rast
2013-03-27 20:03                 ` [PATCH v3 3/3] sha1_file: remove recursion in unpack_entry Thomas Rast
2013-03-27 20:29                   ` Junio C Hamano
2013-03-20  1:17       ` regression in multi-threaded git-pack-index Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vtxo6f27l.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=szager@google.com \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).