All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com,
	jrnieder@gmail.com
Subject: Re: [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes
Date: Mon, 25 Jan 2021 15:03:07 -0500	[thread overview]
Message-ID: <YA8j+3nq8nI/hcIQ@nand.local> (raw)
In-Reply-To: <YAtlZ49mTfTGg11/@coredump.intra.peff.net>

On Fri, Jan 22, 2021 at 06:53:11PM -0500, Jeff King wrote:
> On Wed, Jan 13, 2021 at 05:28:15PM -0500, Taylor Blau wrote:
>
> >  OPTIONS
> > @@ -33,7 +34,14 @@ OPTIONS
> >  	file is constructed from the name of packed archive
> >  	file by replacing .pack with .idx (and the program
> >  	fails if the name of packed archive does not end
> > -	with .pack).
> > +	with .pack). Incompatible with `--rev-index`.
>
> I wondered which option was incompatible, but couldn't see from the
> context. It is "index-pack -o", which kind of makes sense. We can derive
> "foo.rev" from "foo.idx", but normally "-o" does not do any deriving.
>
> So I was all set to say "OK, we can live without it", but...
>
> > @@ -1824,7 +1851,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
> >  	if (from_stdin && hash_algo)
> >  		die(_("--object-format cannot be used with --stdin"));
> >  	if (!index_name && pack_name)
> > -		index_name = derive_filename(pack_name, "idx", &index_name_buf);
> > +		index_name = derive_filename(pack_name, ".pack", "idx", &index_name_buf);
> > +
> > +	opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY);
> > +	if (rev_index) {
> > +		opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV;
> > +		if (index_name)
> > +			rev_index_name = derive_filename(index_name,
> > +							 ".idx", "rev",
> > +							 &rev_index_name_buf);
> > +	}
>
> ...here we do end up deriving ".rev" from ".idx" anyway. So I guess we
> probably could support "-o".  I also wonder what happens with "git
> index-pack -o foo.idx" when pack.writeReverseIndex is set. It looks like
> it would just work because of this block. But then shouldn't
> "--rev-index" work, too? And indeed, there is a test for that at the end
> of the patch! So is the documentation just wrong?

Hah! The documentation is just plain wrong. It's been a while, but I
have a vague recollection of writing this documentation before changing
the implementation of index-pack to allow this. Clearly, I forgot to go
back to update the broken documentation.

Hilariously, there is even a test in t5325 that demonstrates this
working! 'git index-pack --rev-index -o other.idx' writes both
'other.idx' and 'other.rev'. That was easy :-).

> I admit to finding the use of opts.flags versus the rev_index option a
> bit confusing. It seems like they are doing roughly the same thing, but
> influenced by different sources. It seems like we should be able to have
> a single local variable (then that goes on to set opts.flags for any
> sub-functions we call). Or maybe two, if we need to distinguish config
> versus command-line, but then they should have clear names
> (rev_index_config and rev_index_cmdline or something).

Yeah, I know. It's because we already pass a pointer to a struct
pack_idx_option to git_index_pack_config(), so in effect the 'flags' on
that struct *is* rev_index_config.

It's a little ugly, I agree, but I'm skeptical that the effort to clean
it up is worth it, mostly because the pack_idx_option struct probably
shouldn't be part of the index-pack builtin in the first place.

> As an aside, looking at derive_filename(), it seems a bit weird that one
> argument has a dot in the suffix and the other does not. I guess you are
> following the convention from write_special_file(), which omits it in
> the newly-added suffix. But it is slightly awkward to omit it for the
> old suffix in derive_filename(), because we want to strip_suffix() it
> all at once.

> Probably not that big a deal, but if anybody feels strongly, then
> derive_filename() could do:
>
>   if (!strip_suffix(pack_name, strip, &len) ||
>       !len || pack_name[len] != '.')
> 	die("does not end in .%s", strip);

That does make the callers look nicer, but it needs an extra two things:

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index ef2874a8e6..c758f3b8e9 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1441,11 +1441,10 @@ static const char *derive_filename(const char *pack_name, const char *strip,
 {
        size_t len;
        if (!strip_suffix(pack_name, strip, &len) || !len ||
-           pack_name[len] != '.')
+           pack_name[len - 1] != '.')
                die(_("packfile name '%s' does not end with '.%s'"),
                    pack_name, strip);
        strbuf_add(buf, pack_name, len);
-       strbuf_addch(buf, '.');
        strbuf_addstr(buf, suffix);
        return buf->buf;
 }

And then it does what you are looking for. I'll pull that change out
with your Suggested-by as a preparatory commit right before this one.

> > @@ -1578,6 +1591,12 @@ static int git_index_pack_config(const char *k, const char *v, void *cb)
> >  		}
> >  		return 0;
> >  	}
> > +	if (!strcmp(k, "pack.writereverseindex")) {
> > +		if (git_config_bool(k, v))
> > +			opts->flags |= WRITE_REV;
> > +		else
> > +			opts->flags &= ~WRITE_REV;
> > +	}
> >  	return git_default_config(k, v, cb);
> >  }
>
> IMHO we'll eventually want to turn this feature on by default. In which
> case we'll have to update every caller which is checking the config
> manually. Should we hide this in a function that looks up the config,
> and sets the default? Or alternatively, I guess, they could all use some
> shared initializer for "flags".

Note that there are only two such callers, so I'm not sure the effort to
extract this would be worth it.

> > +	# Intentionally corrupt the reverse index.
> > +	chmod u+w $rev &&
> > +	printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc &&
> > +
> > +	test_must_fail git index-pack --rev-index --verify \
> > +		$packdir/pack-$pack.pack 2>err &&
> > +	grep "validation error" err
> > +'
>
> This isn't that subtle of a corruption, because we are corrupting the
> first 4 bytes, which is the magic signature. Maybe something further in
> the actual data would be interesting instead of or in addition?
>
> I dunno. There are a lot of edge cases around corruption (likewise, we
> might care how the normal reading code-path perceives a signature
> corruption like this). I'm not sure it's all that interesting to test
> all of them.

Agreed, I think what is here (even though it's not a severe corruption)
would be sufficient to make me feel good about our error handling in
general.

Thanks,
Taylor

  reply	other threads:[~2021-01-25 20:04 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:19 [PATCH 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-08 18:19 ` [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-08 18:20 ` [PATCH 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-08 18:20 ` [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-08 18:20 ` [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 17:11   ` Ævar Arnfjörð Bjarmason
2021-01-12 18:40     ` Taylor Blau
2021-01-08 18:20 ` [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-12 16:49   ` Derrick Stolee
2021-01-12 17:34     ` Taylor Blau
2021-01-12 17:18   ` Ævar Arnfjörð Bjarmason
2021-01-12 17:39     ` Derrick Stolee
2021-01-12 18:17       ` Taylor Blau
2021-01-08 18:20 ` [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-13 22:28 ` [PATCH v2 0/8] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-13 22:28   ` [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-14  7:22     ` Junio C Hamano
2021-01-14 12:07       ` Derrick Stolee
2021-01-14 19:57         ` Jeff King
2021-01-14 18:28       ` Taylor Blau
2021-01-14  7:26     ` Junio C Hamano
2021-01-14 18:13       ` Taylor Blau
2021-01-14 20:57         ` Junio C Hamano
2021-01-22 22:54     ` Jeff King
2021-01-25 17:44       ` Taylor Blau
2021-01-25 18:27         ` Jeff King
2021-01-25 19:04         ` Junio C Hamano
2021-01-25 19:23           ` Taylor Blau
2021-01-13 22:28   ` [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-22 23:24     ` Jeff King
2021-01-25 19:15       ` Taylor Blau
2021-01-26 21:43         ` Jeff King
2021-01-13 22:28   ` [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-22 23:53     ` Jeff King
2021-01-25 20:03       ` Taylor Blau [this message]
2021-01-13 22:28   ` [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-22 23:57     ` Jeff King
2021-01-23  0:08       ` Jeff King
2021-01-25 20:21         ` Taylor Blau
2021-01-25 20:50           ` Jeff King
2021-01-13 22:28   ` [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-13 22:28   ` [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-13 22:28   ` [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-25 23:37 ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Taylor Blau
2021-01-25 23:37   ` [PATCH v3 01/10] packfile: prepare for the existence of '*.rev' files Taylor Blau
2021-01-29  0:27     ` Jeff King
2021-01-29  1:14       ` Taylor Blau
2021-01-30  8:39         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 02/10] pack-write.c: prepare to write 'pack-*.rev' files Taylor Blau
2021-01-25 23:37   ` [PATCH v3 03/10] builtin/index-pack.c: allow stripping arbitrary extensions Taylor Blau
2021-01-29  0:28     ` Jeff King
2021-01-29  1:15       ` Taylor Blau
2021-01-25 23:37   ` [PATCH v3 04/10] builtin/index-pack.c: write reverse indexes Taylor Blau
2021-01-25 23:37   ` [PATCH v3 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Taylor Blau
2021-01-25 23:37   ` [PATCH v3 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Taylor Blau
2021-01-29  0:30     ` Jeff King
2021-01-29  1:17       ` Taylor Blau
2021-01-30  8:41         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:45     ` Jeff King
2021-01-29  1:09       ` Eric Sunshine
2021-01-29  1:21       ` Taylor Blau
2021-01-30  8:43         ` Jeff King
2021-01-29  2:42       ` Junio C Hamano
2021-01-25 23:37   ` [PATCH v3 08/10] t: support GIT_TEST_WRITE_REV_INDEX Taylor Blau
2021-01-29  0:47     ` Jeff King
2021-01-25 23:37   ` [PATCH v3 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence Taylor Blau
2021-01-29  0:53     ` Jeff King
2021-01-29  1:25       ` Taylor Blau
2021-01-30  8:46         ` Jeff King
2021-01-25 23:37   ` [PATCH v3 10/10] t5325: check both on-disk and in-memory reverse index Taylor Blau
2021-01-29  1:04     ` Jeff King
2021-01-29  1:05     ` Jeff King
2021-01-29  1:32       ` Taylor Blau
2021-01-30  8:47         ` Jeff King
2021-01-26  2:36   ` [PATCH v3 00/10] pack-revindex: introduce on-disk '.rev' format Junio C Hamano
2021-01-26  2:49     ` Taylor Blau
2021-01-29  1:06   ` Jeff King
2021-01-29  1:34     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YA8j+3nq8nI/hcIQ@nand.local \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.