All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "René Scharfe" <l.s.r@web.de>, "SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [PATCH v2 05/13] midx: check both pack and index names for containment
Date: Fri, 5 Apr 2019 14:06:04 -0400	[thread overview]
Message-ID: <20190405180604.GE32243@sigill.intra.peff.net> (raw)
In-Reply-To: <20190405180306.GA21113@sigill.intra.peff.net>

A midx file (and the struct we parse from it) contains a list of all of
the covered packfiles, mentioned by their ".idx" names (e.g.,
"pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
callers to provide the idx name.

This works for most of the calls, but the one in open_packed_git_1()
tries to feed a packed_git->pack_name, which is the ".pack" name,
meaning we'll never find a match (even if the pack is covered by the
midx).

We can fix this by converting the ".pack" to ".idx" in the caller.
However, that requires allocating a new string. Instead, let's make
midx_contains_pack() a bit friendlier, and allow it take _either_ the
.pack or .idx variant.

All cleverness in the matching code is credited to René. Bugs are mine.

There's no test here, because while this does fix _a_ bug, it's masked
by another bug in that same caller. That will be covered (with a test)
in the next patch.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
---
I was tempted to suggest that the midx struct just store the base name
without ".idx" at all, but having callers pass that is no less tricky
than passing ".idx" (they still have to allocate a new string).

 midx.c | 36 ++++++++++++++++++++++++++++++++++--
 midx.h |  2 +-
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/midx.c b/midx.c
index 8a505fd423..0ceca1938f 100644
--- a/midx.c
+++ b/midx.c
@@ -307,7 +307,39 @@ int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct mu
 	return nth_midxed_pack_entry(m, e, pos);
 }
 
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
+/* Match "foo.idx" against either "foo.pack" _or_ "foo.idx". */
+static int cmp_idx_or_pack_name(const char *idx_or_pack_name,
+				const char *idx_name)
+{
+	/* Skip past any initial matching prefix. */
+	while (*idx_name && *idx_name == *idx_or_pack_name) {
+		idx_name++;
+		idx_or_pack_name++;
+	}
+
+	/*
+	 * If we didn't match completely, we may have matched "pack-1234." and
+	 * be left with "idx" and "pack" respectively, which is also OK. We do
+	 * not have to check for "idx" and "idx", because that would have been
+	 * a complete match (and in that case these strcmps will be false, but
+	 * we'll correctly return 0 from the final strcmp() below.
+	 *
+	 * Technically this matches "fooidx" and "foopack", but we'd never have
+	 * such names in the first place.
+	 */
+	if (!strcmp(idx_name, "idx") && !strcmp(idx_or_pack_name, "pack"))
+		return 0;
+
+	/*
+	 * This not only checks for a complete match, but also orders based on
+	 * the first non-identical character, which means our ordering will
+	 * match a raw strcmp(). That makes it OK to use this to binary search
+	 * a naively-sorted list.
+	 */
+	return strcmp(idx_or_pack_name, idx_name);
+}
+
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name)
 {
 	uint32_t first = 0, last = m->num_packs;
 
@@ -317,7 +349,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_name)
 		int cmp;
 
 		current = m->pack_names[mid];
-		cmp = strcmp(idx_name, current);
+		cmp = cmp_idx_or_pack_name(idx_or_pack_name, current);
 		if (!cmp)
 			return 1;
 		if (cmp > 0) {
diff --git a/midx.h b/midx.h
index 774f652530..26dd042d63 100644
--- a/midx.h
+++ b/midx.h
@@ -43,7 +43,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid,
 					struct multi_pack_index *m,
 					uint32_t n);
 int fill_midx_entry(const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
-int midx_contains_pack(struct multi_pack_index *m, const char *idx_name);
+int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
 int write_midx_file(const char *object_dir);
-- 
2.21.0.729.g7d31bf3764


  parent reply	other threads:[~2019-04-05 18:06 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 23:21 [PATCH 0/12] a rabbit hole of update-server-info fixes Jeff King
2019-04-04 23:22 ` [PATCH 01/12] t5319: fix bogus cat-file argument Jeff King
2019-04-05  0:44   ` Ramsay Jones
2019-04-05  1:41     ` Jeff King
2019-04-05  1:46       ` Jeff King
2019-04-04 23:22 ` [PATCH 02/12] t5319: drop useless --buffer from cat-file Jeff King
2019-04-04 23:22 ` [PATCH 03/12] packfile: factor out .pack to .idx name conversion Jeff King
2019-04-04 23:25 ` [PATCH 04/12] packfile: check midx coverage with .idx rather than .pack Jeff King
2019-04-05  8:05   ` René Scharfe
2019-04-05 13:21     ` Jeff King
2019-04-05 13:30       ` Jeff King
2019-04-05 12:01   ` SZEDER Gábor
2019-04-05 13:40     ` Jeff King
2019-04-04 23:27 ` [PATCH 05/12] http: simplify parsing of remote objects/info/packs Jeff King
2019-04-05 10:41   ` René Scharfe
2019-04-05 18:11     ` Jeff King
2019-04-05 20:17       ` René Scharfe
2019-04-04 23:27 ` [PATCH 06/12] server-info: fix blind pointer arithmetic Jeff King
2019-04-04 23:28 ` [PATCH 07/12] server-info: simplify cleanup in parse_pack_def() Jeff King
2019-04-04 23:28 ` [PATCH 08/12] server-info: use strbuf to read old info/packs file Jeff King
2019-04-04 23:29 ` [PATCH 09/12] server-info: drop nr_alloc struct member Jeff King
2019-04-04 23:30 ` [PATCH 10/12] packfile.h: drop extern from function declarations Jeff King
2019-04-04 23:30 ` [PATCH 11/12] server-info: drop objdirlen pointer arithmetic Jeff King
2019-04-04 23:31 ` [PATCH 12/12] update_info_refs(): drop unused force parameter Jeff King
2019-04-05  5:46 ` [PATCH 0/12] a rabbit hole of update-server-info fixes Junio C Hamano
2019-04-05 18:03 ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King
2019-04-05 18:03   ` [PATCH v2 01/13] packfile.h: drop extern from function declarations Jeff King
2019-04-05 19:19     ` Ramsay Jones
2019-04-05 19:25       ` Jeff King
2019-04-08  5:13         ` Junio C Hamano
2019-04-08 20:32           ` Jeff King
2019-04-09 15:08         ` Junio C Hamano
2019-04-09 15:15           ` Jeff King
2019-04-05 18:04   ` [PATCH v2 02/13] pack-revindex: open index if necessary Jeff King
2019-04-05 18:04   ` [PATCH v2 03/13] t5319: fix bogus cat-file argument Jeff King
2019-04-05 18:05   ` [PATCH v2 04/13] t5319: drop useless --buffer from cat-file Jeff King
2019-04-05 18:06   ` Jeff King [this message]
2019-04-05 20:18     ` [PATCH v2 05/13] midx: check both pack and index names for containment René Scharfe
2019-04-05 18:06   ` [PATCH v2 06/13] packfile: fix pack basename computation Jeff King
2019-04-05 18:12   ` [PATCH v2 07/13] http: simplify parsing of remote objects/info/packs Jeff King
2019-04-05 18:13   ` [PATCH v2 08/13] server-info: fix blind pointer arithmetic Jeff King
2019-04-05 18:13   ` [PATCH v2 09/13] server-info: simplify cleanup in parse_pack_def() Jeff King
2019-04-05 18:16     ` Jeff King
2019-04-05 18:13   ` [PATCH v2 10/13] server-info: use strbuf to read old info/packs file Jeff King
2019-04-05 18:14   ` [PATCH v2 11/13] server-info: drop nr_alloc struct member Jeff King
2019-04-05 18:14   ` [PATCH v2 12/13] server-info: drop objdirlen pointer arithmetic Jeff King
2019-04-05 18:14   ` [PATCH v2 13/13] update_info_refs(): drop unused force parameter Jeff King
2019-04-05 18:19   ` [PATCH v2 0/13] a rabbit hole of update-server-info (and midx!) fixes Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190405180604.GE32243@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.