git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	John Cai <johncai86@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Taylor Blau <me@ttaylorr.com>, Derrick Stolee <stolee@gmail.com>,
	Patrick Steinhardt <ps@pks.im>,
	Christian Couder <christian.couder@gmail.com>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: [PATCH 2/9] pack-objects: add `--print-filtered` to print omitted objects
Date: Wed, 14 Jun 2023 21:25:34 +0200	[thread overview]
Message-ID: <20230614192541.1599256-3-christian.couder@gmail.com> (raw)
In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com>

When using the `--filter=<filter-spec>` option, `git pack-objects` will
omit some objects from the resulting packfile(s) it produces. It could
be useful to know about these omitted objects though.

For example, we might want to write these objects into a separate
packfile by piping them into another `git pack-object` process.
Or we might want to check if these objects are available from a
promisor remote.

Anyway, this patch implements a simple way to let us know about these
objects by simply printing their oid, one per line, on stdout when the
new `--print-filtered` flag is passed.

As `--print-filtered` doesn't make sense without `--filter`, it is
disallowed to use the former without the latter.

Using `--stdout` is likely to make the `--print-filtered` output
difficult to find or parse, so we also disallow using these two options
together.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-pack-objects.txt     | 10 ++++++
 builtin/pack-objects.c                 | 47 ++++++++++++++++++++++++--
 t/t5317-pack-objects-filter-objects.sh | 27 +++++++++++++++
 3 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 583270a85f..6469080029 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -305,6 +305,16 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle.
 --no-filter::
 	Turns off any previous `--filter=` argument.
 
+--print-filtered::
+	Requires `--filter=`. Prints on stdout, one per line, the
+	object IDs of the objects that are filtered out from the
+	resulting packfile by the filter. This is incompatible with
+	`--stdout`. As <SHA-1> hashes are already written to stdout
+	based on the resulting pack contents (see the `base-name`
+	argument), a line containing only six `-` characters is
+	written after those <SHA-1> hashes, before the filtered object
+	IDs.
+
 --missing=<missing-action>::
 	A debug option to help with future "partial clone" development.
 	This option specifies how missing objects are handled.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index af007868c1..c8e2b6b859 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -266,6 +266,12 @@ static struct oidmap configured_exclusions;
 
 static struct oidset excluded_by_config;
 
+/*
+ * Objects omitted by filter
+ */
+static int print_filtered_out;
+static struct oidset *omitted_by_filter;
+
 /*
  * stats
  */
@@ -4065,11 +4071,18 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
 		die(_("revision walk setup failed"));
 	mark_edges_uninteresting(revs, show_edge, sparse);
 
+	if (print_filtered_out) {
+		omitted_by_filter = xmalloc(sizeof(*omitted_by_filter));
+		oidset_init(omitted_by_filter, 0);
+	}
+
 	if (!fn_show_object)
 		fn_show_object = show_object;
-	traverse_commit_list(revs,
-			     show_commit, fn_show_object,
-			     NULL);
+	traverse_commit_list_filtered(revs,
+				      show_commit,
+				      fn_show_object,
+				      NULL,
+				      omitted_by_filter);
 
 	if (unpack_unreachable_expiration) {
 		revs->ignore_missing_links = 1;
@@ -4165,6 +4178,23 @@ static int option_parse_cruft_expiration(const struct option *opt,
 	return 0;
 }
 
+static void print_omitted_by_filter(void)
+{
+	struct oidset_iter iter;
+	const struct object_id *oid;
+
+	fprintf_ln(stdout, "%s", "------");
+	fprintf_ln(stderr, "%s", _("Printing objects omitted by filter"));
+
+	oidset_iter_init(omitted_by_filter, &iter);
+
+	while ((oid = oidset_iter_next(&iter)))
+		fprintf_ln(stdout, "%s", oid_to_hex(oid));
+
+	oidset_clear(omitted_by_filter);
+	FREE_AND_NULL(omitted_by_filter);
+}
+
 int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 {
 	int use_internal_rev_list = 0;
@@ -4278,6 +4308,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		OPT_STRING_LIST(0, "uri-protocol", &uri_protocols,
 				N_("protocol"),
 				N_("exclude any configured uploadpack.blobpackfileuri with this protocol")),
+		OPT_BOOL(0, "print-filtered", &print_filtered_out,
+			 N_("print filtered out objects to stdout")),
 		OPT_END(),
 	};
 
@@ -4394,6 +4426,12 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (stdin_packs && use_internal_rev_list)
 		die(_("cannot use internal rev list with --stdin-packs"));
 
+	if (print_filtered_out && !filter_options.choice)
+		die(_("cannot use --print-filtered without --filter"));
+
+	if (print_filtered_out && pack_to_stdout)
+		die(_("cannot use --print-filtered with --stdout"));
+
 	if (cruft) {
 		if (use_internal_rev_list)
 			die(_("cannot use internal rev list with --cruft"));
@@ -4509,6 +4547,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			   written, written_delta, reused, reused_delta,
 			   reuse_packfile_objects);
 
+	if (omitted_by_filter)
+		print_omitted_by_filter();
+
 cleanup:
 	list_objects_filter_release(&filter_options);
 	strvec_clear(&rp);
diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh
index b26d476c64..ec3a03d90a 100755
--- a/t/t5317-pack-objects-filter-objects.sh
+++ b/t/t5317-pack-objects-filter-objects.sh
@@ -438,6 +438,33 @@ test_expect_success 'verify sparse:oid=oid-ish' '
 	test_cmp expected observed
 '
 
+# Test pack-objects with --print-filtered option
+
+test_expect_success 'pack-objects fails w/ both --print-filtered and --stdout' '
+	test_must_fail git -C r1 pack-objects --revs --stdout \
+		--filter=blob:none --print-filtered >filter.out <<-EOF
+	HEAD
+	EOF
+'
+
+test_expect_success 'pack-objects w/ --print-filtered and a pack name' '
+	git -C r1 pack-objects --revs --filter=blob:none \
+		--print-filtered filtered-pack >filter.out <<-EOF &&
+	HEAD
+	EOF
+
+	# Check that the second line contains "------"
+	head -n 2 filter.out | tail -n 1 >actual &&
+	echo "------" >expected &&
+	test_cmp expected actual &&
+
+	# Remove the first two lines and check there are all the blobs
+	tail -n +3 filter.out | sort >actual &&
+	git -C r1 cat-file --batch-check --batch-all-objects | grep blob |
+		sed -e "s/ blob.*//" | sort >expected &&
+	test_cmp expected actual
+'
+
 # Delete some loose objects and use pack-objects, but WITHOUT any filtering.
 # This models previously omitted objects that we did not receive.
 
-- 
2.41.0.37.gae45d9845e


  parent reply	other threads:[~2023-06-14 19:26 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14 19:25 [PATCH 0/9] Repack objects into separate packfiles based on a filter Christian Couder
2023-06-14 19:25 ` [PATCH 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-06-21 10:49   ` Taylor Blau
2023-07-05  6:16     ` Christian Couder
2023-06-14 19:25 ` Christian Couder [this message]
2023-06-15 22:50   ` [PATCH 2/9] pack-objects: add `--print-filtered` to print omitted objects Junio C Hamano
2023-06-21 10:52     ` Taylor Blau
2023-06-21 11:11       ` Christian Couder
2023-06-21 11:54         ` Taylor Blau
2023-06-14 19:25 ` [PATCH 3/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-06-15 23:32   ` Junio C Hamano
2023-06-21 10:40     ` Christian Couder
2023-06-21 10:54     ` Taylor Blau
2023-06-14 19:25 ` [PATCH 4/9] repack: refactor piping an oid to a command Christian Couder
2023-06-15 23:46   ` Junio C Hamano
2023-06-21 10:55     ` Taylor Blau
2023-06-21 10:56     ` Christian Couder
2023-06-14 19:25 ` [PATCH 5/9] repack: refactor finishing pack-objects command Christian Couder
2023-06-16  0:13   ` Junio C Hamano
2023-06-21 11:06     ` Taylor Blau
2023-06-21 11:19       ` Christian Couder
2023-06-21 11:05   ` Taylor Blau
2023-06-14 19:25 ` [PATCH 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-06-16  0:43   ` Junio C Hamano
2023-06-21 11:20     ` Taylor Blau
2023-06-21 15:04       ` Christian Couder
2023-06-22 11:05         ` Taylor Blau
2023-06-21 14:40     ` Christian Couder
2023-06-21 16:53       ` Junio C Hamano
2023-06-22  8:39         ` Christian Couder
2023-06-22 18:32           ` Junio C Hamano
2023-06-21 11:17   ` Taylor Blau
2023-07-05  7:18     ` Christian Couder
2023-06-14 19:25 ` [PATCH 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-06-14 19:25 ` [PATCH 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-06-16  2:21   ` Junio C Hamano
2023-06-21 11:49   ` Taylor Blau
2023-06-21 12:08     ` Christian Couder
2023-06-21 12:25       ` Taylor Blau
2023-06-21 16:44         ` Junio C Hamano
2023-07-05  6:19     ` Christian Couder
2023-06-14 19:25 ` [PATCH 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-06-16  2:54   ` Junio C Hamano
2023-06-14 21:36 ` [PATCH 0/9] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-06-16  3:08   ` Junio C Hamano
2023-07-05  6:08 ` [PATCH v2 0/8] " Christian Couder
2023-07-05  6:08   ` [PATCH v2 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-07-05  6:08   ` [PATCH v2 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-07-05  6:08   ` [PATCH v2 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-07-05  6:08   ` [PATCH v2 4/8] repack: refactor finding pack prefix Christian Couder
2023-07-05  6:08   ` [PATCH v2 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-07-05 17:53     ` Junio C Hamano
2023-07-24  9:01       ` Christian Couder
2023-07-24 18:28         ` Junio C Hamano
2023-07-25 15:22           ` Christian Couder
2023-07-25 17:25             ` Junio C Hamano
2023-07-25 23:08               ` Junio C Hamano
2023-08-08  8:45                 ` Christian Couder
2023-08-09 20:38                   ` Taylor Blau
2023-08-09 22:50                   ` Junio C Hamano
2023-08-09 23:38                     ` Junio C Hamano
2023-08-10  0:10                       ` Jeff King
2023-07-05 18:12     ` Junio C Hamano
2023-07-24  9:02       ` Christian Couder
2023-07-05  6:08   ` [PATCH v2 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-07-05  6:08   ` [PATCH v2 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-07-05 18:26     ` Junio C Hamano
2023-07-24  9:00       ` Christian Couder
2023-07-24 18:18         ` Junio C Hamano
2023-07-25 13:41           ` Robert Coup
2023-07-25 16:50             ` Junio C Hamano
2023-07-25 15:45           ` Christian Couder
2023-07-05  6:08   ` [PATCH v2 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-07-24  8:59   ` [PATCH v3 0/8] Repack objects into separate packfiles based on a filter Christian Couder
2023-07-24  8:59     ` [PATCH v3 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-07-25 22:38       ` Taylor Blau
2023-07-25 23:51         ` Junio C Hamano
2023-07-24  8:59     ` [PATCH v3 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-07-25 22:44       ` Taylor Blau
2023-08-08  8:28         ` Christian Couder
2023-07-24  8:59     ` [PATCH v3 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-07-25 22:45       ` Taylor Blau
2023-07-24  8:59     ` [PATCH v3 4/8] repack: refactor finding pack prefix Christian Couder
2023-07-25 22:47       ` Taylor Blau
2023-08-08  8:29         ` Christian Couder
2023-07-24  8:59     ` [PATCH v3 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-07-25 23:04       ` Taylor Blau
2023-08-08  8:34         ` Christian Couder
2023-08-09 21:12           ` Taylor Blau
2023-07-24  8:59     ` [PATCH v3 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-07-25 23:07       ` Taylor Blau
2023-08-08  8:38         ` Christian Couder
2023-08-09 21:15           ` Taylor Blau
2023-07-24  8:59     ` [PATCH v3 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-07-24  8:59     ` [PATCH v3 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-07-25 23:10     ` [PATCH v3 0/8] Repack objects into separate packfiles based on a filter Taylor Blau
2023-08-08  8:26     ` [PATCH v4 " Christian Couder
2023-08-08  8:26       ` [PATCH v4 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-08-08  8:26       ` [PATCH v4 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-08-09 21:18         ` Taylor Blau
2023-08-08  8:26       ` [PATCH v4 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-08-08  8:26       ` [PATCH v4 4/8] repack: refactor finding pack prefix Christian Couder
2023-08-09 21:20         ` Taylor Blau
2023-08-08  8:26       ` [PATCH v4 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-08-09 21:40         ` Taylor Blau
2023-08-08  8:26       ` [PATCH v4 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-08-08  8:26       ` [PATCH v4 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-08-08  8:26       ` [PATCH v4 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-08-09 21:45       ` [PATCH v4 0/8] Repack objects into separate packfiles based on a filter Taylor Blau
2023-08-09 21:57         ` Junio C Hamano
2023-08-12  0:12         ` Christian Couder
2023-08-12  0:00       ` [PATCH v5 " Christian Couder
2023-08-12  0:00         ` [PATCH v5 1/8] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-08-12  0:00         ` [PATCH v5 2/8] t/helper: add 'find-pack' test-tool Christian Couder
2023-08-12  0:00         ` [PATCH v5 3/8] repack: refactor finishing pack-objects command Christian Couder
2023-08-12  0:00         ` [PATCH v5 4/8] repack: refactor finding pack prefix Christian Couder
2023-08-12  0:00         ` [PATCH v5 5/8] repack: add `--filter=<filter-spec>` option Christian Couder
2023-08-12  0:00         ` [PATCH v5 6/8] gc: add `gc.repackFilter` config option Christian Couder
2023-08-12  0:00         ` [PATCH v5 7/8] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-08-12  0:00         ` [PATCH v5 8/8] gc: add `gc.repackFilterTo` config option Christian Couder
2023-08-15  0:51         ` [PATCH v5 0/8] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-08-15 21:43           ` Taylor Blau
2023-08-15 22:32             ` Junio C Hamano
2023-08-15 23:09               ` Taylor Blau
2023-08-15 23:18                 ` Junio C Hamano
2023-08-16  0:38                   ` Taylor Blau
2023-08-16 17:16                     ` Junio C Hamano
2023-09-11 15:20                 ` Christian Couder
2023-09-11 15:06         ` [PATCH v6 0/9] " Christian Couder
2023-09-11 15:06           ` [PATCH v6 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-09-11 15:06           ` [PATCH v6 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-09-11 15:06           ` [PATCH v6 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-09-11 15:06           ` [PATCH v6 4/9] repack: refactor finding pack prefix Christian Couder
2023-09-11 15:06           ` [PATCH v6 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-09-11 15:06           ` [PATCH v6 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-09-11 15:06           ` [PATCH v6 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-09-11 15:06           ` [PATCH v6 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-09-11 15:06           ` [PATCH v6 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-09-25 15:25           ` [PATCH v7 0/9] Repack objects into separate packfiles based on a filter Christian Couder
2023-09-25 15:25             ` [PATCH v7 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-09-25 15:25             ` [PATCH v7 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-09-25 15:25             ` [PATCH v7 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-09-25 15:25             ` [PATCH v7 4/9] repack: refactor finding pack prefix Christian Couder
2023-09-25 15:25             ` [PATCH v7 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-09-25 15:25             ` [PATCH v7 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-09-25 15:25             ` [PATCH v7 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-09-25 15:25             ` [PATCH v7 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-09-25 15:25             ` [PATCH v7 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-09-25 19:14             ` [PATCH v7 0/9] Repack objects into separate packfiles based on a filter Junio C Hamano
2023-09-25 22:41               ` Taylor Blau
2023-10-02 16:54             ` [PATCH v8 " Christian Couder
2023-10-02 16:54               ` [PATCH v8 1/9] pack-objects: allow `--filter` without `--stdout` Christian Couder
2023-10-02 16:54               ` [PATCH v8 2/9] t/helper: add 'find-pack' test-tool Christian Couder
2023-10-02 16:54               ` [PATCH v8 3/9] repack: refactor finishing pack-objects command Christian Couder
2023-10-02 16:54               ` [PATCH v8 4/9] repack: refactor finding pack prefix Christian Couder
2023-10-02 16:55               ` [PATCH v8 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Christian Couder
2023-10-02 16:55               ` [PATCH v8 6/9] repack: add `--filter=<filter-spec>` option Christian Couder
2023-10-02 16:55               ` [PATCH v8 7/9] gc: add `gc.repackFilter` config option Christian Couder
2023-10-02 16:55               ` [PATCH v8 8/9] repack: implement `--filter-to` for storing filtered out objects Christian Couder
2023-10-02 16:55               ` [PATCH v8 9/9] gc: add `gc.repackFilterTo` config option Christian Couder
2023-10-02 20:14               ` [PATCH v8 0/9] Repack objects into separate packfiles based on a filter Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230614192541.1599256-3-christian.couder@gmail.com \
    --to=christian.couder@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).