git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 6/9] oid-array: provide a for-loop iterator
Date: Fri, 4 Dec 2020 13:53:23 -0500	[thread overview]
Message-ID: <X8qFo+GJJTbaPV58@coredump.intra.peff.net> (raw)
In-Reply-To: <X8qEg/KiAQDugPC0@coredump.intra.peff.net>

We provide oid_array_for_each_unique() for iterating over the
de-duplicated items in an array. But it's awkward to use for two
reasons:

  1. It uses a callback, which means marshaling arguments into a struct
     and passing it to the callback with a void parameter.

  2. The callback doesn't know the numeric index of the oid we're
     looking at. This is useful for things like progress meters.

Iterating with a for-loop is much more natural for some cases, but the
caller has to do the de-duping itself. However, we can provide a small
helper to make this easier (see the docstring in the header for an
example use).

The caller does have to remember to sort the array first. We could add
an assertion into the helper that array->sorted is set, but I didn't
want to complicate what is otherwise a pretty fast code path.

I also considered adding a full iterator type with init/next/end
functions (similar to what we have for hashmaps). But it ended up making
the callers much harder to read. This version keeps us close to a basic
for-loop.

Yet another option would be adding an option to sort the array and
compact out the duplicates. This would mean iterating over the array an
extra time, though that's probably not a big deal (we did just do an
O(n log n) sort). But we'd still have to write a for-loop to iterate, so
it doesn't really make anything easier for the caller.

No new test, since we'll convert the callback iterator (which is covered
by t0064, among other callers) to use the new code.

Signed-off-by: Jeff King <peff@peff.net>
---
 oid-array.c |  7 ++-----
 oid-array.h | 22 ++++++++++++++++++++++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/oid-array.c b/oid-array.c
index 29f718d835..8e1bcedc0c 100644
--- a/oid-array.c
+++ b/oid-array.c
@@ -67,11 +67,8 @@ int oid_array_for_each_unique(struct oid_array *array,
 
 	oid_array_sort(array);
 
-	for (i = 0; i < array->nr; i++) {
-		int ret;
-		if (i > 0 && oideq(array->oid + i, array->oid + i - 1))
-			continue;
-		ret = fn(array->oid + i, data);
+	for (i = 0; i < array->nr; i = oid_array_next_unique(array, i)) {
+		int ret = fn(array->oid + i, data);
 		if (ret)
 			return ret;
 	}
diff --git a/oid-array.h b/oid-array.h
index 6a22c0ac94..5d86ea5a30 100644
--- a/oid-array.h
+++ b/oid-array.h
@@ -1,6 +1,8 @@
 #ifndef OID_ARRAY_H
 #define OID_ARRAY_H
 
+#include "hash.h"
+
 /**
  * The API provides storage and manipulation of sets of object identifiers.
  * The emphasis is on storage and processing efficiency, making them suitable
@@ -111,4 +113,24 @@ void oid_array_filter(struct oid_array *array,
  */
 void oid_array_sort(struct oid_array *array);
 
+/**
+ * Find the next unique oid in the array after position "cur". You
+ * can use this to iterate over unique elements, like:
+ *
+ *   size_t i;
+ *   oid_array_sort(array);
+ *   for (i = 0; i < array->nr; i = oid_array_next_unique(array, i))
+ *	printf("%s", oid_to_hex(array->oids[i]);
+ *
+ * Non-unique iteration can just increment with "i++" to visit each element.
+ */
+static inline size_t oid_array_next_unique(struct oid_array *array, size_t cur)
+{
+	do {
+		cur++;
+	} while (cur < array->nr &&
+		 oideq(array->oid + cur, array->oid + cur - 1));
+	return cur;
+}
+
 #endif /* OID_ARRAY_H */
-- 
2.29.2.896.g080220a959


  parent reply	other threads:[~2020-12-04 18:54 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04 18:48 [PATCH 0/9] misc commit-graph and oid-array cleanups Jeff King
2020-12-04 18:48 ` [PATCH 1/9] oid-array.h: drop sha1 mention from header guard Jeff King
2020-12-04 18:49 ` [PATCH 2/9] t0064: drop sha1 mention from filename Jeff King
2020-12-04 18:50 ` [PATCH 3/9] t0064: make duplicate tests more robust Jeff King
2020-12-04 18:51 ` [PATCH 4/9] cache.h: move hash/oid functions to hash.h Jeff King
2020-12-04 18:52 ` [PATCH 5/9] oid-array: make sort function public Jeff King
2020-12-04 18:53 ` Jeff King [this message]
2020-12-04 19:05   ` [PATCH 6/9] oid-array: provide a for-loop iterator Taylor Blau
2020-12-04 19:11     ` Taylor Blau
2020-12-04 19:52       ` Jeff King
2020-12-04 19:51     ` Jeff King
2020-12-04 19:18   ` Eric Sunshine
2020-12-04 20:44     ` Jeff King
2020-12-04 20:57       ` Eric Sunshine
2020-12-04 21:54     ` Junio C Hamano
2020-12-07 19:05       ` Jeff King
2020-12-04 18:56 ` [PATCH 7/9] commit-graph: drop count_distinct_commits() function Jeff King
2020-12-04 20:06   ` Derrick Stolee
2020-12-04 20:42     ` Jeff King
2020-12-04 20:47       ` Derrick Stolee
2020-12-04 20:50         ` Jeff King
2020-12-04 21:01           ` Derrick Stolee
2020-12-05  2:26   ` Ævar Arnfjörð Bjarmason
2020-12-07 19:01     ` Jeff King
2020-12-04 18:57 ` [PATCH 8/9] commit-graph: replace packed_oid_list with oid_array Jeff King
2020-12-04 19:14   ` Taylor Blau
2020-12-04 18:58 ` [PATCH 9/9] commit-graph: use size_t for array allocation and indexing Jeff King
2020-12-04 19:15 ` [PATCH 0/9] misc commit-graph and oid-array cleanups Taylor Blau
2020-12-04 20:08   ` Derrick Stolee
2020-12-07 19:10 ` [PATCH v2 " Jeff King
2020-12-07 19:10   ` [PATCH v2 1/9] oid-array.h: drop sha1 mention from header guard Jeff King
2020-12-07 19:10   ` [PATCH v2 2/9] t0064: drop sha1 mention from filename Jeff King
2020-12-07 19:10   ` [PATCH v2 3/9] t0064: make duplicate tests more robust Jeff King
2020-12-07 19:10   ` [PATCH v2 4/9] cache.h: move hash/oid functions to hash.h Jeff King
2020-12-07 19:10   ` [PATCH v2 5/9] oid-array: make sort function public Jeff King
2020-12-07 19:11   ` [PATCH v2 6/9] oid-array: provide a for-loop iterator Jeff King
2020-12-07 19:11   ` [PATCH v2 7/9] commit-graph: drop count_distinct_commits() function Jeff King
2020-12-07 19:11   ` [PATCH v2 8/9] commit-graph: replace packed_oid_list with oid_array Jeff King
2020-12-07 19:11   ` [PATCH v2 9/9] commit-graph: use size_t for array allocation and indexing Jeff King
2020-12-07 19:26   ` [PATCH v2 0/9] misc commit-graph and oid-array cleanups Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X8qFo+GJJTbaPV58@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).