All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: newren@gmaill.com, peff@peff.net, me@ttaylorr.com,
	jrnieder@gmail.com, Derrick Stolee <dstolee@microsoft.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 06/10] sparse-checkout: use oidset to prevent repeat blobs
Date: Thu, 07 May 2020 13:17:38 +0000	[thread overview]
Message-ID: <fcf948bda7aebcc5f88c17f5b308b2ce0cc285f5.1588857462.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.627.git.1588857462.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

As we parse the in-tree config files that store the sparse.dir values
used to create an in-tree sparse-checkout definition, we can easily
avoid parsing the same file multiple times by using an oidset on those
blobs. We only parse if the oid is new to the oidset.

This is unlikely to have a major performance benefit right now, but will
be extremely important when we introduce the sparse.inherit options to
link multiple files in a directed graph. This oidset will prevent
infinite loops when cycles exist in that digraph, or exponential blowups
even in the case of a directed acyclic graph.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sparse-checkout.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/sparse-checkout.c b/sparse-checkout.c
index 6c58fda9722..d01f4d7b525 100644
--- a/sparse-checkout.c
+++ b/sparse-checkout.c
@@ -9,6 +9,7 @@
 #include "string-list.h"
 #include "unpack-trees.h"
 #include "object-store.h"
+#include "oidset.h"
 
 char *get_sparse_checkout_filename(void)
 {
@@ -77,9 +78,12 @@ int load_in_tree_pattern_list(struct repository *r,
 			      struct string_list *sl,
 			      struct pattern_list *pl)
 {
+	int result = 0;
 	struct index_state *istate = r->index;
 	struct string_list_item *item;
 	struct strbuf path = STRBUF_INIT;
+	struct oidset set;
+	oidset_init(&set, 16);
 
 	pl->use_cone_patterns = 1;
 
@@ -96,24 +100,34 @@ int load_in_tree_pattern_list(struct repository *r,
 		 * Use -1 return to ensure populate_from_existing_patterns()
 		 * skips the sparse-checkout updates.
 		 */
-		if (pos < 0)
-			return -1;
+		if (pos < 0) {
+			result = -1;
+			goto cleanup;
+		}
 
 		oid = &istate->cache[pos]->oid;
+
+		if (oidset_contains(&set, oid))
+			continue;
+
+		oidset_insert(&set, oid);
+
 		type = oid_object_info(r, oid, NULL);
 
 		if (type != OBJ_BLOB) {
 			warning(_("expected a file at '%s'; not updating sparse-checkout"),
 				oid_to_hex(oid));
-			return 1;
+			result = 1;
+			goto cleanup;
 		}
 
 		load_in_tree_from_blob(pl, oid);
 	}
 
+cleanup:
 	strbuf_release(&path);
-
-	return 0;
+	oidset_clear(&set);
+	return result;
 }
 
 int populate_sparse_checkout_patterns(struct pattern_list *pl)
-- 
gitgitgadget


  parent reply	other threads:[~2020-05-07 13:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07 13:17 [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 01/10] unpack-trees: avoid array out-of-bounds error Derrick Stolee via GitGitGadget
2020-05-07 22:27   ` Junio C Hamano
2020-05-08 12:19     ` Derrick Stolee
2020-05-08 15:09       ` Junio C Hamano
2020-05-20 16:32     ` Elijah Newren
2020-05-07 13:17 ` [PATCH 02/10] sparse-checkout: move code from builtin Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 03/10] sparse-checkout: move code from unpack-trees.c Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 04/10] sparse-checkout: allow in-tree definitions Derrick Stolee via GitGitGadget
2020-05-07 22:58   ` Junio C Hamano
2020-05-08 15:40     ` Derrick Stolee
2020-05-20 17:52       ` Elijah Newren
2020-06-17 23:07         ` Elijah Newren
2020-06-18  8:18           ` Son Luong Ngoc
2020-05-07 13:17 ` [PATCH 05/10] sparse-checkout: automatically update in-tree definition Derrick Stolee via GitGitGadget
2020-05-20 16:28   ` Elijah Newren
2020-05-07 13:17 ` Derrick Stolee via GitGitGadget [this message]
2020-05-20 16:40   ` [PATCH 06/10] sparse-checkout: use oidset to prevent repeat blobs Elijah Newren
2020-05-21  3:49     ` Elijah Newren
2020-05-21 17:54       ` Derrick Stolee
2020-05-07 13:17 ` [PATCH 07/10] sparse-checkout: define in-tree dependencies Derrick Stolee via GitGitGadget
2020-05-20 18:10   ` Elijah Newren
2020-05-30 17:26     ` Elijah Newren
2020-05-07 13:17 ` [PATCH 08/10] Makefile: skip git-gui if dir is missing Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 09/10] Makefile: disable GETTEXT when 'po' " Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 10/10] .sparse: add in-tree sparse-checkout for Git Derrick Stolee via GitGitGadget
2020-05-20 17:38 ` [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Elijah Newren
2020-06-17 23:14 ` Elijah Newren
2020-06-18  1:42   ` Derrick Stolee
2020-06-18  1:59     ` Elijah Newren
2020-06-18  3:01       ` Derrick Stolee
2020-06-18  5:03         ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fcf948bda7aebcc5f88c17f5b308b2ce0cc285f5.1588857462.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmaill.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.