git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Add delta islands support
@ 2018-08-09 15:55 Christian Couder
  2018-08-09 15:55 ` [PATCH v3 1/8] packfile: make get_delta_base() non static Christian Couder
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

This patch series is upstreaming work made by GitHub and available in:

https://github.com/peff/git/commits/jk/delta-islands

The above work has been already described in the following article:

https://githubengineering.com/counting-objects/

The above branch contains only one patch. In this patch series the
patch has been split into 5 patches (1/8, 2/8, 4/8, 5/8 and 6/8) with
their own commit message, and on top of that 3 new patches (3/8, 7/8
and 8/8) have been added. The new patches implement things that were
requested after the previous iterations.

I kept Peff as the author of the original 5 patches and took the
liberty to add his Signed-off-by to them.

As explained in details in the Counting Object article referenced
above, the goal of the delta island mechanism is for a hosting
provider to make it possible to have the "forks" of a repository share
as much storage as possible while preventing object packs to contain
deltas between different forks.

If deltas between different forks are not prevented, when users clone
or fetch a fork, preparing the pack that should be sent to them can be
very costly CPU wise, as objects from a different fork should not be
sent, which means that a lot of deltas might need to be computed
again (instead of reusing existing deltas).


The following changes have been made since the previous iteration:

* suggested by Duy: move the code computing the write order for a
  layer to a new separate function called compute_layer_order() in
  builtin/pack-objects.c in new patch 3/8

  I think that indeed this makes the following patch (4/8) shorter and
  easier to understand as well as the overall result nicer.

* suggested by Duy and Peff: rework the way the 'tree_depth' field is
  moved from 'struct object_entry' to 'struct packing_data' in
  pack-object.h in patch 7/8

* suggested by Duy and Peff: move field 'layer' from
  'struct object_entry' to 'struct packing_data' in pack-object.h in
  new patch 8/8

This patch series is also available on GitHub in:

https://github.com/chriscool/git/commits/delta-islands

The previous versions are available there:

V2: https://github.com/chriscool/git/commits/delta-islands19
V1: https://github.com/chriscool/git/commits/delta-islands6

V2: https://public-inbox.org/git/20180805172525.15278-1-chriscool@tuxfamily.org/
V1: https://public-inbox.org/git/20180722054836.28935-1-chriscool@tuxfamily.org/


Christian Couder (3):
  pack-objects: refactor code into compute_layer_order()
  pack-objects: move tree_depth into 'struct packing_data'
  pack-objects: move 'layer' into 'struct packing_data'

Jeff King (5):
  packfile: make get_delta_base() non static
  Add delta-islands.{c,h}
  pack-objects: add delta-islands support
  repack: add delta-islands support
  t: add t5319-delta-islands.sh

 Documentation/config.txt           |  19 ++
 Documentation/git-pack-objects.txt |  97 ++++++
 Documentation/git-repack.txt       |   5 +
 Makefile                           |   1 +
 builtin/pack-objects.c             | 137 +++++---
 builtin/repack.c                   |   9 +
 delta-islands.c                    | 506 +++++++++++++++++++++++++++++
 delta-islands.h                    |  11 +
 pack-objects.c                     |  12 +
 pack-objects.h                     |  39 +++
 packfile.c                         |  10 +-
 packfile.h                         |   7 +
 t/t5319-delta-islands.sh           | 143 ++++++++
 13 files changed, 948 insertions(+), 48 deletions(-)
 create mode 100644 delta-islands.c
 create mode 100644 delta-islands.h
 create mode 100755 t/t5319-delta-islands.sh

-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/8] packfile: make get_delta_base() non static
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 2/8] Add delta-islands.{c,h} Christian Couder
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

From: Jeff King <peff@peff.net>

As get_delta_base() will be used outside 'packfile.c' in
a following commit, let's make it non static and let's
declare it in 'packfile.h'.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 packfile.c | 10 +++++-----
 packfile.h |  7 +++++++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/packfile.c b/packfile.c
index 6974903e58..04d387f312 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1037,11 +1037,11 @@ const struct packed_git *has_packed_and_bad(const unsigned char *sha1)
 	return NULL;
 }
 
-static off_t get_delta_base(struct packed_git *p,
-				    struct pack_window **w_curs,
-				    off_t *curpos,
-				    enum object_type type,
-				    off_t delta_obj_offset)
+off_t get_delta_base(struct packed_git *p,
+		     struct pack_window **w_curs,
+		     off_t *curpos,
+		     enum object_type type,
+		     off_t delta_obj_offset)
 {
 	unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL);
 	off_t base_offset;
diff --git a/packfile.h b/packfile.h
index cc7eaffe1b..1265fd9b06 100644
--- a/packfile.h
+++ b/packfile.h
@@ -126,6 +126,13 @@ extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsig
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
+/*
+ * Return the offset of the object that is the delta base of the object at curpos.
+ */
+extern off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs,
+			    off_t *curpos, enum object_type type,
+			    off_t delta_obj_offset);
+
 extern void release_pack_memory(size_t);
 
 /* global flag to enable extra checks when accessing packed objects */
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/8] Add delta-islands.{c,h}
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
  2018-08-09 15:55 ` [PATCH v3 1/8] packfile: make get_delta_base() non static Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-11  9:04   ` SZEDER Gábor
  2018-08-09 15:55 ` [PATCH v3 3/8] pack-objects: refactor code into compute_layer_order() Christian Couder
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

From: Jeff King <peff@peff.net>

Hosting providers that allow users to "fork" existing
repos want those forks to share as much disk space as
possible.

Alternates are an existing solution to keep all the
objects from all the forks into a unique central repo,
but this can have some drawbacks. Especially when
packing the central repo, deltas will be created
between objects from different forks.

This can make cloning or fetching a fork much slower
and much more CPU intensive as Git might have to
compute new deltas for many objects to avoid sending
objects from a different fork.

Because the inefficiency primarily arises when an
object is delitified against another object that does
not exist in the same fork, we partition objects into
sets that appear in the same fork, and define
"delta islands". When finding delta base, we do not
allow an object outside the same island to be
considered as its base.

So "delta islands" is a way to store objects from
different forks in the same repo and packfile without
having deltas between objects from different forks.

This patch implements the delta islands mechanism in
"delta-islands.{c,h}", but does not yet make use of it.

A few new fields are added in 'struct object_entry'
in "pack-objects.h" though.

The documentation will follow in a patch that actually
uses delta islands in "builtin/pack-objects.c".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Makefile        |   1 +
 delta-islands.c | 497 ++++++++++++++++++++++++++++++++++++++++++++++++
 delta-islands.h |  11 ++
 pack-objects.h  |   4 +
 4 files changed, 513 insertions(+)
 create mode 100644 delta-islands.c
 create mode 100644 delta-islands.h

diff --git a/Makefile b/Makefile
index bc4fc8eeab..e7994888e8 100644
--- a/Makefile
+++ b/Makefile
@@ -841,6 +841,7 @@ LIB_OBJS += csum-file.o
 LIB_OBJS += ctype.o
 LIB_OBJS += date.o
 LIB_OBJS += decorate.o
+LIB_OBJS += delta-islands.o
 LIB_OBJS += diffcore-break.o
 LIB_OBJS += diffcore-delta.o
 LIB_OBJS += diffcore-order.o
diff --git a/delta-islands.c b/delta-islands.c
new file mode 100644
index 0000000000..448ddcbbe4
--- /dev/null
+++ b/delta-islands.c
@@ -0,0 +1,497 @@
+#include "cache.h"
+#include "attr.h"
+#include "object.h"
+#include "blob.h"
+#include "commit.h"
+#include "tag.h"
+#include "tree.h"
+#include "delta.h"
+#include "pack.h"
+#include "tree-walk.h"
+#include "diff.h"
+#include "revision.h"
+#include "list-objects.h"
+#include "progress.h"
+#include "refs.h"
+#include "khash.h"
+#include "pack-bitmap.h"
+#include "pack-objects.h"
+#include "delta-islands.h"
+#include "sha1-array.h"
+#include "config.h"
+
+KHASH_INIT(str, const char *, void *, 1, kh_str_hash_func, kh_str_hash_equal)
+
+static khash_sha1 *island_marks;
+static unsigned island_counter;
+static unsigned island_counter_core;
+
+static kh_str_t *remote_islands;
+
+struct remote_island {
+	uint64_t hash;
+	struct oid_array oids;
+};
+
+struct island_bitmap {
+	uint32_t refcount;
+	uint32_t bits[];
+};
+
+static uint32_t island_bitmap_size;
+
+/*
+ * Allocate a new bitmap; if "old" is not NULL, the new bitmap will be a copy
+ * of "old". Otherwise, the new bitmap is empty.
+ */
+static struct island_bitmap *island_bitmap_new(const struct island_bitmap *old)
+{
+	size_t size = sizeof(struct island_bitmap) + (island_bitmap_size * 4);
+	struct island_bitmap *b = xcalloc(1, size);
+
+	if (old)
+		memcpy(b, old, size);
+
+	b->refcount = 1;
+	return b;
+}
+
+static void island_bitmap_or(struct island_bitmap *a, const struct island_bitmap *b)
+{
+	uint32_t i;
+
+	for (i = 0; i < island_bitmap_size; ++i)
+		a->bits[i] |= b->bits[i];
+}
+
+static int island_bitmap_is_subset(struct island_bitmap *self,
+		struct island_bitmap *super)
+{
+	uint32_t i;
+
+	if (self == super)
+		return 1;
+
+	for (i = 0; i < island_bitmap_size; ++i) {
+		if ((self->bits[i] & super->bits[i]) != self->bits[i])
+			return 0;
+	}
+
+	return 1;
+}
+
+#define ISLAND_BITMAP_BLOCK(x) (x / 32)
+#define ISLAND_BITMAP_MASK(x) (1 << (x % 32))
+
+static void island_bitmap_set(struct island_bitmap *self, uint32_t i)
+{
+	self->bits[ISLAND_BITMAP_BLOCK(i)] |= ISLAND_BITMAP_MASK(i);
+}
+
+static int island_bitmap_get(struct island_bitmap *self, uint32_t i)
+{
+	return (self->bits[ISLAND_BITMAP_BLOCK(i)] & ISLAND_BITMAP_MASK(i)) != 0;
+}
+
+int in_same_island(const struct object_id *trg_oid, const struct object_id *src_oid)
+{
+	khiter_t trg_pos, src_pos;
+
+	/* If we aren't using islands, assume everything goes together. */
+	if (!island_marks)
+		return 1;
+
+	/*
+	 * If we don't have a bitmap for the target, we can delta it
+	 * against anything -- it's not an important object
+	 */
+	trg_pos = kh_get_sha1(island_marks, trg_oid->hash);
+	if (trg_pos >= kh_end(island_marks))
+		return 1;
+
+	/*
+	 * if the source (our delta base) doesn't have a bitmap,
+	 * we don't want to base any deltas on it!
+	 */
+	src_pos = kh_get_sha1(island_marks, src_oid->hash);
+	if (src_pos >= kh_end(island_marks))
+		return 0;
+
+	return island_bitmap_is_subset(kh_value(island_marks, trg_pos),
+				kh_value(island_marks, src_pos));
+}
+
+int island_delta_cmp(const struct object_id *a, const struct object_id *b)
+{
+	khiter_t a_pos, b_pos;
+	struct island_bitmap *a_bitmap = NULL, *b_bitmap = NULL;
+
+	if (!island_marks)
+		return 0;
+
+	a_pos = kh_get_sha1(island_marks, a->hash);
+	if (a_pos < kh_end(island_marks))
+		a_bitmap = kh_value(island_marks, a_pos);
+
+	b_pos = kh_get_sha1(island_marks, b->hash);
+	if (b_pos < kh_end(island_marks))
+		b_bitmap = kh_value(island_marks, b_pos);
+
+	if (a_bitmap) {
+		if (!b_bitmap || !island_bitmap_is_subset(a_bitmap, b_bitmap))
+			return -1;
+	}
+	if (b_bitmap) {
+		if (!a_bitmap || !island_bitmap_is_subset(b_bitmap, a_bitmap))
+			return 1;
+	}
+
+	return 0;
+}
+
+static struct island_bitmap *create_or_get_island_marks(struct object *obj)
+{
+	khiter_t pos;
+	int hash_ret;
+
+	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
+	if (hash_ret)
+		kh_value(island_marks, pos) = island_bitmap_new(NULL);
+
+	return kh_value(island_marks, pos);
+}
+
+static void set_island_marks(struct object *obj, struct island_bitmap *marks)
+{
+	struct island_bitmap *b;
+	khiter_t pos;
+	int hash_ret;
+
+	pos = kh_put_sha1(island_marks, obj->oid.hash, &hash_ret);
+	if (hash_ret) {
+		/*
+		 * We don't have one yet; make a copy-on-write of the
+		 * parent.
+		 */
+		marks->refcount++;
+		kh_value(island_marks, pos) = marks;
+		return;
+	}
+
+	/*
+	 * We do have it. Make sure we split any copy-on-write before
+	 * updating.
+	 */
+	b = kh_value(island_marks, pos);
+	if (b->refcount > 1) {
+		b->refcount--;
+		b = kh_value(island_marks, pos) = island_bitmap_new(b);
+	}
+	island_bitmap_or(b, marks);
+}
+
+static void mark_remote_island_1(struct remote_island *rl, int is_core_island)
+{
+	uint32_t i;
+
+	for (i = 0; i < rl->oids.nr; ++i) {
+		struct island_bitmap *marks;
+		struct object *obj = parse_object(the_repository, &rl->oids.oid[i]);
+
+		if (!obj)
+			continue;
+
+		marks = create_or_get_island_marks(obj);
+		island_bitmap_set(marks, island_counter);
+
+		if (is_core_island && obj->type == OBJ_COMMIT)
+			obj->flags |= NEEDS_BITMAP;
+
+		/* If it was a tag, also make sure we hit the underlying object. */
+		while (obj && obj->type == OBJ_TAG) {
+			obj = ((struct tag *)obj)->tagged;
+			if (obj) {
+				parse_object(the_repository, &obj->oid);
+				marks = create_or_get_island_marks(obj);
+				island_bitmap_set(marks, island_counter);
+			}
+		}
+	}
+
+	if (is_core_island)
+		island_counter_core = island_counter;
+
+	island_counter++;
+}
+
+static int cmp_tree_depth(const void *va, const void *vb)
+{
+	struct object_entry *a = *(struct object_entry **)va;
+	struct object_entry *b = *(struct object_entry **)vb;
+	return a->tree_depth - b->tree_depth;
+}
+
+void resolve_tree_islands(int progress, struct packing_data *to_pack)
+{
+	struct progress *progress_state = NULL;
+	struct object_entry **todo;
+	int nr = 0;
+	int i;
+
+	if (!island_marks)
+		return;
+
+	/*
+	 * We process only trees, as commits and tags have already been handled
+	 * (and passed their marks on to root trees, as well. We must make sure
+	 * to process them in descending tree-depth order so that marks
+	 * propagate down the tree properly, even if a sub-tree is found in
+	 * multiple parent trees.
+	 */
+	ALLOC_ARRAY(todo, to_pack->nr_objects);
+	for (i = 0; i < to_pack->nr_objects; i++) {
+		if (oe_type(&to_pack->objects[i]) == OBJ_TREE)
+			todo[nr++] = &to_pack->objects[i];
+	}
+	QSORT(todo, nr, cmp_tree_depth);
+
+	if (progress)
+		progress_state = start_progress(_("Propagating island marks"), nr);
+
+	for (i = 0; i < nr; i++) {
+		struct object_entry *ent = todo[i];
+		struct island_bitmap *root_marks;
+		struct tree *tree;
+		struct tree_desc desc;
+		struct name_entry entry;
+		khiter_t pos;
+
+		pos = kh_get_sha1(island_marks, ent->idx.oid.hash);
+		if (pos >= kh_end(island_marks))
+			continue;
+
+		root_marks = kh_value(island_marks, pos);
+
+		tree = lookup_tree(the_repository, &ent->idx.oid);
+		if (!tree || parse_tree(tree) < 0)
+			die(_("bad tree object %s"), oid_to_hex(&ent->idx.oid));
+
+		init_tree_desc(&desc, tree->buffer, tree->size);
+		while (tree_entry(&desc, &entry)) {
+			struct object *obj;
+
+			if (S_ISGITLINK(entry.mode))
+				continue;
+
+			obj = lookup_object(the_repository, entry.oid->hash);
+			if (!obj)
+				continue;
+
+			set_island_marks(obj, root_marks);
+		}
+
+		free_tree_buffer(tree);
+
+		display_progress(progress_state, i+1);
+	}
+
+	stop_progress(&progress_state);
+	free(todo);
+}
+
+static regex_t *island_regexes;
+static unsigned int island_regexes_alloc, island_regexes_nr;
+static const char *core_island_name;
+
+static int island_config_callback(const char *k, const char *v, void *cb)
+{
+	if (!strcmp(k, "pack.island")) {
+		struct strbuf re = STRBUF_INIT;
+
+		if (!v)
+			return config_error_nonbool(k);
+
+		if (island_regexes_nr >= island_regexes_alloc) {
+			island_regexes_alloc = (island_regexes_alloc + 8) * 2;
+			island_regexes = xrealloc(island_regexes,
+					island_regexes_alloc * sizeof(regex_t));
+		}
+
+		if (*v != '^')
+			strbuf_addch(&re, '^');
+		strbuf_addstr(&re, v);
+
+		if (regcomp(&island_regexes[island_regexes_nr], re.buf, REG_EXTENDED))
+			die(_("failed to load island regex for '%s': %s"), k, re.buf);
+
+		strbuf_release(&re);
+		island_regexes_nr++;
+		return 0;
+	}
+
+	if (!strcmp(k, "pack.islandcore"))
+		return git_config_string(&core_island_name, k, v);
+
+	return 0;
+}
+
+static void add_ref_to_island(const char *island_name, const struct object_id *oid)
+{
+	uint64_t sha_core;
+	struct remote_island *rl = NULL;
+
+	int hash_ret;
+	khiter_t pos = kh_put_str(remote_islands, island_name, &hash_ret);
+
+	if (hash_ret) {
+		kh_key(remote_islands, pos) = xstrdup(island_name);
+		kh_value(remote_islands, pos) = xcalloc(1, sizeof(struct remote_island));
+	}
+
+	rl = kh_value(remote_islands, pos);
+	oid_array_append(&rl->oids, oid);
+
+	memcpy(&sha_core, oid->hash, sizeof(uint64_t));
+	rl->hash += sha_core;
+}
+
+static int find_island_for_ref(const char *refname, const struct object_id *oid,
+			       int flags, void *data)
+{
+	/*
+	 * We should advertise 'ARRAY_SIZE(matches) - 2' as the max,
+	 * so we can diagnose below a config with more capture groups
+	 * than we support.
+	 */
+	regmatch_t matches[16];
+	int i, m;
+	struct strbuf island_name = STRBUF_INIT;
+
+	/* walk backwards to get last-one-wins ordering */
+	for (i = island_regexes_nr - 1; i >= 0; i--) {
+		if (!regexec(&island_regexes[i], refname,
+			     ARRAY_SIZE(matches), matches, 0))
+			break;
+	}
+
+	if (i < 0)
+		return 0;
+
+	if (matches[ARRAY_SIZE(matches) - 1].rm_so != -1)
+		warning(_("island regex from config has "
+			  "too many capture groups (max=%d)"),
+			(int)ARRAY_SIZE(matches) - 2);
+
+	for (m = 1; m < ARRAY_SIZE(matches); m++) {
+		regmatch_t *match = &matches[m];
+
+		if (match->rm_so == -1)
+			continue;
+
+		if (island_name.len)
+			strbuf_addch(&island_name, '-');
+
+		strbuf_add(&island_name, refname + match->rm_so, match->rm_eo - match->rm_so);
+	}
+
+	add_ref_to_island(island_name.buf, oid);
+	strbuf_release(&island_name);
+	return 0;
+}
+
+static struct remote_island *get_core_island(void)
+{
+	if (core_island_name) {
+		khiter_t pos = kh_get_str(remote_islands, core_island_name);
+		if (pos < kh_end(remote_islands))
+			return kh_value(remote_islands, pos);
+	}
+
+	return NULL;
+}
+
+static void deduplicate_islands(void)
+{
+	struct remote_island *island, *core = NULL, **list;
+	unsigned int island_count, dst, src, ref, i = 0;
+
+	island_count = kh_size(remote_islands);
+	list = xmalloc(island_count * sizeof(struct remote_island *));
+
+	kh_foreach_value(remote_islands, island, {
+		list[i++] = island;
+	});
+
+	for (ref = 0; ref + 1 < island_count; ref++) {
+		for (src = ref + 1, dst = src; src < island_count; src++) {
+			if (list[ref]->hash == list[src]->hash)
+				continue;
+
+			if (src != dst)
+				list[dst] = list[src];
+
+			dst++;
+		}
+		island_count = dst;
+	}
+
+	island_bitmap_size = (island_count / 32) + 1;
+	core = get_core_island();
+
+	for (i = 0; i < island_count; ++i) {
+		mark_remote_island_1(list[i], core && list[i]->hash == core->hash);
+	}
+
+	free(list);
+}
+
+void load_delta_islands(void)
+{
+	island_marks = kh_init_sha1();
+	remote_islands = kh_init_str();
+
+	git_config(island_config_callback, NULL);
+	for_each_ref(find_island_for_ref, NULL);
+	deduplicate_islands();
+
+	fprintf(stderr, _("Marked %d islands, done.\n"), island_counter);
+}
+
+void propagate_island_marks(struct commit *commit)
+{
+	khiter_t pos = kh_get_sha1(island_marks, commit->object.oid.hash);
+
+	if (pos < kh_end(island_marks)) {
+		struct commit_list *p;
+		struct island_bitmap *root_marks = kh_value(island_marks, pos);
+
+		parse_commit(commit);
+		set_island_marks(&get_commit_tree(commit)->object, root_marks);
+		for (p = commit->parents; p; p = p->next)
+			set_island_marks(&p->item->object, root_marks);
+	}
+}
+
+int compute_pack_layers(struct packing_data *to_pack)
+{
+	uint32_t i;
+
+	if (!core_island_name || !island_marks)
+		return 1;
+
+	for (i = 0; i < to_pack->nr_objects; ++i) {
+		struct object_entry *entry = &to_pack->objects[i];
+		khiter_t pos = kh_get_sha1(island_marks, entry->idx.oid.hash);
+
+		entry->layer = 1;
+
+		if (pos < kh_end(island_marks)) {
+			struct island_bitmap *bitmap = kh_value(island_marks, pos);
+
+			if (island_bitmap_get(bitmap, island_counter_core))
+				entry->layer = 0;
+		}
+	}
+
+	return 2;
+}
diff --git a/delta-islands.h b/delta-islands.h
new file mode 100644
index 0000000000..f9725730f4
--- /dev/null
+++ b/delta-islands.h
@@ -0,0 +1,11 @@
+#ifndef DELTA_ISLANDS_H
+#define DELTA_ISLANDS_H
+
+int island_delta_cmp(const struct object_id *a, const struct object_id *b);
+int in_same_island(const struct object_id *, const struct object_id *);
+void resolve_tree_islands(int progress, struct packing_data *to_pack);
+void load_delta_islands(void);
+void propagate_island_marks(struct commit *commit);
+int compute_pack_layers(struct packing_data *to_pack);
+
+#endif /* DELTA_ISLANDS_H */
diff --git a/pack-objects.h b/pack-objects.h
index edf74dabdd..8eecd67991 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -100,6 +100,10 @@ struct object_entry {
 	unsigned type_:TYPE_BITS;
 	unsigned no_try_delta:1;
 	unsigned in_pack_type:TYPE_BITS; /* could be delta */
+
+	unsigned int tree_depth; /* should be repositioned for packing? */
+	unsigned char layer;
+
 	unsigned preferred_base:1; /*
 				    * we do not pack this, but is available
 				    * to be used as the base object to delta
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/8] pack-objects: refactor code into compute_layer_order()
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
  2018-08-09 15:55 ` [PATCH v3 1/8] packfile: make get_delta_base() non static Christian Couder
  2018-08-09 15:55 ` [PATCH v3 2/8] Add delta-islands.{c,h} Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 4/8] pack-objects: add delta-islands support Christian Couder
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

In a following commit, as we will use delta islands, we will
have to compute the write order for different layers, not just
for one.

Let's prepare for that by refactoring the code that will be
used to compute the write order for a given layer into a new
compute_layer_order() function.

This will make it easier to see and understand what the
following changes are doing.

Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/pack-objects.c | 90 +++++++++++++++++++++++-------------------
 1 file changed, 50 insertions(+), 40 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4391504a91..99b6329399 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -667,48 +667,15 @@ static void add_family_to_write_order(struct object_entry **wo,
 	add_descendants_to_write_order(wo, endp, root);
 }
 
-static struct object_entry **compute_write_order(void)
+static void compute_layer_order(struct object_entry **wo, unsigned int *wo_end)
 {
-	unsigned int i, wo_end, last_untagged;
-
-	struct object_entry **wo;
+	unsigned int i, last_untagged;
 	struct object_entry *objects = to_pack.objects;
 
 	for (i = 0; i < to_pack.nr_objects; i++) {
-		objects[i].tagged = 0;
-		objects[i].filled = 0;
-		SET_DELTA_CHILD(&objects[i], NULL);
-		SET_DELTA_SIBLING(&objects[i], NULL);
-	}
-
-	/*
-	 * Fully connect delta_child/delta_sibling network.
-	 * Make sure delta_sibling is sorted in the original
-	 * recency order.
-	 */
-	for (i = to_pack.nr_objects; i > 0;) {
-		struct object_entry *e = &objects[--i];
-		if (!DELTA(e))
-			continue;
-		/* Mark me as the first child */
-		e->delta_sibling_idx = DELTA(e)->delta_child_idx;
-		SET_DELTA_CHILD(DELTA(e), e);
-	}
-
-	/*
-	 * Mark objects that are at the tip of tags.
-	 */
-	for_each_tag_ref(mark_tagged, NULL);
-
-	/*
-	 * Give the objects in the original recency order until
-	 * we see a tagged tip.
-	 */
-	ALLOC_ARRAY(wo, to_pack.nr_objects);
-	for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
 			break;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, wo_end, &objects[i]);
 	}
 	last_untagged = i;
 
@@ -717,7 +684,7 @@ static struct object_entry **compute_write_order(void)
 	 */
 	for (; i < to_pack.nr_objects; i++) {
 		if (objects[i].tagged)
-			add_to_write_order(wo, &wo_end, &objects[i]);
+			add_to_write_order(wo, wo_end, &objects[i]);
 	}
 
 	/*
@@ -727,7 +694,7 @@ static struct object_entry **compute_write_order(void)
 		if (oe_type(&objects[i]) != OBJ_COMMIT &&
 		    oe_type(&objects[i]) != OBJ_TAG)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, wo_end, &objects[i]);
 	}
 
 	/*
@@ -736,7 +703,7 @@ static struct object_entry **compute_write_order(void)
 	for (i = last_untagged; i < to_pack.nr_objects; i++) {
 		if (oe_type(&objects[i]) != OBJ_TREE)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, wo_end, &objects[i]);
 	}
 
 	/*
@@ -744,8 +711,51 @@ static struct object_entry **compute_write_order(void)
 	 */
 	for (i = last_untagged; i < to_pack.nr_objects; i++) {
 		if (!objects[i].filled)
-			add_family_to_write_order(wo, &wo_end, &objects[i]);
+			add_family_to_write_order(wo, wo_end, &objects[i]);
 	}
+}
+
+static struct object_entry **compute_write_order(void)
+{
+	unsigned int i, wo_end;
+
+	struct object_entry **wo;
+	struct object_entry *objects = to_pack.objects;
+
+	for (i = 0; i < to_pack.nr_objects; i++) {
+		objects[i].tagged = 0;
+		objects[i].filled = 0;
+		SET_DELTA_CHILD(&objects[i], NULL);
+		SET_DELTA_SIBLING(&objects[i], NULL);
+	}
+
+	/*
+	 * Fully connect delta_child/delta_sibling network.
+	 * Make sure delta_sibling is sorted in the original
+	 * recency order.
+	 */
+	for (i = to_pack.nr_objects; i > 0;) {
+		struct object_entry *e = &objects[--i];
+		if (!DELTA(e))
+			continue;
+		/* Mark me as the first child */
+		e->delta_sibling_idx = DELTA(e)->delta_child_idx;
+		SET_DELTA_CHILD(DELTA(e), e);
+	}
+
+	/*
+	 * Mark objects that are at the tip of tags.
+	 */
+	for_each_tag_ref(mark_tagged, NULL);
+
+	/*
+	 * Give the objects in the original recency order until
+	 * we see a tagged tip.
+	 */
+	ALLOC_ARRAY(wo, to_pack.nr_objects);
+	wo_end = 0;
+
+	compute_layer_order(wo, &wo_end);
 
 	if (wo_end != to_pack.nr_objects)
 		die("ordered %u objects, expected %"PRIu32, wo_end, to_pack.nr_objects);
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/8] pack-objects: add delta-islands support
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (2 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 3/8] pack-objects: refactor code into compute_layer_order() Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 5/8] repack: " Christian Couder
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

From: Jeff King <peff@peff.net>

Implement support for delta islands in git pack-objects
and document how delta islands work in
"Documentation/git-pack-objects.txt" and Documentation/config.txt.

This allows users to setup delta islands in their config and
get the benefit of less disk usage while cloning and fetching
is still quite fast and not much more CPU intensive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt           | 15 +++++
 Documentation/git-pack-objects.txt | 97 ++++++++++++++++++++++++++++++
 builtin/pack-objects.c             | 57 +++++++++++++++---
 3 files changed, 161 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 63365dcf3d..27bb77f9e7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2585,6 +2585,21 @@ Note that changing the compression level will not automatically recompress
 all existing objects. You can force recompression by passing the -F option
 to linkgit:git-repack[1].
 
+pack.island::
+	An extended regular expression configuring a set of delta
+	islands. See "DELTA ISLANDS" in linkgit:git-pack-objects[1]
+	for details.
+
+pack.islandCore::
+	Specify an island name which gets to have its objects be
+	packed first. This creates a kind of pseudo-pack at the front
+	of one pack, so that the objects from the specified island are
+	hopefully faster to copy into any pack that should be served
+	to a user requesting these objects. In practice this means
+	that the island specified should likely correspond to what is
+	the most commonly cloned in the repo. See also "DELTA ISLANDS"
+	in linkgit:git-pack-objects[1].
+
 pack.deltaCacheSize::
 	The maximum memory in bytes used for caching deltas in
 	linkgit:git-pack-objects[1] before writing them out to a pack.
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index d95b472d16..40c825c381 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -289,6 +289,103 @@ Unexpected missing object will raise an error.
 --unpack-unreachable::
 	Keep unreachable objects in loose form. This implies `--revs`.
 
+--delta-islands::
+	Restrict delta matches based on "islands". See DELTA ISLANDS
+	below.
+
+
+DELTA ISLANDS
+-------------
+
+When possible, `pack-objects` tries to reuse existing on-disk deltas to
+avoid having to search for new ones on the fly. This is an important
+optimization for serving fetches, because it means the server can avoid
+inflating most objects at all and just send the bytes directly from
+disk. This optimization can't work when an object is stored as a delta
+against a base which the receiver does not have (and which we are not
+already sending). In that case the server "breaks" the delta and has to
+find a new one, which has a high CPU cost. Therefore it's important for
+performance that the set of objects in on-disk delta relationships match
+what a client would fetch.
+
+In a normal repository, this tends to work automatically. The objects
+are mostly reachable from the branches and tags, and that's what clients
+fetch. Any deltas we find on the server are likely to be between objects
+the client has or will have.
+
+But in some repository setups, you may have several related but separate
+groups of ref tips, with clients tending to fetch those groups
+independently. For example, imagine that you are hosting several "forks"
+of a repository in a single shared object store, and letting clients
+view them as separate repositories through `GIT_NAMESPACE` or separate
+repos using the alternates mechanism. A naive repack may find that the
+optimal delta for an object is against a base that is only found in
+another fork. But when a client fetches, they will not have the base
+object, and we'll have to find a new delta on the fly.
+
+A similar situation may exist if you have many refs outside of
+`refs/heads/` and `refs/tags/` that point to related objects (e.g.,
+`refs/pull` or `refs/changes` used by some hosting providers). By
+default, clients fetch only heads and tags, and deltas against objects
+found only in those other groups cannot be sent as-is.
+
+Delta islands solve this problem by allowing you to group your refs into
+distinct "islands". Pack-objects computes which objects are reachable
+from which islands, and refuses to make a delta from an object `A`
+against a base which is not present in all of `A`'s islands. This
+results in slightly larger packs (because we miss some delta
+opportunities), but guarantees that a fetch of one island will not have
+to recompute deltas on the fly due to crossing island boundaries.
+
+When repacking with delta islands the delta window tends to get
+clogged with candidates that are forbidden by the config. Repacking
+with a big --window helps (and doesn't take as long as it otherwise
+might because we can reject some object pairs based on islands before
+doing any computation on the content).
+
+Islands are configured via the `pack.island` option, which can be
+specified multiple times. Each value is a left-anchored regular
+expressions matching refnames. For example:
+
+-------------------------------------------
+[pack]
+island = refs/heads/
+island = refs/tags/
+-------------------------------------------
+
+puts heads and tags into an island (whose name is the empty string; see
+below for more on naming). Any refs which do not match those regular
+expressions (e.g., `refs/pull/123`) is not in any island. Any object
+which is reachable only from `refs/pull/` (but not heads or tags) is
+therefore not a candidate to be used as a base for `refs/heads/`.
+
+Refs are grouped into islands based on their "names", and two regexes
+that produce the same name are considered to be in the same
+island. The names are computed from the regexes by concatenating any
+capture groups from the regex, with a '-' dash in between. (And if
+there are no capture groups, then the name is the empty string, as in
+the above example.) This allows you to create arbitrary numbers of
+islands. Only up to 14 such capture groups are supported though.
+
+For example, imagine you store the refs for each fork in
+`refs/virtual/ID`, where `ID` is a numeric identifier. You might then
+configure:
+
+-------------------------------------------
+[pack]
+island = refs/virtual/([0-9]+)/heads/
+island = refs/virtual/([0-9]+)/tags/
+island = refs/virtual/([0-9]+)/(pull)/
+-------------------------------------------
+
+That puts the heads and tags for each fork in their own island (named
+"1234" or similar), and the pull refs for each go into their own
+"1234-pull".
+
+Note that we pick a single island for each regex to go into, using "last
+one wins" ordering (which allows repo-specific config to take precedence
+over user-wide config, and so forth).
+
 SEE ALSO
 --------
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 99b6329399..30ef48dc43 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -24,6 +24,7 @@
 #include "streaming.h"
 #include "thread-utils.h"
 #include "pack-bitmap.h"
+#include "delta-islands.h"
 #include "reachable.h"
 #include "sha1-array.h"
 #include "argv-array.h"
@@ -59,6 +60,7 @@ static struct packing_data to_pack;
 
 static struct pack_idx_entry **written_list;
 static uint32_t nr_result, nr_written, nr_seen;
+static uint32_t write_layer;
 
 static int non_empty;
 static int reuse_delta = 1, reuse_object = 1;
@@ -93,6 +95,8 @@ static uint16_t write_bitmap_options;
 
 static int exclude_promisor_objects;
 
+static int use_delta_islands;
+
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE;
 static unsigned long cache_max_small_delta_size = 1000;
@@ -607,7 +611,7 @@ static inline void add_to_write_order(struct object_entry **wo,
 			       unsigned int *endp,
 			       struct object_entry *e)
 {
-	if (e->filled)
+	if (e->filled || e->layer != write_layer)
 		return;
 	wo[(*endp)++] = e;
 	e->filled = 1;
@@ -710,13 +714,14 @@ static void compute_layer_order(struct object_entry **wo, unsigned int *wo_end)
 	 * Finally all the rest in really tight order
 	 */
 	for (i = last_untagged; i < to_pack.nr_objects; i++) {
-		if (!objects[i].filled)
+		if (!objects[i].filled && objects[i].layer == write_layer)
 			add_family_to_write_order(wo, wo_end, &objects[i]);
 	}
 }
 
 static struct object_entry **compute_write_order(void)
 {
+	uint32_t max_layers = 1;
 	unsigned int i, wo_end;
 
 	struct object_entry **wo;
@@ -748,14 +753,14 @@ static struct object_entry **compute_write_order(void)
 	 */
 	for_each_tag_ref(mark_tagged, NULL);
 
-	/*
-	 * Give the objects in the original recency order until
-	 * we see a tagged tip.
-	 */
+	if (use_delta_islands)
+		max_layers = compute_pack_layers(&to_pack);
+
 	ALLOC_ARRAY(wo, to_pack.nr_objects);
 	wo_end = 0;
 
-	compute_layer_order(wo, &wo_end);
+	for (; write_layer < max_layers; ++write_layer)
+		compute_layer_order(wo, &wo_end);
 
 	if (wo_end != to_pack.nr_objects)
 		die("ordered %u objects, expected %"PRIu32, wo_end, to_pack.nr_objects);
@@ -1514,7 +1519,8 @@ static void check_object(struct object_entry *entry)
 			break;
 		}
 
-		if (base_ref && (base_entry = packlist_find(&to_pack, base_ref, NULL))) {
+		if (base_ref && (base_entry = packlist_find(&to_pack, base_ref, NULL)) &&
+		    in_same_island(&entry->idx.oid, &base_entry->idx.oid)) {
 			/*
 			 * If base_ref was set above that means we wish to
 			 * reuse delta data, and we even found that base
@@ -1830,6 +1836,11 @@ static int type_size_sort(const void *_a, const void *_b)
 		return -1;
 	if (a->preferred_base < b->preferred_base)
 		return 1;
+	if (use_delta_islands) {
+		int island_cmp = island_delta_cmp(&a->idx.oid, &b->idx.oid);
+		if (island_cmp)
+			return island_cmp;
+	}
 	if (a_size > b_size)
 		return -1;
 	if (a_size < b_size)
@@ -1978,6 +1989,9 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 	if (trg_size < src_size / 32)
 		return 0;
 
+	if (!in_same_island(&trg->entry->idx.oid, &src->entry->idx.oid))
+		return 0;
+
 	/* Load data if not already done */
 	if (!trg->data) {
 		read_lock();
@@ -2516,6 +2530,9 @@ static void prepare_pack(int window, int depth)
 	uint32_t i, nr_deltas;
 	unsigned n;
 
+	if (use_delta_islands)
+		resolve_tree_islands(progress, &to_pack);
+
 	get_object_details();
 
 	/*
@@ -2679,6 +2696,9 @@ static void show_commit(struct commit *commit, void *data)
 
 	if (write_bitmap_index)
 		index_commit_for_bitmap(commit);
+
+	if (use_delta_islands)
+		propagate_island_marks(commit);
 }
 
 static void show_object(struct object *obj, const char *name, void *data)
@@ -2686,6 +2706,19 @@ static void show_object(struct object *obj, const char *name, void *data)
 	add_preferred_base_object(name);
 	add_object_entry(&obj->oid, obj->type, name, 0);
 	obj->flags |= OBJECT_ADDED;
+
+	if (use_delta_islands) {
+		const char *p;
+		unsigned depth = 0;
+		struct object_entry *ent;
+
+		for (p = strchr(name, '/'); p; p = strchr(p + 1, '/'))
+			depth++;
+
+		ent = packlist_find(&to_pack, obj->oid.hash, NULL);
+		if (ent && depth > ent->tree_depth)
+			ent->tree_depth = depth;
+	}
 }
 
 static void show_object__ma_allow_any(struct object *obj, const char *name, void *data)
@@ -3013,6 +3046,9 @@ static void get_object_list(int ac, const char **av)
 	if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
 		return;
 
+	if (use_delta_islands)
+		load_delta_islands();
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	mark_edges_uninteresting(&revs, show_edge);
@@ -3192,6 +3228,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		  option_parse_missing_action },
 		OPT_BOOL(0, "exclude-promisor-objects", &exclude_promisor_objects,
 			 N_("do not pack objects in promisor packfiles")),
+		OPT_BOOL(0, "delta-islands", &use_delta_islands,
+			 N_("respect islands during delta compression")),
 		OPT_END(),
 	};
 
@@ -3318,6 +3356,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (pack_to_stdout || !rev_list_all)
 		write_bitmap_index = 0;
 
+	if (use_delta_islands)
+		argv_array_push(&rp, "--topo-order");
+
 	if (progress && all_progress_implied)
 		progress = 2;
 
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 5/8] repack: add delta-islands support
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (3 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 4/8] pack-objects: add delta-islands support Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 6/8] t: add t5319-delta-islands.sh Christian Couder
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

From: Jeff King <peff@peff.net>

Implement simple support for --delta-islands option and
repack.useDeltaIslands config variable in git repack.

This allows users to setup delta islands in their config and
get the benefit of less disk usage while cloning and fetching
is still quite fast and not much more CPU intensive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config.txt     | 4 ++++
 Documentation/git-repack.txt | 5 +++++
 builtin/repack.c             | 9 +++++++++
 3 files changed, 18 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27bb77f9e7..2bd31078b2 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3145,6 +3145,10 @@ repack.packKeptObjects::
 	index is being written (either via `--write-bitmap-index` or
 	`repack.writeBitmaps`).
 
+repack.useDeltaIslands::
+	If set to true, makes `git repack` act as if `--delta-islands`
+	was passed. Defaults to `false`.
+
 repack.writeBitmaps::
 	When true, git will write a bitmap index when packing all
 	objects to disk (e.g., when `git repack -a` is run).  This
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index d90e7907f4..a8b2d4722f 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -155,6 +155,11 @@ depth is 4095.
 	being removed. In addition, any unreachable loose objects will
 	be packed (and their loose counterparts removed).
 
+-i::
+--delta-islands::
+	Pass the `--delta-islands` option to `git-pack-objects`, see
+	linkgit:git-pack-objects[1].
+
 Configuration
 -------------
 
diff --git a/builtin/repack.c b/builtin/repack.c
index 6c636e159e..5ab9ee69e4 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -12,6 +12,7 @@
 static int delta_base_offset = 1;
 static int pack_kept_objects = -1;
 static int write_bitmaps;
+static int use_delta_islands;
 static char *packdir, *packtmp;
 
 static const char *const git_repack_usage[] = {
@@ -40,6 +41,10 @@ static int repack_config(const char *var, const char *value, void *cb)
 		write_bitmaps = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "repack.usedeltaislands")) {
+		use_delta_islands = git_config_bool(var, value);
+		return 0;
+	}
 	return git_default_config(var, value, cb);
 }
 
@@ -194,6 +199,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				N_("pass --local to git-pack-objects")),
 		OPT_BOOL('b', "write-bitmap-index", &write_bitmaps,
 				N_("write bitmap index")),
+		OPT_BOOL('i', "delta-islands", &use_delta_islands,
+				N_("pass --delta-islands to git-pack-objects")),
 		OPT_STRING(0, "unpack-unreachable", &unpack_unreachable, N_("approxidate"),
 				N_("with -A, do not loosen objects older than this")),
 		OPT_BOOL('k', "keep-unreachable", &keep_unreachable,
@@ -267,6 +274,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		argv_array_pushf(&cmd.args, "--no-reuse-object");
 	if (write_bitmaps)
 		argv_array_push(&cmd.args, "--write-bitmap-index");
+	if (use_delta_islands)
+		argv_array_push(&cmd.args, "--delta-islands");
 
 	if (pack_everything & ALL_INTO_ONE) {
 		get_non_kept_pack_filenames(&existing_packs, &keep_pack_list);
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 6/8] t: add t5319-delta-islands.sh
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (4 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 5/8] repack: " Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 7/8] pack-objects: move tree_depth into 'struct packing_data' Christian Couder
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

From: Jeff King <peff@peff.net>

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 t/t5319-delta-islands.sh | 143 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)
 create mode 100755 t/t5319-delta-islands.sh

diff --git a/t/t5319-delta-islands.sh b/t/t5319-delta-islands.sh
new file mode 100755
index 0000000000..fea92a5777
--- /dev/null
+++ b/t/t5319-delta-islands.sh
@@ -0,0 +1,143 @@
+#!/bin/sh
+
+test_description='exercise delta islands'
+. ./test-lib.sh
+
+# returns true iff $1 is a delta based on $2
+is_delta_base () {
+	delta_base=$(echo "$1" | git cat-file --batch-check='%(deltabase)') &&
+	echo >&2 "$1 has base $delta_base" &&
+	test "$delta_base" = "$2"
+}
+
+# generate a commit on branch $1 with a single file, "file", whose
+# content is mostly based on the seed $2, but with a unique bit
+# of content $3 appended. This should allow us to see whether
+# blobs of different refs delta against each other.
+commit() {
+	blob=$({ test-tool genrandom "$2" 10240 && echo "$3"; } |
+	       git hash-object -w --stdin) &&
+	tree=$(printf '100644 blob %s\tfile\n' "$blob" | git mktree) &&
+	commit=$(echo "$2-$3" | git commit-tree "$tree" ${4:+-p "$4"}) &&
+	git update-ref "refs/heads/$1" "$commit" &&
+	eval "$1"'=$(git rev-parse $1:file)' &&
+	eval "echo >&2 $1=\$$1"
+}
+
+test_expect_success 'setup commits' '
+	commit one seed 1 &&
+	commit two seed 12
+'
+
+# Note: This is heavily dependent on the "prefer larger objects as base"
+# heuristic.
+test_expect_success 'vanilla repack deltas one against two' '
+	git repack -adf &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'island repack with no island definition is vanilla' '
+	git repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'island repack with no matches is vanilla' '
+	git -c "pack.island=refs/foo" repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'separate islands disallows delta' '
+	git -c "pack.island=refs/heads/(.*)" repack -adfi &&
+	! is_delta_base $one $two &&
+	! is_delta_base $two $one
+'
+
+test_expect_success 'same island allows delta' '
+	git -c "pack.island=refs/heads" repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'coalesce same-named islands' '
+	git \
+		-c "pack.island=refs/(.*)/one" \
+		-c "pack.island=refs/(.*)/two" \
+		repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'island restrictions drop reused deltas' '
+	git repack -adfi &&
+	is_delta_base $one $two &&
+	git -c "pack.island=refs/heads/(.*)" repack -adi &&
+	! is_delta_base $one $two &&
+	! is_delta_base $two $one
+'
+
+test_expect_success 'island regexes are left-anchored' '
+	git -c "pack.island=heads/(.*)" repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'island regexes follow last-one-wins scheme' '
+	git \
+		-c "pack.island=refs/heads/(.*)" \
+		-c "pack.island=refs/heads/" \
+		repack -adfi &&
+	is_delta_base $one $two
+'
+
+test_expect_success 'setup shared history' '
+	commit root shared root &&
+	commit one shared 1 root &&
+	commit two shared 12-long root
+'
+
+# We know that $two will be preferred as a base from $one,
+# because we can transform it with a pure deletion.
+#
+# We also expect $root as a delta against $two by the "longest is base" rule.
+test_expect_success 'vanilla delta goes between branches' '
+	git repack -adf &&
+	is_delta_base $one $two &&
+	is_delta_base $root $two
+'
+
+# Here we should allow $one to base itself on $root; even though
+# they are in different islands, the objects in $root are in a superset
+# of islands compared to those in $one.
+#
+# Similarly, $two can delta against $root by our rules. And unlike $one,
+# in which we are just allowing it, the island rules actually put $root
+# as a possible base for $two, which it would not otherwise be (due to the size
+# sorting).
+test_expect_success 'deltas allowed against superset islands' '
+	git -c "pack.island=refs/heads/(.*)" repack -adfi &&
+	is_delta_base $one $root &&
+	is_delta_base $two $root
+'
+
+# We are going to test the packfile order here, so we again have to make some
+# assumptions. We assume that "$root", as part of our core "one", must come
+# before "$two". This should be guaranteed by the island code. However, for
+# this test to fail without islands, we are also assuming that it would not
+# otherwise do so. This is true by the current write order, which will put
+# commits (and their contents) before their parents.
+test_expect_success 'island core places core objects first' '
+	cat >expect <<-EOF &&
+	$root
+	$two
+	EOF
+	git -c "pack.island=refs/heads/(.*)" \
+	    -c "pack.islandcore=one" \
+	    repack -adfi &&
+	git verify-pack -v .git/objects/pack/*.pack |
+	cut -d" " -f1 |
+	egrep "$root|$two" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'unmatched island core is not fatal' '
+	git -c "pack.islandcore=one" repack -adfi
+'
+
+test_done
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 7/8] pack-objects: move tree_depth into 'struct packing_data'
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (5 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 6/8] t: add t5319-delta-islands.sh Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 15:55 ` [PATCH v3 8/8] pack-objects: move 'layer' " Christian Couder
  2018-08-09 19:34 ` [PATCH v3 0/8] Add delta islands support Christian Couder
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

This reduces the size of 'struct object_entry' and therefore
makes packing objects more efficient.

This also renames cmp_tree_depth() into tree_depth_compare(),
as it is more modern to have the name of the compare functions
end with "compare".

Helped-by: Jeff King <peff@peff.net>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/pack-objects.c |  4 ++--
 delta-islands.c        | 27 ++++++++++++++++++---------
 pack-objects.c         |  6 ++++++
 pack-objects.h         | 21 ++++++++++++++++++++-
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 30ef48dc43..fd3e514b31 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2716,8 +2716,8 @@ static void show_object(struct object *obj, const char *name, void *data)
 			depth++;
 
 		ent = packlist_find(&to_pack, obj->oid.hash, NULL);
-		if (ent && depth > ent->tree_depth)
-			ent->tree_depth = depth;
+		if (ent && depth > oe_tree_depth(&to_pack, ent))
+			oe_set_tree_depth(&to_pack, ent, depth);
 	}
 }
 
diff --git a/delta-islands.c b/delta-islands.c
index 448ddcbbe4..3c8fe60801 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -224,17 +224,23 @@ static void mark_remote_island_1(struct remote_island *rl, int is_core_island)
 	island_counter++;
 }
 
-static int cmp_tree_depth(const void *va, const void *vb)
+struct tree_islands_todo {
+	struct object_entry *entry;
+	unsigned int depth;
+};
+
+static int tree_depth_compare(const void *a, const void *b)
 {
-	struct object_entry *a = *(struct object_entry **)va;
-	struct object_entry *b = *(struct object_entry **)vb;
-	return a->tree_depth - b->tree_depth;
+	const struct tree_islands_todo *todo_a = a;
+	const struct tree_islands_todo *todo_b = b;
+
+	return todo_a->depth - todo_b->depth;
 }
 
 void resolve_tree_islands(int progress, struct packing_data *to_pack)
 {
 	struct progress *progress_state = NULL;
-	struct object_entry **todo;
+	struct tree_islands_todo *todo;
 	int nr = 0;
 	int i;
 
@@ -250,16 +256,19 @@ void resolve_tree_islands(int progress, struct packing_data *to_pack)
 	 */
 	ALLOC_ARRAY(todo, to_pack->nr_objects);
 	for (i = 0; i < to_pack->nr_objects; i++) {
-		if (oe_type(&to_pack->objects[i]) == OBJ_TREE)
-			todo[nr++] = &to_pack->objects[i];
+		if (oe_type(&to_pack->objects[i]) == OBJ_TREE) {
+			todo[nr].entry = &to_pack->objects[i];
+			todo[nr].depth = oe_tree_depth(to_pack, &to_pack->objects[i]);
+			nr++;
+		}
 	}
-	QSORT(todo, nr, cmp_tree_depth);
+	QSORT(todo, nr, tree_depth_compare);
 
 	if (progress)
 		progress_state = start_progress(_("Propagating island marks"), nr);
 
 	for (i = 0; i < nr; i++) {
-		struct object_entry *ent = todo[i];
+		struct object_entry *ent = todo[i].entry;
 		struct island_bitmap *root_marks;
 		struct tree *tree;
 		struct tree_desc desc;
diff --git a/pack-objects.c b/pack-objects.c
index 92708522e7..30314572e6 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -160,6 +160,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata,
 
 		if (!pdata->in_pack_by_idx)
 			REALLOC_ARRAY(pdata->in_pack, pdata->nr_alloc);
+
+		if (pdata->tree_depth)
+			REALLOC_ARRAY(pdata->tree_depth, pdata->nr_alloc);
 	}
 
 	new_entry = pdata->objects + pdata->nr_objects++;
@@ -175,5 +178,8 @@ struct object_entry *packlist_alloc(struct packing_data *pdata,
 	if (pdata->in_pack)
 		pdata->in_pack[pdata->nr_objects - 1] = NULL;
 
+	if (pdata->tree_depth)
+		pdata->tree_depth[pdata->nr_objects - 1] = 0;
+
 	return new_entry;
 }
diff --git a/pack-objects.h b/pack-objects.h
index 8eecd67991..3cb5527eeb 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -101,7 +101,6 @@ struct object_entry {
 	unsigned no_try_delta:1;
 	unsigned in_pack_type:TYPE_BITS; /* could be delta */
 
-	unsigned int tree_depth; /* should be repositioned for packing? */
 	unsigned char layer;
 
 	unsigned preferred_base:1; /*
@@ -145,6 +144,9 @@ struct packing_data {
 	struct packed_git **in_pack;
 
 	uintmax_t oe_size_limit;
+
+	/* delta islands */
+	unsigned int *tree_depth;
 };
 
 void prepare_packing_data(struct packing_data *pdata);
@@ -350,4 +352,21 @@ static inline void oe_set_delta_size(struct packing_data *pack,
 		    "where delta size is the same as entry size");
 }
 
+static inline unsigned int oe_tree_depth(struct packing_data *pack,
+					 struct object_entry *e)
+{
+	if (!pack->tree_depth)
+		return 0;
+	return pack->tree_depth[e - pack->objects];
+}
+
+static inline void oe_set_tree_depth(struct packing_data *pack,
+				     struct object_entry *e,
+				     unsigned int tree_depth)
+{
+	if (!pack->tree_depth)
+		ALLOC_ARRAY(pack->tree_depth, pack->nr_objects);
+	pack->tree_depth[e - pack->objects] = tree_depth;
+}
+
 #endif
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 8/8] pack-objects: move 'layer' into 'struct packing_data'
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (6 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 7/8] pack-objects: move tree_depth into 'struct packing_data' Christian Couder
@ 2018-08-09 15:55 ` Christian Couder
  2018-08-09 19:34 ` [PATCH v3 0/8] Add delta islands support Christian Couder
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

This reduces the size of 'struct object_entry' from 88 bytes
to 80 and therefore makes packing objects more efficient.

For example on a Linux repo with 12M objects,
`git pack-objects --all` needs extra 96MB memory even if the
layer feature is not used.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/pack-objects.c |  4 ++--
 delta-islands.c        |  4 ++--
 pack-objects.c         |  6 ++++++
 pack-objects.h         | 20 ++++++++++++++++++--
 4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fd3e514b31..d5d91eeed5 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -611,7 +611,7 @@ static inline void add_to_write_order(struct object_entry **wo,
 			       unsigned int *endp,
 			       struct object_entry *e)
 {
-	if (e->filled || e->layer != write_layer)
+	if (e->filled || oe_layer(&to_pack, e) != write_layer)
 		return;
 	wo[(*endp)++] = e;
 	e->filled = 1;
@@ -714,7 +714,7 @@ static void compute_layer_order(struct object_entry **wo, unsigned int *wo_end)
 	 * Finally all the rest in really tight order
 	 */
 	for (i = last_untagged; i < to_pack.nr_objects; i++) {
-		if (!objects[i].filled && objects[i].layer == write_layer)
+		if (!objects[i].filled && oe_layer(&to_pack, &objects[i]) == write_layer)
 			add_family_to_write_order(wo, wo_end, &objects[i]);
 	}
 }
diff --git a/delta-islands.c b/delta-islands.c
index 3c8fe60801..92137f2eca 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -492,13 +492,13 @@ int compute_pack_layers(struct packing_data *to_pack)
 		struct object_entry *entry = &to_pack->objects[i];
 		khiter_t pos = kh_get_sha1(island_marks, entry->idx.oid.hash);
 
-		entry->layer = 1;
+		oe_set_layer(to_pack, entry, 1);
 
 		if (pos < kh_end(island_marks)) {
 			struct island_bitmap *bitmap = kh_value(island_marks, pos);
 
 			if (island_bitmap_get(bitmap, island_counter_core))
-				entry->layer = 0;
+				oe_set_layer(to_pack, entry, 0);
 		}
 	}
 
diff --git a/pack-objects.c b/pack-objects.c
index 30314572e6..98389460c2 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -163,6 +163,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata,
 
 		if (pdata->tree_depth)
 			REALLOC_ARRAY(pdata->tree_depth, pdata->nr_alloc);
+
+		if (pdata->layer)
+			REALLOC_ARRAY(pdata->layer, pdata->nr_alloc);
 	}
 
 	new_entry = pdata->objects + pdata->nr_objects++;
@@ -181,5 +184,8 @@ struct object_entry *packlist_alloc(struct packing_data *pdata,
 	if (pdata->tree_depth)
 		pdata->tree_depth[pdata->nr_objects - 1] = 0;
 
+	if (pdata->layer)
+		pdata->layer[pdata->nr_objects - 1] = 0;
+
 	return new_entry;
 }
diff --git a/pack-objects.h b/pack-objects.h
index 3cb5527eeb..ad3c208764 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -101,8 +101,6 @@ struct object_entry {
 	unsigned no_try_delta:1;
 	unsigned in_pack_type:TYPE_BITS; /* could be delta */
 
-	unsigned char layer;
-
 	unsigned preferred_base:1; /*
 				    * we do not pack this, but is available
 				    * to be used as the base object to delta
@@ -147,6 +145,7 @@ struct packing_data {
 
 	/* delta islands */
 	unsigned int *tree_depth;
+	unsigned char *layer;
 };
 
 void prepare_packing_data(struct packing_data *pdata);
@@ -369,4 +368,21 @@ static inline void oe_set_tree_depth(struct packing_data *pack,
 	pack->tree_depth[e - pack->objects] = tree_depth;
 }
 
+static inline unsigned char oe_layer(struct packing_data *pack,
+				     struct object_entry *e)
+{
+	if (!pack->layer)
+		return 0;
+	return pack->layer[e - pack->objects];
+}
+
+static inline void oe_set_layer(struct packing_data *pack,
+				struct object_entry *e,
+				unsigned char layer)
+{
+	if (!pack->layer)
+		ALLOC_ARRAY(pack->layer, pack->nr_objects);
+	pack->layer[e - pack->objects] = layer;
+}
+
 #endif
-- 
2.18.0.555.g17f9c4abba


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/8] Add delta islands support
  2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
                   ` (7 preceding siblings ...)
  2018-08-09 15:55 ` [PATCH v3 8/8] pack-objects: move 'layer' " Christian Couder
@ 2018-08-09 19:34 ` Christian Couder
  8 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-09 19:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

On Thu, Aug 9, 2018 at 5:55 PM, Christian Couder
<christian.couder@gmail.com> wrote:

> The following changes have been made since the previous iteration:
>
> * suggested by Duy: move the code computing the write order for a
>   layer to a new separate function called compute_layer_order() in
>   builtin/pack-objects.c in new patch 3/8
>
>   I think that indeed this makes the following patch (4/8) shorter and
>   easier to understand as well as the overall result nicer.
>
> * suggested by Duy and Peff: rework the way the 'tree_depth' field is
>   moved from 'struct object_entry' to 'struct packing_data' in
>   pack-object.h in patch 7/8
>
> * suggested by Duy and Peff: move field 'layer' from
>   'struct object_entry' to 'struct packing_data' in pack-object.h in
>   new patch 8/8

I forgot to tell about the following change that is also part of the v3:

* suggested by Junio and Peff: increase the 'regmatch_t matches[]'
array from 8 to 16, advertise 14 group captures max, and add a warning
if the final element gets filled in in delta-islands.c in patch 2/8

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/8] Add delta-islands.{c,h}
  2018-08-09 15:55 ` [PATCH v3 2/8] Add delta-islands.{c,h} Christian Couder
@ 2018-08-11  9:04   ` SZEDER Gábor
  2018-08-11 10:32     ` Christian Couder
  0 siblings, 1 reply; 14+ messages in thread
From: SZEDER Gábor @ 2018-08-11  9:04 UTC (permalink / raw)
  To: Christian Couder
  Cc: SZEDER Gábor, git, Junio C Hamano, Jeff King, Duy Nguyen,
	Johannes Schindelin, Stefan Beller, Christian Couder

> diff --git a/delta-islands.c b/delta-islands.c
> new file mode 100644
> index 0000000000..448ddcbbe4
> --- /dev/null
> +++ b/delta-islands.c

> +static void deduplicate_islands(void)
> +{
> +	struct remote_island *island, *core = NULL, **list;
> +	unsigned int island_count, dst, src, ref, i = 0;
> +
> +	island_count = kh_size(remote_islands);
> +	list = xmalloc(island_count * sizeof(struct remote_island *));

Please use ALLOC_ARRAY here.

> +
> +	kh_foreach_value(remote_islands, island, {
> +		list[i++] = island;
> +	});
> +
> +	for (ref = 0; ref + 1 < island_count; ref++) {
> +		for (src = ref + 1, dst = src; src < island_count; src++) {
> +			if (list[ref]->hash == list[src]->hash)
> +				continue;
> +
> +			if (src != dst)
> +				list[dst] = list[src];
> +
> +			dst++;
> +		}
> +		island_count = dst;
> +	}
> +
> +	island_bitmap_size = (island_count / 32) + 1;
> +	core = get_core_island();
> +
> +	for (i = 0; i < island_count; ++i) {
> +		mark_remote_island_1(list[i], core && list[i]->hash == core->hash);
> +	}
> +
> +	free(list);
> +}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/8] Add delta-islands.{c,h}
  2018-08-11  9:04   ` SZEDER Gábor
@ 2018-08-11 10:32     ` Christian Couder
  2018-08-11 14:12       ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Couder @ 2018-08-11 10:32 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: git, Junio C Hamano, Jeff King, Duy Nguyen, Johannes Schindelin,
	Stefan Beller, Christian Couder

On Sat, Aug 11, 2018 at 11:04 AM, SZEDER Gábor <szeder.dev@gmail.com> wrote:
>> diff --git a/delta-islands.c b/delta-islands.c
>> new file mode 100644
>> index 0000000000..448ddcbbe4
>> --- /dev/null
>> +++ b/delta-islands.c
>
>> +static void deduplicate_islands(void)
>> +{
>> +     struct remote_island *island, *core = NULL, **list;
>> +     unsigned int island_count, dst, src, ref, i = 0;
>> +
>> +     island_count = kh_size(remote_islands);
>> +     list = xmalloc(island_count * sizeof(struct remote_island *));
>
> Please use ALLOC_ARRAY here.

Ok, I have made the following changes in the branch I will send next.

diff --git a/delta-islands.c b/delta-islands.c
index 92137f2eca..22e4360810 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -322,8 +322,7 @@ static int island_config_callback(const char *k,
const char *v, void *cb)

                if (island_regexes_nr >= island_regexes_alloc) {
                        island_regexes_alloc = (island_regexes_alloc + 8) * 2;
-                       island_regexes = xrealloc(island_regexes,
-                                       island_regexes_alloc * sizeof(regex_t));
+                       REALLOC_ARRAY(island_regexes, island_regexes_alloc);
                }

                if (*v != '^')
@@ -425,7 +424,7 @@ static void deduplicate_islands(void)
        unsigned int island_count, dst, src, ref, i = 0;

        island_count = kh_size(remote_islands);
-       list = xmalloc(island_count * sizeof(struct remote_island *));
+       ALLOC_ARRAY(list, island_count);

        kh_foreach_value(remote_islands, island, {
                list[i++] = island;

Thanks!

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/8] Add delta-islands.{c,h}
  2018-08-11 10:32     ` Christian Couder
@ 2018-08-11 14:12       ` Jeff King
  2018-08-12  4:32         ` Christian Couder
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2018-08-11 14:12 UTC (permalink / raw)
  To: Christian Couder
  Cc: SZEDER Gábor, git, Junio C Hamano, Duy Nguyen,
	Johannes Schindelin, Stefan Beller, Christian Couder

On Sat, Aug 11, 2018 at 12:32:32PM +0200, Christian Couder wrote:

> Ok, I have made the following changes in the branch I will send next.
> 
> diff --git a/delta-islands.c b/delta-islands.c
> index 92137f2eca..22e4360810 100644
> --- a/delta-islands.c
> +++ b/delta-islands.c
> @@ -322,8 +322,7 @@ static int island_config_callback(const char *k,
> const char *v, void *cb)
> 
>                 if (island_regexes_nr >= island_regexes_alloc) {
>                         island_regexes_alloc = (island_regexes_alloc + 8) * 2;
> -                       island_regexes = xrealloc(island_regexes,
> -                                       island_regexes_alloc * sizeof(regex_t));
> +                       REALLOC_ARRAY(island_regexes, island_regexes_alloc);
>                 }

I think this whole block could actually be ALLOC_GROW().

-Peff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/8] Add delta-islands.{c,h}
  2018-08-11 14:12       ` Jeff King
@ 2018-08-12  4:32         ` Christian Couder
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Couder @ 2018-08-12  4:32 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, git, Junio C Hamano, Duy Nguyen,
	Johannes Schindelin, Stefan Beller, Christian Couder

On Sat, Aug 11, 2018 at 4:12 PM, Jeff King <peff@peff.net> wrote:
> On Sat, Aug 11, 2018 at 12:32:32PM +0200, Christian Couder wrote:
>
>> Ok, I have made the following changes in the branch I will send next.
>>
>> diff --git a/delta-islands.c b/delta-islands.c
>> index 92137f2eca..22e4360810 100644
>> --- a/delta-islands.c
>> +++ b/delta-islands.c
>> @@ -322,8 +322,7 @@ static int island_config_callback(const char *k,
>> const char *v, void *cb)
>>
>>                 if (island_regexes_nr >= island_regexes_alloc) {
>>                         island_regexes_alloc = (island_regexes_alloc + 8) * 2;
>> -                       island_regexes = xrealloc(island_regexes,
>> -                                       island_regexes_alloc * sizeof(regex_t));
>> +                       REALLOC_ARRAY(island_regexes, island_regexes_alloc);
>>                 }
>
> I think this whole block could actually be ALLOC_GROW().

Yeah, thanks! The whole block will be replaced with the following in
the next reroll:

ALLOC_GROW(island_regexes, island_regexes_nr + 1, island_regexes_alloc);

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-08-12  4:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-09 15:55 [PATCH v3 0/8] Add delta islands support Christian Couder
2018-08-09 15:55 ` [PATCH v3 1/8] packfile: make get_delta_base() non static Christian Couder
2018-08-09 15:55 ` [PATCH v3 2/8] Add delta-islands.{c,h} Christian Couder
2018-08-11  9:04   ` SZEDER Gábor
2018-08-11 10:32     ` Christian Couder
2018-08-11 14:12       ` Jeff King
2018-08-12  4:32         ` Christian Couder
2018-08-09 15:55 ` [PATCH v3 3/8] pack-objects: refactor code into compute_layer_order() Christian Couder
2018-08-09 15:55 ` [PATCH v3 4/8] pack-objects: add delta-islands support Christian Couder
2018-08-09 15:55 ` [PATCH v3 5/8] repack: " Christian Couder
2018-08-09 15:55 ` [PATCH v3 6/8] t: add t5319-delta-islands.sh Christian Couder
2018-08-09 15:55 ` [PATCH v3 7/8] pack-objects: move tree_depth into 'struct packing_data' Christian Couder
2018-08-09 15:55 ` [PATCH v3 8/8] pack-objects: move 'layer' " Christian Couder
2018-08-09 19:34 ` [PATCH v3 0/8] Add delta islands support Christian Couder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).