All of lore.kernel.org
 help / color / mirror / Atom feed
* low memory system to clone larger repo
@ 2015-01-08 16:10 matthew sporleder
  2015-02-09 10:40 ` Duy Nguyen
  2015-02-09 12:32 ` Duy Nguyen
  0 siblings, 2 replies; 12+ messages in thread
From: matthew sporleder @ 2015-01-08 16:10 UTC (permalink / raw)
  To: git

I am attempting to clone this repo: https://github.com/jsonn/src/

and have been successful on some lower memory systems, but i'm
interested in continuing to push down the limit.

I am getting more success running clone via https:// than git:// or
ssh (which is confusing to me) and the smallest system that works is a
raspberry pi with 256 RAM + 256 swap.

I seem to run out of memory consistently around 16% into Resolving
deltas phase but I don't notice an RSS jump so that's another
confusing spot.

My config is below and I'd appreciate any more suggestions of getting
that down to working on a 128MB box (or smaller).

---

I appreciate any suggestions,
Matt

p.s. shallow clones work fine on very small systems


[pack]
        windowMemory = 1m
        packSizeLimit = 1m
        deltaCacheSize = 1m
        deltaCacheLimit = 10
        packSizeLimit = 1m
        threads = 1
[core]
        packedGitWindowSize = 1m
        packedGitLimit = 1m
        deltaBaseCacheLimit = 1m
        compression = 0
        loosecompression = 0
        bigFileThreshold = 10m
[http]
        sslVerify = false
[transfer]
        unpackLimit = 10

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: low memory system to clone larger repo
  2015-01-08 16:10 low memory system to clone larger repo matthew sporleder
@ 2015-02-09 10:40 ` Duy Nguyen
  2015-02-09 11:20   ` Matt Sporleder
  2015-02-09 12:32 ` Duy Nguyen
  1 sibling, 1 reply; 12+ messages in thread
From: Duy Nguyen @ 2015-02-09 10:40 UTC (permalink / raw)
  To: matthew sporleder; +Cc: Git Mailing List

On Thu, Jan 8, 2015 at 11:10 PM, matthew sporleder <msporleder@gmail.com> wrote:
> I am attempting to clone this repo: https://github.com/jsonn/src/
>
> and have been successful on some lower memory systems, but i'm
> interested in continuing to push down the limit.
>
> I am getting more success running clone via https:// than git:// or
> ssh (which is confusing to me) and the smallest system that works is a
> raspberry pi with 256 RAM + 256 swap.
>
> I seem to run out of memory consistently around 16% into Resolving
> deltas phase but I don't notice an RSS jump so that's another
> confusing spot.

Sorry for a really late reply. The command that's running when you run
out of memory is index-pack. I guess it's verifying the delta chain. I
think it needs enough memory for two uncompressed objects (or files)
in a delta chain. I haven't finished cloning this repo yet so I don't
know what these delta chains look like.

What does it say when it runs out of memory? I'm thinking maybe we
could force a core dump, then look at how memory is used.

What if you "git init", then do "git fetch https://..." manually?
There's an optimization for git-clone that may make index-pack use a
bit more memory (and push it over the edge)

> My config is below and I'd appreciate any more suggestions of getting
> that down to working on a 128MB box (or smaller).

I suppose it's ~/.gitconfig or /etc/gitconfig, it's not added after
the clone is complete, correct? Sounds interesting, let me profile its
memory usage..

> [pack]
>         windowMemory = 1m
>         packSizeLimit = 1m
>         deltaCacheSize = 1m
>         deltaCacheLimit = 10
>         packSizeLimit = 1m

I think many of these only affect the server side. If you clone from
github, then they are useless. You may want to provide your own server
side with these settings and see if things change. Also play with
pack.depth (affecting server side)

>         threads = 1
-- 
Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: low memory system to clone larger repo
  2015-02-09 10:40 ` Duy Nguyen
@ 2015-02-09 11:20   ` Matt Sporleder
  0 siblings, 0 replies; 12+ messages in thread
From: Matt Sporleder @ 2015-02-09 11:20 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

A more "final" version of the tuning exercise I did is here:

http://mail-index.netbsd.org/tech-repository/2015/01/08/msg000520.html

I did try some of these setting on the server and it made the repo much much larger so I guess I am looking for ways to just reduce client memory usage/the best balance of disk, memory, and bandwidth. 

If there is a way to turn off some memory hogging speed ups I m interested in trying them. 

I will try to get you the output later since I have since started working on the server side.

Let me know if you want to try some server side stuff and I can give you a git:// or http:// off list. 

Thanks for looking, 
Matt


> On Feb 9, 2015, at 5:40 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> 
>> On Thu, Jan 8, 2015 at 11:10 PM, matthew sporleder <msporleder@gmail.com> wrote:
>> I am attempting to clone this repo: https://github.com/jsonn/src/
>> 
>> and have been successful on some lower memory systems, but i'm
>> interested in continuing to push down the limit.
>> 
>> I am getting more success running clone via https:// than git:// or
>> ssh (which is confusing to me) and the smallest system that works is a
>> raspberry pi with 256 RAM + 256 swap.
>> 
>> I seem to run out of memory consistently around 16% into Resolving
>> deltas phase but I don't notice an RSS jump so that's another
>> confusing spot.
> 
> Sorry for a really late reply. The command that's running when you run
> out of memory is index-pack. I guess it's verifying the delta chain. I
> think it needs enough memory for two uncompressed objects (or files)
> in a delta chain. I haven't finished cloning this repo yet so I don't
> know what these delta chains look like.
> 
> What does it say when it runs out of memory? I'm thinking maybe we
> could force a core dump, then look at how memory is used.
> 
> What if you "git init", then do "git fetch https://..." manually?
> There's an optimization for git-clone that may make index-pack use a
> bit more memory (and push it over the edge)
> 
>> My config is below and I'd appreciate any more suggestions of getting
>> that down to working on a 128MB box (or smaller).
> 
> I suppose it's ~/.gitconfig or /etc/gitconfig, it's not added after
> the clone is complete, correct? Sounds interesting, let me profile its
> memory usage..
> 
>> [pack]
>>        windowMemory = 1m
>>        packSizeLimit = 1m
>>        deltaCacheSize = 1m
>>        deltaCacheLimit = 10
>>        packSizeLimit = 1m
> 
> I think many of these only affect the server side. If you clone from
> github, then they are useless. You may want to provide your own server
> side with these settings and see if things change. Also play with
> pack.depth (affecting server side)
> 
>>        threads = 1
> -- 
> Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: low memory system to clone larger repo
  2015-01-08 16:10 low memory system to clone larger repo matthew sporleder
  2015-02-09 10:40 ` Duy Nguyen
@ 2015-02-09 12:32 ` Duy Nguyen
  2015-02-09 13:18   ` [PATCH] index-pack: reduce memory footprint a bit Nguyễn Thái Ngọc Duy
  2015-02-10  3:56   ` low memory system to clone larger repo matthew sporleder
  1 sibling, 2 replies; 12+ messages in thread
From: Duy Nguyen @ 2015-02-09 12:32 UTC (permalink / raw)
  To: matthew sporleder; +Cc: Git Mailing List

On Thu, Jan 8, 2015 at 11:10 PM, matthew sporleder <msporleder@gmail.com> wrote:
> I am attempting to clone this repo: https://github.com/jsonn/src/

This repo has 3.4M objects. Basic book keeping would cost 200MB (in
practice it'll be higher because I'm assuming no deltas in my
calculation). On my 64-bit system, it already uses 400+ MB at the
beginning of delta resolving phase, and is about 500MB during. 32-bit
systems cost less but I doubt we could keep it within 256 MB limit. I
think you just need more powerful machines for a repo this size.

Also, they have some large files (udivmodti4_test.c 16MB, MD5SUMS
6MB..) These giant files could make index-pack use more memory
especially if they are deltified. If you repack the repo with
core.bigFileThreshold about 1-2MB, then clone, you may get a better
memory consumption, but at the cost of bigger packs.
-- 
Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] index-pack: reduce memory footprint a bit
  2015-02-09 12:32 ` Duy Nguyen
@ 2015-02-09 13:18   ` Nguyễn Thái Ngọc Duy
  2015-02-09 19:27     ` Junio C Hamano
  2015-02-10  3:56   ` low memory system to clone larger repo matthew sporleder
  1 sibling, 1 reply; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-02-09 13:18 UTC (permalink / raw)
  To: git; +Cc: msporleder, Nguyễn Thái Ngọc Duy

For each object in the input pack, we need one struct object_entry. On
x86-64, this struct is 64 bytes long. Although:

 - The 8 bytes for delta_depth and base_object_no are only useful when
   show_stat is set. And it's never set unless someone is debugging.

 - The three fields hdr_size, type and real_type take 4 bytes each
   even though they never use more than 4 bits.

By moving delta_depth and base_object_no out of struct object_entry
and make the other 3 fields one byte long instead of 4, we shrink 25%
of this struct.

On a 3.4M object repo that's about 53MB. The saving is less impressive
compared to index-pack total memory use (about 400MB before delta
resolving, so the saving is just 13%)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 I'm not sure if this patch is worth pursuing. It makes the code a
 little bit harder to read. I was just wondering how much memory could
 be saved..

 We could maybe save some more by splitting union delta_base with the
 assumption that pack-objects would utilize delta-ofs-offset as much
 as possible, which makes the delta_base.sha1[] a waste most of the
 time.
 
 This repo has 2803447 deltas, and because it's a clone case, all
 delta should be ofs-delta, which means we waste about 32MB. But
 shrinking this could get ugly.

 builtin/index-pack.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4632117..479ec5e 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -18,9 +18,12 @@ static const char index_pack_usage[] =
 struct object_entry {
 	struct pack_idx_entry idx;
 	unsigned long size;
-	unsigned int hdr_size;
-	enum object_type type;
-	enum object_type real_type;
+	unsigned char hdr_size;
+	char type;
+	char real_type;
+};
+
+struct object_entry_extra {
 	unsigned delta_depth;
 	int base_object_no;
 };
@@ -64,6 +67,7 @@ struct delta_entry {
 };
 
 static struct object_entry *objects;
+static struct object_entry_extra *objects_extra;
 static struct delta_entry *deltas;
 static struct thread_local nothread_data;
 static int nr_objects;
@@ -873,13 +877,15 @@ static void resolve_delta(struct object_entry *delta_obj,
 	void *base_data, *delta_data;
 
 	if (show_stat) {
-		delta_obj->delta_depth = base->obj->delta_depth + 1;
+		int i = delta_obj - objects;
+		int j = base->obj - objects;
+		objects_extra[i].delta_depth = objects_extra[j].delta_depth + 1;
 		deepest_delta_lock();
-		if (deepest_delta < delta_obj->delta_depth)
-			deepest_delta = delta_obj->delta_depth;
+		if (deepest_delta < objects_extra[i].delta_depth)
+			deepest_delta = objects_extra[i].delta_depth;
 		deepest_delta_unlock();
+		objects_extra[i].base_object_no = j;
 	}
-	delta_obj->base_object_no = base->obj - objects;
 	delta_data = get_data_from_pack(delta_obj);
 	base_data = get_base_data(base);
 	result->obj = delta_obj;
@@ -902,7 +908,7 @@ static void resolve_delta(struct object_entry *delta_obj,
  * "want"; if so, swap in "set" and return true. Otherwise, leave it untouched
  * and return false.
  */
-static int compare_and_swap_type(enum object_type *type,
+static int compare_and_swap_type(char *type,
 				 enum object_type want,
 				 enum object_type set)
 {
@@ -1499,7 +1505,7 @@ static void show_pack_info(int stat_only)
 		struct object_entry *obj = &objects[i];
 
 		if (is_delta_type(obj->type))
-			chain_histogram[obj->delta_depth - 1]++;
+			chain_histogram[objects_extra[i].delta_depth - 1]++;
 		if (stat_only)
 			continue;
 		printf("%s %-6s %lu %lu %"PRIuMAX,
@@ -1508,8 +1514,8 @@ static void show_pack_info(int stat_only)
 		       (unsigned long)(obj[1].idx.offset - obj->idx.offset),
 		       (uintmax_t)obj->idx.offset);
 		if (is_delta_type(obj->type)) {
-			struct object_entry *bobj = &objects[obj->base_object_no];
-			printf(" %u %s", obj->delta_depth, sha1_to_hex(bobj->idx.sha1));
+			struct object_entry *bobj = &objects[objects_extra[i].base_object_no];
+			printf(" %u %s", objects_extra[i].delta_depth, sha1_to_hex(bobj->idx.sha1));
 		}
 		putchar('\n');
 	}
@@ -1672,6 +1678,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+	if (show_stat)
+		objects_extra = xcalloc(nr_objects + 1, sizeof(struct object_entry_extra));
 	deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
 	parse_pack_objects(pack_sha1);
 	resolve_deltas();
-- 
2.3.0.rc1.137.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-09 13:18   ` [PATCH] index-pack: reduce memory footprint a bit Nguyễn Thái Ngọc Duy
@ 2015-02-09 19:27     ` Junio C Hamano
  2015-02-10  9:30       ` Duy Nguyen
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2015-02-09 19:27 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, msporleder

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> For each object in the input pack, we need one struct object_entry. On
> x86-64, this struct is 64 bytes long. Although:
>
>  - The 8 bytes for delta_depth and base_object_no are only useful when
>    show_stat is set. And it's never set unless someone is debugging.
>
>  - The three fields hdr_size, type and real_type take 4 bytes each
>    even though they never use more than 4 bits.
>
> By moving delta_depth and base_object_no out of struct object_entry
> and make the other 3 fields one byte long instead of 4, we shrink 25%
> of this struct.
>
> On a 3.4M object repo that's about 53MB. The saving is less impressive
> compared to index-pack total memory use (about 400MB before delta
> resolving, so the saving is just 13%)
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  I'm not sure if this patch is worth pursuing. It makes the code a
>  little bit harder to read. I was just wondering how much memory could
>  be saved..

I would say 13% is already impressive ;-).

I do not find the result all that harder to read.  I however think
that the change would make it a lot harder to maintain, especially
because the name "object-entry-extra" does not have any direct link
to "show-stat" to hint us that this must be allocated when show-stat
is in use and must never be looked at when show-stat is not in use.

Also it makes me wonder if the compilers are smart enough to notice
that the codepaths that access objects_extra[] are OK because they
are all inside "if (show_stat)".

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: low memory system to clone larger repo
  2015-02-09 12:32 ` Duy Nguyen
  2015-02-09 13:18   ` [PATCH] index-pack: reduce memory footprint a bit Nguyễn Thái Ngọc Duy
@ 2015-02-10  3:56   ` matthew sporleder
  1 sibling, 0 replies; 12+ messages in thread
From: matthew sporleder @ 2015-02-10  3:56 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

Below is the output from my index-pack/clone over git://

This is with the recent memory patch applied but testing some less crazy tuning.
--

The corruption of the signal number shows up in google from other
people so I guess it's a lingering bug.

*
git-tests $ git clone git://github.com/jsonn/src
Cloning into 'src'...
remote: Counting objects: 3497569, done.
remote: Compressing objects: 100% (640647/640647), done.
error: index-pack died of signal 9497569), 990.62 MiB | 8.35 MiB/s
fatal: index-pack failed






On Mon, Feb 9, 2015 at 6:32 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Thu, Jan 8, 2015 at 11:10 PM, matthew sporleder <msporleder@gmail.com> wrote:
>> I am attempting to clone this repo: https://github.com/jsonn/src/
>
> This repo has 3.4M objects. Basic book keeping would cost 200MB (in
> practice it'll be higher because I'm assuming no deltas in my
> calculation). On my 64-bit system, it already uses 400+ MB at the
> beginning of delta resolving phase, and is about 500MB during. 32-bit
> systems cost less but I doubt we could keep it within 256 MB limit. I
> think you just need more powerful machines for a repo this size.
>
> Also, they have some large files (udivmodti4_test.c 16MB, MD5SUMS
> 6MB..) These giant files could make index-pack use more memory
> especially if they are deltified. If you repack the repo with
> core.bigFileThreshold about 1-2MB, then clone, you may get a better
> memory consumption, but at the cost of bigger packs.
> --
> Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-09 19:27     ` Junio C Hamano
@ 2015-02-10  9:30       ` Duy Nguyen
  2015-02-10 12:08         ` matthew sporleder
  0 siblings, 1 reply; 12+ messages in thread
From: Duy Nguyen @ 2015-02-10  9:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, msporleder

On Mon, Feb 09, 2015 at 11:27:21AM -0800, Junio C Hamano wrote:
> > On a 3.4M object repo that's about 53MB. The saving is less impressive
> > compared to index-pack total memory use (about 400MB before delta
> > resolving, so the saving is just 13%)
> >
> > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> > ---
> >  I'm not sure if this patch is worth pursuing. It makes the code a
> >  little bit harder to read. I was just wondering how much memory could
> >  be saved..

(text reordered)

> I do not find the result all that harder to read.  I however think
> that the change would make it a lot harder to maintain, especially
> because the name "object-entry-extra" does not have any direct link
> to "show-stat" to hint us that this must be allocated when show-stat
> is in use and must never be looked at when show-stat is not in use.

Noted. To be fixed.

> I would say 13% is already impressive ;-).

The second patch makes the total saving 119MB, close to 30% (again on
x86-64, 32-bit platform number may be different). If we only compare
with the size of objects[] and deltas[], the saving percentage is 37%
(only for clone case) for this repo. Now it looks impressive to me :-D

The patch is larger than the previous one, but not really complex. And
the final index-pack.c is not hard to read either, probably becase we
already handle ofs-delta and ref-delta separately.

-- 8< --
Subject: [PATCH 2/2] index-pack: kill union delta_base to save memory

Once we know the number of objects in the input pack, we allocate an
array of nr_objects of struct delta_entry. On x86-64, this struct is
32 bytes long. The union delta_base, which is part of struct
delta_entry, provides enough space to store either ofs-delta (8 bytes)
or ref-delta (20 bytes).

Notice that with "recent" Git versions, ofs-delta objects are
preferred over ref-delta objects and ref-delta objects have no reason
to be present in a clone pack. So in clone case we waste
(20-8) * nr_objects bytes because of this union. That's about 38MB out
of 100MB for deltas[] with 3.4M objects, or 38%. deltas[] would be
around 62MB without the waste.

This patch attempts to eliminate that. deltas[] array is split into
two: one for ofs-delta and one for ref-delta. Many functions are also
duplicated because of this split. With this patch, ofs_delta_entry[]
array takes 38MB. ref_deltas[] should remain unallocated in clone case
(0 bytes). This array grows as we see ref-delta. We save more than
half in clone case, or 25% of total book keeping.

The saving is more than the calculation above because padding is
removed by __attribute__((packed)) on ofs_delta_entry. This attribute
should be ok to use, as we used to have it in our code base for some
time. The last use was removed because it may lead to incorrect
behavior when the struct is not packed, which is not the case in
index-pack.

A note about ofs_deltas allocation. We could use ref_deltas memory
allocation strategy for ofs_deltas. But that probably just adds more
overhead on top. ofs-deltas are generally the majority (1/2 to 2/3) in
any pack. Incremental realloc may lead to too many memcpy. And if we
preallocate, say 1/2 or 2/3 of nr_objects initially, the growth rate
of ALLOC_GROW() could make this array larger than nr_objects, wasting
more memory.

Brought-up-by: Matthew Sporleder <msporleder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c | 260 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 160 insertions(+), 100 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 07b2c0c..27e3c8b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -28,11 +28,6 @@ struct object_stat {
 	int base_object_no;
 };
 
-union delta_base {
-	unsigned char sha1[20];
-	off_t offset;
-};
-
 struct base_data {
 	struct base_data *base;
 	struct base_data *child;
@@ -52,26 +47,28 @@ struct thread_local {
 	int pack_fd;
 };
 
-/*
- * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
- * to memcmp() only the first 20 bytes.
- */
-#define UNION_BASE_SZ	20
-
 #define FLAG_LINK (1u<<20)
 #define FLAG_CHECKED (1u<<21)
 
-struct delta_entry {
-	union delta_base base;
+struct ofs_delta_entry {
+	off_t offset;
+	int obj_no;
+} __attribute__((packed));
+
+struct ref_delta_entry {
+	unsigned char sha1[20];
 	int obj_no;
 };
 
 static struct object_entry *objects;
 static struct object_stat *obj_stat;
-static struct delta_entry *deltas;
+static struct ofs_delta_entry *ofs_deltas;
+static struct ref_delta_entry *ref_deltas;
 static struct thread_local nothread_data;
 static int nr_objects;
-static int nr_deltas;
+static int nr_ofs_deltas;
+static int nr_ref_deltas;
+static int ref_deltas_alloc;
 static int nr_resolved_deltas;
 static int nr_threads;
 
@@ -480,7 +477,8 @@ static void *unpack_entry_data(unsigned long offset, unsigned long size,
 }
 
 static void *unpack_raw_entry(struct object_entry *obj,
-			      union delta_base *delta_base,
+			      off_t *ofs_offset,
+			      unsigned char *ref_sha1,
 			      unsigned char *sha1)
 {
 	unsigned char *p;
@@ -509,11 +507,10 @@ static void *unpack_raw_entry(struct object_entry *obj,
 
 	switch (obj->type) {
 	case OBJ_REF_DELTA:
-		hashcpy(delta_base->sha1, fill(20));
+		hashcpy(ref_sha1, fill(20));
 		use(20);
 		break;
 	case OBJ_OFS_DELTA:
-		memset(delta_base, 0, sizeof(*delta_base));
 		p = fill(1);
 		c = *p;
 		use(1);
@@ -527,8 +524,8 @@ static void *unpack_raw_entry(struct object_entry *obj,
 			use(1);
 			base_offset = (base_offset << 7) + (c & 127);
 		}
-		delta_base->offset = obj->idx.offset - base_offset;
-		if (delta_base->offset <= 0 || delta_base->offset >= obj->idx.offset)
+		*ofs_offset = obj->idx.offset - base_offset;
+		if (*ofs_offset <= 0 || *ofs_offset >= obj->idx.offset)
 			bad_object(obj->idx.offset, _("delta base offset is out of bound"));
 		break;
 	case OBJ_COMMIT:
@@ -612,55 +609,108 @@ static void *get_data_from_pack(struct object_entry *obj)
 	return unpack_data(obj, NULL, NULL);
 }
 
-static int compare_delta_bases(const union delta_base *base1,
-			       const union delta_base *base2,
-			       enum object_type type1,
-			       enum object_type type2)
+static int compare_ofs_delta_bases(off_t offset1, off_t offset2,
+				   enum object_type type1,
+				   enum object_type type2)
+{
+	int cmp = type1 - type2;
+	if (cmp)
+		return cmp;
+	return offset1 - offset2;
+}
+
+static int find_ofs_delta(const off_t offset, enum object_type type)
+{
+	int first = 0, last = nr_ofs_deltas;
+
+	while (first < last) {
+		int next = (first + last) / 2;
+		struct ofs_delta_entry *delta = &ofs_deltas[next];
+		int cmp;
+
+		cmp = compare_ofs_delta_bases(offset, delta->offset,
+					      type, objects[delta->obj_no].type);
+		if (!cmp)
+			return next;
+		if (cmp < 0) {
+			last = next;
+			continue;
+		}
+		first = next+1;
+	}
+	return -first-1;
+}
+
+static void find_ofs_delta_children(off_t offset,
+				    int *first_index, int *last_index,
+				    enum object_type type)
+{
+	int first = find_ofs_delta(offset, type);
+	int last = first;
+	int end = nr_ofs_deltas - 1;
+
+	if (first < 0) {
+		*first_index = 0;
+		*last_index = -1;
+		return;
+	}
+	while (first > 0 && ofs_deltas[first - 1].offset == offset)
+		--first;
+	while (last < end && ofs_deltas[last + 1].offset == offset)
+		++last;
+	*first_index = first;
+	*last_index = last;
+}
+
+static int compare_ref_delta_bases(const unsigned char *sha1,
+				   const unsigned char *sha2,
+				   enum object_type type1,
+				   enum object_type type2)
 {
 	int cmp = type1 - type2;
 	if (cmp)
 		return cmp;
-	return memcmp(base1, base2, UNION_BASE_SZ);
+	return hashcmp(sha1, sha2);
 }
 
-static int find_delta(const union delta_base *base, enum object_type type)
+static int find_ref_delta(const unsigned char *sha1, enum object_type type)
 {
-	int first = 0, last = nr_deltas;
-
-        while (first < last) {
-                int next = (first + last) / 2;
-                struct delta_entry *delta = &deltas[next];
-                int cmp;
-
-		cmp = compare_delta_bases(base, &delta->base,
-					  type, objects[delta->obj_no].type);
-                if (!cmp)
-                        return next;
-                if (cmp < 0) {
-                        last = next;
-                        continue;
-                }
-                first = next+1;
-        }
-        return -first-1;
+	int first = 0, last = nr_ref_deltas;
+
+	while (first < last) {
+		int next = (first + last) / 2;
+		struct ref_delta_entry *delta = &ref_deltas[next];
+		int cmp;
+
+		cmp = compare_ref_delta_bases(sha1, delta->sha1,
+					      type, objects[delta->obj_no].type);
+		if (!cmp)
+			return next;
+		if (cmp < 0) {
+			last = next;
+			continue;
+		}
+		first = next+1;
+	}
+	return -first-1;
 }
 
-static void find_delta_children(const union delta_base *base,
-				int *first_index, int *last_index,
-				enum object_type type)
+static void find_ref_delta_children(const unsigned char *sha1,
+				    int *first_index, int *last_index,
+				    enum object_type type)
 {
-	int first = find_delta(base, type);
+	int first = find_ref_delta(sha1, type);
 	int last = first;
-	int end = nr_deltas - 1;
+	int end = nr_ref_deltas - 1;
 
 	if (first < 0) {
 		*first_index = 0;
 		*last_index = -1;
 		return;
 	}
-	while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
+	while (first > 0 && !hashcmp(ref_deltas[first - 1].sha1, sha1))
 		--first;
-	while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
+	while (last < end && !hashcmp(ref_deltas[last + 1].sha1, sha1))
 		++last;
 	*first_index = first;
 	*last_index = last;
@@ -927,16 +977,13 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
 						  struct base_data *prev_base)
 {
 	if (base->ref_last == -1 && base->ofs_last == -1) {
-		union delta_base base_spec;
-
-		hashcpy(base_spec.sha1, base->obj->idx.sha1);
-		find_delta_children(&base_spec,
-				    &base->ref_first, &base->ref_last, OBJ_REF_DELTA);
+		find_ref_delta_children(base->obj->idx.sha1,
+					&base->ref_first, &base->ref_last,
+					OBJ_REF_DELTA);
 
-		memset(&base_spec, 0, sizeof(base_spec));
-		base_spec.offset = base->obj->idx.offset;
-		find_delta_children(&base_spec,
-				    &base->ofs_first, &base->ofs_last, OBJ_OFS_DELTA);
+		find_ofs_delta_children(base->obj->idx.offset,
+					&base->ofs_first, &base->ofs_last,
+					OBJ_OFS_DELTA);
 
 		if (base->ref_last == -1 && base->ofs_last == -1) {
 			free(base->data);
@@ -947,7 +994,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
 	}
 
 	if (base->ref_first <= base->ref_last) {
-		struct object_entry *child = objects + deltas[base->ref_first].obj_no;
+		struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no;
 		struct base_data *result = alloc_base_data();
 
 		if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA,
@@ -963,7 +1010,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
 	}
 
 	if (base->ofs_first <= base->ofs_last) {
-		struct object_entry *child = objects + deltas[base->ofs_first].obj_no;
+		struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no;
 		struct base_data *result = alloc_base_data();
 
 		assert(child->real_type == OBJ_OFS_DELTA);
@@ -999,15 +1046,20 @@ static void find_unresolved_deltas(struct base_data *base)
 	}
 }
 
-static int compare_delta_entry(const void *a, const void *b)
+static int compare_ofs_delta_entry(const void *a, const void *b)
+{
+	const struct ofs_delta_entry *delta_a = a;
+	const struct ofs_delta_entry *delta_b = b;
+
+	return delta_a->offset - delta_b->offset;
+}
+
+static int compare_ref_delta_entry(const void *a, const void *b)
 {
-	const struct delta_entry *delta_a = a;
-	const struct delta_entry *delta_b = b;
+	const struct ref_delta_entry *delta_a = a;
+	const struct ref_delta_entry *delta_b = b;
 
-	/* group by type (ref vs ofs) and then by value (sha-1 or offset) */
-	return compare_delta_bases(&delta_a->base, &delta_b->base,
-				   objects[delta_a->obj_no].type,
-				   objects[delta_b->obj_no].type);
+	return hashcmp(delta_a->sha1, delta_b->sha1);
 }
 
 static void resolve_base(struct object_entry *obj)
@@ -1053,7 +1105,8 @@ static void *threaded_second_pass(void *data)
 static void parse_pack_objects(unsigned char *sha1)
 {
 	int i, nr_delays = 0;
-	struct delta_entry *delta = deltas;
+	struct ofs_delta_entry *ofs_delta = ofs_deltas;
+	unsigned char ref_delta_sha1[20];
 	struct stat st;
 
 	if (verbose)
@@ -1062,12 +1115,18 @@ static void parse_pack_objects(unsigned char *sha1)
 				nr_objects);
 	for (i = 0; i < nr_objects; i++) {
 		struct object_entry *obj = &objects[i];
-		void *data = unpack_raw_entry(obj, &delta->base, obj->idx.sha1);
+		void *data = unpack_raw_entry(obj, &ofs_delta->offset,
+					      ref_delta_sha1, obj->idx.sha1);
 		obj->real_type = obj->type;
-		if (is_delta_type(obj->type)) {
-			nr_deltas++;
-			delta->obj_no = i;
-			delta++;
+		if (obj->type == OBJ_OFS_DELTA) {
+			nr_ofs_deltas++;
+			ofs_delta->obj_no = i;
+			ofs_delta++;
+		} else if (obj->type == OBJ_REF_DELTA) {
+			ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, ref_deltas_alloc);
+			hashcpy(ref_deltas[nr_ref_deltas].sha1, ref_delta_sha1);
+			ref_deltas[nr_ref_deltas].obj_no = i;
+			nr_ref_deltas++;
 		} else if (!data) {
 			/* large blobs, check later */
 			obj->real_type = OBJ_BAD;
@@ -1118,15 +1177,18 @@ static void resolve_deltas(void)
 {
 	int i;
 
-	if (!nr_deltas)
+	if (!nr_ofs_deltas && !nr_ref_deltas)
 		return;
 
 	/* Sort deltas by base SHA1/offset for fast searching */
-	qsort(deltas, nr_deltas, sizeof(struct delta_entry),
-	      compare_delta_entry);
+	qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
+	      compare_ofs_delta_entry);
+	qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
+	      compare_ref_delta_entry);
 
 	if (verbose)
-		progress = start_progress(_("Resolving deltas"), nr_deltas);
+		progress = start_progress(_("Resolving deltas"),
+					  nr_ref_deltas + nr_ofs_deltas);
 
 #ifndef NO_PTHREADS
 	nr_dispatched = 0;
@@ -1164,7 +1226,7 @@ static void resolve_deltas(void)
 static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved);
 static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned char *pack_sha1)
 {
-	if (nr_deltas == nr_resolved_deltas) {
+	if (nr_ref_deltas + nr_ofs_deltas == nr_resolved_deltas) {
 		stop_progress(&progress);
 		/* Flush remaining pack final 20-byte SHA1. */
 		flush();
@@ -1175,7 +1237,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
 		struct sha1file *f;
 		unsigned char read_sha1[20], tail_sha1[20];
 		struct strbuf msg = STRBUF_INIT;
-		int nr_unresolved = nr_deltas - nr_resolved_deltas;
+		int nr_unresolved = nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas;
 		int nr_objects_initial = nr_objects;
 		if (nr_unresolved <= 0)
 			die(_("confusion beyond insanity"));
@@ -1197,11 +1259,11 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
 			die(_("Unexpected tail checksum for %s "
 			      "(disk corruption?)"), curr_pack);
 	}
-	if (nr_deltas != nr_resolved_deltas)
+	if (nr_ofs_deltas + nr_ref_deltas != nr_resolved_deltas)
 		die(Q_("pack has %d unresolved delta",
 		       "pack has %d unresolved deltas",
-		       nr_deltas - nr_resolved_deltas),
-		    nr_deltas - nr_resolved_deltas);
+		       nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas),
+		    nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas);
 }
 
 static int write_compressed(struct sha1file *f, void *in, unsigned int size)
@@ -1261,14 +1323,14 @@ static struct object_entry *append_obj_to_pack(struct sha1file *f,
 
 static int delta_pos_compare(const void *_a, const void *_b)
 {
-	struct delta_entry *a = *(struct delta_entry **)_a;
-	struct delta_entry *b = *(struct delta_entry **)_b;
+	struct ref_delta_entry *a = *(struct ref_delta_entry **)_a;
+	struct ref_delta_entry *b = *(struct ref_delta_entry **)_b;
 	return a->obj_no - b->obj_no;
 }
 
 static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
 {
-	struct delta_entry **sorted_by_pos;
+	struct ref_delta_entry **sorted_by_pos;
 	int i, n = 0;
 
 	/*
@@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
 	 * resolving deltas in the same order as their position in the pack.
 	 */
 	sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
-	for (i = 0; i < nr_deltas; i++) {
-		if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
-			continue;
-		sorted_by_pos[n++] = &deltas[i];
-	}
+	for (i = 0; i < nr_ref_deltas; i++)
+		sorted_by_pos[n++] = &ref_deltas[i];
 	qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
 
 	for (i = 0; i < n; i++) {
-		struct delta_entry *d = sorted_by_pos[i];
+		struct ref_delta_entry *d = sorted_by_pos[i];
 		enum object_type type;
 		struct base_data *base_obj = alloc_base_data();
 
 		if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
 			continue;
-		base_obj->data = read_sha1_file(d->base.sha1, &type, &base_obj->size);
+		base_obj->data = read_sha1_file(d->sha1, &type, &base_obj->size);
 		if (!base_obj->data)
 			continue;
 
-		if (check_sha1_signature(d->base.sha1, base_obj->data,
+		if (check_sha1_signature(d->sha1, base_obj->data,
 				base_obj->size, typename(type)))
-			die(_("local object %s is corrupt"), sha1_to_hex(d->base.sha1));
-		base_obj->obj = append_obj_to_pack(f, d->base.sha1,
+			die(_("local object %s is corrupt"), sha1_to_hex(d->sha1));
+		base_obj->obj = append_obj_to_pack(f, d->sha1,
 					base_obj->data, base_obj->size, type);
 		find_unresolved_deltas(base_obj);
 		display_progress(progress, nr_resolved_deltas);
@@ -1495,7 +1554,7 @@ static void read_idx_option(struct pack_idx_option *opts, const char *pack_name)
 
 static void show_pack_info(int stat_only)
 {
-	int i, baseobjects = nr_objects - nr_deltas;
+	int i, baseobjects = nr_objects - nr_ref_deltas - nr_ofs_deltas;
 	unsigned long *chain_histogram = NULL;
 
 	if (deepest_delta)
@@ -1680,11 +1739,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
 	if (show_stat)
 		obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
-	deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
+	ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
 	parse_pack_objects(pack_sha1);
 	resolve_deltas();
 	conclude_pack(fix_thin_pack, curr_pack, pack_sha1);
-	free(deltas);
+	free(ofs_deltas);
+	free(ref_deltas);
 	if (strict)
 		foreign_nr = check_objects();
 
-- 
2.2.0.513.g477eb31

-- 8< --

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-10  9:30       ` Duy Nguyen
@ 2015-02-10 12:08         ` matthew sporleder
  2015-02-10 18:49           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: matthew sporleder @ 2015-02-10 12:08 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

I'm having trouble getting this new patch to apply.  Are you working
on a branch that I can track?

On Tue, Feb 10, 2015 at 3:30 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Mon, Feb 09, 2015 at 11:27:21AM -0800, Junio C Hamano wrote:
>> > On a 3.4M object repo that's about 53MB. The saving is less impressive
>> > compared to index-pack total memory use (about 400MB before delta
>> > resolving, so the saving is just 13%)
>> >
>> > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> > ---
>> >  I'm not sure if this patch is worth pursuing. It makes the code a
>> >  little bit harder to read. I was just wondering how much memory could
>> >  be saved..
>
> (text reordered)
>
>> I do not find the result all that harder to read.  I however think
>> that the change would make it a lot harder to maintain, especially
>> because the name "object-entry-extra" does not have any direct link
>> to "show-stat" to hint us that this must be allocated when show-stat
>> is in use and must never be looked at when show-stat is not in use.
>
> Noted. To be fixed.
>
>> I would say 13% is already impressive ;-).
>
> The second patch makes the total saving 119MB, close to 30% (again on
> x86-64, 32-bit platform number may be different). If we only compare
> with the size of objects[] and deltas[], the saving percentage is 37%
> (only for clone case) for this repo. Now it looks impressive to me :-D
>
> The patch is larger than the previous one, but not really complex. And
> the final index-pack.c is not hard to read either, probably becase we
> already handle ofs-delta and ref-delta separately.
>
> -- 8< --
> Subject: [PATCH 2/2] index-pack: kill union delta_base to save memory
>
> Once we know the number of objects in the input pack, we allocate an
> array of nr_objects of struct delta_entry. On x86-64, this struct is
> 32 bytes long. The union delta_base, which is part of struct
> delta_entry, provides enough space to store either ofs-delta (8 bytes)
> or ref-delta (20 bytes).
>
> Notice that with "recent" Git versions, ofs-delta objects are
> preferred over ref-delta objects and ref-delta objects have no reason
> to be present in a clone pack. So in clone case we waste
> (20-8) * nr_objects bytes because of this union. That's about 38MB out
> of 100MB for deltas[] with 3.4M objects, or 38%. deltas[] would be
> around 62MB without the waste.
>
> This patch attempts to eliminate that. deltas[] array is split into
> two: one for ofs-delta and one for ref-delta. Many functions are also
> duplicated because of this split. With this patch, ofs_delta_entry[]
> array takes 38MB. ref_deltas[] should remain unallocated in clone case
> (0 bytes). This array grows as we see ref-delta. We save more than
> half in clone case, or 25% of total book keeping.
>
> The saving is more than the calculation above because padding is
> removed by __attribute__((packed)) on ofs_delta_entry. This attribute
> should be ok to use, as we used to have it in our code base for some
> time. The last use was removed because it may lead to incorrect
> behavior when the struct is not packed, which is not the case in
> index-pack.
>
> A note about ofs_deltas allocation. We could use ref_deltas memory
> allocation strategy for ofs_deltas. But that probably just adds more
> overhead on top. ofs-deltas are generally the majority (1/2 to 2/3) in
> any pack. Incremental realloc may lead to too many memcpy. And if we
> preallocate, say 1/2 or 2/3 of nr_objects initially, the growth rate
> of ALLOC_GROW() could make this array larger than nr_objects, wasting
> more memory.
>
> Brought-up-by: Matthew Sporleder <msporleder@gmail.com>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  builtin/index-pack.c | 260 +++++++++++++++++++++++++++++++--------------------
>  1 file changed, 160 insertions(+), 100 deletions(-)
>
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index 07b2c0c..27e3c8b 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -28,11 +28,6 @@ struct object_stat {
>         int base_object_no;
>  };
>
> -union delta_base {
> -       unsigned char sha1[20];
> -       off_t offset;
> -};
> -
>  struct base_data {
>         struct base_data *base;
>         struct base_data *child;
> @@ -52,26 +47,28 @@ struct thread_local {
>         int pack_fd;
>  };
>
> -/*
> - * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
> - * to memcmp() only the first 20 bytes.
> - */
> -#define UNION_BASE_SZ  20
> -
>  #define FLAG_LINK (1u<<20)
>  #define FLAG_CHECKED (1u<<21)
>
> -struct delta_entry {
> -       union delta_base base;
> +struct ofs_delta_entry {
> +       off_t offset;
> +       int obj_no;
> +} __attribute__((packed));
> +
> +struct ref_delta_entry {
> +       unsigned char sha1[20];
>         int obj_no;
>  };
>
>  static struct object_entry *objects;
>  static struct object_stat *obj_stat;
> -static struct delta_entry *deltas;
> +static struct ofs_delta_entry *ofs_deltas;
> +static struct ref_delta_entry *ref_deltas;
>  static struct thread_local nothread_data;
>  static int nr_objects;
> -static int nr_deltas;
> +static int nr_ofs_deltas;
> +static int nr_ref_deltas;
> +static int ref_deltas_alloc;
>  static int nr_resolved_deltas;
>  static int nr_threads;
>
> @@ -480,7 +477,8 @@ static void *unpack_entry_data(unsigned long offset, unsigned long size,
>  }
>
>  static void *unpack_raw_entry(struct object_entry *obj,
> -                             union delta_base *delta_base,
> +                             off_t *ofs_offset,
> +                             unsigned char *ref_sha1,
>                               unsigned char *sha1)
>  {
>         unsigned char *p;
> @@ -509,11 +507,10 @@ static void *unpack_raw_entry(struct object_entry *obj,
>
>         switch (obj->type) {
>         case OBJ_REF_DELTA:
> -               hashcpy(delta_base->sha1, fill(20));
> +               hashcpy(ref_sha1, fill(20));
>                 use(20);
>                 break;
>         case OBJ_OFS_DELTA:
> -               memset(delta_base, 0, sizeof(*delta_base));
>                 p = fill(1);
>                 c = *p;
>                 use(1);
> @@ -527,8 +524,8 @@ static void *unpack_raw_entry(struct object_entry *obj,
>                         use(1);
>                         base_offset = (base_offset << 7) + (c & 127);
>                 }
> -               delta_base->offset = obj->idx.offset - base_offset;
> -               if (delta_base->offset <= 0 || delta_base->offset >= obj->idx.offset)
> +               *ofs_offset = obj->idx.offset - base_offset;
> +               if (*ofs_offset <= 0 || *ofs_offset >= obj->idx.offset)
>                         bad_object(obj->idx.offset, _("delta base offset is out of bound"));
>                 break;
>         case OBJ_COMMIT:
> @@ -612,55 +609,108 @@ static void *get_data_from_pack(struct object_entry *obj)
>         return unpack_data(obj, NULL, NULL);
>  }
>
> -static int compare_delta_bases(const union delta_base *base1,
> -                              const union delta_base *base2,
> -                              enum object_type type1,
> -                              enum object_type type2)
> +static int compare_ofs_delta_bases(off_t offset1, off_t offset2,
> +                                  enum object_type type1,
> +                                  enum object_type type2)
> +{
> +       int cmp = type1 - type2;
> +       if (cmp)
> +               return cmp;
> +       return offset1 - offset2;
> +}
> +
> +static int find_ofs_delta(const off_t offset, enum object_type type)
> +{
> +       int first = 0, last = nr_ofs_deltas;
> +
> +       while (first < last) {
> +               int next = (first + last) / 2;
> +               struct ofs_delta_entry *delta = &ofs_deltas[next];
> +               int cmp;
> +
> +               cmp = compare_ofs_delta_bases(offset, delta->offset,
> +                                             type, objects[delta->obj_no].type);
> +               if (!cmp)
> +                       return next;
> +               if (cmp < 0) {
> +                       last = next;
> +                       continue;
> +               }
> +               first = next+1;
> +       }
> +       return -first-1;
> +}
> +
> +static void find_ofs_delta_children(off_t offset,
> +                                   int *first_index, int *last_index,
> +                                   enum object_type type)
> +{
> +       int first = find_ofs_delta(offset, type);
> +       int last = first;
> +       int end = nr_ofs_deltas - 1;
> +
> +       if (first < 0) {
> +               *first_index = 0;
> +               *last_index = -1;
> +               return;
> +       }
> +       while (first > 0 && ofs_deltas[first - 1].offset == offset)
> +               --first;
> +       while (last < end && ofs_deltas[last + 1].offset == offset)
> +               ++last;
> +       *first_index = first;
> +       *last_index = last;
> +}
> +
> +static int compare_ref_delta_bases(const unsigned char *sha1,
> +                                  const unsigned char *sha2,
> +                                  enum object_type type1,
> +                                  enum object_type type2)
>  {
>         int cmp = type1 - type2;
>         if (cmp)
>                 return cmp;
> -       return memcmp(base1, base2, UNION_BASE_SZ);
> +       return hashcmp(sha1, sha2);
>  }
>
> -static int find_delta(const union delta_base *base, enum object_type type)
> +static int find_ref_delta(const unsigned char *sha1, enum object_type type)
>  {
> -       int first = 0, last = nr_deltas;
> -
> -        while (first < last) {
> -                int next = (first + last) / 2;
> -                struct delta_entry *delta = &deltas[next];
> -                int cmp;
> -
> -               cmp = compare_delta_bases(base, &delta->base,
> -                                         type, objects[delta->obj_no].type);
> -                if (!cmp)
> -                        return next;
> -                if (cmp < 0) {
> -                        last = next;
> -                        continue;
> -                }
> -                first = next+1;
> -        }
> -        return -first-1;
> +       int first = 0, last = nr_ref_deltas;
> +
> +       while (first < last) {
> +               int next = (first + last) / 2;
> +               struct ref_delta_entry *delta = &ref_deltas[next];
> +               int cmp;
> +
> +               cmp = compare_ref_delta_bases(sha1, delta->sha1,
> +                                             type, objects[delta->obj_no].type);
> +               if (!cmp)
> +                       return next;
> +               if (cmp < 0) {
> +                       last = next;
> +                       continue;
> +               }
> +               first = next+1;
> +       }
> +       return -first-1;
>  }
>
> -static void find_delta_children(const union delta_base *base,
> -                               int *first_index, int *last_index,
> -                               enum object_type type)
> +static void find_ref_delta_children(const unsigned char *sha1,
> +                                   int *first_index, int *last_index,
> +                                   enum object_type type)
>  {
> -       int first = find_delta(base, type);
> +       int first = find_ref_delta(sha1, type);
>         int last = first;
> -       int end = nr_deltas - 1;
> +       int end = nr_ref_deltas - 1;
>
>         if (first < 0) {
>                 *first_index = 0;
>                 *last_index = -1;
>                 return;
>         }
> -       while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
> +       while (first > 0 && !hashcmp(ref_deltas[first - 1].sha1, sha1))
>                 --first;
> -       while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
> +       while (last < end && !hashcmp(ref_deltas[last + 1].sha1, sha1))
>                 ++last;
>         *first_index = first;
>         *last_index = last;
> @@ -927,16 +977,13 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
>                                                   struct base_data *prev_base)
>  {
>         if (base->ref_last == -1 && base->ofs_last == -1) {
> -               union delta_base base_spec;
> -
> -               hashcpy(base_spec.sha1, base->obj->idx.sha1);
> -               find_delta_children(&base_spec,
> -                                   &base->ref_first, &base->ref_last, OBJ_REF_DELTA);
> +               find_ref_delta_children(base->obj->idx.sha1,
> +                                       &base->ref_first, &base->ref_last,
> +                                       OBJ_REF_DELTA);
>
> -               memset(&base_spec, 0, sizeof(base_spec));
> -               base_spec.offset = base->obj->idx.offset;
> -               find_delta_children(&base_spec,
> -                                   &base->ofs_first, &base->ofs_last, OBJ_OFS_DELTA);
> +               find_ofs_delta_children(base->obj->idx.offset,
> +                                       &base->ofs_first, &base->ofs_last,
> +                                       OBJ_OFS_DELTA);
>
>                 if (base->ref_last == -1 && base->ofs_last == -1) {
>                         free(base->data);
> @@ -947,7 +994,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
>         }
>
>         if (base->ref_first <= base->ref_last) {
> -               struct object_entry *child = objects + deltas[base->ref_first].obj_no;
> +               struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no;
>                 struct base_data *result = alloc_base_data();
>
>                 if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA,
> @@ -963,7 +1010,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base,
>         }
>
>         if (base->ofs_first <= base->ofs_last) {
> -               struct object_entry *child = objects + deltas[base->ofs_first].obj_no;
> +               struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no;
>                 struct base_data *result = alloc_base_data();
>
>                 assert(child->real_type == OBJ_OFS_DELTA);
> @@ -999,15 +1046,20 @@ static void find_unresolved_deltas(struct base_data *base)
>         }
>  }
>
> -static int compare_delta_entry(const void *a, const void *b)
> +static int compare_ofs_delta_entry(const void *a, const void *b)
> +{
> +       const struct ofs_delta_entry *delta_a = a;
> +       const struct ofs_delta_entry *delta_b = b;
> +
> +       return delta_a->offset - delta_b->offset;
> +}
> +
> +static int compare_ref_delta_entry(const void *a, const void *b)
>  {
> -       const struct delta_entry *delta_a = a;
> -       const struct delta_entry *delta_b = b;
> +       const struct ref_delta_entry *delta_a = a;
> +       const struct ref_delta_entry *delta_b = b;
>
> -       /* group by type (ref vs ofs) and then by value (sha-1 or offset) */
> -       return compare_delta_bases(&delta_a->base, &delta_b->base,
> -                                  objects[delta_a->obj_no].type,
> -                                  objects[delta_b->obj_no].type);
> +       return hashcmp(delta_a->sha1, delta_b->sha1);
>  }
>
>  static void resolve_base(struct object_entry *obj)
> @@ -1053,7 +1105,8 @@ static void *threaded_second_pass(void *data)
>  static void parse_pack_objects(unsigned char *sha1)
>  {
>         int i, nr_delays = 0;
> -       struct delta_entry *delta = deltas;
> +       struct ofs_delta_entry *ofs_delta = ofs_deltas;
> +       unsigned char ref_delta_sha1[20];
>         struct stat st;
>
>         if (verbose)
> @@ -1062,12 +1115,18 @@ static void parse_pack_objects(unsigned char *sha1)
>                                 nr_objects);
>         for (i = 0; i < nr_objects; i++) {
>                 struct object_entry *obj = &objects[i];
> -               void *data = unpack_raw_entry(obj, &delta->base, obj->idx.sha1);
> +               void *data = unpack_raw_entry(obj, &ofs_delta->offset,
> +                                             ref_delta_sha1, obj->idx.sha1);
>                 obj->real_type = obj->type;
> -               if (is_delta_type(obj->type)) {
> -                       nr_deltas++;
> -                       delta->obj_no = i;
> -                       delta++;
> +               if (obj->type == OBJ_OFS_DELTA) {
> +                       nr_ofs_deltas++;
> +                       ofs_delta->obj_no = i;
> +                       ofs_delta++;
> +               } else if (obj->type == OBJ_REF_DELTA) {
> +                       ALLOC_GROW(ref_deltas, nr_ref_deltas + 1, ref_deltas_alloc);
> +                       hashcpy(ref_deltas[nr_ref_deltas].sha1, ref_delta_sha1);
> +                       ref_deltas[nr_ref_deltas].obj_no = i;
> +                       nr_ref_deltas++;
>                 } else if (!data) {
>                         /* large blobs, check later */
>                         obj->real_type = OBJ_BAD;
> @@ -1118,15 +1177,18 @@ static void resolve_deltas(void)
>  {
>         int i;
>
> -       if (!nr_deltas)
> +       if (!nr_ofs_deltas && !nr_ref_deltas)
>                 return;
>
>         /* Sort deltas by base SHA1/offset for fast searching */
> -       qsort(deltas, nr_deltas, sizeof(struct delta_entry),
> -             compare_delta_entry);
> +       qsort(ofs_deltas, nr_ofs_deltas, sizeof(struct ofs_delta_entry),
> +             compare_ofs_delta_entry);
> +       qsort(ref_deltas, nr_ref_deltas, sizeof(struct ref_delta_entry),
> +             compare_ref_delta_entry);
>
>         if (verbose)
> -               progress = start_progress(_("Resolving deltas"), nr_deltas);
> +               progress = start_progress(_("Resolving deltas"),
> +                                         nr_ref_deltas + nr_ofs_deltas);
>
>  #ifndef NO_PTHREADS
>         nr_dispatched = 0;
> @@ -1164,7 +1226,7 @@ static void resolve_deltas(void)
>  static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved);
>  static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned char *pack_sha1)
>  {
> -       if (nr_deltas == nr_resolved_deltas) {
> +       if (nr_ref_deltas + nr_ofs_deltas == nr_resolved_deltas) {
>                 stop_progress(&progress);
>                 /* Flush remaining pack final 20-byte SHA1. */
>                 flush();
> @@ -1175,7 +1237,7 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
>                 struct sha1file *f;
>                 unsigned char read_sha1[20], tail_sha1[20];
>                 struct strbuf msg = STRBUF_INIT;
> -               int nr_unresolved = nr_deltas - nr_resolved_deltas;
> +               int nr_unresolved = nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas;
>                 int nr_objects_initial = nr_objects;
>                 if (nr_unresolved <= 0)
>                         die(_("confusion beyond insanity"));
> @@ -1197,11 +1259,11 @@ static void conclude_pack(int fix_thin_pack, const char *curr_pack, unsigned cha
>                         die(_("Unexpected tail checksum for %s "
>                               "(disk corruption?)"), curr_pack);
>         }
> -       if (nr_deltas != nr_resolved_deltas)
> +       if (nr_ofs_deltas + nr_ref_deltas != nr_resolved_deltas)
>                 die(Q_("pack has %d unresolved delta",
>                        "pack has %d unresolved deltas",
> -                      nr_deltas - nr_resolved_deltas),
> -                   nr_deltas - nr_resolved_deltas);
> +                      nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas),
> +                   nr_ofs_deltas + nr_ref_deltas - nr_resolved_deltas);
>  }
>
>  static int write_compressed(struct sha1file *f, void *in, unsigned int size)
> @@ -1261,14 +1323,14 @@ static struct object_entry *append_obj_to_pack(struct sha1file *f,
>
>  static int delta_pos_compare(const void *_a, const void *_b)
>  {
> -       struct delta_entry *a = *(struct delta_entry **)_a;
> -       struct delta_entry *b = *(struct delta_entry **)_b;
> +       struct ref_delta_entry *a = *(struct ref_delta_entry **)_a;
> +       struct ref_delta_entry *b = *(struct ref_delta_entry **)_b;
>         return a->obj_no - b->obj_no;
>  }
>
>  static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
>  {
> -       struct delta_entry **sorted_by_pos;
> +       struct ref_delta_entry **sorted_by_pos;
>         int i, n = 0;
>
>         /*
> @@ -1282,28 +1344,25 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
>          * resolving deltas in the same order as their position in the pack.
>          */
>         sorted_by_pos = xmalloc(nr_unresolved * sizeof(*sorted_by_pos));
> -       for (i = 0; i < nr_deltas; i++) {
> -               if (objects[deltas[i].obj_no].real_type != OBJ_REF_DELTA)
> -                       continue;
> -               sorted_by_pos[n++] = &deltas[i];
> -       }
> +       for (i = 0; i < nr_ref_deltas; i++)
> +               sorted_by_pos[n++] = &ref_deltas[i];
>         qsort(sorted_by_pos, n, sizeof(*sorted_by_pos), delta_pos_compare);
>
>         for (i = 0; i < n; i++) {
> -               struct delta_entry *d = sorted_by_pos[i];
> +               struct ref_delta_entry *d = sorted_by_pos[i];
>                 enum object_type type;
>                 struct base_data *base_obj = alloc_base_data();
>
>                 if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
>                         continue;
> -               base_obj->data = read_sha1_file(d->base.sha1, &type, &base_obj->size);
> +               base_obj->data = read_sha1_file(d->sha1, &type, &base_obj->size);
>                 if (!base_obj->data)
>                         continue;
>
> -               if (check_sha1_signature(d->base.sha1, base_obj->data,
> +               if (check_sha1_signature(d->sha1, base_obj->data,
>                                 base_obj->size, typename(type)))
> -                       die(_("local object %s is corrupt"), sha1_to_hex(d->base.sha1));
> -               base_obj->obj = append_obj_to_pack(f, d->base.sha1,
> +                       die(_("local object %s is corrupt"), sha1_to_hex(d->sha1));
> +               base_obj->obj = append_obj_to_pack(f, d->sha1,
>                                         base_obj->data, base_obj->size, type);
>                 find_unresolved_deltas(base_obj);
>                 display_progress(progress, nr_resolved_deltas);
> @@ -1495,7 +1554,7 @@ static void read_idx_option(struct pack_idx_option *opts, const char *pack_name)
>
>  static void show_pack_info(int stat_only)
>  {
> -       int i, baseobjects = nr_objects - nr_deltas;
> +       int i, baseobjects = nr_objects - nr_ref_deltas - nr_ofs_deltas;
>         unsigned long *chain_histogram = NULL;
>
>         if (deepest_delta)
> @@ -1680,11 +1739,12 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
>         objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
>         if (show_stat)
>                 obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
> -       deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
> +       ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
>         parse_pack_objects(pack_sha1);
>         resolve_deltas();
>         conclude_pack(fix_thin_pack, curr_pack, pack_sha1);
> -       free(deltas);
> +       free(ofs_deltas);
> +       free(ref_deltas);
>         if (strict)
>                 foreign_nr = check_objects();
>
> --
> 2.2.0.513.g477eb31
>
> -- 8< --

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-10 12:08         ` matthew sporleder
@ 2015-02-10 18:49           ` Junio C Hamano
  2015-02-11 13:01             ` matthew sporleder
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2015-02-10 18:49 UTC (permalink / raw)
  To: matthew sporleder; +Cc: Duy Nguyen, Git Mailing List

matthew sporleder <msporleder@gmail.com> writes:

> I'm having trouble getting this new patch to apply.

Apply the first one, replace all object_entry_extra with
object_stat, replace all objects_extra with obj_stat and amend the
first one.  Then apply this one.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-10 18:49           ` Junio C Hamano
@ 2015-02-11 13:01             ` matthew sporleder
  2015-02-11 13:10               ` Duy Nguyen
  0 siblings, 1 reply; 12+ messages in thread
From: matthew sporleder @ 2015-02-11 13:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Git Mailing List

On Tue, Feb 10, 2015 at 12:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
> matthew sporleder <msporleder@gmail.com> writes:
>
>> I'm having trouble getting this new patch to apply.
>
> Apply the first one, replace all object_entry_extra with
> object_stat, replace all objects_extra with obj_stat and amend the
> first one.  Then apply this one.

I got this to work and had a good experience but got this from an arm user:

Cloning into 'NetBSD-src-git'...
remote: Counting objects: 3484984, done.
remote: Compressing objects: 100% (636083/636083), done.
error: index-pack died of signal 10
fatal: index-pack failed
      125.84 real         0.13 user         0.49 sys

Core was generated by `git'.
Program terminated with signal SIGBUS, Bus error.
#0  0x00045f88 in cmd_index_pack ()
(gdb) bt
#0  0x00045f88 in cmd_index_pack ()
#1  0x00014058 in handle_builtin ()
#2  0x00129358 in main ()


I will wait for the "official" patch and ask if my friend can compile with -g.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: reduce memory footprint a bit
  2015-02-11 13:01             ` matthew sporleder
@ 2015-02-11 13:10               ` Duy Nguyen
  0 siblings, 0 replies; 12+ messages in thread
From: Duy Nguyen @ 2015-02-11 13:10 UTC (permalink / raw)
  To: matthew sporleder; +Cc: Junio C Hamano, Git Mailing List

On Wed, Feb 11, 2015 at 8:01 PM, matthew sporleder <msporleder@gmail.com> wrote:
> On Tue, Feb 10, 2015 at 12:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> matthew sporleder <msporleder@gmail.com> writes:
>>
>>> I'm having trouble getting this new patch to apply.
>>
>> Apply the first one, replace all object_entry_extra with
>> object_stat, replace all objects_extra with obj_stat and amend the
>> first one.  Then apply this one.
>
> I got this to work and had a good experience but got this from an arm user:
>
> Cloning into 'NetBSD-src-git'...
> remote: Counting objects: 3484984, done.
> remote: Compressing objects: 100% (636083/636083), done.
> error: index-pack died of signal 10
> fatal: index-pack failed
>       125.84 real         0.13 user         0.49 sys
>
> Core was generated by `git'.
> Program terminated with signal SIGBUS, Bus error.

It might be the effect of __attribute__((packed)). Maybe you could try
again without that in builtin/index-pack.c. Also could you run gdb and
do

p sizeof(*ofs_deltas)

? No need to actually run it from gdb.
-- 
Duy

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-02-11 13:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-08 16:10 low memory system to clone larger repo matthew sporleder
2015-02-09 10:40 ` Duy Nguyen
2015-02-09 11:20   ` Matt Sporleder
2015-02-09 12:32 ` Duy Nguyen
2015-02-09 13:18   ` [PATCH] index-pack: reduce memory footprint a bit Nguyễn Thái Ngọc Duy
2015-02-09 19:27     ` Junio C Hamano
2015-02-10  9:30       ` Duy Nguyen
2015-02-10 12:08         ` matthew sporleder
2015-02-10 18:49           ` Junio C Hamano
2015-02-11 13:01             ` matthew sporleder
2015-02-11 13:10               ` Duy Nguyen
2015-02-10  3:56   ` low memory system to clone larger repo matthew sporleder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.