All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL REQUEST] initial pack v4 support
@ 2013-09-09 19:52 Nicolas Pitre
  2013-09-09 22:28 ` Junio C Hamano
  2013-09-10 21:21 ` Junio C Hamano
  0 siblings, 2 replies; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-09 19:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyn Thái Ngc Duy, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5552 bytes --]

Junio, would you please pull the following into pu:

	git://git.linaro.org/people/nico/git

This is the pack v4 work to date which is somewhat getting usable.  It 
is time it gets more exposure, and possibly some more people's attention 
who would like to work on the missing parts as I need to scale down my 
own involvement.

I've included the latest patches from Nguyn Thái Ngc Duy (sorry for not 
handling your name properly) as well.  There is no test suite for this 
yet, but the added code doesn't seem to regress the existing tests.

Nguyễn Thái Ngọc Duy (31):
      Document pack v4 format
      pack v4: allocate dicts from the beginning
      pack v4: stop using static/global variables in packv4-create.c
      pack v4: move packv4-create.c to libgit.a
      pack v4: split pv4_create_dict() out of load_dict()
      pack v4: add pv4_free_dict()
      index-pack: add more comments on some big functions
      index-pack: split out varint decoding code
      index-pack: do not allocate buffer for unpacking deltas in the first pass
      index-pack: split inflate/digest code out of unpack_entry_data
      index-pack: parse v4 header and dictionaries
      index-pack: make sure all objects are registered in v4's SHA-1 table
      index-pack: parse v4 commit format
      index-pack: parse v4 tree format
      index-pack: move delta base queuing code to unpack_raw_entry
      index-pack: record all delta bases in v4 (tree and ref-delta)
      index-pack: skip looking for ofs-deltas in v4 as they are not allowed
      index-pack: resolve v4 one-base trees
      pack v4: add version argument to write_pack_header
      pack_write: tighten valid object type check in encode_in_pack_object_header
      pack-write.c: add pv4_encode_object_header
      pack-objects: add --version to specify written pack version
      list-objects.c: add show_tree_entry callback to traverse_commit_list
      pack-objects: do not cache delta for v4 trees
      pack-objects: exclude commits out of delta objects in v4
      pack-objects: create pack v4 tables
      pack-objects: prepare SHA-1 table in v4
      pack-objects: support writing pack v4
      pack v4: support "end-of-pack" indicator in index-pack and pack-objects
      index-pack: use nr_objects_final as sha1_table size
      index-pack: support completing thin packs v4

Nicolas Pitre (41):
      pack v4: initial pack dictionary structure and code
      export packed_object_info()
      pack v4: scan tree objects
      pack v4: add tree entry mode support to dictionary entries
      pack v4: add commit object parsing
      pack v4: split the object list and dictionary creation
      pack v4: move to struct pack_idx_entry and get rid of our own struct idx_entry
      pack v4: basic SHA1 reference encoding
      introduce get_sha1_lowhex()
      pack v4: commit object encoding
      pack v4: tree object encoding
      pack v4: dictionary table output
      pack v4: creation code
      pack v4: object headers
      pack v4: object data copy
      pack v4: object writing
      pack v4: tree object delta encoding
      pack v4: load delta candidate for encoding tree objects
      packv4-create: optimize delta encoding
      pack v4: honor pack.compression config option
      pack v4: relax commit parsing a bit
      packv4-create: don't transcode tree objects with zero-padded file modes
      pack index v3
      packv4-create: normalize pack name to properly generate the pack index file name
      packv4-create: add progress display
      pack v4: initial pack index v3 support on the read side
      pack v4: object header decode
      pack v4: code to obtain a SHA1 from a sha1ref
      pack v4: code to load and prepare a pack dictionary table for use
      pack v4: code to retrieve an ident entry
      pack v4: code to recreate a canonical commit object
      sha1_file.c: make use of decode_varint()
      pack v4: parse delta base reference
      pack v4: we can read commit objects now
      pack v4: code to retrieve a path component
      pack v4: decode tree objects
      pack v4: we can read tree objects now
      packv4-create: add a command line argument to limit tree copy sequences
      pack v4: allow canonical commit and tree objects
      packv4-parse.c: get rid of snprintf()
      packv4-parse.c: allow tree entry copying from a canonical tree object

 Documentation/technical/pack-format.txt | 138 ++++-
 Makefile                                |   5 +
 builtin/index-pack.c                    | 781 ++++++++++++++++++++++----
 builtin/pack-objects.c                  | 230 +++++++-
 builtin/rev-list.c                      |   4 +-
 bulk-checkin.c                          |   2 +-
 cache.h                                 |  13 +
 hex.c                                   |  11 +
 list-objects.c                          |   9 +-
 list-objects.h                          |   3 +-
 pack-check.c                            |   4 +-
 pack-revindex.c                         |   7 +-
 pack-write.c                            |  57 +-
 pack.h                                  |   6 +-
 packv4-create.c                         | 685 ++++++++++++++++++++++
 packv4-create.h                         |  39 ++
 packv4-parse.c                          | 576 +++++++++++++++++++
 packv4-parse.h                          |  18 +
 sha1_file.c                             | 116 +++-
 test-packv4.c                           | 476 ++++++++++++++++
 upload-pack.c                           |   2 +-
 21 files changed, 3007 insertions(+), 175 deletions(-)


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PULL REQUEST] initial pack v4 support
  2013-09-09 19:52 [PULL REQUEST] initial pack v4 support Nicolas Pitre
@ 2013-09-09 22:28 ` Junio C Hamano
  2013-09-10 21:21 ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-09-09 22:28 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Nguyn Thái Ngc Duy, git

I am very excited, but I already am deep into today's integration
cycle, so I'd have to do this either tonight or early tomorrow.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PULL REQUEST] initial pack v4 support
  2013-09-09 19:52 [PULL REQUEST] initial pack v4 support Nicolas Pitre
  2013-09-09 22:28 ` Junio C Hamano
@ 2013-09-10 21:21 ` Junio C Hamano
  2013-09-10 21:32   ` Nicolas Pitre
                     ` (2 more replies)
  1 sibling, 3 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-09-10 21:21 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Nguyn Thái Ngc Duy, git

Nicolas Pitre <nico@fluxnic.net> writes:

> Junio, would you please pull the following into pu:
>
> 	git://git.linaro.org/people/nico/git
>
> This is the pack v4 work to date which is somewhat getting usable.  It 
> is time it gets more exposure, and possibly some more people's attention 
> who would like to work on the missing parts as I need to scale down my 
> own involvement.

Thanks.  Parked on 'pu'.

>       packv4-parse.c: allow tree entry copying from a canonical tree object

This one needed a small fix-up to make it compile.

I do not particularly like reusing that "size" variable, but it
seemed to be dead at that point, so...

 packv4-parse.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index f96acc1..3f95ed4 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -365,13 +365,14 @@ static int copy_canonical_tree_entries(struct packed_git *p, off_t offset,
 		update_tree_entry(&desc);
 	end = desc.buffer;
 
-	if (end - from > *sizep) {
+	size = (const char *)end - (const char *)from;
+	if (size > *sizep) {
 		free(data);
 		return -1;
 	}
-	memcpy(*dstp, from, end - from);
-	*dstp += end - from;
-	*sizep -= end - from;
+	memcpy(*dstp, from, size);
+	*dstp += size;
+	*sizep -= size;
 	free(data);
 	return 0;
 }
-- 
1.8.4-468-g1185e84

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PULL REQUEST] initial pack v4 support
  2013-09-10 21:21 ` Junio C Hamano
@ 2013-09-10 21:32   ` Nicolas Pitre
  2013-09-10 21:52     ` Junio C Hamano
  2013-09-10 22:31   ` Nicolas Pitre
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-10 21:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyn Thái Ngc Duy, git

On Tue, 10 Sep 2013, Junio C Hamano wrote:

> Nicolas Pitre <nico@fluxnic.net> writes:
> 
> > Junio, would you please pull the following into pu:
> >
> > 	git://git.linaro.org/people/nico/git
> >
> > This is the pack v4 work to date which is somewhat getting usable.  It 
> > is time it gets more exposure, and possibly some more people's attention 
> > who would like to work on the missing parts as I need to scale down my 
> > own involvement.
> 
> Thanks.  Parked on 'pu'.

Good.

> >       packv4-parse.c: allow tree entry copying from a canonical tree object
> 
> This one needed a small fix-up to make it compile.
> 
> I do not particularly like reusing that "size" variable, but it
> seemed to be dead at that point, so...

Feel free to fold this in the original commit.

I'm curious... what compiler are you using?  My gcc version is just 
happy to do arithmetic on void pointers.


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PULL REQUEST] initial pack v4 support
  2013-09-10 21:32   ` Nicolas Pitre
@ 2013-09-10 21:52     ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-09-10 21:52 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Nguyn Thái Ngc Duy, git

Nicolas Pitre <nico@fluxnic.net> writes:

> On Tue, 10 Sep 2013, Junio C Hamano wrote:
>
>> Nicolas Pitre <nico@fluxnic.net> writes:
>> 
>> > Junio, would you please pull the following into pu:
>> >
>> > 	git://git.linaro.org/people/nico/git
>> >
>> > This is the pack v4 work to date which is somewhat getting usable.  It 
>> > is time it gets more exposure, and possibly some more people's attention 
>> > who would like to work on the missing parts as I need to scale down my 
>> > own involvement.
>> 
>> Thanks.  Parked on 'pu'.
>
> Good.
>
>> >       packv4-parse.c: allow tree entry copying from a canonical tree object
>> 
>> This one needed a small fix-up to make it compile.
>> 
>> I do not particularly like reusing that "size" variable, but it
>> seemed to be dead at that point, so...
>
> Feel free to fold this in the original commit.
>
> I'm curious... what compiler are you using?  My gcc version is just 
> happy to do arithmetic on void pointers.

I have -Werror -Wpointer-arith -Woverflow -Wno-pointer-to-int-cast
defined in my private build script (the todo branch is checked out
as Meta/ subdirectory of git.git, and "Meta/Make --pedantic" is how
I build things).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PULL REQUEST] initial pack v4 support
  2013-09-10 21:21 ` Junio C Hamano
  2013-09-10 21:32   ` Nicolas Pitre
@ 2013-09-10 22:31   ` Nicolas Pitre
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-10 22:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyn Thái Ngc Duy, git

On Tue, 10 Sep 2013, Junio C Hamano wrote:

> >       packv4-parse.c: allow tree entry copying from a canonical tree object
> 
> This one needed a small fix-up to make it compile.
> 
> I do not particularly like reusing that "size" variable, but it
> seemed to be dead at that point, so...
> 
>  packv4-parse.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/packv4-parse.c b/packv4-parse.c
> index f96acc1..3f95ed4 100644
> --- a/packv4-parse.c
> +++ b/packv4-parse.c
> @@ -365,13 +365,14 @@ static int copy_canonical_tree_entries(struct packed_git *p, off_t offset,
>  		update_tree_entry(&desc);
>  	end = desc.buffer;
>  
> -	if (end - from > *sizep) {
> +	size = (const char *)end - (const char *)from;
> +	if (size > *sizep) {

BTW, a simpler fix might simply involve declaring those 2 variables as 
const char * up front which would remove the need for any cast.


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 00/21] np/pack-v4 updates
  2013-09-10 21:21 ` Junio C Hamano
  2013-09-10 21:32   ` Nicolas Pitre
  2013-09-10 22:31   ` Nicolas Pitre
@ 2013-09-11  6:06   ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 01/21] fixup! pack-objects: prepare SHA-1 table in v4 Nguyễn Thái Ngọc Duy
                       ` (22 more replies)
  2 siblings, 23 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy

This contains fixups for some of my patches, some of Nico's, adds v4
support to unpack-objects because the test suite needs it. With these,
when force generating pack v4 unconditionally, the remaining failed
tests are:

 - t5300-pack-object: ofs-delta tests fail (not surprising).
   core.packsizelimit also fails. Kinda expected, but not my top
   priority.

 - t5302-pack-index: mainly to test .idx v2, expected

 - t5303-pack-corruption-resilience: if I force generating .idx v2
   with .pack v4, I could get to 1/3 of it. Need a deeper look.

So v4 code is in pretty good shape in terms of correctness and near
function complete. Brave souls should try it out.

Nguyễn Thái Ngọc Duy (21):
  fixup! pack-objects: prepare SHA-1 table in v4
  fixup! pack-objects: support writing pack v4
  fixup! pack v4: support "end-of-pack" indicator in index-pack and pack-objects
  fixup! index-pack: parse v4 header and dictionaries
  fixup! index-pack: record all delta bases in v4 (tree and ref-delta)
  pack v4: lift dict size check in load_dict()
  pack v4: move pv4 objhdr parsing code to packv4-parse.c
  pack-objects: respect compression level in v4
  pack-objects: recognize v4 as pack source
  pack v4: add a note that streaming does not support OBJ_PV4_*
  unpack-objects: report missing object name
  unpack-objects: recognize end-of-pack in v4 thin pack
  unpack-objects: read v4 dictionaries
  unpack-objects: decode v4 object header
  unpack-objects: decode v4 ref-delta
  unpack-objects: decode v4 commits
  unpack-objects: allow to save processed bytes to a buffer
  unpack-objects: decode v4 trees
  index-pack, pack-objects: allow creating .idx v2 with .pack v4
  show-index: acknowledge that it does not read .idx v3
  t1050, t5500: replace the use of "show-index|wc -l" with verify-pack

 builtin/index-pack.c     |  19 ++-
 builtin/pack-objects.c   |  60 +++++--
 builtin/unpack-objects.c | 395 ++++++++++++++++++++++++++++++++++++++++++++---
 packv4-create.c          |  17 +-
 packv4-create.h          |   6 +-
 packv4-parse.c           |  16 +-
 packv4-parse.h           |   7 +
 sha1_file.c              |   9 +-
 show-index.c             |   4 +-
 streaming.c              |   2 +-
 t/t1050-large.sh         |   9 +-
 t/t5500-fetch-pack.sh    |   4 +-
 test-packv4.c            |   9 +-
 13 files changed, 480 insertions(+), 77 deletions(-)

-- 
1.8.2.82.gc24b958

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/21] fixup! pack-objects: prepare SHA-1 table in v4
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 02/21] fixup! pack-objects: support writing pack v4 Nguyễn Thái Ngọc Duy
                       ` (21 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 remove debugging code

 builtin/pack-objects.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1efb728..945b817 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -816,7 +816,6 @@ static void prepare_sha1_table(uint32_t start, struct object_entry **write_order
 		struct object_entry *e = write_order[i];
 		if (e->idx.offset > 0) {
 			v4.all_objs[v4.all_objs_nr++] = e->idx;
-			fprintf(stderr, "%s in\n", sha1_to_hex(e->idx.sha1));
 			e->idx.offset = 0;
 		}
 	}
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/21] fixup! pack-objects: support writing pack v4
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 01/21] fixup! pack-objects: prepare SHA-1 table in v4 Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 03/21] fixup! pack v4: support "end-of-pack" indicator in index-pack and pack-objects Nguyễn Thái Ngọc Duy
                       ` (20 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 by setting usable_delta to zero, I disable tree delta in
 pack-objects. Some test cases spotted this.

 builtin/pack-objects.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 945b817..b60b1a0 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -256,7 +256,12 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
 	struct git_istream *st = NULL;
 	char *result = "OK";
 
-	if (!usable_delta) {
+	if (!usable_delta ||
+	    /*
+	     * Force loading canonical tree. In future we may want to
+	     * read v4 trees directly instead.
+	     */
+	    (pack_version == 4 && entry->type == OBJ_TREE)) {
 		if (entry->type == OBJ_BLOB &&
 		    entry->size > big_file_threshold &&
 		    (st = open_istream(entry->idx.sha1, &type, &size, NULL)) != NULL)
@@ -518,9 +523,6 @@ static unsigned long write_object(struct sha1file *f,
 	else
 		usable_delta = 0;	/* base could end up in another pack */
 
-	if (pack_version == 4 && entry->type == OBJ_TREE)
-		usable_delta = 0;
-
 	if (!reuse_object)
 		to_reuse = 0;	/* explicit */
 	else if (!entry->in_pack)
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/21] fixup! pack v4: support "end-of-pack" indicator in index-pack and pack-objects
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 01/21] fixup! pack-objects: prepare SHA-1 table in v4 Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 02/21] fixup! pack-objects: support writing pack v4 Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 04/21] fixup! index-pack: parse v4 header and dictionaries Nguyễn Thái Ngọc Duy
                       ` (19 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 nr_objects contains a lot more than the number of objects to be
 written.

 builtin/pack-objects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b60b1a0..39d1e08 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -872,7 +872,7 @@ static void write_pack_file(void)
 			 * Pack v4 thin pack is terminated by a "type
 			 * 0, size 0" in variable length encoding
 			 */
-			if (pack_version == 4 && nr_written < nr_objects)
+			if (pack_version == 4 && nr_written < v4.all_objs_nr)
 				sha1write(f, &type_zero, 1);
 			sha1close(f, sha1, CSUM_CLOSE);
 		} else if (nr_written == nr_remaining) {
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/21] fixup! index-pack: parse v4 header and dictionaries
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (2 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 03/21] fixup! pack v4: support "end-of-pack" indicator in index-pack and pack-objects Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 05/21] fixup! index-pack: record all delta bases in v4 (tree and ref-delta) Nguyễn Thái Ngọc Duy
                       ` (18 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 empty pack case

 builtin/index-pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 8a6e2a3..89bc708 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1474,7 +1474,8 @@ static void parse_dictionaries(void)
 		return;
 
 	sha1_table = xmalloc(20 * nr_objects_final);
-	hashcpy(sha1_table, fill_and_use(20));
+	if (nr_objects_final)
+		hashcpy(sha1_table, fill_and_use(20));
 	for (i = 1; i < nr_objects_final; i++) {
 		unsigned char *p = sha1_table + i * 20;
 		hashcpy(p, fill_and_use(20));
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/21] fixup! index-pack: record all delta bases in v4 (tree and ref-delta)
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (3 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 04/21] fixup! index-pack: parse v4 header and dictionaries Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 06/21] pack v4: lift dict size check in load_dict() Nguyễn Thái Ngọc Duy
                       ` (17 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 compiler warning fix

 builtin/index-pack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 89bc708..1895adf 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -776,7 +776,7 @@ static void *unpack_raw_entry(struct object_entry *obj,
 	case OBJ_OFS_DELTA:
 		if (packv4)
 			die(_("pack version 4 does not support ofs-delta type (offset %lu)"),
-			    obj->idx.offset);
+			    (unsigned long)obj->idx.offset);
 		offset = obj->idx.offset - read_varint();
 		if (offset <= 0 || offset >= obj->idx.offset)
 			bad_object(obj->idx.offset,
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/21] pack v4: lift dict size check in load_dict()
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (4 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 05/21] fixup! index-pack: record all delta bases in v4 (tree and ref-delta) Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 07/21] pack v4: move pv4 objhdr parsing code to packv4-parse.c Nguyễn Thái Ngọc Duy
                       ` (16 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy

A pack with no trees (or an empty pack) could have zero-sized name
dictionary.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 packv4-parse.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index f96acc1..80ad6fc 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -87,10 +87,6 @@ static struct packv4_dict *load_dict(struct packed_git *p, off_t *offset)
 	src = use_pack(p, &w_curs, curpos, &avail);
 	cp = src;
 	dict_size = decode_varint(&cp);
-	if (dict_size < 3) {
-		error("bad dict size");
-		return NULL;
-	}
 	curpos += cp - src;
 
 	data = xmallocz(dict_size);
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/21] pack v4: move pv4 objhdr parsing code to packv4-parse.c
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (5 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 06/21] pack v4: lift dict size check in load_dict() Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 08/21] pack-objects: respect compression level in v4 Nguyễn Thái Ngọc Duy
                       ` (15 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 packv4-parse.c | 12 ++++++++++++
 packv4-parse.h |  5 +++++
 sha1_file.c    |  9 ++-------
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/packv4-parse.c b/packv4-parse.c
index 80ad6fc..7a43635 100644
--- a/packv4-parse.c
+++ b/packv4-parse.c
@@ -570,3 +570,15 @@ void *pv4_get_tree(struct packed_git *p, struct pack_window **w_curs,
 	}
 	return dst;
 }
+
+unsigned long pv4_unpack_object_header_buffer(const unsigned char *base,
+					      unsigned long len,
+					      enum object_type *type,
+					      unsigned long *sizep)
+{
+	const unsigned char *cp = base;
+	uintmax_t val = decode_varint(&cp);
+	*type = val & 0xf;
+	*sizep = val >> 4;
+	return cp - base;
+}
diff --git a/packv4-parse.h b/packv4-parse.h
index e6719f6..52f52f5 100644
--- a/packv4-parse.h
+++ b/packv4-parse.h
@@ -10,6 +10,11 @@ struct packv4_dict {
 struct packv4_dict *pv4_create_dict(const unsigned char *data, int dict_size);
 void pv4_free_dict(struct packv4_dict *dict);
 
+unsigned long pv4_unpack_object_header_buffer(const unsigned char *base,
+					      unsigned long len,
+					      enum object_type *type,
+					      unsigned long *sizep);
+
 void *pv4_get_commit(struct packed_git *p, struct pack_window **w_curs,
 		     off_t offset, unsigned long size);
 void *pv4_get_tree(struct packed_git *p, struct pack_window **w_curs,
diff --git a/sha1_file.c b/sha1_file.c
index 1528e28..038e22e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1736,13 +1736,8 @@ int unpack_object_header(struct packed_git *p,
 	base = use_pack(p, w_curs, *curpos, &left);
 	if (p->version < 4) {
 		used = unpack_object_header_buffer(base, left, &type, sizep);
-	} else {
-		const unsigned char *cp = base;
-		uintmax_t val = decode_varint(&cp);
-		used = cp - base;
-		type = val & 0xf;
-		*sizep = val >> 4;
-	}
+	} else
+		used = pv4_unpack_object_header_buffer(base, left, &type, sizep);
 	if (!used) {
 		type = OBJ_BAD;
 	} else
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/21] pack-objects: respect compression level in v4
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (6 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 07/21] pack v4: move pv4 objhdr parsing code to packv4-parse.c Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 09/21] pack-objects: recognize v4 as pack source Nguyễn Thái Ngọc Duy
                       ` (14 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/pack-objects.c |  5 +++--
 packv4-create.c        | 17 ++++++++++-------
 packv4-create.h        |  6 ++++--
 test-packv4.c          |  9 +++++----
 4 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 39d1e08..63c9b9e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -295,7 +295,8 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
 		datalen = size;
 	else if (pack_version == 4 && entry->type == OBJ_COMMIT) {
 		datalen = size;
-		result = pv4_encode_commit(&v4, buf, &datalen);
+		result = pv4_encode_commit(&v4, buf, &datalen,
+					   pack_compression_level);
 		if (result) {
 			free(buf);
 			buf = result;
@@ -857,7 +858,7 @@ static void write_pack_file(void)
 		if (!offset)
 			die_errno("unable to write pack header");
 		if (pack_version == 4)
-			offset += packv4_write_tables(f, &v4);
+			offset += packv4_write_tables(f, &v4, pack_compression_level);
 		nr_written = 0;
 		for (; i < nr_objects; i++) {
 			struct object_entry *e = write_order[i];
diff --git a/packv4-create.c b/packv4-create.c
index 83a6336..3acd10f 100644
--- a/packv4-create.c
+++ b/packv4-create.c
@@ -18,8 +18,6 @@
 #include "packv4-create.h"
 
 
-int pack_compression_seen;
-int pack_compression_level = Z_DEFAULT_COMPRESSION;
 int min_tree_copy = 1;
 
 struct data_entry {
@@ -285,7 +283,8 @@ int encode_sha1ref(const struct packv4_tables *v4,
  * regenerated and produce the same hash.
  */
 void *pv4_encode_commit(const struct packv4_tables *v4,
-			void *buffer, unsigned long *sizep)
+			void *buffer, unsigned long *sizep,
+			int pack_compression_level)
 {
 	unsigned long size = *sizep;
 	char *in, *tail, *end;
@@ -611,7 +610,8 @@ void *pv4_encode_tree(const struct packv4_tables *v4,
 	return buffer;
 }
 
-static unsigned long write_dict_table(struct sha1file *f, struct dict_table *t)
+static unsigned long write_dict_table(struct sha1file *f, struct dict_table *t,
+				      int pack_compression_level)
 {
 	unsigned char buffer[1024];
 	unsigned hdrlen;
@@ -661,7 +661,8 @@ static unsigned long write_dict_table(struct sha1file *f, struct dict_table *t)
 }
 
 unsigned long packv4_write_tables(struct sha1file *f,
-				  const struct packv4_tables *v4)
+				  const struct packv4_tables *v4,
+				  int pack_compression_level)
 {
 	unsigned nr_objects = v4->all_objs_nr;
 	struct pack_idx_entry *objs = v4->all_objs;
@@ -676,10 +677,12 @@ unsigned long packv4_write_tables(struct sha1file *f,
 	written = 20 * nr_objects;
 
 	/* Then the commit dictionary table */
-	written += write_dict_table(f, commit_ident_table);
+	written += write_dict_table(f, commit_ident_table,
+				    pack_compression_level);
 
 	/* Followed by the path component dictionary table */
-	written += write_dict_table(f, tree_path_table);
+	written += write_dict_table(f, tree_path_table,
+				    pack_compression_level);
 
 	return written;
 }
diff --git a/packv4-create.h b/packv4-create.h
index ba4929a..4ac4d71 100644
--- a/packv4-create.h
+++ b/packv4-create.h
@@ -25,9 +25,11 @@ void sort_dict_entries_by_hits(struct dict_table *t);
 int encode_sha1ref(const struct packv4_tables *v4,
 		   const unsigned char *sha1, unsigned char *buf);
 unsigned long packv4_write_tables(struct sha1file *f,
-				  const struct packv4_tables *v4);
+				  const struct packv4_tables *v4,
+				  int pack_compression_level);
 void *pv4_encode_commit(const struct packv4_tables *v4,
-			void *buffer, unsigned long *sizep);
+			void *buffer, unsigned long *sizep,
+			int pack_compression_level);
 void *pv4_encode_tree(const struct packv4_tables *v4,
 		      void *_buffer, unsigned long *sizep,
 		      void *delta, unsigned long delta_size,
diff --git a/test-packv4.c b/test-packv4.c
index 3b0d7a2..b50422a 100644
--- a/test-packv4.c
+++ b/test-packv4.c
@@ -5,8 +5,8 @@
 #include "varint.h"
 #include "packv4-create.h"
 
-extern int pack_compression_seen;
-extern int pack_compression_level;
+static int pack_compression_seen;
+static int pack_compression_level = Z_DEFAULT_COMPRESSION;
 extern int min_tree_copy;
 
 static struct pack_idx_entry *get_packed_object_list(struct packed_git *p)
@@ -291,7 +291,8 @@ static off_t packv4_write_object(struct packv4_tables *v4,
 
 	switch (type) {
 	case OBJ_COMMIT:
-		result = pv4_encode_commit(v4, src, &buf_size);
+		result = pv4_encode_commit(v4, src, &buf_size,
+					   pack_compression_level);
 		break;
 	case OBJ_TREE:
 		if (packed_type != OBJ_TREE) {
@@ -407,7 +408,7 @@ void process_one_pack(struct packv4_tables *v4, char *src_pack, char *dst_pack)
 	if (!f)
 		die("unable to open destination pack");
 	written += packv4_write_header(f, nr_objects);
-	written += packv4_write_tables(f, v4);
+	written += packv4_write_tables(f, v4, pack_compression_level);
 
 	/* Let's write objects out, updating the object index list in place */
 	progress_state = start_progress("Writing objects", nr_objects);
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/21] pack-objects: recognize v4 as pack source
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (7 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 08/21] pack-objects: respect compression level in v4 Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 10/21] pack v4: add a note that streaming does not support OBJ_PV4_* Nguyễn Thái Ngọc Duy
                       ` (13 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/pack-objects.c | 28 +++++++++++++++++++++++++---
 packv4-parse.h         |  2 ++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 63c9b9e..ac25973 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -19,6 +19,8 @@
 #include "streaming.h"
 #include "thread-utils.h"
 #include "packv4-create.h"
+#include "packv4-parse.h"
+#include "varint.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@@ -1397,9 +1399,14 @@ static void check_object(struct object_entry *entry)
 		 * We want in_pack_type even if we do not reuse delta
 		 * since non-delta representations could still be reused.
 		 */
-		used = unpack_object_header_buffer(buf, avail,
-						   &entry->in_pack_type,
-						   &entry->size);
+		if (p->version < 4)
+			used = unpack_object_header_buffer(buf, avail,
+							   &entry->in_pack_type,
+							   &entry->size);
+		else
+			used = pv4_unpack_object_header_buffer(buf, avail,
+							       &entry->in_pack_type,
+							       &entry->size);
 		if (used == 0)
 			goto give_up;
 
@@ -1417,7 +1424,22 @@ static void check_object(struct object_entry *entry)
 				goto give_up;
 			unuse_pack(&w_curs);
 			return;
+		case OBJ_PV4_COMMIT:
+		case OBJ_PV4_TREE:
+			entry->type = entry->in_pack_type - 8;
+			entry->in_pack_header_size = used;
+			unuse_pack(&w_curs);
+			return;
 		case OBJ_REF_DELTA:
+			if (p->version == 4) {
+				const unsigned char *sha1, *cp;
+				cp = buf + used;
+				sha1 = get_sha1ref(p, &cp);
+				entry->in_pack_header_size = cp - buf;;
+				if (reuse_delta && !entry->preferred_base)
+					base_ref = sha1;
+				break;
+			}
 			if (reuse_delta && !entry->preferred_base)
 				base_ref = use_pack(p, &w_curs,
 						entry->in_pack_offset + used, NULL);
diff --git a/packv4-parse.h b/packv4-parse.h
index 52f52f5..d674a3f 100644
--- a/packv4-parse.h
+++ b/packv4-parse.h
@@ -14,6 +14,8 @@ unsigned long pv4_unpack_object_header_buffer(const unsigned char *base,
 					      unsigned long len,
 					      enum object_type *type,
 					      unsigned long *sizep);
+const unsigned char *get_sha1ref(struct packed_git *p,
+				 const unsigned char **bufp);
 
 void *pv4_get_commit(struct packed_git *p, struct pack_window **w_curs,
 		     off_t offset, unsigned long size);
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/21] pack v4: add a note that streaming does not support OBJ_PV4_*
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (8 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 09/21] pack-objects: recognize v4 as pack source Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 11/21] unpack-objects: report missing object name Nguyễn Thái Ngọc Duy
                       ` (12 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 streaming.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/streaming.c b/streaming.c
index debe904..c7edebb 100644
--- a/streaming.c
+++ b/streaming.c
@@ -437,7 +437,7 @@ static open_method_decl(pack_non_delta)
 	unuse_pack(&window);
 	switch (in_pack_type) {
 	default:
-		return -1; /* we do not do deltas for now */
+		return -1; /* we do not do deltas nor pv4 types for now */
 	case OBJ_COMMIT:
 	case OBJ_TREE:
 	case OBJ_BLOB:
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/21] unpack-objects: report missing object name
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (9 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 10/21] pack v4: add a note that streaming does not support OBJ_PV4_* Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 12/21] unpack-objects: recognize end-of-pack in v4 thin pack Nguyễn Thái Ngọc Duy
                       ` (11 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 2217d7b..6d0a65c 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -193,7 +193,7 @@ static int check_object(struct object *obj, int type, void *data)
 		unsigned long size;
 		int type = sha1_object_info(obj->sha1, &size);
 		if (type != obj->type || type <= 0)
-			die("object of unexpected type");
+			die("object %s of unexpected type", sha1_to_hex(obj->sha1));
 		obj->flags |= FLAG_WRITTEN;
 		return 0;
 	}
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/21] unpack-objects: recognize end-of-pack in v4 thin pack
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (10 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 11/21] unpack-objects: report missing object name Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 13/21] unpack-objects: read v4 dictionaries Nguyễn Thái Ngọc Duy
                       ` (10 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6d0a65c..c9eb31d 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+static int packv4;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -421,7 +422,7 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 	free(base);
 }
 
-static void unpack_one(unsigned nr)
+static int unpack_one(unsigned nr)
 {
 	unsigned shift;
 	unsigned char *pack;
@@ -431,6 +432,10 @@ static void unpack_one(unsigned nr)
 	obj_list[nr].offset = consumed_bytes;
 
 	pack = fill(1);
+	if (packv4 && *(char*)fill(1) == 0) {
+		use(1);
+		return -1;
+	}
 	c = *pack;
 	use(1);
 	type = (c >> 4) & 7;
@@ -450,18 +455,19 @@ static void unpack_one(unsigned nr)
 	case OBJ_BLOB:
 	case OBJ_TAG:
 		unpack_non_delta_entry(type, size, nr);
-		return;
+		break;
 	case OBJ_REF_DELTA:
 	case OBJ_OFS_DELTA:
 		unpack_delta_entry(type, size, nr);
-		return;
+		break;
 	default:
 		error("bad object type %d", type);
 		has_errors = 1;
 		if (recover)
-			return;
+			break;
 		exit(1);
 	}
+	return 0;
 }
 
 static void unpack_all(void)
@@ -477,13 +483,15 @@ static void unpack_all(void)
 	if (!pack_version_ok(hdr->hdr_version))
 		die("unknown pack file version %"PRIu32,
 			ntohl(hdr->hdr_version));
+	packv4 = ntohl(hdr->hdr_version) == 4;
 	use(sizeof(struct pack_header));
 
 	if (!quiet)
 		progress = start_progress("Unpacking objects", nr_objects);
 	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
 	for (i = 0; i < nr_objects; i++) {
-		unpack_one(i);
+		if (unpack_one(i))
+			break;
 		display_progress(progress, i + 1);
 	}
 	stop_progress(&progress);
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/21] unpack-objects: read v4 dictionaries
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (11 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 12/21] unpack-objects: recognize end-of-pack in v4 thin pack Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 14/21] unpack-objects: decode v4 object header Nguyễn Thái Ngọc Duy
                       ` (9 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index c9eb31d..1a3c30e 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -10,6 +10,7 @@
 #include "tree-walk.h"
 #include "progress.h"
 #include "decorate.h"
+#include "packv4-parse.h"
 #include "fsck.h"
 
 static int dry_run, quiet, recover, has_errors, strict;
@@ -20,7 +21,10 @@ static unsigned char buffer[4096];
 static unsigned int offset, len;
 static off_t consumed_bytes;
 static git_SHA_CTX ctx;
+
 static int packv4;
+static unsigned char *sha1_table;
+static struct packv4_dict *name_dict, *path_dict;
 
 /*
  * When running under --strict mode, objects whose reachability are
@@ -89,6 +93,28 @@ static void use(int bytes)
 	consumed_bytes += bytes;
 }
 
+static inline void *fill_and_use(int bytes)
+{
+	void *p = fill(bytes);
+	use(bytes);
+	return p;
+}
+
+static uintmax_t read_varint(void)
+{
+	unsigned char c = *(char*)fill_and_use(1);
+	uintmax_t val = c & 127;
+	while (c & 128) {
+		val += 1;
+		if (!val || MSB(val, 7))
+			die("offset overflow in read_varint at %lu",
+			    (unsigned long)consumed_bytes);
+		c = *(char*)fill_and_use(1);
+		val = (val << 7) + (c & 127);
+	}
+	return val;
+}
+
 static void *get_data(unsigned long size)
 {
 	git_zstream stream;
@@ -470,6 +496,20 @@ static int unpack_one(unsigned nr)
 	return 0;
 }
 
+static struct packv4_dict *read_dict(void)
+{
+	unsigned long size;
+	unsigned char *data;
+	struct packv4_dict *dict;
+
+	size = read_varint();
+	data = get_data(size);
+	dict = pv4_create_dict(data, size);
+	if (!dict)
+		die("unable to parse dictionary");
+	return dict;
+}
+
 static void unpack_all(void)
 {
 	int i;
@@ -486,6 +526,16 @@ static void unpack_all(void)
 	packv4 = ntohl(hdr->hdr_version) == 4;
 	use(sizeof(struct pack_header));
 
+	if (packv4) {
+		sha1_table = xmalloc(20 * nr_objects);
+		for (i = 0; i < nr_objects; i++) {
+			unsigned char *p = sha1_table + i * 20;
+			hashcpy(p, fill_and_use(20));
+		}
+		name_dict = read_dict();
+		path_dict = read_dict();
+	}
+
 	if (!quiet)
 		progress = start_progress("Unpacking objects", nr_objects);
 	obj_list = xcalloc(nr_objects, sizeof(*obj_list));
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/21] unpack-objects: decode v4 object header
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (12 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 13/21] unpack-objects: read v4 dictionaries Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 15/21] unpack-objects: decode v4 ref-delta Nguyễn Thái Ngọc Duy
                       ` (8 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 1a3c30e..a906a98 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -448,32 +448,38 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 	free(base);
 }
 
-static int unpack_one(unsigned nr)
+static void read_typesize_v2(enum object_type *type, unsigned long *size)
 {
+	unsigned char c = *(char*)fill_and_use(1);
 	unsigned shift;
-	unsigned char *pack;
-	unsigned long size, c;
+
+	*type = (c >> 4) & 7;
+	*size = (c & 15);
+	shift = 4;
+	while (c & 128) {
+		c = *(char*)fill_and_use(1);
+		*size += (c & 0x7f) << shift;
+		shift += 7;
+	}
+}
+
+static int unpack_one(unsigned nr)
+{
+	unsigned long size;
 	enum object_type type;
 
 	obj_list[nr].offset = consumed_bytes;
 
-	pack = fill(1);
 	if (packv4 && *(char*)fill(1) == 0) {
 		use(1);
 		return -1;
 	}
-	c = *pack;
-	use(1);
-	type = (c >> 4) & 7;
-	size = (c & 15);
-	shift = 4;
-	while (c & 0x80) {
-		pack = fill(1);
-		c = *pack;
-		use(1);
-		size += (c & 0x7f) << shift;
-		shift += 7;
-	}
+	if (packv4) {
+		uintmax_t val = read_varint();
+		type = val & 15;
+		size = val >> 4;
+	} else
+		read_typesize_v2(&type, &size);
 
 	switch (type) {
 	case OBJ_COMMIT:
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/21] unpack-objects: decode v4 ref-delta
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (13 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 14/21] unpack-objects: decode v4 object header Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 16/21] unpack-objects: decode v4 commits Nguyễn Thái Ngọc Duy
                       ` (7 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index a906a98..f8442f4 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -23,6 +23,7 @@ static off_t consumed_bytes;
 static git_SHA_CTX ctx;
 
 static int packv4;
+static unsigned nr_objects;
 static unsigned char *sha1_table;
 static struct packv4_dict *name_dict, *path_dict;
 
@@ -115,6 +116,21 @@ static uintmax_t read_varint(void)
 	return val;
 }
 
+static const unsigned char *read_sha1ref(void)
+{
+	unsigned int index = read_varint();
+	if (!index) {
+		static unsigned char sha1[20];
+		hashcpy(sha1, fill_and_use(20));
+		return sha1;
+	}
+	index--;
+	if (index >= nr_objects)
+		die("bad index in read_sha1ref at %lu",
+		    (unsigned long)consumed_bytes);
+	return sha1_table + index * 20;
+}
+
 static void *get_data(unsigned long size)
 {
 	git_zstream stream;
@@ -185,7 +201,6 @@ struct obj_info {
 #define FLAG_WRITTEN (1u<<21)
 
 static struct obj_info *obj_list;
-static unsigned nr_objects;
 
 /*
  * Called only from check_object() after it verified this object
@@ -361,8 +376,12 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 	unsigned char base_sha1[20];
 
 	if (type == OBJ_REF_DELTA) {
-		hashcpy(base_sha1, fill(20));
-		use(20);
+		if (packv4)
+			hashcpy(base_sha1, read_sha1ref());
+		else {
+			hashcpy(base_sha1, fill(20));
+			use(20);
+		}
 		delta_data = get_data(delta_size);
 		if (dry_run || !delta_data) {
 			free(delta_data);
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 16/21] unpack-objects: decode v4 commits
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (14 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 15/21] unpack-objects: decode v4 ref-delta Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 17/21] unpack-objects: allow to save processed bytes to a buffer Nguyễn Thái Ngọc Duy
                       ` (6 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index f8442f4..6fc72c1 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -131,6 +131,15 @@ static const unsigned char *read_sha1ref(void)
 	return sha1_table + index * 20;
 }
 
+static const unsigned char *read_dictref(struct packv4_dict *dict)
+{
+	unsigned int index = read_varint();
+	if (index >= dict->nb_entries)
+		die("bad index in read_dictref at %lu",
+		    (unsigned long)consumed_bytes);
+	return  dict->data + dict->offsets[index];
+}
+
 static void *get_data(unsigned long size)
 {
 	git_zstream stream;
@@ -467,6 +476,54 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 	free(base);
 }
 
+static void unpack_commit_v4(unsigned long size, unsigned long nr)
+{
+	unsigned int nb_parents;
+	const unsigned char *committer, *author, *ident;
+	unsigned long author_time, committer_time;
+	int16_t committer_tz, author_tz;
+	struct strbuf dst;
+	char *remaining;
+
+	strbuf_init(&dst, size);
+
+	strbuf_addf(&dst, "tree %s\n", sha1_to_hex(read_sha1ref()));
+	nb_parents = read_varint();
+	while (nb_parents--)
+		strbuf_addf(&dst, "parent %s\n", sha1_to_hex(read_sha1ref()));
+
+	committer_time = read_varint();
+	ident = read_dictref(name_dict);
+	committer_tz = (ident[0] << 8) | ident[1];
+	committer = ident + 2;
+
+	author_time = read_varint();
+	ident = read_dictref(name_dict);
+	author_tz = (ident[0] << 8) | ident[1];
+	author = ident + 2;
+
+	if (author_time & 1)
+		author_time = committer_time + (author_time >> 1);
+	else
+		author_time = committer_time - (author_time >> 1);
+
+	strbuf_addf(&dst,
+		    "author %s %lu %+05d\n"
+		    "committer %s %lu %+05d\n",
+		    author, author_time, author_tz,
+		    committer, committer_time, committer_tz);
+
+	if (dst.len > size)
+		die("bad commit");
+
+	remaining = get_data(size - dst.len);
+	strbuf_add(&dst, remaining, size - dst.len);
+	if (!dry_run)
+		write_object(nr, OBJ_COMMIT, dst.buf, dst.len);
+	else
+		strbuf_release(&dst);
+}
+
 static void read_typesize_v2(enum object_type *type, unsigned long *size)
 {
 	unsigned char c = *(char*)fill_and_use(1);
@@ -511,6 +568,9 @@ static int unpack_one(unsigned nr)
 	case OBJ_OFS_DELTA:
 		unpack_delta_entry(type, size, nr);
 		break;
+	case OBJ_PV4_COMMIT:
+		unpack_commit_v4(size, nr);
+		break;
 	default:
 		error("bad object type %d", type);
 		has_errors = 1;
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 17/21] unpack-objects: allow to save processed bytes to a buffer
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (15 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 16/21] unpack-objects: decode v4 commits Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 18/21] unpack-objects: decode v4 trees Nguyễn Thái Ngọc Duy
                       ` (5 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 6fc72c1..044a087 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -54,6 +54,9 @@ static void add_object_buffer(struct object *object, char *buffer, unsigned long
 		die("object %s tried to add buffer twice!", sha1_to_hex(object->sha1));
 }
 
+static struct strbuf back_buffer = STRBUF_INIT;
+static int save_to_back_buffer;
+
 /*
  * Make sure at least "min" bytes are available in the buffer, and
  * return the pointer to the buffer.
@@ -66,6 +69,8 @@ static void *fill(int min)
 		die("cannot fill %d bytes", min);
 	if (offset) {
 		git_SHA1_Update(&ctx, buffer, offset);
+		if (save_to_back_buffer)
+			strbuf_add(&back_buffer, buffer, offset);
 		memmove(buffer, buffer + offset, len);
 		offset = 0;
 	}
@@ -81,6 +86,18 @@ static void *fill(int min)
 	return buffer;
 }
 
+static void copy_back_buffer(int set)
+{
+	if (offset) {
+		git_SHA1_Update(&ctx, buffer, offset);
+		if (save_to_back_buffer)
+			strbuf_add(&back_buffer, buffer, offset);
+		memmove(buffer, buffer + offset, len);
+		offset = 0;
+	}
+	save_to_back_buffer = set;
+}
+
 static void use(int bytes)
 {
 	if (bytes > len)
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 18/21] unpack-objects: decode v4 trees
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (16 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 17/21] unpack-objects: allow to save processed bytes to a buffer Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11  6:06     ` [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4 Nguyễn Thái Ngọc Duy
                       ` (4 subsequent siblings)
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/unpack-objects.c | 191 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 189 insertions(+), 2 deletions(-)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 044a087..9fd5640 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -12,6 +12,7 @@
 #include "decorate.h"
 #include "packv4-parse.h"
 #include "fsck.h"
+#include "varint.h"
 
 static int dry_run, quiet, recover, has_errors, strict;
 static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict] < pack-file";
@@ -148,6 +149,27 @@ static const unsigned char *read_sha1ref(void)
 	return sha1_table + index * 20;
 }
 
+static void check_against_sha1table(const unsigned char *sha1)
+{
+	const unsigned char *found;
+	if (!packv4)
+		return;
+
+	found = bsearch(sha1, sha1_table, nr_objects, 20,
+			(int (*)(const void *, const void *))hashcmp);
+	if (!found)
+		die(_("object %s not found in SHA-1 table"),
+		    sha1_to_hex(sha1));
+}
+
+static const unsigned char *read_sha1table_ref(void)
+{
+	const unsigned char *sha1 = read_sha1ref();
+	if (sha1 < sha1_table || sha1 >= sha1_table + nr_objects * 20)
+		check_against_sha1table(sha1);
+	return sha1;
+}
+
 static const unsigned char *read_dictref(struct packv4_dict *dict)
 {
 	unsigned int index = read_varint();
@@ -327,6 +349,84 @@ static void write_object(unsigned nr, enum object_type type,
 	}
 }
 
+static void resolve_tree_v4(unsigned long nr_obj,
+			    const void *tree,
+			    unsigned long tree_len,
+			    const unsigned char *base_sha1,
+			    const void *base,
+			    unsigned long base_size)
+{
+	int nr;
+	struct strbuf sb = STRBUF_INIT;
+	const unsigned char *p = tree;
+	const unsigned char *end = p + tree_len;
+
+	nr = decode_varint(&p);
+	while (nr > 0 && p < end) {
+		unsigned int copy_start_or_path = decode_varint(&p);
+		if (copy_start_or_path & 1) { /* copy_start */
+			struct tree_desc desc;
+			struct name_entry entry;
+			unsigned int copy_count = decode_varint(&p);
+			unsigned int copy_start = copy_start_or_path >> 1;
+			if (!base_sha1)
+				die("we are not supposed to copy from another tree!");
+			if (copy_count & 1) { /* first delta */
+				unsigned int id = decode_varint(&p);
+				const unsigned char *last_base;
+				if (!id) {
+					last_base = p;
+					p += 20;
+				} else
+					last_base = sha1_table + (id - 1) * 20;
+				if (hashcmp(last_base, base_sha1))
+					die("bad base tree in resolve_tree_v4");
+			}
+
+			copy_count >>= 1;
+			nr -= copy_count;
+
+			init_tree_desc(&desc, base, base_size);
+			while (tree_entry(&desc, &entry)) {
+				if (copy_start)
+					copy_start--;
+				else if (copy_count) {
+					strbuf_addf(&sb, "%o %s%c",
+						    entry.mode, entry.path, '\0');
+					strbuf_add(&sb, entry.sha1, 20);
+					copy_count--;
+				} else
+					break;
+			}
+		} else {	/* path */
+			unsigned int path_idx = copy_start_or_path >> 1;
+			const unsigned char *path;
+			unsigned mode;
+			unsigned int id;
+			const unsigned char *entry_sha1;
+
+			id = decode_varint(&p);
+			if (!id) {
+				entry_sha1 = p;
+				p += 20;
+			} else
+				entry_sha1 = sha1_table + (id - 1) * 20;
+			nr--;
+
+			path = path_dict->data + path_dict->offsets[path_idx];
+			mode = (path[0] << 8) | path[1];
+			strbuf_addf(&sb, "%o %s%c", mode, path+2, '\0');
+			strbuf_add(&sb, entry_sha1, 20);
+		}
+	}
+	if (nr != 0 || p != end)
+		die(_("bad delta tree"));
+	if (!dry_run)
+		write_object(nr_obj, OBJ_TREE, sb.buf, sb.len);
+	else
+		strbuf_release(&sb);
+}
+
 static void resolve_delta(unsigned nr, enum object_type type,
 			  void *base, unsigned long base_size,
 			  void *delta, unsigned long delta_size)
@@ -358,8 +458,13 @@ static void added_object(unsigned nr, enum object_type type,
 		    info->base_offset == obj_list[nr].offset) {
 			*p = info->next;
 			p = &delta_list;
-			resolve_delta(info->nr, type, data, size,
-				      info->delta, info->size);
+			if (type == OBJ_TREE && packv4)
+				resolve_tree_v4(info->nr, info->delta,
+						info->size, info->base_sha1,
+						data, size);
+			else
+				resolve_delta(info->nr, type, data, size,
+					      info->delta, info->size);
 			free(info);
 			continue;
 		}
@@ -493,6 +598,85 @@ static void unpack_delta_entry(enum object_type type, unsigned long delta_size,
 	free(base);
 }
 
+static int resolve_tree_against_held(unsigned nr, const unsigned char *base,
+				     void *delta_data, unsigned long delta_size)
+{
+	struct object *obj;
+	struct obj_buffer *obj_buffer;
+	obj = lookup_object(base);
+	if (!obj || obj->type != OBJ_TREE)
+		return 0;
+	obj_buffer = lookup_object_buffer(obj);
+	if (!obj_buffer)
+		return 0;
+	resolve_tree_v4(nr, delta_data, delta_size,
+			base, obj_buffer->buffer, obj_buffer->size);
+	return 1;
+}
+
+static void unpack_tree_v4(unsigned long size, unsigned long nr_obj)
+{
+	unsigned int nr;
+	const unsigned char *last_base = NULL;
+
+	copy_back_buffer(1);
+	strbuf_reset(&back_buffer);
+	nr = read_varint();
+	while (nr) {
+		unsigned int copy_start_or_path = read_varint();
+		if (copy_start_or_path & 1) { /* copy_start */
+			unsigned int copy_count = read_varint();
+			if (copy_count & 1) { /* first delta */
+				const unsigned char *old_base = last_base;
+				last_base = read_sha1table_ref();
+				if (old_base && hashcmp(last_base, old_base))
+					die("multi-base trees are not supported");
+			} else if (!last_base)
+				die("missing delta base unpack_tree_v4 at %lu",
+				    (unsigned long)consumed_bytes);
+			copy_count >>= 1;
+			if (!copy_count || copy_count > nr)
+				die("bad copy count index in unpack_tree_v4 at %lu",
+				    (unsigned long)consumed_bytes);
+			nr -= copy_count;
+		} else {	/* path */
+			unsigned int path_idx = copy_start_or_path >> 1;
+			if (path_idx >= path_dict->nb_entries)
+				die("bad path index in unpack_tree_v4 at %lu",
+				    (unsigned long)consumed_bytes);
+			read_sha1ref();
+			nr--;
+		}
+	}
+	copy_back_buffer(0);
+
+	if (last_base) {
+		if (has_sha1_file(last_base)) {
+			enum object_type type;
+			unsigned long base_size;
+			void *base = read_sha1_file(last_base, &type, &base_size);
+			if (type != OBJ_TREE) {
+				die("base tree %s is not a tree", sha1_to_hex(last_base));
+				last_base = NULL;
+			}
+			resolve_tree_v4(nr_obj, back_buffer.buf, back_buffer.len,
+					last_base, base, base_size);
+			free(base);
+		} else if (resolve_tree_against_held(nr_obj, last_base,
+						     back_buffer.buf, back_buffer.len))
+			   ; /* resolved */
+		else {
+			unsigned long delta_size = back_buffer.len;
+			char *delta = strbuf_detach(&back_buffer, NULL);
+			/* cannot resolve yet --- queue it */
+			hashcpy(obj_list[nr].sha1, null_sha1);
+			add_delta_to_list(nr, last_base, 0, delta, delta_size);
+		}
+	} else
+		resolve_tree_v4(nr_obj, back_buffer.buf, back_buffer.len, NULL, NULL, 0);
+	strbuf_release(&back_buffer);
+}
+
 static void unpack_commit_v4(unsigned long size, unsigned long nr)
 {
 	unsigned int nb_parents;
@@ -588,6 +772,9 @@ static int unpack_one(unsigned nr)
 	case OBJ_PV4_COMMIT:
 		unpack_commit_v4(size, nr);
 		break;
+	case OBJ_PV4_TREE:
+		unpack_tree_v4(size, nr);
+		break;
 	default:
 		error("bad object type %d", type);
 		has_errors = 1;
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (17 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 18/21] unpack-objects: decode v4 trees Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11 15:48       ` Nicolas Pitre
  2013-09-11  6:06     ` [PATCH 20/21] show-index: acknowledge that it does not read .idx v3 Nguyễn Thái Ngọc Duy
                       ` (3 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy

While .idx v3 is recommended because it's smaller, there is no reason
why .idx v2 can't use with .pack v4. Enable it, at least for the test
suite as some tests need to this kind of information from show-index
and show-index does not support .idx v3.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c   | 14 ++++++++++----
 builtin/pack-objects.c | 14 ++++++++++----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 1895adf..4607dc6 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -89,7 +89,7 @@ static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
 static int packv4;
-
+static int idx_version_set;
 static struct progress *progress;
 
 /* We always read in 4kB chunks. */
@@ -1892,8 +1892,9 @@ static int git_index_pack_config(const char *k, const char *v, void *cb)
 
 	if (!strcmp(k, "pack.indexversion")) {
 		opts->version = git_config_int(k, v);
-		if (opts->version > 2)
+		if (opts->version > 3)
 			die(_("bad pack.indexversion=%"PRIu32), opts->version);
+		idx_version_set = 1;
 		return 0;
 	}
 	if (!strcmp(k, "pack.threads")) {
@@ -2107,12 +2108,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 			} else if (!prefixcmp(arg, "--index-version=")) {
 				char *c;
 				opts.version = strtoul(arg + 16, &c, 10);
-				if (opts.version > 2)
+				if (opts.version > 3)
 					die(_("bad %s"), arg);
 				if (*c == ',')
 					opts.off32_limit = strtoul(c+1, &c, 0);
 				if (*c || opts.off32_limit & 0x80000000)
 					die(_("bad %s"), arg);
+				idx_version_set = 1;
 			} else
 				usage(index_pack_usage);
 			continue;
@@ -2151,6 +2153,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 		if (!index_name)
 			die(_("--verify with no packfile name given"));
 		read_idx_option(&opts, index_name);
+		idx_version_set = 1;
 		opts.flags |= WRITE_IDX_VERIFY | WRITE_IDX_STRICT;
 	}
 	if (strict)
@@ -2167,6 +2170,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
+	if (!packv4 && opts.version >= 3)
+		die(_("pack idx version %d does not work with pack version %d"),
+		    opts.version, 4);
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
 	deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
 	parse_dictionaries();
@@ -2180,7 +2186,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	if (show_stat)
 		show_pack_info(stat_only);
 
-	if (packv4)
+	if (packv4 && !idx_version_set)
 		opts.version = 3;
 
 	idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *));
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index ac25973..f604fa5 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -66,7 +66,7 @@ static uint32_t nr_objects, nr_alloc, nr_result, nr_written;
 
 static struct packv4_tables v4;
 
-static int non_empty;
+static int non_empty, idx_version_set;
 static int reuse_delta = 1, reuse_object = 1;
 static int keep_unreachable, unpack_unreachable, include_tag;
 static unsigned long unpack_unreachable_expiration;
@@ -2205,7 +2205,8 @@ static void prepare_pack(int window, int depth)
 		sort_dict_entries_by_hits(v4.commit_ident_table);
 		sort_dict_entries_by_hits(v4.tree_path_table);
 		v4.all_objs = xmalloc(nr_objects * sizeof(*v4.all_objs));
-		pack_idx_opts.version = 3;
+		if (!idx_version_set)
+			pack_idx_opts.version = 3;
 		allow_ofs_delta = 0;
 	}
 
@@ -2319,9 +2320,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 	}
 	if (!strcmp(k, "pack.indexversion")) {
 		pack_idx_opts.version = git_config_int(k, v);
-		if (pack_idx_opts.version > 2)
+		if (pack_idx_opts.version > 3)
 			die("bad pack.indexversion=%"PRIu32,
 			    pack_idx_opts.version);
+		idx_version_set = 1;
 		return 0;
 	}
 	return git_default_config(k, v, cb);
@@ -2604,12 +2606,13 @@ static int option_parse_index_version(const struct option *opt,
 	char *c;
 	const char *val = arg;
 	pack_idx_opts.version = strtoul(val, &c, 10);
-	if (pack_idx_opts.version > 2)
+	if (pack_idx_opts.version > 3)
 		die(_("unsupported index version %s"), val);
 	if (*c == ',' && c[1])
 		pack_idx_opts.off32_limit = strtoul(c+1, &c, 0);
 	if (*c || pack_idx_opts.off32_limit & 0x80000000)
 		die(_("bad index version '%s'"), val);
+	idx_version_set = 1;
 	return 0;
 }
 
@@ -2739,6 +2742,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		usage_with_options(pack_usage, pack_objects_options);
 	if (pack_version != 2 && pack_version != 4)
 		die(_("pack version %d is not supported"), pack_version);
+	if (pack_version < 4 && pack_idx_opts.version >= 3)
+		die(_("pack idx version %d cannot be used with pack version %d"),
+		    pack_idx_opts.version, pack_version);
 
 	rp_av[rp_ac++] = "pack-objects";
 	if (thin) {
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 20/21] show-index: acknowledge that it does not read .idx v3
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (18 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4 Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11 16:19       ` Nicolas Pitre
  2013-09-11  6:06     ` [PATCH 21/21] t1050, t5500: replace the use of "show-index|wc -l" with verify-pack Nguyễn Thái Ngọc Duy
                       ` (2 subsequent siblings)
  22 siblings, 1 reply; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy

show-index takes .idx from stdin while v3 requires the .pack. It's
used for testing purposes only. Let those test scripts force .idx v2
with index-pack.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 show-index.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/show-index.c b/show-index.c
index 5a9eed7..2028e02 100644
--- a/show-index.c
+++ b/show-index.c
@@ -19,8 +19,10 @@ int main(int argc, char **argv)
 		die("unable to read header");
 	if (top_index[0] == htonl(PACK_IDX_SIGNATURE)) {
 		version = ntohl(top_index[1]);
-		if (version < 2 || version > 2)
+		if (version < 2 || version > 3)
 			die("unknown index version");
+		if (version == 3)
+			die("show-index does not support .idx v3, convert to v2 instead");
 		if (fread(top_index, 256 * 4, 1, stdin) != 1)
 			die("unable to read index");
 	} else {
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 21/21] t1050, t5500: replace the use of "show-index|wc -l" with verify-pack
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (19 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 20/21] show-index: acknowledge that it does not read .idx v3 Nguyễn Thái Ngọc Duy
@ 2013-09-11  6:06     ` Nguyễn Thái Ngọc Duy
  2013-09-11 14:21     ` [PATCH 00/21] np/pack-v4 updates Duy Nguyen
  2013-09-11 16:24     ` Nicolas Pitre
  22 siblings, 0 replies; 36+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-09-11  6:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t1050-large.sh      | 9 +++++----
 t/t5500-fetch-pack.sh | 4 ++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/t/t1050-large.sh b/t/t1050-large.sh
index fd10528..829030b 100755
--- a/t/t1050-large.sh
+++ b/t/t1050-large.sh
@@ -32,7 +32,7 @@ test_expect_success 'add a large file or two' '
 	done &&
 	test -z "$bad" &&
 	test $count = 1 &&
-	cnt=$(git show-index <"$idx" | wc -l) &&
+	cnt=$(git verify-pack -v "${idx/idx/pack}" | grep "^[0-9a-f]\{40\}" | wc -l) &&
 	test $cnt = 2 &&
 	for l in .git/objects/??/??????????????????????????????????????
 	do
@@ -93,11 +93,12 @@ test_expect_success 'packsize limit' '
 		) |
 		sort >expect &&
 
-		for pi in .git/objects/pack/pack-*.idx
+		for pi in .git/objects/pack/pack-*.pack
 		do
-			git show-index <"$pi"
+			git verify-pack -v "$pi"
 		done |
-		sed -e "s/^[0-9]* \([0-9a-f]*\) .*/\1/" |
+		grep "^[0-9a-f]\{40\}" |
+		sed -e "s/^\([0-9a-f]\{40\}\) .*/\1/" |
 		sort >actual &&
 
 		test_cmp expect actual
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index fd2598e..f99cd14 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -60,8 +60,8 @@ pull_to_client () {
 			git unpack-objects <$p &&
 			git fsck --full &&
 
-			idx=`echo pack-*.idx` &&
-			pack_count=`git show-index <$idx | wc -l` &&
+			pack=`echo pack-*.pack` &&
+			pack_count=`git verify-pack -v $pack | grep "^[0-9a-f]\{40\}" | wc -l` &&
 			test $pack_count = $count &&
 			rm -f pack-*
 		)
-- 
1.8.2.82.gc24b958

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (20 preceding siblings ...)
  2013-09-11  6:06     ` [PATCH 21/21] t1050, t5500: replace the use of "show-index|wc -l" with verify-pack Nguyễn Thái Ngọc Duy
@ 2013-09-11 14:21     ` Duy Nguyen
  2013-09-11 16:25       ` Nicolas Pitre
  2013-09-11 16:24     ` Nicolas Pitre
  22 siblings, 1 reply; 36+ messages in thread
From: Duy Nguyen @ 2013-09-11 14:21 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Nguyễn Thái Ngọc Duy, Git Mailing List

Nico, if you have time you may want to look into this. The result v4
pack from pack-objects on git.git for me is 35MB (one branch) while
packv4-create produces 30MB (v2 is 40MB). I don't know why there is
such a big difference in size. I compared. Ident dict is identical.
Tree dict is a bit different (some that have same hits are ordered
differently). Delta chains do not differ much. Many groups of entries
in the pack are displaced though. I guess I turned a wrong knob or
something in pack-objects in v4 code..
--
DUy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4
  2013-09-11  6:06     ` [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4 Nguyễn Thái Ngọc Duy
@ 2013-09-11 15:48       ` Nicolas Pitre
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-11 15:48 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1492 bytes --]

On Wed, 11 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> While .idx v3 is recommended because it's smaller, there is no reason
> why .idx v2 can't use with .pack v4. Enable it, at least for the test
> suite as some tests need to this kind of information from show-index
> and show-index does not support .idx v3.

FYI, I've added that ability to show-index in my tree.  The output does 
not include the actual object SHA1 though.

[...]
> @@ -2167,6 +2170,9 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
>  
>  	curr_pack = open_pack_file(pack_name);
>  	parse_pack_header();
> +	if (!packv4 && opts.version >= 3)
> +		die(_("pack idx version %d does not work with pack version %d"),
> +		    opts.version, 4);

I don't think this is what you really meant here.  I've amended this 
patch with:

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 4607dc6..f071ed9 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -2171,8 +2171,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
 	if (!packv4 && opts.version >= 3)
-		die(_("pack idx version %d does not work with pack version %d"),
-		    opts.version, 4);
+		die(_("pack idx version %d requires at least pack version 4"),
+		    opts.version);
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
 	deltas = xcalloc(nr_objects, sizeof(struct delta_entry));
 	parse_dictionaries();


Nicolas

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 20/21] show-index: acknowledge that it does not read .idx v3
  2013-09-11  6:06     ` [PATCH 20/21] show-index: acknowledge that it does not read .idx v3 Nguyễn Thái Ngọc Duy
@ 2013-09-11 16:19       ` Nicolas Pitre
  0 siblings, 0 replies; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-11 16:19 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano

[-- Attachment #1: Type: TEXT/PLAIN, Size: 678 bytes --]

On Wed, 11 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> show-index takes .idx from stdin while v3 requires the .pack. It's
> used for testing purposes only. Let those test scripts force .idx v2
> with index-pack.

Since I have a patch adding (partial) index v3 support to show-index in 
my tree, I've dropped this patch.

Many tests use show-index only to manually count the number of objects 
and the added support in show-index is good enough for that.  I've 
therefore reduced your next patch only to those tests where the actual 
list of object names is expected.

Whether or not we'd wish to get rid of show-index completely at some 
point is a separate matter.


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
                       ` (21 preceding siblings ...)
  2013-09-11 14:21     ` [PATCH 00/21] np/pack-v4 updates Duy Nguyen
@ 2013-09-11 16:24     ` Nicolas Pitre
  22 siblings, 0 replies; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-11 16:24 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy, Junio C Hamano; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 538 bytes --]

On Wed, 11 Sep 2013, Nguyễn Thái Ngọc Duy wrote:

> This contains fixups for some of my patches, some of Nico's, adds v4
> support to unpack-objects because the test suite needs it. With these,
> when force generating pack v4 unconditionally, the remaining failed
> tests are:
[...]

@junio: I've folded those patches into my branch, along with the better 
fix for the compilation issue you found.  So you may simply replace the 
branch you have in  pu with mine directly if you wish.

git://git.linaro.org/people/nico/git


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-11 14:21     ` [PATCH 00/21] np/pack-v4 updates Duy Nguyen
@ 2013-09-11 16:25       ` Nicolas Pitre
  2013-09-12  3:38         ` Duy Nguyen
  0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-11 16:25 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

On Wed, 11 Sep 2013, Duy Nguyen wrote:

> Nico, if you have time you may want to look into this. The result v4
> pack from pack-objects on git.git for me is 35MB (one branch) while
> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
> such a big difference in size. I compared. Ident dict is identical.
> Tree dict is a bit different (some that have same hits are ordered
> differently). Delta chains do not differ much. Many groups of entries
> in the pack are displaced though. I guess I turned a wrong knob or
> something in pack-objects in v4 code..

Will try to have a closer look.

Thanks for your dedication so far.


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-11 16:25       ` Nicolas Pitre
@ 2013-09-12  3:38         ` Duy Nguyen
  2013-09-12 16:20           ` Nicolas Pitre
  0 siblings, 1 reply; 36+ messages in thread
From: Duy Nguyen @ 2013-09-12  3:38 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Mailing List

On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Wed, 11 Sep 2013, Duy Nguyen wrote:
>
>> Nico, if you have time you may want to look into this. The result v4
>> pack from pack-objects on git.git for me is 35MB (one branch) while
>> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
>> such a big difference in size. I compared. Ident dict is identical.
>> Tree dict is a bit different (some that have same hits are ordered
>> differently). Delta chains do not differ much. Many groups of entries
>> in the pack are displaced though. I guess I turned a wrong knob or
>> something in pack-objects in v4 code..
>
> Will try to have a closer look.

Problem found. I encoded some trees as ref-delta instead of pv4-tree
:( Something like this brings the size back to packv4-create output

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f604fa5..3d9ab0e 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
  * deltify other objects against, in order to avoid
  * circular deltas.
  */
- entry->type = entry->in_pack_type;
+ if (pack_version < 4)
+ entry->type = entry->in_pack_type;
  entry->delta = base_entry;
  entry->delta_size = entry->size;
  entry->delta_sibling = base_entry->delta_child;
-- 
Duy

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-12  3:38         ` Duy Nguyen
@ 2013-09-12 16:20           ` Nicolas Pitre
  2013-09-13  1:11             ` Duy Nguyen
  0 siblings, 1 reply; 36+ messages in thread
From: Nicolas Pitre @ 2013-09-12 16:20 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

On Thu, 12 Sep 2013, Duy Nguyen wrote:

> On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Wed, 11 Sep 2013, Duy Nguyen wrote:
> >
> >> Nico, if you have time you may want to look into this. The result v4
> >> pack from pack-objects on git.git for me is 35MB (one branch) while
> >> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
> >> such a big difference in size. I compared. Ident dict is identical.
> >> Tree dict is a bit different (some that have same hits are ordered
> >> differently). Delta chains do not differ much. Many groups of entries
> >> in the pack are displaced though. I guess I turned a wrong knob or
> >> something in pack-objects in v4 code..
> >
> > Will try to have a closer look.
> 
> Problem found. I encoded some trees as ref-delta instead of pv4-tree
> :( Something like this brings the size back to packv4-create output
> 
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index f604fa5..3d9ab0e 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
>   * deltify other objects against, in order to avoid
>   * circular deltas.
>   */
> - entry->type = entry->in_pack_type;
> + if (pack_version < 4)
> + entry->type = entry->in_pack_type;
>   entry->delta = base_entry;
>   entry->delta_size = entry->size;
>   entry->delta_sibling = base_entry->delta_child;

Hmmm... I've folded this fix into your patch touching this area.

This code is becoming rather subtle and messy though.  We'll have to 
find a way to better abstract things.  Especially since object data 
reuse will work only for blobs and tags with packv4.  Commits and trees 
will need adjustments to their indices.


Nicolas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/21] np/pack-v4 updates
  2013-09-12 16:20           ` Nicolas Pitre
@ 2013-09-13  1:11             ` Duy Nguyen
  0 siblings, 0 replies; 36+ messages in thread
From: Duy Nguyen @ 2013-09-13  1:11 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Mailing List

On Thu, Sep 12, 2013 at 11:20 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Thu, 12 Sep 2013, Duy Nguyen wrote:
>
>> On Wed, Sep 11, 2013 at 11:25 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
>> > On Wed, 11 Sep 2013, Duy Nguyen wrote:
>> >
>> >> Nico, if you have time you may want to look into this. The result v4
>> >> pack from pack-objects on git.git for me is 35MB (one branch) while
>> >> packv4-create produces 30MB (v2 is 40MB). I don't know why there is
>> >> such a big difference in size. I compared. Ident dict is identical.
>> >> Tree dict is a bit different (some that have same hits are ordered
>> >> differently). Delta chains do not differ much. Many groups of entries
>> >> in the pack are displaced though. I guess I turned a wrong knob or
>> >> something in pack-objects in v4 code..
>> >
>> > Will try to have a closer look.
>>
>> Problem found. I encoded some trees as ref-delta instead of pv4-tree
>> :( Something like this brings the size back to packv4-create output
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index f604fa5..3d9ab0e 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -1490,7 +1490,8 @@ static void check_object(struct object_entry *entry)
>>   * deltify other objects against, in order to avoid
>>   * circular deltas.
>>   */
>> - entry->type = entry->in_pack_type;
>> + if (pack_version < 4)
>> + entry->type = entry->in_pack_type;
>>   entry->delta = base_entry;
>>   entry->delta_size = entry->size;
>>   entry->delta_sibling = base_entry->delta_child;
>
> Hmmm... I've folded this fix into your patch touching this area.

You may want to stricten the condition a bit, to "pack_version < 4 ||
entry->type != OBJ_TREE". I think always not doing it in v4 turns off
the reuse code path for blobs and tags.

> This code is becoming rather subtle and messy though.  We'll have to
> find a way to better abstract things.

Yep. Not sure how that should be done though. Maybe we can revisit it
when pack-objects learns to skip compatibility layer when reading v4
commits and trees..

> Especially since object data
> reuse will work only for blobs and tags with packv4.  Commits and trees
> will need adjustments to their indices.
-- 
Duy

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2013-09-13  1:11 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-09 19:52 [PULL REQUEST] initial pack v4 support Nicolas Pitre
2013-09-09 22:28 ` Junio C Hamano
2013-09-10 21:21 ` Junio C Hamano
2013-09-10 21:32   ` Nicolas Pitre
2013-09-10 21:52     ` Junio C Hamano
2013-09-10 22:31   ` Nicolas Pitre
2013-09-11  6:06   ` [PATCH 00/21] np/pack-v4 updates Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 01/21] fixup! pack-objects: prepare SHA-1 table in v4 Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 02/21] fixup! pack-objects: support writing pack v4 Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 03/21] fixup! pack v4: support "end-of-pack" indicator in index-pack and pack-objects Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 04/21] fixup! index-pack: parse v4 header and dictionaries Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 05/21] fixup! index-pack: record all delta bases in v4 (tree and ref-delta) Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 06/21] pack v4: lift dict size check in load_dict() Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 07/21] pack v4: move pv4 objhdr parsing code to packv4-parse.c Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 08/21] pack-objects: respect compression level in v4 Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 09/21] pack-objects: recognize v4 as pack source Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 10/21] pack v4: add a note that streaming does not support OBJ_PV4_* Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 11/21] unpack-objects: report missing object name Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 12/21] unpack-objects: recognize end-of-pack in v4 thin pack Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 13/21] unpack-objects: read v4 dictionaries Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 14/21] unpack-objects: decode v4 object header Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 15/21] unpack-objects: decode v4 ref-delta Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 16/21] unpack-objects: decode v4 commits Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 17/21] unpack-objects: allow to save processed bytes to a buffer Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 18/21] unpack-objects: decode v4 trees Nguyễn Thái Ngọc Duy
2013-09-11  6:06     ` [PATCH 19/21] index-pack, pack-objects: allow creating .idx v2 with .pack v4 Nguyễn Thái Ngọc Duy
2013-09-11 15:48       ` Nicolas Pitre
2013-09-11  6:06     ` [PATCH 20/21] show-index: acknowledge that it does not read .idx v3 Nguyễn Thái Ngọc Duy
2013-09-11 16:19       ` Nicolas Pitre
2013-09-11  6:06     ` [PATCH 21/21] t1050, t5500: replace the use of "show-index|wc -l" with verify-pack Nguyễn Thái Ngọc Duy
2013-09-11 14:21     ` [PATCH 00/21] np/pack-v4 updates Duy Nguyen
2013-09-11 16:25       ` Nicolas Pitre
2013-09-12  3:38         ` Duy Nguyen
2013-09-12 16:20           ` Nicolas Pitre
2013-09-13  1:11             ` Duy Nguyen
2013-09-11 16:24     ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.