All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Prefix-compress on-disk index entries
@ 2012-04-03 22:53 Junio C Hamano
  2012-04-03 22:53 ` [PATCH 1/9] varint: make it available outside the context of pack Junio C Hamano
                   ` (11 more replies)
  0 siblings, 12 replies; 27+ messages in thread
From: Junio C Hamano @ 2012-04-03 22:53 UTC (permalink / raw)
  To: git

This is still rough, but with this patch I am getting:

    $ ls -l .git/index*
    -rw-r----- 1 jch eng 25586488 2012-04-03 15:27 .git/index
    -rw-r----- 1 jch eng 14654328 2012-04-03 15:38 .git/index-4

in a clone of WebKit repository that has 183175 paths.

With hot-cache with no local modification:

    $ time sh -c 'GIT_INDEX_FILE=.git/index-4 git diff'
    real  0m0.469s
    user  0m0.130s
    sys   0m0.330s

    $ time sh -c 'git diff'
    real  0m0.677s
    user  0m0.290s
    sys   0m0.370s

which is mesuring the time needed to read of the index into in-core
structure and comparing the cached stat information taken from lstat(2).

The updated format is not documented yet, as I didn't intend (and I still
am not committed) to declare a change along this line the official "v4"
format; I was merely being curious to see how much improvements we can get
from a trivial approach like this.

The saving of the on-disk index size comes from two factors:

 - Not padding the on-disk index entries to 8-byte boundary;

 - Not storing the full pathname for each entry in the on-disk format.

Because the entries are sorted by path, adjacent entries in the index tend
to share the leading components of them, and it makes sense to only store
the differences in later entries.  In the v4 on-disk format of the index,
each on-disk cache entry stores the number of bytes to be stripped from
the end of the previous name, and the bytes to append to the result, to
come up with its name.

The "to-remove" count is encoded in the varint format used in the
packfiles, and the "bytes-to-append" is a simple NUL-terminated string.

Junio C Hamano (9):
  varint: make it available outside the context of pack
  cache.h: hide on-disk index details
  read-cache.c: allow unaligned mapping of the index file
  read-cache.c: make create_from_disk() report number of bytes it consumed
  read-cache.c: report the header version we do not understand
  read-cache.c: move code to copy ondisk to incore cache to a helper function
  read-cache.c: move code to copy incore to ondisk cache to a helper function
  read-cache.c: read prefix-compressed names in index on-disk version v4
  read-cache.c: write index v4 format

 Makefile               |    2 +
 builtin/update-index.c |    2 +
 cache.h                |   52 +---------
 config.c               |   11 ++
 environment.c          |    1 +
 read-cache.c           |  259 ++++++++++++++++++++++++++++++++++++++++--------
 varint.c               |   29 ++++++
 varint.h               |    9 ++
 8 files changed, 275 insertions(+), 90 deletions(-)
 create mode 100644 varint.c
 create mode 100644 varint.h

-- 
1.7.10.rc4.54.g1d5dd3

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2012-05-02 17:20 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-03 22:53 [PATCH 0/9] Prefix-compress on-disk index entries Junio C Hamano
2012-04-03 22:53 ` [PATCH 1/9] varint: make it available outside the context of pack Junio C Hamano
2012-04-03 22:53 ` [PATCH 2/9] cache.h: hide on-disk index details Junio C Hamano
2012-04-03 22:53 ` [PATCH 3/9] read-cache.c: allow unaligned mapping of the index file Junio C Hamano
2012-04-03 22:53 ` [PATCH 4/9] read-cache.c: make create_from_disk() report number of bytes it consumed Junio C Hamano
2012-04-03 22:53 ` [PATCH 5/9] read-cache.c: report the header version we do not understand Junio C Hamano
2012-04-03 22:53 ` [PATCH 6/9] read-cache.c: move code to copy ondisk to incore cache to a helper function Junio C Hamano
2012-04-03 22:53 ` [PATCH 7/9] read-cache.c: move code to copy incore to ondisk " Junio C Hamano
2012-04-03 22:53 ` [PATCH 8/9] read-cache.c: read prefix-compressed names in index on-disk version v4 Junio C Hamano
2012-04-03 22:53 ` [PATCH 9/9] read-cache.c: write index v4 format Junio C Hamano
2012-04-04  1:44 ` [PATCH 0/9] Prefix-compress on-disk index entries David Barr
2012-04-04 15:33   ` Junio C Hamano
2012-04-04 16:57     ` Junio C Hamano
2012-04-04 16:58       ` [PATCH 2/2] update-index: upgrade/downgrade on-disk index version Junio C Hamano
2012-04-04 12:34 ` [PATCH 0/9] Prefix-compress on-disk index entries Nguyen Thai Ngoc Duy
2012-04-04 18:44   ` Junio C Hamano
2012-04-06  8:41     ` David Barr
2012-05-02  1:58       ` Nguyen Thai Ngoc Duy
2012-05-02  4:26         ` David Barr
2012-04-27 22:58 ` [PATCH 1/2] unpack-trees: preserve the index file version of original Junio C Hamano
2012-04-27 23:02   ` [PATCH 2/2] index-v4: document the entry format Junio C Hamano
2012-04-30 17:20     ` Thomas Rast
2012-05-01  4:00       ` Junio C Hamano
2012-05-01 21:43         ` Thomas Rast
2012-05-02 15:12         ` Shawn Pearce
2012-05-02 17:04           ` Junio C Hamano
2012-05-02 17:13             ` Shawn Pearce

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.