All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shawn Pearce <spearce@spearce.org>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: elton sky <eltonsky9404@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Thomas Rast <trast@inf.ethz.ch>
Subject: Re: GSoC - Designing a faster index format
Date: Mon, 2 Apr 2012 10:27:47 -0400	[thread overview]
Message-ID: <CAJo=hJuVZiik6J0nhO4jpzWYenerRQoREHLMmJoFY8W0bZR+5A@mail.gmail.com> (raw)
In-Reply-To: <20120402123146.GA24813@do>

On Mon, Apr 2, 2012 at 08:31, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> -- 8< --
> GIT index format
> ================
>
> This format replaces the old "DIRC" format. Compared to the old
> format, which is essentially a sorted list of pathnames, this one:
>
>  - is tree-based
>  - use crc32 as checksum
>  - only verify integrity on parts that git accesses, instead of whole
>   file
>  - append changes to the end
>  - allow index versioning
>
> Updates can be made directly to the index by appending to the end. The
> index traversed by locating the root tree block from the trailer. When
> a path is updated, all related tree blocks are updated and appended to
> the end, then a new trailer (with generation increased by one) is
> written to conclude the index.
>
> The index size will increase continuously. At some point, we will need
> to repack it. Let assume a tree block is 64k on average and a path
> generally consists of 3 path components.  That means an entry update
> adds 192k and we can do about 80 updates before index reaches 16M (in
> addition to initial index size).

Only 3 path components? Java sources can easily have 8-10 with a long
Maven and Java package implied prefix. This will increase the
frequency of rewrites of the index file.

> At 16M or when trailer generation hits a limit (the limit can be
> configurable), we rewrite the index to reduce its size. Some heavy
> operations can also be used to rewrite index, such as checkout or
> reset.
>
> The index integrity is verified by crc32. One crc32 covers header and
> trailer. Each block has its own crc32. When the index is found
> corrupt, we could try to roll back to latest good version by looking
> for trailers from bottom up. Even when the index is not corrupt, users
> can still look back this way for older index versions.

How do you deal with a partially written append to the index file?
E.g. if a prior update crashes or the filesystem doesn't write
everything out before power failure, you need to find the last good
trailer block in the file.

> = The git index file has the following format
>
>   - A 8-byte header consisting of
>
>     4-byte signature:
>       The signature is { 'T', 'R', 'E', 'E' }
>
>     4-byte version number:
>       The current supported versions are 1.

Why not DIRC version 4?

>   - A number of blocks of variable size
>
>      1-byte block type
>
>      3-byte content size in byte
>
>      block content

So you are limiting the size of a canonical tree now? Currently there
is no limit on the size a tree. But here the entire index structure
plus set of names must be under 16 MiB. Granted no project probably
hits that limit, but you are painting us into a corner with an upper
limit here that doesn't look like it will be easy to increase.

>      4-byte crc32 of all above
>
>   - A 18-byte trailer consisting of
>
>      4-byte trailer signature:
>        The signature is { 'R', 'O', 'O', 'T' }
>
>      2-byte generation:
>         The first trailer is 0, the second 1 and so on.
>
>      4-byte root block offset
>
>      4-byte extension table offset:
>        Zero means no extension
>
>      4-byte checksum:
>        CRC32 of the header and the trailer (excluding this field)

See above my question about how to find the last good trailer if the
last append attempt was incomplete.

> == Tree block
>
>  A tree block contains a (maybe invalid) tree object and extra
>  information of its companion in working directory. Tree block has
>  block type 'T'.
>
>  Tree block content is basically the list of non-recursive entries in
>  specified path, with all attributes we store in the index now. There
>  are a few changes though to intergrate cache-tree and allow
>  bsearch() on mmap'd block.
>
>  A tree block content consists of
>
>  - 4-byte tree object size
>
>  - 20-byte SHA-1 of the cached tree object
>
>  - a list attributes corresponding to tree object's item, in the same
>    order.  These attributes are the same as in DIRC entry format
>    except that entry name is removed, and a tree block offset is
>    added in case the item is a directory.
>
>    32-bit ctime seconds, the last time a file's metadata changed
>      this is stat(2) data
>
>    32-bit ctime nanosecond fractions
>      this is stat(2) data
>
>    32-bit mtime seconds, the last time a file's data changed
>      this is stat(2) data
>
>    32-bit mtime nanosecond fractions
>      this is stat(2) data
>
>    32-bit dev
>      this is stat(2) data
>
>    32-bit ino
>      this is stat(2) data
>
>    32-bit mode, split into (high to low bits)
>
>      4-bit object type
>        valid values in binary are 1000 (regular file), 1010 (symbolic link)
>        and 1110 (gitlink)
>
>      3-bit unused
>
>      9-bit unix permission. Only 0755 and 0644 are valid for regular files.
>      Symbolic links and gitlinks have value 0 in this field.
>
>    32-bit uid
>      this is stat(2) data
>
>    32-bit gid
>      this is stat(2) data
>
>    32-bit file size
>      This is the on-disk size from stat(2), truncated to 32-bit.
>
>    160-bit SHA-1 for the represented object if blobs or the offset
>      to another tree block if trees
>
>    A 32-bit 'flags' field split into (high to low bits)
>
>      1-bit assume-valid flag
>
>      1-bit extended flag (must be zero in version 2)
>
>      2-bit stage (during merge)
>
>      12-bit name length if the length is less than 0xFFF; otherwise 0xFFF
>      is stored in this field.
>
>      1-bit skip-worktree flag (used by sparse checkout)
>
>      1-bit intent-to-add flag (used by "git add -N")
>
>      14-bit unused, must be zero
>
>    A 16-bit offset, relative to the beginning of this block, to the
>      pathname of this entry. FIXME: make it 32-bit, relative to the
>      beginning of the file, so that we can reuse pathnames from other
>      (old) blocks?

16 bit offset doesn't work well in a block that can be as large as 2^24.

If you reuse a path name list at the start of the file, how do you
handle new names?

  reply	other threads:[~2012-04-02 14:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-20 23:10 GSoC - Designing a faster index format elton sky
2012-03-21  1:18 ` Nguyen Thai Ngoc Duy
2012-03-21 11:25 ` Thomas Rast
2012-03-21 12:01   ` elton sky
2012-03-22 20:32     ` elton sky
2012-03-23  0:46       ` Jakub Narebski
2012-03-23  1:30       ` Nguyen Thai Ngoc Duy
2012-03-23 10:27         ` elton sky
2012-03-23 11:24           ` Nguyen Thai Ngoc Duy
     [not found]             ` <CAKTdtZmLOzAgG0uCDcVr+O41XPX-XnoVZjsZWPN-BLjq2oG-7A@mail.gmail.com>
2012-03-24  8:58               ` Nguyen Thai Ngoc Duy
     [not found]                 ` <CAKTdtZkpjVaBSkcieojKj+V7WztT3UDzjGfXyghY=S8mq+X9zw@mail.gmail.com>
     [not found]                   ` <CACsJy8D85thmK_5jLC7MxJtsitLr=zphKiw2miwPu7Exf7ty=Q@mail.gmail.com>
2012-03-26 12:36                     ` elton sky
2012-03-26 12:41                       ` elton sky
2012-03-26 14:28                       ` Thomas Rast
2012-03-26 15:25                         ` Nguyen Thai Ngoc Duy
2012-03-26 16:08                           ` Shawn Pearce
2012-03-27  2:49                             ` elton sky
2012-03-27  3:34                               ` David Barr
2012-03-27  6:33                                 ` Nguyen Thai Ngoc Duy
2012-03-29  9:45                                   ` Jeff King
2012-03-27  6:31                             ` Nguyen Thai Ngoc Duy
2012-03-26 16:19                         ` Nguyen Thai Ngoc Duy
2012-03-27  3:20                           ` elton sky
2012-03-27  6:43                             ` Nguyen Thai Ngoc Duy
2012-04-02 11:50                               ` elton sky
2012-04-02 12:31                                 ` Nguyen Thai Ngoc Duy
2012-04-02 14:27                                   ` Shawn Pearce [this message]
2012-04-02 15:12                                     ` Nguyen Thai Ngoc Duy
2012-04-04  8:26                                   ` elton sky
2012-04-04 12:20                                     ` Nguyen Thai Ngoc Duy
2012-04-04 16:22                                       ` elton sky
2012-04-06  3:13                                         ` elton sky
2012-04-06  3:15                                           ` elton sky
2012-04-07  8:29                                             ` elton sky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJo=hJuVZiik6J0nhO4jpzWYenerRQoREHLMmJoFY8W0bZR+5A@mail.gmail.com' \
    --to=spearce@spearce.org \
    --cc=eltonsky9404@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=trast@inf.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.