All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
To: Thomas Rast <trast@student.ethz.ch>
Cc: elton sky <eltonsky9404@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: GSoC - Designing a faster index format
Date: Mon, 26 Mar 2012 22:25:59 +0700	[thread overview]
Message-ID: <CACsJy8CsdZpQUQ7ydM1fOpSomm6+LyACCR83ccncVtUk+HbLKA@mail.gmail.com> (raw)
In-Reply-To: <87iphrjv23.fsf@thomas.inf.ethz.ch>

On Mon, Mar 26, 2012 at 9:28 PM, Thomas Rast <trast@student.ethz.ch> wrote:
> elton sky <eltonsky9404@gmail.com> writes:
>
>> On Mon, Mar 26, 2012 at 12:06 PM, Nguyen Thai Ngoc Duy
>> <pclouds@gmail.com> wrote:
>>> (I think this should be on git@vger as there are many experienced devs there)
>>>
>>> On Sun, Mar 25, 2012 at 11:13 AM, elton sky <eltonsky9404@gmail.com> wrote:
>>>> About the new format:
>>>>
>>>> The index is a single file. Entries in the index still stored
>>>> sequentially as old format. The difference is they are grouped into
>>>> blocks. A block contains many entries and they are ordered by names.
>>>> Blocks are also ordered by the name of the first entry. Each block
>>>> contains a sha1 for entries in it.
>>>
>>> If I remove an entry in the first block, because blocks are of fixed
>>> size, you would need to shift all entries up by one, thus update all
>>> blocks?
>>
>> We need some GC here. I am not moving all blocks. Rather I would
>> consider merge or recycle the block. In a simple case if a block
>> becomes empty, I ll change the offset of new block in the header point
>> to this block, and make this block points to the original offset of
>> new block. In this way, I keep the list of empty blocks I can reuse.
> [...]
>
> Doesn't that venture into database land?
>
> If we go that far, wouldn't it be better to use a proper database
> library?  All other things being equal, writing such complex code from
> scratch is probably not a good idea.

If there's a library that fits our needs (including linking
statically). I think we've come close to sqlite file format [1]. But
sqlite comes with sql engine, transactional updates... that we don't
need. Another obvious source for inspiration is file systems, but I
dare not go that way.

[1] http://www.sqlite.org/fileformat2.html
-- 
Duy

  reply	other threads:[~2012-03-26 15:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-20 23:10 GSoC - Designing a faster index format elton sky
2012-03-21  1:18 ` Nguyen Thai Ngoc Duy
2012-03-21 11:25 ` Thomas Rast
2012-03-21 12:01   ` elton sky
2012-03-22 20:32     ` elton sky
2012-03-23  0:46       ` Jakub Narebski
2012-03-23  1:30       ` Nguyen Thai Ngoc Duy
2012-03-23 10:27         ` elton sky
2012-03-23 11:24           ` Nguyen Thai Ngoc Duy
     [not found]             ` <CAKTdtZmLOzAgG0uCDcVr+O41XPX-XnoVZjsZWPN-BLjq2oG-7A@mail.gmail.com>
2012-03-24  8:58               ` Nguyen Thai Ngoc Duy
     [not found]                 ` <CAKTdtZkpjVaBSkcieojKj+V7WztT3UDzjGfXyghY=S8mq+X9zw@mail.gmail.com>
     [not found]                   ` <CACsJy8D85thmK_5jLC7MxJtsitLr=zphKiw2miwPu7Exf7ty=Q@mail.gmail.com>
2012-03-26 12:36                     ` elton sky
2012-03-26 12:41                       ` elton sky
2012-03-26 14:28                       ` Thomas Rast
2012-03-26 15:25                         ` Nguyen Thai Ngoc Duy [this message]
2012-03-26 16:08                           ` Shawn Pearce
2012-03-27  2:49                             ` elton sky
2012-03-27  3:34                               ` David Barr
2012-03-27  6:33                                 ` Nguyen Thai Ngoc Duy
2012-03-29  9:45                                   ` Jeff King
2012-03-27  6:31                             ` Nguyen Thai Ngoc Duy
2012-03-26 16:19                         ` Nguyen Thai Ngoc Duy
2012-03-27  3:20                           ` elton sky
2012-03-27  6:43                             ` Nguyen Thai Ngoc Duy
2012-04-02 11:50                               ` elton sky
2012-04-02 12:31                                 ` Nguyen Thai Ngoc Duy
2012-04-02 14:27                                   ` Shawn Pearce
2012-04-02 15:12                                     ` Nguyen Thai Ngoc Duy
2012-04-04  8:26                                   ` elton sky
2012-04-04 12:20                                     ` Nguyen Thai Ngoc Duy
2012-04-04 16:22                                       ` elton sky
2012-04-06  3:13                                         ` elton sky
2012-04-06  3:15                                           ` elton sky
2012-04-07  8:29                                             ` elton sky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8CsdZpQUQ7ydM1fOpSomm6+LyACCR83ccncVtUk+HbLKA@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=eltonsky9404@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.