git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GSoC] Designing a faster index format - Progress Report week 8
@ 2012-06-11 20:53 Thomas Gummerer
  2012-06-15 13:39 ` Thomas Rast
  0 siblings, 1 reply; 2+ messages in thread
From: Thomas Gummerer @ 2012-06-11 20:53 UTC (permalink / raw)
  To: git; +Cc: trast, gitster, mhagger, pclouds


== Work done in the previous 7 weeks ==

- Definition of a tentative index file v5 format [1]. This differs
  from the proposal in making it possible to bisect the directory
  entries and file entries, to do a binary search. The exact bits
  for each section were also defined. To further compress the index,
  along with prefix compression, the stat data is hashed, since
  it's only used for comparison, but the plain data is never used.
  Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
  and Robin Rosenberg for feedback.
- Prototype of a converter from the index format v2/v3 to the index
  format v5. [2] The converter reads the index from a git repository,
  can output parts of the index (header, index entries as in
  git ls-files --debug, cache tree as in test-dump-cache-tree, or
  the reuc data). Then it writes the v5 index file format to
  .git/index-v5. Thanks to Michael Haggerty for the code review.
- Prototype of a reader for the new index file format. [3] The
  reader has mainly the purpose to show the algorithm used to read
  the index lexicographically sorted after the full name which is
  required by the current internal memory format. Big thanks for
  reviewing this code and giving me advice on refactoring goes
  to Michael Haggerty.
- Read the index format format and translate it to the current in
  memory format. This doesn't include reading any of the current
  extensions, which are now part of the main index. The code again
  is on github. [4] Thanks for reviewing the first steps to Thomas
  Rast.

== Work done in the last week ==

- Read the cache-tree data from the new ondisk format. Same as the
  reading algorithm, this algorithm can (and will) still be optimized.
- Started implementing the API, but it's still in the very early
  stages.

== Outlook for the next week ==

- Continue implementing the API as discussed on [5].

[1] https://github.com/tgummerer/git/wiki/Index-file-format-v5
[2] https://github.com/tgummerer/git/blob/pythonprototype/git-convert-index.py
[3] https://github.com/tgummerer/git/blob/pythonprototype/git-read-index-v5.py
[4] https://github.com/tgummerer/git/tree/index-v5
[5] http://thread.gmane.org/gmane.comp.version-control.git/198283/focus=198474

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [GSoC] Designing a faster index format - Progress Report week 8
  2012-06-11 20:53 [GSoC] Designing a faster index format - Progress Report week 8 Thomas Gummerer
@ 2012-06-15 13:39 ` Thomas Rast
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Rast @ 2012-06-15 13:39 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: git, gitster, mhagger, pclouds

Thomas Gummerer <t.gummerer@gmail.com> writes:

> == Outlook for the next week ==
>
> - Continue implementing the API as discussed on [5].
>
> [5] http://thread.gmane.org/gmane.comp.version-control.git/198283/focus=198474

Sorry for being rather slow this week!  I still intend to do another
review of the github code RSN, however I felt I should point out some
more IRC conclusions to the list.

I talked Thomas out of going forward with the API as the next step, and
instead work towards having a writer side for the current format.

This should have some benefits:

* The code becomes more easily testable, and hopefully able to run the
  test suite.

* Spelling the writer in code should shake down any unforeseen
  deficiencies of the format.

  Until update-index or similar learn to use the partial writing
  facilities, we should have a little extra tool here that allows poking
  at a single entry for testing.

* The API will need to have an updating and a writing side, and what
  that should look like will become clearer.

That does mean that the API-demo in git-ls-files (suggested by Duy) will
likely have to wait until after midterms.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-06-15 13:39 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-11 20:53 [GSoC] Designing a faster index format - Progress Report week 8 Thomas Gummerer
2012-06-15 13:39 ` Thomas Rast

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).