All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: David Turner <dturner@twopensource.com>,
	Git Mailing List <git@vger.kernel.org>,
	David Turner <dturner@twitter.com>
Subject: Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
Date: Tue, 08 Jul 2014 10:05:24 -0700	[thread overview]
Message-ID: <xmqqwqbnaii3.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <CACsJy8C20oFdATHKTLK=9U3_kHu1QsuS4i74RPgQn0aTwVCC8w@mail.gmail.com> (Duy Nguyen's message of "Tue, 8 Jul 2014 17:32:20 +0700")

Duy Nguyen <pclouds@gmail.com> writes:

> I wonder if we need to update_main_cache_tree() so many times in this
> function. If I read the code correctly, all roads must lead to
> update_main_cache_tree(0) in prepare_to_commit().

I think prepare-to-commit is too late; it does not want to know if
the index it was given to create a tree out of is the one that the
user will keep using after this invocation of "git commit" or just a
temporary one used for partial commit.  The cache-tree rebuilt there
is what is recorded with commit_tree_extended() in cmd_commit(), but
if you are doing a partial commit, these generic code paths are
given a temporary index file on the filesystem to work on, and
cache-tree in that will not help user's later operation.

For three main uses of "git commit", prepare_index() does these:

 - "git commit -a" and "git commit -i $paths" update the index with
   the new contents from the working tree, fully builds cache-tree
   in-core to write out the tree objects, and writes the index file
   to the filesystem.  Because this index is the one used after this
   invocation of "git commit" returns, we have a fully populated
   cache-tree after this happens.  This code path is perfect and
   there is no need to change.

 - "git commit" commits the contents of the index as-is, so
   technically there is no reason for it to update the index on the
   filesystem at all, but it refreshes the cached stat data to help
   the "status" part of the command, and if it finds that stat data
   was stale, updates the index on the filesystem because it is
   wasteful not to do so.  As we would be spending I/O cycles to
   update the index file in that case anyway, we also rebuild the
   cache-tree and include that in the updated index.

   When the cached stat data was already up-to-date, however, we do
   not update the index on the filesystem, so the series under
   discussion will change the trade-off by doing one more I/O to
   write out a new index to the filesystem only to update cache-tree.

 - "git commit $paths" updates the "real" index with given $paths
   and writes it out to the filesystem first.  This is the index the
   user will use after "git commit" finishes; traditionally our
   trade-off was "populate cache-tree only when we do not have to
   spend any cycle only to do so (i.e. when we are writing trees
   anyway, or when we read from a tree), and let it degrade as paths
   are added, removed and modified" and we avoided populating
   cache-tree in this codepath.  The series under discussion will
   change the trade-off here, too.

   After it updates this "real" index, it builds another temporary
   index that represents the tree state to be committed (starting
   from HEAD and updates with the given $paths), but that will be
   discarded and we do not want to spend any extra cycle to do
   anything only to make its later use more efficient (like writing
   updated cache-tree to it).

> If we find out that
> we have incomplete cache-tree before that call, we could write the
> index one more time after that call,

and that will make an extra I/O only to write out cache-tree to the
temporary index that we will discard immediately after for a partial
commit.

  reply	other threads:[~2014-07-08 17:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
2014-07-07 19:27   ` Junio C Hamano
2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
2014-07-06  8:10   ` Eric Sunshine
2014-07-07 19:15   ` Junio C Hamano
2014-07-06  4:06 ` [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit David Turner
2014-07-07 20:03   ` Junio C Hamano
2014-07-08  0:26     ` Junio C Hamano
2014-07-08 10:32       ` Duy Nguyen
2014-07-08 17:05         ` Junio C Hamano [this message]
2014-07-09  1:58           ` Duy Nguyen
2014-07-08 18:32         ` Junio C Hamano
2014-07-08 19:15         ` Junio C Hamano
2014-07-07 18:58 ` [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwqbnaii3.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dturner@twitter.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.