git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: pclouds@gmail.com, Elijah Newren <newren@gmail.com>
Subject: [RFC PATCH 00/15] Sparse clones
Date: Sat,  4 Sep 2010 18:13:52 -0600	[thread overview]
Message-ID: <1283645647-1891-1-git-send-email-newren@gmail.com> (raw)

This patch series implements some basics for sparse clones, which I
define as a clone where not all blob, tree, or commit objects are
downloaded.  The idea is to include sparseness both relative to span
of files/directories and depth of history, though currently I've only
put effort into span of paths.

This patch is built on pu, because it requires
en/object-list-with-pathspec.

What works:
  * all operations on non-sparse clones (full testsuite passes)
  * clone
  * read-tree
  * ls-files
  * cat-file
  * ls-tree
  * checkout
  * diff
  * status
  * log
  * add (except for not giving errors for paths outside the sparse limits)
  * commit
What doesn't work, yet:
  * Probably everything not tested in the new t572*.sh tests  :-)
  Notable examples of things missing from t572*.sh tests:
  * fetch
  * push
  * merge
  * rebase
  * thin packs (need to modify pack-objects to only delta against
    objects within the sparse limits)
  * densify command (to make a sparse repository non-sparse)
  * "missing" commits (see README file in PATCH1)

Cursory comparison with Nguyễn Thái Ngọc Duy's subtree clone (he's probably
made progress since his last submission, so this may be outdated):
  * His series supports fetch, mine doesn't (yet).
  * His series supports push,  mine doesn't (yet).
  * His series supports merge, mine doesn't (yet).
  * His handling of subtree request over clone/fetch via capabilities
    is probably the right way; I'm pretty sure my adding of sparse
    limits are extra arguments to upload-pack would break backward
    compatibility and be bad.
  * He supports just one selected subtree (though he mentioned he's
    working on extending that); I support arbitrary number of subtrees
    or subfiles.
  * He modifies index format (bumping to header version 4); I don't.
    Perhaps it's necessary for merge handling as I haven't implemented
    that, but at an early glance I don't think it's necessary.
  * While there are some similarities in the low-level details of how
    we've modified the git to avoid missing objects, there are many
    differences as well.  I'm hoping to provoke some good discussion.

Elijah Newren (15):

  P1- README-sparse-clone: Add a basic writeup of my ideas for sparse clones

Just a big old write-up.  Not everything in it is implemented yet, but it
gives you the high-level picture.

  P2- Add tests for client handling in a sparse repository

Tests!  Yaay!

  P3- Read sparse limiting args from $GIT_DIR/sparse-limit

When a sparse clone is created, limiting paths will be stored.

  P4- When unpacking in a sparse repository, avoid traversing missing
    trees/blobs
  P5- read_tree_recursive: Avoid missing blobs and trees in a sparse
    repository
  P6- Automatically reuse sparse limiting arguments in revision walking
  P7- cache_tree_update(): Capability to handle tree entries missing from
    index
  P8- cache_tree_update(): Require relevant tree to be passed

Avoiding missing trees/blobs.  

  P9- Add tests for communication dealing with sparse repositories

Tests for clone/fetch/push/etc.  Just clone so far.

  P10- sparse-repo: Provide a function to record sparse limiting arguments

Can't just read from $GIT_DIR/sparse-limit; gotta write to it too.

  P11- builtin-clone: Accept paths for sparse clone
  P12- Pass extra (rev-list) args on, at least in some cases
  P13- upload-pack: Handle extra rev-list arguments being passed
  P14- EVIL COMMIT: Include all commits
  P15- clone: Ensure sparse limiting arguments are used in subsequent
    operations

I like the changes to how clone accepts additional rev-list arguments
to limit what is downloaded, but I'm not too happy with how these
patches pass those rev-list arguments on to upload-pack.  So don't
bother looking too closely at these.


 Makefile                                   |    2 +
 README-sparse-clone                        |  284 ++++++++++++++++++++++++++++
 builtin/archive.c                          |    2 +-
 builtin/checkout.c                         |    2 +-
 builtin/clone.c                            |   39 +++-
 builtin/commit.c                           |   15 +-
 builtin/fetch-pack.c                       |    3 +-
 builtin/merge.c                            |   19 +-
 builtin/revert.c                           |    7 +-
 builtin/send-pack.c                        |    3 +-
 builtin/write-tree.c                       |    6 +-
 cache-tree.c                               |   92 +++++++++-
 cache-tree.h                               |    4 +-
 cache.h                                    |    5 +-
 connect.c                                  |    9 +-
 diff.h                                     |    1 -
 environment.c                              |    2 +
 merge-recursive.c                          |    6 +-
 merge-recursive.h                          |    2 +-
 revision.c                                 |   21 ++-
 revision.h                                 |    3 +-
 setup.c                                    |    2 +
 sparse-repo.c                              |   84 ++++++++
 sparse-repo.h                              |    4 +
 t/sparse-lib.sh                            |   38 ++++
 t/t5601-clone.sh                           |   14 --
 t/t5720-sparse-repository-basics.sh        |  130 +++++++++++++
 t/t5721-sparse-repository-communication.sh |  106 +++++++++++
 test-dump-cache-tree.c                     |    3 +-
 transport-helper.c                         |    5 +-
 transport.c                                |   13 +-
 transport.h                                |    9 +-
 tree-diff.c                                |    4 +-
 tree-walk.c                                |   48 ++++-
 tree-walk.h                                |    3 +
 tree.c                                     |    5 +
 upload-pack.c                              |   45 +++--
 37 files changed, 952 insertions(+), 88 deletions(-)
 create mode 100644 README-sparse-clone
 create mode 100644 sparse-repo.c
 create mode 100644 sparse-repo.h
 create mode 100644 t/sparse-lib.sh
 create mode 100755 t/t5720-sparse-repository-basics.sh
 create mode 100755 t/t5721-sparse-repository-communication.sh

-- 
1.7.2.3.541.g94cc33

             reply	other threads:[~2010-09-05  0:13 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-05  0:13 Elijah Newren [this message]
2010-09-05  0:13 ` [RFC PATCH 01/15] README-sparse-clone: Add a basic writeup of my ideas for sparse clones Elijah Newren
2010-09-05  3:01   ` Nguyen Thai Ngoc Duy
2010-09-05  3:13     ` Elijah Newren
2010-09-06  3:14       ` Nguyen Thai Ngoc Duy
2010-09-05  0:13 ` [RFC PATCH 02/15] Add tests for client handling in a sparse repository Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 03/15] Read sparse limiting args from $GIT_DIR/sparse-limit Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 04/15] When unpacking in a sparse repository, avoid traversing missing trees/blobs Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 05/15] read_tree_recursive: Avoid missing blobs and trees in a sparse repository Elijah Newren
2010-09-05  2:00   ` Nguyen Thai Ngoc Duy
2010-09-05  3:16     ` Elijah Newren
2010-09-05  4:31       ` Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 06/15] Automatically reuse sparse limiting arguments in revision walking Elijah Newren
2010-09-05  1:58   ` Nguyen Thai Ngoc Duy
2010-09-05  4:50     ` Elijah Newren
2010-09-05  7:12       ` Nguyen Thai Ngoc Duy
2010-09-05  0:13 ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Elijah Newren
2010-09-05  7:54   ` Nguyen Thai Ngoc Duy
2010-09-05 21:09     ` Elijah Newren
2010-09-06  4:42       ` Elijah Newren
2010-09-06  5:02         ` Nguyen Thai Ngoc Duy
2010-09-06  4:47   ` [PATCH 0/4] en/object-list-with-pathspec update Nguyễn Thái Ngọc Duy
2010-09-06  4:47   ` [PATCH 1/4] Add testcases showing how pathspecs are ignored with rev-list --objects Nguyễn Thái Ngọc Duy
2010-09-06  4:47   ` [PATCH 2/4] tree-walk: copy tree_entry_interesting() as is from tree-diff.c Nguyễn Thái Ngọc Duy
2010-09-06 15:22     ` Elijah Newren
2010-09-06 22:09       ` Nguyen Thai Ngoc Duy
2010-09-06  4:47   ` [PATCH 3/4] tree-walk: actually move tree_entry_interesting() to tree-walk.c Nguyễn Thái Ngọc Duy
2010-09-06 15:31     ` Elijah Newren
2010-09-06 22:20       ` Nguyen Thai Ngoc Duy
2010-09-06 23:53         ` Junio C Hamano
2010-09-06  4:47   ` [PATCH 4/4] Make rev-list --objects work together with pathspecs Nguyễn Thái Ngọc Duy
2010-09-07  1:28   ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Nguyen Thai Ngoc Duy
2010-09-07  3:06     ` Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 08/15] cache_tree_update(): Require relevant tree to be passed Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 09/15] Add tests for communication dealing with sparse repositories Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 10/15] sparse-repo: Provide a function to record sparse limiting arguments Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 11/15] builtin-clone: Accept paths for sparse clone Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 12/15] Pass extra (rev-list) args on, at least in some cases Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 13/15] upload-pack: Handle extra rev-list arguments being passed Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 14/15] EVIL COMMIT: Include all commits Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 15/15] clone: Ensure sparse limiting arguments are used in subsequent operations Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1283645647-1891-1-git-send-email-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).