git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: git <git@vger.kernel.org>
Subject: Re: [RFC PATCH 05/15] read_tree_recursive: Avoid missing blobs and trees in a sparse repository
Date: Sat, 4 Sep 2010 22:31:33 -0600	[thread overview]
Message-ID: <AANLkTimH_iMvvOadoCx_DJybDpzS3eNe92-6ywzdgUUk@mail.gmail.com> (raw)
In-Reply-To: <AANLkTimsqDbtXArL_EQc+ZCqOd=tYWJ1ZXZpa7T5qUiH@mail.gmail.com>

On Sat, Sep 4, 2010 at 9:16 PM, Elijah Newren <newren@gmail.com> wrote:
> On Sat, Sep 4, 2010 at 8:00 PM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>> On Sun, Sep 5, 2010 at 10:13 AM, Elijah Newren <newren@gmail.com> wrote:
>>> @@ -119,6 +119,11 @@ int read_tree_recursive(struct tree *tree,
>>>                default:
>>>                        return -1;
>>>                }
>>> +
>>> +               if (git_sparse_pathspecs &&
>>> +                   sha1_object_info(entry.sha1, NULL) <= 0)
>>> +                       continue;
>>> +
>>
>> I suppose this is temporary and the final solution would have stricter
>> checks (i.e. only ignore if those entries are outside sparse zone)?
>> This opens a door for broken repo.
>
> Yes, good catch.  Looks like I somehow missed that one, but I agree,
> there should be an "&& !tree_entry_interesting(...)" in there.

Sorry, now I remember why that isn't in there and can't be.  I did
have it there at one point. However, base & baselen in
read_tree_recursive do not necessarily correspond to the relative path
from the toplevel of the repository, though the sparse limit pathspecs
always will.  For example, running
  git ls-tree master:Documentation/technical
or, equivalently (for current git.git)
  git ls-tree da0ae7c59bb0df4c13554fd840e1a563cde659ea
then base will be "" for paths under Documentation/technical rather
than "Documenation/technical/".  And there's really no way of
determining the "real base" either in order to apply matching to the
sparse limit pathspecs (well...I guess you could do an exhaustive walk
of all history each time, so long as the given sha1sum only appears as
one particular directory, but that's unrealistic and slow and leaves
open what to do when multiple directories at different paths happen to
have the same sha1sum at some point(s) in their history).  Note that
this affects cat-file -p as well, since it calls ls-tree and thus this
code.

I really don't see how one can change this.  However, if it's any
consolation, sha1_object_info() will print out a warning message
whenever it's asked for a sha1sum that cannot be found in the object
store.  For example, in a sparse clone:

$ git ls-tree -rt master
040000 tree f98bf35e9a746ebbd5a706591fe1ea4942942bad    sub
040000 tree 436913a91c5648a6ed8fa23719fbd6e3052e2597    sub/a
error: unable to find 436913a91c5648a6ed8fa23719fbd6e3052e2597
040000 tree 07753f428765ac1afe2020b24e40785869bd4a85    sub/b
100644 blob d95f3ad14dee633a758d2e331151e950dd13e4ed    sub/b/file
040000 tree 07753f428765ac1afe2020b24e40785869bd4a85    sub/bcopy
100644 blob d95f3ad14dee633a758d2e331151e950dd13e4ed    sub/bcopy/file
100644 blob 4b7b65e07a8641bcd14375ebddf5c8a7fc002a30    sub/file

It can't traverse into sub/a because it doesn't have the necessary
information.  I figured the warning message was useful to the end user
as a reminder that they are in a sparse clone. (Such warning messages
don't affect cat-file -p, because its callbacks don't return
READ_TREE_RECURSIVE, making sure it doesn't trigger this part of the
code.)

  reply	other threads:[~2010-09-05  4:31 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-05  0:13 [RFC PATCH 00/15] Sparse clones Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 01/15] README-sparse-clone: Add a basic writeup of my ideas for sparse clones Elijah Newren
2010-09-05  3:01   ` Nguyen Thai Ngoc Duy
2010-09-05  3:13     ` Elijah Newren
2010-09-06  3:14       ` Nguyen Thai Ngoc Duy
2010-09-05  0:13 ` [RFC PATCH 02/15] Add tests for client handling in a sparse repository Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 03/15] Read sparse limiting args from $GIT_DIR/sparse-limit Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 04/15] When unpacking in a sparse repository, avoid traversing missing trees/blobs Elijah Newren
2010-09-05  0:13 ` [RFC PATCH 05/15] read_tree_recursive: Avoid missing blobs and trees in a sparse repository Elijah Newren
2010-09-05  2:00   ` Nguyen Thai Ngoc Duy
2010-09-05  3:16     ` Elijah Newren
2010-09-05  4:31       ` Elijah Newren [this message]
2010-09-05  0:13 ` [RFC PATCH 06/15] Automatically reuse sparse limiting arguments in revision walking Elijah Newren
2010-09-05  1:58   ` Nguyen Thai Ngoc Duy
2010-09-05  4:50     ` Elijah Newren
2010-09-05  7:12       ` Nguyen Thai Ngoc Duy
2010-09-05  0:13 ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Elijah Newren
2010-09-05  7:54   ` Nguyen Thai Ngoc Duy
2010-09-05 21:09     ` Elijah Newren
2010-09-06  4:42       ` Elijah Newren
2010-09-06  5:02         ` Nguyen Thai Ngoc Duy
2010-09-06  4:47   ` [PATCH 0/4] en/object-list-with-pathspec update Nguyễn Thái Ngọc Duy
2010-09-06  4:47   ` [PATCH 1/4] Add testcases showing how pathspecs are ignored with rev-list --objects Nguyễn Thái Ngọc Duy
2010-09-06  4:47   ` [PATCH 2/4] tree-walk: copy tree_entry_interesting() as is from tree-diff.c Nguyễn Thái Ngọc Duy
2010-09-06 15:22     ` Elijah Newren
2010-09-06 22:09       ` Nguyen Thai Ngoc Duy
2010-09-06  4:47   ` [PATCH 3/4] tree-walk: actually move tree_entry_interesting() to tree-walk.c Nguyễn Thái Ngọc Duy
2010-09-06 15:31     ` Elijah Newren
2010-09-06 22:20       ` Nguyen Thai Ngoc Duy
2010-09-06 23:53         ` Junio C Hamano
2010-09-06  4:47   ` [PATCH 4/4] Make rev-list --objects work together with pathspecs Nguyễn Thái Ngọc Duy
2010-09-07  1:28   ` [RFC PATCH 07/15] cache_tree_update(): Capability to handle tree entries missing from index Nguyen Thai Ngoc Duy
2010-09-07  3:06     ` Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 08/15] cache_tree_update(): Require relevant tree to be passed Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 09/15] Add tests for communication dealing with sparse repositories Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 10/15] sparse-repo: Provide a function to record sparse limiting arguments Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 11/15] builtin-clone: Accept paths for sparse clone Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 12/15] Pass extra (rev-list) args on, at least in some cases Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 13/15] upload-pack: Handle extra rev-list arguments being passed Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 14/15] EVIL COMMIT: Include all commits Elijah Newren
2010-09-05  0:14 ` [RFC PATCH 15/15] clone: Ensure sparse limiting arguments are used in subsequent operations Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimH_iMvvOadoCx_DJybDpzS3eNe92-6ywzdgUUk@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).