From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
newren@gmaill.com, Jeff King <peff@peff.net>,
Taylor Blau <me@ttaylorr.com>,
Jonathan Nieder <jrnieder@gmail.com>,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 04/10] sparse-checkout: allow in-tree definitions
Date: Wed, 20 May 2020 10:52:41 -0700 [thread overview]
Message-ID: <CABPp-BH5p1VPXfMOyN_0SLnsFKkRU9R-ZpiAe4k5r=ZUbHeibQ@mail.gmail.com> (raw)
In-Reply-To: <6d354901-9361-d8d1-539d-3b6c3edb2d9f@gmail.com>
On Fri, May 8, 2020 at 8:42 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 5/7/2020 6:58 PM, Junio C Hamano wrote:
> > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> >> One of the difficulties of using the sparse-checkout feature is not
> >> knowing which directories are absolutely needed for working in a portion
> >> of the repository. Some of this can be documented in README files or
> >> included in a bootstrapping tool along with the repository. This is done
> >> in an ad-hoc way by every project that wants to use it.
> >>
> >> Let's make this process easier for users by creating a way to define a
> >> useful sparse-checkout definition inside the Git tree data. This has
> >> several benefits. In particular, the data is available to anyone who has
> >> a copy of the repository without needing a different data source.
> >> Second, the needs of the repository can change over time and Git can
> >> present a way to automatically update the working directory as these
> >> sparse-checkout definitions change over time.
> >
> > And two lines of development can merge them together?
> >
> > Any time a new "feature" pops up that would eventually affect how
> > "git clone" and "git checkout" work based on untrusted user data, we
> > need to make sure there is no negative security implications.
> >
> > If it only boils down to "we have files that can record list of
> > leading directory names and without offering extra 'flexibility'", I
> > guess there aren't all that much that a malicious sparse definition
> > can do and we would be safe, though.
>
> Yes. I hope that we can be extremely careful with this feature.
> The RFC status of this series implicitly includes the question
> "Should we do this at all?" I think the benefits outweigh the
> risks, but we can minimize those risks with very careful design
> and implementation.
>
> >> To use this feature, add the "--in-tree" option when setting or adding
> >> directories to the sparse-checkout definition. For example:
> >>
> >> $ git sparse-checkout set --in-tree .sparse/base
> >> $ git sparse-checkout add --in-tree .sparse/extra
> >>
> >> These commands add values to the multi-valued config setting
> >> "sparse.inTree". When updating the sparse-checkout definition, these
> >> values describe paths in the repository to find the sparse-checkout
> >> data. After the commands listed earlier, we expect to see the following
> >> in .git/config.worktree:
> >>
> >> [sparse]
> >> intree = .sparse/base
> >> intree = .sparse/extra
> >
> > What does this say in human words? "These two tracked files specify
> > which paths should be in the working tree"? Spelling it out here
> > would help readers of this commit.
>
> You got it. Sounds good.
>
> >> When applying the sparse-checkout definitions from this config, the
> >> blobs at HEAD:.sparse/base and HEAD:.sparse/extra are loaded.
> >
> > OK, so end-user edit to the working tree copy or what is added to
> > the index does not count and only the committed version gets used.
> >
> > That makes it simple---I was wondering how we would operate when
> > merging a branch with different contents in the .sparse/* files
> > until the conflicts are resolved.
>
> It's worth testing this case so we can be sure what happens.
During a merge or rebase or checkout -m, what happens if .sparse/extra
has the following working tree content:
[sparse]
dir = D
dir = X
<<<<<< HEAD
dir = Y
|||||| MERGE_BASE
======
inherit = .sparse/tools
>>>>>> MERGE_HEAD
inherit = .sparse/base
and, of course, three different entries in the index?
Also, do we use the version of the --in-tree file from the latest
commit, from the index, or from the working tree? (This is a question
not only for merge and rebase, but also checkout with dirty changes
and even checkout -m.) Which one "wins"?
And what if the user updates and commits an ill-formed version of the
file -- is it equivalent to getting an empty cone with just the
toplevel directory, equivalent to getting a complete checkout of
everything, or something else?
next prev parent reply other threads:[~2020-05-20 17:52 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-07 13:17 [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 01/10] unpack-trees: avoid array out-of-bounds error Derrick Stolee via GitGitGadget
2020-05-07 22:27 ` Junio C Hamano
2020-05-08 12:19 ` Derrick Stolee
2020-05-08 15:09 ` Junio C Hamano
2020-05-20 16:32 ` Elijah Newren
2020-05-07 13:17 ` [PATCH 02/10] sparse-checkout: move code from builtin Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 03/10] sparse-checkout: move code from unpack-trees.c Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 04/10] sparse-checkout: allow in-tree definitions Derrick Stolee via GitGitGadget
2020-05-07 22:58 ` Junio C Hamano
2020-05-08 15:40 ` Derrick Stolee
2020-05-20 17:52 ` Elijah Newren [this message]
2020-06-17 23:07 ` Elijah Newren
2020-06-18 8:18 ` Son Luong Ngoc
2020-05-07 13:17 ` [PATCH 05/10] sparse-checkout: automatically update in-tree definition Derrick Stolee via GitGitGadget
2020-05-20 16:28 ` Elijah Newren
2020-05-07 13:17 ` [PATCH 06/10] sparse-checkout: use oidset to prevent repeat blobs Derrick Stolee via GitGitGadget
2020-05-20 16:40 ` Elijah Newren
2020-05-21 3:49 ` Elijah Newren
2020-05-21 17:54 ` Derrick Stolee
2020-05-07 13:17 ` [PATCH 07/10] sparse-checkout: define in-tree dependencies Derrick Stolee via GitGitGadget
2020-05-20 18:10 ` Elijah Newren
2020-05-30 17:26 ` Elijah Newren
2020-05-07 13:17 ` [PATCH 08/10] Makefile: skip git-gui if dir is missing Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 09/10] Makefile: disable GETTEXT when 'po' " Derrick Stolee via GitGitGadget
2020-05-07 13:17 ` [PATCH 10/10] .sparse: add in-tree sparse-checkout for Git Derrick Stolee via GitGitGadget
2020-05-20 17:38 ` [PATCH 00/10] [RFC] In-tree sparse-checkout definitions Elijah Newren
2020-06-17 23:14 ` Elijah Newren
2020-06-18 1:42 ` Derrick Stolee
2020-06-18 1:59 ` Elijah Newren
2020-06-18 3:01 ` Derrick Stolee
2020-06-18 5:03 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABPp-BH5p1VPXfMOyN_0SLnsFKkRU9R-ZpiAe4k5r=ZUbHeibQ@mail.gmail.com' \
--to=newren@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=me@ttaylorr.com \
--cc=newren@gmaill.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).