git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Config spec for git
@ 2021-11-17  9:30 Wallace, Brooke T (US 349D-Affiliate)
  2021-11-17 12:32 ` Ævar Arnfjörð Bjarmason
  2021-11-18  0:07 ` Johannes Schindelin
  0 siblings, 2 replies; 5+ messages in thread
From: Wallace, Brooke T (US 349D-Affiliate) @ 2021-11-17  9:30 UTC (permalink / raw)
  To: git

Has any one considered adding a config spec feature to Git or does Git alreadt have some way to support the same features?

I've been using Git for a while now for small projects but taking on a new larger project I've come to realize that Git does not have config specs and so seems to be missing an important feature for managing large projects.

We use configuration specs to select directories from a common code base (repo) and map them into different baselines to creat multiple product builds with different feature sets. We used this feature in VCSs such as Clearcase and Perforce. Ultimately this allows us to manage the repo in one directory structure and create product builds with a different one. For example the repo has multiple directories for different products/targets, but a baseline, the workspace, has only one target directory always with the same name mapped to the same location. Obviously the corresponding directories in the repo have different names.

Git supports the notion of submodules, but I see no way to map a submodule directory to a different name, remove unwanted subdirs of a submodule, or map a submodule over a subdirectory of the primary repo. Config specs also allow you to specify a specific branch or version that you want to map to your workspace independent of other directories, branches and versions.

I suppose it may be possible to achieve the same result by treating the primary repo as the configspec. But I feel like there are some features config specs support that i do not have using submodules, but might need down the road.

I can see that omitting, obscuring, or overwriting parts of a repo would not play well with the commit id. So I imagine there could be some real complications trying to add support for the notion of a flexible config spec.

Appreciate any comments/feedback
-Brooke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Config spec for git
  2021-11-17  9:30 Config spec for git Wallace, Brooke T (US 349D-Affiliate)
@ 2021-11-17 12:32 ` Ævar Arnfjörð Bjarmason
  2021-11-17 15:38   ` Philip Oakley
  2021-11-18  0:07 ` Johannes Schindelin
  1 sibling, 1 reply; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-17 12:32 UTC (permalink / raw)
  To: Wallace, Brooke T (US 349D-Affiliate); +Cc: git


On Wed, Nov 17 2021, Wallace, Brooke T (US 349D-Affiliate) wrote:

> Has any one considered adding a config spec feature to Git or does Git alreadt have some way to support the same features?
>
> I've been using Git for a while now for small projects but taking on a
> new larger project I've come to realize that Git does not have config
> specs and so seems to be missing an important feature for managing
> large projects.
>
> We use configuration specs to select directories from a common code
> base (repo) and map them into different baselines to creat multiple
> product builds with different feature sets. We used this feature in
> VCSs such as Clearcase and Perforce. Ultimately this allows us to
> manage the repo in one directory structure and create product builds
> with a different one. For example the repo has multiple directories
> for different products/targets, but a baseline, the workspace, has
> only one target directory always with the same name mapped to the same
> location. Obviously the corresponding directories in the repo have
> different names.
>
> Git supports the notion of submodules, but I see no way to map a
> submodule directory to a different name, remove unwanted subdirs of a
> submodule, or map a submodule over a subdirectory of the primary
> repo. Config specs also allow you to specify a specific branch or
> version that you want to map to your workspace independent of other
> directories, branches and versions.
>
> I suppose it may be possible to achieve the same result by treating
> the primary repo as the configspec. But I feel like there are some
> features config specs support that i do not have using submodules, but
> might need down the road.
>
> I can see that omitting, obscuring, or overwriting parts of a repo
> would not play well with the commit id. So I imagine there could be
> some real complications trying to add support for the notion of a
> flexible config spec.
>
> Appreciate any comments/feedback

I understand all the terms involved in your E-Mail except "config spec",
so on the first couple of readings I was thoroughly confused.

I gather from some Google searching that you may be referring to
ClearCase SCM jargon:
https://en.wikipedia.org/wiki/Rational_ClearCase#The_configuration_specification
&
https://www.ibm.com/docs/en/rational-clearcase/8.0.0?topic=views-how-config-spec-works

From your description it seems like you're talking about some
combination of the work-in-progress "sparse checkout" feature, and a
feature to compose arbitrary subdirectories and overlays of existing
repositories.

As far as I know nobody's working on the latter, although I suppose some
clever combination of submodules and sparse checkouts might make it
possible.

All of that's really a shot in the dark, I think I'm probably not the
only one who'd benefit from a description of what you'd expect a "config
spec" to do for you that doesn't assume pre-existing knowledge of the
term.

More generally it's a very common initial migration stategy between
SCM's and X SCM -> Git in particular to first consider how you could 1=1
map existing behavior to Git.

Those sorts of migrations are generally much more painful in the longer
term than considering how you'd map the software or assets you have to
Git if you were starting out today, which may be something to think
about.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Config spec for git
  2021-11-17 12:32 ` Ævar Arnfjörð Bjarmason
@ 2021-11-17 15:38   ` Philip Oakley
  2021-11-22 18:19     ` Martin von Zweigbergk
  0 siblings, 1 reply; 5+ messages in thread
From: Philip Oakley @ 2021-11-17 15:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Wallace,
	Brooke T (US 349D-Affiliate)
  Cc: git

On 17/11/2021 12:32, Ævar Arnfjörð Bjarmason wrote:
> On Wed, Nov 17 2021, Wallace, Brooke T (US 349D-Affiliate) wrote:
>
>> Has any one considered adding a config spec feature to Git or does Git alreadt have some way to support the same features?
>>
>> I've been using Git for a while now for small projects but taking on a
>> new larger project I've come to realize that Git does not have config
>> specs and so seems to be missing an important feature for managing
>> large projects.
>>
>> We use configuration specs to select directories from a common code
>> base (repo) and map them into different baselines to creat multiple
>> product builds with different feature sets. We used this feature in
>> VCSs such as Clearcase and Perforce. Ultimately this allows us to
>> manage the repo in one directory structure and create product builds
>> with a different one. For example the repo has multiple directories
>> for different products/targets, but a baseline, the workspace, has
>> only one target directory always with the same name mapped to the same
>> location. Obviously the corresponding directories in the repo have
>> different names.
>>
>> Git supports the notion of submodules, but I see no way to map a
>> submodule directory to a different name, remove unwanted subdirs of a
>> submodule, or map a submodule over a subdirectory of the primary
>> repo. Config specs also allow you to specify a specific branch or
>> version that you want to map to your workspace independent of other
>> directories, branches and versions.
>>
>> I suppose it may be possible to achieve the same result by treating
>> the primary repo as the configspec. But I feel like there are some
>> features config specs support that i do not have using submodules, but
>> might need down the road.
>>
>> I can see that omitting, obscuring, or overwriting parts of a repo
>> would not play well with the commit id. So I imagine there could be
>> some real complications trying to add support for the notion of a
>> flexible config spec.
>>
>> Appreciate any comments/feedback
> I understand all the terms involved in your E-Mail except "config spec",
> so on the first couple of readings I was thoroughly confused.
>
> I gather from some Google searching that you may be referring to
> ClearCase SCM jargon:
> https://en.wikipedia.org/wiki/Rational_ClearCase#The_configuration_specification
> &
> https://www.ibm.com/docs/en/rational-clearcase/8.0.0?topic=views-how-config-spec-works
>
> From your description it seems like you're talking about some
> combination of the work-in-progress "sparse checkout" feature, and a
> feature to compose arbitrary subdirectories and overlays of existing
> repositories.
>
> As far as I know nobody's working on the latter, although I suppose some
> clever combination of submodules and sparse checkouts might make it
> possible.
>
> All of that's really a shot in the dark, I think I'm probably not the
> only one who'd benefit from a description of what you'd expect a "config
> spec" to do for you that doesn't assume pre-existing knowledge of the
> term.
>
> More generally it's a very common initial migration stategy between
> SCM's and X SCM -> Git in particular to first consider how you could 1=1
> map existing behavior to Git.
>
> Those sorts of migrations are generally much more painful in the longer
> term than considering how you'd map the software or assets you have to
> Git if you were starting out today, which may be something to think
> about.
Also intrigued, I tried "mapping clearcase config spec to Git" in my
search engine to see what it came up with.

There's a YouTube webinar  "ClearCase to Git - November 2016"
https://www.youtube.com/watch?v=z2odE0CKxCQ  which looked like it may
help clarify issues.
Then there are a few StackOverflow Q&As that may help with terminology.
https://stackoverflow.com/questions/763099/flexible-vs-static-branching-git-vs-clearcase-accurev
https://stackoverflow.com/questions/28280685/toward-an-ideal-workflow-with-clearcase-and-git

It feels like your config specs are like feature branches, but that what
is missing (from Git, relative to the config spec) is a merge strategy
that can define which particular files/folders are merged at the one
time, rather than the current 'all files' being merged. This desire has
come up a few times when large (corporate?)  projects need to merge
large independent feature branches that will need different specialists
to handle different groups of files (i.e. partial merges, e.g. [1]), but
that hasn't been implemented (yet) as it would need someone to think it
through and work on it.
--
Philip
[1]
https://lore.kernel.org/git/BY5PR19MB3400EB9AD87DFE612AFD5CC390810@BY5PR19MB3400.namprd19.prod.outlook.com/

Collaborative conflict resolution feature request


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Config spec for git
  2021-11-17  9:30 Config spec for git Wallace, Brooke T (US 349D-Affiliate)
  2021-11-17 12:32 ` Ævar Arnfjörð Bjarmason
@ 2021-11-18  0:07 ` Johannes Schindelin
  1 sibling, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2021-11-18  0:07 UTC (permalink / raw)
  To: Wallace, Brooke T (US 349D-Affiliate); +Cc: git

Hi Brooke,

On Wed, 17 Nov 2021, Wallace, Brooke T (US 349D-Affiliate) wrote:

> Has any one considered adding a config spec feature to Git or does Git
> alreadt have some way to support the same features?

Since config specs are clearly not (yet) a Git feature, it would make
sense to begin the discussion with a description of the concept of config
specs, as the majority of the readers on the Git mailing list will be
unfamiliar with them.

> I've been using Git for a while now for small projects but taking on a
> new larger project I've come to realize that Git does not have config
> specs and so seems to be missing an important feature for managing large
> projects.
>
> We use configuration specs to select directories from a common code base
> (repo) and map them into different baselines to creat multiple product
> builds with different feature sets. We used this feature in VCSs such as
> Clearcase and Perforce. Ultimately this allows us to manage the repo in
> one directory structure and create product builds with a different one.
> For example the repo has multiple directories for different
> products/targets, but a baseline, the workspace, has only one target
> directory always with the same name mapped to the same location.
> Obviously the corresponding directories in the repo have different
> names.

So from what I gather after reading this, I suspect that you have a main
branch with a full tree, and you want to have a way to check out only
parts of the tree.

This concept has been brought up before, in
https://lore.kernel.org/git/pull.627.git.1588857462.gitgitgadget@gmail.com/:
proposing a way to define what parts of the tree should be checked out in
a sparse checkout.

However, it looks as if your config specs also allow to map the
directories in the Git revisions to different locations and maybe even
names?

Such a concept has not come up on the Git mailing list.

There _have_ been ideas floating around in Git for Windows, mainly to
allow for checking out revisions that rely on file names that are illegal
on Windows (such as file names containing backslashes, or reserved names
such as `aux.c`).

Nothing came of those ideas, though, mainly because nobody snatched up the
baton to work on a concrete patch to implement this.

I should point out, though, that the concept of a sparse checkout is
independent from the concept of mapping file/directory names in the Git
revision to different ones in the Git worktree.

> Git supports the notion of submodules, but I see no way to map a
> submodule directory to a different name, remove unwanted subdirs of a
> submodule, or map a submodule over a subdirectory of the primary repo.
> Config specs also allow you to specify a specific branch or version that
> you want to map to your workspace independent of other directories,
> branches and versions.

The idea of letting directories in the same Git worktree originate from
_different_ revisions is very, very foreign to the fundamental Git concept
of what constitutes a commit. A commit is very much a snapshot of the
entire tree. And when you make a new commit, it is again very much a
snapshot of the entire tree, based on a single parent commit.

So I doubt that you will be able to come up with a workable design to let
Git replicate this functionality.

> I suppose it may be possible to achieve the same result by treating the
> primary repo as the configspec. But I feel like there are some features
> config specs support that i do not have using submodules, but might need
> down the road.

I agree that submodules are unlikely to give you what you want.

> I can see that omitting, obscuring, or overwriting parts of a repo would
> not play well with the commit id. So I imagine there could be some real
> complications trying to add support for the notion of a flexible config
> spec.

Indeed.

The only way I can see that you can _somehow_ combine parts of multiple
revisions into one worktree is by transforming those parts into a single
commit, quite possibly by scripting the transformation.

For example, if you wanted to map, say, `Documentation/technical/` of the
tag `v2.34.0` to `tech-specs/` and `compat/poll/` of the tag `v2.30.0` to
`poll-emulation/` in a clone of https://github.com/git/git, you could use
something like this to create a new branch:

(
	GIT_INDEX_FILE=.tmp-index &&
	export GIT_INDEX_FILE &&
	git read-tree --prefix=tech-specs/ v2.34.0:Documentation/technical &&
	git read-tree --prefix=poll-emulation/ v2.30.0:compat/poll &&
	tree=$(git write-tree) &&
	commit=$(git commit-tree $tree -p v2.34.0^0 -p v2.30.0) &&
	git branch my-generated-branch $commit
)

This would give you a full Git branch that could be checked out and has
the mapping.

You would have to play similar tricks if you wanted to transport committed
changes from that branch back to the originating commit histories.

So yes, it is _somewhat_ possible to replicate what you can do with config
specs, it is just unlikely to ever offer a good user experience.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Config spec for git
  2021-11-17 15:38   ` Philip Oakley
@ 2021-11-22 18:19     ` Martin von Zweigbergk
  0 siblings, 0 replies; 5+ messages in thread
From: Martin von Zweigbergk @ 2021-11-22 18:19 UTC (permalink / raw)
  To: Philip Oakley
  Cc: Ævar Arnfjörð Bjarmason, Wallace,
	Brooke T (US 349D-Affiliate),
	git

On Wed, Nov 17, 2021 at 8:38 AM Philip Oakley <philipoakley@iee.email> wrote:
>
> It feels like your config specs are like feature branches, but that what
> is missing (from Git, relative to the config spec) is a merge strategy
> that can define which particular files/folders are merged at the one
> time, rather than the current 'all files' being merged. This desire has
> come up a few times when large (corporate?)  projects need to merge
> large independent feature branches that will need different specialists
> to handle different groups of files (i.e. partial merges, e.g. [1]), but
> that hasn't been implemented (yet) as it would need someone to think it
> through and work on it.
> --
> Philip
> [1]
> https://lore.kernel.org/git/BY5PR19MB3400EB9AD87DFE612AFD5CC390810@BY5PR19MB3400.namprd19.prod.outlook.com/
>
> Collaborative conflict resolution feature request
>

FWIW, I've spent a lot of time thinking about this and implementing it
in https://github.com/martinvonz/jj. Being able to represent a
conflicted file state has a lot of benefits, of which collaborative
conflict resolution is among the smaller ones (allowing automatic
rebase every time a commit is rewritten is much more useful, IMO). I
need to document the design but you can find an old version of it
(from before I rewrote the README to target users) is at [1]. The
design can of course be copied to Git.

[1] https://github.com/martinvonz/jj/blob/2879d817dd1021f8dc2ea5e42000c1d5d50e4fc7/README.md#commits-can-contain-conflicts

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-22 18:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-17  9:30 Config spec for git Wallace, Brooke T (US 349D-Affiliate)
2021-11-17 12:32 ` Ævar Arnfjörð Bjarmason
2021-11-17 15:38   ` Philip Oakley
2021-11-22 18:19     ` Martin von Zweigbergk
2021-11-18  0:07 ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).