All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Define "precious" attribute and support it in `git clean`
@ 2023-10-10 12:37 Sebastian Thiel
  2023-10-10 13:38 ` Kristoffer Haugsbakk
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-10 12:37 UTC (permalink / raw)
  To: git; +Cc: Josh Triplett

[Note: I'm collaborating with Josh Triplett (CCed) on the design.]

I'd like to propose adding a new standard gitattribute "precious".  I've
included proposed documentation at the end of this mail, and I'm happy to write
the code.  I wanted to get feedback on the concept first.

What's a 'precious' file?

"Precious" files are files that are specific to a user or local configuration
and thus not tracked by Git.  As such, a user spent time to create or generate
them, and to tune them to fit their needs.  They are typically also ignored by
`git` due to `.gitignore` configuration, preventing them to be tracked by
accident.

This proposal suggests to make them known to Git using git-attributes so that
`git clean` can be taught to treat them with care.

Example: A Linux Kernel .config file

Users can mark the `.config` file as 'precious' using `.gitattributes`:

    /.config precious

When checking which ignored files `git clean -nx` would remove, we would see
the following.

    Would remove precious .config
    Would remove scripts/basic/.fixdep.cmd
    Would remove scripts/basic/fixdep
    Would remove scripts/kconfig/.conf.cmd


This highlights precious files by calling them out, but doesn't change the
behaviour of existing flags.  Instead, the new flag `-p` is added which lets
`git clean` spare precious files.

Thus `git clean -np` would print:

    Would remove scripts/basic/.fixdep.cmd
    Would remove scripts/basic/fixdep
    Would remove scripts/kconfig/.conf.cmd

The precious file is not part of the set of files to be removed anymore.

`git clean -[n|f] -xp` will fail with an error indicating that `-x` and `-p`
are mutually exclusive.  The hope is that people can replace some of their
usage of `-x` with `-p` to preserve precious files, while continuing to use
`-x` if they want a completely clean working directory.

Additional Benefits

`git clean -fdp` can now be used to restore the user's directory to a pristine
post-clone state while keeping all files and directories the project or user
identifies as precious.  There is less fear of accidentally deleting files
which are required for local development or otherwise represent a time
investment.

Example: A precious IDE configuration directory.

To keep IDE configuration, one can also mark entire directories - the following
could go into a user-specific gitattributes file denoted by the
`core.attributesFile` configuration.

    /.idea/** precious

With this attributes file in place, `git clean -ndx` would produce the
following output...

    Would remove .DS_Store
    Would remove precious .idea/

...while `git clean -ndp` would look like this:

    Would remove .DS_Store

Here's a patch showing what the documentation could look like.  Happy to write
the corresponding code.

---
diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 5e1a3d5148..5b2eab6573 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -60,6 +60,10 @@ OPTIONS
 	Use the given exclude pattern in addition to the standard ignore rules
 	(see linkgit:gitignore[5]).
 
+-p::
+	Remove ignored files as well (like `-x`), but preserve "precious"
+	files (see linkgit:gitattributes[5]).
+
 -x::
 	Don't use the standard ignore rules (see linkgit:gitignore[5]), but
 	still use the ignore rules given with `-e` options from the command
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 6deb89a296..f68aadc3c2 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -1248,6 +1248,20 @@ If this attribute is not set or has an invalid value, the value of the
 (See linkgit:git-config[1]).
 
 
+Preserving precious files
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+`precious`
+^^^^^^^^^^
+
+A file marked as `precious` will be preserved when running linkgit:git-clean[1]
+with the `-p` option. Use this attribute for files such as a Linux kernel
+`.config` file, which are not tracked by git because they contain user-specific
+or build-specific configuration, but which contain valuable information that a
+user spent time and effort to create.
+
+
+
 USING MACRO ATTRIBUTES
 ----------------------
 

What do you think?

Thanks for your feedback,
Sebastian

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 12:37 [RFC] Define "precious" attribute and support it in `git clean` Sebastian Thiel
@ 2023-10-10 13:38 ` Kristoffer Haugsbakk
  2023-10-10 14:10   ` Josh Triplett
  2023-10-10 17:02 ` Junio C Hamano
  2023-10-11 21:41 ` Kristoffer Haugsbakk
  2 siblings, 1 reply; 28+ messages in thread
From: Kristoffer Haugsbakk @ 2023-10-10 13:38 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: Josh Triplett, git

Hi Sebastian

On Tue, Oct 10, 2023, at 14:37, Sebastian Thiel wrote:
> This highlights precious files by calling them out, but doesn't change the
> behaviour of existing flags.  Instead, the new flag `-p` is added which lets
> `git clean` spare precious files.

Why can't `clean` preserve precious files by default? And then delete them
as well with something like `--no-keep-precious`? Is there some backwards
compatibility concern?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 13:38 ` Kristoffer Haugsbakk
@ 2023-10-10 14:10   ` Josh Triplett
  2023-10-10 17:07     ` Junio C Hamano
  2023-10-10 19:10     ` Kristoffer Haugsbakk
  0 siblings, 2 replies; 28+ messages in thread
From: Josh Triplett @ 2023-10-10 14:10 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Sebastian Thiel, git

On Tue, Oct 10, 2023 at 03:38:51PM +0200, Kristoffer Haugsbakk wrote:
> Hi Sebastian
> 
> On Tue, Oct 10, 2023, at 14:37, Sebastian Thiel wrote:
> > This highlights precious files by calling them out, but doesn't change the
> > behaviour of existing flags.  Instead, the new flag `-p` is added which lets
> > `git clean` spare precious files.
> 
> Why can't `clean` preserve precious files by default? And then delete them
> as well with something like `--no-keep-precious`? Is there some backwards
> compatibility concern?

While I'd love for it to default to that and require an extra option to
clean away precious files, I'd expect that that would break people's
workflows and finger memory. If someone expects `git clean -x -d -f` to
clean away everything, including `.config`, and then it leaves some
files in place, that seems likely to cause problems. (Leaving aside that
it might break scripted workflows.)

It seems safer to keep the existing behavior for existing options, and
add a new option for "remove everything except precious files".

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 12:37 [RFC] Define "precious" attribute and support it in `git clean` Sebastian Thiel
  2023-10-10 13:38 ` Kristoffer Haugsbakk
@ 2023-10-10 17:02 ` Junio C Hamano
  2023-10-11 10:06   ` Richard Kerry
                     ` (2 more replies)
  2023-10-11 21:41 ` Kristoffer Haugsbakk
  2 siblings, 3 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-10 17:02 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: git, Josh Triplett

Sebastian Thiel <sebastian.thiel@icloud.com> writes:

> I'd like to propose adding a new standard gitattribute "precious".

;-).

Over the years, I've seen many times scenarios that would have been
helped if we had not just "tracked? ignored? unignored?" but also
the fourth kind [*].  The word "ignored" (or "excluded") has always
meant "not tracked, not to be tracked, and expendable" to Git, and
"ignored but unexpendable" class was missing.  I even used the term
"precious" myself in those discussions.  At the concept level, I
support the effort 100%, but as always, the devil will be in the
details.

Scenarios that people wished for "precious" traditionally have been

 * You are working on 'master'.  You have in your .gitignore or
   .git/info/exclude a line to ignore path A, and have random
   scribbles in a throw-away file there.  There is another branch
   'seen', where they added some tracked contents at path A/B.  You
   do "git checkout seen" and your file A that is an expendable file,
   because it is listed as ignored in .git/info/exclude, is removed
   to make room for creating A/B.

 * Similar situation, but this time, 'seen' branch added a tracked
   contents at path A.  Again, "git checkout seen" will discard the
   expendable file A and replace it with tracked contents.

 * Instead of "git checkout", you decide to merge the branch 'seen'
   to the checkout of 'master', where you have an ignored path A.
   Because merging 'seen' would need to bring the tracked contents
   of either A/B (in the first scenario above) or A (in the second
   scenario), your "expendable" A will be removed to make room.

In previous discussions, nobody was disturbed that "git clean" was
unaware of the "precious" class, but if we were to have the
"precious" class in addition to "ignored" aka "expendable", I would
not oppose to teach "git clean" about it, too.

There was an early and rough design draft there in

https://lore.kernel.org/git/7vipsnar23.fsf@alter.siamese.dyndns.org/

which probably is worth a read, too.

Even though I referred to the precious _attribute_ in some of these
discussions, between the attribute mechanism and the ignore
mechanism, I am actually leaning toward suggesting to extend the
exclude/ignore mechanism to introduce the "precious" class.  That
way, we can avoid possible snafu arising from marking a path in
.gitignore as ignored, and in .gitattrbutes as precious, and have to
figure out how these two settings are to work together.

In any case, the "precious" paths are expected to be small minority
of what people never want to "git add" or "git commit", so coming up
with a special syntax to be used in .gitignore, even if that special
syntax is ugly and cumbersome to type, would be perfectly OK.


[Reference]

 * https://lore.kernel.org/git/7viptp9jos.fsf@alter.siamese.dyndns.org/
 * https://lore.kernel.org/git/xmqqva534vnb.fsf@gitster-ct.c.googlers.com/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 14:10   ` Josh Triplett
@ 2023-10-10 17:07     ` Junio C Hamano
  2023-10-12  8:47       ` Josh Triplett
  2023-10-10 19:10     ` Kristoffer Haugsbakk
  1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2023-10-10 17:07 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Kristoffer Haugsbakk, Sebastian Thiel, git

Josh Triplett <josh@joshtriplett.org> writes:

> While I'd love for it to default to that and require an extra option to
> clean away precious files, I'd expect that that would break people's
> workflows and finger memory. If someone expects `git clean -x -d -f` to
> clean away everything, including `.config`, and then it leaves some
> files in place, that seems likely to cause problems. (Leaving aside that
> it might break scripted workflows.)

I thought the point of introducing the new "precious" class of
paths, in addition to the current "tracked", "ignored, untracked,
and expendable", "not ignored and untracked", is so that people can
do "git clean -x -d -f" and expect the ".config" that is marked as
"precious" to stay.  Before their Git learned the precious class, if
they marked ".config" as "ignored, untracked, and expendable", then
such an invocation of "clean" would have removed it, but if they add
it to the new "precious" class, their expectation ought to be that
precious ones are not removed, no?  Otherwise I am not quite sure
what the point of adding such a new protection is.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 14:10   ` Josh Triplett
  2023-10-10 17:07     ` Junio C Hamano
@ 2023-10-10 19:10     ` Kristoffer Haugsbakk
  2023-10-12  9:04       ` Josh Triplett
  1 sibling, 1 reply; 28+ messages in thread
From: Kristoffer Haugsbakk @ 2023-10-10 19:10 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Sebastian Thiel, git

 Hi Josh

On Tue, Oct 10, 2023, at 16:10, Josh Triplett wrote:
> > [snip]
>
> While I'd love for it to default to that and require an extra option to
> clean away precious files, I'd expect that that would break people's
> workflows and finger memory. If someone expects `git clean -x -d -f` to
> clean away everything, including `.config`, and then it leaves some
> files in place, that seems likely to cause problems. (Leaving aside that
> it might break scripted workflows.)
>
> It seems safer to keep the existing behavior for existing options, and
> add a new option for "remove everything except precious files".

What's a scenario where it breaks? I'm guessing:

1. Someone clones a project
2. That project has precious files marked via `.gitattributes`
3. They later do a `clean`
4. The precious files are left alone even though they expected them to be
   deleted; they don't check what `clean` did (it deletes everything
   untracked (they expect) so nothing to check)
5. This hurts them somehow

It seems that the only files that should be deleted with expediency are
secrets. But then why or how would:

1. The project mark such files as precious
2. The user introduces these files (they are precious hence they were not
   part of the clone)
3. They are never deleted

This sounds unlikely to me. And if it was some kind of malignant vector
then all would be vulnerable to it (not just legacy scripts/legacy hands).

What am I missing?

-- 
Kristoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 17:02 ` Junio C Hamano
@ 2023-10-11 10:06   ` Richard Kerry
  2023-10-11 22:40     ` Jeff King
  2023-10-11 23:35     ` Junio C Hamano
  2023-10-12 10:55   ` Sebastian Thiel
  2023-10-14  5:59   ` Josh Triplett
  2 siblings, 2 replies; 28+ messages in thread
From: Richard Kerry @ 2023-10-11 10:06 UTC (permalink / raw)
  To: git


> > I'd like to propose adding a new standard gitattribute "precious".
> 
> ;-).

The version of CVS that I used to use, CVSNT, was a lot more careful about the user's files than Git is inclined to be.
If CVSNT, while doing an Update, came across a non-tracked file that was in the way of something that it wanted to write, then the Update would be aborted showing a list of any files that were "in the way".  The user could then rename/delete them or redo the Update with a "force" parameter to indicate that such items could be overwritten.
Git has tended to take an approach of "if it's important it'll be tracked by Git - anything else can be trashed with impunity.".  Over the years people have been caught out by this and lost work.  It may well be that in a Linux development world anything other than tracked source files can be summarily deleted, but in a wider world, like Windows, or environments that are not software development, or that need special files lying around, this is not always an entirely reasonable approach.

> Over the years, I've seen many times scenarios that would have been helped
> if we had not just "tracked? ignored? unignored?" but also the fourth kind
> [*].  The word "ignored" (or "excluded") has always meant "not tracked, not
> to be tracked, and expendable" to Git, and "ignored but unexpendable" class
> was missing.  I even used the term "precious" myself in those discussions.  At
> the concept level, I support the effort 100%, but as always, the devil will be in
> the details.
> 
> Scenarios that people wished for "precious" traditionally have been
> 
>  * You are working on 'master'.  You have in your .gitignore or
>    .git/info/exclude a line to ignore path A, and have random
>    scribbles in a throw-away file there.  There is another branch
>    'seen', where they added some tracked contents at path A/B.  You
>    do "git checkout seen" and your file A that is an expendable file,
>    because it is listed as ignored in .git/info/exclude, is removed
>    to make room for creating A/B.

So checkout aborts, saying "A is in the way".

>  * Similar situation, but this time, 'seen' branch added a tracked
>    contents at path A.  Again, "git checkout seen" will discard the
>    expendable file A and replace it with tracked contents.

So checkout aborts, saying "A is in the way".

>  * Instead of "git checkout", you decide to merge the branch 'seen'
>    to the checkout of 'master', where you have an ignored path A.
>    Because merging 'seen' would need to bring the tracked contents
>    of either A/B (in the first scenario above) or A (in the second
>    scenario), your "expendable" A will be removed to make room.

So merge aborts, saying "A is in the way".  It is entirely conventional to have merge conflicts that the user needs to resolve.  This is just another kind of conflict.

> In previous discussions, nobody was disturbed that "git clean" was unaware
> of the "precious" class, but if we were to have the "precious" class in addition
> to "ignored" aka "expendable", I would not oppose to teach "git clean" about
> it, too.

Indeed, if something is explicitly precious then nothing should summarily delete it.

I know this goes against some stated design decisions of early Git, but in the CVSNT world *all* files were considered precious and would always cause an update to be aborted if there were any inclination to replace them.

An option might be to state, in config, whether a project, or everything, should be managed on the basis of "all untracked files are precious" or "files may be explicitly marked precious", or, as now, "nothing is precious".

Regards,
Richard.

PS.  I think I've caught all places where my fingers typed "previous" when my brain meant "precious" - apologies if I've missed any.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 12:37 [RFC] Define "precious" attribute and support it in `git clean` Sebastian Thiel
  2023-10-10 13:38 ` Kristoffer Haugsbakk
  2023-10-10 17:02 ` Junio C Hamano
@ 2023-10-11 21:41 ` Kristoffer Haugsbakk
  2 siblings, 0 replies; 28+ messages in thread
From: Kristoffer Haugsbakk @ 2023-10-11 21:41 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: Josh Triplett, git

On Tue, Oct 10, 2023, at 14:37, Sebastian Thiel wrote:
> [Note: I'm collaborating with Josh Triplett (CCed) on the design.]
>
> I'd like to propose adding a new standard gitattribute "precious".  I've
> included proposed documentation at the end of this mail, and I'm happy to write
> the code.  I wanted to get feedback on the concept first.
>
> What's a 'precious' file?
>
> "Precious" files are files that are specific to a user or local configuration
> and thus not tracked by Git.  As such, a user spent time to create or generate
> them, and to tune them to fit their needs.  They are typically also ignored by
> `git` due to `.gitignore` configuration, preventing them to be tracked by
> accident.

How do people deal with these precious files today?

You could track these precious files somewhere else. Maybe a Git directory
which is a sibling of `.git`.

    .git-local

Files that are useless to the project but important to the individual.

It would get its own “excludes” file.

    .gitignore-local

Now the normal repository (`.git`) needs to ignore these things.

    # Cast a wide net: could want other siblings
    # Use another pattern if you have something like `.git-blame-ignore-revs`
    printf '.git-*\n'       >> .git/info/exclude
    printf '.gitignore-*\n' >> .git/info/exclude

You need to pass in two arguments to `git` every time you want to use
`.git-local`.

    alias gitl='git --git-dir=.git-local -c core.excludesFile=.gitignore-local'

Git Local should ignore everything by default. You should check with
`.gitignore` to make sure that it ignores the files that Git Local does
_not_ ignore.

    printf '*\n'                 >> .gitignore-local
    printf '!.gitignore-local\n' >> .gitignore-local
    printf '!.idea/**\n'         >> .gitignore-local

Now you can backup your local files.

    gitl add .gitignore-local
    gitl add .idea
    gitl commit -m'Update local'

But you can also version control them by providing real (intentional)
messages.

(Maybe `.idea/` is just an XML soup and thus hard to make a VCS narrative
around; I don't know yet.)

`git clean` won't help you. But an alias can.

Or if writing a shell-oneliner alias is too hard for you, I mean me.

    #!/bin/sh
    # git-klean
    git --git-dir=.git-local -c core.excludesFile=.gitignore-local add --all
    git --git-dir=.git-local -c core.excludesFile=.gitignore-local commit -mUpdate
    git clean -e .gitignore-local -e .git-local "$@"

The two `-e` switches protect the Git Local things from being wiped by
`-xd` (tested with `--dry-run`).

§ Sibling repositories

At first I thought that `git clean` could use a `pre-clean` hook. But
that's not very satisfying.

Maybe Git could be told about its siblings via a multi-valued
configuration variable.

    sibling=local

Then it expects there to exist `.git-<sibling name>` repository next to
it (`.git/../.git-<sibling name>`).

Then the rule becomes:

Only do destructive operations if all of the working trees[1] of the
sibling repositories are clean (cannot override with `--force`).

Additionally one could say that directories that are Git repositories
should be ignored by `git clean`, always.[2] Or siblings which
match the glob:

    .git-*

... Or if you want something longer:

    .git-sibling-*

(And also ignore `.gitignore-*` and maybe more things)

Then the regular Git repository might still blow away your precious
files. But they will be backed up by the siblings.

Or you put these other Git repositories outside of `.git/..`. Sidestepping
the issue at the cost of some path confusion (for yourself). Maybe at:

    /home/user/git-siblings/repository1/local

† 1: Worktrees not considered.
† 2: And what would that break? People who make Git repositories in their
   working trees and then delete them? (Well they can still use `rm -r`.)

⧫ ⧫

§ Worktrees

But seriously: worktrees probably makes this not work.

-- 
Kristoffer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-11 10:06   ` Richard Kerry
@ 2023-10-11 22:40     ` Jeff King
  2023-10-11 23:35     ` Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Jeff King @ 2023-10-11 22:40 UTC (permalink / raw)
  To: Richard Kerry; +Cc: git

On Wed, Oct 11, 2023 at 10:06:25AM +0000, Richard Kerry wrote:

> The version of CVS that I used to use, CVSNT, was a lot more careful
> about the user's files than Git is inclined to be.
> If CVSNT, while doing an Update, came across a non-tracked file that
> was in the way of something that it wanted to write, then the Update
> would be aborted showing a list of any files that were "in the way".
> The user could then rename/delete them or redo the Update with a
> "force" parameter to indicate that such items could be overwritten.
> Git has tended to take an approach of "if it's important it'll be
> tracked by Git - anything else can be trashed with impunity.".  Over
> the years people have been caught out by this and lost work.  It may
> well be that in a Linux development world anything other than tracked
> source files can be summarily deleted, but in a wider world, like
> Windows, or environments that are not software development, or that
> need special files lying around, this is not always an entirely
> reasonable approach.

I'm not sure if you are just skipping the details of ".gitignore" here,
but to be clear, blowing away untracked files is _not_ Git's default
behavior.

For example:

  [sample repo with established history]
  $ git init
  $ echo content >base
  $ git add base
  $ git commit -m base

  [one branch touches some-file]
  $ git checkout -b side-branch
  $ echo whatever >some-file
  $ git add some-file
  $ git commit -m 'add some-file'

  [but back on master/main, it is untracked]
  $ git checkout main
  $ echo precious >some-file

  [an operation that tries to overwrite the untracked file will fail]
  $ git checkout side-branch
  $ git checkout side-branch
  error: The following untracked working tree files would be overwritten by checkout:
	some-file
  Please move or remove them before you switch branches.
  Aborting

  [providing --force will obliterate it]
  $ git checkout --force side-branch
  Switched to branch 'side-branch'

The issue that people sometimes find with Git is when the user has
explicitly listed a file in ".gitignore", Git takes that to mean it
should never be tracked _and_ it is not precious. But people sometimes
want a way to say "this should never be tracked, but keep treating it as
precious in the usual way".

From the description above it might sound like Git's current behavior is
conflating two orthogonal things, but if you switched the default
behavior of .gitignore'd files to treat them as precious, you will find
lots of cases that are annoying. E.g., if a file is generated by some
parts of history and tracked in others, you'd have to use --force to
move between them to overwrite the generated version.

-Peff

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-11 10:06   ` Richard Kerry
  2023-10-11 22:40     ` Jeff King
@ 2023-10-11 23:35     ` Junio C Hamano
  1 sibling, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-11 23:35 UTC (permalink / raw)
  To: Richard Kerry; +Cc: git

Richard Kerry <richard.kerry@eviden.com> writes:

> An option might be to state, in config, whether a project, or
> everything, should be managed on the basis of "all untracked files
> are precious" or "files may be explicitly marked precious", or, as
> now, "nothing is precious".

I do not think there is any need to have a separate "all or none"
option.  We do not have to make things more complicated than
necessary.

If all untracked files are precious, a user should be able to say so
with an entry that matches all paths "*" to mark them precious, and
nothing more needs to be done.  By default nothing is ignored and
nothing is precious, until you start marking paths with .gitignore
entries.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 17:07     ` Junio C Hamano
@ 2023-10-12  8:47       ` Josh Triplett
  0 siblings, 0 replies; 28+ messages in thread
From: Josh Triplett @ 2023-10-12  8:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kristoffer Haugsbakk, Sebastian Thiel, git

On Tue, Oct 10, 2023 at 10:07:08AM -0700, Junio C Hamano wrote:
> Josh Triplett <josh@joshtriplett.org> writes:
> 
> > While I'd love for it to default to that and require an extra option to
> > clean away precious files, I'd expect that that would break people's
> > workflows and finger memory. If someone expects `git clean -x -d -f` to
> > clean away everything, including `.config`, and then it leaves some
> > files in place, that seems likely to cause problems. (Leaving aside that
> > it might break scripted workflows.)
> 
> I thought the point of introducing the new "precious" class of
> paths, in addition to the current "tracked", "ignored, untracked,
> and expendable", "not ignored and untracked", is so that people can
> do "git clean -x -d -f" and expect the ".config" that is marked as
> "precious" to stay.  Before their Git learned the precious class, if
> they marked ".config" as "ignored, untracked, and expendable", then
> such an invocation of "clean" would have removed it, but if they add
> it to the new "precious" class, their expectation ought to be that
> precious ones are not removed, no?  Otherwise I am not quite sure
> what the point of adding such a new protection is.

I'd expect a lot of projects to move things *from* the current "ignored"
state to "precious", once "precious" exists. Linux `.config`, for
instance.

That said, I do agree that the ideal behavior is for clean to preserve
precious files by default, and require an extra option to remove
precious files. If you think that doesn't have backwards-compatibility
considerations, then it certainly seems much easier to jump directly to
that behavior.

- Josh Triplett

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 19:10     ` Kristoffer Haugsbakk
@ 2023-10-12  9:04       ` Josh Triplett
  0 siblings, 0 replies; 28+ messages in thread
From: Josh Triplett @ 2023-10-12  9:04 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Sebastian Thiel, git

On Tue, Oct 10, 2023 at 09:10:34PM +0200, Kristoffer Haugsbakk wrote:
>  Hi Josh
> 
> On Tue, Oct 10, 2023, at 16:10, Josh Triplett wrote:
> > > [snip]
> >
> > While I'd love for it to default to that and require an extra option to
> > clean away precious files, I'd expect that that would break people's
> > workflows and finger memory. If someone expects `git clean -x -d -f` to
> > clean away everything, including `.config`, and then it leaves some
> > files in place, that seems likely to cause problems. (Leaving aside that
> > it might break scripted workflows.)
> >
> > It seems safer to keep the existing behavior for existing options, and
> > add a new option for "remove everything except precious files".
> 
> What's a scenario where it breaks? I'm guessing:
> 
> 1. Someone clones a project
> 2. That project has precious files marked via `.gitattributes`
> 3. They later do a `clean`
> 4. The precious files are left alone even though they expected them to be
>    deleted; they don't check what `clean` did (it deletes everything
>    untracked (they expect) so nothing to check)
> 5. This hurts them somehow

The scenario I had in mind was:

- Project has ignored files; git doesn't have a concept of "precious"
- Users expect that `git clean -x -d -f` deletes everything that isn't
  part of the latest commit.
- Git introduces the concept of "precious"
- Project adopts "precious" and marks some of its ignored files as
  "precious" instead
- Users' finger-macros around `git clean` stop cleaning up files they
  expected to be cleaned.

That said, given Junio's response I'm no longer concerned about this
scenario.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 17:02 ` Junio C Hamano
  2023-10-11 10:06   ` Richard Kerry
@ 2023-10-12 10:55   ` Sebastian Thiel
  2023-10-12 16:58     ` Junio C Hamano
  2023-10-14  5:59   ` Josh Triplett
  2 siblings, 1 reply; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-12 10:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

I liked the idea too see precious files as sub-class of ignored files, and
investigated possibilities on how to achieve that while keeping the overall
effort low and remove any potential for backwards-incompatibility as well.

Currently, `.gitignore` files only contain one pattern per line, which
optionally may be prefixed with `!` to negate it. This can be escaped with `\!`
- and that's it.

Parsing patterns that way makes for simple parsing without a need for
quoting.

### What about a `$` syntax in `.gitignore` files?

I looked into adding a new prefix, `$` to indicate the following path is
precious or… valuable. It can be escaped with `\$` just like `\!`. 

Doing so has the advantage that older `git` versions simply take the
declaration as literal and would now exclude `$.config`, for example, whereas
newer `git` versions will consider them precious.

There is some potential for accidentally excluding files that previously
were untracked with older versions of git, but I'd think chances are low.

#### Example: Linux kernel

`.config` is ignored via `.gitignore: 

    .*

*Unfortunately*, users can't just add a local `.git/info/exclude` file with
`$.config` in it and expect `.config` to be considered precious as the pattern
search order will search this last as it's part of the exclude-globals. The
same is true for per-user git-ignore files. This means that any git would
have the `.*` pattern match before the `$.config` pattern, and stop right there
concluding that it's expendable, instead of precious. This is how users can
expect `.gitignore` files to work, and this is how `!negations` work as well -
the negation has to come after the actual exclusion to be effective.

Thus, to make this work, projects that ship the `.gitignore` files would *have
to add patterns* that make certain files precious.

Alternatively, users have to specify gitignore-overrides on the command-line,
but not all commands (if any except for `git check-ignore`) support that.

In the case of `git clean` one can already pass `--exclude=pattern`, but if
that's needed one doesn't need syntax for precious files in the first place.

**This makes the $precious syntax useful only for projects who chose to opt in,
but makes overrides for users near impossible**.

Such opted-in projects would produce `.gitignore` files like these:

    .*
    $.config

Note that due to the way ignore patterns are searched, the following would
consider `.config` trackable, not precious:

    .*
    $.config
    !.config

It's up the maintainer of the repository to configure their .gitignore files
correctly, so nothing new either.

#### Benefits

* simple implementation, fast
* backwards compatible

#### Disadvantages

* cannot easily be overridden by the user as part of their local settings.
* needs repository-buy-in to be useful
* $file could clash with the file '$file' and cause older git  to ignore it

### What about a `precious` attribute?

The search of `.gitattributes` works differently which makes it possible for
users to set attributes on any file or folder easily using their local files.
Using attributes has the added benefit of being extensible as one can start out
with:

```gitattributes
.config precious
```

and optionally continue with…

```gitattributes
.config precious=input
kernel precious=output
```

…to further classify kinds of precious files, probably for their personal use.
Please note that currently pathspecs can't be used to filter by attribute
for files that are igonred and untracked or I couldn't figure out how.
That even makes sense as it wasn't considered a use-case yet.


#### Benefits

* backwards compatible
* easily extendable with 'tags' or sub-classes of precious files using the
  assignment syntax.
* overridable with user's local files

#### Disadvantages

* any 'exclude' query now also needs a .gitattribute query to support precious
  files (and it's not easy to optimize unless there is a flag to turn precious
  file support on or off)
* `precious` might be in use by some repos which now gains a possibly different
  meaning in `git` as well.

### Conclusion

Weighing advantages and disadvantages of both approaches makes me prefer the
`.gitignore` extension. The `.gitattributes` version of it *could* also be
implemented on top of it at a later date. However, it should be gated behind a
configuration flag so users who need it as they want local overrides
can opt-in. Then they also pay for the feature which for most repositories 
won't be an issue in the first place.

All this seems a bit too good to be true, and I hope you can show where
it wouldn't work or which dangers or possible issues haven't been
mentioned yet.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-12 10:55   ` Sebastian Thiel
@ 2023-10-12 16:58     ` Junio C Hamano
  2023-10-13  9:09       ` Sebastian Thiel
                         ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-12 16:58 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

Sebastian Thiel <sebastian.thiel@icloud.com> writes:

> ### What about a `$` syntax in `.gitignore` files?
>
> I looked into adding a new prefix, `$` to indicate the following path is
> precious or… valuable. It can be escaped with `\$` just like `\!`. 

I have been regretting that I did not make the quoting syntax not
obviously extensible in f87f9497 (git-ls-files: --exclude mechanism
updates., 2005-07-24), which technically was a breaking change (as a
relative pathname that began with '!' were not special, but after
the change, it became necessary to '\'-quote it).  A relative
pathname that begins with '$' would be now broken the same way, but
hopefully the fallout would be minor.  I presume you picked '$'
exactly because of this reason?

I do not think it will be the end of the world if we don't do so,
but it would be really really nice if we at least explored a way (or
two) to make a big enough hole in the syntax to not just add
"precious", but leave room to later add other traits, without having
to worry about breaking the backward compatibility again.  A
simplest and suboptimal way may be to declare that a path that
begins with '$' now needs '\'-quoting (just like your proposal),
reserve '$$' as the precious prefix, and '$' followed by any other
byte reserved for future use, but there may be better ideas.

> *Unfortunately*, users can't just add a local `.git/info/exclude` file with
> `$.config` in it and expect `.config` to be considered precious as the pattern
> search order will search this last as it's part of the exclude-globals.

That it nothing new and is the same for ignored files.  The lower
precedence files do not override higher precedence files.

> Thus, to make this work, projects that ship the `.gitignore` files would *have
> to add patterns* that make certain files precious.

Not really.  They do not have to do anything if they are content
with the current Git ecosystem.  And users who have precious stuff
can mark them in the.git/info/excludes no?  The only case that is
problematic is when the project says 'foo' is ignored and expendable
but the user thinks otherwise.  So to make this work, projects that
ship the ".gitignore" files have to avoid adding patterns to ignore
things that it may reasonably be expected for its users to mark
precious.

> Such opted-in projects would produce `.gitignore` files like these:
>
>     .*
>     $.config

I would understand if you ignored "*~" or "*.o", but why ignore ".*"?

THanks.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-12 16:58     ` Junio C Hamano
@ 2023-10-13  9:09       ` Sebastian Thiel
  2023-10-13 16:39         ` Junio C Hamano
  2023-10-13 10:06       ` Phillip Wood
  2023-10-13 11:25       ` Oswald Buddenhagen
  2 siblings, 1 reply; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-13  9:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

On 12 Oct 2023, at 18:58, Junio C Hamano wrote:

> I presume you picked '$' > exactly because of this reason?

Yes, and because I thought '$' seems a great fit to represent value.

> I do not think it will be the end of the world if we don't do so,
> but it would be really really nice if we at least explored a way (or
> two) to make a big enough hole in the syntax to not just add
> "precious", but leave room to later add other traits, without having
> to worry about breaking the backward compatibility again.  A
> simplest and suboptimal way may be to declare that a path that
> begins with '$' now needs '\'-quoting (just like your proposal),
> reserve '$$' as the precious prefix, and '$' followed by any other
> byte reserved for future use, but there may be better ideas.

Even though I'd love to go with the unextensible option assuming it would last
another 15 years, I can see the appeal of making it extensible from the start.

In a world where '$' is a prefix, I'd also think that it's now possible to
specify exclusion using '$!path' for completeness, if '$$path' marks 'path'
precious.

But if there is now a prefix, I feel that it might as well be chosen so that it
is easier to remember and/or less likely to cause conflicts. I think it must
have been that reason for pathspecs to choose ':' as their prefix, and it seems
to be an equally good choice here.

This would give us the following, taking the Linux kernel as example:

    .*
    !this-file-is-hidden-and-tracked
    :!new-syntax-for-negation-for-completeness
    \!an-ignored-file-with-leading-!
    \:an-ignored-file-with-leading-:-which-is-technically-breaking
    :$.config
    :x-invalid-as-:-needs-either-!-or-$-to-follow-it

Now ':$path' would make any path precious, which is `:$.config` in the example
above.

How does that 'feel'? Is the similarity to pathspecs without being pathspecs
an anti-feature maybe?

>> Thus, to make this work, projects that ship the `.gitignore` files would *have
>> to add patterns* that make certain files precious.
>
> Not really.  They do not have to do anything if they are content
> with the current Git ecosystem.  And users who have precious stuff
> can mark them in the.git/info/excludes no?

Yes, but only if they control all the ignore patterns in their global files. If
the repository decides to exclude a file they deem precious, now it won't be
precious anymore as their ':$make-this-precious' pattern is seen sequentially
after the pattern in the repository.

For instance, tooling-specific ignores are typically fully controlled by the
user, like '/.idea/', which could now easily be made precious with ':$/idea/'.

But as the Linux kernel repository ships with a '.gitignore' file that includes
the '.*' pattern, users won't be able to 'get ahead' of that pattern with their
':$.config' specification.

> The only case that is
> problematic is when the project says 'foo' is ignored and expendable
> but the user thinks otherwise.  So to make this work, projects that
> ship the ".gitignore" files have to avoid adding patterns to ignore
> things that it may reasonably be expected for its users to mark
> precious.

Yes, I think my paragraph above is exactly that but with examples to practice
the new syntax-proposal.

>
>> Such opted-in projects would produce `.gitignore` files like these:
>>
>>     .*
>>     $.config
>
> I would understand if you ignored "*~" or "*.o", but why ignore ".*"?

I don't have an answer, the example is from the Linux Kernel repository was
added in 1e65174a33784 [1].

I am definitely getting excited about the progress the syntax is making :),
thanks for proposing it!

[ Reference ]

1. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e65174a33784


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-12 16:58     ` Junio C Hamano
  2023-10-13  9:09       ` Sebastian Thiel
@ 2023-10-13 10:06       ` Phillip Wood
  2023-10-14 16:10         ` Junio C Hamano
  2023-10-13 11:25       ` Oswald Buddenhagen
  2 siblings, 1 reply; 28+ messages in thread
From: Phillip Wood @ 2023-10-13 10:06 UTC (permalink / raw)
  To: Junio C Hamano, Sebastian Thiel; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

On 12/10/2023 17:58, Junio C Hamano wrote:
> Sebastian Thiel <sebastian.thiel@icloud.com> writes:

Sebastian - thanks for raising this again, it would be really good to 
get a solution for handling "ignored but not expendable" files

> I have been regretting that I did not make the quoting syntax not
> obviously extensible in f87f9497 (git-ls-files: --exclude mechanism
> updates., 2005-07-24), which technically was a breaking change (as a
> relative pathname that began with '!' were not special, but after
> the change, it became necessary to '\'-quote it).  A relative
> pathname that begins with '$' would be now broken the same way, but
> hopefully the fallout would be minor.  I presume you picked '$'
> exactly because of this reason?
> 
> I do not think it will be the end of the world if we don't do so,
> but it would be really really nice if we at least explored a way (or
> two) to make a big enough hole in the syntax to not just add
> "precious", but leave room to later add other traits, without having
> to worry about breaking the backward compatibility again.  A
> simplest and suboptimal way may be to declare that a path that
> begins with '$' now needs '\'-quoting (just like your proposal),
> reserve '$$' as the precious prefix, and '$' followed by any other
> byte reserved for future use, but there may be better ideas.

One thought I had was that we could abuse the comment syntax to annotate 
paths something like

#(keep)
/my-precious-file

would prevent /my-precious-file from being deleted by git clean (and 
hopefully unpack-trees()[1]). It means that older versions of git would 
treat the file as ignored. If we ever want more than one annotation per 
path we could separate them with commas

#(keep,something-else)
/my-file

Strictly speaking it is a backward incompatible change but I doubt there 
are many people using comments like that. I also wondered about some 
kind of suffix on the file

/my-precious-file #(keep)

but that means that older versions of git would not ignore the file.

Best Wishes

Phillip

[1] Of the cases listed in [2] it is "git checkout" and friends 
overwriting ignored files that I worry about more. At least "git clean" 
defaults to a dry-run and has in interactive mode to select what gets 
deleted.

[2] https://lore.kernel.org/git/xmqqttqytnqb.fsf@gitster.g


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-12 16:58     ` Junio C Hamano
  2023-10-13  9:09       ` Sebastian Thiel
  2023-10-13 10:06       ` Phillip Wood
@ 2023-10-13 11:25       ` Oswald Buddenhagen
  2 siblings, 0 replies; 28+ messages in thread
From: Oswald Buddenhagen @ 2023-10-13 11:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sebastian Thiel, git, Josh Triplett, Kristoffer Haugsbakk

On Thu, Oct 12, 2023 at 09:58:19AM -0700, Junio C Hamano wrote:
>I do not think it will be the end of the world if we don't do so,
>but it would be really really nice if we at least explored a way (or
>two) to make a big enough hole in the syntax to not just add
>"precious", but leave room to later add other traits, without having
>to worry about breaking the backward compatibility again.
>
that would invariably make the syntax more verbose, for dubious gain.

that the extension we're deliberating now (again) was coming (in some 
form) was clear for quite a while, while i'm not aware of anything else 
that would semantically fit gitignore (*). "other traits" sounds awfully 
like scope creep, and would most likely fit gitattributes better.

anyway, such a hypothetical "breaking" change wouldn't have much impact, 
because versioned files aren't affected by gitignore. and for the 
misclassification to be actually harmful, the user would have to be 
unable to notice or correct it.

(*) this got me thinking about things that would fit, and i came up with 
a modification of the proposal: one might want to specify just *how* 
precious a file is (which i guess would translate to how many times the 
extra override option would have to be passed to git-clean). (**)

i guess a suitable syntax for that would be

   2>.config

note that even though using the dollar sign to denote "precious" is kind 
of intuitive, i'm not using it for two reasons: a) it's not "crazy" 
enough to use it at not quite the beginning of a file name (note that 
traditionally it isn't even special on windows), and b) the visual 
separation of the prefix isn't as good as with the "arrow-like" 
character.

(**) actually, one would probably want proper type tagging (e.g., config 
files vs. autotools-generated files (which do not belong into a repo, 
but do into a tar-ball)). that really does sound a lot like 
gitattributes, only that the files aren't versioned.

regards

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-13  9:09       ` Sebastian Thiel
@ 2023-10-13 16:39         ` Junio C Hamano
  2023-10-14  7:30           ` Sebastian Thiel
  0 siblings, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2023-10-13 16:39 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

Sebastian Thiel <sebastian.thiel@icloud.com> writes:

> But if there is now a prefix, I feel that it might as well be chosen so that it
> is easier to remember and/or less likely to cause conflicts.

Another criteria is that it is not very often used in real
pathnames, of course, and '!' and '$' are good ones.

Come to think of it, we might be able to retrofit '!' without too
much damage.  Something like "!unignored" is now a deprecated but
still supported way to say "!!unignored", "!*precious" is new, and
"\!anything" is a pathname that begins with '!'.

> Yes, I think my paragraph above is exactly that but with examples to practice
> the new syntax-proposal.

OK.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-10 17:02 ` Junio C Hamano
  2023-10-11 10:06   ` Richard Kerry
  2023-10-12 10:55   ` Sebastian Thiel
@ 2023-10-14  5:59   ` Josh Triplett
  2023-10-14 17:41     ` Junio C Hamano
  2023-10-15  6:44     ` Elijah Newren
  2 siblings, 2 replies; 28+ messages in thread
From: Josh Triplett @ 2023-10-14  5:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sebastian Thiel, git

On Tue, Oct 10, 2023 at 10:02:20AM -0700, Junio C Hamano wrote:
> Sebastian Thiel <sebastian.thiel@icloud.com> writes:
> 
> > I'd like to propose adding a new standard gitattribute "precious".
> 
> ;-).
> 
> Over the years, I've seen many times scenarios that would have been
> helped if we had not just "tracked? ignored? unignored?" but also
> the fourth kind [*].  The word "ignored" (or "excluded") has always
> meant "not tracked, not to be tracked, and expendable" to Git, and
> "ignored but unexpendable" class was missing.  I even used the term
> "precious" myself in those discussions.  At the concept level, I
> support the effort 100%, but as always, the devil will be in the
> details.

"I've already wanted this for years" is, honestly, the best response we
could *possibly* have hoped for.

> Scenarios that people wished for "precious" traditionally have been
> 
>  * You are working on 'master'.  You have in your .gitignore or
>    .git/info/exclude a line to ignore path A, and have random
>    scribbles in a throw-away file there.  There is another branch
>    'seen', where they added some tracked contents at path A/B.  You
>    do "git checkout seen" and your file A that is an expendable file,
>    because it is listed as ignored in .git/info/exclude, is removed
>    to make room for creating A/B.

Ouch, I hadn't even thought about the issue of branch-switching
overwriting a file like that, but that's another great reason to have
"precious". (I've been thinking about "precious" as primarily to protect
files like `.config`, where they'd be unlikely to be checked in on any
branch because they have an established purpose in the project. Though,
of course, people *do* sometimes check in `.config` files in
special-purpose branches that aren't meant for upstreaming.)

>  * Similar situation, but this time, 'seen' branch added a tracked
>    contents at path A.  Again, "git checkout seen" will discard the
>    expendable file A and replace it with tracked contents.
> 
>  * Instead of "git checkout", you decide to merge the branch 'seen'
>    to the checkout of 'master', where you have an ignored path A.
>    Because merging 'seen' would need to bring the tracked contents
>    of either A/B (in the first scenario above) or A (in the second
>    scenario), your "expendable" A will be removed to make room.

+1

> In previous discussions, nobody was disturbed that "git clean" was
> unaware of the "precious" class, but if we were to have the
> "precious" class in addition to "ignored" aka "expendable", I would
> not oppose to teach "git clean" about it, too.
> 
> There was an early and rough design draft there in
> 
> https://lore.kernel.org/git/7vipsnar23.fsf@alter.siamese.dyndns.org/
> 
> which probably is worth a read, too.
> 
> Even though I referred to the precious _attribute_ in some of these
> discussions, between the attribute mechanism and the ignore
> mechanism, I am actually leaning toward suggesting to extend the
> exclude/ignore mechanism to introduce the "precious" class.  That
> way, we can avoid possible snafu arising from marking a path in
> .gitignore as ignored, and in .gitattrbutes as precious, and have to
> figure out how these two settings are to work together.

Sounds reasonable.

> In any case, the "precious" paths are expected to be small minority
> of what people never want to "git add" or "git commit", so coming up
> with a special syntax to be used in .gitignore, even if that special
> syntax is ugly and cumbersome to type, would be perfectly OK.

[Following up both to this and to Sebastian's response.]

One potentially important question: should the behavior of old git be to
treat precious files as ignored, or as not-ignored? If the syntax were
something like

$.config

then old git would treat the file as not-ignored. If the syntax were
something like

$precious
.config

then old git would treat the file as ignored.

Seems like it would be obtrusive if `git status` in old git showed the
file, and `git add .` in old git added it.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-13 16:39         ` Junio C Hamano
@ 2023-10-14  7:30           ` Sebastian Thiel
  0 siblings, 0 replies; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-14  7:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Josh Triplett, Kristoffer Haugsbakk

On 13 Oct 2023, at 18:39, Junio C Hamano wrote:

> Come to think of it, we might be able to retrofit '!' without too
> much damage.  Something like "!unignored" is now a deprecated but
> still supported way to say "!!unignored", "!*precious" is new, and
> "\!anything" is a pathname that begins with '!'.

I don't know anything about statistics, and I don't which of the proposed
syntax thus far has the lowest probability of accidental breakage, possibly
in combination with the best possible usability.

However, I do like even more the idea to retro-fit `!` instead of having an
entirely new prefix, it seems more intuitive to me.

An apparent disadvantage would be that using `!` prefix with
backwards-compatibility will make any additional future modifier more
breaking. For instance `!*` is potentially ignoring an additional file
in old git, and another `!-` modifier is having the same effect.

Chances for this are probably low though, but if in doubt it would be possible
to check certain patterns against all files of the top-3.5TB of
GitHub repositories.

Using `!*` to signal precious files also seems like a less likely
path prefix than `!$` would be, but then again, it's just a guess
which most definitely doesn't have much bearing.

I personally also like this more than using special comments as 'modifier',
even though doing so would probably have the lowest probability for
accidentally ignoring files in old git.

Maybe it's time to choose one of the options with the possibility to validate
it for accidental exclusion of files against the top 3.5TB of
GitHub repositories to be more sure it?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-13 10:06       ` Phillip Wood
@ 2023-10-14 16:10         ` Junio C Hamano
  0 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-14 16:10 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Sebastian Thiel, git, Josh Triplett, Kristoffer Haugsbakk

Phillip Wood <phillip.wood123@gmail.com> writes:

> One thought I had was that we could abuse the comment syntax to
> annotate paths something like
>
> #(keep)
> /my-precious-file
>
> would prevent /my-precious-file from being deleted by git clean (and
> hopefully unpack-trees()[1]). It means that older versions of git
> would treat the file as ignored. If we ever want more than one
> annotation per path we could separate them with commas
>
> #(keep,something-else)
> /my-file
>
> Strictly speaking it is a backward incompatible change but I doubt
> there are many people using comments like that.

;-)

If "#(" feels a bit too generic, that part can be bikeshed.

I might find some example use cases why we shouldn't later, but
offhand, the idea of (ab)using the comment is a very good idea.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-14  5:59   ` Josh Triplett
@ 2023-10-14 17:41     ` Junio C Hamano
  2023-10-15  6:44     ` Elijah Newren
  1 sibling, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-14 17:41 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Sebastian Thiel, git

Josh Triplett <josh@joshtriplett.org> writes:

> On Tue, Oct 10, 2023 at 10:02:20AM -0700, Junio C Hamano wrote:
>> Sebastian Thiel <sebastian.thiel@icloud.com> writes:
>> 
>> > I'd like to propose adding a new standard gitattribute "precious".
>> 
>> ;-).
>> 
>> Over the years, I've seen many times scenarios that would have been
>> helped if we had not just "tracked? ignored? unignored?" but also
>> the fourth kind [*].  The word "ignored" (or "excluded") has always
>> meant "not tracked, not to be tracked, and expendable" to Git, and
>> "ignored but unexpendable" class was missing.  I even used the term
>> "precious" myself in those discussions.  At the concept level, I
>> support the effort 100%, but as always, the devil will be in the
>> details.
>
> "I've already wanted this for years" is, honestly, the best response we
> could *possibly* have hoped for.

Yeah, but that is not what I gave here.

It is something I saw people want from time to time over the years;
I am not at all talking about my desire, or lack thereof, to add it
to the system ;-)

>> In previous discussions, nobody was disturbed that "git clean" was
>> unaware of the "precious" class, but if we were to have the
>> "precious" class in addition to "ignored" aka "expendable", I would
>> not oppose to teach "git clean" about it, too.
>> 
>> There was an early and rough design draft there in
>> 
>> https://lore.kernel.org/git/7vipsnar23.fsf@alter.siamese.dyndns.org/
>> 
>> which probably is worth a read, too.

The project can say something like

    # force older git to ignore
    .config
    # older git unignores "$.config" without touching ".config"
    # but newer git applies the "last one wins" rule as usual
    # to mark ".config" as precious.
    !$.config

if our syntax were to retrofit '!' prefix, and even more simply

    #:(precious)
    .config

if we adopt Phillip's "comment abuse" idea, where older Git will
treat it as saying ".config" is not to be added but is expendable,
while newer Git will treat it as saying ".config" is not to be added
and not to be clobbered.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-14  5:59   ` Josh Triplett
  2023-10-14 17:41     ` Junio C Hamano
@ 2023-10-15  6:44     ` Elijah Newren
  2023-10-15  7:33       ` Sebastian Thiel
  1 sibling, 1 reply; 28+ messages in thread
From: Elijah Newren @ 2023-10-15  6:44 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Junio C Hamano, Sebastian Thiel, git

Hi,

On Fri, Oct 13, 2023 at 11:00 PM Josh Triplett <josh@joshtriplett.org> wrote:
>
> On Tue, Oct 10, 2023 at 10:02:20AM -0700, Junio C Hamano wrote:
> > Sebastian Thiel <sebastian.thiel@icloud.com> writes:
> >
> > > I'd like to propose adding a new standard gitattribute "precious".
> >
> > ;-).
> >
> > Over the years, I've seen many times scenarios that would have been
> > helped if we had not just "tracked? ignored? unignored?" but also
> > the fourth kind [*].  The word "ignored" (or "excluded") has always
> > meant "not tracked, not to be tracked, and expendable" to Git, and
> > "ignored but unexpendable" class was missing.  I even used the term
> > "precious" myself in those discussions.  At the concept level, I
> > support the effort 100%, but as always, the devil will be in the
> > details.
>
> "I've already wanted this for years" is, honestly, the best response we
> could *possibly* have hoped for.
>
> > Scenarios that people wished for "precious" traditionally have been
> >
> >  * You are working on 'master'.  You have in your .gitignore or
> >    .git/info/exclude a line to ignore path A, and have random
> >    scribbles in a throw-away file there.  There is another branch
> >    'seen', where they added some tracked contents at path A/B.  You
> >    do "git checkout seen" and your file A that is an expendable file,
> >    because it is listed as ignored in .git/info/exclude, is removed
> >    to make room for creating A/B.
>
> Ouch, I hadn't even thought about the issue of branch-switching
> overwriting a file like that, but that's another great reason to have
> "precious". (I've been thinking about "precious" as primarily to protect
> files like `.config`, where they'd be unlikely to be checked in on any
> branch because they have an established purpose in the project. Though,
> of course, people *do* sometimes check in `.config` files in
> special-purpose branches that aren't meant for upstreaming.)

If we're going to implement precious files, I think we should take a
step back and figure out what parts of the system are affected.  It's
way more than branch switching and `git clean`.  Some notes (including
some useful implementation pointers):

A) You will probably learn a lot and get a leg up on the
implementation by grepping for "preserve_ignored"; lots of the
plumbing has already been created related to this.

B) checkout has a --no-overwrite-ignore, which for checkout
operations, essentially turns all ignored files into precious files.
B1) The code behind --no-overwrite-ignore will probably be helpful in
your implementation of precious files
B2) What happens to this --no-overwrite-ignore option?
Deprecate/remove it after adding precious files, since precious files
now serve that purpose?  Keep the flag anyway?  What happens to the
docs around the flag?
B3) If we keep --no-overwrite-ignore, do we also need to add a
--overwrite-precious option to allow those to be similarly tweaked?

C) merge has a --no-overwrite-ignore option, which for supported merge
operations (which are sadly very few), essentially turns all ignored
files into precious files.
C1) Same comments as A1-A3 but for merges
C2) Sadly, merge's --no-overwrite-ignore is almost garbage.
builtin/merge.c will only pass this option to the "fast-forward" merge
backend, causing any other type of merge to overwrite ignored files
despite any such flag.
C3) most merge backends don't have logic to handle
--no-overwrite-ignore even if they were passed it, and would need
explicit support added.
C4) merge-ort would only essentially need a one-liner; it basically
has the code in place and a comment but the flag was never plumbed
through
C5) it'd be a herculean effort to support this with merge-recursive,
and a sisyphean effort to attempt to maintain.  Deprecating and
removing merge-recursive is probably a better option
C6) merge-resolve and merge-octopus could probably be handled
automatically by ensuring `git read-tree` gained support for it.
C7) there'd be no way to ensure user-written merge algorithms would
support it, but that's kind of a general problem with user-written
anythings

D) git ls-files would need to have a way to query for precious files,
much as it can currently be used to query for ignored files (or
tracked files, or conflicted file, or skip-worktree files, or...).
Backward compatibility questions arise about whether precious files
should appear in `git ls-files -o` output.

E) git status has an --ignored option, with multiple flags for
controlling it.  We'll likely need more flags to be able to pick out
precious files.

F) We'd probably need to look through several other commands and look
at what they need for special handling.  e.g., am, stash, reset.  I
suspect stash will be a particularly sore point, as its unfortunate
design of implementing shell in C code and attempting to decompose the
command in terms of other high-level commands is basically a leaky
abstraction that is very likely to be susceptible to edge and corner
cases here.  (In fact, I think I may have left some of the issues for
untracked/ignored files in stash broken when I was fixing such
problems for other commands.)

G) Documentation.  Commands like `git reset --hard`, `git checkout
-f`, and `read-tree --reset` are documented to nuke untracked files
specifically because we expect most commands to preserve untracked
files.  These would need to mention that "precious" files are also
nuked (or, if we don't nuke them, why precious-and-ignored files are
more precious than untracked files).

H) Design of `reset --hard`.  As per
https://lore.kernel.org/git/xmqqr1e2ejs9.fsf@gitster.g/, `git reset
--hard` is a little funny and we have thought about changing it.  Will
the addition of "precious" objects provide more impetus to do so, and
should a migration story be part of such a new feature?

I) Although git's stated behavior was to nuke ignored files and
protect untracked files, just a couple years ago we found _many_ bugs
where Git didn't do that.  A series was pushed to fix most of those[1]
(incidentally, the same series the introduced the preserve_ignored
flag I pointed you to earlier), but it left a few things
commented/broken.  The cover letter and some emails in the thread also
discussed in more detail some of the ramifications around a "precious"
setting.  It may be worth reading to catch other things to cover and
think about.

J) To implement this feature, you're going to have to touch dir.c.
Good luck with that.  (Seriously, good luck.  The more people that
touch it that aren't me, the less I'll be pinged/queried about that
monstrosity.)


> > In any case, the "precious" paths are expected to be small minority
> > of what people never want to "git add" or "git commit", so coming up
> > with a special syntax to be used in .gitignore, even if that special
> > syntax is ugly and cumbersome to type, would be perfectly OK.
>
> [Following up both to this and to Sebastian's response.]
>
> One potentially important question: should the behavior of old git be to
> treat precious files as ignored, or as not-ignored? If the syntax were
> something like
>
> $.config
>
> then old git would treat the file as not-ignored. If the syntax were
> something like
>
> $precious
> .config
>
> then old git would treat the file as ignored.
>
> Seems like it would be obtrusive if `git status` in old git showed the
> file, and `git add .` in old git added it.

A very good set of questions, along similar lines as the question
about `git ls-files -o` handling.


Anyway, I'm a bit worried after digging up and dumping all these
concerns on you, that it'll sound like I'm trying to bury the feature
and discourage folks.  In the past I have been against this at times,
but mostly because it looked like lots of work, I didn't want to touch
dir.c anymore, and I was worried we'd add a bunch more edge & corner
cases to the code (when we already had plenty with our more limited
number of file types).  In a way, the preserve_ignored stuff kind of
made this a lot more reasonable for us to switch to.  But I do still
think it's a fair amount of work, and I am kind of worried about
potential new edge and corner case.


Hope that helps,
Elijah

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-15  6:44     ` Elijah Newren
@ 2023-10-15  7:33       ` Sebastian Thiel
  2023-10-15 16:31         ` Junio C Hamano
  0 siblings, 1 reply; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-15  7:33 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Josh Triplett, Junio C Hamano, git

Thanks so much Elijah for your eye-opening response. Thus far I was both
naive and ignorant about the complexity of the matter, and also never
asked the question as to why it wasn't tackled earlier since it came up
already.

Since so many areas of git are affected by precious files, it seems that
rolling it out with everything working is unrealistic and I wonder if
it even had to be behind a feature toggle at first.

A particularly interesting question brought up here also was the question
of what's more important: untracked files, or precious files? Are they
effectively treated the same, or is there a difference?

In any case, it seems easiest to set the desired syntax for such a
feature and/or validate it, and then devise a plan for how it could all
come together.


On 15 Oct 2023, at 8:44, Elijah Newren wrote:

> Hi,
>
> On Fri, Oct 13, 2023 at 11:00 PM Josh Triplett <josh@joshtriplett.org> wrote:
>>
>> On Tue, Oct 10, 2023 at 10:02:20AM -0700, Junio C Hamano wrote:
>>> Sebastian Thiel <sebastian.thiel@icloud.com> writes:
>>>
>>>> I'd like to propose adding a new standard gitattribute "precious".
>>>
>>> ;-).
>>>
>>> Over the years, I've seen many times scenarios that would have been
>>> helped if we had not just "tracked? ignored? unignored?" but also
>>> the fourth kind [*].  The word "ignored" (or "excluded") has always
>>> meant "not tracked, not to be tracked, and expendable" to Git, and
>>> "ignored but unexpendable" class was missing.  I even used the term
>>> "precious" myself in those discussions.  At the concept level, I
>>> support the effort 100%, but as always, the devil will be in the
>>> details.
>>
>> "I've already wanted this for years" is, honestly, the best response we
>> could *possibly* have hoped for.
>>
>>> Scenarios that people wished for "precious" traditionally have been
>>>
>>>  * You are working on 'master'.  You have in your .gitignore or
>>>    .git/info/exclude a line to ignore path A, and have random
>>>    scribbles in a throw-away file there.  There is another branch
>>>    'seen', where they added some tracked contents at path A/B.  You
>>>    do "git checkout seen" and your file A that is an expendable file,
>>>    because it is listed as ignored in .git/info/exclude, is removed
>>>    to make room for creating A/B.
>>
>> Ouch, I hadn't even thought about the issue of branch-switching
>> overwriting a file like that, but that's another great reason to have
>> "precious". (I've been thinking about "precious" as primarily to protect
>> files like `.config`, where they'd be unlikely to be checked in on any
>> branch because they have an established purpose in the project. Though,
>> of course, people *do* sometimes check in `.config` files in
>> special-purpose branches that aren't meant for upstreaming.)
>
> If we're going to implement precious files, I think we should take a
> step back and figure out what parts of the system are affected.  It's
> way more than branch switching and `git clean`.  Some notes (including
> some useful implementation pointers):
>
> A) You will probably learn a lot and get a leg up on the
> implementation by grepping for "preserve_ignored"; lots of the
> plumbing has already been created related to this.
>
> B) checkout has a --no-overwrite-ignore, which for checkout
> operations, essentially turns all ignored files into precious files.
> B1) The code behind --no-overwrite-ignore will probably be helpful in
> your implementation of precious files
> B2) What happens to this --no-overwrite-ignore option?
> Deprecate/remove it after adding precious files, since precious files
> now serve that purpose?  Keep the flag anyway?  What happens to the
> docs around the flag?
> B3) If we keep --no-overwrite-ignore, do we also need to add a
> --overwrite-precious option to allow those to be similarly tweaked?
>
> C) merge has a --no-overwrite-ignore option, which for supported merge
> operations (which are sadly very few), essentially turns all ignored
> files into precious files.
> C1) Same comments as A1-A3 but for merges
> C2) Sadly, merge's --no-overwrite-ignore is almost garbage.
> builtin/merge.c will only pass this option to the "fast-forward" merge
> backend, causing any other type of merge to overwrite ignored files
> despite any such flag.
> C3) most merge backends don't have logic to handle
> --no-overwrite-ignore even if they were passed it, and would need
> explicit support added.
> C4) merge-ort would only essentially need a one-liner; it basically
> has the code in place and a comment but the flag was never plumbed
> through
> C5) it'd be a herculean effort to support this with merge-recursive,
> and a sisyphean effort to attempt to maintain.  Deprecating and
> removing merge-recursive is probably a better option
> C6) merge-resolve and merge-octopus could probably be handled
> automatically by ensuring `git read-tree` gained support for it.
> C7) there'd be no way to ensure user-written merge algorithms would
> support it, but that's kind of a general problem with user-written
> anythings
>
> D) git ls-files would need to have a way to query for precious files,
> much as it can currently be used to query for ignored files (or
> tracked files, or conflicted file, or skip-worktree files, or...).
> Backward compatibility questions arise about whether precious files
> should appear in `git ls-files -o` output.
>
> E) git status has an --ignored option, with multiple flags for
> controlling it.  We'll likely need more flags to be able to pick out
> precious files.
>
> F) We'd probably need to look through several other commands and look
> at what they need for special handling.  e.g., am, stash, reset.  I
> suspect stash will be a particularly sore point, as its unfortunate
> design of implementing shell in C code and attempting to decompose the
> command in terms of other high-level commands is basically a leaky
> abstraction that is very likely to be susceptible to edge and corner
> cases here.  (In fact, I think I may have left some of the issues for
> untracked/ignored files in stash broken when I was fixing such
> problems for other commands.)
>
> G) Documentation.  Commands like `git reset --hard`, `git checkout
> -f`, and `read-tree --reset` are documented to nuke untracked files
> specifically because we expect most commands to preserve untracked
> files.  These would need to mention that "precious" files are also
> nuked (or, if we don't nuke them, why precious-and-ignored files are
> more precious than untracked files).
>
> H) Design of `reset --hard`.  As per
> https://lore.kernel.org/git/xmqqr1e2ejs9.fsf@gitster.g/, `git reset
> --hard` is a little funny and we have thought about changing it.  Will
> the addition of "precious" objects provide more impetus to do so, and
> should a migration story be part of such a new feature?
>
> I) Although git's stated behavior was to nuke ignored files and
> protect untracked files, just a couple years ago we found _many_ bugs
> where Git didn't do that.  A series was pushed to fix most of those[1]
> (incidentally, the same series the introduced the preserve_ignored
> flag I pointed you to earlier), but it left a few things
> commented/broken.  The cover letter and some emails in the thread also
> discussed in more detail some of the ramifications around a "precious"
> setting.  It may be worth reading to catch other things to cover and
> think about.
>
> J) To implement this feature, you're going to have to touch dir.c.
> Good luck with that.  (Seriously, good luck.  The more people that
> touch it that aren't me, the less I'll be pinged/queried about that
> monstrosity.)
>
>
>>> In any case, the "precious" paths are expected to be small minority
>>> of what people never want to "git add" or "git commit", so coming up
>>> with a special syntax to be used in .gitignore, even if that special
>>> syntax is ugly and cumbersome to type, would be perfectly OK.
>>
>> [Following up both to this and to Sebastian's response.]
>>
>> One potentially important question: should the behavior of old git be to
>> treat precious files as ignored, or as not-ignored? If the syntax were
>> something like
>>
>> $.config
>>
>> then old git would treat the file as not-ignored. If the syntax were
>> something like
>>
>> $precious
>> .config
>>
>> then old git would treat the file as ignored.
>>
>> Seems like it would be obtrusive if `git status` in old git showed the
>> file, and `git add .` in old git added it.
>
> A very good set of questions, along similar lines as the question
> about `git ls-files -o` handling.
>
>
> Anyway, I'm a bit worried after digging up and dumping all these
> concerns on you, that it'll sound like I'm trying to bury the feature
> and discourage folks.  In the past I have been against this at times,
> but mostly because it looked like lots of work, I didn't want to touch
> dir.c anymore, and I was worried we'd add a bunch more edge & corner
> cases to the code (when we already had plenty with our more limited
> number of file types).  In a way, the preserve_ignored stuff kind of
> made this a lot more reasonable for us to switch to.  But I do still
> think it's a fair amount of work, and I am kind of worried about
> potential new edge and corner case.
>
>
> Hope that helps,
> Elijah

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-15  7:33       ` Sebastian Thiel
@ 2023-10-15 16:31         ` Junio C Hamano
  2023-10-16  6:02           ` Sebastian Thiel
  2023-10-23  7:15           ` Sebastian Thiel
  0 siblings, 2 replies; 28+ messages in thread
From: Junio C Hamano @ 2023-10-15 16:31 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: Elijah Newren, Josh Triplett, git

Sebastian Thiel <sebastian.thiel@icloud.com> writes:

> A particularly interesting question brought up here also was the question
> of what's more important: untracked files, or precious files? Are they
> effectively treated the same, or is there a difference?

Think of it this way.  There are two orthogonal axes.

 (1) Are you a candidate to be tracked, even though you are not
     tracked right now?

 (2) Should you be kept and make an operation fail that wants to
     remove you to make room?

For untracked files, both are "Yes".  As we already saw in the long
discussion, precious files are "not to be added and not to be
clobbered", so you'd answer "No" and "Yes" [*].

In other words, both are equally protected from getting cloberred.

    Side note: for completeness, for ignored files, the answers are
    "No", and "No".  The introduction of "precious" class makes a
    combination "No-Yes" that hasn't been possible so far.

Elijah, thanks for doing a very good job of creating a catalog of
kludges we accumulated over the years for the lack of proper support
for the precious paths.  I think they should be kept for backward
compatibility, but for new users they should not have to learn any
of them once we have the support for precious paths.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-15 16:31         ` Junio C Hamano
@ 2023-10-16  6:02           ` Sebastian Thiel
  2023-10-23  7:15           ` Sebastian Thiel
  1 sibling, 0 replies; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-16  6:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren, Josh Triplett, git

Thanks a lot, that makes perfect sense!

Thanks to Elijah we may also have discovered why the idea of precious files
didn't get implemented last time it came up: it's too much work to make
all portions of the code aware.

I don't know if this time will be different as I can only offer to implement
the syntax adjustment, whatever that might be (possibly after validating
the candidate against a corpus of repositories), along with the update
to `git clean` so it leaves precious files alone by default and a new flag
to also remove precious files.

Maybe that already is something worth having, but I can also imagine
that ideally there is a plan for retrofitting other portions of git as
well along with the resources to actually do it.

On 15 Oct 2023, at 18:31, Junio C Hamano wrote:

> Sebastian Thiel <sebastian.thiel@icloud.com> writes:
>
>> A particularly interesting question brought up here also was the question
>> of what's more important: untracked files, or precious files? Are they
>> effectively treated the same, or is there a difference?
>
> Think of it this way.  There are two orthogonal axes.
>
>  (1) Are you a candidate to be tracked, even though you are not
>      tracked right now?
>
>  (2) Should you be kept and make an operation fail that wants to
>      remove you to make room?
>
> For untracked files, both are "Yes".  As we already saw in the long
> discussion, precious files are "not to be added and not to be
> clobbered", so you'd answer "No" and "Yes" [*].
>
> In other words, both are equally protected from getting cloberred.
>
>     Side note: for completeness, for ignored files, the answers are
>     "No", and "No".  The introduction of "precious" class makes a
>     combination "No-Yes" that hasn't been possible so far.
>
> Elijah, thanks for doing a very good job of creating a catalog of
> kludges we accumulated over the years for the lack of proper support
> for the precious paths.  I think they should be kept for backward
> compatibility, but for new users they should not have to learn any
> of them once we have the support for precious paths.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-15 16:31         ` Junio C Hamano
  2023-10-16  6:02           ` Sebastian Thiel
@ 2023-10-23  7:15           ` Sebastian Thiel
  2023-10-29  6:44             ` Elijah Newren
  1 sibling, 1 reply; 28+ messages in thread
From: Sebastian Thiel @ 2023-10-23  7:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren, Josh Triplett, git

On 16 Oct 2023, at 8:02, Sebastian Thiel wrote:

> I don't know if this time will be different as I can only offer to implement
> the syntax adjustment, whatever that might be (possibly after validating
> the candidate against a corpus of repositories), along with the update
> to `git clean` so it leaves precious files alone by default and a new flag
> to also remove precious files.

I am happy to announce this feature can now be contributed in full by me once
you give it a go. This would mean that the entirety of `git` would become
aware of precious files over time.

To my mind, and probably out of ignorance, it seems that once the syntax is
decided on it's possible for the implementation to start. From there I could
use Elijah's analysis to know which parts of git to make aware of precious files
in addition to `git clean`.

I am definitely looking forward to hearing from you :).



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] Define "precious" attribute and support it in `git clean`
  2023-10-23  7:15           ` Sebastian Thiel
@ 2023-10-29  6:44             ` Elijah Newren
  0 siblings, 0 replies; 28+ messages in thread
From: Elijah Newren @ 2023-10-29  6:44 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: Junio C Hamano, Josh Triplett, git

Hi Sebastian,

On Mon, Oct 23, 2023 at 12:15 AM Sebastian Thiel
<sebastian.thiel@icloud.com> wrote:
>
> On 16 Oct 2023, at 8:02, Sebastian Thiel wrote:
>
> > I don't know if this time will be different as I can only offer to implement
> > the syntax adjustment, whatever that might be (possibly after validating
> > the candidate against a corpus of repositories), along with the update
> > to `git clean` so it leaves precious files alone by default and a new flag
> > to also remove precious files.
>
> I am happy to announce this feature can now be contributed in full by me once
> you give it a go. This would mean that the entirety of `git` would become
> aware of precious files over time.
>
> To my mind, and probably out of ignorance, it seems that once the syntax is
> decided on it's possible for the implementation to start. From there I could
> use Elijah's analysis to know which parts of git to make aware of precious files
> in addition to `git clean`.
>
> I am definitely looking forward to hearing from you :).

So, we typically don't pre-approve patches/features.  Junio described
this recently at [1].

However, starting things out with an RFC, as you've done, is certainly
a good first step to gauge whether folks think a feature is useful.

Occasionally, when the feature is bigger or touches lots of areas of
the code, people will even write up a design document, and first get a
review on the document, which then streamlines later reviews since we
have some of the high-level aspects agreed to.  Some examples:
  * Documentation/technical/hash-function-transition.txt
  * Documentation/technical/sparse-checkout.txt
  * Documentation/technical/sparse-index.txt
Each of which are in various stages between "these are ideas we think
are good and our plans to get there" to "most of this document has
since been implemented".  There are others in that directory too,
though not everything in that directory is a planning document; some
of the files are simply documentation of what already exists.

Anyway, creating a similar planning document and covering the various
cases I mentioned would likely be a very useful next step here.  I did
note that multiple ideas have been presented in this thread about the
syntax for specifying precious files, and it'd be good to nail one
down.  It would also be nice to see proposed answers to the several
cases I brought up (some of which Junio answered, others of which I
also have potential answers for so I could potentially help you craft
this document, and a few others that someone else would need to fill
in).  Sometimes we also want to cover pros/cons of the approaches we
have decided upon, in part because others may come along later and if
they discover a new pro or con that we haven't thought of, then we may
need to rethink the plan.

Hope that helps,
Elijah

[1] https://lore.kernel.org/git/xmqq8r9ommyt.fsf@gitster.g/

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-10-29  6:44 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-10 12:37 [RFC] Define "precious" attribute and support it in `git clean` Sebastian Thiel
2023-10-10 13:38 ` Kristoffer Haugsbakk
2023-10-10 14:10   ` Josh Triplett
2023-10-10 17:07     ` Junio C Hamano
2023-10-12  8:47       ` Josh Triplett
2023-10-10 19:10     ` Kristoffer Haugsbakk
2023-10-12  9:04       ` Josh Triplett
2023-10-10 17:02 ` Junio C Hamano
2023-10-11 10:06   ` Richard Kerry
2023-10-11 22:40     ` Jeff King
2023-10-11 23:35     ` Junio C Hamano
2023-10-12 10:55   ` Sebastian Thiel
2023-10-12 16:58     ` Junio C Hamano
2023-10-13  9:09       ` Sebastian Thiel
2023-10-13 16:39         ` Junio C Hamano
2023-10-14  7:30           ` Sebastian Thiel
2023-10-13 10:06       ` Phillip Wood
2023-10-14 16:10         ` Junio C Hamano
2023-10-13 11:25       ` Oswald Buddenhagen
2023-10-14  5:59   ` Josh Triplett
2023-10-14 17:41     ` Junio C Hamano
2023-10-15  6:44     ` Elijah Newren
2023-10-15  7:33       ` Sebastian Thiel
2023-10-15 16:31         ` Junio C Hamano
2023-10-16  6:02           ` Sebastian Thiel
2023-10-23  7:15           ` Sebastian Thiel
2023-10-29  6:44             ` Elijah Newren
2023-10-11 21:41 ` Kristoffer Haugsbakk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.