All of lore.kernel.org
 help / color / mirror / Atom feed
* What should be the CRLF policy when win + Linux?
@ 2010-05-05 10:01 mat
  2010-05-05 13:27 ` Ramkumar Ramachandra
                   ` (2 more replies)
  0 siblings, 3 replies; 82+ messages in thread
From: mat @ 2010-05-05 10:01 UTC (permalink / raw)
  To: git

Hi

I have two git projects:
-one (A) with linux people only
-one (B) with someone using windows

As we had "end of line" problems with the person using windows (B), I used:

git config --global core.autocrlf true

Following advices from:
http://help.github.com/dealing-with-lineendings/

So everything now if fine with project B, but now some problems using 
project (A): I wanted to copy the whole project file to another dir, and 
now it is complaining about the change, signaling warning:

CRLF will be replaced by LF in .../A.

So I don't know exactly what I should do...Should I change all the CRLF 
from project A, but people will have also problems, or can I switch the 
config, once I'm using project A and B? It is not so clear in my mind 
and I would appreciate any advice!!

Thanks a lot

Matthieu Stigler

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-05 10:01 What should be the CRLF policy when win + Linux? mat
@ 2010-05-05 13:27 ` Ramkumar Ramachandra
  2010-05-06  9:27   ` mat
  2010-05-06  2:35 ` hasen j
  2010-05-07  7:15 ` What should be the CRLF policy when win + Linux? Gelonida
  2 siblings, 1 reply; 82+ messages in thread
From: Ramkumar Ramachandra @ 2010-05-05 13:27 UTC (permalink / raw)
  To: mat; +Cc: git

Hi,

On Wed, May 5, 2010 at 12:01 PM, mat <matthieu.stigler@gmail.com> wrote:
> So I don't know exactly what I should do...Should I change all the CRLF from
> project A, but people will have also problems, or can I switch the config,
> once I'm using project A and B? It is not so clear in my mind and I would
> appreciate any advice!!

I'm not sure what you should be doing because I've never worked with
Windows, but the following information might be useful: Yes, you can
have project-specific config quite easily.

In the command
> git config --global core.autocrlf true
just drop `--global` and the setting becomes repository-specific.

-- Ram

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-05 10:01 What should be the CRLF policy when win + Linux? mat
  2010-05-05 13:27 ` Ramkumar Ramachandra
@ 2010-05-06  2:35 ` hasen j
  2010-05-06  7:29   ` Wilbert van Dolleweerd
  2010-05-07  7:15 ` What should be the CRLF policy when win + Linux? Gelonida
  2 siblings, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-06  2:35 UTC (permalink / raw)
  To: git

On 5 May 2010 04:01, mat <matthieu.stigler@gmail.com> wrote:
>
> Hi
>
> I have two git projects:
> -one (A) with linux people only
> -one (B) with someone using windows
>
> As we had "end of line" problems with the person using windows (B), I used:
>
> git config --global core.autocrlf true
>
> Following advices from:
> http://help.github.com/dealing-with-lineendings/
>
> So everything now if fine with project B, but now some problems using project (A): I wanted to copy the whole project file to another dir, and now it is complaining about the change, signaling warning:
>
> CRLF will be replaced by LF in .../A.
>
> So I don't know exactly what I should do...Should I change all the CRLF from project A, but people will have also problems, or can I switch the config, once I'm using project A and B? It is not so clear in my mind and I would appreciate any advice!!
>
> Thanks a lot
>
> Matthieu Stigler

I personally find that autocrlf causes more confusion than it solves problems.

I've yet to see a text editor on windows that can't handle \n line
endings. (Notepad doesn't count)

Just keep the project with \n line endings, disable autocrlf, and make
sure that people are aware of this.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06  2:35 ` hasen j
@ 2010-05-06  7:29   ` Wilbert van Dolleweerd
  2010-05-06 15:34     ` hasen j
  0 siblings, 1 reply; 82+ messages in thread
From: Wilbert van Dolleweerd @ 2010-05-06  7:29 UTC (permalink / raw)
  To: git

> I personally find that autocrlf causes more confusion than it solves problems.
>
> I've yet to see a text editor on windows that can't handle \n line
> endings. (Notepad doesn't count)
>
> Just keep the project with \n line endings, disable autocrlf, and make
> sure that people are aware of this.

Editors may handle it gracefully but older Windows programs will have problems.

For instance, Visual Studio 6 will barf on Visual Basic projectfiles
with non-windows line-style endings. (And please don't ask why I know
this....)

-- 
Kind regards,

Wilbert van Dolleweerd
Blog: http://walkingthestack.blogspot.com/
Twitter: http://www.twitter.com/wvandolleweerd

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-05 13:27 ` Ramkumar Ramachandra
@ 2010-05-06  9:27   ` mat
  2010-05-06 10:03     ` Erik Faye-Lund
  0 siblings, 1 reply; 82+ messages in thread
From: mat @ 2010-05-06  9:27 UTC (permalink / raw)
  To: Ramkumar Ramachandra; +Cc: git, hasan.aljudy

Thanks for your answer!!

I think what you suggest Ramkumar is indeed what I need, great! The 
suggestion from hasan to keep with those settings was not doable as the 
windows guy had the problem of that after even a clean cloning, git was 
signaling changes (see: http://help.github.com/dealing-with-lineendings/)

So I just did:

 git config --global --unset core.autocrlf

and then set for this specifical project:

 git config core.autocrlf true

Hope this is how you meant?

Thanks a lot!!

Matthieu

Ramkumar Ramachandra a écrit :
> Hi,
>
> On Wed, May 5, 2010 at 12:01 PM, mat <matthieu.stigler@gmail.com> wrote:
>   
>> So I don't know exactly what I should do...Should I change all the CRLF from
>> project A, but people will have also problems, or can I switch the config,
>> once I'm using project A and B? It is not so clear in my mind and I would
>> appreciate any advice!!
>>     
>
> I'm not sure what you should be doing because I've never worked with
> Windows, but the following information might be useful: Yes, you can
> have project-specific config quite easily.
>
> In the command
>   
>> git config --global core.autocrlf true
>>     
> just drop `--global` and the setting becomes repository-specific.
>
> -- Ram
>   

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06  9:27   ` mat
@ 2010-05-06 10:03     ` Erik Faye-Lund
  0 siblings, 0 replies; 82+ messages in thread
From: Erik Faye-Lund @ 2010-05-06 10:03 UTC (permalink / raw)
  To: mat; +Cc: Ramkumar Ramachandra, git, hasan.aljudy

On Thu, May 6, 2010 at 11:27 AM, mat <matthieu.stigler@gmail.com> wrote:
> Thanks for your answer!!
>
> I think what you suggest Ramkumar is indeed what I need, great! The
> suggestion from hasan to keep with those settings was not doable as the
> windows guy had the problem of that after even a clean cloning, git was
> signaling changes (see: http://help.github.com/dealing-with-lineendings/)
>

This is a symptom that someone checked in files with CRLF into the
repo with core.autocrlf disabled, and the Windows guy having
core.autocrlf enabled.

I don't quite agree with Hasen about checking out LF on Windows,
though. There's just too many tools that gets slightly confused (as
well as some getting REALLY confused) by this in my experience. It's
sometimes the best trade-off, but quite often not IMO.

What I'd do, is to set core.autocrlf to "input" on non-Windows
machines, and "true" on Windows-machines. This makes sure that no
machines will check in CRLF. If there's already files checked in with
CRLF (as seems to be the case with your repo), the Windows-people will
be annoyed. So you'd need to make sure that the repo only contained
CRLFs, and you have basically two options:
1) Just call dos2unix on all files and commit the changes. This will
still cause problems for the Windows users if they need to check out
commits older than the dos2unix one.
2) Use git filter-branch to rewrite the history to pretend no one ever
made the mistake of committing CRLFs. This will make trouble for
anyone who's working on a branch. But it's a one-time issue (unless
someone manages to commit CRLF-files again, that is).

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06  7:29   ` Wilbert van Dolleweerd
@ 2010-05-06 15:34     ` hasen j
  2010-05-06 17:15       ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-06 15:34 UTC (permalink / raw)
  To: Wilbert van Dolleweerd; +Cc: git

On 6 May 2010 01:29, Wilbert van Dolleweerd <wilbert@arentheym.com> wrote:
>> I personally find that autocrlf causes more confusion than it solves problems.
>>
>> I've yet to see a text editor on windows that can't handle \n line
>> endings. (Notepad doesn't count)
>>
>> Just keep the project with \n line endings, disable autocrlf, and make
>> sure that people are aware of this.
>
> Editors may handle it gracefully but older Windows programs will have problems.
>
> For instance, Visual Studio 6 will barf on Visual Basic projectfiles
> with non-windows line-style endings. (And please don't ask why I know
> this....)
>

Well, this is the exception that proves the rule then :)

Anyway, If it's a VB project, might as well just keep the files with
CRLF endings then.

I don't know all linux editors, but I've yet to see one that can't
handle CRLF endings.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 15:34     ` hasen j
@ 2010-05-06 17:15       ` Linus Torvalds
  2010-05-06 17:26         ` Erik Faye-Lund
  2010-05-06 20:00         ` hasen j
  0 siblings, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-06 17:15 UTC (permalink / raw)
  To: hasen j; +Cc: Wilbert van Dolleweerd, git



On Thu, 6 May 2010, hasen j wrote:
> 
> I don't know all linux editors, but I've yet to see one that can't
> handle CRLF endings.

A _lot_ of UNIX editors will handle CRLF endings, but if you change a 
file, they often write the result back with _mixed_ endings. Some will 
also show the CR as '^M' or some other garbage at the end.

A number of tools will also end up confused, including very fundamental 
things like "grep". Try this:

	echo -e "Hello\015" > f
	grep 'Hello$' f

and notice how the grep does _not_ find the Hello at the end of the line, 
because grep sees another random character there (this might be 
unportable, I could easily imagine some versions of grep finding it).

So I would strongly suggest against CRLF on UNIX. It really doesn't work 
very well, even if some tools will handle it to some limited degree.

In short: having 'core.autocrlf' set will likely make it much more 
pleasant to work across different platforms. 

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 17:15       ` Linus Torvalds
@ 2010-05-06 17:26         ` Erik Faye-Lund
  2010-05-06 20:00         ` hasen j
  1 sibling, 0 replies; 82+ messages in thread
From: Erik Faye-Lund @ 2010-05-06 17:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: hasen j, Wilbert van Dolleweerd, git

On Thu, May 6, 2010 at 7:15 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 6 May 2010, hasen j wrote:
>>
>> I don't know all linux editors, but I've yet to see one that can't
>> handle CRLF endings.
>
> A _lot_ of UNIX editors will handle CRLF endings, but if you change a
> file, they often write the result back with _mixed_ endings.

Just for completeness: The inverse is also the case on Windows; a lot
of editors will handle LF endings, but a handful of them will insert
gladly insert CRLFs under certain circumstances. Microsoft Visual
Studio is one of these.

So yeah, neither CRLF or LF everywhere is generally a good idea.

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 17:15       ` Linus Torvalds
  2010-05-06 17:26         ` Erik Faye-Lund
@ 2010-05-06 20:00         ` hasen j
  2010-05-06 20:23           ` Linus Torvalds
  2010-05-06 20:40           ` Erik Faye-Lund
  1 sibling, 2 replies; 82+ messages in thread
From: hasen j @ 2010-05-06 20:00 UTC (permalink / raw)
  To: Linus Torvalds, git, Erik Faye-Lund

On 6 May 2010 11:15, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 6 May 2010, hasen j wrote:
>>
>> I don't know all linux editors, but I've yet to see one that can't
>> handle CRLF endings.
>
> A _lot_ of UNIX editors will handle CRLF endings, but if you change a
> file, they often write the result back with _mixed_ endings. Some will
> also show the CR as '^M' or some other garbage at the end.
>
> A number of tools will also end up confused, including very fundamental
> things like "grep". Try this:
>
>        echo -e "Hello\015" > f
>        grep 'Hello$' f
>
> and notice how the grep does _not_ find the Hello at the end of the line,
> because grep sees another random character there (this might be
> unportable, I could easily imagine some versions of grep finding it).
>
> So I would strongly suggest against CRLF on UNIX. It really doesn't work
> very well, even if some tools will handle it to some limited degree.
>
> In short: having 'core.autocrlf' set will likely make it much more
> pleasant to work across different platforms.
>
>                        Linus
>

When I'm on windows, I prefer LF (unless the project already uses
CRLF, or it's outside my control).

VB is very windowsy; I *really* doubt most VB developers use (or even
know) grep, so I don't think it's a problem if a VB project
standardizes line endings to be CRLF.

My problem with autocrlf is that, well, it converts line endings in
the working directory to CRLF, even though I don't always want it to.
(most of the time, I don't).

The other problem is, git will get confused if you set autocrlf *after
the fact*; i.e. you already cloned and have the files checked out,
maybe even made some commits.

Overall, I ran into many awkward situations with autocrlf (and I can't
remember them now), but if you google you can find some of the issues
people are having.

The whole problem would go away if there was no crlf, and that's not
impossible: any decent text editor can read/write files with Unix line
endings.

I wasn't aware that Visual Studio doesn't have an easy way to have it
write LF endings by default; I'm sure there are addons to make that
easier. Plus most open source projects are not usually setup with VS
as the development environment anyway, so it's really not a big
problem.

So yeah, I think LF everywhere is the better way to go most of the time.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 20:00         ` hasen j
@ 2010-05-06 20:23           ` Linus Torvalds
  2010-05-06 20:40           ` Erik Faye-Lund
  1 sibling, 0 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-06 20:23 UTC (permalink / raw)
  To: hasen j; +Cc: git, Erik Faye-Lund



On Thu, 6 May 2010, hasen j wrote:
> 
> My problem with autocrlf is that, well, it converts line endings in
> the working directory to CRLF, even though I don't always want it to.
> (most of the time, I don't).

You can just set it to 'input' if you want to. It's not just on/off, you 
can also say "I want to check out with no conversion (ie "just LF"), but 
convert CRLF to LF on input".

Btw, one thing to keep in mind with autocrlf is the "auto" part: it tries 
to do a good job noticing when something is text vs binary, but it _is_ a 
heuristic. I think it's a pretty good one, but if you do set autocrlf 
(whether to "true" or to "input"), at least think about attributes ("man 
gitattributes")

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 20:00         ` hasen j
  2010-05-06 20:23           ` Linus Torvalds
@ 2010-05-06 20:40           ` Erik Faye-Lund
  2010-05-06 22:14             ` hasen j
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
  1 sibling, 2 replies; 82+ messages in thread
From: Erik Faye-Lund @ 2010-05-06 20:40 UTC (permalink / raw)
  To: hasen j; +Cc: Linus Torvalds, git

On Thu, May 6, 2010 at 10:00 PM, hasen j <hasan.aljudy@gmail.com> wrote:
> On 6 May 2010 11:15, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>
>>
>> On Thu, 6 May 2010, hasen j wrote:
>>>
>>> I don't know all linux editors, but I've yet to see one that can't
>>> handle CRLF endings.
>>
>> A _lot_ of UNIX editors will handle CRLF endings, but if you change a
>> file, they often write the result back with _mixed_ endings. Some will
>> also show the CR as '^M' or some other garbage at the end.
>>
>> A number of tools will also end up confused, including very fundamental
>> things like "grep". Try this:
>>
>>        echo -e "Hello\015" > f
>>        grep 'Hello$' f
>>
>> and notice how the grep does _not_ find the Hello at the end of the line,
>> because grep sees another random character there (this might be
>> unportable, I could easily imagine some versions of grep finding it).
>>
>> So I would strongly suggest against CRLF on UNIX. It really doesn't work
>> very well, even if some tools will handle it to some limited degree.
>>
>> In short: having 'core.autocrlf' set will likely make it much more
>> pleasant to work across different platforms.
>>
>>                        Linus
>>
>
> When I'm on windows, I prefer LF (unless the project already uses
> CRLF, or it's outside my control).
>

"When I'm on windows" leads me to believe Windows is not your primary
operating system. If not, please excuse me.

> My problem with autocrlf is that, well, it converts line endings in
> the working directory to CRLF, even though I don't always want it to.
> (most of the time, I don't).
>

There's gitattributes for that.

> The other problem is, git will get confused if you set autocrlf *after
> the fact*; i.e. you already cloned and have the files checked out,
> maybe even made some commits.
>

core.autocrlf being on by default in Git for Windows greatly reduces
the risk for this. I with core.autocrlf was set to "input" by default
on other platforms, though.

> Overall, I ran into many awkward situations with autocrlf (and I can't
> remember them now), but if you google you can find some of the issues
> people are having.
>
> The whole problem would go away if there was no crlf, and that's not
> impossible: any decent text editor can read/write files with Unix line
> endings.
>

That's probably on of the things that makes a text-editor decent in
your book, but this opinion might not be shared with everyone. Perhaps
not being primarily a Windows-user somehow biases your opinion here?

> I wasn't aware that Visual Studio doesn't have an easy way to have it
> write LF endings by default; I'm sure there are addons to make that
> easier. Plus most open source projects are not usually setup with VS
> as the development environment anyway, so it's really not a big
> problem.

The problem with Visual Studio isn't that it doesn't write LFs
normally... the problem is that when you paste text, it retains the
newline style from the source you copied from. But it is not the only
tool with such issues, so playing the "VS is the problem"-card doesn't
stick IMO.

Even if it did, Open source isn't the only model for developing
software. And again... even if it were, working well together with
visual studio support would be very beneficial for quite a bit of
projects. Visual Studio is probably the most used code-editor among
Windows-developers (with a good margin too, I suspect), so ignoring it
is would just be sticking your head in the sand - or worse, asking for
less contributions from Windows-users (which can often be a problem in
the first place).

So no, I strongly doubt LF everywhere is the better way ;)

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 20:40           ` Erik Faye-Lund
@ 2010-05-06 22:14             ` hasen j
  2010-05-06 23:25               ` Erik Faye-Lund
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
  1 sibling, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-06 22:14 UTC (permalink / raw)
  To: kusmabite; +Cc: Linus Torvalds, git

>>
>> When I'm on windows, I prefer LF (unless the project already uses
>> CRLF, or it's outside my control).
>>
>
> "When I'm on windows" leads me to believe Windows is not your primary
> operating system. If not, please excuse me.

I used to be, I only moved to linux about a year ago, but I use
windows at work, and I started using git when I was on windows.

> Open source isn't the only model for developing software.

But it's probably the most common scenario where people run into line
ending issues.

If the project is a VS project, then it's probably not multi-platform,
plus everyone at the company would be using windows anyway, so there's
no line-ending issue.

> And again... even if it were, working well together with
> visual studio support would be very beneficial for quite a bit of
> projects. Visual Studio is probably the most used code-editor among
> Windows-developers (with a good margin too, I suspect), so ignoring it
> is would just be sticking your head in the sand - or worse, asking for
> less contributions from Windows-users (which can often be a problem in
> the first place).

The problem can be avoided with a little bit of education. VS is not a
multiplatform IDE anyway
Sure, it can't work with LF endings as well as notepad++, but it's not
git's responsibility to try to fix that.

I just don't think it's a big enough issue to be built into git.

IMHO it's much better to work around the problem (if and when it
arises) by using clean and smudge filters in .gitattributes, than
having it built in and enabled by default in the msysgit installer.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-06 20:40           ` Erik Faye-Lund
  2010-05-06 22:14             ` hasen j
@ 2010-05-06 22:27             ` Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 1/3] Add "auto-eol" attribute and "core.eolStyle" config variable Eyvind Bernhardsen
                                 ` (5 more replies)
  1 sibling, 6 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-06 22:27 UTC (permalink / raw)
  To: git; +Cc: hasan.aljudy, kusmabite, torvalds, prohaska, gitster

This discussion couldn't be more timely, as I've recently acquired a
desperate need to solve CRLF problems at $dayjob.  This patch series
introduces a new way of turning on autocrlf normalization by splitting
the configuration into two:

- An attribute called "auto-eol" is set in the repository to turn on
  normalization of line endings.  Since attributes are content, the
  setting is copied when the repository is cloned and can be changed in
  an existing repository (with a few caveats).  Setting this attribute
  is equivalent to setting "core.autocrlf" to "input" or "true".

- A configuration variable called "core.eolStyle" determines which type
  of line endings are used when checking files out to the working
  directory.

How does this solve the current problems with core.autocrlf?  First,
let's enumerate them:


1. Setting core.autocrlf in your global or system configuration is a
pain since git will get confused whenever you work in a repository which
contains CRLF line endings.  If you have to work in both repositories
with normalization and repositories with mixed line endings, you have no
choice but to set core.autocrlf in each repository individually.

2. Setting core.autocrlf in an individual repository would be okay
except that naive users will do it after they have already cloned:
unless core.autocrlf is set globally, the clone will have the wrong line
endings, and the user needs to know how to refresh it manually (rm -rf *
&& git checkout -f).

3. Once somebody does it, _everyone_ has to do it: if someone checks in
a file with CRLFs, that file will cause trouble for everyone who has
autocrlf set.  That someone can be a Linux user who just copied a file
from Windows and didn't think to convert the line endings (BT, DT).

4. Once a repository contains CRLFs autocrlf can never sanely be
enabled; the CRLFs can be normalized in a commit, but there's no way to
say "all commits after this one are normalized, those that came before
were not".

5. On the other hand, setting core.autocrlf means that git no longer
stores your files in their pristine, natural state; if you _know_ that
your repository will never be used by anyone whose EOL preference
differs from your own, it seems wasteful and dangerous to normalize
those line endings.


I used an attribute to enable line-ending conversion because it seems to
be a good idea to have line ending normalization be a property of the
content rather than the repository's or user's configuration.  "If
anybody wants to clone my repository, they'd better be prepared to
normalize their EOLs".

Which EOLs the user wants to use obviously can't be a part of the
content, so part is still a configuration variable.

"core.autocrlf" is still available and allows someone working on, say,
git.git from Windows to have CRLFs in their working directory without
requiring any changes to the repository.

For backwards compatibility, "core.autocrlf" overrides "auto-eol" if it
is set, and "core.eolStyle" can be set to "false" to disable conversion
even when "auto-eol" is set. 

For my own part, I'll be implementing this change company-wide shortly.
We have an existing repository with a large body of code that contains a
heady mix of CRLF and LF files, but our newly introduced build system
requires everything to be normalized to CRLF (don't ask).  There's no
sane way of handling this using autocrlf; all developers would have to
know when to set core.autocrlf and remember to set it on every clone, or
even on every checkout.

I hope someone else will find it useful.

Eyvind Bernhardsen (3):
  Add "auto-eol" attribute and "core.eolStyle" config variable
  Add tests for per-repository eol normalization
  Add per-repository eol normalization

 Documentation/config.txt        |   11 ++-
 Documentation/gitattributes.txt |   92 +++++++++++++++++---
 Makefile                        |    3 +
 cache.h                         |   19 ++++
 config.c                        |   16 +++-
 convert.c                       |   48 ++++++++---
 environment.c                   |    1 +
 t/t0025-auto-eol.sh             |  180 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 339 insertions(+), 31 deletions(-)
 create mode 100755 t/t0025-auto-eol.sh

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH/RFC 1/3] Add "auto-eol" attribute and "core.eolStyle" config variable
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
@ 2010-05-06 22:27               ` Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 2/3] Add tests for per-repository eol normalization Eyvind Bernhardsen
                                 ` (4 subsequent siblings)
  5 siblings, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-06 22:27 UTC (permalink / raw)
  To: git; +Cc: hasan.aljudy, kusmabite, torvalds, prohaska, gitster

Introduce a new attribute called "auto-eol" and a config variable,
"core.eolStyle", which will enable line ending normalisation using the
autocrlf mechanism.

The intent is to enable autocrlf in an alternative way, splitting the
existing "core.autocrlf" config variable into two:

- a per-repository "line endings should be normalised in this
  repository" setting, activated by setting the auto-eol attribute
  (usually on all files in the repository)

- a config variable, "core.eolStyle" which lets the user decide which
  line endings are preferred in the working directory

Possible values for "core.eolStyle" are:

- "lf", meaning that LF line endings are preferred
- "crlf", meaning that CRLF line endings are preferred
- "native" (the default), crlf or lf according to platform
- "false", which disables end-of-line conversion even when auto-eol is
  set

"core.autocrlf" will override auto-eol when set to anything but "false".

Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
---
 Makefile      |    3 +++
 cache.h       |   19 +++++++++++++++++++
 config.c      |   16 +++++++++++++++-
 environment.c |    1 +
 4 files changed, 38 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 910f471..419532e 100644
--- a/Makefile
+++ b/Makefile
@@ -224,6 +224,8 @@ all::
 #
 # Define CHECK_HEADER_DEPENDENCIES to check for problems in the hard-coded
 # dependency rules.
+#
+# Define NATIVE_CRLF if your platform uses CRLF for line endings.
 
 GIT-VERSION-FILE: FORCE
 	@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -989,6 +991,7 @@ ifeq ($(uname_S),Windows)
 	NO_CURL = YesPlease
 	NO_PYTHON = YesPlease
 	BLK_SHA1 = YesPlease
+	NATIVE_CRLF = YesPlease
 
 	CC = compat/vcbuild/scripts/clink.pl
 	AR = compat/vcbuild/scripts/lib.pl
diff --git a/cache.h b/cache.h
index 5eb0573..690511e 100644
--- a/cache.h
+++ b/cache.h
@@ -561,6 +561,25 @@ enum safe_crlf {
 
 extern enum safe_crlf safe_crlf;
 
+enum auto_crlf {
+	AUTO_CRLF_FALSE = 0,
+	AUTO_CRLF_TRUE = 1,
+	AUTO_CRLF_INPUT = -1,
+};
+
+enum eol_style {
+	EOL_STYLE_FALSE = AUTO_CRLF_FALSE,
+	EOL_STYLE_CRLF = AUTO_CRLF_TRUE,
+	EOL_STYLE_LF = AUTO_CRLF_INPUT,
+#ifdef NATIVE_CRLF
+	EOL_STYLE_NATIVE = EOL_STYLE_CRLF,
+#else
+	EOL_STYLE_NATIVE = EOL_STYLE_LF,
+#endif
+};
+
+extern enum eol_style eol_style;
+
 enum branch_track {
 	BRANCH_TRACK_UNSPECIFIED = -1,
 	BRANCH_TRACK_NEVER = 0,
diff --git a/config.c b/config.c
index 6963fbe..8a11052 100644
--- a/config.c
+++ b/config.c
@@ -461,7 +461,7 @@ static int git_default_core_config(const char *var, const char *value)
 
 	if (!strcmp(var, "core.autocrlf")) {
 		if (value && !strcasecmp(value, "input")) {
-			auto_crlf = -1;
+			auto_crlf = AUTO_CRLF_INPUT;
 			return 0;
 		}
 		auto_crlf = git_config_bool(var, value);
@@ -477,6 +477,20 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.eolstyle")) {
+		if (value && !strcasecmp(value, "lf"))
+			eol_style = EOL_STYLE_LF;
+		else if (value && !strcasecmp(value, "crlf"))
+			eol_style = EOL_STYLE_CRLF;
+		else if (value && !strcasecmp(value, "native"))
+			eol_style = EOL_STYLE_NATIVE;
+		else if (! git_config_bool(var, value))
+			eol_style = EOL_STYLE_FALSE;
+		else
+			return error("Malformed value for %s", var);
+		return 0;
+	}
+
 	if (!strcmp(var, "core.notesref")) {
 		notes_ref_name = xstrdup(value);
 		return 0;
diff --git a/environment.c b/environment.c
index 876c5e5..05cd1d5 100644
--- a/environment.c
+++ b/environment.c
@@ -40,6 +40,7 @@ const char *editor_program;
 const char *excludes_file;
 int auto_crlf = 0;	/* 1: both ways, -1: only when adding git objects */
 int read_replace_refs = 1;
+enum eol_style eol_style = EOL_STYLE_NATIVE;
 enum safe_crlf safe_crlf = SAFE_CRLF_WARN;
 unsigned whitespace_rule_cfg = WS_DEFAULT_RULE;
 enum branch_track git_branch_track = BRANCH_TRACK_REMOTE;
-- 
1.7.1.3.gb95c9

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH/RFC 2/3] Add tests for per-repository eol normalization
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 1/3] Add "auto-eol" attribute and "core.eolStyle" config variable Eyvind Bernhardsen
@ 2010-05-06 22:27               ` Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 3/3] Add " Eyvind Bernhardsen
                                 ` (3 subsequent siblings)
  5 siblings, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-06 22:27 UTC (permalink / raw)
  To: git; +Cc: hasan.aljudy, kusmabite, torvalds, prohaska, gitster


Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
---
 t/t0025-auto-eol.sh |  180 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 180 insertions(+), 0 deletions(-)
 create mode 100755 t/t0025-auto-eol.sh

diff --git a/t/t0025-auto-eol.sh b/t/t0025-auto-eol.sh
new file mode 100755
index 0000000..5acee2d
--- /dev/null
+++ b/t/t0025-auto-eol.sh
@@ -0,0 +1,180 @@
+#!/bin/sh
+
+test_description='CRLF conversion'
+
+. ./test-lib.sh
+
+has_cr() {
+	tr '\015' Q <"$1" | grep Q >/dev/null
+}
+
+test_expect_success setup '
+
+	git config core.autocrlf false &&
+
+	for w in Hello world how are you; do echo $w; done >one &&
+	for w in I am very very fine thank you; do echo ${w}Q; done | q_to_cr >two &&
+	git add . &&
+
+	git commit -m initial &&
+
+	one=`git rev-parse HEAD:one` &&
+	two=`git rev-parse HEAD:two` &&
+
+	for w in Some extra lines here; do echo $w; done >>one &&
+	git diff >patch.file &&
+	patched=`git hash-object --stdin <one` &&
+	git read-tree --reset -u HEAD &&
+
+	echo happy.
+'
+
+test_expect_success 'default settings cause no changes' '
+
+	rm -f .gitattributes tmp one two &&
+	git read-tree --reset -u HEAD &&
+
+	if has_cr one || ! has_cr two
+	then
+		echo "Eh? $f"
+		false
+	fi &&
+	onediff=`git diff one` &&
+	twodiff=`git diff two` &&
+	test -z "$onediff" -a -z "$twodiff"
+'
+
+test_expect_success 'no auto-eol, explicit eolstyle=native causes no changes' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.eolstyle native &&
+	git read-tree --reset -u HEAD &&
+
+	if has_cr one || ! has_cr two
+	then
+		echo "Eh? $f"
+		false
+	fi &&
+	onediff=`git diff one` &&
+	twodiff=`git diff two` &&
+	test -z "$onediff" -a -z "$twodiff"
+'
+
+test_expect_failure 'auto-eol=true, eolStyle=crlf <=> autocrlf=true' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf false &&
+	git config core.eolstyle crlf &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+	unset missing_cr &&
+
+	for f in one two
+	do
+		if ! has_cr "$f"
+		then
+			echo "Eh? $f"
+			missing_cr=1
+			break
+		fi
+	done &&
+	test -z "$missing_cr"
+'
+
+test_expect_failure 'auto-eol=true, eolStyle=lf <=> autocrlf=input' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf false &&
+	git config core.eolstyle lf &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+
+	if has_cr one || ! has_cr two
+	then
+		echo "Eh? $f"
+		false
+	fi &&
+	onediff=`git diff one` &&
+	twodiff=`git diff two` &&
+	test -z "$onediff" -a -n "$twodiff"
+'
+
+test_expect_success 'auto-eol=true, eolStyle=false <=> autocrlf=false' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf false &&
+	git config core.eolstyle false &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+
+	if has_cr one || ! has_cr two
+	then
+		echo "Eh? $f"
+		false
+	fi
+	onediff=`git diff one` &&
+	twodiff=`git diff two` &&
+	test -z "$onediff" -a -z "$twodiff"
+'
+
+test_expect_success 'autocrlf=true overrides auto-eol=true, eolStyle=lf' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf true &&
+	git config core.eolstyle lf &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+	unset missing_cr &&
+
+	for f in one two
+	do
+		if ! has_cr "$f"
+		then
+			echo "Eh? $f"
+			missing_cr=1
+			break
+		fi
+	done &&
+	test -z "$missing_cr"
+'
+
+test_expect_success 'autocrlf=input overrides auto-eol=true, eolStyle=crlf' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf input &&
+	git config core.eolstyle crlf &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+
+	if has_cr one || ! has_cr two
+	then
+		echo "Eh? $f"
+		false
+	fi &&
+	onediff=`git diff one` &&
+	twodiff=`git diff two` &&
+	test -z "$onediff" -a -n "$twodiff"
+'
+
+test_expect_success 'autocrlf=true overrides auto-eol=true, eolStyle=false' '
+
+	rm -f .gitattributes tmp one two &&
+	git config core.autocrlf true &&
+	git config core.eolstyle false &&
+	echo "* auto-eol" > .gitattributes &&
+	git read-tree --reset -u HEAD &&
+	unset missing_cr &&
+
+	for f in one two
+	do
+		if ! has_cr "$f"
+		then
+			echo "Eh? $f"
+			missing_cr=1
+			break
+		fi
+	done &&
+	test -z "$missing_cr"
+'
+
+test_done
-- 
1.7.1.3.gb95c9

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH/RFC 3/3] Add per-repository eol normalization
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 1/3] Add "auto-eol" attribute and "core.eolStyle" config variable Eyvind Bernhardsen
  2010-05-06 22:27               ` [PATCH/RFC 2/3] Add tests for per-repository eol normalization Eyvind Bernhardsen
@ 2010-05-06 22:27               ` Eyvind Bernhardsen
  2010-05-06 23:38               ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Avery Pennarun
                                 ` (2 subsequent siblings)
  5 siblings, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-06 22:27 UTC (permalink / raw)
  To: git; +Cc: hasan.aljudy, kusmabite, torvalds, prohaska, gitster

Implement an alternative end-of-line conversion setting which uses a new
attribute, "auto-eol", and a new config variable, "core.eolStyle" to
enable end-of-line conversion.

The auto-eol attribute enables automatic line ending detection and
conversion for files on which it is set.  Since attributes are under
version control, this setting is copied when the repository is cloned.
It can also be changed over the history of a repository, with some
caveats.

The core.eolStyle variable is used to decide if LF or CRLF line endings
are preferred in the working directory.  It is only used when auto-eol
is set, and defaults to the platform-native line ending.

"core.autocrlf" overrides auto-eol when set to anything but "false".

Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
---
 Documentation/config.txt        |   11 ++++-
 Documentation/gitattributes.txt |   92 +++++++++++++++++++++++++++++++++------
 convert.c                       |   48 ++++++++++++++------
 t/t0025-auto-eol.sh             |    4 +-
 4 files changed, 123 insertions(+), 32 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 92f851e..7bbf8a0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -207,9 +207,16 @@ core.autocrlf::
 	the file's `crlf` attribute, or if `crlf` is unspecified,
 	based on the file's contents.  See linkgit:gitattributes[5].
 
+core.eolStyle::
+	Sets the line ending type to use for text files in the working
+	directory when the `auto-eol` property is set.  Alternatives are
+	'lf', 'crlf', 'native' and 'false'.  'native', the default, uses
+	the platform's native line ending.  'false' disables `auto-eol`
+	line ending conversion.  See linkgit:gitattributes[5].
+
 core.safecrlf::
-	If true, makes git check if converting `CRLF` as controlled by
-	`core.autocrlf` is reversible.  Git will verify if a command
+	If true, makes git check if converting `CRLF` is reversible when
+	end-of-line conversion is active.  Git will verify if a command
 	modifies a file in the work tree either directly or indirectly.
 	For example, committing a file followed by checking out the
 	same file should yield the original file in the work tree.  If
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index d892e64..1c52ae9 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -92,6 +92,46 @@ such as 'git checkout' and 'git merge' run.  They also affect how
 git stores the contents you prepare in the working tree in the
 repository upon 'git add' and 'git commit'.
 
+`auto-eol`
+^^^^^^^^^^
+
+This attribute enables automatic end-of-line conversion (see below).
+When `auto-eol` is used, it should in most cases be set for all files in
+the repository.
+
+Set::
+
+	Setting the `auto-eol` attribute turns on automatic
+	conversion of line endings.  When `auto-eol` is set,
+	line endings are converted to LF on checkin, and if
+	`core.eolStyle` is set to "crlf", line endings are
+	also converted to CRLF on checkout.
+
+Unset::
+
+	No line-ending conversion is performed.
+
+NOTE: When committing a change that sets this attribute in an existing
+repository, line endings should be normalized as part of the same
+commit.  From a clean working directory:
+
+-------------------------------------------------
+$ echo "* auto-eol" >.gitattributes
+$ rm .git/index     # Remove the index to force git to
+$ git reset         # re-scan the working directory
+$ git status        # Show files that will be normalized
+$ git add -u
+$ git add .gitattributes
+$ git commit -m "Introduce end-of-line normalization"
+-------------------------------------------------
+
+If any files that should not be normalized show up in 'git status',
+unset their `crlf` attribute in `.gitattributes` before 'git add -u'.
+
+`core.autocrlf` overrides `auto-eol` if set to "true" or "input".
+Setting `core.eolStyle` to "false" prevents line ending conversion even
+when `auto-eol` is set.
+
 `crlf`
 ^^^^^^
 
@@ -100,7 +140,7 @@ This attribute controls the line-ending convention.
 Set::
 
 	Setting the `crlf` attribute on a path is meant to mark
-	the path as a "text" file.  'core.autocrlf' conversion
+	the path as a "text" file.  End-of-line conversion
 	takes place without guessing the content type by
 	inspection.
 
@@ -111,8 +151,8 @@ Unset::
 
 Unspecified::
 
-	Unspecified `crlf` attribute tells git to apply the
-	`core.autocrlf` conversion when the file content looks
+	Unspecified `crlf` attribute tells git to apply
+	end-of-line conversion when the file content looks
 	like text.
 
 Set to string value "input"::
@@ -125,20 +165,44 @@ Any other value set to `crlf` attribute is ignored and git acts
 as if the attribute is left unspecified.
 
 
-The `core.autocrlf` conversion
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-If the configuration variable `core.autocrlf` is false, no
-conversion is done.
-
-When `core.autocrlf` is true, it means that the platform wants
-CRLF line endings for files in the working tree, and you want to
-convert them back to the normal LF line endings when checking
-in to the repository.
+End-of-line conversion
+^^^^^^^^^^^^^^^^^^^^^^
 
-When `core.autocrlf` is set to "input", line endings are
-converted to LF upon checkin, but there is no conversion done
-upon checkout.
+While git normally leaves file contents alone, it can be configured to
+normalize line endings to LF in the repository and, optionally, to
+convert them to CRLF when files are checked out.  Binary files are
+detected automatically and will not be modified; this detection can be
+overridden with the `crlf` attribute.
+
+NOTE: This conversion requires the repository to be free of text files
+containing CRLFs.  When it is enabled on an existing repository, the
+index should be rebuilt to find any such files, and these files should
+either have their `crlf` attribute set to false ("-crlf"), or they
+should be checked in to the repository in normalized form.
+
+End-of-line conversion is controlled by the configuration variables
+`core.eolStyle` and `core.autocrlf` and the attributes `auto-eol` and
+`crlf`.
+
+When a repository is shared between users on platforms with different
+end-of-line conventions, using the `auto-eol` mechanism is probably the
+best choice.  A developer on a minority platform sharing a repository
+with a large group of users on an LF-native platform would want to set
+`core.autocrlf` instead.
+
+End-of-line conversion is enabled as follows:
+
+- If the attribute `auto-eol` is not set and the configuration variable
+  `core.autocrlf` is false, no conversion is done.
+
+- When the `auto-eol` attribute is set, or `core.autocrlf` is true or
+  "input", line endings are normalized as files are checked in to the
+  repository.
+
+- When the `auto-eol` attribute is set and `core.eolStyle` is "crlf", or
+  `core.autocrlf` is true, line endings in the repository are normalized
+  and will be converted to CRLF when files are checked out to the
+  working tree.
 
 If `core.safecrlf` is set to "true" or "warn", git verifies if
 the conversion is reversible for the current setting of
diff --git a/convert.c b/convert.c
index 4f8fcb7..f0f59e3 100644
--- a/convert.c
+++ b/convert.c
@@ -90,12 +90,13 @@ static int is_binary(unsigned long size, struct text_stat *stats)
 }
 
 static void check_safe_crlf(const char *path, int action,
-                            struct text_stat *stats, enum safe_crlf checksafe)
+			    struct text_stat *stats, enum safe_crlf checksafe,
+			    int eol_conversion)
 {
 	if (!checksafe)
 		return;
 
-	if (action == CRLF_INPUT || auto_crlf <= 0) {
+	if (action == CRLF_INPUT || eol_conversion <= 0) {
 		/*
 		 * CRLFs would not be restored by checkout:
 		 * check if we'd remove CRLFs
@@ -106,7 +107,7 @@ static void check_safe_crlf(const char *path, int action,
 			else /* i.e. SAFE_CRLF_FAIL */
 				die("CRLF would be replaced by LF in %s.", path);
 		}
-	} else if (auto_crlf > 0) {
+	} else if (eol_conversion > 0) {
 		/*
 		 * CRLFs would be added by checkout:
 		 * check if we have "naked" LFs
@@ -121,12 +122,13 @@ static void check_safe_crlf(const char *path, int action,
 }
 
 static int crlf_to_git(const char *path, const char *src, size_t len,
-                       struct strbuf *buf, int action, enum safe_crlf checksafe)
+		       struct strbuf *buf, int action, enum safe_crlf checksafe,
+		       int eol_conversion)
 {
 	struct text_stat stats;
 	char *dst;
 
-	if ((action == CRLF_BINARY) || !auto_crlf || !len)
+	if ((action == CRLF_BINARY) || !eol_conversion || !len)
 		return 0;
 
 	gather_stats(src, len, &stats);
@@ -147,7 +149,7 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 			return 0;
 	}
 
-	check_safe_crlf(path, action, &stats, checksafe);
+	check_safe_crlf(path, action, &stats, checksafe, eol_conversion);
 
 	/* Optimization: No CR? Nothing to convert, regardless. */
 	if (!stats.cr)
@@ -180,13 +182,13 @@ static int crlf_to_git(const char *path, const char *src, size_t len,
 }
 
 static int crlf_to_worktree(const char *path, const char *src, size_t len,
-                            struct strbuf *buf, int action)
+			    struct strbuf *buf, int action, int eol_conversion)
 {
 	char *to_free = NULL;
 	struct text_stat stats;
 
 	if ((action == CRLF_BINARY) || (action == CRLF_INPUT) ||
-	    auto_crlf <= 0)
+	    eol_conversion <= 0)
 		return 0;
 
 	if (!len)
@@ -377,17 +379,31 @@ static void setup_convert_check(struct git_attr_check *check)
 	static struct git_attr *attr_crlf;
 	static struct git_attr *attr_ident;
 	static struct git_attr *attr_filter;
+	static struct git_attr *attr_auto_eol;
 
 	if (!attr_crlf) {
 		attr_crlf = git_attr("crlf");
 		attr_ident = git_attr("ident");
 		attr_filter = git_attr("filter");
+		attr_auto_eol = git_attr("auto-eol");
 		user_convert_tail = &user_convert;
 		git_config(read_convert_config, NULL);
 	}
 	check[0].attr = attr_crlf;
 	check[1].attr = attr_ident;
 	check[2].attr = attr_filter;
+	check[3].attr = attr_auto_eol;
+}
+
+static int choose_eol_conversion(int auto_eol)
+{
+	if (auto_crlf)
+		return auto_crlf;
+
+	if (auto_eol)
+		return eol_style;
+
+	return 0;
 }
 
 static int count_ident(const char *cp, unsigned long size)
@@ -571,9 +587,9 @@ static int git_path_check_ident(const char *path, struct git_attr_check *check)
 int convert_to_git(const char *path, const char *src, size_t len,
                    struct strbuf *dst, enum safe_crlf checksafe)
 {
-	struct git_attr_check check[3];
+	struct git_attr_check check[4];
 	int crlf = CRLF_GUESS;
-	int ident = 0, ret = 0;
+	int ident = 0, ret = 0, auto_eol = 0;
 	const char *filter = NULL;
 
 	setup_convert_check(check);
@@ -584,6 +600,7 @@ int convert_to_git(const char *path, const char *src, size_t len,
 		drv = git_path_check_convert(path, check + 2);
 		if (drv && drv->clean)
 			filter = drv->clean;
+		auto_eol = git_path_check_ident(path, check + 3);
 	}
 
 	ret |= apply_filter(path, src, len, dst, filter);
@@ -591,7 +608,8 @@ int convert_to_git(const char *path, const char *src, size_t len,
 		src = dst->buf;
 		len = dst->len;
 	}
-	ret |= crlf_to_git(path, src, len, dst, crlf, checksafe);
+	ret |= crlf_to_git(path, src, len, dst, crlf, checksafe,
+		choose_eol_conversion(auto_eol));
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
@@ -601,9 +619,9 @@ int convert_to_git(const char *path, const char *src, size_t len,
 
 int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst)
 {
-	struct git_attr_check check[3];
+	struct git_attr_check check[4];
 	int crlf = CRLF_GUESS;
-	int ident = 0, ret = 0;
+	int ident = 0, ret = 0, auto_eol = 0;
 	const char *filter = NULL;
 
 	setup_convert_check(check);
@@ -614,6 +632,7 @@ int convert_to_working_tree(const char *path, const char *src, size_t len, struc
 		drv = git_path_check_convert(path, check + 2);
 		if (drv && drv->smudge)
 			filter = drv->smudge;
+		auto_eol = git_path_check_ident(path, check + 3);
 	}
 
 	ret |= ident_to_worktree(path, src, len, dst, ident);
@@ -621,7 +640,8 @@ int convert_to_working_tree(const char *path, const char *src, size_t len, struc
 		src = dst->buf;
 		len = dst->len;
 	}
-	ret |= crlf_to_worktree(path, src, len, dst, crlf);
+	ret |= crlf_to_worktree(path, src, len, dst, crlf,
+		choose_eol_conversion(auto_eol));
 	if (ret) {
 		src = dst->buf;
 		len = dst->len;
diff --git a/t/t0025-auto-eol.sh b/t/t0025-auto-eol.sh
index 5acee2d..5195885 100755
--- a/t/t0025-auto-eol.sh
+++ b/t/t0025-auto-eol.sh
@@ -60,7 +60,7 @@ test_expect_success 'no auto-eol, explicit eolstyle=native causes no changes' '
 	test -z "$onediff" -a -z "$twodiff"
 '
 
-test_expect_failure 'auto-eol=true, eolStyle=crlf <=> autocrlf=true' '
+test_expect_success 'auto-eol=true, eolStyle=crlf <=> autocrlf=true' '
 
 	rm -f .gitattributes tmp one two &&
 	git config core.autocrlf false &&
@@ -81,7 +81,7 @@ test_expect_failure 'auto-eol=true, eolStyle=crlf <=> autocrlf=true' '
 	test -z "$missing_cr"
 '
 
-test_expect_failure 'auto-eol=true, eolStyle=lf <=> autocrlf=input' '
+test_expect_success 'auto-eol=true, eolStyle=lf <=> autocrlf=input' '
 
 	rm -f .gitattributes tmp one two &&
 	git config core.autocrlf false &&
-- 
1.7.1.3.gb95c9

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 22:14             ` hasen j
@ 2010-05-06 23:25               ` Erik Faye-Lund
  2010-05-18 15:13                 ` Anthony W. Youngman
  0 siblings, 1 reply; 82+ messages in thread
From: Erik Faye-Lund @ 2010-05-06 23:25 UTC (permalink / raw)
  To: hasen j; +Cc: Linus Torvalds, git

On Fri, May 7, 2010 at 12:14 AM, hasen j <hasan.aljudy@gmail.com> wrote:
>>>
>>> When I'm on windows, I prefer LF (unless the project already uses
>>> CRLF, or it's outside my control).
>>>
>>
>> "When I'm on windows" leads me to believe Windows is not your primary
>> operating system. If not, please excuse me.
>
> I used to be, I only moved to linux about a year ago, but I use
> windows at work, and I started using git when I was on windows.
>

OK, I'm sorry for assuming some Windows-ignorance.

>> Open source isn't the only model for developing software.
>
> But it's probably the most common scenario where people run into line
> ending issues.
>

Closed source does not imply a single operating system, and you get
these issues whenever you have a project with targets systems with
different newline style. In my day job I develop closed source,
multi-platform software, using git. So it's certainly not MY most
common scenario.

And even if it were, so what? When did we start only caring for the
most common case?

> If the project is a VS project, then it's probably not multi-platform,
> plus everyone at the company would be using windows anyway, so there's
> no line-ending issue.
>

Using VS on Windows does not exclude other platforms either. Either
one can maintain multiple build-systems for Windows and Unix-y
systems, or one can use a system like CMake that automate the job.

A typical case where you pretty much have to build using Visual Studio
is when you develop a C++ library, where your Windows users use Visual
Studio (due to C++' symbol-mangling you have to use the same
compiler). This is not an entirely uncommon situation for open source
software.

>> And again... even if it were, working well together with
>> visual studio support would be very beneficial for quite a bit of
>> projects. Visual Studio is probably the most used code-editor among
>> Windows-developers (with a good margin too, I suspect), so ignoring it
>> is would just be sticking your head in the sand - or worse, asking for
>> less contributions from Windows-users (which can often be a problem in
>> the first place).
>
> The problem can be avoided with a little bit of education. VS is not a
> multiplatform IDE anyway
> Sure, it can't work with LF endings as well as notepad++, but it's not
> git's responsibility to try to fix that.

Again, using VS on Windows does not exclude other platforms. I'm not
sure what you mean with "a little bit of education" here, though.

CRLF is Windows' native newline style. If git can't check out to that,
it'll look like a lot less attractive solution to anybody that targets
Windows compared to the competition. If it wasn't for core.autocrlf, I
would have never switched myself.

>
> I just don't think it's a big enough issue to be built into git.
>
> IMHO it's much better to work around the problem (if and when it
> arises) by using clean and smudge filters in .gitattributes, than
> having it built in and enabled by default in the msysgit installer.
>

But it IS built in. And it's very unlikely that this feature will ever
be removed. So what's the problem with using it?

And it's a very common thing to want to do, so why make everybody who
does have to jump through hoops just because YOU don't need it?

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
                                 ` (2 preceding siblings ...)
  2010-05-06 22:27               ` [PATCH/RFC 3/3] Add " Eyvind Bernhardsen
@ 2010-05-06 23:38               ` Avery Pennarun
  2010-05-06 23:54                 ` Avery Pennarun
  2010-05-07  8:45               ` Erik Faye-Lund
  2010-05-07 16:33               ` Junio C Hamano
  5 siblings, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-06 23:38 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: git, hasan.aljudy, kusmabite, torvalds, prohaska, gitster

On Thu, May 6, 2010 at 6:27 PM, Eyvind Bernhardsen
<eyvind.bernhardsen@gmail.com> wrote:
> - An attribute called "auto-eol" is set in the repository to turn on
>  normalization of line endings.  Since attributes are content, the
>  setting is copied when the repository is cloned and can be changed in
>  an existing repository (with a few caveats).  Setting this attribute
>  is equivalent to setting "core.autocrlf" to "input" or "true".
>
> - A configuration variable called "core.eolStyle" determines which type
>  of line endings are used when checking files out to the working
>  directory.

I definitely like this.  The existing core.autocrlf setting does cause
a lot of confusion for precisely the reason you stated: people often
forget to set it until *after* they've checked out the repo, at which
time all the files are already checked out wrong and total confusion
ensues.

Being able to globally set my preferred eol style in one place, but
only have it take effect on projects (and individual files in that
project) that we already know have eol constraints, would be
wonderful.

Of course this new feature would be in addition to the existing
core.autocrlf setting, not replacing it.

This would definitely help our Windows users at work.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-06 23:38               ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Avery Pennarun
@ 2010-05-06 23:54                 ` Avery Pennarun
  0 siblings, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-06 23:54 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: git, hasan.aljudy, kusmabite, torvalds, prohaska, gitster

On Thu, May 6, 2010 at 7:38 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> I definitely like this.  The existing core.autocrlf setting does cause
> a lot of confusion for precisely the reason you stated: people often
> forget to set it until *after* they've checked out the repo, at which
> time all the files are already checked out wrong and total confusion
> ensues.

Oh, just to clarify the rationale a bit more:

Whether a developer wants autocrlf or not actually is
project-dependent, not user-dependent or "all Windows users want
autocrlf."  For example, if I'm running Cygwin and I checkout a copy
of the git source code to build with Cygwin gcc, I definitely don't
want autocrlf.  (Actually, almost always, for C source code I don't
want autocrlf, or I want autocrlf=input.)

If I'm checking out a copy of our Delphi project on Windows, though, I
need autocrlf or the IDE goes bananas.  And our team would be happy to
put the right magic incantation in a .gitattributes file in our Delphi
project if it would make this work out automatically.

Setting core.autocrlf on one of our Windows developers' systems can't
cover both of those cases automatically, whereas the settings Eyvind
has proposed would solve our problem.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-05 10:01 What should be the CRLF policy when win + Linux? mat
  2010-05-05 13:27 ` Ramkumar Ramachandra
  2010-05-06  2:35 ` hasen j
@ 2010-05-07  7:15 ` Gelonida
  2 siblings, 0 replies; 82+ messages in thread
From: Gelonida @ 2010-05-07  7:15 UTC (permalink / raw)
  To: git

I'm not convinced, that one policy is a good solution, but it really
depends on your project.


What we do:
.bat files with windows line endings
.cmd .vbs files with windows line endings
.sh files with unix file endings
 source files (.c .h .py .pl) with unix file endings
.txt files with unix file endings

The rest untouched:
you might add a precommti hook to verify this.
SO war we din't bother to automate it, but I must admint, that we had
occasional rare jickups.


bye


N


mat wrote:
> Hi
> 
> I have two git projects:
> -one (A) with linux people only
> -one (B) with someone using windows
> 
> As we had "end of line" problems with the person using windows (B), I used:
> 
> git config --global core.autocrlf true
> 
> Following advices from:
> http://help.github.com/dealing-with-lineendings/
> 
> So everything now if fine with project B, but now some problems using
> project (A): I wanted to copy the whole project file to another dir, and
> now it is complaining about the change, signaling warning:
> 
> CRLF will be replaced by LF in .../A.
> 
> So I don't know exactly what I should do...Should I change all the CRLF
> from project A, but people will have also problems, or can I switch the
> config, once I'm using project A and B? It is not so clear in my mind
> and I would appreciate any advice!!
> 
> Thanks a lot
> 
> Matthieu Stigler
> 
> 
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
                                 ` (3 preceding siblings ...)
  2010-05-06 23:38               ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Avery Pennarun
@ 2010-05-07  8:45               ` Erik Faye-Lund
  2010-05-07 16:33               ` Junio C Hamano
  5 siblings, 0 replies; 82+ messages in thread
From: Erik Faye-Lund @ 2010-05-07  8:45 UTC (permalink / raw)
  To: Eyvind Bernhardsen; +Cc: git, hasan.aljudy, torvalds, prohaska, gitster

On Fri, May 7, 2010 at 12:27 AM, Eyvind Bernhardsen
<eyvind.bernhardsen@gmail.com> wrote:
> This discussion couldn't be more timely, as I've recently acquired a
> desperate need to solve CRLF problems at $dayjob.  This patch series
> introduces a new way of turning on autocrlf normalization by splitting
> the configuration into two:
>
> - An attribute called "auto-eol" is set in the repository to turn on
>  normalization of line endings.  Since attributes are content, the
>  setting is copied when the repository is cloned and can be changed in
>  an existing repository (with a few caveats).  Setting this attribute
>  is equivalent to setting "core.autocrlf" to "input" or "true".
>
> - A configuration variable called "core.eolStyle" determines which type
>  of line endings are used when checking files out to the working
>  directory.
>

Beautiful! This approach addresses most (all?) issues I've had with
core.autocrlf in a very elegant way IMO! :)

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
                                 ` (4 preceding siblings ...)
  2010-05-07  8:45               ` Erik Faye-Lund
@ 2010-05-07 16:33               ` Junio C Hamano
  2010-05-07 16:57                 ` Avery Pennarun
                                   ` (3 more replies)
  5 siblings, 4 replies; 82+ messages in thread
From: Junio C Hamano @ 2010-05-07 16:33 UTC (permalink / raw)
  To: Eyvind Bernhardsen; +Cc: git, hasan.aljudy, kusmabite, torvalds, prohaska

Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:

> - An attribute called "auto-eol" is set in the repository to turn on
>   normalization of line endings.  Since attributes are content, the
>   setting is copied when the repository is cloned and can be changed in
>   an existing repository (with a few caveats).  Setting this attribute
>   is equivalent to setting "core.autocrlf" to "input" or "true".

In what way is this attribute different from existing "crlf" attribute?

It feels as if this series is fixing shortcomings of the combination of
core.autocrlf configuration and crlf attribute while trying very hard to
keep their shortcomings when the user doesn't say so.  What is the
downside of making the existing "core.autocrlf" + "crlf" combination do
what your patch wanted to do without retaining this "keep the existing
shortcomings for backward compatibility"?

> 1. Setting core.autocrlf in your global or system configuration is a
> pain

This is a wrong thing to do to begin with, and not worth discussing.  You
know and your readers know that line ending convention in the repository
data (i.e. blobs) is under project control while line ending convention in
the working tree is end user preference.

> 2. Setting core.autocrlf in an individual repository would be okay
> except that naive users will do it after they have already cloned:
> unless core.autocrlf is set globally, the clone will have the wrong line
> endings, and the user needs to know how to refresh it manually (rm -rf *
> && git checkout -f).

This may be a worthy goal.  But if a "auto-eol" attribute "fixes" this,
perhaps "crlf" attribute can be taught to fix it the same way, no?

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 16:33               ` Junio C Hamano
@ 2010-05-07 16:57                 ` Avery Pennarun
  2010-05-07 17:10                 ` Linus Torvalds
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 16:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eyvind Bernhardsen, git, hasan.aljudy, kusmabite, torvalds, prohaska

On Fri, May 7, 2010 at 12:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:
>> - An attribute called "auto-eol" is set in the repository to turn on
>>   normalization of line endings.  Since attributes are content, the
>>   setting is copied when the repository is cloned and can be changed in
>>   an existing repository (with a few caveats).  Setting this attribute
>>   is equivalent to setting "core.autocrlf" to "input" or "true".
>
> In what way is this attribute different from existing "crlf" attribute?

Mostly that it relates to the new core.eolStyle config option instead
of core.autocrlf.  Arguably you could use the same gitattribute to set
both config options, but I don't know how you'd make that respond in a
sane backwards-compatible fashion.

> It feels as if this series is fixing shortcomings of the combination of
> core.autocrlf configuration and crlf attribute while trying very hard to
> keep their shortcomings when the user doesn't say so.  What is the
> downside of making the existing "core.autocrlf" + "crlf" combination do
> what your patch wanted to do without retaining this "keep the existing
> shortcomings for backward compatibility"?

Is this even possible?  If core.autocrlf is set, then files all over
the place start getting crlf conversion, even if no attributes are set
at all.  If core.eolStyle is set, only files with the auto-eol
attribute set appropriately will experience any conversion.

Maybe the options aren't named ideally.  "core.eolStyle" might better
be named "core.nativeEol" - it tells git what the native EOL style is
on your computer / in this repository, but it doesn't tell git to *do*
anything with this information.  The problem with core.autocrlf is
that it mixes two concepts: identifying your native EOL style, and
telling git to do stuff.  The existing gitattribute can then tell git
*not* to do stuff, but almost no projects have a .gitattributes file
that does this.

>> 1. Setting core.autocrlf in your global or system configuration is a
>> pain
>
> This is a wrong thing to do to begin with, and not worth discussing.

Ha, doesn't msysgit do this by default?  It did at one point, anyway.
I use cygwin git (which doesn't because it thinks it's Unix) so I
don't know.

If this was ever the default behaviour, then it's at least not
*obviously* wrong.

The end result is that nobody really likes the current autocrlf
behaviour, though, so I'd agree that it *ends up* being wrong.  Just
as setting it on a per-checkout basis also ends up being wrong,
because it's so easy to forget.

> You
> know and your readers know that line ending convention in the repository
> data (i.e. blobs) is under project control while line ending convention in
> the working tree is end user preference.

Yes.  But the current system doesn't make it very easy to state your preference.

>> 2. Setting core.autocrlf in an individual repository would be okay
>> except that naive users will do it after they have already cloned:
>> unless core.autocrlf is set globally, the clone will have the wrong line
>> endings, and the user needs to know how to refresh it manually (rm -rf *
>> && git checkout -f).
>
> This may be a worthy goal.  But if a "auto-eol" attribute "fixes" this,
> perhaps "crlf" attribute can be taught to fix it the same way, no?

It fixes it by making the global setting actually do what people want.
 I'm not sure the existing config option can be made to work like
that.

Again, maybe it would make sense to combine a single attribute but
have two config options (and people can eventually just stop using
core.autocrlf altogether).  I suspect it might subtly break some
existing projects, though.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 16:33               ` Junio C Hamano
  2010-05-07 16:57                 ` Avery Pennarun
@ 2010-05-07 17:10                 ` Linus Torvalds
  2010-05-07 19:02                   ` Linus Torvalds
                                     ` (2 more replies)
  2010-05-07 19:41                 ` Finn Arne Gangstad
  2010-05-07 20:11                 ` Eyvind Bernhardsen
  3 siblings, 3 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 17:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eyvind Bernhardsen, git, hasan.aljudy, kusmabite, prohaska



On Fri, 7 May 2010, Junio C Hamano wrote:

> Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:
> 
> > - An attribute called "auto-eol" is set in the repository to turn on
> >   normalization of line endings.  Since attributes are content, the
> >   setting is copied when the repository is cloned and can be changed in
> >   an existing repository (with a few caveats).  Setting this attribute
> >   is equivalent to setting "core.autocrlf" to "input" or "true".
> 
> In what way is this attribute different from existing "crlf" attribute?

The existing crlf attribute is a no-op _unless_ core.autocrlf is set, 
isn't it?

The whole point of Eyvind's series is to be able to set crlf attributes 
without having to set the config option - because he wants to make sure 
that a new clone always gets the proper crlf handling without users 
having to do anything extra.

And I do have to say that it makes sense.

I also do think that maybe we could just change the existing crlf 
attribute to work even without 'core.autocrlf'. 

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 17:10                 ` Linus Torvalds
@ 2010-05-07 19:02                   ` Linus Torvalds
  2010-05-07 19:11                     ` Avery Pennarun
  2010-05-07 19:31                     ` Nicolas Pitre
  2010-05-07 19:06                   ` Junio C Hamano
  2010-05-07 19:25                   ` Eyvind Bernhardsen
  2 siblings, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 19:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eyvind Bernhardsen, git, hasan.aljudy, kusmabite, prohaska



On Fri, 7 May 2010, Linus Torvalds wrote:
> 
> I also do think that maybe we could just change the existing crlf 
> attribute to work even without 'core.autocrlf'. 

Btw, another option might be to start searching ".gitconfig", but only 
allow a certain "safe subset" of config options in that. Things that can 
really be about the project itself, and not per-user or per-repository.

And parse it before ~/.gitconfig and .git/config, so that people can 
always override it.

I dunno. Looking at the config options, there really aren't a lot of them 
that make sense on a project scale. There's a few, though. Things like

	core.autocrlf
	i18n.commitEnconfig

and possibly others..

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 17:10                 ` Linus Torvalds
  2010-05-07 19:02                   ` Linus Torvalds
@ 2010-05-07 19:06                   ` Junio C Hamano
  2010-05-07 19:25                   ` Eyvind Bernhardsen
  2 siblings, 0 replies; 82+ messages in thread
From: Junio C Hamano @ 2010-05-07 19:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Eyvind Bernhardsen, git, hasan.aljudy, kusmabite, prohaska

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Fri, 7 May 2010, Junio C Hamano wrote:
>
>> Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:
>> 
>> > - An attribute called "auto-eol" is set in the repository to turn on
>> >   normalization of line endings.  Since attributes are content, the
>> >   setting is copied when the repository is cloned and can be changed in
>> >   an existing repository (with a few caveats).  Setting this attribute
>> >   is equivalent to setting "core.autocrlf" to "input" or "true".
>> 
>> In what way is this attribute different from existing "crlf" attribute?
>
> The existing crlf attribute is a no-op _unless_ core.autocrlf is set, 
> isn't it?
>
> The whole point of Eyvind's series is to be able to set crlf attributes 
> without having to set the config option - because he wants to make sure 
> that a new clone always gets the proper crlf handling without users 
> having to do anything extra.
>
> And I do have to say that it makes sense.
>
> I also do think that maybe we could just change the existing crlf 
> attribute to work even without 'core.autocrlf'. 

Yes, that is exactly what I was alluding to.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:02                   ` Linus Torvalds
@ 2010-05-07 19:11                     ` Avery Pennarun
  2010-05-07 19:16                       ` Linus Torvalds
  2010-05-07 19:23                       ` Eyvind Bernhardsen
  2010-05-07 19:31                     ` Nicolas Pitre
  1 sibling, 2 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 19:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 3:02 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Btw, another option might be to start searching ".gitconfig", but only
> allow a certain "safe subset" of config options in that. Things that can
> really be about the project itself, and not per-user or per-repository.
> [...]
> Things like
>
>        core.autocrlf
>        i18n.commitEnconfig

Unfortunately this option wouldn't be as flexible as Eyvind's current proposal.

What his method allows is to mark some files in a project as "these
should be the native EOL style" and others as "these should be left
alone."  Then each person can set a (usually global) config option
that states what the native EOL style should be.  Like core.autocrlf,
only it wouldn't affect projects without crlf attributes (like git.git
or linux.git) where CRLF translation is pretty much always wrong.
(And if one person disagrees that it's always wrong, well, he can
always set core.autocrlf for himeself.)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:11                     ` Avery Pennarun
@ 2010-05-07 19:16                       ` Linus Torvalds
  2010-05-07 19:35                         ` Avery Pennarun
  2010-05-07 19:23                       ` Eyvind Bernhardsen
  1 sibling, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 19:16 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Avery Pennarun wrote:
> 
> Unfortunately this option wouldn't be as flexible as Eyvind's current proposal.

Oh, absolutely it is.

> What his method allows is to mark some files in a project as "these
> should be the native EOL style" and others as "these should be left
> alone."

But that's what a .gitconfig would too. We _already_ have that 
.gitattribute thing to then distinguish particular pathname rules. It's 
just that currently .git/config is needed to _enable_ it.

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:11                     ` Avery Pennarun
  2010-05-07 19:16                       ` Linus Torvalds
@ 2010-05-07 19:23                       ` Eyvind Bernhardsen
  1 sibling, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 19:23 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Linus Torvalds, Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska

On 7. mai 2010, at 21.11, Avery Pennarun wrote:

> On Fri, May 7, 2010 at 3:02 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> Btw, another option might be to start searching ".gitconfig", but only
>> allow a certain "safe subset" of config options in that. Things that can
>> really be about the project itself, and not per-user or per-repository.
>> [...]
>> Things like
>> 
>>        core.autocrlf
>>        i18n.commitEnconfig
> 
> Unfortunately this option wouldn't be as flexible as Eyvind's current proposal.

Thanks for the support!

My objection to this idea is more practical: I suspect that parsing .gitconfig from the repository would be a lot more work than my simple hack :)
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 17:10                 ` Linus Torvalds
  2010-05-07 19:02                   ` Linus Torvalds
  2010-05-07 19:06                   ` Junio C Hamano
@ 2010-05-07 19:25                   ` Eyvind Bernhardsen
  2 siblings, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 19:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska

On 7. mai 2010, at 19.10, Linus Torvalds wrote:

> I also do think that maybe we could just change the existing crlf 
> attribute to work even without 'core.autocrlf'. 

Ah, of course.  Thanks for the clarification!  I didn't understand what Junio meant (and was composing a long email which may or may not have had a bitter tone); now I'm preparing a new patch series instead.
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:02                   ` Linus Torvalds
  2010-05-07 19:11                     ` Avery Pennarun
@ 2010-05-07 19:31                     ` Nicolas Pitre
  2010-05-07 19:36                       ` Avery Pennarun
  2010-05-07 19:40                       ` Linus Torvalds
  1 sibling, 2 replies; 82+ messages in thread
From: Nicolas Pitre @ 2010-05-07 19:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, 7 May 2010, Linus Torvalds wrote:

> Btw, another option might be to start searching ".gitconfig", but only 
> allow a certain "safe subset" of config options in that. Things that can 
> really be about the project itself, and not per-user or per-repository.
> 
> And parse it before ~/.gitconfig and .git/config, so that people can 
> always override it.
> 
> I dunno. Looking at the config options, there really aren't a lot of them 
> that make sense on a project scale. There's a few, though. Things like
> 
> 	core.autocrlf
> 	i18n.commitEnconfig
> 
> and possibly others..

Given that only a subset of gitconfig could make sense to have 
distributed, I think the file should be named .gitparams to make the 
distinction clear.


Nicolas

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:16                       ` Linus Torvalds
@ 2010-05-07 19:35                         ` Avery Pennarun
  2010-05-07 19:45                           ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 19:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 3:16 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> Unfortunately this option wouldn't be as flexible as Eyvind's current proposal.
>
> Oh, absolutely it is.
>
>> What his method allows is to mark some files in a project as "these
>> should be the native EOL style" and others as "these should be left
>> alone."
>
> But that's what a .gitconfig would too. We _already_ have that
> .gitattribute thing to then distinguish particular pathname rules. It's
> just that currently .git/config is needed to _enable_ it.

Hmm, I don't think we're saying the same thing.  There are two
separate settings here:

1) Whether a project has files that should be EOL-converted
automatically (we seem to all agree that this is set in
.gitattributes, whichever attribute is used).

2) Whether a particular person wants those particular files to be
EOL-converted, and what to convert them to.

The existing semantics of core.autocrlf just don't let you express #2
in a useful way.  If I set --global core.autocrlf, it turns it on for
*all* projects, not just ones with the .gitattribute set.  If a
project has a .gitconfig inside that sets core.autocrlf, then it's
really just redundant with #1.  If I set .git/config on a particular
project, it works, but it's far too easy to forget (and there seems to
be no way to set this per-project at clone time, and setting it
*after* cloning causes git's index to get confused).

Eyvind's proposal is deceptively simple because it simply makes it
much less error prone for users to express something that's already
*technically* possible, but in practice, is very very frequently done
wrong.

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:31                     ` Nicolas Pitre
@ 2010-05-07 19:36                       ` Avery Pennarun
  2010-05-07 20:29                         ` Nicolas Pitre
  2010-05-07 19:40                       ` Linus Torvalds
  1 sibling, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 19:36 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Torvalds, Junio C Hamano, Eyvind Bernhardsen, git,
	hasan.aljudy, kusmabite, prohaska

On Fri, May 7, 2010 at 3:31 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Fri, 7 May 2010, Linus Torvalds wrote:
>> Btw, another option might be to start searching ".gitconfig", but only
>> allow a certain "safe subset" of config options in that. Things that can
>> really be about the project itself, and not per-user or per-repository.
>>
>> And parse it before ~/.gitconfig and .git/config, so that people can
>> always override it.
>>
>> I dunno. Looking at the config options, there really aren't a lot of them
>> that make sense on a project scale. There's a few, though. Things like
>>
>>       core.autocrlf
>>       i18n.commitEnconfig
>>
>> and possibly others..
>
> Given that only a subset of gitconfig could make sense to have
> distributed, I think the file should be named .gitparams to make the
> distinction clear.

Since the options it *does* have are exactly the same as .git/config,
however, naming it .gitconfig makes sense.  I'd say just print a
warning when reading options that are going to be ignored for security
reasons (or because they're not known at all, or whatever).

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:31                     ` Nicolas Pitre
  2010-05-07 19:36                       ` Avery Pennarun
@ 2010-05-07 19:40                       ` Linus Torvalds
  2010-05-07 20:32                         ` Nicolas Pitre
  1 sibling, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 19:40 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Nicolas Pitre wrote:
> 
> Given that only a subset of gitconfig could make sense to have 
> distributed, I think the file should be named .gitparams to make the 
> distinction clear.

I went through the options listed in "man gitconfig", and quite frankly, I 
didn't find any new ones. I didn't grep the source, and I'm sure they're 
not all documented, but if it really is just two options, I doubt it's 
worth it at all.

Hopefully nobody sane uses any non-utf8 encoding for commit messages 
anyway (but what do I know - I have no idea about Asian usage, where it 
may make more sense than in US/Western Europe). So i18n.commitEnconfig is 
not likely to be a big deal.

And just making the crlf attribute work regardless of core.autocrlf sounds 
like it wouldn't be a bad idea. Just _maybe_ we could actually make an 
_explicit_ "core.autocrlf = off/false" actually disable any .gitattribute 
crlf settings, but I'm not sure even that is a good idea.

So I'd suggest relegating "core.autocrlf" to just files that are _not_ 
covered by some explicit .gitattribute setting. After all, that just more 
solidly puts the "auto" in autocrlf.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 16:33               ` Junio C Hamano
  2010-05-07 16:57                 ` Avery Pennarun
  2010-05-07 17:10                 ` Linus Torvalds
@ 2010-05-07 19:41                 ` Finn Arne Gangstad
  2010-05-07 20:06                   ` Avery Pennarun
  2010-05-07 20:11                 ` Eyvind Bernhardsen
  3 siblings, 1 reply; 82+ messages in thread
From: Finn Arne Gangstad @ 2010-05-07 19:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eyvind Bernhardsen, git, hasan.aljudy, kusmabite, torvalds, prohaska

On Fri, May 07, 2010 at 09:33:49AM -0700, Junio C Hamano wrote:
> Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:
> 
> > - An attribute called "auto-eol" is set in the repository to turn on
> >   normalization of line endings.  Since attributes are content, the
> >   setting is copied when the repository is cloned and can be changed in
> >   an existing repository (with a few caveats).  Setting this attribute
> >   is equivalent to setting "core.autocrlf" to "input" or "true".
> 
> In what way is this attribute different from existing "crlf" attribute?

The crlf attribute says whether to enable autocrlf functionality for a
file, but that is not what is really wanted. auto-eol instead says how
line endings should be stored in the repository. Also, auto-eol will
only affect files auto-detected as text (or forced to be treated as
text by the crlf attribute) it seems.

> This may be a worthy goal.  But if a "auto-eol" attribute "fixes"
> this, perhaps "crlf" attribute can be taught to fix it the same way,
> no?

Maybe it is sufficient to add a new value to "crlf" that means:

- If the file is autodetected as text:
  - Convert to LF only on commit, and
  - Convert to your preferred EOL style on checkout.

I don't think autocrlf is a good place to specify preferred EOL
style, it is too dangerous to set autocrlf to true by default, but it should
not be dangerous to say that your preferred EOL style is CRLF.

- Finn Arne

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:35                         ` Avery Pennarun
@ 2010-05-07 19:45                           ` Linus Torvalds
  2010-05-07 19:58                             ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 19:45 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Avery Pennarun wrote:
> 
> 1) Whether a project has files that should be EOL-converted
> automatically (we seem to all agree that this is set in
> .gitattributes, whichever attribute is used).
> 
> 2) Whether a particular person wants those particular files to be
> EOL-converted, and what to convert them to.

So? If we were to have a .gitconfig file, then both of those things would 
just work. It's no different from Eyvind's patch, except the exact details 
on syntax (and which file to set) would differ slightly.

So it's a syntactic difference, nothing more.

That said, I don't think the extra .gitconfig is even worth it, the same 
way I do _not_ think Eyvind's extra .gitattributes things are worth it. We 
already have perfectly good .gitattributes, and the only real issue is 
that they just don't take effect in some situations where people would 
_want_ them to take effect.

So just a small semantic change to how .gitattributes crlf works would 
likely make everybody happy.

The only downside is that it _is_ a semantic change. It really would 
change existing git behavior. Now, I think most people would consider the 
change in behavior to be a clear improvement, but hey...

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:45                           ` Linus Torvalds
@ 2010-05-07 19:58                             ` Avery Pennarun
  2010-05-07 20:06                               ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 19:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 3:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> 1) Whether a project has files that should be EOL-converted
>> automatically (we seem to all agree that this is set in
>> .gitattributes, whichever attribute is used).
>>
>> 2) Whether a particular person wants those particular files to be
>> EOL-converted, and what to convert them to.
>
> So? If we were to have a .gitconfig file, then both of those things would
> just work.

No!  The whole point is that each user *does* still want to be able to
decide how to convert the files tagged by the crlf gitattribute (or a
new attribute, I don't care).  Setting this in a .gitconfig file
inside the project is pointless; I need it in my *personal* config.
msysgit users want to set it globally to CRLF by default, Linux or
cygwin users probably want to set it to LF by default.

So #1 is useful to have in the repo, #2 is not.

I am a real live example of this.  For our Delphi projects at work, I
want to check it out with LF on my Linux machine (so I can
patch/diff/merge/grep/edit/etc easily), and CRLF on my Windows machine
(so that the Delphi IDE doesn't get confused).  Other projects I want
to have pure LF on both Linux and Windows, so setting
core.autocrlf=true globally will break things.

Eyvind's proposal (or a similar proposal where his new attribute is
just the crlf attribute) will get me and all my co-workers the
wonderful correct behaviour *by default*; the current behaviour, or an
in-repo .gitconfig, will not.  The key feature is the new
core.eolStyle option, not whether or not we add a new attribute.

> That said, I don't think the extra .gitconfig is even worth it, the same
> way I do _not_ think Eyvind's extra .gitattributes things are worth it.

Do you even use any CRLF projects?  If not, then presumably none of
the options will seem worth it. :)

But the current behaviour really doesn't work for people who need CRLF
conversion, and an in-repo .gitconfig file won't help them.
core.eolStyle + a change to crlf attribute semantics will.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:58                             ` Avery Pennarun
@ 2010-05-07 20:06                               ` Linus Torvalds
  2010-05-07 20:17                                 ` Linus Torvalds
  2010-05-07 20:58                                 ` Avery Pennarun
  0 siblings, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 20:06 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Avery Pennarun wrote:
> 
> No!  The whole point is that each user *does* still want to be able to
> decide how to convert the files tagged by the crlf gitattribute (or a
> new attribute, I don't care).

Avery, you really don't _get_ it, do you?

If you want to set how the autocrlf conversion would be done, JUST DO IT. 
The .gitconfig file would be overridden by your personal settings.

So what you'd have is

 .gitconfig: core.autocrlf=true	# to enable .gitattributes

but then any .git/config setting (to "input", say) would still override 
that repository setting.

End result: exactly what you're talking about. With _simpler_ syntax than 
the one Eyvind had.

Now, the thing is, we can go for even simpler syntax still, by just making 
that ".gitconfig: core.autocrlf=true" entirely unnecessary. 

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:41                 ` Finn Arne Gangstad
@ 2010-05-07 20:06                   ` Avery Pennarun
  0 siblings, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 20:06 UTC (permalink / raw)
  To: Finn Arne Gangstad
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	torvalds, prohaska

On Fri, May 7, 2010 at 3:41 PM, Finn Arne Gangstad <finnag@pvv.org> wrote:
> Maybe it is sufficient to add a new value to "crlf" that means:
>
> - If the file is autodetected as text:
>  - Convert to LF only on commit, and
>  - Convert to your preferred EOL style on checkout.
>
> I don't think autocrlf is a good place to specify preferred EOL
> style, it is too dangerous to set autocrlf to true by default, but it should
> not be dangerous to say that your preferred EOL style is CRLF.

Assuming it's updated to reuse the existing crlf attribute instead of
adding a new one, that seems to be exactly what this patch series is
about.  "Your preferred EOL style" is the newly introduced
core.eolStyle config option.  So... good idea :)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 16:33               ` Junio C Hamano
                                   ` (2 preceding siblings ...)
  2010-05-07 19:41                 ` Finn Arne Gangstad
@ 2010-05-07 20:11                 ` Eyvind Bernhardsen
  3 siblings, 0 replies; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 20:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, hasan.aljudy, kusmabite, torvalds, prohaska

On 7. mai 2010, at 18.33, Junio C Hamano wrote:

> Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com> writes:
> 
>> - An attribute called "auto-eol" is set in the repository to turn on
>>  normalization of line endings.  Since attributes are content, the
>>  setting is copied when the repository is cloned and can be changed in
>>  an existing repository (with a few caveats).  Setting this attribute
>>  is equivalent to setting "core.autocrlf" to "input" or "true".
> 
> In what way is this attribute different from existing "crlf" attribute?

Avery and Linus have covered this quite well, but I think I can use "crlf" instead of inventing a new attribute.  New patch series to come.

> It feels as if this series is fixing shortcomings of the combination of
> core.autocrlf configuration and crlf attribute while trying very hard to
> keep their shortcomings when the user doesn't say so.  What is the
> downside of making the existing "core.autocrlf" + "crlf" combination do
> what your patch wanted to do without retaining this "keep the existing
> shortcomings for backward compatibility"?

I think keeping the existing shortcomings is partly necessary because I don't want to break any existing repositories by changing the meaning of "core.autocrlf=input" and "core.autocrlf=true".

I also like "core.eolStyle" because I want a config setting that explicitly says "crlf" or "lf" rather than forcing the user to remember what "true" and "input" mean.  The new series will keep core.eolStyle.

I would like to have a boolean "core.autocrlf" that uses "core.eolStyle" instead of implying anything about line endings in the working directory, but I'm not sure if that is possible without breaking anybody's setup.

>> 1. Setting core.autocrlf in your global or system configuration is a
>> pain
> 
> This is a wrong thing to do to begin with, and not worth discussing.  You
> know and your readers know that line ending convention in the repository
> data (i.e. blobs) is under project control while line ending convention in
> the working tree is end user preference.

I think it's worth mentioning because git doesn't currently enforce line ending normalization on a per-project basis, which is what I'm trying to rectify.  Also, the default setting in msysgit is "core.autocrlf=true", but I guess you disagree with that default :)

>> 2. Setting core.autocrlf in an individual repository would be okay
>> except that naive users will do it after they have already cloned:
>> unless core.autocrlf is set globally, the clone will have the wrong line
>> endings, and the user needs to know how to refresh it manually (rm -rf *
>> && git checkout -f).
> 
> This may be a worthy goal.  But if a "auto-eol" attribute "fixes" this,
> perhaps "crlf" attribute can be taught to fix it the same way, no?

Yes.  And it shall!
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:06                               ` Linus Torvalds
@ 2010-05-07 20:17                                 ` Linus Torvalds
  2010-05-07 20:42                                   ` Eyvind Bernhardsen
  2010-05-07 20:58                                 ` Avery Pennarun
  1 sibling, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 20:17 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Linus Torvalds wrote:
> 
> Now, the thing is, we can go for even simpler syntax still, by just making 
> that ".gitconfig: core.autocrlf=true" entirely unnecessary. 

Exact semantics I'd suggest for 'core.autocrlf':

    Setting		path in .gitattributes	path _not_ in .gitattributes
    =======		======================	===========================
 - not set at all	attribute value		no crlf
 - "off"/"false"	no crlf			no crlf
 - "on"			attribute value		autocrlf	
 - "input"		attribute "input"	autocrlf "input"

Which is different from what we do now for the "not set at all" case, 
in that it still takes the .gitattributes value for those cases if a path 
matches.

We could add a few core.autocrlf entries, like "force" (to force output to 
be CRLF even on a platform where it isn't the default).

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:36                       ` Avery Pennarun
@ 2010-05-07 20:29                         ` Nicolas Pitre
  2010-05-07 21:00                           ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2010-05-07 20:29 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Linus Torvalds, Junio C Hamano, Eyvind Bernhardsen, git,
	hasan.aljudy, kusmabite, prohaska

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1402 bytes --]

On Fri, 7 May 2010, Avery Pennarun wrote:

> On Fri, May 7, 2010 at 3:31 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Fri, 7 May 2010, Linus Torvalds wrote:
> >> Btw, another option might be to start searching ".gitconfig", but only
> >> allow a certain "safe subset" of config options in that. Things that can
> >> really be about the project itself, and not per-user or per-repository.
> >>
> >> And parse it before ~/.gitconfig and .git/config, so that people can
> >> always override it.
> >>
> >> I dunno. Looking at the config options, there really aren't a lot of them
> >> that make sense on a project scale. There's a few, though. Things like
> >>
> >>       core.autocrlf
> >>       i18n.commitEnconfig
> >>
> >> and possibly others..
> >
> > Given that only a subset of gitconfig could make sense to have
> > distributed, I think the file should be named .gitparams to make the
> > distinction clear.
> 
> Since the options it *does* have are exactly the same as .git/config,
> however, naming it .gitconfig makes sense.

Well, I disagree.

> I'd say just print a
> warning when reading options that are going to be ignored for security
> reasons (or because they're not known at all, or whatever).

Or just make it .gitparams (or anything you wish) which is not the same 
as gitconfig. This way it is less likely to get bogus bug reports for 
options that aren't supported.


Nicolas

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 19:40                       ` Linus Torvalds
@ 2010-05-07 20:32                         ` Nicolas Pitre
  0 siblings, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2010-05-07 20:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, 7 May 2010, Linus Torvalds wrote:

> On Fri, 7 May 2010, Nicolas Pitre wrote:
> > 
> > Given that only a subset of gitconfig could make sense to have 
> > distributed, I think the file should be named .gitparams to make the 
> > distinction clear.
> 
> I went through the options listed in "man gitconfig", and quite frankly, I 
> didn't find any new ones. I didn't grep the source, and I'm sure they're 
> not all documented, but if it really is just two options, I doubt it's 
> worth it at all.

I don't dispute that.

I was merely pointing out that naming such a file .gitconfig is a bad 
idea if it doesn't duplicate the entire .git/config functionality.


Nicolas

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:17                                 ` Linus Torvalds
@ 2010-05-07 20:42                                   ` Eyvind Bernhardsen
  2010-05-07 20:57                                     ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 20:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska

On 7. mai 2010, at 22.17, Linus Torvalds wrote:

> 
> 
> On Fri, 7 May 2010, Linus Torvalds wrote:
>> 
>> Now, the thing is, we can go for even simpler syntax still, by just making 
>> that ".gitconfig: core.autocrlf=true" entirely unnecessary. 
> 
> Exact semantics I'd suggest for 'core.autocrlf':
> 
>    Setting		path in .gitattributes	path _not_ in .gitattributes
>    =======		======================	===========================
> - not set at all	attribute value		no crlf
> - "off"/"false"	no crlf			no crlf
> - "on"			attribute value		autocrlf	
> - "input"		attribute "input"	autocrlf "input"
> 
> Which is different from what we do now for the "not set at all" case, 
> in that it still takes the .gitattributes value for those cases if a path 
> matches.
> 
> We could add a few core.autocrlf entries, like "force" (to force output to 
> be CRLF even on a platform where it isn't the default).

How can you say that this is simpler than my syntax?  I have an attribute that means "line endings should be normalised" and a configuration variable that decides what line endings should be used in the working directory for normalised files.  If you like CRLFs you set it to "crlf", if you like LFs you set it to "lf".

I'll replace "auto-eol" with something like "crlf=auto" because I actually think that's pretty neat, but I won't pretend that "true" and "input" are sane ways to indicate if you prefer CRLF or LF line endings in your working directory.
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:42                                   ` Eyvind Bernhardsen
@ 2010-05-07 20:57                                     ` Linus Torvalds
  2010-05-07 21:17                                       ` Eyvind Bernhardsen
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 20:57 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: Avery Pennarun, Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska



On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
> 
> How can you say that this is simpler than my syntax?

Because your syntax adds totally new attributes, so now you can't even 
take an existing .gitattributes and make it do something sane - instead 
you have to write totally new rules.

My suggestion just makes any existing usage do the "what you'd expect".

THAT is simpler.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:06                               ` Linus Torvalds
  2010-05-07 20:17                                 ` Linus Torvalds
@ 2010-05-07 20:58                                 ` Avery Pennarun
  1 sibling, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 20:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Eyvind Bernhardsen, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 4:06 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> No!  The whole point is that each user *does* still want to be able to
>> decide how to convert the files tagged by the crlf gitattribute (or a
>> new attribute, I don't care).
>
> Avery, you really don't _get_ it, do you?

I was going to say that I do get it, but I guess I didn't.  You're
right, your proposal is functionally equivalent.  Feel free to stop
reading the rest of this post :)

For the benefit of those who might have misunderstood as I did, the
reason they're equivalent is that "core.eolStyle = LF" is the same as
saying "never do EOL conversion" since an unconverted file is
implicitly LF.  And there is already a way to say "never do EOL
conversion," which is to set core.autocrlf=False.

By adding core.autocrlf=True to an in-project .gitconfig file, we can
fix a mistake in the original definition of the crlf attribute, ie.,
it should be able to force CRLF conversion even when a user hasn't set
core.autocrlf explicitly.  But that new ability doesn't take away a
person's ability to override it globally because .git/config and
~/.gitconfig take precedence.  Notably, this solution doesn't break
any backward compatibility.

Linus's second proposed option would be to slightly change the way the
crlf attribute works, by making core.autocrlf a tri-state variable
instead of just true/false.  "Undefined" would mean "use the crlf
attribute" where currently it means (rather unhelpfully) "always use
LF even if .gitattributes says otherwise."  However, this would be a
backward-incompatible change.  Arguably, not one that anyone would
care about.  (For the record, none of my co-workers would care.  The
current behaviour is sufficiently unhelpful that we have to use
core.autocrlf=True anyway, so .gitattributes crlf hasn't been useful.)

Now, arguably, the current semantics, and even Linus's proposed
improved semantics, are still pretty hard to explain.  "This file
should always be unchanged" and "this file should always use native
line endings" and "this is my native line ending style" is very simple
and straightforward.  But I'm sure others would argue the opposite,
and it's just a matter of preference.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:29                         ` Nicolas Pitre
@ 2010-05-07 21:00                           ` Avery Pennarun
  2010-05-07 21:12                             ` Nicolas Pitre
  0 siblings, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 21:00 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Torvalds, Junio C Hamano, Eyvind Bernhardsen, git,
	hasan.aljudy, kusmabite, prohaska

On Fri, May 7, 2010 at 4:29 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> Since the options it *does* have are exactly the same as .git/config,
>> however, naming it .gitconfig makes sense.
>
> Well, I disagree.
>
>> I'd say just print a
>> warning when reading options that are going to be ignored for security
>> reasons (or because they're not known at all, or whatever).
>
> Or just make it .gitparams (or anything you wish) which is not the same
> as gitconfig. This way it is less likely to get bogus bug reports for
> options that aren't supported.

It has exactly the same syntax as ~/.gitconfig, and the options it
does support can all be carried over literally to ~/.gitconfig.
Calling it something else would imply that it deserves its own man
page, which would need to repeat all the options that are already
documented for ~/.gitconfig.

I'd say something that's syntactically identical, and in some cases
actually interchangeable, should have the same name.  Using a
different name could actually be *misleading*.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:00                           ` Avery Pennarun
@ 2010-05-07 21:12                             ` Nicolas Pitre
  2010-05-07 21:26                               ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2010-05-07 21:12 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Linus Torvalds, Junio C Hamano, Eyvind Bernhardsen, git,
	hasan.aljudy, kusmabite, prohaska

On Fri, 7 May 2010, Avery Pennarun wrote:

> On Fri, May 7, 2010 at 4:29 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Fri, 7 May 2010, Avery Pennarun wrote:
> >> Since the options it *does* have are exactly the same as .git/config,
> >> however, naming it .gitconfig makes sense.
> >
> > Well, I disagree.
> >
> >> I'd say just print a
> >> warning when reading options that are going to be ignored for security
> >> reasons (or because they're not known at all, or whatever).
> >
> > Or just make it .gitparams (or anything you wish) which is not the same
> > as gitconfig. This way it is less likely to get bogus bug reports for
> > options that aren't supported.
> 
> It has exactly the same syntax as ~/.gitconfig, and the options it
> does support can all be carried over literally to ~/.gitconfig.

Absolutely not.

Most options for ~/.gitconfig simply make no sense in a distributed 
.gitconfig file.

> Calling it something else would imply that it deserves its own man
> page, which would need to repeat all the options that are already
> documented for ~/.gitconfig.

No because most of those options don't and can't apply to a distributed 
option file.

> I'd say something that's syntactically identical, and in some cases
> actually interchangeable, should have the same name.

Indeed.  But this is not the case here.


Nicolas

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 20:57                                     ` Linus Torvalds
@ 2010-05-07 21:17                                       ` Eyvind Bernhardsen
  2010-05-07 21:23                                         ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 21:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska

On 7. mai 2010, at 22.57, Linus Torvalds wrote:

> On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
>> 
>> How can you say that this is simpler than my syntax?
> 
> Because your syntax adds totally new attributes, so now you can't even 
> take an existing .gitattributes and make it do something sane - instead 
> you have to write totally new rules.

I don't understand.  All you have to do is add "* auto-eol=true" to your .gitattributes, and line endings will be normalized exactly as if you'd set "core.autocrlf".  Why would you have to write totally new rules?  Which rules?

> My suggestion just makes any existing usage do the "what you'd expect".
> 
> THAT is simpler.

Well, sort of, but "simple for someone who already knows how core.autocrlf works" isn't what I'm aiming for :)
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:17                                       ` Eyvind Bernhardsen
@ 2010-05-07 21:23                                         ` Linus Torvalds
  2010-05-07 21:30                                           ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 21:23 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: Avery Pennarun, Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska



On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
> 
> I don't understand.  All you have to do is add "* auto-eol=true" to your 
> .gitattributes, and line endings will be normalized exactly as if you'd 
> set "core.autocrlf".  Why would you have to write totally new rules?  
> Which rules?

I think "* auto-eol=true" is just crazy. We would _never_ want to do that. 
Any project that does that should be shot in the head.

So encouraging that as a format is just silly and stupid.

In contrast, the slight change in semantics (with no new config options 
_or_ attributes) that I suggest should just make everybody happy - because 
it takes care of the real life situation that people are in.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:12                             ` Nicolas Pitre
@ 2010-05-07 21:26                               ` Avery Pennarun
  2010-05-07 22:09                                 ` A Large Angry SCM
  0 siblings, 1 reply; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 21:26 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Torvalds, Junio C Hamano, Eyvind Bernhardsen, git,
	hasan.aljudy, kusmabite, prohaska

On Fri, May 7, 2010 at 5:12 PM, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> It has exactly the same syntax as ~/.gitconfig, and the options it
>> does support can all be carried over literally to ~/.gitconfig.
>
> Absolutely not.
>
> Most options for ~/.gitconfig simply make no sense in a distributed
> .gitconfig file.

No, that's the converse of what I said.

Try this in your head:

    cp .gitconfig .git/config

Perfectly valid.  Copying the other way might (or might not) result in
invalid options in .gitconfig, which probably ought to be warned
about.  But the syntax is obviously identical.

>> Calling it something else would imply that it deserves its own man
>> page, which would need to repeat all the options that are already
>> documented for ~/.gitconfig.
>
> No because most of those options don't and can't apply to a distributed
> option file.

But the ones that *do* apply all have the same meanings.

>> I'd say something that's syntactically identical, and in some cases
>> actually interchangeable, should have the same name.
>
> Indeed.  But this is not the case here.

Hmm, how to name the file is most a matter of opinion, but this last
bit is just factual ;)  They're syntactically identical.  And in some
cases, they're interchangeable.  I don't see how one could argue
otherwise.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:23                                         ` Linus Torvalds
@ 2010-05-07 21:30                                           ` Avery Pennarun
  2010-05-07 21:37                                             ` Eyvind Bernhardsen
  2010-05-07 21:54                                             ` Linus Torvalds
  0 siblings, 2 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 21:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 5:23 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
>> I don't understand.  All you have to do is add "* auto-eol=true" to your
>> .gitattributes, and line endings will be normalized exactly as if you'd
>> set "core.autocrlf".  Why would you have to write totally new rules?
>> Which rules?
>
> I think "* auto-eol=true" is just crazy. We would _never_ want to do that.
> Any project that does that should be shot in the head.

In the interests of further making myself look like an idiot:

Just to clarify, is it crazy because that line would convert all
files, even binary ones, where core.autocrlf auto-detects whether
files are binary or text?

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:30                                           ` Avery Pennarun
@ 2010-05-07 21:37                                             ` Eyvind Bernhardsen
  2010-05-07 21:58                                               ` Linus Torvalds
  2010-05-07 21:54                                             ` Linus Torvalds
  1 sibling, 1 reply; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-07 21:37 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Linus Torvalds, Eyvind Bernhardsen, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska

On 7. mai 2010, at 23.30, Avery Pennarun <apenwarr@gmail.com> wrote:

> On Fri, May 7, 2010 at 5:23 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
>>> I don't understand.  All you have to do is add "* auto-eol=true"  
>>> to your
>>> .gitattributes, and line endings will be normalized exactly as if  
>>> you'd
>>> set "core.autocrlf".  Why would you have to write totally new rules?
>>> Which rules?
>>
>> I think "* auto-eol=true" is just crazy. We would _never_ want to  
>> do that.
>> Any project that does that should be shot in the head.
>
> In the interests of further making myself look like an idiot:
>
> Just to clarify, is it crazy because that line would convert all
> files, even binary ones, where core.autocrlf auto-detects whether
> files are binary or text?

Just to clarify a bit more, that is _not_ what it would do.  The  
"crlf" attribute is still respected, of course.

Also, I meant to write "* crlf=auto", not "* auto-eol=true", if that  
makes it any less crazy.
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:30                                           ` Avery Pennarun
  2010-05-07 21:37                                             ` Eyvind Bernhardsen
@ 2010-05-07 21:54                                             ` Linus Torvalds
  2010-05-07 22:14                                               ` Linus Torvalds
                                                                 ` (2 more replies)
  1 sibling, 3 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 21:54 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Avery Pennarun wrote:
>
> > I think "* auto-eol=true" is just crazy. We would _never_ want to do that.
> > Any project that does that should be shot in the head.
>
> Just to clarify, is it crazy because that line would convert all
> files, even binary ones, where core.autocrlf auto-detects whether
> files are binary or text?

No, presumably 'auto-eol' does the same auto-detection. Otherwise the name 
wouldn't make sense.

I just think that it's crazy because

 (a) you should try to avoid do things like that in the first place. For 
     something like an attribute file, you should just list the files you 
     want to convert. That's the _point_ of an attribute. So it's much 
     nicer if you instead actually are explicit about it, ie

	*.[ch] crlf
	*.txt crlf
	*.jpg -crlf

     should be the _primary_ way you do it, since the autocrlf thing is a 
     bit dangerous in theory.

 (b) But let's say that you want to do it anyway (because you're lazy 
     and because autocrlf works pretty damn well in practice), isn't that 
     a really ugly and crazy thing to add _another_ attribute name for 
     that?

     IOW, if you really want to say "do automatic crlf for this set of 
     paths", the natural syntax for that would be

	* crlf=auto

     No? Not some totally new attribute name.

And in the end, you always do want to have a config variable for the 
actual type of conversion. And like it or not, we already do end up having 
this mix-up between .gitattributes and git "core.autocrlf" config entry, 
so my suggested rule was kind of a "minimally invasive" suggestion to just 
turn that mixing of attributes and config entries into something more 
practically useful.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:37                                             ` Eyvind Bernhardsen
@ 2010-05-07 21:58                                               ` Linus Torvalds
  0 siblings, 0 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 21:58 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska



On Fri, 7 May 2010, Eyvind Bernhardsen wrote:
> 
> Also, I meant to write "* crlf=auto", not "* auto-eol=true", if that makes it
> any less crazy.

Oh, yes. See my other email. "* crlf=auto" is at least sensible, although 
somewhat scary. At least with core.autocrlf=true, the user has to had 
consciously set it. It was the "whole new attribute name" that I thought 
pushed it from "slightly scary" to "crazy".

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:26                               ` Avery Pennarun
@ 2010-05-07 22:09                                 ` A Large Angry SCM
  2010-05-07 22:10                                   ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: A Large Angry SCM @ 2010-05-07 22:09 UTC (permalink / raw)
  To: Avery Pennarun, Eyvind Bernhardsen
  Cc: Nicolas Pitre, Linus Torvalds, Junio C Hamano, git, hasan.aljudy,
	kusmabite, prohaska

Avery Pennarun wrote:
[...]
>     cp .gitconfig .git/config
> 
> Perfectly valid.  Copying the other way might (or might not) result in
> invalid options in .gitconfig, which probably ought to be warned
> about.  But the syntax is obviously identical.
[...]

Which one takes precedence? I *MUST* be able to override a distributed 
.gitconfig/.gitparams/.gitparameters file.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 22:09                                 ` A Large Angry SCM
@ 2010-05-07 22:10                                   ` Avery Pennarun
  0 siblings, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 22:10 UTC (permalink / raw)
  To: gitzilla
  Cc: Eyvind Bernhardsen, Nicolas Pitre, Linus Torvalds,
	Junio C Hamano, git, hasan.aljudy, kusmabite, prohaska

On Fri, May 7, 2010 at 6:09 PM, A Large Angry SCM <gitzilla@gmail.com> wrote:
> Avery Pennarun wrote:
>>    cp .gitconfig .git/config
>>
>> Perfectly valid.  Copying the other way might (or might not) result in
>> invalid options in .gitconfig, which probably ought to be warned
>> about.  But the syntax is obviously identical.
>
> Which one takes precedence? I *MUST* be able to override a distributed
> .gitconfig/.gitparams/.gitparameters file.

Yes, absolutely.  As Linus said, the in-project file is lower priority
than your .git/config and ~/.gitconfig files.

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:54                                             ` Linus Torvalds
@ 2010-05-07 22:14                                               ` Linus Torvalds
  2010-05-07 22:34                                                 ` Avery Pennarun
  2010-05-07 22:19                                               ` Avery Pennarun
  2010-05-08 20:49                                               ` Dmitry Potapov
  2 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 22:14 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Linus Torvalds wrote:
> 
>      IOW, if you really want to say "do automatic crlf for this set of 
>      paths", the natural syntax for that would be
> 
> 	* crlf=auto

Btw, since we're discussing this, I do think that our current "crlf=input" 
syntax for .gitattributes is pretty dubious. 

I don't really see why it should be a path-dependent thing on whether you 
do crlf conversion on just input or on checkout too.  It smells odd. It 
makes more sense to me to have a global policy for what the output/input 
conversion should be, and then the path rules are just about whether that 
conversion gets done or not.

And like it or not, we called that global rule "autocrlf", and then mixed 
it up with the decision on whether we should do conversion at all. I do 
think that that was a mistake too, and that we could try to fix it, but I 
also think that's a fairly independent issue.

So we _could_ introduce a new "core.crlf" config option that talks purely 
about what kind of conversion gets done - not about _whether_ it gets 
done. So you could do

	[core]
		crlf=input

and it would imply that crlf conversion is only done on input, but it 
would differ from "autocrlf=input" in that it would _not_ imply that any 
paths not matched by gitattributes crlf rules would be automatically 
converted.

[ And in the above model, "core.autocrlf = input" would just be a 
  shorthand for saying "core.autocrlf=true" + "core.crlf=input")

So I think we could improve the config file syntax a bit.

But I think that's really a separate issue from the .gitattributes file, 
and whether the "crlf" attribute means anythin in the _absense_ of any 
config file rules about crlf.

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:54                                             ` Linus Torvalds
  2010-05-07 22:14                                               ` Linus Torvalds
@ 2010-05-07 22:19                                               ` Avery Pennarun
  2010-05-08 20:49                                               ` Dmitry Potapov
  2 siblings, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 22:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 5:54 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> > I think "* auto-eol=true" is just crazy. We would _never_ want to do that.
>> > Any project that does that should be shot in the head.
>>
>> Just to clarify, is it crazy because that line would convert all
>> files, even binary ones, where core.autocrlf auto-detects whether
>> files are binary or text?
>
> No, presumably 'auto-eol' does the same auto-detection. Otherwise the name
> wouldn't make sense.
> [...]
> Eyvind Bernhardsen wrote:
>> Also, I meant to write "* crlf=auto", not "* auto-eol=true", if that makes it
>> any less crazy.
>
> Oh, yes. See my other email. "* crlf=auto" is at least sensible, although
> somewhat scary. At least with core.autocrlf=true, the user has to had
> [...]
>  (b) But let's say that you want to do it anyway (because you're lazy
>     and because autocrlf works pretty damn well in practice), isn't that
>     a really ugly and crazy thing to add _another_ attribute name for
>     that?
>
>     IOW, if you really want to say "do automatic crlf for this set of
>     paths", the natural syntax for that would be
>
>        * crlf=auto

Oh, good grief, I'm just getting more and more confused.

So just to keep all of this straight, I think there are still two
proposals under consideration here:

a) add an in-project .gitconfig, in which case the above crlf=auto is
exactly equivalent to "crlf attribute missing" (which is different
from "crlf unset", hee hee, are we having fun yet?) since the crlf
attribute is ignored unless core.autocrlf=true, and missing means to
use the core.autocrlf setting;

OR

b) change the semantics of the crlf attribute, in which case crlf=auto
is a new mode that means "use autocrlf on this file even if
core.autocrlf is unset or unspecified".

Right?  So in case (a), the new crlf=auto option is unneeded.  Though
it does seem as if we're trending toward case (b).

Thanks,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 22:14                                               ` Linus Torvalds
@ 2010-05-07 22:34                                                 ` Avery Pennarun
  2010-05-07 22:54                                                   ` hasen j
  2010-05-07 23:18                                                   ` Linus Torvalds
  0 siblings, 2 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-07 22:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 6:14 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Linus Torvalds wrote:
>>      IOW, if you really want to say "do automatic crlf for this set of
>>      paths", the natural syntax for that would be
>>
>>       * crlf=auto
>
> Btw, since we're discussing this, I do think that our current "crlf=input"
> syntax for .gitattributes is pretty dubious.
>
> I don't really see why it should be a path-dependent thing on whether you
> do crlf conversion on just input or on checkout too.

Me neither.  However, in the name of sanity, it sure would be great to
have the global configuration options exactly parallel the per-project
and per-file configuration options.  From that point of view, 'input'
exists just to keep things nice and symmetrical.  And considering how
complicated this discussion already is (compared to what a simple
concept CRLF conversion is), that's probably worth something in
itself.

Part of the confusion comes from the way the options are currently
declared.  set vs. unset vs. unspecified vs. "input" vs. "auto" for an
option named "crlf" is just very, very, unfriendly.  None of the words
*mean* anything.

Maybe we should rethink this from the top.  Imagine that we currently
have no crlf options whatsoever.  What *should* it look like?  I
suggest the following:

Config:
   core.eolOverride = lf / crlf / auto / binary / input
   core.eolDefault = lf / crlf / auto / binary / input

Attribute:
   eol = lf / crlf / auto / binary / input

If eolOverride is not "auto" or unspecified, we ignore eolDefault or
any attributes.

If the attribute is not "auto" or unspecified, we ignore eolDefault.

For all entries, unspecified is equivalent to "auto".

Of course the eol attribute could be named "crlf", but that might not
increase the sanity as much as we would like.

And "input" means "auto, but strip CR when committing."  Or maybe the
problem is that it doesn't belong here at all: maybe it should be an
entirely separate attribute that takes effect whenever the eol
attribute/config resolves to "auto."

Or maybe I'm just not thinking about it the right way?

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 22:34                                                 ` Avery Pennarun
@ 2010-05-07 22:54                                                   ` hasen j
  2010-05-07 23:18                                                   ` Linus Torvalds
  1 sibling, 0 replies; 82+ messages in thread
From: hasen j @ 2010-05-07 22:54 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Linus Torvalds, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska

> Part of the confusion comes from the way the options are currently
> declared.  set vs. unset vs. unspecified vs. "input" vs. "auto" for an
> option named "crlf" is just very, very, unfriendly.  None of the words
> *mean* anything.
>
> Maybe we should rethink this from the top.  Imagine that we currently
> have no crlf options whatsoever.  What *should* it look like?  I
> suggest the following:
>
> Config:
>   core.eolOverride = lf / crlf / auto / binary / input
>   core.eolDefault = lf / crlf / auto / binary / input
>
> Attribute:
>   eol = lf / crlf / auto / binary / input
>
> If eolOverride is not "auto" or unspecified, we ignore eolDefault or
> any attributes.
>
> If the attribute is not "auto" or unspecified, we ignore eolDefault.
>
> For all entries, unspecified is equivalent to "auto".
>
> Of course the eol attribute could be named "crlf", but that might not
> increase the sanity as much as we would like.
>
> And "input" means "auto, but strip CR when committing."  Or maybe the
> problem is that it doesn't belong here at all: maybe it should be an
> entirely separate attribute that takes effect whenever the eol
> attribute/config resolves to "auto."
>
> Or maybe I'm just not thinking about it the right way?
>
> Avery
>

If we forget everything git has now, I would suggest the following:

- eol-normalization is per repository, per filetype (fnmatch filter)
- in a file separate from .git/config, such as .git/eol
- when you clone, you get this file

You specifies the 'standard' eol type for each file type in this project:

    *.c lf
    *.python lf
    *.vb crlf
    *.sln crlf
    etc (something like that)

committing and checking-out always normalize line endings; *always*

add (and commit) can take an option to keep eol as-is (i.e.
--no-eol-normalization or --keep-eol or --raw-eol)

In this model:

1- Anyone who clones gets the repository eol settings
2- No one can possibly commit in a different eol style unless he
explicitly says he wants to.
3- Naturally, eol-normalization doesn't apply to binary files

#2 is important, it's needed so you won't have someone making bad
commits because he has a settings some where in his global config to
always ignore eol normalization.
on the other hand, one can alias 'add --raw-eol' to something like
'eviladd', so he can do 'git eviladd file.c', which is fine because
it's explicit.

This would get rid of issues where an editor (such as VS) saves a file
with mixed line endings: we don't care because we normalize them.

This would also make it more transparent to windows users: they don't
even have to think about eol issues; they can't make bad commits
"by-accident". (provided the repo maintainer has set the eol filters
properly).

I have no idea what happens (or should happen) if the origin repo
maintainer updates the .git/eol file. Maybe it should be .giteol
instead of .git/eol

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 22:34                                                 ` Avery Pennarun
  2010-05-07 22:54                                                   ` hasen j
@ 2010-05-07 23:18                                                   ` Linus Torvalds
  2010-05-07 23:47                                                     ` hasen j
  2010-05-08  0:31                                                     ` Avery Pennarun
  1 sibling, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 23:18 UTC (permalink / raw)
  To: Avery Pennarun
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska



On Fri, 7 May 2010, Avery Pennarun wrote:
> 
> Maybe we should rethink this from the top.  Imagine that we currently
> have no crlf options whatsoever.  What *should* it look like?  I
> suggest the following:
> 
> Config:
>    core.eolOverride = lf / crlf / auto / binary / input
>    core.eolDefault = lf / crlf / auto / binary / input

Ugh. Hell no. What an ugly format. What does that crazy "override vs 
default" even _mean_?

So no.

Plus the above is confused anyway. The only reason to ever support 'lf' is 
if you're a total moron of a SCM, and you save files you know are text in 
CRLF format internally. That's just f*cking stupid.

So the above is just crazy talk.

The options that make sense is:

 - disabling all "text" issues, and considering everything to be pure 
   binary. This is the "I know I'm sane and unix" option, or the "doing 
   any conversion is always wrong" option.

   We'd call this "binary" or "off" or "false".

 - if you recognize a text-file, and consider it text and different from 
   binary, at a _minimum_ it needs what we call "input". Anything else is 
   crazy-talk. We don't save the same text-file in different formats, and 
   we know that CRLF (or CR) is just a stupid format for text.

   So there are zero options for the input side. If we don't do CRLF -> LF 
   conversion on input, it's worthless even _talking_ about text vs binary.

 - For output, there are exactly three choices: "do nothing" (aka just 
   "input", aka "LF"), output in native format (CRLF on Windows, LF on 
   UNIX), or "force CRLF" regardless of any defaults (and the last 
   probably doesn't make sense in practice, but is good for test-suites, 
   so that you can get CRLF output even on sane platforms.

So I think the _only_ sane choices are basically

	core.crlf=[off|input|on|force]

where you may obviously have aliases (ie "off", "false" and "binary" could 
all mean the same thing, and you could alias "input" to "lf" and "force" 
to "crlf").

And the above is basically what we have. Except that for historical 
reasons (ie we didn't even _have_ any attributes) it got mixed it up with 
"do we want to do this automatically", so "autocrlf=on" actually ends up 
being "yes, do automatic detection" _and_ what I'd call "core.crlf=force" 
above.

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 23:18                                                   ` Linus Torvalds
@ 2010-05-07 23:47                                                     ` hasen j
  2010-05-07 23:50                                                       ` Linus Torvalds
  2010-05-08  0:31                                                     ` Avery Pennarun
  1 sibling, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-07 23:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska

>
> The only reason to ever support 'lf' is
> if you're a total moron of a SCM, and you save files you know are text in
> CRLF format internally. That's just f*cking stupid.
>

What if:

- The entire history of the file is stored in CRLF
- It's a windows-only file where the official "tool" that reads it
barfs on LF line endings.
- Third party tools also expect (or at least, handle) CRLF line endings.

Even if you end up deciding to store it with LF line endings
internally, it should still be *always* checked out with CRLF endings.

And no, just because I want certain files to be checked out with CRLF
endings, doesn't mean that I want all files to be checked out that
way. This is one of the areas where git's crlf handling is lacking
right now.

Also, git-diff should ignore eol differences by default, unless
explicitly asked not to (currently it's the other way around).

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 23:47                                                     ` hasen j
@ 2010-05-07 23:50                                                       ` Linus Torvalds
  2010-05-08  0:19                                                         ` hasen j
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-07 23:50 UTC (permalink / raw)
  To: hasen j
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska



On Fri, 7 May 2010, hasen j wrote:

> >
> > The only reason to ever support 'lf' is
> > if you're a total moron of a SCM, and you save files you know are text in
> > CRLF format internally. That's just f*cking stupid.
> >
> 
> What if:
> 
> - The entire history of the file is stored in CRLF
> - It's a windows-only file where the official "tool" that reads it
> barfs on LF line endings.
> - Third party tools also expect (or at least, handle) CRLF line endings.

Umm. Then it's not text, is it? What you are describing is a binary file 
that happens to look like text with CRLF.

If it's _text_, then you import it as such, and set crlf=true so that it 
gets checked out with crlf.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 23:50                                                       ` Linus Torvalds
@ 2010-05-08  0:19                                                         ` hasen j
  2010-05-08  0:33                                                           ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-08  0:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska

On 7 May 2010 17:50, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Fri, 7 May 2010, hasen j wrote:
>
>> >
>> > The only reason to ever support 'lf' is
>> > if you're a total moron of a SCM, and you save files you know are text in
>> > CRLF format internally. That's just f*cking stupid.
>> >
>>
>> What if:
>>
>> - The entire history of the file is stored in CRLF
>> - It's a windows-only file where the official "tool" that reads it
>> barfs on LF line endings.
>> - Third party tools also expect (or at least, handle) CRLF line endings.
>
> Umm. Then it's not text, is it? What you are describing is a binary file
> that happens to look like text with CRLF.

That depends on your definition of text.

Storing it with LF internally is ok, as long as we can have it
*always* be checked out as crlf.

> If it's _text_, then you import it as such, and set crlf=true so that it
> gets checked out with crlf.

It should be the repository maintainer's responsibility to tell git to
always checkout that file with crlf.

Why?

Because it's part of the project. I never set crlf=true on windows,
but if some files just *have* to have crlf, then I wouldn't mind
having them that way.
This doesn't mean I should have to pollute all my files with crlf just
to please visual studio, or whatever tool requires the crlf endings.

Other developers (specially those new to git) shouldn't have to worry
about crlf issues: when they clone, git would automatically convert
some files to crlf on checkout, regardless of whether or not they set
crlf=true.

git currently has it backward: putting the onus on each individual
contributer to set autocrlf=true

This doesn't make any sense.

If someone did want everything to be crlf, sure, they can set crlf=true.

But there's another potential problem: what if some files just *can't*
have crlf? Say some build (or whatever) tool barfs on crlf files, and
the user sets crlf=true because that's his preferred eol style, but
the project has one of those lf-only files? In this case, we'd want
that file to be always checked out with LF, even if crlf=true is set.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 23:18                                                   ` Linus Torvalds
  2010-05-07 23:47                                                     ` hasen j
@ 2010-05-08  0:31                                                     ` Avery Pennarun
  1 sibling, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-08  0:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eyvind Bernhardsen, Junio C Hamano, git, hasan.aljudy, kusmabite,
	prohaska

On Fri, May 7, 2010 at 7:18 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, 7 May 2010, Avery Pennarun wrote:
>> Maybe we should rethink this from the top.  Imagine that we currently
>> have no crlf options whatsoever.  What *should* it look like?  I
>> suggest the following:
>>
>> Config:
>>    core.eolOverride = lf / crlf / auto / binary / input
>>    core.eolDefault = lf / crlf / auto / binary / input
>
> Ugh. Hell no. What an ugly format. What does that crazy "override vs
> default" even _mean_?

That's easy:

 - if "override" is set, it overrides any attribute setting.
 - if "default" is set, we use it when there's no attribute or override setting.

We can argue about whether having two config options is strictly
necessary from a formal truth table point of view, and you'll probably
win the argument because it all makes my head spin.  My argument is
simpler: if it makes my head spin, it probably makes other people's
heads spin.  The way I described is simple enough for anyone to
understand.

> Plus the above is confused anyway. The only reason to ever support 'lf' is
> if you're a total moron of a SCM, and you save files you know are text in
> CRLF format internally. That's just f*cking stupid.

What I meant by "lf" is just what we currently mean by "crlf=false".
It's more clear for the average person to say "eol=lf" than
"crlf=false", because "crlf=false" doesn't say what you *do* want, it
only says what you *don't* want.

Clearly any repo storing some other weird line ending, then converting
it to LF, is not what we want here.

>  - disabling all "text" issues, and considering everything to be pure
>   binary. This is the "I know I'm sane and unix" option, or the "doing
>   any conversion is always wrong" option.
>
>   We'd call this "binary" or "off" or "false".

Sure, that's what I called "binary" above.

>  - if you recognize a text-file, and consider it text and different from
>   binary, at a _minimum_ it needs what we call "input". Anything else is
>   crazy-talk. We don't save the same text-file in different formats, and
>   we know that CRLF (or CR) is just a stupid format for text.
>
>   So there are zero options for the input side. If we don't do CRLF -> LF
>   conversion on input, it's worthless even _talking_ about text vs binary.

That sounds good to me.  So this was a mistake in the original
implementation of autocrlf; let's just correct it, and make all text
modes do input conversion.

Note that, in prior threads on this topic, there was some objection to
doing crlf=anything by default because it wastes CPU in the common
case that people are running on Unix and aren't doing screwy things
with line endings.  Defaulting to crlf=input would require us to waste
CPU here.  Is that ok?

>  - For output, there are exactly three choices: "do nothing" (aka just
>   "input", aka "LF"), output in native format (CRLF on Windows, LF on
>   UNIX), or "force CRLF" regardless of any defaults (and the last
>   probably doesn't make sense in practice, but is good for test-suites,
>   so that you can get CRLF output even on sane platforms.
>
> So I think the _only_ sane choices are basically
>
>        core.crlf=[off|input|on|force]

One nice thing about my suggestion is that it completely avoids the
concept of a "native CRLF format."  Because nowadays, that's just not
very useful.  On Unix sometimes I need crlf files; on Windows
sometimes I need lf files.  Yes, we can still implement that in terms
of "native" terminology, but it seems to a roundabout way of stating
what I want.

> And the above is basically what we have. Except that for historical
> reasons (ie we didn't even _have_ any attributes) it got mixed it up with
> "do we want to do this automatically", so "autocrlf=on" actually ends up
> being "yes, do automatic detection" _and_ what I'd call "core.crlf=force"
> above.

Functionally, yes, we have this already.  Your new proposal is
essentially to make crlf=auto (= unspecified) to actually always
include crlf=input behaviour, which sounds good to me, but may be
backwards incompatible in some important way.  (I wouldn't think
anybody would want the non-fixing-stuff behaviour.  But I wonder what
it would do to git-svn... maybe it could just check everything in as
if it were crlf=binary, if it doesn't already.)

My suggestion doesn't much change this functionality, but attempts to
straighten out the terminology so normal humans can understand what
will happen.  Not sure if that's worth it, given that we'll probably
have to support the old attribute names forever anyhow, and adding a
second set of words might confuse normal humans all the more.  But I
would much rather teach people to use it using my terminology than
crlf=true/false/binary terminology.  What does "crlf=binary" mean?

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  0:19                                                         ` hasen j
@ 2010-05-08  0:33                                                           ` Linus Torvalds
  2010-05-08  1:39                                                             ` hasen j
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-08  0:33 UTC (permalink / raw)
  To: hasen j
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska



On Fri, 7 May 2010, hasen j wrote:

> On 7 May 2010 17:50, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >> What if:
> >>
> >> - The entire history of the file is stored in CRLF
> >> - It's a windows-only file where the official "tool" that reads it
> >> barfs on LF line endings.
> >> - Third party tools also expect (or at least, handle) CRLF line endings.
> >
> > Umm. Then it's not text, is it? What you are describing is a binary file
> > that happens to look like text with CRLF.
> 
> That depends on your definition of text.

Well, my definition of text is "does it make sense to do any end-of-line 
conversions". That's the only definition that makes sense for an SCM, at 
least in the current context. If doing conversions on the line endings is 
wrong, then it's not text.

And your whole premise was that conversions were always wrong. So the way 
you put it, that's not a text-file, it's a binary file.

> Storing it with LF internally is ok, as long as we can have it
> *always* be checked out as crlf.

.. and that's what I suggested "core.crlf=on" would mean.

However, if you think that it needs to be CRLF on _all_ platforms, even 
platforms where CRLF is _wrong_ for a text-file, then see above: in that 
case it's not a text-file at all as far as the SCM is concerned.

In that case it's just a binary file, and CRLF is _not_ "end of text 
line", it's part of the definition of the format for that binary file.

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  0:33                                                           ` Linus Torvalds
@ 2010-05-08  1:39                                                             ` hasen j
  2010-05-08  1:49                                                               ` Linus Torvalds
  0 siblings, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-08  1:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska

On 7 May 2010 18:33, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Fri, 7 May 2010, hasen j wrote:
>
>> On 7 May 2010 17:50, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>> >> What if:
>> >>
>> >> - The entire history of the file is stored in CRLF
>> >> - It's a windows-only file where the official "tool" that reads it
>> >> barfs on LF line endings.
>> >> - Third party tools also expect (or at least, handle) CRLF line endings.
>> >
>> > Umm. Then it's not text, is it? What you are describing is a binary file
>> > that happens to look like text with CRLF.
>>
>> That depends on your definition of text.
>
> Well, my definition of text is "does it make sense to do any end-of-line
> conversions". That's the only definition that makes sense for an SCM, at
> least in the current context. If doing conversions on the line endings is
> wrong, then it's not text.
>
> And your whole premise was that conversions were always wrong. So the way
> you put it, that's not a text-file, it's a binary file.
>
>> Storing it with LF internally is ok, as long as we can have it
>> *always* be checked out as crlf.
>
> .. and that's what I suggested "core.crlf=on" would mean.
>
> However, if you think that it needs to be CRLF on _all_ platforms, even
> platforms where CRLF is _wrong_ for a text-file, then see above: in that
> case it's not a text-file at all as far as the SCM is concerned.
>
> In that case it's just a binary file, and CRLF is _not_ "end of text
> line", it's part of the definition of the format for that binary file.
>
>                        Linus
>

(sorry about the previous message, forgot to make it reply all)

What does the platform care? This doesn't make any sense. Files that
need CRLF are not Unix files to begin with (e.g. sln).

My whole argument is based on a simple premise: LF -> CRLF doesn't
make sense because all windows editors can handle LF endings, and
because it just causes a lot of confusion.

Until Erik brought up the case where a multi-platform project uses
different build systems on each platform.

I don't know if .sln is one of these formats where the tools will
vomit if it's not crlf, but let's just assume so.

- *.sln is not a Unix file, so it's perfectly ok (maybe even
desirable) to check it out with crlf.
- it's an exception; git doesn't have to convert _all_ files to crlf;
just the .sln ones.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  1:39                                                             ` hasen j
@ 2010-05-08  1:49                                                               ` Linus Torvalds
  2010-05-08  2:49                                                                 ` hasen j
  2010-05-08  3:34                                                                 ` Avery Pennarun
  0 siblings, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2010-05-08  1:49 UTC (permalink / raw)
  To: hasen j
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska



On Fri, 7 May 2010, hasen j wrote:
> > However, if you think that it needs to be CRLF on _all_ platforms, even
> > platforms where CRLF is _wrong_ for a text-file, then see above: in that
> > case it's not a text-file at all as far as the SCM is concerned.
> >
> > In that case it's just a binary file, and CRLF is _not_ "end of text
> > line", it's part of the definition of the format for that binary file.
> 
> What does the platform care? This doesn't make any sense. Files that
> need CRLF are not Unix files to begin with (e.g. sln).

Don't be silly.

The whole AND ONLY point of CRLF translation is that line-endings are 
different on different platforms.

So when you say "What does the platform care?", that is a totally idiotic 
and utterly stupid thing to ask.

And since you ask it, I can only assume that you don't understand anything 
about the whole CRLF discussion, that you don't care about cross-platform 
repositories, and that as a result you should NEVER EVER actually use any 
of the git crlf conversion code.

It's that simple. You seem to totally miss the whole point of the whole 
feature in the first place.

			Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  1:49                                                               ` Linus Torvalds
@ 2010-05-08  2:49                                                                 ` hasen j
  2010-05-08  3:31                                                                   ` Robert Buck
  2010-05-08  3:34                                                                 ` Avery Pennarun
  1 sibling, 1 reply; 82+ messages in thread
From: hasen j @ 2010-05-08  2:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	kusmabite, prohaska

On 7 May 2010 19:49, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Fri, 7 May 2010, hasen j wrote:
>> > However, if you think that it needs to be CRLF on _all_ platforms, even
>> > platforms where CRLF is _wrong_ for a text-file, then see above: in that
>> > case it's not a text-file at all as far as the SCM is concerned.
>> >
>> > In that case it's just a binary file, and CRLF is _not_ "end of text
>> > line", it's part of the definition of the format for that binary file.
>>
>> What does the platform care? This doesn't make any sense. Files that
>> need CRLF are not Unix files to begin with (e.g. sln).
>
> Don't be silly.
>
> The whole AND ONLY point of CRLF translation is that line-endings are
> different on different platforms.
>
> So when you say "What does the platform care?", that is a totally idiotic
> and utterly stupid thing to ask.
>
> And since you ask it, I can only assume that you don't understand anything
> about the whole CRLF discussion, that you don't care about cross-platform
> repositories, and that as a result you should NEVER EVER actually use any
> of the git crlf conversion code.
>
> It's that simple. You seem to totally miss the whole point of the whole
> feature in the first place.
>
>                        Linus
>

I worked on several projects on windows where ALL my files were LF;
the platform didn't give a shit and everything worked great.

I don't suppose you use the CRLF feature yourself, not to mention
doing any windows development (ever?).

The way git handles crlf is just confusing; in fact it's so confusing
that it's often better to just turn it off. I'm not the only person
who thinks that. It's specifically confusing because git thinks "if
you're on windows then ALL your files should be CRLF", which is
clearly what you think.

The platform is not windows, it's the development tools. Most
development tools don't actually mind if the line endings are LF only,
and since CRLF conversions in git cause endless confusion, it's better
to turn it off most of the time, unless you're dealing with a retarded
tool that think CRLF is the only line ending and fails to read files
with LF endings.

When that happens, it's most likely the case that these files are
platform-dependent anyway, and so converting them back and forth
between LF and CRLF is just a waste of time.

The whole idea behind my suggestion is to minimize confusion.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  2:49                                                                 ` hasen j
@ 2010-05-08  3:31                                                                   ` Robert Buck
  2010-05-08  3:45                                                                     ` Avery Pennarun
  0 siblings, 1 reply; 82+ messages in thread
From: Robert Buck @ 2010-05-08  3:31 UTC (permalink / raw)
  To: git
  Cc: Linus Torvalds, Avery Pennarun, Eyvind Bernhardsen,
	Junio C Hamano, kusmabite, prohaska

On Fri, May 7, 2010 at 10:49 PM, hasen j <hasan.aljudy@gmail.com> wrote:
> On 7 May 2010 19:49, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>>
>>
>> Don't be silly.
>>
>> The whole AND ONLY point of CRLF translation is that line-endings are
>> different on different platforms.
>>
>>                        Linus

Actually, Linus, that depends. And while you will recognize this, let
me state the obvious, that there are cases where for certain text
files the platform does not matter, that for all platforms they MUST
normalize to one setting. For instance there are cases where text
files MUST be LF ended on ALL platforms. Have you considered XML to be
one such example? The W3 XML spec states:

   ... [XML processors] MUST behave as if it normalized all line
breaks in external parsed entities (including the document entity) on
input, before parsing, by translating both the two-character sequence
#xD #xA and any #xD that is not followed by #xA to a single #xA
character.

So here is an example of a text file that by convention MUST be
LF-based, yes, even on Windows. And for the record, solution (sln)
files have been an XML format for seven years now. So in any one
workspace it is entirely reasonable that there may be some text files
that MUST have LF, while for other files they SHOULD have CR/LF. There
are also cases where some text files MUST have CR/LF (some scripting
languages barf on Windows otherwise).

[snip ...]

> The way git handles crlf is just confusing; in fact it's so confusing
> that it's often better to just turn it off. I'm not the only person
> who thinks that. It's specifically confusing because git thinks "if
> you're on windows then ALL your files should be CRLF", which is
> clearly what you think.

Hasen makes a good point here. It is simply this, the LF issue does
not boil down to a single boolean switch. People who think of the
LF/CRLF issue as a boolean switch are not dealing with all the facts.
There's a lot of grey, not simply black and white.

Commercial systems, decent ones that is, have had this right for years
(12+ years as I recall). We wouldn't be asking Git to do the right
thing if we weren't sold on Git already. Git is otherwise fantastic
(with using it on Windows being the apparent exception, hence this
conversation).

[snip ...]

> When that happens, it's most likely the case that these files are
> platform-dependent anyway, and so converting them back and forth
> between LF and CRLF is just a waste of time.

I disagree on this one actually, this comment is not spot on. Again,
it depends. I'd generally say,

* perform conversions, or no conversions as the case may be, on the
obvious file types
* when conversions occur, normalize internally to only one convention
* otherwise perform no conversions

> The whole idea behind my suggestion is to minimize confusion.

Confusion, yes. The Git documentation is very confusing on this
point... Linus and Junio may want to lift a page from the Perforce
book ;)

I would hope that people do agree there is a problem here, that Git
SHOULD have a good answer to the issue of line feeds. I am no expert
on Git, and I will not pretend to be, but at Iron Mountain we are
looking at adopting Git, but this is one of two questions that I have.
Having worked with complete pleasure for years with Perforce,
line-feeds had NEVER been an issue, but the documentation about
line-feed support in Git seems a bit "odd". Mind you, as much as I
love Perforce, I also love Git, perhaps more (except for Git on
Windows). But I am now digress, so back to the point...

By the way, Linus and Junio, have you read this yet:

*   http://kb.perforce.com/?article=063

It would seem to me there are some text files that by convention MUST
have LF regardless of the platform, and there are examples of text
files that MAY have CRLF depending upon the platform.

So long as an SCM has a provision to permit, whether by prescription
and/or by convention, various line-feed types, files will naturally
fall into one of the following three categories:

* normalization to LF on input, preserving otherwise; e.g. XML
* automatic conversions to platform line feeds for files otherwise
considered ordinary text
* no conversions for everything else, treated as binary

Classic examples of files that MUST have conversions to platform
line-feeds are scripts (but not all types of scripts mind you) that
otherwise would not parse properly. I'm sure we've all seen cases of
this, especially when copying files from one system type to another
over a mount. XML-based build environments are particularly
troublesome in this regard (e.g. Ant).

- Bob

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  1:49                                                               ` Linus Torvalds
  2010-05-08  2:49                                                                 ` hasen j
@ 2010-05-08  3:34                                                                 ` Avery Pennarun
  1 sibling, 0 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-08  3:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: hasen j, Eyvind Bernhardsen, Junio C Hamano, git, kusmabite, prohaska

On Fri, May 7, 2010 at 9:49 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> So when you say "What does the platform care?", that is a totally idiotic
> and utterly stupid thing to ask.
>
> And since you ask it, I can only assume that you don't understand anything
> about the whole CRLF discussion, that you don't care about cross-platform
> repositories, and that as a result you should NEVER EVER actually use any
> of the git crlf conversion code.

I guess there's your use case for being able to turn off crlf=input, then. :)

Hasen: you and Linus don't seem to be communicating clearly, but it
looks to me like Linus's proposed changes would work fine for your use
case.  What you want is for the repository maintainer to be able to
control whether a file is checked out with crlf or not; this is
possible with *either* a per-project .gitconfig or a crlf=true
attribute that works when core.autocrlf is unspecified, which are
Linus's two suggested options.  If you really, truly want your crlf
characters not to be messed with, then set crlf=false, which means
"binary." [1].

[1] Which reminds me of my opinion about it being too hard to tell
what you're specifying given the current set of config options. But
'man gitattributes' makes at least this point clear.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  3:31                                                                   ` Robert Buck
@ 2010-05-08  3:45                                                                     ` Avery Pennarun
  2010-05-08 10:36                                                                       ` hasen j
  2010-05-08 11:36                                                                       ` Robert Buck
  0 siblings, 2 replies; 82+ messages in thread
From: Avery Pennarun @ 2010-05-08  3:45 UTC (permalink / raw)
  To: Robert Buck
  Cc: git, Linus Torvalds, Eyvind Bernhardsen, Junio C Hamano,
	kusmabite, prohaska

On Fri, May 7, 2010 at 11:31 PM, Robert Buck <buck.robert.j@gmail.com> wrote:
> Actually, Linus, that depends. And while you will recognize this, let
> me state the obvious, that there are cases where for certain text
> files the platform does not matter, that for all platforms they MUST
> normalize to one setting. For instance there are cases where text
> files MUST be LF ended on ALL platforms. Have you considered XML to be
> one such example? The W3 XML spec states:
>
>   ... [XML processors] MUST behave as if it normalized all line
> breaks in external parsed entities (including the document entity) on
> input, before parsing, by translating both the two-character sequence
> #xD #xA and any #xD that is not followed by #xA to a single #xA
> character.

Erm, this seems to be a counterexample to your point.  It says very
clearly that the files can use either LF or CRLF line endings, and
will be parsed correctly either way, or your parser is broken.  So
pretty much any CRLF conversion rule (or none at all) will work with
such files.

Hasen wrote:
>> The way git handles crlf is just confusing; in fact it's so confusing
>> that it's often better to just turn it off.

True.  This discussion is about fixing that, though, so it seems
unnecessary to make that point.

> Hasen makes a good point here. It is simply this, the LF issue does
> not boil down to a single boolean switch. People who think of the
> LF/CRLF issue as a boolean switch are not dealing with all the facts.
> There's a lot of grey, not simply black and white.

How on earth is anyone suggesting that it's a simple boolean switch?
Linus posted an 8-cell truth table earlier, and he hadn't even
included all the cases.

> I'd generally say,
>
> * perform conversions, or no conversions as the case may be, on the
> obvious file types
> * when conversions occur, normalize internally to only one convention
> * otherwise perform no conversions

Unfortunately those steps aren't clear enough to be helpful.  "as the
case may be" and "obvious file types" are definitely not obvious, or
we wouldn't be here.

> Confusion, yes. The Git documentation is very confusing on this
> point... Linus and Junio may want to lift a page from the Perforce
> book ;)

I've learned that git people never learn from anyone's book.  svn has
also had this problem solved pretty much forever, and would be easy to
copy.  For better or for worse, it all has to be hashed out from
scratch or it won't happen.

> It would seem to me there are some text files that by convention MUST
> have LF regardless of the platform, and there are examples of text
> files that MAY have CRLF depending upon the platform.

Well... obviously.  The former case is crlf=false; the latter is
crlf=true.  To bring up my point again about the confusing
configuration options, you might think that "crlf=true" means "always
CRLF", but in fact that's not the case.  In fact it works the way you
want.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  3:45                                                                     ` Avery Pennarun
@ 2010-05-08 10:36                                                                       ` hasen j
  2010-05-08 11:36                                                                       ` Robert Buck
  1 sibling, 0 replies; 82+ messages in thread
From: hasen j @ 2010-05-08 10:36 UTC (permalink / raw)
  To: Avery Pennarun, Robert Buck, git, Linus Torvalds, Eyvind Bernhardsen

>
> It's that simple. You seem to totally miss the whole point of the whole
> feature in the first place.
>
>                        Linus

Sure, I won't deny, it always baffled me why it's built into git.

The only good reason I could think of is avoiding scenarios someone
saves a file with different line endings and then all merging hell
would break loose because all lines are changed. Although
theoretically I think that can be avoided if the merge algorithm
normalized line endings before the merge (but really, I don't know
anything about merging).

Under this assumption, the point of autocrlf is that windows users
should commit with LF endings even if they use CRLF in the working
directory (e.g. some stupid text editor resaves files with crlf).

If that's not the reason, then why the hell does git care about
converting line ending styles?

If the only reason is "LF is not a new line in Windows", then I'll go
back to my previous opinion that autocrlf is useless most of the time
and shouldn't be builtin; use smudge/clean filters instead if you
really need crlf files.



>>   ... [XML processors] MUST behave as if it normalized all line
>> breaks in external parsed entities (including the document entity) on
>> input, before parsing, by translating both the two-character sequence
>> #xD #xA and any #xD that is not followed by #xA to a single #xA
>> character.
>
> Erm, this seems to be a counterexample to your point.  It says very
> clearly that the files can use either LF or CRLF line endings, and
> will be parsed correctly either way, or your parser is broken.  So
> pretty much any CRLF conversion rule (or none at all) will work with
> such files.

Agreed. This is an example where all line endings are valid on all platforms.

>
> Hasen wrote:
>>> The way git handles crlf is just confusing; in fact it's so confusing
>>> that it's often better to just turn it off.
>
> True.  This discussion is about fixing that, though, so it seems
> unnecessary to make that point.

It is necessary. It's broken because the assumptions it's built on are wrong.

>> Hasen makes a good point here. It is simply this, the LF issue does
>> not boil down to a single boolean switch. People who think of the
>> LF/CRLF issue as a boolean switch are not dealing with all the facts.
>> There's a lot of grey, not simply black and white.
>
> How on earth is anyone suggesting that it's a simple boolean switch?
> Linus posted an 8-cell truth table earlier, and he hadn't even
> included all the cases.

That's cool and all, but we need to simplify it; not make it more
confusing. The name autocrlf is confusing all by itself: what does it
mean? is it a two way conversion or a one way conversion? Where the
hell did "input" come from? I always have to pull up the man pages.

I'd rather be able to say:

- My over all preference is 'lf'
- For this repo, this file here is always 'lf' (takes precedence over
the above preference)
- And this other file here is always 'crlf' (ditto)

This model makes way more sense for me as a user and for the project.


>> Confusion, yes. The Git documentation is very confusing on this
>> point... Linus and Junio may want to lift a page from the Perforce
>> book ;)
>
> I've learned that git people never learn from anyone's book.  svn has
> also had this problem solved pretty much forever, and would be easy to
> copy.  For better or for worse, it all has to be hashed out from
> scratch or it won't happen.

No, I actually think git got source control right exactly because it
didn't bother copying other existing systems. The other system's
solutions don't necessarily fit with git's model.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08  3:45                                                                     ` Avery Pennarun
  2010-05-08 10:36                                                                       ` hasen j
@ 2010-05-08 11:36                                                                       ` Robert Buck
  1 sibling, 0 replies; 82+ messages in thread
From: Robert Buck @ 2010-05-08 11:36 UTC (permalink / raw)
  To: git
  Cc: Linus Torvalds, Eyvind Bernhardsen, Junio C Hamano, kusmabite,
	prohaska, Avery Pennarun

[...]

>> character.
>
> Erm, this seems to be a counterexample to your point.  It says very
> clearly that the files can use either LF or CRLF line endings, and
> will be parsed correctly either way, or your parser is broken.  So
> pretty much any CRLF conversion rule (or none at all) will work with
> such files.

Perhaps I was not clear, or you did not understand my point.

Read "...by translating... to #xA", XSLT output to a file therefore
MUST be LF by definition for it to be canonical form. This is an
example of a TEXT file that MUST by definition of the spec be LF based
on all platforms. Looking at the "auto" code that exists in Git, it
does not appear to support this very obvious standard, whereby for
this "file-type" it should always be checked out of source control
with LF regardless of how it came in. This is equivalent to the Git
"input" setting I believe (?), but on a file-type basis. Yes, Git
apparently does not have the notion of file-types, does it (e.g. *.xml
maps to text)?

The point I am really trying to make clear is that there are multiple
dimensions to this problem, and not making that succinct will result
in a botched attempt. We need to carefully distinguish file-types from
other switches that control whether or not to perform automatic
conversions. The two dimensions are eol-style and file-type.

THE SWITCHES

So for the switches, here is what would be meaningful to me, short, sweet:

core.autocrlf  :: true false
core.eolstyle  :: local share lf crlf

If autocrlf is false, then what comes out is exactly what goes in.

EOL-STYLE

The eolstyle property only applies to text files (discussed later):

- "local" means normalize "text" files to LF when read in, and convert
to the platform preferred setting when materializing workspaces.
- "share" means accept anything, but when writing files to a workspace
normalize to LF (XML, XSLT, some scripting languages ...)
- "lf" means always to accept anything though and convert to LF, output LF
- "crlf" means to accept anything and convert to CRLF on output

FILE-TYPES

Linus alluded above file-types, and being explicit about them. That's
great, I agree. Let me provide examples:

By extension:
    http://www.perforce.com/perforce/doc.current/manuals/cmdref/o.ftypes.html

By pathnames or extensions:
    http://www.perforce.com/perforce/doc.current/manuals/cmdref/typemap.html

Don't beat me up for referencing other systems, please. But as people
move to Git from other systems there will be some level of
expectation, so understanding those perspectives and expectations so
you are prepared to provide a meaningful answer would help.

AUTO/TEXT-DETECTION

So the above explicit definitions gets you most of the way, but what
about "auto"? This is a question at the heart of convert.c, the
gather_stats function that classifies among other things whether or
not an input is text or binary.

While gather_stats is a good start, it naively is US-centric; it most
assuredly does not address UTF-8 and ISO-8859-1, both of which are
VERY easy to identify, but are not presently handled by this
algorithm. I wrote a simple stat gatherer for the MATLAB kernel years
ago that classified the character-set of arbitrary input text to one
of about a half-dozen common character-sets, so what about adding in a
lightweight checker for at least UTF-8 and ISO-8859-1? I could provide
such a thing back to this community if people wish.

To have a little more in the gather_stats code to handle a couple more
cases would go a long way and would be easy to add, and does not
necessarily depend up file-type support. It would simply broaden what
it means to be a text file.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-07 21:54                                             ` Linus Torvalds
  2010-05-07 22:14                                               ` Linus Torvalds
  2010-05-07 22:19                                               ` Avery Pennarun
@ 2010-05-08 20:49                                               ` Dmitry Potapov
  2010-05-08 21:54                                                 ` Linus Torvalds
  2 siblings, 1 reply; 82+ messages in thread
From: Dmitry Potapov @ 2010-05-08 20:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska

On Fri, May 07, 2010 at 02:54:40PM -0700, Linus Torvalds wrote:
> 
>  (a) you should try to avoid do things like that in the first place. For 
>      something like an attribute file, you should just list the files you 
>      want to convert. That's the _point_ of an attribute. So it's much 
>      nicer if you instead actually are explicit about it, ie
> 
> 	*.[ch] crlf
> 	*.txt crlf
> 	*.jpg -crlf
> 
>      should be the _primary_ way you do it, since the autocrlf thing is a 
>      bit dangerous in theory.
> 
>  (b) But let's say that you want to do it anyway (because you're lazy 
>      and because autocrlf works pretty damn well in practice), isn't that 
>      a really ugly and crazy thing to add _another_ attribute name for 
>      that?
> 
>      IOW, if you really want to say "do automatic crlf for this set of 
>      paths", the natural syntax for that would be
> 
> 	* crlf=auto
> 
>      No? Not some totally new attribute name.

I like your proposal and it makes perfect sense to me, but I am not new
to git and core.autocrlf. I have observed that many people who were new
to Git often got confused by meaning of the crlf attribute. In essence,
at first, they thought that it means what you would probably describe as
crlf=force. Thus, seeing something like this:

    *.sln -crlf

baffled them, because sln files have CRLF as ending. So, it was very
counter-intuitive for them. Of course, you can explain that Git stores
text files with LF internally, and why it is the sane thing to do, and
why sln files are not exactly text files (at least, non-text in sense
of eol-conversion), etc... but I believe that all those discussion and
explanation could be easily avoided by renaming 'crlf' as 'eol'.  Now,
if you look at this:

      *.sln -eol
      *.jpg -eol
      *.txt eol
      *.[ch] eol

it is clear that .sln and .jpg files are stored "as is", while Git does
the end-of-line conversion for others files in accordance with user's
preference. Why should users bother at all how Git stores text files
internally? They do not need to know that Git stores text files with LF
internally. They just want to checkout those files with the right ending
for their platform.

So, perhaps, 'eol' would be a better name than 'crlf' for new Git users.



Dmitry

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08 20:49                                               ` Dmitry Potapov
@ 2010-05-08 21:54                                                 ` Linus Torvalds
  2010-05-08 23:42                                                   ` Dmitry Potapov
  0 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2010-05-08 21:54 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska



On Sun, 9 May 2010, Dmitry Potapov wrote:
>
> explanation could be easily avoided by renaming 'crlf' as 'eol'.

What the heck is wrong with people?

> Now, if you look at this:
> 
>       *.sln -eol
>       *.jpg -eol
>       *.txt eol
>       *.[ch] eol

Right. Look at it. It's totally incomprehensible. It's _worse_ than "crlf" 
as a name.

What the f*ck does "jpg" have to do with "eol"? Nothing.

You could talk about "binary" vs "text", and it would make sense, but your 
argument that "eol" is somehow better than "crlf" is just insane.

So I could certainly see

	*.jpg binary
	*.txt text

making sense. But "eol" is certainly no better than "crlf". 

In the end, crlf is what we have. We're not getting rid of it, so if 
somebody were to actually rename it, that would just mean that there are 
_two_ different ways to say the same thing. And quite frankly, I think 
that's worse than what we have now, so I don't think it's worth it.

		Linus

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08 21:54                                                 ` Linus Torvalds
@ 2010-05-08 23:42                                                   ` Dmitry Potapov
  2010-05-09  7:49                                                     ` Eyvind Bernhardsen
  0 siblings, 1 reply; 82+ messages in thread
From: Dmitry Potapov @ 2010-05-08 23:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Avery Pennarun, Eyvind Bernhardsen, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska

On Sat, May 08, 2010 at 02:54:35PM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 9 May 2010, Dmitry Potapov wrote:
> >
> > explanation could be easily avoided by renaming 'crlf' as 'eol'.
> 
> What the heck is wrong with people?
> 
> > Now, if you look at this:
> > 
> >       *.sln -eol
> >       *.jpg -eol
> >       *.txt eol
> >       *.[ch] eol
> 
> Right. Look at it. It's totally incomprehensible. It's _worse_ than "crlf" 
> as a name.
> 
> What the f*ck does "jpg" have to do with "eol"? Nothing.

Right, nothing, in other words, no eol conversion... and "-eol" seems to
express this idea well. So, I don't see why it is worse than "crlf"...

Personally, I do not care whether it is "crlf", or "eol", but a lot of
people that I know were confused by crlf, because they thought that it
means that this file is stored with crlf, while this attribute actually
means that file needs eol conversion.

> 
> You could talk about "binary" vs "text", and it would make sense, but your 
> argument that "eol" is somehow better than "crlf" is just insane.
> 
> So I could certainly see
> 
> 	*.jpg binary
> 	*.txt text
> 
> making sense. But "eol" is certainly no better than "crlf". 

What about .sln files? They are xml files with CRLF ending. Does it mean
they are binary? Based on how it is stored, it is certainly binary, but
when it comes to "diff" or even "merge" you may want to think about them
as text, and, in general, people tend to think about them as text files.

Another example is shell scripts. You really want them to be LF even on
Windows. So, is it a binary file too?

So, this approach is not so intuitive as it may appear if you consider
only .jpg and .txt.

> 
> In the end, crlf is what we have. We're not getting rid of it, so if 
> somebody were to actually rename it, that would just mean that there are 
> _two_ different ways to say the same thing. And quite frankly, I think 
> that's worse than what we have now, so I don't think it's worth it.

I was not sure myself that the idea of renaming worth it... While I do
think that "eol" is a better name than "crlf", but not by big margin,
and as you said crlf is what we have now... so be it...


Dmitry

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-08 23:42                                                   ` Dmitry Potapov
@ 2010-05-09  7:49                                                     ` Eyvind Bernhardsen
  2010-05-09 10:35                                                       ` Robert Buck
  0 siblings, 1 reply; 82+ messages in thread
From: Eyvind Bernhardsen @ 2010-05-09  7:49 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Linus Torvalds, Avery Pennarun, Junio C Hamano, git,
	hasan.aljudy, kusmabite, prohaska

On 9. mai 2010, at 01.42, Dmitry Potapov wrote:

> On Sat, May 08, 2010 at 02:54:35PM -0700, Linus Torvalds wrote:

[...]

>> You could talk about "binary" vs "text", and it would make sense, but your 
>> argument that "eol" is somehow better than "crlf" is just insane.
>> 
>> So I could certainly see
>> 
>> 	*.jpg binary
>> 	*.txt text
>> 
>> making sense. But "eol" is certainly no better than "crlf". 
> 
> What about .sln files? They are xml files with CRLF ending. Does it mean
> they are binary? Based on how it is stored, it is certainly binary, but
> when it comes to "diff" or even "merge" you may want to think about them
> as text, and, in general, people tend to think about them as text files.
> 
> Another example is shell scripts. You really want them to be LF even on
> Windows. So, is it a binary file too?

I think "binary" and "text" are the wrong things to talk about in this case.

If we were to following Avery's suggestion that we look at what we would have implemented had autocrlf not already existed, it would be better to call "crlf" something like "eolconv".  You're not saying that a file is text or binary as such, rather that "I want eol conversion for this file" or "I don't want eol conversion for this file".

Flagging a file as "-eolconv" because it should always have LFs or always CRLFs seems logical to me.  "eolconv=auto" also makes sense.

[...]

>> In the end, crlf is what we have. We're not getting rid of it, so if 
>> somebody were to actually rename it, that would just mean that there are 
>> _two_ different ways to say the same thing. And quite frankly, I think 
>> that's worse than what we have now, so I don't think it's worth it.
> 
> I was not sure myself that the idea of renaming worth it... While I do
> think that "eol" is a better name than "crlf", but not by big margin,
> and as you said crlf is what we have now... so be it...

Renaming "crlf" might not be worth it, but thinking about what it should look like definitely is worth it.  Since I already have a patch series that changes this area, I'd like for it to be future proof.

I think the idea that we're stuck with "crlf" (or any bad ui design) for ever and ever is depressing, and I reject it.  It would be easy to create a new attribute with a better name that is the same setting under the hood, and deprecate "crlf".  The old attribute would still work in existing repositories (indefinitely, if needs be), but new users wouldn't have to be confused by its poor name.

I'm not saying I want to replace "crlf" right now!  I'm just saying that it makes sense to think about how we would want to replace it, and try not to introduce any new change that will make it harder to do the right thing later.
-- 
Eyvind

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH/RFC 0/3] Per-repository end-of-line normalization
  2010-05-09  7:49                                                     ` Eyvind Bernhardsen
@ 2010-05-09 10:35                                                       ` Robert Buck
  0 siblings, 0 replies; 82+ messages in thread
From: Robert Buck @ 2010-05-09 10:35 UTC (permalink / raw)
  To: Eyvind Bernhardsen
  Cc: Dmitry Potapov, Linus Torvalds, Avery Pennarun, Junio C Hamano,
	git, hasan.aljudy, kusmabite, prohaska

On Sun, May 9, 2010 at 3:49 AM, Eyvind Bernhardsen
<eyvind.bernhardsen@gmail.com> wrote:
> On 9. mai 2010, at 01.42, Dmitry Potapov wrote:
>
>> On Sat, May 08, 2010 at 02:54:35PM -0700, Linus Torvalds wrote:
>
> [...]
>
>>> You could talk about "binary" vs "text", and it would make sense, but your
>>> argument that "eol" is somehow better than "crlf" is just insane.
>>>
>>> So I could certainly see
>>>
>>>      *.jpg binary
>>>      *.txt text
>>>
>>> making sense. But "eol" is certainly no better than "crlf".

Linus - Perhaps I missed this, but where would you this typemap exist?
I like this sort of prescriptive approach; out of the box users would
get a bunch of reasonable defaults, but they could customize it by
adding/changing them.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: What should be the CRLF policy when win + Linux?
  2010-05-06 23:25               ` Erik Faye-Lund
@ 2010-05-18 15:13                 ` Anthony W. Youngman
  0 siblings, 0 replies; 82+ messages in thread
From: Anthony W. Youngman @ 2010-05-18 15:13 UTC (permalink / raw)
  To: git

In message 
<o2v40aa078e1005061625md5fede79h660a22227c4f22d1@mail.gmail.com>, Erik 
Faye-Lund <kusmabite@googlemail.com> writes
>Closed source does not imply a single operating system, and you get
>these issues whenever you have a project with targets systems with
>different newline style. In my day job I develop closed source,
>multi-platform software, using git. So it's certainly not MY most
>common scenario.

And there's a lot more line endings out there than just lf or crlf.

Okay, the two I'm about to quote have, I believe, gone the way of the 
dinosaur, but wasn't the mac just cr? And what is *still* my favourite 
system, Prime (a multics derivative too), used a "packed lf", so your 
line ending could be either lf or lfnull depending on the line length 
(it was always stored on disk as an integral word-length, a word being 
16 bits. So if your text was an even number of characters, the ending 
was lfnull to pad it to the next word boundary).

Cheers,
Wol
-- 
Anthony W. Youngman - anthony@thewolery.demon.co.uk

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2010-05-18 15:15 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-05 10:01 What should be the CRLF policy when win + Linux? mat
2010-05-05 13:27 ` Ramkumar Ramachandra
2010-05-06  9:27   ` mat
2010-05-06 10:03     ` Erik Faye-Lund
2010-05-06  2:35 ` hasen j
2010-05-06  7:29   ` Wilbert van Dolleweerd
2010-05-06 15:34     ` hasen j
2010-05-06 17:15       ` Linus Torvalds
2010-05-06 17:26         ` Erik Faye-Lund
2010-05-06 20:00         ` hasen j
2010-05-06 20:23           ` Linus Torvalds
2010-05-06 20:40           ` Erik Faye-Lund
2010-05-06 22:14             ` hasen j
2010-05-06 23:25               ` Erik Faye-Lund
2010-05-18 15:13                 ` Anthony W. Youngman
2010-05-06 22:27             ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Eyvind Bernhardsen
2010-05-06 22:27               ` [PATCH/RFC 1/3] Add "auto-eol" attribute and "core.eolStyle" config variable Eyvind Bernhardsen
2010-05-06 22:27               ` [PATCH/RFC 2/3] Add tests for per-repository eol normalization Eyvind Bernhardsen
2010-05-06 22:27               ` [PATCH/RFC 3/3] Add " Eyvind Bernhardsen
2010-05-06 23:38               ` [PATCH/RFC 0/3] Per-repository end-of-line normalization Avery Pennarun
2010-05-06 23:54                 ` Avery Pennarun
2010-05-07  8:45               ` Erik Faye-Lund
2010-05-07 16:33               ` Junio C Hamano
2010-05-07 16:57                 ` Avery Pennarun
2010-05-07 17:10                 ` Linus Torvalds
2010-05-07 19:02                   ` Linus Torvalds
2010-05-07 19:11                     ` Avery Pennarun
2010-05-07 19:16                       ` Linus Torvalds
2010-05-07 19:35                         ` Avery Pennarun
2010-05-07 19:45                           ` Linus Torvalds
2010-05-07 19:58                             ` Avery Pennarun
2010-05-07 20:06                               ` Linus Torvalds
2010-05-07 20:17                                 ` Linus Torvalds
2010-05-07 20:42                                   ` Eyvind Bernhardsen
2010-05-07 20:57                                     ` Linus Torvalds
2010-05-07 21:17                                       ` Eyvind Bernhardsen
2010-05-07 21:23                                         ` Linus Torvalds
2010-05-07 21:30                                           ` Avery Pennarun
2010-05-07 21:37                                             ` Eyvind Bernhardsen
2010-05-07 21:58                                               ` Linus Torvalds
2010-05-07 21:54                                             ` Linus Torvalds
2010-05-07 22:14                                               ` Linus Torvalds
2010-05-07 22:34                                                 ` Avery Pennarun
2010-05-07 22:54                                                   ` hasen j
2010-05-07 23:18                                                   ` Linus Torvalds
2010-05-07 23:47                                                     ` hasen j
2010-05-07 23:50                                                       ` Linus Torvalds
2010-05-08  0:19                                                         ` hasen j
2010-05-08  0:33                                                           ` Linus Torvalds
2010-05-08  1:39                                                             ` hasen j
2010-05-08  1:49                                                               ` Linus Torvalds
2010-05-08  2:49                                                                 ` hasen j
2010-05-08  3:31                                                                   ` Robert Buck
2010-05-08  3:45                                                                     ` Avery Pennarun
2010-05-08 10:36                                                                       ` hasen j
2010-05-08 11:36                                                                       ` Robert Buck
2010-05-08  3:34                                                                 ` Avery Pennarun
2010-05-08  0:31                                                     ` Avery Pennarun
2010-05-07 22:19                                               ` Avery Pennarun
2010-05-08 20:49                                               ` Dmitry Potapov
2010-05-08 21:54                                                 ` Linus Torvalds
2010-05-08 23:42                                                   ` Dmitry Potapov
2010-05-09  7:49                                                     ` Eyvind Bernhardsen
2010-05-09 10:35                                                       ` Robert Buck
2010-05-07 20:58                                 ` Avery Pennarun
2010-05-07 19:23                       ` Eyvind Bernhardsen
2010-05-07 19:31                     ` Nicolas Pitre
2010-05-07 19:36                       ` Avery Pennarun
2010-05-07 20:29                         ` Nicolas Pitre
2010-05-07 21:00                           ` Avery Pennarun
2010-05-07 21:12                             ` Nicolas Pitre
2010-05-07 21:26                               ` Avery Pennarun
2010-05-07 22:09                                 ` A Large Angry SCM
2010-05-07 22:10                                   ` Avery Pennarun
2010-05-07 19:40                       ` Linus Torvalds
2010-05-07 20:32                         ` Nicolas Pitre
2010-05-07 19:06                   ` Junio C Hamano
2010-05-07 19:25                   ` Eyvind Bernhardsen
2010-05-07 19:41                 ` Finn Arne Gangstad
2010-05-07 20:06                   ` Avery Pennarun
2010-05-07 20:11                 ` Eyvind Bernhardsen
2010-05-07  7:15 ` What should be the CRLF policy when win + Linux? Gelonida

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.