All of lore.kernel.org
 help / color / mirror / Atom feed
* Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
@ 2012-03-30  2:19 Chris Harris
  2012-03-30  6:42 ` Johannes Sixt
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Harris @ 2012-03-30  2:19 UTC (permalink / raw)
  To: git

The gitattributes documentation (e.g. from "git help attributes")
gives the impression that the "text" attribute's sole effect is to
control whether or not end-of-line normalization takes place. I wanted
to check whether I'm indeed supposed to take such a narrow
interpretation as a git user, or whether setting "text" or "-text"
might have a broader meaning (or whether it will have a broader
meaning in the future).

In the rest of this message, let me outline a case of current interest
where the distinction seems to matter:

I'm starting a new repository for a Windows-only project where I don't
think I want git to do any end-of-line normalization on my text files.
(I'm totally happy to have CRLFs both in the repo and in all the
working copies.) Unless you think that end-of-line normalization is
always vital, let's try to presume I've made the right choice about
this.

Now as far I can tell from the gitattributes documentation, one
perfectly legitimate way to accomplish this (and to override any
core.autocrlf settings on other teammates' machines in the process) is
to add a .gitignore file to my repo containing the single line

    * -text

If all "-text" does is disable end-of-line normalization, then this
setting should be a small deal. For example, it shouldn't
fundamentally alter how any of the source control operations (diffs,
merges, etc.) work on my text files, or generate more merge conflicts
than otherwise. (Recall that I'm starting a fresh repo, so I don't
have to worry that the repo might already have some normalized
linefeeds.)

But I'm not completely sure if it's reasonable to expect there to be
no side-effects. I haven't yet discovered any side-effects from git
itself, but I have at least discovered one in the Git Extensions
project, which takes a broader interpretation of "-text". Some of you
might not care directly about Git Extensions, but perhaps you can
still help me figure out whether I'm making a misguided use of
"-text", or whether this is perhaps an area where Git Extensions is
doing the wrong thing:

* Here's the normal behavior (without "* -text"): Git Extensions has a
widget that lets you explore a given commit. In that widget, if you
click on the name of a text file, then the contents of that file shows
up in an adjacent pane. In contrast, if you click on the name of a
binary file, then the adjacent pane simply says, e.g., "Binary file:
foo.jpg".
* Here's the behavior with "* -text" in my .gitattributes: Now, no
matter what file I click on in the above widget, the pane says, e.g.,
"Binary file: foo.txt". I can no longer see the contents of any text
files, which is annoying.
* I peeked at the Git Extensions code, and it is making the following
inference: The path to foo.txt is tagged as "-text", therefore the
file is "binary", therefore it doesn't make sense to display its
contents to the user, and therefore I'll just display "Binary file:
foo.txt" instead. But this is an unfortunate consequence. When I
originally set "-text" I didn't mean to convey "this is a binary
file", merely "don't mess with newlines."

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
  2012-03-30  2:19 Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications? Chris Harris
@ 2012-03-30  6:42 ` Johannes Sixt
  2012-03-30  7:25   ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Sixt @ 2012-03-30  6:42 UTC (permalink / raw)
  To: Chris Harris; +Cc: git

Am 3/30/2012 4:19, schrieb Chris Harris:
> I'm starting a new repository for a Windows-only project where I don't
> think I want git to do any end-of-line normalization on my text files.
> (I'm totally happy to have CRLFs both in the repo and in all the
> working copies.)

The question is rather: Are you happy if someone commits a file that does
*not* have CRLF, but only LF?

Because if you don't care, you are better off setting no attributes and no
core.autocrlf and no core.eol at all. The git will take the file
unmodified. If someone's editor changes the eol style of a file, it will
be noticed because the diff will show that the entire file has changed.
Your team mates should better have enough discipline not to ignore such a
hint that something's gone awry, of course.

> Unless you think that end-of-line normalization is
> always vital, let's try to presume I've made the right choice about
> this.

It's your code, you are to judge what is best for you. IOW, I don't think
that eol normalization is "always vital", and you are right. :-)

(I didn't answer the question in the subject of your message, and I can't;
I don't use the text attribute nor eol normalization, even though I work
on Windows quite a lot.)

-- Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
  2012-03-30  6:42 ` Johannes Sixt
@ 2012-03-30  7:25   ` Jeff King
  2012-03-30 17:49     ` Chris Harris
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff King @ 2012-03-30  7:25 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Chris Harris, git

On Fri, Mar 30, 2012 at 08:42:04AM +0200, Johannes Sixt wrote:

> Am 3/30/2012 4:19, schrieb Chris Harris:
> > I'm starting a new repository for a Windows-only project where I don't
> > think I want git to do any end-of-line normalization on my text files.
> > (I'm totally happy to have CRLFs both in the repo and in all the
> > working copies.)
> 
> The question is rather: Are you happy if someone commits a file that does
> *not* have CRLF, but only LF?
> 
> Because if you don't care, you are better off setting no attributes and no
> core.autocrlf and no core.eol at all. The git will take the file
> unmodified. If someone's editor changes the eol style of a file, it will
> be noticed because the diff will show that the entire file has changed.
> Your team mates should better have enough discipline not to ignore such a
> hint that something's gone awry, of course.

I think it may be slightly more complex than that. He may be OK with
"git does nothing" and assuming everybody's editor does the sane thing.
But he may _not_ be OK with a stray core.autocrlf setting in a project
member's git config normalizing all line endings whenever they touch a
file. Setting "-text" prevents the latter.

> (I didn't answer the question in the subject of your message, and I can't;
> I don't use the text attribute nor eol normalization, even though I work
> on Windows quite a lot.)

I don't use them either.

However, I find the behavior of "Git Extensions" to be questionable. I
can see the rationale for thinking that "-text" means more than just
handling line-endings, but I think "-diff" is probably a better choice
for seeing if something is binary (or even checking the "binary" macro).
Those are what git uses itself.

Perhaps it was a mistake to call it "text", as it invites this sort of
confusion.

-Peff

PS I think one could potentially work around the whole issue by setting
   "-crlf", which git treats equivalently to "-text" these days (and
   hopefully isn't also checked by Git Extensions).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
  2012-03-30  7:25   ` Jeff King
@ 2012-03-30 17:49     ` Chris Harris
  2012-03-30 18:22       ` Junio C Hamano
  2012-03-30 21:30       ` Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: Chris Harris @ 2012-03-30 17:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, git

On Fri, Mar 30, 2012 at 12:25 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Mar 30, 2012 at 08:42:04AM +0200, Johannes Sixt wrote:
>
>> Am 3/30/2012 4:19, schrieb Chris Harris:
>> > I'm starting a new repository for a Windows-only project where I don't
>> > think I want git to do any end-of-line normalization on my text files.
>> > (I'm totally happy to have CRLFs both in the repo and in all the
>> > working copies.)
>>
>> The question is rather: Are you happy if someone commits a file that does
>> *not* have CRLF, but only LF?
>>
>> Because if you don't care, you are better off setting no attributes and no
>> core.autocrlf and no core.eol at all. The git will take the file
>> unmodified. If someone's editor changes the eol style of a file, it will
>> be noticed because the diff will show that the entire file has changed.
>> Your team mates should better have enough discipline not to ignore such a
>> hint that something's gone awry, of course.
>
> I think it may be slightly more complex than that. He may be OK with
> "git does nothing" and assuming everybody's editor does the sane thing.
> But he may _not_ be OK with a stray core.autocrlf setting in a project
> member's git config normalizing all line endings whenever they touch a
> file. Setting "-text" prevents the latter.

Yes, avoiding stray core.autocrlf settings was indeed one of my main
motivations. Johannes is right that ideally a teammate is going to
notice if all of us a sudden a whole file has changed. But I also
believe that, if there's an easy way to prevent people from
accidentally doing the wrong thing when they're tired/hurried/whatever
and it has no bad side-effects, why not enable it?

The problem is slightly intensified for users of msysgit on Windows;
the msysgit installer guides one toward picking core.autocrlf=true as
your system default. Making sure that every last teammate disables
autocrlf seems potentially error-prone.

> However, I find the behavior of "Git Extensions" to be questionable. I
> can see the rationale for thinking that "-text" means more than just
> handling line-endings, but I think "-diff" is probably a better choice
> for seeing if something is binary (or even checking the "binary" macro).
> Those are what git uses itself.
>
> Perhaps it was a mistake to call it "text", as it invites this sort of
> confusion.

Ok, thanks. That's helpful.

A related point of confusion: I've noticed that, if you start with a
question along the lines of "how can I explicitly tell git that a file
is binary", then the web currently gives a slightly confusing array of
answers. For example:
* The Pro Git Book (http://progit.org/book/ch7-2.html) tells you to
use either "binary" or "-crlf -diff"
* http://www.bluishcoder.co.nz/2007/09/git-binary-files-and-cherry-picking.html
tells you to use "-crlf -diff -merge"
* http://www.dont-panic.cc/capi/2009/02/16/how-to-force-git-to-consider-a-file-as-binary/
tells you to use "-crlf"
* "man gitattributes" has helpful info, but it's scattered across
different sections. In the section "Marking files as binary", it says
"The simplest way to mark a file as binary is to unset the diff
attribute in the .gitattributes file". (Note: This implies that there
are other ways you might also want to consider.) Under "Performing a
three-way merge" you also learn that "-merge" is "suitable for binary
files that do not have a well-defined merge semantics". You learn
about the "binary" attribute only under the section "Defining Macro
Attributes", which says that it means "-text -diff", but not in what
cases you might want to use it. The section describing "text"/"-text"
does not contain the word "binary" at all, so you have to infer
whether it's a helpful setting for binary files.

It makes me wonder if the documentation could be improved a little on
this count, though I don't yet feel solid enough in my understanding
to propose a particular patch.

> PS I think one could potentially work around the whole issue by setting
>   "-crlf", which git treats equivalently to "-text" these days (and
>   hopefully isn't also checked by Git Extensions).

Yes, that sounds like a plausible way to go.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
  2012-03-30 17:49     ` Chris Harris
@ 2012-03-30 18:22       ` Junio C Hamano
  2012-03-30 21:30       ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2012-03-30 18:22 UTC (permalink / raw)
  To: Chris Harris; +Cc: Jeff King, Johannes Sixt, git

Chris Harris <ryguasu@gmail.com> writes:

> A related point of confusion: I've noticed that, if you start with a
> question along the lines of "how can I explicitly tell git that a file
> is binary", then the web currently gives a slightly confusing array of
> answers. For example:
>...
> It makes me wonder if the documentation could be improved a little on
> this count, though I don't yet feel solid enough in my understanding
> to propose a particular patch.

If you only read your analysis on gitattributes documentation, it would be
clear that text/-text does not have much to do with binaryness, and it
also gives an authoritative advice "binary".

We do not have direct control over third-party sites that give incorrect
information, so people need to bug them as they find mistakes.  I think
ProGit actively accepts patches; I do not know about others.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications?
  2012-03-30 17:49     ` Chris Harris
  2012-03-30 18:22       ` Junio C Hamano
@ 2012-03-30 21:30       ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2012-03-30 21:30 UTC (permalink / raw)
  To: Chris Harris; +Cc: Johannes Sixt, git

On Fri, Mar 30, 2012 at 10:49:42AM -0700, Chris Harris wrote:

> > However, I find the behavior of "Git Extensions" to be questionable. I
> > can see the rationale for thinking that "-text" means more than just
> > handling line-endings, but I think "-diff" is probably a better choice
> > for seeing if something is binary (or even checking the "binary" macro).
> > Those are what git uses itself.
> [...]
> A related point of confusion: I've noticed that, if you start with a
> question along the lines of "how can I explicitly tell git that a file
> is binary", then the web currently gives a slightly confusing array of
> answers. For example:
> * The Pro Git Book (http://progit.org/book/ch7-2.html) tells you to
> use either "binary" or "-crlf -diff"

I think setting "binary" is the most sane thing. Ultimately, I think
what it comes down to is this: git provides a bunch of per-operation
attributes for treating a file as binary for a particular operation. It
also provides a "binary" macro to conveniently cover all of the
operations.

Git Extensions cares about binary-ness for a _new_ operation, which is
showing the file at all (that is what I got from your original email, at
least; I have never used Git Extensions myself). The equivalent in git
would be perhaps for "git show HEAD:file" to either print a text file,
or to say "This is a binary file". But since git itself does not care
about binary-ness for that operation (we just always show the file), we
have not defined an operation-specific attribute.

So what is something like Git Extensions to do? It can introduce a new
attribute, but of course nobody is likely to be using it. It can depend
on "binary", except that some people will manually spell out "-crlf
-diff" instead of saying "binary". Or it can piggy-back on "-text" or
"-diff", which can be subtly wrong in cases where the file is not
binary, but you want to disable those operations (i.e., your case).

Of those, just checking "binary" seems like the least wrong thing to me.

> * "man gitattributes" has helpful info, but it's scattered across
> different sections. In the section "Marking files as binary", it says
> "The simplest way to mark a file as binary is to unset the diff
> attribute in the .gitattributes file".

Note that "Marking files as binary" is actually a subsection in
"Generating diff text". We could probably do a better job of mentioning
the "binary" macro there, though.

> (Note: This implies that there > are other ways you might also want to
> consider.)

Yes. You can also use a custom diff driver (e.g., "diff=foo"), and then
tell the diff driver that the file should be considered binary (by
setting diff.foo.binary in your config).

> Under "Performing a three-way merge" you also learn that "-merge" is
> "suitable for binary files that do not have a well-defined merge
> semantics".

Arguably the "binary" macro should imply "-merge". And like -diff, we
the documentation should probably reference the section on the "binary"
macro.

> You learn about the "binary" attribute only under the section
> "Defining Macro Attributes", which says that it means "-text -diff",
> but not in what cases you might want to use it. The section describing
> "text"/"-text" does not contain the word "binary" at all, so you have
> to infer whether it's a helpful setting for binary files.

I think it is the case that binary files should imply "-text", but
"-text" does not necessarily imply binary files. But like the other
spots, it should probably say "hey, if you are dealing with a binary
file, you might want to just set the binary attribute macro".

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-03-30 21:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-30  2:19 Is the "text" attribute meant *only* to specify end-of-line normalization behavior, or does it have broader implications? Chris Harris
2012-03-30  6:42 ` Johannes Sixt
2012-03-30  7:25   ` Jeff King
2012-03-30 17:49     ` Chris Harris
2012-03-30 18:22       ` Junio C Hamano
2012-03-30 21:30       ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.