All of lore.kernel.org
 help / color / mirror / Atom feed
* git-cvsserver and binary files
@ 2007-02-22 15:04 Andy Parkins
  2007-02-22 16:06 ` [PATCH] Added "-kb" to all the entries lines sent to the client Andy Parkins
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-02-22 15:04 UTC (permalink / raw)
  To: git

Hello,

A colleague of mine is using TortoiseCVS to access git-cvserver on my system 
to do development on Windows with a repository stored on my machine.

It's all gone fairly well, nice and easy to use.  However, we've found that 
images stored in the repository are being corrupted on checkout to the 
Windows machine.  Even though they are binary files, they're having the line 
endings rewritten (in a really strange way as well).

I don't think git itself is at fault; the source files are all retaining their 
original unix line endings, and the images are uncorrupted when I check them 
out.  I've done a bit of research though and found that CVS marks binary 
files in the repository, not in the client, so I assume that git-cvsserver is 
at fault.  It's telling the client side that the image files are text.

To be as close to git as possible, I reckon that git-cvsserver should be 
telling the remote that /every/ file is binary and leave the line endings 
alone (although that would perhaps annoy some windows users).  I'd much 
prefer it if what was checked out was what was checked in.  I'm at a bit of a 
loss as to how to do it though.  I don't know enough about the CVS protocol 
to know which part of git-cvsserver is telling the client the file type.

Should I even try to make all binary checkouts via CVS or should I try and use 
the auocrlf work that's going on now?




Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] Added "-kb" to all the entries lines sent to the client
  2007-02-22 15:04 git-cvsserver and binary files Andy Parkins
@ 2007-02-22 16:06 ` Andy Parkins
  2007-02-22 20:37   ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-02-22 16:06 UTC (permalink / raw)
  To: git

git doesn't distinguish between binary and text files - so force the
client to do the same by sending everything as a binary.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
DON'T APPLY TO REPOSITORY

Turns out the CVS protocol isn't as hard as I thought.

http://soc.if.usp.br/doc/cvs/html-cvsclient/cvsclient_5.html#SEC6

Was all I needed.  I've changed every entries line to send "-kb" as one of
options.  I believe this will make all files into binaries as far as CVS
clients are concerned.

I am certain this is too heavy handed for most users.  I submit this patch
only to help other poor souls who might have the same problem in the
future.  (Hello poor soul).

Perhaps when the whole .gitattributes system has settled down that could be
used to conditionally set -kb

 git-cvsserver.perl |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index f6ddf34..e8d74ae 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -374,7 +374,7 @@ sub req_add
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/0///\n";
+        print "/$filepart/0//-kb/\n";
 
         $addcount++;
     }
@@ -455,7 +455,7 @@ sub req_remove
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/-1.$wrev///\n";
+        print "/$filepart/-1.$wrev//-kb/\n";
 
         $rmcount++;
     }
@@ -726,7 +726,7 @@ sub req_co
        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) and 
$git->{dir} ne "./" ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
 
         # this is an "entries" line
-        print "/$git->{name}/1.$git->{revision}///\n";
+        print "/$git->{name}/1.$git->{revision}//-kb/\n";
         # permissions
         print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
 
@@ -917,8 +917,8 @@ sub req_update
 		print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
 		# this is an "entries" line
-		$log->debug("/$filepart/1.$meta->{revision}///");
-		print "/$filepart/1.$meta->{revision}///\n";
+		$log->debug("/$filepart/1.$meta->{revision}//-kb/");
+		print "/$filepart/1.$meta->{revision}//-kb/\n";
 
 		# permissions
 		$log->debug("SEND : u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}");
@@ -961,8 +961,8 @@ sub req_update
                     print "Update-existing $dirpart\n";
                     
$log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
                     print 
$state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    $log->debug("/$filepart/1.$meta->{revision}///");
-                    print "/$filepart/1.$meta->{revision}///\n";
+                    $log->debug("/$filepart/1.$meta->{revision}//-kb/");
+                    print "/$filepart/1.$meta->{revision}//-kb/\n";
                 }
             }
             elsif ( $return == 1 )
@@ -975,7 +975,7 @@ sub req_update
                 {
                     print "Update-existing $dirpart\n";
                     print 
$state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    print "/$filepart/1.$meta->{revision}/+//\n";
+                    print "/$filepart/1.$meta->{revision}/+/-kb/\n";
                 }
             }
             else
@@ -1207,7 +1207,7 @@ sub req_ci
         } else {
             print "Checked-in $dirpart\n";
             print "$filename\n";
-            print "/$filepart/1.$meta->{revision}///\n";
+            print "/$filepart/1.$meta->{revision}//-kb/\n";
         }
     }
 
-- 
1.5.0.1.51.g5a369

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] Added "-kb" to all the entries lines sent to the client
  2007-02-22 16:06 ` [PATCH] Added "-kb" to all the entries lines sent to the client Andy Parkins
@ 2007-02-22 20:37   ` Junio C Hamano
  2007-02-27 13:45     ` [PATCH] cvsserver: Make always-binary mode a config file option Andy Parkins
  2007-02-27 13:46     ` Andy Parkins
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2007-02-22 20:37 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> Perhaps when the whole .gitattributes system has settled down that could be
> used to conditionally set -kb

Indeed.  And this patch is a great help by identifying where to
insert necessary '-kb'.

If the patch had a stub implementation to determine if -kb is
needed (and the stub always says "yes" for your purpose, or "no"
for backward compatibility for non-text files), it could be
applied to the mainline for wider testing.

sub path_is_binary {
        my ($filename) = @_;

	# If you do not mind LF lineendings and care more about
        # binary files, return true from here; otherwise you may
        # want to change this to return false.
        # We'll ask the much talked about .gitattributes
        # mechanism when it's ready.
        return 1;
}

sub req_add {
        ...
        $kb = path_is_binary($filename) ? "-kb" : "";
        print "Checked-in $dirpart\n";
        print "$filename\n";
        print "/$filepart/0//$kb/\n";

        $addcount++;
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-22 20:37   ` Junio C Hamano
@ 2007-02-27 13:45     ` Andy Parkins
  2007-02-28 11:36       ` Martin Langhoff
  2007-02-27 13:46     ` Andy Parkins
  1 sibling, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-02-27 13:45 UTC (permalink / raw)
  To: git

The config option gitcvs.allbinary may be set to force all entries to
get the -kb flag.

In the future the gitattributes system will probably be a more
appropriate way of doing this, but that will easily slot in as the
entries lines sent to the CVS client now have their kopts set via the
function kopts_from_path().

In the interim it might be better to not just have a all-or-nothing
approach, but rather detect based on file extension (or file contents?).
That would slot in easily here as well.  However, I personally prefer
everything to be binary-safe, so I just switch the switch.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
How about this instead?

I've added one new function - kopts_from_path() - which is passed the
path of the file having an "Entries" line generated.  It can do what it
likes to convert that into the options component.  I've just used a
config option to decide whether everything is binary or nothing is binary.

Default is to be the same as previous behaviour - you have to specifically
ask for the binary flag.

 git-cvsserver.perl |   47 ++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 56f21b6..15b9443 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -374,7 +374,8 @@ sub req_add
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/0//-kb/\n";
+        my $kopts = kopts_from_path($filepart);
+        print "/$filepart/0//$kopts/\n";
 
         $addcount++;
     }
@@ -455,7 +456,8 @@ sub req_remove
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/-1.$wrev//-kb/\n";
+        my $kopts = kopts_from_path($filepart);
+        print "/$filepart/-1.$wrev//$kopts/\n";
 
         $rmcount++;
     }
@@ -726,7 +728,8 @@ sub req_co
        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) and 
$git->{dir} ne "./" ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
 
         # this is an "entries" line
-        print "/$git->{name}/1.$git->{revision}//-kb/\n";
+        my $kopts = kopts_from_path($git->{name});
+        print "/$git->{name}/1.$git->{revision}//$kopts/\n";
         # permissions
         print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
 
@@ -917,8 +920,9 @@ sub req_update
 		print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
 		# this is an "entries" line
-		$log->debug("/$filepart/1.$meta->{revision}//-kb/");
-		print "/$filepart/1.$meta->{revision}//-kb/\n";
+		my $kopts = kopts_from_path($filepart);
+		$log->debug("/$filepart/1.$meta->{revision}//$kopts/");
+		print "/$filepart/1.$meta->{revision}//$kopts/\n";
 
 		# permissions
 		$log->debug("SEND : u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}");
@@ -961,8 +965,9 @@ sub req_update
                     print "Update-existing $dirpart\n";
                     
$log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
                     print 
$state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    $log->debug("/$filepart/1.$meta->{revision}//-kb/");
-                    print "/$filepart/1.$meta->{revision}//-kb/\n";
+                    my $kopts = kopts_from_path($filepart);
+                    $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
+                    print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 }
             }
             elsif ( $return == 1 )
@@ -975,7 +980,8 @@ sub req_update
                 {
                     print "Update-existing $dirpart\n";
                     print 
$state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    print "/$filepart/1.$meta->{revision}/+/-kb/\n";
+                    my $kopts = kopts_from_path($filepart);
+                    print "/$filepart/1.$meta->{revision}/+/$kopts/\n";
                 }
             }
             else
@@ -1206,7 +1212,8 @@ sub req_ci
         } else {
             print "Checked-in $dirpart\n";
             print "$filename\n";
-            print "/$filepart/1.$meta->{revision}//-kb/\n";
+            my $kopts = kopts_from_path($filepart);
+            print "/$filepart/1.$meta->{revision}//$kopts/\n";
         }
     }
 
@@ -1887,6 +1894,28 @@ sub filecleanup
     return $filename;
 }
 
+# Given a path, this function returns a string containing the kopts
+# that should go into that path's Entries line.  For example, a binary
+# file should get -kb.
+sub kopts_from_path
+{
+	my ($path) = @_;
+
+	# Once it exists, the git attributes system should be used to look up
+	# what attributes apply to this path.
+
+	# Until then, take the setting from the config file
+    unless ( defined ( $cfg->{gitcvs}{allbinary} ) and $cfg->{gitcvs}
{allbinary} =~ /^\s*(1|true|yes)\s*$/i )
+    {
+		# Return "" to give no special treatment to any path
+		return "";
+    } else {
+		# Alternatively, to have all files treated as if they are binary (which
+		# is more like git itself), always return the "-kb" option
+		return "-kb";
+    }
+}
+
 package GITCVS::log;
 
 ####
-- 
1.5.0.2.778.gdcb06

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-22 20:37   ` Junio C Hamano
  2007-02-27 13:45     ` [PATCH] cvsserver: Make always-binary mode a config file option Andy Parkins
@ 2007-02-27 13:46     ` Andy Parkins
  2007-02-27 23:56       ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-02-27 13:46 UTC (permalink / raw)
  To: git

The config option gitcvs.allbinary may be set to force all entries to
get the -kb flag.

In the future the gitattributes system will probably be a more
appropriate way of doing this, but that will easily slot in as the
entries lines sent to the CVS client now have their kopts set via the
function kopts_from_path().

In the interim it might be better to not just have a all-or-nothing
approach, but rather detect based on file extension (or file contents?).
That would slot in easily here as well.  However, I personally prefer
everything to be binary-safe, so I just switch the switch.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
---
This time with word wrap turned off.  Sorry Junio.


 git-cvsserver.perl |   47 ++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 56f21b6..15b9443 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -374,7 +374,8 @@ sub req_add
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/0//-kb/\n";
+        my $kopts = kopts_from_path($filepart);
+        print "/$filepart/0//$kopts/\n";
 
         $addcount++;
     }
@@ -455,7 +456,8 @@ sub req_remove
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        print "/$filepart/-1.$wrev//-kb/\n";
+        my $kopts = kopts_from_path($filepart);
+        print "/$filepart/-1.$wrev//$kopts/\n";
 
         $rmcount++;
     }
@@ -726,7 +728,8 @@ sub req_co
        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) and $git->{dir} ne "./" ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
 
         # this is an "entries" line
-        print "/$git->{name}/1.$git->{revision}//-kb/\n";
+        my $kopts = kopts_from_path($git->{name});
+        print "/$git->{name}/1.$git->{revision}//$kopts/\n";
         # permissions
         print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
 
@@ -917,8 +920,9 @@ sub req_update
 		print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
 		# this is an "entries" line
-		$log->debug("/$filepart/1.$meta->{revision}//-kb/");
-		print "/$filepart/1.$meta->{revision}//-kb/\n";
+		my $kopts = kopts_from_path($filepart);
+		$log->debug("/$filepart/1.$meta->{revision}//$kopts/");
+		print "/$filepart/1.$meta->{revision}//$kopts/\n";
 
 		# permissions
 		$log->debug("SEND : u=$meta->{mode},g=$meta->{mode},o=$meta->{mode}");
@@ -961,8 +965,9 @@ sub req_update
                     print "Update-existing $dirpart\n";
                     $log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    $log->debug("/$filepart/1.$meta->{revision}//-kb/");
-                    print "/$filepart/1.$meta->{revision}//-kb/\n";
+                    my $kopts = kopts_from_path($filepart);
+                    $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
+                    print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 }
             }
             elsif ( $return == 1 )
@@ -975,7 +980,8 @@ sub req_update
                 {
                     print "Update-existing $dirpart\n";
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    print "/$filepart/1.$meta->{revision}/+/-kb/\n";
+                    my $kopts = kopts_from_path($filepart);
+                    print "/$filepart/1.$meta->{revision}/+/$kopts/\n";
                 }
             }
             else
@@ -1206,7 +1212,8 @@ sub req_ci
         } else {
             print "Checked-in $dirpart\n";
             print "$filename\n";
-            print "/$filepart/1.$meta->{revision}//-kb/\n";
+            my $kopts = kopts_from_path($filepart);
+            print "/$filepart/1.$meta->{revision}//$kopts/\n";
         }
     }
 
@@ -1887,6 +1894,28 @@ sub filecleanup
     return $filename;
 }
 
+# Given a path, this function returns a string containing the kopts
+# that should go into that path's Entries line.  For example, a binary
+# file should get -kb.
+sub kopts_from_path
+{
+	my ($path) = @_;
+
+	# Once it exists, the git attributes system should be used to look up
+	# what attributes apply to this path.
+
+	# Until then, take the setting from the config file
+    unless ( defined ( $cfg->{gitcvs}{allbinary} ) and $cfg->{gitcvs}{allbinary} =~ /^\s*(1|true|yes)\s*$/i )
+    {
+		# Return "" to give no special treatment to any path
+		return "";
+    } else {
+		# Alternatively, to have all files treated as if they are binary (which
+		# is more like git itself), always return the "-kb" option
+		return "-kb";
+    }
+}
+
 package GITCVS::log;
 
 ####
-- 
1.5.0.2.778.gdcb06

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-27 13:46     ` Andy Parkins
@ 2007-02-27 23:56       ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2007-02-27 23:56 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

Andy Parkins <andyparkins@gmail.com> writes:

> +    unless ( defined ( $cfg->{gitcvs}{allbinary} ) and $cfg->{gitcvs}{allbinary} =~ /^\s*(1|true|yes)\s*$/i )
> +    {
> +		# Return "" to give no special treatment to any path
> +		return "";
> +    } else {
> +		# Alternatively, to have all files treated as if they are binary (which
> +		# is more like git itself), always return the "-kb" option
> +		return "-kb";
> +    }
> +}

This is not your fault as you copied existing code to check
boolean, but I am unhappy every time I see "git-config -l"
forces such an eyesore on us X-<.

Anyway, will apply.  Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-27 13:45     ` [PATCH] cvsserver: Make always-binary mode a config file option Andy Parkins
@ 2007-02-28 11:36       ` Martin Langhoff
  2007-02-28 13:01         ` Andy Parkins
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Langhoff @ 2007-02-28 11:36 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On 2/28/07, Andy Parkins <andyparkins@gmail.com> wrote:
> The config option gitcvs.allbinary may be set to force all entries to
> get the -kb flag.

Hi Andy,

I am a bit confused here... why would we want to do this? We don't
want to support keyword expansion, and that happens at the server end
anyway. CVS clients are pretty dumb and will just write out the file
we give them. When we are going to diff, they send the file to the
server, and the server runs the diff. Etc. No smarts at the client end
that care about -k options.

Or are you trying to control newline mangling on Windows/Mac clients?
I'm not even sure where that mangling happens. If that's the case, a
repo-wide toggle is useless, you really want the per-file-type rules
you mention.

dazed and confused...



martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-28 11:36       ` Martin Langhoff
@ 2007-02-28 13:01         ` Andy Parkins
  2007-02-28 23:40           ` Martin Langhoff
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-02-28 13:01 UTC (permalink / raw)
  To: git; +Cc: Martin Langhoff

On Wednesday 2007 February 28 11:36, Martin Langhoff wrote:

> I am a bit confused here... why would we want to do this? We don't
> want to support keyword expansion, and that happens at the server end

Well, at the moment you don't, but that might happen in the future with 
certain settings in .gitattributes.  It might be nice to have a switch to 
disable that available.  However, that's not why I added this.

> anyway. CVS clients are pretty dumb and will just write out the file
> we give them. When we are going to diff, they send the file to the
> server, and the server runs the diff. Etc. No smarts at the client end
> that care about -k options.

I think your supposition that no client cares about the -k options is wrong. 
CVS clients don't just write the file untouched.  CVS stores all text files 
with UNIX line endings, the client then uses the text/binary flag to decide 
if line-ending conversion should happen.  Checking out to unix would of 
course not require any conversion so the file would be untouched.  Not so for 
Windows clients.  That was what prompted me to make this patch.  I keep some 
binary images in the git repository, when they were checked out with 
TortoiseCVS they were having all the 0x0a (LF) bytes changed to CR LF, and 
thus mangling the images.  

> Or are you trying to control newline mangling on Windows/Mac clients?
> I'm not even sure where that mangling happens. If that's the case, a

It seems to be happening in the client.  In fact it's a given that it happens 
in the client as only the client knows what the local line ending rules are.

> repo-wide toggle is useless, you really want the per-file-type rules
> you mention.

"Useless" is a strong word; especially as I'm finding it very useful :-). 

Yes, a per-file option would be better - but git has no support for that at 
present, so I need a work around.  Most windows editors will cope fine with 
UNIX line endings, so not converting them doesn't cause a problem.  On the 
other hand, mangling every binary file /does/ matter.  So, for me, the better 
default is to not mangle anything (i.e. tell CVS that everything is binary 
and should be preserved as is).


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-28 13:01         ` Andy Parkins
@ 2007-02-28 23:40           ` Martin Langhoff
  2007-03-01  8:40             ` Andy Parkins
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Langhoff @ 2007-02-28 23:40 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On 3/1/07, Andy Parkins <andyparkins@gmail.com> wrote:
> > Or are you trying to control newline mangling on Windows/Mac clients?
> > I'm not even sure where that mangling happens. If that's the case, a
>
> It seems to be happening in the client.  In fact it's a given that it happens
> in the client as only the client knows what the local line ending rules are.

Ok - that's an interesting bit of info. I wasn't sure.

> > repo-wide toggle is useless, you really want the per-file-type rules
> > you mention.
>
> "Useless" is a strong word; especially as I'm finding it very useful :-).

I guess what I mean is that the common case on windows is going to be
for users who want their binary files un-corrupted, and their text
files newline-converted.

In fact, I think we should have it set to default to binary -- which
does the best job of preserving data. And override that based on file
extension (configurable) or  check the "head" of the file cheaply for
binary-ness before we update the file on the client side.

I agree with the idea of doing something smart with -kb flags, but
this is not a good move. For the common case among Windows TortoiseCVS
users, the switch proposed introduces the ability to switch between
one broken mode to another broken mode.

(And in terms of temporary workarounds, TortoiseCVS has a switch in
itself to disable newline conversion.)

cheers,


martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-02-28 23:40           ` Martin Langhoff
@ 2007-03-01  8:40             ` Andy Parkins
  2007-03-01  9:13               ` Martin Langhoff
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-03-01  8:40 UTC (permalink / raw)
  To: git; +Cc: Martin Langhoff

On Wednesday 2007 February 28 23:40, Martin Langhoff wrote:

> I guess what I mean is that the common case on windows is going to be
> for users who want their binary files un-corrupted, and their text
> files newline-converted.

Given that windows editors/tools generally cope quite well with UNIX line 
endings, the common case could well be that no one cares that the new line 
conversion isn't happening (it's certainly the case for me).

> In fact, I think we should have it set to default to binary -- which

Me too.  However, I didn't want to change existing behaviour.  That die has 
been cast unfortunately.

> does the best job of preserving data. And override that based on file
> extension (configurable) or  check the "head" of the file cheaply for
> binary-ness before we update the file on the client side.

You're right in theory, but I really don't like the idea of auto-detection; 
things like that /always/ go wrong and when they do there isn't a way to undo 
what it's done.  Douglas Adams had it right when he said:

"The major difference between a thing that might go wrong and a thing that 
cannot possibly go wrong is that when a thing that cannot possibly go wrong 
goes wrong it usually turns out to be impossible to get at or repair."

> I agree with the idea of doing something smart with -kb flags, but
> this is not a good move. For the common case among Windows TortoiseCVS
> users, the switch proposed introduces the ability to switch between
> one broken mode to another broken mode.

I don't understand.  What is broken about it?  As far as I can tell the -kb 
flag doesn't do anything smart, it /disables/ the smartness and tells the 
client that the file is binary.  As you say - both modes are broken, so 
supplying a switch isn't crazy because one broken mode might suit better than 
another.  I agree that neither is ideal, but until the "ideal" is 
implemented, it's better to have the option than not have it.

I think I'm missing something that you are worrying about and that I haven't 
noticed.  Does -kb do something that I'm not aware of?  Is there a more 
correct way of telling the client that a file is binary?

> (And in terms of temporary workarounds, TortoiseCVS has a switch in
> itself to disable newline conversion.)

I'd prefer if the configuration is kept on the server side as much as 
possible.  Disabling newline conversion in the client is exactly the same 
solution as putting -kb everywhere in the server, except in the server it 
only needs doing once; and when a better method is eventually found the 
clients can continue without having to change anything.



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01  8:40             ` Andy Parkins
@ 2007-03-01  9:13               ` Martin Langhoff
  2007-03-01  9:41                 ` Andy Parkins
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Langhoff @ 2007-03-01  9:13 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git

On 3/1/07, Andy Parkins <andyparkins@gmail.com> wrote:
> On Wednesday 2007 February 28 23:40, Martin Langhoff wrote:
>
> > I guess what I mean is that the common case on windows is going to be
> > for users who want their binary files un-corrupted, and their text
> > files newline-converted.
>
> Given that windows editors/tools generally cope quite well with UNIX line
> endings, the common case could well be that no one cares that the new line
> conversion isn't happening (it's certainly the case for me).

Well... the guys working on the MinGW port of git have railroaded us
with the opposite position. If it's going to work in Windows, you
better accept it has to handle newline conversion correctly or it's
broken...

...

> > I agree with the idea of doing something smart with -kb flags, but
> > this is not a good move. For the common case among Windows TortoiseCVS
> > users, the switch proposed introduces the ability to switch between
> > one broken mode to another broken mode.
>
> I don't understand.  What is broken about it?

The interesting case to fix this for is a mixed binary and text repo
with windows users wanting newline conversion (if they really don't,
they can tell TortoiseCVS as much). For that scenario, current
behaviour is broken (binaries get mangled), and setting -kb on
everything breaks newline conversion.

And that scenario can only be addressed sending appropriate flags for
each file, not with a blanket switch.

>  As you say - both modes are broken, so
> supplying a switch isn't crazy because one broken mode might suit better than
> another.

Except that it's not that hard to do something better -- I was hoping
to prep a patch today, but things got frantic at the office.

> I think I'm missing something that you are worrying about and that I haven't
> noticed.  Does -kb do something that I'm not aware of?  Is there a more
> correct way of telling the client that a file is binary?

Just that I think it's easy enough to implement something that sets
-k mode on file extension, which is actually _much_ better and makes
the blanket setting pointless. And perhaps we can get binary
autodetection working well but accepting overrides.

cheers,


martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01  9:13               ` Martin Langhoff
@ 2007-03-01  9:41                 ` Andy Parkins
  2007-03-01  9:46                   ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Parkins @ 2007-03-01  9:41 UTC (permalink / raw)
  To: git; +Cc: Martin Langhoff

On Thursday 2007 March 01 09:13, Martin Langhoff wrote:

> Well... the guys working on the MinGW port of git have railroaded us
> with the opposite position. If it's going to work in Windows, you
> better accept it has to handle newline conversion correctly or it's
> broken...

I do accept that, but as you said - it's broken whether you pick all binary or 
all text.  I'm not advocating that my solution is final, but it is better to 
not convert text than to mangle binary.

> The interesting case to fix this for is a mixed binary and text repo
> with windows users wanting newline conversion (if they really don't,
> they can tell TortoiseCVS as much). For that scenario, current
> behaviour is broken (binaries get mangled), and setting -kb on
> everything breaks newline conversion.
>
> And that scenario can only be addressed sending appropriate flags for
> each file, not with a blanket switch.

Erm, yes, I know that.  But who is going to set that switch?  This isn't real 
CVS where the repository records that information.  At the moment git does 
not know whether any given file is binary or text.

> Except that it's not that hard to do something better -- I was hoping
> to prep a patch today, but things got frantic at the office.

I think it is hard; as git doesn't supply the needed information.  I suppose 
it could be stored in the SQLlite table, but that is seriously borked.  That 
table is a necessary evil, not a place to start storing any old flag.

> Just that I think it's easy enough to implement something that sets
> -k mode on file extension, which is actually _much_ better and makes
> the blanket setting pointless. And perhaps we can get binary
> autodetection working well but accepting overrides.

As I said, I don't like autodetection.  It /will/ get it wrong.  As for using 
file extension - well that's wrong too if it's coded into the source.

In a previous life, I used to develop a unix program on a windows computer; 
the UNIX directory was mounted as a samba share and built and run via an ssh 
connection.  How is git-cvsserver meant to know about a crazy situation like 
that and "auto-detect" the right thing?  It can't.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01  9:41                 ` Andy Parkins
@ 2007-03-01  9:46                   ` Junio C Hamano
  2007-03-01 10:04                     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2007-03-01  9:46 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git, Martin Langhoff

Andy Parkins <andyparkins@gmail.com> writes:

> Erm, yes, I know that.  But who is going to set that switch?
> This isn't real CVS where the repository records that
> information.  At the moment git does not know whether any
> given file is binary or text.

What do you think I have been hacking around pathattr stuff
today for ;-)? 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01  9:46                   ` Junio C Hamano
@ 2007-03-01 10:04                     ` Junio C Hamano
  2007-03-01 11:03                       ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2007-03-01 10:04 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git, Martin Langhoff

Junio C Hamano <junkio@cox.net> writes:

> Andy Parkins <andyparkins@gmail.com> writes:
>
>> Erm, yes, I know that.  But who is going to set that switch?
>> This isn't real CVS where the repository records that
>> information.  At the moment git does not know whether any
>> given file is binary or text.
>
> What do you think I have been hacking around pathattr stuff
> today for ;-)? 

A bit more clarification.  Although I won't be hacking on it
anymore since it is almost my bedtime,...

I think the 4 patch series I sent out tonight was fun but
slightly misdesigned.  I should have made the classification of
paths and actions on them separate sections.  That is, instead
of saying:

	[pathattr "a/v"]
        	path = "*.mpg"
                path = "*.mp3"
                path = "*.jpg"
		conv_i = none
                conv_o = none
	[pathattr "text"]
        	path = "*" ; all the rest
                conv_i = crlf
                conv_o = crlf

we should make two kinds of configuration.  Classification of
paths is property of the project and does not depend on where
the user uses the paths from the project:

	[pathattr "a/v"]
        	path = "*.mpg"
                path = "*.mp3"
                path = "*.jpg"
	[pathattr "text"]
        	path = "*" ; all the rest

while how they are handled can be platform dependent.  On UNIX,
you might have this in $HOME/.gitconfig:

	[handler "a/v"]
        	pretty = "cmd xine %s" ; nb. there is no 'cmd' yet...
	[handler "text"]
		pretty = "pipe fmt -"

while on another system, you might have:

	[handler "a/v"]
        	pretty = "cmd mediaplayer %s"
	[handler "text"]
                conv_i = crlf
                conv_o = crlf

One thing to note is that [pathattr "kind"] may need to behave
like existing .gitignore in that they are read from directories,
but [handler "kind"] does not have to.

We probably need to expose pathattr_lookup(path) to scripts, and
cvsserver could use that interface to query.  The interface may
look like this:

	$ git pathattr --lookup porn.mpg
        a/v
        $ git config --get handler.a/v
	cmd xine %s

and kopts_from_path would look like:

	sub kopts_from_path {
        	my ($path) = shift;
                my $kind = `git pathattr $path`;
		my $conv_o = `git config --get "handler.$kind"`

                if ($conv_o eq 'crlf') {
			return '-kb';
		}
		return '';
	}

Hmm?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01 10:04                     ` Junio C Hamano
@ 2007-03-01 11:03                       ` Junio C Hamano
  2007-03-01 11:40                         ` Andy Parkins
  2007-03-01 11:44                         ` Karl Hasselström
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2007-03-01 11:03 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git, Martin Langhoff

Junio C Hamano <junkio@cox.net> writes:

> we should make two kinds of configuration.  Classification of
> paths is property of the project and does not depend on where
> the user uses the paths from the project:
> ...
> while how they are handled can be platform dependent.  On UNIX,
> you might have this in $HOME/.gitconfig:
>
> 	[handler "a/v"]
>         	pretty = "cmd xine %s" ; nb. there is no 'cmd' yet...
> 	[handler "text"]
> 		pretty = "pipe fmt -"
>
> while on another system, you might have:
>
> 	[handler "a/v"]
>         	pretty = "cmd mediaplayer %s"
> 	[handler "text"]
>                 conv_i = crlf
>                 conv_o = crlf

Once we separate out the handler section, we can introduce more
useful variables in it (not the "pretty", whose sole purpose was
to have fun and serve as a demonstration).  Obvious ones are:

	; takes temporary files old, mine, his and writes the
        ; result in another
	merge = "cmd external-merge-command %s %s %s %s"

	; takes two temporary files
        diff = "cmd external-diff-command %s %s"

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01 11:03                       ` Junio C Hamano
@ 2007-03-01 11:40                         ` Andy Parkins
  2007-03-01 11:44                         ` Karl Hasselström
  1 sibling, 0 replies; 17+ messages in thread
From: Andy Parkins @ 2007-03-01 11:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Martin Langhoff

On Thursday 2007 March 01 11:03, Junio C Hamano wrote:

> Once we separate out the handler section, we can introduce more
> useful variables in it (not the "pretty", whose sole purpose was
> to have fun and serve as a demonstration).  Obvious ones are:
>
> 	; takes temporary files old, mine, his and writes the
>         ; result in another
> 	merge = "cmd external-merge-command %s %s %s %s"
>
> 	; takes two temporary files
>         diff = "cmd external-diff-command %s %s"

I was part way through assembling a document when I read this; I hadn't 
thought of the external merge option.  I'll add that and send the document 
soon.


Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] cvsserver: Make always-binary mode a config file option
  2007-03-01 11:03                       ` Junio C Hamano
  2007-03-01 11:40                         ` Andy Parkins
@ 2007-03-01 11:44                         ` Karl Hasselström
  1 sibling, 0 replies; 17+ messages in thread
From: Karl Hasselström @ 2007-03-01 11:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Andy Parkins, git, Martin Langhoff

On 2007-03-01 03:03:11 -0800, Junio C Hamano wrote:

>       ; takes temporary files old, mine, his and writes the
>         ; result in another
>       merge = "cmd external-merge-command %s %s %s %s"

Better make this "%1 %2 %3 %4", "%old %mine %his %result", or
something along those lines, so as to not force a particular argument
order on the external script.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-03-01 11:44 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-22 15:04 git-cvsserver and binary files Andy Parkins
2007-02-22 16:06 ` [PATCH] Added "-kb" to all the entries lines sent to the client Andy Parkins
2007-02-22 20:37   ` Junio C Hamano
2007-02-27 13:45     ` [PATCH] cvsserver: Make always-binary mode a config file option Andy Parkins
2007-02-28 11:36       ` Martin Langhoff
2007-02-28 13:01         ` Andy Parkins
2007-02-28 23:40           ` Martin Langhoff
2007-03-01  8:40             ` Andy Parkins
2007-03-01  9:13               ` Martin Langhoff
2007-03-01  9:41                 ` Andy Parkins
2007-03-01  9:46                   ` Junio C Hamano
2007-03-01 10:04                     ` Junio C Hamano
2007-03-01 11:03                       ` Junio C Hamano
2007-03-01 11:40                         ` Andy Parkins
2007-03-01 11:44                         ` Karl Hasselström
2007-02-27 13:46     ` Andy Parkins
2007-02-27 23:56       ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.