All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Git config file reader in Perl (WIP)
@ 2007-01-15  0:44 Jakub Narebski
  2007-01-15  7:08 ` Eric Wong
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-15  0:44 UTC (permalink / raw)
  To: git

To make gitweb faster I thought about adding to it, or to Git.pm,
simple nonvalidation config file reader. Nonvalidating means that
it would accept some input which git-repo-config considers invalid.

Some of the trouble is caused because of coner cases like this 
example

[section "sub ; # sect \" ion\\"] ; "]
	key = a " b ; " ; " c ; "

which is valid, but strange.

I'm not proficent in Perl, so help is appreciated.

-- >8 --
#!/usr/bin/perl

use strict;
use warnings;

use Text::Balanced qw(extract_delimited);


sub read_config {
	my $configfile = shift;
	my $section = shift;
	my %config;

	open my $fd, $configfile
		or die "Cannot open $configfile: $!";

	my $sectfull;
	while (my $line = <$fd>) {
		chomp $line;

		if ($line =~ m/^\s*\[\s*([^][:space:]]*)\s*\](.*)$/) {
			# section without subsection

			my $sect = lc($1);

			$sectfull = $sect;

		} elsif ($line =~ m/\s*\[([^][:space:]]*)\s"((?:\\.|[^"])*)"\](.*)$/) {
			# section with subsection

			my $sect = lc($1);
			my $subsect = $2;
			$subsect =~ s/\\(.)/$1/g;

			$sectfull = "$sect.$subsect";

		} elsif ($line =~ m/\s*(\w+)\s*=\s*(.*?)\s*$/) {
			# variable assignment

			my $key = lc($1);
			my $rhs = $2;

			my $value = '';
			my ($next, $remainder, $prefix) = qw();
		DELIM: {
				do {
					($next, $remainder, $prefix) =
						extract_delimited($rhs, '"', qr/(?:\\.|[^"])*/);

					if ($prefix =~ s/\s*[;#].*$//) {
						# comment in unquoted part
						$value .= $prefix;
						last DELIM;
					} else {
						$value .= $prefix if $prefix;
						if ($next && $next =~ s/^"(.*)"$/$1/) {
							$value .= $next;
						}
					}

					$rhs = $remainder;
				} while ($rhs && $next);
			} # DELIM:

			if ($remainder) {
				$remainder =~ s/\s*[;#].*$//;
				$value .= $remainder;
			}

			$value =~ s/\\(.)/$1/g;

			if (exists $config{"$sectfull.$key"}) {
				push @{$config{"$sectfull.$key"}}, $value;
			} else {
				$config{"$sectfull.$key"} = [ $value ];
			}

		} elsif ($line =~ m/^\s*(\w+)\s*(:?[;#].*)?$/) {
			# boolean variable without value

			my $key = lc($1);

			if (exists $config{"$sectfull.$key"}) {
				push @{$config{"$sectfull.$key"}}, undef;
			} else {
				$config{"$sectfull.$key"} = [ undef ];
			}
		} # end if
	}

	close $fd
		or die "Cannot close $configfile: $!";

	return wantarray ? %config : \%config;
}

# --------------------------------------------------------------------------

my %config;

%config = read_config("~/git/.git/config");
%config = read_config("/tmp/jnareb/gitconfig");

foreach my $ckey (sort keys %config) {
	foreach my $cvalue (@{$config{$ckey}}) {
		if (defined $cvalue) {
			print "$ckey=$cvalue\n";
		} else {
			print "$ckey\n";
		}
	}
}

__END__

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15  0:44 [RFC] Git config file reader in Perl (WIP) Jakub Narebski
@ 2007-01-15  7:08 ` Eric Wong
  2007-01-15  9:03   ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Wong @ 2007-01-15  7:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> To make gitweb faster I thought about adding to it, or to Git.pm,
> simple nonvalidation config file reader. Nonvalidating means that
> it would accept some input which git-repo-config considers invalid.

How about something like git-for-each-ref that dumps the entire output
of a config file into an eval()-able string?  That way we don't have to
deal with corner-cases and subtle differences between C and Perl
implementations.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15  7:08 ` Eric Wong
@ 2007-01-15  9:03   ` Jakub Narebski
  2007-01-15  9:56     ` Eric Wong
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-15  9:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Eric Wong wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:
>> To make gitweb faster I thought about adding to it, or to Git.pm,
>> simple nonvalidation config file reader. Nonvalidating means that
>> it would accept some input which git-repo-config considers invalid.
> 
> How about something like git-for-each-ref that dumps the entire output
> of a config file into an eval()-able string?  That way we don't have to
> deal with corner-cases and subtle differences between C and Perl
> implementations.

The idea is (at least for gitweb) to avoid cost of fork. And I think
if the format gets documented properly, there should be no differences
in config file parsing.

Please remember also that is first draft of git config file reader
in perl; an alpha version.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15  9:03   ` Jakub Narebski
@ 2007-01-15  9:56     ` Eric Wong
  2007-01-15 10:01       ` Shawn O. Pearce
  2007-01-15 10:32       ` Jakub Narebski
  0 siblings, 2 replies; 54+ messages in thread
From: Eric Wong @ 2007-01-15  9:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> Eric Wong wrote:
> > Jakub Narebski <jnareb@gmail.com> wrote:
> >> To make gitweb faster I thought about adding to it, or to Git.pm,
> >> simple nonvalidation config file reader. Nonvalidating means that
> >> it would accept some input which git-repo-config considers invalid.
> > 
> > How about something like git-for-each-ref that dumps the entire output
> > of a config file into an eval()-able string?  That way we don't have to
> > deal with corner-cases and subtle differences between C and Perl
> > implementations.
> 
> The idea is (at least for gitweb) to avoid cost of fork. And I think
> if the format gets documented properly, there should be no differences
> in config file parsing.

If the Perl output is redirected to a file (say .git/config.perl) and
only regenerated when .git/config changes, `do(".git/config.perl")' will
likely be faster since all the parsing will be done by Perl itself.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15  9:56     ` Eric Wong
@ 2007-01-15 10:01       ` Shawn O. Pearce
  2007-01-15 10:32       ` Jakub Narebski
  1 sibling, 0 replies; 54+ messages in thread
From: Shawn O. Pearce @ 2007-01-15 10:01 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jakub Narebski, git

Eric Wong <normalperson@yhbt.net> wrote:
> If the Perl output is redirected to a file (say .git/config.perl) and
> only regenerated when .git/config changes, `do(".git/config.perl")' will
> likely be faster since all the parsing will be done by Perl itself.

So long as its automatic in gitweb.cgi and based on the stat
attributes of .git/config, OK.  But my database background tells
me two copies of the same thing is fishy...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15  9:56     ` Eric Wong
  2007-01-15 10:01       ` Shawn O. Pearce
@ 2007-01-15 10:32       ` Jakub Narebski
  2007-01-15 11:26         ` Eric Wong
  1 sibling, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-15 10:32 UTC (permalink / raw)
  To: Eric Wong; +Cc: git

Eric Wong wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:
>> Eric Wong wrote:
>>> Jakub Narebski <jnareb@gmail.com> wrote:
>>>> To make gitweb faster I thought about adding to it, or to Git.pm,
>>>> simple nonvalidation config file reader. Nonvalidating means that
>>>> it would accept some input which git-repo-config considers invalid.
>>> 
>>> How about something like git-for-each-ref that dumps the entire output
>>> of a config file into an eval()-able string?  That way we don't have to
>>> deal with corner-cases and subtle differences between C and Perl
>>> implementations.
>> 
>> The idea is (at least for gitweb) to avoid cost of fork. And I think
>> if the format gets documented properly, there should be no differences
>> in config file parsing.
> 
> If the Perl output is redirected to a file (say .git/config.perl) and
> only regenerated when .git/config changes, `do(".git/config.perl")' will
> likely be faster since all the parsing will be done by Perl itself.

Would you write "git repo-config --perl", then? ;-)

Besides, I'd rather avoid the need for /tmp/gitweb, and I think usually
gitweb do not have (and should not have) write access to repository.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 10:32       ` Jakub Narebski
@ 2007-01-15 11:26         ` Eric Wong
  2007-01-15 12:15           ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Eric Wong @ 2007-01-15 11:26 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> Eric Wong wrote:
> > Jakub Narebski <jnareb@gmail.com> wrote:
> >> Eric Wong wrote:
> >>> Jakub Narebski <jnareb@gmail.com> wrote:
> >>>> To make gitweb faster I thought about adding to it, or to Git.pm,
> >>>> simple nonvalidation config file reader. Nonvalidating means that
> >>>> it would accept some input which git-repo-config considers invalid.
> >>> 
> >>> How about something like git-for-each-ref that dumps the entire output
> >>> of a config file into an eval()-able string?  That way we don't have to
> >>> deal with corner-cases and subtle differences between C and Perl
> >>> implementations.
> >> 
> >> The idea is (at least for gitweb) to avoid cost of fork. And I think
> >> if the format gets documented properly, there should be no differences
> >> in config file parsing.
> > 
> > If the Perl output is redirected to a file (say .git/config.perl) and
> > only regenerated when .git/config changes, `do(".git/config.perl")' will
> > likely be faster since all the parsing will be done by Perl itself.
> 
> Would you write "git repo-config --perl", then? ;-)

The below patch should be a start (only tested on my fairly standard
.git/config).  A --python option should be easy, too :)

> Besides, I'd rather avoid the need for /tmp/gitweb, and I think usually
> gitweb do not have (and should not have) write access to repository.

Good point.  Having to maintain a .git/config.perl in the repository
would be a pain from an administrative standpoint; but on the other hand
.git/config is not often regenerated.

I don't think giving gitweb write access to a repo is a good idea;
either.  Perhaps it would be updated via hook like the HTTP stuff.
IMHO, there is nothing wrong with gitweb writing to /tmp; however.

diff --git a/builtin-repo-config.c b/builtin-repo-config.c
index 9063311..a9ef358 100644
--- a/builtin-repo-config.c
+++ b/builtin-repo-config.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "quote.h"
 
 static const char git_config_set_usage[] =
 "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list";
@@ -13,6 +14,7 @@ static int do_all;
 static int do_not_match;
 static int seen;
 static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
+static char *last_key;
 
 static int show_all_config(const char *key_, const char *value_)
 {
@@ -23,6 +25,30 @@ static int show_all_config(const char *key_, const char *value_)
 	return 0;
 }
 
+static int show_perl_config(const char *key_, const char *value_)
+{
+	if (last_key) {
+		if (strcmp(last_key, key_)) {
+			free(last_key);
+			last_key = xstrdup(key_);
+			fputs("\t],\n\t", stdout);
+			perl_quote_print(stdout, key_);
+			fputs(" => [\n", stdout);
+		}
+	} else {
+		last_key = xstrdup(key_);
+		fputc('\t', stdout);
+		perl_quote_print(stdout, key_);
+		fputs(" => [\n", stdout);
+	}
+	if (value_) {
+		fputs("\t\t", stdout);
+		perl_quote_print(stdout, value_);
+		fputs(",\n", stdout);
+	}
+	return 0;
+}
+
 static int show_config(const char* key_, const char* value_)
 {
 	char value[256];
@@ -138,6 +164,17 @@ int cmd_repo_config(int argc, const char **argv, const char *prefix)
 			type = T_BOOL;
 		else if (!strcmp(argv[1], "--list") || !strcmp(argv[1], "-l"))
 			return git_config(show_all_config);
+		else if (!strcmp(argv[1], "--perl")) {
+			int rv;
+			puts("\%git_config = (");
+			rv = git_config(show_perl_config);
+			if (last_key) {
+				puts("\t]\n);\n");
+				free(last_key);
+				last_key = NULL;
+			}
+			return rv;
+		}
 		else if (!strcmp(argv[1], "--global")) {
 			char *home = getenv("HOME");
 			if (home) {

-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 11:26         ` Eric Wong
@ 2007-01-15 12:15           ` Johannes Schindelin
  2007-01-15 15:34             ` Nikolai Weibull
  2007-01-16  9:51             ` Eric Wong
  0 siblings, 2 replies; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-15 12:15 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jakub Narebski, git

Hi,

On Mon, 15 Jan 2007, Eric Wong wrote:

> > Would you write "git repo-config --perl", then? ;-)
> 
> The below patch should be a start (only tested on my fairly standard 
> .git/config).  A --python option should be easy, too :)

A bit shorter (and gets the booleans right, plus being even easier 
towards --python extension):

---

 builtin-repo-config.c |   19 +++++++++++++++++--
 1 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/builtin-repo-config.c b/builtin-repo-config.c
index 9063311..8ebf436 100644
--- a/builtin-repo-config.c
+++ b/builtin-repo-config.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "quote.h"
 
 static const char git_config_set_usage[] =
 "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list";
@@ -12,11 +13,18 @@ static int use_key_regexp;
 static int do_all;
 static int do_not_match;
 static int seen;
+static const char *perl_prefix = NULL;
 static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
 
 static int show_all_config(const char *key_, const char *value_)
 {
-	if (value_)
+	if (perl_prefix) {
+		printf("%s", perl_prefix);
+		perl_quote_print(stdout, key_);
+		printf(" => ");
+		perl_quote_print(stdout, value_ ? value_ : "true");
+		perl_prefix = ",\n\t";
+	} else if (value_)
 		printf("%s=%s\n", key_, value_);
 	else
 		printf("%s\n", key_);
@@ -138,7 +146,14 @@ int cmd_repo_config(int argc, const char **argv, const char *prefix)
 			type = T_BOOL;
 		else if (!strcmp(argv[1], "--list") || !strcmp(argv[1], "-l"))
 			return git_config(show_all_config);
-		else if (!strcmp(argv[1], "--global")) {
+		else if (!strcmp(argv[1], "--perl")) {
+			int ret;
+			perl_prefix = "\n\t";
+			printf("%%git_config = (");
+			ret = git_config(show_all_config);
+			printf("\n);\n");
+			return ret;
+		} else if (!strcmp(argv[1], "--global")) {
 			char *home = getenv("HOME");
 			if (home) {
 				char *user_config = xstrdup(mkpath("%s/.gitconfig", home));

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 12:15           ` Johannes Schindelin
@ 2007-01-15 15:34             ` Nikolai Weibull
  2007-01-15 15:44               ` Johannes Schindelin
                                 ` (2 more replies)
  2007-01-16  9:51             ` Eric Wong
  1 sibling, 3 replies; 54+ messages in thread
From: Nikolai Weibull @ 2007-01-15 15:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Eric Wong, Jakub Narebski, git

On 1/15/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> On Mon, 15 Jan 2007, Eric Wong wrote:

> > > Would you write "git repo-config --perl", then? ;-)

> > The below patch should be a start (only tested on my fairly standard
> > .git/config).  A --python option should be easy, too :)

> A bit shorter (and gets the booleans right, plus being even easier
> towards --python extension):

If we're going down this slippery slope, why not just give up and add
a --xml switch instead?  Readable by all and a lot more flexible than
--perl, --python, --ruby, --tcl, --sh, --c++, --fortran, --lisp,
--html, --that-next-silver-bullet-language-that-hasnt-been-invented-yet-but-will-need-its-own-switch-once-it-has-been.

That said, parsing the config file as-is can't be so difficult that we
need to export it to separate files with a different syntax, now can
it?

  nikolai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 15:34             ` Nikolai Weibull
@ 2007-01-15 15:44               ` Johannes Schindelin
  2007-01-15 16:22                 ` Nikolai Weibull
  2007-01-15 16:00               ` Jakub Narebski
  2007-01-16 10:45               ` Junio C Hamano
  2 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-15 15:44 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Eric Wong, Jakub Narebski, git

Hi,

On Mon, 15 Jan 2007, Nikolai Weibull wrote:

> On 1/15/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 
> > On Mon, 15 Jan 2007, Eric Wong wrote:
> 
> > > > Would you write "git repo-config --perl", then? ;-)
> 
> > > The below patch should be a start (only tested on my fairly standard
> > > .git/config).  A --python option should be easy, too :)
> 
> > A bit shorter (and gets the booleans right, plus being even easier
> > towards --python extension):
> 
> If we're going down this slippery slope, why not just give up and add
> a --xml switch instead?

AFAIR this switch was meant to _enhance_ performance.

> That said, parsing the config file as-is can't be so difficult that we 
> need to export it to separate files with a different syntax, now can it?

The point is having one parser to rule them all, and avoid having 
different parsers, all with their own set of shortcomings.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 15:34             ` Nikolai Weibull
  2007-01-15 15:44               ` Johannes Schindelin
@ 2007-01-15 16:00               ` Jakub Narebski
  2007-01-16 10:45               ` Junio C Hamano
  2 siblings, 0 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-15 16:00 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Johannes Schindelin, Eric Wong, git

Nikolai Weibull wrote:
> On 1/15/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>> On Mon, 15 Jan 2007, Eric Wong wrote:
> 
>>>> Would you write "git repo-config --perl", then? ;-)
> 
>>> The below patch should be a start (only tested on my fairly standard
>>> .git/config).  A --python option should be easy, too :)
> 
>> A bit shorter (and gets the booleans right, plus being even easier
>> towards --python extension):
> 
> If we're going down this slippery slope, why not just give up and add
> a --xml switch instead?  Readable by all and a lot more flexible than
> --perl, --python, --ruby, --tcl, --sh, --c++, --fortran, --lisp,
> --html, --that-next-silver-bullet-language [...].
> 
> That said, parsing the config file as-is can't be so difficult that we
> need to export it to separate files with a different syntax, now can
> it?

Parsing the config file is not _that_ difficult (the first post in this 
thread had config reader in Perl), but it is not that easy: case 
(in)setiviness, quoting, escaping, comments, removing leading and 
trailing whitespace when not quoted...

P.S. I'd rather have an additional implementation (in Perl) conforming 
to yet to be written git ini-like config file specs, to find places 
where canonic parser, git-repo-config, doesn't conform to the specs.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 15:44               ` Johannes Schindelin
@ 2007-01-15 16:22                 ` Nikolai Weibull
  0 siblings, 0 replies; 54+ messages in thread
From: Nikolai Weibull @ 2007-01-15 16:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Eric Wong, Jakub Narebski, git

On 1/15/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> On Mon, 15 Jan 2007, Nikolai Weibull wrote:

> > On 1/15/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> > > On Mon, 15 Jan 2007, Eric Wong wrote:

> > > > > Would you write "git repo-config --perl", then? ;-)

> > > > The below patch should be a start (only tested on my fairly standard
> > > > .git/config).  A --python option should be easy, too :)

> > > A bit shorter (and gets the booleans right, plus being even easier
> > > towards --python extension):

> > If we're going down this slippery slope, why not just give up and add
> > a --xml switch instead?

> AFAIR this switch was meant to _enhance_ performance.

As far as I can tell, comparing fork() vs. reading a dump with eval
vs. XML isn't meaningful - parsing a 20-line XML file can hardly be
much more (if it even is more) expensive than evaling a file of the
same length.

> > That said, parsing the config file as-is can't be so difficult that we
> > need to export it to separate files with a different syntax, now can it?

> The point is having one parser to rule them all, and avoid having
> different parsers, all with their own set of shortcomings.

So then you must agree that having one export format makes a lot of
sense, for the same reasons.  Not that I think that an export format
makes sense in the first place.

  nikolai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 12:15           ` Johannes Schindelin
  2007-01-15 15:34             ` Nikolai Weibull
@ 2007-01-16  9:51             ` Eric Wong
  2007-01-16 10:47               ` Johannes Schindelin
  1 sibling, 1 reply; 54+ messages in thread
From: Eric Wong @ 2007-01-16  9:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
> 
> On Mon, 15 Jan 2007, Eric Wong wrote:
> 
> > > Would you write "git repo-config --perl", then? ;-)
> > 
> > The below patch should be a start (only tested on my fairly standard 
> > .git/config).  A --python option should be easy, too :)
> 
> A bit shorter (and gets the booleans right, plus being even easier 
> towards --python extension):

Your version doesn't get arrays right, however.

Here's a Perl/Python/Ruby version below.  It should be extendable for
other languages; feedback and additions appreciated:

Note that usage has been changed to --dump=(perl|python|ruby)

I may add key_suffix to lang_dump just to be consistent with pairings,
but array_start seems to handle all cases of it and it would be
redundant...

--- a/builtin-repo-config.c
+++ b/builtin-repo-config.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "quote.h"
 
 static const char git_config_set_usage[] =
 "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list";
@@ -14,6 +15,90 @@ static int do_not_match;
 static int seen;
 static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
 
+struct lang_dump {
+	const char *name;
+	const char *decl_start;
+	const char *decl_end;
+	const char *key_prefix;
+	const char *array_start;
+	const char *array_end;
+	const char *val_prefix;
+	const char *val_suffix;
+	const char *true_val; /* should already be quoted, if needed */
+	void (*quote_key_fn)(FILE *, const char*);
+	void (*quote_val_fn)(FILE *, const char*);
+};
+static char *last_key;
+static struct lang_dump *lang;
+static struct lang_dump lang_dump_defs[] = {
+	{ "perl",
+		"\%git_config = (\n", ");\n",
+		"\t",
+		" => [\n", "\t],\n",
+		"\t\t", ",\n",
+		"'true'",
+		perl_quote_print, perl_quote_print },
+	{ "python",
+		"git_config = {\n", "}\n",
+		"    ",
+		" : [\n", "    ],\n",
+		"        ", ",\n",
+		"True",
+		python_quote_print, python_quote_print },
+	{ "ruby", /* Ruby is very Perl-like */
+		"git_config = {\n", "}\n",
+		"  ",
+		" => [\n", "  ],\n",
+		"    ", ",\n",
+		"true",
+		perl_quote_print, perl_quote_print },
+};
+
+static int show_lang_config(const char *key_, const char *value_)
+{
+	if (last_key) {
+		if (strcmp(last_key, key_)) {
+			free(last_key);
+			fputs(lang->array_end, stdout);
+			goto new_key;
+		}
+	} else {
+new_key:
+		last_key = xstrdup(key_);
+		fputs(lang->key_prefix, stdout);
+		lang->quote_key_fn(stdout, key_);
+		fputs(lang->array_start, stdout);
+	}
+	fputs(lang->val_prefix, stdout);
+	if (value_)
+		lang->quote_val_fn(stdout, value_);
+	else
+		fputs(lang->true_val, stdout);
+	fputs(lang->val_suffix, stdout);
+	return 0;
+}
+
+static int show_lang_config_all(const char *lang_name)
+{
+	int i, rv;
+	for (i = ARRAY_SIZE(lang_dump_defs); --i >= 0; ) {
+		if (strcmp(lang_name, lang_dump_defs[i].name))
+			continue;
+		lang = lang_dump_defs + i;
+		fputs(lang->decl_start, stdout);
+		rv = git_config(show_lang_config);
+		if (last_key) {
+			free(last_key);
+			last_key = NULL;
+			fputs(lang->array_end, stdout);
+			fputs(lang->decl_end, stdout);
+		}
+		return rv;
+	}
+	fputs("Dumping config to '%s' is not yet supported", stderr);
+	return -1;
+}
+
 static int show_all_config(const char *key_, const char *value_)
 {
 	if (value_)
@@ -138,6 +223,8 @@ int cmd_repo_config(int argc, const char **argv, const char *prefix)
 			type = T_BOOL;
 		else if (!strcmp(argv[1], "--list") || !strcmp(argv[1], "-l"))
 			return git_config(show_all_config);
+		else if (!strncmp(argv[1], "--dump=", 7))
+			return show_lang_config_all(argv[1] + 7);
 		else if (!strcmp(argv[1], "--global")) {
 			char *home = getenv("HOME");
 			if (home) {
-- 
Eric Wong

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-15 15:34             ` Nikolai Weibull
  2007-01-15 15:44               ` Johannes Schindelin
  2007-01-15 16:00               ` Jakub Narebski
@ 2007-01-16 10:45               ` Junio C Hamano
  2007-01-16 11:12                 ` Johannes Schindelin
  2 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2007-01-16 10:45 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Johannes Schindelin, Eric Wong, Jakub Narebski, git

"Nikolai Weibull" <now@bitwi.se> writes:

> If we're going down this slippery slope, why not just give up and add
> a --xml switch instead?  Readable by all...

Perhaps all except humans.

At least YAML, please...

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16  9:51             ` Eric Wong
@ 2007-01-16 10:47               ` Johannes Schindelin
  2007-01-16 19:53                 ` Eric Wong
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-16 10:47 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jakub Narebski, git

Hi,

On Tue, 16 Jan 2007, Eric Wong wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > 
> > On Mon, 15 Jan 2007, Eric Wong wrote:
> > 
> > > > Would you write "git repo-config --perl", then? ;-)
> > > 
> > > The below patch should be a start (only tested on my fairly standard 
> > > .git/config).  A --python option should be easy, too :)
> > 
> > A bit shorter (and gets the booleans right, plus being even easier 
> > towards --python extension):
> 
> Your version doesn't get arrays right, however.

That's right.

I'd like that code to be simpler, though. Way simpler.

> --- a/builtin-repo-config.c
> +++ b/builtin-repo-config.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "cache.h"
> +#include "quote.h"
>  
>  static const char git_config_set_usage[] =
>  "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list";
> @@ -14,6 +15,90 @@ static int do_not_match;
>  static int seen;
>  static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
>  
> +struct lang_dump {
> +	const char *name;
> +	const char *decl_start;
> +	const char *decl_end;
> +	const char *key_prefix;
> +	const char *array_start;
> +	const char *array_end;
> +	const char *val_prefix;
> +	const char *val_suffix;
> +	const char *true_val; /* should already be quoted, if needed */
> +	void (*quote_key_fn)(FILE *, const char*);
> +	void (*quote_val_fn)(FILE *, const char*);
> +};
> +static char *last_key;
> +static struct lang_dump *lang;
> +static struct lang_dump lang_dump_defs[] = {
> +	{ "perl",
> +		"\%git_config = (\n", ");\n",
0> +		"\t",
> +		" => [\n", "\t],\n",
> +		"\t\t", ",\n",
> +		"'true'",
> +		perl_quote_print, perl_quote_print },

The two quote members seem to be the same for _all_ three languages.

> +	{ "python",
> +		"git_config = {\n", "}\n",
> +		"    ",

I don't understand why you do not consolidate that into using tabs for 
_all_ backends?


> +static int show_lang_config(const char *key_, const char *value_)
> +{
> +	if (last_key) {
> +		if (strcmp(last_key, key_)) {
> +			free(last_key);
> +			fputs(lang->array_end, stdout);
> +			goto new_key;
> +		}
> +	} else {
> +new_key:
> +		last_key = xstrdup(key_);
> +		fputs(lang->key_prefix, stdout);
> +		lang->quote_key_fn(stdout, key_);
> +		fputs(lang->array_start, stdout);
> +	}

So this makes _all_ config vars arrays? It is consistent, yes... but it is 
also ugly, no?

> +static int show_lang_config_all(const char *lang_name)
> +{
> +	int i, rv;
> +	for (i = ARRAY_SIZE(lang_dump_defs); --i >= 0; ) {
> +		if (strcmp(lang_name, lang_dump_defs[i].name))
> +			continue;
> +		lang = lang_dump_defs + i;

IMHO this would be much easier to read using a path_list:

	struct path_list_item *item = path_list_lookup(lang_name, &langs);

	if (item == NULL)
		return -1;

	lang = item->util;

> +		fputs(lang->decl_start, stdout);
> +		rv = git_config(show_lang_config);
> +		if (last_key) {
> +			free(last_key);
> +			last_key = NULL;
> +			fputs(lang->array_end, stdout);
> +			fputs(lang->decl_end, stdout);

If the config is empty, no decl_end is printed, right?

> +		}
> +		return rv;
> +	}
> +	fputs("Dumping config to '%s' is not yet supported", stderr);
> +	return -1;
> +}

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 10:45               ` Junio C Hamano
@ 2007-01-16 11:12                 ` Johannes Schindelin
  2007-01-16 14:14                   ` Jakub Narebski
  2007-01-16 19:09                   ` Eric Wong
  0 siblings, 2 replies; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-16 11:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nikolai Weibull, Eric Wong, Jakub Narebski, git

Hi,

On Tue, 16 Jan 2007, Junio C Hamano wrote:

> "Nikolai Weibull" <now@bitwi.se> writes:
> 
> > If we're going down this slippery slope, why not just give up and add
> > a --xml switch instead?  Readable by all...
> 
> Perhaps all except humans.
> 
> At least YAML, please...

I am _strongly_ opposed to all that rubbish. _If_ we want to use 
repo-config to preformat the config variables, we should either

1) just use "git repo-config -l" and STFU, or
2) introduce something like "--dump" which Eric implemented.

Everything else is just _complicating_ matters, and for _what_? _Nothing_ 
at all. If we use repo-config for that task, it should cater for parsing 
by _script languages_, not _users_.

I work with XML everyday. It has its uses. But this here problem is _not_ 
one of them. How silly would that be: we parse an easy-to-read format, 
munge the easy-to-handle internal data format into another "easy-to-read" 
format which is then parsed by a script language into an easy-to-handle 
internal data format? No. NO.

Ciao,
Dscho

P.S.: The more I think about it, we should just use the output of 
"repo-config -l".

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 11:12                 ` Johannes Schindelin
@ 2007-01-16 14:14                   ` Jakub Narebski
  2007-01-16 22:17                     ` Nikolai Weibull
  2007-01-16 19:09                   ` Eric Wong
  1 sibling, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-16 14:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Nikolai Weibull, Eric Wong, git

Johannes Schindelin wrote:

> On Tue, 16 Jan 2007, Junio C Hamano wrote:
> 
>> "Nikolai Weibull" <now@bitwi.se> writes:
>> 
>>> If we're going down this slippery slope, why not just give up and add
>>> a --xml switch instead?  Readable by all...
>> 
>> Perhaps all except humans.
>> 
>> At least YAML, please...
> 
> I am _strongly_ opposed to all that rubbish. _If_ we want to use 
> repo-config to preformat the config variables, we should either
> 
> 1) just use "git repo-config -l" and STFU, or

> P.S.: The more I think about it, we should just use the output of 
> "repo-config -l".

It wouldn't work. Subsection and value are (almost) free form, and
they can contain '=' in them.

But I agree that XML is serious overkill...

> 2) introduce something like "--dump" which Eric implemented.

It would be probably best to introduce --dump which would output
config file _as if_ it was written by git-repo-config (probably
without comments).

All values would be in separate lines, there would be only one
section header per each section, values would be quoted if needed
(if they contain comment delimiter, '=' or whitespace).

E.g.

  [section "subsection"] key=a " b; " c; " d; "

would get rewritten as

  [section "subsection"]
  	key = "a  b;  c"

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 11:12                 ` Johannes Schindelin
  2007-01-16 14:14                   ` Jakub Narebski
@ 2007-01-16 19:09                   ` Eric Wong
  1 sibling, 0 replies; 54+ messages in thread
From: Eric Wong @ 2007-01-16 19:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Nikolai Weibull, Jakub Narebski, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
> 
> On Tue, 16 Jan 2007, Junio C Hamano wrote:
> 
> > "Nikolai Weibull" <now@bitwi.se> writes:
> > 
> > > If we're going down this slippery slope, why not just give up and add
> > > a --xml switch instead?  Readable by all...
> > 
> > Perhaps all except humans.
> > 
> > At least YAML, please...
> 
> I am _strongly_ opposed to all that rubbish. _If_ we want to use 
> repo-config to preformat the config variables, we should either
> 
> 1) just use "git repo-config -l" and STFU, or
> 2) introduce something like "--dump" which Eric implemented.
> 
> Everything else is just _complicating_ matters, and for _what_? _Nothing_ 
> at all. If we use repo-config for that task, it should cater for parsing 
> by _script languages_, not _users_.
> 
> I work with XML everyday. It has its uses. But this here problem is _not_ 
> one of them. How silly would that be: we parse an easy-to-read format, 
> munge the easy-to-handle internal data format into another "easy-to-read" 
> format which is then parsed by a script language into an easy-to-handle 
> internal data format? No. NO.

I agree with these statements.  I actually had YAML in my code
originally, but ripped it out because it would be another round of
parsing for any language using a YAML parser.

I like YAML, but no, this isn't the place for it IMHO.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 10:47               ` Johannes Schindelin
@ 2007-01-16 19:53                 ` Eric Wong
  0 siblings, 0 replies; 54+ messages in thread
From: Eric Wong @ 2007-01-16 19:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> I'd like that code to be simpler, though. Way simpler.

I've tried, but I'm not sure how much farther I can go.

> > +	{ "perl",
> > +		"\%git_config = (\n", ");\n",
> 0> +		"\t",
> > +		" => [\n", "\t],\n",
> > +		"\t\t", ",\n",
> > +		"'true'",
> > +		perl_quote_print, perl_quote_print },
> 
> The two quote members seem to be the same for _all_ three languages.

Yes, I was trying to imagine a corner case where quoting
for keys could be different than values.

I know Ruby can use :symbols but those don't support '.' and '-' as
far as I know, Perl doesn't require quoting for keys if they match
^-?\w+.  I guess just using quoted strings as keys is fine enough.
The patch below uses the same quote operator for both keys and values.

> > +	{ "python",
> > +		"git_config = {\n", "}\n",
> > +		"    ",
> 
> I don't understand why you do not consolidate that into using tabs for 
> _all_ backends?

I wanted to make things look familiar to people using those languages.

Most Ruby programmers I've seen use 2-space indents.  I myself have given
into using them when I write Ruby.

Python doesn't seem to care about indentation for data structures, but I
think Python programmers prefer spaces for indentation.  I'm not very
experienced with Python, however.

Tabs are my own personal preference for Perl; but indentation is very
inconsistent in Perl code I've looked at :/.  perlstyle(1) actually
recommends 4 space indents...

On the other hand, we are writing for interpreters and not humans.  So
maybe just using tabs is good enough (some formatting makes debugging
easier, so I'm not putting everything on one line :).

> > +static int show_lang_config(const char *key_, const char *value_)
> > +{
> > +	if (last_key) {
> > +		if (strcmp(last_key, key_)) {
> > +			free(last_key);
> > +			fputs(lang->array_end, stdout);
> > +			goto new_key;
> > +		}
> > +	} else {
> > +new_key:
> > +		last_key = xstrdup(key_);
> > +		fputs(lang->key_prefix, stdout);
> > +		lang->quote_key_fn(stdout, key_);
> > +		fputs(lang->array_start, stdout);
> > +	}
> 
> So this makes _all_ config vars arrays? It is consistent, yes... but it is 
> also ugly, no?

Somewhat ugly, yes, but I think returning everything as an array would
make things easier for code using the data structures.  They could
always just reference the first element if they didn't want the array
instead of having to find the type with ref() or .kind_of?

> > +static int show_lang_config_all(const char *lang_name)
> > +{
> > +	int i, rv;
> > +	for (i = ARRAY_SIZE(lang_dump_defs); --i >= 0; ) {
> > +		if (strcmp(lang_name, lang_dump_defs[i].name))
> > +			continue;
> > +		lang = lang_dump_defs + i;
> 
> IMHO this would be much easier to read using a path_list:
> 
> 	struct path_list_item *item = path_list_lookup(lang_name, &langs);
> 
> 	if (item == NULL)
> 		return -1;
> 
> 	lang = item->util;

Ah, I didn't know about path_list_lookup().  Now that I know about it, I
don't think it's worth it to create the extra data structures around
it.  We can just as easily switch to bsearch(3) when we add more
languages.

> > +		fputs(lang->decl_start, stdout);
> > +		rv = git_config(show_lang_config);
> > +		if (last_key) {
> > +			free(last_key);
> > +			last_key = NULL;
> > +			fputs(lang->array_end, stdout);
> > +			fputs(lang->decl_end, stdout);
> 
> If the config is empty, no decl_end is printed, right?

Good catch.  Thanks.

> > +		}
> > +		return rv;
> > +	}
> > +	fputs("Dumping config to '%s' is not yet supported", stderr);
> > +	return -1;
> > +}

--- a/builtin-repo-config.c
+++ b/builtin-repo-config.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "quote.h"
 
 static const char git_config_set_usage[] =
 "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list";
@@ -14,6 +15,89 @@ static int do_not_match;
 static int seen;
 static enum { T_RAW, T_INT, T_BOOL } type = T_RAW;
 
+struct lang_dump {
+	const char *name;
+	const char *decl_start;
+	const char *decl_end;
+	const char *key_prefix;
+	const char *array_start;
+	const char *array_end;
+	const char *val_prefix;
+	const char *val_suffix;
+	const char *true_val; /* should already be quoted, if needed */
+	void (*quote_fn)(FILE *, const char*);
+};
+static char *last_key;
+static struct lang_dump *lang;
+static struct lang_dump lang_dump_defs[] = {
+	{ "perl",
+		"\%git_config = (\n", ");\n",
+		"\t",
+		" => [\n", "\t],\n",
+		"\t\t", ",\n",
+		"'true'",
+		perl_quote_print },
+	{ "python",
+		"git_config = {\n", "}\n",
+		"    ",
+		" : [\n", "    ],\n",
+		"        ", ",\n",
+		"True",
+		python_quote_print },
+	{ "ruby", /* Ruby is very Perl-like */
+		"git_config = {\n", "}\n",
+		"  ",
+		" => [\n", "  ],\n",
+		"    ", ",\n",
+		"true",
+		perl_quote_print },
+};
+
+static int show_lang_config(const char *key_, const char *value_)
+{
+	if (last_key) {
+		if (strcmp(last_key, key_)) {
+			free(last_key);
+			fputs(lang->array_end, stdout);
+			goto new_key;
+		}
+	} else {
+new_key:
+		last_key = xstrdup(key_);
+		fputs(lang->key_prefix, stdout);
+		lang->quote_fn(stdout, key_);
+		fputs(lang->array_start, stdout);
+	}
+	fputs(lang->val_prefix, stdout);
+	if (value_)
+		lang->quote_fn(stdout, value_);
+	else
+		fputs(lang->true_val, stdout);
+	fputs(lang->val_suffix, stdout);
+	return 0;
+}
+
+static int show_lang_config_all(const char *lang_name)
+{
+	int i, rv;
+	for (i = ARRAY_SIZE(lang_dump_defs); --i >= 0; ) {
+		if (strcmp(lang_name, lang_dump_defs[i].name))
+			continue;
+		lang = lang_dump_defs + i;
+		fputs(lang->decl_start, stdout);
+		rv = git_config(show_lang_config);
+		if (last_key) {
+			free(last_key);
+			last_key = NULL;
+			fputs(lang->array_end, stdout);
+		}
+		fputs(lang->decl_end, stdout);
+		return rv;
+	}
+	fputs("Dumping config to '%s' is not yet supported", stderr);
+	return -1;
+}
+
 static int show_all_config(const char *key_, const char *value_)
 {
 	if (value_)
@@ -138,6 +222,8 @@ int cmd_repo_config(int argc, const char **argv, const char *prefix)
 			type = T_BOOL;
 		else if (!strcmp(argv[1], "--list") || !strcmp(argv[1], "-l"))
 			return git_config(show_all_config);
+		else if (!strncmp(argv[1], "--dump=", 7))
+			return show_lang_config_all(argv[1] + 7);
 		else if (!strcmp(argv[1], "--global")) {
 			char *home = getenv("HOME");
 			if (home) {
-- 
Eric Wong

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 14:14                   ` Jakub Narebski
@ 2007-01-16 22:17                     ` Nikolai Weibull
  2007-01-16 22:37                       ` Jakub Narebski
  2007-01-16 22:42                       ` [RFC] Git config file reader in Perl (WIP) Johannes Schindelin
  0 siblings, 2 replies; 54+ messages in thread
From: Nikolai Weibull @ 2007-01-16 22:17 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Johannes Schindelin, Junio C Hamano, Eric Wong, git

On 1/16/07, Jakub Narebski <jnareb@gmail.com> wrote:
> Johannes Schindelin wrote:
>
> > On Tue, 16 Jan 2007, Junio C Hamano wrote:
> >
> >> "Nikolai Weibull" <now@bitwi.se> writes:
> >>
> >>> If we're going down this slippery slope, why not just give up and add
> >>> a --xml switch instead?  Readable by all...
> >>
> >> Perhaps all except humans.
> >>
> >> At least YAML, please...

> > P.S.: The more I think about it, we should just use the output of
> > "repo-config -l".

> It wouldn't work. Subsection and value are (almost) free form, and
> they can contain '=' in them.

Sadly, yes.  It would be very nice if -l gave unambiguous output for
all cases, but perhaps -l is more for parsing by people than by seds.

> But I agree that XML is serious overkill...

I don't know if it was clear from my first mail, but I wasn't
suggesting --xml as a serious alternative.  My point was that if we're
going to go through all the fuss of adding all these switches for
outputting the configuration file in some fixed format, why not go
with one that at least is universal in some sense (not necessarily
XML).  And, as Johannes already pointed out, it's very disturbing
having to dump a configuration file so that it is more easily read by
other programs.  That would suggest that the ini-based format for
git's configuration file is suboptimal.

Of course, once git is librified (which is still a long-term goal,
right?), languages could create bindings to the git library, which
would provide access functions to the configuration file.  Then we
would truly have that one parser to rule them all.

  nikolai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 22:17                     ` Nikolai Weibull
@ 2007-01-16 22:37                       ` Jakub Narebski
  2007-01-16 22:56                         ` Johannes Schindelin
  2007-01-16 22:42                       ` [RFC] Git config file reader in Perl (WIP) Johannes Schindelin
  1 sibling, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-16 22:37 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Johannes Schindelin, Junio C Hamano, Eric Wong, git

Nikolai Weibull wrote:
> On 1/16/07, Jakub Narebski <jnareb@gmail.com> wrote:

>> But I agree that XML is serious overkill...
> 
> I don't know if it was clear from my first mail, but I wasn't
> suggesting --xml as a serious alternative.  My point was that if we're
> going to go through all the fuss of adding all these switches for
> outputting the configuration file in some fixed format, why not go
> with one that at least is universal in some sense (not necessarily
> XML).  And, as Johannes already pointed out, it's very disturbing
> having to dump a configuration file so that it is more easily read by
> other programs.  That would suggest that the ini-based format for
> git's configuration file is suboptimal.

No, ini-based, or rather ini-like format for git configuration
is nice, but I think git is too forgiving in accepting input.
Examples: section header and key/value pair in the same line,
allowing multiple quotes in in value part.

Well, the idea I had was to have --dump switch to git-repo-config
to dump init file as if it was created by git-repo-config invocations,
without any hand editing (canonical format).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 22:17                     ` Nikolai Weibull
  2007-01-16 22:37                       ` Jakub Narebski
@ 2007-01-16 22:42                       ` Johannes Schindelin
  2007-01-17 18:08                         ` Nikolai Weibull
  1 sibling, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-16 22:42 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Jakub Narebski, Junio C Hamano, Eric Wong, git

Hi,

On Tue, 16 Jan 2007, Nikolai Weibull wrote:

> And, as Johannes already pointed out, it's very disturbing having to 
> dump a configuration file so that it is more easily read by other 
> programs.

I never pointed out such a thing. The configuration file is meant to be 
user-friendly, as the inventor did not mean to have a program like 
git-repo-config.

> That would suggest that the ini-based format for git's configuration 
> file is suboptimal.

Not at all. Again, git's configuration file is meant for human inspection. 
Therefore, an ini-style file is optimal.

And for scripts, we do have git-repo-config. But now some people want to 
be clever and read the whole config file in, to make it easier to have 
gazillions of configuration options for their script.

I was not happy when we introduced more relaxed section titles, and I am 
not happy now that I see what problems we introduced with that.

> Of course, once git is librified (which is still a long-term goal,
> right?), [...]

You are welcome to contribute to that goal! AFAICT nobody is seriously 
working on that.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 22:37                       ` Jakub Narebski
@ 2007-01-16 22:56                         ` Johannes Schindelin
  2007-01-16 23:24                           ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-16 22:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Hi,

On Tue, 16 Jan 2007, Jakub Narebski wrote:

> Nikolai Weibull wrote:
> > On 1/16/07, Jakub Narebski <jnareb@gmail.com> wrote:
> 
> >> But I agree that XML is serious overkill...
> > 
> > I don't know if it was clear from my first mail, but I wasn't
> > suggesting --xml as a serious alternative.  My point was that if we're
> > going to go through all the fuss of adding all these switches for
> > outputting the configuration file in some fixed format, why not go
> > with one that at least is universal in some sense (not necessarily
> > XML).  And, as Johannes already pointed out, it's very disturbing
> > having to dump a configuration file so that it is more easily read by
> > other programs.  That would suggest that the ini-based format for
> > git's configuration file is suboptimal.
> 
> No, ini-based, or rather ini-like format for git configuration
> is nice,

Exactly.

> but I think git is too forgiving in accepting input. Examples: section 
> header and key/value pair in the same line, allowing multiple quotes in 
> in value part.

But this is nice to the user!

> Well, the idea I had was to have --dump switch to git-repo-config to 
> dump init file as if it was created by git-repo-config invocations, 
> without any hand editing (canonical format).

My point still stands: if you already parse the user-friendly format, why 
not dump a parse friendly format? If it weren't for those darn non-alnums 
in the keys, out put of "git repo-config -l" would be perfectly 
acceptable.

So, how about a "git repo-config --dump" which outputs a stream of NUL 
separated keys and values? This should be really easy to "parse", and 
there are no ambiguities: No key or value can contain a NUL.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 22:56                         ` Johannes Schindelin
@ 2007-01-16 23:24                           ` Jakub Narebski
  2007-01-17  8:51                             ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-16 23:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Dnia wtorek 16. stycznia 2007 23:56, Johannes Schindelin napisał:
> 
> On Tue, 16 Jan 2007, Jakub Narebski wrote:

>> Well, the idea I had was to have --dump switch to git-repo-config to 
>> dump init file as if it was created by git-repo-config invocations, 
>> without any hand editing (canonical format).
> 
> My point still stands: if you already parse the user-friendly format, why 
> not dump a parse friendly format? If it weren't for those darn non-alnums 
> in the keys, out put of "git repo-config -l" would be perfectly 
> acceptable.
> 
> So, how about a "git repo-config --dump" which outputs a stream of NUL 
> separated keys and values? This should be really easy to "parse", and 
> there are no ambiguities: No key or value can contain a NUL.

Good idea, although "\n" would work as well as NUL.

The only problem is with "key without value" case, i.e. something like

  [section]
  	noval

which shows as

  section.noval

in "git repo-config -l" output (note missing '=' !), and I guess differs
for some case from

  [section]
  	noval = 

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 23:24                           ` Jakub Narebski
@ 2007-01-17  8:51                             ` Johannes Schindelin
  2007-01-17  9:48                               ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-17  8:51 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Hi,

On Wed, 17 Jan 2007, Jakub Narebski wrote:

> Dnia wtorek 16. stycznia 2007 23:56, Johannes Schindelin napisa?:
> > 
> > On Tue, 16 Jan 2007, Jakub Narebski wrote:
> 
> >> Well, the idea I had was to have --dump switch to git-repo-config to 
> >> dump init file as if it was created by git-repo-config invocations, 
> >> without any hand editing (canonical format).
> > 
> > My point still stands: if you already parse the user-friendly format, why 
> > not dump a parse friendly format? If it weren't for those darn non-alnums 
> > in the keys, out put of "git repo-config -l" would be perfectly 
> > acceptable.
> > 
> > So, how about a "git repo-config --dump" which outputs a stream of NUL 
> > separated keys and values? This should be really easy to "parse", and 
> > there are no ambiguities: No key or value can contain a NUL.
> 
> Good idea, although "\n" would work as well as NUL.

No it would not:

	[someSection]
		thisKey = has\na\nvalue\with\nseveral\nnewlines

> The only problem is with "key without value" case, i.e. something like
> 
>   [section]
>   	noval
> 
> which shows as
> 
>   section.noval

but is equivalent to

	[section]
		noval = true

Since it is by definition a boolean value.

> in "git repo-config -l" output (note missing '=' !), and I guess differs
> for some case from
> 
>   [section]
>   	noval = 

Yes, this is not a boolean. The difference is that the callback function 
is called with NULL in the former case, and with "" in the latter.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17  8:51                             ` Johannes Schindelin
@ 2007-01-17  9:48                               ` Jakub Narebski
  2007-01-17 10:44                                 ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-17  9:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Johannes Schindelin wrote:
> 
> On Wed, 17 Jan 2007, Jakub Narebski wrote:
> 
>> Dnia wtorek 16. stycznia 2007 23:56, Johannes Schindelin napisa?:

>>> So, how about a "git repo-config --dump" which outputs a stream of NUL 
>>> separated keys and values? This should be really easy to "parse", and 
>>> there are no ambiguities: No key or value can contain a NUL.
>> 
>> Good idea, although "\n" would work as well as NUL.
> 
> No it would not:
> 
> 	[someSection]
> 		thisKey = has\na\nvalue\with\nseveral\nnewlines

$ fatal: bad config file line <nn> in <config>

The same with quoted:

 	[someSection]
 		qthisKey = "has\na\nvalue\with\nseveral\nnewlines"

There is no escaping besides escaping " and escape character
i.e. escaping \ in git config. Se "\n" would work as well as NUL.
(It is said explicitely that subsection names cannot contain "\n").

>> The only problem is with "key without value" case, i.e. something like
>> 
>>   [section]
>>   	noval
>> 
>> which shows as
>> 
>>   section.noval
> 
> but is equivalent to
> 
> 	[section]
> 		noval = true
> 
> Since it is by definition a boolean value.

But only for "git repo-config --bool --get section.noval" output.
Semantically equivalent to "true".

But without --bool it returns like it was "".

>> in "git repo-config -l" output (note missing '=' !), and I guess differs
>> for some case from
>> 
>>   [section]
>>   	noval = 
> 
> Yes, this is not a boolean. The difference is that the callback function 
> is called with NULL in the former case, and with "" in the latter.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17  9:48                               ` Jakub Narebski
@ 2007-01-17 10:44                                 ` Johannes Schindelin
  2007-01-17 12:11                                   ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-17 10:44 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Hi,

On Wed, 17 Jan 2007, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> > 
> > On Wed, 17 Jan 2007, Jakub Narebski wrote:
> > 
> >> Dnia wtorek 16. stycznia 2007 23:56, Johannes Schindelin napisa?:
> 
> >>> So, how about a "git repo-config --dump" which outputs a stream of NUL 
> >>> separated keys and values? This should be really easy to "parse", and 
> >>> there are no ambiguities: No key or value can contain a NUL.
> >> 
> >> Good idea, although "\n" would work as well as NUL.
> > 
> > No it would not:
> > 
> > 	[someSection]
> > 		thisKey = has\na\nvalue\with\nseveral\nnewlines
> 
> $ fatal: bad config file line <nn> in <config>

Yeah, sorry. But you got the point.

> The same with quoted:
> 
>  	[someSection]
>  		qthisKey = "has\na\nvalue\with\nseveral\nnewlines"
> 
> There is no escaping besides escaping " and escape character
> i.e. escaping \ in git config. Se "\n" would work as well as NUL.
> (It is said explicitely that subsection names cannot contain "\n").

So you want "git-repo-config --dump" to output something which has to be 
scanned for escaping sequences?

If you call

	$ git repo-config -l

you will _no longer_ see "\n"s, but rather newlines.

I don't know why you insist on newlines, when a NUL makes perfect sense: 
Take everything until the next NUL. This is the key. Then take everything 
until the next NUL. This is the value. Repeat until EOF.

> >> The only problem is with "key without value" case, i.e. something like
> >> 
> >>   [section]
> >>   	noval
> >> 
> >> which shows as
> >> 
> >>   section.noval
> > 
> > but is equivalent to
> > 
> > 	[section]
> > 		noval = true
> > 
> > Since it is by definition a boolean value.
> 
> But only for "git repo-config --bool --get section.noval" output.
> Semantically equivalent to "true".
> 
> But without --bool it returns like it was "".

Yes, it returns "", but this is _wrong_. A single "[section] noval" _only_ 
makes sense as a boolean. The information lies in its _presence_, which is 
as good as saying "true".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 10:44                                 ` Johannes Schindelin
@ 2007-01-17 12:11                                   ` Jakub Narebski
  2007-01-17 12:37                                     ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-17 12:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Johannes Schindelin wrote:
> 
> On Wed, 17 Jan 2007, Jakub Narebski wrote:
> 
>> Johannes Schindelin wrote:
>>> 
>>> On Wed, 17 Jan 2007, Jakub Narebski wrote:
>>> 
>>>> Johannes Schindelin wrote:
>>>> 
>>>>> So, how about a "git repo-config --dump" which outputs a stream of NUL 
>>>>> separated keys and values? This should be really easy to "parse", and 
>>>>> there are no ambiguities: No key or value can contain a NUL.
>>>> 
>>>> Good idea, although "\n" would work as well as NUL.
>>> 
>>> No it would not:
>>> 
>>> 	[someSection]
>>> 		thisKey = has\na\nvalue\with\nseveral\nnewlines
>> 
>> $ fatal: bad config file line <nn> in <config>
> 
> Yeah, sorry. But you got the point.

No, I don't got the point. No key or value can contain "\n".

>>>> The only problem is with "key without value" case, i.e. something like
>>>> 
>>>>   [section]
>>>>   	noval
>>>> 
>>>> which shows as
>>>> 
>>>>   section.noval
>>> 
>>> but is equivalent to
>>> 
>>> 	[section]
>>> 		noval = true
>>> 
>>> Since it is by definition a boolean value.
>> 
>> But only for "git repo-config --bool --get section.noval" output.
>> Semantically equivalent to "true".
>> 
>> But without --bool it returns like it was "".
> 
> Yes, it returns "", but this is _wrong_. A single "[section] noval" _only_ 
> makes sense as a boolean. The information lies in its _presence_, which is 
> as good as saying "true".

With "\n" as separator you can simply rrturn NUL in the noval case.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 12:11                                   ` Jakub Narebski
@ 2007-01-17 12:37                                     ` Johannes Schindelin
  2007-01-17 14:00                                       ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-17 12:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Hi,

On Wed, 17 Jan 2007, Jakub Narebski wrote:

> No key or value can contain "\n".

I just tried this:

	$ cat > .git/config << EOF
	[section] key = "Hello\nWorld"
	EOF
	$ git-repo-config -l
	section.key=Hello
	World

So, values _can_ contain newlines.

> With "\n" as separator you can simply rrturn NUL in the noval case.

Which would buy you what exactly? You can tell that the user did not say 
"noval = true", but "noval". Great. But the _effect_ should be the same!

Anyway, I realize you don't like my solution, so I will just shut up.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 12:37                                     ` Johannes Schindelin
@ 2007-01-17 14:00                                       ` Jakub Narebski
  2007-01-19 12:10                                         ` Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-17 14:00 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Johannes Schindelin wrote:
> 
> On Wed, 17 Jan 2007, Jakub Narebski wrote:
> 
>> No key or value can contain "\n".
> 
> I just tried this:
> 
> 	$ cat > .git/config << EOF
> 	[section] key = "Hello\nWorld"
> 	EOF
> 	$ git-repo-config -l
> 	section.key=Hello
> 	World
> 
> So, values _can_ contain newlines.

Sorry, my mistake. I haven't noticed that your previous example
the error was "\w", not embedded newlines, and that embedded newlines
work.
 
>> With "\n" as separator you can simply rrturn NUL in the noval case.
> 
> Which would buy you what exactly? You can tell that the user did not say 
> "noval = true", but "noval". Great. But the _effect_ should be the same!
> 
> Anyway, I realize you don't like my solution, so I will just shut up.

I like your solution.

The only ambiguity is how to deal with '[section] noval' case. You
propose to treat it as if it was '[section] noval = true' for --dump,
and not as if it was '[section] noval = ' or '[section] noval = ""'.
Good. But this _has_ to be explained in documentation.

That said, I still think that having alternate parser for a format
is a good idea. Otherwise it is not a format, but "something that
parser parses".

BTW. it looks like C escape sequences are parsed, but not octal
escape sequences, nor no-op escaping other character.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-16 22:42                       ` [RFC] Git config file reader in Perl (WIP) Johannes Schindelin
@ 2007-01-17 18:08                         ` Nikolai Weibull
  2007-01-17 19:22                           ` Jakub Narebski
                                             ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Nikolai Weibull @ 2007-01-17 18:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, Junio C Hamano, Eric Wong, git

On 1/16/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> On Tue, 16 Jan 2007, Nikolai Weibull wrote:
>
> > And, as Johannes already pointed out, it's very disturbing having to
> > dump a configuration file so that it is more easily read by other
> > programs.
>
> I never pointed out such a thing. The configuration file is meant to be
> user-friendly, as the inventor did not mean to have a program like
> git-repo-config.

Then what did you mean in your mail from Jan 16, 2007 12:12 PM:

  How silly would that be: we parse an easy-to-read format,
  munge the easy-to-handle internal data format into another "easy-to-read"
  format which is then parsed by a script language into an easy-to-handle
  internal data format? No. NO.

?

> > That would suggest that the ini-based format for git's configuration
> > file is suboptimal.

> Not at all. Again, git's configuration file is meant for human inspection.
> Therefore, an ini-style file is optimal.

It is suboptimal because it's hard for computers to inspect.  An
optimal format would be accessible by all, whether human, machine,
robot, android, what have you.

I really don't like that people take my statements and somehow contort
them into meaning something else.  You can't take my statement and
then claim that because git's configuration file was meant for human
inspection my /suggestion/ about it being suboptimal is somehow flawed
because I didn't consider the original intent of the file in question.

Agreed, ini-style files are sweet for humans, but that's not what I
was saying.  My claim was that ini-style files aren't necessarily
optimal for parsing by computers.  I think that was made clear.  Also,
I only /suggested/ that it may be /suboptimal/ not that it in fact
/is/ suboptimal.

> And for scripts, we do have git-repo-config. But now some people want to
> be clever and read the whole config file in, to make it easier to have
> gazillions of configuration options for their script.

I maintain the git completion-definition for Zsh.  Being able to get
at the configuration settings is important, both for the completion of
git-repo-config, but also to provide completions of remote
repositories.  'git-repo-config -l' works fine, but it's not
guaranteed to be unambiguous, as the name of a remote repository can
contain characters like '=', so it can be difficult to decide where to
split the name of the remote repository from the url it represents as
the url can also contain '='.

> I was not happy when we introduced more relaxed section titles, and I am
> not happy now that I see what problems we introduced with that.

I'm with you on this one.

  nikolai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 18:08                         ` Nikolai Weibull
@ 2007-01-17 19:22                           ` Jakub Narebski
  2007-01-17 20:01                             ` Nikolai Weibull
  2007-01-17 19:25                           ` Jakub Narebski
  2007-01-18  0:50                           ` Johannes Schindelin
  2 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-17 19:22 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Johannes Schindelin, Junio C Hamano, Eric Wong, git

Nikolai Weibull wrote:

> I maintain the git completion-definition for Zsh.

Could you add it to contrib/completion/ or at least put link
in http://git.or.cz/gitwiki/InterfacesFrontendsAndTools ?

TIA
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 18:08                         ` Nikolai Weibull
  2007-01-17 19:22                           ` Jakub Narebski
@ 2007-01-17 19:25                           ` Jakub Narebski
  2007-01-18  0:50                           ` Johannes Schindelin
  2 siblings, 0 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-17 19:25 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Johannes Schindelin, Junio C Hamano, Eric Wong, git

Nikolai Weibull wrote:
> On 1/16/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

>> I was not happy when we introduced more relaxed section titles, and I am
>> not happy now that I see what problems we introduced with that.
> 
> I'm with you on this one.

More relaxed section (actually _sub_section) titles are not the problem.
The fact that they are case sensitive (where section and key names
are not) can be.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 19:22                           ` Jakub Narebski
@ 2007-01-17 20:01                             ` Nikolai Weibull
  0 siblings, 0 replies; 54+ messages in thread
From: Nikolai Weibull @ 2007-01-17 20:01 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 1/17/07, Jakub Narebski <jnareb@gmail.com> wrote:
> Nikolai Weibull wrote:
>
> > I maintain the git completion-definition for Zsh.
>
> Could you add it to contrib/completion/ or at least put link
> in http://git.or.cz/gitwiki/InterfacesFrontendsAndTools ?

I did announce it way back when, but since it's included with new
releases of Zsh there's really nothing interesting to add or link to.
It's also in Zsh's CVS repository, but I do my updates here:

  http://git.bitwi.se/?p=dot-home.git;a=tree

although I now realize that I haven't pushed some recent updates to
that repository yet.

  nikolai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 18:08                         ` Nikolai Weibull
  2007-01-17 19:22                           ` Jakub Narebski
  2007-01-17 19:25                           ` Jakub Narebski
@ 2007-01-18  0:50                           ` Johannes Schindelin
  2 siblings, 0 replies; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-18  0:50 UTC (permalink / raw)
  To: Nikolai Weibull; +Cc: Jakub Narebski, Junio C Hamano, Eric Wong, git

Hi,

On Wed, 17 Jan 2007, Nikolai Weibull wrote:

> On 1/16/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 
> > On Tue, 16 Jan 2007, Nikolai Weibull wrote:
> > 
> > > And, as Johannes already pointed out, it's very disturbing having to
> > > dump a configuration file so that it is more easily read by other
> > > programs.
> > 
> > I never pointed out such a thing. The configuration file is meant to be
> > user-friendly, as the inventor did not mean to have a program like
> > git-repo-config.
> 
> Then what did you mean in your mail from Jan 16, 2007 12:12 PM:
> 
>  How silly would that be: we parse an easy-to-read format,
>  munge the easy-to-handle internal data format into another "easy-to-read"
>  format which is then parsed by a script language into an easy-to-handle
>  internal data format? No. NO.
> 
> ?

It was about the human-readable -> internal -> human-readable -> internal
chain. Those are way too many transformations for little gain.

IMHO modern programs spend 99% of the time are spent transforming data. 
Most of them is unnecessary. That's bad.

> > > That would suggest that the ini-based format for git's configuration
> > > file is suboptimal.
> 
> > Not at all. Again, git's configuration file is meant for human inspection.
> > Therefore, an ini-style file is optimal.
> 
> It is suboptimal because it's hard for computers to inspect.  An
> optimal format would be accessible by all, whether human, machine,
> robot, android, what have you.

There's a reason we program in C, Java, etc. and not in Assembler. 
Sometimes computers and humans are too different to be able to cater for 
both of them at the same time.

I know, I know. The geek inside me disagrees, too. But my experience 
doesn't.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-17 14:00                                       ` Jakub Narebski
@ 2007-01-19 12:10                                         ` Jakub Narebski
  2007-01-19 12:25                                           ` Jakub Narebski
  2007-01-19 13:20                                           ` Johannes Schindelin
  0 siblings, 2 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-19 12:10 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Jakub Narebski wrote:

> BTW. it looks like C escape sequences are parsed, but not octal
> escape sequences, nor no-op escaping other character.
 
From a bit of testing, as documentation of config file format
is woefully incomplete, (yes, I know I should use the source)
_some_ of C escape sequences aka. character escape codes (CEC)
are parsed:
 - "\t", tab character (HT, TAB)
 - "\n", newline (NL)
 - "\b", backspace (BS)
It's a bit strange that "\b" is parsed...

Other CEC cause "fatal: bad config file line 41 <nn> in <file>"
error:
 - "\r", return (CR)
 - "\f", form feed (FF)
 - "\a", alarm (bell) (BEL)
 - "\e", escape (ESC)
 - "\v", vertical tab (VT)
I think supporting "\e" could be useful, but also dangerous.

Octal char sequences (like \040 or \0 for NUL) are not supported
and result in "bad config file" error.

Literal/unknown escape sequences like "\w" or "\." also result
in "bad config file" error, with the obvious exception of escaping
quote character i.e. \" and escaping escape character i.e. \\

Values of configuration variables can span multiple lines by escaping
newline, i.e. putting \ as the last character. Nice. (This doesn't work
for comments, of course). This doesn't work for section headers nor key 
names.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-19 12:10                                         ` Jakub Narebski
@ 2007-01-19 12:25                                           ` Jakub Narebski
  2007-01-19 13:20                                           ` Johannes Schindelin
  1 sibling, 0 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-19 12:25 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Jakub Narebski wrote:
> Jakub Narebski wrote:
> 
>> BTW. it looks like C escape sequences are parsed, but not octal
>> escape sequences, nor no-op escaping other character.
>  
> From a bit of testing, as documentation of config file format
> is woefully incomplete, (yes, I know I should use the source)
> _some_ of C escape sequences aka. character escape codes (CEC)
> are parsed:
>  - "\t", tab character (HT, TAB)
>  - "\n", newline (NL)
>  - "\b", backspace (BS)
> It's a bit strange that "\b" is parsed...
[...]
> Values of configuration variables can span multiple lines by escaping
> newline, i.e. putting \ as the last character. Nice.

>From config.c:parse_value:73

		if (c == '\\') {
			c = get_next_char();
			switch (c) {
			case '\n':
				continue;
			case 't':
				c = '\t';
				break;
			case 'b':
				c = '\b';
				break;
			case 'n':
				c = '\n';
				break;
			/* Some characters escape as themselves */
			case '\\': case '"':
				break;
			/* Reject unknown escape sequences */
			default:
				return NULL;
			}

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-19 12:10                                         ` Jakub Narebski
  2007-01-19 12:25                                           ` Jakub Narebski
@ 2007-01-19 13:20                                           ` Johannes Schindelin
  2007-01-19 22:44                                             ` Jakub Narebski
  1 sibling, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-19 13:20 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Hi,

On Fri, 19 Jan 2007, Jakub Narebski wrote:

> From a bit of testing, as documentation of config file format is 
> woefully incomplete, (yes, I know I should use the source) _some_ of C 
> escape sequences aka. character escape codes (CEC) are parsed:

No, you should not just use the source. You should use the source _and_ 
complete the documentation.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-19 13:20                                           ` Johannes Schindelin
@ 2007-01-19 22:44                                             ` Jakub Narebski
  2007-01-20  0:08                                               ` Johannes Schindelin
  2007-01-20  0:19                                               ` Junio C Hamano
  0 siblings, 2 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-19 22:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nikolai Weibull, Junio C Hamano, Eric Wong, git

Johannes Schindelin wrote:
> 
> On Fri, 19 Jan 2007, Jakub Narebski wrote:
> 
>> From a bit of testing, as documentation of config file format is 
>> woefully incomplete, (yes, I know I should use the source) _some_ of C 
>> escape sequences aka. character escape codes (CEC) are parsed:
> 
> No, you should not just use the source. You should use the source _and_ 
> complete the documentation.

Something like the patch below? Untested! ("make doc" up to 
git-repo-config.txt compiles, though).

I'm not sure how to tell that you can have [section] if you have
[section "subsection"], but you don't need to. And I probably forgot
to add some information.

And I'm not sure if some behavior should not be changed, for example
allowing _any_ line to be continued with `\`, or that other character
escape sequences and perhaps also octal character sequences should be
allowed (either that or `\b` should not be parsed).

I can send proper patch if requested, but I'd rather above issues
were resolved first.

diff --git a/Documentation/config.txt b/Documentation/config.txt
index da7fde5..9544308 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -14,14 +14,53 @@ dot-separated segment and the section name is everything before the last
 dot. The variable names are case-insensitive and only alphanumeric
 characters are allowed. Some variables may appear multiple times.
 
+Syntax
+~~~~~~
+
 The syntax is fairly flexible and permissive; whitespaces are mostly
-ignored. The '#' and ';' characters begin comments to the end of line,
-blank lines are ignored, lines containing strings enclosed in square
-brackets start sections and all the other lines are recognized
-as setting variables, in the form 'name = value'. If there is no equal
-sign on the line, the entire line is taken as 'name' and the variable
-is recognized as boolean "true". String values may be entirely or partially
-enclosed in double quotes; some variables may require special value format.
+ignored.  The '#' and ';' characters begin comments to the end of line,
+blank lines are ignored.
+
+The file consists of sections and variables.  A section begins with
+the name of the section in square brackets and continues until the next
+section begins.  Section names are not case sensitive.  Each variable
+must belong to some section, which means that there must be section
+header before first setting of a variable.
+
+Sections can be further divided into subsections.  To begin a subsection
+put it name in double quotes, separated by space from the section name,
+in the section header, like in example below
+
+	[section "subsection"]
+
+Subsection names can contain whitespace and are case sensitive.  Variables
+may belong directly to a section, or to a given subsection.
+
+All the other lines are recognized as setting variables, in the form
+'name = value'. If there is no equal sign on the line, the entire line
+is taken as 'name' and the variable is recognized as boolean "true".
+Variable names are case insensitive.  There can be more than one value
+for a given variable; we say then that variable is multivalued.
+
+Leading and trailing whitespace in a variable value is discarded.
+Internal whitespace within a variable value is retained verbatim.
+
+String values may be entirely or partially enclosed in double quotes.
+You need to enclose variable value in double quotes if you want to
+preserve leading or trailing whitespace, or if variable value contains
+beginning of comment characters, it means if it contains `#` or `;`.
+Double quote `"` and backslash `\` characters in variable value must
+be escaped: use `\"` for `"`, and `\\` for `\`.
+
+The following escape sequences (beside `\"` and `\\`) are recognized:
+`\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB)
+and `\b` for backspace (BS).  No other character escape codes, nor octal
+char sequences are valid.
+
+Variable value ending in a `\` is continued on the next line in the 
+customary UNIX fashion.
+
+Some variables may require special value format.
 
 Example
 ~~~~~~~

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-19 22:44                                             ` Jakub Narebski
@ 2007-01-20  0:08                                               ` Johannes Schindelin
  2007-01-20  0:59                                                 ` Jakub Narebski
  2007-01-20  0:19                                               ` Junio C Hamano
  1 sibling, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-20  0:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git

Hi,

On Fri, 19 Jan 2007, Jakub Narebski wrote:

> +Sections can be further divided into subsections.  To begin a subsection
> +put it name in double quotes, separated by space from the section name,

       ^ "its"

> +in the section header, like in example below
> +
> +	[section "subsection"]

I wonder if we should also mention the (case insensitive) alternative 
"[section.subsection]", to give a better idea to people why we actually 
check for "section.subsection" in the code.

> +All the other lines are recognized as setting variables, in the form
> +'name = value'. If there is no equal sign on the line, the entire line
> +is taken as 'name' and the variable is recognized as boolean "true".
> +Variable names are case insensitive.

They cannot contain anything else than alphanumeric characters, in 
particular no whitespace.

>					 There can be more than one value
> +for a given variable; we say then that variable is multivalued.

Maybe give the example of "remote.<name>.fetch" to explain why? Or maybe 
not.

> +The following escape sequences (beside `\"` and `\\`) are recognized:
> +`\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB)
> +and `\b` for backspace (BS).  No other character escape codes, nor octal
> +char sequences are valid.

I did not even know about BS! Does it make sense to allow it, really?

> +Some variables may require special value format.

I think you can safely skip that; it should be evident that the format of 
the variables depends on the purpose.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-19 22:44                                             ` Jakub Narebski
  2007-01-20  0:08                                               ` Johannes Schindelin
@ 2007-01-20  0:19                                               ` Junio C Hamano
  2007-01-20  1:25                                                 ` [PATCH] config_set_multivar(): disallow newlines in keys Johannes Schindelin
  2007-01-20 14:03                                                 ` [PATCH] Documentation/config.txt: Document config file syntax better Jakub Narebski
  1 sibling, 2 replies; 54+ messages in thread
From: Junio C Hamano @ 2007-01-20  0:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Nikolai Weibull, Johannes Schindelin, Eric Wong, git

Jakub Narebski <jnareb@gmail.com> writes:

> I'm not sure how to tell that you can have [section] if you have
> [section "subsection"], but you don't need to.

s/I'm not .*that//; would be enough, I think.

> And I'm not sure if some behavior should not be changed, for example
> allowing _any_ line to be continued with `\`, or that other character
> escape sequences and perhaps also octal character sequences should be
> allowed (either that or `\b` should not be parsed).

I do not think we need continuation line for [] headers, if that
is what you mean.  I offhand do not think anybody would scream
if we start parsing full c-quote.

One thing that left me puzzled after reading the description was
what a user can do with "subsection".  It is unclear from the
description if [section "sub.section"], [section"sub.sec=ti.on"]
or worse yet, [section "sub\nsection with an embbedded LF"] are
allowed.  The rest seemed sane.

I think the current repo-config handles sane cases alright, but
it is still fragile in error cases.  For example:

	$ git repo-config 'foo.bar=bzz
          baz.boo' foobar

does not currently barf, but results in a corrupted config file.

	$ git repo-config 'foo.bar=bzz
          baz.boo'
	fatal: bad config file line 56 in .git/config

        $ sed -ne '55,$p' .git/config
        [foo "bar=bzz
        baz"]
                boo = foobar

The only way we use the subsection names for (almost arbitrary)
end user string is to store branch names, so LF is not an issue
and we could forbid it (if need arises loosening the restriction
while updating the code to behave sanely is easier than leaving
it open without properly checking).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFC] Git config file reader in Perl (WIP)
  2007-01-20  0:08                                               ` Johannes Schindelin
@ 2007-01-20  0:59                                                 ` Jakub Narebski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-20  0:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

Johannes Schindelin wrote:
> 
> On Fri, 19 Jan 2007, Jakub Narebski wrote: 

>> +in the section header, like in example below
>> +
>> +	[section "subsection"]
> 
> I wonder if we should also mention the (case insensitive) alternative 
> "[section.subsection]", to give a better idea to people why we actually 
> check for "section.subsection" in the code.

I'm not sure. "section.subsection" can be treated as section with that
name, strange because it has '.' in it, and not subsection "subsection"
of section "section". git-repo-config writes section with subsection
in above format.
 
>> +All the other lines are recognized as setting variables, in the form
>> +'name = value'. If there is no equal sign on the line, the entire line
>> +is taken as 'name' and the variable is recognized as boolean "true".
>> +Variable names are case insensitive.
> 
> They cannot contain anything else than alphanumeric characters, in 
> particular no whitespace.

Contrary to for example also ini-like smb.conf:
  Leading, trailing and _internal_ whitespace in section and parameter
  names is  irrelevant.

>>					 There can be more than one value
>> +for a given variable; we say then that variable is multivalued.
> 
> Maybe give the example of "remote.<name>.fetch" to explain why? Or maybe 
> not.

I put it because in this git config differs from standard ini-format.

>> +The following escape sequences (beside `\"` and `\\`) are recognized:
>> +`\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB)
>> +and `\b` for backspace (BS).  No other character escape codes, nor octal
>> +char sequences are valid.
> 
> I did not even know about BS! Does it make sense to allow it, really?

That is the question I'm also asking...

>> +Some variables may require special value format.
> 
> I think you can safely skip that; it should be evident that the format of 
> the variables depends on the purpose.

This was in the original, and I think it is better left (at least for now).

I wonder if to write about --bool and --int formats...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-20  0:19                                               ` Junio C Hamano
@ 2007-01-20  1:25                                                 ` Johannes Schindelin
  2007-01-20  1:40                                                   ` Junio C Hamano
  2007-01-22 15:06                                                   ` Alex Riesen
  2007-01-20 14:03                                                 ` [PATCH] Documentation/config.txt: Document config file syntax better Jakub Narebski
  1 sibling, 2 replies; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-20  1:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


This will no longer work:

$ git repo-config 'key.with
newline' some-value

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
	On Fri, 19 Jan 2007, Junio C Hamano wrote:
	
	> I think the current repo-config handles sane cases alright, but
	> it is still fragile in error cases.  For example:
	> 
	> 	$ git repo-config 'foo.bar=bzz
	>           baz.boo' foobar
	> 
	> does not currently barf, but results in a corrupted config file.

	Now it barfs.

 config.c               |    5 +++++
 t/t1300-repo-config.sh |    6 ++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/config.c b/config.c
index b6082f5..c08c668 100644
--- a/config.c
+++ b/config.c
@@ -661,6 +661,11 @@ int git_config_set_multivar(const char* key, const char* value,
 				goto out_free;
 			}
 			c = tolower(c);
+		} else if (c == '\n') {
+			fprintf(stderr, "invalid key (newline): %s\n", key);
+			free(store.key);
+			ret = 1;
+			goto out_free;
 		}
 		store.key[i] = c;
 	}
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 60acdd3..eb7455b 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -418,5 +418,11 @@ EOF
 
 test_expect_success 'quoting' 'cmp .git/config expect'
 
+test_expect_failure 'key with newline' 'git repo-config key.with\\\
+newline 123'
+
+test_expect_success 'value with newline' 'git repo-config key.sub value.with\\\
+newline'
+
 test_done
 
-- 
1.5.0.rc1.g5a400-dirty

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-20  1:25                                                 ` [PATCH] config_set_multivar(): disallow newlines in keys Johannes Schindelin
@ 2007-01-20  1:40                                                   ` Junio C Hamano
  2007-01-22 15:06                                                   ` Alex Riesen
  1 sibling, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2007-01-20  1:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Thanks.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] Documentation/config.txt: Document config file syntax better
  2007-01-20  0:19                                               ` Junio C Hamano
  2007-01-20  1:25                                                 ` [PATCH] config_set_multivar(): disallow newlines in keys Johannes Schindelin
@ 2007-01-20 14:03                                                 ` Jakub Narebski
  2007-01-22 15:25                                                   ` Jakub Narebski
  1 sibling, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-20 14:03 UTC (permalink / raw)
  To: git, Junio C Hamano, Johannes Schindelin
  Cc: Nikolai Weibull, Eric Wong, Jakub Narebski

Separate part of Documentation/config.txt which deals with git config file
syntax into "Syntax" subsection, and expand it.  Add information about
subsections, boolean values, escaping and escape sequences in string
values, and continuing variable value on the next line.

Add also proxy settings to config file example to show example of
partially enclosed in double quotes string value.

Parts based on comments by Junio C Hamano, Johannes Schindelin,
and the smb.conf(5) man page.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
On Sat, 20 Jan 2007, Johannes Schindelin wrote:
> On Fri, 19 Jan 2007, Jakub Narebski wrote:

>> +in the section header, like in example below
>> +
>> +	[section "subsection"]
> 
> I wonder if we should also mention the (case insensitive) alternative 
> "[section.subsection]", to give a better idea to people why we actually 
> check for "section.subsection" in the code.

Added one line note about this.

>> +All the other lines are recognized as setting variables, in the form
>> +'name = value'. If there is no equal sign on the line, the entire line
>> +is taken as 'name' and the variable is recognized as boolean "true".
>> +Variable names are case insensitive.
> 
> They cannot contain anything else than alphanumeric characters, in 
> particular no whitespace.

It is mentioned above "Syntax" section, but perhaps it should be repeated.
I haven't took a look at code to check what values for section names and
for key/variable names are allowed.

>> +Some variables may require special value format.
> 
> I think you can safely skip that; it should be evident that the format of 
> the variables depends on the purpose.

This was in the original, and I think it is better left (at least for now).


Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> I'm not sure how to tell that you can have [section] if you have
>> [section "subsection"], but you don't need to.
> 
> s/I'm not .*that//; would be enough, I think.

Thanks for suggestion. I have used it (although perhaps the preceding
sentence is now not needed).

> One thing that left me puzzled after reading the description was
> what a user can do with "subsection".  It is unclear from the
> description if [section "sub.section"], [section "sub.sec=ti.on"]
> or worse yet, [section "sub\nsection with an embbedded LF"] are
> allowed.  The rest seemed sane.

I'm not sure what is allowed in section name, and in subsection name,
so for now I have left it as is. I can amend this commit, or add new
commit explaining this.


BTW. currently one of examples from git-repo-config(1) doesn't work:

  To add a new proxy, without altering any of the existing ones, use
  
  ------------
  % git repo-config core.gitproxy '"proxy" for example.com'
  ------------

I think it would be better instead of adding quotes if needed, just
do _not_ escape quotes. But that leaves the problem what to do if one
puts value with trailing or leading whitespace, or comment character
outside quotes using git-repo-config...


 Documentation/config.txt |   69 +++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index da7fde5..03133e2 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -14,14 +14,65 @@ dot-separated segment and the section name is everything before the last
 dot. The variable names are case-insensitive and only alphanumeric
 characters are allowed. Some variables may appear multiple times.
 
+Syntax
+~~~~~~
+
 The syntax is fairly flexible and permissive; whitespaces are mostly
-ignored. The '#' and ';' characters begin comments to the end of line,
-blank lines are ignored, lines containing strings enclosed in square
-brackets start sections and all the other lines are recognized
-as setting variables, in the form 'name = value'. If there is no equal
-sign on the line, the entire line is taken as 'name' and the variable
-is recognized as boolean "true". String values may be entirely or partially
-enclosed in double quotes; some variables may require special value format.
+ignored.  The '#' and ';' characters begin comments to the end of line,
+blank lines are ignored.
+
+The file consists of sections and variables.  A section begins with
+the name of the section in square brackets and continues until the next
+section begins.  Section names are not case sensitive.  Each variable
+must belong to some section, which means that there must be section
+header before first setting of a variable.
+
+Sections can be further divided into subsections.  To begin a subsection
+put its name in double quotes, separated by space from the section name,
+in the section header, like in example below:
+
+--------
+	[section "subsection"]
+
+--------
+
+Subsection names can contain whitespace and are case sensitive.  Variables
+may belong directly to a section, or to a given subsection.  You can have
+`[section]` if you have `[section "subsection"]`, but you don't need to.
+
+There is also (case insensitive) alternative `[section.subsection]` syntax.
+
+All the other lines are recognized as setting variables, in the form
+'name = value'. If there is no equal sign on the line, the entire line
+is taken as 'name' and the variable is recognized as boolean "true".
+Variable names are case insensitive.  There can be more than one value
+for a given variable; we say then that variable is multivalued.
+
+Leading and trailing whitespace in a variable value is discarded.
+Internal whitespace within a variable value is retained verbatim.
+
+The values following the equals sign in variable assign are all either
+a string, an integer, or a boolean.  Boolean values may be given as yes/no,
+0/1 or true/false.  Case is not significant in boolean values, when
+converting value to the canonical form using '--bool' type specifier;
+git-repo-config will ensure that the output is "true" or "false".
+
+String values may be entirely or partially enclosed in double quotes.
+You need to enclose variable value in double quotes if you want to
+preserve leading or trailing whitespace, or if variable value contains
+beginning of comment characters (if it contains '#' or ';').
+Double quote '`"`' and backslash '`\`' characters in variable value must
+be escaped: use '`\"`' for '`"`' and '`\\`' for '`\`'.
+
+The following escape sequences (beside '`\"`' and '`\\`') are recognized:
+'`\n`' for newline character (NL), '`\t`' for horizontal tabulation (HT, TAB)
+and '`\b`' for backspace (BS).  No other char escape sequence, nor octal
+char sequences are valid.
+
+Variable value ending in a '`\`' is continued on the next line in the
+customary UNIX fashion.
+
+Some variables may require special value format.
 
 Example
 ~~~~~~~
@@ -40,6 +91,10 @@ Example
 		remote = origin
 		merge = refs/heads/devel
 
+	# Proxy settings
+	[core]
+		gitProxy="ssh" for "ssh://kernel.org/"
+		gitProxy=default-proxy ; for the rest
 
 Variables
 ~~~~~~~~~
-- 
1.4.4.3

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-20  1:25                                                 ` [PATCH] config_set_multivar(): disallow newlines in keys Johannes Schindelin
  2007-01-20  1:40                                                   ` Junio C Hamano
@ 2007-01-22 15:06                                                   ` Alex Riesen
  2007-01-22 15:21                                                     ` Johannes Schindelin
  1 sibling, 1 reply; 54+ messages in thread
From: Alex Riesen @ 2007-01-22 15:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

On 1/20/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> --- a/config.c
> +++ b/config.c
> @@ -661,6 +661,11 @@ int git_config_set_multivar(const char* key, const char* value,
>                                 goto out_free;
>                         }
>                         c = tolower(c);
> +               } else if (c == '\n') {
> +                       fprintf(stderr, "invalid key (newline): %s\n", key);

BTW, why config.c never uses error() or warn()?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-22 15:06                                                   ` Alex Riesen
@ 2007-01-22 15:21                                                     ` Johannes Schindelin
  2007-01-22 15:33                                                       ` Alex Riesen
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-22 15:21 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git

Hi,

On Mon, 22 Jan 2007, Alex Riesen wrote:

> On 1/20/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > --- a/config.c
> > +++ b/config.c
> > @@ -661,6 +661,11 @@ int git_config_set_multivar(const char* key, const
> > char* value,
> >                                 goto out_free;
> >                         }
> >                         c = tolower(c);
> > +               } else if (c == '\n') {
> > +                       fprintf(stderr, "invalid key (newline): %s\n", key);
> 
> BTW, why config.c never uses error() or warn()?

Mainly because error() was meant to be used as "return error("blabla");", 
and we tried to discern different failures by different return values.

But yeah, I think it would be acceptable to use error() instead of 
fprintf() even then.

BTW IMHO we will probably never libify git; too many too complicated cases 
exist already.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH] Documentation/config.txt: Document config file syntax better
  2007-01-20 14:03                                                 ` [PATCH] Documentation/config.txt: Document config file syntax better Jakub Narebski
@ 2007-01-22 15:25                                                   ` Jakub Narebski
  2007-01-24 14:14                                                     ` [PATCH 2/1] Documentation/config.txt: Correct info about subsection name Jakub Narebski
  0 siblings, 1 reply; 54+ messages in thread
From: Jakub Narebski @ 2007-01-22 15:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jakub Narebski

Separate part of Documentation/config.txt which deals with git config file
syntax into "Syntax" subsection, and expand it.  Add information about
subsections, boolean values, escaping and escape sequences in string
values, and continuing variable value on the next line.

Add also proxy settings to config file example to show example of
partially enclosed in double quotes string value.

Parts based on comments by Junio C Hamano, Johannes Schindelin,
config.c, and the smb.conf(5) man page.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
Junio C Hamano wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
>  
>> What about my Documentation/config.txt changes?
> 
> I was not sure about that one, given a lot of commentary in your
> message, suggesting more research and revision is needed, like
> these...
> 
>>>> +All the other lines are recognized as setting variables, in the form
>>>> +'name = value'. If there is no equal sign on the line, the entire line
>>>> +is taken as 'name' and the variable is recognized as boolean "true".
>>>> +Variable names are case insensitive.
>>> 
>>> They cannot contain anything else than alphanumeric characters, in 
>>> particular no whitespace.
>>
>> It is mentioned above "Syntax" section, but perhaps it should be repeated.
>> I haven't took a look at code to check what values for section names and
>> for key/variable names are allowed.
>> ...
>>> One thing that left me puzzled after reading the description was
>>> what a user can do with "subsection".  It is unclear from the
>>> description if [section "sub.section"], [section "sub.sec=ti.on"]
>>> or worse yet, [section "sub\nsection with an embbedded LF"] are
>>> allowed.  The rest seemed sane.
>>
>> I'm not sure what is allowed in section name, and in subsection name,
>> so for now I have left it as is. I can amend this commit, or add new
>> commit explaining this.

I hope that this is satisfactory.


I haven't wrote about current limits on the lengths: 256/2 for section
plus subsection name length, 256 for fully qualified variable name,
1024 for value length; I think this does not belong to end user
documentation.

 Documentation/config.txt |   76 +++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f1f409d..77a2b16 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -14,14 +14,72 @@ dot-separated segment and the section name is everything before the last
 dot. The variable names are case-insensitive and only alphanumeric
 characters are allowed. Some variables may appear multiple times.
 
+Syntax
+~~~~~~
+
 The syntax is fairly flexible and permissive; whitespaces are mostly
-ignored. The '#' and ';' characters begin comments to the end of line,
-blank lines are ignored, lines containing strings enclosed in square
-brackets start sections and all the other lines are recognized
-as setting variables, in the form 'name = value'. If there is no equal
-sign on the line, the entire line is taken as 'name' and the variable
-is recognized as boolean "true". String values may be entirely or partially
-enclosed in double quotes; some variables may require special value format.
+ignored.  The '#' and ';' characters begin comments to the end of line,
+blank lines are ignored.
+
+The file consists of sections and variables.  A section begins with
+the name of the section in square brackets and continues until the next
+section begins.  Section names are not case sensitive.  Only alphanumeric
+characters, '`-`' and '`.`' are allowed in section names.  Each variable
+must belong to some section, which means that there must be section
+header before first setting of a variable.
+
+Sections can be further divided into subsections.  To begin a subsection
+put its name in double quotes, separated by space from the section name,
+in the section header, like in example below:
+
+--------
+	[section "subsection"]
+
+--------
+
+Subsection names can contain any characters (doublequote '`"`', backslash
+'`\`' and newline have to be entered escaped as '`\"`', '`\\`' and '`\n`',
+respecitvely) and are case sensitive.  Section header cannot span multiple
+lines.  Variables may belong directly to a section or to a given subsection.
+You can have `[section]` if you have `[section "subsection"]`, but you
+don't need to.
+
+There is also (case insensitive) alternative `[section.subsection]` syntax.
+In this syntax subsection names follow the same restrictions as for section
+name.
+
+All the other lines are recognized as setting variables, in the form
+'name = value'.  If there is no equal sign on the line, the entire line
+is taken as 'name' and the variable is recognized as boolean "true".
+The variable names are case-insensitive and only alphanumeric
+characters and '`-`' are allowed.  There can be more than one value
+for a given variable; we say then that variable is multivalued.
+
+Leading and trailing whitespace in a variable value is discarded.
+Internal whitespace within a variable value is retained verbatim.
+
+The values following the equals sign in variable assign are all either
+a string, an integer, or a boolean.  Boolean values may be given as yes/no,
+0/1 or true/false.  Case is not significant in boolean values, when
+converting value to the canonical form using '--bool' type specifier;
+`git-repo-config` will ensure that the output is "true" or "false".
+
+String values may be entirely or partially enclosed in double quotes.
+You need to enclose variable value in double quotes if you want to
+preserve leading or trailing whitespace, or if variable value contains
+beginning of comment characters (if it contains '#' or ';').
+Double quote '`"`' and backslash '`\`' characters in variable value must
+be escaped: use '`\"`' for '`"`' and '`\\`' for '`\`'.
+
+The following escape sequences (beside '`\"`' and '`\\`') are recognized:
+'`\n`' for newline character (NL), '`\t`' for horizontal tabulation (HT, TAB)
+and '`\b`' for backspace (BS).  No other char escape sequence, nor octal
+char sequences are valid.
+
+Variable value ending in a '`\`' is continued on the next line in the
+customary UNIX fashion.
+
+Some variables may require special value format.
 
 Example
 ~~~~~~~
@@ -40,6 +98,10 @@ Example
 		remote = origin
 		merge = refs/heads/devel
 
+	# Proxy settings
+	[core]
+		gitProxy="ssh" for "ssh://kernel.org/"
+		gitProxy=default-proxy ; for the rest
 
 Variables
 ~~~~~~~~~
-- 
1.4.4.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-22 15:21                                                     ` Johannes Schindelin
@ 2007-01-22 15:33                                                       ` Alex Riesen
  2007-01-22 15:44                                                         ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Alex Riesen @ 2007-01-22 15:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On 1/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> BTW IMHO we will probably never libify git; too many too complicated cases
> exist already.

We probably don't need it libified completely. Just reading and writing of
the object database and tags/references and very simple revision walking
will be a very good start and possibly enough for anything which _uses_ the
database, like importers and exporters.from other VCS or just programs which
need a versioned storage. Comparing, archeology and maintenance tasks are
more often done manually and can wait (they do now, don't they?) until the
rest of diff, patch and revision walking machinery libified properly.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-22 15:33                                                       ` Alex Riesen
@ 2007-01-22 15:44                                                         ` Johannes Schindelin
  2007-01-22 16:09                                                           ` Alex Riesen
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-22 15:44 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git

Hi,

On Mon, 22 Jan 2007, Alex Riesen wrote:

> On 1/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > BTW IMHO we will probably never libify git; too many too complicated cases
> > exist already.
> 
> We probably don't need it libified completely. Just reading and writing of
> the object database and tags/references and very simple revision walking
> will be a very good start [...]

... and even these use xmalloc(), which die()s on out-of-memory. Note that 
it would be a really horrible work to libify that, since you basically 
have to insert gazillions of free() calls at the right point, which we 
don't have to, since exit() cleans up after us anyway.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-22 15:44                                                         ` Johannes Schindelin
@ 2007-01-22 16:09                                                           ` Alex Riesen
  2007-01-23 11:26                                                             ` Johannes Schindelin
  0 siblings, 1 reply; 54+ messages in thread
From: Alex Riesen @ 2007-01-22 16:09 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On 1/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > > BTW IMHO we will probably never libify git; too many too complicated cases
> > > exist already.
> >
> > We probably don't need it libified completely. Just reading and writing of
> > the object database and tags/references and very simple revision walking
> > will be a very good start [...]
>
> ... and even these use xmalloc(), which die()s on out-of-memory. Note that
> it would be a really horrible work to libify that, since you basically
> have to insert gazillions of free() calls at the right point, which we
> don't have to, since exit() cleans up after us anyway.

Didn't say it'd be simple. Just not impossible.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-22 16:09                                                           ` Alex Riesen
@ 2007-01-23 11:26                                                             ` Johannes Schindelin
  2007-01-23 12:47                                                               ` Alex Riesen
  0 siblings, 1 reply; 54+ messages in thread
From: Johannes Schindelin @ 2007-01-23 11:26 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git

Hi,

On Mon, 22 Jan 2007, Alex Riesen wrote:

> On 1/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > > > BTW IMHO we will probably never libify git; too many too 
> > > > complicated cases exist already.
> > >
> > > We probably don't need it libified completely. Just reading and writing of
> > > the object database and tags/references and very simple revision walking
> > > will be a very good start [...]
> > 
> > ... and even these use xmalloc(), which die()s on out-of-memory. Note that
> > it would be a really horrible work to libify that, since you basically
> > have to insert gazillions of free() calls at the right point, which we
> > don't have to, since exit() cleans up after us anyway.
> 
> Didn't say it'd be simple. Just not impossible.

I thought that was what you meant by "very simple revision walking".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH] config_set_multivar(): disallow newlines in keys
  2007-01-23 11:26                                                             ` Johannes Schindelin
@ 2007-01-23 12:47                                                               ` Alex Riesen
  0 siblings, 0 replies; 54+ messages in thread
From: Alex Riesen @ 2007-01-23 12:47 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On 1/23/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > > > We probably don't need it libified completely. Just reading and writing of
> > > > the object database and tags/references and very simple revision walking
> > > > will be a very good start [...]
> > >
> > > ... and even these use xmalloc(), which die()s on out-of-memory. Note that
> > > it would be a really horrible work to libify that, since you basically
> > > have to insert gazillions of free() calls at the right point, which we
> > > don't have to, since exit() cleans up after us anyway.
> >
> > Didn't say it'd be simple. Just not impossible.
>
> I thought that was what you meant by "very simple revision walking".
>

that'd be "just a part of revision walking". Dunno what part yet,
because I still have to come up with convincing example of where
only a libgit could be used: just committers and history browsers
should not have any problems just using plumbing (no, stupid
microsoft's createprocess is not a reason enough).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 2/1] Documentation/config.txt: Correct info about subsection name
  2007-01-22 15:25                                                   ` Jakub Narebski
@ 2007-01-24 14:14                                                     ` Jakub Narebski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Narebski @ 2007-01-24 14:14 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski

Contrary to variable values, in subsection names parsing character
escape codes (besides literal escaping of " as \", and \ as \\)
is not performed; subsection name cannot contain newlines.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
 Documentation/config.txt |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 77a2b16..d8244b1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -37,8 +37,8 @@ in the section header, like in example below:
 
 --------
 
-Subsection names can contain any characters (doublequote '`"`', backslash
-'`\`' and newline have to be entered escaped as '`\"`', '`\\`' and '`\n`',
+Subsection names can contain any characters except newline (doublequote
+'`"`' and backslash have to be escaped as '`\"`' and '`\\`',
 respecitvely) and are case sensitive.  Section header cannot span multiple
 lines.  Variables may belong directly to a section or to a given subsection.
 You can have `[section]` if you have `[section "subsection"]`, but you
-- 
1.4.4.4

^ permalink raw reply related	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2007-01-24 14:14 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-15  0:44 [RFC] Git config file reader in Perl (WIP) Jakub Narebski
2007-01-15  7:08 ` Eric Wong
2007-01-15  9:03   ` Jakub Narebski
2007-01-15  9:56     ` Eric Wong
2007-01-15 10:01       ` Shawn O. Pearce
2007-01-15 10:32       ` Jakub Narebski
2007-01-15 11:26         ` Eric Wong
2007-01-15 12:15           ` Johannes Schindelin
2007-01-15 15:34             ` Nikolai Weibull
2007-01-15 15:44               ` Johannes Schindelin
2007-01-15 16:22                 ` Nikolai Weibull
2007-01-15 16:00               ` Jakub Narebski
2007-01-16 10:45               ` Junio C Hamano
2007-01-16 11:12                 ` Johannes Schindelin
2007-01-16 14:14                   ` Jakub Narebski
2007-01-16 22:17                     ` Nikolai Weibull
2007-01-16 22:37                       ` Jakub Narebski
2007-01-16 22:56                         ` Johannes Schindelin
2007-01-16 23:24                           ` Jakub Narebski
2007-01-17  8:51                             ` Johannes Schindelin
2007-01-17  9:48                               ` Jakub Narebski
2007-01-17 10:44                                 ` Johannes Schindelin
2007-01-17 12:11                                   ` Jakub Narebski
2007-01-17 12:37                                     ` Johannes Schindelin
2007-01-17 14:00                                       ` Jakub Narebski
2007-01-19 12:10                                         ` Jakub Narebski
2007-01-19 12:25                                           ` Jakub Narebski
2007-01-19 13:20                                           ` Johannes Schindelin
2007-01-19 22:44                                             ` Jakub Narebski
2007-01-20  0:08                                               ` Johannes Schindelin
2007-01-20  0:59                                                 ` Jakub Narebski
2007-01-20  0:19                                               ` Junio C Hamano
2007-01-20  1:25                                                 ` [PATCH] config_set_multivar(): disallow newlines in keys Johannes Schindelin
2007-01-20  1:40                                                   ` Junio C Hamano
2007-01-22 15:06                                                   ` Alex Riesen
2007-01-22 15:21                                                     ` Johannes Schindelin
2007-01-22 15:33                                                       ` Alex Riesen
2007-01-22 15:44                                                         ` Johannes Schindelin
2007-01-22 16:09                                                           ` Alex Riesen
2007-01-23 11:26                                                             ` Johannes Schindelin
2007-01-23 12:47                                                               ` Alex Riesen
2007-01-20 14:03                                                 ` [PATCH] Documentation/config.txt: Document config file syntax better Jakub Narebski
2007-01-22 15:25                                                   ` Jakub Narebski
2007-01-24 14:14                                                     ` [PATCH 2/1] Documentation/config.txt: Correct info about subsection name Jakub Narebski
2007-01-16 22:42                       ` [RFC] Git config file reader in Perl (WIP) Johannes Schindelin
2007-01-17 18:08                         ` Nikolai Weibull
2007-01-17 19:22                           ` Jakub Narebski
2007-01-17 20:01                             ` Nikolai Weibull
2007-01-17 19:25                           ` Jakub Narebski
2007-01-18  0:50                           ` Johannes Schindelin
2007-01-16 19:09                   ` Eric Wong
2007-01-16  9:51             ` Eric Wong
2007-01-16 10:47               ` Johannes Schindelin
2007-01-16 19:53                 ` Eric Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.