Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	"Philip Oakley" <philipoakley@iee.email>
Subject: Re: [PATCH v5 6/8] config: parse more robust format in GIT_CONFIG_PARAMETERS
Date: Wed, 16 Dec 2020 20:01:41 +0000
Message-ID: <ccb476b8-9835-3810-c272-b74822fe74eb@gmail.com> (raw)
In-Reply-To: <d832f3dedf5bde4cd9389ddab734703ff2dbd5a1.1608104755.git.ps@pks.im>

Hi Patrick/Peff

On 16/12/2020 07:57, Patrick Steinhardt wrote:
> From: Jeff King <peff@peff.net>
> 
> When we stuff config options into GIT_CONFIG_PARAMETERS, we shell-quote
> each one as a single unit, like:
> 
>    'section.one=value1' 'section.two=value2'
> 
> On the reading side, we de-quote to get the individual strings, and then
> parse them by splitting on the first "=" we find. This format is
> ambiguous, because an "=" may appear in a subsection. So the config
> represented in a file by both:
> 
>    [section "subsection=with=equals"]
>    key = value
> 
> and:
> 
>    [section]
>    subsection = with=equals.key=value
> 
> ends up in this flattened format like:
> 
>    'section.subsection=with=equals.key=value'
> 
> and we can't tell which was desired. We have traditionally resolved this
> by taking the first "=" we see starting from the left, meaning that we
> allowed arbitrary content in the value, but not in the subsection.

I was just wondering what happens if a subsection name contains a single 
quote - can we handle that now and how is it affected by this change?

Best Wishes

Phillip

> Let's make our environment format a bit more robust by separately
> quoting the key and value. That turns those examples into:
> 
>    'section.subsection=with=equals.key'='value'
> 
> and:
> 
>    'section.subsection'='with=equals.key=value'
> 
> respectively, and we can tell the difference between them. We can detect
> which format is in use for any given element of the list based on the
> presence of the unquoted "=". That means we can continue to allow the
> old format to work to support any callers which manually used the old
> format, and we can even intermingle the two formats. The old format
> wasn't documented, and nobody was supposed to be using it. But it's
> likely that such callers exist in the wild, so it's nice if we can avoid
> breaking them. Likewise, it may be possible to trigger an older version
> of "git -c" that runs a script that calls into a newer version of "git
> -c"; that new version would see the intermingled format.
> 
> This does create one complication, which is that the obvious format in
> the new scheme for
> 
>    [section]
>    some-bool
> 
> is:
> 
>    'section.some-bool'
> 
> with no equals. We'd mistake that for an old-style variable. And it even
> has the same meaning in the old style, but:
> 
>    [section "with=equals"]
>    some-bool
> 
> does not. It would be:
> 
>    'section.with=equals=some-bool'
> 
> which we'd take to mean:
> 
>    [section]
>    with = equals=some-bool
> 
> in the old, ambiguous style. Likewise, we can't use:
> 
>    'section.some-bool'=''
> 
> because that's ambiguous with an actual empty string. Instead, we'll
> again use the shell-quoting to give us a hint, and use:
> 
>    'section.some-bool'=
> 
> to show that we have no value.
> 
> Note that this commit just expands the reading side. We'll start writing
> the new format via "git -c" in a future patch. In the meantime, the
> existing "git -c" tests will make sure we didn't break reading the old
> format. But we'll also add some explicit coverage of the two formats to
> make sure we continue to handle the old one after we move the writing
> side over.
> 
> And one final note: since we're now using the shell-quoting as a
> semantically meaningful hint, this closes the door to us ever allowing
> arbitrary shell quoting, like:
> 
>    'a'shell'would'be'ok'with'this'.key=value
> 
> But we have never supported that (only what sq_quote() would produce),
> and we are probably better off keeping things simple, robust, and
> backwards-compatible, than trying to make it easier for humans. We'll
> continue not to advertise the format of the variable to users, and
> instead keep "git -c" as the recommended mechanism for setting config
> (even if we are trying to be kind not to break users who may be relying
> on the current undocumented format).
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   config.c          | 69 +++++++++++++++++++++++++++++++++++------------
>   t/t1300-config.sh | 52 +++++++++++++++++++++++++++++++++++
>   2 files changed, 104 insertions(+), 17 deletions(-)
> 
> diff --git a/config.c b/config.c
> index 53ed048689..60a7261807 100644
> --- a/config.c
> +++ b/config.c
> @@ -541,14 +541,62 @@ int git_config_parse_parameter(const char *text,
>   	return ret;
>   }
>   
> +static int parse_config_env_list(char *env, config_fn_t fn, void *data)
> +{
> +	char *cur = env;
> +	while (cur && *cur) {
> +		const char *key = sq_dequote_step(cur, &cur);
> +		if (!key)
> +			return error(_("bogus format in %s"),
> +				     CONFIG_DATA_ENVIRONMENT);
> +
> +		if (!cur || isspace(*cur)) {
> +			/* old-style 'key=value' */
> +			if (git_config_parse_parameter(key, fn, data) < 0)
> +				return -1;
> +		}
> +		else if (*cur == '=') {
> +			/* new-style 'key'='value' */
> +			const char *value;
> +
> +			cur++;
> +			if (*cur == '\'') {
> +				/* quoted value */
> +				value = sq_dequote_step(cur, &cur);
> +				if (!value || (cur && !isspace(*cur))) {
> +					return error(_("bogus format in %s"),
> +						     CONFIG_DATA_ENVIRONMENT);
> +				}
> +			} else if (!*cur || isspace(*cur)) {
> +				/* implicit bool: 'key'= */
> +				value = NULL;
> +			} else {
> +				return error(_("bogus format in %s"),
> +					     CONFIG_DATA_ENVIRONMENT);
> +			}
> +
> +			if (config_parse_pair(key, value, fn, data) < 0)
> +				return -1;
> +		}
> +		else {
> +			/* unknown format */
> +			return error(_("bogus format in %s"),
> +				     CONFIG_DATA_ENVIRONMENT);
> +		}
> +
> +		if (cur) {
> +			while (isspace(*cur))
> +				cur++;
> +		}
> +	}
> +	return 0;
> +}
> +
>   int git_config_from_parameters(config_fn_t fn, void *data)
>   {
>   	const char *env = getenv(CONFIG_DATA_ENVIRONMENT);
>   	int ret = 0;
>   	char *envw;
> -	const char **argv = NULL;
> -	int nr = 0, alloc = 0;
> -	int i;
>   	struct config_source source;
>   
>   	if (!env)
> @@ -561,21 +609,8 @@ int git_config_from_parameters(config_fn_t fn, void *data)
>   
>   	/* sq_dequote will write over it */
>   	envw = xstrdup(env);
> +	ret = parse_config_env_list(envw, fn, data);
>   
> -	if (sq_dequote_to_argv(envw, &argv, &nr, &alloc) < 0) {
> -		ret = error(_("bogus format in %s"), CONFIG_DATA_ENVIRONMENT);
> -		goto out;
> -	}
> -
> -	for (i = 0; i < nr; i++) {
> -		if (git_config_parse_parameter(argv[i], fn, data) < 0) {
> -			ret = -1;
> -			goto out;
> -		}
> -	}
> -
> -out:
> -	free(argv);
>   	free(envw);
>   	cf = source.prev;
>   	return ret;
> diff --git a/t/t1300-config.sh b/t/t1300-config.sh
> index 36a60879f6..35a1a6e8b1 100755
> --- a/t/t1300-config.sh
> +++ b/t/t1300-config.sh
> @@ -1294,6 +1294,58 @@ test_expect_success 'git -c is not confused by empty environment' '
>   	GIT_CONFIG_PARAMETERS="" git -c x.one=1 config --list
>   '
>   
> +test_expect_success 'GIT_CONFIG_PARAMETERS handles old-style entries' '
> +	v="${SQ}key.one=foo${SQ}" &&
> +	v="$v  ${SQ}key.two=bar${SQ}" &&
> +	v="$v ${SQ}key.ambiguous=section.whatever=value${SQ}" &&
> +	GIT_CONFIG_PARAMETERS=$v git config --get-regexp "key.*" >actual &&
> +	cat >expect <<-EOF &&
> +	key.one foo
> +	key.two bar
> +	key.ambiguous section.whatever=value
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'GIT_CONFIG_PARAMETERS handles new-style entries' '
> +	v="${SQ}key.one${SQ}=${SQ}foo${SQ}" &&
> +	v="$v  ${SQ}key.two${SQ}=${SQ}bar${SQ}" &&
> +	v="$v ${SQ}key.ambiguous=section.whatever${SQ}=${SQ}value${SQ}" &&
> +	GIT_CONFIG_PARAMETERS=$v git config --get-regexp "key.*" >actual &&
> +	cat >expect <<-EOF &&
> +	key.one foo
> +	key.two bar
> +	key.ambiguous=section.whatever value
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'old and new-style entries can mix' '
> +	v="${SQ}key.oldone=oldfoo${SQ}" &&
> +	v="$v ${SQ}key.newone${SQ}=${SQ}newfoo${SQ}" &&
> +	v="$v ${SQ}key.oldtwo=oldbar${SQ}" &&
> +	v="$v ${SQ}key.newtwo${SQ}=${SQ}newbar${SQ}" &&
> +	GIT_CONFIG_PARAMETERS=$v git config --get-regexp "key.*" >actual &&
> +	cat >expect <<-EOF &&
> +	key.oldone oldfoo
> +	key.newone newfoo
> +	key.oldtwo oldbar
> +	key.newtwo newbar
> +	EOF
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'old and new bools with ambiguous subsection' '
> +	v="${SQ}key.with=equals.oldbool${SQ}" &&
> +	v="$v ${SQ}key.with=equals.newbool${SQ}=" &&
> +	GIT_CONFIG_PARAMETERS=$v git config --get-regexp "key.*" >actual &&
> +	cat >expect <<-EOF &&
> +	key.with equals.oldbool
> +	key.with=equals.newbool
> +	EOF
> +	test_cmp expect actual
> +'
> +
>   test_expect_success 'detect bogus GIT_CONFIG_PARAMETERS' '
>   	cat >expect <<-\EOF &&
>   	env.one one
> 

  reply index

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24 10:50 [PATCH v2 0/2] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2020-11-24 10:50 ` [PATCH v2 1/2] config: extract function to parse config pairs Patrick Steinhardt
2020-11-24 10:50 ` [PATCH v2 2/2] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2020-11-25  3:39   ` Junio C Hamano
2020-11-25  7:06     ` Patrick Steinhardt
2020-11-25  7:41       ` Junio C Hamano
2020-11-25  7:57         ` Patrick Steinhardt
2020-11-25  8:47   ` Ævar Arnfjörð Bjarmason
2020-11-25  9:00   ` Ævar Arnfjörð Bjarmason
2020-11-25 19:50     ` Junio C Hamano
2020-11-25 10:41 ` [PATCH v2 0/2] " Jeff King
2020-11-25 13:16   ` Patrick Steinhardt
2020-11-26  0:36     ` Jeff King
2020-11-25 20:28   ` Junio C Hamano
2020-11-25 22:47   ` brian m. carlson
2020-11-26  6:31     ` Patrick Steinhardt
2020-12-01  9:47   ` Patrick Steinhardt
2020-12-01 11:30     ` Jeff King
2020-12-01  9:55 ` [PATCH v3 0/4] " Patrick Steinhardt
2020-12-01  9:55   ` [PATCH v3 1/4] environment: make `getenv_safe()` non-static Patrick Steinhardt
2020-12-01  9:56   ` [PATCH v3 2/4] config: extract function to parse config pairs Patrick Steinhardt
2020-12-01  9:56   ` [PATCH v3 3/4] config: refactor parsing of GIT_CONFIG_PARAMETERS Patrick Steinhardt
2020-12-01  9:56   ` [PATCH v3 4/4] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2020-12-09 11:52 ` [PATCH v4 0/6] config: allow specifying config entries via env Patrick Steinhardt
2020-12-09 11:52   ` [PATCH v4 1/6] git: add `--super-prefix` to usage string Patrick Steinhardt
2020-12-09 11:52   ` [PATCH v4 2/6] config: add new way to pass config via `--config-env` Patrick Steinhardt
2020-12-09 14:40     ` Ævar Arnfjörð Bjarmason
2020-12-09 16:24       ` Jeff King
2020-12-11 13:24         ` Patrick Steinhardt
2020-12-11 14:21           ` Jeff King
2020-12-11 14:54             ` Patrick Steinhardt
2020-12-11 16:10               ` Jeff King
2020-12-09 16:10     ` Jeff King
2020-12-09 16:12       ` [PATCH 1/3] quote: make sq_dequote_step() a public function Jeff King
2020-12-09 16:17       ` [PATCH 2/3] config: parse more robust format in GIT_CONFIG_PARAMETERS Jeff King
2020-12-09 16:20       ` [PATCH 3/3] config: store "git -c" variables using more robust format Jeff King
2020-12-09 16:34         ` Jeff King
2020-12-10 20:55         ` Ævar Arnfjörð Bjarmason
2020-12-10 21:49           ` Junio C Hamano
2020-12-11 13:21           ` Jeff King
2020-12-10  0:00       ` [PATCH v4 2/6] config: add new way to pass config via `--config-env` Junio C Hamano
2020-12-10  0:09         ` Jeff King
2020-12-10  0:57           ` Junio C Hamano
2020-12-11 13:24       ` Patrick Steinhardt
2020-12-11 14:20         ` Jeff King
2020-12-09 11:52   ` [PATCH v4 3/6] environment: make `getenv_safe()` non-static Patrick Steinhardt
2020-12-09 11:52   ` [PATCH v4 4/6] config: extract function to parse config pairs Patrick Steinhardt
2020-12-09 13:12     ` Ævar Arnfjörð Bjarmason
2020-12-09 11:52   ` [PATCH v4 5/6] config: refactor parsing of GIT_CONFIG_PARAMETERS Patrick Steinhardt
2020-12-09 11:52   ` [PATCH v4 6/6] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2020-12-09 15:29   ` [PATCH v4 0/6] config: allow specifying config entries via env Ævar Arnfjörð Bjarmason
2020-12-11 13:35     ` Patrick Steinhardt
2020-12-11 14:27       ` Jeff King
2020-12-11 14:42         ` Jeff King
2020-12-11 14:58           ` Patrick Steinhardt
2020-12-11 14:47         ` Patrick Steinhardt
2020-12-11 15:21           ` Ævar Arnfjörð Bjarmason
2020-12-11 16:02           ` Jeff King
2020-12-16  7:52 ` [PATCH v5 0/8] " Patrick Steinhardt
2020-12-16  7:52   ` [PATCH v5 1/8] git: add `--super-prefix` to usage string Patrick Steinhardt
2020-12-16  7:52   ` [PATCH v5 2/8] config: add new way to pass config via `--config-env` Patrick Steinhardt
2020-12-23 21:35     ` Junio C Hamano
2020-12-16  7:54   ` [PATCH v5 4/8] config: extract function to parse config pairs Patrick Steinhardt
2020-12-16  7:54   ` [PATCH v5 7/8] environment: make `getenv_safe()` a public function Patrick Steinhardt
2020-12-16  7:54   ` [PATCH v5 8/8] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2020-12-23 21:14     ` Junio C Hamano
2020-12-23 21:55       ` Junio C Hamano
2021-01-06 10:28         ` Patrick Steinhardt
2021-01-06 21:07           ` Junio C Hamano
2020-12-16  7:56   ` [PATCH v5 3/8] quote: make sq_dequote_step() a public function Patrick Steinhardt
2020-12-16  7:56   ` [PATCH v5 5/8] config: store "git -c" variables using more robust format Patrick Steinhardt
2020-12-16  7:57   ` [PATCH v5 6/8] config: parse more robust format in GIT_CONFIG_PARAMETERS Patrick Steinhardt
2020-12-16 20:01     ` Phillip Wood [this message]
2021-01-07  6:36 ` [PATCH v6 0/8] config: allow specifying config entries via env Patrick Steinhardt
2021-01-07  6:36   ` [PATCH v6 1/8] git: add `--super-prefix` to usage string Patrick Steinhardt
2021-01-07  6:36   ` [PATCH v6 2/8] config: add new way to pass config via `--config-env` Patrick Steinhardt
2021-01-10 20:29     ` Simon Ruderich
2021-01-11  0:29       ` Junio C Hamano
2021-01-11  8:24         ` Patrick Steinhardt
2021-01-07  6:36   ` [PATCH v6 3/8] quote: make sq_dequote_step() a public function Patrick Steinhardt
2021-01-07  6:37   ` [PATCH v6 4/8] config: extract function to parse config pairs Patrick Steinhardt
2021-01-07  6:37   ` [PATCH v6 5/8] config: store "git -c" variables using more robust format Patrick Steinhardt
2021-01-07  6:37   ` [PATCH v6 6/8] config: parse more robust format in GIT_CONFIG_PARAMETERS Patrick Steinhardt
2021-01-07  6:37   ` [PATCH v6 7/8] environment: make `getenv_safe()` a public function Patrick Steinhardt
2021-01-07  6:37   ` [PATCH v6 8/8] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2021-01-11  8:36 ` [PATCH v7 0/8] " Patrick Steinhardt
2021-01-11  8:36   ` [PATCH v7 1/8] git: add `--super-prefix` to usage string Patrick Steinhardt
2021-01-11  8:36   ` [PATCH v7 2/8] config: add new way to pass config via `--config-env` Patrick Steinhardt
2021-01-11 22:34     ` Junio C Hamano
2021-01-11  8:36   ` [PATCH v7 3/8] quote: make sq_dequote_step() a public function Patrick Steinhardt
2021-01-11  8:36   ` [PATCH v7 4/8] config: extract function to parse config pairs Patrick Steinhardt
2021-01-11  8:37   ` [PATCH v7 5/8] config: store "git -c" variables using more robust format Patrick Steinhardt
2021-01-11  8:37   ` [PATCH v7 6/8] config: parse more robust format in GIT_CONFIG_PARAMETERS Patrick Steinhardt
2021-01-11  8:37   ` [PATCH v7 7/8] environment: make `getenv_safe()` a public function Patrick Steinhardt
2021-01-11  8:37   ` [PATCH v7 8/8] config: allow specifying config entries via envvar pairs Patrick Steinhardt
2021-01-12 12:26 ` [PATCH v8 0/8] " Patrick Steinhardt
2021-01-12 12:26   ` [PATCH v8 1/8] git: add `--super-prefix` to usage string Patrick Steinhardt
2021-01-12 12:26   ` [PATCH v8 2/8] config: add new way to pass config via `--config-env` Patrick Steinhardt
2021-04-16 15:40     ` Ævar Arnfjörð Bjarmason
2021-04-17  8:38       ` Jeff King
2021-04-19 15:28         ` Patrick Steinhardt
2021-04-20 11:01           ` Ævar Arnfjörð Bjarmason
2021-04-20 10:59         ` Ævar Arnfjörð Bjarmason
2021-04-23 10:05           ` Jeff King
2021-05-19 11:36             ` Ævar Arnfjörð Bjarmason
2021-01-12 12:26   ` [PATCH v8 3/8] quote: make sq_dequote_step() a public function Patrick Steinhardt
2021-01-12 12:26   ` [PATCH v8 4/8] config: extract function to parse config pairs Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v8 5/8] config: store "git -c" variables using more robust format Patrick Steinhardt
2021-01-15 19:16     ` Jeff King
2021-01-20  6:29       ` Patrick Steinhardt
2021-01-20  6:55         ` Junio C Hamano
2021-01-20  7:42           ` Patrick Steinhardt
2021-01-20 22:28             ` Junio C Hamano
2021-01-12 12:27   ` [PATCH v8 6/8] config: parse more robust format in GIT_CONFIG_PARAMETERS Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v8 7/8] environment: make `getenv_safe()` a public function Patrick Steinhardt
2021-01-12 12:27   ` [PATCH v8 8/8] config: allow specifying config entries via envvar pairs Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccb476b8-9835-3810-c272-b74822fe74eb@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=philipoakley@iee.email \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=ps@pks.im \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git