git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Improve the documentation and test coverage for whitespace and comments
@ 2024-03-12 15:55 Dragan Simic
  2024-03-12 15:55 ` [PATCH 1/3] config.txt: describe whitespace characters further and more accurately Dragan Simic
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Dragan Simic @ 2024-03-12 15:55 UTC (permalink / raw)
  To: git; +Cc: gitster, rsbecker, github

Following up a recent discussion, [1] this small series addresses some
inaccuracy spotted in the git-config(1) documentation, performs some more
improvements and small cleanups, and adds a couple of automated tests for
improved whitespace and comment coverage.

[1] https://lore.kernel.org/git/pull.1681.v2.git.1709824540636.gitgitgadget@gmail.com/T/#m6372f6b5f2a26f7b1a6297241f5cbfa7a29f4fe5

Dragan Simic (3):
  config.txt: describe whitespace characters further and more accurately
  config.txt: perform some minor reformatting
  t1300: add tests for internal whitespace and inline comments

 Documentation/config.txt | 29 ++++++++++++++++-------------
 t/t1300-config.sh        | 20 ++++++++++++++++++++
 2 files changed, 36 insertions(+), 13 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] config.txt: describe whitespace characters further and more accurately
  2024-03-12 15:55 [PATCH 0/3] Improve the documentation and test coverage for whitespace and comments Dragan Simic
@ 2024-03-12 15:55 ` Dragan Simic
  2024-03-14  1:18   ` Junio C Hamano
  2024-03-12 15:55 ` [PATCH 2/3] config.txt: perform some minor reformatting Dragan Simic
  2024-03-12 15:55 ` [PATCH 3/3] t1300: add tests for internal whitespace and inline comments Dragan Simic
  2 siblings, 1 reply; 14+ messages in thread
From: Dragan Simic @ 2024-03-12 15:55 UTC (permalink / raw)
  To: git; +Cc: gitster, rsbecker, github

Make it more clear what the whitespace characters are in the context of git
configuration, improve the description of the trailing whitespace handling,
and correct the description of the value-internal whitespace handling.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---
 Documentation/config.txt | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 782c2bab906c..4480bb44203b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -22,9 +22,10 @@ multivalued.
 Syntax
 ~~~~~~
 
-The syntax is fairly flexible and permissive; whitespaces are mostly
-ignored.  The '#' and ';' characters begin comments to the end of line,
-blank lines are ignored.
+The syntax is fairly flexible and permissive.  Whitespace characters,
+which in this context are the space character (SP) and the horizontal
+tabulation (HT), are mostly ignored.  The '#' and ';' characters begin
+comments to the end of line.  Blank lines are ignored.
 
 The file consists of sections and variables.  A section begins with
 the name of the section in square brackets and continues until the next
@@ -64,12 +65,14 @@ The variable names are case-insensitive, allow only alphanumeric characters
 and `-`, and must start with an alphabetic character.
 
 A line that defines a value can be continued to the next line by
-ending it with a `\`; the backslash and the end-of-line are
-stripped.  Leading whitespaces after 'name =', the remainder of the
+ending it with a `\`; the backslash and the end-of-line are stripped.
+Leading whitespace characters after 'name =', the remainder of the
 line after the first comment character '#' or ';', and trailing
-whitespaces of the line are discarded unless they are enclosed in
-double quotes.  Internal whitespaces within the value are retained
-verbatim.
+whitespace characters of the line are discarded unless they are enclosed
+in double quotes.  This discarding of the trailing whitespace characters
+also applies after the remainder of the line after the comment character
+is discarded.  Any number of internal whitespace characters found within
+the value are converted to the same number of space (SP) characters.
 
 Inside double quotes, double quote `"` and backslash `\` characters
 must be escaped: use `\"` for `"` and `\\` for `\`.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] config.txt: perform some minor reformatting
  2024-03-12 15:55 [PATCH 0/3] Improve the documentation and test coverage for whitespace and comments Dragan Simic
  2024-03-12 15:55 ` [PATCH 1/3] config.txt: describe whitespace characters further and more accurately Dragan Simic
@ 2024-03-12 15:55 ` Dragan Simic
  2024-03-14  1:58   ` Junio C Hamano
  2024-03-12 15:55 ` [PATCH 3/3] t1300: add tests for internal whitespace and inline comments Dragan Simic
  2 siblings, 1 reply; 14+ messages in thread
From: Dragan Simic @ 2024-03-12 15:55 UTC (permalink / raw)
  To: git; +Cc: gitster, rsbecker, github

Reformat a few lines a bit, to utilize the available horizontal space better.
There are no changes to the actual contents of the documentation.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---
 Documentation/config.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4480bb44203b..2fc4a52d8d76 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -58,11 +58,11 @@ compared case sensitively. These subsection names follow the same
 restrictions as section names.
 
 All the other lines (and the remainder of the line after the section
-header) are recognized as setting variables, in the form
-'name = value' (or just 'name', which is a short-hand to say that
-the variable is the boolean "true").
-The variable names are case-insensitive, allow only alphanumeric characters
-and `-`, and must start with an alphabetic character.
+header) are recognized as setting variables, in the form 'name = value'
+(or just 'name', which is a short-hand to say that the variable is the
+boolean "true").  The variable names are case-insensitive, allow only
+alphanumeric characters and `-`, and must start with an alphabetic
+character.
 
 A line that defines a value can be continued to the next line by
 ending it with a `\`; the backslash and the end-of-line are stripped.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] t1300: add tests for internal whitespace and inline comments
  2024-03-12 15:55 [PATCH 0/3] Improve the documentation and test coverage for whitespace and comments Dragan Simic
  2024-03-12 15:55 ` [PATCH 1/3] config.txt: describe whitespace characters further and more accurately Dragan Simic
  2024-03-12 15:55 ` [PATCH 2/3] config.txt: perform some minor reformatting Dragan Simic
@ 2024-03-12 15:55 ` Dragan Simic
  2024-03-14  2:18   ` Junio C Hamano
  2 siblings, 1 reply; 14+ messages in thread
From: Dragan Simic @ 2024-03-12 15:55 UTC (permalink / raw)
  To: git; +Cc: gitster, rsbecker, github

Add a couple of additional automated tests, to improve the coverage of
configuration file entries whose values contain internal whitespace, or have
an additional inline comment.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---
 t/t1300-config.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index 31c387868708..589af5e81d61 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -11,6 +11,26 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+cat > .git/config << EOF
+[section]
+	sparse = big 		 blue
+	annotated = big blue	# to be discarded
+EOF
+
+echo 'big    blue' > expect
+
+test_expect_success 'internal whitespace' '
+	git config --get section.sparse > output &&
+	test_cmp expect output
+'
+
+echo 'big blue' > expect
+
+test_expect_success 'inline comment' '
+	git config --get section.annotated > output &&
+	test_cmp expect output
+'
+
 test_expect_success 'clear default config' '
 	rm -f .git/config
 '

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] config.txt: describe whitespace characters further and more accurately
  2024-03-12 15:55 ` [PATCH 1/3] config.txt: describe whitespace characters further and more accurately Dragan Simic
@ 2024-03-14  1:18   ` Junio C Hamano
  2024-03-14  6:20     ` Dragan Simic
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2024-03-14  1:18 UTC (permalink / raw)
  To: Dragan Simic; +Cc: git, rsbecker, github

Dragan Simic <dsimic@manjaro.org> writes:

> Make it more clear what the whitespace characters are in the context of git
> configuration, improve the description of the trailing whitespace handling,
> and correct the description of the value-internal whitespace handling.
>
> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
>  Documentation/config.txt | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 782c2bab906c..4480bb44203b 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -22,9 +22,10 @@ multivalued.
>  Syntax
>  ~~~~~~
>  
> -The syntax is fairly flexible and permissive; whitespaces are mostly
> -ignored.  The '#' and ';' characters begin comments to the end of line,
> -blank lines are ignored.
> +The syntax is fairly flexible and permissive.  Whitespace characters,
> +which in this context are the space character (SP) and the horizontal
> +tabulation (HT), are mostly ignored.  The '#' and ';' characters begin
> +comments to the end of line.  Blank lines are ignored.

OK, except for "whitespace characters"---do we need to say
"whitespace characters", after we already listed HT and SP are the
ones, instead of just "whitespaces"?

> @@ -64,12 +65,14 @@ The variable names are case-insensitive, allow only alphanumeric characters
>  and `-`, and must start with an alphabetic character.
>  
>  A line that defines a value can be continued to the next line by
> +ending it with a `\`; the backslash and the end-of-line are stripped.
> +Leading whitespace characters after 'name =', the remainder of the
>  line after the first comment character '#' or ';', and trailing
> +whitespace characters of the line are discarded unless they are enclosed
> +in double quotes.  This discarding of the trailing whitespace characters
> +also applies after the remainder of the line after the comment character
> +is discarded.

"also" makes it sound as if we do it twice, once to remove trailing
whitespaces after the remainder of the line after '#", and then trim
the trailing whitespaces after we removed the comment.

I wonder if we can make it clearer by following the step-by-step
nature of the earlier part of the paragraph through.  We already say
the folded line processing is done first, so break things down in
conceptual phases/steps, perhaps like

 * The backslash at the end-of-line is removed, together with the
   end-of-line, to form a single long line.

 * Anything that come after the first unquoted comment character,
   either '#' or ';', are discarded.

 * The leading and trailing whitespaces around the value part
   (i.e. what follows 'name =') are discarded.

 * Remaining unquoted whitespaces inside the value part are munged.

> +Any number of internal whitespace characters found within
> +the value are converted to the same number of space (SP) characters.

The last one sounds like a bug to me, by the way.

At least the very original 17712991 (Add ".git/config" file parser,
2005-10-10) squashed a run of whitespace characters into a single
SP, which makes sense as a "clean-up".

But ebdaae37 (config: Keep inner whitespace verbatim, 2009-07-30),
while claiming to "Keep" inner whitespaces, broke it by replacing
any isspace() bytes that are not SP with SP, contradicting its
stated purpose.

As the latest change by the author of that change is from more than
10 years ago, I do not expect that he is still interested in this
part of the codebase, but thanks to a very clearly written log
message, we can read what the motivation behind that change was, and
seeing that what the code does contradicts with the stated
motivation we can safely declare that this is an ancient bug.

Fixing that bug can of course be left outside the series.  For those
who are looking for microproject ideas who discovered this message
by searching for the #leftoverbits keyword, one possible fix would
be to revert ebdaae37, make sure a value with any whitespace in it
gets quoted, and document clearly that an unquoted run of
whitespaces is squashed into a single SP.  Another way that is
milder is to finish what ebdaae37 wanted to do and retain the
whitespaces "verbatim".

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] config.txt: perform some minor reformatting
  2024-03-12 15:55 ` [PATCH 2/3] config.txt: perform some minor reformatting Dragan Simic
@ 2024-03-14  1:58   ` Junio C Hamano
  2024-03-14  6:20     ` Dragan Simic
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2024-03-14  1:58 UTC (permalink / raw)
  To: Dragan Simic; +Cc: git, rsbecker, github

Dragan Simic <dsimic@manjaro.org> writes:

> Reformat a few lines a bit, to utilize the available horizontal space better.
> There are no changes to the actual contents of the documentation.

I was a bit surprised to see such a "preliminary clean-up" step to
come before the main change, not after, but separating this from the
change to the next paragraph, which is the main change in this series,
is nevertheless a very good idea.

>
> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
>  Documentation/config.txt | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 4480bb44203b..2fc4a52d8d76 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -58,11 +58,11 @@ compared case sensitively. These subsection names follow the same
>  restrictions as section names.
>  
>  All the other lines (and the remainder of the line after the section
> -header) are recognized as setting variables, in the form
> -'name = value' (or just 'name', which is a short-hand to say that
> -the variable is the boolean "true").
> -The variable names are case-insensitive, allow only alphanumeric characters
> -and `-`, and must start with an alphabetic character.
> +header) are recognized as setting variables, in the form 'name = value'
> +(or just 'name', which is a short-hand to say that the variable is the
> +boolean "true").  The variable names are case-insensitive, allow only
> +alphanumeric characters and `-`, and must start with an alphabetic
> +character.
>  
>  A line that defines a value can be continued to the next line by
>  ending it with a `\`; the backslash and the end-of-line are stripped.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] t1300: add tests for internal whitespace and inline comments
  2024-03-12 15:55 ` [PATCH 3/3] t1300: add tests for internal whitespace and inline comments Dragan Simic
@ 2024-03-14  2:18   ` Junio C Hamano
  2024-03-14  6:20     ` Dragan Simic
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2024-03-14  2:18 UTC (permalink / raw)
  To: Dragan Simic; +Cc: git, rsbecker, github

Dragan Simic <dsimic@manjaro.org> writes:

> Add a couple of additional automated tests, to improve the coverage of
> configuration file entries whose values contain internal whitespace, or have
> an additional inline comment.

While this may document the current behaviour, I am not sure of the
value of carving the current behaviour in stone, especially after
checking if the current behaviour is a bug.

> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
>  t/t1300-config.sh | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/t/t1300-config.sh b/t/t1300-config.sh
> index 31c387868708..589af5e81d61 100755
> --- a/t/t1300-config.sh
> +++ b/t/t1300-config.sh
> @@ -11,6 +11,26 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>  TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
> +cat > .git/config << EOF
> +[section]
> +	sparse = big 		 blue
> +	annotated = big blue	# to be discarded
> +EOF
> +
> +echo 'big    blue' > expect
> +
> +test_expect_success 'internal whitespace' '
> +	git config --get section.sparse > output &&
> +	test_cmp expect output
> +'
> +
> +echo 'big blue' > expect
> +
> +test_expect_success 'inline comment' '
> +	git config --get section.annotated > output &&
> +	test_cmp expect output
> +'
> +
>  test_expect_success 'clear default config' '
>  	rm -f .git/config
>  '

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] config.txt: describe whitespace characters further and more accurately
  2024-03-14  1:18   ` Junio C Hamano
@ 2024-03-14  6:20     ` Dragan Simic
  2024-03-14 16:45       ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Dragan Simic @ 2024-03-14  6:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, rsbecker, github

On 2024-03-14 02:18, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
>> -The syntax is fairly flexible and permissive; whitespaces are mostly
>> -ignored.  The '#' and ';' characters begin comments to the end of 
>> line,
>> -blank lines are ignored.
>> +The syntax is fairly flexible and permissive.  Whitespace characters,
>> +which in this context are the space character (SP) and the horizontal
>> +tabulation (HT), are mostly ignored.  The '#' and ';' characters 
>> begin
>> +comments to the end of line.  Blank lines are ignored.
> 
> OK, except for "whitespace characters"---do we need to say
> "whitespace characters", after we already listed HT and SP are the
> ones, instead of just "whitespaces"?

I also spent some time thinking about that.  To me, the plural form, 
i.e.
"whitespaces", simply doesn't sound very good, because "whitespace" 
feels
to me more like a mass noun, and I really haven't seen it used in plural
form in other projects.

>>  A line that defines a value can be continued to the next line by
>> +ending it with a `\`; the backslash and the end-of-line are stripped.
>> +Leading whitespace characters after 'name =', the remainder of the
>>  line after the first comment character '#' or ';', and trailing
>> +whitespace characters of the line are discarded unless they are 
>> enclosed
>> +in double quotes.  This discarding of the trailing whitespace 
>> characters
>> +also applies after the remainder of the line after the comment 
>> character
>> +is discarded.
> 
> "also" makes it sound as if we do it twice, once to remove trailing
> whitespaces after the remainder of the line after '#", and then trim
> the trailing whitespaces after we removed the comment.

Good point, I also felt it the same way, but went with such wording 
simply
because I thought it should be more understandable to the users, despite
being technically a bit incorrect.

> I wonder if we can make it clearer by following the step-by-step
> nature of the earlier part of the paragraph through.  We already say
> the folded line processing is done first, so break things down in
> conceptual phases/steps, perhaps like
> 
>  * The backslash at the end-of-line is removed, together with the
>    end-of-line, to form a single long line.
> 
>  * Anything that come after the first unquoted comment character,
>    either '#' or ';', are discarded.
> 
>  * The leading and trailing whitespaces around the value part
>    (i.e. what follows 'name =') are discarded.
> 
>  * Remaining unquoted whitespaces inside the value part are munged.

Hmm, I'm not really sure that such a description would be more clear
to the users, despite being technically more correct.  I'll think a bit
more about it.

>> +Any number of internal whitespace characters found within
>> +the value are converted to the same number of space (SP) characters.
> 
> The last one sounds like a bug to me, by the way.
> 
> At least the very original 17712991 (Add ".git/config" file parser,
> 2005-10-10) squashed a run of whitespace characters into a single
> SP, which makes sense as a "clean-up".
> 
> But ebdaae37 (config: Keep inner whitespace verbatim, 2009-07-30),
> while claiming to "Keep" inner whitespaces, broke it by replacing
> any isspace() bytes that are not SP with SP, contradicting its
> stated purpose.

Thank you for the investigation.  The ebdaae37 commit certainly
introduced a bug to the value parsing, which presumably has remained
undected because the included test passes.

The way I see it, fixing the bug may actually be a breaking change,
because some user configurations may actually rely on the current
(mis)behavior.  This makes me somewhat afraid that fixing this bug,
which I already thought about, may actually do more harm than good.
However, fixing this bug seems to be only right thing to do, which
I'll explain further below.

> As the latest change by the author of that change is from more than
> 10 years ago, I do not expect that he is still interested in this
> part of the codebase, but thanks to a very clearly written log
> message, we can read what the motivation behind that change was, and
> seeing that what the code does contradicts with the stated
> motivation we can safely declare that this is an ancient bug.

Agreed, the evidence is clear.

> Fixing that bug can of course be left outside the series.  For those
> who are looking for microproject ideas who discovered this message
> by searching for the #leftoverbits keyword, one possible fix would
> be to revert ebdaae37, make sure a value with any whitespace in it
> gets quoted, and document clearly that an unquoted run of
> whitespaces is squashed into a single SP.  Another way that is
> milder is to finish what ebdaae37 wanted to do and retain the
> whitespaces "verbatim".

I already though about fixing the bug so the value parser actually does
what git-config(1) currently says, but as I already noted above, I'm
afraid a bit that fixing this bug may actually do more harm than good.

Though, further investigation shows that setting a configuration value,
by invoking git-config(1), converts value-internal tabs into "\t" escape
sequences, which the value-parsing logic doesn't "squash" into spaces.
That's why the test included in the ebdaae37 commit passes.  On the 
other
hand, value-internal literal tab characters, found in a configuration
file, do get "squashed" by the value-parsing logic, so I'd say that the
only right thing to do is to fix this bug by making the value-internal
whitespace characters preserved verbatim.

I'd be happy to include the bugfix into this series, if my 
above-mentioned
fears prove to be unnecessary.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] config.txt: perform some minor reformatting
  2024-03-14  1:58   ` Junio C Hamano
@ 2024-03-14  6:20     ` Dragan Simic
  2024-03-14 16:22       ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Dragan Simic @ 2024-03-14  6:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, rsbecker, github

On 2024-03-14 02:58, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>> Reformat a few lines a bit, to utilize the available horizontal space 
>> better.
>> There are no changes to the actual contents of the documentation.
> 
> I was a bit surprised to see such a "preliminary clean-up" step to
> come before the main change, not after, but separating this from the
> change to the next paragraph, which is the main change in this series,
> is nevertheless a very good idea.

The reason why this patch went as the second in the series was simply
because it's a somewhat unrelated cleanup that performs no actual 
changes
to the contents of the documentation.

Thus, the first patch in the series brings the changes, the second patch
stays in the same domain but only as a cleanup, and the third patch 
moves
to another domain.  I find it quite logical.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] t1300: add tests for internal whitespace and inline comments
  2024-03-14  2:18   ` Junio C Hamano
@ 2024-03-14  6:20     ` Dragan Simic
  0 siblings, 0 replies; 14+ messages in thread
From: Dragan Simic @ 2024-03-14  6:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, rsbecker, github

On 2024-03-14 03:18, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>> Add a couple of additional automated tests, to improve the coverage of
>> configuration file entries whose values contain internal whitespace, 
>> or have
>> an additional inline comment.
> 
> While this may document the current behaviour, I am not sure of the
> value of carving the current behaviour in stone, especially after
> checking if the current behaviour is a bug.

Please see my comments for the first patch in the series.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] config.txt: perform some minor reformatting
  2024-03-14  6:20     ` Dragan Simic
@ 2024-03-14 16:22       ` Junio C Hamano
  2024-03-14 18:40         ` Dragan Simic
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2024-03-14 16:22 UTC (permalink / raw)
  To: Dragan Simic; +Cc: git, rsbecker, github

Dragan Simic <dsimic@manjaro.org> writes:

>> I was a bit surprised to see such a "preliminary clean-up" step to
>> come before the main change, not after, but separating this from the
>> change to the next paragraph, which is the main change in this series,
>> is nevertheless a very good idea.
>
> The reason why this patch went as the second in the series was simply
> because it's a somewhat unrelated cleanup that performs no actual
> changes
> to the contents of the documentation.

It would have been understandable if it were left at the end, as
"after the dust settles".  It would made even more sense if it were
at the front, "before doing anything else, let's clean up the
mess--we do not intend to change the behaviour with this change at
all".  Having it in the middle was what made me surprised.

Generally, the order of preference is to do "preliminary clean-up"
first, followed by the real change.  That way, trivial clean-up that
is designed not to change any behaviour can go ahead and merged down
even before the real change solidifies.

Unrelated changes has no place in a series with a real purpose.
Unless the series is about "assorted clean-ups that are not related
with each other", that is.

Thanks.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] config.txt: describe whitespace characters further and more accurately
  2024-03-14  6:20     ` Dragan Simic
@ 2024-03-14 16:45       ` Junio C Hamano
  2024-03-14 18:48         ` Dragan Simic
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2024-03-14 16:45 UTC (permalink / raw)
  To: Dragan Simic; +Cc: git, rsbecker, github

Dragan Simic <dsimic@manjaro.org> writes:

> Though, further investigation shows that setting a configuration value,
> by invoking git-config(1), converts value-internal tabs into "\t" escape
> sequences, which the value-parsing logic doesn't "squash" into spaces.

Correct.  It would have been nicer to just quote values that had
whitespaces in them, but replacing HT to SP while turning HT that
comes from our tool into "\t" would still let the value round-trip,
while breaking anything written manually in editors.  If you stay
within Git without using any editor, what ebdaae37 (config: Keep
inner whitespace verbatim, 2009-07-30) left us is at least
internally consistent.

> I'd be happy to include the bugfix into this series, if my
> above-mentioned
> fears prove to be unnecessary.

Documenting status quo is a good place to stop for now.  I do not
know if it is a good idea to add too many tests to etch the current
behaviour that we know is wrong and we'll need to update when we fix
the bug, though.

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] config.txt: perform some minor reformatting
  2024-03-14 16:22       ` Junio C Hamano
@ 2024-03-14 18:40         ` Dragan Simic
  0 siblings, 0 replies; 14+ messages in thread
From: Dragan Simic @ 2024-03-14 18:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, rsbecker, github

On 2024-03-14 17:22, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>>> I was a bit surprised to see such a "preliminary clean-up" step to
>>> come before the main change, not after, but separating this from the
>>> change to the next paragraph, which is the main change in this 
>>> series,
>>> is nevertheless a very good idea.
>> 
>> The reason why this patch went as the second in the series was simply
>> because it's a somewhat unrelated cleanup that performs no actual
>> changes
>> to the contents of the documentation.
> 
> It would have been understandable if it were left at the end, as
> "after the dust settles".  It would made even more sense if it were
> at the front, "before doing anything else, let's clean up the
> mess--we do not intend to change the behaviour with this change at
> all".  Having it in the middle was what made me surprised.
> 
> Generally, the order of preference is to do "preliminary clean-up"
> first, followed by the real change.  That way, trivial clean-up that
> is designed not to change any behaviour can go ahead and merged down
> even before the real change solidifies.

After thinking a bit more about it, I'd agree, especially because
such an approach makes accepting patches easier.  Thank you for
pointing that out!

> Unrelated changes has no place in a series with a real purpose.
> Unless the series is about "assorted clean-ups that are not related
> with each other", that is.

Having all in mind, especially the addition of a bugfix for the
value parsing into the series, I think it's the best if I take the
cleanup patch out of the series and send it separately.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] config.txt: describe whitespace characters further and more accurately
  2024-03-14 16:45       ` Junio C Hamano
@ 2024-03-14 18:48         ` Dragan Simic
  0 siblings, 0 replies; 14+ messages in thread
From: Dragan Simic @ 2024-03-14 18:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, rsbecker, github

On 2024-03-14 17:45, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>> Though, further investigation shows that setting a configuration 
>> value,
>> by invoking git-config(1), converts value-internal tabs into "\t" 
>> escape
>> sequences, which the value-parsing logic doesn't "squash" into spaces.
> 
> Correct.  It would have been nicer to just quote values that had
> whitespaces in them, but replacing HT to SP while turning HT that
> comes from our tool into "\t" would still let the value round-trip,
> while breaking anything written manually in editors.  If you stay
> within Git without using any editor, what ebdaae37 (config: Keep
> inner whitespace verbatim, 2009-07-30) left us is at least
> internally consistent.

Yes, but we already support unquoted values that contain whitespace
characters, and people use editors to configure variables.  For example,
I never use git-config(1) to make changes to my ~/.gitconfig file.

>> I'd be happy to include the bugfix into this series, if my
>> above-mentioned
>> fears prove to be unnecessary.
> 
> Documenting status quo is a good place to stop for now.  I do not
> know if it is a good idea to add too many tests to etch the current
> behaviour that we know is wrong and we'll need to update when we fix
> the bug, though.

But I already started to work on a bugfix?  I'm pretty much close to
completing the bugfix and doing some testing.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-14 18:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-12 15:55 [PATCH 0/3] Improve the documentation and test coverage for whitespace and comments Dragan Simic
2024-03-12 15:55 ` [PATCH 1/3] config.txt: describe whitespace characters further and more accurately Dragan Simic
2024-03-14  1:18   ` Junio C Hamano
2024-03-14  6:20     ` Dragan Simic
2024-03-14 16:45       ` Junio C Hamano
2024-03-14 18:48         ` Dragan Simic
2024-03-12 15:55 ` [PATCH 2/3] config.txt: perform some minor reformatting Dragan Simic
2024-03-14  1:58   ` Junio C Hamano
2024-03-14  6:20     ` Dragan Simic
2024-03-14 16:22       ` Junio C Hamano
2024-03-14 18:40         ` Dragan Simic
2024-03-12 15:55 ` [PATCH 3/3] t1300: add tests for internal whitespace and inline comments Dragan Simic
2024-03-14  2:18   ` Junio C Hamano
2024-03-14  6:20     ` Dragan Simic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).