* [PATCH] doc: clarify the filename encoding in git diff
@ 2021-04-19 13:27 Andrey Bienkowski via GitGitGadget
2021-04-19 21:33 ` Junio C Hamano
2021-04-20 11:24 ` [PATCH v2] " Andrey Bienkowski via GitGitGadget
0 siblings, 2 replies; 3+ messages in thread
From: Andrey Bienkowski via GitGitGadget @ 2021-04-19 13:27 UTC (permalink / raw)
To: git; +Cc: Andrey Bienkowski, Andrey Bienkowski
From: Andrey Bienkowski <hexagonrecursion@gmail.com>
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and
diff --name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included in
various manpages (e.g., https://git-scm.com/docs/git-log#_discussion)
but it doesn't seem to be mentioned in git-diff.
>
The short answer is: mostly utf8, but historically on platforms that
don't care (like Linux) you could get away with other encodings.
>
-Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Signed-off-by: Andrey Bienkowski <hexagonrecursion@gmail.com>
---
doc: clarify the filename encoding in git diff --name-only and
--name-status
AFAICT parsing the output of git diff --name-only master...feature is
the intended way of programmatically getting the list of files modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and diff
--name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included
> in various manpages (e.g.,
> https://git-scm.com/docs/git-log#_discussion) but it doesn't seem to
> be mentioned in git-diff.
>
> The short answer is: mostly utf8, but historically on platforms that
> don't care (like Linux) you could get away with other encodings.
>
> -Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Changes since v1:
* Replace "always" with "often"
* Add a link to https://git-scm.com/docs/git-log
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-996%2Fhexagonrecursion%2Futf8-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-996/hexagonrecursion/utf8-v1
Pull-Request: https://github.com/git/git/pull/996
Documentation/diff-options.txt | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index aa2b5c11f20b..4ce36ef535ba 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -293,11 +293,14 @@ explained for the configuration variable `core.quotePath` (see
linkgit:git-config[1]).
--name-only::
- Show only names of changed files.
+ Show only names of changed files. The file names are usually encoded in UTF-8.
+ For more information see the discussion about encoding in the linkgit:git-log[1]
+ manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
+ Just like `--name-only` the file names are usually encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
base-commit: 48bf2fa8bad054d66bd79c6ba903c89c704201f7
--
gitgitgadget
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] doc: clarify the filename encoding in git diff
2021-04-19 13:27 [PATCH] doc: clarify the filename encoding in git diff Andrey Bienkowski via GitGitGadget
@ 2021-04-19 21:33 ` Junio C Hamano
2021-04-20 11:24 ` [PATCH v2] " Andrey Bienkowski via GitGitGadget
1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2021-04-19 21:33 UTC (permalink / raw)
To: Andrey Bienkowski via GitGitGadget; +Cc: git, Andrey Bienkowski
"Andrey Bienkowski via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> Signed-off-by: Andrey Bienkowski <hexagonrecursion@gmail.com>
> ---
> My takeaway was to always parse it as utf8 regardless of platform or
> environment.
>
> Changes since v1:
I do not think the readers on the list have seen the "v1", but
anyway, the
> * Replace "always" with "often"
"often" here sound more measured than ...
> --name-only::
> - Show only names of changed files.
> + Show only names of changed files. The file names are usually encoded in UTF-8.
> + For more information see the discussion about encoding in the linkgit:git-log[1]
> + manual page.
... "usually" here ...
> --name-status::
> Show only names and status of changed files. See the description
> of the `--diff-filter` option on what the status letters mean.
> + Just like `--name-only` the file names are usually encoded in UTF-8.
... and here.
Thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2] doc: clarify the filename encoding in git diff
2021-04-19 13:27 [PATCH] doc: clarify the filename encoding in git diff Andrey Bienkowski via GitGitGadget
2021-04-19 21:33 ` Junio C Hamano
@ 2021-04-20 11:24 ` Andrey Bienkowski via GitGitGadget
1 sibling, 0 replies; 3+ messages in thread
From: Andrey Bienkowski via GitGitGadget @ 2021-04-20 11:24 UTC (permalink / raw)
To: git; +Cc: Andrey Bienkowski, Andrey Bienkowski
From: Andrey Bienkowski <hexagonrecursion@gmail.com>
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and
diff --name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included
in
various manpages (e.g., https://git-scm.com/docs/git-log#_discussion)
but it doesn't seem to be mentioned in git-diff.
>
The short answer is: mostly utf8, but historically on platforms that
don't care (like Linux) you could get away with other encodings.
>
-Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Signed-off-by: Andrey Bienkowski <hexagonrecursion@gmail.com>
---
doc: clarify the filename encoding in git diff --name-only and
--name-status
AFAICT parsing the output of git diff --name-only master...feature is
the intended way of programmatically getting the list of files modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and diff
--name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included
> in various manpages (e.g.,
> https://git-scm.com/docs/git-log#_discussion) but it doesn't seem to
> be mentioned in git-diff.
>
> The short answer is: mostly utf8, but historically on platforms that
> don't care (like Linux) you could get away with other encodings.
>
> -Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Changes since v1:
* Replace "always" with "usually"
* Add a link to https://git-scm.com/docs/git-log
* Replace "usually" with "often"
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-996%2Fhexagonrecursion%2Futf8-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-996/hexagonrecursion/utf8-v2
Pull-Request: https://github.com/git/git/pull/996
Range-diff vs v1:
1: 4f1987e5e09c ! 1: 6daa652b7b15 doc: clarify the filename encoding in git diff
@@ Commit message
doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
- is the intended way of programmatically getting the list of files modified
+ is the intended way of programmatically getting the list of files
+ modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and
diff --name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
- > There's some discussion in Documentation/i18n.txt, which is included in
+ > There's some discussion in Documentation/i18n.txt, which is included
+ in
various manpages (e.g., https://git-scm.com/docs/git-log#_discussion)
but it doesn't seem to be mentioned in git-diff.
>
@@ Documentation/diff-options.txt: explained for the configuration variable `core.q
--name-only::
- Show only names of changed files.
-+ Show only names of changed files. The file names are usually encoded in UTF-8.
++ Show only names of changed files. The file names are often encoded in UTF-8.
+ For more information see the discussion about encoding in the linkgit:git-log[1]
+ manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
-+ Just like `--name-only` the file names are usually encoded in UTF-8.
++ Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
Documentation/diff-options.txt | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index aa2b5c11f20b..69de49f977b6 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -293,11 +293,14 @@ explained for the configuration variable `core.quotePath` (see
linkgit:git-config[1]).
--name-only::
- Show only names of changed files.
+ Show only names of changed files. The file names are often encoded in UTF-8.
+ For more information see the discussion about encoding in the linkgit:git-log[1]
+ manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
+ Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
base-commit: 48bf2fa8bad054d66bd79c6ba903c89c704201f7
--
gitgitgadget
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-20 11:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-19 13:27 [PATCH] doc: clarify the filename encoding in git diff Andrey Bienkowski via GitGitGadget
2021-04-19 21:33 ` Junio C Hamano
2021-04-20 11:24 ` [PATCH v2] " Andrey Bienkowski via GitGitGadget
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.