From: "Andrey Bienkowski via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Andrey Bienkowski <hexagonrecursion@gmail.com>,
Andrey Bienkowski <hexagonrecursion@gmail.com>
Subject: [PATCH v2] doc: clarify the filename encoding in git diff
Date: Tue, 20 Apr 2021 11:24:37 +0000 [thread overview]
Message-ID: <pull.996.v2.git.git.1618917877881.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.996.git.git.1618838856399.gitgitgadget@gmail.com>
From: Andrey Bienkowski <hexagonrecursion@gmail.com>
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and
diff --name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included
in
various manpages (e.g., https://git-scm.com/docs/git-log#_discussion)
but it doesn't seem to be mentioned in git-diff.
>
The short answer is: mostly utf8, but historically on platforms that
don't care (like Linux) you could get away with other encodings.
>
-Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Signed-off-by: Andrey Bienkowski <hexagonrecursion@gmail.com>
---
doc: clarify the filename encoding in git diff --name-only and
--name-status
AFAICT parsing the output of git diff --name-only master...feature is
the intended way of programmatically getting the list of files modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and diff
--name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
> There's some discussion in Documentation/i18n.txt, which is included
> in various manpages (e.g.,
> https://git-scm.com/docs/git-log#_discussion) but it doesn't seem to
> be mentioned in git-diff.
>
> The short answer is: mostly utf8, but historically on platforms that
> don't care (like Linux) you could get away with other encodings.
>
> -Peff
My takeaway was to always parse it as utf8 regardless of platform or
environment.
Changes since v1:
* Replace "always" with "usually"
* Add a link to https://git-scm.com/docs/git-log
* Replace "usually" with "often"
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-996%2Fhexagonrecursion%2Futf8-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-996/hexagonrecursion/utf8-v2
Pull-Request: https://github.com/git/git/pull/996
Range-diff vs v1:
1: 4f1987e5e09c ! 1: 6daa652b7b15 doc: clarify the filename encoding in git diff
@@ Commit message
doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
- is the intended way of programmatically getting the list of files modified
+ is the intended way of programmatically getting the list of files
+ modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and
diff --name-status was not documented.
I asked on the mailing list and got this:
https://public-inbox.org/git/YGx2EMHnwXWbp4ET@coredump.intra.peff.net/
- > There's some discussion in Documentation/i18n.txt, which is included in
+ > There's some discussion in Documentation/i18n.txt, which is included
+ in
various manpages (e.g., https://git-scm.com/docs/git-log#_discussion)
but it doesn't seem to be mentioned in git-diff.
>
@@ Documentation/diff-options.txt: explained for the configuration variable `core.q
--name-only::
- Show only names of changed files.
-+ Show only names of changed files. The file names are usually encoded in UTF-8.
++ Show only names of changed files. The file names are often encoded in UTF-8.
+ For more information see the discussion about encoding in the linkgit:git-log[1]
+ manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
-+ Just like `--name-only` the file names are usually encoded in UTF-8.
++ Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
Documentation/diff-options.txt | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index aa2b5c11f20b..69de49f977b6 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -293,11 +293,14 @@ explained for the configuration variable `core.quotePath` (see
linkgit:git-config[1]).
--name-only::
- Show only names of changed files.
+ Show only names of changed files. The file names are often encoded in UTF-8.
+ For more information see the discussion about encoding in the linkgit:git-log[1]
+ manual page.
--name-status::
Show only names and status of changed files. See the description
of the `--diff-filter` option on what the status letters mean.
+ Just like `--name-only` the file names are often encoded in UTF-8.
--submodule[=<format>]::
Specify how differences in submodules are shown. When specifying
base-commit: 48bf2fa8bad054d66bd79c6ba903c89c704201f7
--
gitgitgadget
prev parent reply other threads:[~2021-04-20 11:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-19 13:27 [PATCH] doc: clarify the filename encoding in git diff Andrey Bienkowski via GitGitGadget
2021-04-19 21:33 ` Junio C Hamano
2021-04-20 11:24 ` Andrey Bienkowski via GitGitGadget [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.996.v2.git.git.1618917877881.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=hexagonrecursion@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.