linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Edward Cree <ecree.xilinx@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	linux-kernel@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	alsa-devel@alsa-project.org, coresight@lists.linaro.org,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-wired-lan@lists.osuosl.org, keyrings@vger.kernel.org,
	kvm@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-edac@vger.kernel.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-fpga@vger.kernel.org, linux-hwmon@vger.kernel.org,
	linux-iio@vger.kernel.org, linux-input@vger.kernel.org,
	linux-integrity@vger.kernel.org, linux-media@vger.kernel.org,
	linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-sgx@vger.kernel.org, linux-usb@vger.kernel.org,
	mjpeg-users@lists.sourceforge.net, netdev@vger.kernel.org,
	rcu@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII
Date: Tue, 11 May 2021 10:19:40 +0100	[thread overview]
Message-ID: <899f0da2727a131c3216ad0e03e9380b33b2d4ad.camel@infradead.org> (raw)
In-Reply-To: <20210511110002.2f187f01@coco.lan>

[-- Attachment #1: Type: text/plain, Size: 2207 bytes --]

On Tue, 2021-05-11 at 11:00 +0200, Mauro Carvalho Chehab wrote:
> Yet, this series has two positive side effects:
> 
>  - it helps people needing to touch the documents using non-utf8 locales[1];
>  - it makes easier to grep for a text;
> 
> [1] There are still some widely used distros nowadays (LTS ones?) that
>     don't set UTF-8 as default. Last time I installed a Debian machine
>     I had to explicitly set UTF-8 charset after install as the default
>     were using ASCII encoding (can't remember if it was Debian 10 or an
>     older version).

This whole line of thinking is fundamentally wrong.

A given set of characters in a "text file" are encoded with a specific
character set / encoding. To interpret that file and convert the bytes
back to characters, we need to use the *same* charset.

That charset is a property of the text file, and each text file or
piece of text in a system (like this email, which will contain a
Content-Type: header indicating the charset) might be encoded with a
*different* character set.

In the days before you could connect computers together — or before you
could exchange data between computers in different countries, at least
— perhaps it made sense to store 'text' files without explicitly noting
their encoding. And to interpret them using some kind of "default"
character set.

Those days are long gone. You're trying to work around an egregiously
stupid bug, if you're trying to pander to "default" encodings. There
*is* no default encoding that even makes sense, except perhaps UTF-8.
To *speak* of them as you did shows a misunderstanding of how broken
they are. It's *precisely* that kind of half-baked thinking which
always used to lead to stupid assumptions and double conversions and
Mojibake. Before we just standardised on UTF-8 everywhere and it
stopped mattering so much.

Just don't.

Now, you *can* make this work if you really insist on it, even for
systems with EBCDIC as their default encoding. Just make git do the
"convert to local charset" on checkout, precisely the same way as it
does CRLF for Windows systems. But it's stupid and anachronistic, so I
don't really see the point.


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

  reply	other threads:[~2021-05-11  9:20 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 10:26 [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII Mauro Carvalho Chehab
2021-05-10 10:26 ` [PATCH 06/53] docs: admin-guide: avoid using UTF-8 chars Mauro Carvalho Chehab
2021-05-10 18:40   ` Gabriel Krisman Bertazi
2021-05-12  8:44     ` Mauro Carvalho Chehab
2021-05-12  9:25       ` David Woodhouse
2021-05-12 10:22         ` Mauro Carvalho Chehab
2021-05-10 10:52 ` [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII Thorsten Leemhuis
2021-05-10 11:19   ` Mauro Carvalho Chehab
2021-05-10 12:27     ` Mauro Carvalho Chehab
2021-05-10 10:54 ` David Woodhouse
2021-05-10 11:55   ` Mauro Carvalho Chehab
2021-05-10 13:16     ` Edward Cree
2021-05-10 13:38       ` Mauro Carvalho Chehab
2021-05-10 13:58         ` Edward Cree
2021-05-10 13:59       ` Matthew Wilcox
2021-05-10 14:33         ` Edward Cree
2021-05-11  9:00           ` Mauro Carvalho Chehab
2021-05-11  9:19             ` David Woodhouse [this message]
2021-05-10 13:49     ` David Woodhouse
2021-05-10 19:22       ` Theodore Ts'o
2021-05-11  9:37         ` Mauro Carvalho Chehab
2021-05-11  9:25       ` Mauro Carvalho Chehab
2021-05-10 14:00     ` Ben Boeckel
2021-05-10 21:57 ` Adam Borowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=899f0da2727a131c3216ad0e03e9380b33b2d4ad.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=alsa-devel@alsa-project.org \
    --cc=corbet@lwn.net \
    --cc=coresight@lists.linaro.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=ecree.xilinx@gmail.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=keyrings@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-iio@vger.kernel.org \
    --cc=linux-input@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mjpeg-users@lists.sourceforge.net \
    --cc=netdev@vger.kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).