All of lore.kernel.org
 help / color / mirror / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: Git List <git@vger.kernel.org>,
	Lars Schneider <larsxschneider@gmail.com>,
	Rich Felker <dalias@libc.org>, Junio C Hamano <gitster@pobox.com>,
	Kevin Daudt <me@ikke.info>
Subject: Re: [PATCH] utf8: handle systems that don't write BOM for UTF-16
Date: Sun, 10 Feb 2019 18:14:30 +0000	[thread overview]
Message-ID: <20190210181430.GA28510@genre.crustytoothpaste.net> (raw)
In-Reply-To: <CAPig+cRyzZMOM19ztgR_wqvk68P_1eNNVBBj5pbY=MhQm08WAw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]

On Sat, Feb 09, 2019 at 08:45:16PM -0500, Eric Sunshine wrote:
> On Sat, Feb 9, 2019 at 3:08 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > [...]
> > Add a Makefile and #define knob, ICONV_NEEDS_BOM, that can be set if the
> > iconv implementation has this behavior. When set, Git will write a BOM
> > manually for UTF-16 and UTF-32 and then force the data to be written in
> > UTF-16BE or UTF-32BE. We choose big-endian behavior here because the
> > tests use the raw "UTF-16" encoding, which will be big-endian when the
> > implementation requires this knob to be set.
> 
> The name ICONV_NEEDS_BOM makes it sound as if we must feed a BOM
> _into_ 'iconv', which is quite confusing since the actual intention is
> that 'iconv' doesn't emit a BOM and we need to make up for the
> deficiency. Using a name such as ICONV_OMITS_BOM or ICONV_NEGLECTS_BOM
> makes it somewhat clearer that there is some deficiency with which we
> need to deal.

That does sound like a better name.

> > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> > ---
> > diff --git a/Makefile b/Makefile
> > @@ -259,6 +259,9 @@ all::
> > +# Define ICONV_NEEDS_BOM if your iconv implementation does not write a
> > +# byte-order mark (BOM) when writing UTF-16 or UTF-32.
> 
> Not a big deal, but I wonder if it would be helpful to tack on "...,
> in which case it outputs big-endian unconditionally." or something.

Sure, I can add that.

> Stray blank line before the closing brace.

Will fix.

> It's probably doesn't matter much with these two tiny functions, but I
> was wondering if it would make sense to maintain the &&-chain, perhaps
> like this:
> 
>     if test test_have_prereq NO_UTF32_BOM
>     then
>         printf '\x00\x00\xfe\xff'
>     fi &&
>     iconv -f UTF-8 -t UTF-32

Yeah, that sounds like a good idea.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

  reply	other threads:[~2019-02-10 18:14 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07 21:59 t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Kevin Daudt
2019-02-08  0:17 ` brian m. carlson
2019-02-08  6:04   ` Rich Felker
2019-02-08 11:45     ` brian m. carlson
2019-02-08 11:55       ` Kevin Daudt
2019-02-08 13:51         ` brian m. carlson
2019-02-08 17:50           ` Junio C Hamano
2019-02-08 20:23             ` Kevin Daudt
2019-02-08 20:42               ` brian m. carlson
2019-02-08 23:12                 ` Junio C Hamano
2019-02-09  0:24                   ` brian m. carlson
2019-02-09 14:57                 ` Kevin Daudt
2019-02-09 20:08                   ` [PATCH] utf8: handle systems that don't write BOM for UTF-16 brian m. carlson
2019-02-10  1:45                     ` Eric Sunshine
2019-02-10 18:14                       ` brian m. carlson [this message]
2019-02-10  8:04                     ` Torsten Bögershausen
2019-02-10 18:55                       ` brian m. carlson
2019-02-11 17:14                         ` Junio C Hamano
2019-02-11  0:23                     ` [PATCH v2] " brian m. carlson
2019-02-11  1:16                       ` Eric Sunshine
2019-02-11  1:20                         ` brian m. carlson
2019-02-11  1:26                     ` [PATCH v3] " brian m. carlson
2019-02-11 21:43                       ` Kevin Daudt
2019-02-11 23:58                         ` brian m. carlson
2019-02-12  0:31                           ` Junio C Hamano
2019-02-12  0:53                             ` brian m. carlson
2019-02-12  2:43                               ` Junio C Hamano
2019-02-12  0:52                     ` [PATCH v4] " brian m. carlson
2019-02-08 16:13         ` t0028-working-tree-encoding.sh failing on musl based systems (Alpine Linux) Rich Felker
2019-02-09  8:09     ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190210181430.GA28510@genre.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=dalias@libc.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=me@ikke.info \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.