All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denis Kenzior <denkenz@gmail.com>
To: ell@lists.01.org
Subject: Re: ctype.h undefined behaviour on signed char platforms, needs cast to (unsigned char)
Date: Sat, 21 Nov 2020 21:23:38 -0600	[thread overview]
Message-ID: <b1bf6af2-5163-65b7-6fa1-71d3e7640f4b@gmail.com> (raw)
In-Reply-To: <6174F377-F1D9-40D1-9DD5-567AFD638B17@bluewin.ch>

[-- Attachment #1: Type: text/plain, Size: 1450 bytes --]

Hi Phil,

On 11/21/20 4:49 PM, Phil wrote:
> The defect is already in HEAD and not just in that patch: ell/util.c calls isprint and toupper without a cast.

Ok, now I gotcha.  I thought I eradicated all uses of ctype.h, but guess we 
still had some hiding.  Thanks for pointing this out.

I removed these in commit ef25e0072d283217fc12e422f628f1af0920242a.

> 
> [My quick search for ctype did not find the macro definitions in "utf8.h". I didn't bother to look for the reserved identifiers to[a-z]+ and is[a-z]+ because it is quite fiddly to refine the regexp to not also match a lot of other identifiers. So I had no idea that the l_ascii_is* macros in utf8.h even existed, thanks for the tip. But utf8.h is only a partial replacement for <ctype.h> because there are no l_ascii_to* variants, and the names are off-putting, it looks like they are just for ascii not utf8. What value do they add to the ctype originals anyway, apart from providing the cast?]

So to answer your question about why the l_ascii stuff is in utf8...  ascii is a 
subset of utf8, so it seemed logical and didn't seem worth it to add another header.

We avoid ctype.h 'originals' like the plague because they're locale based, which 
we do not ever need or want.  Then there's the casting issue that you already 
pointed out.  It is also too easy to forget that the behavior changes depending 
on locale, which can lead to subtle bugs.

Regards,
-Denis

      parent reply	other threads:[~2020-11-22  3:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <814B09B7-CAAB-487B-9F42-8C0A2169A015@bluewin.ch>
2020-11-21  9:52 ` ctype.h undefined behaviour on signed char platforms, needs cast to (unsigned char) Phil
2020-11-21 20:29   ` Denis Kenzior
2020-11-21 20:57     ` Andrew Zaborowski
2020-11-21 22:49       ` Phil
2020-11-21 23:18         ` Andrew Zaborowski
2020-11-22  3:23         ` Denis Kenzior [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b1bf6af2-5163-65b7-6fa1-71d3e7640f4b@gmail.com \
    --to=denkenz@gmail.com \
    --cc=ell@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.