linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Paubert <paubert@iram.es>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-toolchains@vger.kernel.org,
	Masahiro Yamada <masahiroy@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH] kbuild: treat char as always signed
Date: Sat, 22 Oct 2022 08:06:21 +0200	[thread overview]
Message-ID: <Y1OIXdh3vWOMUlQK@lt-gp.iram.es> (raw)
In-Reply-To: <CAHk-=wiYtSvjyz5xz2Sbnmxgzg_=AL2OyTiRueUem3xzCzM8VA@mail.gmail.com>

On Fri, Oct 21, 2022 at 03:46:01PM -0700, Linus Torvalds wrote:
> On Thu, Oct 20, 2022 at 3:41 AM Gabriel Paubert <paubert@iram.es> wrote:
> >
> > I must miss something, the strcmp man page says:
> >
> > "The comparison is done using unsigned characters."
> 
> You're not missing anything, I just hadn't looked at strcmp() in forever.
> 
> Yeah, strcmp clearly doesn't care about the signedness of 'char', and
> arguably an unsigned char argument makes more sense considering the
> semantics of the funmction.
> 
> > But it's not for this that I wrote this message. Has anybody considered
> > using transparent unions?
> 
> I don't love the transparent union-as-argument syntax, but you're
> right, that would fix the warning.

I'm not in love with the syntax either.

> 
> Except it then doesn't actually *work* very well.
> 
> Try this:
> 
>         #include <sys/types.h>
> 
>         #if USE_UNION
>         typedef union {
>                 const char *a;
>                 const signed char *b;
>                 const unsigned char *c;
>         } conststring_arg __attribute__ ((__transparent_union__));
>         size_t strlen(conststring_arg);
>         #else
>         size_t strlen(const char *);
>         #endif
> 
>         int test(char *a, unsigned char *b)
>         {
>                 return strlen(a)+strlen(b);
>         }
> 
>         int test2(void)
>         {
>                 return strlen("hello");
>         }
> 
> and now compile it both ways with
> 
>         gcc -DUSE_UNION -Wall -O2 -S t.c
>         gcc -Wall -O2 -S t.c
> 

Ok, I´ve just tried it, except that I had something slightly different in
mind, but perhaps should have been clearer in my first post.

I have change your code to the following:


#include <sys/types.h>

#if USE_UNION
typedef union {
	const char *a;
	const signed char *b;
	const unsigned char *c;
} conststring_arg __attribute__ ((__transparent_union__));
static inline size_t strlen(conststring_arg p)
{
	return __builtin_strlen(p.a);
}
#else
size_t strlen(const char *);
#endif

int test(char *a, unsigned char *b)
{
	return strlen(a)+strlen(b);
}

int test2(void)
{
	return strlen("hello");
}

> and notice how yes, the "-DUSE_UNION" one silences the warning about
> using 'unsigned char *' for strlen. So it seems to work fine.
> 
> But then look at the code it generates for 'test2()" in the two cases.

Now test2 looks properly optimized.

This is a bit exploiting a compiler loophole, it calls an external
function which has been defined with the same name!

Depending on how you look at it, it's either disgusting or clever.

I don´t have clang installed, so I don't know whether it would swallow
this code or react with a strong allergy.

	Gabriel
> 
> The transparent union version actually generates a function call to an
> external 'strlen()' function.
> 
> The regular version uses the compiler builtin, and just compiles
> test2() to return the constant value 5.
> 
> So playing games with anonymous union arguments ends up also disabling
> all the compiler optimizations we do want, becaue apparently gcc then
> decides "ok, I'm not going to warn about you declaring this
> differently, but I'm also not going to use the regular one because you
> declared it differently".
> 
> This, btw, is also the reason why we don't use --freestanding in the
> kernel. We do want the basic <string.h> things to just DTRT.
> 
> For the sockaddr_in games, the above isn't an issue. For strlen() and
> friends, it very much is.
> 
>                        Linus


 


  reply	other threads:[~2022-10-22 11:01 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19 16:26 [PATCH] kbuild: treat char as always signed Jason A. Donenfeld
2022-10-19 16:54 ` Segher Boessenkool
2022-10-19 17:14   ` Linus Torvalds
2022-10-19 17:26     ` Linus Torvalds
2022-10-19 18:10       ` Nick Desaulniers
2022-10-19 18:35         ` Linus Torvalds
2022-10-19 19:23           ` Andy Shevchenko
2022-10-19 19:36             ` Linus Torvalds
2022-10-19 17:43     ` Segher Boessenkool
2022-10-19 18:11       ` Linus Torvalds
2022-10-19 18:20         ` Nick Desaulniers
2022-10-19 18:56           ` Linus Torvalds
2022-10-19 19:11             ` Kees Cook
2022-10-19 19:30               ` Linus Torvalds
2022-10-19 20:35                 ` Jason A. Donenfeld
2022-10-20  0:10                   ` Linus Torvalds
2022-10-20  3:11                     ` Jason A. Donenfeld
2022-10-19 20:15             ` Segher Boessenkool
2022-10-19 21:07         ` David Laight
2022-10-19 21:26           ` Segher Boessenkool
2022-10-20 10:41         ` Gabriel Paubert
2022-10-21 22:46           ` Linus Torvalds
2022-10-22  6:06             ` Gabriel Paubert [this message]
2022-10-22 18:16               ` Linus Torvalds
2022-10-23 20:23                 ` Gabriel Paubert
2022-10-25 23:00                   ` Kees Cook
2022-10-26  0:04                     ` Jason A. Donenfeld
2022-10-26 15:41                       ` Kees Cook
2022-10-19 19:54 ` Linus Torvalds
2022-10-19 20:23   ` Jason A. Donenfeld
2022-10-19 20:30     ` [PATCH v2] kbuild: treat char as always unsigned Jason A. Donenfeld
2022-10-19 23:56       ` Linus Torvalds
2022-10-20  0:02         ` Jason A. Donenfeld
2022-10-20  0:38           ` Linus Torvalds
2022-10-20  2:59             ` Jason A. Donenfeld
2022-10-20 18:41             ` Kees Cook
2022-10-21  1:01               ` Jason A. Donenfeld
2022-10-20 20:24         ` Segher Boessenkool
2022-10-24  9:24       ` Dan Carpenter
2022-10-24  9:30         ` Dan Carpenter
2022-10-24 16:33           ` Jason A. Donenfeld
2022-10-24 17:10             ` Linus Torvalds
2022-10-24 17:17               ` Jason A. Donenfeld
2022-10-25 19:22                 ` Kalle Valo
2022-10-25 10:16               ` David Laight
2022-10-24 15:17         ` Jason A. Donenfeld
2022-12-21 14:53       ` Guenter Roeck
2022-12-21 15:05         ` Geert Uytterhoeven
2022-12-21 15:23           ` Guenter Roeck
2022-12-21 15:29           ` Rasmus Villemoes
2022-12-21 15:56             ` Guenter Roeck
2022-12-21 17:06               ` Linus Torvalds
2022-12-21 17:19                 ` Guenter Roeck
2022-12-21 18:46                   ` Linus Torvalds
2022-12-21 19:08                     ` Linus Torvalds
2022-12-21 21:01                     ` Guenter Roeck
2022-12-22 13:05                     ` Geert Uytterhoeven
2022-12-22 10:41                 ` David Laight
     [not found]                   ` <f02e0ac7f2d805020a7ba66803aaff3e31b5eeff.camel@t-online.de>
2022-12-24  9:47                     ` Geert Uytterhoeven
2022-12-30 11:39                     ` David Laight
2022-12-30 13:13                       ` David Laight
2023-01-02  8:29                       ` Geert Uytterhoeven
2022-12-21 17:49               ` Andreas Schwab
2022-12-21 16:57             ` Geert Uytterhoeven
2022-10-19 20:58   ` [PATCH] kbuild: treat char as always signed David Laight
2022-10-26  0:10   ` make ctype ascii only? (was [PATCH] kbuild: treat char as always signed) Rasmus Villemoes
2022-10-26 18:10     ` Linus Torvalds
2022-10-27  7:59       ` Rasmus Villemoes
2022-10-27 18:28         ` Linus Torvalds
     [not found] ` <202210201618.8XhEGsLd-lkp@intel.com>
2022-10-20 16:33   ` [PATCH] kbuild: treat char as always signed Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1OIXdh3vWOMUlQK@lt-gp.iram.es \
    --to=paubert@iram.es \
    --cc=Jason@zx2c4.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=segher@kernel.crashing.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).