All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Paubert <paubert@iram.es>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-toolchains@vger.kernel.org,
	Masahiro Yamada <masahiroy@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH] kbuild: treat char as always signed
Date: Sat, 22 Oct 2022 08:06:21 +0200	[thread overview]
Message-ID: <Y1OIXdh3vWOMUlQK@lt-gp.iram.es> (raw)
In-Reply-To: <CAHk-=wiYtSvjyz5xz2Sbnmxgzg_=AL2OyTiRueUem3xzCzM8VA@mail.gmail.com>

On Fri, Oct 21, 2022 at 03:46:01PM -0700, Linus Torvalds wrote:
> On Thu, Oct 20, 2022 at 3:41 AM Gabriel Paubert <paubert@iram.es> wrote:
> >
> > I must miss something, the strcmp man page says:
> >
> > "The comparison is done using unsigned characters."
> 
> You're not missing anything, I just hadn't looked at strcmp() in forever.
> 
> Yeah, strcmp clearly doesn't care about the signedness of 'char', and
> arguably an unsigned char argument makes more sense considering the
> semantics of the funmction.
> 
> > But it's not for this that I wrote this message. Has anybody considered
> > using transparent unions?
> 
> I don't love the transparent union-as-argument syntax, but you're
> right, that would fix the warning.

I'm not in love with the syntax either.

> 
> Except it then doesn't actually *work* very well.
> 
> Try this:
> 
>         #include <sys/types.h>
> 
>         #if USE_UNION
>         typedef union {
>                 const char *a;
>                 const signed char *b;
>                 const unsigned char *c;
>         } conststring_arg __attribute__ ((__transparent_union__));
>         size_t strlen(conststring_arg);
>         #else
>         size_t strlen(const char *);
>         #endif
> 
>         int test(char *a, unsigned char *b)
>         {
>                 return strlen(a)+strlen(b);
>         }
> 
>         int test2(void)
>         {
>                 return strlen("hello");
>         }
> 
> and now compile it both ways with
> 
>         gcc -DUSE_UNION -Wall -O2 -S t.c
>         gcc -Wall -O2 -S t.c
> 

Ok, I´ve just tried it, except that I had something slightly different in
mind, but perhaps should have been clearer in my first post.

I have change your code to the following:


#include <sys/types.h>

#if USE_UNION
typedef union {
	const char *a;
	const signed char *b;
	const unsigned char *c;
} conststring_arg __attribute__ ((__transparent_union__));
static inline size_t strlen(conststring_arg p)
{
	return __builtin_strlen(p.a);
}
#else
size_t strlen(const char *);
#endif

int test(char *a, unsigned char *b)
{
	return strlen(a)+strlen(b);
}

int test2(void)
{
	return strlen("hello");
}

> and notice how yes, the "-DUSE_UNION" one silences the warning about
> using 'unsigned char *' for strlen. So it seems to work fine.
> 
> But then look at the code it generates for 'test2()" in the two cases.

Now test2 looks properly optimized.

This is a bit exploiting a compiler loophole, it calls an external
function which has been defined with the same name!

Depending on how you look at it, it's either disgusting or clever.

I don´t have clang installed, so I don't know whether it would swallow
this code or react with a strong allergy.

	Gabriel
> 
> The transparent union version actually generates a function call to an
> external 'strlen()' function.
> 
> The regular version uses the compiler builtin, and just compiles
> test2() to return the constant value 5.
> 
> So playing games with anonymous union arguments ends up also disabling
> all the compiler optimizations we do want, becaue apparently gcc then
> decides "ok, I'm not going to warn about you declaring this
> differently, but I'm also not going to use the regular one because you
> declared it differently".
> 
> This, btw, is also the reason why we don't use --freestanding in the
> kernel. We do want the basic <string.h> things to just DTRT.
> 
> For the sockaddr_in games, the above isn't an issue. For strlen() and
> friends, it very much is.
> 
>                        Linus


 


  reply	other threads:[~2022-10-22 11:02 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-19 16:26 [PATCH] kbuild: treat char as always signed Jason A. Donenfeld
2022-10-19 16:54 ` Segher Boessenkool
2022-10-19 17:14   ` Linus Torvalds
2022-10-19 17:26     ` Linus Torvalds
2022-10-19 18:10       ` Nick Desaulniers
2022-10-19 18:35         ` Linus Torvalds
2022-10-19 19:23           ` Andy Shevchenko
2022-10-19 19:36             ` Linus Torvalds
2022-10-19 17:43     ` Segher Boessenkool
2022-10-19 18:11       ` Linus Torvalds
2022-10-19 18:20         ` Nick Desaulniers
2022-10-19 18:56           ` Linus Torvalds
2022-10-19 19:11             ` Kees Cook
2022-10-19 19:30               ` Linus Torvalds
2022-10-19 20:35                 ` Jason A. Donenfeld
2022-10-20  0:10                   ` Linus Torvalds
2022-10-20  3:11                     ` Jason A. Donenfeld
2022-10-19 20:15             ` Segher Boessenkool
2022-10-19 21:07         ` David Laight
2022-10-19 21:26           ` Segher Boessenkool
2022-10-20 10:41         ` Gabriel Paubert
2022-10-21 22:46           ` Linus Torvalds
2022-10-22  6:06             ` Gabriel Paubert [this message]
2022-10-22 18:16               ` Linus Torvalds
2022-10-23 20:23                 ` Gabriel Paubert
2022-10-25 23:00                   ` Kees Cook
2022-10-26  0:04                     ` Jason A. Donenfeld
2022-10-26 15:41                       ` Kees Cook
2022-10-19 19:54 ` Linus Torvalds
2022-10-19 20:23   ` Jason A. Donenfeld
2022-10-19 20:30     ` [PATCH v2] kbuild: treat char as always unsigned Jason A. Donenfeld
2022-10-19 23:56       ` Linus Torvalds
2022-10-20  0:02         ` Jason A. Donenfeld
2022-10-20  0:38           ` Linus Torvalds
2022-10-20  2:59             ` Jason A. Donenfeld
2022-10-20 18:41             ` Kees Cook
2022-10-21  1:01               ` Jason A. Donenfeld
2022-10-20 20:24         ` Segher Boessenkool
2022-10-24  9:24       ` Dan Carpenter
2022-10-24  9:30         ` Dan Carpenter
2022-10-24 16:33           ` Jason A. Donenfeld
2022-10-24 17:10             ` Linus Torvalds
2022-10-24 17:17               ` Jason A. Donenfeld
2022-10-25 19:22                 ` Kalle Valo
2022-10-25 10:16               ` David Laight
2022-10-24 15:17         ` Jason A. Donenfeld
2022-12-21 14:53       ` Guenter Roeck
2022-12-21 15:05         ` Geert Uytterhoeven
2022-12-21 15:23           ` Guenter Roeck
2022-12-21 15:29           ` Rasmus Villemoes
2022-12-21 15:56             ` Guenter Roeck
2022-12-21 17:06               ` Linus Torvalds
2022-12-21 17:19                 ` Guenter Roeck
2022-12-21 18:46                   ` Linus Torvalds
2022-12-21 19:08                     ` Linus Torvalds
2022-12-21 21:01                     ` Guenter Roeck
2022-12-22 13:05                     ` Geert Uytterhoeven
2022-12-22 10:41                 ` David Laight
     [not found]                   ` <f02e0ac7f2d805020a7ba66803aaff3e31b5eeff.camel@t-online.de>
2022-12-24  9:47                     ` Geert Uytterhoeven
2022-12-30 11:39                     ` David Laight
2022-12-30 13:13                       ` David Laight
2023-01-02  8:29                       ` Geert Uytterhoeven
2022-12-21 17:49               ` Andreas Schwab
2022-12-21 16:57             ` Geert Uytterhoeven
2022-10-19 20:58   ` [PATCH] kbuild: treat char as always signed David Laight
2022-10-26  0:10   ` make ctype ascii only? (was [PATCH] kbuild: treat char as always signed) Rasmus Villemoes
2022-10-26 18:10     ` Linus Torvalds
2022-10-27  7:59       ` Rasmus Villemoes
2022-10-27 18:28         ` Linus Torvalds
2022-10-20  8:39 ` [PATCH] kbuild: treat char as always signed kernel test robot
2022-10-20 16:33   ` Jason A. Donenfeld
2022-10-20 16:33     ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1OIXdh3vWOMUlQK@lt-gp.iram.es \
    --to=paubert@iram.es \
    --cc=Jason@zx2c4.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=segher@kernel.crashing.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.