From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE911C43460 for ; Sat, 17 Apr 2021 11:47:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B6349611AF for ; Sat, 17 Apr 2021 11:47:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236174AbhDQLrd (ORCPT ); Sat, 17 Apr 2021 07:47:33 -0400 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:51820 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236058AbhDQLrc (ORCPT ); Sat, 17 Apr 2021 07:47:32 -0400 Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 13HBkNMJ015171; Sat, 17 Apr 2021 13:46:23 +0200 Date: Sat, 17 Apr 2021 13:46:23 +0200 From: Willy Tarreau To: Peter Zijlstra Cc: Matthew Wilcox , Miguel Ojeda , Wedson Almeida Filho , Miguel Ojeda , Linus Torvalds , Greg Kroah-Hartman , rust-for-linux@vger.kernel.org, Linux Kbuild mailing list , Linux Doc Mailing List , linux-kernel Subject: Re: [PATCH 00/13] [RFC] Rust support Message-ID: <20210417114623.GA15120@1wt.eu> References: <20210414184604.23473-1-ojeda@kernel.org> <20210416161444.GA10484@1wt.eu> <20210416180829.GO2531743@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: rust-for-linux@vger.kernel.org On Sat, Apr 17, 2021 at 01:17:21PM +0200, Peter Zijlstra wrote: > Well, I think the rules actually make sense, at the point in the syntax > tree where + happens, we have 'unsigned char' and 'int', so at that > point we promote to 'int'. Subsequently 'int' gets shifted and bad > things happen. That's always the problem caused by signedness being applied to the type while modern machines do not care about that and use it during (or even after) the operation instead :-/ We'd need to define some macros to zero-extend and sign-extend some values to avoid such issues. I'm sure this would be more intuitive than trying to guess how many casts (and in what order) to place to make sure an operation works as desired. > The 'unsigned long' doesn't happen until quite a bit later. > > Anyway, the rules are imo fairly clear and logical, but yes they can be > annoying. The really silly thing here is that << and >> have UB at all, > and I would love a -fwrapv style flag that simply defines it. Yes it > will generate worse code in some cases, but having the UB there is just > stupid. I'd also love to have a UB-less mode with well defined semantics for plenty of operations that are known to work well on modern machines, like integer wrapping, bit shifts ignoring higher bits etc. Lots of stuff we often have to write useless code for, just to please the compiler. > That of course doesn't help your case here, it would simply misbehave > and not be UB. > > Another thing the C rules cannot really express is a 32x32->64 > multiplication, some (older) versions of GCC can be tricked into it, but > mostly it just doesn't want to do that sanely and the C rules are > absolutely no help there. For me the old trick of casting one side as long long still works: unsigned long long mul3264(unsigned int a, unsigned int b) { return (unsigned long long)a * b; } i386: 00000000 : 0: 8b 44 24 08 mov 0x8(%esp),%eax 4: f7 64 24 04 mull 0x4(%esp) 8: c3 ret x86_64: 0000000000000000 : 0: 89 f8 mov %edi,%eax 2: 89 f7 mov %esi,%edi 4: 48 0f af c7 imul %rdi,%rax 8: c3 retq Or maybe you had something else in mind ? Willy